Fallback mechanism: Difference between revisions

From coreboot
Jump to navigation Jump to search
(some options moved to the `Chipset` menu)
 
(106 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Introduction ==
== Introduction ==
The fallback mecanism permits to have 2 different romstage,ramstage,payload in the same images under a different prefix.
This mechanism permits to test and recover from certain non-booting coreboot images.
The switch between both can be governed by an nvram configuration parameter.


== Uses cases ==
This works by having two coreboot images in the same flash chip:
* Test new images way faster: if the image doesn't boot it will fallback on the old known-working image and save a long reflashing procedure
* One fallback/ image: The working image.
* Test new images more safely: Despite of the recommendations of having a way to externally reflash, many new user don't. Assuming that the user don't screw up the fallback/ procedure (which adds a layer of complexity) he can test new images more safely because it will fallback on the known good image.
* One normal/ image: The image to be tested.
* More compact testing setup: Since reflashing tools are not mandatory anymore, the tests can be done with less voluminous hardware, which means that the test setup is easier to bring with you while travelling.
== How it works (summary) ==
Coreboot will switch to fallback/ if the boot count is higher than CONFIG_MAX_REBOOT_CNT (or if normal/ isn't present).


Coreboot increments the reboot count at each boot.
This feature is not widely tested on all boards. It also requires it to have a reboot_counter exported in the CMOS layout.


Here, clearing the boot count is delegated to what is run after coreboot.
This also doesn't protect against human errors when using such feature, or bugs in the code responsible for switching between the two images.


To get the maximum safety out of it, clearing the boot count at the latest moment is advised.
== Uses cases ==
* Test new images way faster: if the image doesn't boot it will fallback on the old known-working image and save a long reflashing procedure. Handy for bisecting faster.
* Test new images more safely: Despite of the recommendations of having a way to externally reflash, many new user don't. Still, this method is not totally foolproof.
* More compact testing setup: Since reflashing tools are not mandatory anymore, the tests can be done with less hardware, very useful when traveling.


=== Example of use ===
== How it works ==
For instance once the system is fully booted, a systemd unit can reset the boot count.
Coreboot increments a reboot count at each boot but never clears it. What runs after coreboot is responsible for that.


That way if the coreboot changes makes it impossible to boot a linux kernel or even if GNU/Linux can't fully boot, the boot count won't be reset.
That way, the count can be cleared by the OS once it's fully booted.


Then the user will power off the computer, and at the next boot CONFIG_MAX_REBOOT_CNT will hopefully be reached.
If a certain threshold<ref>Defined by CONFIG_MAX_REBOOT_CNT, typically 3</ref> is attained at boot, coreboot will boot the fallback image.
Then coreboot will boot on the good known working image and the boot will complete.


At that point the user is expected to reflash a good image in order not to go in normal/ again at the next boot.
== Warnings ==
Because we uses two images, it's easy to wrongly identify which image booted:
* If the user mistakenly thinks the normal image is booting...
* But the fallback image always boots...
* And the normal image doesn't work...
* And the user flashes the normal in fallback because she thinks it boots fine...
* Then the user bricked her device and has to reflash it externally.


== Current limitations ==
== Fallback build  ==
* scripts exist only for the systemd init system, but they are easy to adapt to other init systems
To configure it for fallback, do:
* suspend/resume systemd scripts not written yet
$ make menuconfig
* some issues can arrise when the nvram layout is not the same between normal/ and fallback/
Then in "General setup  --->", near the top use "fallback" in "CBFS prefix to use":
* The number of failed boot is 3 by default (for all boards that don't set CONFIG_MAX_REBOOT_CNT)
(fallback) CBFS prefix to use
* In order to fully boot, some boards do reboot once during the boot procedure. The issue is that it reboot conditionally, and no code has been written yet to take that into account.
Then near the bottom, make sure to have:
[ ] Update existing coreboot.rom image
And in the "Chipset  --->" menu at the bottom:
Bootblock behaviour (Switch to normal if CMOS says so) --->
[*] Do not clear reboot count after successful boot


== New Howto (depends on code that is not yet merged) ==
You can then build the fallback image with the [[Fallback mechanism/fallback.sh|fallback.sh]] script.


* The code dependencies  can be found [http://review.coreboot.org/#/q/status:open+project:coreboot+branch:master+topic:falback-patches-v2,n,z in gerrit]
== Normal build ==
 
To configure it for normal, do:
=== Mandatory configuration (in make menuconfig) ===
  $ make menuconfig
You will have to make two configurations:
Then in "General setup  --->", near the top use "normal" in "CBFS prefix to use":
* one for the fallback image
(normal) CBFS prefix to use
* one for the normal image
Then near the bottom, make sure to have:
==== First image ====
  [*] Update existing coreboot.rom image
start configuring the first image with:
And in the "Chipset --->" menu at the bottom:
  make menuconfig
Then configure it like that:
 
Go in the following menu:
  Architecture (x86) --->
And then select that:
  Bootblock behaviour (Switch to normal if CMOS says so)  --->
  Bootblock behaviour (Switch to normal if CMOS says so)  --->
Which will bring that menu:
  [*] Do not clear reboot count after successful boot
  ( ) Always load fallback
(X) Switch to normal if CMOS says so
Select the "Switch to normal if CMOS says so" line like described above.


In order to know if your computer booted correctly the last time, coreboot reads it in the nvram.
You can then build with the normal part with the [[Fallback mechanism/normal.sh|normal.sh]] script. It takes an existing coreboot image as argument.
There are two ways to make it know that it booted fine the last time:
* The automatic way, which happens inside the ramstage of coreboot.
* The manual way, which happens when you want after the ramstage.


If you want it to happen after the ramstage Select the following menu:
== OS configuration ==
General setup  --->
And inside select the following if you want the manual way:
[*] Keep boot count
Or don't select it if you want the automatic way:
[ ] Keep boot count
Then choose the number of times you want it to try to boot, before switching back to fallback/
(1) Number of failed boot attempts before switching back to fallback/
Note that the minimum number could be device specific.
Setting the minimum to 1 on the Lenovo x60 worked well.


In any case, make sure that you have:
=== The manual way ===
(fallback-mode) Local version string
An approach is to run switch-to-normal.sh before trying an image.
(fallback) CBFS prefix to use
It's however more error prone than the systemd approach because:
* you have to do it manually, each time, before testing an image.
* If you then want to use that new image, you have to flash it, again, to fallback.


Verify that you have the following in .config (that make menuconfig just generated if you followed the previous instructions correctly)
==== switch-to-normal.sh ====
  CONFIG_X86_BOOTBLOCK_NORMAL=y
  #!/bin/sh
  CONFIG_BOOTBLOCK_SOURCE="bootblock_normal.c"
  nvramtool -w boot_option=Normal
And that you have:
  nvramtool -w reboot_counter=0
  CONFIG_LOCALVERSION="fallback-mode"
CONFIG_CBFS_PREFIX="fallback"


If you selected "Keep boot count", also verify that you have:
==== switch-to-fallback.sh ====
CONFIG_KEEP_BOOT_COUNT=y
  #!/bin/sh
 
  nvramtool -w boot_option=Fallback
At the end copy the .config to defconfig-fallback (that will erase the file named defconfig-fallback if there was one):
  nvramtool -w reboot_counter=15
cp .config defconfig-fallback
 
==== Second image ====
After configuring the first image, you should configure the second one.
use "make menuconfig" again to change the current configuration in .config (you already copied it to defconfig-fallback, so you will only modify a copy of it).
  make menuconfig
Then go in "General setup"
General setup  --->
And modify the prefix and the version string to look like that:
(normal-mode) Local version string
(normal) CBFS prefix to use
So that the second image that we will build later will be put in the "normal/" prefix and not in the "fallback/" one.
 
Then go in Architecture:
  Architecture (x86)  --->
And enable the "Update existing coreboot.rom image" option:
  [*] Update existing coreboot.rom image


At the end copy the .config to defconfig-normal (that will erase the file named defconfig-normal if there was one):
(Assuming that 15 is the maximum that can be stored in reboot_counter.)
cp .config defconfig-normal


==== Pseudo-diff ====
=== Systemd ===
Then compare the two resulting configurations to be sure of what you did:
Here we use systemd to automatically reset the boot counter after each successful boot (or resume).
$ diff -u defconfig-fallback defconfig-normal


The output should look a bit like that but with more context lines(the lines not starting with a "+" or a "-"):
We are then supposed to use the normal image daily and only resort to fallback in case of issues.
--- defconfig-fallback 2013-10-26 22:27:19.471326092 +0200
+++ defconfig-normal 2013-10-26 22:26:44.471328732 +0200
-CONFIG_LOCALVERSION="fallback-mode"
-CONFIG_CBFS_PREFIX="fallback"
+CONFIG_LOCALVERSION="normal-mode"
+CONFIG_CBFS_PREFIX="normal"
-# CONFIG_UPDATE_IMAGE is not set
+CONFIG_UPDATE_IMAGE=y


=== Compilation ===
To install it, first install nvramtool (from coreboot sources):
==== Build script ====
  $ cd util/nvramtool
This is a build script for the first build that will contains both /fallback and /normal:
  $ make
#!/bin/sh
  $ sudo make install
# In the cases where this work is copyrightable, it falls under the GPLv2
# or later license that is available here:
# https://www.gnu.org/licenses/gpl-2.0.txt
#verbose="V=1"
die() {
echo
echo "!!!! Compilation failed !!!!"
exit 1
}
success() {
echo
echo "!!!! Compilation finished !!!!"
echo
}
separator() {
echo
echo "!!!! First prefix compilation finished !!!!"
echo
}
  fallback() {
make clean || die
#fallback image
cp defconfig-fallback .config  || die
make ${verbose}  || die
./build/cbfstool ./build/coreboot.rom add -f .config -n config-fallback -t raw  || die
   
#because it could be re-included it in the second build...
#./build/cbfstool ./build/coreboot.rom remove -n etc/ps2-keyboard-spinup  || die
#./build/cbfstool ./build/coreboot.rom remove -n pci8086,109a.rom  || die
}
save_clean_and_restore_fallback() {
cp ./build/coreboot.rom ./build-save/coreboot.rom.fallback || die
make clean || die
mkdir -p build/
cp ./build-save/coreboot.rom.fallback ./build/coreboot.rom || die
   
separator
}
normal() {
#normal image
cp defconfig-normal .config  || die
make ${verbose} || die
./build/cbfstool ./build/coreboot.rom add -f .config -n config-normal -t raw  || die
}
add_external_cbfs() {
#Add the remaining files
./build/cbfstool ./build/coreboot.rom add -f /home/gnutoo/x86/ipxe/src/bin/8086109a.rom -n pci8086,109a.rom -t raw || die
}
fallback
save_clean_and_restore_fallback
normal
add_external_cbfs
success
==== Update script ====
Before using that script, you should do:
make menuconfig #change the options if you need it, and save
And:
cp .config defconfig-normal-update


This script is meant for updating an existing coreboot.rom image while not touching the fallback/ part
Then add the following systemd units at their respective paths:
#!/bin/sh
* [[Fallback_mechanism/coreboot@boot.service|/etc/systemd/system/coreboot@boot.service]]
# In the cases where this work is copyrightable, it falls under the GPLv2
* [[Fallback_mechanism/coreboot@resume.service|/etc/systemd/system/coreboot@resume.service]]
# or later license that is available here:
# https://www.gnu.org/licenses/gpl-2.0.txt
#verbose="V=1"
die() {
echo
echo "!!!! Compilation failed !!!!"
exit 1
}
success() {
echo
echo "!!!! Compilation finished !!!!"
echo
}
separator() {
echo
echo "!!!! First prefix compilation finished !!!!"
echo
}
build_cbfstool() {
make -C util/cbfstool
}
save_clean_and_restore_image() {
if [ -f ./build/coreboot.rom ] ; then
cp ./build/coreboot.rom ./build-save/coreboot.rom.fallback || die
fi
make clean || die
mkdir -p build/
cp ./build-save/coreboot.rom.fallback ./build/coreboot.rom || die
separator
}
remove_normal_from_image() {
./util/cbfstool/cbfstool ./build/coreboot.rom remove -n normal/romstage || die
./util/cbfstool/cbfstool ./build/coreboot.rom remove -n normal/coreboot_ram || die
./util/cbfstool/cbfstool ./build/coreboot.rom remove -n normal/payload || die
./util/cbfstool/cbfstool ./build/coreboot.rom remove -n config-normal
}
normal() {
#normal image
cp defconfig-normal-update .config  || die
make ${verbose} || die
./util/cbfstool/cbfstool ./build/coreboot.rom add -f .config -n config-normal -t raw  || die
}
remove_external_cbfs() {
./util/cbfstool/cbfstool ./build/coreboot.rom remove -n pci8086,109a.rom
}
re_add_external_cbfs() {
./util/cbfstool/cbfstool ./build/coreboot.rom add -f /home/gnutoo/x86/ipxe/src/bin/8086109a.rom -n pci8086,109a.rom -t raw || die
}
build_cbfstool
save_clean_and_restore_image
remove_normal_from_image
remove_external_cbfs
normal
re_add_external_cbfs
success


=== Use it ===
Then enable them with:
If you chose the following option:
$ sudo systemctl enable coreboot@boot.service
  [*] Keep boot count
$ sudo systemctl start coreboot@boot.service
Then you or something will need to tell coreboot that the computer booted correctly.
$ sudo systemctl enable coreboot@resume.service
Here are some example scripts.
$ sudo systemctl start coreboot@resume.service


==== set-fallback-1.sh ====
== Current limitations ==
#!/bin/sh
* '''Use of the same cmos.layout in fallback and normal !'''
nvramtool -w boot_option=Fallback
* The user may wrongly identify which image booted, and because of that, end up reflashing a non-working image.
nvramtool -w last_boot=Fallback
* Some issues can arrise when the nvram layout is not the same between normal/ and fallback/
nvramtool -w reboot_bits=1
* The number of failed boot is fixed at compilation time.
==== set-normal-0.sh ====
* In order to fully boot, some boards do reset conditionally during the boot process resulting in a non-predictable increment of the boot count.
#!/bin/sh
* Example script exist only for systemd. Still, they are trivial to adapt to other init systems.
nvramtool -w boot_option=Normal
* Payloads sometime have fixed default locations when loading things from cbfs:
nvramtool -w last_boot=Normal
** When using grub as a payload, grub.cfg is at etc/grub.cfg by default, so if you want to test grub as a payload, remember to change grub.cfg's path not to interfer with the fallback's grub configuration.
nvramtool -w reboot_bits=0
** Changing the path of what SeaBIOS loads from cbfs is probably configurable with SeaBIOS cbfs symlinks but not yet tested/documented with the use of the fallback mecanism
 
* Tested boards need to be listed somewhere.
==== get-nvram.sh ====
#!/bin/sh
nvramtool -a | grep -e boot_option -e last_boot -e reboot_bits


==== Systemd units ====
== Issues ==
/etc/systemd/system/coreboot-booted-ok.service:
=== thinkpad_acpi ===
This file is not part of systemd.
This linux driver can have some bad interactions with the fallback/normal mecanism: when using it with the volume_control=1 option, volume_mode=1 is required, otherwise after shutting down the computer, it will always boot from fallback.
#
#  this file is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
[Unit]
Description=Tell coreboot that the computer booted fine.
DefaultDependencies=no
Wants=display-manager.service
After=display-manager.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/sbin/nvramtool -w boot_option=Normal
ExecStart=/usr/local/sbin/nvramtool -w last_boot=Normal
ExecStart=/usr/local/sbin/nvramtool -w reboot_bits=0
[Install]
WantedBy=multi-user.target
== Update build script ==
#!/bin/sh
if [ $# -ne 1 ] ; then
  echo "Usage $0 <image>"
  exit 1
fi
image="$1"
die() {
  echo "Failed"
  exit 1
}
cbfs_remove() {
  file=$1
  ./util/cbfstool/cbfstool ./build/coreboot.rom remove -n ${file}
}
make oldconfig || die
make clean || die
mkdir build/ || die
cp ${image} ./build/coreboot.rom || die
cbfs_remove normal/romstage
cbfs_remove normal/coreboot_ram
cbfs_remove normal/payload
cbfs_remove config
cbfs_remove etc/grub.cfg
make || die


== Old Howto (will be replaced) ==
This might be because as the default settings of volume_mode touches the nvram, it probably corrupts it at shutdown when saving the alsa state of the volume buttons "sound card" (called EC Mixer). Then at boot, coreboot will detects a corrupted nvram and restore its valid defaults.
* build the coreboot image as usual, it will produce an image in build/coreboot.rom
* After the first build run:
make menuconfig
* Optionally change the payload.
* Go in
General setup  --->
* Change:
(fallback) CBFS prefix to use
To:
(normal) CBFS prefix to use
* Go back to the main menu and select:
Architecture (x86)  --->
select the following option:
[*] Update existing coreboot.rom image
Exit and save and rebuild...


The image will then have fallback and normal:
== references ==
Name                          Offset    Type        Size
<references/>
cmos_layout.bin                0x0        cmos_layout  1776
pci1002,9710.rom              0x740      optionrom    60928
fallback/romstage              0xf580    stage        92823
fallback/coreboot_ram          0x26080    stage        66639
fallback/payload              0x36540    payload      54976
config                        0x43c40    raw          4455
normal/romstage                0x44e00    stage        92823
normal/coreboot_ram            0x5b8c0    stage        68820
normal/payload                0x6c600    payload      159949
(empty)                        0x93700    null        442136

Latest revision as of 20:59, 25 February 2018

Introduction

This mechanism permits to test and recover from certain non-booting coreboot images.

This works by having two coreboot images in the same flash chip:

  • One fallback/ image: The working image.
  • One normal/ image: The image to be tested.

This feature is not widely tested on all boards. It also requires it to have a reboot_counter exported in the CMOS layout.

This also doesn't protect against human errors when using such feature, or bugs in the code responsible for switching between the two images.

Uses cases

  • Test new images way faster: if the image doesn't boot it will fallback on the old known-working image and save a long reflashing procedure. Handy for bisecting faster.
  • Test new images more safely: Despite of the recommendations of having a way to externally reflash, many new user don't. Still, this method is not totally foolproof.
  • More compact testing setup: Since reflashing tools are not mandatory anymore, the tests can be done with less hardware, very useful when traveling.

How it works

Coreboot increments a reboot count at each boot but never clears it. What runs after coreboot is responsible for that.

That way, the count can be cleared by the OS once it's fully booted.

If a certain threshold<ref>Defined by CONFIG_MAX_REBOOT_CNT, typically 3</ref> is attained at boot, coreboot will boot the fallback image.

Warnings

Because we uses two images, it's easy to wrongly identify which image booted:

  • If the user mistakenly thinks the normal image is booting...
  • But the fallback image always boots...
  • And the normal image doesn't work...
  • And the user flashes the normal in fallback because she thinks it boots fine...
  • Then the user bricked her device and has to reflash it externally.

Fallback build

To configure it for fallback, do:

$ make menuconfig

Then in "General setup --->", near the top use "fallback" in "CBFS prefix to use":

(fallback) CBFS prefix to use

Then near the bottom, make sure to have:

[ ] Update existing coreboot.rom image

And in the "Chipset --->" menu at the bottom:

Bootblock behaviour (Switch to normal if CMOS says so)  --->
[*] Do not clear reboot count after successful boot

You can then build the fallback image with the fallback.sh script.

Normal build

To configure it for normal, do:

$ make menuconfig

Then in "General setup --->", near the top use "normal" in "CBFS prefix to use":

(normal) CBFS prefix to use

Then near the bottom, make sure to have:

[*] Update existing coreboot.rom image

And in the "Chipset --->" menu at the bottom:

Bootblock behaviour (Switch to normal if CMOS says so)  --->
[*] Do not clear reboot count after successful boot

You can then build with the normal part with the normal.sh script. It takes an existing coreboot image as argument.

OS configuration

The manual way

An approach is to run switch-to-normal.sh before trying an image. It's however more error prone than the systemd approach because:

  • you have to do it manually, each time, before testing an image.
  • If you then want to use that new image, you have to flash it, again, to fallback.

switch-to-normal.sh

#!/bin/sh
nvramtool -w boot_option=Normal
nvramtool -w reboot_counter=0

switch-to-fallback.sh

#!/bin/sh
nvramtool -w boot_option=Fallback
nvramtool -w reboot_counter=15

(Assuming that 15 is the maximum that can be stored in reboot_counter.)

Systemd

Here we use systemd to automatically reset the boot counter after each successful boot (or resume).

We are then supposed to use the normal image daily and only resort to fallback in case of issues.

To install it, first install nvramtool (from coreboot sources):

$ cd util/nvramtool
$ make
$ sudo make install

Then add the following systemd units at their respective paths:

Then enable them with:

$ sudo systemctl enable coreboot@boot.service
$ sudo systemctl start coreboot@boot.service
$ sudo systemctl enable coreboot@resume.service
$ sudo systemctl start coreboot@resume.service

Current limitations

  • Use of the same cmos.layout in fallback and normal !
  • The user may wrongly identify which image booted, and because of that, end up reflashing a non-working image.
  • Some issues can arrise when the nvram layout is not the same between normal/ and fallback/
  • The number of failed boot is fixed at compilation time.
  • In order to fully boot, some boards do reset conditionally during the boot process resulting in a non-predictable increment of the boot count.
  • Example script exist only for systemd. Still, they are trivial to adapt to other init systems.
  • Payloads sometime have fixed default locations when loading things from cbfs:
    • When using grub as a payload, grub.cfg is at etc/grub.cfg by default, so if you want to test grub as a payload, remember to change grub.cfg's path not to interfer with the fallback's grub configuration.
    • Changing the path of what SeaBIOS loads from cbfs is probably configurable with SeaBIOS cbfs symlinks but not yet tested/documented with the use of the fallback mecanism
  • Tested boards need to be listed somewhere.

Issues

thinkpad_acpi

This linux driver can have some bad interactions with the fallback/normal mecanism: when using it with the volume_control=1 option, volume_mode=1 is required, otherwise after shutting down the computer, it will always boot from fallback.

This might be because as the default settings of volume_mode touches the nvram, it probably corrupts it at shutdown when saving the alsa state of the volume buttons "sound card" (called EC Mixer). Then at boot, coreboot will detects a corrupted nvram and restore its valid defaults.

references

<references/>