Coreboot v3

From coreboot
Jump to: navigation, search

The wiki is being retired!

Documentation is now handled by the same processes we use for code: Add something to the Documentation/ directory in the coreboot repo, and it will be rendered to Contributions welcome!

To obtain the v3 sources, refer to Download coreboot. Further v3 documentation is provided in the coreboot-v3/doc subdirectory. The design/newboot.lyx document describes the v3 structure and is a good read. A PDF version is available here on the wiki, but may not always be completely up to date.

How to add a new board support

To add new mainboard support to v3:

  1. add mainboard vendor to mainboard/Kconfig,
  2. add mainboard model to mainboard/$vendor/Kconfig,
  3. create mainboard/$vendor/$model/Kconfig and mainboard/$vendor/$model/Makefile by copying from a similar model and modifying accordingly,
  4. perform same copy/modify procedure for mainboard/$vendor/$model/initram.c, mainboard/$vendor/$model/stage1.c, and mainboard/$vendor/$model/cmos.layout,
  5. create mainboard-specific files mainboard/$vendor/$model/dts and (possibly) mainboard/$vendor/$model/irq_tables.c, populate with board-specific data.

For embedded boards, initram.c and stage1.c may need significant changes between different boards.

How to add a new SuperIO device

TODO. Please refer to documentation in the source tree.

How to add a new northbridge

TODO. Please refer to documentation in the source tree.

How to add a new southbridge

TODO. Please refer to documentation in the source tree.

How to add a new architecture support (CPU)

Implementation specification for the CAR code

  • What's to be provided to the rest of coreboot v3


  • Exported labels to the rest of coreboot v3


  • Example implementations
    • via cache test registers TBD
    • via MTRR registers TBD
    • what else is possible?

How coreboot starts after Reset

Whenever an x86 CPU wakes up after reset, it does it in Real Mode. This mode is limited to 1MiB address space and 64k offsets and the reset vector of the original 8086/88 was located at 0xFFFF0.

As there was no change even if we run current processors like P3, these newer CPUs also feels like they where start at 0xF0000:0xFFF0 after a reset. But they do not. The base of the code segment register is 0xFFFF0000 after reset, so the CPU generates a physical address of 0xFFFFFFF0 to the chipset. And the chipset is responsible to forward this area to the boot ROM. Its confusing: The CPU "thinks" it runs code at 0xF000:0xFFF0 but instead it uses code at 0xFFFFFFF0. The developers must have been tanked up when they realised this design into silicon.

On some chipsets there is an additional pitfall: The A20 gate. It was introduced to support full compatibility for the 80286 CPU with their predecessor 8086. When the old CPU accesses space "behind" the 0xFFFFF (=1MiB) limit, they wrap around to 0x00000. On 80286 and newer processors accesses above 0xFFFFF natively do not wrap around. The external A20 gate forces this wrap also on newer CPUs. On older CPUs its open after reset, so the CPU cannot generate addresses with A20 set.


Next pitfall on some chipsets is, if they are able to forward two address spaces to the boot ROM: 0xFFFF0000 and 0xF0000. If not and the opcode at the reset vector does something like "jmp 0xF000:xxxx" this crashes the machine immediately, as this will force the baseaddress of the code segment register to 0. After this, the CPU really outputs address with 0xFxxxx to the chipset. And if the chipset cannot handle the forwarding of two address spaces, the boot ROM cannot be accessed anymore. You are lost.


How to escape from these restrictions?

First of all, we must ensure not to touch the baseaddress of the code segment register. This will keep us in the 0xFFFF0000 address space. We can ensure this by using branches only instead of jumps. So the opcode at the reset vector must be nothing else than a branch command! The next step depends on the used chipset. Does it open the gate A20 after reset? If yes, we must close it prior switching to Linear Flat Mode.

Mandatory steps

  • loading a Global Descriptor Table
  • activating the pm bit in the CCR0 register
  • reloading of the code segment register (with a far jump)

This only requires a small amount of code. So we can shrink the pain of the real mode to only a very small part of our whole program.

Everything becomes easy when we are reaching the Linear Flat Mode. No more hardware pain, no more toolchain pain. Why Linear Flat Mode and not Protected Mode? Most people call this operation mode Protected Mode, when they switch the CPU to its native 32bit operation mode in this way. But it's only a Linear Flat Mode, as there is no protection at all. Its more like a 32 bit real mode. There is no address translation, no paging and no protection

The Linear Flat Mode

When all segment registers uses the same baseaddress and limits, it is called the Linear Flat Mode. Advantages:

  • no limits in the 4GiB address space
  • everything is allowed
  • no access restrictions to RAM and I/O
  • clear and easy to understand source code, no "tricks" required to access space above 1MiB


  • no protection if someone is working with a NULL or invalid pointer
  • no protection if any accessed address is invalid
  • as everything is allowed, everything could work against you
  • no protection of stack overflow

There is still one pain after entering the Linear Flat Mode: The lack of system RAM. This issue will be addressed by the Cache As RAM solution. See below.

The Toolchain Pain and how to solve it

It seems modern toolchains handle the Real Mode opcode generation in a correct manner. But they do not really support Real Mode sections across all parts of the toolchain. The flaw is the linker: You can't link Real Mode sections (means 16 bit), if they contain unresolved symbols (the direction doesn't matter).

But you can link every Real Mode section if it does not contain any unresolved symbols! So to solve the toolchain pain we only must avoid unresolved symbols in our Real Mode sections! As we control their content this is a way to go.

To avoid unresolved symbols we must use fixed addresses in our commands that refer external symbols when we use it in a common way. To ensure that fixed addresses are working correctly at runtime, this forces a special layout in our ROM image.

All we need are four fixed addresses, and the correct code at these points:

  • reset vector
  • the program code to load the GDT and switch to Linear Flat Mode
  • the Global Descriptor Table for Linear Flat Mode
  • the program code entered in Linear Flat Mode

The first two of this list are realmode sections we need special handling for, the last two are allready 32 bit section, we can link and use as expected.


These four fixed addresses are defined in the coreboot v3 menu when the expert mode is enabled. This menu is only important for developers, as they must define the addresses and sizes of this four areas to fit platform's requirements.

The linker script to achive this layout looks like this:

OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")

        .stage0_1 CONFIG_STAGE_0_PA_BASE : AT ( 0 ) {
                _stage0_1 = .;
                _estage0_1 = .;

/* ############## Create the delicate workflow for reset ################### */

        .stage0_flat_mode CONFIG_STAGE_0_PA_FLAT_SIDE : \

        .stage0_gdt CONFIG_STAGE_0_PA_GDT : \
                AT ( CONFIG_STAGE_0_PA_GDT - CONFIG_STAGE_0_PA_BASE ) {

        .stage0_real_mode CONFIG_STAGE_0_PA_REAL_SIDE - 0xFFF00000 : \
                FILL(0x90);     /* fill with NOP opcodes */

        .stage0_reset CONFIG_STAGE_0_PA_RESET_VECTOR - 0xFFF00000 : \

        /* fill the image up to the end */
        .stage0_filler 0x0000FFFE : AT ( 0xFFFFFFFE - CONFIG_STAGE_0_PA_BASE ) {
                BYTE(0xFE);     /* Computer type (XT) */
                BYTE(0xFF);     /* Checksum byte */

        /DISCARD/ : {

With this lables our layout now looks like this:


After linking the whole code we could check the result with:

[jbe@jupiter]~/coreboot-v3> objdump -h build/stage0.o

build/stage0.o:     file format elf32-i386

Idx Name          Size      VMA       LMA       File off  Algn
  0 .stage0_1     0000199d  ffffc100  00000000  00000100  2**5
                  CONTENTS, ALLOC, LOAD, CODE
  1 .stage0_flat_mode 00000178  fffffc00  00003b00  00001c00  2**0
  2 .stage0_gdt   00000028  ffffff00  00003e00  00001f00  2**0
  3 .stage0_real_mode 0000002e  000fffa0  00003ea0  00001fa0  2**0
  4 .stage0_reset 0000000e  000ffff0  00003ef0  00001ff0  2**0
  5 .stage0_filler 00000002  0000fffe  00003efe  00001ffe  2**0
                  CONTENTS, ALLOC, LOAD, DATA

This is the layout.

Now we will fill the sections with code. The first one is the last one in the ROM: The reset vector. To avoid any unresolved symbols in this section we must create the branch opcode manually.

       .section ".reset_first", "ax"
       .globl reset_entry

       .byte 0xe9

We can check the result with objdump:

[jbe@jupiter]~/coreboot-v3> objdump -mi8086 -d -j .stage0_reset build/stage0.o

build/stage0.o:     file format elf32-i386

Disassembly of section .stage0_reset:

000ffff0 <reset_entry>:
   ffff0:       e9 ad ff                jmp    ffa0 <gdt_limit+0xff79>

The next step is to load a GDT, enabling protected mode and jump to the flat mode code. This is also 16 bit code so we cannot use any labels to load the GDT and to jump to the correct flat mode code address. But we alinged both to fixed physical addresses, so we can use these addresses now:

       .section ".real_mode", "ax"
       .globl real_mode_fallback_entry

       movl    %eax, %ebp;     /* save the BIST result */
       xorl    %eax, %eax
       movl    %eax, %cr3      /* Invalidate TLB */

       data32  lgdt %cs:(CONFIG_STAGE_0_PA_GDT-0xFFFF0000)

       movl    %cr0, %eax
       andl    $0x7FFAFFD1, %eax /* PG,AM,WP,NE,TS,EM,MP = 0 */
       orl     $0x60000001, %eax /* CD, NW, PE = 1 */
       movl    %eax, %cr0

       movl    %ebp, %eax      /* Restore BIST result */
       data32  ljmp $ROM_CODE_SEG, $CONFIG_STAGE_0_PA_FLAT_SIDE

We can also check the result with objdump:

[jbe@jupiter]~/coreboot-v3> objdump -mi8086 -d -j .stage0_real_mode build/stage0.o

build/stage0.o:     file format elf32-i386

000fffa0 <real_mode_fallback_entry>:
  fffa0:       fa                      cli
  fffa1:       66 89 c5                mov    %eax,%ebp
  fffa4:       66 31 c0                xor    %eax,%eax
  fffa7:       0f 22 d8                mov    %eax,%cr3
  fffaa:       66 2e 0f 01 16 00 ff    lgdtl  %cs:-256
  fffb1:       0f 20 c0                mov    %cr0,%eax
  fffb4:       66 25 d1 ff fa 7f       and    $0x7ffaffd1,%eax
  fffba:       66 0d 01 00 00 60       or     $0x60000001,%eax
  fffc0:       0f 22 c0                mov    %eax,%cr0
  fffc3:       66 89 e8                mov    %ebp,%eax
  fffc6:       66 ea 00 fc ff ff 08    ljmpl  $0x8,$0xfffffc00
  fffcd:       00

How it works at runtime


  1. After reset the CPU starts to fetch opcodes virtually from address 0xF000:0xFFF0 (=0xFFFFFFF0 = CONFIG_STAGE_0_PA_RESET_VECTOR). It will fetch our branch command and continues at 0xF000:0xFFA0 (=0xFFFFFFA0 = CONFIG_STAGE_0_PA_REAL_SIDE)
  2. The real mode code loads the GDT at the real mode offset 0xFF00 relative to the CS register. This results into the address 0xF000:0xFF00 (=0xFFFFFF00 = CONFIG_STAGE_0_PA_GDT)
  3. The switch to the protected mode jumps to 32 bit offset 0xFFFFFC00, but now CS register's baseaddress is 0 (from GDT entry at offset 8), so we match CONFIG_STAGE_0_PA_FLAT_SIDE
  4. the flat mode code does
    1. activate the full Linear Flat Mode by loading the remaining segment registers
    2. activate CAR (CPU dependend)
  5. At the end, its time to jump into the stage0_1 code at label stage1_main(). From now on, there is no more need for any special physical layout

coreboot V3 in Stages and Phases

Each stage is a LAR entry. Stage0 and Stage1 execute in place (XIP). Starting with Stage2 the LAR entry is copied to memory and executed.

Stage 0

Cache_As_RAM (CAR) setup. See How_coreboot_starts_after_Reset and How it works at runtime above. With CAR setup we can use C code.

Stage 1

This is the XIP root stage which every subsequent stage returns to if they return.

Stage1 Phase 1

Early CPU (for AMD - HT, FID/VID), chipset (SMbus), and memory initialization.

Stage1 Phase 2

Disable CAR and start using memory.

Stage 2

Memory has been setup and coreboot is ready to enumerate the mainboard devices. The device tree (dts) is an integral part of this process. The dts contains the mainboard device locations and settings. Each phase walks the device tree with a particular purpose.

Stage2 Phase 1

Not currently used.

device_operations functions:

  • .phase1_set_device_operations

Stage2 Phase 2

Used for dts and/or device fixups that need to happen prior to scanning for devices. Note that early device setup should happen in phase3 unless there is a really good reason to put it here.

device_operations functions:

  • .phase2_fixup

Stage2 Phase 3

Scan the bus adding bridges and devices to the dts. Do chip and device specific initialization. Particularly important to chipset and mainboard devices (plug-in cards will have a standard resource init)

device_operations functions:

  • .phase3_scan_bus
  • .phase3_chip_setup_dev
  • .phase3_enable

Stage2 Phase 4

Setup resources for all bridges and devices. ex. normal PCI device BAR allocation.

device_operations functions:

  • .phase4_read_resource
  • .phase4_set_resource

Stage2 Phase 5

Device enable. ex. PCI command register.

device_operations functions:

  • .phase5_enable_resource

Stage2 Phase 6

Late initialization. ex. PCI ROM init.

device_operations functions:

  • .phase6_init


When stage2 completes control is returned to stage1 and a payload is loaded. The payload should not return to coreboot.

Creative Commons License
Creative Commons Attribution icon
This file is licensed under Creative Commons Attribution 2.5 License.
In short: you are free to distribute and modify the file as long as you attribute its author(s) or licensor(s).