[coreboot] MP table multicore patch

Stefan Reinauer stepan at coresystems.de
Wed Feb 17 00:40:00 CET 2010


On 2/16/10 8:42 PM, Myles Watson wrote:
>
>
> On Tue, Feb 16, 2010 at 12:02 PM, Stefan Reinauer
> <stepan at coresystems.de <mailto:stepan at coresystems.de>> wrote:
>
>     On 2/16/10 5:11 AM, Timothy Pearson wrote:
>     > Here is a cleaned up and tested version of the SMP APIC
>     autodetect patch.
>     >
>     > Signed-off-by: Timothy Pearson
>     <tpearson at raptorengineeringinc.com
>     <mailto:tpearson at raptorengineeringinc.com>>
>
>  
>
>     And, can you describe high level, what the patch changes? It looks
>     to me
>     as if you are recursing through the tree instead of just walking the
>     "all_devices" list.
>     So this implies that you don't catch all devices when running through
>     all_devices. This sounds like a problem in the resource allocator and
>     maybe it should be fixed there instead?
>
> I don't understand why this would be a resource allocator problem. 
> Aren't we talking about the device tree?

Oh, sorry I didn't say more...  What the old code did was the following:

    device_t cpu;
    for(cpu = all_devices; cpu; cpu = cpu->next) {
     ....
     <- check if the device is an APIC and if it is, dump an APIC entry
in the mptable ->
     ....
    }

But it seems that the code does not see all devices -- anymore -- not
sure if it ever did, but I think to remember that's what happened at
some point, or was at least intended.

So now the code looks l

scan_for_apic_devices(parent)
{
    for(c_it=0; c_it < parent->links; c_it++) {
        for (child = parent->link[c_it].children; child; child =
child->sibling) {
....
            if (child->path.type == DEVICE_PATH_APIC_CLUSTER) {
                // Found an APIC cluster, scan it for APICs
                smp_scan_for_apic_devices(child);
            }
        }
    }
}

So two more steps are necessary:
- check all the downwards links of a device instead of just walking
devices and checking their type.
- run recursively in a special case on APIC clusters.

This sounds a whole lot like something changed in the way "all_devices"
works. And if "all_devices" does not mean "all devices" I am sure there
are more places in our code that need similar fixes.

> Maybe the real problem is related to the memory corruption seen with
> printk?
Not completely impossible, but I figured it's easier to ask the guy who
rewrote the resource allocator if he knows something about how the
intended behavior of "all_devices" ;-)
- Maybe the behavior is intended, then we just need to check in
Timothy's patch.
- Maybe the behavior is not intended, but that's how the code works
right now. Then we'd rather have to look at the resource allocator and
decide if we want "all_devices" to mean "all devices", or whether we
rename the variable, or redefine its meaning.
- Maybe none of the above applies, then we need to do nasty stack
corruption debugging. In this case it would be fundamentally wrong to
touch either the device tree code or the mp table creation code until we
fix the corruption in order to make sure we don't create funny special
cases.

So, if you have hints enlightening any of the maybes, please share!


Stefan



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.coreboot.org/pipermail/coreboot/attachments/20100217/f0eb41c8/attachment.html>


More information about the coreboot mailing list