[coreboot] serengeti_cheetah_fam10: Erratas triple fault in SimNow (was: AMD SimNOW Seg Fault)

Bernhard Kaindl bkaindl at ffii.org
Thu Mar 20 21:46:27 CET 2008


Hi,
    Are you saying SimNow itself segfaults on you or is it coreboot
which triple-faults inside SimNow?

Maybe this is something different:

I recently investigated why

src/mainboard/amd/serengeti_cheetah_fam10/cache_as_ram_auto.c

causes triple-faults inside the publically available SimNow
(the non-NDA version) and here is one of the causes:

         /* FIXME: Check CPU revision to apply correct erratas */
         /* Rev B errata */
         /* Errata #169 - supercedes errata #131 */
         msr = rdmsr(0xC001001F);
         msr.hi |= 1 << (32 - 32);
         wrmsr(0xC001101F, msr);

This set a different bit in a different MSR as it is indicated
in Errata #169. To apply the Errata as indicated in the public
document, a change like this is needed:

         msr = rdmsr(0xC001001F);
-       msr.hi |= 1 << (32 - 32);
+       msr.hi |= 1 << 32;
-       wrmsr(0xC001101F, msr);
+       wrmsr(0xC001001F, msr);

The current code reads the correct MSR, sets a different bit
(bit 0 instead of 32), and write the changed value to a private,
undocumented or even non-existing MSR, or maybe it's a typo.

Sadly, bit 32 of 0xC001001F is also undocumented AFAICS, but
Errata #169 says that it should be set. However, that errata
was later updated to suggest that also as register in the north
bridge must be changed and I didn't find that part of the errata
in coreboot yet.

With that change (I guess it's a fix) SimNow executes this code
but triples on the next errata implementation:

         /* Errata #202 [DIS_PIGGY_BACK_SCRUB]=1 */
         msr = rdmsr(0xC0011022);
         msr.hi |= 1 << 24;
         wrmsr(0xC0010022, msr);

Again, this applies the changed MSR value to a different MSR
which is also undocumented or even non-existing(or typo). I also did
not manage to find any information in Errata #202, so I guess
it applies to AMD engineering samples only?

I have no suggestion on how to fix that part as I could not
find any documentaiton on it.

...

I also think that applying erratas which are not essental to have
in the very earlyest boot stage should not neccesarily reside
inside the mainboard-specific cache_as_ram_auto.c but moved to
a place in the compessed coreboot code where different boards
can share errata implementations for the CPUs which they suppport.

When everyhing is set up, exceptions from wrmsr could also be
handled better (I guess) than causing triple faults. Linux has
wrmsr() functions with exception handling in include/asm-x86/msr.h
which give a proper return code and do not crash the code.

netbsd has a very nice structure for that in place in which you
can enter erratas simply by adding an entry in a table in which
you specify for which CPU which errata shall be applied:

http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/x86/x86/errata.c?rev=1.13&content-type=text/x-cvsweb-markup

Bernhard

On Thu, 20 Mar 2008, Marc Karasek wrote:

> I have gotten some cycles and they removed our proxy server
> (hurrah!!!).  So I recompiled the BIOS fro SimNOW using buildrom on my
> test machine.
>
> When I went to run SimNOW is Seg Faults.  I tried it with the default
> BIOS image for the Cheetah BSD, I also tried one of the other BSDs.  All
> of them Seg Fault. :-(
>
> I made the mistake of updating Fedora8_64 with the latest RPMs.   Lesson
> learned, if it ain't broke don't fix it...
>
> Does anyone have any idea what, I am guessing, package could be causing
> this?  I have tried with both kernels that are on the machine, with no
> success.  It did work at one point,  before the update.  I can nuke the
> box and reinstall f8_64, but would rather not.
>
> --
> *********************
> Marc Karasek
> MTS
> Sun Microsystems
> mailto:marc.karasek at sun.com
> ph:770.360.6415
> *********************




More information about the coreboot mailing list