[coreboot] AMD cache setup is broken
scott at notabs.org
Tue Sep 7 02:40:59 CEST 2010
Stefan Reinauer <stefan.reinauer at coresystems.de> writes:
>> Can you see if the patches posted in
>> http://article.gmane.org/gmane.linux.bios/57707 make any difference for
> Did we ever figure out what is causing this?
The last time I really dug into this, it was fairly obvious that it was
caused by instruction fetch thrashing towards the ROM. I tried to amend
this with MTRR settings, but I was unable to make that work correctly.
For some reason it seemed like the HT requests were sublty changed when
the MTRR was applied, and didn't hit the legacy southbridge properly.
> The patch would require 4KB more stack on all supported systems, so if
> we can we should do things differently.
It doesn't have to be stack, but it is nice to have it memory mananged
in some way. The unrv2b patch I posted addressing the same problem was
even more kludgy.
> Also, it's not really guaranteed that the code works from the new
> location since we don't compile coreboot with -fPIC (and as far as I
> understand the GCC guys, even that would not help), so I am a bit
> hesitant to check this in.
Agreed, it is a bit icky. Not sure what the best way to handle that is,
though. On the pro side, I assume breakage here is going to be obvious,
and (supposing these patches actually help Nick) this is an issue people
are running into with some regularity.
coreboot mailing list: coreboot at coreboot.org
One necessary condition for caching MMIO such as the flash chip on
AMD family 10h processors is not well known:
If the processor has an L3 cache, then bit 15 of msr C001_102A
(ClLinesToNbDis) must be set. This bit needs to eventually be cleared
in order for the OS to use the L3 cache. But BIOS must not clear this
bit until cacheable accesses to the flash chip are no longer needed.
This situation applies only to family 10h processors that have L3 cache.
Often BIOS clears this bit too early and slow execution results.As an
experiment, you could add code to set this bit before the slow function
and see what happens.
Last night I tried to debug this code on simnow. An HT modeling problem
kept me from getting past HT init. I may try it again today.
The recommended cacheability setting for MMIO is WP. At the point the
simnow model hangs in HT init, the setting is WB. While this should
be OK for family 10h, it will be important to use WP for families
14h and 15. ClLinesToNbDis is properly set for MMIO caching at this point
------------Effective memory type and destination by address------------
NORMAL NORMAL NORMAL SMM SMM SMM
READ WRITE EXECUTE READ WRITE EXECUTE
00000-C3FFF UC MMIO.................... UC MMIO....................
C4000-CFFFF WB DRAM.................... WB DRAM....................
D0000-FFFFF UC MMIO.................... UC MMIO....................
00100000-00FFFFFF UC DRAM
01000000-FFEFFFFF UC MMIO
FFF00000-FFF7FFFF WB MMIO <=== really should be WP
FFF80000-FFFFFFFF UC MMIO
00000040_010080C8 <=== good at this point
Simnow testing with Tilapia confirms that the coreboot AMD family 10h
code _does_ have the problem of clearing ClLinesToNbDis too early. To
confirm this problem, someone testing on real AMD family 10h hardware
should remove the msr C001_001a write from STOP_CAR_AND_CPU() and
from mct_ClrClToNB_D(). An AMD F10h system running an optimized legacy
bios can boot to a DOS ptompt in less than one second. There is no
reason coreboot should be any slower.
More information about the coreboot