[coreboot] AMD cache setup is broken

Scott Duplichan scott at notabs.org
Wed Sep 8 21:47:02 CEST 2010

]On Mon, Sep 6, 2010 at 6:40 PM, Scott Duplichan <scott at notabs.org> wrote:
]> Stefan Reinauer <stefan.reinauer at coresystems.de> writes:
]>>> Can you see if the patches posted in
]>>> http://article.gmane.org/gmane.linux.bios/57707 make any difference for
]>>> you?
]>> Did we ever figure out what is causing this?
]> The last time I really dug into this, it was fairly obvious that it was
]> caused by instruction fetch thrashing towards the ROM.  I tried to amend
]> this with MTRR settings, but I was unable to make that work correctly.
]> For some reason it seemed like the HT requests were sublty changed when
]> the MTRR was applied, and didn't hit the legacy southbridge properly.
]>> The patch would require 4KB more stack on all supported systems, so if
]>> we can we should do things differently.
]> It doesn't have to be stack, but it is nice to have it memory mananged
]> in some way.  The unrv2b patch I posted addressing the same problem was
]> even more kludgy.
]>> Also, it's not really guaranteed that the code works from the new
]>> location since we don't compile coreboot with -fPIC (and as far as I
]>> understand the GCC guys, even that would not help), so I am a bit
]>> hesitant to check this in.
]> Agreed, it is a bit icky.  Not sure what the best way to handle that is,
]> though.  On the pro side, I assume breakage here is going to be obvious,
]> and (supposing these patches actually help Nick) this is an issue people
]> are running into with some regularity.
]> --
]>                                                        Arne.
]> --
]> coreboot mailing list: coreboot at coreboot.org
]> http://www.coreboot.org/mailman/listinfo/coreboot
]> One necessary condition for caching MMIO such as the flash chip on
]> AMD family 10h processors is not well known:
]> If the processor has an L3 cache, then bit 15 of msr C001_102A
]> (ClLinesToNbDis) must be set. This bit needs to eventually be cleared
]> in order for the OS to use the L3 cache. But BIOS must not clear this
]> bit until cacheable accesses to the flash chip are no longer needed.
]> This situation applies only to family 10h processors that have L3 cache.
]> Often BIOS clears this bit too early and slow execution results.As an
]> experiment, you could add code to set this bit before the slow function
]> and see what happens.
]> Last night I tried to debug this code on simnow. An HT modeling problem
]> kept me from getting past HT init. I may try it again today.
]> The recommended cacheability setting for MMIO is WP. At the point the
]> simnow model hangs in HT init, the setting is WB. While this should
]> be OK for family 10h, it will be important to use WP for families
]> 14h and 15. ClLinesToNbDis is properly set for MMIO caching at this point
]> (HT init):
]> ------------Effective memory type and destination by address------------
]>             NORMAL    NORMAL    NORMAL     SMM       SMM       SMM
]>             READ      WRITE     EXECUTE    READ      WRITE     EXECUTE
]> 00000-C3FFF  UC MMIO....................    UC MMIO....................
]> C4000-CFFFF  WB DRAM....................    WB DRAM....................
]> D0000-FFFFF  UC MMIO....................    UC MMIO....................
]> 00100000-00FFFFFF  UC DRAM
]> 01000000-FFEFFFFF  UC MMIO
]> FFF00000-FFF7FFFF  WB MMIO   <=== really should be WP
]> -msr c001102a
]> 00000040_010080C8 <=== good at this point
]> Thanks,
]> Scott
]> Simnow testing with Tilapia confirms that the coreboot AMD family 10h
]> code _does_ have the problem of clearing ClLinesToNbDis too early. To
]> confirm this problem, someone testing on real AMD family 10h hardware
]> should remove the msr C001_001a write from STOP_CAR_AND_CPU() and
]> from mct_ClrClToNB_D(). An AMD F10h system running an optimized legacy
]> bios can boot to a DOS ptompt in less than one second. There is no
]> reason coreboot should be any slower.
]> Thanks,
]> Scott
]Ah, this makes sense now. Is c001_001a a shared msr? Is it ok for the
]APs to be disabled and just leave it enabled on the BSP until the
]ramstage is decompressed? I should have a patch ready this afternoon.

I checked on real hardware, and family 10h msr c001_102a is not shared
among cores on a die. I suppose it would be OK to do what you say. I 
imagine that would minimize code changes. The alternative is to remove
the two instances where ClLinesToNbDis is cleared early, and then
clear it at some later time, before AP MSRs are synced to the BSP values.
I see the BKDG was finally updated to address this situation:

    When BIOS is done executing from WP-IO the following steps are followed:
    1. MSRC001_102A[ClLinesToNbDis]=0.


More information about the coreboot mailing list