[coreboot] GeodeLX RAM initialisation issue

Nathan Williams nathan at traverse.com.au
Tue Dec 1 00:17:47 CET 2009


Marc Jones wrote:
> On Fri, Nov 27, 2009 at 2:05 AM, Nathan Williams <nathan at traverse.com.au> wrote:
>> Nathan Williams wrote:
>>> Marc Jones wrote:
>>>> On Tue, Nov 24, 2009 at 1:09 AM, Nathan Williams <nathan at traverse.com.au> wrote:
>>>>> Marc Jones wrote:
>>>>>> On Mon, Nov 23, 2009 at 12:27 AM, Nathan Williams
>>>>>> <nathan at traverse.com.au> wrote:
>>>>>>> I managed to get the commercial BIOS to boot on my board and diffed it with coreboot:
>>>>>>>
>>>>>>> http://coreboot.pastebin.com/m39b22c21
>>>>>>>
>>>>>>> The only differences I can see are related to interrupts, which shouldn't matter in relation to
>>>>>>> my RAM problems.
>>>>>>>
>>>>>>> I have also run a memtest86 with the commercial BIOS (from bootable CDROM) and as a payload in coreboot.
>>>>>>> The commercial BIOS didn't have any errors, but my coreboot did.  So the hardware can't be too bad.
>>>>>> That looks like just the southbridge cs5536 target. The memory
>>>>>> differences would be in the processor geodelx target. Can you send
>>>>>> those results?
>>>>>>
>>>>>> Marc
>>>>>>
>>>>> I did some new MSR dumps.
>>>>>
>>>>> Diff:
>>>>> ./msrtool -t geodelx -t cs5536 -d amd_ref_bios
>>>>> http://coreboot.pastebin.com/m5e487f87
>>>>>
>>>>> AMD NAS reference BIOS:
>>>>> ./msrtool -t geodelx -t cs5536 -l -s amd_ref_bios
>>>>> http://coreboot.pastebin.com/madc04ac
>>>>>
>>>>> My Coreboot:
>>>>> ./msrtool -t geodelx -t cs5536 -l -s nathan_bios
>>>>> http://coreboot.pastebin.com/m7f35d855
>>>>>
>>>>>
>>>>> The diffs I did today show some differences with GLCP_DELAY_CONTROLS.
>>>>> Last time I added some code to force it to match the commercial BIOS
>>>>> GLCP_DELAY_CONTROLS MSR, but it didn't seem to make any difference.
>>>>>
>>>>> I also tested all the SODIMMS I have here (about 10) with the commercial BIOS.
>>>>> Each time I did a msrtool diff to one I saved on disk.
>>>>>
>>>>> Most are 333MHz, but 2 are 400MHz.  There weren't any changes to the MSRs.
>>>>>
>>>>> Could there be an issue with the initialisation sequence that reading MSRs
>>>>> after booting won't show?  Also, quite a few MSRs aren't defined in geodelx.c yet.
>>>>> Are there any obvious ones that should be added in?
>>>>>
>>>> --- AMD NAS reference BIOS
>>>> +++ Nathan's coreboot v3
>>>> #
>>>> # GLCP_DELAY_CONTROLS
>>>> #
>>>> -0x4c00000f 0x83f1_00aa_5696_0404
>>>> +0x4c00000f 0x8271_005a_ 5696_ 0404
>>>>
>>>> It looks like coreboot and the ref bios detect different dimm
>>>> configuration. This timing setup could be part of the instability (I
>>>> don't think it explains the reset problem). Look at the code here:
>>>> SetDelayControl(void) and anywhere else that GLCP_DELAY_CONTROLS gets
>>>> set to see what might be happening. Make sure that MTest is disabled
>>>> in the ref bios setup. This setting is based on the number of devices
>>>> (load) there is on the dimm.
>>>>
>>>> I didn't realize that so few registers were in the msr tool for
>>>> geodelx. You should add these:
>>>> 20000018h R/W Refresh and SDRAM Program (MC_CF07_DATA)
>>>> 10071007_00000040h Page 227
>>>> 20000019h R/W Timing and Mode Program (MC_CF8F_DATA) 18000008_287337A3h Page 229
>>>> 2000001Ah R/W Feature Enables (MC_CF1017_DATA) 00000000_11080001h Page 231
>>>> 2000001Bh RO Performance Counters (MC_CFPERF_CNT1) 00000000_00000000h Page 232
>>>> 2000001Ch R/W Counter and CAS Control (MC_PERCNT2) 00000000_00FF00FFh Page 233
>>>> 2000001Dh R/W Clocking and Debug (MC_CFCLK_DBUG) 00000000_00001300h Page 233
>>>>
>>>> 4C00000Fh R/W GLCP I/O Delay
>>>> Controls(GLCP_DELAY_CONTROLS)00000000_00000000h Page 549
>>>> 4C000014h R/W GLCP System Reset and PLL Control (GLCP_SYS_RSTPLL)
>>>> Bootstrap specific Page 554
>>>>
>>>> Marc
>>>>
>>> I've now added the MSRs and uploaded to pastebin:
>>>
>>> AMD NAS:
>>> http://coreboot.pastebin.com/m53aed60b
>>>
>>> My coreboot:
>>> http://coreboot.pastebin.com/md23bc6a
>>>
>>> ./msrtool -d AMD_NAS:
>>> http://coreboot.pastebin.com/m77663de5
>>>
>>> Tomorrow I'll try the tests on the NAS hardware, instead of our own motherboards
>>> just in case there are some hidden hardware issues.
>>>
>>> Regards,
>>> Nathan
>>>
>> On the NAS reference board I got the following diff between coreboot
>> and the commercial BIOS:
>>
>> http://coreboot.pastebin.com/m1353db1a
>>
>> As you can see there are a lot of latency differences.
>> Unfortunately it was only later that I realised that the differences are because the bootstraps are set to bypass, which means coreboot uses 266 as the speed, where as the commercial bios uses 333.  So when I repeat the same on our boards, the only difference in the geodelx MSRs is:
>>
>> # MC_CFCLK_DBUG
>> -0x2000001d 0x0000000000000000
>> +0x2000001d 0x0000000000001000
>> #    12 TRISTATE_DIS TRI-STATE Disable
>> -0: Tri-stating enabled
>> +1: Tri-stating disabled
> 
> 
> Nathan,
> 
> I don't think the tri-state disable bit explains the problems you have
> seen. Since the memory has the same settings, the problem must be
> somewhere else. You will need to go back the the reboot path to
> investigate. It seems like something in the reset isn't doing a
> complete reset, which causes a problem with the cache disable.
> 
> Marc
> 
> 

I am suspicious that the reset problem only occurs when I'm using a laptop hard drive
off the 44pin IDE connector on our board.  I have tried booting with a 3.5" drive
and external 12V, but I can't replicate the problem.  With the 3.5" drive, a reboot from
fsck works fine.  Hopefully the next PCB revision should perform better because we've
moved the 5V plane further away from the DDR tracks.

I don't know if I mentioned another problem that has similar symptoms.  Some RAM causes
the same cache disable problem, even if there are no IDE devices connected.  This happens
from power-up, so it's not a reset issue.

Nathan




More information about the coreboot mailing list