[coreboot] [help] A K8/RS780/SB710 board MCE error

Marc Jones marcj303 at gmail.com
Wed Sep 15 18:27:22 CEST 2010


Hi Liu,


On Mon, Sep 13, 2010 at 8:49 PM, Liu Tao <liutao1980 at gmail.com> wrote:
> Hello everyone,
>
> I'm porting coreboot v4 to a k8-rs780-sb710 based mainboard,  and use
> amd/mahogany
> and amd/tilapia_fam10 codes as the reference. Now coreboot boots the
> board and filo loads linux,but the board crashes at a MCE error during
> booting process. I'm not very know the detail about the MCE, so any
> suggestions will be appreciated, thanks very much.
>
> The mainboard architecture:
> CPU: socket F Opteron 2210 EE get_cpu_rev EAX=0x40f13 (1 cpu, dual core)
> DIMM: DDR2 333M (x1 / x2)
> HT Link0: off
> HT Link1: RS780->SB710
> HT Link2: off
> VGA off
> GFX off
> PCIE off
>
> coreboot code  revision: modified on r5692
>
> The MCE/panic message:
>
> HARDWARE ERROR
> CPU 0: Machine Check Exception:                4 Bank 0: f658a00000000833
> TSC 572507f34 ADDR 6000
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
> Kernel panic - not syncing: Machine check
> ------------[ cut here ]------------
> WARNING: at kernel/smp.c:331 smp_call_function_mask+0x32/0x1ec()
> Modules linked in:
> Supported: Yes
> Pid: 1, comm: swapper Tainted: G   M      2.6.27.19-5-default #1
>
> Call Trace:
>  [<ffffffff8020d9f9>] show_trace_log_lvl+0x41/0x58
>  [<ffffffff80496a74>] dump_stack+0x69/0x6f
>  [<ffffffff8023bfba>] warn_on_slowpath+0x51/0x77
>  [<ffffffff8025b1c5>] smp_call_function_mask+0x32/0x1ec
>  [<ffffffff8025b3a8>] smp_call_function+0x29/0x2e
>  [<ffffffff8021a04a>] native_smp_send_stop+0x1a/0x26
>  [<ffffffff80496b36>] panic+0xbc/0x169
>  [<ffffffff80216366>] mce_log+0x0/0x7e
>  [<ffffffff80216740>] do_machine_check+0x31e/0x3cd
>  [<ffffffff8020d27f>] machine_check+0x7f/0x90
>  [<ffffffff802126c8>] setup_trampoline+0x20/0x30
>  [<ffffffff804919a5>] native_cpu_up+0x31e/0xc64
>  [<ffffffff80493d17>] _cpu_up+0x9a/0x11c
>  [<ffffffff80493df4>] cpu_up+0x5b/0x6f
>  [<ffffffff8095b708>] kernel_init+0xe1/0x1eb
>  [<ffffffff8020cf49>] child_rip+0xa/0x11
>
> ---[ end trace 4eaa2a86a8e2da22 ]---
>
> mcelog --k8 --ascii
>
> HARDWARE ERROR
> CPU 0: Machine Check Exception:                4 Bank 0: f658a00000000833
> TSC 572507f34 ADDR 6000
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
> HARDWARE ERROR
> CPU 0 0 data cache TSC 572507f34
>  Data cache ECC error (syndrome b1)
>       bit45 = uncorrected ecc error
>       bit57 = processor context corrupt
>       bit61 = error uncorrected
>       bit62 = error overflow (multiple errors)
>  bus error 'local node origin, request didn't time out
>      data read mem transaction
>      memory access, level generic'
> STATUS f658a00000000833 MCGSTATUS 4
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
>
> Attached is the detailed boot message.

I haven't worked with K8 is a while, but it seems like this could be a
real CPU problem. Do you have another CPU to test with? The other
possibility is that there is a missing errata or workaround for your
CPU. You could review the AMD K8 revision guide for cache and MCA/MCE
issues. Please let us know what you find.

Marc

-- 
http://se-eng.com




More information about the coreboot mailing list