[coreboot] [help] A K8/RS780/SB710 board MCE error
Marc Jones
marcj303 at gmail.com
Wed Sep 15 18:27:22 CEST 2010
Hi Liu,
On Mon, Sep 13, 2010 at 8:49 PM, Liu Tao <liutao1980 at gmail.com> wrote:
> Hello everyone,
>
> I'm porting coreboot v4 to a k8-rs780-sb710 based mainboard, and use
> amd/mahogany
> and amd/tilapia_fam10 codes as the reference. Now coreboot boots the
> board and filo loads linux,but the board crashes at a MCE error during
> booting process. I'm not very know the detail about the MCE, so any
> suggestions will be appreciated, thanks very much.
>
> The mainboard architecture:
> CPU: socket F Opteron 2210 EE get_cpu_rev EAX=0x40f13 (1 cpu, dual core)
> DIMM: DDR2 333M (x1 / x2)
> HT Link0: off
> HT Link1: RS780->SB710
> HT Link2: off
> VGA off
> GFX off
> PCIE off
>
> coreboot code revision: modified on r5692
>
> The MCE/panic message:
>
> HARDWARE ERROR
> CPU 0: Machine Check Exception: 4 Bank 0: f658a00000000833
> TSC 572507f34 ADDR 6000
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
> Kernel panic - not syncing: Machine check
> ------------[ cut here ]------------
> WARNING: at kernel/smp.c:331 smp_call_function_mask+0x32/0x1ec()
> Modules linked in:
> Supported: Yes
> Pid: 1, comm: swapper Tainted: G M 2.6.27.19-5-default #1
>
> Call Trace:
> [<ffffffff8020d9f9>] show_trace_log_lvl+0x41/0x58
> [<ffffffff80496a74>] dump_stack+0x69/0x6f
> [<ffffffff8023bfba>] warn_on_slowpath+0x51/0x77
> [<ffffffff8025b1c5>] smp_call_function_mask+0x32/0x1ec
> [<ffffffff8025b3a8>] smp_call_function+0x29/0x2e
> [<ffffffff8021a04a>] native_smp_send_stop+0x1a/0x26
> [<ffffffff80496b36>] panic+0xbc/0x169
> [<ffffffff80216366>] mce_log+0x0/0x7e
> [<ffffffff80216740>] do_machine_check+0x31e/0x3cd
> [<ffffffff8020d27f>] machine_check+0x7f/0x90
> [<ffffffff802126c8>] setup_trampoline+0x20/0x30
> [<ffffffff804919a5>] native_cpu_up+0x31e/0xc64
> [<ffffffff80493d17>] _cpu_up+0x9a/0x11c
> [<ffffffff80493df4>] cpu_up+0x5b/0x6f
> [<ffffffff8095b708>] kernel_init+0xe1/0x1eb
> [<ffffffff8020cf49>] child_rip+0xa/0x11
>
> ---[ end trace 4eaa2a86a8e2da22 ]---
>
> mcelog --k8 --ascii
>
> HARDWARE ERROR
> CPU 0: Machine Check Exception: 4 Bank 0: f658a00000000833
> TSC 572507f34 ADDR 6000
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
> HARDWARE ERROR
> CPU 0 0 data cache TSC 572507f34
> Data cache ECC error (syndrome b1)
> bit45 = uncorrected ecc error
> bit57 = processor context corrupt
> bit61 = error uncorrected
> bit62 = error overflow (multiple errors)
> bus error 'local node origin, request didn't time out
> data read mem transaction
> memory access, level generic'
> STATUS f658a00000000833 MCGSTATUS 4
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
>
> Attached is the detailed boot message.
I haven't worked with K8 is a while, but it seems like this could be a
real CPU problem. Do you have another CPU to test with? The other
possibility is that there is a missing errata or workaround for your
CPU. You could review the AMD K8 revision guide for cache and MCA/MCE
issues. Please let us know what you find.
Marc
--
http://se-eng.com
More information about the coreboot
mailing list