Difference between revisions of "Board:asus/kgpe-d16"
m (turbo 2 and gaming notes)
m (→RAM HCL: Common knowledge, also: who has dirty RAM modules?)
|(2 intermediate revisions by 2 users not shown)|
Latest revision as of 16:54, 10 May 2018
The KGPE-D16 is a AMD Family 10h / 15h, dual-CPU server and workstation motherboard released in 2009 (ASUS). It is well supported and stable under Coreboot, with all CPUs, RAM, and peripherals functioning normally. Family 10h (61xx) processors do not currently support the isochronous mode required to enable the IOMMU, but Family 15h (62xx, 63xx) processors work well with the IOMMU enabled.
Both the KGPE-D16 and the KCMA-D8 have CPU's fast enough to play new games at high settings in a VM with a suitable video card.
While the D8 and D16 have are owner controlled, have libre firmware and are relatively free as in freedom, Taiidan recommends the purchase of the significantly faster OpenPOWER9 TALOS 2 which has an even higher level of security, freedom and performance. OpenPOWER is the only owner controlled performance CPU arch on the market now that both intel and AMD have black box supervisor processors and hardware enforced code signing for external flashing on their new hardware so not enough people purchasing them would mean no TALOS 3, and the end of libre hardware that is fast enough to compile modern libre software.
This board is automatically tested by Raptor Engineering's test stand. For more details please visit AutoTest/RaptorEngineering.
A basic system diagram is available in the official manual, Appendix A.1 and has been confirmed to match the hardware shipping from ASUS. Not indicated are the PCIe lane widths for the gigabit network controller, which are both x1. All legacy PCI devices share the same bus, and partially due to this design the SP5100 has severe issues with bridging high-bandwidth PCI peripherals. As such, an external PCI-PCIe bridge is recommended should you need to interface a high bandwidth legacy PCI device to this system; ASMedia controllers have been verified to function correctly.
Northbridge functions are distributed between the CPU internal northbridge and the SR5690 northbridge, which is effectively a HyperTransport to ALink/PCIe translator and switch. There is a separate SP5100 southbridge device, adjacent to the northbridge and residing under the smaller heatsink of the two. This device provides all traditional southbridge services including the LPC bridge and SATA controllers. All southbridge-destined messages, including CPU-originated power state control messages over HyperTransport, pass through the CPU northbridge and are routed to the southbridge via the SR5690 northbridge device.
Incidentally, this design places the IOMMU, which is part of the SR5690, in the correct location to properly shield the main CPU from all unauthorized traffic. If the southbridge connected directly to a HyperTransport link there would be no way to prevent unauthorized DMA from legacy PCI devices connected to the southbridge, or even from the southbridge's embedded microprocessor.
- You MUST use at least one real 8PIN EPS12V cable for one CPU, and two real 8PIN EPS12V cables for two CPU's (Adapters may catch fire!)
- coreboot must be flashed externally when migrating from the proprietary BIOS. After coreboot has been flashed and booted at least once, flashrom can safely reprogram the ROM under Linux.
- When migrating from the proprietary BIOS, after flashing coreboot the CMOS memory must be cleared. Failing to clear the CMOS will typically result in odd hangs during the boot process.
- Enabling the serial console or EHCI debug console will drastically increase the time needed to boot.
- Having a serial console log level above 2 will drastically increase the time required for booting.
- The proprietary BMC module must be removed for coreboot to function - or flashed with the OpenBMC port
- All CPU's are split in to two NUMA nodes as they are two 2/4/6/8 core CPU's in one package, memory is divided based on NUMA nodes (1 6282SE 16 core CPU, 2 Nodes, 32GB RAM, 16GB per node) and not properly aligning NUMA RAM will result in drastically decreased performance - make sure you do!
- Turbo 2 and power saving seems to require a tickless system to function (nohz=on in the kernel cmdline), otherwise the extra cores are always woken up and will never enter CC6.
- Check your used CPU for damage to the pins and bottom components, while a physically damaged CPU may still be sold as working it might not work in a dual socket configuration.
- The 63xx "Piledriver" series processors absolutely require microcode updates for safe operation due to the 2016 gain-root-via-NMI exploit that effects various versions of the burned in microcode updates - an update is also required to enable IOMMU due to an errata.
Coreboot does not do fan control so here are your options:
OpenBMC is the best choice for this as you will have fancontrol no matter what the main operating system is doing
- Install the OpenBMC port beta to the ASMB4-iKVM or ASMB5-iKVM modules that come with the main KGPE-D16 retail SKU, this provides fan control and a variety of other cool remote management features. The default configuration is 3 pin case fans and 4 pin PWM fans for the CPU fans as this is the only way to provide separate fan control zones due to ASUS not wiring up the rest of the SuperIO fan channels.
- Fancontrol/pwmconfig to control your fans via linux.
|Max RAM||192 GB||128GB normal or 192GB via special configuration - See below|
|PCI-e slots||4||5 physical, 4 concurrent|
|PCI slots||1||Via PCI Bridge that also connects onboard AST graphics chip|
|Other Expansion Slots||1 PIKE||ASUS Proprietary I/O Expansion Slot, Insert PIKE RAID card for second half of the motherboard SATA/SAS ports - half of the "PIKE" connector is simply a reversed PCI-e x4 slot.|
|EEPROM Type||DIP 8 SPI Socket|
|Factory EEPROM Size||2MB|
|Max EEPROM Size||???|
|TPM||YES||Compatibility with TPM modules that fit ASUS TPM header. (Connected over LPC.) Note these modules are proprietary. Owner controlled CRTM|
|Crossfire XDMA||???||Has ACS and dual PCI-e 2.0 x16 slots, so it should work (reported working on vendor bios, need tester for coreboot)|
|Blob Free Operations||YES|
|Native GFX Init||Partial||Text Mode Only||Now features proper EDID parsing.|
|BMC||Open Source||OpenBMC - open source remote management - Available for KGPE-D16 and KCMA-D8 boards via installation to the ASUS ASMB4-iKVM or ASMB5-iKVM module.|
|IOMMU||YES||v1.26 with Interrupt Remapping|
|IOMMU for Graphics||YES||Near-Native 3D gaming graphics performance with proper software configuration|
|SR-IOV||???||Coreboot doesn't support SR-IOV|
|PCI-e ARI||???||Required for more than 8 SR-IOV VF per device, AMD's docs say the chipset supports it however there are no firmware implementations that feature it.|
OpenBMC - Open Source Remote Management
Raptor Engineering is working on porting OpenBMC to the KGPE-D16 and KCMA-D8 under a crowdfunded contract, it should be done in a few months and there is currently a beta available.
At the moment you require the ASUS ASMB4-iKVM or ASMB5-iKVM module to use it - most KGPE-D16 retail SKU's should come with this otherwise it is generally $30-60 used/new.
The following RAM models and configurations have been tested by either Raptor Engineering or a third party and are know to work as of the stated GIT revision.
|Manufacturer||Model||Max working RAM / CPU||Size||Speed||Type||ECC||Populated Slots||CPU||Mainboard Type||Firmware|
|Micron||36KSF2G72PZ-1G4E1 (N/A)||16GB||DDR3-1333||Registered||Yes||A2 / C2||Opteron 6378||Coreboot 2268e0d or later|
|Micron||MT36KSF1G72PZ-1G6M1FF||32GB||8GB||DDR3-1600||Registered||Yes||All orange slots||Opteron 6262HE||1.03G||Internal development version of coreboot|
|Micron / HP||MT36JSF2G72PZ-1G6E1LG (HP: 672612-081)||32GB||16GB||DDR3-1600||Registered||Yes||A2 / C2 / E2 / G2||Opteron 6276||1.03G||Libreboot 20160907|
|Hynix/Hyundai||HMT151R7BFR4C-H9||16GB||4GB||DDR3-1333||Registered||Yes||A2 / C2 / E2 / G2
A2 / B2 / C2 / D2
|Opteron 6276||1.03G||Libreboot 20160907|
|Kingston||9965525-055.A00LF||8GB||DDR3-1600||Unbuffered||Yes||A2 / C2 / E2 / F2||Opteron 6328||Coreboot 9fba481|
|Kingston||KVR16R11D4/16 (9965516-483.A00LF)||64GB||16GB||DDR3-1600||Registered||Yes||All orange slots (128GB)||Opteron 6278/6262HE||Libreboot 20160907|
|Kingston||KVR16R11D4K4/64I (9965516-477.A00LF)||64GB||16GB||DDR3-1600||Registered||Yes||All orange slots (128GB)||Opteron 6278/6262HE/6284SE||Libreboot 20160907|
|crucial ("crucial by Micron")||CT16G3ERSLD4160B (MT36KSF2G72PZ-1G6P1NE)||64GB||16GB||DDR3-1600||Registered||Yes||All orange slots (128GB)||Opteron 6278/6282SE/6284SE/6287SE||1.03G, 1.04||Libreboot 20160907|
|Micron||MT36KSF2G72PZ-1G6E1FE||64GB||16GB||DDR3-1600||Registered||Yes||All orange slots||Opteron 6378||1.04||Internal development version of coreboot (2017)|
|Micron||MT36KSF2G72PZ-1G6N1KG||64GB||16GB||DDR3-1600||Registered||Yes||All orange slots||Opteron 6378||1.04||Internal development version of coreboot (2017)|
|crucial ("crucial by Micron")||CT16G3ERSLD4160B (MT36KSF2G72PZ-1G6P1NE)||192GB||16GB||DDR3-1600||Registered||Yes||Leave H1, H2, G1, G2 empty (see page 2-16 in the ASUS manual), LVDDR3_SEL1 can be set to "Force 1.35V"||Opteron 6278/6282SE/6284SE/6287SE||1.03G, 1.04||coreboot d6735b0|
In addition to the 1 or 2 main CPUs, there are no less than three known secondary processors present on the mainboard. All are disabled when running under coreboot.
- There is a very poorly documented microprocessor inside the SR5690; purpose and type unknown. It is believed this processor requires a firmware upload from the main platform firmware or via JTAG in order to start execution.
- A single 8051 processor core is present inside the SB700 southbridge. It normally handles errata related to power states and may also be responsible for the blinking power LED in S3 suspend under the proprietary BIOS. It is believed accesses made by this processor are responsible for the flashrom write failure when the board is booted from the proprietary BIOS. This processor also requires a firmware upload from the main platform firmware or via JTAG in order to start execution.
- The BMC has an integrated ARM core. This is disabled by pin strap when the BMC firmware module is not installed.
Some processors may be present on or activated by add-on modules:
- The optional PIKE add-on cards use ARM cores to handle the SAS protocol, though this firmware is directly loaded from a Flash chip on the module and does not involve any non-local components (e.g. the main CPU never touches the firmware on these modules outside of a manual reflash operation). Raptor Engineering is currently unaware of any SAS controllers that operate without a secondary processor or use libre firmware; the protocol is simply too complex to handle via a mask ROM, and as there are only one or two suppliers of SAS controllers there is very little incentive to release the source code to the firmware. Writing a libre firmware to replace the existing firmware may technically be possible, however it is extremely unlikely this will ever happen due to the man-decades required.
- Installing an ASUS iKVM firmware module will activate the ARM core in the BMC, which has full system access to all peripherals and possibly memory. It is not recommended to use this module as the firmware is both highly privileged and proprietary, and is known to contain at least one critical security bug.
EHCI debug console
The EHCI debug console causes severe USB problems under both Libreboot and coreboot. This typically manifests as very slow boot / slow typing on USB keyboards. This issue appears to extend to the KCMA-D8 and KFSN4-DRE boards as well.
MMIO Resources Limit
The coreboot 32bit MMIO space limits the use of large amounts of PCI-e devices, such as more than a few network interfaces or graphics cards with the limit coming up sooner for older multi-port NIC's that have a switched design (ex: 82576), vs the newer style native multi-port pci-e setup (i350)
This is the reason for the "Not enough MMIO resources for SR-IOV" error when you attempt to enable SR-IOV on a system with both a quad port NIC and the onboard interfaces.
- Certain models and populations of DIMMs do not function under either coreboot or the proprietary BIOS. These failures may also be contingent on the exact PCB revision and / or CPU model installed. For a list of known failing combinations please visit KGPE-D16 Known Bad Configurations.
> 192 GB of RAM not working
The KGPE-D16 doesn't work with more than 192 GB RAM (reported by ThomasUmbach) and would need further work by coreboot developers. To use 192 GB RAM it's necessary to leave either the two DIMM slots next to the CPUs unpopulated (in this case, RAM training works well, but the system will be unstable) or the 4 closest on CPU1 (system stable), for more info see RAM HCL on this page (reported by ThomasUmbach).
The 4 total PCI-e slots may be limiting, but as the board has PCI-e ACS you may be able to use an external ACS supporting PCI-e expansion system - you would still have IOMMU security and performance as ACS support means that the devices beyond the external switch will be placed in separate IOMMU groups and thus you will maintain security and not have to use the unsafe attachment override for attaching devices to virtual machines.
NOTE: MMIO space limit dependent.
MCM/NUMA notes - Read if you play video games
NOTE: All G34 CPU's are dual-MCM thus with two NUMA nodes, if you play video games or need a single task with many threads the socket C32 single MCM/NUMA node KCMA-D8 with a 4386 might have improved performance although it is also possible to play games with a dual node CPU without stuttering.
The correct way to do this is to create a VM with properly pinned CPU's including iothread/emulator with all of the RAM on one node which is the same one that your interrupts for assigned devices such as graphics usb etc are being processed on.
Turbo Examples: If you have a 16 core CPU to obtain Turbo 2 you would select 2 modules and thus 4 cores from each MCM/NUMA node - then you allocate all of the hugepages/VM RAM on node 0 where the interrupts are assigned - this will provide the best gaming performance with a 16 core CPU. If you have dual 8 core 6328 CPU's the best VM gaming performance is gained by using both node zeros from both CPU's and hugepages RAM on the first node (zero) of the first CPU - this obtains 8 cores at 3.8ghz. You would also need to isolate the CPU's not in use by using the isolcpus kernel command line option and moving away interrupts if they somehow migrate to an isol'ed cpu.
Please contact Taiidan for advice on VM gaming for this board, how to obtain Crossfire xDMA in a VM, etc - with a capable graphics card you should be able to almost max out games circa 2017 at 1080p with one of the faster socket G34 CPU's.
Free Software Foundations "Respects Your Privacy" (RYF) certification
The Vikings D16 (a relabelled KGPE-D16) board is being sold with coreboot/Libreboot pre-installed. It is the first workstation/server mainboard that has been "RYF - Respects Your Freedom" certified by the Free Software Foundation on March 6th, 2017.
CPUs recommended by users
Microcode updates from Taiidan:
Due to the spectre exploit all Opteron CPU's will soon have microcode updates according to AMD.
It is a philosophical issue, all x86_64 CPU's have microcode but do you trust AMD now or AMD circa 2011-2013 when the G34 CPU's were released?
If the new 63xx microcode has some type of introduced security flaw then why not simply "fix" a bug and add the backdoor to the 62xx series as well?
I believe the improved performance of the 63xx series is good enough to justify the microcode updates - the 6287SE is nearly as fast as a 6386SE but it is quite hard to find.
|Processor sold by AMD||Part Number||Cores||Requires microcode updates for secure operation (ref)||Notes|
|Opteron 6386SE (fast)||OS6386YETGGHK||16||Yes|
|Opteron 6328||OS6328WKT8GHK or OS6328WKT8GHKWOF||8||Yes|
|Opteron 6287SE (2nd fastest)||?||16||No|
An 8 Core CPU is not really worth it unless you need the slightly better single threaded performance more than the second set of cores.