[LinuxBIOS] Remaining MCP55 bits (please comment!)

Lu, Yinghai yinghai.lu at amd.com
Wed Mar 21 17:06:51 CET 2007


Please check the comment below:

-----Original Message-----
From: linuxbios-bounces at linuxbios.org
[mailto:linuxbios-bounces at linuxbios.org] On Behalf Of Stefan Reinauer
Sent: Wednesday, March 21, 2007 5:40 AM
To: Ed Swierk
Cc: Carl-Daniel Hailfinger; Ward Vandewege; LinuxBIOS
Subject: Re: [LinuxBIOS] Remaining MCP55 bits (please comment!)



* Ed Swierk <eswierk at arastra.com> [070319 20:01]:
> Index: src/devices/hypertransport.c
> ===================================================================
> --- src/devices/hypertransport.c.orig
> +++ src/devices/hypertransport.c
> @@ -279,20 +279,7 @@ static void ht_collapse_early_enumeratio
>  		}
>  		/* Has the link failed? */
>  		if (ctrl & (1 << 4)) {
> -			/*
> -			 * Either the link has failed, or we have
> -			 * a CRC error.
> -			 * Sometimes this can happen due to link
> -			 * retrain, so lets knock it down and see
> -			 * if its transient
> -			 */
> -			ctrl |= ((1 << 4) | (1 <<8)); // Link fail + Crc
> -			pci_write_config16(prev.dev, prev.pos +
prev.ctrl_off, ctrl);
> -			ctrl = pci_read_config16(prev.dev, prev.pos +
prev.ctrl_off);
> -			if (ctrl & ((1 << 4) | (1 << 8))) {
> -				printk_alert("Detected error on
Hypertransport Link\n");
> -				return;
> -			}
> +			return;
 
This needs to be dropped. It backs out a very useful error detection and
is there due to Yinghai using a very old tree for the port.

YH: Actually hyptertransport.c is used for linuxbios_ram stage code, so
We already reset that in CAR stage code. So...
 
> @@ -396,25 +385,12 @@ unsigned int hypertransport_scan_chain(s
>  		/* Wait until the link initialization is complete */
>  		do {
>  			ctrl = pci_read_config16(prev.dev, prev.pos +
prev.ctrl_off);
> -
> -			if (ctrl & (1 << 6))
> -				goto end_of_chain;	// End of chain
> -
> -			if (ctrl & ((1 << 4) | (1 << 8))) {
> -				/*
> -				 * Either the link has failed, or we
have
> -				 * a CRC error.
> -				 * Sometimes this can happen due to link
> -				 * retrain, so lets knock it down and
see
> -				 * if its transient
> -				 */
> -				ctrl |= ((1 << 4) | (1 <<8)); // Link
fail + Crc
> -				pci_write_config16(prev.dev, prev.pos +
prev.ctrl_off, ctrl);
> -				ctrl = pci_read_config16(prev.dev,
prev.pos + prev.ctrl_off);
> -				if (ctrl & ((1 << 4) | (1 << 8))) {
> -					printk_alert("Detected error on
Hypertransport Link\n");
> -					goto end_of_chain;
> -				}
> +			/* Is this the end of the hypertransport chain?
> +			 * Has the link failed?
> +			 * If so further scanning is pointless.
> +			 */
> +			if (ctrl & ((1 << 6) | (1 << 4))) {
> +				goto end_of_chain;
>  			}
>  		} while((ctrl & (1 << 5)) == 0);

dito. 
YH: ....
 		
> @@ -449,39 +425,53 @@ unsigned int hypertransport_scan_chain(s
>  		if (flags & 0x1f) {
>  			break;
>  		}
> -
>  		flags &= ~0x1f; /* mask out base Unit ID */
> -                flags |= next_unitid & 0x1f;
> +
> +		count = (flags >> 5) & 0x1f; /* get unit count */
> +		not_use_count = 0;
> +		temp_unitid = next_unitid;
> +#if HT_CHAIN_END_UNITID_BASE < HT_CHAIN_UNITID_BASE


Yinghai: Could you please comment this a bit? What does it do? 
There are so many magic variables doing so much magic stuff.

Why is this ifdefed? Don't we want this on all systems? Why not?

YH: I have sent another patch, after that big tar ball, that one is more
clear and useful for strange HT device.

> Index: src/cpu/x86/mtrr/mtrr.c
> ===================================================================
> --- src/cpu/x86/mtrr/mtrr.c.orig
> +++ src/cpu/x86/mtrr/mtrr.c
> @@ -282,10 +282,16 @@ static void set_fixed_mtrr_resource(void
>  	
>  }
>  
> +#ifndef CONFIG_VAR_MTRR_HOLE
> +#define CONFIG_VAR_MTRR_HOLE 1
> +#endif

This needs some documentation as well. Why would one disable this?

Why would one leave it enabled? There are other HOLE variables as well?
Is this one not the same or directly depending?

YH: some IB or other device want the kernel to change mtrr from uncached
for WC, and some kernel can not handle subtract case. It seems latest
kernel already could change subtract to add schema and change some range
to WC...

> Index: src/cpu/x86/lapic/lapic_cpu_init.c
> ===================================================================
> --- src/cpu/x86/lapic/lapic_cpu_init.c.orig
> +++ src/cpu/x86/lapic/lapic_cpu_init.c
> @@ -13,6 +13,13 @@
>  #include <smp/atomic.h>
>  #include <smp/spinlock.h>
>  #include <cpu/cpu.h>
> +#include <cpu/x86/tsc.h>
> +
> +static inline void print_debug_tsc_x(const char *str, unsigned i,
unsigned val, unsigned val2)
> +{
> +        printk_debug("%s[%02x]=%08x%08x\n", str, i, val, val2);
> +
> +}
>  
>  #if CONFIG_SMP == 1
>  
> @@ -374,6 +381,7 @@ void initialize_cpus(struct bus *cpu_bus
>  {
>  	struct device_path cpu_path;
>  	struct cpu_info *info;
> +	tsc_t tsc;
>  
>  	/* Find the info struct for this cpu */
>  	info = cpu_info();
> @@ -400,6 +408,9 @@ void initialize_cpus(struct bus *cpu_bus
>  	
>          cpus_ready_for_init(); 
>  
> +        tsc = rdtsc();
> +        print_debug_tsc_x("\nFinal              : tsc ", 10,  tsc.hi,
tsc.lo);
> +
>  #if CONFIG_SMP == 1
>  	#if SERIAL_CPU_INIT == 0
>  	/* start all aps at first, so we can init ECC all together */
> @@ -419,6 +430,8 @@ void initialize_cpus(struct bus *cpu_bus
>  	/* Now wait the rest of the cpus stop*/
>  	wait_other_cpus_stop(cpu_bus);
>  #endif
> +        tsc = rdtsc();
> +        print_debug_tsc_x("\nFinal              : tsc ", 11,  tsc.hi,
tsc.lo);
>  
>  }
 

All changes to the whole file are just debugging. And not exactly nice.
I suggest dropping it.
 
> Index: src/northbridge/amd/amdk8/coherent_ht.c
> ===================================================================
> --- src/northbridge/amd/amdk8/coherent_ht.c.orig
> +++ src/northbridge/amd/amdk8/coherent_ht.c
> @@ -1,10 +1,3 @@
> -#if USE_DCACHE_RAM
> -
> -#include "coherent_ht_car.c"
> -
> -#else
> -


Does this mean we only need one version of coherent_ht.c again?
Can we then please also drop coherent_ht_car.c? (!!!!)

YH: depends if you like to use ROMCC with it.

> -#if CONFIG_USE_INIT
> +#if CONFIG_USE_PRINTK_IN_CAR

Is this rename consistent with the rest of the tree?

If not, can someone do a sed s/CONFIG_USE_INIT/CONFIG_USE_PRINTK_IN_CAR/
on all x86 boards in the tree?

YH: It make PRINTK can be used even you are not use initobjects.

> +static void enable_apic_ext_id(u8 node) 
> +{
> +#if ENABLE_APIC_EXT_ID==1
> +#warning "FIXME Is the right place to enable apic ext id here?"
> +
> +      u32 val;
> +
> +        val = pci_read_config32(NODE_HT(node), 0x68);
> +        val |= (HTTC_APIC_EXT_SPUR | HTTC_APIC_EXT_ID |
HTTC_APIC_EXT_BRD_CST);
> +        pci_write_config32(NODE_HT(node), 0x68, val);
> +#endif
> +}
> +
>  static void enable_routing(u8 node)
>  {
>  	u32 val;
> @@ -199,17 +208,6 @@ static void enable_routing(u8 node)
>  	print_spew(" done.\r\n");
>  }
>  
> -static void enable_apic_ext_id(u8 node)
> -{
> -
> -	u32 val;
> -
> -        val = pci_read_config32(NODE_HT(node), 0x68);
> -        val |= (HTTC_APIC_EXT_SPUR | HTTC_APIC_EXT_ID |
HTTC_APIC_EXT_BRD_CST);
> -        pci_write_config32(NODE_HT(node), 0x68, val);
> -}

Personally I think such function moves are nasty. I think it should be
dropped if there is no reason.

YH: MACRO?


> @@ -259,23 +257,6 @@ static void rename_temp_node(u8 node)
>  
>  	print_spew(" done.\r\n");
>  }
> -#if K8_HT_CHECK_PENDING_LINK == 1
> -static void wait_ht_stable(uint8_t node)
> -{       
> -        uint8_t linkn;
> -        for(linkn = 0; linkn<3; linkn++) {
> -                uint8_t regpos;
> -                uint16_t i;
> -                uint32_t reg;
> -                regpos = 0x98 + 0x20 * linkn;
> -                for(i = 0; i < 0xff; i++) { //wait to make sure it is
done
> -                        reg = pci_read_config32(NODE_HT(node),
regpos);
> -                        if ((reg & 0x10) == 0) break; // init
complete
> -                        udelay(10);
> -                } 
> -        }
> -}
> -#endif

Why is this dropped?
 
> -	if (!is_cpu_pre_e0()) {
> +#if K8_HT_FREQ_1G_SUPPORT == 1
> +    #if K8_REV_F_SUPPORT == 0
> +	if (!is_cpu_pre_e0())
> +    #endif 
> +	{
>  		return freq_cap;
>  	}
> +#endif

This nesting of ifdef statements looks wrong. Why would supporting more
than before drop code. This means "K8_NO_REV_E_AND_EARLIER_SUPPORT"
instead of "K8_REV_F_SUPPORT".  I believe it should be dropped (and many
places like that in the code) It just makes the code ugly to read for no
good reason. (Rev F are all using CAR anyways, so code size impact is
minimal)

Yinghai? What is it good for?

> @@ -676,11 +665,6 @@ static void setup_uniprocessor(void)
>  	disable_probes();
>  }
>  
> -struct setup_smp_result {
> -	int nodes;
> -	int needs_reset;
> -};
> -
>  #if CONFIG_MAX_PHYSICAL_CPUS > 2
>  static int optimize_connection_group(const u8 *opt_conn, int num) {
>  	int needs_reset = 0;
> @@ -695,21 +679,20 @@ static int optimize_connection_group(con
>  #endif
>  
>  #if CONFIG_MAX_PHYSICAL_CPUS > 1
> -static struct setup_smp_result setup_smp2(void)
> +static unsigned setup_smp2(void)


Above you start dropping "unsigned" (which i think is very cool) but
here you use it again for new stuff.  Unsigned _what_?

YH: I want to gcc to decide the size sometime if we don't care about the
size.

>  	print_spew("coherent_ht_finalize\r\n");
> +#if K8_REV_F_SUPPORT == 0
>  	rev_a0 = is_cpu_rev_a0();
> +#endif

I think we should completely drop all K8 checks for rev a and rev b and
instead put in one early check that prints

"Revision Ax and Bx systems are not supported by LinuxBIOS. Halted"

Because it is true. A rev B3 system will not be initialized correctly.

YH: some big cluster in LANL still use B3?

> Index: src/arch/i386/boot/tables.c
> ===================================================================
> --- src/arch/i386/boot/tables.c.orig
> +++ src/arch/i386/boot/tables.c
> @@ -10,38 +10,12 @@
>  #include <arch/acpi.h>
>  #include "linuxbios_table.h"
>  
> -// Global Descriptor Table, defined in c_start.S
> -extern uint8_t gdt;
> -extern uint8_t gdt_end;
> -
> -/* i386 lgdt argument */
> -struct gdtarg {
> -	unsigned short limit;
> -	unsigned int base;
> -} __attribute__((packed));
> -
> -// Copy GDT to new location and reload it
> -// 2003-07 by SONE Takeshi
> -// Ported from Etherboot to LinuxBIOS 2005-08 by Steve Magnani
> -void move_gdt(unsigned long newgdt)
> -{
> -	uint16_t num_gdt_bytes = &gdt_end - &gdt;
> -	struct gdtarg gdtarg;
> -
> -	printk_debug("Moving GDT to %#lx...", newgdt);
> -	memcpy((void*)newgdt, &gdt, num_gdt_bytes);
> -	gdtarg.base = newgdt;
> -	gdtarg.limit = num_gdt_bytes - 1;
> -	__asm__ __volatile__ ("lgdt %0\n\t" : : "m" (gdtarg));
> -	printk_debug("ok\n");
> -}
> -
>  struct lb_memory *write_tables(void)

I think dropping move_gdt _here_ (in tables) is good. BUT: It is a
regression caused by Yinghai working on an older tree. 

It should be added again elsewhere. Where?

We have this nasty bugger in v3 as well.. it must die ;-)

> -#if HAVE_MP_TABLE==1
> +#if HAVE_MP_TABLE

this looks wrong ^^^^^^

> Index: src/arch/i386/lib/cpu.c
> ===================================================================
> --- src/arch/i386/lib/cpu.c.orig
> +++ src/arch/i386/lib/cpu.c
> @@ -207,7 +207,6 @@ static void set_cpu_ops(struct device *c
>  			}
>  		}
>  	}
> -	die("Unknown cpu");

This should be a printk at least?

>  	return;
>   found:
>  	cpu->ops = driver->ops;
> @@ -223,7 +222,7 @@ void cpu_initialize(void)
>  	struct device *cpu;
>  	struct cpu_info *info;
>  	struct cpuinfo_x86 c;
> -
> +	
>  	info = cpu_info();
 
This should be dropped. 
 
> Index: src/pc80/serial.c
> ===================================================================
> --- src/pc80/serial.c.orig
> +++ src/pc80/serial.c
> @@ -92,6 +92,7 @@ static void uart_init(void)
>  	outb(UART_LCS, TTYS0_BASE + UART_LCR);
>  }
>  #else
> +#include "../lib/uart8250.c"
>  extern void uart8250_init(unsigned base_port, unsigned divisor,
unsigned lcs);
>  static void uart_init(void)
>  {

Why this include? Looks wrong. Include stinks (if used like that).

YH: make the CONFIG_PRINTK_IN_CAR happy


> Index: src/mainboard/gigabyte/m57sli/Options.lb
> ===================================================================
> --- src/mainboard/gigabyte/m57sli/Options.lb.orig
> +++ src/mainboard/gigabyte/m57sli/Options.lb
> @@ -213,7 +213,7 @@ default K8_HT_FREQ_1G_SUPPORT=1
>  default CONFIG_CONSOLE_VGA=1
>  default CONFIG_PCI_ROM_RUN=1
>  
> -default CONFIG_USBDEBUG_DIRECT=1
> +#default CONFIG_USBDEBUG_DIRECT=1
>  
>  #HT Unit ID offset, default is 1, the typical one, 0 mean only one HT
device
>  default HT_CHAIN_UNITID_BASE=0

No USB debug console? What other console is there on the m57sli?

YH: serial console

> Index: util/abuild/abuild
> ===================================================================
> --- util/abuild/abuild.orig
> +++ util/abuild/abuild
> @@ -142,14 +142,14 @@ EOF
>  			cat <<EOF
>  romimage "normal"
>  	option USE_FALLBACK_IMAGE=0
> -	option ROM_IMAGE_SIZE=0x16000
> +	option ROM_IMAGE_SIZE=0x17000
>  	option LINUXBIOS_EXTRA_VERSION=".0-normal"
>  	payload PAYLOAD
>  end
>  
>  romimage "fallback" 
>  	option USE_FALLBACK_IMAGE=1
> -	option ROM_IMAGE_SIZE=0x16000
> +	option ROM_IMAGE_SIZE=0x17000
>  	option LINUXBIOS_EXTRA_VERSION=".0-fallback"
>  	payload PAYLOAD
>  end

Which boards are failing without this option? They should get their 
specific Config.abuild instead. This option will break building images
with filo payload on qa.linuxbios.org for many small systems.







More information about the coreboot mailing list