Hey all,<br>
<br>
It appears that under certain situations / hardware, HT can come up
with the LinkFail and CrcError bits set on certain devices, even though
the bus isn't *currently* in an error state. This causes
'hypertransport_scan_chain()' to stop traversing down a chain. I've
made the following patch which knocks down the error state and re-reads
to identify if the error is transient or not (It also reports the error
rather than silently aborts the chain scan which caused me about 6
hours of hunting to find):<br>
<br>
*****BEGIN CUT*****<br>
Index: hypertransport.c<br>
===================================================================<br>
--- hypertransport.c    (revision 2064)<br>
+++ hypertransport.c    (working copy)<br>
@@ -345,12 +345,25 @@<br>
               
/* Wait until the link initialization is complete */<br>
                do {<br>
                       
ctrl = pci_read_config16(prev.dev, prev.pos + prev.ctrl_off);<br>
-                      
/* Is this the end of the hypertransport chain?<br>
-                       
* Has the link failed?<br>
-                       
* If so further scanning is pointless.<br>
-                       
*/<br>
-                      
if (ctrl & ((1 << 6) | (1 << 4))) {<br>
-                              
goto end_of_chain;<br>
+<br>
+                      
if (ctrl & (1 << 6))<br>
+                              
goto end_of_chain;      // End of chain<br>
+<br>
+                      
if (ctrl & ((1 << 4) | (1 << 8))) {<br>
+                              
/*<br>
+                               
* Either the link has failed, or we have<br>
+                               
* a CRC error.<br>
+                               
* Sometimes this can happen due to link<br>
+                               
* retrain, so lets knock it down and see<br>
+                               
* if its transient<br>
+                               
*/<br>
+                              
ctrl |= ((1 << 6) | (1 <<8)); // Link fail + Crc<br>
+                              
pci_write_config16(prev.dev, prev.pos + prev.ctrl_off, ctrl);<br>
+                              
ctrl = pci_read_config16(prev.dev, prev.pos + prev.ctrl_off);<br>
+                              
if (ctrl & ((1 << 4) | (1 << 8))) {<br>
+                                      
printk_alert("Detected error on Hypertransport Link\n");<br>
+                                      
goto end_of_chain;<br>
+                              
}<br>
                       
}<br>
               
} while((ctrl & (1 << 5)) == 0);<br>
<br>
****END CUT*****<br>
<br>