[LinuxBIOS] RESEND: v3 patch for console functions, and mods to lib/lar.c to support a void * parameter for running functions.

Fri Jan 4 04:14:52 CET 2008

On 04.01.2008 02:43, ron minnich wrote:
> I am restarting the CAR discussion a bit. I want to get a resync.
>   

Thanks.

> One part of the discussion involves the idea that I am trying to avoid
> return from disable_car because of some "bug" in our code. It is not
> that simple. The issue is that, when the code enters disable_car,the
> CPU has been using cache as ram, and using "memory" at non-memory
> addresses. In disable_car, the CPU is supposed to disable cache as
> ram, and continue -- should it return? The return is on the stack. The
> function that called disable_car has its own stack variables. They're
> actually in cache, not ram. So if the code is going to return, it will
> need a working stack. It will need to restore the stack from cache to
> ram. But where is the stack? at c000 in one case. Where is ram?
> Somewhere else.

That's something I don't understand. Are you saying that there might be
no RAM at address 0xC000? (Or at 0xC0000, IIRC.) That's well below 1MB
and on my local system 0xC0000-0xCFFFF is claimed by the Video ROM of
the embedded ATI graphics. Then again, I seem to remember that this area
can be RAM as well.

>  So it is possible that variables on stack are wrong,
> if they are pointers to variables on the stack, and there is the
> problem of changing esp, ss, and then fixing up the stack for the RET.
>
> Here's a useful comment:
> x86/car/cache_as_ram_post.c:
> /*
>         FIXME: I hope we don't need to change esp and ebp value here,
> so we can restore value from mmx sse back
>                 But the problem is the range is some io related, So
> don't go back
>         */
>
> If you look at the cache as ram for the amd and x86, you will find
> something interesting: it does not go back.
> x86:
>         leal    _iseg, %edi
>         jmp     *%edi
>
>
> For AMD, at the end of the mainboard cache as ram main, it calls
> post_cache_as_ram(void). At the end of that function is this:
>         /*copy and execute linuxbios_ram */
>         copy_and_run();
>         /* We will not return */
>
> in other words, once CAR is disabled, we chain to copy_and_run, and
> never return. We've also set up a new ESP.
>
> To put it simply, the most current CAR code *does not return*
> --because getting that right, for all CPUs, is tricky and not always
> possible.
>   

OK.

> So, expecting a return from disable car is not really in line with
> what we do now. In fact, what we do now on most of v2 -- certainly on
> opteron -- is merely what I'm proposing we do on V3.
>
> I'm pasting in some old mail now, with my comments in[[[]]]
>   

Thanks for that. It seems I missed the essence of that mail back then.

> =================================================================
> Marc Jones wrote:
>   
>> Carl-Daniel Hailfinger wrote:
>>
>>     
>>> Dumb question: We know the location of the stack when we enter
>>> disable_car(). We also know that all memory belongs to us. Can we copy
>>> CAR stack contents to some safe location and restore them after wbinvd()?
>>>
>>>       
>> This shouldn't be needed. If the stack needed to be moved you would want
>> to do it before the wbinvd so you were copying from cache to memory.
>> Much faster than memory to memory.
>>
>> Marc
>>
>>     
>
> and I am too smart for my own good.....
>
>
> Here is the deal. Look in cpu/amd/model_lx/cache_as_ram.inc.
> /* If you wanted to maintain the stack in memory you would need to set
> the tags as dirty
> [[[ Good point here! For the wbinvd to push the stack out, we need to
> set cache tags!
>     This is going to be tough on some CPUs -- how do we do it on LX?
> ]]]
>   

I assumed from one of the mails in that old thread that setting the
cache tags would be easy.

>          so the wbinvd would push out the old stack contents to memory */
> /* Clear the cache, the following code from crt0.S.lb will setup a new
> stack*/
> wbinvd
>
> For LX we didn't need to maintain the stack or any variables beyond CAR.
> [[[[ But on v3 we do .... ]]]]
> That meant that we didn't need to do any stack copy like the K8
> post_cache_as_ram()(that never returns). The LX cache_as_ram_main
> returns and the LB copy function is back in cache_as_ram.inc. Since
> there is nothing on the stack and everything is running out of the ROM a
> new stack is easily created.
>
> [[[ Please note the comment: "that never returns"]]]
>   

OK.

> [[[ and, note: if you look at that code, round about line 313, you'll
> notice that it does not return either]]]
>
> Sooo, if you want to maintain the stack in v3... I left a comment in the
>  v2 code. Recall that they cache is disabled so the way doesn't get
> reallocated and thus the tags are never marked dirty and written back.
> Since the tags are never marked dirty the wbind won't wb but it will invd.
>
> [[[meaning: stack cache lines will be invalidated but NOT written back ...]]]
>   

And that means the void * parameter to the running functions will be
invalidated as well because it is passed on the stack. BOOM. You might
be lucky enough that gcc generates code which loads the parameter into a
register before the wbinvd and does not try to reload that parameter
from your now-defunct old stack, but depending on luck is not an option
here.
If we kill our stack by making the wbinvd essentially an invd (because
tags are not dirty), no function arguments nor any local variables will
be available directly after the wbinvd. The only way out is to directly
call a function (not by pointer) whose address is known during link
time. Something like the call to copy_and_run() from v2 is needed here
as well.

> There are a couple ways to address this.
> 1. copy the stack to a new location.
> 2. Set the tags dirty with by writing the way MSRs.
> [[[ This is how you set the tags ]]]
>   

That's the way I'd prefer to handle things.

> 3. enable the cache and copy the stack back on it's self to dirty the tags.
> [[[This might work too to set the tags BUT -- the stack really has to
> be running at the right location. We would have to change this.]]]
>
> I am least sure about number three.
>
> I think that sums it up. I still think that it is best/fastest if the
> code returned to the CAR function and it setup a new stack in memory and
> then launched the next stage.
>
> [[[ Which is pretty much what I am proposing]]]
> =============================================================
> I hope this additional discussion is helpful.
>   

Indeed, thanks!

> Now, getting back to where we were: I am still arguing for what Marc
> argues in the last paragraph, stating "I still think that it is
> best/fastest if the
> code returned to the CAR function and it setup a new stack in memory and
> then launched the next stage.".
>   

As long as I can salvage the old stack contents in some way, that would
be acceptable for me as well.

> Carl-Daniel, I believe you are hoping we can do the return from
> disable_car, but as Marc points out in options 1-3 above, it's really
> not simple. This is not a "bug" per se. Fixing it will be complex. It
> is possible that you will need different solutions for each CPU
> *implementation* in some cases, not just each CPU type -- for example,
> the Intel CAR had to be changed for AMD, and different types of
> Opteron may well need different CAR return code.
>   

An explosion of code variants for CAR disabling would seriously
compromise maintainability of our code.

Regards,
Carl-Daniel