Dave Korn | 5 May 2010 18:54

Re: CFA: pseudo-reloc v2

[ redirected from cygwin-developers. ]
On 04/10/2009 05:11, Charles Wilson wrote:
[ thread seriously necro'd! ]
> Dave Korn wrote:
>> Charles Wilson wrote:
>>>   120 void
>>>   121 _pei386_runtime_relocator ()
>>>   122 {
>>>   123   static int was_init = 0;
>>>   124   if (was_init)
>>>   125     return;
>>>   126   ++was_init;
>>>   127   do_pseudo_reloc (&__RUNTIME_PSEUDO_RELOC_LIST__,&__RUNTIME_PSEUDO_RELOC_LIST_END__,&_image_base__);
>>>   128 }
>>   Maybe that static should be NO_COPY?  If everything gets remapped in the
>> forkee, do the relocs need rerunning?  (I'm not sure about the behaviour of
>> NtCreateProcess w.r.t modified .text section pages.)
> 
> Good guess!  With the following patch, all of these fork tests perform
> as expected.

  Aha, not so good as all that after all!  We need to re-apply relocs in the
forkee - but only if they *don't* point to regions covered by the .data/.bss
section copying at startup.  Argh!

>  One oddity; it turns out that __INSIDE_CYGWIN__ is not
> defined inside pseudo-reloc.c, so I used __CYGWIN__ as a guard.

  Dunno if we ever went into that, but it's right; pseudo-reloc.o is part of
the CRT in winsup/cygwin/lib/, and is linked statically into every exe and
(Continue reading)

Christopher Faylor | 5 May 2010 19:56
Favicon

Re: CFA: pseudo-reloc v2

On Wed, May 05, 2010 at 05:54:29PM +0100, Dave Korn wrote:
>[ redirected from cygwin-developers. ]
>On 04/10/2009 05:11, Charles Wilson wrote:
>[ thread seriously necro'd! ]
>> Dave Korn wrote:
>>> Charles Wilson wrote:
>>>>   120 void
>>>>   121 _pei386_runtime_relocator ()
>>>>   122 {
>>>>   123   static int was_init = 0;
>>>>   124   if (was_init)
>>>>   125     return;
>>>>   126   ++was_init;
>>>>   127   do_pseudo_reloc (&__RUNTIME_PSEUDO_RELOC_LIST__,&__RUNTIME_PSEUDO_RELOC_LIST_END__,&_image_base__);
>>>>   128 }
>>>   Maybe that static should be NO_COPY?  If everything gets remapped in the
>>> forkee, do the relocs need rerunning?  (I'm not sure about the behaviour of
>>> NtCreateProcess w.r.t modified .text section pages.)
>> 
>> Good guess!  With the following patch, all of these fork tests perform
>> as expected.
>
>  Aha, not so good as all that after all!  We need to re-apply relocs in the
>forkee - but only if they *don't* point to regions covered by the .data/.bss
>section copying at startup.  Argh!
>
>>  One oddity; it turns out that __INSIDE_CYGWIN__ is not
>> defined inside pseudo-reloc.c, so I used __CYGWIN__ as a guard.
>
>  Dunno if we ever went into that, but it's right; pseudo-reloc.o is part of
(Continue reading)

Dave Korn | 5 May 2010 20:58

Re: CFA: pseudo-reloc v2

On 05/05/2010 18:56, Christopher Faylor wrote:

> I like the idea but I have a few problems with this, some stylistic and
> some implementation.
> 
> Stylistic:

  Those all make sense to me, but I won't rework it yet until we've seen your
PoC/discussed further.

> Implementation:
> 
> I don't like keeping a list of "places we know we need to ignore" separate from
> the Cygwin DLL.  That means that if there is something new added besides data/bss
> this function becomes obsolete.
> 
> I think this argues for most of this functionality being moved to the
> Cygwin DLL itself so that an application gets improvements for free.  I
> should have suggested that when this function first made its way into
> the libcygwin.a (or maybe I did and someone will enlighten me about why that
> won't work).
> 
> I'll see I can come up with a proof-of-concept of what I'm talking about soon.
> 
> Btw, don't we have to worry about data/bss for DLLs too?  Is that
> handled by this change or is that not an issue?

  We do have to worry and it is handled.  This code gets linked statically
into each DLL and EXE, each instance referring just to its own pseudo-reloc
tables.  The Cygwin DLL copies all the data and bss for both DLLs and EXEs (at
(Continue reading)

Christopher Faylor | 5 May 2010 21:13
Favicon

Re: CFA: pseudo-reloc v2

On Wed, May 05, 2010 at 07:58:20PM +0100, Dave Korn wrote:
>On 05/05/2010 18:56, Christopher Faylor wrote:
>
>> I like the idea but I have a few problems with this, some stylistic and
>> some implementation.
>> 
>> Stylistic:
>
>  Those all make sense to me, but I won't rework it yet until we've seen your
>PoC/discussed further.
>
>> Implementation:
>> 
>> I don't like keeping a list of "places we know we need to ignore" separate from
>> the Cygwin DLL.  That means that if there is something new added besides data/bss
>> this function becomes obsolete.
>> 
>> I think this argues for most of this functionality being moved to the
>> Cygwin DLL itself so that an application gets improvements for free.  I
>> should have suggested that when this function first made its way into
>> the libcygwin.a (or maybe I did and someone will enlighten me about why that
>> won't work).
>> 
>> I'll see I can come up with a proof-of-concept of what I'm talking about soon.
>> 
>> Btw, don't we have to worry about data/bss for DLLs too?  Is that
>> handled by this change or is that not an issue?
>
>  We do have to worry and it is handled.  This code gets linked statically
>into each DLL and EXE, each instance referring just to its own pseudo-reloc
(Continue reading)

Dave Korn | 5 May 2010 21:48

Re: CFA: pseudo-reloc v2

On 05/05/2010 20:13, Christopher Faylor wrote:

> Yeah, I realized that two seconds after sending the message.  However,
> is this particular problem really an issue for DLLs?  DLLs should get
> their data/bss updated after _pei386_runtime_relocator() is called.  So
> it seems like you'd get the same thing being written twice.  It's not
> optimal but it shouldn't be fatal.
> 
> The program's data/bss is different since that gets copied during DLL
> initialization and before _pei386_runtime_relocator() is (was) called.  So
> I could see how it could be screwed up.

  Ah, right; I wasn't looking at how much later the dll sections got copied, I
just figured the safest and consistent solution was just to treat everything
the same.

> That's basically it and I have it more-or-less coded but I haven't
> finished thinking about DLLs.  Maybe that's more complication than is
> warranted.  I have to do more research there.  We could, and I think
> should, put most of the code in pseudo_reloc.c in cygwin1.dll, though,
> rather than duplicate it in every source file.

  Yeh, the only thing we need in the source file is to capture the module's
idea of its section start/end pointers, as we already do in the per_process;
we could consider passing pointers to the pseudo-relocs in that as well, but
horrible backward-compatibility problems could arise.  It would make sense to
inline the remnants of _pei386_runtime_relocator into _cygwin_crt0_common and
do away with the pseudo-reloc.c file altogether.

> This information is all recorded for fork() so it should be doable.  It is
(Continue reading)

Christopher Faylor | 5 May 2010 22:30
Favicon

Re: CFA: pseudo-reloc v2

On Wed, May 05, 2010 at 08:48:28PM +0100, Dave Korn wrote:
>On 05/05/2010 20:13, Christopher Faylor wrote:
>
>> Yeah, I realized that two seconds after sending the message.  However,
>> is this particular problem really an issue for DLLs?  DLLs should get
>> their data/bss updated after _pei386_runtime_relocator() is called.  So
>> it seems like you'd get the same thing being written twice.  It's not
>> optimal but it shouldn't be fatal.
>> 
>> The program's data/bss is different since that gets copied during DLL
>> initialization and before _pei386_runtime_relocator() is (was) called.  So
>> I could see how it could be screwed up.
>
>  Ah, right; I wasn't looking at how much later the dll sections got copied, I
>just figured the safest and consistent solution was just to treat everything
>the same.
>
>> That's basically it and I have it more-or-less coded but I haven't
>> finished thinking about DLLs.  Maybe that's more complication than is
>> warranted.  I have to do more research there.  We could, and I think
>> should, put most of the code in pseudo_reloc.c in cygwin1.dll, though,
>> rather than duplicate it in every source file.
>
>  Yeh, the only thing we need in the source file is to capture the module's
>idea of its section start/end pointers, as we already do in the per_process;
>we could consider passing pointers to the pseudo-relocs in that as well, but
>horrible backward-compatibility problems could arise.  It would make sense to
>inline the remnants of _pei386_runtime_relocator into _cygwin_crt0_common and
>do away with the pseudo-reloc.c file altogether.
>
(Continue reading)

Dave Korn | 6 May 2010 01:47

Re: CFA: pseudo-reloc v2

On 05/05/2010 21:30, Christopher Faylor wrote:

> I have something written now.  I'll dig through the cygwin archives to
> see if I can find the original message which started this but are there
> other test cases that I could use to verify that I caught all of the
> code paths in the DLL?

  http://cygwin.com/ml/cygwin/2010-04/msg00957.html comes with a couple of
testcases attached, although you can only be sure they've worked by running
them and seeing that no .stackdump file was generated in your $CWD.

> Chuck?  Do you have anything I could use to test what I did?

  There were some fork-related testcases in the original thread, but I didn't
refer back to them when I was revising this, so they're probably worth verifying:
   http://www.cygwin.com/ml/cygwin-developers/2009-10/msg00052.html

> What I did:
> 
> 1) Move pseudo-reloc.c out of lib and into the dll (making
> it a c++ file in the process).
> 
> 2) Record the three values needed by _pei386_runtime_relocator in the
> per_process structure.

  That bit worries me - even adding a single pointer in a place where there
would never have been a field before caused us enough trouble!  But, it's
probably the right thing to do; it's the defined mechanism for conveying
image-specific information from the module to the cygwin dll.

(Continue reading)

Charles Wilson | 6 May 2010 05:07
Favicon

Re: CFA: pseudo-reloc v2

On 5/5/2010 3:13 PM, Christopher Faylor wrote:

> That's basically it and I have it more-or-less coded but I haven't
> finished thinking about DLLs.  Maybe that's more complication than is
> warranted.  I have to do more research there.  We could, and I think
> should, put most of the code in pseudo_reloc.c in cygwin1.dll, though,
> rather than duplicate it in every source file.

I disagree with this statement.

I spent a lot of effort trying to synchronize our version of
pseudo_reloc.c with the mingw and mingw64 versions -- specifically so
that we could leverage Kai's v2 efforts.

If we -- meaning cygwin -- move most of the guts into the cygwin DLL,
then ... we either
  (1) fork our version from the mingw[32|64] version permanently, and
lose the possibility of "easy" code sharing between the three projects, or
  (2) this portion of the code lives in both places (pseudo_reloc.c and
some-other-cygwin-dll-source-file), but is #ifdef'ed in pseudo_reloc.c
when compiled on cygwin, because there's this other identical copy over
in some-other-cygwin-dll-source-file.

Yuck. (I don't mind "losing" the effort I put in, because whatever
happens we now have v2 support. But...why make it harder if somebody in
mingw-land invents v3? Or make it harder on them, if WE do?)

--
Chuck

(Continue reading)

Charles Wilson | 6 May 2010 05:12
Favicon

Re: CFA: pseudo-reloc v2

On 5/5/2010 4:30 PM, Christopher Faylor wrote:
> Chuck?  Do you have anything I could use to test what I did?

Yes, most recent version attached. (embedded READMEs describe expected
output).

--
Chuck
Attachment (pseudo-reloc-tests-v3.tar.bz2): application/octet-stream, 3582 bytes
Christopher Faylor | 6 May 2010 15:49
Favicon

Re: CFA: pseudo-reloc v2

On Wed, May 05, 2010 at 11:07:33PM -0400, Charles Wilson wrote:
>On 5/5/2010 3:13 PM, Christopher Faylor wrote:
>
>> That's basically it and I have it more-or-less coded but I haven't
>> finished thinking about DLLs.  Maybe that's more complication than is
>> warranted.  I have to do more research there.  We could, and I think
>> should, put most of the code in pseudo_reloc.c in cygwin1.dll, though,
>> rather than duplicate it in every source file.
>
>I disagree with this statement.
>
>I spent a lot of effort trying to synchronize our version of
>pseudo_reloc.c with the mingw and mingw64 versions -- specifically so
>that we could leverage Kai's v2 efforts.
>
>If we -- meaning cygwin -- move most of the guts into the cygwin DLL,
>then ... we either
>  (1) fork our version from the mingw[32|64] version permanently, and
>lose the possibility of "easy" code sharing between the three projects, or
>  (2) this portion of the code lives in both places (pseudo_reloc.c and
>some-other-cygwin-dll-source-file), but is #ifdef'ed in pseudo_reloc.c
>when compiled on cygwin, because there's this other identical copy over
>in some-other-cygwin-dll-source-file.

I kept the ifdef __CYGWIN__ stuff.  Moving the code into the DLL
actually simplifies the Cygwin part quite a bit since you can use things
like "winsup.h" and "small_printf".  And, my changes don't permute
things as much as Dave's.  Dave's changes were not really MinGW
friendly.

(Continue reading)


Gmane