Re: [PATCH v2] hugetlb: fix race condition in hugetlb_fault()
Chris Metcalf <cmetcalf <at> tilera.com>
2012-04-01 16:51:49 GMT
On 4/1/2012 8:10 AM, Hillf Danton wrote:
> On Sat, Mar 31, 2012 at 4:07 AM, Chris Metcalf <cmetcalf <at> tilera.com> wrote:
>> The race is as follows. Suppose a multi-threaded task forks a new
>> process, thus bumping up the ref count on all the pages. While the fork
>> is occurring (and thus we have marked all the PTEs as read-only), another
>> thread in the original process tries to write to a huge page, taking an
>> access violation from the write-protect and calling hugetlb_cow(). Now,
>> suppose the fork() fails. It will undo the COW and decrement the ref
>> count on the pages, so the ref count on the huge page drops back to 1.
>> Meanwhile hugetlb_cow() also decrements the ref count by one on the
>> original page, since the original address space doesn't need it any more,
>> having copied a new page to replace the original page. This leaves the
>> ref count at zero, and when we call unlock_page(), we panic.
>>
>> The solution is to take an extra reference to the page while we are
>> holding the lock on it.
>>
> If the following chart matches the above description,
>
> [...]
>
> would you please spin with description refreshed?
Done, and thanks! I added your timeline chart to my description; I figure
no harm in having it both ways.
>> Cc: stable <at> kernel.org
> Let Andrew do the stable work, ok?
Fair point. I'm used to adding the Cc myself for things I push through the
(Continue reading)