Martin Mares | 1 Apr 2012 06:26
Picon

The linux-input mailing list has been moved

The linux-input mailing list has been moved to vger.kernel.org.
See http://vger.kernel.org/vger-lists.html for information on the
new list server (or consult your local oracle).

		Yours virtually,
					Martin Mares

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo <at> kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont <at> kvack.org"> email <at> kvack.org </a>

Hillf Danton | 1 Apr 2012 14:10
Picon

Re: [PATCH v2] hugetlb: fix race condition in hugetlb_fault()

On Sat, Mar 31, 2012 at 4:07 AM, Chris Metcalf <cmetcalf <at> tilera.com> wrote:
> The race is as follows.  Suppose a multi-threaded task forks a new
> process, thus bumping up the ref count on all the pages.  While the fork
> is occurring (and thus we have marked all the PTEs as read-only), another
> thread in the original process tries to write to a huge page, taking an
> access violation from the write-protect and calling hugetlb_cow().  Now,
> suppose the fork() fails.  It will undo the COW and decrement the ref
> count on the pages, so the ref count on the huge page drops back to 1.
> Meanwhile hugetlb_cow() also decrements the ref count by one on the
> original page, since the original address space doesn't need it any more,
> having copied a new page to replace the original page.  This leaves the
> ref count at zero, and when we call unlock_page(), we panic.
>
> The solution is to take an extra reference to the page while we are
> holding the lock on it.
>
If the following chart matches the above description,

===
	fork on CPU A				fault on CPU B
	=============				==============
	...
	down_write(&parent->mmap_sem);
	down_write_nested(&child->mmap_sem);
	...
	while duplicating vmas
		if error
			break;
	...
	up_write(&child->mmap_sem);
(Continue reading)

Hillf Danton | 1 Apr 2012 14:33
Picon

Re: [PATCH v3] arch/tile: support multiple huge page sizes dynamically

On Sat, Mar 31, 2012 at 3:37 AM, Chris Metcalf <cmetcalf <at> tilera.com> wrote:
> This change adds support for a new "super" bit in the PTE, and a
> new arch_make_huge_pte() method called from make_huge_pte().
> The Tilera hypervisor sees the bit set at a given level of the page
> table and gangs together 4, 16, or 64 consecutive pages from
> that level of the hierarchy to create a larger TLB entry.
>
> One extra "super" page size can be specified at each of the
> three levels of the page table hierarchy on tilegx, using the
> "hugepagesz" argument on the boot command line.  A new hypervisor
> API is added to allow Linux to tell the hypervisor how many PTEs
> to gang together at each level of the page table.
>
> To allow pre-allocating huge pages larger than the buddy allocator
> can handle, this change modifies the Tilera bootmem support to
> put all of memory on tilegx platforms into bootmem.
>
> As part of this change I eliminate the vestigial CONFIG_HIGHPTE
> support, which never worked anyway, and eliminate the hv_page_size()
> API in favor of the standard vma_kernel_pagesize() API.
>
> Reviewed-by: Hillf Danton <dhillf <at> gmail.com>
> Signed-off-by: Chris Metcalf <cmetcalf <at> tilera.com>
> ---
> This version of the patch adds a generic no-op definition to
> <linux/hugetlb.h> if "arch_make_huge_pte" is not #defined.  I'm following
> Linus's model in https://lkml.org/lkml/2012/1/19/443 which says you create
> the inline, then "#define func func" to indicate that the function exists.
>
> Hillf, let me know if you want to provide an Acked-by, or I'll leave it
(Continue reading)

Chris Metcalf | 1 Apr 2012 18:46
Favicon

Re: [PATCH v3] arch/tile: support multiple huge page sizes dynamically

On 4/1/2012 8:33 AM, Hillf Danton wrote:
> On Sat, Mar 31, 2012 at 3:37 AM, Chris Metcalf <cmetcalf <at> tilera.com> wrote:
>> This change adds support for a new "super" bit in the PTE, and a
>> new arch_make_huge_pte() method called from make_huge_pte().
>> The Tilera hypervisor sees the bit set at a given level of the page
>> table and gangs together 4, 16, or 64 consecutive pages from
>> that level of the hierarchy to create a larger TLB entry.
>>
>> One extra "super" page size can be specified at each of the
>> three levels of the page table hierarchy on tilegx, using the
>> "hugepagesz" argument on the boot command line.  A new hypervisor
>> API is added to allow Linux to tell the hypervisor how many PTEs
>> to gang together at each level of the page table.
>>
>> To allow pre-allocating huge pages larger than the buddy allocator
>> can handle, this change modifies the Tilera bootmem support to
>> put all of memory on tilegx platforms into bootmem.
>>
>> As part of this change I eliminate the vestigial CONFIG_HIGHPTE
>> support, which never worked anyway, and eliminate the hv_page_size()
>> API in favor of the standard vma_kernel_pagesize() API.
>>
>> Reviewed-by: Hillf Danton <dhillf <at> gmail.com>
>> Signed-off-by: Chris Metcalf <cmetcalf <at> tilera.com>
>> ---
>> This version of the patch adds a generic no-op definition to
>> <linux/hugetlb.h> if "arch_make_huge_pte" is not #defined.  I'm following
>> Linus's model in https://lkml.org/lkml/2012/1/19/443 which says you create
>> the inline, then "#define func func" to indicate that the function exists.
>>
(Continue reading)

Fengguang Wu | 1 Apr 2012 10:30
Picon
Favicon

Re: [PATCH 0/6] buffered write IO controller in balance_dirty_pages()

On Sun, Apr 01, 2012 at 09:46:06AM +0530, Suresh Jayaraman wrote:
> On 03/28/2012 05:43 PM, Fengguang Wu wrote:
> > Here is one possible solution to "buffered write IO controller", based on Linux
> > v3.3
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux.git  buffered-write-io-controller
> > 
> 
> The implementation looks unbelievably simple. I ran a few tests
> (throttling) and I found it working well generally.

Thanks for test it out :)

> > Features:
> > - support blkio.weight
> > - support blkio.throttle.buffered_write_bps
> > 
> > Possibilities:
> > - it's trivial to support per-bdi .weight or .buffered_write_bps
> > 
> > Pros:
> > 1) simple
> > 2) virtually no space/time overheads
> > 3) independent of the block layer and IO schedulers, hence
> > 3.1) supports all filesystems/storages, eg. NFS/pNFS, CIFS, sshfs, ...
> > 3.2) supports all IO schedulers. One may use noop for SSDs, inside virtual machines, over iSCSI, etc.
> > 
> > Cons:
> > 1) don't try to smooth bursty IO submission in the flusher thread (*)
> > 2) don't support IOPS based throttling
(Continue reading)

Chris Metcalf | 1 Apr 2012 18:51
Favicon

Re: [PATCH v2] hugetlb: fix race condition in hugetlb_fault()

On 4/1/2012 8:10 AM, Hillf Danton wrote:
> On Sat, Mar 31, 2012 at 4:07 AM, Chris Metcalf <cmetcalf <at> tilera.com> wrote:
>> The race is as follows.  Suppose a multi-threaded task forks a new
>> process, thus bumping up the ref count on all the pages.  While the fork
>> is occurring (and thus we have marked all the PTEs as read-only), another
>> thread in the original process tries to write to a huge page, taking an
>> access violation from the write-protect and calling hugetlb_cow().  Now,
>> suppose the fork() fails.  It will undo the COW and decrement the ref
>> count on the pages, so the ref count on the huge page drops back to 1.
>> Meanwhile hugetlb_cow() also decrements the ref count by one on the
>> original page, since the original address space doesn't need it any more,
>> having copied a new page to replace the original page.  This leaves the
>> ref count at zero, and when we call unlock_page(), we panic.
>>
>> The solution is to take an extra reference to the page while we are
>> holding the lock on it.
>>
> If the following chart matches the above description,
>
> [...]
>
> would you please spin with description refreshed?

Done, and thanks!  I added your timeline chart to my description; I figure
no harm in having it both ways.

>> Cc: stable <at> kernel.org
> Let Andrew do the stable work, ok?

Fair point.  I'm used to adding the Cc myself for things I push through the
(Continue reading)

Vivek Goyal | 1 Apr 2012 22:56
Picon
Favicon

Re: [PATCH 0/6] buffered write IO controller in balance_dirty_pages()

On Wed, Mar 28, 2012 at 08:13:08PM +0800, Fengguang Wu wrote:
> 
> Here is one possible solution to "buffered write IO controller", based on Linux
> v3.3
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux.git  buffered-write-io-controller
> 
> Features:
> - support blkio.weight
> - support blkio.throttle.buffered_write_bps

Introducing separate knob for buffered write makes sense. It is different
throttling done at block layer.

> 
> Possibilities:
> - it's trivial to support per-bdi .weight or .buffered_write_bps
> 
> Pros:
> 1) simple
> 2) virtually no space/time overheads
> 3) independent of the block layer and IO schedulers, hence
> 3.1) supports all filesystems/storages, eg. NFS/pNFS, CIFS, sshfs, ...
> 3.2) supports all IO schedulers. One may use noop for SSDs, inside virtual machines, over iSCSI, etc.
> 
> Cons:
> 1) don't try to smooth bursty IO submission in the flusher thread (*)

Yes, this is a core limitation of throttling while writing to cache. I think
once we had agreed that IO scheduler in general should be able to handle
(Continue reading)

Hillf Danton | 2 Apr 2012 04:21
Picon

Re: [PATCH v3] arch/tile: support multiple huge page sizes dynamically

On Mon, Apr 2, 2012 at 12:46 AM, Chris Metcalf <cmetcalf <at> tilera.com> wrote:
>
>  As it happens, I am the tile guru for this code :-)
>
I see.

>
> So does it make sense for me to push the two resulting changes through the
> tile tree?  I'd like to ask Linus to pull this stuff for 3.4 (I know, I'm
> late in the cycle for that), but obviously it's not much use without the
> part that you reviewed.
>
No more question:)
-hd

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo <at> kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont <at> kvack.org"> email <at> kvack.org </a>

Sergey Senozhatsky | 2 Apr 2012 08:35
Picon
Gravatar

[PATCH] kmemleak: do not leak object after tree insertion error

[PATCH] kmemleak: do not leak object after tree insertion error

In case when tree insertion fails due to already existing object
error, pointer to allocated object gets lost due to lookup_object()
overwrite. Free allocated object before lookup happens.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky <at> gmail.com>

---

 mm/kmemleak.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 45eb621..d6eec2d 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
 <at>  <at>  -260,6 +260,7  <at>  <at>  static struct early_log
 static int crt_early_log __initdata;

 static void kmemleak_disable(void);
+static void __delete_object(struct kmemleak_object *);

 /*
  * Print a warning and dump the stack trace.
 <at>  <at>  -576,6 +577,7  <at>  <at>  static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
 	 * random memory blocks.
 	 */
 	if (node != &object->tree_node) {
+		__delete_object(object);
(Continue reading)

Cyrill Gorcunov | 2 Apr 2012 11:54
Favicon

Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE

On Mon, Apr 02, 2012 at 01:46:03PM +0400, Konstantin Khlebnikov wrote:
> Cyrill Gorcunov wrote:
> >On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> >>
> >>Add Cyrill. This conflicts with
> >>c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch in -mm.
> >
> >Thanks for CC'ing, Oleg. I think if thise series go in it won't
> >be a problem to update my patch accordingly.
> 
> In this patch I leave mm->exe_file lockless.
> After exec/fork we can change it only for current task and only if mm->mm_users == 1.
> 
> something like this:
> 
> task_lock(current);
> if (atomic_read(&current->mm->mm_users) == 1)
> 	set_mm_exe_file(current->mm, new_file);
> else
> 	ret = -EBUSY;
> task_unlock(current);
> 
> task_lock() protect this code against get_task_mm()

I see. Konstantin, the question is what is more convenient way to update the
patch in linux-next. The c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch
is in -mm already, so I either should wait until Andrew pick your series up and
send updating patch on top, or I could fetch your series, update my patch and
send it here as reply. Hmm?

(Continue reading)


Gmane