Eric W. Biederman | 2 Mar 2008 03:03
Favicon

Re: [PATCH 0/2] Fix /proc/net in presence of net namespaces


- The experience from vserver, planetlab and OpenVZ is that it is good
  to be able to monitor processes in other namespaces.  

- The linux experience says filesystems are a good way to do that.

- So we really want to filesystem monitoring interfaces to depend on
  the filesystem mount options instead of current.

- Starting with making /proc and sysctls depend on current is a cheap
  way to get things up and going.

- When I consider breaking things up into multiple filesystems I run
  across the occasional file that depends on multiple namespaces.
  uids in /proc/sysvipc/* for example.  Luckily I have yet to find
  any directory structures that depend on more then one namespace.

  Maybe that can be handled properly by capturing multiple 
  namespaces at mount time but I am a bit leery of that.

- The visibility of namespaces should be match the visibility of the
  processes that use them.   Access control of course can be more
  restricted.

- We want to see how namespaces connect to tasks.

Therefore.

/proc/net, /proc/sys, /proc/sysvipc, and probably a few others
should migrate under /proc/≤pid>/task/≤tid> (not under /proc/≤pid>
(Continue reading)

Li Zefan | 3 Mar 2008 08:23
Favicon

Re: [RFC] Prefixing cgroup generic control filenames with "cgroup."

Xpl++ wrote:
> Paul Menage ??????:
>> ...
>>
>> A compromise might be to keep "tasks" unprefixed, and say that future
>> names get the "cgroup." prefix; in this case I'd be inclined to add
>> the prefix to notify_on_release and release_agent on the grounds that
>> there's much less chance of breaking anyone with those files since (I
>> suspect) they're much less used.
>>   
> This makes most sense to me. It won't break any existing software 
> (most likely) while it seems reasonable to leave 'tasks' unprefixed as 
> this is something that any software using any subsystem of cgroup 
> would be using anyway and it is not that much associated with a 
> particulat subsystem.
>

And it makes most sense to me too, though I still doubt name collision 
will be a problem.

Paul Menage | 3 Mar 2008 09:38
Picon
Favicon

Re: [RFC] Prefixing cgroup generic control filenames with "cgroup."

On Thu, Feb 28, 2008 at 2:06 PM, Paul Menage <menage <at> google.com> wrote:
> On Thu, Feb 28, 2008 at 1:33 PM,  <serge <at> hallyn.com> wrote:
>  >
>  >  You said the set of files belong to cgroup itself is likely to increase
>  >  - do you have some candidates in mind?
>
>  Nothing concrete right now. One example that I already proposed was
>  the "cgroup.api" file but that's shelved for now, until such time as I
>  actually propose the binary API that it was intended to help support.
>

One likely new file that people agreed a while ago could be useful
would be a "procs" file, similar to "tasks", but acting (and
reporting) on thread groups rather than individual tasks.

Paul
Paul Menage | 3 Mar 2008 10:11
Picon
Favicon

Re: [RFC] Prefixing cgroup generic control filenames with "cgroup."

So, there have been various options suggested over the course of this thread:

--

1) no code changes, just stake out all names matching a certain regexp
(e.g. "[a-z].*") as being potentially used by the kernel in the
future; document this, and let users who are worried about name
clashes avoid these names

pros: no work involved, avoids potentially complex changes to solve a
possibly non-problem.

cons: leaves an intermingled namespace; since this would be a
convention rather than an enforced rule, users might be unaware that
they're setting themselves up for a fall

--

2) separate out the kernel-generated names and user-generated names by
putting the user-generated names in a "groups" sub-directory (can be a
mount option that's automatically disabled for cpusets).

pros: completely solves problem of intermingled namespaces; makes it
easier to see sub-groups at a glance

cons: extra code, slightly more awkward to deal with in the general
case, is incompatible with the code that was in mainline in the brief
period of time since 2.6.24 was finalized.

--
(Continue reading)

Balbir Singh | 3 Mar 2008 10:59
Picon

Re: [RFC] Prefixing cgroup generic control filenames with "cgroup."

Paul Menage wrote:
> On Thu, Feb 28, 2008 at 2:06 PM, Paul Menage <menage <at> google.com> wrote:
>> On Thu, Feb 28, 2008 at 1:33 PM,  <serge <at> hallyn.com> wrote:
>>  >
>>  >  You said the set of files belong to cgroup itself is likely to increase
>>  >  - do you have some candidates in mind?
>>
>>  Nothing concrete right now. One example that I already proposed was
>>  the "cgroup.api" file but that's shelved for now, until such time as I
>>  actually propose the binary API that it was intended to help support.
>>
> 
> One likely new file that people agreed a while ago could be useful
> would be a "procs" file, similar to "tasks", but acting (and
> reporting) on thread groups rather than individual tasks.
> 
> Paul

Yes, I remember this. This feature would be extremely useful.

--

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
Daniel Lezcano | 3 Mar 2008 15:20
Picon
Favicon

network namespace ipv6 perfs

Hi,

Some performance tests was made by Benjamin to watch out the impact of 
the network namespace. The good news is there is no impact when used 
with or without namespaces. That has been checked using a real network 
device inside a network namespace.

These results are consistent with the ones previously made for ipv4.

http://lxc.sourceforge.net/network/bench_ipv6_graph.php

Thanks to Benjamin who did all the performance tests :)

Regards
	-- Daniel

Sauf indication contraire ci-dessus:
Compagnie IBM France
Siège Social : Tour Descartes, 2, avenue Gambetta, La Défense 5, 92400
Courbevoie
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 542.737.118 ?
SIREN/SIRET : 552 118 465 02430
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)

Benjamin Thery | 3 Mar 2008 15:42
Picon

Re: network namespace ipv6 perfs

Daniel Lezcano wrote:
> Hi,
> 
> Some performance tests was made by Benjamin to watch out the impact of 
> the network namespace. The good news is there is no impact when used 
> with or without namespaces. That has been checked using a real network 
> device inside a network namespace.
> 
> These results are consistent with the ones previously made for ipv4.
> 
> http://lxc.sourceforge.net/network/bench_ipv6_graph.php
> 
> Thanks to Benjamin who did all the performance tests :)

In these results, may be, there is one thing that should be explained.
It is the CPU utilization overhead in the 'veth' case.

Compared to physical devices or macvlan, veth interfaces don't benefit
from hardware offloading mechanisms: i.e. checksums have to be computed
by the soft. That explains the big overhead in CPU utilization when
using this kind of virtual interface.

Benjamin

> 
> Regards
>     -- Daniel
> 
> 
> 
(Continue reading)

Benjamin Thery | 3 Mar 2008 15:48
Picon

Re: network namespace ipv6 perfs

One more thing about these results: the kernel.
The version used to run these tests was 2.6.25-rc1 from Dave Miller's
net-2.6 tree.

(and I included results from a vanilla 2.6.23.16 as reference)

Benjamin

Daniel Lezcano wrote:
> Hi,
> 
> Some performance tests was made by Benjamin to watch out the impact of 
> the network namespace. The good news is there is no impact when used 
> with or without namespaces. That has been checked using a real network 
> device inside a network namespace.
> 
> These results are consistent with the ones previously made for ipv4.
> 
> http://lxc.sourceforge.net/network/bench_ipv6_graph.php
> 
> Thanks to Benjamin who did all the performance tests :)
> 
> Regards
>     -- Daniel
> 
> 
> 
> 
> 
> 
(Continue reading)

Pavel Emelyanov | 3 Mar 2008 15:55
Favicon

Re: [Devel] Re: network namespace ipv6 perfs

Benjamin Thery wrote:
> Daniel Lezcano wrote:
>> Hi,
>>
>> Some performance tests was made by Benjamin to watch out the impact of 
>> the network namespace. The good news is there is no impact when used 
>> with or without namespaces. That has been checked using a real network 
>> device inside a network namespace.
>>
>> These results are consistent with the ones previously made for ipv4.
>>
>> http://lxc.sourceforge.net/network/bench_ipv6_graph.php
>>
>> Thanks to Benjamin who did all the performance tests :)
> 
> In these results, may be, there is one thing that should be explained.
> It is the CPU utilization overhead in the 'veth' case.
> 
> Compared to physical devices or macvlan, veth interfaces don't benefit
> from hardware offloading mechanisms: i.e. checksums have to be computed
> by the soft. That explains the big overhead in CPU utilization when

You can tune the veth devices not to account checksum when unnecessary.

> using this kind of virtual interface.
> 
> Benjamin
> 
>> Regards
>>     -- Daniel
(Continue reading)

Benjamin Thery | 3 Mar 2008 16:04
Picon

Re: [Devel] Re: network namespace ipv6 perfs

On Mon, Mar 3, 2008 at 3:55 PM, Pavel Emelyanov <xemul <at> openvz.org> wrote:
> Benjamin Thery wrote:
>  > Daniel Lezcano wrote:
>  >> Hi,
>  >>
>  >> Some performance tests was made by Benjamin to watch out the impact of
>  >> the network namespace. The good news is there is no impact when used
>  >> with or without namespaces. That has been checked using a real network
>  >> device inside a network namespace.
>  >>
>  >> These results are consistent with the ones previously made for ipv4.
>  >>
>  >> http://lxc.sourceforge.net/network/bench_ipv6_graph.php
>  >>
>  >> Thanks to Benjamin who did all the performance tests :)
>  >
>  > In these results, may be, there is one thing that should be explained.
>  > It is the CPU utilization overhead in the 'veth' case.
>  >
>  > Compared to physical devices or macvlan, veth interfaces don't benefit
>  > from hardware offloading mechanisms: i.e. checksums have to be computed
>  > by the soft. That explains the big overhead in CPU utilization when
>
>  You can tune the veth devices not to account checksum when unnecessary.

Oh. This is interesting.

You mean with ethtool -K rx/tx?
I will give it a try.

(Continue reading)


Gmane