Andrew Gallatin | 6 Dec 2005 02:56

tuning TCP over 10GbE?


Are there any tuning tips for getting good endstation
performance out of a Solaris machine with a 10GbE nic?
With linux it seems to be something of an art form...

Also, what is the suggested way for a GLD driver to expose runtime
(rather than load-time) tuning parameters to the admin?  Specifically,
I'd like to be able to tune interrupt coalescing parameters on the
fly, without re-loading the driver.

Thanks,

Drew
kapil.sampath | 6 Dec 2005 09:23

Doubt in Trunking in Solaris 10

Hi,

 

I am configuring trunking for inbuilt interfaces bge (Broadcom) in Solaris 10. The hardware version is SUNFIRE V240.

 

I installed the patch Solairs 10 Update 1 in this box to support GLDv3 driver support. (NEMO).

 

bash-3.00# dladm create-aggr -d bge1 1

 

bash-3.00# dladm add-aggr -d  bge2 1

 

bash-3.00# dladm show-aggr

key: 1 (0x0001) policy: L4      address: 0:3:ba:e0:12:10 (auto)

           device       address                 speed           duplex  link                                                 state

           bge1         0:3:ba:e0:12:10   0     Mbps    unknown unknown standby

           bge2         0:3:ba:e0:12:11   0     Mbps    unknown unknown standby

 

bash-3.00# ifconfig aggr1

aggr1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3

        inet 0.0.0.0 netmask 0

        ether 0:3:ba:e0:12:10

 

bash-3.00# ifconfig aggr1 192.0.0.1/24 up

 

bash-3.00# ifconfig aggr1

aggr1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3

        inet 192.0.0.1 netmask ffffff00 broadcast 192.0.0.255

        ether 0:3:ba:e0:12:10

 

bash-3.00# dladm show-aggr

key: 1 (0x0001) policy: L4      address: 0:3:ba:e0:12:10 (auto)

           device       address                 speed           duplex  link    state

           bge1         0:3:ba:e0:12:10   0     Mbps    unknown down    standby

           bge2         0:3:ba:e0:12:11   0     Mbps    unknown down    standby

 

These are the steps followed. But the speed is shown as 0 Mbps. I am trying out for a speed of 200 Mbps (100 for bge1 and 100 for bge2). Can you please help me in resolving this issue?

 

 

Regards

Kapil Sampath

 

"Many of life's failures are people who did not realize how close they were to success when they gave up"

                                                                                                                                     - Thomas Edison

 



Confidentiality Notice

The information contained in this electronic message and any attachments to this message are intended
for the exclusive use of the addressee(s) and may contain confidential or privileged information. If
you are not the intended recipient, please notify the sender at Wipro or Mailadmin-uxC5H9eHYlcAvxtiuMwx3w@public.gmane.org immediately
and destroy all copies of this message and any attachments.
<div>

<div class="Section1">

<p class="MsoNormal"><span>Hi,<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>I am configuring trunking for inbuilt interfaces bge
(Broadcom) in Solaris 10. The hardware version is SUNFIRE V240.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>I installed the patch Solairs 10 Update 1 in this box to
support GLDv3 driver support. (NEMO).<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# dladm
create-aggr -d bge1 1<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# dladm add-aggr
-d&nbsp; bge2 1<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# dladm show-aggr<p></p></span></p>

<p class="MsoNormal"><span>key: 1 (0x0001) policy:
L4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; address: 0:3:ba:e0:12:10 (auto)<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
device&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
address&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
speed&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; duplex&nbsp;
link&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
state<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
bge1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0:3:ba:e0:12:10&nbsp;&nbsp;
0&nbsp;&nbsp;&nbsp;&nbsp; Mbps&nbsp;&nbsp;&nbsp; unknown unknown standby<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
bge2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0:3:ba:e0:12:11&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; Mbps&nbsp;&nbsp;&nbsp;
unknown unknown standby<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# ifconfig aggr1<p></p></span></p>

<p class="MsoNormal"><span>aggr1:
flags=1000842&lt;BROADCAST,RUNNING,MULTICAST,IPv4&gt; mtu 1500 index 3<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
inet 0.0.0.0 netmask 0<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
ether 0:3:ba:e0:12:10<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# ifconfig aggr1
192.0.0.1/24 up<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# ifconfig aggr1<p></p></span></p>

<p class="MsoNormal"><span>aggr1: flags=1000843&lt;UP,BROADCAST,RUNNING,MULTICAST,IPv4&gt;
mtu 1500 index 3<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
inet 192.0.0.1 netmask ffffff00 broadcast 192.0.0.255<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
ether 0:3:ba:e0:12:10<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>bash-3.00# dladm
show-aggr<p></p></span></p>

<p class="MsoNormal"><span>key: 1 (0x0001) policy:
L4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; address: 0:3:ba:e0:12:10 (auto)<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
device&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
address&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
speed&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; duplex&nbsp;
link&nbsp;&nbsp;&nbsp; state<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
bge1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0:3:ba:e0:12:10&nbsp;&nbsp; <span>0&nbsp;&nbsp;&nbsp;&nbsp;
Mbps</span>&nbsp;&nbsp;&nbsp; unknown down&nbsp;&nbsp;&nbsp; standby<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
bge2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0:3:ba:e0:12:11&nbsp;&nbsp; <span>0&nbsp;&nbsp;&nbsp;&nbsp;
Mbps</span>&nbsp;&nbsp;&nbsp; unknown down&nbsp;&nbsp;&nbsp; standby<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>These are the steps followed. But the speed is shown as 0
Mbps. I am trying out for a speed of 200 Mbps (100 for bge1 and 100 for bge2).
Can you please help me in resolving this issue?<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Regards</span><p></p></p>

<p class="MsoNormal"><span>Kapil Sampath<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>"<span>Many
of life's failures are people who did not realize how close they were to
success when they gave up</span>"<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
- Thomas Edison</span><p></p></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

</div>

<table><tr><td bgcolor="#ffffff">
<br><br>
Confidentiality Notice <br><br>
The information contained in this electronic message and any attachments to this message are intended<br>
for the exclusive use of the addressee(s) and may contain confidential or privileged information. If<br>
you are not the intended recipient, please notify the sender at Wipro or Mailadmin@... immediately<br>
and destroy all copies of this message and any attachments.<br>
</td></tr></table>
</div>
David.Edmondson | 6 Dec 2005 09:34
Picon

Re: Doubt in Trunking in Solaris 10

* kapil.sampath@... [20051206T082343]:
> bash-3.00# dladm show-aggr
> key: 1 (0x0001) policy: L4      address: 0:3:ba:e0:12:10 (auto)
>            device       address           speed           duplex  link    state
>            bge1         0:3:ba:e0:12:10   0     Mbps    unknown down    standby
>            bge2         0:3:ba:e0:12:11   0     Mbps    unknown down    standby

It sounds as though the cable isn't actually connected (the link state
is shown as "down").  Can you rule out cabling problems?

dme.
--

-- 
David Edmondson, Solaris Engineering, Sun Microsystems.
Paul Durrant | 6 Dec 2005 09:46
Picon

Re: tuning TCP over 10GbE?

On 6 Dec 2005, at 01:56, Andrew Gallatin wrote:
>
> Also, what is the suggested way for a GLD driver to expose runtime
> (rather than load-time) tuning parameters to the admin?  Specifically,
> I'd like to be able to tune interrupt coalescing parameters on the
> fly, without re-loading the driver.

Maybe you should be looking at Nemo. Nemo incorporates feedback from 
the IP stack to tune interrupt coalescing. In the meantime you could 
hook the gldm_ioctl() entry point - GLD will pass through ioctls that 
it does not recognize so you could have an app. open your driver and 
pass down config. ioctls. You could even use <shudder> ndd if you 
provide the appropriate ioctls (not that I'm recommending this approach 
- it's pretty nasty but it works).
Another, possibly more hack option, would be to create yourself a set 
of kstats for your tunables and utilize the fact that the kstat 
interface allows you to write as well as read.
If you're feeling adventurous you could always add the long missing 
feature of being able to write dynamic driver properties into the 
kernel driver framework and then use that ;-)

   Paul

--
Paul Durrant
On 6 Dec 2005, at 01:56, Andrew Gallatin wrote:
>
> Also, what is the suggested way for a GLD driver to expose runtime
> (rather than load-time) tuning parameters to the admin?  Specifically,
> I'd like to be able to tune interrupt coalescing parameters on the
> fly, without re-loading the driver.

Maybe you should be looking at Nemo. Nemo incorporates feedback from 
the IP stack to tune interrupt coalescing. In the meantime you could 
hook the gldm_ioctl() entry point - GLD will pass through ioctls that 
it does not recognize so you could have an app. open your driver and 
pass down config. ioctls. You could even use <shudder> ndd if you 
provide the appropriate ioctls (not that I'm recommending this approach 
- it's pretty nasty but it works).
Another, possibly more hack option, would be to create yourself a set 
of kstats for your tunables and utilize the fact that the kstat 
interface allows you to write as well as read.
If you're feeling adventurous you could always add the long missing 
feature of being able to write dynamic driver properties into the 
kernel driver framework and then use that ;-)

   Paul

--
Paul Durrant
Peter Memishian | 6 Dec 2005 10:08
Picon

Re: tuning TCP over 10GbE?


 > Maybe you should be looking at Nemo.

Indeed, though the Nemo interfaces are not yet stable for third-party use.
We're working on it -- but first we need to make a number of changes as
part of Clearview and other ongoing projects.

However, we are planning to add a first-class property interface to dladm
as part of upcoming wireless support, with the intent of later extending
this to other network interfaces.  This interface will be significantly
more admin-friendly than ndd -- especially because the dladm utility will
know the semantics associated with each property and thus will be able to
allow the administrator to interact with named enumerations and the like,
rather than piles of integers :-)

Stay tuned.
--

-- 
meem
Picon

Re: tuning TCP over 10GbE?

Andrew Gallatin 已写入:

>Are there any tuning tips for getting good endstation
>performance out of a Solaris machine with a 10GbE nic?
>With linux it seems to be something of an art form...
>  
>
If it supports jumbo frame, to enable it will sure get much better 
performance.
If TCP, try tuning below parameters by ndd:
tcp_xmit_hiwat, tcp_recv_hiwat
And try to add a line "set ip:ip_squeue_fanout=1" in /etc/system.

>Also, what is the suggested way for a GLD driver to expose runtime
>(rather than load-time) tuning parameters to the admin?  Specifically,
>I'd like to be able to tune interrupt coalescing parameters on the
>fly, without re-loading the driver.
>
>Thanks,
>
>Drew
>_______________________________________________
>networking-discuss mailing list
>networking-discuss@...
>  
>

Leonid Grossman | 6 Dec 2005 15:52
Favicon

RE: tuning TCP over 10GbE?


> >Also, what is the suggested way for a GLD driver to expose runtime 
> >(rather than load-time) tuning parameters to the admin?  
> Specifically, 
> >I'd like to be able to tune interrupt coalescing parameters 
> on the fly, 
> >without re-loading the driver.

Hi Andrew,
At present, dynamic interrupt moderation is NIC-specific (assuming the
hardware supports it). 
What 10GbE cards you are using? For Xframe, the readme file is on our
website and it should suffice, but we can give you more details on
moderation keywords if needed. These keywords cover both time and
utilization-based moderation and can be changed on the fly. The ASIC
supports separate moderation scheme for each tx/rx ring, but the
shipping Xframe driver doesn't implement multiple rings yet.
Cheers, Leonid

> >
> >Thanks,
> >
> >Drew
> >_______________________________________________
> >networking-discuss mailing list
> >networking-discuss@...
> >  
> >
> 
> _______________________________________________
> networking-discuss mailing list
> networking-discuss@...
> 
Andrew Gallatin | 6 Dec 2005 16:08

RE: tuning TCP over 10GbE?


Leonid Grossman writes:
 > 
 > > >Also, what is the suggested way for a GLD driver to expose runtime 
 > > >(rather than load-time) tuning parameters to the admin?  
 > > Specifically, 
 > > >I'd like to be able to tune interrupt coalescing parameters 
 > > on the fly, 
 > > >without re-loading the driver.
 > 
 > Hi Andrew,
 > At present, dynamic interrupt moderation is NIC-specific (assuming the
 > hardware supports it). 
 > What 10GbE cards you are using? For Xframe, the readme file is on our

I appreciate the response.  Actually, this is our own 10GbE PCI
Express nic, for which I'm finishing up a Solaris driver, so Xframe
tuning parameters will not help me.

Drew
Andrew Gallatin | 6 Dec 2005 16:20

Re: tuning TCP over 10GbE?


Peter Memishian writes:
 > 
 >  > Maybe you should be looking at Nemo.
 > 
 > Indeed, though the Nemo interfaces are not yet stable for third-party use.
 > We're working on it -- but first we need to make a number of changes as
 > part of Clearview and other ongoing projects.

Indeed.  Whatever we ship has to work on Solaris 10.  If there
is a patchset to Solaris 10 which adds Nemo support, then we can
think about it..

Drew
Leonid Grossman | 6 Dec 2005 16:33
Favicon

RE: tuning TCP over 10GbE?


> 
> I appreciate the response.  Actually, this is our own 10GbE 
> PCI Express nic, for which I'm finishing up a Solaris driver, 
> so Xframe tuning parameters will not help me.

The idea stays the same - at present, interrupt moderation is
ASIC-specific. 
Check your 10GbE hw manual, if dynamic interrupt moderation is supported
by the hardware then you can add the code to your Solaris driver to
utilize the feature - this is basically how our 10GbE Express NICs work
today.
Nemo will add a more generic framework of course.

Leonid

> 
> Drew
> 

Gmane