Discussion:
Link aggregation ignoring policy
Sašo Kiselkov
2011-05-17 21:29:48 UTC
Permalink
I'm having trouble getting my server to spread the transmit data over an
aggregated link (Sun Fire X2250) create from its two on-board
interfaces. Below is the configuration of the aggregation:

# dladm show-aggr
LINK POLICY ADDRPOLICY LACPACTIVITY LACPTIMER
FLAGS
aggr1 L2,L3,L4 auto active short
-----

# dladm show-aggr -x aggr1
LINK PORT SPEED DUPLEX STATE ADDRESS
PORTSTATE
aggr1 -- 1000Mb full up 0:23:8b:ce:45:c2 --
e1000g0 1000Mb full up 0:23:8b:ce:45:c2
attached
e1000g1 1000Mb full up 0:23:8b:ce:45:c3
attached

The other end is an Extreme Network X450a switch stack and everything is
configured correctly on its end of the link aggregation. I'm streaming
video over this link to a few dozen clients over UDP and given the
transmit policy, I should see at least some transmit utilization on
either link. However, for some reason, the system doesn't load-balance
at all, instead sending almost all (99.9%) though e1000g0.

Can somebody please help me find out why the server is sending
everything through the first link and ignoring the second? The switch is
also streaming some data back to me and there I can see a near perfect
50-50% split of the transmit data, so the link appears to be configured
fine...

Regards,
--
Saso
Giovanni Tirloni
2011-05-18 15:07:22 UTC
Permalink
Post by Sašo Kiselkov
I'm having trouble getting my server to spread the transmit data over an
aggregated link (Sun Fire X2250) create from its two on-board
# dladm show-aggr
LINK            POLICY   ADDRPOLICY           LACPACTIVITY  LACPTIMER
FLAGS
aggr1           L2,L3,L4 auto                 active        short
-----
# dladm show-aggr -x aggr1
LINK        PORT           SPEED DUPLEX   STATE     ADDRESS
PORTSTATE
aggr1       --             1000Mb full    up        0:23:8b:ce:45:c2   --
           e1000g0        1000Mb full    up        0:23:8b:ce:45:c2
attached
           e1000g1        1000Mb full    up        0:23:8b:ce:45:c3
attached
The other end is an Extreme Network X450a switch stack and everything is
configured correctly on its end of the link aggregation. I'm streaming
video over this link to a few dozen clients over UDP and given the
transmit policy, I should see at least some transmit utilization on
either link. However, for some reason, the system doesn't load-balance
at all, instead sending almost all (99.9%) though e1000g0.
Can somebody please help me find out why the server is sending
everything through the first link and ignoring the second? The switch is
also streaming some data back to me and there I can see a near perfect
50-50% split of the transmit data, so the link appears to be configured
fine...
Have you experimented with L2,L3 or L2 policies?

Does it send anything "useful" through the almost idle link? If so,
what is it exactly?

Do you have direct access to the clients or is it going through a
proxy? Can you send just a few packet samples from the busy link?
--
Giovanni Tirloni
Sašo Kiselkov
2011-05-18 15:34:46 UTC
Permalink
Post by Giovanni Tirloni
Post by Sašo Kiselkov
I'm having trouble getting my server to spread the transmit data over an
aggregated link (Sun Fire X2250) create from its two on-board
# dladm show-aggr
LINK POLICY ADDRPOLICY LACPACTIVITY LACPTIMER
FLAGS
aggr1 L2,L3,L4 auto active short
-----
# dladm show-aggr -x aggr1
LINK PORT SPEED DUPLEX STATE ADDRESS
PORTSTATE
aggr1 -- 1000Mb full up 0:23:8b:ce:45:c2 --
e1000g0 1000Mb full up 0:23:8b:ce:45:c2
attached
e1000g1 1000Mb full up 0:23:8b:ce:45:c3
attached
The other end is an Extreme Network X450a switch stack and everything is
configured correctly on its end of the link aggregation. I'm streaming
video over this link to a few dozen clients over UDP and given the
transmit policy, I should see at least some transmit utilization on
either link. However, for some reason, the system doesn't load-balance
at all, instead sending almost all (99.9%) though e1000g0.
Can somebody please help me find out why the server is sending
everything through the first link and ignoring the second? The switch is
also streaming some data back to me and there I can see a near perfect
50-50% split of the transmit data, so the link appears to be configured
fine...
Have you experimented with L2,L3 or L2 policies?
Does it send anything "useful" through the almost idle link? If so,
what is it exactly?
Do you have direct access to the clients or is it going through a
proxy? Can you send just a few packet samples from the busy link?
My application is a video streaming server. Clients connect to it using
TCP for control (RTSP) and then the server streams video data to them in
a separate UDP stream. On the server side, a new process is spawned for
each stream, thus each stream originates in a different socket. It
appears that the only stuff being sent out through the idle link is the
TCP portion of the load (which is relatively miniscule, a few dozen
packets per second at most), whereas all of the UDP traffic goes through
one interface (> 10k packets per second).

Initially I only had the L4 policy on it. Then I tried many different
combinations, but none helped.

The system (including the switch-side) is completely under my control.
I've attached a sample of the packet headers flying over the high-volume
interface.

Regards,
--
Saso
Sašo Kiselkov
2011-05-22 20:03:48 UTC
Permalink
Post by Sašo Kiselkov
Post by Giovanni Tirloni
Post by Sašo Kiselkov
I'm having trouble getting my server to spread the transmit data over an
aggregated link (Sun Fire X2250) create from its two on-board
# dladm show-aggr
LINK POLICY ADDRPOLICY LACPACTIVITY LACPTIMER
FLAGS
aggr1 L2,L3,L4 auto active short
-----
# dladm show-aggr -x aggr1
LINK PORT SPEED DUPLEX STATE ADDRESS
PORTSTATE
aggr1 -- 1000Mb full up 0:23:8b:ce:45:c2 --
e1000g0 1000Mb full up 0:23:8b:ce:45:c2
attached
e1000g1 1000Mb full up 0:23:8b:ce:45:c3
attached
The other end is an Extreme Network X450a switch stack and everything is
configured correctly on its end of the link aggregation. I'm streaming
video over this link to a few dozen clients over UDP and given the
transmit policy, I should see at least some transmit utilization on
either link. However, for some reason, the system doesn't load-balance
at all, instead sending almost all (99.9%) though e1000g0.
Can somebody please help me find out why the server is sending
everything through the first link and ignoring the second? The switch is
also streaming some data back to me and there I can see a near perfect
50-50% split of the transmit data, so the link appears to be configured
fine...
Have you experimented with L2,L3 or L2 policies?
Does it send anything "useful" through the almost idle link? If so,
what is it exactly?
Do you have direct access to the clients or is it going through a
proxy? Can you send just a few packet samples from the busy link?
My application is a video streaming server. Clients connect to it using
TCP for control (RTSP) and then the server streams video data to them in
a separate UDP stream. On the server side, a new process is spawned for
each stream, thus each stream originates in a different socket. It
appears that the only stuff being sent out through the idle link is the
TCP portion of the load (which is relatively miniscule, a few dozen
packets per second at most), whereas all of the UDP traffic goes through
one interface (> 10k packets per second).
Initially I only had the L4 policy on it. Then I tried many different
combinations, but none helped.
The system (including the switch-side) is completely under my control.
I've attached a sample of the packet headers flying over the high-volume
interface.
Regards,
--
Saso
Sorry to reply to myself, but so far I still fail to discover why my
aggr link always sends all data through a single link. I tried to
unconfigure link aggregation on the idle link and reconfigure it back,
but to no avail. Upon putting the idle link back into the aggregation I
noticed the following in dmesg:

May 22 21:56:20 ba-sitel-iptv-mon-01 unix: [ID 665567 kern.warning]
WARNING: kstat_create('aggr906001', 0, 'mac_tx_hwlane1'): namespace
collision
May 22 21:56:20 ba-sitel-iptv-mon-01 unix: [ID 665567 kern.warning]
WARNING: kstat_create('aggr901001', 0, 'mac_tx_hwlane1'): namespace
collision
May 22 21:56:20 ba-sitel-iptv-mon-01 unix: [ID 665567 kern.warning]
WARNING: kstat_create('aggr930001', 0, 'mac_tx_hwlane1'): namespace
collision
May 22 21:56:20 ba-sitel-iptv-mon-01 unix: [ID 665567 kern.warning]
WARNING: kstat_create('aggr932001', 0, 'mac_tx_hwlane1'): namespace
collision
May 22 21:56:23 ba-sitel-iptv-mon-01 mac: [ID 435574 kern.info] NOTICE:
e1000g1 link up, 1000 Mbps, full duplex

Vlan930 is the one where all of the streaming takes place, so could it
be that the OS is having trouble assigning the vlan to the physical
interface?

BR,
--
Saso

Loading...