Large file transfer failure with IPMP

I had an issue today where large file transfers via SCP from within the data center were failing intermittently to an IPMP NIC pair. We quickly found that the transfer always failed to one IP but not the other.

Half of the IPMP pair is running on a CE interface at gigabit.
The other half on an ERI interface that only support 100megabit.

Both interfaces ping perfectly well and SSH works to both just fine.
From outside of the data center, large files transfer to either IP just fine.
However, from inside of the data center large file transfers stall at 152K.
Also the ERI NICs report errors.

Failed test:

$ scp /tmp/USERtestDELETEME 192.168.1.100:/tmp/USERtestDELETEME2
USERtestDELETEME   100% |******************************************************************************************************************************|   263 MB    00:10

$ scp /tmp/USERtestDELETEME 192.168.1.101:/tmp/USERtestDELETEME2
USERtestDELETEME     0% |                                                                                                                              |   152 KB  - stalled -^CKilled by signal 2.

Previous settings:

Network speed settings for HOSTNAME

NDD settings for ce Instance 0

adv_autoneg_cap 1
adv_1000fdx_cap 1
adv_1000hdx_cap 1
adv_100T4_cap 0
adv_100fdx_cap 1
adv_100hdx_cap 1
adv_10fdx_cap 1
adv_10hdx_cap 1

NDD settings for eri Instance 0

adv_autoneg_cap 0
adv_100T4_cap 0
adv_100fdx_cap 1
adv_100hdx_cap 0
adv_10fdx_cap 0
adv_10hdx_cap 0

Interface Link Speed Duplex Link Partner Autoneg LP Setting
_________ ____ _____ ______ ____________________ __________
ce0 UP 1000 FULL ENABLED 1000_FULL
ce1 UP 1000 FULL ENABLED 1000_FULL
eri0 UP 100 FULL DISABLED 100_FULL

Root Cause:
My theory is that the switch ports are running at gigabit when the NIC is locked to 100MB and auto negotiation is turned off.
If the traffic comes in at under 100MB then the NIC handles it fine, but if the traffic exceeds that level, data is corrupted, packets drop and the tranfer stalls.
By turning on auto negotiation the NIC is able to tell the switch port to slow down to 100MB.

Solution:
Turn on auto negotiation for the ERI0 NIC.

Changes made:

ndd -set /dev/eri instance 0
ndd -set /dev/eri adv_autoneg_cap 1

New Settings:

Network speed settings for HOSTNAME

NDD settings for ce Instance 0

adv_autoneg_cap 1
adv_1000fdx_cap 1
adv_1000hdx_cap 1
adv_100T4_cap 0
adv_100fdx_cap 1
adv_100hdx_cap 1
adv_10fdx_cap 1
adv_10hdx_cap 1

NDD settings for eri Instance 0

adv_autoneg_cap 1
adv_100T4_cap 0
adv_100fdx_cap 1
adv_100hdx_cap 0
adv_10fdx_cap 0
adv_10hdx_cap 0

Interface Link Speed Duplex Link Partner Autoneg LP Setting
_________ ____ _____ ______ ____________________ __________
ce0 UP 1000 FULL ENABLED 1000_FULL
ce1 UP 1000 FULL ENABLED 1000_FULL
eri0 UP 100 FULL ENABLED 100_FULL

Test Passes:

$ scp /tmp/USERtestDELETEME 192.168.1.100:/tmp/USERtestDELETEME2
USERtestDELETEME   100% |******************************************************************************************************************************|   263 MB    00:10

$ scp /tmp/USERtestDELETEME 192.168.1.101:/tmp/USERtestDELETEME2
USERtestDELETEME   100% |******************************************************************************************************************************|   263 MB    00:23
This entry was posted in Network Settings. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *