Skip navigation

Occassionally when you provide managed services for clients and there are issues, fingers get pointed and accusations get made about the integrity of the network – particularly if the medium in question uses fibre, or another less common network medium (like wireless).

We host a clients server on our premises for backup purposes, on the end of 2Km of multimode fibre connected to media converters at both sides.  When the client was having issues with the speed and  integrity of the network (packet loss and timeouts), it was necessary to do a little research to initially test and then to prove that the issue was not resultant of the fibre link.  Of course, with the aid of an OTDR it’s easy to demonstrate that the fibre does not show losses – but an OTDR is a very expensive piece of equipment to buy or rent, and does not provide any throughput data illustrating whether the endpoints are doing as they should.  As the clients IT support were only testing from a Windows->Windows box and they were only using ping to illustrate the issue, it was necessary to do a little more digging.

I put together a short test plan:

1)A standard ping test
2) A short packet capture while running a standard ping test.
3) An isolated packet capture to ascertain whether there are any obvious network issues
(excessive ARP, retransmission, etc).
4) A flood ping test
5) A bi-directional iperf test to measure the bandwidth and throughput of the fibre link
(through one of the clients network switches)
6) A bi-directional iperf test to measure the bandwidth and throughput of the fibre link
directly from the media converter.

As the ping test yielded no unusual results and the packet capture (tcpdump -i eth0 -s0 -w pingtest.pcap) of the ping test didn’t show anything unusual, I ran the flood ping back to my box (note that it’s fairly important to only flood ping boxes that are capable of handling more traffic than you can generate (wikipedia).

#ping 10.202.4.130 -f

— 10.202.4.130 ping statistics —
11011 packets transmitted, 11010 received, 0% packet loss, time 1725ms
rtt min/avg/max/mdev = 0.129/0.133/5.692/0.055 ms, ipg/ewma 0.156/0.132 ms

Again, this illustrated that even with a massive burst of data in a short space of time that there were no errors in transmission.

Next it was necessary to run test #5.  Iperf was installed on the remote side and on my laptop, so I started the server on the remote side using:

#iperf -s

and started the client side on my laptop using:

#iperf -c 10.202.4.130 -r

The results showed a slow throughput on the client and server side:

ID] Interval  Transfer Bandwidth
[ 5] 0.0-10.0 sec 18.8 MBytes 15.8 Mbits/sec

This looked likely to be the cause of the problem, but the fibre link should have been running at 100Mbps.  The next step was to connect directly into the media converter rather than through the clients switch.  I ran the test directly through the media converter:

#iperf -c 10.202.4.130 -r
ID] Interval  Transfer Bandwidth
[ 5] 0.0-10.0 sec 112 MBytes 94.2 Mbits/sec

A much improved result!  I ran the test again to verify the findings and then plugged into an alternative switch port at the clients side to run the test again, and this time got the 94Mbps I was hoping to see, proving that the issue was with the switch and most likely to be caused by rate-limiting on the switch port.

Sometimes a simple ping is not enough to thoroughly test a network and other tools need to be used to verify findings….iperf is excellent for providing a tangible measurement of throughput, and tcpdump & wireshark are useful for looking for packet retransmissions, excessive arp and other clues to performance issues..

One Comment

  1. Hi,

    I am facing issues with Iperf when I run the device under test as UDP streamer and iperf as the server I get the following error at the server end:

    [root@localhost src]# iperf -s -u -i 1
    ————————————————————
    Server listening on UDP port 5001
    Receiving 1470 byte datagrams
    UDP buffer size: 108 KByte (default)
    ————————————————————
    terminate called after throwing an instance of ‘std::bad_alloc’
    what(): St9bad_alloc
    Aborted

    Could you trrow some light on what might be going wrong here.

    Thank you,

    Suneel.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>