Aug 24

A TCP packet is normally 1500 bytes large, 40 bytes of that being header information and 1640 of it being data. Trouble is, sometimes you don’t have 1640 bytes of data to send. In a telnet session for example, hitting enter is 1 byte, but we still need to send an entire packet for that single byte meaning 41 bytes on the wire.  This isn’t efficient and too many small packets like this can bring a firewall to its knees. A solution was put in place years ago called Nagles Algorithm that holds off on sending data until a larger packet can be assembled. Good for general use, not good for an iSCSI SAN.  Here are some worthless and misleading numbers.

Single SATA drive, P4 OpenSolaris Build 91 shared via iSCSI on a dedicated gig network with MTU at 1500.

while true; do dd if=/dev/zero of=test.img bs=1024 count=100000; rm test.img; done
102400000 bytes (102 MB) copied, 3.84045 s, 26.7 MB/s
102400000 bytes (102 MB) copied, 4.24423 s, 24.1 MB/s

Now with Nagle disabled

while true; do dd if=/dev/zero of=test.img bs=1024 count=100000; rm test.img; done
102400000 bytes (102 MB) copied, 2.13878 s, 47.9 MB/s
102400000 bytes (102 MB) copied, 2.14086 s, 47.8 MB/s

For the statisticians among you, I did several more tests and used other utilities like cp, but this is a blog and blogs are quick so work with me here.

A bug was opened with the OpenSolaris folks (6621560) but in the meantime, we can take care of this ourselves with one little command.   To check the status of Nagle, run the following as root

ndd -get /dev/tcp tcp_naglim_def

If that comes back with anything other then 1, Nagle is kicking you in the shorts. So lets fix that.

ndd -set /dev/tcp tcp_naglim_def 1

Tada. things should be a fair bit speedier now.  Unfortunatly, this is system wide and may impact thigs like webservers, but if you are running apache on your SAN you pretty much get what you deserve. Enjoy!

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.