Handling TCP keep-alive

Keep-alives are useful in scenarios where either end of a TCP connection disappears without closing the session.

The following script in Python demonstrates sending a keep-alive message when there is no data activity for 60 seconds. If there is no response, 4 additional keep-alive messages are sent at intervals of 15 seconds. If none get a response, the connection is aborted.

Edit the IP address and port to whatever works on your network.

Create a test TCP listener/server using netcat (nc on OS X) on machine with IP address specified in the script

netcat -l 8001

Next, run the script

python keepalive.py

It should establish a TCP connection with the listener. Interrupt the network by enabling a firewall, or powering off a router. You’ll see the following when the connection times out

Traceback (most recent call last):
  File "socket_test.py", line 39, in do_work
    req = sock.recv(10)
error: [Errno 110] Connection timed out
Other Socket err, exit and try creating socket again

On OS X or Windows, you can enable keep-alive but cannot set TCP_KEEPIDLE and other parameters. You’ll get the following error message if you try to do so

Traceback (most recent call last):
  File "socket_test.py", line 65, in <module>
  File "socket_test.py", line 19, in do_work
    sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60)
AttributeError: 'module' object has no attribute 'TCP_KEEPIDLE'

Wireshark highlights keep-alive messages if TCP sequence number analysis is enabled.


3 thoughts on “Handling TCP keep-alive

  1. That is an interesting mechanism, I will try that. But if you use the code below it will defeat the issue “TCP peer is being quiet – or if the TCP socket has gone away” as quoted on the reference link, zero data is catched by the exception socket.timeout, isn’t it ?

            if req == '':
                # connection closed by peer, exit loop
                print 'Connection closed by peer'
  2. You mean if there isn’t any data socket.timeout will happen? Yes, but we loop back to check for data again. If no exception happens and yet there is no data, that is when we have a problem, most probably due to TCP Keepalive.

  3. yes, my bad, you’re right with the codes, I do some tests, assuming we don’t have tcp keepalive, if there is no data (meaning no activity and not receive 0 data) socket.timeout is raised, I maintain a counter to count the number of timeout and close the socket accordingly. Indeed data 0 byte happens only when the remote peer makes a proper call close(), you’re codes are correct req==” to detect a close from remote peer. In addition to that, I probably need the tcp keepalive mechanism because my router doing NAT looks to send a reconnect after only a few seconds to my server socket, or the reconnect is coming from the client application, I am unsure. Do you think the tcp keepalive packet can maintain the socket connection with the NAT router and force to to stop sending continuously reconnect ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s