Enable IP multicast routing in Linux kernel


In this post I discuss how to enable multicast routing in a Linux system. It is a continuation to the post Wireless Router with Buildroot and Raspberry Pi, where I discussed how to build a basic Wi-Fi router with a Raspberry Pi. You’ll want to read that first.

Linux kernel configuration

Besides the Kernel modules mentioned in the post(s) linked above, you’ll need a few additional modules.

IP multicast routing and tunneling

IPv6 protocol

Under Networking support, Networking options, enable

  • IP: multicasting
  • IP: tunneling – this is required if you want to use tunneling with mrouted
  • IP: multicast routing and its sub-options
  • The IPv6 protocol

In the absence of IPv6 smcroute fails with an error such as

Starting static multicast router daemon: INIT: ICMPv6 socket open; Errno(97): Address family not supported by protocol
INIT: MRT6_INIT failed; Errno(97): Address family not supported by protocol
smcroute.

IPv6 Multicast Routing

Under Networking support, Networking options, The IPv6 protocol, enable IPv6: multicast routing and its sub-options.

Packet Mangling

Enable packet mangling with TTL target support if you require support for changing TTL values with iptables.

Buildroot package configuration

The following Buildroot packages provide daemons for performing multicast routing. Enable mrouted and smcroute under Target packages, Networking applications. mrouted requires a glibc based toolchain, you will have to enable it instead of uClibc if you want to use mrouted.

mrouted

smcroute

Perform build and prepare the SD card.

Setup multicast routing

The following procedure is performed from a root console. I usually use the serial console through the expansion header.

Use mrouted when proper IGMP signaling exists

mrouted

The default configuration file /etc/mrouted.conf should be enough, unless you want to perform tunneling.

If you don’t have proper IGMP signaling happening, you can still perform static multicast routing using

smcroute -d

smcroute requires a configuration file, which in my case is /etc/smcroute.conf and looks something like

mgroup from wlan0 group 225.0.0.1
mroute from wlan0 group 225.0.0.1 to usb0

If you don’t have an application and want to use ping to test mutlicast, you can enable ICMP echo responses thus

echo "0" > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

You can use ping requests and receive responses from destination hosts

ping -t 10 225.0.0.1

Note the use of the time to live (TTL) parameter -t. Linux and Mac OS X will set TTL to 1 before forwarding message to the default gateway. You can dump ping messages with TTL parameter using

tcpdump -v host 224.0.0.1 or 225.0.0.1

Note change in TTL from 10 to 1 in a packet routed through Mac OS X in the following dump

14:49:19.642140 IP (tos 0x0, ttl 10, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    10.211.55.12 > 225.0.0.1: ICMP echo request, id 4113, seq 92, length 64
14:49:19.642190 IP (tos 0x0, ttl 1, id 32573, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.2.10 > 225.0.0.1: ICMP echo request, id 4113, seq 92, length 64

If all is well with the routing daemon, IP variable /proc/sys/net/ipv4/conf/{all,interface}/mc_forwarding will be set to 1.

Some other files offer useful hints related to multicast routing. The following lists interfaces where multicast routing is active

cat /proc/net/ip_mr_vif

This lists multicast routing cache entries

cat /proc/net/ip_mr_cache

When using static multicast routing with smcroute, routing will work only when TTL is greater than 1. If the downstream hosts are transmitting packets with TTL at 1, you can use iptables to set TTL thus

iptables -t mangle -A PREROUTING -i wlan0 -j TTL --ttl-set 64

I’ve also had to wait a while after executing smcroute for NAT to kick in, so that source IP address is translated to address of interface on the destination network. Note the change in source IP address in a message sequence captured using

tcpdump -v -i usb0 host 224.0.0.1 or 225.0.0.1
03:03:25.393056 IP (tos 0x0, ttl 63, id 16103, offset 0, flags [none], proto UDP (17), length 41)
    192.168.2.10.61312 > 225.0.0.1.4007: UDP, length 13
03:04:22.277348 IP (tos 0x0, ttl 63, id 8095, offset 0, flags [none], proto UDP (17), length 41)
    192.168.10.2.61312 > 225.0.0.1.4007: UDP, length 13

IP multicasting


IP multicasting is used to target a group of hosts by sending a single datagram. IP addresses in the range 224.0.0.0 through 239.255.255.255 are reserved for multicasting.

To find out which hosts on your subnet support multicasting, try

ping 224.0.0.1

Here’s a Node.js code snippet that sends UDP datagrams to multicast group 225.0.0.1 at port 8001

var dgram = require('dgram');
var s = dgram.createSocket('udp4');
s.bind(8000);
var b = new Buffer("Hello");
s.send(b, 0, b.length, 8001, "225.0.0.1", function(err, bytes) {
  console.log("Sent " + bytes + " bytes");
  s.close();
});

A host that desires to receive a datagram sent to a multicast group, must first request membership to that group. Here’s a Node.js code snippet that receives datagram sent by the code above

var dgram = require('dgram');
var s = dgram.createSocket('udp4');
s.bind(8001, function() {
  s.addMembership('225.0.0.1');
});
s.on("message", function (msg, rinfo) {
  console.log("server got: " + msg + " from " +
    rinfo.address + ":" + rinfo.port);
});

.NET code that does something similar can be found in the UDP Tool at GitHUb.

Receiving multicasts on Linux does not work when you bind the socket to a specific interface, for instance s.bind(8001, 192.168.1.1... does not work. It looks like a Linux-only (nay Unix?) quirk because it does not happen with either Mono .NET runtime or Node.js on Windows.

Another quirk observed on Linux is the need to add a route to forward multicast IP packets received over a wireless LAN interface in access point mode

route add -net 225.0.0.0 netmask 255.0.0.0 gw 192.168.2.1

Or, a more generic

route add -net 224.0.0.0/4 gw 192.168.2.1

Check that the route has been added with

netstat -nr

Or, just

route

Using ping to determine MTU size


ping is a ubiquitous and versatile utility available from the command line of most operating systems. Here’s how it can be used to determine maximum transmission unit (MTU) size i.e. the maximum amount of data the network will forward in a single data packet.

On Ubuntu GNU/Linux

ping -s 50000 -M do localhost

Here, 50000 is the size of the payload in the ICMP echo request. The -M do option prohibits fragmentation. The ICMP message as a whole has 8 more bytes, as that is the size of the header. The command shows that the loopback adapter’s MTU size is 16436, and the ping fails.

On a Mac

ping -s 50000 -D localhost

Does not print the MTU size, so you’ll have to try different payload sizes until you hit the limit. -D prohibits fragmentation by setting the Don’t Fragment (DF) bit in the IP header.

On Windows 8

ping -l 50000 -f localhost

Prints the default MTU size as 65500, so the ping above works.

IP fowarding


Normally, commercial operating systems do not act as routers, although they posses the ability to.

It is fairly easy to enable it and there are instructions all over the Internet to do so for Windows and Linux.

You’ll also need to add appropriate route settings to the routing table, so that the packets destined to a network are properly routed through a network interface on that network. This can be done using the route add command on Windows and Linux.

I have had problems routing multicast packets, from IP addresses 224.0.0.0 through 239.255.255.255 (formerly referred as Class D addresses). Windows does not route multicast traffic by default. Using EnableMulticastForwarding does not automatically enable it either.

Applications that communicate


You are building an application that needs to communicate over a network, maybe you have decided to build your own communication protocol. I hope you’re doing it because TCP over IP does not meet your needs. I cover some points to keep in mind when developing an application or protocol that communicates over a network.

Use an existing transport protocol

You’ll find it easier to layer your protocol on top of an existing transport protocol such as UDP or TCP over IP. It will require more work otherwise.

Protocol header

A protocol usually requires a header to transmit relevant information about the message. It can contain information such as version, sender address, receiver address, payload size, sequence number, and so on. One important consideration is the size of the header itself, make it as small as possible so that it does not become a significant overhead.

Message oriented vs stream oriented

It may be desirable to have message boundaries preserved. For instance, if the protocol has been asked to deliver a particular set of bytes, it should ideally provide the receiver those same set of bytes as a cohesive whole.

TCP is an example of a stream oriented protocol in the sense that there are no clear message boundaries. UDP is message oriented, each message or datagram can be up to approximately 65,000 bytes long.

Fragmentation and reassembly

Depending on the the size of the data, it will need to be broken into smaller fragments, these are reassembled when received. To reassemble data, data fragments need to be put in the order they are sent. The order can be indicated by adding a sequence number to each fragment. The application may also segment data as required, a protocol does not care for the contents of the data itself, it is blissfully unaware that data is segmented.

IP, and therefore TCP and UDP, transparently perform fragmentation and reassembly of data. TCP also further segments data sent by the application. The segments are reassembled at the receiver and provided to the application as a stream. The segment size needs to be such that the total length of the network packet does not exceed the maximum transmission unit (MTU) of the network.

Retransmission

If your communication link is unreliable, such as a noisy wireless link, you’ll need to retransmit data that does not arrive at the receiver. Retransmission may also be required if data arrives but is corrupted.

One way to implement retransmission is by requiring the receiver to send an acknowledgement when data is correctly received. The sender can use a timer to resend data when an acknowledgement is not received. If multiple simultaneous retries fail, the data transfer attempt may be abandoned, and an error reported to the software layer that uses the protocol. Each data fragment needs a unique identifier that should be used during acknowledgement.

Another way to implement retransmission is for the receiver to request it when a fragment with a particular sequence number is not received, after a more recent fragment has been received. This eliminates the need for acknowledgement.

IP is best effort, it neither retransmits nor prevents duplicate messages from arriving. UDP retains these drawbacks, large datagrams may be dropped if the network is unreliable, they may also arrive out of order. TCP handles retransmission, making it reliable and robust at the cost of throughput.

Error checking and correction

Error checking codes such as CRC codes can be added to data fragments so that errors during transmission can be detected. Redundancy in the data can ensure that data can be corrected even when there are errors. This is useful in scenarios where retransmission is expensive or not possible at all.

UDP and TCP are capable of checking header and data integrity based on a checksum value. They do not have data correction capability. Since TCP does retransmission, it can recover from errors by asking the sender to retransmit.

Multiple networks paths

The communication protocol stack may have to deal with multiple network paths to the destination, for instance a Bluetooth PAN and a WiFi link. The decision to choose one over the others may be based on the knowledge of which is more reliable, is currently active, has better throughput and so on. IP prioritizes one interface over the other using routing metric.

Connection (re)establishment

The state of the connection can be detected using keepalive or heartbeat messages. If the receiver responds to heartbeat messages, the connection is alive. Otherwise, it is considered broken and an error reported to the application. Heartbeat messages compete with regular data, so they may be used when no data activity is present. Connection reestablishment may require user intervention in case of persistent problems with the network.

Protocols such as TCP initiate and maintain a session with the receiver. A termination in this connection is negotiated. TCP supports keepalive, it can be enabled on a per connection basis. UDP on the other hand does not maintain a connection, it is entirely stateless. Termination of a connection due to persistent problems in the network is not handled gracefully by UDP.

Compression and encryption

Compression can reduce the bandwidth required to transport data. Domain specific compression algorithms are usually more efficient than generic compression algorithms like Deflate, for instance JPEG is better at compressing image data, and MP3 is better at compressing music. Encryption ensures that data cannot be read by parties other than the sender and the receiver. Encryption is quite an elaborate and complex topic involving key exchange, and several kinds of crypto algorithms.

Rate/flow control

Rate control, also called throttling, prevents the network and network nodes from being overwhelmed, averting effects such as packet loss. It can also be used to divide the available bandwidth between users, when it is scarce. Rate control can also be applied when available bandwidth changes, commonly referred to as adaptive rate control.

Store and forward

Some considerations need to be made as to what happens to messages when delivery fails, when there is a power outage for instance. The protocol can store messages in a persistent queue and forward them at a later time. This is also sometimes referred to as fire and forget, since the application fires a message and is assured that the other end will receive it, even after a significant delay.

Software design patterns and data structures

Certain data structures and patterns that can be very useful are queues, priority queues, observer, and chain of responsibility.

Transfer lots of data between PCs


So, you just bought a new PC and are wondering how you’ll transfer tons of data from one to the other?

Here’s a quick tip. You probably have a gigabit Ethernet adapter on both and can use an Ethernet cable to transfer gigabytes of data in a couple of hours.

Just plug a regular Ethernet cable into both PCs, configure network interfaces with  static IP addresses, and you are ready to transfer your data. Older Ethernet adapters may have difficulty with a normal cable, you may need a crossover cable, but most adapters auto-detect crossover and work with regular cables just fine.

Configuring the static IP address is operating system specific. So is file copying. On Windows, use the network adapter settings page to access Internet Protocol version 4 properties. Set the network address to something like 192.168.2.1 on one PC and 192.168.2.2 on the other. Set the network mask on both to 255.255.255.0.

Now, just share a folder or disk you wan’t to copy over the network, and access it from the other PC to copy whatever files you need. I got an average data transfer rate of about 25 Megabytes per second. That is far less than the theoretical 125 MB per second I should get from gigabit Ethernet. Need to figure out why, but it got the job done.

Of course, if you have an external USB drive, you can save yourself the pain above, and use that for doing the transfer, even though it may take at least twice as long.

TCP socket connection from the web browser


Web browsers do not support communicating with TCP hosts, other than web servers. In this post I take a different tack. I demonstrate a relay written with Node.js, that receives data from the browser over websockets, and sends it to a TCP socket. Data received over the TCP socket is similarly relayed back to the browser. This approach can also be used with UDP and other IP protocols.

JavaScript implementations of most modern browsers have typed arrays, that can be used to manipulate binary data. Latency and performance of JavaScript are important factors to consider. Some hosts may have tight timing requirements for responses that may be hard to meet.

The websockets implementation used by the relay is based on the ws module. socket.io is also a good fit but I wanted to be as close to vanilla websockets as possible. The ws module can be installed as follows:


npm -g install ws

The client

Here’s the implementation of a test client. It requests the relay to open a new socket connection to http://www.google.com at port 80. It then sends an HTTP GET request, and shows the response to the GET request in a DIV element.

You’ll need some familiarity with jQuery to follow the code. Since I use the CDN hosted version of jQuery, you’ll need an internet connection.

<html>
<head>
  <title>Test Client</title>
</head>
<body>
  <div id="output">Output</div>

  <script src="http://code.jquery.com/jquery-1.7.2.min.js"></script>

  <script>
  $(document).ready(function() {
    var config = {
      relayURL: "ws://192.168.0.129:8080",
      remoteHost: "www.google.com",
      remotePort: 80
    };

    var client = new RelayClient(config, function(socket) {
      socket.onmessage = function(event) {
        $('div#output').html(event.data)
      };
      var get = 'GET / HTTP/1.1\r\n\r\n';
      //var get = new Blob(['GET / HTTP/1.1\r\n\r\n']);
      socket.send(get);
    });
  });

  function RelayClient(config, handler) {
    var connected = false;
    var connectHandler = handler;

    var socket = new WebSocket(config.relayURL);

    socket.onopen = function() {
      socket.send('open ' + config.remoteHost + ' ' + config.remotePort);
    };

    socket.onmessage = function(event) {
      if (!connected && event.data == 'connected') {
        connected = true;
        handler(socket);
      }
    }
  }
  </script>
</body>
</html>

The relay

The relay implementation receives a message from the client containing an open request followed by the remote host and port. After a connection is established with the remote host, it sends a connected message to the client. After that, all messages from the client are simply relayed to the host, and vice-versa.

The ws module is used in tandem with the express web application framework. The express framework is setup to serve static files from the folder where the script is located.

var express = require('express');
var net = require('net');
var WebSocketServer = require('ws').Server;

var app = express.createServer();
app.use(express.static(__dirname));
app.listen(8080);

var wss = new WebSocketServer({server: app});

wss.on('connection', function(ws) {
  // new client connection
  var connected = false;
  var host = undefined;

  ws.on('message', function(message) {
    if (!connected && message.substring(0, 4) == 'open') {
      var options = message.split(' ');
      console.log('Trying %s at port %s...', options[1], options[2]);
      host = net.connect(options[2], options[1], function() {
        connected = true;
        ws.send('connected');
      });
      host.on('data', function(data) {
        console.log('Got data from %s, sending to client.', options[1]);
        ws.send(data);
      });
      host.on('end', function() {
        console.log('Host %s terminated connection.', options[1]);
        ws.close();
      });
    } else {
      console.log('Got data from client, sending to host.');
      host.write(message);
    }
  });
});

Some considerations

The data being sent and received is UTF-8. Blob support is currently limited to Firefox and desktop WebKit based browsers. The relay mechanism can be extended to support other protocols like UDP.