The GNU-VPE Protocols

Overview

\s-1GVPE\s0 can make use of a number of protocols. One of them is the \s-1GNU\s0 \s-1VPE\s0 protocol which is used to authenticate tunnels and send encrypted data packets. This protocol is described in more detail the second part of this document.

The first part of this document describes the transport protocols which are used by \s-1GVPE\s0 to send it's data packets over the network.

PART 1: Transport protocols

\s-1GVPE\s0 offers a wide range of transport protocols that can be used to interchange data between nodes. Protocols differ in their overhead, speed, reliability, and robustness.

The following sections describe each transport protocol in more detail. They are sorted by overhead/efficiency, the most efficient transport is listed first:

\s-1RAW\s0 \s-1IP\s0

This protocol is the best choice, performance-wise, as the minimum overhead per packet is only 38 bytes.

It works by sending the \s-1VPN\s0 payload using raw \s-1IP\s0 frames (using the protocol set by \*(C`ip-proto\*(C').

Using raw \s-1IP\s0 frames has the drawback that many firewalls block \*(L"unknown\*(R" protocols, so this transport only works if you have full \s-1IP\s0 connectivity between nodes.

\s-1ICMP\s0

This protocol offers very low overhead (minimum 42 bytes), and can sometimes tunnel through firewalls when other protocols can not.

It works by prepending an \s-1ICMP\s0 header with type \*(C`icmp-type\*(C' and a code of 255. The default \*(C`icmp-type\*(C' is \*(C`echo-reply\*(C', so the resulting packets look like echo replies, which looks rather strange to network administrators.

This transport should only be used if other transports (i.e. raw \s-1IP\s0) are not available or undesirable (due to their overhead).

\s-1UDP\s0

This is a good general choice for the transport protocol as \s-1UDP\s0 packets tunnel well through most firewalls and routers, and the overhead per packet is moderate (minimum 58 bytes).

It should be used if \s-1RAW\s0 \s-1IP\s0 is not available.

\s-1TCP\s0

This protocol is a very bad choice, as it not only has high overhead (more than 60 bytes), but the transport also retries on it's own, which leads to congestion when the link has moderate packet loss (as both the \s-1TCP\s0 transport and the tunneled traffic will retry, increasing congestion more and more). It also has high latency and is quite inefficient.

It's only useful when tunneling through firewalls that block better protocols. If a node doesn't have direct internet access but a \s-1HTTP\s0 proxy that supports the \s-1CONNECT\s0 method it can be used to tunnel through a web proxy. For this to work, the \*(C`tcp-port\*(C' should be 443 (\*(C`https\*(C'), as most proxies do not allow connections to other ports.

It is an abuse of the usage a proxy was designed for, so make sure you are allowed to use it for \s-1GVPE\s0.

This protocol also has server and client sides. If the \*(C`tcp-port\*(C' is set to zero, other nodes cannot connect to this node directly. If the \*(C`tcp-port\*(C' is non-zero, the node can act both as a client as well as a server.

\s-1DNS\s0

\s-1WARNING:\s0 Parsing and generating \s-1DNS\s0 packets is rather tricky. The code almost certainly contains buffer overflows and other, likely exploitable, bugs. You have been warned.

This is the worst choice of transport protocol with respect to overhead (overhead can be 2-3 times higher than the transferred data), and latency (which can be many seconds). Some \s-1DNS\s0 servers might not be prepared to handle the traffic and drop or corrupt packets. The client also has to constantly poll the server for data, so the client will constantly create traffic even if it doesn't need to transport packets.

In addition, the same problems as the \s-1TCP\s0 transport also plague this protocol.

Its only use is to tunnel through firewalls that do not allow direct internet access. Similar to using a \s-1HTTP\s0 proxy (as the \s-1TCP\s0 transport does), it uses a local \s-1DNS\s0 server/forwarder (given by the \*(C`dns-forw-host\*(C' configuration value) as a proxy to send and receive data as a client, and an \*(C`NS\*(C' record pointing to the \s-1GVPE\s0 server (as given by the \*(C`dns-hostname\*(C' directive).

The only good side of this protocol is that it can tunnel through most firewalls mostly undetected, iff the local \s-1DNS\s0 server/forwarder is sane (which is true for most routers, wireless \s-1LAN\s0 gateways and nameservers).

Fine-tuning needs to be done by editing \*(C`src/vpn_dns.C\*(C' directly.

PART 2: The GNU VPE protocol

This section, unfortunately, is not yet finished, although the protocol is stable (until bugs in the cryptography are found, which will likely completely change the following description). Nevertheless, it should give you some overview over the protocol.

Anatomy of a \s-1VPN\s0 packet

The exact layout and field lengths of a \s-1VPN\s0 packet is determined at compile time and doesn't change. The same structure is used for all transport protocols, be it \s-1RAWIP\s0 or \s-1TCP\s0.

 +------+------+--------+------+
 | HMAC | TYPE | SRCDST | DATA |
 +------+------+--------+------+

The \s-1HMAC\s0 field is present in all packets, even if not used (e.g. in auth request packets), in which case it is set to all zeroes. The checksum itself is calculated over the \s-1TYPE\s0, \s-1SRCDST\s0 and \s-1DATA\s0 fields in all cases.

The \s-1TYPE\s0 field is a single byte and determines the purpose of the packet (e.g. \s-1RESET\s0, \s-1COMPRESSED/UNCOMPRESSED\s0 \s-1DATA\s0, \s-1PING\s0, \s-1AUTH\s0 \s-1REQUEST/RESPONSE\s0, \s-1CONNECT\s0 \s-1REQUEST/INFO\s0 etc.).

\s-1SRCDST\s0 is a three byte field which contains the source and destination node IDs (12 bits each).

The \s-1DATA\s0 portion differs between each packet type, naturally, and is the only part that can be encrypted. Data packets contain more fields, as shown:

+------+------+--------+------+-------+------+ | HMAC | TYPE | SRCDST | RAND | SEQNO | DATA | +------+------+--------+------+-------+------+

\s-1RAND\s0 is a sequence of fully random bytes, used to increase the entropy of the data for encryption purposes.

\s-1SEQNO\s0 is a 32-bit sequence number. It is negotiated at every connection initialization and starts at some random 31 bit value. \s-1VPE\s0 currently uses a sliding window of 512 packets/sequence numbers to detect reordering, duplication and replay attacks.

The encryption is done on \s-1RAND+SEQNO+DATA\s0 in \s-1CBC\s0 mode with zero \s-1IV\s0 (or, equivalently, the \s-1IV\s0 is \s-1RAND+SEQNO\s0, encrypted with the block cipher, unless \s-1RAND\s0 size is decreased or increased over the default value).

The authentication protocol

Before nodes can exchange packets, they need to establish authenticity of the other side and a key. Every node has a private \s-1RSA\s0 key and the public \s-1RSA\s0 keys of all other nodes.

A host establishes a simplex connection by sending the other node an \s-1RSA\s0 encrypted challenge containing a random challenge (consisting of the encryption and authentication keys to use when sending packets, more random data and \s-1PKCS1_OAEP\s0 padding) and a random 16 byte \*(L"challenge-id\*(R" (used to detect duplicate auth packets). The destination node will respond by replying with an (unencrypted) hash of the decrypted challenge, which will authenticate that node. The destination node will also set the outgoing encryption parameters as given in the packet.

When the source node receives a correct auth reply (by verifying the hash and the id, which will expire after 120 seconds), it will start to accept data packets from the destination node.

This means that a node can only initiate a simplex connection, telling the other side the key it has to use when it sends packets. The challenge reply is only used to set the current \s-1IP\s0 address of the other side and protocol parameters.

This protocol is completely symmetric, so to be able to send packets the destination node must send a challenge in the exact same way as already described (so, in essence, two simplex connections are created per node pair).

Retrying

When there is no response to an auth request, the node will send auth requests in bursts with an exponential back-off. After some time it will resort to \s-1PING\s0 packets, which are very small (8 bytes + protocol header) and lightweight (no \s-1RSA\s0 operations required). A node that receives ping requests from an unconnected peer will respond by trying to create a connection.

In addition to the exponential back-off, there is a global rate-limit on a per-IP base. It allows long bursts but will limit total packet rate to something like one control packet every ten seconds, to avoid accidental floods due to protocol problems (like a \s-1RSA\s0 key file mismatch between two nodes).

The intervals between retries are limited by the \*(C`max-retry\*(C' configuration value. A node with \*(C`connect\*(C' = \*(C`always\*(C' will always retry, a node with \*(C`connect\*(C' = \*(C`ondemand\*(C' will only try (and re-try) to connect as long as there are packets in the queue, usually this limits the retry period to \*(C`max-ttl\*(C' seconds.

Sending packets over the \s-1VPN\s0 will reset the retry intervals as well, which means as long as somebody is trying to send packets to a given node, \s-1GVPE\s0 will try to connect every few seconds.

Routing and Protocol translation

The \s-1GVPE\s0 routing algorithm is easy: there isn't much routing to speak of: When routing packets to another node, \s-1GVPE\s0 tries the following options, in order:

If the two nodes should be able to reach each other directly (common protocol, port known), then \s-1GVPE\s0 will send the packet directly to the other node.
If no such router exists, then \s-1GVPE\s0 will simply send the packet to the node with the highest priority available.
Failing all that, the packet will be dropped.

A host can usually declare itself unreachable directly by setting it's port number(s) to zero. It can declare other hosts as unreachable by using a config-file that disables all protocols for these other hosts. Another option is to disable all protocols on that host in the other config files.

If two hosts cannot connect to each other because their \s-1IP\s0 address(es) are not known (such as dial-up hosts), one side will send a mediated connection request to a router (routers must be configured to act as routers!), which will send both the originating and the destination host a connection info request with protocol information and \s-1IP\s0 address of the other host (if known). Both hosts will then try to establish a direct connection to the other peer, which is usually possible even when both hosts are behind a \s-1NAT\s0 gateway.

Routing via other nodes works because the \s-1SRCDST\s0 field is not encrypted, so the router can just forward the packet to the destination host. Since each host uses it's own private key, the router will not be able to decrypt or encrypt packets, it will just act as a simple router and protocol translator.