master-thesis/doc/Background.tex

518 lines
26 KiB
TeX

\chapter{\label{background}Background}
In this chapter we describe the key technologies involved and their
relation to our work.
% ----------------------------------------------------------------------
\section{\label{background:p4}P4}
P4 is a programming language designed to program inside network
equipment. Its main features are protocol and target independence.
The \textit{protocol independence} refers to the separation of concerns in
terms of language and protocols: P4, generally speaking, operates on
bits that are parsed and then accessible in the self defined
structures called headers. The general flow can be seen in
figure \ref{fig:p4fromnsg}: a parser parses the incoming packet and
prepares it for processing in the switching logic. Afterwards the
packets are output and deparsing of the parsed data might follow.
In the context of NAT64 this is a very important feature: while the
parser will read and parse in the ingress pipeline one protocol
(f.i. IPv6), the deparser will output a different protocol (f.i. IPv4).
\begin{figure}[htbp]
\includegraphics[scale=0.9]{p4-from-nsg}
\centering
\caption{P4 Protocol Independence~\cite{vanbever:_progr_networ_data_planes}}
\label{fig:p4fromnsg}
\end{figure}
The \textit{target independence} is the second major feature
of P4: it allows code to be compiled to different targets. While in
theory the P4 code should be completely target independent, in reality,
there are some modifications needed on a per-target basis and each
target faces different restrictions. The challenges arising from this
are discussed in section \ref{results:p4}.
As opposed to general purpose programming languages, P4 lacks some
features. Most notably loops, floating point operations and
modulo operations.
However, within its constraints, P4 can guarantee
operation at line speed, which general purpose programming languages
cannot guarantee and also fail to achieve in reality
(see section \ref{results:softwarenat64} for details).
% ok
% ----------------------------------------------------------------------
\section{\label{background:ip}IPv6, IPv4 and Ethernet}
The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv6 and
IPv4 operate on layer 3 of the OSI model. In this thesis we only
consider transmission via Ethernet, which operates at
layer 2. Inside the Ethernet frame a field named ``type'' specifies
the higher level protocol identifier.\footnote{
0x0800 for IPv4~\cite{rfc894} and 0x86DD for IPv6~\cite{rfc2464}.}
This is important because
Ethernet can only reference one protocol, which makes IPv4 and IPv6
mutually exclusive.
In the figures \ref{fig:ipv4header} and \ref{fig:ipv6header} we
show the packet headers of IPv4 and IPv6 for showing the in-protocol
differences. The most notable differences between
the two protocols for this thesis are:
\begin{itemize}
\item Different address lengths
\begin{itemize}
\item IPv4: 32 bit
\item IPv6: 128 bit
\end{itemize}
\item Lack of a checksum in IPv6
\item Format of Pseudo headers (see section \ref{background:checksums})
\end{itemize}
% ----------------------------------------------------------------------
\section{\label{background:arpndp}ARP and NDP, ICMP and ICMP6}
While IPv6 and IPv4 are primarily used as a ``shell'' to support
addressing for protocols that have no or limited addressing support
(like TCP or UDP), protocols like ARP~\cite{rfc826} and
NDP~\cite{rfc4861} provide support for resolving IPv6 and IPv4
addresses to hardware (MAC) addresses. While both ARP and NDP are only
used prior to establishing a connection and their results are
cached, their availability is crucial for operating a switch, because
without ARP or NDP no connection will every be established.
Figure \ref{fig:arpndp} illustrates a typical address resolution process.
\begin{figure}[htbp]
\includegraphics[scale=0.4]{arp-ndp}
\centering
\caption{ARP and NDP}
\label{fig:arpndp}
\end{figure}
The major differences between ARP and NDP in relation to P4 are
\begin{itemize}
\item ARP is a separate protocol on the same layer as IPv6 and IPv4,
\item NDP operates below ICMP6 which operates below IPv6,
\item NDP contains checksums over payload,
\item and NDP in ICMP6 contains optional, non-referenced option fields
(specifically: ICMP6 link layer address option).
\end{itemize}
ARP is required to be a separate protocol, because IPv4 hosts don't
know how to communicate with each other yet, as they don't have a
way to communicate to the target IPv4 address (``The chicken and the
egg problem'').
NDP on the other hand already works within IPv6, as every IPv6 host is
required to have a self-assigned link local IPv6 address from the
IPv6 network \texttt{fe80::/10} (compare
RFC4291~\cite{rfc4291}). While ARP uses broadcasting for address
resolution, NDP uses multicasting. IPv6 hosts automatically
join multicast groups that embed parts of their
IPv6 addresses~\cite{rfc2710},~\cite{wikipedia:_solic}. This way the
collision domain is significantly reduced in IPv6, compared to IPv4.
As seen later in this document (compare
section \ref{results:netpfga:features}), the requirement to generate checksums
over payload poses difficult problems for some hardware targets. Even
more difficult is the use of options within ICMP6.
\begin{figure}[htbp]
\includegraphics[scale=0.4]{icmp6ndp}
\centering
\caption{ICMP6 Option Fields}
\label{fig:icmp6ndp}
\end{figure}
The problem arises from the layout of the options, as seen
in figure \ref{fig:icmp6ndp} and the following quote:
\begin{quote}
``Neighbor Discovery messages include zero or more options, some of
which may appear multiple times in the same message. Options should
be padded when necessary to ensure that they end on their natural
64-bit boundaries''.\footnote{Quote from \cite{rfc4861}.}
\end{quote}
ICMP6 and ICMP are primarily used to signal errors in
communication. Specifically, signalling that a packet is too big to
pass a certain link and needs fragmentation is a common functionality
of both protocols. For a host (or a switch) to be able to emit ICMP6 and
ICMP messages, the host requires a valid IPv6 / IPv4 address.
Without ICMP6 / ICMP support path MTU
discovery~\cite{rfc1191},~\cite{rfc8201}
does not work and the sender needs to determine
different ways of finding out the maximum MTU on the path.
% ----------------------------------------------------------------------
\section{\label{background:transition}IPv6 Translation Mechanisms}
While in this thesis we focus on NAT64 as a translation mechanism,
there are a variety of different approaches, some of which we would
like to portray here.
% ----------------------------------------------------------------------
\subsection{\label{background:transition:stateless}Stateless NAT64}
Stateless NAT64 describes static mappings between IPv6 and IPv4
addresses. This can be based on longest prefix matching (LPM),
ranges, bitmasks or individual entries.
NAT64 translations as described in this thesis modify multiple layers
in the translation process:
\begin{itemize}
\item Ethernet (changing the type field)
\item IPv4 / IPv6 (changing the protocol, changing the fields)
\item TCP/UDP/ICMP/ICMP6 checksums
\end{itemize}
Figures \ref{fig:ipv6header} and \ref{fig:ipv4header} show the headers
of IPv4 and IPv6. As can be seen in the diagrams not only are the
addresses of different size, but fields have also been changed or
removed when the version changed. Depending on the NAT64
translation direction, a translator will need to re-arrange fields to
a different position, remove fields and add fields.
\begin{figure}[htbp]
\begin{verbatim}
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Traffic Class | Flow Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length | Next Header | Hop Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Source Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Destination Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\centering
\caption{IPv6 Header~\cite{rfc2460}}
\label{fig:ipv6header}
\end{figure}
This in turn causes the packet size for standard headers
to differ by 160 Bit.\footnote{IPv6: 320 Bit, IPv4 160 Bit}
% ----------------------------------------------------------------------
\subsection{\label{background:transition:statefulnat64}Stateful NAT64}
Stateful NAT64 as defined in RFC6146~\cite{rfc6146} defines how to
create 1:n mappings between IPv6 and IPv4 hosts. The motivation for
stateful NAT64 is similar to stateful NAT44~\cite{rfc3022}: while
NAT44 allows translating many (private) IPv4 addresses to one
(public) IPv4 address,
NAT64 allows translating many IPv6 addresses to one IPv4 address.
While the opposite stateful translation, mapping many IPv4 addresses
to one IPv6 address, is also technically possible,
the differences in address space size don't justify its use in general.
\begin{figure}[htbp]
\begin{verbatim}
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\caption{IPv4 Header~\cite{rfc791}}
\label{fig:ipv4header}
\end{figure}
Stateful NAT64 in particular uses information in higher level
protocols to multiplex connections: Given one IPv4 address and the TCP
protocol, outgoing connections from IPv6 hosts can dynamically mapped
to the range of possible TCP ports. After a session is closed, the
port can be reused again.
\begin{figure}[htbp]
\includegraphics[scale=0.5]{statefulnat64}
\centering
\caption{Stateful NAT64}
\label{fig:statefulnat64}
\end{figure}
The selection of mapped ports is usually based on the availability of
the IPv4 side and not related to the original port. To support
stateful NAT64, the translator needs to store the mapping
in a table and purge entries regularly.
Stateful NAT64 usually uses information found in protocols at layer 4
like TCP~\cite{rfc793} or UDP~\cite{rfc768}. However, it can also
support ICMP~\cite{rfc792} and ICMP6~\cite{rfc4443}.
% ----------------------------------------------------------------------
\subsection{\label{background:transition:Protocol dependent}Higher
Layer Protocol Dependent Translation}
Further translation can be achieved by using information in higher
level protocols like HTTP~\cite{rfc2616} or TLS~\cite{rfc4366}.
Application proxies like
nginx~\cite{nginx:_nginx_high_perfor_load_balan}
use layer 7 protocol information, like the requested hostname,
to proxy towards backends.
Within this proxying method, the underlying IP protocol can be changed
from IPv6 to IPv4 and vice versa. However, if using HTTPS with TLS
1.3~\cite{rfc8446}, the requested hostname that is usually used for
selecting the backend can be encrypted, which poses a challenge for
implementations.
While protocol dependent translation has the highest amount of
information to choose from for translation, complex parsers or even
cryptographic methods are required for it. That reduces the
opportunities for) protocol dependent translations to run on devices
with less sophisticated devices.
% ----------------------------------------------------------------------
\subsection{\label{background:transition:prefixnat}Mapping IPv4
Addresses in IPv6}
As described in section \ref{background:ip}, one of the major
differences between IPv6 and IPv4 is the address length. As the whole
IPv4 Internet can be represented in only 32 bits, it is a common
practice to assign an IPv6 prefix for IPv6 hosts that represents a
mapping to the whole IPv4 Internet. In RFC6052~\cite{rfc6052} the well
known prefix \textit{64:ff9b::/96} is defined that can be used for
this purpose. One possibility to map
an IPv4 address into the prefix is by adding its integer value to the
prefix, treating it as an offset. In figure \ref{fig:ipv4embed}
we show example python code of how this can be done.
\begin{figure}[htbp]
\begin{verbatim}
>>> import ipaddress
>>> prefix=ipaddress.IPv6Network("64:ff9b::/96")
>>> ipv4address=ipaddress.IPv4Address("192.0.2.0")
>>> int(ipv4address)
3221225984
>>> hex(3221225984)
'0xc0000200'
>>> prefix[int(ipv4address)]
IPv6Address('64:ff9b::c000:200')
\end{verbatim}
\centering
\caption{Representing an IPv4 address in an IPv6 Prefix}
\label{fig:ipv4embed}
\end{figure}
Network administrators can choose to use either the well known prefix
or to use a network block of their own to map the
Internet.\footnote{For instance
2a0a:e5c0:0:1::/96~\cite{ungleich:networkinfrastructure}.}
While a /96 prefix seems a natural selection (it provides exactly 32 bit),
other prefix lengths are defined in RFC6052 (see figure
\ref{fig:prefixlen}) that allow flexible embedding of the IPv4 address.
\begin{figure}[htbp]
\begin{verbatim}
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|PL| 0-------------32--40--48--56--64--72--80--88--96--104---------|
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|32| prefix |v4(32) | u | suffix |
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|40| prefix |v4(24) | u |(8)| suffix |
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|48| prefix |v4(16) | u | (16) | suffix |
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|56| prefix |(8)| u | v4(24) | suffix |
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|64| prefix | u | v4(32) | suffix |
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|96| prefix | v4(32) |
+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
\end{verbatim}
\centering
\caption{IPv4 Embedding Depending on the Prefix Length}
\label{fig:prefixlen}
\end{figure}
RFC6146, which describes stateful NAT64, states that
``IPv4 addresses of IPv4 hosts are algorithmically
translated to and from IPv6 addresses by using the algorithm defined
in [RFC6052]''~\cite{rfc6146} While this sentence does not use the
typical RFC keywords like SHALL, REQUIRED, etc.~\cite{rfc2119}, we
interpret this sentence in the meaning of ``a stateful NAT64
translator SHALL implement IPv4 address embedding as described in the
algorithm of RFC6052''.
% ----------------------------------------------------------------------
\subsection{\label{background:transition:dns64}DNS64}
Tightly related to NAT64 is a technology known as
DNS64~\cite{rfc6147}. DNS64 tries to solve the problem of addressing
IPv4 only hosts from IPv6 only hosts
by adding a ``fake'' IPv6 (AAAA) DNS resource record, as shown in
figure \ref{fig:dns64}.
\begin{figure}[h]
\includegraphics[scale=0.4]{dns64}
\centering
\caption{Illustration of DNS64}
\label{fig:dns64}
\end{figure}
The DNS64 DNS server will query the authoritative DNS server for an AAAA
record. However as the host \textit{ipv4onlyhost.example.com} is only
reachable by IPv4, it also only has an A entry. After receiving the
answer that there is no AAAA record, the DNS64 server will ask for an
A record and will get an answer that the name
\textit{ipv4onlyhost.example.com} resolves to the IPv4 address
\textit{192.0.2.0}. The DNS64 server then embeds the IPv4 address in
the configured IPv6 prefix (\textit{64:ff9b::/96} in this case) and
returns a fake AAAA record to the IPv6 only host (pointing to
\textit{64:ff9b::c000:200} in this case). The IPv6 only host
then will use the address to connect to. The NAT64 translator recognises
either that the address is part of a configured prefix or that it has
a dedicated table entry for mapping this IPv6 address to an IPv4
address and translates it accordingly.
% ok
% ----------------------------------------------------------------------
\section{\label{background:checksums}Protocol Checksums}
One challenge for translating IPv6 to IPv4 are checksums of higher level
protocols like TCP and UDP that incorporate information from the lower
level protocols. The pseudo header for upper layer protocols for
IPv6 is defined in RFC2460~\cite{rfc2460} and shown in figure
\ref{fig:ipv6pseudoheader}, the IPv4 pseudo header for TCP and UDP are
defined in RFC768 and RFC793 and are shown in \ref{fig:ipv4pseudoheader}.
\begin{figure}[htbp]
\begin{verbatim}
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Source Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Destination Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Upper-Layer Packet Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| zero | Next Header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\centering
\caption{IPv6 Pseudo Header}
\label{fig:ipv6pseudoheader}
\end{figure}
When translating, the checksum fields in the higher protocols need to be
adjusted. The checksums for TCP and UDP are calculated not only over the pseudo
headers, but also contain the payload of the packet. This is
important because some targets (like the NetFPGA) do not allow
accessing the payload (see section \ref{design:netpfga}).
\begin{figure}[htbp]
\begin{verbatim}
0 7 8 15 16 23 24 31
+--------+--------+--------+--------+
| source address |
+--------+--------+--------+--------+
| destination address |
+--------+--------+--------+--------+
| zero |protocol| TCP/UDP length |
+--------+--------+--------+--------+
\end{verbatim}
\centering
\caption{IPv4 Pseudo Header}
\label{fig:ipv4pseudoheader}
\end{figure}
The checksums for IPv4, TCP, UDP and ICMP6 are all based on the
``Internet Checksum''~\cite{rfc791},~\cite{rfc1071}.
Its calculation can be summarised as follows:
\begin{quote}
The checksum field is the 16-bit one's complement of the one's
complement sum of all 16-bit words in the header. For purposes of
computing the checksum, the value of the checksum field
is zero.\footnote{Quote from Wikipedia~\cite{wikipedia:_ipv4}.}.
\end{quote}
% ----------------------------------------------------------------------
\section{\label{background:networkdesign}Network Designs}
In relation to IPv6 and IPv4, there are in general three different
network designs possible:
The oldest form are IPv4 only networks.
These networks consist of
hosts that are either not configured for IPv6 or are even technically
incapable of enabling the IPv6 protocol. These nodes are connected to
an IPv4 router that is connected to the Internet. That router might be
capable of translating IPv4 to IPv6 and vice versa.
With the introduction of IPv6, hosts can have a separate IP stack
active and in that configuration hosts are called ``dualstack hosts''.
Dualstack hosts are capable of reaching both IPv6 and IPv4 hosts
directly without the need of any translation mechanism.
The last possible network design is based on IPv6 only hosts.
While it is technically easy to disable IPv4,
completely removing the IPv4 stack in current operating
systems is not an easy task~\cite{ungleich:_ipv4}.
%% \begin{figure}[h]
%% \includegraphics[scale=0.5]{v6only}
%% \centering
%% \caption{IPv6 only network}
%% \label{fig:v6onlynet}
%% \end{figure}
While the three network designs look similar, there are significant
differences in operating them and limitations that are not easy to
circumvent. In the following sections, we describe the limitations and
explain how a translation mechanism like our NAT64 implementation
should be deployed.
% ----------------------------------------------------------------------
\subsection{\label{background:networkdesign:ipv4}IPv4 Only Network Limitations}
As shown in figures \ref{fig:ipv4header} and \ref{fig:ipv6header}
the IPv4 address size is 32 bit, while the IPv6 address size is 128
bit.
Without an extension to the address space, there is no protocol independent
mapping of IPv4 address to IPv6\footnote{See section
\ref{background:ip}.} that can cover the whole IPv6 address space.
Thus IPv4 only hosts can
never address every host in the IPv6 Internet. While protocol
dependent translations can try to minimise the impact, accessing all
IPv6 addresses independent of the protocol is not possible.
% ok
% ----------------------------------------------------------------------
\subsection{\label{background:networkdesign:dualstack}Dualstack Network
Maintenance}
While dualstack hosts can address any host in either IPv6 or IPv4
networks, the deployment of dualstack hosts comes with a major
disadvantage: all network configurations double. The required routing
tables double, the firewall rules roughly double\footnote{The rule sets
even for identical policies in IPv6 and IPv4 networks are not
identical, but similar. For this reason we state that roughly double
the amount of firewall rules are required for the same policy to be
applied.} and the number of network supporting systems, (like DHCPv4,
DHCPv6, router advertisement daemons, etc.) also roughly double.
Additionally, services that run on either IPv6 or IPv4 might need to be
configured to run in dualstack mode as well and not every software
might be capable of that.
So while there is the instant benefit of not requiring any transition mechanism
or translation method, we argue that the added complexity (and thus
operational cost) of running dual stack networks can be significant.
% ----------------------------------------------------------------------
\subsection{\label{background:networkdesign:v6only}IPv6 Only Networks}
IPv6 only networks are in our opinion the best choice for long term
deployments. Our reasons for this are the following: First of all hosts
eventually will need to support IPv6 and secondly
IPv6 hosts can address the whole 32 bit IPv4 Internet mapped in
a single /96 IPv6 network. IPv6 only networks also allow the operators
to focus on one IP stack.
% ----------------------------------------------------------------------
\section{\label{background:netfpga}NetFPGA}
\begin{figure}[htbp]
\includegraphics[scale=0.4]{sumeboard}
\centering
\caption{NetFPGA Board~\cite{zilberman:_netfp_sume}}
\label{fig:netfpga}
\end{figure}
The NetFPGA~\cite{zilberman:_netfp_sume}
is an FPGA card featuring four 10 Gbit/s SFP+ ports. It
includes the Xilinx Virtex-7 690T FPGA on board, 27 MB of storage,
to save table data, and 8 GB of DDR3 RAM. The NetFPGA can be
run inside a host (connected by PCI-E, gen 3) or as a standalone
card.
It can be used as a ``traditional'' FPGA, with the focus on designing
the logic. However, the NetFPGA also supports the P4 programming
language~\cite{netfpga:_p4_netpf_public_github} and thus abstracts
away the low level logic by providing a higher level interface.
For the purpose of this thesis we treat the NetFPGA as a standard P4
target, similar to other available P4
targets~\cite{networks:_tofin},
~\cite{networks:_tofin1},
~\cite{networks:_arist_series}. In particular, we treat the NetFPGA as a
P4 capable, four port 10 Gbit/s network switch that allows us to
process packets at line speed.
% ok