Cleanup background

This commit is contained in:
Nico Schottelius 2019-08-18 23:58:10 +02:00
parent d771c7de6b
commit d7d38ad820
17 changed files with 665 additions and 590 deletions

View file

@ -1,12 +1,14 @@
\chapter{\label{background}Background} \chapter{\label{background}Background}
In this chapter we describe the key technologies involved. In this chapter we describe the key technologies involved and their
relation to our work.
% ----------------------------------------------------------------------
\section{\label{background:p4}P4} \section{\label{background:p4}P4}
P4 is a programming language designed to program inside network P4 is a programming language designed to program inside network
equipment. It's main features are protocol and target independence. equipment. It's main features are protocol and target independence.
The \textit{protocol independence} refers to the separation of concerns in The \textit{protocol independence} refers to the separation of concerns in
terms of language and protocols: P4 generally speaking operates on terms of language and protocols: P4, generally speaking, operates on
bits that are parsed and then accessible in the (self) defined bits that are parsed and then accessible in the self defined
structures, also called headers. The general flow can be seen in structures called headers. The general flow can be seen in
figure \ref{fig:p4fromnsg}: a parser parses the incoming packet and figure \ref{fig:p4fromnsg}: a parser parses the incoming packet and
prepares it for processing in the switching logic. Afterwards the prepares it for processing in the switching logic. Afterwards the
packets are output and deparsing of the parsed data might follow. packets are output and deparsing of the parsed data might follow.
@ -27,13 +29,13 @@ target faces different restrictions. The challenges arising from this
are discussed in section \ref{results:p4}. are discussed in section \ref{results:p4}.
As opposed to general purpose programming languages, P4 lacks some As opposed to general purpose programming languages, P4 lacks some
features, most notably loops, floating point operations and the features. Most notably loops, floating point operations and
modulo operator. modulo operations.
However within its constraints, P4 can guarantee However within its constraints, P4 can guarantee
operation at line speed, which general purpose programming languages operation at line speed, which general purpose programming languages
cannot guarantee and also fail to achieve in reality cannot guarantee and also fail to achieve in reality
(see section \ref{results:softwarenat64} for details). (see section \ref{results:softwarenat64} for details).
% ok
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{background:ip}IPv6, IPv4 and Ethernet} \section{\label{background:ip}IPv6, IPv4 and Ethernet}
The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv4 and The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv4 and
@ -43,7 +45,8 @@ layer 2. Inside the Ethernet frame a field named ``type'' specifies
the higher level protocol identifier.\footnote{ the higher level protocol identifier.\footnote{
0x0800 for IPv4~\cite{rfc894} and 0x86DD for IPv6~\cite{rfc2464}.} 0x0800 for IPv4~\cite{rfc894} and 0x86DD for IPv6~\cite{rfc2464}.}
This is important, because This is important, because
Ethernet can only carry either of the two protocols. Ethernet can only reference one protocol, which makes IPv4 and IPv6
mutually exclusive.
The figures \ref{fig:ipv4header} and \ref{fig:ipv6header} show the The figures \ref{fig:ipv4header} and \ref{fig:ipv6header} show the
packet headers of IPv4 and IPv6. The most notable differences between packet headers of IPv4 and IPv6. The most notable differences between
the two protocols for this thesis are: the two protocols for this thesis are:
@ -56,59 +59,7 @@ the two protocols for this thesis are:
\item Lack of a checksum in IPv6 \item Lack of a checksum in IPv6
\item Format of Pseudo headers (see section \ref{background:checksums}) \item Format of Pseudo headers (see section \ref{background:checksums})
\end{itemize} \end{itemize}
\begin{figure}[h]
\begin{verbatim}
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Traffic Class | Flow Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length | Next Header | Hop Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Source Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Destination Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\centering
\caption{IPv6 Header~\cite{rfc2460}}
\label{fig:ipv6header}
\end{figure}
\begin{figure}[h]
\begin{verbatim}
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\caption{IPv4 Header~\cite{rfc791}}
\label{fig:ipv4header}
\end{figure}
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{background:arpndp}ARP and NDP, ICMP and ICMP6} \section{\label{background:arpndp}ARP and NDP, ICMP and ICMP6}
While IPv6 and IPv4 are primarily used as a ``shell'' to support While IPv6 and IPv4 are primarily used as a ``shell'' to support
@ -157,7 +108,7 @@ typical layout of a neighbor advertisement messages.
\label{fig:icmp6ndp} \label{fig:icmp6ndp}
\end{figure} \end{figure}
The problem arises from the layout of the options, as seen in the The problem arises from the layout of the options, as seen in the
following quote: following quote and in figure \ref{icmp6ndp}:
\begin{quote} \begin{quote}
``Neighbor Discovery messages include zero or more options, some of ``Neighbor Discovery messages include zero or more options, some of
which may appear multiple times in the same message. Options should which may appear multiple times in the same message. Options should
@ -181,20 +132,53 @@ While in this thesis the focus was in NAT64 as a translation mechanism,
there are a variety of different approaches, some of which we would there are a variety of different approaches, some of which we would
like to portray here. like to portray here.
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\subsection{\label{background:transition:staticnat64}Static NAT64} \subsection{\label{background:transition:stateless}Stateless NAT64}
Static NAT64 describes static mappings between IPv6 and IPv4 Stateless NAT64 describes static mappings between IPv6 and IPv4
addresses. This can be based on longest prefix matchings (LPM), addresses. This can be based on longest prefix matchings (LPM),
ranges, bitmasks or individual entries. ranges, bitmasks or individual entries.
NAT64 translations as described in this thesis modify multiple layers NAT64 translations as described in this thesis modify multiple layers
in the translation process: in the translation process:
\begin{itemize} \begin{itemize}
\item Ethernet (changing the type field) \item Ethernet (changing the type field)
\item IPv4 / IPv6 (changing the protocol, changing the fields) \item IPv4 / IPv6 (changing the protocol, changing the fields)
\item TCP/UDP/ICMP/ICMP6 checksums \item TCP/UDP/ICMP/ICMP6 checksums
\end{itemize} \end{itemize}
Figures \ref{fig:ipv6header} and \ref{fig:ipv4header} show the headers
of IPv4 and IPv6. As can be seen in the diagrams not only are the
addresses of different size, but fields have also been changed or
removed when the version changed. Depending on the NAT64
translation direction, a translator will need to re-arrange fields to
a different position, remove fields and add fields.
\begin{figure}[h]
\begin{verbatim}
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Traffic Class | Flow Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length | Next Header | Hop Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Source Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Destination Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\centering
\caption{IPv6 Header~\cite{rfc2460}}
\label{fig:ipv6header}
\end{figure}
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\subsection{\label{background:transition:statefulnat64}Stateful NAT64} \subsection{\label{background:transition:statefulnat64}Stateful NAT64}
Stateful NAT64 as defined in RFC6146~\cite{rfc6146} defines how to Stateful NAT64 as defined in RFC6146~\cite{rfc6146} defines how to
@ -204,6 +188,31 @@ translating many IPv6 addresses to one IPv4 address. While the
opposite translation is also technically possible, the differences in opposite translation is also technically possible, the differences in
address space don't justify its use in general. address space don't justify its use in general.
\begin{figure}[h]
\begin{verbatim}
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\end{verbatim}
\caption{IPv4 Header~\cite{rfc791}}
\label{fig:ipv4header}
\end{figure}
Stateful NAT64 in particular uses information in higher level Stateful NAT64 in particular uses information in higher level
protocols to multiplex connections: Given one IPv4 address and the tcp protocols to multiplex connections: Given one IPv4 address and the tcp
protocol, outgoing connections from IPv6 hosts can dynamically mapped protocol, outgoing connections from IPv6 hosts can dynamically mapped
@ -381,9 +390,8 @@ access the payload.
+--------+--------+--------+--------+ +--------+--------+--------+--------+
| destination address | | destination address |
+--------+--------+--------+--------+ +--------+--------+--------+--------+
| zero |protocol| UDP length | | zero |protocol| TCP/UDP length |
+--------+--------+--------+--------+ +--------+--------+--------+--------+
\end{verbatim} \end{verbatim}
\centering \centering
\caption{IPv4 Pseudo Header} \caption{IPv4 Pseudo Header}
@ -391,43 +399,42 @@ access the payload.
\end{figure} \end{figure}
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{background:networkdesign}Network Designs} \section{\label{background:networkdesign}Network Designs}
\begin{figure}[h] %% \begin{figure}[h]
\includegraphics[scale=0.5]{v4only} %% \includegraphics[scale=0.5]{v4only}
\centering %% \centering
\caption{IPv4 only network} %% \caption{IPv4 only network}
\label{fig:v4onlynet} %% \label{fig:v4onlynet}
\end{figure} %% \end{figure}
In relation to IPv6 and IPv4, there are in general three different In relation to IPv6 and IPv4, there are in general three different
network designs possible: network designs possible:
The oldest form are IPv4 only networks (see figure The oldest form are IPv4 only networks.
\ref{fig:v4onlynet}).
These networks consist of These networks consist of
hosts that are either not configured for IPv6 or are even technically hosts that are either not configured for IPv6 or are even technically
incapable of enabling the IPv6 protocol. These nodes are connected to incapable of enabling the IPv6 protocol. These nodes are connected to
an IPv4 router that is connected to the Internet. That router might be an IPv4 router that is connected to the Internet. That router might be
capable of translating IPv4 to IPv6 and vice versa. capable of translating IPv4 to IPv6 and vice versa.
\begin{figure}[h] %% \begin{figure}[h]
\includegraphics[scale=0.5]{dualstack} %% \includegraphics[scale=0.5]{dualstack}
\centering %% \centering
\caption{Dualstack network} %% \caption{Dualstack network}
\label{fig:dualstacknet} %% \label{fig:dualstacknet}
\end{figure} %% \end{figure}
With the introduction of IPv6, hosts can have a separate IP stack With the introduction of IPv6, hosts can have a separate IP stack
active and in that configuration hosts are called ``dualstack hosts'' active and in that configuration hosts are called ``dualstack hosts''.
(see figure \ref{fig:dualstacknet}).
Dualstack hosts are capabale of reaching both IPv6 and IPv4 hosts Dualstack hosts are capabale of reaching both IPv6 and IPv4 hosts
directly without the need of any translation mechanism. directly without the need of any translation mechanism.
The last possible network design is based on IPv6 only hosts, as shown The last possible network design is based on IPv6 only hosts.
in figure \ref{fig:v6onlynet}. While it is technically easy to disable IPv4, it While it is technically easy to disable IPv4, it
seems that completely removing the IPv4 stack in current operating seems that completely removing the IPv4 stack in current operating
systems is not an easy task~\cite{ungleich:_ipv4}. systems is not an easy task~\cite{ungleich:_ipv4}.
\begin{figure}[h] %% \begin{figure}[h]
\includegraphics[scale=0.5]{v6only} %% \includegraphics[scale=0.5]{v6only}
\centering %% \centering
\caption{IPv6 only network} %% \caption{IPv6 only network}
\label{fig:v6onlynet} %% \label{fig:v6onlynet}
\end{figure} %% \end{figure}
While the three network designs look similar, there are significant While the three network designs look similar, there are significant
differences in operating them and limitations that are not easy to differences in operating them and limitations that are not easy to
circumvent. In the following sections we describe the limitations and circumvent. In the following sections we describe the limitations and

View file

@ -1,11 +1,17 @@
\chapter{\label{design}Design} \chapter{\label{design}Design}
Description of the theory/software/hardware that you designed.
%** Design.tex: How was the problem attacked, what was the design %** Design.tex: How was the problem attacked, what was the design
% the architecture % the architecture
In this chapter we describe the architecture of our solution. In this chapter we describe the architecture of our solution.
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{Design:General}General - FIXME} \section{\label{design:nat64}NAT64 with P4 - FIXME: elaborate}
In section \ref{background:transition} we discussed different
translation mechanisms for IPv6 and IPv4. In this thesis we focus on
the translation mechansims stateless and stateful NAT64. While higher
layer protocol dependent translations are more flexible, this topic
has already addressed in
\cite{nico18:_implem_layer_ipv4_ipv6_rever_proxy} and the focus in
this thesis is on the practicability of high speed NAT64.
The high level design can be seen in figure \ref{fig:switchdesign}: a The high level design can be seen in figure \ref{fig:switchdesign}: a
P4 capable switch is running our code to provide NAT64 P4 capable switch is running our code to provide NAT64
functionality. The P4 switch cannot manage its tables on it own and functionality. The P4 switch cannot manage its tables on it own and
@ -16,7 +22,7 @@ switch tables.
\begin{figure}[h] \begin{figure}[h]
\includegraphics[scale=0.5]{switchdesign} \includegraphics[scale=0.5]{switchdesign}
\centering \centering
\caption{General Design} \caption{P4 Switch Architecture}
\label{fig:switchdesign} \label{fig:switchdesign}
\end{figure} \end{figure}
The P4 switch can use any protocol to communicate with controller, as The P4 switch can use any protocol to communicate with controller, as
@ -25,7 +31,41 @@ port. The design allows our solution to be used as a standard NAT64
translation method or as an in network NAT64 translation (compare translation method or as an in network NAT64 translation (compare
figures \ref{fig:v6v4innetwork} and \ref{fig:v6v4standard}). The figures \ref{fig:v6v4innetwork} and \ref{fig:v6v4standard}). The
controller is implemented in python, the NAT64 solution is implemented controller is implemented in python, the NAT64 solution is implemented
in P4. in P4. The network
\begin{figure}[h]
\includegraphics[scale=0.5]{networkdesignnat64}
\centering
\caption{Network design}
\label{fig:switchdesign}
\end{figure}
from intro:
\begin{figure}[h]
\includegraphics[scale=0.4]{v6-v4-mixed}
\centering
\caption{Different network design with in network NAT64 translation}
\label{fig:v6v4mixed}
\end{figure}
Figures \ref{fig:v6v4standard} shows the standard NAT64
approach and \ref{fig:v6v4innetwork} shows our solution.
\begin{figure}[h]
\includegraphics[scale=0.7]{v6-v4-innetwork}
\centering
\caption{In Network NAT64 translation}
\label{fig:v6v4innetwork}
\end{figure}
\begin{figure}[h]
\includegraphics[scale=0.7]{v6-v4-standard}
\centering
\caption{Standard NAT64 translation}
\label{fig:v6v4standard}
\end{figure}
Describe network layouts Describe network layouts
\begin{verbatim} \begin{verbatim}
@ -60,14 +100,30 @@ TCP6:[2001:db8:1::a00:1]:2343"; sleep 2; done
mx h1 "echo V6-OK | socat - TCP6:[2001:db8:1::a00:1]:2343" mx h1 "echo V6-OK | socat - TCP6:[2001:db8:1::a00:1]:2343"
\end{verbatim} \end{verbatim}
% ----------------------------------------------------------------------
% ----------------------------------------------------------------------
\section{\label{design:statelessnat64}Stateless NAT64 - FIXME: write}
Only using /96. Using addition.
% ----------------------------------------------------------------------
\section{\label{design:statefulnat64}Stateful NAT64 - FIXME: write}
- controller selects "outgoing" IPv4 address range => base for sessions
- IPv4 addresses can be "random" (in our test case), but need
to be unique
- switch does not need to know about the "range", only about
sessions
- on session create, controller selects "random" ip (ring?)
- on session create, controller selects "random port" (next in range?)
- on session create controller adds choice into 2 tables:
incoming, outgoing
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{Design:BMV2}BMV2} \section{\label{Design:BMV2}BMV2}
Development of the thesis took place on a software emulated switch Development of the thesis took place on a software emulated switch
that is implemented using Open vSwitch ~\cite{openvswitch} that is implemented using Open vSwitch~\cite{openvswitch}
and the behavioral model and the behavioral model~\cite{_implem_your_switc_target_with_bmv2}.
~\cite{_implem_your_switc_target_with_bmv2}. The development followed The development followed
closely the general design shown in section closely the general design shown in section
\ref{Design:General}. Within the software emulation checksums can be \ref{design:nat64}. Within the software emulation checksums can be
computed with two different methods: computed with two different methods:
\begin{itemize} \begin{itemize}
\item Recalculating the checksum by inspecting headers and payload \item Recalculating the checksum by inspecting headers and payload
@ -76,7 +132,7 @@ computed with two different methods:
The BMV2 model is rather sophisticated and provides many standard The BMV2 model is rather sophisticated and provides many standard
features including checksumming over payload. This allows the BMV2 features including checksumming over payload. This allows the BMV2
model to operate as a full featured host, including advanced features model to operate as a full featured host, including advanced features
like responding to ICMP6 Neighbor discovery requests ~\cite{rfc4861} like responding to ICMP6 Neighbor discovery requests~\cite{rfc4861}
that include payload checksums. that include payload checksums.
A typical code to create the checksum can be found in figure A typical code to create the checksum can be found in figure
\ref{fig:checksum}. \ref{fig:checksum}.
@ -113,7 +169,7 @@ update_checksum_with_payload(meta.chk_icmp6_na_ns == 1,
\end{figure} \end{figure}
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{Design:NetPFGA}NetFPGA} \section{\label{Design:NetPFGA}NetFPGA - FIXME: relate things}
While the P4-NetFPGA project ~\cite{netfpga:_p4_netpf_public_github} While the P4-NetFPGA project ~\cite{netfpga:_p4_netpf_public_github}
allows compiling P4 to the NetPFGA, the design slightly varies. allows compiling P4 to the NetPFGA, the design slightly varies.
In particular, the NetFPGA P4 compiler does not support reading In particular, the NetFPGA P4 compiler does not support reading
@ -166,8 +222,8 @@ action delta_tcp_from_v6_to_v4()
\label{fig:checksumbydiff} \label{fig:checksumbydiff}
\end{figure} \end{figure}
The checksums for IPv4, TCP, UDP and ICMP6 are all based on the The checksums for IPv4, TCP, UDP and ICMP6 are all based on the
``Internet Checksum'' (~\cite{rfc791}, ~\cite{rfc1071}). Its calculation ``Internet Checksum''~\cite{rfc791},~\cite{rfc1071}.
can be summarised as follows: Its calculation can be summarised as follows:
\begin{quote} \begin{quote}
The checksum field is the 16-bit one's complement of the one's The checksum field is the 16-bit one's complement of the one's
complement sum of all 16-bit words in the header. For purposes of complement sum of all 16-bit words in the header. For purposes of
@ -182,26 +238,10 @@ not the full headers are used, but the pseudo headers (compare figures
To compensate the carry bit, our code uses 17 bit integers for To compensate the carry bit, our code uses 17 bit integers for
correcting the carry. correcting the carry.
% FIXME: add note to python script / checksum diffing % FIXME: add note to python script / checksum diffing
% ----------------------------------------------------------------------
\section{\label{design:statefulnat64}Stateful NAT64}
- controller selects "outgoing" IPv4 address range => base for sessions
- IPv4 addresses can be "random" (in our test case), but need
to be unique
- switch does not need to know about the "range", only about
sessions
- on session create, controller selects "random" ip (ring?)
- on session create, controller selects "random port" (next in range?)
- on session create controller adds choice into 2 tables:
incoming, outgoing
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{design:ipv4embed}IPv4 embedding} \section{\label{design:benchmarks}Benchmarks}
Only using /96. Using addition.
% ----------------------------------------------------------------------
\section{\label{Design:Benchmarks}Benchmarks}
The benchmarks were performed on two hosts, a load generator and a The benchmarks were performed on two hosts, a load generator and a
nat64 translator. Both hosts were equipped with a dual port nat64 translator. Both hosts were equipped with a dual port
Intel X520 10 Gbit/s network card. Both hosts were connected using DAC Intel X520 10 Gbit/s network card. Both hosts were connected using DAC
@ -222,12 +262,12 @@ CPU @ 3.40GHz), enabled with hyperthreading and 16 GB RAM. The NAT64
translator is also equipped with a quard core CPU (Intel(R) Core(TM) translator is also equipped with a quard core CPU (Intel(R) Core(TM)
i7-4770 CPU @ 3.40GHz) and 16 GB RAM. i7-4770 CPU @ 3.40GHz) and 16 GB RAM.
The first 10 seconds of the benchmark were excluded to avoid the tcp The first 10 seconds of the benchmark were excluded to avoid the TCP
warm up phase.\footnote{iperf -O 10 parameter} warm up phase.\footnote{iperf -O 10 parameter}
\begin{figure}[h] \begin{figure}[h]
\includegraphics[scale=0.5]{netpfgadesign} \includegraphics[scale=0.5]{netpfgadesign}
\centering \centering
\caption{NAT64 with NetFPGA benchmark} \caption{NAT64 with NetFPGA benchmark}
\label{fig:netpfgadesign} \label{fig:netpfgadesign}
\end{figure} \end{figure}
% ok

View file

@ -24,13 +24,15 @@ it motivates our work to support ease transition to IPv6 networks.
\section{\label{introduction:ipv4ipv6}IPv4 exhaustion and IPv6 adoption} \section{\label{introduction:ipv4ipv6}IPv4 exhaustion and IPv6 adoption}
The Internet has almost completely run out of public IPv4 space. The The Internet has almost completely run out of public IPv4 space. The
5 Regional Internet Registries (RIRs) report IPv4 exhaustion world wide 5 Regional Internet Registries (RIRs) report IPv4 exhaustion world wide
(\cite{ripe_exhaustion}, \cite{ripe_exhaustion},
\cite{apnic_exhaustion}, \cite{apnic_exhaustion},
\cite{lacnic:_ipv4_deplet_phases}, \cite{lacnic:_ipv4_deplet_phases},
\cite{afrinic:_afrin_ipv4_exhaus}, \cite{afrinic:_afrin_ipv4_exhaus},
\cite{arin:_ipv4_addres_option}) and LACNIC project complete \cite{arin:_ipv4_addres_option}.
exhaustion for 2020 (see figure \ref{fig:lacnicexhaust}). Figure \ref{fig:riripv4rundown} contains summarised data from all RIRs
and projects complete IPv4 addresses depletion by 2021.
The LACNIC project even predicts complete exhaustion for 2020 as shown
in figure \ref{fig:lacnicexhaust}.
\begin{figure}[h] \begin{figure}[h]
\includegraphics[scale=0.7]{lacnicdepletion} \includegraphics[scale=0.7]{lacnicdepletion}
\centering \centering
@ -38,7 +40,12 @@ exhaustion for 2020 (see figure \ref{fig:lacnicexhaust}).
~\cite{lacnic:_ipv4_deplet_phases}} ~\cite{lacnic:_ipv4_deplet_phases}}
\label{fig:lacnicexhaust} \label{fig:lacnicexhaust}
\end{figure} \end{figure}
\begin{figure}[h]
\includegraphics[scale=0.6]{rir-ipv4-rundown}
\centering
\caption{RIR IPv4 rundown projection from~\cite{huston:_ipv4_addres_repor}}
\label{fig:riripv4rundown}
\end{figure}
On the other hand IPv6 adoption grows significantly, with at least On the other hand IPv6 adoption grows significantly, with at least
three countries (India, US, Belgium) surpassing 50\% three countries (India, US, Belgium) surpassing 50\%
adoption~\cite{akamai:_ipv6_adopt_visual}, adoption~\cite{akamai:_ipv6_adopt_visual},
@ -48,64 +55,52 @@ of 2019-08-08~\cite{google:_ipv6_googl}, see figure \ref{fig:googlev6}.
\begin{figure}[h] \begin{figure}[h]
\includegraphics[scale=0.2]{googlev6} \includegraphics[scale=0.2]{googlev6}
\centering \centering
\caption{Google IPv6 Statistics, \caption{Google IPv6 Statistics from~\cite{google:_ipv6_googl}}
~\cite{google:_ipv6_googl}}
\label{fig:googlev6} \label{fig:googlev6}
\end{figure} \end{figure}
We conclude that IPv6 is a technology strongly gaining importance with We conclude that IPv6 is a technology strongly gaining importance with
the IPv4 depletion that is estimated to be world wide happening in the the IPv4 depletion that is estimated to be world wide happening in the
next years. Thus more devices will be using IPv6, while communication next years. Thus more devices will be using IPv6, while communication
to legacy IPv4 devices still needs to be provided. to legacy IPv4 devices still needs to be provided.
% ok
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{introduction:motivation}Motivation} \section{\label{introduction:motivation}Motivation}
IPv6 nodes and IPv4 nodes cannot directly connect to each other, IPv6 nodes and IPv4 nodes cannot directly connect to each other,
because the protocols are incompatible to each other. because the protocols are incompatible to each other.
To allow communication between different protocol nodes, To allow communication between different protocol nodes,
several transition mechanism have been several transition mechanism have been
proposed (\cite{wikipedia:_ipv6}, \cite{rfc4213}). proposed~\cite{wikipedia:_ipv6},~\cite{rfc4213}.
\begin{figure}[h]
\includegraphics[scale=0.4]{v6-v6-separated}
\centering
\caption{Separated IPv6 and IPv4 network segments}
\label{fig:v6v4separated}
\end{figure}
However installation and configuration of the transition mechanism However installation and configuration of the transition mechanism
usually require in depth knowledge about both protocols and require usually require in depth knowledge about both protocols and require
additional hardware to be added in the network. additional hardware to be added in the network.
In this thesis we show an in-network transition method based on In this thesis we show an in-network transition method based on
NAT64~\cite{rfc6146}. Compared to traditional NAT64 methods which require an NAT64~\cite{rfc6146}. Compared to traditional NAT64 methods which
extra device in the network, our proposed method is transparent to the require hosts to explicitly use an extra device in the
user. This way neither the operator nor the end user has to configure network,\footnote{Usually the default router will take this role.}
extra devices. Figures \ref{fig:v6v4standard} shows the standard NAT64 our proposed method is transparent to the hosts.
approach and \ref{fig:v6v4innetwork} shows our solution. This way the routing and network configuration does not need to be
\begin{figure}[h] changed to support NAT64 within a network.
\includegraphics[scale=0.7]{v6-v4-innetwork}
\centering
\caption{In Network NAT64 translation}
\label{fig:v6v4innetwork}
\end{figure}
\begin{figure}[h]
\includegraphics[scale=0.7]{v6-v4-standard}
\centering
\caption{Standard NAT64 translation}
\label{fig:v6v4standard}
\end{figure}
Currently network operators have to focus on two network stacks when Currently network operators have to focus on two network stacks when
designing networks: IPv6 and IPv4. While in a small scale setup this designing networks: IPv6 and IPv4. While in a small scale setup this
might not introduce significant complexity, figure might not introduce significant complexity, figure
\ref{fig:v6v4mixed} shows how the complexity quickly grows \ref{fig:v6v4separated} shows how the complexity quickly grows
with the number of hosts. even with a small number of hosts.
\begin{figure}[h] The proposed in-network solution does not only ease the installation and
\includegraphics[scale=0.4]{v6-v4-mixed}
\centering
\caption{Differenent network design with in network NAT64 translation}
\label{fig:v6v4mixed}
\end{figure}
The in network solution does not only ease the installation and
deployment of IPv6, but it also allows line speed translation, because deployment of IPv6, but it also allows line speed translation, because
it is compiled into target dependent low level code that can run in it is compiled into target dependent low level code that can run in
ASICs~\cite{networks:_tofin}, ASICs~\cite{networks:_tofin},
FPGAs~\cite{netfpga:_p4_netpf_public_github} FPGAs~\cite{netfpga:_p4_netpf_public_github}
or even in or even in
software~\cite{_implem_your_switc_target_with_bmv2}. software~\cite{_implem_your_switc_target_with_bmv2}. Figure
\ref{fig:v6v4mixed} shows how the design differs for an in-network
solution.
Even on fast CPUs, software solutions like Even on fast CPUs, software solutions like
tayga~\cite{lutchansky:_tayga_simpl_nat64_linux} tayga~\cite{lutchansky:_tayga_simpl_nat64_linux}
can be CPU bound and are can be CPU bound (see section \ref{results:softwarenat64}) and are
not capabale of translating protocols at line speed. incapabale of translating protocols at line speed.

View file

@ -12,7 +12,7 @@ objective of this thesis was to demonstrate the high speed
capabilities of NAT64 in hardware, no benchmarks were performed on the capabilities of NAT64 in hardware, no benchmarks were performed on the
P4 software implementation. P4 software implementation.
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{results:p4:implementation}P4 based implementation} \section{\label{results:p4}P4 based implementations}
****** TODO IPv6 udp -> IPv4 ****** TODO IPv6 udp -> IPv4
- Got 4-5 tuple ([proto], src ip, src port, dst ip, dst port) - Got 4-5 tuple ([proto], src ip, src port, dst ip, dst port)
@ -22,6 +22,392 @@ P4 software implementation.
Only supporting /96, not other embeddings as described in Only supporting /96, not other embeddings as described in
section \ref{background:transition:prefixnat}. section \ref{background:transition:prefixnat}.
All planned features could be realised with P4 and a controller.
The language has some limitations on where if/switch statements can be
used.\footnote{In general, if and switch statements in actions lead to
errors, but not all constellations are forbidden.}
For this thesis the parsing capabilities of P4 were adequate. However
P4 at the time of writing cannot parse ICMP6 options, as the upper
level protocol does not specify the number of options that follow and
parsing of 64 bit blocks is required.
P4/BMV2 does not support for multiple LPM keys in a table, however it
supports multiple keys with ternary matching.
When developing P4 programs, the reason for incorrect behaviour was
most often found in checksum problems. If frame checksum errors where
displayed by tcpdump, usually the effective length of the packet was
incorrect.
FIXMe: IPv6: NDP: not easy to parse, as unknown number of following fields
The tooling around P4 is still fragile, encountered many bugs
in the development.~\cite{schottelius:github1675}
or missing features (~\cite{schottelius:github745},
~\cite{theojepsen:_get})
Hitting expression bug (FIXME: source)
1) Impossible to retrieve key from table: LPM: addr + mask -> addr and
mask might be used in controller
2) retrieving information from tables : no meta information, don't
know which table matched
3) type definitions separate Code sharing (controller, switch)
No switch in actions, No conditional execution in actions
Not directly related to P4, but supporting scripts are usually written in python2, however python2
handles unicode strings differently and thus effects like an IPv6
address ``changing'' happen. ~\cite{appendix:p4:python2unicode}.
P4os - reusable code
idomatic problem: Security issue: not checking checksums before
% ----------------------------------------------------------------------
\subsection{\label{Results:BMV2}BMV2}
The software implementation of P4 has most features, which is
mostly due to the capability of checksumming the payload: Acting
as a ``proper'' participant in NDP, requires the host to calculate
checksums over the payload.
List of features BMV2 ~\cite{tab:p4bmv2features}
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & fully implemented\footnote{Source code: \texttt{actions\_egress.p4}}\\
\hline
Controller to Switch & Controller can setup table entries &
fully implemented\footnote{Source code: \texttt{controller.py}}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
fully implemented\footnote{Source code:
\texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ARP & Switch can answer ARP request (without controller) & fully
implemented\footnote{Source code: \texttt{actions\_arp.p4}}\\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
fully implemented\footnote{Source code:
\texttt{actions\_nat64\_session.p4}, \texttt{controller.py}} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / BMV2 feature list}
\label{tab:p4bmv2features}
\end{center}
\end{table}
Responds to icmp, icmp6
ndp ~\cite{rfc4861}
arp
very easy to use
Fully functional host
Can compute checksums on its own.
focus on typical use cases of icmp, icmp6, the software implementation
supports translating echo request and echo reply messages, but does
not support all ICMP/ICMP6 translations that are defined in
RFC6145~\cite{rfc6145}.
Stateful : no automatic removal
Session management not benchmarked, as it is only a matter of creating
table entries.
Jool and tayga are supported by
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga}NetFPGA - FIXME: writing}
The reduced feature set of the NetPFGA implementation is due to two
factors: compile time. Between 2 to 6 hours per compile run. No
payload checksum
overview - general translation - not advanced features
% ----------------------------------------------------------------------
\subsubsection{\label{results:netpfga:features}Features}
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & portable\footnote{While the NetFPGA P4 implementation
does not have the clone3() extern that the BMV2 implementation offers,
communication to the controller can easily be realised by using one of
the additional ports of the NetFPGA and connect a physical network
card to it.}\\
\hline
Controller to Switch & Controller can setup table entries &
portable\footnote{The p4utils suite offers an easy access to the
switch tables. While the P4-NetFPGA support repository also offers
python scripts to modify the switch tables, the code is less
sophisticated and more fragile.}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
portable\footnote{NetFPGA/P4 does not offer calculating the checksume
over the payload. However delta checksumming can be used to create
the required checksum for replying.} \\
\hline
ARP & Switch can answer ARP request (without controller) &
portable\footnote{As ARP does not use checksums, integrating the
source code \texttt{actions\_arp.p4} into the netpfga code base is
enough to enable ARP support in the NetPFGA.} \\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
portable\footnote{ICMP/ICMP6 translations only require enabling the
icmp/icmp6 code in the netpfga code base.} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
portable\footnote{Same reasoning as ``Controller to switch''.} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
unsupported\footnote{To support creating payload checksums, either an
HDL module needs to be created or to modify the generated
the PX program.~\cite{schottelius:_exter_p4_netpf}} \\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / NetFPGA feature list}
\label{tab:p4netpfgafeatures}
\end{center}
\end{table}
% ----------------------------------------------------------------------
\subsubsection{\label{results:netpfga:stability}Stability}
Two different NetPFGA cards were used during the development of the
thesis. The first card had consistent ioctl errors (compare section
\ref{netpfgaioctlerror}) when writing table entries. The available
hardware tests (compare figures \ref{fig:hwtestnico} and
\ref{fig:hwtesthendrik}) showed failures in both cards, however the
first card reported an additional ``10G\_Loopback'' failure. Due to
the inability of setting table entries, no benchmarking was performed
on the first NetFPGA card.
\begin{figure}[h]
\includegraphics[scale=1.4]{hwtestnico}
\centering
\caption{Hardware Test NetPFGA card 1}
\label{fig:hwtestnico}
\end{figure}
\begin{figure}[h]
\includegraphics[scale=0.2]{hwtesthendrik}
\centering
\caption{Hardware Test NetPFGA card 2, ~\cite{hendrik:_p4_progr_fpga_semes_thesis_sa}}
\label{fig:hwtesthendrik}
\end{figure}
During the development and benchmarking, the second NetFPGA card stopped to
function properly multiple times. In both cases the card would not
forward packets anymore. Multiple reboots (3 were usually enough)
and multiple times reflashing the bitstream to the NetFPGA usually
restored the intended behaviour. However due to this ``crashes'', it
was impossible to complete a full benchmark run that would last for
more than one hour.
Sometimes it was also required to reboot the host containing the
NetFPGA card 3 times to enable successful flashing.\footnote{Typical
output of the flashing process would be: ``fpga configuration failed. DONE PIN is not HIGH''}
% ----------------------------------------------------------------------
\subsubsection{\label{results:netpfga:performance}Performance}
As expected, the NetFGPA card performed at near line speed and offers
NAT64 translations at 9.28 Gbit/s. Single and multiple streams
performed almost exactly identical and have been consistent through
multiple iterations of the benchmarks.
% ----------------------------------------------------------------------
\subsubsection{\label{results:netpfga:usability}Usability}
To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be
installed. However a bug in the installer triggers an infinite loop,
if a certain shared library\footnote{The required shared library
is libncurses5.} is missing on the target operating system. The
installation program seems still to be progressing, however does never
finish.
While the NetFPGA card supports P4, the toolchains and supporting
scripts are in a immature state. The compilation process consists of
at least 9 different steps, which are interdependent\footnote{See
source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate
shell scripts and python scripts that in turn generate JSON
data.\footnote{One compilation step calls the script
``config\_writes.py''. This script failed with a syntax error, as it
contained incomplete python code. The scripts config\_writes.py
and config\_writes.sh are generated by gen\_config\_writes.py.
The output of the script gen\_config\_writes.py depends on the content
of config\_writes.txt. That file is generated by the simulation
``xsim''. The file ``SimpleSumeSwitch\_tb.sv'' contains code that is
responsible for writing config\_writes.txt and uses a function
named axi4\_lite\_master\_write\_request\_control for generating the
output. This in turn is dependent on the output of a script named
gen\_testdata.py.}
However incorrect parsing generates syntactically incorrect
scripts or scripts that generate incorrect output. The toolchain
provided by the NetFGPA-P4 repository contains more than 80000 lines
of code. The supporting scripts for setting table entries require
setting the parameters for all possible actions, not only for the
selected action. Supplying only the required parameters results in a
crash of the supporting script.
The documentation for using the NetFPGA-P4 repository is very
distributed and does not contain a reference on how to use the
tools. Mapping of egress ports and their metadata field are found in a
python script that is used for generating test data.
The compile process can take up to 6 hours and because the different
steps are interdependent, errors in a previous stage were in our
experiences detected hours after they happened. The resulting log
files of the compilation process can be up to 5 MB in size. Within
this log file various commands output references to other logfiles,
however the referenced logfiles do not exist before or after the
compile process.
During the compile process various informational, warning and error
messages are printed. However some informational messages constitute
critical errors, while on the other hand critical errors and syntax
errors often do not constitue a critical
error.\footnote{F.i. ``CRITICAL WARNING: [BD 41-737] Cannot set the
parameter TRANSLATION\_MODE on /axi\_interconnect\_0. It is
read-only.'' is a non critical warning.}
Also contradicting
output is generated.\footnote{While using version 2018.2, the following
message was printed: ``WARNING: command 'get\_user\_parameter' will be removed in the 2015.3
release, use 'get\_user\_parameters' instead''.}
Programs or scripts that are called during the compile process do not
necessarily exit non zero if they encountered a critical error. Thus
finding the source of an error can be difficult due to the compile
process continuing after critical errors occured. Not only programs
that have critical errors exit ``successfully'', but also python
scripts that encounter critical paths don't abort with raise(), but
print an error message to stdout and don't abort with an error.
The most often encountered critical compile error is
``Run 'impl\_1' has not been launched. Unable to open''. This error
indicates that something in the previous compile steps failed and can
refer to incorrectly generated testdata to unsupported LPM tables.
The NetFPGA kernel module provides access to virtual Linux
devices (nf0...nf3). However tcpdump does not see any packets that are
emitted from the switch. The only possibility to capture packets
that are emitted from the switch is by connecting a physical cable to
the port and capturing on the other side.
Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are
commonly used in 10 Gbit/s networks. According to
\ref{wikipedia:_jumbo}, even many gigabit network interface card
support jumbo frames. However according to emails on the private
NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at
the moment and additional work is required to implement support for
bigger frames.
Our P4 source code required contains Xilinx
annotations\footnote{F.i. ``@Xilinx\_MaxPacketRegion(1024)''} that define
the maximum packet size in bits. We observed two different errors on
the output packet, if the incoming packets exceeds the specified size:
\begin{itemize}
\item The output packet is longer then the original packet.
\item The output packet is corrupted.
\end{itemize}
While most of the P4 language is supported on the netpfga, some key
techniques are missing or not supported.
\begin{itemize}
\item Analysing / accessing payload is not supported
\item Checksum computation over payload is not supported
\item Using LPM tables can lead to compilation errors
\item Depening on the match type, only certain table sizes are allowed
\end{itemize}
Renaming variables in the declaration of the parser or deparser lead
to compilation errors. Function syntax is not supported. For this
reason our implementation uses \texttt{\#define} statements instead of functions.
FIXME:
General result: limited NAT64 is working, however
No Payload ; checksumming - requires controller
Hash funktion in Arbeit ; No NDP, no ARP - focused on key factors of NAT64 translation,
other features can be supported by controller
Needed to debug internal parsing errors
debugging generated tcl code to debug impl1 error
% ----------------------------------------------------------------------
\section{\label{results:softwarenat64}Software based NAT64}
with Tayga and
Jool
Both cpu bound.
During the benchmark cpu bound, single thread
tayga: Single threaded
easy to use
Jool kernel module
100\% cpu usage on 1 core for udp
0\% visible cpu usage for tcp, might be tcp offloading
Integration with iptables
Requires routing
% ----------------------------------------------------------------------
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\section{\label{results:benchmark}NAT64 Benchmarks - FIXME: explain numbers} \section{\label{results:benchmark}NAT64 Benchmarks - FIXME: explain numbers}
We successfully implemented P4 code to realise We successfully implemented P4 code to realise
@ -153,385 +539,3 @@ ndp
controller support controller support
netpfga consistent netpfga consistent
% ----------------------------------------------------------------------
\section{\label{Results:BMV2}BMV2 - FIXME: write better}
The software implementation of P4 has most features, which is
mostly due to the capability of checksumming the payload: Acting
as a ``proper'' participant in NDP, requires the host to calculate
checksums over the payload.
List of features BMV2 ~\cite{tab:p4bmv2features}
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & fully implemented\footnote{Source code: \texttt{actions\_egress.p4}}\\
\hline
Controller to Switch & Controller can setup table entries &
fully implemented\footnote{Source code: \texttt{controller.py}}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
fully implemented\footnote{Source code:
\texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ARP & Switch can answer ARP request (without controller) & fully
implemented\footnote{Source code: \texttt{actions\_arp.p4}}\\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
fully implemented\footnote{Source code:
\texttt{actions\_nat64\_session.p4}, \texttt{controller.py}} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / BMV2 feature list}
\label{tab:p4bmv2features}
\end{center}
\end{table}
Responds to icmp, icmp6
ndp ~\cite{rfc4861}
arp
very easy to use
Fully functional host
Can compute checksums on its own.
focus on typical use cases of icmp, icmp6, the software implementation
supports translating echo request and echo reply messages, but does
not support all ICMP/ICMP6 translations that are defined in
RFC6145~\cite{rfc6145}.
Stateful : no automatic removal
Session management not benchmarked, as it is only a matter of creating
table entries.
Jool and tayga are supported by
% ----------------------------------------------------------------------
\section{\label{results:netpfga}NetFPGA - FIXME: writing}
The reduced feature set of the NetPFGA implementation is due to two
factors: compile time. Between 2 to 6 hours per compile run. No
payload checksum
overview - general translation - not advanced features
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:features}Features}
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & portable\footnote{While the NetFPGA P4 implementation
does not have the clone3() extern that the BMV2 implementation offers,
communication to the controller can easily be realised by using one of
the additional ports of the NetFPGA and connect a physical network
card to it.}\\
\hline
Controller to Switch & Controller can setup table entries &
portable\footnote{The p4utils suite offers an easy access to the
switch tables. While the P4-NetFPGA support repository also offers
python scripts to modify the switch tables, the code is less
sophisticated and more fragile.}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
portable\footnote{NetFPGA/P4 does not offer calculating the checksume
over the payload. However delta checksumming can be used to create
the required checksum for replying.} \\
\hline
ARP & Switch can answer ARP request (without controller) &
portable\footnote{As ARP does not use checksums, integrating the
source code \texttt{actions\_arp.p4} into the netpfga code base is
enough to enable ARP support in the NetPFGA.} \\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
portable\footnote{ICMP/ICMP6 translations only require enabling the
icmp/icmp6 code in the netpfga code base.} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
portable\footnote{Same reasoning as ``Controller to switch''.} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
unsupported\footnote{To support creating payload checksums, either an
HDL module needs to be created or to modify the generated
the PX program.~\cite{schottelius:_exter_p4_netpf}} \\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / NetFPGA feature list}
\label{tab:p4netpfgafeatures}
\end{center}
\end{table}
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:stability}Stability}
Two different NetPFGA cards were used during the development of the
thesis. The first card had consistent ioctl errors (compare section
\ref{netpfgaioctlerror}) when writing table entries. The available
hardware tests (compare figures \ref{fig:hwtestnico} and
\ref{fig:hwtesthendrik}) showed failures in both cards, however the
first card reported an additional ``10G\_Loopback'' failure. Due to
the inability of setting table entries, no benchmarking was performed
on the first NetFPGA card.
\begin{figure}[h]
\includegraphics[scale=1.4]{hwtestnico}
\centering
\caption{Hardware Test NetPFGA card 1}
\label{fig:hwtestnico}
\end{figure}
\begin{figure}[h]
\includegraphics[scale=0.2]{hwtesthendrik}
\centering
\caption{Hardware Test NetPFGA card 2, ~\cite{hendrik:_p4_progr_fpga_semes_thesis_sa}}
\label{fig:hwtesthendrik}
\end{figure}
During the development and benchmarking, the second NetFPGA card stopped to
function properly multiple times. In both cases the card would not
forward packets anymore. Multiple reboots (3 were usually enough)
and multiple times reflashing the bitstream to the NetFPGA usually
restored the intended behaviour. However due to this ``crashes'', it
was impossible to complete a full benchmark run that would last for
more than one hour.
Sometimes it was also required to reboot the host containing the
NetFPGA card 3 times to enable successful flashing.\footnote{Typical
output of the flashing process would be: ``fpga configuration failed. DONE PIN is not HIGH''}
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:performance}Performance}
As expected, the NetFGPA card performed at near line speed and offers
NAT64 translations at 9.28 Gbit/s. Single and multiple streams
performed almost exactly identical and have been consistent through
multiple iterations of the benchmarks.
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:usability}Usability}
To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be
installed. However a bug in the installer triggers an infinite loop,
if a certain shared library\footnote{The required shared library
is libncurses5.} is missing on the target operating system. The
installation program seems still to be progressing, however does never
finish.
While the NetFPGA card supports P4, the toolchains and supporting
scripts are in a immature state. The compilation process consists of
at least 9 different steps, which are interdependent\footnote{See
source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate
shell scripts and python scripts that in turn generate JSON
data.\footnote{One compilation step calls the script
``config\_writes.py''. This script failed with a syntax error, as it
contained incomplete python code. The scripts config\_writes.py
and config\_writes.sh are generated by gen\_config\_writes.py.
The output of the script gen\_config\_writes.py depends on the content
of config\_writes.txt. That file is generated by the simulation
``xsim''. The file ``SimpleSumeSwitch\_tb.sv'' contains code that is
responsible for writing config\_writes.txt and uses a function
named axi4\_lite\_master\_write\_request\_control for generating the
output. This in turn is dependent on the output of a script named
gen\_testdata.py.}
However incorrect parsing generates syntactically incorrect
scripts or scripts that generate incorrect output. The toolchain
provided by the NetFGPA-P4 repository contains more than 80000 lines
of code. The supporting scripts for setting table entries require
setting the parameters for all possible actions, not only for the
selected action. Supplying only the required parameters results in a
crash of the supporting script.
The documentation for using the NetFPGA-P4 repository is very
distributed and does not contain a reference on how to use the
tools. Mapping of egress ports and their metadata field are found in a
python script that is used for generating test data.
The compile process can take up to 6 hours and because the different
steps are interdependent, errors in a previous stage were in our
experiences detected hours after they happened. The resulting log
files of the compilation process can be up to 5 MB in size. Within
this log file various commands output references to other logfiles,
however the referenced logfiles do not exist before or after the
compile process.
During the compile process various informational, warning and error
messages are printed. However some informational messages constitute
critical errors, while on the other hand critical errors and syntax
errors often do not constitue a critical
error.\footnote{F.i. ``CRITICAL WARNING: [BD 41-737] Cannot set the
parameter TRANSLATION\_MODE on /axi\_interconnect\_0. It is
read-only.'' is a non critical warning.}
Also contradicting
output is generated.\footnote{While using version 2018.2, the following
message was printed: ``WARNING: command 'get\_user\_parameter' will be removed in the 2015.3
release, use 'get\_user\_parameters' instead''.}
Programs or scripts that are called during the compile process do not
necessarily exit non zero if they encountered a critical error. Thus
finding the source of an error can be difficult due to the compile
process continuing after critical errors occured. Not only programs
that have critical errors exit ``successfully'', but also python
scripts that encounter critical paths don't abort with raise(), but
print an error message to stdout and don't abort with an error.
The most often encountered critical compile error is
``Run 'impl\_1' has not been launched. Unable to open''. This error
indicates that something in the previous compile steps failed and can
refer to incorrectly generated testdata to unsupported LPM tables.
The NetFPGA kernel module provides access to virtual Linux
devices (nf0...nf3). However tcpdump does not see any packets that are
emitted from the switch. The only possibility to capture packets
that are emitted from the switch is by connecting a physical cable to
the port and capturing on the other side.
Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are
commonly used in 10 Gbit/s networks. According to
\ref{wikipedia:_jumbo}, even many gigabit network interface card
support jumbo frames. However according to emails on the private
NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at
the moment and additional work is required to implement support for
bigger frames.
Our P4 source code required contains Xilinx
annotations\footnote{F.i. ``@Xilinx\_MaxPacketRegion(1024)''} that define
the maximum packet size in bits. We observed two different errors on
the output packet, if the incoming packets exceeds the specified size:
\begin{itemize}
\item The output packet is longer then the original packet.
\item The output packet is corrupted.
\end{itemize}
While most of the P4 language is supported on the netpfga, some key
techniques are missing or not supported.
\begin{itemize}
\item Analysing / accessing payload is not supported
\item Checksum computation over payload is not supported
\item Using LPM tables can lead to compilation errors
\item Depening on the match type, only certain table sizes are allowed
\end{itemize}
Renaming variables in the declaration of the parser or deparser lead
to compilation errors. Function syntax is not supported. For this
reason our implementation uses \texttt{\#define} statements instead of functions.
FIXME:
General result: limited NAT64 is working, however
No Payload ; checksumming - requires controller
Hash funktion in Arbeit ; No NDP, no ARP - focused on key factors of NAT64 translation,
other features can be supported by controller
Needed to debug internal parsing errors
debugging generated tcl code to debug impl1 error
% ----------------------------------------------------------------------
\section{\label{results:softwarenat64}Software NAT64 with Tayga and
Jool}
Both cpu bound.
During the benchmark cpu bound, single thread
tayga: Single threaded
easy to use
Jool kernel module
100\% cpu usage on 1 core for udp
0\% visible cpu usage for tcp, might be tcp offloading
Integration with iptables
Requires routing
% ----------------------------------------------------------------------
\section{\label{results:p4}P4}
All planned features could be realised with P4 and a controller.
The language has some limitations on where if/switch statements can be
used.\footnote{In general, if and switch statements in actions lead to
errors, but not all constellations are forbidden.}
For this thesis the parsing capabilities of P4 were adequate. However
P4 at the time of writing cannot parse ICMP6 options, as the upper
level protocol does not specify the number of options that follow and
parsing of 64 bit blocks is required.
P4/BMV2 does not support for multiple LPM keys in a table, however it
supports multiple keys with ternary matching.
When developing P4 programs, the reason for incorrect behaviour was
most often found in checksum problems. If frame checksum errors where
displayed by tcpdump, usually the effective length of the packet was
incorrect.
FIXMe: IPv6: NDP: not easy to parse, as unknown number of following fields
The tooling around P4 is still fragile, encountered many bugs
in the development.~\cite{schottelius:github1675}
or missing features (~\cite{schottelius:github745},
~\cite{theojepsen:_get})
Hitting expression bug (FIXME: source)
1) Impossible to retrieve key from table: LPM: addr + mask -> addr and
mask might be used in controller
2) retrieving information from tables : no meta information, don't
know which table matched
3) type definitions separate Code sharing (controller, switch)
No switch in actions, No conditional execution in actions
Not directly related to P4, but supporting scripts are usually written in python2, however python2
handles unicode strings differently and thus effects like an IPv6
address ``changing'' happen. ~\cite{appendix:p4:python2unicode}.
P4os - reusable code
idomatic problem: Security issue: not checking checksums before

Binary file not shown.

View file

@ -1,13 +1,19 @@
digraph G { digraph G {
node [ shape="box"]; node [ shape="box"];
rankdir="LR";
ipv6 [ label="IPv6" ] ipv6 [ label="IPv6" ]
icmp6 [ label="ICMP6" ] icmp6 [ label="ICMP6" ]
icmp6ns [ label="ICMP6 Neigbor Advertisement" ] icmp6ns [ label="ICMP6 Neigbor Advertisement" ]
icmp6nsll [ label="ICMP6 Neigbor Solicitation Link layer option" ] icmp6nsll [ label="ICMP6 Link layer option" ]
icmp6other [ label="More option fields" ] icmp6other [ label="Option field 1" ]
icmp6other2 [ label="Option field 2" ]
icmp6othern [ label="Option field n" ]
ipv6->icmp6->icmp6ns->icmp6nsll->icmp6other; ipv6->icmp6->icmp6ns->icmp6nsll->icmp6other->icmp6other2;
icmp6other2->icmp6othern [ style="dotted" ];
} }

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

After

Width:  |  Height:  |  Size: 9.4 KiB

View file

@ -0,0 +1,12 @@
digraph G {
node [ shape="box"];
# ORDER OF CREATION IS IMPORTANT FOR ORDERING!
v6host [ label="2001:db8:42::42" ];
nat64 [ label="NAT64 translator" ];
v4host [ label="10.0.0.42" ];
v6host->nat64 [ label="Connect to 2001:db8:42::10.0.0.42" ];
nat64->v4host [ dir=back label="Connect to 10.0.0.66" ];
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

View file

@ -3,56 +3,21 @@ graph G {
rankdir="LR"; rankdir="LR";
v6host1 [ label="IPv6 only host"];
v6host2 [ label="IPv6 only host"];
v6host3 [ label="IPv6 only host"];
v6host12 [ label="IPv6 only host"]; v6host12 [ label="IPv6 only host"];
v6host22 [ label="IPv6 only host"]; v6host22 [ label="IPv6 only host"];
v6host32 [ label="IPv6 only host"]; v6host32 [ label="IPv6 only host"];
v4host1 [ label="IPv4 only host"];
v4host2 [ label="IPv4 only host"];
v4host3 [ label="IPv4 only host"];
v4host12 [ label="IPv4 only host"]; v4host12 [ label="IPv4 only host"];
v4host22 [ label="IPv4 only host"]; v4host22 [ label="IPv4 only host"];
v4host32 [ label="IPv4 only host"]; v4host32 [ label="IPv4 only host"];
switchv6 [ label="Network Switch", shape="oval" ];
switchv4 [ label="Network Switch", shape="oval" ];
switchboth [ label="Network Switch with NAT64", shape="oval" ]; switchboth [ label="Network Switch with NAT64", shape="oval" ];
nat64gw [ label="NAT64 translator", rank=max ]; v6host12--switchboth;
v6host22--switchboth;
v6host32--switchboth;
subgraph cluster_seperate { v4host12--switchboth;
label="Separated IPv6/IPv4 networks"; v4host22--switchboth;
v4host32--switchboth;
rank="max";
v6host1--switchv6;
v6host2--switchv6;
v6host3--switchv6;
v4host1--switchv4;
v4host2--switchv4;
v4host3--switchv4;
switchv4--nat64gw;
switchv6--nat64gw;
};
subgraph cluster_mixed {
label="Mixed IPv6/IPv4 networks";
rank="min";
v6host12--switchboth;
v6host22--switchboth;
v6host32--switchboth;
v4host12--switchboth;
v4host22--switchboth;
v4host32--switchboth;
}
} }

Binary file not shown.

Before

Width:  |  Height:  |  Size: 81 KiB

After

Width:  |  Height:  |  Size: 32 KiB

View file

@ -5,17 +5,21 @@ graph G {
v6host [ label="IPv6 only host"]; v6host [ label="IPv6 only host"];
v4host [ label="IPv4 only host"]; v4host [ label="IPv4 only host"];
switch1 [ label="Network Switch", shape="oval" ]; switch1 [ label="Network Switch", shape="oval" ];
switch2 [ label="Network Switch", shape="oval" ]; switch2 [ label="Network Switch", shape="oval" ];
v6router [ label="IPv6 router" ];
v4router [ label="IPv4 router" ];
nat64gw [ label="NAT64 translator", rank=max ]; nat64gw [ label="NAT64 translator", rank=max ];
v6host--switch1; v6host--switch1;
v4host--switch2; v4host--switch2;
switch1--nat64gw; switch1--v6router--nat64gw;
switch2--nat64gw; switch2--v4router--nat64gw;

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 19 KiB

View file

@ -0,0 +1,28 @@
graph G {
node [ shape="box"];
rankdir="LR";
v6host1 [ label="IPv6 only host"];
v6host2 [ label="IPv6 only host"];
v6host3 [ label="IPv6 only host"];
v4host1 [ label="IPv4 only host"];
v4host2 [ label="IPv4 only host"];
v4host3 [ label="IPv4 only host"];
switchv6 [ label="Network Segment", shape="oval" ];
switchv4 [ label="Network Segment", shape="oval" ];
nat64gw [ label="Router /\nNAT64 translator", rank=max ];
v6host1--switchv6;
v6host2--switchv6;
v6host3--switchv6;
v4host1--switchv4;
v4host2--switchv4;
v4host3--switchv4;
switchv4--nat64gw;
switchv6--nat64gw;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

View file

@ -133,6 +133,14 @@
title = {High speed NAT64 in P4 (git repository)}, title = {High speed NAT64 in P4 (git repository)},
howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/ma-2019-19_high_speed_nat64_with_p4}}} howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/ma-2019-19_high_speed_nat64_with_p4}}}
@Misc{nico18:_implem_layer_ipv4_ipv6_rever_proxy,
author = {Nico Schottelius and Sarah Plocher},
title = {Implementation of a Layer 7 IPv4 to IPv6 Reverse Proxy},
howpublished = {Protected git repositry \url{https://gitlab.ethz.ch/nicosc/sdn-nat64/}, part of the Advanced topics in communication networks course fall 2019, \url{https://adv-net.ethz.ch/}},
year = 2018}
@Misc{schottelius:_exter_p4_netpf, @Misc{schottelius:_exter_p4_netpf,
author = {Nico Schottelius}, author = {Nico Schottelius},
title = {Extern for checksum'ing payload (P4-NetPFGA-public)}, title = {Extern for checksum'ing payload (P4-NetPFGA-public)},
@ -149,3 +157,9 @@
title = {Jumbo frame}, title = {Jumbo frame},
howpublished = {\url{https://en.wikipedia.org/wiki/Jumbo_frame}}, howpublished = {\url{https://en.wikipedia.org/wiki/Jumbo_frame}},
note = {Requested on 2019-08-15}} note = {Requested on 2019-08-15}}
@Misc{huston:_ipv4_addres_repor,
author = {Geoff Huston},
title = {IPv4 Address Report},
howpublished = {\url{https://ipv4.potaroo.net/}},
note = {Requested on 2019-08-18}}