corrections, corrections, corrections...

This commit is contained in:
Nico Schottelius 2019-08-21 19:06:10 +02:00
parent a5858febf0
commit 2bc5964ae3
7 changed files with 85 additions and 75 deletions

View file

@ -31,14 +31,14 @@ are discussed in section \ref{results:p4}.
As opposed to general purpose programming languages, P4 lacks some
features. Most notably loops, floating point operations and
modulo operations.
However within its constraints, P4 can guarantee
However, within its constraints, P4 can guarantee
operation at line speed, which general purpose programming languages
cannot guarantee and also fail to achieve in reality
(see section \ref{results:softwarenat64} for details).
% ok
% ----------------------------------------------------------------------
\section{\label{background:ip}IPv6, IPv4 and Ethernet}
The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv4 and
The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv6 and
IPv4 operate on layer 3 of the OSI model. In this thesis we only
consider transmission via Ethernet, which operates at
layer 2. Inside the Ethernet frame a field named ``type'' specifies
@ -78,7 +78,7 @@ Figure \ref{fig:arpndp} illustrates a typical address resolution process.
\caption{ARP and NDP}
\label{fig:arpndp}
\end{figure}
The major difference between ARP and NDP in relation to P4 are
The major differences between ARP and NDP in relation to P4 are
\begin{itemize}
\item ARP is a separate protocol on the same layer as IPv6 and IPv4,
\item NDP operates below ICMP6 which operates below IPv6,
@ -87,7 +87,7 @@ The major difference between ARP and NDP in relation to P4 are
(specifically: ICMP6 link layer address option).
\end{itemize}
ARP is required to be a separate protocol, because IPv4 hosts don't
know how to communicate with each other yet, because they don't have a
know how to communicate with each other yet, as they don't have a
way to communicate to the target IPv4 address (``The chicken and the
egg problem'').
NDP on the other hand already works within IPv6, as every IPv6 host is
@ -228,10 +228,10 @@ port can be reused again.
\caption{Stateful NAT64}
\label{fig:statefulnat64}
\end{figure}
The selection of mapped ports is usually based on the availability on
The selection of mapped ports is usually based on the availability of
the IPv4 side and not related to the original port. To support
stateful NAT64, the translator needs to store the mapping in a table and
purge entries regularly.
stateful NAT64, the translator needs to store the mapping
in a table and purge entries regularly.
Stateful NAT64 usually uses information found in protocols at layer 4
like TCP~\cite{rfc793} or UDP~\cite{rfc768}. However, it can also
@ -255,7 +255,7 @@ implementations.
While protocol dependent translation has the highest amount of
information to choose from for translation, complex parsers or even
cryptographic methods are required for it. That reduces the
opportunities of protocol dependent translations to run on devices
opportunities for) protocol dependent translations to run on devices
with less sophisticated devices.
% ----------------------------------------------------------------------
\subsection{\label{background:transition:prefixnat}Mapping IPv4
@ -340,7 +340,7 @@ The DNS64 DNS server will query the authoritative DNS server for an AAAA
record. However as the host \textit{ipv4onlyhost.example.com} is only
reachable by IPv4, it also only has an A entry. After receiving the
answer that there is no AAAA record, the DNS64 server will ask for an
A record and gets an answer that the name
A record and will get an answer that the name
\textit{ipv4onlyhost.example.com} resolves to the IPv4 address
\textit{192.0.2.0}. The DNS64 server then embeds the IPv4 address in
the configured IPv6 prefix (\textit{64:ff9b::/96} in this case) and
@ -388,10 +388,10 @@ defined in RFC768 and RFC793 and are shown in \ref{fig:ipv4pseudoheader}.
\label{fig:ipv6pseudoheader}
\end{figure}
When translating, the checksum fields in the higher protocols need to be
adjusted. The checksums for TCP and UDP is calculated not only over the pseudo
adjusted. The checksums for TCP and UDP are calculated not only over the pseudo
headers, but also contain the payload of the packet. This is
important because some targets (like the NetFPGA) do not allow to
access the payload (see section \ref{design:netpfga}).
important because some targets (like the NetFPGA) do not allow
accessing the payload (see section \ref{design:netpfga}).
\begin{figure}[h]
\begin{verbatim}
0 7 8 15 16 23 24 31
@ -419,12 +419,6 @@ Its calculation can be summarised as follows:
% ----------------------------------------------------------------------
\section{\label{background:networkdesign}Network Designs}
%% \begin{figure}[h]
%% \includegraphics[scale=0.5]{v4only}
%% \centering
%% \caption{IPv4 only network}
%% \label{fig:v4onlynet}
%% \end{figure}
In relation to IPv6 and IPv4, there are in general three different
network designs possible:
The oldest form are IPv4 only networks.
@ -433,12 +427,6 @@ hosts that are either not configured for IPv6 or are even technically
incapable of enabling the IPv6 protocol. These nodes are connected to
an IPv4 router that is connected to the Internet. That router might be
capable of translating IPv4 to IPv6 and vice versa.
%% \begin{figure}[h]
%% \includegraphics[scale=0.5]{dualstack}
%% \centering
%% \caption{Dualstack network}
%% \label{fig:dualstacknet}
%% \end{figure}
With the introduction of IPv6, hosts can have a separate IP stack
active and in that configuration hosts are called ``dualstack hosts''.
@ -446,8 +434,8 @@ Dualstack hosts are capable of reaching both IPv6 and IPv4 hosts
directly without the need of any translation mechanism.
The last possible network design is based on IPv6 only hosts.
While it is technically easy to disable IPv4, it
seems that completely removing the IPv4 stack in current operating
While it is technically easy to disable IPv4,
completely removing the IPv4 stack in current operating
systems is not an easy task~\cite{ungleich:_ipv4}.
%% \begin{figure}[h]
%% \includegraphics[scale=0.5]{v6only}
@ -458,7 +446,7 @@ systems is not an easy task~\cite{ungleich:_ipv4}.
While the three network designs look similar, there are significant
differences in operating them and limitations that are not easy to
circumvent. In the following sections, we describe the limitations and
reason how a translation mechanism like our NAT64 implementation
explain how a translation mechanism like our NAT64 implementation
should be deployed.
% ----------------------------------------------------------------------
\subsection{\label{background:networkdesign:ipv4}IPv4 only network limitations}
@ -478,14 +466,14 @@ IPv6 addresses independent of the protocol is not possible.
maintenance}
While dualstack hosts can address any host in either IPv6 or IPv4
networks, the deployment of dualstack hosts comes with a major
disadvantage: all network configuration double. The required routing
disadvantage: all network configurations double. The required routing
tables double, the firewall rules roughly double\footnote{The rule sets
even for identical policies in IPv6 and IPv4 networks are not
identical, but similar. For this reason we state that roughly double
the amount of firewall rules are required for the same policy to be
applied.} and the number of network supporting systems, (like DHCPv4,
DHCPv6, router advertisement daemons, etc.) also roughly double.
Additionally services that run on either IPv6 or IPv4 might need to be
Additionally, services that run on either IPv6 or IPv4 might need to be
configured to run in dualstack mode as well and not every software
might be capable of that.
So while there is the instant benefit of not requiring any transition mechanism
@ -494,7 +482,7 @@ operational cost) of running dual stack networks can be significant.
% ----------------------------------------------------------------------
\subsection{\label{background:networkdesign:v6only}IPv6 only networks}
IPv6 only networks are in our opinion the best choice for long term
deployments. The reasons for this are as follows: First of all hosts
deployments. Our reasons for this are the following: First of all hosts
eventually will need to support IPv6 and secondly
IPv6 hosts can address the whole 32 bit IPv4 Internet mapped in
a single /96 IPv6 network. IPv6 only networks also allow the operators
@ -510,7 +498,7 @@ to focus on one IP stack.
The NetFPGA~\cite{zilberman:_netfp_sume}
is an FPGA card featuring four 10 Gbit/s SFP+ ports. It
includes the Xilinx Virtex-7 690T FPGA on board, 27 MB of storage,
allowing to save table data, and 8 GB of DDR3 RAM. The NetFPGA can be
to save table data, and 8 GB of DDR3 RAM. The NetFPGA can be
run inside a host (connected by PCI-E, gen 3) or as a standalone
card.

View file

@ -15,10 +15,10 @@ Our in-network solution allows novel translations
without involving external routers, without involving
external routers.\footnote{Compare
figures \ref{fig:v6v4standard} and \ref{fig:v6v4mixed}.}
We expect this to supporting migration to IPv6 only networks
We expect this to support migration to IPv6 only networks.
% P4
P4 has been proven for us as a suitable programming language for
P4 has been proven to us as a suitable programming language for
network equipment with great potential. However in the current state
the tooling and frameworks are still immature and need significant
work to become usable for solving day-to-day challenges or supporting
@ -26,7 +26,7 @@ large scale projects. Even with the current state drawbacks, P4 is a
very convincing language that has wide range of applications due to
its protocol independence and easy to understand architecture.
The availability of protocol independent programmable network
equipment opens up many possibilities for in network
equipment opens up many possibilities in network
programming. While this thesis focused on NAT64, the accompanying
technology DNS64~\cite{rfc6147} could also be implemented in P4, thus
completing the translation mechanism.
@ -54,7 +54,7 @@ depth translation like ICMP/ICMP6 specifics.
% P4
The P4 language has shown maturity, but the usability and ease of use
of the provided toolchains can be significantly improved. Additionally
of the provided toolchains can be significantly improved. Additionally,
we envision a stronger tie between the different tools in the P4
environment, like a collection of libraries and modules that could
form something on the line of a ``P4OS''. This operating system could
@ -67,7 +67,7 @@ The NetFPGA, from the hardware point of view, is a very
interesting hardware platform. Reducing the difficulties
we experienced with the surrounding toolchain and making
development experience more consistent has the potential to not only
make NetFPGA, but also the who set of P4 hardware more interesting for
make NetFPGA, but also the whole set of P4 hardware more interesting for
developers.
%% PMTU

View file

@ -4,37 +4,37 @@
In this chapter we describe the architecture of our solution and our
design choices. We first introduce the general design of NAT64 in the
P4 architecture. Afterwards we describe the design differences
for the BMV2 and NetFPGA P4 architectures. Afterwards we discuss the
design of stateless and stateful NAT64 in relation to P4 an well as
of the BMV2 and NetFPGA P4 architectures. Afterwards we discuss the
design of stateless and stateful NAT64 in relation to P4 as well as
two existing software NAT64 solutions.
Lastly we discuss how we verify NAT64 functionality present
the network configurations that we use.
Lastly we discuss how we verify NAT64 functionality and
present the network configurations that we use.
% ----------------------------------------------------------------------
\section{\label{design:nat64}P4/NAT64}
\begin{figure}[h]
\includegraphics[scale=0.4]{switchdesign}
\includegraphics[scale=0.5]{switchdesign}
\centering
\caption{P4 Switch Architecture}
\label{fig:switchdesign}
\end{figure}
In section \ref{background:transition} we discussed different
translation mechanisms for IPv6 and IPv4. In this thesis we focus on
the translation mechanisms stateless and stateful NAT64. While higher
the translation mechanisms ``stateless'' and ``stateful'' NAT64. While higher
layer protocol dependent translations are more flexible, this topic
has already been addressed in
\cite{nico18:_implem_layer_ipv4_ipv6_rever_proxy} and the focus in
this thesis is on the practicability of high speed NAT64 with P4.
The high level design can be seen in figure \ref{fig:switchdesign}: a
P4 capable switch is running our code to provide NAT64
functionality. A P4 switch cannot manage its tables on it own and
functionality. A P4 switch cannot manage its tables on its own and
needs support for this from a controller. The controller also has the
role to handle unknown packets and can modify the runtime
configuration of the switch. This is especially useful in the case of
stateful NAT64.
If only static table entries
are required, they can usually be added at the start of a P4 switch
and the controller can also be omitted. However stateful
and the controller can also be omitted. However, stateful
NAT64 requires the use of a controller to create session entries in the
switch tables.
The P4 switch can use any protocol to communicate with the controller, as
@ -51,7 +51,8 @@ Software NAT64 solutions typically require routing to be applied to
transport the packet to the NAT64 translator as shown in figure
\ref{fig:v6v4standard}.
Our design differs here: while routing could be used like described
Our design differs here:
while routing could be used like described
above, NAT64 with P4 does not require any routing to be setup. Figure
\ref{fig:v6v4mixed} shows the network design that we realise using
P4. This design has multiple advantages: first it reduces the number
@ -64,15 +65,16 @@ segment.
\caption{In-network NAT64 translation}
\label{fig:v6v4mixed}
\end{figure}
% ----------------------------------------------------------------------
\section{\label{design:nat64:indepth}P4/NAT64}
P4 switches in general look very similar to regular switches, however
support executing logic while the packet passes through the switch.
When a packet enters the switch,
\digraph[scale=0.5]{abc}{rankdir=LR; a->b->c;}
support executing logic while the packet passes through the
switch. Figure \ref{fig:p4switch} illustrates how our solution is
implemented and translates packets.
\begin{figure}[h]
\includegraphics[scale=0.5]{p4switch}
\centering
\caption{Our P4 Switch Architecture}
\label{fig:p4switch}
\end{figure}
% ----------------------------------------------------------------------
\section{\label{design:bmv2}P4/BMV2}
\begin{figure}[h]
@ -200,10 +202,10 @@ and then calculates the difference including a
possible carry bit and adjusts the higher level protocol by this
difference (\texttt{delta\_tcp\_from\_v6\_to\_v4()}).
Figure \ref{fig:checksumbydiff} shows an
excerpt of the code used for adjust the checksum when translating TCP
excerpt of the code used for adjusting the checksum when translating TCP
from IPv6 to IPv4.
It is notable that
not the full headers are used, but only a ``pseudo header'' (compare figures
not the full headers are used, but only a ``pseudo header'' is (compare figures
\ref{fig:ipv6pseudoheader} and \ref{fig:ipv4pseudoheader}).
% ok
@ -243,7 +245,7 @@ entries are configured differently depending on the implementation:
Limitations in the P4/NetFPGA environment require to use table
entries. Jool supports individual entries as a special case of LPM,
with a network mask matching only one IP address. Tayga
support LPM for translation from IPv6 to IPv4, but requires individual
supports LPM to translate from IPv6 to IPv4, but requires individual
entries for translating from IPv4 to IPv6. Our P4/BMV2 offers the
highest degree of flexibility, as it provides support for individual
entries based on table entries and LPM table entries.
@ -261,7 +263,7 @@ to solve this problem:
that don't have a table entry, sets the table entry in the P4 switch
and inserts the original packet afterwards back into the switch.
\item With tayga we rely on the Linux kernel NAT44 capabilities
\item Jool implements its own stateful mechanism based on a port
\item Jool implements its own stateful mechanism based on port
ranges
\end{itemize}
All methods though operate in a very similar fashion: A ``controller''

View file

@ -8,7 +8,7 @@ We distinguish the software implementation of P4 (BMV2) and the
hardware implementation (NetFPGA) due to significant differences in
deployment and development. We present benchmarks for the existing
software solutions as well as for our hardware implementation. As the
objective of this thesis was to demonstrate the high speed
objective of this thesis is to demonstrate the high speed
capabilities of NAT64 in hardware, no benchmarks were performed on the
P4 software implementation.
% ok
@ -17,7 +17,7 @@ P4 software implementation.
We successfully implemented P4 code to realise
NAT64~\cite{schottelius:thesisrepo}. It contains parsers
for all related protocols (IPv6, IPv4, UDP, TCP, ICMP, ICMP6, NDP,
ARP), supports EAMT as defined by RFC7757 ~\cite{rfc7757} and is
ARP), supports EAMT as defined by RFC7757 ~\cite{rfc7757}, and is
feature equivalent to the two compared software solutions
tayga~\cite{lutchansky:_tayga_simpl_nat64_linux} and
jool~\cite{mexico:_jool_open_sourc_siit_nat64_linux}.
@ -41,7 +41,7 @@ superset of LPM matching.
When developing P4 programs, the reason for incorrect behaviour we
have seen were checksum problems. This is in retrospective expected,
as the main task our implementation does is modify headers on which
as the main task of our implementation is modifying headers on which
the checksums depend. In all cases we have seen Ethernet frame
checksum errors, the effective length of the packet was incorrect.
@ -54,7 +54,7 @@ the matching key from table or the name of the action called. Thus
if different table entries call the same action, it is impossible
within the action, or if forwarded to the controller, within the
controller to distinguish on which match the action was
triggered. This problem is very consistent within P4, as not even the
triggered. This problem is very consistent within P4, not even the
matching table name can be retrieved. While these information can be
added manually as additional fields in the table entries, we would
expect a language to support reading and forwarding this kind of meta
@ -68,7 +68,7 @@ that this duplication is a likely source of errors in bigger software
projects.
The supporting scripts in the P4 toolchain are usually written in
python2. However python2 ``is
python2. However, python2 ``is
legacy''~\cite{various:_shoul_i_python_python}. During development
errors with unicode string handling in python2 caused
changes to IPv6 addresses.
@ -136,7 +136,7 @@ fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
\end{center}
\end{table}
The switch responds to ICMP echo requests, ICMP6 echo requests,
answers NDP and ARP requests. Overall P4/BMV is very easy to use
answers NDP and ARP requests. Overall P4/BMV is very easy to use,
even without a controller a fully functional network host can be
implemented.
@ -159,7 +159,7 @@ support of the NetFPGA P4 compiler to inspect payload and to compute
checksums over payload. While this can (partially) be compensated
using delta checksums, the compile time of 2 to 6 hours contributed to
a significant slower development cycle compared to BMV2.
Lastly, the focus of this thesis was to implement high speed NAT64 on
Lastly, the focus of this thesis is to implement high speed NAT64 on
P4, which only requires a subset of the features that we realised on
BMV2. Table \ref{tab:p4netpfgafeatures} summarises the implemented
features and reasons about their implementation status.
@ -233,7 +233,7 @@ unsupported\footnote{To support creating payload checksums, either an
% ok
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:stability}Stability}
Two different NetPFGA cards were used during the development of the
Two different NetPFGA cards were used during the development of this
thesis. The first card had consistent ioctl errors (compare section
\ref{appendix:netfpgalogs:compilelogs}) when writing table entries. The available
hardware tests (compare figures \ref{fig:hwtestnico} and
@ -258,7 +258,7 @@ function properly multiple times. In theses cases the card would not
forward packets anymore. Multiple reboots (up to 3)
and multiple times reflashing the bitstream to the NetFPGA usually
restored the intended behaviour. However due to this ``crashes'', it
was impossible for us run a benchmark for more than one hour.
was impossible for us to run a benchmark for more than one hour.
Similarly, sometimes flashing the bitstream to the NetFPGA would fail.
It was required to reboot the host containing the
NetFPGA card up to 3 times to enable successful flashing.\footnote{Typical
@ -284,12 +284,12 @@ To use the NetFPGA, the tools Vivado and SDNET provided by Xilinx need to be
installed. However a bug in the installer triggers an infinite loop,
if a certain shared library\footnote{The required shared library
is libncurses5.} is missing on the target operating system. The
installation program seems still to be progressing, however does never
finish.
installation program seems to be still progressing, however never
finishes.
While the NetFPGA card supports P4, the toolchains and supporting
scripts are in a immature state. The compilation process consists of
at least 9 different steps, which are interdependent\footnote{See
scripts are in an immature state. The compilation process consists of
at least 9 different steps, which are interdependent.\footnote{See
source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate
shell scripts and python scripts that in turn generate JSON
data.\footnote{One compilation step calls the script
@ -358,7 +358,7 @@ the port and capturing on the other side.
Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are
commonly used in 10 Gbit/s networks. According to
\ref{wikipedia:_jumbo}, even many gigabit network interface card
\cite{wikipedia:_jumbo}, even many gigabit network interface card
support jumbo frames. However according to emails on the private
NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at
the moment and additional work is required to implement support for
@ -367,9 +367,9 @@ bigger frames.
Our P4 source code required contains Xilinx
annotations\footnote{F.i. ``@Xilinx\_MaxPacketRegion(1024)''} that define
the maximum packet size in bits. We observed two different errors on
the output packet, if the incoming packets exceeds the specified size:
the output packet, if the incoming packets exceed the specified size:
\begin{itemize}
\item The output packet is longer then the original packet.
\item The output packet is longer than the original packet.
\item The output packet is corrupted.
\end{itemize}
@ -413,7 +413,7 @@ X520 cards. Figure \ref{fig:softwarenat64design}
shows the network setup.
When testing the NetPFGA/P4 performance, the X520 cards in the NAT64
translator are disconnected and instead the NetPFGA ports are
connected, as show in figure \ref{fig:netpfgadesign}. The load
connected, as shown in figure \ref{fig:netpfgadesign}. The load
generator is equipped with a quad core CPU (Intel(R) Core(TM) i7-6700
CPU @ 3.40GHz), enabled with hyperthreading and 16 GB RAM. The NAT64
translator is also equipped with a quard core CPU (Intel(R) Core(TM)

Binary file not shown.

21
doc/graphviz/p4switch.dot Normal file
View file

@ -0,0 +1,21 @@
digraph G {
rankdir="TB";
v4host [ shape="box" label="IPv4 Host" rank="min" ];
v6host [ shape="box" label="IPv6 Host" ];
parser [ label="Parser"];
deparser [ label="Deparser"];
translation [ label="Translation"];
v4packet [ label="IPv4 Packet"];
v6packet [ label="IPv6 Packet"];
subgraph cluster_nat64 {
label="P4 Switch";
parser->v4packet->translation->v6packet->deparser;
}
deparser->v6host;
v4host->parser;
}

View file

@ -162,7 +162,6 @@
title = {P4-Programming on an FPGA, Semester Thesis SA-2019-02},
howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/sa-2019-02_p4_programming_sume_netfpga/blob/master/SA-2019-02.pdf}}}
@Misc{wikipedia:_jumbo,
author = {Wikipedia},
title = {Jumbo frame},