master-thesis/doc/Results.tex
2019-08-16 12:23:35 +02:00

641 lines
25 KiB
TeX
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

\chapter{\label{results}Results}
%** Results.tex: What were the results achieved including an evaluation
%
This section describes the achieved results and compares the P4 based
implementation with real world software solutions.
We distinguish the software implementation of P4 (BMV2) and the
hardware implementation (NetFPGA) due to significant differences in
deployment and development. We present benchmarks for the existing
software solutions as well as for our hardware implementation. As the
objective of this thesis was to demonstrate the high speed
capabilities of NAT64 in hardware, no benchmarks were performed on the
P4 software implementation.
% ----------------------------------------------------------------------
\section{\label{results:p4}NAT64 Overview - FIXME: verify numbers}
We successfully implemented P4 code to realise
NAT64\cite{schottelius:thesisrepo}. It contains parsers
for all related protocols (ipv6, ipv4, udp, tcp, icmp, icmp6, ndp,
arp), supports EAMT as defined by RFC7757 \cite{rfc7757} and is
feature equivalent to the two compared software solutions
tayga\cite{lutchansky:_tayga_simpl_nat64_linux} and
jool\cite{mexico:_jool_open_sourc_siit_nat64_linux}.
Due to limitations in the P4 environment of the
NetFPGA\cite{conclusion:netfpga} environment, the BMV2 implementation
is more feature rich. Table \ref{tab:benchmark} summarises the
achieved bandwidths of the NAT64 solutions.
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c | c |}
\hline
Solution & \multicolumn{3}{|c|}{Parallel connections} \\
& 1 & 20 & 3 \\
\hline
Tayga & 3.02 & 3.28 & 2.85\\
\hline
Jool & 6.67 & 16.8 ?? & 20.5 udp?\\
\hline
P4 / NetPFGA & 9.28 & 9.29 & 9.29\\
\hline
\end{tabular}
\end{minipage}
\caption{NAT64 Benchmark (client: IPv6, server: IPv4), all results in Gbit/sec (\%loss)}
\label{tab:benchmarkv6}
\end{center}
\end{table}
During the benchmarks the client -- CPU usage
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c | c |}
\hline
Solution & \multicolumn{3}{|c|}{Parallel connections} \\
& 1 & 20 & 3 \\
\hline
Tayga & 3.36 & 3.29 & 3.11 \\
\hline
Jool & 8.24 & 8.26 & 8.29\\
\hline
P4 / NetPFGA & 9.28 & 9.29 & 9.29\\
\hline
\end{tabular}
\end{minipage}
\caption{NAT64 Benchmark (client: IPv4, server: IPv6), all results in Gbit/sec (\%loss)}
\label{tab:benchmarkv4}
\end{center}
\end{table}
Feature comparison
speed - sessions - eamt
can act as host
lpm tables
ping
ping6 support
ndp
controller support
% ----------------------------------------------------------------------
\section{\label{Results:BMV2}BMV2}
The software implementation of P4 has most features, which is
mostly due to the capability of checksumming the payload: Acting
as a ``proper'' participant in NDP, requires the host to calculate
checksums over the payload.
List of features:
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & fully implemented\footnote{Source code: \texttt{actions\_egress.p4}}\\
\hline
Controller to Switch & Controller can setup table entries &
fully implemented\footnote{Source code: \texttt{controller.py}}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
fully implemented\footnote{Source code:
\texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ARP & Switch can answer ARP request (without controller) & fully
implemented\footnote{Source code: \texttt{actions\_arp.p4}}\\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
fully implemented\footnote{Source code:
\texttt{actions\_nat64\_session.p4}, \texttt{controller.py}} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / BMV2 feature list}
\label{tab:p4bmv2features}
\end{center}
\end{table}
Responds to icmp, icmp6
ndp \cite{rfc4861}
arp
Fully functional host
Can compute checksums on its own.
focus on typical use cases of icmp, icmp6, the software implementation
supports translating echo request and echo reply messages, but does
not support all ICMP/ICMP6 translations that are defined in
RFC6145\cite{rfc6145}.
Stateful : no automatic removal
Session management not benchmarked, as it is only a matter of creating
table entries.
Jool and tayga are supported by
Trace files
\begin{verbatim}
create mode 100644 pcap/tcp-udp-delta-2019-07-17-1555-h1.pcap
create mode 100644 pcap/tcp-udp-delta-2019-07-17-1555-h3.pcap
create mode 100644 pcap/tcp-udp-delta-2019-07-17-1557-h1.pcap
create mode 100644 pcap/tcp-udp-delta-2019-07-17-1558-h3.pcap
\end{verbatim}
% ----------------------------------------------------------------------
\section{\label{results:netpfga}NetFPGA}
The reduced feature set of the NetPFGA implementation is due to two
factors: compile time. Between 2 to 6 hours per compile run. No
payload checksum
overview - general translation - not advanced features
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:features}Features}
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & portable\footnote{While the NetFPGA P4 implementation
does not have the clone3() extern that the BMV2 implementation offers,
communication to the controller can easily be realised by using one of
the additional ports of the NetFPGA and connect a physical network
card to it.}\\
\hline
Controller to Switch & Controller can setup table entries &
portable\footnote{The p4utils suite offers an easy access to the
switch tables. While the P4-NetFPGA support repository also offers
python scripts to modify the switch tables, the code is less
sophisticated and more fragile.}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
portable\footnote{NetFPGA/P4 does not offer calculating the checksume
over the payload. However delta checksumming can be used to create
the required checksum for replying.} \\
\hline
ARP & Switch can answer ARP request (without controller) &
portable\footnote{As ARP does not use checksums, integrating the
source code \texttt{actions\_arp.p4} into the netpfga code base is
enough to enable ARP support in the NetPFGA.} \\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
portable\footnote{ICMP/ICMP6 translations only require enabling the
icmp/icmp6 code in the netpfga code base.} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
portable\footnote{Same reasoning as ``Controller to switch''.} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
unsupported\footnote{To support creating payload checksums, either an
HDL module needs to be created or to modify the generated
the PX program.\cite{schottelius:_exter_p4_netpf}} \\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / NetFPGA feature list}
\label{tab:p4netpfgafeatures}
\end{center}
\end{table}
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:stability}Stability}
Two different NetPFGA cards were used during the development of the
thesis. The first card had consistent ioctl errors (compare section
\ref{netpfgaioctlerror}) when writing table entries. The available
hardware tests (compare figures \ref{fig:hwtestnico} and
\ref{fig:hwtesthendrik}) showed failures in both cards, however the
first card reported an additional ``10G\_Loopback'' failure. Due to
the inability of setting table entries, no benchmarking was performed
on the first NetFPGA card.
\begin{figure}[h]
\includegraphics[scale=1.4]{hwtestnico}
\centering
\caption{Hardware Test NetPFGA card 1}
\label{fig:hwtestnico}
\end{figure}
\begin{figure}[h]
\includegraphics[scale=0.2]{hwtesthendrik}
\centering
\caption{Hardware Test NetPFGA card 2, \cite{hendrik:_p4_progr_fpga_semes_thesis_sa}}
\label{fig:hwtesthendrik}
\end{figure}
During the development and benchmarking, the second NetFPGA card stopped to
function properly multiple times. In both cases the card would not
forward packets anymore. Multiple reboots (3 were usually enough)
and multiple times reflashing the bitstream to the NetFPGA usually
restored the intended behaviour. However due to this ``crashes'', it
was impossible to complete a full benchmark run that would last for
more than one hour.
Sometimes it was also required to reboot the host containing the
NetFPGA card 3 times to enable successful flashing.\footnote{Typical
output of the flashing process would be: ``fpga configuration failed. DONE PIN is not HIGH''}
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:performance}Performance}
As expected, the NetFGPA card performed at near line speed and offers
NAT64 translations at 9.28 Gbit/s. Single and multiple streams
performed almost exactly identical and have been consistent through
multiple iterations of the benchmarks.
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:usability}Usability}
To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be
installed. However a bug in the installer triggers an infinite loop,
if a certain shared library\footnote{The required shared library
is libncurses5.} is missing on the target operating system. The
installation program seems still to be progressing, however does never
finish.
While the NetFPGA card supports P4, the toolchains and supporting
scripts are in a immature state. The compilation process consists of
at least 9 different steps, which are interdependent\footnote{See
source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate
shell scripts and python scripts that in turn generate JSON
data.\footnote{One compilation step calls the script
``config\_writes.py''. This script failed with a syntax error, as it
contained incomplete python code. The scripts config\_writes.py
and config\_writes.sh are generated by gen\_config\_writes.py.
The output of the script gen\_config\_writes.py depends on the content
of config\_writes.txt. That file is generated by the simulation
``xsim''. The file ``SimpleSumeSwitch\_tb.sv'' contains code that is
responsible for writing config\_writes.txt and uses a function
named axi4\_lite\_master\_write\_request\_control for generating the
output. This in turn is dependent on the output of a script named
gen\_testdata.py.}
However incorrect parsing generates syntactically incorrect
scripts or scripts that generate incorrect output. The toolchain
provided by the NetFGPA-P4 repository contains more than 80000 lines
of code. The supporting scripts for setting table entries require
setting the parameters for all possible actions, not only for the
selected action. Supplying only the required parameters results in a
crash of the supporting script.
The documentation for using the NetFPGA-P4 repository is very
distributed and does not contain a reference on how to use the
tools. Mapping of egress ports and their metadata field are found in a
python script that is used for generating test data.
The compile process can take up to 6 hours and because the different
steps are interdependent, errors in a previous stage were in our
experiences detected hours after they happened. The resulting log
files of the compilation process can be up to 5 MB in size. Within
this log file various commands output references to other logfiles,
however the referenced logfiles do not exist before or after the
compile process.
During the compile process various informational, warning and error
messages are printed. However some informational messages constitute
critical errors, while on the other hand critical errors and syntax
errors often do not constitue a critical
error.\footnote{F.i. ``CRITICAL WARNING: [BD 41-737] Cannot set the
parameter TRANSLATION\_MODE on /axi\_interconnect\_0. It is
read-only.'' is a non critical warning.}
Also contradicting
output is generated.\footnote{While using version 2018.2, the following
message was printed: ``WARNING: command 'get\_user\_parameter' will be removed in the 2015.3
release, use 'get\_user\_parameters' instead''.}
Programs or scripts that are called during the compile process do not
necessarily exit non zero if they encountered a critical error. Thus
finding the source of an error can be difficult due to the compile
process continuing after critical errors occured. Not only programs
that have critical errors exit ``successfully'', but also python
scripts that encounter critical paths don't abort with raise(), but
print an error message to stdout and don't abort with an error.
The most often encountered critical compile error is
``Run 'impl\_1' has not been launched. Unable to open''. This error
indicates that something in the previous compile steps failed and can
refer to incorrectly generated testdata to unsupported LPM tables.
The NetFPGA kernel module provides access to virtual Linux
devices (nf0...nf3). However tcpdump does not see any packets that are
emitted from the switch. The only possibility to capture packets
that are emitted from the switch is by connecting a physical cable to
the port and capturing on the other side.
Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are
commonly used in 10 Gbit/s networks. According to
\ref{wikipedia:_jumbo}, even many gigabit network interface card
support jumbo frames. However according to emails on the private
NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at
the moment and additional work is required to implement support for
bigger frames.
Our P4 source code required contains Xilinx
annotations\footnote{F.i. ``@Xilinx\_MaxPacketRegion(1024)''} that define
the maximum packet size in bits. We observed two different errors on
the output packet, if the incoming packets exceeds the specified size:
\begin{itemize}
\item The output packet is longer then the original packet.
\item The output packet is corrupted.
\end{itemize}
While most of the P4 language is supported on the netpfga, some key
techniques are missing or not supported.
\begin{itemize}
\item Analysing / accessing payload is not supported
\item Checksum computation over payload is not supported
\item Using LPM tables can lead to compilation errors
\item Depening on the match type, only certain table sizes are allowed
\end{itemize}
Renaming variables in the declaration of the parser or deparser lead
to compilation errors. Function syntax is not supported. For this
reason our implementation uses \texttt{\#define} statements instead of functions.
\begin{verbatim}
*** DONE 2019-07-21: Proof of v6->v4 working delta based
CLOSED: [2019-07-21 Sun 12:30]
#+BEGIN_CENTER
pcap/tcp-udp-delta-from-v6-2019-07-21-0853-h1.pcap | Bin 0 -> 4252 bytes
pcap/tcp-udp-delta-from-v6-2019-07-21-0853-h3.pcap | Bin 0 -> 2544 bytes
#+END_CENTER
\end{verbatim}
\begin{verbatim}
**** DONE Testing v4->v6 tcp: ok (version 10.0)
CLOSED: [2019-08-04 Sun 09:15]
#+BEGIN_CENTER
nico@ESPRIMO-P956:~/master-thesis/bin$ ./socat-connect-tcp-v4
+ echo from-v4-ok
+ socat - TCP:10.0.0.66:2345
TCPv6-ok
nico@ESPRIMO-P956:~/master-thesis/bin$ ./socat-listen-tcp-v6
from-v4-ok
#+END_CENTER
trace:
netfpga-nat64-2019-08-04-0907-enp2s0f0.pcap
netfpga-nat64-2019-08-04-0907-enp2s0f1.pcap
**** DONE Testing v4->v6 udp: ok (version 10.1)
trace:
create mode 100644 pcap/netfpga-nat64-udp-2019-08-04-0913-enp2s0f0.pcap
create mode 100644 pcap/netfpga-nat64-udp-2019-08-04-0913-enp2s0f1.pcap
\end{verbatim}
\begin{verbatim}
*** DONE 2019-08-04: version 10.1/10.2: new maxpacketregion: v4->v6 works
CLOSED: [2019-08-04 Sun 19:42]
#+BEGIN_CENTER
nico@ESPRIMO-P956:~/master-thesis/bin$ ./init_ipv4_esprimo.sh
nico@ESPRIMO-P956:~/master-thesis/bin$ ./set_ipv4_neighbor.sh
#+END_CENTER
Test 20 first:
- Does't work -> missed to add table entries
- Does work after setting table entries
- 300 works
- 1450 works
- 1500 does not work
Proof:
create mode 100644 pcap/netfpga-10.2-maxpacket-2019-08-04-1931-enp2s0f0.pcap
create mode 100644 pcap/netfpga-10.2-maxpacket-2019-08-04-1931-enp2s0f1.pcap
\end{verbatim}
\begin{verbatim}
*** DONE 2019-08-04: test v6 -> v4: works for 1420
CLOSED: [2019-08-04 Sun 20:30]
Proof:
#+BEGIN_CENTER
create mode 100644 pcap/netfpga-10.2-fromv6tov4-2019-08-04-1943-enp2s0f0.pcap
create mode 100644 pcap/netfpga-10.2-fromv6tov4-2019-08-04-1943-enp2s0f1.pcap
\end{verbatim}
General result: limited NAT64 is working, however
No Payload
checksumming - requires controller
Hash funktion in Arbeit
No NDP, no ARP - focused on key factors of NAT64 translation,
other features can be supported by controller
Needed to debug internal parsing errors
debugging generated tcl code to debug impl1 error
% ----------------------------------------------------------------------
\section{\label{results:tayga}Tayga}
During the benchmark cpu bound, single thread
tayga: Single threaded
% ----------------------------------------------------------------------
\section{\label{results:jool}Jool}
kernel module
high cpu usage for udp connetcinos
Integration with iptables
% ----------------------------------------------------------------------
\section{\label{results:p4}P4}
All planned features could be realised with P4 and a controller.
The language has some limitations on where if/switch statements can be
used.\footnote{In general, if and switch statements in actions lead to
errors, but not all constellations are forbidden.}
For this thesis the parsing capabilities of P4 were adequate. However
P4 at the time of writing cannot parse ICMP6 options, as the upper
level protocol does not specify the number of options that follow and
parsing of 64 bit blocks is required.
P4/BMV2 does not support for multiple LPM keys in a table, however it
supports multiple keys with ternary matching.
When developing P4 programs, the reason for incorrect behaviour was
most often found in checksum problems. If frame checksum errors where
displayed by tcpdump, usually the effective length of the packet was
incorrect.
NDP parsing problem
checksumming a frequent problem and helper
IPv6: NDP: not easy to parse, as unknown number of following fields
if things don't work, often a checksum problem.
\begin{verbatim}
ipaddress.ip_network("2001:db8:61::/64")
IPv6Network(u'3230:3031:3a64:6238:3a36:313a:3a2f:3634/128')
Fix:
from __future__ import unicode_literals
\end{verbatim}
The tooling around P4 is still fragile, encountered many bugs
in the development.\cite{schottelius:github1675}
or missing features (\cite{schottelius:github745},
\cite{theojepsen:_get})
Hitting expression bug
retrieving information from tables
\begin{verbatim}
Key and mask for matching destination is in table. We need this
information in the action. However this information is not exposed, so
we need to specify another parameter with the same information as in
the key(s).
Log from slack: (2019-03-14)
nico [1:55 PM]
If I use LPM for matching, can I easily get the network address from P4 or do I have to use a bitmask myself? In the latter case it is not exactly clear how to get the mask from the table
Nate Foster [1:58 PM]
You want to retrieve the address in the packet? In a table?
And do you want to do the retrieving from the data plane or the control plane? (edited)
nico [2:00 PM]
If I have a match in a table that matches on LPM, it can be any IP address in a network
For calculating the NAT64/NAT46 translation, I will need the base address, i.e. network address to do subtractions/additions
So it is fully data plane, what I would like to do
I'll commit sample code to show the use case more clearly
https://gitlab.ethz.ch/nicosc/master-thesis/blob/master/p4src/static-mapping.p4#L73
GitLab
p4src/static-mapping.p4 · master · nicosc / master-thesis
gitlab.ethz.ch
So the action nat64_static() is used in the table v6_networks.
In v6_networks I use a match on `hdr.ipv6.dst_addr: lpm;`
What I would like to be able is to get the network address ; I can do that manually, if I have the mask
I can also re-inject this parameter by another action argument, but I'd assume that I can somewhere read this out from the table / match
Nate Foster [2:15 PM]
To make sure I understand, in the data plane, you want to retrieve the address in the lpm pattern? (edited)
nico [2:16 PM]
I want to retrieve the key
Nate Foster [2:16 PM]
Wait. The value `hdr.ipv6.dst_addr` is the thing used in the match.
So you have that.
What you dont have is the IPv6 address and mask put into the table by the control plane.
I assume you want the latter, right?
nico [2:17 PM]
For example, if my matching key is 2001:db8::/32 and the real address is 2001:db8::f00, then I would like to retrieve 2001:db8:: and 32 from the table
exactly :slightly_smiling_face:
I can "fix" this by adding another argument, but it feels somewhat wrong to do that
Because the table already knows this information
Nate Foster [2:26 PM]
I cant think of a way other than the action parameter hack.
nico [2:26 PM]
Oh, ok
Is it because the information is "lost in hardware"?
Nate Foster [2:31 PM]
No youre right that most implementations have the value in memory. And one can imagine a different table API that allowed one to retrieve it in the data plane.
But unless I am missing something obvious, P4 hides it…
\end{verbatim}
no meta information
\begin{verbatim}
Is there any meta information for "from which table was the action
called" available? My use case is having a debug action that sends
packets to the controller and I use it as a default_action in various
tables; however know I don't know anymore from which table the action
was called. Is there any kind of meta information which table called
me available?
I could work around this by using if(! .. .hit) { my_action(table_id)
}, but it would not work with using default_action = ...
\end{verbatim}
type definitions separate
Code sharing (controller, switch)
\begin{verbatim}
*** DONE Synchronisation with the controller
- Double data type definition -> might differ
- TYPE_CPU for ethernet
- Port ingress offset (9 vs. 16 bit)
\end{verbatim}
No switch in actions, No conditional execution in actions
P4os - reusable code
\begin{verbatim}
Not addressed so far: how to create re-usable code fragments that can
be plugged in easily. There could be a hypothetical "P4OS" that
manages code fragments. This might include, but not limited to
downloading (signed?) source code, managing dependencies similar to
Linux package management, handling updates, etc.
\end{verbatim}
idomatic problem: Security issue: not checking checksums before