Cleanup appendix
This commit is contained in:
parent
d99a65ed27
commit
27d4d449aa
4 changed files with 591 additions and 1492 deletions
|
@ -66,3 +66,6 @@ Long term supporting python3 would be helpful. P4OS.
|
|||
|
||||
|
||||
- react on FIN/RST (?) -- could be an addition
|
||||
P4os - reusable code
|
||||
|
||||
Future work: session handling
|
||||
|
|
273
doc/Results.tex
273
doc/Results.tex
|
@ -14,12 +14,26 @@ P4 software implementation.
|
|||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{results:p4}P4 based implementations}
|
||||
We successfully implemented P4 code to realise
|
||||
NAT64~\cite{schottelius:thesisrepo}. It contains parsers
|
||||
for all related protocols (ipv6, ipv4, udp, tcp, icmp, icmp6, ndp,
|
||||
arp), supports EAMT as defined by RFC7757 ~\cite{rfc7757} and is
|
||||
feature equivalent to the two compared software solutions
|
||||
tayga~\cite{lutchansky:_tayga_simpl_nat64_linux} and
|
||||
jool~\cite{mexico:_jool_open_sourc_siit_nat64_linux}.
|
||||
Due to limitations in the P4 environment of the
|
||||
NetFPGA~\cite{conclusion:netfpga} environment, the BMV2 implementation
|
||||
is more feature rich. Table \ref{tab:benchmark} summarises the
|
||||
achieved bandwidths of the NAT64 solutions.
|
||||
|
||||
BEFORE OR AFTER MARKER - FIXME
|
||||
|
||||
All planned features could be realised with P4 and a controller.
|
||||
For this thesis the parsing capabilities of P4 were adequate.
|
||||
However P4, at the time of writing, cannot parse ICMP6 options in
|
||||
general, as the upper level protocol does not specify the number
|
||||
of options that follow and parsing of an undefined number
|
||||
of 64 bit blocks is required.
|
||||
of 64 bit blocks is required, which P4 does not support.
|
||||
|
||||
The language has some limitations on where the placement of
|
||||
conditional statements (\texttt{if/switch}).\footnote{In general,
|
||||
|
@ -61,34 +75,15 @@ The supporting scripts in the P4 toolchain are usually written in
|
|||
python2. However python2 ``is
|
||||
legacy''~\cite{various:_shoul_i_python_python}. During development
|
||||
errors with unicode string handling in python2 caused
|
||||
changes to IPv6 addresses.~\ref{appendix:p4:python2unicode}
|
||||
|
||||
P4os - reusable code
|
||||
|
||||
% idomatic problem: Security issue: not checking checksums before
|
||||
|
||||
|
||||
****** TODO IPv6 udp -> IPv4
|
||||
- Got 4-5 tuple ([proto], src ip, src port, dst ip, dst port)
|
||||
- Does not / never signal end
|
||||
- Needs timeout for cleaning up
|
||||
|
||||
P4/BMV2 thus
|
||||
allows us to closest resemble any other translation implementation.
|
||||
|
||||
Only supporting /96, not other embeddings as described in
|
||||
section \ref{background:transition:prefixnat}.
|
||||
|
||||
changes to IPv6 addresses.\footnote{Compare section ~\ref{appendix:p4:python2unicode}.}
|
||||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{Results:BMV2}BMV2}
|
||||
\section{\label{results:bmv2}P4/BMV2}
|
||||
The software implementation of P4 has most features, which is
|
||||
mostly due to the capability of checksumming the payload: Acting
|
||||
as a ``proper'' participant in NDP, requires the host to calculate
|
||||
checksums over the payload.
|
||||
|
||||
|
||||
List of features BMV2 ~\cite{tab:p4bmv2features}
|
||||
|
||||
mostly due to the capability of creating checksums over the payload.
|
||||
It enables the switch to act as a ``proper'' participant in NDP, as
|
||||
this requires the host to calculate checksums over the payload.
|
||||
Table~\ref{tab:p4bmv2features} references all implemented features.
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
\begin{tabular}{| c | c | c |}
|
||||
|
@ -140,42 +135,38 @@ fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
|
|||
\hline
|
||||
\end{tabular}
|
||||
\end{minipage}
|
||||
\caption{P4 / BMV2 feature list}
|
||||
\caption{P4/BMV2 feature list}
|
||||
\label{tab:p4bmv2features}
|
||||
\end{center}
|
||||
\end{table}
|
||||
The switch responds to ICMP echo requests, ICMP6 echo requests,
|
||||
answers NDP and ARP requests. Overall P4/BMV is very easy to use
|
||||
even without a controller a fully functional network host can be
|
||||
implemented.
|
||||
|
||||
Responds to icmp, icmp6
|
||||
ndp ~\cite{rfc4861}
|
||||
arp
|
||||
|
||||
very easy to use
|
||||
|
||||
Fully functional host
|
||||
Can compute checksums on its own.
|
||||
|
||||
focus on typical use cases of icmp, icmp6, the software implementation
|
||||
supports translating echo request and echo reply messages, but does
|
||||
not support all ICMP/ICMP6 translations that are defined in
|
||||
This P4/BMV implementation supports translating ICMP/ICMP6
|
||||
echo request and echo reply messages, but does not support
|
||||
all ICMP/ICMP6 translations that are defined in
|
||||
RFC6145~\cite{rfc6145}.
|
||||
|
||||
Stateful : no automatic removal
|
||||
|
||||
Session management not benchmarked, as it is only a matter of creating
|
||||
table entries.
|
||||
|
||||
Jool and tayga are supported by
|
||||
|
||||
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{results:netpfga}NetFPGA - FIXME: writing}
|
||||
The reduced feature set of the NetPFGA implementation is due to two
|
||||
factors: compile time. Between 2 to 6 hours per compile run. No
|
||||
payload checksum
|
||||
|
||||
overview - general translation - not advanced features
|
||||
\section{\label{results:netpfga}P4/NetFPGA}
|
||||
In the following section we describe the achieved feature set of
|
||||
P4/NetFPGA in detail and analyse differences to the BMV2 based
|
||||
implementation.
|
||||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
\subsubsection{\label{results:netpfga:features}Features}
|
||||
\subsection{\label{results:netpfga:features}Features}
|
||||
While the NetFPGA target supports P4, compared to P4/BMV2
|
||||
we only implemented a reduced features set on P4/NetPFGA. The first
|
||||
reason for this is missing
|
||||
support of the NetFPGA P4 compiler to inspect payload and to compute
|
||||
checksums over payload. While this can (partially) be compensated
|
||||
using delta checksums, the compile time of 2 to 6 hours contributed to
|
||||
a significant slower development cycle compared to BMV2.
|
||||
Lastly, the focus of this thesis was to implement high speed NAT64 on
|
||||
P4, which only requires a subset of the features that we realised on
|
||||
BMV2. Table \ref{tab:p4netpfgafeatures} summarises the implemented
|
||||
features and reasons about their implementation status.
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
\begin{tabular}{| c | c | c |}
|
||||
|
@ -239,12 +230,13 @@ unsupported\footnote{To support creating payload checksums, either an
|
|||
\hline
|
||||
\end{tabular}
|
||||
\end{minipage}
|
||||
\caption{P4 / NetFPGA feature list}
|
||||
\caption{P4/NetFPGA feature list}
|
||||
\label{tab:p4netpfgafeatures}
|
||||
\end{center}
|
||||
\end{table}
|
||||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
\subsubsection{\label{results:netpfga:stability}Stability}
|
||||
\subsection{\label{results:netpfga:stability}Stability}
|
||||
Two different NetPFGA cards were used during the development of the
|
||||
thesis. The first card had consistent ioctl errors (compare section
|
||||
\ref{netpfgaioctlerror}) when writing table entries. The available
|
||||
|
@ -266,25 +258,33 @@ on the first NetFPGA card.
|
|||
\label{fig:hwtesthendrik}
|
||||
\end{figure}
|
||||
During the development and benchmarking, the second NetFPGA card stopped to
|
||||
function properly multiple times. In both cases the card would not
|
||||
forward packets anymore. Multiple reboots (3 were usually enough)
|
||||
function properly multiple times. In theses cases the card would not
|
||||
forward packets anymore. Multiple reboots (up to 3)
|
||||
and multiple times reflashing the bitstream to the NetFPGA usually
|
||||
restored the intended behaviour. However due to this ``crashes'', it
|
||||
was impossible to complete a full benchmark run that would last for
|
||||
more than one hour.
|
||||
|
||||
Sometimes it was also required to reboot the host containing the
|
||||
NetFPGA card 3 times to enable successful flashing.\footnote{Typical
|
||||
output of the flashing process would be: ``fpga configuration failed. DONE PIN is not HIGH''}
|
||||
was impossible for us run a benchmark for more than one hour.
|
||||
Similariy, sometimes flashing the bitstream to the NetFPGA would fail.
|
||||
It was required to reboot the host containing the
|
||||
NetFPGA card up to 3 times to enable successful flashing.\footnote{Typical
|
||||
output of the flashing process would be: ``fpga configuration
|
||||
failed. DONE PIN is not HIGH''}
|
||||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
\subsubsection{\label{results:netpfga:performance}Performance}
|
||||
As expected, the NetFGPA card performed at near line speed and offers
|
||||
NAT64 translations at 9.28 Gbit/s. Single and multiple streams
|
||||
The NetFGPA card performed at near line speed and offers
|
||||
NAT64 translations at 9.28 Gbit/s (see section \ref{results:benchmark}
|
||||
for details).
|
||||
Single and multiple streams
|
||||
performed almost exactly identical and have been consistent through
|
||||
multiple iterations of the benchmarks.
|
||||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
\subsubsection{\label{results:netpfga:usability}Usability}
|
||||
To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be
|
||||
\subsection{\label{results:netpfga:usability}Usability}
|
||||
The handling and usability of the NetFPGA card is rather difficult. In
|
||||
this section we describe our findings and experiences with the card
|
||||
and its toolchain.
|
||||
|
||||
To use the NetFGPA, the tools Vivado and SDNET provided by Xilinx need to be
|
||||
installed. However a bug in the installer triggers an infinite loop,
|
||||
if a certain shared library\footnote{The required shared library
|
||||
is libncurses5.} is missing on the target operating system. The
|
||||
|
@ -388,36 +388,68 @@ techniques are missing or not supported.
|
|||
Renaming variables in the declaration of the parser or deparser lead
|
||||
to compilation errors. Function syntax is not supported. For this
|
||||
reason our implementation uses \texttt{\#define} statements instead of functions.
|
||||
|
||||
FIXME:
|
||||
|
||||
General result: limited NAT64 is working, however
|
||||
No Payload ; checksumming - requires controller
|
||||
Hash funktion in Arbeit ; No NDP, no ARP - focused on key factors of NAT64 translation,
|
||||
other features can be supported by controller
|
||||
Needed to debug internal parsing errors
|
||||
debugging generated tcl code to debug impl1 error
|
||||
|
||||
%ok
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{results:softwarenat64}Software based NAT64}
|
||||
with Tayga and
|
||||
Jool
|
||||
Both cpu bound.
|
||||
|
||||
During the benchmark cpu bound, single thread
|
||||
tayga: Single threaded
|
||||
easy to use
|
||||
|
||||
Jool kernel module
|
||||
100\% cpu usage on 1 core for udp
|
||||
0\% visible cpu usage for tcp, might be tcp offloading
|
||||
Integration with iptables
|
||||
Requires routing
|
||||
|
||||
|
||||
Both solutions Tayga and Jool worked flawlessly. However as expected,
|
||||
both solutions have a bottleneck that is CPU bound. Under high load
|
||||
scenarios both solutions utilise one core fully. Neither Tayga as a
|
||||
user space program nor Jool as a kernel module implement multi
|
||||
threading.
|
||||
%ok
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{results:benchmark}NAT64 Benchmarks - FIXME: explain
|
||||
numbers}
|
||||
\section{\label{results:benchmark}NAT64 Benchmarks}
|
||||
In this section we summarise the benchmarking results, in the
|
||||
sub sections we discuss the benchmark design and the individual results.
|
||||
|
||||
FIXME: summary here
|
||||
|
||||
MTU setting to 1500, as netpfga doesn't support jumbo frames
|
||||
|
||||
|
||||
iperf3, iperf 3.0.11
|
||||
|
||||
50 parallel = 2x 100% cpu usage
|
||||
40 parallel = 100%, 70% cpu usage
|
||||
30 parallel = 70%-100, 70% cpu usage
|
||||
|
||||
Turning back on checksum offloading (see below)
|
||||
|
||||
30 parallel = 70%, 30% cpu usage
|
||||
|
||||
|
||||
\subsection{\label{benchmark:tayga:tcp}Tayga/TCP}
|
||||
|
||||
Tayga running at 100% cpu load,
|
||||
|
||||
v4->v6 tcp
|
||||
delivering
|
||||
3.36 gbit/s at P1
|
||||
3.30 Gbit/s at P20
|
||||
3.11 gbit/s at P50
|
||||
|
||||
v6->v4 tcp
|
||||
P1: 3.02 Gbit/s
|
||||
P20: 3.28 gbit/s
|
||||
P50: 2.85 gbit/s
|
||||
|
||||
Commands:
|
||||
|
||||
|
||||
UDP load generator hitting 100\% cpu at P20.
|
||||
TCP confirmed.
|
||||
Over bandwidth results
|
||||
|
||||
Feature comparison
|
||||
speed - sessions - eamt
|
||||
can act as host
|
||||
lpm tables
|
||||
ping
|
||||
ping6 support
|
||||
ndp
|
||||
controller support
|
||||
|
||||
netpfga consistent
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{results:benchmark:design}Benchmark Design}
|
||||
\begin{figure}[h]
|
||||
|
@ -449,20 +481,10 @@ warm up phase.\footnote{iperf -O 10 parameter, see section \ref{design:tests}.}
|
|||
\end{figure}
|
||||
% ok
|
||||
% ----------------------------------------------------------------------
|
||||
|
||||
|
||||
We successfully implemented P4 code to realise
|
||||
NAT64~\cite{schottelius:thesisrepo}. It contains parsers
|
||||
for all related protocols (ipv6, ipv4, udp, tcp, icmp, icmp6, ndp,
|
||||
arp), supports EAMT as defined by RFC7757 ~\cite{rfc7757} and is
|
||||
feature equivalent to the two compared software solutions
|
||||
tayga~\cite{lutchansky:_tayga_simpl_nat64_linux} and
|
||||
jool~\cite{mexico:_jool_open_sourc_siit_nat64_linux}.
|
||||
Due to limitations in the P4 environment of the
|
||||
NetFPGA~\cite{conclusion:netfpga} environment, the BMV2 implementation
|
||||
is more feature rich. Table \ref{tab:benchmark} summarises the
|
||||
achieved bandwidths of the NAT64 solutions.
|
||||
|
||||
\newpage
|
||||
\subsection{\label{results:benchmark:v6v4tcp}IPv6 to IPv4 TCP
|
||||
Benchmark Results}
|
||||
some text
|
||||
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
|
@ -487,8 +509,8 @@ Parallel connections & 1 & 10 & 20 & 50 \\
|
|||
\label{tab:benchmarkv6}
|
||||
\end{center}
|
||||
\end{table}
|
||||
|
||||
|
||||
% ---------------------------------------------------------------------
|
||||
\subsection{\label{results:benchmark:v4v6tcp}IPv4 to IPv6 TCP Benchmark Results}
|
||||
During the benchmarks the client -- CPU usage
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
|
@ -514,7 +536,11 @@ Parallel connections & 1 & 10 & 20 & 50 \\
|
|||
\end{center}
|
||||
\end{table}
|
||||
|
||||
|
||||
% ---------------------------------------------------------------------
|
||||
\newpage
|
||||
\subsection{\label{results:benchmark:v6v4udp}IPv6 to IPv4 UDP
|
||||
Benchmark Results}
|
||||
other text
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
\begin{tabular}{| c | c | c | c | c |}
|
||||
|
@ -540,7 +566,9 @@ Parallel connections & 1 & 10 & 20 & 50 \\
|
|||
\end{center}
|
||||
\end{table}
|
||||
|
||||
|
||||
% ---------------------------------------------------------------------
|
||||
\subsection{\label{results:benchmark:v4v6udp}IPv4 to IPv6 UDP Benchmark Results}
|
||||
last text
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
\begin{tabular}{| c | c | c | c | c |}
|
||||
|
@ -565,18 +593,3 @@ Parallel connections & 1 & 10 & 20 & 50 \\
|
|||
\label{tab:benchmarkv4}
|
||||
\end{center}
|
||||
\end{table}
|
||||
|
||||
UDP load generator hitting 100\% cpu at P20.
|
||||
TCP confirmed.
|
||||
Over bandwidth results
|
||||
|
||||
Feature comparison
|
||||
speed - sessions - eamt
|
||||
can act as host
|
||||
lpm tables
|
||||
ping
|
||||
ping6 support
|
||||
ndp
|
||||
controller support
|
||||
|
||||
netpfga consistent
|
||||
|
|
BIN
doc/Thesis.pdf
BIN
doc/Thesis.pdf
Binary file not shown.
1807
doc/appendix.tex
1807
doc/appendix.tex
File diff suppressed because it is too large
Load diff
Loading…
Reference in a new issue