2019-07-29 17:13:47 +00:00
|
|
|
\chapter{\label{results}Results}
|
|
|
|
%** Results.tex: What were the results achieved including an evaluation
|
|
|
|
%
|
2019-08-14 15:23:12 +00:00
|
|
|
This section describes the achieved results and compares the P4 based
|
|
|
|
implementation with real world software solutions.
|
|
|
|
|
|
|
|
We distinguish the software implementation of P4 (BMV2) and the
|
|
|
|
hardware implementation (NetFPGA) due to significant differences in
|
|
|
|
deployment and development. We present benchmarks for the existing
|
|
|
|
software solutions as well as for our hardware implementation. As the
|
|
|
|
objective of this thesis was to demonstrate the high speed
|
|
|
|
capabilities of NAT64 in hardware, no benchmarks were performed on the
|
|
|
|
P4 software implementation.
|
|
|
|
% ----------------------------------------------------------------------
|
2019-08-18 10:40:34 +00:00
|
|
|
\section{\label{results:p4}NAT64 Benchmarks - FIXME: explain numbers}
|
2019-08-14 15:23:12 +00:00
|
|
|
We successfully implemented P4 code to realise
|
|
|
|
NAT64\cite{schottelius:thesisrepo}. It contains parsers
|
|
|
|
for all related protocols (ipv6, ipv4, udp, tcp, icmp, icmp6, ndp,
|
|
|
|
arp), supports EAMT as defined by RFC7757 \cite{rfc7757} and is
|
|
|
|
feature equivalent to the two compared software solutions
|
|
|
|
tayga\cite{lutchansky:_tayga_simpl_nat64_linux} and
|
|
|
|
jool\cite{mexico:_jool_open_sourc_siit_nat64_linux}.
|
|
|
|
Due to limitations in the P4 environment of the
|
|
|
|
NetFPGA\cite{conclusion:netfpga} environment, the BMV2 implementation
|
|
|
|
is more feature rich. Table \ref{tab:benchmark} summarises the
|
|
|
|
achieved bandwidths of the NAT64 solutions.
|
2019-08-18 10:40:34 +00:00
|
|
|
|
|
|
|
|
2019-08-14 15:23:12 +00:00
|
|
|
\begin{table}[htbp]
|
|
|
|
\begin{center}\begin{minipage}{\textwidth}
|
2019-08-18 10:40:34 +00:00
|
|
|
\begin{tabular}{| c | c | c | c | c |}
|
|
|
|
\hline
|
|
|
|
Implementation & \multicolumn{4}{|c|}{min/avg/max in Gbit/s} \\
|
2019-08-14 15:23:12 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
Tayga & 2.79 / 3.20 / 3.43 & 3.34 / 3.36 / 3.38 & 2.57 / 3.02 / 3.27 &
|
|
|
|
2.35 / 2.91 / 3.20 \\
|
2019-08-14 15:23:12 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
Jool & 8.22 / 8.22 / 8.22 & 8.21 / 8.21 / 8.22 & 8.21 / 8.23 / 8.25
|
|
|
|
& 8.21 / 8.23 / 8.25\\
|
2019-08-14 15:23:12 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
P4 / NetPFGA & 9.28 / 9.28 / 9.29 & 9.28 / 9.28 / 9.29 & 9.28 / 9.28
|
|
|
|
/ 9.29 & 9.28 / 9.28 / 9.29\\
|
2019-08-14 15:23:12 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
Parallel connections & 1 & 10 & 20 & 50 \\
|
2019-08-14 15:23:12 +00:00
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{minipage}
|
2019-08-18 10:40:34 +00:00
|
|
|
\caption{IPv6 to IPv4 TCP NAT64 Benchmark}
|
2019-08-14 15:54:07 +00:00
|
|
|
\label{tab:benchmarkv6}
|
2019-08-14 15:23:12 +00:00
|
|
|
\end{center}
|
|
|
|
\end{table}
|
2019-08-18 10:40:34 +00:00
|
|
|
|
|
|
|
|
2019-08-15 13:33:08 +00:00
|
|
|
During the benchmarks the client -- CPU usage
|
2019-08-14 15:54:07 +00:00
|
|
|
\begin{table}[htbp]
|
|
|
|
\begin{center}\begin{minipage}{\textwidth}
|
2019-08-18 10:40:34 +00:00
|
|
|
\begin{tabular}{| c | c | c | c | c |}
|
|
|
|
\hline
|
|
|
|
Implementation & \multicolumn{4}{|c|}{min/avg/max in Gbit/s} \\
|
|
|
|
\hline
|
|
|
|
Tayga & 2.90 / 3.15 / 3.34 & 2.87 / 3.01 / 3.22 &
|
|
|
|
2.68 / 2.85 / 3.09 & 2.60 / 2.78 / 2.88 \\
|
|
|
|
\hline
|
|
|
|
Jool & 7.18 / 7.56 / 8.24 & 7.97 / 8.05 / 8.09 &
|
|
|
|
8.05 / 8.08 / 8.10 & 8.10 / 8.12 / 8.13 \\
|
|
|
|
\hline
|
|
|
|
P4 / NetPFGA & 8.51 / 8.53 / 8.55 & 9.28 / 9.28 / 9.29 & 9.29 / 9.29 /
|
|
|
|
9.29 & 9.28 / 9.28 / 9.29 \\
|
|
|
|
\hline
|
|
|
|
Parallel connections & 1 & 10 & 20 & 50 \\
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{minipage}
|
|
|
|
\caption{IPv4 to IPv6 TCP NAT64 Benchmark}
|
|
|
|
\label{tab:benchmarkv4}
|
|
|
|
\end{center}
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
|
|
\begin{table}[htbp]
|
|
|
|
\begin{center}\begin{minipage}{\textwidth}
|
|
|
|
\begin{tabular}{| c | c | c | c | c |}
|
|
|
|
\hline
|
|
|
|
Implementation & \multicolumn{4}{|c|}{avg bandwidth in gbit/s / avg loss /
|
|
|
|
adjusted bandwith} \\
|
|
|
|
\hline
|
|
|
|
Tayga & 8.02 / 70\% / 2.43 & 9.39 / 79\% / 1.97 & 15.43 / 86\% / 2.11
|
|
|
|
& 19.27 / 91\% 1.73 \\
|
|
|
|
\hline
|
|
|
|
Jool & 6.44 / 0\% / 6.41 & 6.37 / 2\% / 6.25 &
|
|
|
|
16.13 / 64\% / 5.75 & 20.83 / 71\% / 6.04 \\
|
|
|
|
\hline
|
|
|
|
P4 / NetPFGA & 8.28 / 0\% / 8.28 & 9.26 / 0\% / 9.26 &
|
|
|
|
16.15 / 0\% / 16.15 & 15.8 / 0\% / 15.8 \\
|
|
|
|
\hline
|
|
|
|
Parallel connections & 1 & 10 & 20 & 50 \\
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{minipage}
|
|
|
|
\caption{IPv6 to IPv4 UDP NAT64 Benchmark}
|
|
|
|
\label{tab:benchmarkv4}
|
|
|
|
\end{center}
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
|
|
\begin{table}[htbp]
|
|
|
|
\begin{center}\begin{minipage}{\textwidth}
|
|
|
|
\begin{tabular}{| c | c | c | c | c |}
|
|
|
|
\hline
|
|
|
|
Implementation & \multicolumn{4}{|c|}{avg bandwidth in gbit/s / avg loss /
|
|
|
|
adjusted bandwith} \\
|
2019-08-14 15:54:07 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
Tayga & 6.78 / 84\% / 1.06 & 9.58 / 90\% / 0.96 &
|
|
|
|
15.67 / 91\% / 1.41 & 20.77 / 95\% / 1.04 \\
|
2019-08-14 15:54:07 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
Jool & 4.53 / 0\% / 4.53 & 4.49 / 0\% / 4.49 & 13.26 / 0\% / 13.26 &
|
|
|
|
22.57 / 0\% / 22.57\\
|
2019-08-14 15:54:07 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
P4 / NetPFGA & 7.04 / 0\% / 7.04 & 9.58 / 0\% / 9.58 &
|
|
|
|
9.78 / 0\% / 9.78 & 14.37 / 0\% / 14.37\\
|
2019-08-14 15:54:07 +00:00
|
|
|
\hline
|
2019-08-18 10:40:34 +00:00
|
|
|
Parallel connections & 1 & 10 & 20 & 50 \\
|
2019-08-14 15:54:07 +00:00
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{minipage}
|
2019-08-18 10:40:34 +00:00
|
|
|
\caption{IPv4 to IPv6 UDP NAT64 Benchmark}
|
2019-08-14 15:54:07 +00:00
|
|
|
\label{tab:benchmarkv4}
|
|
|
|
\end{center}
|
|
|
|
\end{table}
|
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
UDP load generator hitting 100\% cpu at P20.
|
|
|
|
TCP confirmed.
|
|
|
|
Over bandwidth results
|
2019-07-29 17:13:47 +00:00
|
|
|
|
2019-08-15 13:33:08 +00:00
|
|
|
Feature comparison
|
|
|
|
speed - sessions - eamt
|
|
|
|
can act as host
|
|
|
|
lpm tables
|
|
|
|
ping
|
|
|
|
ping6 support
|
|
|
|
ndp
|
|
|
|
controller support
|
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
netpfga consistent
|
2019-08-15 13:33:08 +00:00
|
|
|
|
2019-08-14 15:23:12 +00:00
|
|
|
% ----------------------------------------------------------------------
|
2019-08-18 10:40:34 +00:00
|
|
|
\section{\label{Results:BMV2}BMV2 - FIXME: write better}
|
2019-08-15 13:33:08 +00:00
|
|
|
The software implementation of P4 has most features, which is
|
|
|
|
mostly due to the capability of checksumming the payload: Acting
|
2019-08-14 15:54:07 +00:00
|
|
|
as a ``proper'' participant in NDP, requires the host to calculate
|
|
|
|
checksums over the payload.
|
2019-08-12 10:13:59 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
|
|
|
|
List of features BMV2 \cite{tab:p4bmv2features}
|
2019-08-15 13:33:08 +00:00
|
|
|
|
|
|
|
\begin{table}[htbp]
|
|
|
|
\begin{center}\begin{minipage}{\textwidth}
|
|
|
|
\begin{tabular}{| c | c | c |}
|
|
|
|
\hline
|
|
|
|
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
|
|
|
|
\hline
|
|
|
|
Switch to controller & Switch forwards unhandeled packets to
|
|
|
|
controller & fully implemented\footnote{Source code: \texttt{actions\_egress.p4}}\\
|
|
|
|
\hline
|
|
|
|
Controller to Switch & Controller can setup table entries &
|
|
|
|
fully implemented\footnote{Source code: \texttt{controller.py}}\\
|
|
|
|
\hline
|
|
|
|
NDP & Switch responds to ICMP6 neighbor & \\
|
|
|
|
& solicitation request (without controller) &
|
|
|
|
fully implemented\footnote{Source code:
|
|
|
|
\texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
ARP & Switch can answer ARP request (without controller) & fully
|
|
|
|
implemented\footnote{Source code: \texttt{actions\_arp.p4}}\\
|
|
|
|
\hline
|
|
|
|
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
ICMP & Switch responds to ICMP echo request (without controller) &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
NAT64: TCP & Switch translates TCP with checksumming & \\
|
|
|
|
& from/to IPv6 to/from IPv4 &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
NAT64: UDP & Switch translates UDP with checksumming & \\
|
|
|
|
& from/to IPv6 to/from IPv4 &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
NAT64: & Switch translates echo request/reply & \\
|
|
|
|
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
|
|
|
|
fully implemented\footnote{Source code:
|
|
|
|
\texttt{actions\_nat64\_session.p4}, \texttt{controller.py}} \\
|
|
|
|
\hline
|
|
|
|
Delta Checksum & Switch can calculate checksum without payload
|
|
|
|
inspection &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
|
|
|
|
\hline
|
|
|
|
Payload Checksum & Switch can calculate checksum with payload inspection &
|
|
|
|
fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{minipage}
|
|
|
|
\caption{P4 / BMV2 feature list}
|
|
|
|
\label{tab:p4bmv2features}
|
|
|
|
\end{center}
|
|
|
|
\end{table}
|
|
|
|
|
2019-08-12 10:13:59 +00:00
|
|
|
Responds to icmp, icmp6
|
2019-08-12 15:36:43 +00:00
|
|
|
ndp \cite{rfc4861}
|
2019-07-31 08:50:30 +00:00
|
|
|
arp
|
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
very easy to use
|
2019-07-31 08:50:30 +00:00
|
|
|
|
2019-08-12 10:13:59 +00:00
|
|
|
Fully functional host
|
|
|
|
Can compute checksums on its own.
|
2019-07-31 08:50:30 +00:00
|
|
|
|
2019-08-12 15:36:43 +00:00
|
|
|
focus on typical use cases of icmp, icmp6, the software implementation
|
|
|
|
supports translating echo request and echo reply messages, but does
|
|
|
|
not support all ICMP/ICMP6 translations that are defined in
|
|
|
|
RFC6145\cite{rfc6145}.
|
|
|
|
|
|
|
|
Stateful : no automatic removal
|
|
|
|
|
2019-08-15 13:33:08 +00:00
|
|
|
Session management not benchmarked, as it is only a matter of creating
|
|
|
|
table entries.
|
|
|
|
|
|
|
|
Jool and tayga are supported by
|
2019-07-31 08:50:30 +00:00
|
|
|
|
2019-08-16 10:23:35 +00:00
|
|
|
|
2019-08-14 15:23:12 +00:00
|
|
|
% ----------------------------------------------------------------------
|
2019-08-18 10:40:34 +00:00
|
|
|
\section{\label{results:netpfga}NetFPGA - FIXME: writing}
|
2019-08-15 13:33:08 +00:00
|
|
|
The reduced feature set of the NetPFGA implementation is due to two
|
|
|
|
factors: compile time. Between 2 to 6 hours per compile run. No
|
|
|
|
payload checksum
|
2019-08-14 15:23:12 +00:00
|
|
|
|
2019-08-15 14:45:56 +00:00
|
|
|
overview - general translation - not advanced features
|
|
|
|
% ----------------------------------------------------------------------
|
|
|
|
\subsection{\label{results:netpfga:features}Features}
|
2019-08-15 13:33:08 +00:00
|
|
|
\begin{table}[htbp]
|
|
|
|
\begin{center}\begin{minipage}{\textwidth}
|
|
|
|
\begin{tabular}{| c | c | c |}
|
|
|
|
\hline
|
|
|
|
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
|
|
|
|
\hline
|
|
|
|
Switch to controller & Switch forwards unhandeled packets to
|
|
|
|
controller & portable\footnote{While the NetFPGA P4 implementation
|
|
|
|
does not have the clone3() extern that the BMV2 implementation offers,
|
|
|
|
communication to the controller can easily be realised by using one of
|
|
|
|
the additional ports of the NetFPGA and connect a physical network
|
|
|
|
card to it.}\\
|
|
|
|
\hline
|
|
|
|
Controller to Switch & Controller can setup table entries &
|
|
|
|
portable\footnote{The p4utils suite offers an easy access to the
|
|
|
|
switch tables. While the P4-NetFPGA support repository also offers
|
|
|
|
python scripts to modify the switch tables, the code is less
|
|
|
|
sophisticated and more fragile.}\\
|
|
|
|
\hline
|
|
|
|
NDP & Switch responds to ICMP6 neighbor & \\
|
|
|
|
& solicitation request (without controller) &
|
|
|
|
portable\footnote{NetFPGA/P4 does not offer calculating the checksume
|
|
|
|
over the payload. However delta checksumming can be used to create
|
|
|
|
the required checksum for replying.} \\
|
|
|
|
\hline
|
|
|
|
ARP & Switch can answer ARP request (without controller) &
|
|
|
|
portable\footnote{As ARP does not use checksums, integrating the
|
|
|
|
source code \texttt{actions\_arp.p4} into the netpfga code base is
|
|
|
|
enough to enable ARP support in the NetPFGA.} \\
|
|
|
|
\hline
|
|
|
|
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
|
|
|
|
portable\footnote{Same reasoning as NDP.} \\
|
|
|
|
\hline
|
|
|
|
ICMP & Switch responds to ICMP echo request (without controller) &
|
|
|
|
portable\footnote{Same reasoning as NDP.} \\
|
|
|
|
\hline
|
|
|
|
NAT64: TCP & Switch translates TCP with checksumming & \\
|
|
|
|
& from/to IPv6 to/from IPv4 &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
NAT64: UDP & Switch translates UDP with checksumming & \\
|
|
|
|
& from/to IPv6 to/from IPv4 &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
|
|
|
|
\hline
|
|
|
|
NAT64: & Switch translates echo request/reply & \\
|
|
|
|
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
|
|
|
|
portable\footnote{ICMP/ICMP6 translations only require enabling the
|
|
|
|
icmp/icmp6 code in the netpfga code base.} \\
|
|
|
|
\hline
|
|
|
|
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
|
|
|
|
portable\footnote{Same reasoning as ``Controller to switch''.} \\
|
|
|
|
\hline
|
|
|
|
Delta Checksum & Switch can calculate checksum without payload
|
|
|
|
inspection &
|
|
|
|
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
|
|
|
|
\hline
|
|
|
|
Payload Checksum & Switch can calculate checksum with payload inspection &
|
|
|
|
unsupported\footnote{To support creating payload checksums, either an
|
|
|
|
HDL module needs to be created or to modify the generated
|
|
|
|
the PX program.\cite{schottelius:_exter_p4_netpf}} \\
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{minipage}
|
|
|
|
\caption{P4 / NetFPGA feature list}
|
|
|
|
\label{tab:p4netpfgafeatures}
|
|
|
|
\end{center}
|
|
|
|
\end{table}
|
2019-08-14 14:18:27 +00:00
|
|
|
% ----------------------------------------------------------------------
|
2019-08-15 13:33:08 +00:00
|
|
|
\subsection{\label{results:netpfga:stability}Stability}
|
|
|
|
Two different NetPFGA cards were used during the development of the
|
|
|
|
thesis. The first card had consistent ioctl errors (compare section
|
|
|
|
\ref{netpfgaioctlerror}) when writing table entries. The available
|
|
|
|
hardware tests (compare figures \ref{fig:hwtestnico} and
|
|
|
|
\ref{fig:hwtesthendrik}) showed failures in both cards, however the
|
|
|
|
first card reported an additional ``10G\_Loopback'' failure. Due to
|
|
|
|
the inability of setting table entries, no benchmarking was performed
|
|
|
|
on the first NetFPGA card.
|
|
|
|
\begin{figure}[h]
|
|
|
|
\includegraphics[scale=1.4]{hwtestnico}
|
|
|
|
\centering
|
|
|
|
\caption{Hardware Test NetPFGA card 1}
|
|
|
|
\label{fig:hwtestnico}
|
|
|
|
\end{figure}
|
|
|
|
\begin{figure}[h]
|
|
|
|
\includegraphics[scale=0.2]{hwtesthendrik}
|
|
|
|
\centering
|
|
|
|
\caption{Hardware Test NetPFGA card 2, \cite{hendrik:_p4_progr_fpga_semes_thesis_sa}}
|
|
|
|
\label{fig:hwtesthendrik}
|
|
|
|
\end{figure}
|
|
|
|
During the development and benchmarking, the second NetFPGA card stopped to
|
|
|
|
function properly multiple times. In both cases the card would not
|
|
|
|
forward packets anymore. Multiple reboots (3 were usually enough)
|
|
|
|
and multiple times reflashing the bitstream to the NetFPGA usually
|
2019-08-15 14:45:56 +00:00
|
|
|
restored the intended behaviour. However due to this ``crashes'', it
|
|
|
|
was impossible to complete a full benchmark run that would last for
|
|
|
|
more than one hour.
|
2019-08-15 15:08:10 +00:00
|
|
|
|
|
|
|
Sometimes it was also required to reboot the host containing the
|
|
|
|
NetFPGA card 3 times to enable successful flashing.\footnote{Typical
|
|
|
|
output of the flashing process would be: ``fpga configuration failed. DONE PIN is not HIGH''}
|
2019-08-15 13:33:08 +00:00
|
|
|
% ----------------------------------------------------------------------
|
|
|
|
\subsection{\label{results:netpfga:performance}Performance}
|
|
|
|
As expected, the NetFGPA card performed at near line speed and offers
|
2019-08-15 14:45:56 +00:00
|
|
|
NAT64 translations at 9.28 Gbit/s. Single and multiple streams
|
|
|
|
performed almost exactly identical and have been consistent through
|
|
|
|
multiple iterations of the benchmarks.
|
|
|
|
% ----------------------------------------------------------------------
|
|
|
|
\subsection{\label{results:netpfga:usability}Usability}
|
|
|
|
To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be
|
|
|
|
installed. However a bug in the installer triggers an infinite loop,
|
|
|
|
if a certain shared library\footnote{The required shared library
|
|
|
|
is libncurses5.} is missing on the target operating system. The
|
|
|
|
installation program seems still to be progressing, however does never
|
|
|
|
finish.
|
|
|
|
|
|
|
|
While the NetFPGA card supports P4, the toolchains and supporting
|
|
|
|
scripts are in a immature state. The compilation process consists of
|
|
|
|
at least 9 different steps, which are interdependent\footnote{See
|
|
|
|
source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate
|
|
|
|
shell scripts and python scripts that in turn generate JSON
|
|
|
|
data.\footnote{One compilation step calls the script
|
|
|
|
``config\_writes.py''. This script failed with a syntax error, as it
|
|
|
|
contained incomplete python code. The scripts config\_writes.py
|
|
|
|
and config\_writes.sh are generated by gen\_config\_writes.py.
|
|
|
|
The output of the script gen\_config\_writes.py depends on the content
|
|
|
|
of config\_writes.txt. That file is generated by the simulation
|
|
|
|
``xsim''. The file ``SimpleSumeSwitch\_tb.sv'' contains code that is
|
|
|
|
responsible for writing config\_writes.txt and uses a function
|
|
|
|
named axi4\_lite\_master\_write\_request\_control for generating the
|
|
|
|
output. This in turn is dependent on the output of a script named
|
|
|
|
gen\_testdata.py.}
|
|
|
|
|
|
|
|
However incorrect parsing generates syntactically incorrect
|
|
|
|
scripts or scripts that generate incorrect output. The toolchain
|
|
|
|
provided by the NetFGPA-P4 repository contains more than 80000 lines
|
|
|
|
of code. The supporting scripts for setting table entries require
|
|
|
|
setting the parameters for all possible actions, not only for the
|
|
|
|
selected action. Supplying only the required parameters results in a
|
|
|
|
crash of the supporting script.
|
|
|
|
|
|
|
|
The documentation for using the NetFPGA-P4 repository is very
|
|
|
|
distributed and does not contain a reference on how to use the
|
|
|
|
tools. Mapping of egress ports and their metadata field are found in a
|
|
|
|
python script that is used for generating test data.
|
|
|
|
|
|
|
|
The compile process can take up to 6 hours and because the different
|
|
|
|
steps are interdependent, errors in a previous stage were in our
|
|
|
|
experiences detected hours after they happened. The resulting log
|
|
|
|
files of the compilation process can be up to 5 MB in size. Within
|
|
|
|
this log file various commands output references to other logfiles,
|
|
|
|
however the referenced logfiles do not exist before or after the
|
|
|
|
compile process.
|
|
|
|
|
|
|
|
During the compile process various informational, warning and error
|
|
|
|
messages are printed. However some informational messages constitute
|
|
|
|
critical errors, while on the other hand critical errors and syntax
|
|
|
|
errors often do not constitue a critical
|
|
|
|
error.\footnote{F.i. ``CRITICAL WARNING: [BD 41-737] Cannot set the
|
|
|
|
parameter TRANSLATION\_MODE on /axi\_interconnect\_0. It is
|
|
|
|
read-only.'' is a non critical warning.}
|
|
|
|
Also contradicting
|
2019-08-15 15:08:10 +00:00
|
|
|
output is generated.\footnote{While using version 2018.2, the following
|
2019-08-15 14:45:56 +00:00
|
|
|
message was printed: ``WARNING: command 'get\_user\_parameter' will be removed in the 2015.3
|
|
|
|
release, use 'get\_user\_parameters' instead''.}
|
|
|
|
|
2019-08-15 15:08:10 +00:00
|
|
|
Programs or scripts that are called during the compile process do not
|
|
|
|
necessarily exit non zero if they encountered a critical error. Thus
|
|
|
|
finding the source of an error can be difficult due to the compile
|
|
|
|
process continuing after critical errors occured. Not only programs
|
|
|
|
that have critical errors exit ``successfully'', but also python
|
|
|
|
scripts that encounter critical paths don't abort with raise(), but
|
|
|
|
print an error message to stdout and don't abort with an error.
|
|
|
|
|
|
|
|
The most often encountered critical compile error is
|
|
|
|
``Run 'impl\_1' has not been launched. Unable to open''. This error
|
|
|
|
indicates that something in the previous compile steps failed and can
|
|
|
|
refer to incorrectly generated testdata to unsupported LPM tables.
|
|
|
|
|
2019-08-15 14:45:56 +00:00
|
|
|
The NetFPGA kernel module provides access to virtual Linux
|
|
|
|
devices (nf0...nf3). However tcpdump does not see any packets that are
|
|
|
|
emitted from the switch. The only possibility to capture packets
|
|
|
|
that are emitted from the switch is by connecting a physical cable to
|
|
|
|
the port and capturing on the other side.
|
2019-08-15 13:33:08 +00:00
|
|
|
|
2019-08-15 14:45:56 +00:00
|
|
|
Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are
|
|
|
|
commonly used in 10 Gbit/s networks. According to
|
|
|
|
\ref{wikipedia:_jumbo}, even many gigabit network interface card
|
|
|
|
support jumbo frames. However according to emails on the private
|
|
|
|
NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at
|
|
|
|
the moment and additional work is required to implement support for
|
|
|
|
bigger frames.
|
2019-08-15 13:33:08 +00:00
|
|
|
|
2019-08-15 15:08:10 +00:00
|
|
|
Our P4 source code required contains Xilinx
|
|
|
|
annotations\footnote{F.i. ``@Xilinx\_MaxPacketRegion(1024)''} that define
|
|
|
|
the maximum packet size in bits. We observed two different errors on
|
|
|
|
the output packet, if the incoming packets exceeds the specified size:
|
|
|
|
\begin{itemize}
|
|
|
|
\item The output packet is longer then the original packet.
|
|
|
|
\item The output packet is corrupted.
|
|
|
|
\end{itemize}
|
|
|
|
|
2019-08-15 14:45:56 +00:00
|
|
|
While most of the P4 language is supported on the netpfga, some key
|
|
|
|
techniques are missing or not supported.
|
|
|
|
\begin{itemize}
|
|
|
|
\item Analysing / accessing payload is not supported
|
|
|
|
\item Checksum computation over payload is not supported
|
|
|
|
\item Using LPM tables can lead to compilation errors
|
|
|
|
\item Depening on the match type, only certain table sizes are allowed
|
|
|
|
\end{itemize}
|
|
|
|
Renaming variables in the declaration of the parser or deparser lead
|
|
|
|
to compilation errors. Function syntax is not supported. For this
|
|
|
|
reason our implementation uses \texttt{\#define} statements instead of functions.
|
2019-08-13 10:56:15 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
FIXME:
|
2019-08-13 10:56:15 +00:00
|
|
|
|
2019-08-12 10:13:59 +00:00
|
|
|
General result: limited NAT64 is working, however
|
2019-08-18 10:40:34 +00:00
|
|
|
No Payload ; checksumming - requires controller
|
|
|
|
Hash funktion in Arbeit ; No NDP, no ARP - focused on key factors of NAT64 translation,
|
|
|
|
other features can be supported by controller
|
2019-08-15 15:14:35 +00:00
|
|
|
Needed to debug internal parsing errors
|
|
|
|
debugging generated tcl code to debug impl1 error
|
|
|
|
|
2019-08-15 13:33:08 +00:00
|
|
|
% ----------------------------------------------------------------------
|
2019-08-18 10:40:34 +00:00
|
|
|
\section{\label{results:softwarenat64}Software NAT64 with Tayga and
|
|
|
|
Jool}
|
|
|
|
Both cpu bound.
|
|
|
|
|
2019-08-15 13:33:08 +00:00
|
|
|
During the benchmark cpu bound, single thread
|
2019-08-13 10:56:15 +00:00
|
|
|
tayga: Single threaded
|
2019-08-18 10:40:34 +00:00
|
|
|
easy to use
|
2019-08-07 13:55:53 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
Jool kernel module
|
|
|
|
100\% cpu usage on 1 core for udp
|
|
|
|
0\% visible cpu usage for tcp, might be tcp offloading
|
2019-08-15 13:33:08 +00:00
|
|
|
Integration with iptables
|
2019-08-16 13:50:07 +00:00
|
|
|
Requires routing
|
|
|
|
|
2019-08-15 15:14:35 +00:00
|
|
|
% ----------------------------------------------------------------------
|
|
|
|
\section{\label{results:p4}P4}
|
2019-08-16 10:23:35 +00:00
|
|
|
All planned features could be realised with P4 and a controller.
|
|
|
|
The language has some limitations on where if/switch statements can be
|
|
|
|
used.\footnote{In general, if and switch statements in actions lead to
|
|
|
|
errors, but not all constellations are forbidden.}
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-16 10:23:35 +00:00
|
|
|
For this thesis the parsing capabilities of P4 were adequate. However
|
|
|
|
P4 at the time of writing cannot parse ICMP6 options, as the upper
|
|
|
|
level protocol does not specify the number of options that follow and
|
|
|
|
parsing of 64 bit blocks is required.
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-16 10:23:35 +00:00
|
|
|
P4/BMV2 does not support for multiple LPM keys in a table, however it
|
|
|
|
supports multiple keys with ternary matching.
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-16 10:23:35 +00:00
|
|
|
When developing P4 programs, the reason for incorrect behaviour was
|
|
|
|
most often found in checksum problems. If frame checksum errors where
|
|
|
|
displayed by tcpdump, usually the effective length of the packet was
|
|
|
|
incorrect.
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
FIXMe: IPv6: NDP: not easy to parse, as unknown number of following fields
|
2019-08-15 15:14:35 +00:00
|
|
|
|
|
|
|
The tooling around P4 is still fragile, encountered many bugs
|
|
|
|
in the development.\cite{schottelius:github1675}
|
|
|
|
|
|
|
|
or missing features (\cite{schottelius:github745},
|
|
|
|
\cite{theojepsen:_get})
|
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
Hitting expression bug (FIXME: source)
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
1) Impossible to retrieve key from table: LPM: addr + mask -> addr and
|
|
|
|
mask might be used in controller
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
2) retrieving information from tables : no meta information, don't
|
|
|
|
know which table matched
|
2019-08-15 15:14:35 +00:00
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
3) type definitions separate Code sharing (controller, switch)
|
2019-08-15 15:14:35 +00:00
|
|
|
|
|
|
|
No switch in actions, No conditional execution in actions
|
|
|
|
|
2019-08-18 10:40:34 +00:00
|
|
|
Not directly related to P4, but supporting scripts are usually written in python2, however python2
|
|
|
|
handles unicode strings differently and thus effects like an IPv6
|
|
|
|
address ``changing'' happen. \cite{appendix:p4:python2unicode}.
|
2019-08-15 15:14:35 +00:00
|
|
|
|
|
|
|
P4os - reusable code
|
|
|
|
|
|
|
|
idomatic problem: Security issue: not checking checksums before
|