Adjust results section

This commit is contained in:
Nico Schottelius 2019-08-22 14:13:58 +02:00
parent 33963bc681
commit a0cf251917
4 changed files with 81 additions and 68 deletions

View file

@ -377,8 +377,8 @@ this section we describe the IPv6 and IPv4 configurations as a basis
for the discussion. for the discussion.
All IPv6 addresses are from the documentation block All IPv6 addresses are from the documentation block
\textit{2001:DB8::/32}~\cite{rfc3849}. In particular the following sub \textit{2001:DB8::/32}~\cite{rfc3849}. In particular we use the sub
networks and IPv6 addresses are used: networks and IPv6 addresses shown in table \ref{tab:ipv6address}.
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c |} \begin{tabular}{| c | c |}
@ -407,7 +407,7 @@ networks and IPv6 addresses are used:
\end{table} \end{table}
We use private IPv4 addresses as specified by RFC1918~\cite{rfc1918} We use private IPv4 addresses as specified by RFC1918~\cite{rfc1918}
from the 10.0.0.0/8 range as follows: from the 10.0.0.0/8 range as shown in table \ref{tab:ipv4address}.
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}\begin{minipage}{\textwidth}

View file

@ -161,8 +161,8 @@ using delta checksums, the compile time of 2 to 6 hours contributed to
a significant slower development cycle compared to BMV2. a significant slower development cycle compared to BMV2.
Lastly, the focus of this thesis is to implement high speed NAT64 on Lastly, the focus of this thesis is to implement high speed NAT64 on
P4, which only requires a subset of the features that we realised on P4, which only requires a subset of the features that we realised on
BMV2. Table \ref{tab:p4netpfgafeatures} summarises the implemented BMV2. In table \ref{tab:p4netpfgafeatures} we summarise the implemented
features and reasons about their implementation status. features and reason about their portability afterwards:
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |} \begin{tabular}{| c | c | c |}
@ -170,34 +170,23 @@ features and reasons about their implementation status.
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\ \textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline \hline
Switch to controller & Switch forwards unhandled packets to Switch to controller & Switch forwards unhandled packets to
controller & portable\footnote{While the NetFPGA P4 implementation controller & portable\\
does not have the clone3() extern that the BMV2 implementation offers,
communication to the controller can easily be realised by using one of
the additional ports of the NetFPGA and connect a physical network
card to it.}\\
\hline \hline
Controller to Switch & Controller can setup table entries & Controller to Switch & Controller can setup table entries &
portable\footnote{The p4utils suite offers an easy access to the portable\\
switch tables. While the P4-NetFPGA support repository also offers
python scripts to modify the switch tables, the code is less
sophisticated and more fragile.}\\
\hline \hline
NDP & Switch responds to ICMP6 neighbor & \\ NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) & & solicitation request (without controller) &
portable\footnote{NetFPGA/P4 does not offer calculating the checksum portable\\
over the payload. However delta checksumming can be used to create
the required checksum for replying.} \\
\hline \hline
ARP & Switch can answer ARP request (without controller) & ARP & Switch can answer ARP request (without controller) &
portable\footnote{As ARP does not use checksums, integrating the portable \\
source code \texttt{actions\_arp.p4} into the netpfga code base is
enough to enable ARP support in the NetPFGA.} \\
\hline \hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) & ICMP6 & Switch responds to ICMP6 echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\ portable\\
\hline \hline
ICMP & Switch responds to ICMP echo request (without controller) & ICMP & Switch responds to ICMP echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\ portable\\
\hline \hline
NAT64: TCP & Switch translates TCP with checksumming & \\ NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 & & from/to IPv6 to/from IPv4 &
@ -209,20 +198,17 @@ fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4
\hline \hline
NAT64: & Switch translates echo request/reply & \\ NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming & ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
portable\footnote{ICMP/ICMP6 translations only require enabling the portable\\
icmp/icmp6 code in the netpfga code base.} \\
\hline \hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings & NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
portable\footnote{Same reasoning as ``Controller to switch''.} \\ portable\\
\hline \hline
Delta Checksum & Switch can calculate checksum without payload Delta Checksum & Switch can calculate checksum without payload
inspection & inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\ fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline \hline
Payload Checksum & Switch can calculate checksum with payload inspection & Payload Checksum & Switch can calculate checksum with payload inspection &
unsupported\footnote{To support creating payload checksums, either an unsupported \\
HDL module needs to be created or to modify the generated
the PX program.~\cite{schottelius:_exter_p4_netpf}} \\
\hline \hline
\end{tabular} \end{tabular}
\end{minipage} \end{minipage}
@ -230,6 +216,44 @@ unsupported\footnote{To support creating payload checksums, either an
\label{tab:p4netpfgafeatures} \label{tab:p4netpfgafeatures}
\end{center} \end{center}
\end{table} \end{table}
The switch to controller communication differs,
because the P4/NetFPGA implementation does not have the clone3() extern
that the BMV2 implementation offers. However communication to the
controller can easily be realised by using one of
the additional ports of the NetFPGA and connect a physical network
card to it.
Communicating from the controller towards the switch also differs, as
the p4utils suite supporting BMV2 offers an easy access to the switch
tables. While the P4-NetFPGA support repository also offers python
scripts to modify the switch tables, the code is less sophisticated
and more fragile. While porting the existing code is possible, it
might be of advantage to rewrite parts of the P4-NetFPGA before.
The NAT64 session support is based on the P4 switch communicating with
the controller and vice versa. As we consider both features to be
portable, we also consider the NAT64 session feature to be portable.
P4/NetFPGA does not offer calculating the checksum over the payload
and thus calculating the checksum over the payload to create
a reply for an neighbor solicitation packet is not possible. However,
as the payload stays the same as in the request, our delta based
checksum approach can be reused in this situation. With the same
reasoning we consider our ICMP6 and ICMP code, which also requires to
create payload based checksums, to be portable.
ARP replies do not contain a checksum over the payload, thus the
existing ARP code can be directly integrated into P4/NetFPGA without
any changes.
While the P4/NetFPGA target currently does not support accessing the
payload or creating checksums over it, there are two possibilities to
extend the platform: either by creating an HDL module or by
modify the generated the PX
program.~\cite{schottelius:_exter_p4_netpf}
Due to the existing code complexity of the P4/NetFPGA platform, using
the HDL module based approach is likely to be more sustainable.
% ok % ok
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\subsection{\label{results:netpfga:stability}Stability} \subsection{\label{results:netpfga:stability}Stability}
@ -241,13 +265,13 @@ hardware tests (compare figures \ref{fig:hwtestnico} and
first card reported an additional ``10G\_Loopback'' failure. Due to first card reported an additional ``10G\_Loopback'' failure. Due to
the inability of setting table entries, no benchmarking was performed the inability of setting table entries, no benchmarking was performed
on the first NetFPGA card. on the first NetFPGA card.
\begin{figure}[h] \begin{figure}[htbp]
\includegraphics[scale=1.4]{hwtestnico} \includegraphics[scale=1.4]{hwtestnico}
\centering \centering
\caption{Hardware Test NetPFGA Card 1} \caption{Hardware Test NetPFGA Card 1}
\label{fig:hwtestnico} \label{fig:hwtestnico}
\end{figure} \end{figure}
\begin{figure}[h] \begin{figure}[htbp]
\includegraphics[scale=0.2]{hwtesthendrik} \includegraphics[scale=0.2]{hwtesthendrik}
\centering \centering
\caption{Hardware Test NetPFGA Card 2~\cite{hendrik:_p4_progr_fpga_semes_thesis_sa}} \caption{Hardware Test NetPFGA Card 2~\cite{hendrik:_p4_progr_fpga_semes_thesis_sa}}
@ -399,7 +423,7 @@ In this section we give an overview of the benchmark design
and summarise the benchmarking results. and summarise the benchmarking results.
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\subsection{\label{results:benchmark:design}Benchmark Design} \subsection{\label{results:benchmark:design}Benchmark Design}
\begin{figure}[h] \begin{figure}[htbp]
\includegraphics[scale=0.6]{softwarenat64design} \includegraphics[scale=0.6]{softwarenat64design}
\centering \centering
\caption{Benchmark Design for NAT64 in Software Implementations} \caption{Benchmark Design for NAT64 in Software Implementations}
@ -429,28 +453,30 @@ warm up phase.\footnote{iperf -O 10 parameter, see section \ref{design:tests}.}
% ok % ok
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\subsection{\label{results:benchmark:summary}Benchmark Summary} \subsection{\label{results:benchmark:summary}Benchmark Summary}
Overall \textbf{Tayga} has shown to be the slowest translator with an achieved Overall \textbf{Tayga} has shown to be the slowest translator with an
bandwidth of \textbf{about 3 Gbit/s}, followed by \textbf{Jool} that translates at achieved bandwidth of \textbf{about 3 Gbit/s}, followed by
about \textbf{8 Gbit/s}. \textbf{Our solution} is the fastest with an almost line rate \textbf{Jool} that translates at about \textbf{8 Gbit/s}. \textbf{Our
translation speed of about \textbf{9 Gbit/s}. solution} is the fastest with an almost line rate translation speed
of about \textbf{9 Gbit/s} (compare tables \ref{tab:benchmarkv6} and
\ref{tab:benchmarkv4}).
The TCP based benchmarks show realistic numbers, while iperf reports The TCP based benchmarks show realistic numbers, while iperf reports
above line rate speeds (up to 22 gbit/s on a 10gbit/s link) above line rate speeds (up to 22 gbit/s on a 10gbit/s link) for UDP
for UDP based benchmarks. For this reason we based benchmarks. For this reason we have summarised the UDP based
have summarised the UDP based benchmarks with their average loss benchmarks with their average loss instead of listing the bandwidth
instead of listing the bandwidth details. The ``adjusted bandwidth'' details. The ``adjusted bandwidth'' in the UDP benchmarks incorporates
in the UDP benchmarks incorporates the packets loss (compare tables the packets loss (compare tables \ref{tab:benchmarkv6v4udp} and
\ref{tab:benchmarkv6v4udp} and \ref{tab:benchmarkv6v4udp}). \ref{tab:benchmarkv4v6udp}).
Both software solutions showed significant loss of packets in the UDP Both software solutions showed significant loss of packets in the UDP
based benchmarks (Tayga: up to 91\%, Jool up to 71\%), while the based benchmarks (Tayga: up to 91\%, Jool up to 71\%), while the
P4/NetFPGA showed a maximum of 0.01\% packet loss. Packet loss is only P4/NetFPGA showed a maximum of 0.01\% packet loss. Packet loss is only
recorded by iperf for UDP based benchmarks, as TCP packets are confirmed and recorded by iperf for UDP based benchmarks, as TCP packets are
resent if necessary. confirmed and resent if necessary.
Tayga has the highest variation of results, which might be due to Tayga has the highest variation of results, which might be due to
being fully CPU bound, even in the non-parallel benchmark. Jool has less being fully CPU bound, even in the non-parallel benchmark. Jool has
variation and in general the P4/NetFPGA solution behaves almost less variation and in general the P4/NetFPGA solution behaves almost
identical in different benchmark runs. identical in different benchmark runs.
The CPU load for TCP based benchmarks with Jool was almost negligible, The CPU load for TCP based benchmarks with Jool was almost negligible,
@ -460,10 +486,10 @@ utilised. When the translation for P4/NetFPGA happens within the
NetFPGA card, there was no CPU utilisation visible on the NAT64 host. NetFPGA card, there was no CPU utilisation visible on the NAT64 host.
We see lower bandwidth for translating IPv4 to IPv6 in all solutions. We see lower bandwidth for translating IPv4 to IPv6 in all solutions.
We suspect that this might be due to slighty increasing packet sizes that We suspect that this might be due to slighty increasing packet sizes
occur during this direction of translation. Not only does this vary that occur during this direction of translation. Not only does this
the IPv4 versus IPv6 bandwidth, but it might also cause fragmentation vary the IPv4 versus IPv6 bandwidth, but it might also cause
that slows down. fragmentation that slows down.
During the benchmarks with up to 10 parallel connections, no During the benchmarks with up to 10 parallel connections, no
significant CPU load was registered on the load generator. However significant CPU load was registered on the load generator. However
@ -484,11 +510,8 @@ Overall the performance of Tayga, a Linux user space program, is as
expected. We were surprised about the good performance of Jool, which, expected. We were surprised about the good performance of Jool, which,
while slower than the P4/NetFPGA solution, is almost on par with our solution. while slower than the P4/NetFPGA solution, is almost on par with our solution.
% ---------------------------------------------------------------------- % ----------------------------------------------------------------------
\newpage
\subsection{\label{results:benchmark:v6v4tcp}IPv6 to IPv4 TCP
Benchmark Results}
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}
\begin{tabular}{| c | c | c | c | c |} \begin{tabular}{| c | c | c | c | c |}
\hline \hline
Implementation & \multicolumn{4}{|c|}{min/avg/max in Gbit/s} \\ Implementation & \multicolumn{4}{|c|}{min/avg/max in Gbit/s} \\
@ -505,16 +528,14 @@ P4 / NetPFGA & 9.28 / 9.28 / 9.29 & 9.28 / 9.28 / 9.29 & 9.28 / 9.28
Parallel connections & 1 & 10 & 20 & 50 \\ Parallel connections & 1 & 10 & 20 & 50 \\
\hline \hline
\end{tabular} \end{tabular}
\end{minipage}
\caption{IPv6 to IPv4 TCP NAT64 Benchmark} \caption{IPv6 to IPv4 TCP NAT64 Benchmark}
\label{tab:benchmarkv6} \label{tab:benchmarkv6}
\end{center} \end{center}
\end{table} \end{table}
%ok %ok
% --------------------------------------------------------------------- % ---------------------------------------------------------------------
\subsection{\label{results:benchmark:v4v6tcp}IPv4 to IPv6 TCP Benchmark Results}
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}
\begin{tabular}{| c | c | c | c | c |} \begin{tabular}{| c | c | c | c | c |}
\hline \hline
Implementation & \multicolumn{4}{|c|}{min/avg/max in Gbit/s} \\ Implementation & \multicolumn{4}{|c|}{min/avg/max in Gbit/s} \\
@ -531,18 +552,13 @@ P4 / NetPFGA & 8.51 / 8.53 / 8.55 & 9.28 / 9.28 / 9.29 & 9.29 / 9.29 /
Parallel connections & 1 & 10 & 20 & 50 \\ Parallel connections & 1 & 10 & 20 & 50 \\
\hline \hline
\end{tabular} \end{tabular}
\end{minipage}
\caption{IPv4 to IPv6 TCP NAT64 Benchmark} \caption{IPv4 to IPv6 TCP NAT64 Benchmark}
\label{tab:benchmarkv4} \label{tab:benchmarkv4}
\end{center} \end{center}
\end{table} \end{table}
% --------------------------------------------------------------------- % ---------------------------------------------------------------------
\newpage
\subsection{\label{results:benchmark:v6v4udp}IPv6 to IPv4 UDP
Benchmark Results}
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}
\begin{tabular}{| c | c | c | c | c |} \begin{tabular}{| c | c | c | c | c |}
\hline \hline
Implementation & \multicolumn{4}{|c|}{avg bandwidth in gbit/s / avg loss / Implementation & \multicolumn{4}{|c|}{avg bandwidth in gbit/s / avg loss /
@ -560,16 +576,14 @@ P4 / NetPFGA & 8.28 / 0\% / 8.28 & 9.26 / 0\% / 9.26 &
Parallel connections & 1 & 10 & 20 & 50 \\ Parallel connections & 1 & 10 & 20 & 50 \\
\hline \hline
\end{tabular} \end{tabular}
\end{minipage}
\caption{IPv6 to IPv4 UDP NAT64 Benchmark} \caption{IPv6 to IPv4 UDP NAT64 Benchmark}
\label{tab:benchmarkv6v4udp} \label{tab:benchmarkv6v4udp}
\end{center} \end{center}
\end{table} \end{table}
%ok %ok
% --------------------------------------------------------------------- % ---------------------------------------------------------------------
\subsection{\label{results:benchmark:v4v6udp}IPv4 to IPv6 UDP Benchmark Results}
\begin{table}[htbp] \begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth} \begin{center}
\begin{tabular}{| c | c | c | c | c |} \begin{tabular}{| c | c | c | c | c |}
\hline \hline
Implementation & \multicolumn{4}{|c|}{avg bandwidth in gbit/s / avg loss / Implementation & \multicolumn{4}{|c|}{avg bandwidth in gbit/s / avg loss /
@ -587,9 +601,8 @@ P4 / NetPFGA & 7.04 / 0\% / 7.04 & 9.58 / 0\% / 9.58 &
Parallel connections & 1 & 10 & 20 & 50 \\ Parallel connections & 1 & 10 & 20 & 50 \\
\hline \hline
\end{tabular} \end{tabular}
\end{minipage}
\caption{IPv4 to IPv6 UDP NAT64 Benchmark} \caption{IPv4 to IPv6 UDP NAT64 Benchmark}
\label{tab:benchmarkv6v4udp} \label{tab:benchmarkv4v6udp}
\end{center} \end{center}
\end{table} \end{table}
%ok %ok

Binary file not shown.

View file

@ -16,7 +16,7 @@ digraph G {
tableentry [ label="Create Table Entry" ]; tableentry [ label="Create Table Entry" ];
tablematch [ label="Table Match" ]; tablematch [ label="Table Match" ];
reinject [ label="Reinject packet" ]; reinject [ label="Reinject Packet" ];
controller [ label="Controller Reads Packet" ] controller [ label="Controller Reads Packet" ]
deparser [ label="Deparser"]; deparser [ label="Deparser"];