This commit is contained in:
Nico Schottelius 2019-08-15 15:33:08 +02:00
parent f5774e1b47
commit e1949d2ac3
9 changed files with 229 additions and 60 deletions

View file

@ -176,6 +176,11 @@ idomatic problem: Security issue: not checking checksums before
\section{\label{conclusion:netpfga}NetFGPA - all HERE}
personal note here
stopped working
reboot not enough
does not respond to any packet
tested various kernels for table debugging
MTU limitations: 1500 according to a private mail from Salvator Galea

View file

@ -12,7 +12,7 @@ objective of this thesis was to demonstrate the high speed
capabilities of NAT64 in hardware, no benchmarks were performed on the
P4 software implementation.
% ----------------------------------------------------------------------
\section{\label{results:p4}NAT64 Overview}
\section{\label{results:p4}NAT64 Overview - FIXME: verify numbers}
We successfully implemented P4 code to realise
NAT64\cite{schottelius:thesisrepo}. It contains parsers
for all related protocols (ipv6, ipv4, udp, tcp, icmp, icmp6, ndp,
@ -24,7 +24,6 @@ Due to limitations in the P4 environment of the
NetFPGA\cite{conclusion:netfpga} environment, the BMV2 implementation
is more feature rich. Table \ref{tab:benchmark} summarises the
achieved bandwidths of the NAT64 solutions.
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c | c |}
@ -40,13 +39,11 @@ P4 / NetPFGA & 9.28 & 9.29 & 9.29\\
\hline
\end{tabular}
\end{minipage}
\caption{NAT64 Benchmark (IPv6 initiating), all results in Gbit/sec (\%loss)}
\caption{NAT64 Benchmark (client: IPv6, server: IPv4), all results in Gbit/sec (\%loss)}
\label{tab:benchmarkv6}
\end{center}
\end{table}
During the benchmarks the client
During the benchmarks the client -- CPU usage
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c | c |}
@ -62,24 +59,91 @@ P4 / NetPFGA & 8.43 & 9.29 & 9.29\\
\hline
\end{tabular}
\end{minipage}
\caption{NAT64 Benchmark (IPv4 initiating), all results in Gbit/sec (\%loss)}
\caption{NAT64 Benchmark (client: IPv4, server: IPv6), all results in Gbit/sec (\%loss)}
\label{tab:benchmarkv4}
\end{center}
\end{table}
Feature comparison
speed - sessions - eamt
can act as host
lpm tables
ping
ping6 support
ndp
controller support
% ----------------------------------------------------------------------
\section{\label{Results:BMV2}BMV2}
The software implementation of P4 features most features, which is
mostly due to available externs that can checksum the payload: Acting
The software implementation of P4 has most features, which is
mostly due to the capability of checksumming the payload: Acting
as a ``proper'' participant in NDP, requires the host to calculate
checksums over the payload.
List of features:
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & fully implemented\footnote{Source code: \texttt{actions\_egress.p4}}\\
\hline
Controller to Switch & Controller can setup table entries &
fully implemented\footnote{Source code: \texttt{controller.py}}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
fully implemented\footnote{Source code:
\texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ARP & Switch can answer ARP request (without controller) & fully
implemented\footnote{Source code: \texttt{actions\_arp.p4}}\\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
fully implemented\footnote{Source code: \texttt{actions\_icmp6\_ndp\_icmp.p4}} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
fully implemented\footnote{Source code:
\texttt{actions\_nat64\_session.p4}, \texttt{controller.py}} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / BMV2 feature list}
\label{tab:p4bmv2features}
\end{center}
\end{table}
Responds to icmp, icmp6
ndp \cite{rfc4861}
arp
test framework openvswitch
Fully functional host
Can compute checksums on its own.
@ -91,21 +155,123 @@ RFC6145\cite{rfc6145}.
Stateful : no automatic removal
% ----------------------------------------------------------------------
\section{\label{results:tayga}Tayga}
cpu bound, single thread
% ----------------------------------------------------------------------
\section{\label{results:jool}Jool}
Session management not benchmarked, as it is only a matter of creating
table entries.
Jool and tayga are supported by
% ----------------------------------------------------------------------
\section{\label{Results:NetPFGA}NetFPGA}
\subsection{\label{results:netpfga:checksum}Checksum computation}
\subsection{\label{results:netpfga:general}to be named}
The reduced feature set of the NetPFGA implementation is due to two
factors: compile time. Between 2 to 6 hours per compile run. No
payload checksum
\begin{table}[htbp]
\begin{center}\begin{minipage}{\textwidth}
\begin{tabular}{| c | c | c |}
\hline
\textbf{Feature} & \textbf{Description} & \textbf{Status} \\
\hline
Switch to controller & Switch forwards unhandeled packets to
controller & portable\footnote{While the NetFPGA P4 implementation
does not have the clone3() extern that the BMV2 implementation offers,
communication to the controller can easily be realised by using one of
the additional ports of the NetFPGA and connect a physical network
card to it.}\\
\hline
Controller to Switch & Controller can setup table entries &
portable\footnote{The p4utils suite offers an easy access to the
switch tables. While the P4-NetFPGA support repository also offers
python scripts to modify the switch tables, the code is less
sophisticated and more fragile.}\\
\hline
NDP & Switch responds to ICMP6 neighbor & \\
& solicitation request (without controller) &
portable\footnote{NetFPGA/P4 does not offer calculating the checksume
over the payload. However delta checksumming can be used to create
the required checksum for replying.} \\
\hline
ARP & Switch can answer ARP request (without controller) &
portable\footnote{As ARP does not use checksums, integrating the
source code \texttt{actions\_arp.p4} into the netpfga code base is
enough to enable ARP support in the NetPFGA.} \\
\hline
ICMP6 & Switch responds to ICMP6 echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
ICMP & Switch responds to ICMP echo request (without controller) &
portable\footnote{Same reasoning as NDP.} \\
\hline
NAT64: TCP & Switch translates TCP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: UDP & Switch translates UDP with checksumming & \\
& from/to IPv6 to/from IPv4 &
fully implemented\footnote{Source code: \texttt{actions\_nat64\_generic\_icmp.p4}} \\
\hline
NAT64: & Switch translates echo request/reply & \\
ICMP/ICMP6 & from/to ICMP6 to/from ICMP with checksumming &
portable\footnote{ICMP/ICMP6 translations only require enabling the
icmp/icmp6 code in the netpfga code base.} \\
\hline
NAT64: Sessions & Switch and controller create 1:n sessions/mappings &
portable\footnote{Same reasoning as ``Controller to switch''.} \\
\hline
Delta Checksum & Switch can calculate checksum without payload
inspection &
fully implemented\footnote{Source code: \texttt{actions\_delta\_checksum.p4}}\\
\hline
Payload Checksum & Switch can calculate checksum with payload inspection &
unsupported\footnote{To support creating payload checksums, either an
HDL module needs to be created or to modify the generated
the PX program.\cite{schottelius:_exter_p4_netpf}} \\
\hline
\end{tabular}
\end{minipage}
\caption{P4 / NetFPGA feature list}
\label{tab:p4netpfgafeatures}
\end{center}
\end{table}
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:stability}Stability}
Two different NetPFGA cards were used during the development of the
thesis. The first card had consistent ioctl errors (compare section
\ref{netpfgaioctlerror}) when writing table entries. The available
hardware tests (compare figures \ref{fig:hwtestnico} and
\ref{fig:hwtesthendrik}) showed failures in both cards, however the
first card reported an additional ``10G\_Loopback'' failure. Due to
the inability of setting table entries, no benchmarking was performed
on the first NetFPGA card.
\begin{figure}[h]
\includegraphics[scale=1.4]{hwtestnico}
\centering
\caption{Hardware Test NetPFGA card 1}
\label{fig:hwtestnico}
\end{figure}
\begin{figure}[h]
\includegraphics[scale=0.2]{hwtesthendrik}
\centering
\caption{Hardware Test NetPFGA card 2, \cite{hendrik:_p4_progr_fpga_semes_thesis_sa}}
\label{fig:hwtesthendrik}
\end{figure}
During the development and benchmarking, the second NetFPGA card stopped to
function properly multiple times. In both cases the card would not
forward packets anymore. Multiple reboots (3 were usually enough)
and multiple times reflashing the bitstream to the NetFPGA usually
restored the intended behaviour.
% ----------------------------------------------------------------------
\subsection{\label{results:netpfga:performance}Performance}
As expected, the NetFGPA card performed at near line speed and offers
NAT64 translations at 9.28 Gbit/s.
Checksum computation
Trace files
\begin{verbatim}
create mode 100644 pcap/tcp-udp-delta-2019-07-17-1555-h1.pcap
@ -195,37 +361,13 @@ General result: limited NAT64 is working, however
No NDP, no ARP - focused on key factors of NAT64 translation,
other features can be supported by controller
\section{\label{results:softwarenat64}NAT64 in Software}
Tayga, Jool
% ----------------------------------------------------------------------
\section{\label{results:tayga}Tayga}
During the benchmark cpu bound, single thread
tayga: Single threaded
fork:
\begin{verbatim}
| What? | Description | State in P4 | References |
|---------------------+------------------------------------------+-------------------+---------------------------------------------------------------------------------|
| Jool EAMT | Mapping with tables, multiple entries | Supported | https://www.jool.mx/en/eamt.html, https://www.jool.mx/en/run-eam.html, RFC 7757 |
| Jool SIIT | Mapping IPv6 to range of IPv4, one entry | Supported by EAMT | |
\end{verbatim}
\section{\label{results:features}Feature comparison}
speed - sessions - eamt
can act as host
lpm tables
ping
ping6 support
ndp
controller support
\section{todo - FIXME: remove}
\begin{verbatim}
***** Dorth eher detailiertes Drawing
***** Längste Section!
\end{verbatim}
% ----------------------------------------------------------------------
\section{\label{results:jool}Jool}
kernel module
high cpu usage for udp connetcinos
Integration with iptables

Binary file not shown.

View file

@ -1736,6 +1736,7 @@ The HW testing tool for the switch_calc design
testing>
\end{verbatim}
\label{netpfgaioctlerror}
\begin{verbatim}
>> table_cam_add_entry lookup_table send_to_port1 ff:ff:ff:ff:ff:ff =>
CAM_Init_ValidateContext() - done

Binary file not shown.

After

Width:  |  Height:  |  Size: 526 KiB

BIN
doc/images/hwtestnico.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 165 KiB

View file

@ -454,7 +454,8 @@
| | | |
| 2019-08-21 | hand in thesis | |
| | | |
* Thesis implementation
* DONE Thesis implementation
CLOSED: [2019-08-15 Thu 13:46]
** DONE Setup test VM for P4: 2a0a:e5c0:2:12:400:f0ff:fea9:c3e3
** DONE Get feature list of jool
** DONE Get feature list of tayga
@ -8939,7 +8940,8 @@ Proof:
create mode 100644 pcap/netfpga-10.2-fromv6tov4-2019-08-04-1943-enp2s0f1.pcap
#+END_CENTER
*** 2019-08-04: udp benchmark: very slow
*** DONE 2019-08-04: udp benchmark: very slow
CLOSED: [2019-08-15 Thu 13:46]
#+BEGIN_CENTER
nico@ESPRIMO-P956:~$ iperf3 -p 2345 -6 -B 2001:db8:42::42 -s
-----------------------------------------------------------
@ -9086,6 +9088,8 @@ iperf Done.
nico@ESPRIMO-P956:~$
#+END_CENTER
*** TODO 2019-08-15: netpfga "crash"
- 4th run got stuck
** The NetPFGA saga
Problems encountered:
- The logfile for a compile run is 10k+ lines
@ -9229,7 +9233,7 @@ nico@nsg-System:~/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/project
** graphviz:
- https://graphviz.gitlab.io/_pages/doc/info/shapes.html#polygon
** TODO References / Follow up
*** Board
*** TODO Board
*** DONE RFC 791 IPv4 https://tools.ietf.org/html/rfc791
CLOSED: [2019-08-13 Tue 12:31]
*** DONE RFC 792 ICMP https://tools.ietf.org/html/rfc792
@ -9285,10 +9289,14 @@ nico@nsg-System:~/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/project
CLOSED: [2019-08-13 Tue 12:49]
*** DONE Solicited node multicast address https://en.wikipedia.org/wiki/Solicited-node_multicast_address
CLOSED: [2019-08-13 Tue 12:52]
*** Scapy / IPv6: https://www.idsv6.de/Downloads/IPv6PacketCreationWithScapy.pdf
*** V1 model: https://github.com/p4lang/p4c/blob/master/p4include/v1model.p4
*** Cisco NAT64 https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipaddr_nat/configuration/xe-3s/nat-xe-3s-book/iadnat-stateful-nat64.pdf
*** Wiki_mac: https://en.wikipedia.org/wiki/MAC_address
*** DONE Scapy / IPv6: https://www.idsv6.de/Downloads/IPv6PacketCreationWithScapy.pdf
CLOSED: [2019-08-15 Thu 13:45]
*** DONE V1 model: https://github.com/p4lang/p4c/blob/master/p4include/v1model.p4
CLOSED: [2019-08-15 Thu 13:45]
*** DONE Cisco NAT64 https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipaddr_nat/configuration/xe-3s/nat-xe-3s-book/iadnat-stateful-nat64.pdf
CLOSED: [2019-08-15 Thu 13:45]
*** DONE Wiki_mac: https://en.wikipedia.org/wiki/MAC_address
CLOSED: [2019-08-15 Thu 13:45]
** TODO Writing Thesis
*** DONE Introduction: 1-2 pages
CLOSED: [2019-08-13 Tue 12:52]
@ -9308,8 +9316,11 @@ nico@nsg-System:~/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/project
*** TODO Send for review to Tobias/Thilo
*** DONE Create graph of with and without router
CLOSED: [2019-08-13 Tue 12:54]
*** TODO Add comparison with other solutions (Results?)
*** TODO Show / create graph of a bigger network
*** DONE Add comparison with other solutions (Results?)
CLOSED: [2019-08-15 Thu 13:45]
*** DONE Show / create graph of a bigger network
CLOSED: [2019-08-15 Thu 13:45]
*** TODO Update benchmarks
*** TODO Rough time table / effort
| | | Status |
| Abstract | 1 day | okayish |

View file

@ -125,3 +125,13 @@
author = {Nico Schottelius},
title = {High speed NAT64 in P4 (git repository)},
howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/ma-2019-19_high_speed_nat64_with_p4}}}
@Misc{schottelius:_exter_p4_netpf,
author = {Nico Schottelius},
title = {Extern for checksum'ing payload (P4-NetPFGA-public)},
howpublished = {\url{https://github.com/NetFPGA/P4-NetFPGA-public/issues/13}}}
@Misc{hendrik:_p4_progr_fpga_semes_thesis_sa,
author = {Hendrik Züllig, Supervisor; Prof. Dr. Laurent Vanbever; Tutor: Tobias Bühler},
title = {P4-Programming on an FPGA, Semester Thesis SA-2019-02},
howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/sa-2019-02_p4_programming_sume_netfpga/blob/master/SA-2019-02.pdf}}}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 199 KiB