diff --git a/doc/Background.tex b/doc/Background.tex index 7a96dbf..15ab2d0 100644 --- a/doc/Background.tex +++ b/doc/Background.tex @@ -31,14 +31,14 @@ are discussed in section \ref{results:p4}. As opposed to general purpose programming languages, P4 lacks some features. Most notably loops, floating point operations and modulo operations. -However within its constraints, P4 can guarantee +However, within its constraints, P4 can guarantee operation at line speed, which general purpose programming languages cannot guarantee and also fail to achieve in reality (see section \ref{results:softwarenat64} for details). % ok % ---------------------------------------------------------------------- \section{\label{background:ip}IPv6, IPv4 and Ethernet} -The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv4 and +The first IPv6 RFC was published in 1998~\cite{rfc2460}. Both IPv6 and IPv4 operate on layer 3 of the OSI model. In this thesis we only consider transmission via Ethernet, which operates at layer 2. Inside the Ethernet frame a field named ``type'' specifies @@ -78,7 +78,7 @@ Figure \ref{fig:arpndp} illustrates a typical address resolution process. \caption{ARP and NDP} \label{fig:arpndp} \end{figure} -The major difference between ARP and NDP in relation to P4 are +The major differences between ARP and NDP in relation to P4 are \begin{itemize} \item ARP is a separate protocol on the same layer as IPv6 and IPv4, \item NDP operates below ICMP6 which operates below IPv6, @@ -87,7 +87,7 @@ The major difference between ARP and NDP in relation to P4 are (specifically: ICMP6 link layer address option). \end{itemize} ARP is required to be a separate protocol, because IPv4 hosts don't -know how to communicate with each other yet, because they don't have a +know how to communicate with each other yet, as they don't have a way to communicate to the target IPv4 address (``The chicken and the egg problem''). NDP on the other hand already works within IPv6, as every IPv6 host is @@ -228,10 +228,10 @@ port can be reused again. \caption{Stateful NAT64} \label{fig:statefulnat64} \end{figure} -The selection of mapped ports is usually based on the availability on +The selection of mapped ports is usually based on the availability of the IPv4 side and not related to the original port. To support -stateful NAT64, the translator needs to store the mapping in a table and -purge entries regularly. +stateful NAT64, the translator needs to store the mapping +in a table and purge entries regularly. Stateful NAT64 usually uses information found in protocols at layer 4 like TCP~\cite{rfc793} or UDP~\cite{rfc768}. However, it can also @@ -255,7 +255,7 @@ implementations. While protocol dependent translation has the highest amount of information to choose from for translation, complex parsers or even cryptographic methods are required for it. That reduces the -opportunities of protocol dependent translations to run on devices +opportunities for) protocol dependent translations to run on devices with less sophisticated devices. % ---------------------------------------------------------------------- \subsection{\label{background:transition:prefixnat}Mapping IPv4 @@ -340,7 +340,7 @@ The DNS64 DNS server will query the authoritative DNS server for an AAAA record. However as the host \textit{ipv4onlyhost.example.com} is only reachable by IPv4, it also only has an A entry. After receiving the answer that there is no AAAA record, the DNS64 server will ask for an -A record and gets an answer that the name +A record and will get an answer that the name \textit{ipv4onlyhost.example.com} resolves to the IPv4 address \textit{192.0.2.0}. The DNS64 server then embeds the IPv4 address in the configured IPv6 prefix (\textit{64:ff9b::/96} in this case) and @@ -388,10 +388,10 @@ defined in RFC768 and RFC793 and are shown in \ref{fig:ipv4pseudoheader}. \label{fig:ipv6pseudoheader} \end{figure} When translating, the checksum fields in the higher protocols need to be -adjusted. The checksums for TCP and UDP is calculated not only over the pseudo +adjusted. The checksums for TCP and UDP are calculated not only over the pseudo headers, but also contain the payload of the packet. This is -important because some targets (like the NetFPGA) do not allow to -access the payload (see section \ref{design:netpfga}). +important because some targets (like the NetFPGA) do not allow +accessing the payload (see section \ref{design:netpfga}). \begin{figure}[h] \begin{verbatim} 0 7 8 15 16 23 24 31 @@ -419,12 +419,6 @@ Its calculation can be summarised as follows: % ---------------------------------------------------------------------- \section{\label{background:networkdesign}Network Designs} -%% \begin{figure}[h] -%% \includegraphics[scale=0.5]{v4only} -%% \centering -%% \caption{IPv4 only network} -%% \label{fig:v4onlynet} -%% \end{figure} In relation to IPv6 and IPv4, there are in general three different network designs possible: The oldest form are IPv4 only networks. @@ -433,12 +427,6 @@ hosts that are either not configured for IPv6 or are even technically incapable of enabling the IPv6 protocol. These nodes are connected to an IPv4 router that is connected to the Internet. That router might be capable of translating IPv4 to IPv6 and vice versa. -%% \begin{figure}[h] -%% \includegraphics[scale=0.5]{dualstack} -%% \centering -%% \caption{Dualstack network} -%% \label{fig:dualstacknet} -%% \end{figure} With the introduction of IPv6, hosts can have a separate IP stack active and in that configuration hosts are called ``dualstack hosts''. @@ -446,8 +434,8 @@ Dualstack hosts are capable of reaching both IPv6 and IPv4 hosts directly without the need of any translation mechanism. The last possible network design is based on IPv6 only hosts. -While it is technically easy to disable IPv4, it -seems that completely removing the IPv4 stack in current operating +While it is technically easy to disable IPv4, +completely removing the IPv4 stack in current operating systems is not an easy task~\cite{ungleich:_ipv4}. %% \begin{figure}[h] %% \includegraphics[scale=0.5]{v6only} @@ -458,7 +446,7 @@ systems is not an easy task~\cite{ungleich:_ipv4}. While the three network designs look similar, there are significant differences in operating them and limitations that are not easy to circumvent. In the following sections, we describe the limitations and -reason how a translation mechanism like our NAT64 implementation +explain how a translation mechanism like our NAT64 implementation should be deployed. % ---------------------------------------------------------------------- \subsection{\label{background:networkdesign:ipv4}IPv4 only network limitations} @@ -478,14 +466,14 @@ IPv6 addresses independent of the protocol is not possible. maintenance} While dualstack hosts can address any host in either IPv6 or IPv4 networks, the deployment of dualstack hosts comes with a major -disadvantage: all network configuration double. The required routing +disadvantage: all network configurations double. The required routing tables double, the firewall rules roughly double\footnote{The rule sets even for identical policies in IPv6 and IPv4 networks are not identical, but similar. For this reason we state that roughly double the amount of firewall rules are required for the same policy to be applied.} and the number of network supporting systems, (like DHCPv4, DHCPv6, router advertisement daemons, etc.) also roughly double. -Additionally services that run on either IPv6 or IPv4 might need to be +Additionally, services that run on either IPv6 or IPv4 might need to be configured to run in dualstack mode as well and not every software might be capable of that. So while there is the instant benefit of not requiring any transition mechanism @@ -494,7 +482,7 @@ operational cost) of running dual stack networks can be significant. % ---------------------------------------------------------------------- \subsection{\label{background:networkdesign:v6only}IPv6 only networks} IPv6 only networks are in our opinion the best choice for long term -deployments. The reasons for this are as follows: First of all hosts +deployments. Our reasons for this are the following: First of all hosts eventually will need to support IPv6 and secondly IPv6 hosts can address the whole 32 bit IPv4 Internet mapped in a single /96 IPv6 network. IPv6 only networks also allow the operators @@ -510,7 +498,7 @@ to focus on one IP stack. The NetFPGA~\cite{zilberman:_netfp_sume} is an FPGA card featuring four 10 Gbit/s SFP+ ports. It includes the Xilinx Virtex-7 690T FPGA on board, 27 MB of storage, -allowing to save table data, and 8 GB of DDR3 RAM. The NetFPGA can be +to save table data, and 8 GB of DDR3 RAM. The NetFPGA can be run inside a host (connected by PCI-E, gen 3) or as a standalone card. diff --git a/doc/Conclusion.tex b/doc/Conclusion.tex index 4c5819c..e9ce6b2 100644 --- a/doc/Conclusion.tex +++ b/doc/Conclusion.tex @@ -15,10 +15,10 @@ Our in-network solution allows novel translations without involving external routers, without involving external routers.\footnote{Compare figures \ref{fig:v6v4standard} and \ref{fig:v6v4mixed}.} -We expect this to supporting migration to IPv6 only networks +We expect this to support migration to IPv6 only networks. % P4 -P4 has been proven for us as a suitable programming language for +P4 has been proven to us as a suitable programming language for network equipment with great potential. However in the current state the tooling and frameworks are still immature and need significant work to become usable for solving day-to-day challenges or supporting @@ -26,7 +26,7 @@ large scale projects. Even with the current state drawbacks, P4 is a very convincing language that has wide range of applications due to its protocol independence and easy to understand architecture. The availability of protocol independent programmable network -equipment opens up many possibilities for in network +equipment opens up many possibilities in network programming. While this thesis focused on NAT64, the accompanying technology DNS64~\cite{rfc6147} could also be implemented in P4, thus completing the translation mechanism. @@ -54,7 +54,7 @@ depth translation like ICMP/ICMP6 specifics. % P4 The P4 language has shown maturity, but the usability and ease of use -of the provided toolchains can be significantly improved. Additionally +of the provided toolchains can be significantly improved. Additionally, we envision a stronger tie between the different tools in the P4 environment, like a collection of libraries and modules that could form something on the line of a ``P4OS''. This operating system could @@ -67,7 +67,7 @@ The NetFPGA, from the hardware point of view, is a very interesting hardware platform. Reducing the difficulties we experienced with the surrounding toolchain and making development experience more consistent has the potential to not only -make NetFPGA, but also the who set of P4 hardware more interesting for +make NetFPGA, but also the whole set of P4 hardware more interesting for developers. %% PMTU diff --git a/doc/Design.tex b/doc/Design.tex index 53f350e..ce3b8a2 100644 --- a/doc/Design.tex +++ b/doc/Design.tex @@ -4,37 +4,37 @@ In this chapter we describe the architecture of our solution and our design choices. We first introduce the general design of NAT64 in the P4 architecture. Afterwards we describe the design differences -for the BMV2 and NetFPGA P4 architectures. Afterwards we discuss the -design of stateless and stateful NAT64 in relation to P4 an well as +of the BMV2 and NetFPGA P4 architectures. Afterwards we discuss the +design of stateless and stateful NAT64 in relation to P4 as well as two existing software NAT64 solutions. -Lastly we discuss how we verify NAT64 functionality present -the network configurations that we use. +Lastly we discuss how we verify NAT64 functionality and +present the network configurations that we use. % ---------------------------------------------------------------------- \section{\label{design:nat64}P4/NAT64} \begin{figure}[h] - \includegraphics[scale=0.4]{switchdesign} + \includegraphics[scale=0.5]{switchdesign} \centering \caption{P4 Switch Architecture} \label{fig:switchdesign} \end{figure} In section \ref{background:transition} we discussed different translation mechanisms for IPv6 and IPv4. In this thesis we focus on -the translation mechanisms stateless and stateful NAT64. While higher +the translation mechanisms ``stateless'' and ``stateful'' NAT64. While higher layer protocol dependent translations are more flexible, this topic has already been addressed in \cite{nico18:_implem_layer_ipv4_ipv6_rever_proxy} and the focus in this thesis is on the practicability of high speed NAT64 with P4. The high level design can be seen in figure \ref{fig:switchdesign}: a P4 capable switch is running our code to provide NAT64 -functionality. A P4 switch cannot manage its tables on it own and +functionality. A P4 switch cannot manage its tables on its own and needs support for this from a controller. The controller also has the role to handle unknown packets and can modify the runtime configuration of the switch. This is especially useful in the case of stateful NAT64. If only static table entries are required, they can usually be added at the start of a P4 switch -and the controller can also be omitted. However stateful +and the controller can also be omitted. However, stateful NAT64 requires the use of a controller to create session entries in the switch tables. The P4 switch can use any protocol to communicate with the controller, as @@ -51,7 +51,8 @@ Software NAT64 solutions typically require routing to be applied to transport the packet to the NAT64 translator as shown in figure \ref{fig:v6v4standard}. -Our design differs here: while routing could be used like described +Our design differs here: +while routing could be used like described above, NAT64 with P4 does not require any routing to be setup. Figure \ref{fig:v6v4mixed} shows the network design that we realise using P4. This design has multiple advantages: first it reduces the number @@ -64,15 +65,16 @@ segment. \caption{In-network NAT64 translation} \label{fig:v6v4mixed} \end{figure} -% ---------------------------------------------------------------------- -\section{\label{design:nat64:indepth}P4/NAT64} P4 switches in general look very similar to regular switches, however -support executing logic while the packet passes through the switch. - -When a packet enters the switch, - -\digraph[scale=0.5]{abc}{rankdir=LR; a->b->c;} - +support executing logic while the packet passes through the +switch. Figure \ref{fig:p4switch} illustrates how our solution is +implemented and translates packets. +\begin{figure}[h] + \includegraphics[scale=0.5]{p4switch} + \centering + \caption{Our P4 Switch Architecture} + \label{fig:p4switch} +\end{figure} % ---------------------------------------------------------------------- \section{\label{design:bmv2}P4/BMV2} \begin{figure}[h] @@ -200,10 +202,10 @@ and then calculates the difference including a possible carry bit and adjusts the higher level protocol by this difference (\texttt{delta\_tcp\_from\_v6\_to\_v4()}). Figure \ref{fig:checksumbydiff} shows an -excerpt of the code used for adjust the checksum when translating TCP +excerpt of the code used for adjusting the checksum when translating TCP from IPv6 to IPv4. It is notable that -not the full headers are used, but only a ``pseudo header'' (compare figures +not the full headers are used, but only a ``pseudo header'' is (compare figures \ref{fig:ipv6pseudoheader} and \ref{fig:ipv4pseudoheader}). % ok @@ -243,7 +245,7 @@ entries are configured differently depending on the implementation: Limitations in the P4/NetFPGA environment require to use table entries. Jool supports individual entries as a special case of LPM, with a network mask matching only one IP address. Tayga -support LPM for translation from IPv6 to IPv4, but requires individual +supports LPM to translate from IPv6 to IPv4, but requires individual entries for translating from IPv4 to IPv6. Our P4/BMV2 offers the highest degree of flexibility, as it provides support for individual entries based on table entries and LPM table entries. @@ -261,7 +263,7 @@ to solve this problem: that don't have a table entry, sets the table entry in the P4 switch and inserts the original packet afterwards back into the switch. \item With tayga we rely on the Linux kernel NAT44 capabilities -\item Jool implements its own stateful mechanism based on a port +\item Jool implements its own stateful mechanism based on port ranges \end{itemize} All methods though operate in a very similar fashion: A ``controller'' diff --git a/doc/Results.tex b/doc/Results.tex index 2d2b03d..680ed8b 100644 --- a/doc/Results.tex +++ b/doc/Results.tex @@ -8,7 +8,7 @@ We distinguish the software implementation of P4 (BMV2) and the hardware implementation (NetFPGA) due to significant differences in deployment and development. We present benchmarks for the existing software solutions as well as for our hardware implementation. As the -objective of this thesis was to demonstrate the high speed +objective of this thesis is to demonstrate the high speed capabilities of NAT64 in hardware, no benchmarks were performed on the P4 software implementation. % ok @@ -17,7 +17,7 @@ P4 software implementation. We successfully implemented P4 code to realise NAT64~\cite{schottelius:thesisrepo}. It contains parsers for all related protocols (IPv6, IPv4, UDP, TCP, ICMP, ICMP6, NDP, -ARP), supports EAMT as defined by RFC7757 ~\cite{rfc7757} and is +ARP), supports EAMT as defined by RFC7757 ~\cite{rfc7757}, and is feature equivalent to the two compared software solutions tayga~\cite{lutchansky:_tayga_simpl_nat64_linux} and jool~\cite{mexico:_jool_open_sourc_siit_nat64_linux}. @@ -41,7 +41,7 @@ superset of LPM matching. When developing P4 programs, the reason for incorrect behaviour we have seen were checksum problems. This is in retrospective expected, -as the main task our implementation does is modify headers on which +as the main task of our implementation is modifying headers on which the checksums depend. In all cases we have seen Ethernet frame checksum errors, the effective length of the packet was incorrect. @@ -54,7 +54,7 @@ the matching key from table or the name of the action called. Thus if different table entries call the same action, it is impossible within the action, or if forwarded to the controller, within the controller to distinguish on which match the action was -triggered. This problem is very consistent within P4, as not even the +triggered. This problem is very consistent within P4, not even the matching table name can be retrieved. While these information can be added manually as additional fields in the table entries, we would expect a language to support reading and forwarding this kind of meta @@ -68,7 +68,7 @@ that this duplication is a likely source of errors in bigger software projects. The supporting scripts in the P4 toolchain are usually written in -python2. However python2 ``is +python2. However, python2 ``is legacy''~\cite{various:_shoul_i_python_python}. During development errors with unicode string handling in python2 caused changes to IPv6 addresses. @@ -136,7 +136,7 @@ fully implemented\footnote{Source code: \texttt{checksum\_bmv2.p4}}\\ \end{center} \end{table} The switch responds to ICMP echo requests, ICMP6 echo requests, -answers NDP and ARP requests. Overall P4/BMV is very easy to use +answers NDP and ARP requests. Overall P4/BMV is very easy to use, even without a controller a fully functional network host can be implemented. @@ -159,7 +159,7 @@ support of the NetFPGA P4 compiler to inspect payload and to compute checksums over payload. While this can (partially) be compensated using delta checksums, the compile time of 2 to 6 hours contributed to a significant slower development cycle compared to BMV2. -Lastly, the focus of this thesis was to implement high speed NAT64 on +Lastly, the focus of this thesis is to implement high speed NAT64 on P4, which only requires a subset of the features that we realised on BMV2. Table \ref{tab:p4netpfgafeatures} summarises the implemented features and reasons about their implementation status. @@ -233,7 +233,7 @@ unsupported\footnote{To support creating payload checksums, either an % ok % ---------------------------------------------------------------------- \subsection{\label{results:netpfga:stability}Stability} -Two different NetPFGA cards were used during the development of the +Two different NetPFGA cards were used during the development of this thesis. The first card had consistent ioctl errors (compare section \ref{appendix:netfpgalogs:compilelogs}) when writing table entries. The available hardware tests (compare figures \ref{fig:hwtestnico} and @@ -258,7 +258,7 @@ function properly multiple times. In theses cases the card would not forward packets anymore. Multiple reboots (up to 3) and multiple times reflashing the bitstream to the NetFPGA usually restored the intended behaviour. However due to this ``crashes'', it -was impossible for us run a benchmark for more than one hour. +was impossible for us to run a benchmark for more than one hour. Similarly, sometimes flashing the bitstream to the NetFPGA would fail. It was required to reboot the host containing the NetFPGA card up to 3 times to enable successful flashing.\footnote{Typical @@ -284,12 +284,12 @@ To use the NetFPGA, the tools Vivado and SDNET provided by Xilinx need to be installed. However a bug in the installer triggers an infinite loop, if a certain shared library\footnote{The required shared library is libncurses5.} is missing on the target operating system. The -installation program seems still to be progressing, however does never -finish. +installation program seems to be still progressing, however never +finishes. While the NetFPGA card supports P4, the toolchains and supporting -scripts are in a immature state. The compilation process consists of -at least 9 different steps, which are interdependent\footnote{See +scripts are in an immature state. The compilation process consists of +at least 9 different steps, which are interdependent.\footnote{See source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate shell scripts and python scripts that in turn generate JSON data.\footnote{One compilation step calls the script @@ -358,7 +358,7 @@ the port and capturing on the other side. Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are commonly used in 10 Gbit/s networks. According to -\ref{wikipedia:_jumbo}, even many gigabit network interface card +\cite{wikipedia:_jumbo}, even many gigabit network interface card support jumbo frames. However according to emails on the private NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at the moment and additional work is required to implement support for @@ -367,9 +367,9 @@ bigger frames. Our P4 source code required contains Xilinx annotations\footnote{F.i. ``@Xilinx\_MaxPacketRegion(1024)''} that define the maximum packet size in bits. We observed two different errors on -the output packet, if the incoming packets exceeds the specified size: +the output packet, if the incoming packets exceed the specified size: \begin{itemize} -\item The output packet is longer then the original packet. +\item The output packet is longer than the original packet. \item The output packet is corrupted. \end{itemize} @@ -413,7 +413,7 @@ X520 cards. Figure \ref{fig:softwarenat64design} shows the network setup. When testing the NetPFGA/P4 performance, the X520 cards in the NAT64 translator are disconnected and instead the NetPFGA ports are -connected, as show in figure \ref{fig:netpfgadesign}. The load +connected, as shown in figure \ref{fig:netpfgadesign}. The load generator is equipped with a quad core CPU (Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz), enabled with hyperthreading and 16 GB RAM. The NAT64 translator is also equipped with a quard core CPU (Intel(R) Core(TM) diff --git a/doc/Thesis.pdf b/doc/Thesis.pdf index 9076850..e641291 100644 Binary files a/doc/Thesis.pdf and b/doc/Thesis.pdf differ diff --git a/doc/graphviz/p4switch.dot b/doc/graphviz/p4switch.dot new file mode 100644 index 0000000..d63a792 --- /dev/null +++ b/doc/graphviz/p4switch.dot @@ -0,0 +1,21 @@ +digraph G { + rankdir="TB"; + + v4host [ shape="box" label="IPv4 Host" rank="min" ]; + v6host [ shape="box" label="IPv6 Host" ]; + + parser [ label="Parser"]; + deparser [ label="Deparser"]; + translation [ label="Translation"]; + v4packet [ label="IPv4 Packet"]; + v6packet [ label="IPv6 Packet"]; + + subgraph cluster_nat64 { + label="P4 Switch"; + parser->v4packet->translation->v6packet->deparser; + } + + deparser->v6host; + v4host->parser; + +} diff --git a/doc/refs/refs.bib b/doc/refs/refs.bib index f9dbc24..bb793a2 100644 --- a/doc/refs/refs.bib +++ b/doc/refs/refs.bib @@ -162,7 +162,6 @@ title = {P4-Programming on an FPGA, Semester Thesis SA-2019-02}, howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/sa-2019-02_p4_programming_sume_netfpga/blob/master/SA-2019-02.pdf}}} - @Misc{wikipedia:_jumbo, author = {Wikipedia}, title = {Jumbo frame},