215 lines
8.4 KiB
TeX
215 lines
8.4 KiB
TeX
\chapter{\label{design}Design}
|
|
Description of the theory/software/hardware that you designed.
|
|
%** Design.tex: How was the problem attacked, what was the design
|
|
% the architecture
|
|
In this chapter we describe the architecture of our solution.
|
|
|
|
% ----------------------------------------------------------------------
|
|
\section{\label{Design:General}General - FIXME}
|
|
The high level design can be seen in figure \ref{fig:switchdesign}: a
|
|
P4 capable switch is running our code to provide NAT64
|
|
functionality. The P4 switch cannot manage its tables on it own and
|
|
needs support for this from a controller. If only static table entries
|
|
are required, the controller can also be omitted. However stateful
|
|
NAT64 requires the use of a control to create session entries in the
|
|
switch tables.
|
|
\begin{figure}[h]
|
|
\includegraphics[scale=0.5]{switchdesign}
|
|
\centering
|
|
\caption{General Design}
|
|
\label{fig:switchdesign}
|
|
\end{figure}
|
|
The P4 switch can use any protocol to communicate with controller, as
|
|
the connection to the controller is implemented as a separate ethernet
|
|
port. The design allows our solution to be used as a standard NAT64
|
|
translation method or as an in network NAT64 translation (compare
|
|
figures \ref{fig:v6v4innetwork} and \ref{fig:v6v4standard}). The
|
|
controller is implemented in python, the NAT64 solution is implemented
|
|
in P4.
|
|
|
|
Describe network layouts
|
|
\begin{verbatim}
|
|
- IPv6 subnet 2001:db8::/32
|
|
- IPv6 hosts are in 2001:db8:6::/64
|
|
- IPv6 default router (::/0) is 2001:db8:6::42/64
|
|
- IPv4 mapped Internet "NAT64 prefix" 2001:db8:4444::/96 (should
|
|
go into a table)
|
|
- IPv4 hosts are in 10.0.4.0/24
|
|
- IPv6 in IPv4 mapped hosts are in 10.0.6.0/24
|
|
- IPv4 default router = 10.0.0.42
|
|
|
|
\end{verbatim}
|
|
|
|
Describe testing methods
|
|
\begin{verbatim}
|
|
def test_v4_udp_to_v6(self):
|
|
print('mx h3 "echo V4-OK | socat - UDP:10.1.1.1:2342"')
|
|
print('mx h1 "echo V6-OK | socat - UDP-LISTEN:2342"')
|
|
|
|
return
|
|
|
|
p4@ubuntu:~$ mx h1 "echo V6-OK | socat - UDP6-LISTEN:2342"
|
|
p4@ubuntu:~/master-thesis/bin$ mx h3 "echo V4-OK | socat - UDP:10.1.1.1:2342"
|
|
|
|
while true; do mx h3 "echo V4-OK | socat - TCP-LISTEN:2343"; sleep 2;
|
|
done
|
|
|
|
while true; do mx h1 "echo V6-OK | socat -
|
|
TCP6:[2001:db8:1::a00:1]:2343"; sleep 2; done
|
|
|
|
mx h1 "echo V6-OK | socat - TCP6:[2001:db8:1::a00:1]:2343"
|
|
|
|
\end{verbatim}
|
|
% ----------------------------------------------------------------------
|
|
\section{\label{Design:BMV2}BMV2}
|
|
Development of the thesis took place on a software emulated switch
|
|
that is implemented using Open vSwitch \cite{openvswitch}
|
|
and the behavioral model
|
|
\cite{_implem_your_switc_target_with_bmv2}. The development followed
|
|
closely the general design shown in section
|
|
\ref{Design:General}. Within the software emulation checksums can be
|
|
computed with two different methods:
|
|
\begin{itemize}
|
|
\item Recalculating the checksum by inspecting headers and payload
|
|
\item Calculating the difference between the translated headers
|
|
\end{itemize}
|
|
The BMV2 model is rather sophisticated and provides many standard
|
|
features including checksumming over payload. This allows the BMV2
|
|
model to operate as a full featured host, including advanced features
|
|
like responding to ICMP6 Neighbor discovery requests \cite{rfc4861}
|
|
that include payload checksums.
|
|
A typical code to create the checksum can be found in figure
|
|
\ref{fig:checksum}.
|
|
\begin{figure}[h]
|
|
\begin{verbatim}
|
|
/* checksumming for icmp6_na_ns_option */
|
|
update_checksum_with_payload(meta.chk_icmp6_na_ns == 1,
|
|
{
|
|
hdr.ipv6.src_addr, /* 128 */
|
|
hdr.ipv6.dst_addr, /* 128 */
|
|
meta.cast_length, /* 32 */
|
|
24w0, /* 24 0's */
|
|
PROTO_ICMP6, /* 8 */
|
|
hdr.icmp6.type, /* 8 */
|
|
hdr.icmp6.code, /* 8 */
|
|
|
|
hdr.icmp6_na_ns.router,
|
|
hdr.icmp6_na_ns.solicitated,
|
|
hdr.icmp6_na_ns.override,
|
|
hdr.icmp6_na_ns.reserved,
|
|
hdr.icmp6_na_ns.target_addr,
|
|
|
|
hdr.icmp6_option_link_layer_addr.type,
|
|
hdr.icmp6_option_link_layer_addr.ll_length,
|
|
hdr.icmp6_option_link_layer_addr.mac_addr
|
|
},
|
|
hdr.icmp6.checksum,
|
|
HashAlgorithm.csum16
|
|
);
|
|
\end{verbatim}
|
|
\centering
|
|
\caption{IPv4 Pseudo Header}
|
|
\label{fig:checksum}
|
|
\end{figure}
|
|
|
|
% ----------------------------------------------------------------------
|
|
\section{\label{Design:NetPFGA}NetFPGA}
|
|
While the P4-NetFPGA project \cite{netfpga:_p4_netpf_public_github}
|
|
allows compiling P4 to the NetPFGA, the design slightly varies.
|
|
In particular, the NetFPGA P4 compiler does not support reading
|
|
the payload. For this reason it also does not support
|
|
creating the checksum based on the payload.
|
|
To support checksum modifications in NAT64 on the NetFPGA, the
|
|
checksum was calculated on the netpfga using differences between
|
|
the IPv6 and IPv4 headers. Figure \ref{fig:checksumbydiff} shows an
|
|
excerpt of the code used for calculating checksums in the netpfga.
|
|
\begin{figure}[h]
|
|
\begin{verbatim}
|
|
action v4sum() {
|
|
bit<16> tmp = 0;
|
|
|
|
tmp = tmp + (bit<16>) hdr.ipv4.src_addr[15:0]; // 16 bit
|
|
tmp = tmp + (bit<16>) hdr.ipv4.src_addr[31:16]; // 16 bit
|
|
tmp = tmp + (bit<16>) hdr.ipv4.dst_addr[15:0]; // 16 bit
|
|
tmp = tmp + (bit<16>) hdr.ipv4.dst_addr[31:16]; // 16 bit
|
|
|
|
tmp = tmp + (bit<16>) hdr.ipv4.totalLen -20; // 16 bit
|
|
tmp = tmp + (bit<16>) hdr.ipv4.protocol; // 8 bit
|
|
|
|
meta.v4sum = ~tmp;
|
|
}
|
|
|
|
/* analogue code for v6sum skipped */
|
|
|
|
action delta_tcp_from_v6_to_v4()
|
|
{
|
|
v6sum();
|
|
v4sum();
|
|
|
|
bit<17> tmp = (bit<17>) hdr.tcp.checksum + (bit<17>) meta.v4sum;
|
|
if (tmp[16:16] == 1) {
|
|
tmp = tmp + 1;
|
|
tmp[16:16] = 0;
|
|
}
|
|
tmp = tmp + (bit<17>) (0xffff - meta.v6sum);
|
|
if (tmp[16:16] == 1) {
|
|
tmp = tmp + 1;
|
|
tmp[16:16] = 0;
|
|
}
|
|
|
|
hdr.tcp.checksum = (bit<16>) tmp;
|
|
}
|
|
|
|
\end{verbatim}
|
|
\centering
|
|
\caption{Calculating checksum based on header differences}
|
|
\label{fig:checksumbydiff}
|
|
\end{figure}
|
|
The checksums for IPv4, TCP, UDP and ICMP6 are all based on the
|
|
``Internet Checksum'' (\cite{rfc791}, \cite{rfc1071}). Its calculation
|
|
can be summarised as follows:
|
|
\begin{quote}
|
|
The checksum field is the 16-bit one's complement of the one's
|
|
complement sum of all 16-bit words in the header. For purposes of
|
|
computing the checksum, the value of the checksum field
|
|
is zero.\footnote{Quote from Wikipedia\cite{wikipedia:_ipv4}.}.
|
|
\end{quote}
|
|
As the calculation mainly depends on on (1-complement) sums, the
|
|
checksums after translating the protocol can be corrected by
|
|
subtracting the differences of the relevant fields. It is notable that
|
|
not the full headers are used, but the pseudo headers (compare figures
|
|
\ref{fig:ipv6pseudoheader} and \ref{fig:ipv4pseudoheader}).
|
|
To compensate the carry bit, our code uses 17 bit integers for
|
|
correcting the carry.
|
|
% FIXME: add note to python script / checksum diffing
|
|
% ----------------------------------------------------------------------
|
|
\section{\label{Design:Benchmarks}Benchmarks}
|
|
The benchmarks were performed on two hosts, a load generator and a
|
|
nat64 translator. Both hosts were equipped with a dual port
|
|
Intel X520 10 Gbit/s network card. Both hosts were connected using DAC
|
|
without any equipment in between. TCP offloading was enabled in the
|
|
X520 cards. Figure \ref{fig:softwarenat64design}
|
|
shows the network setup.
|
|
\begin{figure}[h]
|
|
\includegraphics[scale=0.5]{softwarenat64design}
|
|
\centering
|
|
\caption{NAT64 in software benchmark}
|
|
\label{fig:softwarenat64design}
|
|
\end{figure}
|
|
When testing the NetPFGA/P4 performance, the X520 cards in the NAT64
|
|
translator were diconnected and instead the NetPFGA ports were
|
|
connected, as show in figure \ref{fig:netpfgadesign}. The load
|
|
generator is equipped with a quad core CPU (Intel(R) Core(TM) i7-6700
|
|
CPU @ 3.40GHz), enabled with hyperthreading and 16 GB RAM. The NAT64
|
|
translator is also equipped with a quard core CPU (Intel(R) Core(TM)
|
|
i7-4770 CPU @ 3.40GHz) and 16 GB RAM.
|
|
|
|
The first 10 seconds of the benchmark were excluded to avoid the tcp
|
|
warm up phase.\footnote{iperf -O 10 parameter}
|
|
|
|
\begin{figure}[h]
|
|
\includegraphics[scale=0.5]{netpfgadesign}
|
|
\centering
|
|
\caption{NAT64 with NetFPGA benchmark}
|
|
\label{fig:netpfgadesign}
|
|
\end{figure}
|