diff --git a/doc/Conclusion.tex b/doc/Conclusion.tex index ad2a41f..d338254 100644 --- a/doc/Conclusion.tex +++ b/doc/Conclusion.tex @@ -174,191 +174,24 @@ idomatic problem: Security issue: not checking checksums before % ---------------------------------------------------------------------- \section{\label{conclusion:netpfga}NetFGPA - all HERE} -personal note here - -stopped working -reboot not enough -does not respond to any packet - -tested various kernels for table debugging - -MTU limitations: 1500 according to a private mail from Salvator Galea -cambridge / uk - -long compile process -error prone compile process many dependencies lpm not supported! Netpfga live, -Vivado -SDNET -xx k lines of supporting code - -Vivado installation: silent errors, infinite loop, missing libncurses5 - -82k lines of code that are interdependent -Many non critical error messages on the way -Zero exit fatal errors - -missing / spreaded documentation - -tcpdump on local nfX doesn't work -> can only debug on other endpoint - - -First card: Writing tables fails -hardware debug shows some errors -but hardware debug on correct card also shows some error -Debug ioctl errors when writing table entries - - -Output all ports -> port mapping documented only in a testdata script - - -hwtest: Execution fails due to missing djtgcfg - - -no payload accessq Many workarounds -Table size 63, table size 64, - -Table entries require arguments of all possible actions, not only used -one. - -Compile time hours - -Silent errors - -Unclear errors: broken board - -Due to the very fragile nature of the build framework from the -NetFPGA-Live repository, - -Renaming VARIABLES in the definition of - -Reproducibility: - -hours for finding right output ports - packet size / annotation Needed to debug internal parsing errors -3x rebooting to get card working with bitstream - -Variable renaming breaks the compile process - -\begin{verbatim} -It seems I was really mistaken for the last weeks -If I am not totally mistaken, the following is happening with the netpfga: -I was testing sending and receiving packets on the same computer; so I sent a packet on nfX and expected an answer on nf0, which is how I wanted to verify that the card works -So I ran tcpdump on nf0, send a packet with ping6 and scapy on nf{0,1,2,3} (edited) -I have never seen the switch emitting ANY packet back with tcpdump -Now with the card connected to another host, sending neighbor solicitation, I see duplicated packets on the other host - so it seems that it might have worked all the time, just that tcpdump on nfX on the host which contains the card does not show the packets - -\end{verbatim} debugging generated tcl code to debug impl1 error -Cable problems: -\begin{verbatim} -[ 488.265148] ixgbe 0000:02:00.0: failed to initialize because an unsupported SFP+ module type was detected. -[ 488.265157] ixgbe 0000:02:00.0: Reload the driver after installing a supported module. -[ 488.265605] ixgbe 0000:02:00.0: removed PHC on enp2s0f0 - -\end{verbatim} - function syntax not supported, using defines instead -4-6 MB logfiles for a compile process. - -confusing messages -\begin{verbatim} -WARNING: command 'get_user_parameter' will be removed in the 2015.3 -release, use 'get_user_parameters' instead - -\end{verbatim} - -critical non critical errors -\begin{verbatim} - - -CRITICAL WARNING: [BD 41-737] Cannot set the parameter TRANSLATION_MODE on /axi_interconnect_0. It is read-only. -\end{verbatim} - -\begin{verbatim} - - step9 (sume simulation, the longest step) in the process calls - "config_writes.py" - - config_writes.py fails with a syntax error, as it is incomplete - python code - - config_writes.py and config_writes.sh are generated by - gen_config_writes.py - - gen_config_writes.py reads config_writes.txt - - config_writes.txt is created in step 5 (sdnet simulation) - - step 5 consists of running xsc, xelab and xsim - - xsim (re-)generates config_writes.txt according to a watch ls -l - on the file: ${XILINX_VIVADO}/bin/xsim --runall - SimpleSumeSwitch_tb#work.glbl - - it seems (by grep -r) that ./Testbench/SimpleSumeSwitch_tb.sv is - responsible for writing config_writes.txt - - It seems that the "task" "SV_write_control" inside that file is - responsible for writing the content, which in turn uses - axi4_lite_master_write_request_control - - -\end{verbatim} - -\begin{verbatim} - - Cannot easily run P4 on notebook - changes to the system very - invasive - - Varous compiler bugs/limitations - - Very very deep rabbithole problems - - Hanging/sleeping issue -- unclear whether it does something or - not - - Open impl_1 error with unclear reason - - logfiles referenced that don't exist -Run output will be captured here: /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/synth/runme.log -nico@nsg-System:~/master-thesis/netpfga/log$ ls -alh /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/synth/runme.log -ls: cannot access '/home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/synth/runme.log': No such file or directory - - even "short" compile runs taking 30m+ - -control_sub_m02_data_fifo_0_synth_1: /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/control_sub_m02_data_fifo_0_synth_1/runme.log -nico@nsg-System:~/master-thesis/netpfga/minip4/testdata$ less /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/control_sub_m02_data_fifo_0_synth_1/runme.log -/home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/control_sub_m02_data_fifo_0_synth_1/runme.log: No such file or directory - - - Wrong warnings: using 2018.2, getting warnings about things - removed in 2015.3 -WARNING: command 'get_user_parameter' will be removed in the 2015.3 -release, use 'get_user_parameters' instead - - - A script/makefile generates a python script that generates a shell - script and later then a python script. If there is a mistake in - generating the first python script (syntax ok, but content is - not correct) then a much later stage of the compile process will - fail due to a syntax error in the third generated - script. However that syntax error is not fatal in the build - process and thus can only be seen with careful analysis of the - logfile, which is around 700 KiB or 10k lines per compile - process and contains 328 lines matching "error" and - "warning". - - Most of the error and warning messages seem to be non-critical - (even if saying they are). Then there are a variety of INFO - messages that actually constitute ERROR messages, but are not - flagged as such nor do they cause the build process to abort. - -\end{verbatim} - -LPM tables don't work - match type exact - table must be at least 64 in size - -multiple reboots sometimes required for flashing - - Damaged, enlarged packets \begin{verbatim} @@ -545,6 +378,14 @@ the learnings of the different layers were very much appreciated / liked It was a + +% ---------------------------------------------------------------------- +\section{\label{conclusion:netpfga2}NetFGPA2 - conclusion here} +Very time intensive development due to usability problems and +uncertainty of functionality (compare sections +\ref{results:netpfga:usability} and \ref{results:netpfga:stability}). + + \section{todo - FIXME: remove} \begin{verbatim} ***** Summary eher kurz diff --git a/doc/Results.tex b/doc/Results.tex index c678202..5523a8b 100644 --- a/doc/Results.tex +++ b/doc/Results.tex @@ -161,13 +161,14 @@ table entries. Jool and tayga are supported by % ---------------------------------------------------------------------- -\section{\label{Results:NetPFGA}NetFPGA} +\section{\label{results:netpfga}NetFPGA} The reduced feature set of the NetPFGA implementation is due to two factors: compile time. Between 2 to 6 hours per compile run. No payload checksum - - +overview - general translation - not advanced features +% ---------------------------------------------------------------------- +\subsection{\label{results:netpfga:features}Features} \begin{table}[htbp] \begin{center}\begin{minipage}{\textwidth} \begin{tabular}{| c | c | c |} @@ -235,7 +236,6 @@ unsupported\footnote{To support creating payload checksums, either an \label{tab:p4netpfgafeatures} \end{center} \end{table} - % ---------------------------------------------------------------------- \subsection{\label{results:netpfga:stability}Stability} Two different NetPFGA cards were used during the development of the @@ -262,15 +262,99 @@ During the development and benchmarking, the second NetFPGA card stopped to function properly multiple times. In both cases the card would not forward packets anymore. Multiple reboots (3 were usually enough) and multiple times reflashing the bitstream to the NetFPGA usually -restored the intended behaviour. - +restored the intended behaviour. However due to this ``crashes'', it +was impossible to complete a full benchmark run that would last for +more than one hour. % ---------------------------------------------------------------------- \subsection{\label{results:netpfga:performance}Performance} As expected, the NetFGPA card performed at near line speed and offers -NAT64 translations at 9.28 Gbit/s. +NAT64 translations at 9.28 Gbit/s. Single and multiple streams +performed almost exactly identical and have been consistent through +multiple iterations of the benchmarks. +% ---------------------------------------------------------------------- +\subsection{\label{results:netpfga:usability}Usability} +To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be +installed. However a bug in the installer triggers an infinite loop, +if a certain shared library\footnote{The required shared library +is libncurses5.} is missing on the target operating system. The +installation program seems still to be progressing, however does never +finish. +While the NetFPGA card supports P4, the toolchains and supporting +scripts are in a immature state. The compilation process consists of +at least 9 different steps, which are interdependent\footnote{See +source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate +shell scripts and python scripts that in turn generate JSON +data.\footnote{One compilation step calls the script +``config\_writes.py''. This script failed with a syntax error, as it +contained incomplete python code. The scripts config\_writes.py +and config\_writes.sh are generated by gen\_config\_writes.py. +The output of the script gen\_config\_writes.py depends on the content +of config\_writes.txt. That file is generated by the simulation +``xsim''. The file ``SimpleSumeSwitch\_tb.sv'' contains code that is +responsible for writing config\_writes.txt and uses a function +named axi4\_lite\_master\_write\_request\_control for generating the +output. This in turn is dependent on the output of a script named +gen\_testdata.py.} -Checksum computation +However incorrect parsing generates syntactically incorrect +scripts or scripts that generate incorrect output. The toolchain +provided by the NetFGPA-P4 repository contains more than 80000 lines +of code. The supporting scripts for setting table entries require +setting the parameters for all possible actions, not only for the +selected action. Supplying only the required parameters results in a +crash of the supporting script. + +The documentation for using the NetFPGA-P4 repository is very +distributed and does not contain a reference on how to use the +tools. Mapping of egress ports and their metadata field are found in a +python script that is used for generating test data. + +The compile process can take up to 6 hours and because the different +steps are interdependent, errors in a previous stage were in our +experiences detected hours after they happened. The resulting log +files of the compilation process can be up to 5 MB in size. Within +this log file various commands output references to other logfiles, +however the referenced logfiles do not exist before or after the +compile process. + +During the compile process various informational, warning and error +messages are printed. However some informational messages constitute +critical errors, while on the other hand critical errors and syntax +errors often do not constitue a critical +error.\footnote{F.i. ``CRITICAL WARNING: [BD 41-737] Cannot set the +parameter TRANSLATION\_MODE on /axi\_interconnect\_0. It is +read-only.'' is a non critical warning.} +Also contradicting +output is generated\footnote{While using version 2018.2, the following +message was printed: ``WARNING: command 'get\_user\_parameter' will be removed in the 2015.3 +release, use 'get\_user\_parameters' instead''.} + +The NetFPGA kernel module provides access to virtual Linux +devices (nf0...nf3). However tcpdump does not see any packets that are +emitted from the switch. The only possibility to capture packets +that are emitted from the switch is by connecting a physical cable to +the port and capturing on the other side. + +Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are +commonly used in 10 Gbit/s networks. According to +\ref{wikipedia:_jumbo}, even many gigabit network interface card +support jumbo frames. However according to emails on the private +NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at +the moment and additional work is required to implement support for +bigger frames. + +While most of the P4 language is supported on the netpfga, some key +techniques are missing or not supported. +\begin{itemize} +\item Analysing / accessing payload is not supported +\item Checksum computation over payload is not supported +\item Using LPM tables can lead to compilation errors +\item Depening on the match type, only certain table sizes are allowed +\end{itemize} +Renaming variables in the declaration of the parser or deparser lead +to compilation errors. Function syntax is not supported. For this +reason our implementation uses \texttt{\#define} statements instead of functions. Trace files \begin{verbatim} diff --git a/doc/Thesis.pdf b/doc/Thesis.pdf index 74edca2..15b7011 100644 Binary files a/doc/Thesis.pdf and b/doc/Thesis.pdf differ diff --git a/doc/appendix.tex b/doc/appendix.tex index 1b9497b..584f976 100644 --- a/doc/appendix.tex +++ b/doc/appendix.tex @@ -508,7 +508,6 @@ nf3: ERROR while getting interface flags: No such device nico@nsg-System:~/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/bitfiles$ \end{verbatim} % ---------------------------------------------------------------------- - \section{\label{chapterB:netpfga-kernelmodule}NetFPGA Kernel module} After a successful flash, loading the kernel module will enable nf devices to appear in the operating system. @@ -580,13 +579,15 @@ nico@nsg-System:~$ \end{verbatim} % ---------------------------------------------------------------------- - \section{\label{chapterB:netpfga-nftraffic}NetFPGA misses packets on nf*} While the nf devices appear in the operating system, packets emitted by the netpfga cannot be sniffed on the nf interfaces directly. Instead one has to sniff packets on a physical network card that is connected to the specific output port. +% ---------------------------------------------------------------------- +\section{\label{chapterB:netpfga-kernelmodule}NetFPGA Kernel module} + %--------------------------------------------------------------------------------------------------------- \chapter{\label{benchmark}Benchmark Logs} % ---------------------------------------------------------------------- diff --git a/doc/refs/refs.bib b/doc/refs/refs.bib index c6db87a..db09199 100644 --- a/doc/refs/refs.bib +++ b/doc/refs/refs.bib @@ -135,3 +135,10 @@ author = {Hendrik Züllig, Supervisor; Prof. Dr. Laurent Vanbever; Tutor: Tobias Bühler}, title = {P4-Programming on an FPGA, Semester Thesis SA-2019-02}, howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/sa-2019-02_p4_programming_sume_netfpga/blob/master/SA-2019-02.pdf}}} + + +@Misc{wikipedia:_jumbo, + author = {Wikipedia}, + title = {Jumbo frame}, + howpublished = {\url{https://en.wikipedia.org/wiki/Jumbo_frame}}, + note = {Requested on 2019-08-15}}