++netpfga results
This commit is contained in:
parent
e1949d2ac3
commit
bf22fdcdb3
5 changed files with 110 additions and 177 deletions
|
@ -174,191 +174,24 @@ idomatic problem: Security issue: not checking checksums before
|
|||
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{conclusion:netpfga}NetFGPA - all HERE}
|
||||
personal note here
|
||||
|
||||
|
||||
stopped working
|
||||
reboot not enough
|
||||
does not respond to any packet
|
||||
|
||||
tested various kernels for table debugging
|
||||
|
||||
MTU limitations: 1500 according to a private mail from Salvator Galea
|
||||
cambridge / uk
|
||||
|
||||
long compile process
|
||||
error prone compile process
|
||||
many dependencies
|
||||
lpm not supported!
|
||||
Netpfga live,
|
||||
Vivado
|
||||
SDNET
|
||||
xx k lines of supporting code
|
||||
|
||||
Vivado installation: silent errors, infinite loop, missing libncurses5
|
||||
|
||||
82k lines of code that are interdependent
|
||||
Many non critical error messages on the way
|
||||
Zero exit fatal errors
|
||||
|
||||
missing / spreaded documentation
|
||||
|
||||
tcpdump on local nfX doesn't work -> can only debug on other endpoint
|
||||
|
||||
|
||||
First card: Writing tables fails
|
||||
hardware debug shows some errors
|
||||
but hardware debug on correct card also shows some error
|
||||
Debug ioctl errors when writing table entries
|
||||
|
||||
|
||||
Output all ports -> port mapping documented only in a testdata script
|
||||
|
||||
|
||||
hwtest: Execution fails due to missing djtgcfg
|
||||
|
||||
|
||||
no payload accessq
|
||||
|
||||
Many workarounds
|
||||
|
||||
Table size 63, table size 64,
|
||||
|
||||
Table entries require arguments of all possible actions, not only used
|
||||
one.
|
||||
|
||||
Compile time hours
|
||||
|
||||
Silent errors
|
||||
|
||||
Unclear errors: broken board
|
||||
|
||||
Due to the very fragile nature of the build framework from the
|
||||
NetFPGA-Live repository,
|
||||
|
||||
Renaming VARIABLES in the definition of
|
||||
|
||||
Reproducibility:
|
||||
|
||||
hours for finding right output ports
|
||||
|
||||
packet size / annotation
|
||||
|
||||
Needed to debug internal parsing errors
|
||||
|
||||
3x rebooting to get card working with bitstream
|
||||
|
||||
Variable renaming breaks the compile process
|
||||
|
||||
\begin{verbatim}
|
||||
It seems I was really mistaken for the last weeks
|
||||
If I am not totally mistaken, the following is happening with the netpfga:
|
||||
I was testing sending and receiving packets on the same computer; so I sent a packet on nfX and expected an answer on nf0, which is how I wanted to verify that the card works
|
||||
So I ran tcpdump on nf0, send a packet with ping6 and scapy on nf{0,1,2,3} (edited)
|
||||
I have never seen the switch emitting ANY packet back with tcpdump
|
||||
Now with the card connected to another host, sending neighbor solicitation, I see duplicated packets on the other host - so it seems that it might have worked all the time, just that tcpdump on nfX on the host which contains the card does not show the packets
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
debugging generated tcl code to debug impl1 error
|
||||
|
||||
Cable problems:
|
||||
\begin{verbatim}
|
||||
[ 488.265148] ixgbe 0000:02:00.0: failed to initialize because an unsupported SFP+ module type was detected.
|
||||
[ 488.265157] ixgbe 0000:02:00.0: Reload the driver after installing a supported module.
|
||||
[ 488.265605] ixgbe 0000:02:00.0: removed PHC on enp2s0f0
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
function syntax not supported, using defines instead
|
||||
|
||||
4-6 MB logfiles for a compile process.
|
||||
|
||||
confusing messages
|
||||
\begin{verbatim}
|
||||
WARNING: command 'get_user_parameter' will be removed in the 2015.3
|
||||
release, use 'get_user_parameters' instead
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
critical non critical errors
|
||||
\begin{verbatim}
|
||||
|
||||
|
||||
CRITICAL WARNING: [BD 41-737] Cannot set the parameter TRANSLATION_MODE on /axi_interconnect_0. It is read-only.
|
||||
\end{verbatim}
|
||||
|
||||
\begin{verbatim}
|
||||
- step9 (sume simulation, the longest step) in the process calls
|
||||
"config_writes.py"
|
||||
- config_writes.py fails with a syntax error, as it is incomplete
|
||||
python code
|
||||
- config_writes.py and config_writes.sh are generated by
|
||||
gen_config_writes.py
|
||||
- gen_config_writes.py reads config_writes.txt
|
||||
- config_writes.txt is created in step 5 (sdnet simulation)
|
||||
- step 5 consists of running xsc, xelab and xsim
|
||||
- xsim (re-)generates config_writes.txt according to a watch ls -l
|
||||
on the file: ${XILINX_VIVADO}/bin/xsim --runall
|
||||
SimpleSumeSwitch_tb#work.glbl
|
||||
- it seems (by grep -r) that ./Testbench/SimpleSumeSwitch_tb.sv is
|
||||
responsible for writing config_writes.txt
|
||||
- It seems that the "task" "SV_write_control" inside that file is
|
||||
responsible for writing the content, which in turn uses
|
||||
axi4_lite_master_write_request_control
|
||||
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
\begin{verbatim}
|
||||
- Cannot easily run P4 on notebook - changes to the system very
|
||||
invasive
|
||||
- Varous compiler bugs/limitations
|
||||
- Very very deep rabbithole problems
|
||||
- Hanging/sleeping issue -- unclear whether it does something or
|
||||
not
|
||||
- Open impl_1 error with unclear reason
|
||||
- logfiles referenced that don't exist
|
||||
Run output will be captured here: /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/synth/runme.log
|
||||
nico@nsg-System:~/master-thesis/netpfga/log$ ls -alh /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/synth/runme.log
|
||||
ls: cannot access '/home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/synth/runme.log': No such file or directory
|
||||
- even "short" compile runs taking 30m+
|
||||
|
||||
control_sub_m02_data_fifo_0_synth_1: /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/control_sub_m02_data_fifo_0_synth_1/runme.log
|
||||
nico@nsg-System:~/master-thesis/netpfga/minip4/testdata$ less /home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/control_sub_m02_data_fifo_0_synth_1/runme.log
|
||||
/home/nico/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/hw/project/simple_sume_switch.runs/control_sub_m02_data_fifo_0_synth_1/runme.log: No such file or directory
|
||||
|
||||
- Wrong warnings: using 2018.2, getting warnings about things
|
||||
removed in 2015.3
|
||||
WARNING: command 'get_user_parameter' will be removed in the 2015.3
|
||||
release, use 'get_user_parameters' instead
|
||||
|
||||
- A script/makefile generates a python script that generates a shell
|
||||
script and later then a python script. If there is a mistake in
|
||||
generating the first python script (syntax ok, but content is
|
||||
not correct) then a much later stage of the compile process will
|
||||
fail due to a syntax error in the third generated
|
||||
script. However that syntax error is not fatal in the build
|
||||
process and thus can only be seen with careful analysis of the
|
||||
logfile, which is around 700 KiB or 10k lines per compile
|
||||
process and contains 328 lines matching "error" and
|
||||
"warning".
|
||||
|
||||
Most of the error and warning messages seem to be non-critical
|
||||
(even if saying they are). Then there are a variety of INFO
|
||||
messages that actually constitute ERROR messages, but are not
|
||||
flagged as such nor do they cause the build process to abort.
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
LPM tables don't work
|
||||
|
||||
match type exact - table must be at least 64 in size
|
||||
|
||||
|
||||
multiple reboots sometimes required for flashing
|
||||
|
||||
|
||||
Damaged, enlarged packets
|
||||
|
||||
\begin{verbatim}
|
||||
|
@ -545,6 +378,14 @@ the learnings of the different layers were very much appreciated / liked
|
|||
|
||||
It was a
|
||||
|
||||
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{conclusion:netpfga2}NetFGPA2 - conclusion here}
|
||||
Very time intensive development due to usability problems and
|
||||
uncertainty of functionality (compare sections
|
||||
\ref{results:netpfga:usability} and \ref{results:netpfga:stability}).
|
||||
|
||||
|
||||
\section{todo - FIXME: remove}
|
||||
\begin{verbatim}
|
||||
***** Summary eher kurz
|
||||
|
|
100
doc/Results.tex
100
doc/Results.tex
|
@ -161,13 +161,14 @@ table entries.
|
|||
Jool and tayga are supported by
|
||||
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{Results:NetPFGA}NetFPGA}
|
||||
\section{\label{results:netpfga}NetFPGA}
|
||||
The reduced feature set of the NetPFGA implementation is due to two
|
||||
factors: compile time. Between 2 to 6 hours per compile run. No
|
||||
payload checksum
|
||||
|
||||
|
||||
|
||||
overview - general translation - not advanced features
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{results:netpfga:features}Features}
|
||||
\begin{table}[htbp]
|
||||
\begin{center}\begin{minipage}{\textwidth}
|
||||
\begin{tabular}{| c | c | c |}
|
||||
|
@ -235,7 +236,6 @@ unsupported\footnote{To support creating payload checksums, either an
|
|||
\label{tab:p4netpfgafeatures}
|
||||
\end{center}
|
||||
\end{table}
|
||||
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{results:netpfga:stability}Stability}
|
||||
Two different NetPFGA cards were used during the development of the
|
||||
|
@ -262,15 +262,99 @@ During the development and benchmarking, the second NetFPGA card stopped to
|
|||
function properly multiple times. In both cases the card would not
|
||||
forward packets anymore. Multiple reboots (3 were usually enough)
|
||||
and multiple times reflashing the bitstream to the NetFPGA usually
|
||||
restored the intended behaviour.
|
||||
|
||||
restored the intended behaviour. However due to this ``crashes'', it
|
||||
was impossible to complete a full benchmark run that would last for
|
||||
more than one hour.
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{results:netpfga:performance}Performance}
|
||||
As expected, the NetFGPA card performed at near line speed and offers
|
||||
NAT64 translations at 9.28 Gbit/s.
|
||||
NAT64 translations at 9.28 Gbit/s. Single and multiple streams
|
||||
performed almost exactly identical and have been consistent through
|
||||
multiple iterations of the benchmarks.
|
||||
% ----------------------------------------------------------------------
|
||||
\subsection{\label{results:netpfga:usability}Usability}
|
||||
To use the NetFGPA, Vivado and SDNET provided by Xilinx need to be
|
||||
installed. However a bug in the installer triggers an infinite loop,
|
||||
if a certain shared library\footnote{The required shared library
|
||||
is libncurses5.} is missing on the target operating system. The
|
||||
installation program seems still to be progressing, however does never
|
||||
finish.
|
||||
|
||||
While the NetFPGA card supports P4, the toolchains and supporting
|
||||
scripts are in a immature state. The compilation process consists of
|
||||
at least 9 different steps, which are interdependent\footnote{See
|
||||
source code \texttt{bin/do-all-steps.sh}.} Some of the steps generate
|
||||
shell scripts and python scripts that in turn generate JSON
|
||||
data.\footnote{One compilation step calls the script
|
||||
``config\_writes.py''. This script failed with a syntax error, as it
|
||||
contained incomplete python code. The scripts config\_writes.py
|
||||
and config\_writes.sh are generated by gen\_config\_writes.py.
|
||||
The output of the script gen\_config\_writes.py depends on the content
|
||||
of config\_writes.txt. That file is generated by the simulation
|
||||
``xsim''. The file ``SimpleSumeSwitch\_tb.sv'' contains code that is
|
||||
responsible for writing config\_writes.txt and uses a function
|
||||
named axi4\_lite\_master\_write\_request\_control for generating the
|
||||
output. This in turn is dependent on the output of a script named
|
||||
gen\_testdata.py.}
|
||||
|
||||
Checksum computation
|
||||
However incorrect parsing generates syntactically incorrect
|
||||
scripts or scripts that generate incorrect output. The toolchain
|
||||
provided by the NetFGPA-P4 repository contains more than 80000 lines
|
||||
of code. The supporting scripts for setting table entries require
|
||||
setting the parameters for all possible actions, not only for the
|
||||
selected action. Supplying only the required parameters results in a
|
||||
crash of the supporting script.
|
||||
|
||||
The documentation for using the NetFPGA-P4 repository is very
|
||||
distributed and does not contain a reference on how to use the
|
||||
tools. Mapping of egress ports and their metadata field are found in a
|
||||
python script that is used for generating test data.
|
||||
|
||||
The compile process can take up to 6 hours and because the different
|
||||
steps are interdependent, errors in a previous stage were in our
|
||||
experiences detected hours after they happened. The resulting log
|
||||
files of the compilation process can be up to 5 MB in size. Within
|
||||
this log file various commands output references to other logfiles,
|
||||
however the referenced logfiles do not exist before or after the
|
||||
compile process.
|
||||
|
||||
During the compile process various informational, warning and error
|
||||
messages are printed. However some informational messages constitute
|
||||
critical errors, while on the other hand critical errors and syntax
|
||||
errors often do not constitue a critical
|
||||
error.\footnote{F.i. ``CRITICAL WARNING: [BD 41-737] Cannot set the
|
||||
parameter TRANSLATION\_MODE on /axi\_interconnect\_0. It is
|
||||
read-only.'' is a non critical warning.}
|
||||
Also contradicting
|
||||
output is generated\footnote{While using version 2018.2, the following
|
||||
message was printed: ``WARNING: command 'get\_user\_parameter' will be removed in the 2015.3
|
||||
release, use 'get\_user\_parameters' instead''.}
|
||||
|
||||
The NetFPGA kernel module provides access to virtual Linux
|
||||
devices (nf0...nf3). However tcpdump does not see any packets that are
|
||||
emitted from the switch. The only possibility to capture packets
|
||||
that are emitted from the switch is by connecting a physical cable to
|
||||
the port and capturing on the other side.
|
||||
|
||||
Jumbo frames\footnote{Frames with an MTU greater than 1500 bytes.} are
|
||||
commonly used in 10 Gbit/s networks. According to
|
||||
\ref{wikipedia:_jumbo}, even many gigabit network interface card
|
||||
support jumbo frames. However according to emails on the private
|
||||
NetPFGA mailing list, the NetFPGA only supports 1500 byte frames at
|
||||
the moment and additional work is required to implement support for
|
||||
bigger frames.
|
||||
|
||||
While most of the P4 language is supported on the netpfga, some key
|
||||
techniques are missing or not supported.
|
||||
\begin{itemize}
|
||||
\item Analysing / accessing payload is not supported
|
||||
\item Checksum computation over payload is not supported
|
||||
\item Using LPM tables can lead to compilation errors
|
||||
\item Depening on the match type, only certain table sizes are allowed
|
||||
\end{itemize}
|
||||
Renaming variables in the declaration of the parser or deparser lead
|
||||
to compilation errors. Function syntax is not supported. For this
|
||||
reason our implementation uses \texttt{\#define} statements instead of functions.
|
||||
|
||||
Trace files
|
||||
\begin{verbatim}
|
||||
|
|
BIN
doc/Thesis.pdf
BIN
doc/Thesis.pdf
Binary file not shown.
|
@ -508,7 +508,6 @@ nf3: ERROR while getting interface flags: No such device
|
|||
nico@nsg-System:~/projects/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/minip4/simple_sume_switch/bitfiles$
|
||||
\end{verbatim}
|
||||
% ----------------------------------------------------------------------
|
||||
|
||||
\section{\label{chapterB:netpfga-kernelmodule}NetFPGA Kernel module}
|
||||
After a successful flash, loading the kernel module will enable nf
|
||||
devices to appear in the operating system.
|
||||
|
@ -580,13 +579,15 @@ nico@nsg-System:~$
|
|||
|
||||
\end{verbatim}
|
||||
% ----------------------------------------------------------------------
|
||||
|
||||
\section{\label{chapterB:netpfga-nftraffic}NetFPGA misses packets on nf*}
|
||||
While the nf devices appear in the operating system, packets emitted
|
||||
by the netpfga cannot be sniffed on the nf interfaces
|
||||
directly. Instead one has to sniff packets on a physical network card
|
||||
that is connected to the specific output port.
|
||||
|
||||
% ----------------------------------------------------------------------
|
||||
\section{\label{chapterB:netpfga-kernelmodule}NetFPGA Kernel module}
|
||||
|
||||
%---------------------------------------------------------------------------------------------------------
|
||||
\chapter{\label{benchmark}Benchmark Logs}
|
||||
% ----------------------------------------------------------------------
|
||||
|
|
|
@ -135,3 +135,10 @@
|
|||
author = {Hendrik Züllig, Supervisor; Prof. Dr. Laurent Vanbever; Tutor: Tobias Bühler},
|
||||
title = {P4-Programming on an FPGA, Semester Thesis SA-2019-02},
|
||||
howpublished = {\url{https://gitlab.ethz.ch/nsg/student-projects/sa-2019-02_p4_programming_sume_netfpga/blob/master/SA-2019-02.pdf}}}
|
||||
|
||||
|
||||
@Misc{wikipedia:_jumbo,
|
||||
author = {Wikipedia},
|
||||
title = {Jumbo frame},
|
||||
howpublished = {\url{https://en.wikipedia.org/wiki/Jumbo_frame}},
|
||||
note = {Requested on 2019-08-15}}
|
||||
|
|
Loading…
Reference in a new issue