-
Notifications
You must be signed in to change notification settings - Fork 13
Running BPFabric
The goal of this example is to demonstrate how to use BPFabric to route traffic between two endpoints.
In this example the switch will have 2 ports. Each port will be attached to a separate node and the goal is to have a working communication between those two nodes.
The controller will be running on the same machine as the software switch and will be responsible for installing the learning switch functionality to the switch.
In order to perform this example you will need two network interfaces that are not currently used by your machine. Each interface should be connected to a node that can generate and receive traffic (for instance a Raspberry PI).
If you are running BPFabric in a Virtual Machine you can use Bridged Network Adapters to attach NIC from your host OS into your VM. If you are using bridged network adapters make sure that the interface is configured in promiscuous mode.
If you do not have enough network interfaces or end nodes to create this topology physically, look the Mininet example instead to emulate it.
Both nodes have been set up to have a static IP on the network interface connected to the BPFabric switch. The first node has the IP 192.168.58.1
and the second one 192.168.58.2
. You can temporarily configure a static IP with sudo ip addr add 192.168.58.1/24 dev en0
. If you have a network manager make sure id doesn't conflict with manually configured IP like this or configure the static IP directly in your network manager.
This will go through the steps of running a BPFabric software switch that communicates with the controller and execute the installed BPF function(s) installed.
For the next step you will need either the softswitch or the DPDK switch running. Let's start with the SoftSwitch as it is the simplest to get up and running.
Regardless of which implementation of the switch you will be running you need some network interfaces that BPFabric can receive traffic from, and send traffic to.
In this case enp0s8
and enp0s9
are the two NICs of interest. Each of them is connected to separate linux machines.
ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 08:00:27:83:2a:a1 brd ff:ff:ff:ff:ff:ff
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 08:00:27:74:0e:14 brd ff:ff:ff:ff:ff:ff
4: enp0s9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 08:00:27:13:29:35 brd ff:ff:ff:ff:ff:ff
Let's run the softswitch against those two interfaces
cd ~/repos/BPFabric/softswitch/
sudo ./softswitch --dpid=1 --controller="127.0.0.1:9000" --promiscuous enp0s8 enp0s9
Setting up 2 interfaces
Interface enp0s8, index 0, fd 3
Interface enp0s9, index 1, fd 4
unable to connect to the controller: Connection refused
Let's start by dissecting the command line arguments passed to softswitch:
-
--dpid=1
set the datapath identifier of this switch to1
. The datapath identifier is unique to each switch and is used to identify the switch on the network. Defaults to a random value if not present. -
--controller="127.0.0.1:9000"
set the address of the controller to be localhost on port 9000. This will be used to establish the connection between the switch and the controller. Defaults to127.0.0.1:9000
if no controller address is provided. -
--promiscuous
enable promiscuous mode on the interface to allow all packets to be received even if not addresses to this NIC. You can manually enable promiscuous mode on the NIC. Defaults to disabled. -
enp0s8 enp0s9
are the interfaces to attach to this switch. The interfaces will be added in order soenp0s8
will map to the port0
of the switch andenp0s9
to port1
This will set up the two interfaces on port 0 and 1 of the switch and attempt to establish a connection to the controller. At this point the controller is not running so the Connection refused
message will be shown periodically every time the connection is attempted.
Now that you have the switch running you can skip to the next section and set up the controller.
This will set up the DPDK switch similarly to the Soft Switch above. A lot of options are internal to DPDK so refer to the official documentation on configuring DPDK and its Environment Abstraction Layer (EAL).
The first step is to enable huge pages, this is necessary only once per boot. In this case we set up 1024 pages of 2048kB. Refer to the official documentation if you want more information on setting up huge pages for DPDK or use 1G huge pages. 1G huge pages can improve performance but need to be configured at kernel start up.
sudo -i
echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
If you check meminfo you should now see the hugepages available
cat /proc/meminfo
[...]
HugePages_Total: 1024
HugePages_Free: 1024
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 2097152 kB
[...]
Now you need to load the drivers for the network interfaces that can be used by DPDK. First let's identify which interfaces to load the drivers for.
dpdk-devbind.py --status
Network devices using kernel driver
===================================
0000:00:03.0 '82540EM Gigabit Ethernet Controller 100e' if=enp0s3 drv=e1000 unused= *Active*
0000:00:08.0 '82540EM Gigabit Ethernet Controller 100e' if=enp0s8 drv=e1000 unused=
0000:00:09.0 '82540EM Gigabit Ethernet Controller 100e' if=enp0s9 drv=e1000 unused=
[...]
At this point let's load the vfio-pci
driver for 0000:00:08.0
(enp0s8) and 0000:00:09.0
(enp0s9)
modprobe vfio-pci
dpdk-devbind.py --bind=vfio-pci 0000:00:08.0
dpdk-devbind.py --bind=vfio-pci 0000:00:09.0
If like me you receive the error Cannot bind to driver vfio-pci: [Errno 22] Invalid argument
your machine might not have iommu available. You first need to enable loading the drivers in no immu mode. Once the command below executed you can try to bind the network interfaces again.
echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
If you check again the status you should see the two interfaces now available for DPDK to use.
dpdk-devbind.py --status
Network devices using DPDK-compatible driver
============================================
0000:00:08.0 '82540EM Gigabit Ethernet Controller 100e' drv=vfio-pci unused=e1000
0000:00:09.0 '82540EM Gigabit Ethernet Controller 100e' drv=vfio-pci unused=e1000
cd ~/repos/BPFabric/dpdkswitch/build/
sudo ./bpfabric -l 0-1 -n 4 -- -q 1 -p 3 -d 1 -c 127.0.0.1:9000
EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Ignore mapping IO port bar(2)
EAL: Probe PCI driver: net_e1000_em (8086:100e) device: 0000:00:08.0 (socket -1)
EAL: Ignore mapping IO port bar(2)
EAL: Probe PCI driver: net_e1000_em (8086:100e) device: 0000:00:09.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Lcore 0: RX port 0 TX port 1
Lcore 1: RX port 1 TX port 0
Initializing port 0... EAL: Error enabling MSI-X interrupts for fd 23
done:
Port 0, MAC address: 08:00:27:74:0E:14
Initializing port 1... EAL: Error enabling MSI-X interrupts for fd 27
done:
Port 1, MAC address: 08:00:27:13:29:35
Checking link status...................done
Port 0 Link up at 1 Gbps FDX Autoneg
Port 1 Link up at 1 Gbps FDX Autoneg
L2FWD: entering main loop on lcore 1
L2FWD: -- lcoreid=1 portid=1
L2FWD: entering main loop on lcore 0
L2FWD: -- lcoreid=0 portid=0
Port statistics ====================================
Statistics for port 0 ------------------------------
Packets sent: 0
Packets received: 0
Packets dropped: 0
Statistics for port 1 ------------------------------
Packets sent: 0
Packets received: 0
Packets dropped: 0
Aggregate statistics ===============================
Total packets sent: 0
Total packets received: 0
Total packets dropped: 0
====================================================
unable to connect to the controller: Connection refused
The arguments to the DPDK switch implementation are split into two. Everything before --
are EAL arguments and everything after are arguments specific to this implementation.
-l 0-1 -n 4 -- -q 1 -p 3 -d 1 -c 127.0.0.1:9000
The arguments to the EAL used here are the following. Please refer to the official documentation to see the available parameters:
-
-l 0-1
use CPU cores 0 and 1 -
-n 4
use 4 memory channels
The arguments to the implementation are:
-
-q 1
use 1 receive queue per logical core -
-p 3
bitmask to enable the ports in hexadecimal. In this case we have two ports and they both need to be enabled. Port 0 is enabled by bit 0 and port 1 by bit 1, so the bitmask becomes in binary00000011
hence3
in hexadecimal. -
-d 1
set the datapath identifier to1
for this switch on the network. -
-c 127.0.0.1:9000
set the controller address to localhost on port 9000.
Similarly to the softswitch you will see Connection refused
messages as the controller is not yet running.
Now you have a switch that's up and running but no function has been installed on the switch. This next step is to start the controller and wait for the switch to establish connection. Once connection is established a new function can be installed on the switch.
In a new terminal separate from the switch start the Command Line Interface (CLI) sample controller that allows interacting with the switches manually.
cd ~/repos/BPFabric/controller
python3 cli.py
--------------------------------------------------------------------------------
eBPF Switch Controller Command Line Interface - Netlab 2024
Simon Jouet <[email protected]> - University of Glasgow
--------------------------------------------------------------------------------
Documented commands (type help <topic>):
========================================
help
Undocumented commands:
======================
connections
After a few seconds you should see your switch establishing a connection to the controller. You will see the DPID of the switch that has established the connection - in this case 00000001
. If you check your running switch you should no longer see Connection refused
messages.
(Cmd) Connection from switch 00000001, version 1
Now that that connection is established type on the enter key to get a clean prompt. Now if you type connections
you will see the list of active switch connections.
(Cmd) connections
dpid version connected at
========== ========= ===================
00000001 1 1707850336.484466
========== ========= ===================
If you check the list of active functions on the switch you should see it's empty. The first argument (1) is the dpid of the switch to which to issue the command. Here we issue the FunctionListRequest
to get the list of installed functions.
(Cmd) 1 list
(Cmd) <Empty Table>
For this example we want the switch to act as a learning switch. For this you can use the provided example function learningswitch
that will learn the mapping between a MAC address and a physical port. Check the implementation in learningswitch.c
to understand how it works.
Let's install the function on the switch by issuing the FunctionAddRequest
that will install the function on the switch. This message can be issued from the CLI using:
(Cmd) 1 add 0 learningswitch ../examples/learningswitch.o
(Cmd) Function has been installed
In this case we want to add the function on switch 1
at index 0
in the execution pipeline. As a description we name this function learningswitch
and the implementation binary (eBPF) is available at the path ../examples/learningswitch.o
.
Now if we list the list of installed functions, we get the learning switch
(Cmd) 1 list
(Cmd)
index name counter
======= ================ =========
0 learningswitch 0
======= ================ =========
The function learningswitch
is installed at index 0 in the pipeline and so far 0
packets have passed through this function.
Now let's see what's the state of the tables used for this function. We can do this by issuing a TablesListRequest
to list the tables used by a function. In the cli you can do that with:
(Cmd) 1 table 0 list
(Cmd)
name type key size value size max entries
========= ====== ========== ============ =============
inports HASH 6 4 256
========= ====== ========== ============ =============
The learning switch function (at index 0) has only one table called inports
. This table is a lookup table (HASH) with a key size of 6 bytes to store the MAC address and a value of 4 bytes to store the port on which a packet with this MAC address was seen.
Listing the content of a table can be done through TableListRequest
which will return the entire content of the table to the controller. If no traffic has passed through the switch this table should be empty. You can check this in the CLI using
(Cmd) 1 table 0 inports list
(Cmd) <Empty Table>
Now that the function is installed let's check that the packets are switched correctly.
You should now have a switch with the learning switch function installed. Let's test that traffic can flow between nodes and that the learning switch function is behaving appropriately.
You can ping one node from the other and you should see that the communication is working
ping 192.168.58.2
PING 192.168.58.2 (192.168.58.2) 56(84) bytes of data.
64 bytes from 192.168.58.2: icmp_seq=1 ttl=64 time=4.07 ms
64 bytes from 192.168.58.2: icmp_seq=2 ttl=64 time=2.28 ms
[...]
You can now inspect the tables of the learning switch to inspect the rules that were learnt.
(Cmd) 1 table 0 inports list
(Cmd)
Key Value
============== ==========
00e0b45aadd5 00000000
c85b76fd3be4 01000000
============== ==========
In this case we have two entries in the table. The key
in this table is the MAC address of the node and the value is the port on which the traffic came from. You can see that traffic from 00:e0:b4:5a:ad:d5
is coming from port 0
and traffic from c8:5b:76:fd:3b:e4
is on port 1
. Note that if you are running BPFabric in a virtualised environment, for instance a VM or Mininet you might have additional entries mapping to the host MAC addresses.
You can also inspect the functions installed on the switch and see that packets have passed through the learning switch function
(Cmd) 1 list
(Cmd)
index name counter
======= ================ =========
0 learningswitch 12
======= ================ =========