-
Notifications
You must be signed in to change notification settings - Fork 13
Function
A function is a single stage of the execution pipeline. Functions are compiled into eBPF to provide platform independent logic as well as provide some execution guarantees.
Each function can contain lookup tables that only them and the controller can access. Functions cannot modify the tables of other functions. The tables can be used to store state and make forwarding decisions when processing a packet. The content of the table is persistent between packet processed by the same function. Tables are removed if the function is removed.
Each function has a set of API exposed to query the tables and provide the ability to raise events on the control plane. The set of functions exposed to the data plane is purposefully limited to the calls that would be possible from the data plane of an hardware switch.
Functions can either be passive and just process the packet and update some state without performing a forwarding decision. Or they can be active and return a forwarding decision for the packet they have processed.
Multiple functions can be chained in a pipeline and dynamically added and removed to change the behaviour of the network at runtime.
The current implementation supports 3 types of tables, HASH
, ARRAY
and LPM
. Other types of tables can be easily added if necessary.
BPF_MAP_TYPE_HASH
defines a hashtable to lookup a value from a provided key. You can declare a HASH
table in your function using the following
struct bpf_map_def SEC("maps") inports = {
.type = BPF_MAP_TYPE_HASH,
.key_size = 6,
.value_size = sizeof(uint32_t),
.max_entries = 256,
};
In this case a hash table called inports
is defined. This table can be used to map a MAC address to a port in the switch. The key is 6 bytes long to store the MAC address and the value is 4 bytes long to store the port. The table can old a maximum of 256 entries.
You can insert an element in the hash table using the following:
bpf_map_update_elem(&inports, pkt->eth.h_source, &pkt->metadata.in_port, 0);
In this case &inports
is the pointer to the table defined above, pkt->eth.h_source
is the source MAC address from the incoming packet, &pkt->metadata.in_port
is the pointer to the uint32
containing the incoming port from this packet's metadata and flags is 0
.
You can lookup an element using:
bpf_map_lookup_elem(&inports, pkt->eth.h_dest, &out_port)
&inports
is the pointer to the table, pkt->eth.h_dest
is the lookup key, in this case the MAC address of the destination for this packet and &out_port
the pointer to store the value in the map.
struct bpf_map_def SEC("maps") traffichist = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(uint32_t),
.value_size = sizeof(uint64_t),
.max_entries = 24,
};
BPF_MAP_TYPE_ARRAY
defined an array lookup table, mapping an index to a value.
In this case we create an array of 24 entries. The index of the array is a uint32
and the value is a uint64
. This example can be useful for instance to store information per port. In this example the switch might have 24 ports so 24 entries are created to have a one to one mapping between a port and an entry in the table.
struct ipv4_lpm_key
{
uint32_t prefixlen;
uint32_t data;
};
struct bpf_map_def SEC("maps") lpm = {
.type = BPF_MAP_TYPE_LPM_TRIE,
.key_size = sizeof(struct ipv4_lpm_key),
.value_size = sizeof(uint32_t),
.max_entries = 256,
.map_flags = BPF_F_NO_PREALLOC,
};
BPF_MAP_TYPE_LPM_TRIE
create a longest prefix match table using a trie. A common use for LPM is to map an IP with a netmask to a port. In this case the key is a structure containing the prefix length in bit of the netmask and the data is the ip. The value is a uint32
and the table can contain maximum 256 entries.
The APIs exposed in the functions are the following:
static int bpf_map_lookup_elem(void *map, void *key, void *value);
static int bpf_map_update_elem(void *map, void *key, void *value, unsigned long long flags);
static int bpf_map_delete_elem(void *map, void *key);
static int bpf_mirror(unsigned long long out_port, void *buf, int len);
static int bpf_notify(int id, void *data, int len);
static int bpf_debug(unsigned long long arg);
-
bpf_map_lookup_elem
: Generic table lookup. Provided a map, key and pointer for the value the entry for this key is retrieved from the table. Returns -1 if the entry is not present in the map. -
bpf_map_update_elem
: Generic table update and insert. Provided a map, key, value and flags this call will add the entry in the map. If the entry does not exists it will be created. Returns -1 on error; -
bpf_map_delete_elem
: Generic table remove. Provided a map and key, this call will remove the entry from the map. Returns -1 on error; -
bpf_mirror
: Mirror the packetbuf
of lengthlen
to theout_port
. If len is shorter than the length of the buffer the packet will be truncated. -
bpf_notify
: Ask the agent to send a notification to the controller. The notification is identified byid
and contains somedata
or lengthlen
. -
bpf_debug
: Print a debug statement in the switch for basic debugging. This only accept a number as an argument.
The function should return a value indicating the forwarding decision. The forwarding decision is a 64 bit unsigned integer that contains an decision and an argument for this decision. The top 32 bits of the return value is the decision and the bottom 32 bits is the argument to the decision.
The list of decisions a function can return are:
#define PORT 0x00ULL
/** Flood the packet to all other ports */
#define FLOOD (0x01ULL << 32)
/** Send the packet to the controller */
#define CONTROLLER (0x02ULL << 32)
/** Drop the packet */
#define DROP (0x03ULL << 32)
/** Send the packet to the next pipeline stage */
#define NEXT (0x04ULL << 32)
-
PORT
: Forward the packet to a specific port. The packet can be forwarded to port 0 by returningPORT
, to port 1 by returningPORT + 1
etc... -
FLOOD
: Forward the packet to all ports except the incoming port. This does not support arguments. -
CONTROLLER
: Forward the packet to the controller as aPacketIn
request. This does not support arguments. -
DROP
: The packet is dropped. This does not support arguments. -
NEXT
: The packet is passed to the next stage in the pipeline. This accept as an argument then number of stages to skip. For instanceNEXT + 1
will skip the next function in the pipeline.