Sockets

NOTE: A substantial amount of thought went into the writing of this page, as sockets are one of the most important networking concepts used in Runtime, but there are also a lot of subtle details to cover and is very complex. This page was therefore written in a very specific way, and intended to be read from top to bottom. Unlike many of the other pages in this Wiki which were intended to be used as references, this page was purposefully written to read more like a textbook, slowly introducing more topics and vocabulary in a (hopefully) cohesive way.

What are Sockets?

Sockets are a very general way for two (or more, depending on the type of socket) endpoints (code that generates or consumes data) to communicate with each other in both directions (as opposed to pipes, which are historically one-directional). We say "very general" because these two connections could be between endpoints within the same process, between endpoints on different processes on the same machine, or between endpoints on different processes on different machines (that are connected over some network, such as your local Wifi network or the Internet).

Generally, how sockets work is that each endpoint creates an object known as a socket that it will use to communicate with. Then, depending on the type of socket (more on that later), a series of steps is taken to make the socket "visible" to the outside world (this series of steps and the meaning of being "visible" depends on the nature of the socket and which end of the connection that given socket will taken on).

A socket is written to / read from an object known as a socket descriptor, which for all intents and purposes is the same as a file descriptor that you can write to and read from.

Types of Sockets

There are three "domains" of communication that sockets can be used with: TCP, UDP, and Unix Domain Sockets (or just Unix sockets). We now do a deep-dive into these three domains, which differ in substantial ways.

TCP (Transmission Control Protocol)

As the name suggests, TCP is a protocol that controls/governs how some data is transmitted between two endpoints on a network. For a TCP-compliant connection (or just a "TCP connection"), there are exactly two endpoints which can send and receive messages from each other; we say that TCP is a connection-oriented protocol because messages cannot be transmitted using TCP if there does not exist a connection between two endpoints. The protocol guarantees that the data is transmitted reliably between the two endpoints, i.e. when the sender sends some data, the receiver will receive that data in the same order that it was sent, and that the receiver will not receive duplicate data. We say that TCP is a reliable protocol.

When data is transmitted from one endpoint to another endpoint, it can be treated as continuous stream of data. This has some significant consequences! If one endpoint sends two messages in quick succession to the other endpoint, such that the two messages are packed one after the other with no breaks in between, the receiving end will not be able tell where in the received stream the first message ends and the second message begins! In other words, TCP does not support message compartmentalization, or that TCP does not include message delimiters. When we use TCP for communication, we must be sure to deal with this problem, especially when our messages are of variable length. The two sides must agree on how to communicate the length of incoming messages to each other. Two common ways to do this are:

Agreeing on a specified delimiter sequence that marks the location where one message ends and another begins
Explicitly sending the length, in bytes, of the incoming message prior to sending the message so that the receiver knows how many bytes to read from the socket to obtain the entire message.

When a socket is going to be used to make a TCP connection, it is declared as a stream socket (because of the stream nature of TCP communication) with the constant SOCK_STREAM. The sequence of steps needed to establish a connection between two TCP sockets starts with one socket connecting to another socket, which is running on a machine with a known address and is bound to a specific port. The address of the machine is (in Runtime and in PiE in general) an IPv4 (Internet Protocol, version 4) address, something like 192.168.0.24. The endpoint that is issuing the connection request is usually called the client, and the endpoint that is receiving the connection request is usually called the server. When the server accepts an incoming connection request from the client, a TCP connection has now been formed between the two endpoints, and writing data from one endpoint results in that data being made available to the other endpoint to read. Fantastic!

Recall that the sockets API is meant to be used between any two endpoints; the endpoints could be within the same process, in different processes on the same machine, or on different machines. The way this is accomplished is with this IPv4 (sometimes abbreviated as just IP) address. On a local network, all of the machines can be accessed by the special IP address 192.168.0.*. For example, all of the devices connected to your Wi-Fi network at home (which is a local network), are all assigned IP addresses that are 192.168.0.*. Publicly, on the internet, every device connected to the internet also has another IP address. In PiE, since the computers running Shepherd, Dawn, and Runtime (the Raspberry Pis) are all connected to the same local network, we use 192.168.0.* to communicate with Shepherd and Dawn in production. However, for endpoints on the same machine, there is another special IP address: 127.0.0.1 (a.k.a. "localhost" or "loopback address") which always points back to the device a process is running on. On my laptop, 127.0.0.1 points to my laptop; on your laptop, 127.0.0.1 points to your laptop, etc. This is how we test Runtime (and, generally, how processes on the same machine can communicate with each other through the sockets API). By telling net_handler to send all of its output to various ports on the 127.0.0.1 address, we can capture all of its output locally on the Raspberry Pi without needing to connect the real Shepherd or the real Dawn over a network to the Raspberry Pi.

UDP (User Datagram Protocol)

UDP is another protocol that governs/controls the transmission of data into and out of an endpoint on a network. UDP is an unreliable, connectionless protocol (as opposed to TCP, which is reliable and connection-oriented). Let's explore what this means.

With TCP, a concrete connection exists between two endpoints; data sent out from one endpoint is reliably transmitted and received in the order that it was sent by the other endpoint. When sending a message using UDP, the endpoint uses its socket (configured to use UDP) to send a message to some IP address and port number; this IP address and port number is the intended receiver of the message. However, there is no notion of a connection between the sender and the intended receiver; indeed, the intended receiver doesn't even have to exist, in which case the message that was sent out is simply dropped and lost forever. When both the sender and intended receiver exist and messages sent are being delivered successfully to the intended receiver, the messages received by the receiver are not necessarily reliable. In other words, if the sender sends message A first, and then message B next, it is possible for the receiver to receive message B first, then message A next, and then a duplicate message A after that. This is simply due to network unreliability and the nature of UDP (it does not offer reliability). The probability of these events happening is low, especially with low network congestion and when the receiver can process incoming messages at a rate faster than the rate at which the messages are arriving.

Another consequence of there being no notion of a "connection" between the sender and the intended receiver is that multiple senders can exist for the same intended receiver. There is nothing wrong with having two endpoints both sending messages to a third endpoint.

Lastly, UDP is not a stream-based protocol, like TCP is. Rather, UDP is a datagram-based protocol—hence the name. A datagram is a "packet" or "container" that stores the contents of one message. Each time the sender sends data to the intended receiver, it calls a function. Each time that function is called, the data that it was called with is packaged into a datagram and sent out onto the network to the intended receiver. So, while UDP is unreliable, it is the whole packets (messages) themselves that is unreliable, not the messages' contents that is unreliable. If a packet arrives at the intended receiver, it is guaranteed to arrive whole and uncorrupted. However, it is not guaranteed that a packet sent from a sender will arrive at the intended receiver successfully, or that packets sent from the sender will arrive at the intended receiver in the same order that they were sent, or that packets will not be duplicated on their way to the intended receiver.

A consequence of UDP using datagrams to transmit messages instead of a byte-stream like TCP is that messages sent using UDP are message delimited—each datagram corresponds to one message, and the receiver knows when one message ends and the next begins. This means that we do not need to worry about sending the length of the message along with the message itself, or coming up with some sequence of bytes that both endpoints agree on will represent one message ending and another one starting.

When a socket is going to be used for transmitting or receiving messages using UDP, it is first created as a datagram socket (due to the datagram nature of UDP) using the constant SOCK_DGRAM. Once this is done, the program can immediately begin to send and receive datagrams on that socket; there is no sequence of steps to establish a connection between a server and a client as with TCP, since there is no notion of a connection with UDP.

The same special IP addresses described in the "TCP" section apply to UDP socket addresses as well: 192.168.0.* is used for communicating between machines on the same local network, and 127.0.0.1 is used for communicating between endpoints on the local machine. A point to note: a UDP socket and a TCP socket that are bound to the same address and the same port is allowed, and refer to two completely separate objects from the operating system's point of view. In other words, a program can open up a stream socket and bind it to 127.0.0.1:8000 (port 8000 on my machine) and the operating system will know to send incoming TCP messages on port 8000 to this socket. That same program can open up a datagram socket and bind it to 127.0.0.1:8000 (the exact same port and address on my machine) and the operating system will know to send incoming UDP datagrams on port 8000 to this socket. Incoming messages on one port do not effect incoming messages on the other port.

Unix Domain Sockets

The most important thing to know about Unix domain sockets is that they cannot be used between processes on different machines. They are kind of like a cross between a pipe and a socket.

Here are some ways that it is like a socket:

Two-way communication
Supports SOCK_DGRAM and SOCK_STREAM (datagram-based transmission and stream-based transmission, both of which are reliable)
Used and coded like a socket

Here are some ways that it is like a pipe:

Only allowed between processes on the same machine
(Usually) has a name in the file system (this is like a FIFO and not like a pure pipe)
Does not use any underlying protocol (IP, TCP, UDP) to process, wrap, pack, or unpack the send messages; they are simply sent as essentially raw bytes from one endpoint to another

When creating a Unix domain socket endpoint, a socket is first created as either a stream-based (SOCK_STREAM) or datagram-based (SOCK_DGRAM) socket. Then, depending on which type of socket was specified, a connection sequence is initiated from a client to the server (for stream-based), or the new socket is immediately used to send and receive datagrams (for datagram-based). Notice the similarities between using TCP and UDP sockets, respectively. The "address" of a Unix domain socket is not an IP address like 127.0.0.1; rather, it is a pathname, such as /foo.sock. This file will be created in the filesystem when the socket is created and bound to that "address".

For Unix domain sockets, it is not necessary for clients to bind to an existing "address" (file name) on the file system that belongs to the server. Thought this is usually done to establish a connection between a client and server with Unix domain sockets, there is some special behavior that occurs when the client does not bind the socket after creating it to communicate with a server. We use this behavior in Runtime, so it would be helpful to cover it here.

If a Unix domain socket is created and then bound without a specific file path name of the socket, it is what's known as an abstract socket. The operating system essentially chooses some internal name for the abstract socket that the user does not need to know and that is unique. What the abstract socket allows us to do is for arbitrary programs to bring up Unix sockets and use them to send data to a server that is bound to a well-known address, and since the assigned socket names for the abstract sockets are guaranteed to be unique, the server knows where to send the incoming data on the server back to (i.e. the server can identify which client sent data to the server).

Additional Comments On All Sockets

EOF Condition

For TCP and Unix Domain stream-based sockets, when one end of a connection is closed, any attempt to read data from the side that is not closed will return an EOF (end-of-file) character (assuming that the side that is still open has processed all prior incoming data from the side that just closed). This lets the receiver know that the other end has closed, and should therefore start cleanup or termination operations.

However, since UDP and Unix Domain datagram-based sockets are connectionless, closing all writers to a particular UDP address will not "close the connection", since there is no notion of a connection. So, a socket using a datagram-based protocol will never receive an EOF on the socket; it is up to the programmer to know when to use the UDP socket and when to determine when a socket is not necessary any more.

One last thing to note: on stream-based sockets, an EOF is given by a call to read that returns 0. Recall that read returns the number of bytes read from the given file descriptor, and under normal circumstances, a read will block if there is nothing to read from the file descriptor. So, read returning with a value of 0 indicates that something abnormal happened to the connection. The confusing thing is that for datagram-based sockets, since EOF does not make sense, it is actually possible for read to return under normal conditions with a value of 0; that just means that the socket received a datagram, and that datagram contained no information (zero bytes).

`struct sockaddr`

This structure is, as the name suggests, used to hold socket addresses. You can imagine that IP addresses/port combinations that identify a TCP or UDP socket (ex. 192.168.0.24:6101) looks very different from the pathname addresses of Unix domain sockets (ex. foo.sock). In order for the socket API to be as general as possible, many of the functions that are used when working with sockets take pointers to struct sockaddr, which is just a general type that specific structures that hold specific types of addresses can be cast to in the call to the function that requires a struct sockaddr structure in its arguments. Now let's look at the two most important specific address structures:

`struct sockaddr_in`

This structure is used for holding IPv4 addresses, and it has three fields:

struct sockaddr_in {
    sa_family_t    sin_family; // address family, will always be AF_INET for IPv4 addresses
    in_port_t      sin_port;   // port (in network byte order; see functions `htonl()` and `htons()` for more information
    struct in_addr sin_addr;   // internet address; see function `inet_addr()` for more information
};

`struct sockaddr_un`

This structure is used for holding Unix Domain socket addresses, and it has two fields:

struct sockaddr_un {
    sa_family_t  sun_family;  // address family, will always be AF_UNIX for Unix Domain socket addresses 
    char         sun_path[];  // string representing name of Unix Domain socket (ex. `foo.sock`)
};

For abstract sockets, the sun_path[] value is left untouched after the struct sockaddr_un variable is created.

TCP vs. UDP

The differences between TCP and UDP are substantial. TCP is a connection-oriented, reliable, stream-based protocol; UDP is a node-oriented, unreliable, datagram-based protocol. Under what circumstances would you prefer one over the other?

Well, when you need reliability, you will need to use TCP. That's just a given. If you want to use UDP, you would need to manually write a lot of code to ensure the reliability that you need, at which point you would basically just have written the TCP protocol. No need to reinvent the wheel, just use TCP.

When you need speed and dropping a few packets here and there is acceptable, you can't beat UDP. Since TCP needs to guarantee reliability, there is a lot metadata that goes into sending TCP messages between two endpoints which slow down the communication considerably, especially when network congestion is high.

Functions Used When Working With Sockets

The socket API is complex and has a lot of subtleties. Some of the below functions will be explained in great detail, others more briefly; some of the functions are used for all sockets, some only for certain types of sockets. Or, they will be used in different ways depending on the type of socket.

The `socket()` Function

The socket function has the following definition:

int socket(int domain, int type, int protocol);

This function creates a socket with the given parameters and returns the socket descriptor for the newly created socket to the caller. The first argument specifies which domain the socket will use to communicate; AF_UNIX for Unix Domain sockets, and AF_INET for both TCP and UDP sockets that use IPv4 addresses. The second argument specifies the type of communication the socket will use; SOCK_STREAM for stream-based communication, and SOCK_DGRAM for datagram-based communication. The third argument specifies the protocol to use (TCP, UDP, SCTP, etc.) but in most cases (and certainly in Runtime's use cases), given the domain and type values, the protocol can be determined (for example AF_INET and SOCK_STREAM implies the TCP protocol), in which case the protocol argument is just going to be 0.

The `bind()` Function

The bind function has the following definition:

int bind(int socket, const struct sockaddr *address, socklen_t address_len);

This function binds a given socket descriptor (returned by a previous call to socket()) to the address specified by address (the second argument). The third argument is the length, in bytes, of the address that you passed in as the second argument (usually found by using the sizeof operator). If the socket being used has been created for TCP or UDP use, the second argument will be a struct sockaddr_in with its fields populated with the appropriate values and then cast to struct sockaddr in the call to bind(); the third argument is, then, sizeof(struct sockaddr_in). Likewise, if the socket being used has been created for Unix sockets use, the second argument will be a struct sockaddr_un with its fields populated and then cast to struct sockaddr in the call to bind(); the third argument is, then, sizeof(struct sockaddr_un).

A call to bind() will fail if the address and port (for TCP/UDP sockets) or socket path name (for Unix sockets) specified in as the second argument is already in use by another process. Sometimes, if the process that was using that address and port previously exits prematurely or doesn't properly terminate, that socket and address will remain open and on the system for some predetermined "timeout time" (usually between 30 seconds and 2 minutes) before it is reclaimed by the operating system. If a bind() to that port and address occurs during this time, it will still fail, even though there is no process using it. The solution in this case is to wait for the timeout to expire before running your program again.

Additional Remarks for TCP:

In normal circumstances, only the server end of a connection calls bind() in order to bind the socket created to listen to incoming requests; this socket is the one that is bound to a well-known address and port and therefore "visible" to the outside world. Incoming connections from clients will arrive at this socket. The client end of a connection does not call bind() on its socket; the client merely creates the socket and then makes a connection request to the (now visible) server socket using the connect() function (the next function covered). If the connection is accepted by the server, a full and stable TCP connection now exists between the client and the server (without any need for the client to also bind its socket).

Additional Remarks for UDP:

Since UDP is a connectionless protocol, the concept of a "server" and "client" are not as well-defined for UDP communication. UDP communication can only happen if the initial sender knows the location of the initial intended receiver, which means that the initial intended receiver must have bound their socket to some well-known address to make it "visible" to the outside world. Suppose you have two machines, machine A and machine B, that wish to communicate via UDP. It is known that machine A will reach out to machine B to initialize the connection (i.e. the first message sent between the two machines is going to be from A to B). Therefore, machine B must make a socket and bind it to some well-known address to that machine A knows where to send that first message to. In UDP, the functions for receiving messages also return the address of the machine that sent the message (see the function recvfrom()). Therefore, when machine B receives a message on its socket, it will now have machine A's address and can therefore specify machine A's address when sending a reply to machine A. So, there is no need for the "client" in a UDP connection to also bind its socket (although it is more common to do so with UDP connections than with TCP connections).

Additional Remarks for Unix Domain Sockets:

To create a normal Unix domain socket and bind it to a pathname, specify a pathname when filling out the struct sockaddr_un. To create an abstract Unix domain socket (which does not have a pathname), simply create the struct sockaddr_un, specify the socket address family as AF_UNIX, and do not enter the pathname.

The `connect()` Function

The connect function has the following definition:

int connect(int socket, const struct sockaddr *address, socklen_t address_len);

This function is only for use with stream-based sockets (i.e. TCP sockets and Unix Domain stream-based sockets). It is called from the client, and is used to make a connection request to a specified socket address and port. The specified socket address and port, given in the second argument, contains the address and port of the server to which the client is issuing a connection request. The specified socket in the first argument is the socket that was created on the client that will become the client end of the connection upon the server accepting the connection.

This function is blocking, and only returns once the specified address and port on the server side accepts the connection request. When this function returns, a connection has been made between the specified socket and a socket on the file descriptor, which the client can now use to communicate with the server.

The `accept()` Function

The accept function has the following definition:

int accept(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len);

This function is only for use with stream-based sockets (i.e. TCP sockets and Unix Domain stream-based sockets). It is called from the server, and is used to accept an incoming connection request from a client. The first argument is the socket descriptor of the socket that has been bound by the server to the well-known, "visible" address (and set to be a listening socket; see the next function). The second argument here is not like the second arguments for bind() and connect(). In those functions, we populated the fields manually and then passed a pointer to the populated structure to the function in order to tell the function what address to bind to or which address to send a connection request to, respectively. In this function, we declare a variable of either type struct sockaddr_in or struct sockaddr_un (depending on what type of sockets are being used) and then pass in a pointer to that variable to accept(). accept() will fill in the fields of that variable with the address of the client that issued the connection request. In this way, we can perform some basic verification on the incoming connection request to see if the connection is what we expect.

The return value of this function is also not the same as any of the functions we have seen. accept() returns a socket descriptor that is connected to the client that made the connection request, and is used from this point on to communicate with the client. Make sure you understand this!! When a client makes a connection request to a server that has bound a socket to some well-known address, all incoming connection requests will go this socket. This socket is not the socket that is used to talk to clients once their connection requests are accepted; it is the socket whose socket descriptor is returned by accept() that is used. This behavior makes sense; since TCP is a connection-oriented protocol and each connection must only have two ends, if the socket that was bound by the server to the well-known address port was the one that made the connection to incoming connection requests, the first connection request coming in would result in a connection between the server and client, but then all other incoming connection requests would be blocked until the server created another socket and bound it to the well-known address and port. You can imagine that this behavior doesn't scale well as the number of connection requests becomes large.

This function blocks, and only returns when there is at least one incoming connection request on the listening socket's file descriptor provided in the first argument.

The `listen()` Function

The listen function has the following definition:

int listen(int sockfd, int backlog);

This function is only for use with stream-based sockets (i.e. TCP sockets and Unix Domain stream-based sockets). It is called from the server, and is used to make the given socket descriptor a "listening" socket that listens for incoming connection requests from potential clients. After calling this function, the socket can now be used in calls to accept(). The second argument, backlog, is the maximum number of unprocessed incoming connection requests that can exist before incoming connection requests to that particular address are dropped. In other words, the second argument indicates the size of the queue for incoming connection requests on that socket.

This function does not block; it merely sets the specified socket into listening mode and does not do any waiting for incoming connection requests from clients (that is all done by calls to accept()).

The `sendto()` Function

The sendto function has the following definition:

ssize_t sendto(int socket, const void *message, size_t length, int flags, const struct sockaddr *dest_addr, socklen_t dest_len);

This function, though technically usable by all sockets, is only really used for sockets communicating with datagram-based protocols (i.e. UDP sockets and Unix datagram-based sockets). It is used by a datagram-based socket that wishes to send some data to a destination datagram-based socket. The first argument is the socket descriptor that the endpoint sending the data will use. The second argument is a pointer to the data to be sent (this is often a stream of bytes, like an array of uint8_t). The third argument is the number of bytes to send. The fourth argument is, for our purposes, always 0 (no special options or flags for sending). The fifth argument is a previously filled-out struct sockaddr_in or struct sockaddr_un specifying the address of the destination. And the sixth argument is the size of the destination address structure (so either sizeof(struct sockaddr_in) or sizeof(struct sockaddr_un)).

The `recvfrom()` Function

The recvfrom function has the following definition:

ssize_t recvfrom(int socket, void *restrict buffer, size_t length, int flags, struct sockaddr *restrict address, socklen_t *restrict address_len);

This function, like sendto(), is technically usable by all sockets, but is only really used for sockets communicating with datagram-based protocols (i.e. UDP sockets and Unix datagram-based sockets). It is used by a datagram-based socket to receive a message from some sender. The first argument is the socket descriptor of the socket receiving the message. The second argument is a pointer to a location into which the incoming message will be copied. The third argument is the number of bytes available at the location pointed to by the second argument (effectively, the size, in bytes, of the longest message that we are expecting to receive). The fourth argument is, for our purposes, always 0 (no special options or flags for receiving). The fifth argument is a pointer to a struct sockaddr_in or struct sockaddr_un into which the address of the sender will be copied into when this function returns (this argument can be NULL if this information is not desired). And the sixth argument is a pointer to a variable that is equal to the size of the address specified in the fifth argument.

The `ip_addr()` and `inet_ntoa()` Functions

The ip_addr() and inet_ntoa() functions have the following definitions:

in_addr_t inet_addr(const char *cp);
char *inet_ntoa(struct in_addr in);

These two functions are useful for converting IP addresses between the human-readable string form (ex. the string "192.168.0.24:6101", specifying port 6101 at the address 192.168.0.24) and the form that is used to represent IP addresses internally in the struct sockaddr_in address structure. inet_addr() converts from the string to the internal representation (and is usually used to set the sin_addr field of struct sockaddr_in). inet_ntoa() converts from the internal representation to the string (and is usually used to print out IP addresses in a human-readable way).

The `htons` and `htonl` Functions

The htons and htonl functions have the following definitions:

uint32_t htonl(uint32_t hostlong);
uint16_t htons(uint16_t hostshort);

These two functions convert integers (either uint32_t for htonl() or uint16_t for htons()) from their representation from the Host machine to their representation on the Network. This helps explain the function names:

htonl: "Host TO Network, Longs"
htons: "Host TO Network, Shorts" The reason these functions are necessary is because different machines have different ways of representing data; machines can be either big-endian or little-endian (we won't explain this here, you can read more about this but the point is they are incompatible ways of representing data). But, since a connection could exist between a machine that is big-endian and a machine that is little-endian, there needed to be a standard for how to represent data on the network. Otherwise, one machine would send a stream of bytes to the other machine and it would interpret those bytes incorrectly. It was settled that data would be represented in big-endian over the network. These functions, then, do absolutely nothing on big-endian machines, but reverse the order of the bytes on little-endian machines to make the number big-endian instead. They are used for converting the address and port numbers when filling in struct sockaddr_in variables.

Lifetime Of A Communication Pathway (Concrete)

Now that we have an understanding the various types of sockets and the functions used to work with them, we will step through the process of establishing a communication pathway between two endpoints, for both the stream-based sockets and the datagram-based sockets.

Stream-based

	Server		Client
	SETUP		SETUP
`socket()`	Creates the socket	`socket()`	Creates the socket
`bind()`	Binds to well-known address	`bind()`	(Optional)
`listen()`	Sets to listening mode	`connect()`	Issue connection request to server
`accept()`	Waits for incoming connection request
	BLOCKING UNTIL CONNECTION ESTABLISHED		BLOCKING UNTIL CONNECTION ESTABLISHED
`read()`	Receive incoming data from client	`write()`	Write data to server
`write()`	Write data to client	`read()`	Receive incoming data from server

Datagram-based

	Initial Intended Receiver ("Server")		Initial Sender ("Client")
	SETUP		SETUP
`socket()`	Creates the socket	`socket()`	Creates the socket
`bind()`	Binds to well-known address	`bind()`	(Optional)
	COMMUNICATION		COMMUNICATION
`recvfrom()`	Receive incoming data from client; obtain client address	`sendto()`	Write data to server; specify server address
`sendto()`	Write data to client; specify address obtained from `recvfrom()`	`recvfrom()`	Receive incoming data from server

Use in Runtime

Sockets are used pretty much everywhere in Runtime. The following briefly touches on where sockets are used; for more information on the details, please visit their respective wiki pages.

`net_handler`

Sockets are used in net_handler to talk to Dawn and Shepherd via network connections. A TCP connection is used with Shepherd, and both a TCP connection and UDP communication is used for Dawn. The UDP communication is used for transmitting gamepad data from Dawn to Runtime, and device data from Runtime to Dawn. Both of these pieces of data will be sent at very high rates across the network, and it is acceptable for a packet or two to be dropped. All other communication is done via the TCP connections, because those messages need to be sent reliably (for example, the message for sending the robot into AUTO mode needs to be reliably sent!). There is also a Unix Domain datagram-based socket that connects net_handler with executor for the passing of challenge data between the two processes.

`executor`

As mentioned above, executor uses a Unix Domain datagram-based socket that connects it to net_handler for receiving and sending challenge inputs and outputs, respectively.

`net_handler_client`

net_handler_client uses sockets to connect to all of the outputs of the Runtime net_handler process in order to route all of net_handler's output to standard output. It simultaneously acts like a connected Shepherd and connected Dawn to provide inputs and read outputs from net_handler.

`dev_handler_client`

dev_handler_client needs to mimic lowcar devices connecting and disconnecting from Runtime. Since lowcar devices communicate via serial ports, which are essentially just raw byte streams, dev_handler_client uses Unix Domain streams-based sockets to "trick" dev_handler into thinking that it's talking with real Arduino devices.

Sockets

Contents

What are Sockets?

Types of Sockets

TCP (Transmission Control Protocol)

UDP (User Datagram Protocol)

Unix Domain Sockets

Additional Comments On All Sockets

EOF Condition

struct sockaddr

struct sockaddr_in

struct sockaddr_un

TCP vs. UDP

Functions Used When Working With Sockets

The socket() Function

The bind() Function

Additional Remarks for TCP:

Additional Remarks for UDP:

Additional Remarks for Unix Domain Sockets:

The connect() Function

The accept() Function

The listen() Function

The sendto() Function

The recvfrom() Function

The ip_addr() and inet_ntoa() Functions

The htons and htonl Functions

Lifetime Of A Communication Pathway (Concrete)

Stream-based

Datagram-based

Use in Runtime

net_handler

executor

net_handler_client

dev_handler_client

Additional Resources

Home

Components

Tools/Libraries

Systems Concepts

Clone this wiki locally