Skip to content

Sockets

benliao1 edited this page Aug 18, 2020 · 11 revisions

What are Sockets?

Sockets are a very general way for two (or more, depending on the type of socket) endpoints (code that generates or consumes data) to communicate with each other in both directions (as opposed to pipes, which are historically one-directional). We say "very general" because these two connections could be between endpoints within the same process, between endpoints on different processes on the same machine, or between endpoints on different processes on different machines (that are connected over some network, such as your local Wifi network or the Internet).

Generally, how sockets work is that each endpoint creates an object known as a socket that it will use to communicate with. Then, depending on the type of socket (more on that later), a series of steps is taken to make the socket "visible" to the outside world (this series of steps and the meaning of being "visible" depends on the nature of the socket and which end of the connection that given socket will taken on).

A socket is written to / read from an object known as a socket descriptor, which for all intents and purposes is the same as a file descriptor that you can write to and read from.

Types of Sockets

There are three "domains" of communication that sockets can be used with: TCP, UDP, and Unix Domain Sockets (or just Unix sockets). We now do a deep-dive into these three domains, which differ in substantial ways.

TCP (Transmission Control Protocol)

As the name suggests, TCP is a protocol that controls/governs how some data is transmitted between two endpoints on a network. For a TCP-compliant connection (or just a "TCP connection"), there are exactly two endpoints which can send and receive messages from each other; we say that TCP is a connection-oriented protocol because messages cannot be transmitted using TCP if there does not exist a connection between two endpoints. The protocol guarantees that the data is transmitted reliably between the two endpoints, i.e. when the sender sends some data, the receiver will receive that data in the same order that it was sent, and that the receiver will not receive duplicate data. We say that TCP is a reliable protocol.

When data is transmitted from one endpoint to another endpoint, it can be treated as continuous stream of data. This has some significant consequences! If one endpoint sends two messages in quick succession to the other endpoint, such that the two messages are packed one after the other with no breaks in between, the receiving end will not be able tell where in the received stream the first message ends and the second message begins! In other words, TCP does not support message compartmentalization, or that TCP does not include message delimiters. When we use TCP for communication, we must be sure to deal with this problem, especially when our messages are of variable length. The two sides must agree on how to communicate the length of incoming messages to each other. Two common ways to do this are:

  • Agreeing on a specified delimiter sequence that marks the location where one message ends and another begins
  • Explicitly sending the length, in bytes, of the incoming message prior to sending the message so that the receiver knows how many bytes to read from the socket to obtain the entire message.

When a socket is going to be used to make a TCP connection, it is declared as a stream socket (because of the stream nature of TCP communication) with the constant SOCK_STREAM. The sequence of steps needed to establish a connection between two TCP sockets starts with one socket connecting to another socket, which is running on a machine with a known address and is bound to a specific port. The address of the machine is (in Runtime and in PiE in general) an IPv4 (Internet Protocol, version 4) address, something like 192.168.0.24. The endpoint that is issuing the connection request is usually called the client, and the endpoint that is receiving the connection request is usually called the server. When the server accepts an incoming connection request from the client, a TCP connection has now been formed between the two endpoints, and writing data from one endpoint results in that data being made available to the other endpoint to read. Fantastic!

Recall that the sockets API is meant to be used between any two endpoints; the endpoints could be within the same process, in different processes on the same machine, or on different machines. The way this is accomplished is with this IPv4 (sometimes abbreviated as just IP) address. On a local network, all of the machines can be accessed by the special IP address 192.168.0.*. For example, all of the devices connected to your Wi-Fi network at home (which is a local network), are all assigned IP addresses that are 192.168.0.*. Publicly, on the internet, every device connected to the internet also has another IP address. In PiE, since the computers running Shepherd, Dawn, and Runtime (the Raspberry Pis) are all connected to the same local network, we use 192.168.0.* to communicate with Shepherd and Dawn in production. However, for endpoints on the same machine, there is another special IP address: 127.0.0.1 (a.k.a. "localhost" or "loopback address") which always points back to the device a process is running on. On my laptop, 127.0.0.1 points to my laptop; on your laptop, 127.0.0.1 points to your laptop, etc. This is how we test Runtime (and, generally, how processes on the same machine can communicate with each other through the sockets API). By telling net_handler to send all of its output to various ports on the 127.0.0.1 address, we can capture all of its output locally on the Raspberry Pi without needing to connect the real Shepherd or the real Dawn over a network to the Raspberry Pi.

UDP (User Datagram Protocol)

UDP is another protocol that governs/controls the transmission of data into and out of an endpoint on a network. UDP is an unreliable, connectionless protocol (as opposed to TCP, which is reliable and connection-oriented). Let's explore what this means.

With TCP, a concrete connection exists between two endpoints; data sent out from one endpoint is reliably transmitted and received in the order that it was sent by the other endpoint. When sending a message using UDP, the endpoint uses its socket (configured to use UDP) to send a message to some IP address and port number; this IP address and port number is the intended receiver of the message. However, there is no notion of a connection between the sender and the intended receiver; indeed, the intended receiver doesn't even have to exist, in which case the message that was sent out is simply dropped and lost forever. When both the sender and intended receiver exist and messages sent are being delivered successfully to the intended receiver, the messages received by the receiver are not necessarily reliable. In other words, if the sender sends message A first, and then message B next, it is possible for the receiver to receive message B first, then message A next, and then a duplicate message A after that. This is simply due to network unreliability and the nature of UDP (it does not offer reliability). The probability of these events happening is low, especially with low network congestion and when the receiver can process incoming messages at a rate faster than the rate at which the messages are arriving.

Another consequence of there being no notion of a "connection" between the sender and the intended receiver is that multiple senders can exist for the same intended receiver. There is nothing wrong with having two endpoints both sending messages to a third endpoint.

Lastly, UDP is not a stream-based protocol, like TCP is. Rather, UDP is a datagram-based protocol—hence the name. A datagram is a "packet" or "container" that stores the contents of one message. Each time the sender sends data to the intended receiver, it calls a function. Each time that function is called, the data that it was called with is packaged into a datagram and sent out onto the network to the intended receiver. So, while UDP is unreliable, it is the whole packets (messages) themselves that is unreliable, not the messages. If a packet arrives at the intended receiver, it is guaranteed to arrive whole and uncorrupted. However, it is not guaranteed that a packet sent from a sender will arrive at the intended receiver successfully, or that packets sent from the sender will arrive at the intended receiver in the same order that they were sent, or that packets will not be duplicated on their way to the intended receiver.

A consequence of UDP using datagrams to transmit messages instead of a byte-stream like TCP is that messages sent using UDP are message delimited—each datagram corresponds to one message, and the receiver knows when one message ends and the next begins. This means that we do not need to worry about sending the length of the message along with the message itself, or coming up with some sequence of bytes that both endpoints agree on will represent one message ending and another one starting.

When a socket is going to be used for transmitting or receiving messages using UDP, it is first created as a datagram socket (due to the datagram nature of UDP) using the constant SOCK_DGRAM. Once this is done, the program can immediately begin to send and receive datagrams on that socket; there is no sequence of steps to establish a connection between a server and a client as with TCP, since there is no notion of a connection with UDP.

The same special IP addresses described in the "TCP" section apply to UDP socket addresses as well: 192.168.0.* is used for communicating between machines on the same local network, and 127.0.0.1 is used for communicating between endpoints on the local machine. A point to note: a UDP socket and a TCP socket that are bound to the same address and the same port is allowed, and refer to two completely separate objects from the operating system's point of view. In other words, a program can open up a stream socket and bind it to 127.0.0.1:8000 (port 8000 on my machine) and the operating system will know to send incoming TCP messages on port 8000 to this socket. That same program can open up a datagram socket and bind it to 127.0.0.1:8000 (the exact same port and address on my machine) and the operating system will know to send incoming UDP datagrams on port 8000 to this socket. Incoming messages on one port do not effect incoming messages on the other port.

Unix Domain Sockets

The most important thing to know about Unix domain sockets is that they cannot be used between processes on different machines. They are kind of like a cross between a pipe and a socket.

Here are some ways that it is like a socket:

  • Two-way communication
  • Supports SOCK_DGRAM and SOCK_STREAM (datagram-based transmission and stream-based transmission, both of which are reliable)
  • Used and coded like a socket

Here are some ways that it is like a pipe:

  • Only allowed between processes on the same machine
  • (Usually) has a name in the file system (this is like a FIFO and not like a pure pipe)
  • Does not use any underlying protocol (IP, TCP, UDP) to process, wrap, pack, or unpack the send messages; they are simply sent as essentially raw bytes from one endpoint to another

When creating a Unix domain socket endpoint, a socket is first created as either a stream-based (SOCK_STREAM) or datagram-based (SOCK_DGRAM) socket. Then, depending on which type of socket was specified, a connection sequence is initiated from a client to the server (for stream-based), or the new socket is immediately used to send and receive datagrams (for datagram-based). Notice the similarities between using TCP and UDP sockets, respectively. The "address" of a Unix domain socket is not an IP address like 127.0.0.1; rather, it is a pathname, such as /foo.sock. This file will be created in the filesystem when the socket is created and bound to that "address".

For Unix domain sockets, it is not necessary for clients to bind to an existing "address" (file name) on the file system that belongs to the server. Thought this is usually done to establish a connection between a client and server with Unix domain sockets, there is some special behavior that occurs when the client does not bind the socket after creating it to communicate with a server. We use this behavior in Runtime, so it would be helpful to cover it here.

If a Unix domain socket is created and then bound without a specific file path name of the socket, it is what's known as an abstract socket. The operating system essentially chooses some internal name for the abstract socket that the user does not need to know and that is unique. What the abstract socket allows us to do is for arbitrary programs to bring up Unix sockets and use them to send data to a server that is bound to a well-known address, and since the assigned socket names for the abstract sockets are guaranteed to be unique, the server knows where to send the incoming data on the server back to (i.e. the server can identify which client sent data to the server).

Additional Comments On All Sockets

For TCP and Unix Domain stream-based sockets, when one end of a connection is closed, any attempt to read data from the side that is not closed will return an EOF (end-of-file) character (assuming that the side that is still open has processed all prior incoming data from the side that just closed). This lets the receiver know that the other end has closed, and should therefore start cleanup or termination operations.

However, since UDP and Unix Domain datagram-based sockets are connectionless, closing all writers to a particular UDP address will not "close the connection", since there is no concept of a connection. So, a socket using a datagram-based protocol will never receive an EOF on the socket; it is up to the programmer to know when to use the UDP socket and when to determine when a socket is not necessary any more.

Functions Used When Working With Sockets

The socket API is complex and has a lot of subtleties. Some of the below functions will be explained in great detail, others more briefly; some of the functions are used for all sockets, some only for certain types of sockets. Or, they will be used in different ways depending on the type of socket.

socket

connect

accept

listen

bind

sendto

recvfrom

ip_addr

htons

htonl

Use in Runtime

net_handler

executor

net_handler_client

dev_handler_client

Additional Resources