You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The etherparse family of from functions parse a given u8 slice "downward" by design. This generally means that if etherparse knows the format of a payload at any point in the parsing process, it will parse it.
At first glance, this feature sounds great. One of the central tenets of the etherparse library is that it places a particular emphasis on the most popular packet-based protocols, and downwards parsing does just that. Suppose, for example, that you had a u8 slice containing a set of nested packets that all came from popular networking protocols, e.g. Ethernet -> IP -> TCP. With downward parsing, a Rust programmer could point a single from_ethernet function at this slice, and parse all three.
At the same time, the current implementation of downward parsing also inhibits the use of etherparse for unpopular or custom protocols. Suppose, now, that you wanted to use etherparse to implement an IP router that received IP packets along a set of interfaces and forwarded them along to their intended destination. You should not have to care about payload format. For each u8 slice that came in along an interface, you would naturally reach for the from_ip function to parse it into a SlicedPacket. The problem is the payload pointer. In my mind, I think that most Rust programmers in this situation would assume that the payload always pointed to the payload of the IP packet. But it does not. By design, if etherparse happened to recognize the format of the IP packet payload as another packet (e.g. a TCP packet), the payload pointer would instead point to the payload of that packet, because etherparse would have gone ahead and eagerly parsed it for you.
In summary:
If you want to use etherparse to help you implement some part of the network layer -- in this specific case, an IP router -- you might have to violate the separation of concerns that is characteristic of the OSI model and dip into the transport layer.
While the current implementation of downward parsing makes it easy to use etherparse for popular protocols, it does so at the cost of introducing what I would characterize as inconsistent and confusing behavior that makes it harder to use etherparse for unpopular protocols.
Possible solutions
As with #26, I am happy to do the work of implementing the solution to this problem. However, before I do, I want to discuss how I should go about solving it. Among others, I see at least two solutions:
Do away with "downward" parsing.
Split payload into separate pointers that each reliably point to the same payload.
I am personally in favor of the first solution, especially because even the packets of popular protocols do not always follow the particular "downwards" nesting order (Ethernet -> IP -> TCP) that etherparse assumes they do. For example, it is perfectly acceptable (and perhaps even somewhat common) to nest an IP packet inside of a TCP packet. (See Tunneling) Likewise, I think that is perfectly reasonable to ask Rust programmers to parse each payload themselves.
The text was updated successfully, but these errors were encountered:
karpawich
changed the title
"Downwards" parsing (currently) makes it difficult to work with unpopular protocols.
"Downwards" parsing (currently) makes it difficult to work with unpopular protocols
Jul 18, 2022
The problem
The etherparse family of
from
functions parse a givenu8
slice "downward" by design. This generally means that if etherparse knows the format of a payload at any point in the parsing process, it will parse it.At first glance, this feature sounds great. One of the central tenets of the etherparse library is that it places a particular emphasis on the most popular packet-based protocols, and downwards parsing does just that. Suppose, for example, that you had a
u8
slice containing a set of nested packets that all came from popular networking protocols, e.g. Ethernet -> IP -> TCP. With downward parsing, a Rust programmer could point a singlefrom_ethernet
function at this slice, and parse all three.At the same time, the current implementation of downward parsing also inhibits the use of etherparse for unpopular or custom protocols. Suppose, now, that you wanted to use etherparse to implement an IP router that received IP packets along a set of interfaces and forwarded them along to their intended destination. You should not have to care about payload format. For each
u8
slice that came in along an interface, you would naturally reach for thefrom_ip
function to parse it into aSlicedPacket
. The problem is thepayload
pointer. In my mind, I think that most Rust programmers in this situation would assume that thepayload
always pointed to the payload of the IP packet. But it does not. By design, if etherparse happened to recognize the format of the IP packet payload as another packet (e.g. a TCP packet), thepayload
pointer would instead point to the payload of that packet, because etherparse would have gone ahead and eagerly parsed it for you.In summary:
If you want to use etherparse to help you implement some part of the network layer -- in this specific case, an IP router -- you might have to violate the separation of concerns that is characteristic of the OSI model and dip into the transport layer.
While the current implementation of downward parsing makes it easy to use etherparse for popular protocols, it does so at the cost of introducing what I would characterize as inconsistent and confusing behavior that makes it harder to use etherparse for unpopular protocols.
Possible solutions
As with #26, I am happy to do the work of implementing the solution to this problem. However, before I do, I want to discuss how I should go about solving it. Among others, I see at least two solutions:
Do away with "downward" parsing.
Split
payload
into separate pointers that each reliably point to the same payload.I am personally in favor of the first solution, especially because even the packets of popular protocols do not always follow the particular "downwards" nesting order (Ethernet -> IP -> TCP) that etherparse assumes they do. For example, it is perfectly acceptable (and perhaps even somewhat common) to nest an IP packet inside of a TCP packet. (See Tunneling) Likewise, I think that is perfectly reasonable to ask Rust programmers to parse each payload themselves.
The text was updated successfully, but these errors were encountered: