top of page
Search
Writer's pictureTom Herbert

TLV Parser Instructions: The World’s FASTEST Way to Parse TLVs!

Tom Herbert, SiPanda CTO, September 9, 2024

Nothing strikes fear into the heart of a networking hardware vendor more than telling them they need to support TLVs! Okay, maybe that’s a bit of an exaggeration, but it’s really not that far off– TLVs were not designed with hardware in mind! The thing is though, TLVs are super important, especially if we want to keep evolving the Internet. So we need a solution ASAP...


So what’s their problem?

A TLV, or Type-Length-Value, is a variable length structure for carrying a data object. The Type field gives the object’s type, the Length field gives the object’s length, and the Value is the object data of the respective length (pretty obvious, right!). A list of TLVs can be created where one TLV follows the next. TLVs are quite expressive and commonly used to convey a list of options in protocol headers like TCP options and IPv6 Hop-by-Hop options. It’s this expressiveness and flexibility that make TLVs problematic for hardware!


The challenges of TLVs for hardware are:

  • TLVs are variable length structures

  • The TLVs in list are combinatorial for different types

  • The number of TLVs in a list may be unlimited


The first two points quickly preclude hardware optimizations we might use for fixed length protocol headers. For instance, we can’t just throw the whole TLV list into a TCAM-- TLVs need to be sequentially one at a time in order. The third point isn't specific to hardware, forcing anyone to process a long list of TLVs is going to be a lot of cycles and power.


TLV Parser Instructions to the rescue!

SiPanda has addressed these concerns by creating specialized CPU instructions for parsing TLVs as fast as possible with configurable limits to manage processing costs. Since these are CPU instructions we get all the benefits of full programmability. We describe the operation and show some example assembly.


As described in Parsing Mechanics parsing TLVs is done in a loop over the TLVs. The flow for processing each TLV looks something like this:

Parser instructions for TLVs are designed to implement this flow logic. Three state variables are maintained in parser registers that correspond to similarly named variables in the flow diagram. DataHdr.Offset (corresponding to data_off) and DataHdr.Length  (corresponding to data_len) give the offset and length of the current TLV being parsed; DataBound  (corresponding to data_bnd) gives the maximum length of TLVs starting from the current TLV.


A TLV loop is preceded by a “loop head” instruction like prs.loadtlvloop or prs.tlvfastloop. Each loop interation starts at the loop head. The end of an iteration processing is at a “stop node” instruction indicated by a .stp prefix. When a “stop node” instruction finishes, program control jumps back to the loop head. TLV loops can be terminated for various reasons including all the TLVs being processed, an End-of-List (EOL) option type is seen, or a limit on number of TLVs to process is exceeded.


Operation

We’ll show the operation of parsing TLVs with a simple example of a Hop-by-Hop Extension Header containing a single Minimum Path MTU option (RFC9268). The diagram below shows the register states when parsing the Hop-by-Hop Extension Header commences.

The length of the Hop-by-Hop Extension header is computed by loading the extension header length field, multiplying the value by eight and adding eight. The instructions to do this set up the registers for running the TLV loop as shown below. 

In each loop iteration, the next TLV in the list is parsed to discern its type and length. The type can be used in a protocol lookup that returns a parse node for processing a TLV. The length is determined by loading the Option Length field and adding two. After setting the TLV length for the Minimum Path MTU the register states for our example are shown below.

When processing of a TLV node completes (at a “stop node” instruction), DataHdr.Offset is incremented by DataHdr.Length, DataBound is decremented by DataHdr.Length, and DataHdr.Length is set to zero. If DataBound reaches zero then the loop is terminated (as it would be in our example when the Minimum Path MTU option has been processed); if DataBound is greater than zero then the next TLV is parsed.


Generic TLV Parsing in Instructions

Generic TLV parsing instructions are designed to handle various TLV formats. To continue our example, we show the assembly code for parsing an IPv6 Hop-by-Hop extension header with processing of the Minimum Path MTU option.


ipv6_hbh: 1 prs.load.h paccum, pcurptr

2  prs.lensetadd.b paccum[1], 8:8

prs.cam pnext,paccum[0]

prs.loadtlvloop paccum,pdatptr

prs.camjumploop paccum,pc

prs.load.h paccum, pdathdr /* Unknown option*/

prs.cmpi.b paccum[0], 0:0xC0

prs.lensettlvadd.b.stp paccum[1], 2

pad1_option:

prs.lensetpad.stp pdathdr:1

padN_option:

10 prs.load.b paccum, pdathdr+1

11 prs.lensetpadadd.b.stp paccum, 2

min_path_mtu_option:

12 prs.load.b paccum, pdathdr+1

13 prs.lensettlvadd.b paccum[1], 2

14 prs.runthread.stp pmtu_handler


ipv6_hbh indicates the start of the instructions for the node, and the node is invoked when the Next Protocol field of an IPv6 header is zero. Lines 1 to 3 process the base Hop-by-Hop which includes computing the length of the extension header and performing a CAM lookup on the Next Header field.


Lines 4 and 5 constitute the “TLV loop”. The prs.loadtlvloop instruction is the “loop head” for the TLV loop. In the first execution of the instruction, loop state in parser registers is initialized. When processing for each iteration completes, a jump is made to the loop head instruction. At each iteration, conditions and limits are checked for terminating the loop. With each loop iteration prs.loadtlvloop, loads a single byte from the data pointer (pdathdr) which is the Option Kind, or TLV Type. prs.camjumploop then performs a CAM lookup on the Option Kind– if a match is found then a jump is made to the target node for processing the TLV, else if no match is found then execution falls through to Line 6.


Lines 6 to 8 handle the case of an unknown Hop-by-Hop option. This entails setting the length of the TLV and checking the high order two bits of the Option Kind to see if the option should be skipped or the packet should be dropped (RFC8200). Lines 9 to 14 (pad1_option, padN_option, and min_path_mtu_option) parse single byte padding (Option Kind 0), multi byte padding (Option Kind 1), and the Minimum Path MTU option (Option Kind 48). For each one, we need to compute the TLV length; note that for the padding options we use prs.lensetpad and prs.lensetpadadd to account for length of padding and check against limits for padding. In Line 14, prs.runthread is used to schedule a worker thread to process the logic of the of the Minimum Path MTU option.


Optimized TLV Parsing in Instructions

A common TLV header format is exactly two bytes where the first byte gives the Type and the second byte gives the Length. This format is employed in several Internet protocols including IPv4 options, TCP options, UDP options, and IPv6 Hop-by-Hop and Destination options. To optimize for this case, we created prs.tlvfastloop and prs.camjumptlvloop instructions. We can use these to implement our example for parsing the Minimum Path MTU option in Hop-by-Hop options.


ipv6_hbh:

1 prs.load.h paccum, pcurptr

2 prs.lensetadd.b paccum[1], 8:8

3 prs.cam pnext,paccum[0]

4 prs.tlvfastloop paccum, pdatptr, 2

5 prs.camjumptlvloop.b.stp paccum,pc

min_path_mtu_option:

6 prs.runthread.stp pmtu_handler


Lines 1 to 3 are the same as before. Lines 4 and 5 implement the “fast” TLV loop.


prs.loadtlvloop scans the TLV list in search of non-padding options. If it encounters single byte or multi byte padding it skips over those so that each invocation of prs.loadtlvloop, returns the next non-padding option to process. Internally, the instruction reads and computes the TLV length and sets the DataHdr.Length register.


prs.camjumptlvloop performs a CAM lookup on the next non-padding option. If no entry is found then the instruction checks if the option is to be ignored (e.g. by checking two high order bits of the Option Kind). If a CAM entry is found then a jump is made to the address of the target node.


In Line 6, the handler for Path MTU option is reduced to a single prs.runthread.stp instruction since the bookkeeping for the TLV length is handled by prs.tlvloopfast.


The benefits

Instructions for parsing TLVs are very powerful. The two instructions that comprise a TLV loop replace over three hundred plain integer instructions. These instructions can leverage a lot of internal parallelism and could employ techniques for prefetch and background computation.  The net effect is that they are at least orders of magnitude performance over plain software implementation and pretty close to the theoretical maximum performance of a hardware solution. Hence our claim: This is the world’s fastest way to parse TLVS. :-)


SiPanda

SiPanda was created to rethink the network datapath and bring both flexibility and wire-speed performance at scale to networking infrastructure. The SiPanda architecture enables data center infrastructure operators and application architects to build solutions for cloud service providers to edge compute (5G) that don’t require the compromises inherent in today’s network solutions. For more information, please visit www.sipanda.io. If you want to find out more about PANDA, you can email us at panda@sipanda.io. IP described here is covered by patent USPTO 12,026,546 and other patents pending.

96 views0 comments

Comments


bottom of page