An algorithmic TCAM based ternary lookup method is provided. The method stores entries for ternary lookup into several sub-tables. All entries in each sub-table have a sub-table key that includes the same common portion of the entry. No two sub-tables are associated with the same sub-table key. The method stores the keys in a sub-table keys table in TCAM. Each key has a different priority. The method stores the entries for each sub-table in random access memory. Each entry in a sub-table has a different priority. The method receives a search request to perform a ternary lookup for an input data item. A ternary lookup into the ternary sub-table key table stored in TCAM is performed to retrieve a sub-table index. The method performs a ternary lookup across the entries of the sub-table associated with the retrieved index to identify the highest priority matched entry for the input data item.
G11C 7/10 - Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
G11C 11/406 - Management or control of the refreshing or charge-regeneration cycles
G11C 15/04 - Digital stores in which information comprising one or more characteristic parts is written into the store and in which information is read-out by searching for one or more of these characteristic parts, i.e. associative or content-addressed stores using semiconductor elements
A method for generating configuration data for configuring a hardware switch is described. The method receives a description of functionality for the hardware switch. Based on the description, the method generates sets of match and action entries to configure the hardware switch to process packets. The method then determines, for each packet header field in a parse graph that specifies instructions for a parser of the switch to extract packet header fields from packets, whether the packet header field is used or modified by at least one match or action entry. The method generates for the parser of the hardware switch configuration data that instructs the parser to extract (i) packet header fields used or modified by at least one match or action entry to a first set of registers and (ii) packet header fields not used by any match or action entries to a second set of registers.
Some embodiments provide a method for processing a packet for a pipeline of a hardware switch. The pipeline, in some embodiments, includes several different stages that match against packet header fields and modify packet header fields. The method receives a packet that includes a set of packet headers. The method then populates, for each packet header in the set of packet headers, (i) a first set of registers with packet header field values of the packet header that are used in the pipeline, and (ii) a second set of registers with packet header field values of the packet header that are not used in the pipeline.
Some embodiments of the invention provide a network forwarding element that can be dynamically reconfigured to adjust its data message processing to stay within a desired operating temperature or power consumption range. In some embodiments, the network forwarding element includes (1) a data-plane forwarding circuit (“data plane”) to process data tuples associated with data messages received by the IC, and (2) a control-plane circuit (“control plane”) for configuring the data plane forwarding circuit. The data plane includes several data processing stages to process the data tuples.
H04L 41/0833 - Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability for reduction of network energy consumption
G06F 1/04 - Generating or distributing clock signals or signals derived directly therefrom
Some embodiments provide a method for an ingress packet processing pipeline of a network forwarding integrated circuit (IC). The ingress packet processing pipeline is for receiving packets from a port of the network forwarding IC and processing the packets to assign different packets to different queues of a traffic management unit of the network forwarding IC. The method receives state data from the traffic management unit. The method stores the state data in a stateful table. The method assigns a particular packet to a particular queue based on the state data received from the traffic management unit and stored in the stateful table.
H04L 45/7453 - Address table lookupAddress filtering using hashing
H04L 47/32 - Flow controlCongestion control by discarding or delaying data units, e.g. packets or frames
H04L 47/62 - Queue scheduling characterised by scheduling criteria
H04L 47/628 - Queue scheduling characterised by scheduling criteria for service slots or service orders based on packet size, e.g. shortest packet first
Some embodiments of the invention provide a forwarding element (e.g., a switch, a router, etc.) that has one or more data plane, message-processing pipelines with key-value processing circuits. The forwarding element's data plane key-value circuits allow the forwarding element to perform key-value services that would otherwise have to be performed by data compute nodes connected by the network fabric that includes the forwarding element. In some embodiments, the key-value (KV) services of the forwarding element and other similar forwarding elements supplement the key-value services of a distributed set of key-value servers by caching a subset of the most commonly used key-value pairs in the forwarding elements that connect the set of key-value servers with their client applications. In some embodiments, the key-value circuits of the forwarding element perform the key-value service operations at message-processing line rates at which the forwarding element forwards messages to the data compute nodes and/or to other network forwarding elements in the network fabric.
A method of multicasting packets by a forwarding element that includes several packet replicators and several egress pipelines. Each packet replicator receives a data structure associated with a multicast packet that identifies a multicast group. Each packet replicator identifies a first physical egress port of a first egress pipeline for sending the multicast packet to a member of the multicast group. The first physical egress port is a member of LAG. Each packet replicator determines that the first physical egress port is not operational and identifies a second physical port in the LAG for sending the multicast packet to the member of the multicast group. When a packet replicator is connected to the same egress pipeline as the second physical egress, the packet replicator provides the identification of the second physical egress port to the egress pipeline to send the packet to the multicast member. Otherwise the packet replicator drops the packet.
A method of identifying a failed egress path of a hardware forwarding element. The method detects an egress link failure in a data plane of the forwarding element. The method generates a link failure signal in the data plane identifying the failed egress link. The method generates a packet that includes the identification of the egress link based on the link failure signal. The method sets the status of the egress link to failed in the data plane based on the identification of the egress link in the generated packet.
Some embodiments provide a method for an ingress packet processing pipeline of a network forwarding integrated circuit (IC). The ingress packet processing pipeline is for receiving packets from a port of the network forwarding IC and processing the packets to assign different packets to different queues of a traffic management unit of the network forwarding IC. The method receives state data from the traffic management unit. The method stores the state data in a stateful table. The method assigns a particular packet to a particular queue based on the state data received from the traffic management unit and stored in the stateful table.
H04L 45/7453 - Address table lookupAddress filtering using hashing
H04L 47/32 - Flow controlCongestion control by discarding or delaying data units, e.g. packets or frames
H04L 47/62 - Queue scheduling characterised by scheduling criteria
H04L 47/628 - Queue scheduling characterised by scheduling criteria for service slots or service orders based on packet size, e.g. shortest packet first
Some embodiments of the invention provide a network forwarding element that can be dynamically reconfigured to adjust its data message processing to stay within a desired operating temperature or power consumption range. In some embodiments, the network forwarding element includes (1) a data-plane forwarding circuit (“data plane”) to process data tuples associated with data messages received by the IC, and (2) a control-plane circuit (“control plane”) for configuring the data plane forwarding circuit. The data plane includes several data processing stages to process the data tuples. The data plane also includes an idle-signal injecting circuit that receives from the control plane configuration data that the control plane generates based on the IC's temperature. Based on the received configuration data, the idle-signal injecting circuit generates idle control signals for the data processing stages. Each stage that receives an idle control signal enters an idle state during which the majority of the components of that stage do not perform any operations, which reduces the power consumed and temperature generated by that stage during its idle state.
G06F 1/3234 - Power saving characterised by the action undertaken
G06F 3/06 - Digital input from, or digital output to, record carriers
G06F 13/16 - Handling requests for interconnection or transfer for access to memory bus
H04L 41/0833 - Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability for reduction of network energy consumption
H04L 45/00 - Routing or path finding of packets in data switching networks
A method for generating configuration data for configuring a hardware switch is described. The method receives a description of functionality for the hardware switch. Based on the description, the method generates sets of match and action entries to configure the hardware switch to process packets. The method then determines, for each packet header field in a parse graph that specifies instructions for a parser of the switch to extract packet header fields from packets, whether the packet header field is used or modified by at least one match or action entry. The method generates for the parser of the hardware switch configuration data that instructs the parser to extract (i) packet header fields used or modified by at least one match or action entry to a first set of registers and (ii) packet header fields not used by any match or action entries to a second set of registers.
Some embodiments of the invention provide a forwarding element that has a data-plane circuit (data plane) that can be configured to implement a DDoS (distributed denial of service) attack detector. The data plane has several stages of configurable data processing circuits, which are typically configured to process data tuples associated with data messages received by the forwarding element in order to forward the data messages within a network. In some embodiments, the configurable data processing circuits of the data plane can also be configured to implement a DDoS attack detector (DDoS detector) in the data plane. In some embodiments, the forwarding element has a control-plane circuit (control plane) that configures the configurable data processing circuits of the data plane, while in other embodiments, a remote controller configures these data processing circuits.
G06F 17/18 - Complex mathematical operations for evaluating statistical data
H04L 43/0876 - Network utilisation, e.g. volume of load or congestion level
H04L 41/0816 - Configuration setting characterised by the conditions triggering a change of settings the condition being an adaptation, e.g. in response to network events
A method of multicasting packets by a forwarding element that includes several packet replicators and several egress pipelines. Each packet replicator receives a data structure associated with a multicast packet that identifies a multicast group. Each packet replicator identifies a first physical egress port of a first egress pipeline for sending the multicast packet to a member of the multicast group. The first physical egress port is a member of LAG. Each packet replicator determines that the first physical egress port is not operational and identifies a second physical port in the LAG for sending the multicast packet to the member of the multicast group. When a packet replicator is connected to the same egress pipeline as the second physical egress, the packet replicator provides the identification of the second physical egress port to the egress pipeline to send the packet to the multicast member. Otherwise the packet replicator drops the packet.
An algorithmic TCAM based ternary lookup method is provided. The method stores entries for ternary lookup into several sub-tables. All entries in each sub-table have a sub-table key that includes the same common portion of the entry. No two sub-tables are associated with the same sub-table key. The method stores the keys in a sub-table keys table in TCAM. Each key has a different priority. The method stores the entries for each sub-table in random access memory. Each entry in a sub-table has a different priority. The method receives a search request to perform a ternary lookup for an input data item. A ternary lookup into the ternary sub-table key table stored in TCAM is performed to retrieve a sub-table index. The method performs a ternary lookup across the entries of the sub-table associated with the retrieved index to identify the highest priority matched entry for the input data item.
G11C 15/04 - Digital stores in which information comprising one or more characteristic parts is written into the store and in which information is read-out by searching for one or more of these characteristic parts, i.e. associative or content-addressed stores using semiconductor elements
15.
Messaging between remote controller and forwarding element
Some embodiments of the invention provide a forwarding element that can be configured through in-band data-plane messages from a remote controller that is a physically separate machine from the forwarding element. The forwarding element of some embodiments has data plane circuits that include several configurable message-processing stages, several storage queues, and a data-plane configurator. A set of one or more message-processing stages of the data plane are configured (1) to process configuration messages received by the data plane from the remote controller and (2) to store the configuration messages in a set of one or more storage queues. The data-plane configurator receives the configuration messages stored in the set of storage queues and configures one or more of the configurable message-processing stages based on configuration data in the configuration messages.
Some embodiments of the invention provide a forwarding element (e.g., a switch, a router, etc.) that has one or more data plane, message-processing pipelines with key-value processing circuits. The forwarding element's data plane key-value circuits allow the forwarding element to perform key-value services that would otherwise have to be performed by data compute nodes connected by the network fabric that includes the forwarding element. In some embodiments, the key-value (KV) services of the forwarding element and other similar forwarding elements supplement the key-value services of a distributed set of key-value servers by caching a subset of the most commonly used key-value pairs in the forwarding elements that connect the set of key-value servers with their client applications. In some embodiments, the key-value circuits of the forwarding element perform the key-value service operations at message-processing line rates at which the forwarding element forwards messages to the data compute nodes and/or to other network forwarding elements in the network fabric.
Some embodiments provide a network forwarding integrated circuit (IC) that includes at least one packet processing pipeline. The packet processing pipeline includes multiple match-action stages, at least one of which includes a stateful processing unit that operates at a line rate of the network forwarding IC. The stateful processing unit is configured to receive data stored in a memory location associated with a stateful table of the match-action stage. The data includes a set of values. The stateful processing unit is further configured to identify one of a maximum value and a minimum value from the set of values, and to output the identified value for use by a next match-action stage.
Some embodiments of the invention provide a forwarding element that can be configured through in-band data-plane messages from a remote controller that is a physically separate machine from the forwarding element. The forwarding element of some embodiments has data plane circuits that include several configurable message-processing stages, several storage queues, and a data-plane configurator. A set of one or more message-processing stages of the data plane are configured (1) to process configuration messages received by the data plane from the remote controller and (2) to store the configuration messages in a set of one or more storage queues. The data-plane configurator receives the configuration messages stored in the set of storage queues and configures one or more of the configurable message-processing stages based on configuration data in the configuration messages.
Some embodiments provide a method for a parser of a processing pipeline. The method receives a packet for processing by a set of match-action stages of the processing pipeline. The method stores packet header field (PHF) values from a first set of PHFs of the packet in a set of data containers. The first set of PHFs are for use by the match-action stages. For a second set of PHFs not used by the match-action stages, the method generates descriptive data that identifies locations of the PHFs of the second set within the packet. The method sends (i) the set of data containers to the match-action stages and (ii) the packet data and the generated descriptive data outside of the match-action stages to a deparser that uses the packet data, generated descriptive data, and the set of data containers as modified by the match-action stages to reconstruct a modified packet.
H04L 47/2441 - Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
H04L 67/63 - Routing a service request depending on the request content or context
Some embodiments provide a method for processing a packet for a pipeline of a hardware switch. The pipeline, in some embodiments, includes several different stages that match against packet header fields and modify packet header fields. The method receives a packet that includes a set of packet headers. The method then populates, for each packet header in the set of packet headers, (i) a first set of registers with packet header field values of the packet header that are used in the pipeline, and (ii) a second set of registers with packet header field values of the packet header that are not used in the pipeline.
A method of identifying a failed egress path of a hardware forwarding element. The method detects an egress link failure in a data plane of the forwarding element. The method generates a link failure signal in the data plane identifying the failed egress link. The method generates a packet that includes the identification of the egress link based on the link failure signal. The method sets the status of the egress link to failed in the data plane based on the identification of the egress link in the generated packet.
Some embodiments provide novel circuits for augmenting the functionality of a data plane circuit of a forwarding element with one or more field programmable circuits and external memory circuits. The external memories in some embodiments serve as deep buffers that receive through one or more FPGAs a set of data messages from the data plane (DP) circuit to store temporarily. In some of these embodiments, one or more of the FPGAs implement schedulers that specify when data messages should be retrieved from the external memories and provided back to the data plane circuit for forwarding through the network. For instance, in some embodiments, a particular FPGA can perform a scheduling operation for a first set of data messages stored in its associated external memory, and can direct another FPGA to perform the scheduling operation for a second set of data messages stored in the particular FPGA's associated external memory. Specifically, in these embodiments, the particular FPGA determines when the first subset of data messages stored in its associated external memory should be forwarded back to the data plane circuit to forward to data messages in the network, while directing another FPGA to determine when a second subset of data messages stored in the particular FPGA's external memory should be forwarded back to the data plane circuit.
Some embodiments provide novel circuits for recording data messages received by a data plane circuit of a forwarding element in an external memory outside of the data plane circuit. The external memory in some embodiments is outside of the forwarding element. In some embodiments, the data plane circuit encapsulates the received data messages that should be recorded with encapsulation headers, inserts into these headers addresses that identify locations for storing these data messages in a memory external to the data plane circuit, and forwards these encapsulated data messages so that these messages can be stored in the external memory by another circuit. Instead of encapsulating received data messages for storage, the data plane circuit in some embodiments encapsulates copies of the received data messages for storage. Accordingly, in these embodiments, the data plane circuit makes copies of the data messages that it needs to record.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that can be configured to provide protection from a SYN-flood denial of service attack by validating a source of a SYN data messages before allowing future messages to be forwarded to a protected server. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to operate as a connection-validation circuit that includes (1) a SYN-processing circuit to process SYN data messages received by the data plane, and (2) an ACK-processing circuit to process ACK data messages received by the data plane.
A method for verifying data plane programs is provided in some embodiments. Because the behavior of a data plane program (e.g., a program written in the P4 language) is determined in part by the control plane populating match-action tables with specific forwarding rules, in some embodiments, programmers are provided with a way to document assumptions about the control plane using annotations (e.g., in the form of “assertions” or “assumptions” about the state based on the unknown control plane contribution). In some embodiments, annotations are added automatically to verify common properties, including checking that every header read or written is valid, that every expression has a well-defined value, and that all standard metadata is manipulated correctly. The method in some embodiments translates programs from a first language (e.g., P4) to a second language (e.g., Guarded Command Language (GCL)) for verification by a satisfiability modulo theory (SMT) solver.
A method of multicasting packets by a forwarding element that includes several packet replicators and several egress pipelines. Each packet replicator receives a data structure associated with a multicast packet that identifies a multicast group. Each packet replicator identifies a first physical egress port of a first egress pipeline for sending the multicast packet to a member of the multicast group. The first physical egress port is a member of LAG. Each packet replicator determines that the first physical egress port is not operational and identifies a second physical port in the LAG for sending the multicast packet to the member of the multicast group. When a packet replicator is connected to the same egress pipeline as the second physical egress, the packet replicator provides the identification of the second physical egress port to the egress pipeline to send the packet to the multicast member. Otherwise the packet replicator drops the packet.
Some embodiments provide a data-plane forwarding circuit that can be configured to learn about a new message flow and to maintain metadata about the new message flow without first having a control plane first configure the data plane to maintain metadata about the flow. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to operate as a flow-tracking circuit that includes (1) a flow-identifying circuit to identify message flows received by the data plane, and (2) a first set of storages to store metadata about the identified flows.
H04L 47/2483 - Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
H04L 41/22 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks comprising specially adapted graphical user interfaces [GUI]
28.
Messaging between remote controller and forwarding element
Some embodiments of the invention provide a forwarding element that can be configured through in-band data-plane messages from a remote controller that is a physically separate machine from the forwarding element. The forwarding element of some embodiments has data plane circuits that include several configurable message-processing stages, several storage queues, and a data-plane configurator. A set of one or more message-processing stages of the data plane are configured (1) to process configuration messages received by the data plane from the remote controller and (2) to store the configuration messages in a set of one or more storage queues. The data-plane configurator receives the configuration messages stored in the set of storage queues and configures one or more of the configurable message-processing stages based on configuration data in the configuration messages.
A method for generating configuration data for configuring a hardware switch is described. The method receives a description of functionality for the hardware switch. Based on the description, the method generates sets of match and action entries to configure the hardware switch to process packets. The method then determines, for each packet header field in a parse graph that specifies instructions for a parser of the switch to extract packet header fields from packets, whether the packet header field is used or modified by at least one match or action entry. The method generates for the parser of the hardware switch configuration data that instructs the parser to extract (i) packet header fields used or modified by at least one match or action entry to a first set of registers and (ii) packet header fields not used by any match or action entries to a second set of registers.
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
Some embodiments use one or more FPGAs and external memories associated with the FPGAs to implement large, hash-addressable tables for a data plane circuit. These embodiments configure at least one message processing stage of the DP circuit to store (1) a first plurality of records for matching with a set of data messages received by the DP circuit, and (2) a redirection record redirecting data messages that do not match the first plurality of records to a DP egress port associated with the memory circuit. These embodiments configure an external memory circuit to store a larger, second set of records for matching with redirected data messages received through the DP egress port associated with the memory circuit. This external memory circuit is a hash-addressable memory in some embodiments. To determine whether a redirected data message matches a record in the second set of record, the method of some embodiments configures an FPGA associated with the hash-addressable external memory to use a collision free hash process to generate a collision-free, hash address value from a set of attributes of the data message. This hash address value specifies an address in the external memory for the record in the second set of records to compare with the redirected data message.
G06F 13/42 - Bus transfer protocol, e.g. handshakeSynchronisation
H03K 19/173 - Logic circuits, i.e. having at least two inputs acting on one outputInverting circuits using specified components using elementary logic circuits as components
G06F 13/16 - Handling requests for interconnection or transfer for access to memory bus
32.
Augmenting data plane functionality with field programmable integrated circuits
Some embodiments use one or more FPGAs and external memories associated with the FPGAs to implement large, hash-addressable tables for a data plane circuit. These embodiments configure at least one message processing stage of the DP circuit to store (1) a first plurality of records for matching with a set of data messages received by the DP circuit, and (2) a redirection record redirecting data messages that do not match the first plurality of records to a DP egress port associated with the memory circuit. These embodiments configure an external memory circuit to store a larger, second set of records for matching with redirected data messages received through the DP egress port associated with the memory circuit. This external memory circuit is a hash-addressable memory in some embodiments. To determine whether a redirected data message matches a record in the second set of record, the method of some embodiments configures an FPGA associated with the hash-addressable external memory to use a collision free hash process to generate a collision-free, hash address value from a set of attributes of the data message. This hash address value specifies an address in the external memory for the record in the second set of records to compare with the redirected data message.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that can be configured to provide protection from a SYN-flood denial of service attack by validating a source of a SYN data messages before allowing future messages to be forwarded to a protected server. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to operate as a connection-validation circuit that includes (1) a SYN-processing circuit to process SYN data messages received by the data plane, and (2) an ACK-processing circuit to process ACK data messages received by the data plane.
G06F 21/00 - Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
H04L 29/06 - Communication control; Communication processing characterised by a protocol
H04L 1/16 - Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
H04L 12/715 - Hierarchical routing, e.g. clustered networks or inter-domain routing
Some embodiments of the invention provide a forwarding element that can be configured through in-band data-plane messages from a remote controller that is a physically separate machine from the forwarding element. The forwarding element of some embodiments has data plane circuits that include several configurable message-processing stages, several storage queues, and a data-plane configurator. A set of one or more message-processing stages of the data plane are configured (1) to process configuration messages received by the data plane from the remote controller and (2) to store the configuration messages in a set of one or more storage queues. The data-plane configurator receives the configuration messages stored in the set of storage queues and configures one or more of the configurable message-processing stages based on configuration data in the configuration messages.
Some embodiments of the invention provide a network forwarding element that can be dynamically reconfigured to adjust its data message processing to stay within a desired operating temperature or power consumption range. In some embodiments, the network forwarding element includes (1) a data-plane forwarding circuit (“data plane”) to process data tuples associated with data messages received by the IC, and (2) a control-plane circuit (“control plane”) for configuring the data plane forwarding circuit. The data plane includes several data processing stages to process the data tuples. The data plane also includes an idle-signal injecting circuit that receives from the control plane configuration data that the control plane generates based on the IC's temperature. Based on the received configuration data, the idle-signal injecting circuit generates idle control signals for the data processing stages. Each stage that receives an idle control signal enters an idle state during which the majority of the components of that stage do not perform any operations, which reduces the power consumed and temperature generated by that stage during its idle state.
Some embodiments of the invention provide novel methods for storing data in a hash-addressed memory and retrieving stored data from the hash-addressed memory. In some embodiments, the method receives a search key and a data tuple. The method then uses a first hash function to generate a first hash value from the search key, and then uses this first hash value to identify an address in the hash-addressed memory. The method also uses a second hash function to generate a second hash value, and then stores this second hash value along with the data tuple in the memory at the address specified by the first hash value. To retrieve data from the hash-addressed memory, the method of some embodiments receives a search key. The method then uses the first hash function to generate a first hash value from the search key, and then uses this first hash value to identify an address in the hash-addressed memory. At the identified address, the hash-addressed memory stores a second hash value and a data tuple. The method retrieves a second hash value from the memory at the identified address, and compares this second hash value with a third hash value that the method generates from the search key by using the second hash function. When the second and third hash values match, the method retrieves the data tuple that the memory stores at the identified address.
A method of identifying a path for forwarding a packet by a packet forwarding element. The method receives a packet that includes a plurality of fields that identify a particular packet flow. The method computes a plurality of hash values from the plurality of fields that identify the particular packet flow. Each hash value computed using a different hash algorithm. Based on the plurality of hash values, the method identifies a plurality of paths configured to forward the packets of the particular flow. The method identifies the status of each of the plurality of paths. Each path status identifies whether or not the corresponding path is operational. The method selects an operational path in the plurality of paths to forward the packet based on a priority scheme using said plurality of identified status bits.
H04L 69/325 - Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the network layer [OSI layer 3], e.g. X.25
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
Some embodiments provide a method for an ingress packet processing pipeline of a network forwarding integrated circuit (IC). The ingress packet processing pipeline is for receiving packets from a port of the network forwarding IC and processing the packets to assign different packets to different queues of a traffic management unit of the network forwarding IC. The method receives state data from the traffic management unit. The method stores the state data in a stateful table. The method assigns a particular packet to a particular queue based on the state data received from the traffic management unit and stored in the stateful table.
H04L 47/32 - Flow controlCongestion control by discarding or delaying data units, e.g. packets or frames
H04L 47/628 - Queue scheduling characterised by scheduling criteria for service slots or service orders based on packet size, e.g. shortest packet first
H04L 49/109 - Integrated on microchip, e.g. switch-on-chip
H04L 47/62 - Queue scheduling characterised by scheduling criteria
Some embodiments of the invention provide a forwarding element (e.g., a switch, a router, etc.) that has one or more data plane, message-processing pipelines with key-value processing circuits. The forwarding element's data plane key-value circuits allow the forwarding element to perform key-value services that would otherwise have to be performed by data compute nodes connected by the network fabric that includes the forwarding element. In some embodiments, the key-value (KV) services of the forwarding element and other similar forwarding elements supplement the key-value services of a distributed set of key-value servers by caching a subset of the most commonly used key-value pairs in the forwarding elements that connect the set of key-value servers with their client applications. In some embodiments, the key-value circuits of the forwarding element perform the key-value service operations at message-processing line rates at which the forwarding element forwards messages to the data compute nodes and/or to other network forwarding elements in the network fabric.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that can be configured to identify large data message flows that it processes for forwarding in a network. In this document, large data message flows are referred to as heavy hitter flows. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to implement a heavy hitter detection (HHD) circuit. The operations of the data plane's message processing stages are configured by a control plane of the data plane's forwarding element in some embodiments.
H04L 47/12 - Avoiding congestionRecovering from congestion
H04L 47/2441 - Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
H04L 47/2483 - Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
Some embodiments provide a method for a network forwarding integrated circuit (IC). The method receives packet data with an instruction to copy a portion of the packet data to a temporary storage of the network forwarding IC. The portion is larger than a maximum entry size of the temporary storage. The method generates a header for each of multiple packet data sections for storage in entries of the temporary storage, with each packet data section including a sub-portion of the packet data portion. The method sends the packet data sections with the generated headers to the temporary storage for storage in multiple separate temporary storage entries.
A forwarding element includes data plane forwarding circuitry for forwarding data messages received by the forwarding element to other network elements in a network. The data-plane forwarding circuitry includes several snapshot-match circuitry units. Each snapshot-match circuitry unit compares a set of header fields of incoming data messages with a corresponding matching data. The data-plane forwarding circuitry also includes several snapshot-capture circuitry units. Each snapshot-capture circuitry units stores a set of header fields of data messages that matches a corresponding matching data.
Some embodiments provide a method for an ingress packet processing pipeline of a network forwarding integrated circuit (IC). The ingress packet processing pipeline is for receiving packets from a port of the network forwarding IC and processing the packets to assign different packets to different queues of a traffic management unit of the network forwarding IC. The method receives state data from the traffic management unit. The method stores the state data in a stateful table. The method assigns a particular packet to a particular queue based on the state data received from the traffic management unit and stored in the stateful table.
H04L 12/933 - Switch core, e.g. crossbar, shared memory or shared medium
H04L 12/709 - Route fault prevention or recovery, e.g. rerouting, route redundancy, virtual router redundancy protocol [VRRP] or hot standby router protocol [HSRP] using path redundancy using M+N parallel active paths
45.
Forwarding element data plane with computing parameter distributor
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
Some embodiments provide a method for a hardware forwarding element. Based on a set of characteristics of a packet, the method determines to copy a packet to a particular temporary storage of a set of temporary storages of the hardware forwarding element. Based on a property of the particular temporary storage, the method stores only a particular portion of the packet in the particular temporary storage. A same size portion of each packet copied to the particular temporary storage is stored in the particular temporary storage.
Some embodiments of the invention provide a network forwarding element that can be dynamically reconfigured to adjust its data message processing to stay within a desired operating temperature or power consumption range. In some embodiments, the network forwarding element includes (1) a data-plane forwarding circuit (“data plane”) to process data tuples associated with data messages received by the IC, and (2) a control-plane circuit (“control plane”) for configuring the data plane forwarding circuit. The data plane includes several data processing stages to process the data tuples. The data plane also includes an idle-signal injecting circuit that receives from the control plane configuration data that the control plane generates based on the IC's temperature. Based on the received configuration data, the idle-signal injecting circuit generates idle control signals for the data processing stages. Each stage that receives an idle control signal enters an idle state during which the majority of the components of that stage do not perform any operations, which reduces the power consumed and temperature generated by that stage during its idle state.
A method of detecting error in a data plane of a packet forwarding element that includes a plurality of physical ternary content-addressable memories (TCAMs) is provided. The method configures a first set of physical TCAMs into a first logical TCAM. The method configures a second set of physical TCAMs into a second logical TCAM. The second logical TCAM includes the same number of physical TCAMs as the first logical TCAM. The method programs the first and second logical TCAMs to store a same set of data. The method requests a search for a particular content from the first and second logical TCAMs. The method generates an error signal when the first and second logical TCAMs do not produce a same search results.
Some embodiments provide a data-plane forwarding circuit that can be configured to learn about a new message flow and to maintain metadata about the new message flow without first having a control plane first configure the data plane to maintain metadata about the flow. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to operate as a flow-tracking circuit that includes (1) a flow-identifying circuit to identify message flows received by the data plane, and (2) a first set of storages to store metadata about the identified flows.
Some embodiments of the invention provide a forwarding element that has one or more data plane, message-processing pipelines with key-value processing circuits. The forwarding element's data plane key-value circuits allow the forwarding element to perform key-value services that would otherwise have to be performed by data compute nodes connected by the network fabric that includes the forwarding element. In some embodiments, the key-value (KV) services of the forwarding element and other similar forwarding elements supplement the key-value services of a distributed set of key-value servers by caching a subset of the most commonly used key-value pairs in the forwarding elements that connect the set of key-value servers with their client applications. In some embodiments, the key-value circuits of the forwarding element perform the key-value service operations at message-processing line rates at which the forwarding element forwards messages to the data compute nodes and/or to other network forwarding elements.
Some embodiments provide a method for a hardware forwarding element that includes multiple queues. The method receives a packet at a multi-stage processing pipeline of the hardware forwarding element. The method determines, at one of the stages of the processing pipeline, to modify a setting of a particular one of the queues. The method stores an identifier for the particular queue and instructions to modify the queue setting with data passed through the processing pipeline for the packet. The stored information is subsequently used by the hardware forwarding element to modify the queue setting.
In a method for allocating physical queues of a network forwarding element, a request is received at the network forwarding element, the network forwarding element including a plurality of physical queues, where each physical queue of the plurality of physical queues has a fixed bandwidth, the request identifying an allocation of a plurality of virtual queues at the network forwarding element. Based at least in part on the request, a configuration of the plurality of physical queues to the plurality of virtual queues is determined. The plurality of physical queues is configured according to the configuration, wherein the configuring includes allocating at least two physical queues to a virtual queue.
Some embodiments provide a method for a packet processing pipeline of a network forwarding integrated circuit. The method stores two copies of a stateful table used by the packet processing pipeline. The stateful table is modified according to data processed by the packet processing pipeline. Upon receiving data to write to the stateful table, the method generates (i) a first copy of the received data along with an indicator for a first one of the copies of the stateful table and (ii) a second copy of the received data along with an indicator for a second one of the copies of the stateful table. The method sends the first copy of the received data into the packet processing pipeline before sending the second copy of the received data into the packet processing pipeline.
Some embodiments of the invention provide a network forwarding element that can be dynamically reconfigured to adjust its data message processing to stay within a desired operating temperature or power consumption range. In some embodiments, the network forwarding element includes (1) a data-plane forwarding circuit (“data plane”) to process data tuples associated with data messages received by the IC, and (2) a control-plane circuit (“control plane”) for configuring the data plane forwarding circuit. The data plane includes several data processing stages to process the data tuples. The data plane also includes an idle-signal injecting circuit that receives from the control plane configuration data that the control plane generates based on the IC's temperature. Based on the received configuration data, the idle-signal injecting circuit generates idle control signals for the data processing stages. Each stage that receives an idle control signal enters an idle state during which the majority of the components of that stage do not perform any operations, which reduces the power consumed and temperature generated by that stage during its idle state.
H04L 41/0833 - Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability for reduction of network energy consumption
G06F 1/3234 - Power saving characterised by the action undertaken
G06F 1/04 - Generating or distributing clock signals or signals derived directly therefrom
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that has a flow-size detection circuit that generates flow-size density distribution for all or some of the data message flows that it processes for forwarding in a network. The flow-size (FS) detection circuit in some embodiments generates statistical values regarding the processed data message flows, and based on these statistical values, it generates a FS density distribution that expresses a number of flows in different flow-size sub-ranges in a range of flow sizes. In some embodiments, the density distribution is a probabilistic density distribution that is based on probabilistic statistical values that the flow-size detection circuit generates for the data message flows that are processed for forwarding within the network. The FS detection circuit in some embodiments generates probabilistic statistical values for the data message flows by generating hash values from header values of the data message flows and accumulating flow-size values at memory locations identified by the generated hash values. In some embodiments, the generated hashes for different data message flows can collide, which results in the accumulated flow-size values being probabilistic values that might have a certain level of inaccuracy.
Some embodiments provide a method for a parser of a processing pipeline. The method receives a packet for processing by a set of match-action stages of the processing pipeline. The method stores packet header field (PHF) values from a first set of PHFs of the packet in a set of data containers. The first set of PHFs are for use by the match-action stages. For a second set of PHFs not used by the match-action stages, the method generates descriptive data that identifies locations of the PHFs of the second set within the packet. The method sends (i) the set of data containers to the match-action stages and (ii) the packet data and the generated descriptive data outside of the match-action stages to a deparser that uses the packet data, generated descriptive data, and the set of data containers as modified by the match-action stages to reconstruct a modified packet.
H04L 47/2441 - Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
H04L 67/63 - Routing a service request depending on the request content or context
H04L 45/64 - Routing or path finding of packets in data switching networks using an overlay routing layer
57.
Identifying and marking failed egress links in data plane
A method of identifying a failed egress path of a hardware forwarding element. The method detects an egress link failure in a data plane of the forwarding element. The method generates a link failure signal in the data plane identifying the failed egress link. The method generates a packet that includes the identification of the egress link based on the link failure signal. The method sets the status of the egress link to failed in the data plane based on the identification of the egress link in the generated packet.
Embodiments described herein describe a network tester that is configured to perform packet modification at an egress pipeline of a programmable packet engine. A packet stream is received at an egress pipeline of an output port of the programmable packet engine, wherein the output port includes a packet modifier. Packets of the packet stream are modified at the packet modifier. The packet stream including modified packets is transmitted through an egress pipeline of the output port.
Some embodiments provide a method for a hardware forwarding element. Based on a set of characteristics of a packet, the method determines to copy a packet to a particular temporary storage of a set of temporary storages of the hardware forwarding element. Based on a property of the particular temporary storage, the method stores only a particular portion of the packet in the particular temporary storage. A same size portion of each packet copied to the particular temporary storage is stored in the particular temporary storage.
Some embodiments provide a method for a match-action stage of a packet processing pipeline. The method receives a set of data containers storing input packet data values for a particular packet. The set of data containers includes multiple subsets of data containers. The method performs a set of match operations using a first subset of the set of data containers. The method uses a set of arithmetic logic units (ALUs) to generate output packet data values to store in a second subset of the set of data containers. Output packet data values for a third subset of the data containers are generated without the set of ALUs.
A method of configuring a forwarding element that includes several message processing stages. The method identifies a first processing stage that starts processing a first header field of a message and a second processing stage that is the last message processing stage that processes the first header field. The method configures a field of a packet header container to store the first header field from the beginning of the first message processing stage. The method identifies a second header field used in a third processing stage after the second processing stage. The method configures a set of circuitries in the data plane to initialize the container field after the end of the second processing stage. The method configures the field of the container to store the second header field of the message after the end of the second processing stage and before the start of the third processing stage.
H04L 12/741 - Header address processing for routing, e.g. table lookup
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 7/57 - Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups or for performing logical operations
Some embodiments of the invention provide a method for reporting congestion in a network that includes several forwarding elements. In a data plane circuit of one of the forwarding elements, the method detects that a queue in the switching circuit of the data plane circuit is congested, while a particular data message is stored in the queue as it is being processed through the data plane circuit. In the data plane circuit, the method then generates a report regarding the detected queue congestion, and sends this report to a data collector external to the forwarding element. To send the report, the data plane circuit in some embodiments duplicates the particular data message, stores it in the duplicate data message information regarding the detected queue congestion, and sends the duplicate data message to the external data collector.
H04J 3/16 - Time-division multiplex systems in which the time allocation to individual channels within a transmission cycle is variable, e.g. to accommodate varying complexity of signals, to vary number of channels transmitted
H04L 12/28 - Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
Some embodiments provide a method for a hardware forwarding element deparser. The method receives, from a match-action pipeline, (i) packet header field values stored in a set of data containers and (ii) a set of data indicating which packet header fields, of multiple possible packet header fields, to include in a packet constructed from the packet header field values. The method uses the received set of data and a list of data container identifiers for multiple possible packet header fields to generate an ordered list of references to data containers of the set of data containers. Based on the ordered list, the method constructs the packet using the packet header field values stored in the referenced data containers.
Some embodiments provide a method for a hardware forwarding element. The method receives a packet to add to a buffer. The packet is assigned a packet class. The method determines an amount of buffer space available for the assigned packet class. Different packet classes have different amounts of buffer space available in the buffer. When the available buffer space for the assigned packet class is large enough for the received packet, the method adds the packet to the buffer.
A method of forwarding a multicast packet by a physical forwarding element is provided. The method receives a multicast packet that identifies a multicast group. The method scans a multicast tree associated with the multicast group to identify an ECMP group for forwarding the multicast packet to a member of the multicast group. The method calculates a group of hash values on several fields of the packet and uses a first hash value in the group of hash values to identify a first path in the ECMP. The method determines that the identified path has failed. The method uses a second hash value to identify a second path in the ECMP. The method forwards the multicast packet to the multicast member through the second path.
H04L 12/709 - Route fault prevention or recovery, e.g. rerouting, route redundancy, virtual router redundancy protocol [VRRP] or hot standby router protocol [HSRP] using path redundancy using M+N parallel active paths
H04L 12/753 - Routing tree discovery, e.g. converting from mesh topology to tree topology
69.
Runtime sharing of unit memories between match tables in a network forwarding element
A method of sharing unit memories between two match tables in a data plane packet processing pipeline of a physical forwarding element is provided. The method, from a plurality of available unit memories of the packet processing pipeline, allocates a first set of unit memories to the first match table and a second set of unit memories to the second match table. The method determines that the first set of unit memories is filled to a threshold capacity after storing a plurality of entries in the first set of unit memories. The method de-allocates a first unit memory from the second match table by moving contents of the first unit memory to a second unit memory in the second set of unit memories. The method allocates the first unit memory to the first match table.
Some embodiments provide a method for a hardware forwarding element. The method adds a received packet to a buffer. The method determines whether adding the packet to the buffer causes the buffer to pass one of multiple flow control thresholds, each of which corresponds to a different packet priority. When adding the packet to the buffer causes the buffer to pass a particular flow control threshold corresponding to a particular priority, the method generates a flow control message for the particular priority.
Some embodiments of the invention provide a network forwarding element that can be dynamically reconfigured to adjust its data message processing to stay within a desired operating temperature or power consumption range. In some embodiments, the network forwarding element includes (1) a data-plane forwarding circuit (“data plane”) to process data tuples associated with data messages received by the IC, and (2) a control-plane circuit (“control plane”) for configuring the data plane forwarding circuit. The data plane includes several data processing stages to process the data tuples. The data plane also includes an idle-signal injecting circuit that receives from the control plane configuration data that the control plane generates based on the IC's temperature. Based on the received configuration data, the idle-signal injecting circuit generates idle control signals for the data processing stages. Each stage that receives an idle control signal enters an idle state during which the majority of the components of that stage do not perform any operations, which reduces the power consumed and temperature generated by that stage during its idle state.
Some embodiments provide a method for a parser of a processing pipeline. The method receives a packet for processing by a set of match-action stages of the processing pipeline. The method stores packet header field (PHF) values from a first set of PHFs of the packet in a set of data containers. The first set of PHFs are for use by the match-action stages. For a second set of PHFs not used by the match-action stages, the method generates descriptive data that identifies locations of the PHFs of the second set within the packet. The method sends (i) the set of data containers to the match-action stages and (ii) the packet data and the generated descriptive data outside of the match-action stages to a deparser that uses the packet data, generated descriptive data, and the set of data containers as modified by the match-action stages to reconstruct a modified packet.
Some embodiments provide a method for a deparser of a processing pipeline. The method receives, from a set of match-action stages of the pipeline, packet header field (PHF) values for a first set of PHFs of a packet processed by the match-action stages. The method also receives, directly from a parser of the pipeline, (i) packet data for the packet prior to any modification by the match-action stages and (ii) descriptive data that specifies locations within the packet data for a second set of PHFs of the packet that are not included in the first set of PHFs. The method constructs a packet from (i) the PHF values received for the first set of PHFs and (ii) the packet data received for the second set of PHFs. The descriptive data is used to extract packet header field values for the second set of PHFs from the packet data.
A forwarding element includes data plane forwarding circuitry for forwarding data messages received by the forwarding element to other network elements in a network. The data-plane forwarding circuitry includes several snapshot-match circuitry units. Each snapshot-match circuitry unit compares a set of header fields of incoming data messages with a corresponding matching data. The data-plane forwarding circuitry also includes several snapshot-capture circuitry units. Each snapshot-capture circuitry units stores a set of header fields of data messages that matches a corresponding matching data.
H04L 47/2441 - Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
Some embodiments provide a data plane circuit for a network forwarding element that searches for one or more patterns of characters stored in data messages received by the data plane circuit. In some embodiments, the data plane circuit analyzes the data messages as it processes the data messages to forward the data messages to their destinations in a network. Because the data messages are already flowing through the network, it is optimal to search the data messages for the character patterns as the data messages pass through the network, instead of performing these operations on a separate set of servers that typically perform these searches at slower rates.
Some embodiments provide a data-plane forwarding circuit that can be configured to learn about a new message flow and to maintain metadata about the new message flow without first having a control plane first configure the data plane to maintain metadata about the flow. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to operate as a flow-tracking circuit that includes (1) a flow-identifying circuit to identify message flows received by the data plane, and (2) a first set of storages to store metadata about the identified flows.
Some embodiments of the invention provide a method for reporting congestion in a network that includes several forwarding elements. In a data plane circuit of one of the forwarding elements, the method detects that a queue in the switching circuit of the data plane circuit is congested, while a particular data message is stored in the queue as it is being processed through the data plane circuit. In the data plane circuit, the method then generates a report regarding the detected queue congestion, and sends this report to a data collector external to the forwarding element. To send the report, the data plane circuit in some embodiments duplicates the particular data message, stores it in the duplicate data message information regarding the detected queue congestion, and sends the duplicate data message to the external data collector.
Some embodiments provide a method for processing a packet for a pipeline of a hardware switch. The pipeline, in some embodiments, includes several different stages that match against packet header fields and modify packet header fields. The method receives a packet that includes a set of packet headers. The method then populates, for each packet header in the set of packet headers, (i) a first set of registers with packet header field values of the packet header that are used in the pipeline, and (ii) a second set of registers with packet header field values of the packet header that are not used in the pipeline.
Some embodiments provide a method for processing a packet for a pipeline of a hardware switch. The pipeline, in some embodiments, includes several different stages that match against packet header fields and modify packet header fields. The method receives a packet that includes a set of packet headers. The method then populates, for each packet header in the set of packet headers, (i) a first set of registers with packet header field values of the packet header that are used in the pipeline, and (ii) a second set of registers with packet header field values of the packet header that are not used in the pipeline.
A novel method for replicating and filtering multicast packet in a physical network is provided. Upon receiving a packet, the method generates a set of metadata as ingress replication context for the received packet based on the content of the receive packet. The generated ingress replication context includes a multicast group identifier, a replication identifier, a first layer exclusion identifier, and a second layer exclusion identifier. The method performs multicast replication of the packet by identifying logical ports and/or logical domains that are to be excluded from the multicast replication based on the content of the generated ingress replication context.
H04L 12/709 - Route fault prevention or recovery, e.g. rerouting, route redundancy, virtual router redundancy protocol [VRRP] or hot standby router protocol [HSRP] using path redundancy using M+N parallel active paths
H04L 12/743 - Header address processing for routing, e.g. table lookup using hashing techniques
81.
Configurable packet processing pipeline for handling non-packet data
Some embodiments provide a method for a packet processing pipeline of a network forwarding integrated circuit (IC). The packet processing pipeline includes multiple match-action stages for processing packets received by the network forwarding IC. Each packet is transmitted through the pipeline using a set of data containers. The method receives data, generated by the network forwarding IC, that is separate from the packets processed by the pipeline. The method transmits through the packet processing pipeline (i) a packet using a first set of data containers and (ii) the received data using a second set of data containers. The first and second sets of data containers are transmitted together through the packet processing pipeline. For at least one of the match-action stages, the method processes the packet data in the first set of data containers and the received data in the second set of data containers in a same clock cycle.
Some embodiments provide a method for processing a packet for a pipeline of a hardware switch. The pipeline, in some embodiments, includes several different stages that match against packet header fields and modify packet header fields. The method receives a packet that includes a set of packet headers. The method then populates, for each packet header in the set of packet headers, (i) a first set of registers with packet header field values of the packet header that are used in the pipeline, and (ii) a second set of registers with packet header field values of the packet header that are not used in the pipeline.
Some embodiments of the invention provide a path-and-latency tracking (PLT) method. At a forwarding element, this method in some embodiments detects the path traversed by a data message through a set of forwarding elements, and the latency that the data message experiences at each of the forwarding elements in the path. In some embodiments, the method has a forwarding element in the path insert its forwarding element identifier and path latency in a header of the data message that it forwards. The method of some embodiments also uses fast PLT operators in the data plane of the forwarding elements to detect new data message flows, to gather PLT data from these data message flows, and to detect path or latency changes for previously detected data message flows. In some embodiments, the method then uses control plane processes (e.g., of the forwarding elements or other devices) to collect and analyze the PLT data gathered in the data plane from new or existing flows.
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
G06F 15/16 - Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
H04L 12/28 - Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
H04L 29/08 - Transmission control procedure, e.g. data link level control procedure
85.
Forwarding element data plane with computing parameter distributor
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
Some embodiments provide a network forwarding element with a data-plane forwarding circuit that has a parameter collecting circuit to store and distribute parameter values computed by several machines in a network. In some embodiments, the machines perform distributed computing operations, and the parameter values that compute are parameter values associated with the distributed computing operations. The parameter collecting circuit of the data-plane forwarding circuit (data plane) in some embodiments (1) stores a set of parameter values computed and sent by a first set of machines, and (2) distributes the collected parameter values to a second set of machines once it has collected the set of parameter values from all the machines in the first set. The first and second sets of machines are the same set of machines in some embodiments, while they are different sets of machines (e.g., one set has at least one machine that is not in the other set) in other embodiments. In some embodiments, the parameter collecting circuit performs computations on the parameter values that it collects and distributes the result of the computations once it has processed all the parameter values distributed by the first set of machines. The computations are aggregating operations (e.g., adding, averaging, etc.) that combine corresponding subset of parameter values distributed by the first set of machines.
Some embodiments provide a method for a traffic management circuit of a data plane forwarding circuit. The traffic management circuit receives data messages from a set of ingress pipelines and provides the data messages to a set of egress pipelines. The method identifies a flow control event. The method provides metadata regarding the flow control event to a message generation circuit of the data plane forwarding circuit via a bus between the traffic management circuit and the message generation circuit.
Some embodiments provide a method for a traffic management circuit of a data plane forwarding circuit. The traffic management circuit receives data messages from a set of ingress pipelines and provides the data messages to a set of egress pipelines. The method identifies a flow control event. The method provides metadata regarding the flow control event to a message generation circuit of the data plane forwarding circuit via a bus between the traffic management circuit and the message generation circuit.
Some embodiments provide a method for a traffic management unit of a network forwarding integrated circuit (IC). The traffic management unit includes multiple queues for storing packets. Each stored packet is (i) received by the traffic management unit from one of multiple ingress packet processing pipelines and (ii) for processing by an egress packet processing pipeline after being released from the queue storing the packet. The method determines that a particular one of the queues has crossed a threshold amount of stored packet data. The method provides queue state data including an identifier of the particular queue and a current amount of data stored in the particular queue to at least a subset of the multiple ingress pipelines. The ingress pipelines use the provided data to process subsequent packets.
A forwarding element includes data plane forwarding circuitry for forwarding data messages received by the forwarding element to other network elements in a network. The data-plane forwarding circuitry includes several snapshot-match circuitry units. Each snapshot-match circuitry unit compares a set of header fields of incoming data messages with a corresponding matching data. The data-plane forwarding circuitry also includes several snapshot-capture circuitry units. Each snapshot-capture circuitry units stores a set of header fields of data messages that matches a corresponding matching data.
Some embodiments provide a network forwarding integrated circuit (IC) including multiple configurable ingress pipelines, multiple configurable egress pipelines, a traffic management unit, and a statistics bus. The configurable ingress pipelines are for processing packets received from ports of the network forwarding IC. The configurable egress pipelines are for processing packets to be transmitted out the ports of the network forwarding IC. The traffic management unit includes multiple queues, each of which corresponds to one of the egress pipelines, and is for receiving a packet from an ingress pipeline and enqueuing the packet into one of the queues. The statistics bus connects the traffic management unit to at least a subset of the ingress pipelines, and is for providing the ingress pipelines with state information regarding the queues.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that can be configured to identify large data message flows that it processes for forwarding in a network. In this document, large data message flows are referred to as heavy hitter flows. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to implement a heavy hitter detection (HHD) circuit. The operations of the data plane's message processing stages are configured by a control plane of the data plane's forwarding element in some embodiments.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that has a flow-size detection circuit that generates flow-size density distribution for all or some of the data message flows that it processes for forwarding in a network. The flow-size (FS) detection circuit in some embodiments generates statistical values regarding the processed data message flows, and based on these statistical values, it generates a FS density distribution that expresses a number of flows in different flow-size sub-ranges in a range of flow sizes. In some embodiments, the density distribution is a probabilistic density distribution that is based on probabilistic statistical values that the flow-size detection circuit generates for the data message flows that are processed for forwarding within the network. The FS detection circuit in some embodiments generates probabilistic statistical values for the data message flows by generating hash values from header values of the data message flows and accumulating flow-size values at memory locations identified by the generated hash values. In some embodiments, the generated hashes for different data message flows can collide, which results in the accumulated flow-size values being probabilistic values that might have a certain level of inaccuracy.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that can be configured to identify large data message flows that it processes for forwarding in a network. In this document, large data message flows are referred to as heavy hitter flows. To perform its forwarding operations, the data plane includes several data message processing stages that are configured to process the data tuples associated with the data messages received by the data plane. In some embodiments, parts of the data plane message-processing stages are also configured to implement a heavy hitter detection (HHD) circuit. The operations of the data plane's message processing stages are configured by a control plane of the data plane's forwarding element in some embodiments.
Some embodiments of the invention provide a data-plane forwarding circuit (data plane) that has a flow-size detection circuit that generates flow-size density distribution for all or some of the data message flows that it processes for forwarding in a network. The flow-size (FS) detection circuit in some embodiments generates statistical values regarding the processed data message flows, and based on these statistical values, it generates a FS density distribution that expresses a number of flows in different flow-size sub-ranges in a range of flow sizes. In some embodiments, the density distribution is a probabilistic density distribution that is based on probabilistic statistical values that the flow-size detection circuit generates for the data message flows that are processed for forwarding within the network. The FS detection circuit in some embodiments generates probabilistic statistical values for the data message flows by generating hash values from header values of the data message flows and accumulating flow-size values at memory locations identified by the generated hash values. In some embodiments, the generated hashes for different data message flows can collide, which results in the accumulated flow-size values being probabilistic values that might have a certain level of inaccuracy.
Some embodiments provide a network forwarding integrated circuit (IC) for processing network packets. The network forwarding IC includes multiple packet processing pipelines and a traffic management unit. Each pipeline is configured to operate as an ingress pipeline and an egress pipeline. The traffic management unit is configured to receive a packet processed by an ingress pipeline and to enqueue the packet for output to a particular egress pipeline. A set of packets received by the network forwarding IC are processed by a first pipeline as an ingress pipeline and a second pipeline as an egress pipeline, then subsequently processed by the second pipeline as an ingress pipeline and a third pipeline as an egress pipeline.
Some embodiments of the invention provide a path-and-latency tracking (PLT) method. At a forwarding element, this method in some embodiments detects the path traversed by a data message through a set of forwarding elements, and the latency that the data message experiences at each of the forwarding elements in the path. In some embodiments, the method has a forwarding element in the path insert its forwarding element identifier and path latency in a header of the data message that it forwards. The method of some embodiments also uses fast PLT operators in the data plane of the forwarding elements to detect new data message flows, to gather PLT data from these data message flows, and to detect path or latency changes for previously detected data message flows. In some embodiments, the method then uses control plane processes (e.g., of the forwarding elements or other devices) to collect and analyze the PLT data gathered in the data plane from new or existing flows.
Some embodiments provide a method for processing a packet for a pipeline of a hardware switch. The pipeline, in some embodiments, includes several different stages that match against packet header fields and modify packet header fields. The method receives a packet that includes a set of packet headers. The method then populates, for each packet header in the set of packet headers, (i) a first set of registers with packet header field values of the packet header that are used in the pipeline, and (ii) a second set of registers with packet header field values of the packet header that are not used in the pipeline.
Some embodiments of the invention provide a forwarding element that can be configured through in-band data-plane messages from a remote controller that is a physically separate machine from the forwarding element. The forwarding element of some embodiments has data plane circuits that include several configurable message-processing stages, several storage queues, and a data-plane configurator. A set of one or more message-processing stages of the data plane are configured (1) to process configuration messages received by the data plane from the remote controller and (2) to store the configuration messages in a set of one or more storage queues. The data-plane configurator receives the configuration messages stored in the set of storage queues and configures one or more of the configurable message-processing stages based on configuration data in the configuration messages.