X-Git-Url: http://git.onelab.eu/?p=sliver-openvswitch.git;a=blobdiff_plain;f=DESIGN;h=f864135a83f1df2eb730f0b0b53d868d507638b5;hp=f9345d16e45485131001248256a18fc176902eeb;hb=HEAD;hpb=623e1caf2f493bfcd96e3f9381d4e89257c92798 diff --git a/DESIGN b/DESIGN index f9345d16e..f864135a8 100644 --- a/DESIGN +++ b/DESIGN @@ -17,8 +17,14 @@ given controller receives OpenFlow asynchronous messages. This section describes how all of these features interact. First, a service controller never receives any asynchronous messages -unless it explicitly configures a miss_send_len greater than zero with -an OFPT_SET_CONFIG message. +unless it changes its miss_send_len from the service controller +default of zero in one of the following ways: + + - Sending an OFPT_SET_CONFIG message with nonzero miss_send_len. + + - Sending any NXT_SET_ASYNC_CONFIG message: as a side effect, this + message changes the miss_send_len to + OFP_DEFAULT_MISS_SEND_LEN (128) for service controllers. Second, OFPT_FLOW_REMOVED and NXT_FLOW_REMOVED messages are generated only if the flow that was removed had the OFPFF_SEND_FLOW_REM flag @@ -78,8 +84,8 @@ OFPP_LOCAL as a physical port and support OFPAT_ENQUEUE on it as well. OFPT_FLOW_MOD ============= -The OpenFlow 1.0 specification for the behavior of OFPT_FLOW_MOD is -confusing. The following table summarizes the Open vSwitch +The OpenFlow specification for the behavior of OFPT_FLOW_MOD is +confusing. The following tables summarize the Open vSwitch implementation of its behavior in the following categories: - "match on priority": Whether the flow_mod acts only on flows @@ -87,7 +93,12 @@ implementation of its behavior in the following categories: - "match on out_port": Whether the flow_mod acts only on flows that output to the out_port included in the flow_mod message (if - out_port is not OFPP_NONE). + out_port is not OFPP_NONE). OpenFlow 1.1 and later have a + similar feature (not listed separately here) for out_group. + + - "match on flow_cookie": Whether the flow_mod acts only on flows + whose flow_cookie matches an optional controller-specified value + and mask. - "updates flow_cookie": Whether the flow_mod changes the flow_cookie of the flow or flows that it matches to the @@ -114,6 +125,11 @@ implementation of its behavior in the following categories: - "zeros counters": Whether the flow_mod resets per-flow packet and byte counters to zero. + - "may add a new flow": Whether the flow_mod may add a new flow to + the flow table. (Obviously this is always true for "add" + commands but in some OpenFlow versions "modify" and + "modify-strict" can also add new flows.) + - "sends flow_removed message": Whether the flow_mod generates a flow_removed message for the flow or flows that it affects. @@ -122,11 +138,17 @@ indicated behavior, "---" means that it does not, an empty cell means that the property is not applicable, and other values are explained below the table. +OpenFlow 1.0 +------------ + MODIFY DELETE ADD MODIFY STRICT DELETE STRICT === ====== ====== ====== ====== -match on priority --- --- yes --- yes +match on priority yes --- yes --- yes match on out_port --- --- --- yes yes +match on flow_cookie --- --- --- --- --- +match on table_id --- --- --- --- --- +controller chooses table_id --- --- --- updates flow_cookie yes yes yes updates OFPFF_SEND_FLOW_REM yes + + honors OFPFF_CHECK_OVERLAP yes + + @@ -135,6 +157,7 @@ updates hard_timeout yes + + resets idle timer yes + + resets hard timer yes yes yes zeros counters yes + + +may add a new flow yes yes yes sends flow_removed message --- --- --- % % (+) "modify" and "modify-strict" only take these actions when they @@ -145,6 +168,271 @@ sends flow_removed message --- --- --- % % (Each controller can separately control whether it wants to receive the generated messages.) +OpenFlow 1.1 +------------ + +OpenFlow 1.1 makes these changes: + + - The controller now must specify the table_id of the flow match + searched and into which a flow may be inserted. Behavior for a + table_id of 255 is undefined. + + - A flow_mod, except an "add", can now match on the flow_cookie. + + - When a flow_mod matches on the flow_cookie, "modify" and + "modify-strict" never insert a new flow. + + MODIFY DELETE + ADD MODIFY STRICT DELETE STRICT + === ====== ====== ====== ====== +match on priority yes --- yes --- yes +match on out_port --- --- --- yes yes +match on flow_cookie --- yes yes yes yes +match on table_id yes yes yes yes yes +controller chooses table_id yes yes yes +updates flow_cookie yes --- --- +updates OFPFF_SEND_FLOW_REM yes + + +honors OFPFF_CHECK_OVERLAP yes + + +updates idle_timeout yes + + +updates hard_timeout yes + + +resets idle timer yes + + +resets hard timer yes yes yes +zeros counters yes + + +may add a new flow yes # # +sends flow_removed message --- --- --- % % + +(+) "modify" and "modify-strict" only take these actions when they + create a new flow, not when they update an existing flow. + +(%) "delete" and "delete_strict" generates a flow_removed message if + the deleted flow or flows have the OFPFF_SEND_FLOW_REM flag set. + (Each controller can separately control whether it wants to + receive the generated messages.) + +(#) "modify" and "modify-strict" only add a new flow if the flow_mod + does not match on any bits of the flow cookie + +OpenFlow 1.2 +------------ + +OpenFlow 1.2 makes these changes: + + - Only "add" commands ever add flows, "modify" and "modify-strict" + never do. + + - A new flag OFPFF_RESET_COUNTS now controls whether "modify" and + "modify-strict" reset counters, whereas previously they never + reset counters (except when they inserted a new flow). + + MODIFY DELETE + ADD MODIFY STRICT DELETE STRICT + === ====== ====== ====== ====== +match on priority yes --- yes --- yes +match on out_port --- --- --- yes yes +match on flow_cookie --- yes yes yes yes +match on table_id yes yes yes yes yes +controller chooses table_id yes yes yes +updates flow_cookie yes --- --- +updates OFPFF_SEND_FLOW_REM yes --- --- +honors OFPFF_CHECK_OVERLAP yes --- --- +updates idle_timeout yes --- --- +updates hard_timeout yes --- --- +resets idle timer yes --- --- +resets hard timer yes yes yes +zeros counters yes & & +may add a new flow yes --- --- +sends flow_removed message --- --- --- % % + +(%) "delete" and "delete_strict" generates a flow_removed message if + the deleted flow or flows have the OFPFF_SEND_FLOW_REM flag set. + (Each controller can separately control whether it wants to + receive the generated messages.) + +(&) "modify" and "modify-strict" reset counters if the + OFPFF_RESET_COUNTS flag is specified. + +OpenFlow 1.3 +------------ + +OpenFlow 1.3 makes these changes: + + - Behavior for a table_id of 255 is now defined, for "delete" and + "delete-strict" commands, as meaning to delete from all tables. + A table_id of 255 is now explicitly invalid for other commands. + + - New flags OFPFF_NO_PKT_COUNTS and OFPFF_NO_BYT_COUNTS for "add" + operations. + +The table for 1.3 is the same as the one shown above for 1.2. + + +OpenFlow 1.4 +------------ + +OpenFlow 1.4 does not change flow_mod semantics. + + +OFPT_PACKET_IN +============== + +The OpenFlow 1.1 specification for OFPT_PACKET_IN is confusing. The +definition in OF1.1 openflow.h is[*]: + + /* Packet received on port (datapath -> controller). */ + struct ofp_packet_in { + struct ofp_header header; + uint32_t buffer_id; /* ID assigned by datapath. */ + uint32_t in_port; /* Port on which frame was received. */ + uint32_t in_phy_port; /* Physical Port on which frame was received. */ + uint16_t total_len; /* Full length of frame. */ + uint8_t reason; /* Reason packet is being sent (one of OFPR_*) */ + uint8_t table_id; /* ID of the table that was looked up */ + uint8_t data[0]; /* Ethernet frame, halfway through 32-bit word, + so the IP header is 32-bit aligned. The + amount of data is inferred from the length + field in the header. Because of padding, + offsetof(struct ofp_packet_in, data) == + sizeof(struct ofp_packet_in) - 2. */ + }; + OFP_ASSERT(sizeof(struct ofp_packet_in) == 24); + +The confusing part is the comment on the data[] member. This comment +is a leftover from OF1.0 openflow.h, in which the comment was correct: +sizeof(struct ofp_packet_in) is 20 in OF1.0 and offsetof(struct +ofp_packet_in, data) is 18. When OF1.1 was written, the structure +members were changed but the comment was carelessly not updated, and +the comment became wrong: sizeof(struct ofp_packet_in) and +offsetof(struct ofp_packet_in, data) are both 24 in OF1.1. + +That leaves the question of how to implement ofp_packet_in in OF1.1. +The OpenFlow reference implementation for OF1.1 does not include any +padding, that is, the first byte of the encapsulated frame immediately +follows the 'table_id' member without a gap. Open vSwitch therefore +implements it the same way for compatibility. + +For an earlier discussion, please see the thread archived at: +https://mailman.stanford.edu/pipermail/openflow-discuss/2011-August/002604.html + +[*] The quoted definition is directly from OF1.1. Definitions used + inside OVS omit the 8-byte ofp_header members, so the sizes in + this discussion are 8 bytes larger than those declared in OVS + header files. + + +VLAN Matching +============= + +The 802.1Q VLAN header causes more trouble than any other 4 bytes in +networking. More specifically, three versions of OpenFlow and Open +vSwitch have among them four different ways to match the contents and +presence of the VLAN header. The following table describes how each +version works. + + Match NXM OF1.0 OF1.1 OF1.2 + ----- --------- ----------- ----------- ------------ + [1] 0000/0000 ????/1,??/? ????/1,??/? 0000/0000,-- + [2] 0000/ffff ffff/0,??/? ffff/0,??/? 0000/ffff,-- + [3] 1xxx/1fff 0xxx/0,??/1 0xxx/0,??/1 1xxx/ffff,-- + [4] z000/f000 ????/1,0y/0 fffe/0,0y/0 1000/1000,0y + [5] zxxx/ffff 0xxx/0,0y/0 0xxx/0,0y/0 1xxx/ffff,0y + [6] 0000/0fff + [7] 0000/f000 + [8] 0000/efff + [9] 1001/1001 1001/1001,-- + [10] 3000/3000 + +Each column is interpreted as follows. + + - Match: See the list below. + + - NXM: xxxx/yyyy means NXM_OF_VLAN_TCI_W with value xxxx and mask + yyyy. A mask of 0000 is equivalent to omitting + NXM_OF_VLAN_TCI(_W), a mask of ffff is equivalent to + NXM_OF_VLAN_TCI. + + - OF1.0 and OF1.1: wwww/x,yy/z means dl_vlan wwww, OFPFW_DL_VLAN + x, dl_vlan_pcp yy, and OFPFW_DL_VLAN_PCP z. ? means that the + given nibble is ignored (and conventionally 0 for wwww or yy, + conventionally 1 for x or z). means that the given match + is not supported. + + - OF1.2: xxxx/yyyy,zz means OXM_OF_VLAN_VID_W with value xxxx and + mask yyyy, and OXM_OF_VLAN_PCP (which is not maskable) with + value zz. A mask of 0000 is equivalent to omitting + OXM_OF_VLAN_VID(_W), a mask of ffff is equivalent to + OXM_OF_VLAN_VID. -- means that OXM_OF_VLAN_PCP is omitted. + means that the given match is not supported. + +The matches are: + + [1] Matches any packet, that is, one without an 802.1Q header or with + an 802.1Q header with any TCI value. + + [2] Matches only packets without an 802.1Q header. + + NXM: Any match with (vlan_tci == 0) and (vlan_tci_mask & 0x1000) + != 0 is equivalent to the one listed in the table. + + OF1.0: The spec doesn't define behavior if dl_vlan is set to + 0xffff and OFPFW_DL_VLAN_PCP is not set. + + OF1.1: The spec says explicitly to ignore dl_vlan_pcp when + dl_vlan is set to 0xffff. + + OF1.2: The spec doesn't say what should happen if (vlan_vid == 0) + and (vlan_vid_mask & 0x1000) != 0 but (vlan_vid_mask != 0x1000), + but it would be straightforward to also interpret as [2]. + + [3] Matches only packets that have an 802.1Q header with VID xxx (and + any PCP). + + [4] Matches only packets that have an 802.1Q header with PCP y (and + any VID). + + NXM: z is ((y << 1) | 1). + + OF1.0: The spec isn't very clear, but OVS implements it this way. + + OF1.2: Presumably other masks such that (vlan_vid_mask & 0x1fff) + == 0x1000 would also work, but the spec doesn't define their + behavior. + + [5] Matches only packets that have an 802.1Q header with VID xxx and + PCP y. + + NXM: z is ((y << 1) | 1). + + OF1.2: Presumably other masks such that (vlan_vid_mask & 0x1fff) + == 0x1fff would also work. + + [6] Matches packets with no 802.1Q header or with an 802.1Q header + with a VID of 0. Only possible with NXM. + + [7] Matches packets with no 802.1Q header or with an 802.1Q header + with a PCP of 0. Only possible with NXM. + + [8] Matches packets with no 802.1Q header or with an 802.1Q header + with both VID and PCP of 0. Only possible with NXM. + + [9] Matches only packets that have an 802.1Q header with an + odd-numbered VID (and any PCP). Only possible with NXM and + OF1.2. (This is just an example; one can match on any desired + VID bit pattern.) + +[10] Matches only packets that have an 802.1Q header with an + odd-numbered PCP (and any VID). Only possible with NXM. (This + is just an example; one can match on any desired VID bit + pattern.) + +Additional notes: + + - OF1.2: The top three bits of OXM_OF_VLAN_VID are fixed to zero, + so bits 13, 14, and 15 in the masks listed in the table may be + set to arbitrary values, as long as the corresponding value bits + are also zero. The suggested ffff mask for [2], [3], and [5] + allows a shorter OXM representation (the mask is omitted) than + the minimal 1fff mask. + Flow Cookies ============ @@ -606,6 +894,39 @@ The following are explicitly *not* supported by in-band control: gateway. +Action Reproduction +=================== + +It seems likely that many controllers, at least at startup, use the +OpenFlow "flow statistics" request to obtain existing flows, then +compare the flows' actions against the actions that they expect to +find. Before version 1.8.0, Open vSwitch always returned exact, +byte-for-byte copies of the actions that had been added to the flow +table. The current version of Open vSwitch does not always do this in +some exceptional cases. This section lists the exceptions that +controller authors must keep in mind if they compare actual actions +against desired actions in a bytewise fashion: + + - Open vSwitch zeros padding bytes in action structures, + regardless of their values when the flows were added. + + - Open vSwitch "normalizes" the instructions in OpenFlow 1.1 + (and later) in the following way: + + * OVS sorts the instructions into the following order: + Apply-Actions, Clear-Actions, Write-Actions, + Write-Metadata, Goto-Table. + + * OVS drops Apply-Actions instructions that have empty + action lists. + + * OVS drops Write-Actions instructions that have empty + action sets. + +Please report other discrepancies, if you notice any, so that we can +fix or document them. + + Suggestions ===========