1 <?xml version="1.0" encoding="utf-8"?>
2 <database title="Hardware VTEP Database">
4 This schema specifies relations that a VTEP can use to integrate
5 physical ports into logical switches maintained by a network
6 virtualization controller such as NSX.
14 VXLAN Tunnel End Point, an entity which originates and/or terminates
20 Hardware Switch Controller.
25 Network Virtualization Controller, e.g. NSX.
30 Virtual Routing and Forwarding instance.
34 <table name="Global" title="Top-level configuration.">
35 Top-level configuration for a hardware VTEP. There must be
36 exactly one record in the <ref table="Global"/> table.
38 <column name="switches">
39 The physical switches managed by the VTEP.
42 <group title="Database Configuration">
44 These columns primarily configure the database server
45 (<code>ovsdb-server</code>), not the hardware VTEP itself.
48 <column name="managers">
49 Database clients to which the database server should connect or
50 to which it should listen, along with options for how these
51 connection should be configured. See the <ref table="Manager"/>
52 table for more information.
57 <table name="Manager" title="OVSDB management connection.">
59 Configuration for a database connection to an Open vSwitch Database
64 The database server can initiate and maintain active connections
65 to remote clients. It can also listen for database connections.
68 <group title="Core Features">
69 <column name="target">
70 <p>Connection method for managers.</p>
72 The following connection methods are currently supported:
75 <dt><code>ssl:<var>ip</var></code>[<code>:<var>port</var></code>]</dt>
78 The specified SSL <var>port</var> (default: 6632) on the host at
79 the given <var>ip</var>, which must be expressed as an IP address
83 SSL key and certificate configuration happens outside the
88 <dt><code>tcp:<var>ip</var></code>[<code>:<var>port</var></code>]</dt>
90 The specified TCP <var>port</var> (default: 6632) on the host at
91 the given <var>ip</var>, which must be expressed as an IP address
94 <dt><code>pssl:</code>[<var>port</var>][<code>:<var>ip</var></code>]</dt>
97 Listens for SSL connections on the specified TCP <var>port</var>
98 (default: 6632). If <var>ip</var>, which must be expressed as an
99 IP address (not a DNS name), is specified, then connections are
100 restricted to the specified local IP address.
103 <dt><code>ptcp:</code>[<var>port</var>][<code>:<var>ip</var></code>]</dt>
105 Listens for connections on the specified TCP <var>port</var>
106 (default: 6632). If <var>ip</var>, which must be expressed as an
107 IP address (not a DNS name), is specified, then connections are
108 restricted to the specified local IP address.
114 <group title="Client Failure Detection and Handling">
115 <column name="max_backoff">
116 Maximum number of milliseconds to wait between connection attempts.
117 Default is implementation-specific.
120 <column name="inactivity_probe">
121 Maximum number of milliseconds of idle time on connection to the
122 client before sending an inactivity probe message. If the Open
123 vSwitch database does not communicate with the client for the
124 specified number of seconds, it will send a probe. If a
125 response is not received for the same additional amount of time,
126 the database server assumes the connection has been broken
127 and attempts to reconnect. Default is implementation-specific.
128 A value of 0 disables inactivity probes.
132 <group title="Status">
133 <column name="is_connected">
134 <code>true</code> if currently connected to this manager,
135 <code>false</code> otherwise.
138 <column name="status" key="last_error">
139 A human-readable description of the last error on the connection
140 to the manager; i.e. <code>strerror(errno)</code>. This key
141 will exist only if an error has occurred.
144 <column name="status" key="state"
145 type='{"type": "string", "enum": ["set", ["VOID", "BACKOFF", "CONNECTING", "ACTIVE", "IDLE"]]}'>
147 The state of the connection to the manager:
150 <dt><code>VOID</code></dt>
151 <dd>Connection is disabled.</dd>
153 <dt><code>BACKOFF</code></dt>
154 <dd>Attempting to reconnect at an increasing period.</dd>
156 <dt><code>CONNECTING</code></dt>
157 <dd>Attempting to connect.</dd>
159 <dt><code>ACTIVE</code></dt>
160 <dd>Connected, remote host responsive.</dd>
162 <dt><code>IDLE</code></dt>
163 <dd>Connection is idle. Waiting for response to keep-alive.</dd>
166 These values may change in the future. They are provided only for
171 <column name="status" key="sec_since_connect"
172 type='{"type": "integer", "minInteger": 0}'>
173 The amount of time since this manager last successfully connected
174 to the database (in seconds). Value is empty if manager has never
175 successfully connected.
178 <column name="status" key="sec_since_disconnect"
179 type='{"type": "integer", "minInteger": 0}'>
180 The amount of time since this manager last disconnected from the
181 database (in seconds). Value is empty if manager has never
185 <column name="status" key="locks_held">
186 Space-separated list of the names of OVSDB locks that the connection
187 holds. Omitted if the connection does not hold any locks.
190 <column name="status" key="locks_waiting">
191 Space-separated list of the names of OVSDB locks that the connection is
192 currently waiting to acquire. Omitted if the connection is not waiting
196 <column name="status" key="locks_lost">
197 Space-separated list of the names of OVSDB locks that the connection
198 has had stolen by another OVSDB client. Omitted if no locks have been
199 stolen from this connection.
202 <column name="status" key="n_connections"
203 type='{"type": "integer", "minInteger": 2}'>
205 When <ref column="target"/> specifies a connection method that
206 listens for inbound connections (e.g. <code>ptcp:</code> or
207 <code>pssl:</code>) and more than one connection is actually active,
208 the value is the number of active connections. Otherwise, this
209 key-value pair is omitted.
212 When multiple connections are active, status columns and key-value
213 pairs (other than this one) report the status of one arbitrarily
219 <group title="Connection Parameters">
221 Additional configuration for a connection between the manager
222 and the database server.
225 <column name="other_config" key="dscp"
226 type='{"type": "integer"}'>
227 The Differentiated Service Code Point (DSCP) is specified using 6 bits
228 in the Type of Service (TOS) field in the IP header. DSCP provides a
229 mechanism to classify the network traffic and provide Quality of
230 Service (QoS) on IP networks.
232 The DSCP value specified here is used when establishing the
233 connection between the manager and the database server. If no
234 value is specified, a default value of 48 is chosen. Valid DSCP
235 values must be in the range 0 to 63.
240 <table name="Physical_Switch" title="A physical switch.">
241 A physical switch that implements a VTEP.
243 <column name="ports">
244 The physical ports within the switch.
247 <group title="Network Status">
248 <column name="management_ips">
249 IPv4 or IPv6 addresses at which the switch may be contacted
250 for management purposes.
253 <column name="tunnel_ips">
255 IPv4 or IPv6 addresses on which the switch may originate or
260 This column is intended to allow a <ref table="Manager"/> to
261 determine the <ref table="Physical_Switch"/> that terminates
262 the tunnel represented by a <ref table="Physical_Locator"/>.
267 <group title="Identification">
269 Symbolic name for the switch, such as its hostname.
272 <column name="description">
273 An extended description for the switch, such as its switch login
277 <group title="Error Notification">
279 An entry in this column indicates to the NVC that this switch
280 has encountered a fault. The switch must clear this column
281 when the fault has been cleared.
284 <column name="switch_fault_status" key="mac_table_exhaustion">
285 Indicates that the switch has been unable to process MAC
286 entries requested by the NVC due to lack of table resources.
289 <column name="switch_fault_status" key="tunnel_exhaustion">
290 Indicates that the switch has been unable to create tunnels
291 requested by the NVC due to lack of resources.
294 <column name="switch_fault_status" key="unspecified_fault">
295 Indicates that an error has occurred in the switch but that no
296 more specific information is available.
302 <table name="Physical_Port" title="A port within a physical switch.">
303 A port within a <ref table="Physical_Switch"/>.
305 <column name="vlan_bindings">
306 Identifies how VLANs on the physical port are bound to logical switches.
307 If, for example, the map contains a (VLAN, logical switch) pair, a packet
308 that arrives on the port in the VLAN is considered to belong to the
309 paired logical switch.
312 <column name="vlan_stats">
313 Statistics for VLANs bound to logical switches on the physical port. An
314 implementation that fully supports such statistics would populate this
315 column with a mapping for every VLAN that is bound in <ref
316 column="vlan_bindings"/>. An implementation that does not support such
317 statistics or only partially supports them would not populate this column
318 or partially populate it, respectively.
321 <group title="Identification">
323 Symbolic name for the port. The name ought to be unique within a given
324 <ref table="Physical_Switch"/>, but the database is not capable of
328 <column name="description">
329 An extended description for the port.
332 <group title="Error Notification">
334 An entry in this column indicates to the NVC that the physical port has
335 encountered a fault. The switch must clear this column when the errror
338 <column name="port_fault_status" key="invalid_vlan_map">
340 Indicates that a VLAN-to-logical-switch mapping requested by
341 the controller could not be instantiated by the switch
342 because of a conflict with local configuration.
345 <column name="port_fault_status" key="unspecified_fault">
347 Indicates that an error has occurred on the port but that no
348 more specific information is available.
355 <table name="Logical_Binding_Stats" title="Statistics for a VLAN on a physical port bound to a logical network.">
356 Reports statistics for the <ref table="Logical_Switch"/> with which a VLAN
357 on a <ref table="Physical_Port"/> is associated.
359 <group title="Statistics">
360 These statistics count only packets to which the binding applies.
362 <column name="packets_from_local">
363 Number of packets sent by the <ref table="Physical_Switch"/>.
366 <column name="bytes_from_local">
367 Number of bytes in packets sent by the <ref table="Physical_Switch"/>.
370 <column name="packets_to_local">
371 Number of packets received by the <ref table="Physical_Switch"/>.
374 <column name="bytes_to_local">
375 Number of bytes in packets received by the <ref
376 table="Physical_Switch"/>.
381 <table name="Logical_Switch" title="A layer-2 domain.">
382 A logical Ethernet switch, whose implementation may span physical and
383 virtual media, possibly crossing L3 domains via tunnels; a logical layer-2
384 domain; an Ethernet broadcast domain.
388 <group title="Per Logical-Switch Tunnel Key">
390 Tunnel protocols tend to have a field that allows the tunnel
391 to be partitioned into sub-tunnels: VXLAN has a VNI, GRE and
392 STT have a key, CAPWAP has a WSI, and so on. We call these
393 generically ``tunnel keys.'' Given that one needs to use a
394 tunnel key at all, there are at least two reasonable ways to
401 Per <ref table="Logical_Switch"/>+<ref table="Physical_Locator"/>
402 pair. That is, each logical switch may be assigned a different
403 tunnel key on every <ref table="Physical_Locator"/>. This model is
408 In this model, <ref table="Physical_Locator"/> carries the tunnel
409 key. Therefore, one <ref table="Physical_Locator"/> record will
410 exist for each logical switch carried at a given IP destination.
416 Per <ref table="Logical_Switch"/>. That is, every tunnel
417 associated with a particular logical switch carries the same tunnel
418 key, regardless of the <ref table="Physical_Locator"/> to which the
419 tunnel is addressed. This model may ease switch implementation
420 because it imposes fewer requirements on the hardware datapath.
424 In this model, <ref table="Logical_Switch"/> carries the tunnel
425 key. Therefore, one <ref table="Physical_Locator"/> record will
426 exist for each IP destination.
431 <column name="tunnel_key">
433 This column is used only in the tunnel key per <ref
434 table="Logical_Switch"/> model (see above), because only in that
435 model is there a tunnel key associated with a logical switch.
439 For <code>vxlan_over_ipv4</code> encapsulation, this column
440 is the VXLAN VNI that identifies a logical switch. It must
441 be in the range 0 to 16,777,215.
446 <group title="Identification">
448 Symbolic name for the logical switch.
451 <column name="description">
452 An extended description for the logical switch, such as its switch
458 <table name="Ucast_Macs_Local" title="Unicast MACs (local)">
460 Mapping of unicast MAC addresses to tunnels (physical
461 locators). This table is written by the HSC, so it contains the
462 MAC addresses that have been learned on physical ports by a
467 A MAC address that has been learned by the VTEP.
470 <column name="logical_switch">
471 The Logical switch to which this mapping applies.
474 <column name="locator">
475 The physical locator to be used to reach this MAC address. In
476 this table, the physical locator will be one of the tunnel IP
477 addresses of the appropriate VTEP.
480 <column name="ipaddr">
481 The IP address to which this MAC corresponds. Optional field for
482 the purpose of ARP supression.
487 <table name="Ucast_Macs_Remote" title="Unicast MACs (remote)">
489 Mapping of unicast MAC addresses to tunnels (physical
490 locators). This table is written by the NVC, so it contains the
491 MAC addresses that the NVC has learned. These include VM MAC
492 addresses, in which case the physical locators will be
493 hypervisor IP addresses. The NVC will also report MACs that it
494 has learned from other HSCs in the network, in which case the
495 physical locators will be tunnel IP addresses of the
500 A MAC address that has been learned by the NVC.
503 <column name="logical_switch">
504 The Logical switch to which this mapping applies.
507 <column name="locator">
508 The physical locator to be used to reach this MAC address. In
509 this table, the physical locator will be either a hypervisor IP
510 address or a tunnel IP addresses of another VTEP.
513 <column name="ipaddr">
514 The IP address to which this MAC corresponds. Optional field for
515 the purpose of ARP supression.
520 <table name="Mcast_Macs_Local" title="Multicast MACs (local)">
522 Mapping of multicast MAC addresses to tunnels (physical
523 locators). This table is written by the HSC, so it contains the
524 MAC addresses that have been learned on physical ports by a
525 VTEP. These may be learned by IGMP snooping, for example. This
526 table also specifies how to handle unknown unicast and broadcast packets.
531 A MAC address that has been learned by the VTEP.
534 The keyword <code>unknown-dst</code> is used as a special
535 ``Ethernet address'' that indicates the locations to which
536 packets in a logical switch whose destination addresses do not
537 otherwise appear in <ref table="Ucast_Macs_Local"/> (for
538 unicast addresses) or <ref table="Mcast_Macs_Local"/> (for
539 multicast addresses) should be sent.
543 <column name="logical_switch">
544 The Logical switch to which this mapping applies.
547 <column name="locator_set">
548 The physical locator set to be used to reach this MAC address. In
549 this table, the physical locator set will be contain one or more tunnel IP
550 addresses of the appropriate VTEP(s).
555 <table name="Mcast_Macs_Remote" title="Multicast MACs (remote)">
557 Mapping of multicast MAC addresses to tunnels (physical
558 locators). This table is written by the NVC, so it contains the
559 MAC addresses that the NVC has learned. This
560 table also specifies how to handle unknown unicast and broadcast
564 Multicast packet replication may be handled by a service node,
565 in which case the physical locators will be IP addresses of
566 service nodes. If the VTEP supports replication onto multiple
567 tunnels, then this may be used to replicate directly onto
568 VTEP-hyperisor tunnels.
573 A MAC address that has been learned by the NVC.
576 The keyword <code>unknown-dst</code> is used as a special
577 ``Ethernet address'' that indicates the locations to which
578 packets in a logical switch whose destination addresses do not
579 otherwise appear in <ref table="Ucast_Macs_Remote"/> (for
580 unicast addresses) or <ref table="Mcast_Macs_Remote"/> (for
581 multicast addresses) should be sent.
585 <column name="logical_switch">
586 The Logical switch to which this mapping applies.
589 <column name="locator_set">
590 The physical locator set to be used to reach this MAC address. In
591 this table, the physical locator set will be either a service node IP
592 address or a set of tunnel IP addresses of hypervisors (and
593 potentially other VTEPs).
596 <column name="ipaddr">
597 The IP address to which this MAC corresponds. Optional field for
598 the purpose of ARP supression.
603 <table name="Logical_Router" title="A logical L3 router.">
605 A logical router, or VRF. A logical router may be connected to one or more
606 logical switches. Subnet addresses and interface addresses may be configured on the
610 <column name="switch_binding">
611 Maps from an IPv4 or IPv6 address prefix in CIDR notation to a
612 logical switch. Multiple prefixes may map to the same switch. By
613 writing a 32-bit (or 128-bit for v6) address with a /N prefix
614 length, both the router's interface address and the subnet
615 prefix can be configured. For example, 192.68.1.1/24 creates a
616 /24 subnet for the logical switch attached to the interface and
617 assigns the address 192.68.1.1 to the router interface.
620 <column name="static_routes">
621 One or more static routes, mapping IP prefixes to next hop IP addresses.
624 <group title="Identification">
626 Symbolic name for the logical router.
629 <column name="description">
630 An extended description for the logical router.
635 <table name="Physical_Locator_Set">
637 A set of one or more <ref table="Physical_Locator"/>s.
641 This table exists only because OVSDB does not have a way to
642 express the type ``map from string to one or more <ref
643 table="Physical_Locator"/> records.''
646 <column name="locators"/>
649 <table name="Physical_Locator">
651 Identifies an endpoint to which logical switch traffic may be
652 encapsulated and forwarded.
656 For the <code>vxlan_over_ipv4</code> encapsulation, the only
657 encapsulation defined so far, all endpoints associated with a given <ref
658 table="Logical_Switch"/> must use a common tunnel key, which is carried
659 in the <ref table="Logical_Switch" column="tunnel_key"/> column of <ref
660 table="Logical_Switch"/>.
664 For some encapsulations yet to be defined, we expect <ref
665 table="Physical_Locator"/> to identify both an endpoint and a tunnel key.
666 When the first such encapsulation is defined, we expect to add a
667 ``tunnel_key'' column to <ref table="Physical_Locator"/> to allow the
668 tunnel key to be defined.
672 See the ``Per Logical-Switch Tunnel Key'' section in the <ref
673 table="Logical_Switch"/> table for further discussion of the model.
676 <column name="encapsulation_type">
677 The type of tunneling encapsulation.
680 <column name="dst_ip">
682 For <code>vxlan_over_ipv4</code> encapsulation, the IPv4 address of the
683 VXLAN tunnel endpoint.
687 We expect that this column could be used for IPv4 or IPv6 addresses in
688 encapsulations to be introduced later.
692 <group title="Bidirectional Forwarding Detection (BFD)">
694 BFD, defined in RFC 5880, allows point to point detection of
695 connectivity failures by occasional transmission of BFD control
696 messages. VTEPs are expected to implement BFD.
700 BFD operates by regularly transmitting BFD control messages at a
701 rate negotiated independently in each direction. Each endpoint
702 specifies the rate at which it expects to receive control messages,
703 and the rate at which it's willing to transmit them. An endpoint
704 which fails to receive BFD control messages for a period of three
705 times the expected reception rate will signal a connectivity
706 fault. In the case of a unidirectional connectivity issue, the
707 system not receiving BFD control messages will signal the problem
708 to its peer in the messages it transmits.
712 A hardware VTEP is expected to use BFD to determine reachability of
713 devices at the end of the tunnels with which it exchanges data. This
714 can enable the VTEP to choose a functioning service node among a set of
715 service nodes providing high availability. It also enables the NVC to
716 report the health status of tunnels.
720 In most cases the BFD peer of a hardware VTEP will be an Open vSwitch
721 instance. The Open vSwitch implementation of BFD aims to comply
722 faithfully with the requirements put forth in RFC 5880. Open vSwitch
723 does not implement the optional Authentication or ``Echo Mode''
727 <group title="BFD Configuration">
729 A controller sets up key-value pairs in the <ref column="bfd"/>
730 column to enable and configure BFD.
733 <column name="bfd" key="enable" type='{"type": "boolean"}'>
734 True to enable BFD on this <ref table="Physical_Locator"/>.
737 <column name="bfd" key="min_rx"
738 type='{"type": "integer", "minInteger": 1}'>
739 The shortest interval, in milliseconds, at which this BFD session
740 offers to receive BFD control messages. The remote endpoint may
741 choose to send messages at a slower rate. Defaults to
745 <column name="bfd" key="min_tx"
746 type='{"type": "integer", "minInteger": 1}'>
747 The shortest interval, in milliseconds, at which this BFD session is
748 willing to transmit BFD control messages. Messages will actually be
749 transmitted at a slower rate if the remote endpoint is not willing to
750 receive as quickly as specified. Defaults to <code>100</code>.
753 <column name="bfd" key="decay_min_rx" type='{"type": "integer"}'>
754 An alternate receive interval, in milliseconds, that must be greater
755 than or equal to <ref column="bfd" key="min_rx"/>. The
756 implementation switches from <ref column="bfd" key="min_rx"/> to <ref
757 column="bfd" key="decay_min_rx"/> when there is no obvious incoming
758 data traffic at the interface, to reduce the CPU and bandwidth cost
759 of monitoring an idle interface. This feature may be disabled by
760 setting a value of 0. This feature is reset whenever <ref
761 column="bfd" key="decay_min_rx"/> or <ref column="bfd" key="min_rx"/>
765 <column name="bfd" key="forwarding_if_rx" type='{"type": "boolean"}'>
766 True to consider the interface capable of packet I/O as long as it
767 continues to receive any packets (not just BFD packets). This
768 prevents link congestion that causes consecutive BFD control packets
769 to be lost from marking the interface down.
772 <column name="bfd" key="cpath_down" type='{"type": "boolean"}'>
773 Set to true to notify the remote endpoint that traffic should not be
774 forwarded to this system for some reason other than a connectivty
775 failure on the interface being monitored. The typical underlying
776 reason is ``concatenated path down,'' that is, that connectivity
777 beyond the local system is down. Defaults to false.
780 <column name="bfd" key="check_tnl_key" type='{"type": "boolean"}'>
781 Set to true to make BFD accept only control messages with a tunnel
782 key of zero. By default, BFD accepts control messages with any
786 <column name="bfd" key="bfd_dst_mac">
787 Set to an Ethernet address in the form
788 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>
789 to set the MAC used as destination for transmitted BFD packets and
790 expected as destination for received BFD packets. The default is
791 <code>00:23:20:00:00:01</code>.
795 <group title="BFD Status">
797 The VTEP sets key-value pairs in the <ref column="bfd_status"/>
798 column to report the status of BFD on this interface. When BFD is
799 not enabled, with <ref column="bfd" key="enable"/>, the switch clears
800 all key-value pairs from <ref column="bfd_status"/>.
803 <column name="bfd_status" key="state"
804 type='{"type": "string",
805 "enum": ["set", ["admin_down", "down", "init", "up"]]}'>
806 Reports the state of the BFD session. The BFD session is fully
807 healthy and negotiated if <code>UP</code>.
810 <column name="bfd_status" key="forwarding" type='{"type": "boolean"}'>
811 Reports whether the BFD session believes this <ref
812 table="Physical_Locator"/> may be used to forward traffic. Typically
813 this means the local session is signaling <code>UP</code>, and the
814 remote system isn't signaling a problem such as concatenated path
818 <column name="bfd_status" key="diagnostic">
819 In case of a problem, set to a short message that reports what the
820 local BFD session thinks is wrong.
823 <column name="bfd_status" key="remote_state"
824 type='{"type": "string",
825 "enum": ["set", ["admin_down", "down", "init", "up"]]}'>
826 Reports the state of the remote endpoint's BFD session.
829 <column name="bfd_status" key="remote_diagnostic">
830 In case of a problem, set to a short message that reports what the
831 remote endpoint's BFD session thinks is wrong.