1 <?xml version="1.0" encoding="utf-8"?>
2 <database title="Hardware VTEP Database">
4 This schema specifies relations that a VTEP can use to integrate
5 physical ports into logical switches maintained by a network
6 virtualization controller such as NSX.
14 VXLAN Tunnel End Point, an entity which originates and/or terminates
20 Hardware Switch Controller.
25 Network Virtualization Controller, e.g. NSX.
30 Virtual Routing and Forwarding instance.
34 <table name="Global" title="Top-level configuration.">
35 Top-level configuration for a hardware VTEP. There must be
36 exactly one record in the <ref table="Global"/> table.
38 <column name="switches">
39 The physical switches managed by the VTEP.
42 <group title="Database Configuration">
44 These columns primarily configure the database server
45 (<code>ovsdb-server</code>), not the hardware VTEP itself.
48 <column name="managers">
49 Database clients to which the database server should connect or
50 to which it should listen, along with options for how these
51 connection should be configured. See the <ref table="Manager"/>
52 table for more information.
57 <table name="Manager" title="OVSDB management connection.">
59 Configuration for a database connection to an Open vSwitch Database
64 The database server can initiate and maintain active connections
65 to remote clients. It can also listen for database connections.
68 <group title="Core Features">
69 <column name="target">
70 <p>Connection method for managers.</p>
72 The following connection methods are currently supported:
75 <dt><code>ssl:<var>ip</var></code>[<code>:<var>port</var></code>]</dt>
78 The specified SSL <var>port</var> (default: 6632) on the host at
79 the given <var>ip</var>, which must be expressed as an IP address
83 SSL key and certificate configuration happens outside the
88 <dt><code>tcp:<var>ip</var></code>[<code>:<var>port</var></code>]</dt>
90 The specified TCP <var>port</var> (default: 6632) on the host at
91 the given <var>ip</var>, which must be expressed as an IP address
94 <dt><code>pssl:</code>[<var>port</var>][<code>:<var>ip</var></code>]</dt>
97 Listens for SSL connections on the specified TCP <var>port</var>
98 (default: 6632). If <var>ip</var>, which must be expressed as an
99 IP address (not a DNS name), is specified, then connections are
100 restricted to the specified local IP address.
103 <dt><code>ptcp:</code>[<var>port</var>][<code>:<var>ip</var></code>]</dt>
105 Listens for connections on the specified TCP <var>port</var>
106 (default: 6632). If <var>ip</var>, which must be expressed as an
107 IP address (not a DNS name), is specified, then connections are
108 restricted to the specified local IP address.
114 <group title="Client Failure Detection and Handling">
115 <column name="max_backoff">
116 Maximum number of milliseconds to wait between connection attempts.
117 Default is implementation-specific.
120 <column name="inactivity_probe">
121 Maximum number of milliseconds of idle time on connection to the
122 client before sending an inactivity probe message. If the Open
123 vSwitch database does not communicate with the client for the
124 specified number of seconds, it will send a probe. If a
125 response is not received for the same additional amount of time,
126 the database server assumes the connection has been broken
127 and attempts to reconnect. Default is implementation-specific.
128 A value of 0 disables inactivity probes.
132 <group title="Status">
133 <column name="is_connected">
134 <code>true</code> if currently connected to this manager,
135 <code>false</code> otherwise.
138 <column name="status" key="last_error">
139 A human-readable description of the last error on the connection
140 to the manager; i.e. <code>strerror(errno)</code>. This key
141 will exist only if an error has occurred.
144 <column name="status" key="state"
145 type='{"type": "string", "enum": ["set", ["VOID", "BACKOFF", "CONNECTING", "ACTIVE", "IDLE"]]}'>
147 The state of the connection to the manager:
150 <dt><code>VOID</code></dt>
151 <dd>Connection is disabled.</dd>
153 <dt><code>BACKOFF</code></dt>
154 <dd>Attempting to reconnect at an increasing period.</dd>
156 <dt><code>CONNECTING</code></dt>
157 <dd>Attempting to connect.</dd>
159 <dt><code>ACTIVE</code></dt>
160 <dd>Connected, remote host responsive.</dd>
162 <dt><code>IDLE</code></dt>
163 <dd>Connection is idle. Waiting for response to keep-alive.</dd>
166 These values may change in the future. They are provided only for
171 <column name="status" key="sec_since_connect"
172 type='{"type": "integer", "minInteger": 0}'>
173 The amount of time since this manager last successfully connected
174 to the database (in seconds). Value is empty if manager has never
175 successfully connected.
178 <column name="status" key="sec_since_disconnect"
179 type='{"type": "integer", "minInteger": 0}'>
180 The amount of time since this manager last disconnected from the
181 database (in seconds). Value is empty if manager has never
185 <column name="status" key="locks_held">
186 Space-separated list of the names of OVSDB locks that the connection
187 holds. Omitted if the connection does not hold any locks.
190 <column name="status" key="locks_waiting">
191 Space-separated list of the names of OVSDB locks that the connection is
192 currently waiting to acquire. Omitted if the connection is not waiting
196 <column name="status" key="locks_lost">
197 Space-separated list of the names of OVSDB locks that the connection
198 has had stolen by another OVSDB client. Omitted if no locks have been
199 stolen from this connection.
202 <column name="status" key="n_connections"
203 type='{"type": "integer", "minInteger": 2}'>
205 When <ref column="target"/> specifies a connection method that
206 listens for inbound connections (e.g. <code>ptcp:</code> or
207 <code>pssl:</code>) and more than one connection is actually active,
208 the value is the number of active connections. Otherwise, this
209 key-value pair is omitted.
212 When multiple connections are active, status columns and key-value
213 pairs (other than this one) report the status of one arbitrarily
219 <group title="Connection Parameters">
221 Additional configuration for a connection between the manager
222 and the database server.
225 <column name="other_config" key="dscp"
226 type='{"type": "integer"}'>
227 The Differentiated Service Code Point (DSCP) is specified using 6 bits
228 in the Type of Service (TOS) field in the IP header. DSCP provides a
229 mechanism to classify the network traffic and provide Quality of
230 Service (QoS) on IP networks.
232 The DSCP value specified here is used when establishing the
233 connection between the manager and the database server. If no
234 value is specified, a default value of 48 is chosen. Valid DSCP
235 values must be in the range 0 to 63.
240 <table name="Physical_Switch" title="A physical switch.">
241 A physical switch that implements a VTEP.
243 <column name="ports">
244 The physical ports within the switch.
247 <group title="Network Status">
248 <column name="management_ips">
249 IPv4 or IPv6 addresses at which the switch may be contacted
250 for management purposes.
253 <column name="tunnel_ips">
255 IPv4 or IPv6 addresses on which the switch may originate or
260 This column is intended to allow a <ref table="Manager"/> to
261 determine the <ref table="Physical_Switch"/> that terminates
262 the tunnel represented by a <ref table="Physical_Locator"/>.
267 <group title="Identification">
269 Symbolic name for the switch, such as its hostname.
272 <column name="description">
273 An extended description for the switch, such as its switch login
277 <group title="Error Notification">
279 An entry in this column indicates to the NVC that this switch
280 has encountered a fault. The switch must clear this column
281 when the fault has been cleared.
284 <column name="switch_fault_status" key="mac_table_exhaustion">
285 Indicates that the switch has been unable to process MAC
286 entries requested by the NVC due to lack of table resources.
289 <column name="switch_fault_status" key="tunnel_exhaustion">
290 Indicates that the switch has been unable to create tunnels
291 requested by the NVC due to lack of resources.
294 <column name="switch_fault_status" key="unspecified_fault">
295 Indicates that an error has occurred in the switch but that no
296 more specific information is available.
302 <table name="Physical_Port" title="A port within a physical switch.">
303 A port within a <ref table="Physical_Switch"/>.
305 <column name="vlan_bindings">
306 Identifies how VLANs on the physical port are bound to logical switches.
307 If, for example, the map contains a (VLAN, logical switch) pair, a packet
308 that arrives on the port in the VLAN is considered to belong to the
309 paired logical switch.
312 <column name="vlan_stats">
313 Statistics for VLANs bound to logical switches on the physical port. An
314 implementation that fully supports such statistics would populate this
315 column with a mapping for every VLAN that is bound in <ref
316 column="vlan_bindings"/>. An implementation that does not support such
317 statistics or only partially supports them would not populate this column
318 or partially populate it, respectively.
321 <group title="Identification">
323 Symbolic name for the port. The name ought to be unique within a given
324 <ref table="Physical_Switch"/>, but the database is not capable of
328 <column name="description">
329 An extended description for the port.
332 <group title="Error Notification">
334 An entry in this column indicates to the NVC that the physical port has
335 encountered a fault. The switch must clear this column when the errror
338 <column name="port_fault_status" key="invalid_vlan_map">
340 Indicates that a VLAN-to-logical-switch mapping requested by
341 the controller could not be instantiated by the switch
342 because of a conflict with local configuration.
345 <column name="port_fault_status" key="unspecified_fault">
347 Indicates that an error has occurred on the port but that no
348 more specific information is available.
355 <table name="Logical_Binding_Stats" title="Statistics for a VLAN on a physical port bound to a logical network.">
356 Reports statistics for the <ref table="Logical_Switch"/> with which a VLAN
357 on a <ref table="Physical_Port"/> is associated.
359 <group title="Statistics">
360 These statistics count only packets to which the binding applies.
362 <column name="packets_from_local">
363 Number of packets sent by the <ref table="Physical_Switch"/>.
366 <column name="bytes_from_local">
367 Number of bytes in packets sent by the <ref table="Physical_Switch"/>.
370 <column name="packets_to_local">
371 Number of packets received by the <ref table="Physical_Switch"/>.
374 <column name="bytes_to_local">
375 Number of bytes in packets received by the <ref
376 table="Physical_Switch"/>.
381 <table name="Logical_Switch" title="A layer-2 domain.">
382 A logical Ethernet switch, whose implementation may span physical and
383 virtual media, possibly crossing L3 domains via tunnels; a logical layer-2
384 domain; an Ethernet broadcast domain.
388 <group title="Per Logical-Switch Tunnel Key">
390 Tunnel protocols tend to have a field that allows the tunnel
391 to be partitioned into sub-tunnels: VXLAN has a VNI, GRE and
392 STT have a key, CAPWAP has a WSI, and so on. We call these
393 generically ``tunnel keys.'' Given that one needs to use a
394 tunnel key at all, there are at least two reasonable ways to
401 Per <ref table="Logical_Switch"/>+<ref table="Physical_Locator"/>
402 pair. That is, each logical switch may be assigned a different
403 tunnel key on every <ref table="Physical_Locator"/>. This model is
408 In this model, <ref table="Physical_Locator"/> carries the tunnel
409 key. Therefore, one <ref table="Physical_Locator"/> record will
410 exist for each logical switch carried at a given IP destination.
416 Per <ref table="Logical_Switch"/>. That is, every tunnel
417 associated with a particular logical switch carries the same tunnel
418 key, regardless of the <ref table="Physical_Locator"/> to which the
419 tunnel is addressed. This model may ease switch implementation
420 because it imposes fewer requirements on the hardware datapath.
424 In this model, <ref table="Logical_Switch"/> carries the tunnel
425 key. Therefore, one <ref table="Physical_Locator"/> record will
426 exist for each IP destination.
431 <column name="tunnel_key">
433 This column is used only in the tunnel key per <ref
434 table="Logical_Switch"/> model (see above), because only in that
435 model is there a tunnel key associated with a logical switch.
439 For <code>vxlan_over_ipv4</code> encapsulation, this column
440 is the VXLAN VNI that identifies a logical switch. It must
441 be in the range 0 to 16,777,215.
446 <group title="Identification">
448 Symbolic name for the logical switch.
451 <column name="description">
452 An extended description for the logical switch, such as its switch
458 <table name="Ucast_Macs_Local" title="Unicast MACs (local)">
460 Mapping of unicast MAC addresses to tunnels (physical
461 locators). This table is written by the HSC, so it contains the
462 MAC addresses that have been learned on physical ports by a
467 A MAC address that has been learned by the VTEP.
470 <column name="logical_switch">
471 The Logical switch to which this mapping applies.
474 <column name="locator">
475 The physical locator to be used to reach this MAC address. In
476 this table, the physical locator will be one of the tunnel IP
477 addresses of the appropriate VTEP.
480 <column name="ipaddr">
481 The IP address to which this MAC corresponds. Optional field for
482 the purpose of ARP supression.
487 <table name="Ucast_Macs_Remote" title="Unicast MACs (remote)">
489 Mapping of unicast MAC addresses to tunnels (physical
490 locators). This table is written by the NVC, so it contains the
491 MAC addresses that the NVC has learned. These include VM MAC
492 addresses, in which case the physical locators will be
493 hypervisor IP addresses. The NVC will also report MACs that it
494 has learned from other HSCs in the network, in which case the
495 physical locators will be tunnel IP addresses of the
500 A MAC address that has been learned by the NVC.
503 <column name="logical_switch">
504 The Logical switch to which this mapping applies.
507 <column name="locator">
508 The physical locator to be used to reach this MAC address. In
509 this table, the physical locator will be either a hypervisor IP
510 address or a tunnel IP addresses of another VTEP.
513 <column name="ipaddr">
514 The IP address to which this MAC corresponds. Optional field for
515 the purpose of ARP supression.
520 <table name="Mcast_Macs_Local" title="Multicast MACs (local)">
522 Mapping of multicast MAC addresses to tunnels (physical
523 locators). This table is written by the HSC, so it contains the
524 MAC addresses that have been learned on physical ports by a
525 VTEP. These may be learned by IGMP snooping, for example. This
526 table also specifies how to handle unknown unicast and broadcast packets.
531 A MAC address that has been learned by the VTEP.
534 The keyword <code>unknown-dst</code> is used as a special
535 ``Ethernet address'' that indicates the locations to which
536 packets in a logical switch whose destination addresses do not
537 otherwise appear in <ref table="Ucast_Macs_Local"/> (for
538 unicast addresses) or <ref table="Mcast_Macs_Local"/> (for
539 multicast addresses) should be sent.
543 <column name="logical_switch">
544 The Logical switch to which this mapping applies.
547 <column name="locator_set">
548 The physical locator set to be used to reach this MAC address. In
549 this table, the physical locator set will be contain one or more tunnel IP
550 addresses of the appropriate VTEP(s).
555 <table name="Mcast_Macs_Remote" title="Multicast MACs (remote)">
557 Mapping of multicast MAC addresses to tunnels (physical
558 locators). This table is written by the NVC, so it contains the
559 MAC addresses that the NVC has learned. This
560 table also specifies how to handle unknown unicast and broadcast
564 Multicast packet replication may be handled by a service node,
565 in which case the physical locators will be IP addresses of
566 service nodes. If the VTEP supports replication onto multiple
567 tunnels, then this may be used to replicate directly onto
568 VTEP-hyperisor tunnels.
573 A MAC address that has been learned by the NVC.
576 The keyword <code>unknown-dst</code> is used as a special
577 ``Ethernet address'' that indicates the locations to which
578 packets in a logical switch whose destination addresses do not
579 otherwise appear in <ref table="Ucast_Macs_Remote"/> (for
580 unicast addresses) or <ref table="Mcast_Macs_Remote"/> (for
581 multicast addresses) should be sent.
585 <column name="logical_switch">
586 The Logical switch to which this mapping applies.
589 <column name="locator_set">
590 The physical locator set to be used to reach this MAC address. In
591 this table, the physical locator set will be either a service node IP
592 address or a set of tunnel IP addresses of hypervisors (and
593 potentially other VTEPs).
596 <column name="ipaddr">
597 The IP address to which this MAC corresponds. Optional field for
598 the purpose of ARP supression.
603 <table name="Logical_Router" title="A logical L3 router.">
605 A logical router, or VRF. A logical router may be connected to one or more
606 logical switches. Subnet addresses and interface addresses may be configured on the
610 <column name="switch_binding">
611 Maps from an IPv4 or IPv6 address prefix in CIDR notation to a
612 logical switch. Multiple prefixes may map to the same switch. By
613 writing a 32-bit (or 128-bit for v6) address with a /N prefix
614 length, both the router's interface address and the subnet
615 prefix can be configured. For example, 192.68.1.1/24 creates a
616 /24 subnet for the logical switch attached to the interface and
617 assigns the address 192.68.1.1 to the router interface.
620 <column name="static_routes">
621 One or more static routes, mapping IP prefixes to next hop IP addresses.
624 <group title="Identification">
626 Symbolic name for the logical router.
629 <column name="description">
630 An extended description for the logical router.
635 <table name="Arp_Sources_Local" title="ARP source addresses for logical routers">
637 MAC address to be used when a VTEP issues ARP requests on behalf
642 A distributed logical router is implemented by a set of VTEPs
643 (both hardware VTEPs and vswitches). In order for a given VTEP
644 to populate the local ARP cache for a logical router, it issues
645 ARP requests with a source MAC address that is unique to the VTEP. A
646 single per-VTEP MAC can be re-used across all logical
647 networks. This table contains the MACs that are used by the
648 VTEPs of a given HSC. The table provides the mapping from MAC to
649 physical locator for each VTEP so that replies to the ARP
650 requests can be sent back to the correct VTEP using the
651 appropriate physical locator.
654 <column name="src_mac">
655 The source MAC to be used by a given VTEP.
658 <column name="locator">
659 The <ref table="Physical_Locator"/> to use for replies to ARP
660 requests from this MAC address.
664 <table name="Arp_Sources_Remote" title="ARP source addresses for logical routers">
666 MAC address to be used when a remote VTEP issues ARP requests on behalf
671 This table is the remote counterpart of <ref
672 table="Arp_sources_local"/>. The NVC writes this table to notify
673 the HSC of the MACs that will be used by remote VTEPs when they
674 issue ARP requests on behalf of a distributed logical router.
677 <column name="src_mac">
678 The source MAC to be used by a given VTEP.
681 <column name="locator">
682 The <ref table="Physical_Locator"/> to use for replies to ARP
683 requests from this MAC address.
687 <table name="Physical_Locator_Set">
689 A set of one or more <ref table="Physical_Locator"/>s.
693 This table exists only because OVSDB does not have a way to
694 express the type ``map from string to one or more <ref
695 table="Physical_Locator"/> records.''
698 <column name="locators"/>
701 <table name="Physical_Locator">
703 Identifies an endpoint to which logical switch traffic may be
704 encapsulated and forwarded.
708 For the <code>vxlan_over_ipv4</code> encapsulation, the only
709 encapsulation defined so far, all endpoints associated with a given <ref
710 table="Logical_Switch"/> must use a common tunnel key, which is carried
711 in the <ref table="Logical_Switch" column="tunnel_key"/> column of <ref
712 table="Logical_Switch"/>.
716 For some encapsulations yet to be defined, we expect <ref
717 table="Physical_Locator"/> to identify both an endpoint and a tunnel key.
718 When the first such encapsulation is defined, we expect to add a
719 ``tunnel_key'' column to <ref table="Physical_Locator"/> to allow the
720 tunnel key to be defined.
724 See the ``Per Logical-Switch Tunnel Key'' section in the <ref
725 table="Logical_Switch"/> table for further discussion of the model.
728 <column name="encapsulation_type">
729 The type of tunneling encapsulation.
732 <column name="dst_ip">
734 For <code>vxlan_over_ipv4</code> encapsulation, the IPv4 address of the
735 VXLAN tunnel endpoint.
739 We expect that this column could be used for IPv4 or IPv6 addresses in
740 encapsulations to be introduced later.
744 <group title="Bidirectional Forwarding Detection (BFD)">
746 BFD, defined in RFC 5880, allows point to point detection of
747 connectivity failures by occasional transmission of BFD control
748 messages. VTEPs are expected to implement BFD.
752 BFD operates by regularly transmitting BFD control messages at a
753 rate negotiated independently in each direction. Each endpoint
754 specifies the rate at which it expects to receive control messages,
755 and the rate at which it's willing to transmit them. An endpoint
756 which fails to receive BFD control messages for a period of three
757 times the expected reception rate will signal a connectivity
758 fault. In the case of a unidirectional connectivity issue, the
759 system not receiving BFD control messages will signal the problem
760 to its peer in the messages it transmits.
764 A hardware VTEP is expected to use BFD to determine reachability of
765 devices at the end of the tunnels with which it exchanges data. This
766 can enable the VTEP to choose a functioning service node among a set of
767 service nodes providing high availability. It also enables the NVC to
768 report the health status of tunnels.
772 In most cases the BFD peer of a hardware VTEP will be an Open vSwitch
773 instance. The Open vSwitch implementation of BFD aims to comply
774 faithfully with the requirements put forth in RFC 5880. Open vSwitch
775 does not implement the optional Authentication or ``Echo Mode''
779 <group title="BFD Configuration">
781 A controller sets up key-value pairs in the <ref column="bfd"/>
782 column to enable and configure BFD.
785 <column name="bfd" key="enable" type='{"type": "boolean"}'>
786 True to enable BFD on this <ref table="Physical_Locator"/>.
789 <column name="bfd" key="min_rx"
790 type='{"type": "integer", "minInteger": 1}'>
791 The shortest interval, in milliseconds, at which this BFD session
792 offers to receive BFD control messages. The remote endpoint may
793 choose to send messages at a slower rate. Defaults to
797 <column name="bfd" key="min_tx"
798 type='{"type": "integer", "minInteger": 1}'>
799 The shortest interval, in milliseconds, at which this BFD session is
800 willing to transmit BFD control messages. Messages will actually be
801 transmitted at a slower rate if the remote endpoint is not willing to
802 receive as quickly as specified. Defaults to <code>100</code>.
805 <column name="bfd" key="decay_min_rx" type='{"type": "integer"}'>
806 An alternate receive interval, in milliseconds, that must be greater
807 than or equal to <ref column="bfd" key="min_rx"/>. The
808 implementation switches from <ref column="bfd" key="min_rx"/> to <ref
809 column="bfd" key="decay_min_rx"/> when there is no obvious incoming
810 data traffic at the interface, to reduce the CPU and bandwidth cost
811 of monitoring an idle interface. This feature may be disabled by
812 setting a value of 0. This feature is reset whenever <ref
813 column="bfd" key="decay_min_rx"/> or <ref column="bfd" key="min_rx"/>
817 <column name="bfd" key="forwarding_if_rx" type='{"type": "boolean"}'>
818 True to consider the interface capable of packet I/O as long as it
819 continues to receive any packets (not just BFD packets). This
820 prevents link congestion that causes consecutive BFD control packets
821 to be lost from marking the interface down.
824 <column name="bfd" key="cpath_down" type='{"type": "boolean"}'>
825 Set to true to notify the remote endpoint that traffic should not be
826 forwarded to this system for some reason other than a connectivty
827 failure on the interface being monitored. The typical underlying
828 reason is ``concatenated path down,'' that is, that connectivity
829 beyond the local system is down. Defaults to false.
832 <column name="bfd" key="check_tnl_key" type='{"type": "boolean"}'>
833 Set to true to make BFD accept only control messages with a tunnel
834 key of zero. By default, BFD accepts control messages with any
838 <column name="bfd" key="bfd_dst_mac">
839 Set to an Ethernet address in the form
840 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>
841 to set the MAC used as destination for transmitted BFD packets and
842 expected as destination for received BFD packets. The default is
843 <code>00:23:20:00:00:01</code>.
847 <group title="BFD Status">
849 The VTEP sets key-value pairs in the <ref column="bfd_status"/>
850 column to report the status of BFD on this interface. When BFD is
851 not enabled, with <ref column="bfd" key="enable"/>, the switch clears
852 all key-value pairs from <ref column="bfd_status"/>.
855 <column name="bfd_status" key="state"
856 type='{"type": "string",
857 "enum": ["set", ["admin_down", "down", "init", "up"]]}'>
858 Reports the state of the BFD session. The BFD session is fully
859 healthy and negotiated if <code>UP</code>.
862 <column name="bfd_status" key="forwarding" type='{"type": "boolean"}'>
863 Reports whether the BFD session believes this <ref
864 table="Physical_Locator"/> may be used to forward traffic. Typically
865 this means the local session is signaling <code>UP</code>, and the
866 remote system isn't signaling a problem such as concatenated path
870 <column name="bfd_status" key="diagnostic">
871 In case of a problem, set to a short message that reports what the
872 local BFD session thinks is wrong.
875 <column name="bfd_status" key="remote_state"
876 type='{"type": "string",
877 "enum": ["set", ["admin_down", "down", "init", "up"]]}'>
878 Reports the state of the remote endpoint's BFD session.
881 <column name="bfd_status" key="remote_diagnostic">
882 In case of a problem, set to a short message that reports what the
883 remote endpoint's BFD session thinks is wrong.