*.loT
*.mod.c
*.o
-*.o
+*.obj
+*.exe
+*.exp
+*.ilk
+*.lib
+*.pdb
*.pyc
*.so
*~
James P. roampune@gmail.com
James Page james.page@ubuntu.com
Jarno Rajahalme jrajahalme@nicira.com
+Jason Kölker jason@koelker.net
Jean Tourrilhes jt@hpl.hp.com
Jeremy Stribling strib@nicira.com
Jesse Gross jesse@nicira.com
André Ruß andre.russ@hybris.com
Andreas Beckmann debian@abeckmann.de
Andrei Andone andrei.andone@softvision.ro
+Anshuman Manral anshuman.manral@outlook.com
Anton Matsiuk anton.matsiuk@gmail.com
Anuprem Chalvadi achalvadi@vmware.com
+Ariel Tubaltsev atubaltsev@vmware.com
Atzm Watanabe atzm@stratosphere.co.jp
Bastian Blank waldi@debian.org
Ben Basler bbasler@nicira.com
Henrik Amren henrik@nicira.com
Hiroshi Tanaka htanaka@nicira.com
Hiroshi Miyata miyahiro.dazu@gmail.com
+Hyojoon Kim joonk@gatech.edu
Igor Ganichev iganichev@nicira.com
Igor Sever igor@xorops.com
Jacob Cherkas jcherkas@nicira.com
Logan Rosen logatronico@gmail.com
Luca Falavigna dktrkranz@debian.org
Luiz Henrique Ozaki luiz.ozaki@gmail.com
+Marco d'Itri md@Linux.IT
Maxime Brun m.brun@alphalink.fr
Michael A. Collins mike.a.collins@ark-net.org
Michael Hu mhu@nicira.com
"/bin/link.exe", rename /bin/link.exe to something else so that the
Visual studio's linker is used.
+* For pthread support, install the library, dll and includes of pthreads-win32
+project from
+ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release to a
+directory (e.g.: C:/pthread).
+
* Get the Open vSwitch sources from either cloning the repo using git
or from a distribution tar ball.
the right compiler, linker, libraries, Open vSwitch component installation
directories, etc. For example,
- % ./configure CC=./build-aux/cccl LD="`which link`" LIBS="-lws2_32 ..." \
+ % ./configure CC=./build-aux/cccl LD="`which link`" LIBS="-lws2_32" \
--prefix="C:/openvswitch/usr" --localstatedir="C:/openvswitch/var" \
- --sysconfdir="C:/openvswitch/etc"
+ --sysconfdir="C:/openvswitch/etc" --with-pthread="C:/pthread"
* Run make for the ported executables in the top source directory, e.g.:
- % make utilities/ovs-vsctl.exe ovsdb/ovsdb-server.exe
+ % make lib/vswitch-idl.h lib/vtep-idl.h ofproto/ipfix-entities.def
+ % make ovsdb/ovsdb-server.exe ovsdb/ovsdb-tool.exe ovsdb/ovsdb-client.exe \
+ utilities/ovs-vsctl.exe utilities/ovs-ofctl.exe \
+ utilities/ovs-dpctl.exe vswitchd/ovs-vswitchd.exe \
+ utilities/ovs-appctl.exe
OpenSSL, Open vSwitch and Visual C++
------------------------------------
include/windows/getopt.h
lib/getopt_long.c
+The following files are licensed under the 3-clause BSD-license
+ include/windows/netinet/icmp6.h
+ include/windows/netinet/ip6.h
+ lib/strsep.c
+
Files under the xenserver directory are licensed on a file-by-file
basis. Refer to each file for details.
C DIALECT
- Some C99 features are OK because they are widely implemented even in
-older compilers:
+ Some C99 features are OK because they are widely implemented:
* Flexible array members (e.g. struct { int foo[]; }).
only take on the values 0 or 1, because this behavior can't be
simulated on C89 compilers.
+ * Designated initializers (e.g. "struct foo foo = {.a = 1};" and
+ "int a[] = {[2] = 5};").
+
Don't use other C99 features that are not widely implemented in
older compilers:
- * Don't use designated initializers (e.g. don't write "struct foo
- foo = {.a = 1};" or "int a[] = {[2] = 5};").
-
* Don't mix declarations and code within a block.
* Don't use declarations in iteration statements (e.g. don't write
The table for 1.3 is the same as the one shown above for 1.2.
+OpenFlow 1.4
+------------
+
+OpenFlow 1.4 does not change flow_mod semantics.
+
+
OFPT_PACKET_IN
==============
1.11.x 2.6.18 to 3.8
2.0.x 2.6.32 to 3.10
2.1.x 2.6.32 to 3.11
+ 2.2.x 2.6.32 to 3.12
Open vSwitch userspace should also work with the Linux kernel module
built into Linux 3.3 and later.
tc-htb(8), tc-hfsc(8)), web resources (e.g. http://lartc.org/), or
mailing lists (e.g. http://vger.kernel.org/vger-lists.html#netdev).
+Q: Does Open vSwitch support OpenFlow meters?
+
+A: Since version 2.0, Open vSwitch has OpenFlow protocol support for
+ OpenFlow meters. There is no implementation of meters in the Open
+ vSwitch software switch (neither the kernel-based nor userspace
+ switches).
+
VLANs
-----
Q: What versions of OpenFlow does Open vSwitch support?
-A: Open vSwitch 1.9 and earlier support only OpenFlow 1.0 (plus
- extensions that bring in many of the features from later versions
- of OpenFlow).
+A: The following table lists the versions of OpenFlow supported by
+ each version of Open vSwitch:
- Open vSwitch 1.10 and later have experimental support for OpenFlow
- 1.2 and 1.3. On these versions of Open vSwitch, the following
- command enables OpenFlow 1.0, 1.2, and 1.3 on bridge br0:
+ Open vSwitch OF1.0 OF1.1 OF1.2 OF1.3 OF1.4
+ =============== ===== ===== ===== ===== =====
+ 1.9 and earlier yes --- --- --- ---
+ 1.10 yes --- [*] [*] ---
+ 1.11 yes --- [*] [*] ---
+ 2.0 yes [*] [*] [*] ---
+ 2.1 yes [*] [*] [*] ---
+ 2.2 yes [*] [*] [*] [%]
- ovs-vsctl set bridge br0 protocols=OpenFlow10,OpenFlow12,OpenFlow13
+ [*] Supported, with one or more missing features.
+ [%] Support is unsafe: ovs-vswitchd will abort when certain
+ unimplemented features are tested.
- Open vSwitch version 2.0 and later will have experimental support
- for OpenFlow 1.1, 1.2, and 1.3. On these versions of Open vSwitch,
- the following command enables OpenFlow 1.0, 1.1, 1.2, and 1.3 on
- bridge br0:
+ Because of missing features, OpenFlow 1.1, 1.2, and 1.3 must be
+ enabled manually. The following command enables OpenFlow 1.0, 1.1,
+ 1.2, and 1.3 on bridge br0:
ovs-vsctl set bridge br0 protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13
ovs-ofctl -O OpenFlow13 dump-flows br0
- Support for OpenFlow 1.1, 1.2, and 1.3 is still incomplete. Work
- to be done is tracked in OPENFLOW-1.1+ in the Open vSwitch sources
- (also via http://openvswitch.org/development/openflow-1-x-plan/).
- When support for a given OpenFlow version is solidly implemented,
- Open vSwitch will enable that version by default.
+ OpenFlow 1.4 is a special case, because it is not implemented
+ safely: ovs-vswitchd will abort when certain unimplemented features
+ are tested. Thus, for now it is suitable only for experimental
+ use. ovs-vswitchd will only allow OpenFlow 1.4 to be enabled
+ (which must be done in the same way described above) when it is
+ invoked with a special --enable-of14 command line option.
+
+ OPENFLOW-1.1+ in the Open vSwitch source tree tracks support for
+ OpenFlow 1.1 and later features. When support for a given OpenFlow
+ version is solidly implemented, Open vSwitch will enable that
+ version by default.
Q: Does Open vSwitch support MPLS?
- INSTALL.RHEL
- INSTALL.XenServer
- INSTALL.NetBSD
+ - INSTALL.DPDK
Build Requirements
------------------
Before starting ovs-vswitchd itself, you need to start its
configuration database, ovsdb-server. Each machine on which Open
vSwitch is installed should run its own copy of ovsdb-server.
-Configure it to use the database you created during step 7 of
-installation, above, to listen on a Unix domain socket, to connect to
-any managers specified in the database itself, and to use the SSL
+Configure it to use the database you created during installation (as
+explained above), to listen on a Unix domain socket, to connect to any
+managers specified in the database itself, and to use the SSL
configuration in the database:
% ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
--- /dev/null
+ Using Open vSwitch with DPDK
+ ============================
+
+Open vSwitch can use Intel(R) DPDK lib to operate entirely in
+userspace. This file explains how to install and use Open vSwitch in
+such a mode.
+
+The DPDK support of Open vSwitch is considered experimental.
+It has not been thoroughly tested.
+
+This version of Open vSwitch should be built manually with "configure"
+and "make".
+
+Building and Installing:
+------------------------
+
+Recommended to use DPDK 1.6.
+
+DPDK:
+cd DPDK
+update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate single lib file.
+CONFIG_RTE_BUILD_COMBINE_LIBS=y
+
+make install T=x86_64-default-linuxapp-gcc
+For details refer to http://dpdk.org/
+
+Linux kernel:
+Refer to intel-dpdk-getting-started-guide.pdf for understanding
+DPDK kernel requirement.
+
+OVS:
+cd $(OVS_DIR)/openvswitch
+./boot.sh
+./configure --with-dpdk=$(DPDK_BUILD)
+make
+
+Refer to INSTALL.userspace for general requirements of building
+userspace OVS.
+
+Using the DPDK with ovs-vswitchd:
+---------------------------------
+
+First setup DPDK devices:
+ - insert uio.ko
+ - insert igb_uio.ko
+ e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko
+ - mount hugetlbfs
+ e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
+ - Bind network device to ibg_uio.
+ e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1
+
+Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
+
+Start vswitchd:
+DPDK configuration arguments can be passed to vswitchd via `--dpdk`
+argument. dpdk arg -c is ignored by ovs-dpdk, but it is required parameter
+for dpdk initialization.
+
+ e.g.
+ ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach
+
+To use ovs-vswitchd with DPDK, create a bridge with datapath_type
+"netdev" in the configuration database. For example:
+
+ ovs-vsctl add-br br0
+ ovs-vsctl set bridge br0 datapath_type=netdev
+
+Now you can add dpdk devices. OVS expect DPDK device name start with dpdk
+and end with portid. vswitchd should print number of dpdk devices found.
+
+ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+
+Once first DPDK port is added vswitchd, it creates Polling thread and
+polls dpdk device in continuous loop. Therefore CPU utilization
+for that thread is always 100%.
+
+Restrictions:
+-------------
+
+ - This Support is for Physical NIC. I have tested with Intel NIC only.
+ - vswitchd userspace datapath does affine polling thread but it is
+ assumed that devices are on numa node 0. Therefore if device is
+ attached to non zero numa node switching performance would be
+ suboptimal.
+ - There are fixed number of polling thread and fixed number of per
+ device queues configured.
+ - Work with 1500 MTU, needs few changes in DPDK lib to fix this issue.
+ - Currently DPDK port does not make use any offload functionality.
+
+Bug Reporting:
+--------------
+
+Please report problems to bugs@openvswitch.org.
if WIN32
AM_CPPFLAGS += -I $(top_srcdir)/include/windows
+AM_CPPFLAGS += $(PTHREAD_INCLUDES)
+AM_LDFLAGS += $(PTHREAD_LDFLAGS)
endif
-AM_CPPFLAGS += $(SSL_INCLUDES)
-
AM_CPPFLAGS += -I $(top_srcdir)/include
AM_CPPFLAGS += -I $(top_srcdir)/lib
AM_CPPFLAGS += -I $(top_builddir)/lib
+AM_CPPFLAGS += $(SSL_INCLUDES)
+
AM_CFLAGS = -Wstrict-prototypes
AM_CFLAGS += $(WARNING_FLAGS)
FAQ \
INSTALL \
INSTALL.Debian \
+ INSTALL.DPDK \
INSTALL.Fedora \
INSTALL.KVM \
INSTALL.Libvirt \
Post-v2.1.0
---------------------
+ - ovs-vsctl now reports when ovs-vswitchd fails to create a new port or
+ bridge.
- The "ovsdbmonitor" graphical tool has been removed, because it was
poorly maintained and not widely used.
- New "check-ryu" Makefile target for running Ryu tests for OpenFlow
- Upon the receipt of a SIGHUP signal, ovs-vswitchd no longer reopens its
log file (it will terminate instead). Please use 'ovs-appctl vlog/reopen'
instead.
+ - Support for Linux kernels up to 3.12. On Kernel 3.12 OVS uses tunnel
+ API for GRE and VXLAN.
+ - Added DPDK support.
v2.1.0 - xx xxx xxxx
OpenFlow 1.1+ support in Open vSwitch
=====================================
-Open vSwitch support for OpenFlow 1.1, 1.2, and 1.3 is a work in
+Open vSwitch support for OpenFlow 1.1 and beyond is a work in
progress. This file describes the work still to be done.
The Plan
The list of remaining work items for OpenFlow 1.1 is below. It is
probably incomplete.
- * OFPT_TABLE_MOD message. This is new in OF1.1, so we need to
- implement it. It should be implemented so that the default OVS
- behavior does not change. Simon Horman has posted a patch.
- [required for OF1.1 and OF1.2]
-
* MPLS. Simon Horman maintains a patch series that adds this
feature. This is partially merged.
[optional for OF1.1+]
In case 2) the VMs expect ARP replies from each other, but this is not
possible over a layer 3 tunnel. One solution is to have static MAC address
entries preconfigured on the VMs (e.g., `arp -f /etc/ethers` on startup on
-Unix based VMs), or have the hypervisor do proxy ARP.
+Unix based VMs), or have the hypervisor do proxy ARP. In this scenario, the
+eth0 interfaces need not be added to the br0 bridge in the examples below.
On the receiving side, the packet arrives without the original MAC header.
The LISP tunneling code attaches a header with harcoded source and destination
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 lisp0 -- set Interface lisp0 type=lisp options:remote_ip=flow options:key=flow
-Flows on br0 are configured as follows:
+The last command sets up flow based tunneling on the lisp0 interface. From
+the LISP point of view, this is like having the Tunnel Router map cache
+implemented as flow rules.
+
+Flows on br0 should be configured as follows:
priority=3,dl_dst=02:00:00:00:00:00,action=mod_dl_dst:<VMx_MAC>,output:1
priority=2,in_port=1,dl_type=0x0806,action=NORMAL
priority=1,in_port=1,dl_type=0x0800,vlan_tci=0,nw_src=<EID_prefix>,action=set_field:<OVSx_IP>->tun_dst,output:3
priority=0,action=NORMAL
-Optionally, if you want to use Instance ID in a flow, you can set it with
-"action=set_tunnel:<IID>".
+The third rule is like a map cache entry: the <EID_prefix> specified by the
+nw_src match field is mapped to the RLOC <OVSx_IP>, which is set as the tunnel
+destination for this particular flow.
+
+Optionally, if you want to use Instance ID in a flow, you can add
+"set_tunnel:<IID>" to the action list.
Patch
-----
-The patch should be in the body of the email following the descrition,
+The patch should be in the body of the email following the description,
separated by a blank line.
Patches should be in "diff -up" format. We recommend that you use Git
AC_MSG_RESULT([$kversion])
if test "$version" -ge 3; then
- if test "$version" = 3 && test "$patchlevel" -le 11; then
+ if test "$version" = 3 && test "$patchlevel" -le 12; then
: # Linux 3.x
else
- AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version newer than 3.11.x is not supported])
+ AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version newer than 3.12.x is not supported])
fi
else
if test "$version" -le 1 || test "$patchlevel" -le 5 || test "$sublevel" -le 31; then
AM_CONDITIONAL(LINUX_ENABLED, test -n "$KBUILD")
])
+dnl OVS_CHECK_DPDK
+dnl
+dnl Configure DPDK source tree
+AC_DEFUN([OVS_CHECK_DPDK], [
+ AC_ARG_WITH([dpdk],
+ [AC_HELP_STRING([--with-dpdk=/path/to/dpdk],
+ [Specify the DPDP build directory])])
+
+ if test X"$with_dpdk" != X; then
+ RTE_SDK=$with_dpdk
+
+ DPDK_INCLUDE=$RTE_SDK/include
+ DPDK_LIB_DIR=$RTE_SDK/lib
+ DPDK_LIBS="$DPDK_LIB_DIR/libintel_dpdk.a"
+
+ LIBS="$DPDK_LIBS $LIBS"
+ CPPFLAGS="-I$DPDK_INCLUDE $CPPFLAGS"
+
+ AC_DEFINE([DPDK_NETDEV], [1], [System uses the DPDK module.])
+ else
+ RTE_SDK=
+ fi
+
+ AM_CONDITIONAL([DPDK_NETDEV], test -n "$RTE_SDK")
+])
+
dnl OVS_GREP_IFELSE(FILE, REGEX, [IF-MATCH], [IF-NO-MATCH])
dnl
dnl Greps FILE for REGEX. If it matches, runs IF-MATCH, otherwise IF-NO-MATCH.
-L*)
path=`echo "$1" | sed 's/-L//'`
- linkopt="$linkopt ${slash}LIBPATH:\"$path\""
+ linkopt="$linkopt ${slash}LIBPATH:$path"
cl_linkopt="${slash}link ${slash}LIBPATH:\"$path\""
;;
version_map = {"1.0": 0x01,
"1.1": 0x02,
"1.2": 0x03,
- "1.3": 0x04}
+ "1.3": 0x04,
+ "1.4": 0x05}
version_reverse_map = dict((v, k) for (k, v) in version_map.iteritems())
token = None
"1.2": (OFP12_VERSION, OFP12_VERSION),
"1.3": (OFP13_VERSION, OFP13_VERSION),
"1.4": (OFP14_VERSION, OFP14_VERSION),
- "1.0+": (OFP10_VERSION, OFP13_VERSION),
- "1.1+": (OFP11_VERSION, OFP13_VERSION),
- "1.2+": (OFP12_VERSION, OFP13_VERSION),
- "1.3+": (OFP13_VERSION, OFP13_VERSION),
+ "1.0+": (OFP10_VERSION, OFP14_VERSION),
+ "1.1+": (OFP11_VERSION, OFP14_VERSION),
+ "1.2+": (OFP12_VERSION, OFP14_VERSION),
+ "1.3+": (OFP13_VERSION, OFP14_VERSION),
"1.4+": (OFP14_VERSION, OFP14_VERSION),
"1.0-1.1": (OFP10_VERSION, OFP11_VERSION),
"1.0-1.2": (OFP10_VERSION, OFP12_VERSION),
AC_ARG_VAR(KARCH, [Kernel Architecture String])
AC_SUBST(KARCH)
OVS_CHECK_LINUX
+OVS_CHECK_DPDK
AC_CONFIG_FILES(Makefile)
AC_CONFIG_FILES(datapath/Makefile)
int ovs_net_id __read_mostly;
+/* Check if need to build a reply message.
+ * OVS userspace sets the NLM_F_ECHO flag if it needs the reply. */
+static bool ovs_must_notify(struct genl_info *info,
+ const struct genl_multicast_group *grp)
+{
+ return info->nlhdr->nlmsg_flags & NLM_F_ECHO ||
+ netlink_has_listeners(genl_info_net(info)->genl_sock, grp->id);
+}
+
static void ovs_notify(struct sk_buff *skb, struct genl_info *info,
struct genl_multicast_group *grp)
{
return &dp->ports[port_no & (DP_VPORT_HASH_BUCKETS - 1)];
}
+/* Called with ovs_mutex or RCU read lock. */
struct vport *ovs_lookup_vport(const struct datapath *dp, u16 port_no)
{
struct vport *vport;
+ nla_total_size(acts->actions_len); /* OVS_FLOW_ATTR_ACTIONS */
}
-/* Called with ovs_mutex. */
+/* Called with ovs_mutex or RCU read lock. */
static int ovs_flow_cmd_fill_info(struct sw_flow *flow, struct datapath *dp,
struct sk_buff *skb, u32 portid,
u32 seq, u32 flags, u8 cmd)
return err;
}
+/* Must be called with ovs_mutex. */
static struct sk_buff *ovs_flow_cmd_alloc_info(struct sw_flow *flow,
- struct genl_info *info)
+ struct genl_info *info,
+ bool always)
{
+ struct sk_buff *skb;
size_t len;
+ if (!always && !ovs_must_notify(info, &ovs_dp_flow_multicast_group))
+ return NULL;
+
len = ovs_flow_cmd_msg_size(ovsl_dereference(flow->sf_acts));
- return genlmsg_new_unicast(len, info, GFP_KERNEL);
+ skb = genlmsg_new_unicast(len, info, GFP_KERNEL);
+
+ if (!skb)
+ return ERR_PTR(-ENOMEM);
+
+ return skb;
}
+/* Must be called with ovs_mutex. */
static struct sk_buff *ovs_flow_cmd_build_info(struct sw_flow *flow,
struct datapath *dp,
struct genl_info *info,
- u8 cmd)
+ u8 cmd, bool always)
{
struct sk_buff *skb;
int retval;
- skb = ovs_flow_cmd_alloc_info(flow, info);
- if (!skb)
- return ERR_PTR(-ENOMEM);
+ skb = ovs_flow_cmd_alloc_info(flow, info, always);
+ if (!skb || IS_ERR(skb))
+ return skb;
retval = ovs_flow_cmd_fill_info(flow, dp, skb, info->snd_portid,
info->snd_seq, 0, cmd);
goto err_kfree;
}
} else if (info->genlhdr->cmd == OVS_FLOW_CMD_NEW) {
+ /* OVS_FLOW_CMD_NEW must have actions. */
error = -EINVAL;
goto error;
}
goto err_flow_free;
}
- reply = ovs_flow_cmd_build_info(flow, dp, info, OVS_FLOW_CMD_NEW);
+ reply = ovs_flow_cmd_build_info(flow, dp, info,
+ OVS_FLOW_CMD_NEW, false);
} else {
/* We found a matching flow. */
- struct sw_flow_actions *old_acts;
-
/* Bail out if we're not allowed to modify an existing flow.
* We accept NLM_F_CREATE in place of the intended NLM_F_EXCL
* because Generic Netlink treats the latter as a dump
if (!ovs_flow_cmp_unmasked_key(flow, &match))
goto err_unlock_ovs;
- /* Update actions. */
- old_acts = ovsl_dereference(flow->sf_acts);
- rcu_assign_pointer(flow->sf_acts, acts);
- ovs_nla_free_flow_actions(old_acts);
+ /* Update actions, if present. */
+ if (acts) {
+ struct sw_flow_actions *old_acts;
- reply = ovs_flow_cmd_build_info(flow, dp, info, OVS_FLOW_CMD_NEW);
+ old_acts = ovsl_dereference(flow->sf_acts);
+ rcu_assign_pointer(flow->sf_acts, acts);
+ ovs_nla_free_flow_actions(old_acts);
+ }
+ reply = ovs_flow_cmd_build_info(flow, dp, info,
+ OVS_FLOW_CMD_NEW, false);
/* Clear stats. */
if (a[OVS_FLOW_ATTR_CLEAR])
}
ovs_unlock();
- if (!IS_ERR(reply))
- ovs_notify(reply, info, &ovs_dp_flow_multicast_group);
- else
- netlink_set_err(sock_net(skb->sk)->genl_sock, 0,
- ovs_dp_flow_multicast_group.id, PTR_ERR(reply));
+ if (reply) {
+ if (!IS_ERR(reply))
+ ovs_notify(reply, info, &ovs_dp_flow_multicast_group);
+ else
+ netlink_set_err(sock_net(skb->sk)->genl_sock, 0,
+ ovs_dp_flow_multicast_group.id,
+ PTR_ERR(reply));
+ }
return 0;
err_flow_free:
goto unlock;
}
- reply = ovs_flow_cmd_build_info(flow, dp, info, OVS_FLOW_CMD_NEW);
+ reply = ovs_flow_cmd_build_info(flow, dp, info, OVS_FLOW_CMD_NEW, true);
if (IS_ERR(reply)) {
err = PTR_ERR(reply);
goto unlock;
goto unlock;
}
- reply = ovs_flow_cmd_alloc_info(flow, info);
- if (!reply) {
- err = -ENOMEM;
+ reply = ovs_flow_cmd_alloc_info(flow, info, false);
+ if (IS_ERR(reply)) {
+ err = PTR_ERR(reply);
goto unlock;
}
ovs_flow_tbl_remove(&dp->table, flow);
- err = ovs_flow_cmd_fill_info(flow, dp, reply, info->snd_portid,
- info->snd_seq, 0, OVS_FLOW_CMD_DEL);
- BUG_ON(err < 0);
+ if (reply) {
+ err = ovs_flow_cmd_fill_info(flow, dp, reply, info->snd_portid,
+ info->snd_seq, 0,
+ OVS_FLOW_CMD_DEL);
+ BUG_ON(err < 0);
+ }
ovs_flow_free(flow, true);
ovs_unlock();
- ovs_notify(reply, info, &ovs_dp_flow_multicast_group);
+ if (reply)
+ ovs_notify(reply, info, &ovs_dp_flow_multicast_group);
return 0;
unlock:
ovs_unlock();
return msgsize;
}
+/* Called with ovs_mutex or RCU read lock. */
static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb,
u32 portid, u32 seq, u32 flags, u8 cmd)
{
ovs_header->dp_ifindex = get_dpifindex(dp);
- rcu_read_lock();
err = nla_put_string(skb, OVS_DP_ATTR_NAME, ovs_dp_name(dp));
- rcu_read_unlock();
if (err)
goto nla_put_failure;
return -EMSGSIZE;
}
-static struct sk_buff *ovs_dp_cmd_build_info(struct datapath *dp,
- struct genl_info *info, u8 cmd)
+static struct sk_buff *ovs_dp_cmd_alloc_info(struct genl_info *info)
{
- struct sk_buff *skb;
- int retval;
-
- skb = genlmsg_new_unicast(ovs_dp_cmd_msg_size(), info, GFP_KERNEL);
- if (!skb)
- return ERR_PTR(-ENOMEM);
-
- retval = ovs_dp_cmd_fill_info(dp, skb, info->snd_portid, info->snd_seq, 0, cmd);
- if (retval < 0) {
- kfree_skb(skb);
- return ERR_PTR(retval);
- }
- return skb;
+ return genlmsg_new_unicast(ovs_dp_cmd_msg_size(), info, GFP_KERNEL);
}
-/* Called with ovs_mutex. */
+/* Called with rcu_read_lock or ovs_mutex. */
static struct datapath *lookup_datapath(struct net *net,
struct ovs_header *ovs_header,
struct nlattr *a[OVS_DP_ATTR_MAX + 1])
else {
struct vport *vport;
- rcu_read_lock();
vport = ovs_vport_locate(net, nla_data(a[OVS_DP_ATTR_NAME]));
dp = vport && vport->port_no == OVSP_LOCAL ? vport->dp : NULL;
- rcu_read_unlock();
}
return dp ? dp : ERR_PTR(-ENODEV);
}
if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID])
goto err;
- ovs_lock();
+ reply = ovs_dp_cmd_alloc_info(info);
+ if (!reply)
+ return -ENOMEM;
err = -ENOMEM;
dp = kzalloc(sizeof(*dp), GFP_KERNEL);
if (dp == NULL)
- goto err_unlock_ovs;
+ goto err_free_reply;
ovs_dp_set_net(dp, hold_net(sock_net(skb->sk)));
ovs_dp_change(dp, a);
+ /* So far only local changes have been made, now need the lock. */
+ ovs_lock();
+
vport = new_vport(&parms);
if (IS_ERR(vport)) {
err = PTR_ERR(vport);
goto err_destroy_ports_array;
}
- reply = ovs_dp_cmd_build_info(dp, info, OVS_DP_CMD_NEW);
- err = PTR_ERR(reply);
- if (IS_ERR(reply))
- goto err_destroy_local_port;
+ err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_DP_CMD_NEW);
+ BUG_ON(err < 0);
ovs_net = net_generic(ovs_dp_get_net(dp), ovs_net_id);
list_add_tail_rcu(&dp->list_node, &ovs_net->dps);
ovs_notify(reply, info, &ovs_dp_datapath_multicast_group);
return 0;
-err_destroy_local_port:
- ovs_dp_detach_port(ovs_vport_ovsl(dp, OVSP_LOCAL));
err_destroy_ports_array:
+ ovs_unlock();
kfree(dp->ports);
err_destroy_percpu:
free_percpu(dp->stats_percpu);
err_free_dp:
release_net(ovs_dp_get_net(dp));
kfree(dp);
-err_unlock_ovs:
- ovs_unlock();
+err_free_reply:
+ kfree_skb(reply);
err:
return err;
}
struct datapath *dp;
int err;
+ reply = ovs_dp_cmd_alloc_info(info);
+ if (!reply)
+ return -ENOMEM;
+
ovs_lock();
dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs);
err = PTR_ERR(dp);
if (IS_ERR(dp))
- goto unlock;
+ goto err_unlock_free;
- reply = ovs_dp_cmd_build_info(dp, info, OVS_DP_CMD_DEL);
- err = PTR_ERR(reply);
- if (IS_ERR(reply))
- goto unlock;
+ err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_DP_CMD_DEL);
+ BUG_ON(err < 0);
__dp_destroy(dp);
- ovs_unlock();
+ ovs_unlock();
ovs_notify(reply, info, &ovs_dp_datapath_multicast_group);
-
return 0;
-unlock:
+
+err_unlock_free:
ovs_unlock();
+ kfree_skb(reply);
return err;
}
struct datapath *dp;
int err;
+ reply = ovs_dp_cmd_alloc_info(info);
+ if (!reply)
+ return -ENOMEM;
+
ovs_lock();
dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs);
err = PTR_ERR(dp);
if (IS_ERR(dp))
- goto unlock;
+ goto err_unlock_free;
ovs_dp_change(dp, info->attrs);
- reply = ovs_dp_cmd_build_info(dp, info, OVS_DP_CMD_NEW);
- if (IS_ERR(reply)) {
- err = PTR_ERR(reply);
- netlink_set_err(sock_net(skb->sk)->genl_sock, 0,
- ovs_dp_datapath_multicast_group.id, err);
- err = 0;
- goto unlock;
- }
+ err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_DP_CMD_NEW);
+ BUG_ON(err < 0);
ovs_unlock();
ovs_notify(reply, info, &ovs_dp_datapath_multicast_group);
-
return 0;
-unlock:
+
+err_unlock_free:
ovs_unlock();
+ kfree_skb(reply);
return err;
}
struct datapath *dp;
int err;
- ovs_lock();
+ reply = ovs_dp_cmd_alloc_info(info);
+ if (!reply)
+ return -ENOMEM;
+
+ rcu_read_lock();
dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs);
if (IS_ERR(dp)) {
err = PTR_ERR(dp);
- goto unlock;
- }
-
- reply = ovs_dp_cmd_build_info(dp, info, OVS_DP_CMD_NEW);
- if (IS_ERR(reply)) {
- err = PTR_ERR(reply);
- goto unlock;
+ goto err_unlock_free;
}
+ err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_DP_CMD_NEW);
+ BUG_ON(err < 0);
+ rcu_read_unlock();
- ovs_unlock();
return genlmsg_reply(reply, info);
-unlock:
- ovs_unlock();
+err_unlock_free:
+ rcu_read_unlock();
+ kfree_skb(reply);
return err;
}
return err;
}
-/* Called with ovs_mutex or RCU read lock. */
+static struct sk_buff *ovs_vport_cmd_alloc_info(void)
+{
+ return nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+}
+
+/* Called with ovs_mutex, only via ovs_dp_notify_wq(). */
struct sk_buff *ovs_vport_cmd_build_info(struct vport *vport, u32 portid,
u32 seq, u8 cmd)
{
u32 port_no;
int err;
- err = -EINVAL;
if (!a[OVS_VPORT_ATTR_NAME] || !a[OVS_VPORT_ATTR_TYPE] ||
!a[OVS_VPORT_ATTR_UPCALL_PID])
- goto exit;
+ return -EINVAL;
+
+ port_no = a[OVS_VPORT_ATTR_PORT_NO]
+ ? nla_get_u32(a[OVS_VPORT_ATTR_PORT_NO]) : 0;
+ if (port_no >= DP_MAX_PORTS)
+ return -EFBIG;
+
+ reply = ovs_vport_cmd_alloc_info();
+ if (!reply)
+ return -ENOMEM;
ovs_lock();
dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex);
err = -ENODEV;
if (!dp)
- goto exit_unlock;
-
- if (a[OVS_VPORT_ATTR_PORT_NO]) {
- port_no = nla_get_u32(a[OVS_VPORT_ATTR_PORT_NO]);
-
- err = -EFBIG;
- if (port_no >= DP_MAX_PORTS)
- goto exit_unlock;
+ goto exit_unlock_free;
+ if (port_no) {
vport = ovs_vport_ovsl(dp, port_no);
err = -EBUSY;
if (vport)
- goto exit_unlock;
+ goto exit_unlock_free;
} else {
for (port_no = 1; ; port_no++) {
if (port_no >= DP_MAX_PORTS) {
err = -EFBIG;
- goto exit_unlock;
+ goto exit_unlock_free;
}
vport = ovs_vport_ovsl(dp, port_no);
if (!vport)
vport = new_vport(&parms);
err = PTR_ERR(vport);
if (IS_ERR(vport))
- goto exit_unlock;
+ goto exit_unlock_free;
err = 0;
if (a[OVS_VPORT_ATTR_STATS])
ovs_vport_set_stats(vport, nla_data(a[OVS_VPORT_ATTR_STATS]));
- reply = ovs_vport_cmd_build_info(vport, info->snd_portid, info->snd_seq,
- OVS_VPORT_CMD_NEW);
- if (IS_ERR(reply)) {
- err = PTR_ERR(reply);
- ovs_dp_detach_port(vport);
- goto exit_unlock;
- }
+ err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_VPORT_CMD_NEW);
+ BUG_ON(err < 0);
+ ovs_unlock();
ovs_notify(reply, info, &ovs_dp_vport_multicast_group);
+ return 0;
-exit_unlock:
+exit_unlock_free:
ovs_unlock();
-exit:
+ kfree_skb(reply);
return err;
}
struct vport *vport;
int err;
+ reply = ovs_vport_cmd_alloc_info();
+ if (!reply)
+ return -ENOMEM;
+
ovs_lock();
vport = lookup_vport(sock_net(skb->sk), info->userhdr, a);
err = PTR_ERR(vport);
if (IS_ERR(vport))
- goto exit_unlock;
+ goto exit_unlock_free;
if (a[OVS_VPORT_ATTR_TYPE] &&
nla_get_u32(a[OVS_VPORT_ATTR_TYPE]) != vport->ops->type) {
err = -EINVAL;
- goto exit_unlock;
- }
-
- reply = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
- if (!reply) {
- err = -ENOMEM;
- goto exit_unlock;
+ goto exit_unlock_free;
}
if (a[OVS_VPORT_ATTR_OPTIONS]) {
err = ovs_vport_set_options(vport, a[OVS_VPORT_ATTR_OPTIONS]);
if (err)
- goto exit_free;
+ goto exit_unlock_free;
}
if (a[OVS_VPORT_ATTR_STATS])
err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
info->snd_seq, 0, OVS_VPORT_CMD_NEW);
BUG_ON(err < 0);
-
ovs_unlock();
+
ovs_notify(reply, info, &ovs_dp_vport_multicast_group);
return 0;
-exit_free:
- kfree_skb(reply);
-exit_unlock:
+exit_unlock_free:
ovs_unlock();
+ kfree_skb(reply);
return err;
}
struct vport *vport;
int err;
+ reply = ovs_vport_cmd_alloc_info();
+ if (!reply)
+ return -ENOMEM;
+
ovs_lock();
vport = lookup_vport(sock_net(skb->sk), info->userhdr, a);
err = PTR_ERR(vport);
if (IS_ERR(vport))
- goto exit_unlock;
+ goto exit_unlock_free;
if (vport->port_no == OVSP_LOCAL) {
err = -EINVAL;
- goto exit_unlock;
+ goto exit_unlock_free;
}
- reply = ovs_vport_cmd_build_info(vport, info->snd_portid,
- info->snd_seq, OVS_VPORT_CMD_DEL);
- err = PTR_ERR(reply);
- if (IS_ERR(reply))
- goto exit_unlock;
-
- err = 0;
+ err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_VPORT_CMD_DEL);
+ BUG_ON(err < 0);
ovs_dp_detach_port(vport);
+ ovs_unlock();
ovs_notify(reply, info, &ovs_dp_vport_multicast_group);
+ return 0;
-exit_unlock:
+exit_unlock_free:
ovs_unlock();
+ kfree_skb(reply);
return err;
}
struct vport *vport;
int err;
+ reply = ovs_vport_cmd_alloc_info();
+ if (!reply)
+ return -ENOMEM;
+
rcu_read_lock();
vport = lookup_vport(sock_net(skb->sk), ovs_header, a);
err = PTR_ERR(vport);
if (IS_ERR(vport))
- goto exit_unlock;
-
- reply = ovs_vport_cmd_build_info(vport, info->snd_portid,
- info->snd_seq, OVS_VPORT_CMD_NEW);
- err = PTR_ERR(reply);
- if (IS_ERR(reply))
- goto exit_unlock;
-
+ goto exit_unlock_free;
+ err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
+ info->snd_seq, 0, OVS_VPORT_CMD_NEW);
+ BUG_ON(err < 0);
rcu_read_unlock();
return genlmsg_reply(reply, info);
-exit_unlock:
+exit_unlock_free:
rcu_read_unlock();
+ kfree_skb(reply);
return err;
}
void ovs_flow_stats_update(struct sw_flow *flow, struct sk_buff *skb)
{
struct flow_stats *stats;
- __be16 tcp_flags = 0;
+ __be16 tcp_flags = flow->key.tp.flags;
int node = numa_node_id();
stats = rcu_dereference(flow->stats[node]);
- if ((flow->key.eth.type == htons(ETH_P_IP) ||
- flow->key.eth.type == htons(ETH_P_IPV6)) &&
- flow->key.ip.frag != OVS_FRAG_TYPE_LATER &&
- flow->key.ip.proto == IPPROTO_TCP &&
- likely(skb->len >= skb_transport_offset(skb) + sizeof(struct tcphdr))) {
- tcp_flags = TCP_FLAGS_BE16(tcp_hdr(skb));
- }
-
/* Check if already have node-specific stats. */
if (likely(stats)) {
spin_lock(&stats->lock);
spin_unlock(&stats->lock);
}
+/* Called with ovs_mutex. */
void ovs_flow_stats_get(struct sw_flow *flow, struct ovs_flow_stats *ovs_stats,
unsigned long *used, __be16 *tcp_flags)
{
memset(ovs_stats, 0, sizeof(*ovs_stats));
for_each_node(node) {
- struct flow_stats *stats = rcu_dereference(flow->stats[node]);
+ struct flow_stats *stats = ovsl_dereference(flow->stats[node]);
if (stats) {
/* Local CPU may write on non-local stats, so we must
/* The ICMPv6 type and code fields use the 16-bit transport port
* fields, so we need to store them in 16-bit network byte order.
*/
- key->ipv6.tp.src = htons(icmp->icmp6_type);
- key->ipv6.tp.dst = htons(icmp->icmp6_code);
+ key->tp.src = htons(icmp->icmp6_type);
+ key->tp.dst = htons(icmp->icmp6_code);
if (icmp->icmp6_code == 0 &&
(icmp->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
if (key->ip.proto == IPPROTO_TCP) {
if (tcphdr_ok(skb)) {
struct tcphdr *tcp = tcp_hdr(skb);
- key->ipv4.tp.src = tcp->source;
- key->ipv4.tp.dst = tcp->dest;
- key->ipv4.tp.flags = TCP_FLAGS_BE16(tcp);
+ key->tp.src = tcp->source;
+ key->tp.dst = tcp->dest;
+ key->tp.flags = TCP_FLAGS_BE16(tcp);
}
} else if (key->ip.proto == IPPROTO_UDP) {
if (udphdr_ok(skb)) {
struct udphdr *udp = udp_hdr(skb);
- key->ipv4.tp.src = udp->source;
- key->ipv4.tp.dst = udp->dest;
+ key->tp.src = udp->source;
+ key->tp.dst = udp->dest;
}
} else if (key->ip.proto == IPPROTO_SCTP) {
if (sctphdr_ok(skb)) {
struct sctphdr *sctp = sctp_hdr(skb);
- key->ipv4.tp.src = sctp->source;
- key->ipv4.tp.dst = sctp->dest;
+ key->tp.src = sctp->source;
+ key->tp.dst = sctp->dest;
}
} else if (key->ip.proto == IPPROTO_ICMP) {
if (icmphdr_ok(skb)) {
/* The ICMP type and code fields use the 16-bit
* transport port fields, so we need to store
* them in 16-bit network byte order. */
- key->ipv4.tp.src = htons(icmp->type);
- key->ipv4.tp.dst = htons(icmp->code);
+ key->tp.src = htons(icmp->type);
+ key->tp.dst = htons(icmp->code);
}
}
if (key->ip.proto == NEXTHDR_TCP) {
if (tcphdr_ok(skb)) {
struct tcphdr *tcp = tcp_hdr(skb);
- key->ipv6.tp.src = tcp->source;
- key->ipv6.tp.dst = tcp->dest;
- key->ipv6.tp.flags = TCP_FLAGS_BE16(tcp);
+ key->tp.src = tcp->source;
+ key->tp.dst = tcp->dest;
+ key->tp.flags = TCP_FLAGS_BE16(tcp);
}
} else if (key->ip.proto == NEXTHDR_UDP) {
if (udphdr_ok(skb)) {
struct udphdr *udp = udp_hdr(skb);
- key->ipv6.tp.src = udp->source;
- key->ipv6.tp.dst = udp->dest;
+ key->tp.src = udp->source;
+ key->tp.dst = udp->dest;
}
} else if (key->ip.proto == NEXTHDR_SCTP) {
if (sctphdr_ok(skb)) {
struct sctphdr *sctp = sctp_hdr(skb);
- key->ipv6.tp.src = sctp->source;
- key->ipv6.tp.dst = sctp->dest;
+ key->tp.src = sctp->source;
+ key->tp.dst = sctp->dest;
}
} else if (key->ip.proto == NEXTHDR_ICMP) {
if (icmp6hdr_ok(skb)) {
__be16 tun_flags;
u8 ipv4_tos;
u8 ipv4_ttl;
-};
+} __packed __aligned(4); /* Minimize padding. */
static inline void ovs_flow_tun_key_init(struct ovs_key_ipv4_tunnel *tun_key,
const struct iphdr *iph, __be64 tun_id,
u32 priority; /* Packet QoS priority. */
u32 skb_mark; /* SKB mark. */
u16 in_port; /* Input switch port (or DP_MAX_PORTS). */
- } phy;
+ } __packed phy; /* Safe when right after 'tun_key'. */
struct {
u8 src[ETH_ALEN]; /* Ethernet source address. */
u8 dst[ETH_ALEN]; /* Ethernet destination address. */
u8 ttl; /* IP TTL/hop limit. */
u8 frag; /* One of OVS_FRAG_TYPE_*. */
} ip;
+ struct {
+ __be16 src; /* TCP/UDP/SCTP source port. */
+ __be16 dst; /* TCP/UDP/SCTP destination port. */
+ __be16 flags; /* TCP flags. */
+ } tp;
union {
struct {
struct {
__be32 src; /* IP source address. */
__be32 dst; /* IP destination address. */
} addr;
- union {
- struct {
- __be16 src; /* TCP/UDP/SCTP source port. */
- __be16 dst; /* TCP/UDP/SCTP destination port. */
- __be16 flags; /* TCP flags. */
- } tp;
- struct {
- u8 sha[ETH_ALEN]; /* ARP source hardware address. */
- u8 tha[ETH_ALEN]; /* ARP target hardware address. */
- } arp;
- };
+ struct {
+ u8 sha[ETH_ALEN]; /* ARP source hardware address. */
+ u8 tha[ETH_ALEN]; /* ARP target hardware address. */
+ } arp;
} ipv4;
struct {
struct {
struct in6_addr dst; /* IPv6 destination address. */
} addr;
__be32 label; /* IPv6 flow label. */
- struct {
- __be16 src; /* TCP/UDP/SCTP source port. */
- __be16 dst; /* TCP/UDP/SCTP destination port. */
- __be16 flags; /* TCP flags. */
- } tp;
struct {
struct in6_addr target; /* ND target address. */
u8 sll[ETH_ALEN]; /* ND source link layer address. */
#include <linux/icmpv6.h>
#include <linux/rculist.h>
#include <net/ip.h>
+#include <net/ip_tunnels.h>
#include <net/ipv6.h>
#include <net/ndisc.h>
if (match->mask && (match->mask->key.ip.proto == 0xff))
mask_allowed |= 1ULL << OVS_KEY_ATTR_ICMPV6;
- if (match->key->ipv6.tp.src ==
+ if (match->key->tp.src ==
htons(NDISC_NEIGHBOUR_SOLICITATION) ||
- match->key->ipv6.tp.src == htons(NDISC_NEIGHBOUR_ADVERTISEMENT)) {
+ match->key->tp.src == htons(NDISC_NEIGHBOUR_ADVERTISEMENT)) {
key_expected |= 1ULL << OVS_KEY_ATTR_ND;
- if (match->mask && (match->mask->key.ipv6.tp.src == htons(0xffff)))
+ if (match->mask && (match->mask->key.tp.src == htons(0xffff)))
mask_allowed |= 1ULL << OVS_KEY_ATTR_ND;
}
}
const struct ovs_key_tcp *tcp_key;
tcp_key = nla_data(a[OVS_KEY_ATTR_TCP]);
- if (orig_attrs & (1ULL << OVS_KEY_ATTR_IPV4)) {
- SW_FLOW_KEY_PUT(match, ipv4.tp.src,
- tcp_key->tcp_src, is_mask);
- SW_FLOW_KEY_PUT(match, ipv4.tp.dst,
- tcp_key->tcp_dst, is_mask);
- } else {
- SW_FLOW_KEY_PUT(match, ipv6.tp.src,
- tcp_key->tcp_src, is_mask);
- SW_FLOW_KEY_PUT(match, ipv6.tp.dst,
- tcp_key->tcp_dst, is_mask);
- }
+ SW_FLOW_KEY_PUT(match, tp.src, tcp_key->tcp_src, is_mask);
+ SW_FLOW_KEY_PUT(match, tp.dst, tcp_key->tcp_dst, is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_TCP);
}
if (attrs & (1ULL << OVS_KEY_ATTR_TCP_FLAGS)) {
if (orig_attrs & (1ULL << OVS_KEY_ATTR_IPV4)) {
- SW_FLOW_KEY_PUT(match, ipv4.tp.flags,
+ SW_FLOW_KEY_PUT(match, tp.flags,
nla_get_be16(a[OVS_KEY_ATTR_TCP_FLAGS]),
is_mask);
} else {
- SW_FLOW_KEY_PUT(match, ipv6.tp.flags,
+ SW_FLOW_KEY_PUT(match, tp.flags,
nla_get_be16(a[OVS_KEY_ATTR_TCP_FLAGS]),
is_mask);
}
const struct ovs_key_udp *udp_key;
udp_key = nla_data(a[OVS_KEY_ATTR_UDP]);
- if (orig_attrs & (1ULL << OVS_KEY_ATTR_IPV4)) {
- SW_FLOW_KEY_PUT(match, ipv4.tp.src,
- udp_key->udp_src, is_mask);
- SW_FLOW_KEY_PUT(match, ipv4.tp.dst,
- udp_key->udp_dst, is_mask);
- } else {
- SW_FLOW_KEY_PUT(match, ipv6.tp.src,
- udp_key->udp_src, is_mask);
- SW_FLOW_KEY_PUT(match, ipv6.tp.dst,
- udp_key->udp_dst, is_mask);
- }
+ SW_FLOW_KEY_PUT(match, tp.src, udp_key->udp_src, is_mask);
+ SW_FLOW_KEY_PUT(match, tp.dst, udp_key->udp_dst, is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_UDP);
}
const struct ovs_key_sctp *sctp_key;
sctp_key = nla_data(a[OVS_KEY_ATTR_SCTP]);
- if (orig_attrs & (1ULL << OVS_KEY_ATTR_IPV4)) {
- SW_FLOW_KEY_PUT(match, ipv4.tp.src,
- sctp_key->sctp_src, is_mask);
- SW_FLOW_KEY_PUT(match, ipv4.tp.dst,
- sctp_key->sctp_dst, is_mask);
- } else {
- SW_FLOW_KEY_PUT(match, ipv6.tp.src,
- sctp_key->sctp_src, is_mask);
- SW_FLOW_KEY_PUT(match, ipv6.tp.dst,
- sctp_key->sctp_dst, is_mask);
- }
+ SW_FLOW_KEY_PUT(match, tp.src, sctp_key->sctp_src, is_mask);
+ SW_FLOW_KEY_PUT(match, tp.dst, sctp_key->sctp_dst, is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_SCTP);
}
const struct ovs_key_icmp *icmp_key;
icmp_key = nla_data(a[OVS_KEY_ATTR_ICMP]);
- SW_FLOW_KEY_PUT(match, ipv4.tp.src,
+ SW_FLOW_KEY_PUT(match, tp.src,
htons(icmp_key->icmp_type), is_mask);
- SW_FLOW_KEY_PUT(match, ipv4.tp.dst,
+ SW_FLOW_KEY_PUT(match, tp.dst,
htons(icmp_key->icmp_code), is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_ICMP);
}
const struct ovs_key_icmpv6 *icmpv6_key;
icmpv6_key = nla_data(a[OVS_KEY_ATTR_ICMPV6]);
- SW_FLOW_KEY_PUT(match, ipv6.tp.src,
+ SW_FLOW_KEY_PUT(match, tp.src,
htons(icmpv6_key->icmpv6_type), is_mask);
- SW_FLOW_KEY_PUT(match, ipv6.tp.dst,
+ SW_FLOW_KEY_PUT(match, tp.dst,
htons(icmpv6_key->icmpv6_code), is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_ICMPV6);
}
if (!nla)
goto nla_put_failure;
tcp_key = nla_data(nla);
- if (swkey->eth.type == htons(ETH_P_IP)) {
- tcp_key->tcp_src = output->ipv4.tp.src;
- tcp_key->tcp_dst = output->ipv4.tp.dst;
- if (nla_put_be16(skb, OVS_KEY_ATTR_TCP_FLAGS,
- output->ipv4.tp.flags))
- goto nla_put_failure;
- } else if (swkey->eth.type == htons(ETH_P_IPV6)) {
- tcp_key->tcp_src = output->ipv6.tp.src;
- tcp_key->tcp_dst = output->ipv6.tp.dst;
- if (nla_put_be16(skb, OVS_KEY_ATTR_TCP_FLAGS,
- output->ipv6.tp.flags))
- goto nla_put_failure;
- }
+ tcp_key->tcp_src = output->tp.src;
+ tcp_key->tcp_dst = output->tp.dst;
+ if (nla_put_be16(skb, OVS_KEY_ATTR_TCP_FLAGS,
+ output->tp.flags))
+ goto nla_put_failure;
} else if (swkey->ip.proto == IPPROTO_UDP) {
struct ovs_key_udp *udp_key;
if (!nla)
goto nla_put_failure;
udp_key = nla_data(nla);
- if (swkey->eth.type == htons(ETH_P_IP)) {
- udp_key->udp_src = output->ipv4.tp.src;
- udp_key->udp_dst = output->ipv4.tp.dst;
- } else if (swkey->eth.type == htons(ETH_P_IPV6)) {
- udp_key->udp_src = output->ipv6.tp.src;
- udp_key->udp_dst = output->ipv6.tp.dst;
- }
+ udp_key->udp_src = output->tp.src;
+ udp_key->udp_dst = output->tp.dst;
} else if (swkey->ip.proto == IPPROTO_SCTP) {
struct ovs_key_sctp *sctp_key;
if (!nla)
goto nla_put_failure;
sctp_key = nla_data(nla);
- if (swkey->eth.type == htons(ETH_P_IP)) {
- sctp_key->sctp_src = swkey->ipv4.tp.src;
- sctp_key->sctp_dst = swkey->ipv4.tp.dst;
- } else if (swkey->eth.type == htons(ETH_P_IPV6)) {
- sctp_key->sctp_src = swkey->ipv6.tp.src;
- sctp_key->sctp_dst = swkey->ipv6.tp.dst;
- }
+ sctp_key->sctp_src = output->tp.src;
+ sctp_key->sctp_dst = output->tp.dst;
} else if (swkey->eth.type == htons(ETH_P_IP) &&
swkey->ip.proto == IPPROTO_ICMP) {
struct ovs_key_icmp *icmp_key;
if (!nla)
goto nla_put_failure;
icmp_key = nla_data(nla);
- icmp_key->icmp_type = ntohs(output->ipv4.tp.src);
- icmp_key->icmp_code = ntohs(output->ipv4.tp.dst);
+ icmp_key->icmp_type = ntohs(output->tp.src);
+ icmp_key->icmp_code = ntohs(output->tp.dst);
} else if (swkey->eth.type == htons(ETH_P_IPV6) &&
swkey->ip.proto == IPPROTO_ICMPV6) {
struct ovs_key_icmpv6 *icmpv6_key;
if (!nla)
goto nla_put_failure;
icmpv6_key = nla_data(nla);
- icmpv6_key->icmpv6_type = ntohs(output->ipv6.tp.src);
- icmpv6_key->icmpv6_code = ntohs(output->ipv6.tp.dst);
+ icmpv6_key->icmpv6_type = ntohs(output->tp.src);
+ icmpv6_key->icmpv6_code = ntohs(output->tp.dst);
if (icmpv6_key->icmpv6_type == NDISC_NEIGHBOUR_SOLICITATION ||
icmpv6_key->icmpv6_type == NDISC_NEIGHBOUR_ADVERTISEMENT) {
static int validate_tp_port(const struct sw_flow_key *flow_key)
{
- if (flow_key->eth.type == htons(ETH_P_IP)) {
- if (flow_key->ipv4.tp.src || flow_key->ipv4.tp.dst)
- return 0;
- } else if (flow_key->eth.type == htons(ETH_P_IPV6)) {
- if (flow_key->ipv6.tp.src || flow_key->ipv6.tp.dst)
- return 0;
- }
+ if ((flow_key->eth.type == htons(ETH_P_IP) ||
+ flow_key->eth.type == htons(ETH_P_IPV6)) &&
+ (flow_key->tp.src || flow_key->tp.dst))
+ return 0;
return -EINVAL;
}
if (!flow)
return;
- if (flow->mask) {
- struct sw_flow_mask *mask = flow->mask;
-
- /* ovs-lock is required to protect mask-refcount and
- * mask list.
- */
- ASSERT_OVSL();
- BUG_ON(!mask->ref_count);
- mask->ref_count--;
-
- if (!mask->ref_count) {
- list_del_rcu(&mask->list);
- if (deferred)
- call_rcu(&mask->rcu, rcu_free_sw_flow_mask_cb);
- else
- kfree(mask);
- }
- }
-
if (deferred)
call_rcu(&flow->rcu, rcu_free_flow_callback);
else
return table_instance_rehash(ti, ti->n_buckets * 2);
}
+/* Remove 'mask' from the mask list, if it is not needed any more. */
+static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
+{
+ if (mask) {
+ /* ovs-lock is required to protect mask-refcount and
+ * mask list.
+ */
+ ASSERT_OVSL();
+ BUG_ON(!mask->ref_count);
+ mask->ref_count--;
+
+ if (!mask->ref_count) {
+ list_del_rcu(&mask->list);
+ call_rcu(&mask->rcu, rcu_free_sw_flow_mask_cb);
+ }
+ }
+}
+
+/* Must be called with OVS mutex held. */
void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
{
struct table_instance *ti = ovsl_dereference(table->ti);
BUG_ON(table->count == 0);
hlist_del_rcu(&flow->hash_node[ti->node_ver]);
table->count--;
+
+ /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
+ * accessible as long as the RCU read lock is held. */
+ flow_mask_remove(table, flow->mask);
}
static struct sw_flow_mask *mask_alloc(void)
return 0;
}
+/* Must be called with OVS mutex held. */
int ovs_flow_tbl_insert(struct flow_table *table, struct sw_flow *flow,
struct sw_flow_mask *mask)
{
linux/compat/include/linux/poison.h \
linux/compat/include/linux/rculist.h \
linux/compat/include/linux/rcupdate.h \
+ linux/compat/include/linux/reciprocal_div.h \
linux/compat/include/linux/rtnetlink.h \
linux/compat/include/linux/sctp.h \
linux/compat/include/linux/skbuff.h \
gfp_t flags)
{
struct flex_array *ret;
+ struct reciprocal_value reciprocal_elems = {0};
int elems_per_part = 0;
- int reciprocal_elems = 0;
int max_size = 0;
if (element_size) {
* 02110-1301, USA
*/
+#include <linux/version.h>
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
#include <linux/kconfig.h>
#if IS_ENABLED(CONFIG_NET_IPGRE_DEMUX)
}
#endif /* CONFIG_NET_IPGRE_DEMUX */
+
+#endif /* 3.12 */
* 02110-1301, USA
*/
+#include <linux/version.h>
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
#include <linux/module.h>
#include <linux/if.h>
#include <linux/if_tunnel.h>
}
return ret;
}
+#endif /* 3.12 */
#ifndef __LINUX_GSO_WRAPPER_H
#define __LINUX_GSO_WRAPPER_H
+#include <linux/version.h>
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
#include <linux/skbuff.h>
#include <net/protocol.h>
#define ip_local_out rpl_ip_local_out
int ip_local_out(struct sk_buff *skb);
+
+#endif /* 3.12 */
#endif
#include_next <linux/flex_array.h>
#else
+#include <linux/reciprocal_div.h>
#include <linux/types.h>
#include <asm/page.h>
int element_size;
int total_nr_elements;
int elems_per_part;
- u32 reciprocal_elems;
+ struct reciprocal_value reciprocal_elems;
struct flex_array_part *parts[];
};
/*
--- /dev/null
+#ifndef _LINUX_RECIPROCAL_DIV_WRAPPER_H
+#define _LINUX_RECIPROCAL_DIV_WRAPPER_H 1
+
+#include <linux/types.h>
+
+/*
+ * This algorithm is based on the paper "Division by Invariant
+ * Integers Using Multiplication" by Torbjörn Granlund and Peter
+ * L. Montgomery.
+ *
+ * The assembler implementation from Agner Fog, which this code is
+ * based on, can be found here:
+ * http://www.agner.org/optimize/asmlib.zip
+ *
+ * This optimization for A/B is helpful if the divisor B is mostly
+ * runtime invariant. The reciprocal of B is calculated in the
+ * slow-path with reciprocal_value(). The fast-path can then just use
+ * a much faster multiplication operation with a variable dividend A
+ * to calculate the division A/B.
+ */
+
+#define reciprocal_value rpl_reciprocal_value
+struct reciprocal_value {
+ u32 m;
+ u8 sh1, sh2;
+};
+
+struct reciprocal_value reciprocal_value(u32 d);
+
+#define reciprocal_divide rpl_reciprocal_divide
+static inline u32 reciprocal_divide(u32 a, struct reciprocal_value R)
+{
+ u32 t = (u32)(((u64)a * R.m) >> 32);
+ return (t + ((a - t) >> R.sh1)) >> R.sh2;
+}
+
+#endif /* _LINUX_RECIPROCAL_DIV_WRAPPER_H */
#include <linux/skbuff.h>
#include <net/ip_tunnels.h>
+#include <linux/version.h>
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,37) || \
defined(HAVE_GRE_CISCO_REGISTER)
#include_next <net/gre.h>
#endif /* LINUX_VERSION_CODE < KERNEL_VERSION(3,10,0) */
#endif /* HAVE_GRE_CISCO_REGISTER */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
#define gre_build_header rpl_gre_build_header
void gre_build_header(struct sk_buff *skb, const struct tnl_ptk_info *tpi,
int hdr_len);
addend += 4;
return addend;
}
+#endif
#endif
#ifndef __NET_IP_TUNNELS_WRAPPER_H
#define __NET_IP_TUNNELS_WRAPPER_H 1
+#include <linux/version.h>
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,12,0)
+#include_next <net/ip_tunnels.h>
+#else
+
#include <linux/if_tunnel.h>
#include <linux/netdevice.h>
#include <linux/skbuff.h>
int iptunnel_xmit(struct rtable *rt,
struct sk_buff *skb,
__be32 src, __be32 dst, __u8 proto,
- __u8 tos, __u8 ttl, __be16 df);
+ __u8 tos, __u8 ttl, __be16 df, bool xnet);
int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto);
+
+#endif
#endif /* __NET_IP_TUNNELS_H */
#include <linux/netdevice.h>
#include <linux/udp.h>
+#include <linux/version.h>
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,12,0)
+#include_next <net/vxlan.h>
+#else
+
struct vxlan_sock;
typedef void (vxlan_rcv_t)(struct vxlan_sock *vs, struct sk_buff *skb, __be32 key);
struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
vxlan_rcv_t *rcv, void *data,
- bool no_share);
+ bool no_share, bool ipv6);
void vxlan_sock_release(struct vxlan_sock *vs);
__be16 vxlan_src_port(__u16 port_min, __u16 port_max, struct sk_buff *skb);
+#endif /* 3.12 */
#endif
* 02110-1301, USA
*/
+#include <linux/version.h>
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/in.h>
int iptunnel_xmit(struct rtable *rt,
struct sk_buff *skb,
__be32 src, __be32 dst, __u8 proto,
- __u8 tos, __u8 ttl, __be16 df)
+ __u8 tos, __u8 ttl, __be16 df, bool xnet)
{
int pkt_len = skb->len;
struct iphdr *iph;
skb->pkt_type = PACKET_HOST;
return 0;
}
+
+#endif /* 3.12 */
+#include <linux/kernel.h>
#include <asm/div64.h>
#include <linux/reciprocal_div.h>
-#include <linux/version.h>
-#if LINUX_VERSION_CODE < KERNEL_VERSION(3,3,0)
-/* definition is required since reciprocal_value() is not exported */
-u32 reciprocal_value(u32 k)
+/*
+ * For a description of the algorithm please have a look at
+ * include/linux/reciprocal_div.h
+ */
+
+struct reciprocal_value reciprocal_value(u32 d)
{
- u64 val = (1LL << 32) + (k - 1);
- do_div(val, k);
- return (u32)val;
+ struct reciprocal_value R;
+ u64 m;
+ int l;
+
+ l = fls(d - 1);
+ m = ((1ULL << 32) * ((1ULL << l) - d));
+ do_div(m, d);
+ ++m;
+ R.m = (u32)m;
+ R.sh1 = min(l, 1);
+ R.sh2 = max(l - 1, 0);
+
+ return R;
}
-#endif
* This code is derived from kernel vxlan module.
*/
+#include <linux/version.h>
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/kernel.h>
if (err)
return err;
- return iptunnel_xmit(rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df);
+ return iptunnel_xmit(rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, false);
}
static void rcu_free_vs(struct rcu_head *rcu)
struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
vxlan_rcv_t *rcv, void *data,
- bool no_share)
+ bool no_share, bool ipv6)
{
return vxlan_socket_create(net, port, rcv, data);
}
queue_work(system_wq, &vs->del_work);
}
+
+#endif /* 3.12 */
return iptunnel_xmit(rt, skb, saddr,
OVS_CB(skb)->tun_key->ipv4_dst, IPPROTO_GRE,
OVS_CB(skb)->tun_key->ipv4_tos,
- OVS_CB(skb)->tun_key->ipv4_ttl, df);
+ OVS_CB(skb)->tun_key->ipv4_ttl, df, false);
err_free_rt:
ip_rt_put(rt);
error:
ovs_net = net_generic(net, ovs_net_id);
- rcu_assign_pointer(ovs_net->vport_net.gre_vport, NULL);
+ RCU_INIT_POINTER(ovs_net->vport_net.gre_vport, NULL);
ovs_vport_deferred_free(vport);
gre_exit();
}
return ERR_PTR(err);
}
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0)
+
static void lisp_fix_segment(struct sk_buff *skb)
{
struct udphdr *udph = udp_hdr(skb);
udph->len = htons(skb->len - skb_transport_offset(skb));
}
-static void handle_offloads(struct sk_buff *skb)
+static int handle_offloads(struct sk_buff *skb)
{
if (skb_is_gso(skb))
OVS_GSO_CB(skb)->fix_segment = lisp_fix_segment;
else if (skb->ip_summed != CHECKSUM_PARTIAL)
skb->ip_summed = CHECKSUM_NONE;
+ return 0;
}
+#else
+static int handle_offloads(struct sk_buff *skb)
+{
+ if (skb_is_gso(skb)) {
+ int err = skb_unclone(skb, GFP_ATOMIC);
+ if (unlikely(err))
+ return err;
+
+ skb_shinfo(skb)->gso_type |= SKB_GSO_UDP_TUNNEL;
+ } else if (skb->ip_summed != CHECKSUM_PARTIAL)
+ skb->ip_summed = CHECKSUM_NONE;
+
+ skb->encapsulation = 1;
+ return 0;
+}
+#endif
static int lisp_send(struct vport *vport, struct sk_buff *skb)
{
lisp_build_header(vport, skb);
/* Offloading */
- handle_offloads(skb);
+ err = handle_offloads(skb);
+ if (err)
+ goto err_free_rt;
+
skb->local_df = 1;
df = OVS_CB(skb)->tun_key->tun_flags &
sent_len = iptunnel_xmit(rt, skb,
saddr, OVS_CB(skb)->tun_key->ipv4_dst,
IPPROTO_UDP, OVS_CB(skb)->tun_key->ipv4_tos,
- OVS_CB(skb)->tun_key->ipv4_ttl, df);
+ OVS_CB(skb)->tun_key->ipv4_ttl, df, false);
return sent_len > 0 ? sent_len + network_offset : sent_len;
vxlan_port = vxlan_vport(vport);
strncpy(vxlan_port->name, parms->name, IFNAMSIZ);
- vs = vxlan_sock_add(net, htons(dst_port), vxlan_rcv, vport, true);
+ vs = vxlan_sock_add(net, htons(dst_port), vxlan_rcv, vport, true, false);
if (IS_ERR(vs)) {
ovs_vport_free(vport);
return (void *)vs;
Package: openvswitch-switch
Architecture: linux-any
Suggests: openvswitch-datapath-module
-Depends: ${shlibs:Depends}, ${misc:Depends}, ${python:Depends}, openvswitch-common (= ${binary:Version}), module-init-tools, procps, uuid-runtime, netbase, python-argparse
+Depends: ${shlibs:Depends}, ${misc:Depends}, ${python:Depends}, openvswitch-common (= ${binary:Version}), kmod, procps, uuid-runtime, netbase, python-argparse
Description: Open vSwitch switch implementations
Open vSwitch is a production quality, multilayer, software-based,
Ethernet virtual switch. It is designed to enable massive network
Copyright (c) 2011 Gaetano Catalli
Copyright (C) 2000-2003 Geoffrey Wossum (gwossum@acm.org)
Copyright (C) 2000 The NetBSD Foundation, Inc.
+ Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
+ Copyright (c) 1982, 1986, 1990, 1993 The Regents of the University of California.
License:
lib/getopt_long.c
include/windows/getopt.h
+* The following files are licensed under the 3-clause BSD-license
+
+ include/windows/netinet/icmp6.h
+ include/windows/netinet/ip6.h
+ lib/strsep.c
+
* The following components are dual-licensed under the
GNU General Public License version 2 and the Apache License Version 2.0.
OVS_KEY_ATTR_TUNNEL, /* Nested set of ovs_tunnel attributes */
OVS_KEY_ATTR_SCTP, /* struct ovs_key_sctp */
OVS_KEY_ATTR_TCP_FLAGS, /* be16 TCP flags. */
-
+ OVS_KEY_ATTR_DP_HASH, /* u32 hash value */
+ OVS_KEY_ATTR_RECIRC_ID, /* u32 recirc id */
#ifdef __KERNEL__
+ /* Only used within kernel data path. */
OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */
#endif
+ /* Experimental */
OVS_KEY_ATTR_MPLS = 62, /* array of struct ovs_key_mpls.
* The implementation may restrict
* @OVS_FLOW_ATTR_ACTIONS: Nested %OVS_ACTION_ATTR_* attributes specifying
* the actions to take for packets that match the key. Always present in
* notifications. Required for %OVS_FLOW_CMD_NEW requests, optional for
- * %OVS_FLOW_CMD_SET requests.
+ * %OVS_FLOW_CMD_SET requests. An %OVS_FLOW_CMD_SET without
+ * %OVS_FLOW_ATTR_ACTIONS will not modify the actions. To clear the actions,
+ * an %OVS_FLOW_ATTR_ACTIONS without any nested attributes must be given.
* @OVS_FLOW_ATTR_STATS: &struct ovs_flow_stats giving statistics for this
* flow. Present in notifications if the stats would be nonzero. Ignored in
* requests.
__be16 vlan_tci; /* 802.1Q TCI (VLAN ID and priority). */
};
+/* Data path hash algorithm for computing Datapath hash.
+ *
+ * The Algorithm type only specifies the fields in a flow
+ * will be used as part of the hash. Each datapath is free
+ * to use its own hash algorithm. The hash value will be
+ * opaque to the user space daemon.
+ */
+enum ovs_recirc_hash_alg {
+ OVS_RECIRC_HASH_ALG_NONE,
+ OVS_RECIRC_HASH_ALG_L4,
+};
+/*
+ * struct ovs_action_recirc - %OVS_ACTION_ATTR_RECIRC action argument.
+ * @recirc_id: The Recirculation label, Zero is invalid.
+ * @hash_alg: Algorithm used to compute hash prior to recirculation.
+ * @hash_bias: bias used for computing hash. used to compute hash prior to
+ * recirculation.
+ */
+struct ovs_action_recirc {
+ uint32_t hash_alg; /* One of ovs_dp_hash_alg. */
+ uint32_t hash_bias;
+ uint32_t recirc_id; /* Recirculation label. */
+};
+
/**
* enum ovs_action_attr - Action types.
*
* indicate the new packet contents. This could potentially still be
* %ETH_P_MPLS if the resulting MPLS label stack is not empty. If there
* is no MPLS label stack, as determined by ethertype, no action is taken.
+ * @OVS_ACTION_RECIRC: Recirculate within the data path.
*
* Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all
* fields within a header are modifiable, e.g. the IPv4 protocol and fragment
OVS_ACTION_ATTR_SAMPLE, /* Nested OVS_SAMPLE_ATTR_*. */
OVS_ACTION_ATTR_PUSH_MPLS, /* struct ovs_action_push_mpls. */
OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */
+ OVS_ACTION_ATTR_RECIRC, /* struct ovs_action_recirc. */
__OVS_ACTION_ATTR_MAX
};
#define NXM_NX_TCP_FLAGS NXM_HEADER (0x0001, 34, 2)
#define NXM_NX_TCP_FLAGS_W NXM_HEADER_W(0x0001, 34, 2)
+/* Metadata dp_hash.
+ *
+ * Internal use only, not programable from controller.
+ *
+ * The dp_hash is used to carry the flow hash computed in the
+ * datapath.
+ *
+ * Prereqs: None.
+ *
+ * Format: 32-bit integer in network byte order.
+ *
+ * Masking: Fully maskable. */
+#define NXM_NX_DP_HASH NXM_HEADER (0x0001, 35, 4)
+#define NXM_NX_DP_HASH_W NXM_HEADER_W(0x0001, 35, 4)
+
+/* Metadata recirc_id.
+ *
+ * Internal use only, not programable from controller.
+ *
+ * The recirc_id used for recirculation. 0 is reserved
+ * for initially received packet.
+ *
+ * Prereqs: None.
+ *
+ * Format: 32-bit integer in network byte order.
+ *
+ * Masking: not maskable. */
+#define NXM_NX_RECIRC_ID NXM_HEADER (0x0001, 36, 4)
+
/* ## --------------------- ## */
/* ## Requests and replies. ## */
/* ## --------------------- ## */
OFP_ASSERT(sizeof(struct ofp13_instruction_meter) == 8);
enum ofp13_action_type {
- OFPAT13_PUSH_PBB = 26, /* Push a new PBB service tag (I-TAG) */
- OFPAT13_POP_PBB = 27 /* Pop the outer PBB service tag (I-TAG) */
+ OFPAT13_OUTPUT = 0, /* Output to switch port. */
+ OFPAT13_COPY_TTL_OUT = 11, /* Copy TTL "outwards" -- from next-to-outermost
+ to outermost */
+ OFPAT13_COPY_TTL_IN = 12, /* Copy TTL "inwards" -- from outermost to
+ next-to-outermost */
+ OFPAT13_SET_MPLS_TTL = 15, /* MPLS TTL */
+ OFPAT13_DEC_MPLS_TTL = 16, /* Decrement MPLS TTL */
+ OFPAT13_PUSH_VLAN = 17, /* Push a new VLAN tag */
+ OFPAT13_POP_VLAN = 18, /* Pop the outer VLAN tag */
+ OFPAT13_PUSH_MPLS = 19, /* Push a new MPLS Label Stack Entry */
+ OFPAT13_POP_MPLS = 20, /* Pop the outer MPLS Label Stack Entry */
+ OFPAT13_SET_QUEUE = 21, /* Set queue id when outputting to a port */
+ OFPAT13_GROUP = 22, /* Apply group. */
+ OFPAT13_SET_NW_TTL = 23, /* IP TTL. */
+ OFPAT13_DEC_NW_TTL = 24, /* Decrement IP TTL. */
+ OFPAT13_SET_FIELD = 25, /* Set a header field using OXM TLV format. */
+ OFPAT13_PUSH_PBB = 26, /* Push a new PBB service tag (I-TAG) */
+ OFPAT13_POP_PBB = 27 /* Pop the outer PBB service tag (I-TAG) */
};
/* enum ofp_config_flags value OFPC_INVALID_TTL_TO_CONTROLLER
};
OFP_ASSERT(sizeof(struct ofp14_role_prop_experimenter) == 12);
+/* Bundle control message types */
+enum ofp14_bundle_ctrl_type {
+ OFPBCT_OPEN_REQUEST = 0,
+ OFPBCT_OPEN_REPLY = 1,
+ OFPBCT_CLOSE_REQUEST = 2,
+ OFPBCT_CLOSE_REPLY = 3,
+ OFPBCT_COMMIT_REQUEST = 4,
+ OFPBCT_COMMIT_REPLY = 5,
+ OFPBCT_DISCARD_REQUEST = 6,
+ OFPBCT_DISCARD_REPLY = 7,
+};
+
+/* Bundle configuration flags. */
+enum ofp14_bundle_flags {
+ OFPBF_ATOMIC = 1 << 0, /* Execute atomically. */
+ OFPBF_ORDERED = 1 << 1, /* Execute in specified order. */
+};
+
+/* Message structure for ONF_ET_BUNDLE_CONTROL. */
+struct ofp14_bundle_ctrl_msg {
+ ovs_be32 bundle_id; /* Identify the bundle. */
+ ovs_be16 type; /* OFPBCT_*. */
+ ovs_be16 flags; /* Bitmap of OFPBF_* flags. */
+ /* Bundle Property list. */
+ /* struct ofp14_bundle_prop_header properties[0]; */
+};
+OFP_ASSERT(sizeof(struct ofp14_bundle_ctrl_msg) == 8);
+
+/* Message structure for OFP_BUNDLE_ADD_MESSAGE.
+* Adding a message in a bundle is done with. */
+struct ofp14_bundle_add_msg {
+ ovs_be32 bundle_id; /* Identify the bundle. */
+ uint8_t pad[2]; /* Align to 64 bits. */
+ ovs_be16 flags; /* Bitmap of ONF_BF_* flags. */
+
+ struct ofp_header message; /* Message added to the bundle. */
+
+ /* If there is one property or more, 'message' is followed by:
+ * - Exactly (message.length + 7)/8*8 - (message.length) (between 0 and 7)
+ * bytes of all-zero bytes */
+
+ /* Bundle Property list. */
+ /* struct ofp14_bundle_prop_header properties[0]; */
+};
+OFP_ASSERT(sizeof(struct ofp14_bundle_add_msg) == 16);
#endif /* openflow/openflow-1.4.h */
OFP10_VERSION = 0x01,
OFP11_VERSION = 0x02,
OFP12_VERSION = 0x03,
- OFP13_VERSION = 0x04
+ OFP13_VERSION = 0x04,
+ OFP14_VERSION = 0x05
/* When we add real support for these versions, add them to the enum so
* that we get compiler warnings everywhere we might forget to provide
* support. Until then, keep them as macros to avoid those warnings. */
-#define OFP14_VERSION 0x05
#define OFP15_VERSION 0x06
};
# without warranty of any kind.
noinst_HEADERS += \
+ include/windows/arpa/inet.h \
+ include/windows/dirent.h \
include/windows/getopt.h \
+ include/windows/net/if.h \
+ include/windows/netdb.h \
+ include/windows/netinet/icmp6.h \
+ include/windows/netinet/in.h \
+ include/windows/netinet/in_systm.h \
+ include/windows/netinet/ip.h \
+ include/windows/netinet/ip6.h \
+ include/windows/netinet/tcp.h \
+ include/windows/poll.h \
+ include/windows/strings.h \
include/windows/syslog.h \
+ include/windows/sys/ioctl.h \
include/windows/sys/resource.h \
+ include/windows/sys/socket.h \
+ include/windows/sys/time.h \
+ include/windows/sys/uio.h \
+ include/windows/sys/un.h \
+ include/windows/sys/wait.h \
+ include/windows/unistd.h \
include/windows/windefs.h
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef __NET_IF_H
+#define __NET_IF_H 1
+
+#include <Netioapi.h>
+
+#define IFNAMSIZ IF_NAMESIZE
+
+#endif /* net/if.h */
--- /dev/null
+/*
+ * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the project nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+/*
+ * Copyright (c) 1982, 1986, 1993
+ * The Regents of the University of California. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * @(#)ip_icmp.h 8.1 (Berkeley) 6/10/93
+ */
+
+#ifndef _NETINET_ICMP6_H_
+#define _NETINET_ICMP6_H_
+
+#include "byte-order.h"
+
+#define ICMPV6_PLD_MAXLEN 1232 /* IPV6_MMTU - sizeof(struct ip6_hdr)
+ - sizeof(struct icmp6_hdr) */
+
+struct icmp6_hdr {
+ u_int8_t icmp6_type; /* type field */
+ u_int8_t icmp6_code; /* code field */
+ u_int16_t icmp6_cksum; /* checksum field */
+ union {
+ u_int32_t icmp6_un_data32[1]; /* type-specific field */
+ u_int16_t icmp6_un_data16[2]; /* type-specific field */
+ u_int8_t icmp6_un_data8[4]; /* type-specific field */
+ } icmp6_dataun;
+};
+
+#define icmp6_data32 icmp6_dataun.icmp6_un_data32
+#define icmp6_data16 icmp6_dataun.icmp6_un_data16
+#define icmp6_data8 icmp6_dataun.icmp6_un_data8
+#define icmp6_pptr icmp6_data32[0] /* parameter prob */
+#define icmp6_mtu icmp6_data32[0] /* packet too big */
+#define icmp6_id icmp6_data16[0] /* echo request/reply */
+#define icmp6_seq icmp6_data16[1] /* echo request/reply */
+#define icmp6_maxdelay icmp6_data16[0] /* mcast group membership */
+
+#define ICMP6_DST_UNREACH 1 /* dest unreachable, codes: */
+#define ICMP6_PACKET_TOO_BIG 2 /* packet too big */
+#define ICMP6_TIME_EXCEEDED 3 /* time exceeded, code: */
+#define ICMP6_PARAM_PROB 4 /* ip6 header bad */
+
+#define ICMP6_ECHO_REQUEST 128 /* echo service */
+#define ICMP6_ECHO_REPLY 129 /* echo reply */
+#define MLD_LISTENER_QUERY 130 /* multicast listener query */
+#define MLD_LISTENER_REPORT 131 /* multicast listener report */
+#define MLD_LISTENER_DONE 132 /* multicast listener done */
+#define MLD_LISTENER_REDUCTION MLD_LISTENER_DONE /* RFC3542 definition */
+
+/* RFC2292 decls */
+#define ICMP6_MEMBERSHIP_QUERY 130 /* group membership query */
+#define ICMP6_MEMBERSHIP_REPORT 131 /* group membership report */
+#define ICMP6_MEMBERSHIP_REDUCTION 132 /* group membership termination */
+
+/* the followings are for backward compatibility to old KAME apps. */
+#define MLD6_LISTENER_QUERY MLD_LISTENER_QUERY
+#define MLD6_LISTENER_REPORT MLD_LISTENER_REPORT
+#define MLD6_LISTENER_DONE MLD_LISTENER_DONE
+
+#define ND_ROUTER_SOLICIT 133 /* router solicitation */
+#define ND_ROUTER_ADVERT 134 /* router advertisement */
+#define ND_NEIGHBOR_SOLICIT 135 /* neighbor solicitation */
+#define ND_NEIGHBOR_ADVERT 136 /* neighbor advertisement */
+#define ND_REDIRECT 137 /* redirect */
+
+#define ICMP6_ROUTER_RENUMBERING 138 /* router renumbering */
+
+#define ICMP6_WRUREQUEST 139 /* who are you request */
+#define ICMP6_WRUREPLY 140 /* who are you reply */
+#define ICMP6_FQDN_QUERY 139 /* FQDN query */
+#define ICMP6_FQDN_REPLY 140 /* FQDN reply */
+#define ICMP6_NI_QUERY 139 /* node information request */
+#define ICMP6_NI_REPLY 140 /* node information reply */
+#define MLDV2_LISTENER_REPORT 143 /* RFC3810 listener report */
+
+/* The definitions below are experimental. TBA */
+#define MLD_MTRACE_RESP 200 /* mtrace response(to sender) */
+#define MLD_MTRACE 201 /* mtrace messages */
+
+/* the followings are for backward compatibility to old KAME apps. */
+#define MLD6_MTRACE_RESP MLD_MTRACE_RESP
+#define MLD6_MTRACE MLD_MTRACE
+
+#define ICMP6_MAXTYPE 201
+
+#define ICMP6_DST_UNREACH_NOROUTE 0 /* no route to destination */
+#define ICMP6_DST_UNREACH_ADMIN 1 /* administratively prohibited */
+#define ICMP6_DST_UNREACH_NOTNEIGHBOR 2 /* not a neighbor(obsolete) */
+#define ICMP6_DST_UNREACH_BEYONDSCOPE 2 /* beyond scope of source address */
+#define ICMP6_DST_UNREACH_ADDR 3 /* address unreachable */
+#define ICMP6_DST_UNREACH_NOPORT 4 /* port unreachable */
+#define ICMP6_DST_UNREACH_POLICY 5 /* source address failed ingress/egress policy */
+#define ICMP6_DST_UNREACH_REJROUTE 6 /* reject route to destination */
+#define ICMP6_DST_UNREACH_SOURCERT 7 /* error in source routing header */
+
+#define ICMP6_TIME_EXCEED_TRANSIT 0 /* ttl==0 in transit */
+#define ICMP6_TIME_EXCEED_REASSEMBLY 1 /* ttl==0 in reass */
+
+#define ICMP6_PARAMPROB_HEADER 0 /* erroneous header field */
+#define ICMP6_PARAMPROB_NEXTHEADER 1 /* unrecognized next header */
+#define ICMP6_PARAMPROB_OPTION 2 /* unrecognized option */
+
+#define ICMP6_INFOMSG_MASK 0x80 /* all informational messages */
+
+#define ICMP6_NI_SUBJ_IPV6 0 /* Query Subject is an IPv6 address */
+#define ICMP6_NI_SUBJ_FQDN 1 /* Query Subject is a Domain name */
+#define ICMP6_NI_SUBJ_IPV4 2 /* Query Subject is an IPv4 address */
+
+#define ICMP6_NI_SUCCESS 0 /* node information successful reply */
+#define ICMP6_NI_REFUSED 1 /* node information request is refused */
+#define ICMP6_NI_UNKNOWN 2 /* unknown Qtype */
+
+#define ICMP6_ROUTER_RENUMBERING_COMMAND 0 /* rr command */
+#define ICMP6_ROUTER_RENUMBERING_RESULT 1 /* rr result */
+#define ICMP6_ROUTER_RENUMBERING_SEQNUM_RESET 255 /* rr seq num reset */
+
+/* Used in kernel only */
+#define ND_REDIRECT_ONLINK 0 /* redirect to an on-link node */
+#define ND_REDIRECT_ROUTER 1 /* redirect to a better router */
+
+/*
+ * Multicast Listener Discovery
+ */
+struct mld_hdr {
+ struct icmp6_hdr mld_icmp6_hdr;
+ struct in6_addr mld_addr; /* multicast address */
+};
+
+/* definitions to provide backward compatibility to old KAME applications */
+#define mld6_hdr mld_hdr
+#define mld6_type mld_type
+#define mld6_code mld_code
+#define mld6_cksum mld_cksum
+#define mld6_maxdelay mld_maxdelay
+#define mld6_reserved mld_reserved
+#define mld6_addr mld_addr
+
+/* shortcut macro definitions */
+#define mld_type mld_icmp6_hdr.icmp6_type
+#define mld_code mld_icmp6_hdr.icmp6_code
+#define mld_cksum mld_icmp6_hdr.icmp6_cksum
+#define mld_maxdelay mld_icmp6_hdr.icmp6_data16[0]
+#define mld_reserved mld_icmp6_hdr.icmp6_data16[1]
+
+#define MLD_MINLEN 24
+
+/*
+ * Neighbor Discovery
+ */
+
+struct nd_router_solicit { /* router solicitation */
+ struct icmp6_hdr nd_rs_hdr;
+ /* could be followed by options */
+};
+
+#define nd_rs_type nd_rs_hdr.icmp6_type
+#define nd_rs_code nd_rs_hdr.icmp6_code
+#define nd_rs_cksum nd_rs_hdr.icmp6_cksum
+#define nd_rs_reserved nd_rs_hdr.icmp6_data32[0]
+
+struct nd_router_advert { /* router advertisement */
+ struct icmp6_hdr nd_ra_hdr;
+ u_int32_t nd_ra_reachable; /* reachable time */
+ u_int32_t nd_ra_retransmit; /* retransmit timer */
+ /* could be followed by options */
+};
+
+#define nd_ra_type nd_ra_hdr.icmp6_type
+#define nd_ra_code nd_ra_hdr.icmp6_code
+#define nd_ra_cksum nd_ra_hdr.icmp6_cksum
+#define nd_ra_curhoplimit nd_ra_hdr.icmp6_data8[0]
+#define nd_ra_flags_reserved nd_ra_hdr.icmp6_data8[1]
+#define ND_RA_FLAG_MANAGED 0x80
+#define ND_RA_FLAG_OTHER 0x40
+#define ND_RA_FLAG_HOME_AGENT 0x20
+
+/*
+ * Router preference values based on RFC4191.
+ */
+#define ND_RA_FLAG_RTPREF_MASK 0x18 /* 00011000 */
+
+#define ND_RA_FLAG_RTPREF_HIGH 0x08 /* 00001000 */
+#define ND_RA_FLAG_RTPREF_MEDIUM 0x00 /* 00000000 */
+#define ND_RA_FLAG_RTPREF_LOW 0x18 /* 00011000 */
+#define ND_RA_FLAG_RTPREF_RSV 0x10 /* 00010000 */
+
+#define nd_ra_router_lifetime nd_ra_hdr.icmp6_data16[1]
+
+struct nd_neighbor_solicit { /* neighbor solicitation */
+ struct icmp6_hdr nd_ns_hdr;
+ struct in6_addr nd_ns_target; /*target address */
+ /* could be followed by options */
+};
+
+#define nd_ns_type nd_ns_hdr.icmp6_type
+#define nd_ns_code nd_ns_hdr.icmp6_code
+#define nd_ns_cksum nd_ns_hdr.icmp6_cksum
+#define nd_ns_reserved nd_ns_hdr.icmp6_data32[0]
+
+struct nd_neighbor_advert { /* neighbor advertisement */
+ struct icmp6_hdr nd_na_hdr;
+ struct in6_addr nd_na_target; /* target address */
+ /* could be followed by options */
+};
+
+#define nd_na_type nd_na_hdr.icmp6_type
+#define nd_na_code nd_na_hdr.icmp6_code
+#define nd_na_cksum nd_na_hdr.icmp6_cksum
+#define nd_na_flags_reserved nd_na_hdr.icmp6_data32[0]
+#define ND_NA_FLAG_ROUTER CONSTANT_HTONL(0x80000000)
+#define ND_NA_FLAG_SOLICITED CONSTANT_HTONL(0x40000000)
+#define ND_NA_FLAG_OVERRIDE CONSTANT_HTONL(0x20000000)
+
+struct nd_redirect { /* redirect */
+ struct icmp6_hdr nd_rd_hdr;
+ struct in6_addr nd_rd_target; /* target address */
+ struct in6_addr nd_rd_dst; /* destination address */
+ /* could be followed by options */
+};
+
+#define nd_rd_type nd_rd_hdr.icmp6_type
+#define nd_rd_code nd_rd_hdr.icmp6_code
+#define nd_rd_cksum nd_rd_hdr.icmp6_cksum
+#define nd_rd_reserved nd_rd_hdr.icmp6_data32[0]
+
+struct nd_opt_hdr { /* Neighbor discovery option header */
+ u_int8_t nd_opt_type;
+ u_int8_t nd_opt_len;
+ /* followed by option specific data*/
+};
+
+#define ND_OPT_SOURCE_LINKADDR 1
+#define ND_OPT_TARGET_LINKADDR 2
+#define ND_OPT_PREFIX_INFORMATION 3
+#define ND_OPT_REDIRECTED_HEADER 4
+#define ND_OPT_MTU 5
+#define ND_OPT_ADVINTERVAL 7
+#define ND_OPT_HOMEAGENT_INFO 8
+#define ND_OPT_SOURCE_ADDRLIST 9
+#define ND_OPT_TARGET_ADDRLIST 10
+#define ND_OPT_MAP 23 /* RFC 5380 */
+#define ND_OPT_ROUTE_INFO 24 /* RFC 4191 */
+#define ND_OPT_RDNSS 25 /* RFC 6016 */
+#define ND_OPT_DNSSL 31 /* RFC 6016 */
+
+struct nd_opt_route_info { /* route info */
+ u_int8_t nd_opt_rti_type;
+ u_int8_t nd_opt_rti_len;
+ u_int8_t nd_opt_rti_prefixlen;
+ u_int8_t nd_opt_rti_flags;
+ u_int32_t nd_opt_rti_lifetime;
+ /* prefix follows */
+};
+
+struct nd_opt_prefix_info { /* prefix information */
+ u_int8_t nd_opt_pi_type;
+ u_int8_t nd_opt_pi_len;
+ u_int8_t nd_opt_pi_prefix_len;
+ u_int8_t nd_opt_pi_flags_reserved;
+ u_int32_t nd_opt_pi_valid_time;
+ u_int32_t nd_opt_pi_preferred_time;
+ u_int32_t nd_opt_pi_reserved2;
+ struct in6_addr nd_opt_pi_prefix;
+};
+
+#define ND_OPT_PI_FLAG_ONLINK 0x80
+#define ND_OPT_PI_FLAG_AUTO 0x40
+
+struct nd_opt_rd_hdr { /* redirected header */
+ u_int8_t nd_opt_rh_type;
+ u_int8_t nd_opt_rh_len;
+ u_int16_t nd_opt_rh_reserved1;
+ u_int32_t nd_opt_rh_reserved2;
+ /* followed by IP header and data */
+};
+
+struct nd_opt_mtu { /* MTU option */
+ u_int8_t nd_opt_mtu_type;
+ u_int8_t nd_opt_mtu_len;
+ u_int16_t nd_opt_mtu_reserved;
+ u_int32_t nd_opt_mtu_mtu;
+};
+
+struct nd_opt_rdnss { /* RDNSS option RFC 6106 */
+ u_int8_t nd_opt_rdnss_type;
+ u_int8_t nd_opt_rdnss_len;
+ u_int16_t nd_opt_rdnss_reserved;
+ u_int32_t nd_opt_rdnss_lifetime;
+ /* followed by list of IP prefixes */
+};
+
+struct nd_opt_dnssl { /* DNSSL option RFC 6106 */
+ u_int8_t nd_opt_dnssl_type;
+ u_int8_t nd_opt_dnssl_len;
+ u_int16_t nd_opt_dnssl_reserved;
+ u_int32_t nd_opt_dnssl_lifetime;
+ /* followed by list of IP prefixes */
+};
+
+/*
+ * icmp6 namelookup
+ */
+
+struct icmp6_namelookup {
+ struct icmp6_hdr icmp6_nl_hdr;
+ u_int8_t icmp6_nl_nonce[8];
+ int32_t icmp6_nl_ttl;
+#if 0
+ u_int8_t icmp6_nl_len;
+ u_int8_t icmp6_nl_name[3];
+#endif
+ /* could be followed by options */
+};
+
+/*
+ * icmp6 node information
+ */
+struct icmp6_nodeinfo {
+ struct icmp6_hdr icmp6_ni_hdr;
+ u_int8_t icmp6_ni_nonce[8];
+ /* could be followed by reply data */
+};
+
+#define ni_type icmp6_ni_hdr.icmp6_type
+#define ni_code icmp6_ni_hdr.icmp6_code
+#define ni_cksum icmp6_ni_hdr.icmp6_cksum
+#define ni_qtype icmp6_ni_hdr.icmp6_data16[0]
+#define ni_flags icmp6_ni_hdr.icmp6_data16[1]
+
+#define NI_QTYPE_NOOP 0 /* NOOP */
+#define NI_QTYPE_SUPTYPES 1 /* Supported Qtypes */
+#define NI_QTYPE_FQDN 2 /* FQDN (draft 04) */
+#define NI_QTYPE_DNSNAME 2 /* DNS Name */
+#define NI_QTYPE_NODEADDR 3 /* Node Addresses */
+#define NI_QTYPE_IPV4ADDR 4 /* IPv4 Addresses */
+
+#define NI_SUPTYPE_FLAG_COMPRESS CONSTANT_HTONS(0x1)
+#define NI_FQDN_FLAG_VALIDTTL CONSTANT_HTONS(0x1)
+
+#ifdef NAME_LOOKUPS_04
+#define NI_NODEADDR_FLAG_LINKLOCAL CONSTANT_HTONS(0x1)
+#define NI_NODEADDR_FLAG_SITELOCAL CONSTANT_HTONS(0x2)
+#define NI_NODEADDR_FLAG_GLOBAL CONSTANT_HTONS(0x4)
+#define NI_NODEADDR_FLAG_ALL CONSTANT_HTONS(0x8)
+#define NI_NODEADDR_FLAG_TRUNCATE CONSTANT_HTONS(0x10)
+#define NI_NODEADDR_FLAG_ANYCAST CONSTANT_HTONS(0x20) /* just experimental. not in spec */
+#else /* draft-ietf-ipngwg-icmp-name-lookups-05 (and later?) */
+#define NI_NODEADDR_FLAG_TRUNCATE CONSTANT_HTONS(0x1)
+#define NI_NODEADDR_FLAG_ALL CONSTANT_HTONS(0x2)
+#define NI_NODEADDR_FLAG_COMPAT CONSTANT_HTONS(0x4)
+#define NI_NODEADDR_FLAG_LINKLOCAL CONSTANT_HTONS(0x8)
+#define NI_NODEADDR_FLAG_SITELOCAL CONSTANT_HTONS(0x10)
+#define NI_NODEADDR_FLAG_GLOBAL CONSTANT_HTONS(0x20)
+#define NI_NODEADDR_FLAG_ANYCAST CONSTANT_HTONS(0x40) /* just experimental. not in spec */
+#endif
+
+struct ni_reply_fqdn {
+ u_int32_t ni_fqdn_ttl; /* TTL */
+ u_int8_t ni_fqdn_namelen; /* length in octets of the FQDN */
+ u_int8_t ni_fqdn_name[3]; /* XXX: alignment */
+};
+
+/*
+ * Router Renumbering. as router-renum-08.txt
+ */
+struct icmp6_router_renum { /* router renumbering header */
+ struct icmp6_hdr rr_hdr;
+ u_int8_t rr_segnum;
+ u_int8_t rr_flags;
+ u_int16_t rr_maxdelay;
+ u_int32_t rr_reserved;
+};
+
+#define ICMP6_RR_FLAGS_TEST 0x80
+#define ICMP6_RR_FLAGS_REQRESULT 0x40
+#define ICMP6_RR_FLAGS_FORCEAPPLY 0x20
+#define ICMP6_RR_FLAGS_SPECSITE 0x10
+#define ICMP6_RR_FLAGS_PREVDONE 0x08
+
+#define rr_type rr_hdr.icmp6_type
+#define rr_code rr_hdr.icmp6_code
+#define rr_cksum rr_hdr.icmp6_cksum
+#define rr_seqnum rr_hdr.icmp6_data32[0]
+
+struct rr_pco_match { /* match prefix part */
+ u_int8_t rpm_code;
+ u_int8_t rpm_len;
+ u_int8_t rpm_ordinal;
+ u_int8_t rpm_matchlen;
+ u_int8_t rpm_minlen;
+ u_int8_t rpm_maxlen;
+ u_int16_t rpm_reserved;
+ struct in6_addr rpm_prefix;
+};
+
+#define RPM_PCO_ADD 1
+#define RPM_PCO_CHANGE 2
+#define RPM_PCO_SETGLOBAL 3
+#define RPM_PCO_MAX 4
+
+struct rr_pco_use { /* use prefix part */
+ u_int8_t rpu_uselen;
+ u_int8_t rpu_keeplen;
+ u_int8_t rpu_ramask;
+ u_int8_t rpu_raflags;
+ u_int32_t rpu_vltime;
+ u_int32_t rpu_pltime;
+ u_int32_t rpu_flags;
+ struct in6_addr rpu_prefix;
+};
+#define ICMP6_RR_PCOUSE_RAFLAGS_ONLINK 0x80
+#define ICMP6_RR_PCOUSE_RAFLAGS_AUTO 0x40
+
+#define ICMP6_RR_PCOUSE_FLAGS_DECRVLTIME CONSTANT_HTONL(0x80000000)
+#define ICMP6_RR_PCOUSE_FLAGS_DECRPLTIME CONSTANT_HTONL(0x40000000)
+
+struct rr_result { /* router renumbering result message */
+ u_int16_t rrr_flags;
+ u_int8_t rrr_ordinal;
+ u_int8_t rrr_matchedlen;
+ u_int32_t rrr_ifid;
+ struct in6_addr rrr_prefix;
+};
+#define ICMP6_RR_RESULT_FLAGS_OOB CONSTANT_HTONS(0x0002)
+#define ICMP6_RR_RESULT_FLAGS_FORBIDDEN CONSTANT_HTONS(0x0001)
+
+/*
+ * icmp6 filter structures.
+ */
+
+struct icmp6_filter {
+ u_int32_t icmp6_filt[8];
+};
+
+#define ICMP6_FILTER_SETPASSALL(filterp) \
+ (void)memset(filterp, 0xff, sizeof(struct icmp6_filter))
+#define ICMP6_FILTER_SETBLOCKALL(filterp) \
+ (void)memset(filterp, 0x00, sizeof(struct icmp6_filter))
+#define ICMP6_FILTER_SETPASS(type, filterp) \
+ (((filterp)->icmp6_filt[(type) >> 5]) |= (1 << ((type) & 31)))
+#define ICMP6_FILTER_SETBLOCK(type, filterp) \
+ (((filterp)->icmp6_filt[(type) >> 5]) &= ~(1 << ((type) & 31)))
+#define ICMP6_FILTER_WILLPASS(type, filterp) \
+ ((((filterp)->icmp6_filt[(type) >> 5]) & (1 << ((type) & 31))) != 0)
+#define ICMP6_FILTER_WILLBLOCK(type, filterp) \
+ ((((filterp)->icmp6_filt[(type) >> 5]) & (1 << ((type) & 31))) == 0)
+
+/*
+ * Variables related to this implementation
+ * of the internet control message protocol version 6.
+ */
+
+/*
+ * IPv6 ICMP statistics.
+ * Each counter is an unsigned 64-bit value.
+ */
+#define ICMP6_STAT_ERROR 0 /* # of calls to icmp6_error */
+#define ICMP6_STAT_CANTERROR 1 /* no error (old was icmp) */
+#define ICMP6_STAT_TOOFREQ 2 /* no error (rate limitation) */
+#define ICMP6_STAT_OUTHIST 3 /* # of output messages */
+ /* space for 256 counters */
+#define ICMP6_STAT_BADCODE 259 /* icmp6_code out of range */
+#define ICMP6_STAT_TOOSHORT 260 /* packet < sizeof(struct icmp6_hdr) */
+#define ICMP6_STAT_CHECKSUM 261 /* bad checksum */
+#define ICMP6_STAT_BADLEN 262 /* calculated bound mismatch */
+ /*
+ * number of responses; this member is inherited from the netinet code,
+ * but for netinet6 code, it is already available in outhist[].
+ */
+#define ICMP6_STAT_REFLECT 263
+#define ICMP6_STAT_INHIST 264 /* # of input messages */
+ /* space for 256 counters */
+#define ICMP6_STAT_ND_TOOMANYOPT 520 /* too many ND options */
+#define ICMP6_STAT_OUTERRHIST 521
+ /* space for 13 counters */
+#define ICMP6_STAT_PMTUCHG 534 /* path MTU changes */
+#define ICMP6_STAT_ND_BADOPT 535 /* bad ND options */
+#define ICMP6_STAT_BADNS 536 /* bad neighbor solicititation */
+#define ICMP6_STAT_BADNA 537 /* bad neighbor advertisement */
+#define ICMP6_STAT_BADRS 538 /* bad router solicitiation */
+#define ICMP6_STAT_BADRA 539 /* bad router advertisement */
+#define ICMP6_STAT_BADREDIRECT 540 /* bad redirect message */
+#define ICMP6_STAT_DROPPED_RAROUTE 541 /* discarded routes from router advertisement */
+
+#define ICMP6_NSTATS 542
+
+#define ICMP6_ERRSTAT_DST_UNREACH_NOROUTE 0
+#define ICMP6_ERRSTAT_DST_UNREACH_ADMIN 1
+#define ICMP6_ERRSTAT_DST_UNREACH_BEYONDSCOPE 2
+#define ICMP6_ERRSTAT_DST_UNREACH_ADDR 3
+#define ICMP6_ERRSTAT_DST_UNREACH_NOPORT 4
+#define ICMP6_ERRSTAT_PACKET_TOO_BIG 5
+#define ICMP6_ERRSTAT_TIME_EXCEED_TRANSIT 6
+#define ICMP6_ERRSTAT_TIME_EXCEED_REASSEMBLY 7
+#define ICMP6_ERRSTAT_PARAMPROB_HEADER 8
+#define ICMP6_ERRSTAT_PARAMPROB_NEXTHEADER 9
+#define ICMP6_ERRSTAT_PARAMPROB_OPTION 10
+#define ICMP6_ERRSTAT_REDIRECT 11
+#define ICMP6_ERRSTAT_UNKNOWN 12
+
+/*
+ * Names for ICMP sysctl objects
+ */
+#define ICMPV6CTL_STATS 1
+#define ICMPV6CTL_REDIRACCEPT 2 /* accept/process redirects */
+#define ICMPV6CTL_REDIRTIMEOUT 3 /* redirect cache time */
+#if 0 /*obsoleted*/
+#define ICMPV6CTL_ERRRATELIMIT 5 /* ICMPv6 error rate limitation */
+#endif
+#define ICMPV6CTL_ND6_PRUNE 6
+#define ICMPV6CTL_ND6_DELAY 8
+#define ICMPV6CTL_ND6_UMAXTRIES 9
+#define ICMPV6CTL_ND6_MMAXTRIES 10
+#define ICMPV6CTL_ND6_USELOOPBACK 11
+/*#define ICMPV6CTL_ND6_PROXYALL 12 obsoleted, do not reuse here */
+#define ICMPV6CTL_NODEINFO 13
+#define ICMPV6CTL_ERRPPSLIMIT 14 /* ICMPv6 error pps limitation */
+#define ICMPV6CTL_ND6_MAXNUDHINT 15
+#define ICMPV6CTL_MTUDISC_HIWAT 16
+#define ICMPV6CTL_MTUDISC_LOWAT 17
+#define ICMPV6CTL_ND6_DEBUG 18
+#define ICMPV6CTL_ND6_DRLIST 19
+#define ICMPV6CTL_ND6_PRLIST 20
+#define ICMPV6CTL_ND6_MAXQLEN 24
+#define ICMPV6CTL_MAXID 25
+
+#define ICMPV6CTL_NAMES { \
+ { 0, 0 }, \
+ { 0, 0 }, \
+ { "rediraccept", CTLTYPE_INT }, \
+ { "redirtimeout", CTLTYPE_INT }, \
+ { 0, 0 }, \
+ { 0, 0 }, \
+ { "nd6_prune", CTLTYPE_INT }, \
+ { 0, 0 }, \
+ { "nd6_delay", CTLTYPE_INT }, \
+ { "nd6_umaxtries", CTLTYPE_INT }, \
+ { "nd6_mmaxtries", CTLTYPE_INT }, \
+ { "nd6_useloopback", CTLTYPE_INT }, \
+ { 0, 0 }, \
+ { "nodeinfo", CTLTYPE_INT }, \
+ { "errppslimit", CTLTYPE_INT }, \
+ { "nd6_maxnudhint", CTLTYPE_INT }, \
+ { "mtudisc_hiwat", CTLTYPE_INT }, \
+ { "mtudisc_lowat", CTLTYPE_INT }, \
+ { "nd6_debug", CTLTYPE_INT }, \
+ { 0, 0 }, \
+ { 0, 0 }, \
+ { 0, 0 }, \
+ { 0, 0 }, \
+ { 0, 0 }, \
+ { "nd6_maxqueuelen", CTLTYPE_INT }, \
+}
+
+#endif /* !_NETINET_ICMP6_H_ */
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef __NETINET_IP_H
+#define __NETINET_IP_H 1
+
+#define IPTOS_PREC_INTERNETCONTROL 0xc0
+#define MAXTTL 255
+#define IPTOS_LOWDELAY 0x10
+#define IPTOS_THROUGHPUT 0x08
+
+#endif /* netinet/ip.h */
--- /dev/null
+/*
+ * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the project nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+/*
+ * Copyright (c) 1982, 1986, 1993
+ * The Regents of the University of California. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * @(#)ip.h 8.1 (Berkeley) 6/10/93
+ */
+
+#ifndef _NETINET_IP6_H_
+#define _NETINET_IP6_H_
+#include <inttypes.h>
+#include <netinet/in.h>
+#include "byte-order.h"
+
+/*
+ * Definition for internet protocol version 6.
+ * RFC 2460
+ */
+
+struct ip6_hdr {
+ union {
+ struct ip6_hdrctl {
+ u_int32_t ip6_un1_flow; /* 20 bits of flow-ID */
+ u_int16_t ip6_un1_plen; /* payload length */
+ u_int8_t ip6_un1_nxt; /* next header */
+ u_int8_t ip6_un1_hlim; /* hop limit */
+ } ip6_un1;
+ u_int8_t ip6_un2_vfc; /* 4 bits version, top 4 bits class */
+ } ip6_ctlun;
+ struct in6_addr ip6_src; /* source address */
+ struct in6_addr ip6_dst; /* destination address */
+};
+
+#define ip6_vfc ip6_ctlun.ip6_un2_vfc
+#define ip6_flow ip6_ctlun.ip6_un1.ip6_un1_flow
+#define ip6_plen ip6_ctlun.ip6_un1.ip6_un1_plen
+#define ip6_nxt ip6_ctlun.ip6_un1.ip6_un1_nxt
+#define ip6_hlim ip6_ctlun.ip6_un1.ip6_un1_hlim
+#define ip6_hops ip6_ctlun.ip6_un1.ip6_un1_hlim
+
+#define IPV6_VERSION 0x60
+#define IPV6_VERSION_MASK 0xf0
+
+#define IPV6_FLOWINFO_MASK CONSTANT_HTONL(0x0fffffff) /* flow info (28 bits) */
+#define IPV6_FLOWLABEL_MASK CONSTANT_HTONL(0x000fffff) /* flow label (20 bits) */
+#if 1
+/* ECN bits proposed by Sally Floyd */
+#define IP6TOS_CE 0x01 /* congestion experienced */
+#define IP6TOS_ECT 0x02 /* ECN-capable transport */
+#endif
+
+/*
+ * Extension Headers
+ */
+
+struct ip6_ext {
+ u_int8_t ip6e_nxt;
+ u_int8_t ip6e_len;
+};
+
+/* Hop-by-Hop options header */
+/* XXX should we pad it to force alignment on an 8-byte boundary? */
+struct ip6_hbh {
+ u_int8_t ip6h_nxt; /* next header */
+ u_int8_t ip6h_len; /* length in units of 8 octets */
+ /* followed by options */
+};
+
+/* Destination options header */
+/* XXX should we pad it to force alignment on an 8-byte boundary? */
+struct ip6_dest {
+ u_int8_t ip6d_nxt; /* next header */
+ u_int8_t ip6d_len; /* length in units of 8 octets */
+ /* followed by options */
+};
+
+/* Option types and related macros */
+#define IP6OPT_PAD1 0x00 /* 00 0 00000 */
+#define IP6OPT_PADN 0x01 /* 00 0 00001 */
+#define IP6OPT_JUMBO 0xC2 /* 11 0 00010 = 194 */
+#define IP6OPT_NSAP_ADDR 0xC3 /* 11 0 00011 */
+#define IP6OPT_TUNNEL_LIMIT 0x04 /* 00 0 00100 */
+#define IP6OPT_RTALERT 0x05 /* 00 0 00101 (KAME definition) */
+#define IP6OPT_ROUTER_ALERT 0x05 /* (RFC3542 def, recommended) */
+
+#define IP6OPT_RTALERT_LEN 4
+#define IP6OPT_RTALERT_MLD 0 /* Datagram contains an MLD message */
+#define IP6OPT_RTALERT_RSVP 1 /* Datagram contains an RSVP message */
+#define IP6OPT_RTALERT_ACTNET 2 /* contains an Active Networks msg */
+#define IP6OPT_MINLEN 2
+
+#define IP6OPT_TYPE(o) ((o) & 0xC0)
+#define IP6OPT_TYPE_SKIP 0x00
+#define IP6OPT_TYPE_DISCARD 0x40
+#define IP6OPT_TYPE_FORCEICMP 0x80
+#define IP6OPT_TYPE_ICMP 0xC0
+
+#define IP6OPT_MUTABLE 0x20
+
+/* IPv6 options: common part */
+struct ip6_opt {
+ u_int8_t ip6o_type;
+ u_int8_t ip6o_len;
+};
+
+/* Jumbo Payload Option */
+struct ip6_opt_jumbo {
+ u_int8_t ip6oj_type;
+ u_int8_t ip6oj_len;
+ u_int8_t ip6oj_jumbo_len[4];
+};
+#define IP6OPT_JUMBO_LEN 6
+
+/* NSAP Address Option */
+struct ip6_opt_nsap {
+ u_int8_t ip6on_type;
+ u_int8_t ip6on_len;
+ u_int8_t ip6on_src_nsap_len;
+ u_int8_t ip6on_dst_nsap_len;
+ /* followed by source NSAP */
+ /* followed by destination NSAP */
+};
+
+/* Tunnel Limit Option */
+struct ip6_opt_tunnel {
+ u_int8_t ip6ot_type;
+ u_int8_t ip6ot_len;
+ u_int8_t ip6ot_encap_limit;
+};
+
+/* Router Alert Option */
+struct ip6_opt_router {
+ u_int8_t ip6or_type;
+ u_int8_t ip6or_len;
+ u_int8_t ip6or_value[2];
+};
+/* Router alert values (in network byte order) */
+#define IP6_ALERT_MLD CONSTANT_HTONS(0x0000)
+#define IP6_ALERT_RSVP CONSTANT_HTONS(0x0001)
+#define IP6_ALERT_AN CONSTANT_HTONS(0x0002)
+
+/* Routing header */
+struct ip6_rthdr {
+ u_int8_t ip6r_nxt; /* next header */
+ u_int8_t ip6r_len; /* length in units of 8 octets */
+ u_int8_t ip6r_type; /* routing type */
+ u_int8_t ip6r_segleft; /* segments left */
+ /* followed by routing type specific data */
+};
+
+/* Type 0 Routing header */
+struct ip6_rthdr0 {
+ u_int8_t ip6r0_nxt; /* next header */
+ u_int8_t ip6r0_len; /* length in units of 8 octets */
+ u_int8_t ip6r0_type; /* always zero */
+ u_int8_t ip6r0_segleft; /* segments left */
+ u_int32_t ip6r0_reserved; /* reserved field */
+};
+
+/* Fragment header */
+struct ip6_frag {
+ u_int8_t ip6f_nxt; /* next header */
+ u_int8_t ip6f_reserved; /* reserved field */
+ u_int16_t ip6f_offlg; /* offset, reserved, and flag */
+ u_int32_t ip6f_ident; /* identification */
+};
+
+#define IP6F_OFF_MASK CONSTANT_HTONS(0xfff8) /* mask out offset from _offlg */
+#define IP6F_RESERVED_MASK CONSTANT_HTONS(0x0006) /* reserved bits in ip6f_offlg */
+#define IP6F_MORE_FRAG CONSTANT_HTONS(0x0001) /* more-fragments flag */
+
+/*
+ * Internet implementation parameters.
+ */
+#define IPV6_MAXHLIM 255 /* maximum hoplimit */
+#define IPV6_DEFHLIM 64 /* default hlim */
+#define IPV6_FRAGTTL 120 /* ttl for fragment packets, in slowtimo tick */
+#define IPV6_HLIMDEC 1 /* subtracted when forwarding */
+
+#define IPV6_MMTU 1280 /* minimal MTU and reassembly. 1024 + 256 */
+#define IPV6_MAXPACKET 65535 /* ip6 max packet size without Jumbo payload*/
+
+#endif /* !_NETINET_IP6_H_ */
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef __SYS_SOCKET_H
+#define __SYS_SOCKET_H 1
+
+typedef unsigned short int sa_family_t;
+
+#endif /* sys/socket.h */
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#ifndef _UNISTD_H
+#define _UNISTD_H 1
+
+#define fsync _commit
+
+/* Standard file descriptors. */
+#define STDIN_FILENO 0 /* Standard input. */
+#define STDOUT_FILENO 1 /* Standard output. */
+#define STDERR_FILENO 2 /* Standard error output. */
+
+#endif /* unistd.h */
#include <WS2tcpip.h>
#include <windows.h>
#include <BaseTsd.h>
+#include <io.h>
+#include <inttypes.h>
+
+#pragma comment(lib, "advapi32")
#define inline __inline
#define __func__ __FUNCTION__
#define u_int32_t uint32_t
#define u_int64_t uint64_t
+typedef int pid_t;
+
+char *strsep(char **stringp, const char *delim);
+
+#define srandom srand
+#define random rand
+
#endif /* windefs.h */
lib_LTLIBRARIES += lib/libopenvswitch.la
lib_libopenvswitch_la_LIBADD = $(SSL_LIBS)
+
+if WIN32
+lib_libopenvswitch_la_LIBADD += ${PTHREAD_LIBS}
+endif
+
lib_libopenvswitch_la_LDFLAGS = -release $(VERSION)
lib_libopenvswitch_la_SOURCES = \
lib/dhparams.h \
lib/dirs.h \
lib/dpif-netdev.c \
+ lib/dpif-netdev.h \
lib/dpif-provider.h \
lib/dpif.c \
lib/dpif.h \
lib/ovs-atomic-c11.h \
lib/ovs-atomic-clang.h \
lib/ovs-atomic-flag-gcc4.7+.h \
- lib/ovs-atomic-gcc4+.c \
lib/ovs-atomic-gcc4+.h \
lib/ovs-atomic-gcc4.7+.h \
- lib/ovs-atomic-pthreads.c \
+ lib/ovs-atomic-locked.c \
+ lib/ovs-atomic-locked.h \
lib/ovs-atomic-pthreads.h \
lib/ovs-atomic.h \
+ lib/ovs-rcu.c \
+ lib/ovs-rcu.h \
lib/ovs-thread.c \
lib/ovs-thread.h \
lib/ovsdb-data.c \
lib/getopt_long.c \
lib/getrusage-windows.c \
lib/latch-windows.c \
+ lib/route-table-stub.c \
+ lib/strsep.c \
lib/stream-fd-windows.c
else
lib_libopenvswitch_la_SOURCES += \
lib/route-table.h
endif
+if DPDK_NETDEV
+lib_libopenvswitch_la_SOURCES += \
+ lib/netdev-dpdk.c \
+ lib/netdev-dpdk.h
+endif
+
if HAVE_POSIX_AIO
lib_libopenvswitch_la_SOURCES += lib/async-append-aio.c
else
*/
#include <config.h>
+#include <inttypes.h>
#include "backtrace.h"
+#include "vlog.h"
+
+VLOG_DEFINE_THIS_MODULE(backtrace);
#ifdef HAVE_BACKTRACE
#include <execinfo.h>
b->frames[i] = (uintptr_t) frames[i];
}
}
+
#else
void
backtrace_capture(struct backtrace *backtrace)
backtrace->n_frames = 0;
}
#endif
+
+static char *
+backtrace_format(const struct backtrace *b, struct ds *ds)
+{
+ if (b->n_frames) {
+ int i;
+
+ ds_put_cstr(ds, " (backtrace:");
+ for (i = 0; i < b->n_frames; i++) {
+ ds_put_format(ds, " 0x%08"PRIxPTR, b->frames[i]);
+ }
+ ds_put_cstr(ds, ")");
+ }
+
+ return ds_cstr(ds);
+}
+
+void
+log_backtrace_at(const char *msg, const char *where)
+{
+ struct backtrace b;
+ struct ds ds = DS_EMPTY_INITIALIZER;
+
+ backtrace_capture(&b);
+ if (msg) {
+ ds_put_format(&ds, "%s ", msg);
+ }
+
+ ds_put_cstr(&ds, where);
+ VLOG_ERR("%s", backtrace_format(&b, &ds));
+
+ ds_destroy(&ds);
+}
#define BACKTRACE_H 1
#include <stdint.h>
+#include "dynamic-string.h"
+
+/* log_backtrace() will save the backtrace of a running program
+ * into the log at the DEBUG level.
+ *
+ * To use it, insert the following code to where backtrace is
+ * desired:
+ * #include "backtrace.h"
+ *
+ * log_backtrace();
+ * // A message can be added with log_backtrace_msg("your message")
+ *
+ *
+ * A typical log will look like the following. The hex numbers listed after
+ * "backtrace" are the addresses of the backtrace.
+ *
+ * 2014-03-13T23:18:11.979Z|00002|backtrace(revalidator_6)|ERR|lib/dpif-netdev.c:1312: (backtrace: 0x00521f57 0x00460365 0x00463ea4 0x0046470b 0x0043b32d 0x0043bac3 0x0043bae2 0x0043943b 0x004c22b3 0x2b5b3ac94e9a 0x2b5b3b4a33fd)
+ *
+ * The following bash command can be used to view backtrace in
+ * a more readable form.
+ * addr2line -p -e vswitchd/ovs-vswitchd <cut-and-paste back traces>
+ *
+ * An typical run and output will look like:
+ * addr2line -p -e vswitchd/ovs-vswitchd 0x00521f57 0x00460365 0x00463ea4
+ * 0x0046470b 0x0043b32d 0x0043bac3 0x0043bae2 0x0043943b 0x004c22b3
+ * 0x2b5b3ac94e9a 0x2b5b3b4a33fd
+ *
+ * openvswitch/lib/backtrace.c:33
+ * openvswitch/lib/dpif-netdev.c:1312
+ * openvswitch/lib/dpif.c:937
+ * openvswitch/lib/dpif.c:1258
+ * openvswitch/ofproto/ofproto-dpif-upcall.c:1440
+ * openvswitch/ofproto/ofproto-dpif-upcall.c:1595
+ * openvswitch/ofproto/ofproto-dpif-upcall.c:160
+ * openvswitch/ofproto/ofproto-dpif-upcall.c:717
+ * openvswitch/lib/ovs-thread.c:268
+ * ??:0
+ * ??:0
+ */
+
+#define log_backtrace() log_backtrace_at(NULL, SOURCE_LOCATOR);
+#define log_backtrace_msg(msg) log_backtrace_at(msg, SOURCE_LOCATOR);
#define BACKTRACE_MAX_FRAMES 31
};
void backtrace_capture(struct backtrace *);
+void log_backtrace_at(const char *msg, const char *where);
#endif /* backtrace.h */
-/* Copyright (c) 2013 Nicira, Inc.
+/* Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "hmap.h"
#include "list.h"
#include "netdev.h"
-#include "netlink.h"
#include "odp-util.h"
#include "ofpbuf.h"
#include "ovs-thread.h"
ovs_mutex_lock(&mutex);
hmap_remove(all_bfds, &bfd->node);
netdev_close(bfd->netdev);
- ovs_refcount_destroy(&bfd->ref_cnt);
free(bfd->name);
free(bfd);
ovs_mutex_unlock(&mutex);
enum flags flags;
uint8_t version;
struct msg *msg;
+ const uint8_t *l7 = ofpbuf_get_udp_payload(p);
+
+ if (!l7) {
+ return; /* No UDP payload. */
+ }
/* This function is designed to follow section RFC 5880 6.8.6 closely. */
goto out;
}
- msg = ofpbuf_at(p, (uint8_t *)p->l7 - (uint8_t *)p->data, BFD_PACKET_LEN);
+ msg = ofpbuf_at(p, l7 - (uint8_t *)p->data, BFD_PACKET_LEN);
if (!msg) {
VLOG_INFO_RL(&rl, "%s: Received too-short BFD control message (only "
"%"PRIdPTR" bytes long, at least %d required).",
- bfd->name, (uint8_t *) ofpbuf_tail(p) - (uint8_t *) p->l7,
+ bfd->name, (uint8_t *) ofpbuf_tail(p) - l7,
BFD_PACKET_LEN);
goto out;
}
/*
- * Copyright (c) 2008, 2009, 2011 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2011, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
}
/* Scans 'bitmap' from bit offset 'start' to 'end', excluding 'end' itself.
- * Returns the bit offset of the lowest-numbered bit set to 1, or 'end' if
- * all of the bits are set to 0. */
+ * Returns the bit offset of the lowest-numbered bit set to 'target', or 'end'
+ * if all of the bits are set to '!target'. */
size_t
-bitmap_scan(const unsigned long int *bitmap, size_t start, size_t end)
+bitmap_scan(const unsigned long int *bitmap, bool target,
+ size_t start, size_t end)
{
/* XXX slow */
size_t i;
for (i = start; i < end; i++) {
- if (bitmap_is_set(bitmap, i)) {
+ if (bitmap_is_set(bitmap, i) == target) {
break;
}
}
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
void bitmap_set_multiple(unsigned long *, size_t start, size_t count,
bool value);
bool bitmap_equal(const unsigned long *, const unsigned long *, size_t n);
-size_t bitmap_scan(const unsigned long int *, size_t start, size_t end);
+size_t bitmap_scan(const unsigned long int *, bool target,
+ size_t start, size_t end);
size_t bitmap_count1(const unsigned long *, size_t n);
#define BITMAP_FOR_EACH_1(IDX, SIZE, BITMAP) \
- for ((IDX) = bitmap_scan(BITMAP, 0, SIZE); (IDX) < (SIZE); \
- (IDX) = bitmap_scan(BITMAP, (IDX) + 1, SIZE))
+ for ((IDX) = bitmap_scan(BITMAP, 1, 0, SIZE); (IDX) < (SIZE); \
+ (IDX) = bitmap_scan(BITMAP, 1, (IDX) + 1, SIZE))
#endif /* bitmap.h */
#include "openvswitch/types.h"
#ifndef __CHECKER__
+#ifndef _WIN32
static inline ovs_be64
htonll(uint64_t n)
{
{
return htonl(1) == 1 ? n : ((uint64_t) ntohl(n) << 32) | ntohl(n >> 32);
}
+#endif /* _WIN32 */
#else
/* Making sparse happy with these functions also makes them unreadable, so
* don't bother to show it their implementations. */
/*
- * Copyright (c) 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
static uint32_t
hash_mpid(uint64_t mpid)
{
- return hash_bytes(&mpid, sizeof mpid, 0);
+ return hash_uint64(mpid);
}
static bool
netdev_close(cfm->netdev);
free(cfm->rmps_array);
- atomic_destroy(&cfm->extended);
- atomic_destroy(&cfm->check_tnl_key);
- ovs_refcount_destroy(&cfm->ref_cnt);
-
free(cfm);
}
if (timer_expired(&cfm->fault_timer)) {
long long int interval = cfm_fault_interval(cfm);
struct remote_mp *rmp, *rmp_next;
- bool old_cfm_fault = cfm->fault;
+ enum cfm_fault_reason old_cfm_fault = cfm->fault;
+ uint64_t old_flap_count = cfm->flap_count;
+ int old_health = cfm->health;
+ size_t old_rmps_array_len = cfm->rmps_array_len;
+ bool old_rmps_deleted = false;
bool old_rmp_opup = cfm->remote_opup;
bool demand_override;
bool rmp_set_opup = false;
cfm->health = 0;
} else {
int exp_ccm_recvd;
- int old_health = cfm->health;
rmp = CONTAINER_OF(hmap_first(&cfm->remote_mps),
struct remote_mp, node);
cfm->health = MIN(cfm->health, 100);
rmp->num_health_ccm = 0;
ovs_assert(cfm->health >= 0 && cfm->health <= 100);
-
- if (cfm->health != old_health) {
- seq_change(connectivity_seq_get());
- }
}
cfm->health_interval = 0;
}
" %lldms", cfm->name, rmp->mpid,
time_msec() - rmp->last_rx);
if (!demand_override) {
+ old_rmps_deleted = true;
hmap_remove(&cfm->remote_mps, &rmp->node);
free(rmp);
}
cfm->remote_opup = true;
}
- if (old_rmp_opup != cfm->remote_opup) {
- seq_change(connectivity_seq_get());
- }
-
if (hmap_is_empty(&cfm->remote_mps)) {
cfm->fault |= CFM_FAULT_RECV;
}
}
/* If there is a flap, increments the counter. */
- if (old_cfm_fault == false || cfm->fault == false) {
+ if (old_cfm_fault == 0 || cfm->fault == 0) {
cfm->flap_count++;
}
+ }
+ /* These variables represent the cfm session status, it is desirable
+ * to update them to database immediately after change. */
+ if (old_health != cfm->health
+ || old_rmp_opup != cfm->remote_opup
+ || (old_rmps_array_len != cfm->rmps_array_len || old_rmps_deleted)
+ || old_cfm_fault != cfm->fault
+ || old_flap_count != cfm->flap_count) {
seq_change(connectivity_seq_get());
}
eth_push_vlan(packet, htons(ETH_TYPE_VLAN), htons(tci));
}
- ccm = packet->l3;
+ ccm = ofpbuf_get_l3(packet);
ccm->mdlevel_version = 0;
ccm->opcode = CCM_OPCODE;
ccm->tlv_offset = 70;
ovs_mutex_lock(&mutex);
eth = p->l2;
- ccm = ofpbuf_at(p, (uint8_t *)p->l3 - (uint8_t *)p->data, CCM_ACCEPT_LEN);
+ ccm = ofpbuf_at(p, (uint8_t *)ofpbuf_get_l3(p) - (uint8_t *)p->data,
+ CCM_ACCEPT_LEN);
if (!ccm) {
VLOG_INFO_RL(&rl, "%s: Received an unparseable 802.1ag CCM heartbeat.",
hash_metadata(ovs_be64 metadata_)
{
uint64_t metadata = (OVS_FORCE uint64_t) metadata_;
- return hash_2words(metadata, metadata >> 32);
+ return hash_uint64(metadata);
}
static struct cls_partition *
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#define OVS_PACKED(DECL) __pragma(pack(push, 1)) DECL __pragma(pack(pop))
#endif
+/* For defining a structure whose instances should aligned on an N-byte
+ * boundary.
+ *
+ * e.g. The following:
+ * OVS_ALIGNED_STRUCT(64, mystruct) { ... };
+ * is equivalent to the following except that it specifies 64-byte alignment:
+ * struct mystruct { ... };
+ */
+#ifndef _MSC_VER
+#define OVS_ALIGNED_STRUCT(N, TAG) struct __attribute__((aligned(N))) TAG
+#else
+#define OVS_ALIGNED_STRUCT(N, TAG) __declspec(align(N)) struct TAG
+#endif
+
#ifdef _MSC_VER
#define CCALL __cdecl
#pragma section(".CRT$XCU",read)
#include "poll-loop.h"
#include "vlog.h"
-#pragma comment(lib, "advapi32")
-
VLOG_DEFINE_THIS_MODULE(daemon);
static bool detach; /* Was --service specified? */
if (now >= wakeup) {
break;
}
- sleep(wakeup - now);
+ xsleep(wakeup - now);
}
}
last_restart = time(NULL);
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
static int dpif_linux_init(void);
static int open_dpif(const struct dpif_linux_dp *, struct dpif **);
static uint32_t dpif_linux_port_get_pid(const struct dpif *,
- odp_port_t port_no);
+ odp_port_t port_no, uint32_t hash);
static int dpif_linux_refresh_channels(struct dpif *);
static void dpif_linux_vport_to_ofpbuf(const struct dpif_linux_vport *,
dpif->n_events = dpif->event_offset = 0;
/* Don't close dpif->epoll_fd since that would cause other threads that
- * call dpif_recv_wait(dpif) to wait on an arbitrary fd or a closed fd. */
+ * call dpif_recv_wait() to wait on an arbitrary fd or a closed fd. */
}
static int
}
static uint32_t
-dpif_linux_port_get_pid(const struct dpif *dpif_, odp_port_t port_no)
+dpif_linux_port_get_pid(const struct dpif *dpif_, odp_port_t port_no,
+ uint32_t hash OVS_UNUSED)
{
struct dpif_linux *dpif = dpif_linux_cast(dpif_);
uint32_t port_idx = odp_to_u32(port_no);
unsigned int nl_status = nl_dump_done(&iter->dump);
atomic_read(&iter->status, &dump_status);
- atomic_destroy(&iter->status);
free(iter);
return dump_status ? dump_status : nl_status;
}
return error;
}
+static int
+dpif_linux_handlers_set(struct dpif *dpif_ OVS_UNUSED,
+ uint32_t n_handlers OVS_UNUSED)
+{
+ return 0;
+}
+
static int
dpif_linux_queue_to_priority(const struct dpif *dpif OVS_UNUSED,
uint32_t queue_id, uint32_t *priority)
}
static int
-dpif_linux_recv(struct dpif *dpif_, struct dpif_upcall *upcall,
- struct ofpbuf *buf)
+dpif_linux_recv(struct dpif *dpif_, uint32_t handler_id OVS_UNUSED,
+ struct dpif_upcall *upcall, struct ofpbuf *buf)
{
struct dpif_linux *dpif = dpif_linux_cast(dpif_);
int error;
}
static void
-dpif_linux_recv_wait(struct dpif *dpif_)
+dpif_linux_recv_wait(struct dpif *dpif_, uint32_t handler_id OVS_UNUSED)
{
struct dpif_linux *dpif = dpif_linux_cast(dpif_);
dpif_linux_execute,
dpif_linux_operate,
dpif_linux_recv_set,
+ dpif_linux_handlers_set,
dpif_linux_queue_to_priority,
dpif_linux_recv,
dpif_linux_recv_wait,
#include "list.h"
#include "meta-flow.h"
#include "netdev.h"
+#include "netdev-dpdk.h"
#include "netdev-vport.h"
#include "netlink.h"
#include "odp-execute.h"
#include "odp-util.h"
#include "ofp-print.h"
#include "ofpbuf.h"
+#include "ovs-rcu.h"
#include "packets.h"
#include "poll-loop.h"
#include "random.h"
/* By default, choose a priority in the middle. */
#define NETDEV_RULE_PRIORITY 0x8000
+#define NR_THREADS 1
+
/* Configuration parameters. */
enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow table. */
-/* Enough headroom to add a vlan tag, plus an extra 2 bytes to allow IP
- * headers to be aligned on a 4-byte boundary. */
-enum { DP_NETDEV_HEADROOM = 2 + VLAN_HEADER_LEN };
-
/* Queues. */
-enum { N_QUEUES = 2 }; /* Number of queues for dpif_recv(). */
enum { MAX_QUEUE_LEN = 128 }; /* Maximum number of packets per queue. */
enum { QUEUE_MASK = MAX_QUEUE_LEN - 1 };
BUILD_ASSERT_DECL(IS_POW2(MAX_QUEUE_LEN));
struct ofpbuf buf; /* ofpbuf instance for upcall.packet. */
};
-/* A queue passing packets from a struct dp_netdev to its clients.
+/* A queue passing packets from a struct dp_netdev to its clients (handlers).
*
*
* Thread-safety
* =============
*
- * Any access at all requires the owning 'dp_netdev''s queue_mutex. */
+ * Any access at all requires the owning 'dp_netdev''s queue_rwlock and
+ * its own mutex. */
struct dp_netdev_queue {
+ struct ovs_mutex mutex;
+ struct seq *seq; /* Incremented whenever a packet is queued. */
struct dp_netdev_upcall upcalls[MAX_QUEUE_LEN] OVS_GUARDED;
unsigned int head OVS_GUARDED;
unsigned int tail OVS_GUARDED;
* port_rwlock
* flow_mutex
* cls.rwlock
- * queue_mutex
+ * queue_rwlock
*/
struct dp_netdev {
const struct dpif_class *const class;
/* Queues.
*
- * Everything in 'queues' is protected by 'queue_mutex'. */
- struct ovs_mutex queue_mutex;
- struct dp_netdev_queue queues[N_QUEUES];
- struct seq *queue_seq; /* Incremented whenever a packet is queued. */
+ * 'queue_rwlock' protects the modification of 'handler_queues' and
+ * 'n_handlers'. The queue elements are protected by its
+ * 'handler_queues''s mutex. */
+ struct fat_rwlock queue_rwlock;
+ struct dp_netdev_queue *handler_queues;
+ uint32_t n_handlers;
/* Statistics.
*
- * ovsthread_counter is internally synchronized. */
- struct ovsthread_counter *n_hit; /* Number of flow table matches. */
- struct ovsthread_counter *n_missed; /* Number of flow table misses. */
- struct ovsthread_counter *n_lost; /* Number of misses not passed up. */
+ * ovsthread_stats is internally synchronized. */
+ struct ovsthread_stats stats; /* Contains 'struct dp_netdev_stats *'. */
/* Ports.
*
/* Forwarding threads. */
struct latch exit_latch;
- struct dp_forwarder *forwarders;
- size_t n_forwarders;
+ struct pmd_thread *pmd_threads;
+ size_t n_pmd_threads;
+ int pmd_count;
};
static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev *dp,
odp_port_t)
OVS_REQ_RDLOCK(dp->port_rwlock);
+enum dp_stat_type {
+ DP_STAT_HIT, /* Packets that matched in the flow table. */
+ DP_STAT_MISS, /* Packets that did not match. */
+ DP_STAT_LOST, /* Packets not passed up to the client. */
+ DP_N_STATS
+};
+
+/* Contained by struct dp_netdev's 'stats' member. */
+struct dp_netdev_stats {
+ struct ovs_mutex mutex; /* Protects 'n'. */
+
+ /* Indexed by DP_STAT_*, protected by 'mutex'. */
+ unsigned long long int n[DP_N_STATS] OVS_GUARDED;
+};
+
+
/* A port in a netdev-based datapath. */
struct dp_netdev_port {
struct hmap_node node; /* Node in dp_netdev's 'ports'. */
odp_port_t port_no;
struct netdev *netdev;
struct netdev_saved_flags *sf;
- struct netdev_rx *rx;
+ struct netdev_rxq **rxq;
+ struct ovs_refcount ref_cnt;
char *type; /* Port type as requested by user. */
};
const struct hmap_node node; /* In owning dp_netdev's 'flow_table'. */
const struct flow flow; /* The flow that created this entry. */
- /* Number of references.
- * The classifier owns one reference.
- * Any thread trying to keep a rule from being freed should hold its own
- * reference. */
- struct ovs_refcount ref_cnt;
-
/* Protects members marked OVS_GUARDED.
*
* Acquire after datapath's flow_mutex. */
/* Statistics.
*
* Reading or writing these members requires 'mutex'. */
- long long int used OVS_GUARDED; /* Last used time, in monotonic msecs. */
- long long int packet_count OVS_GUARDED; /* Number of packets matched. */
- long long int byte_count OVS_GUARDED; /* Number of bytes matched. */
- uint16_t tcp_flags OVS_GUARDED; /* Bitwise-OR of seen tcp_flags values. */
+ struct ovsthread_stats stats; /* Contains "struct dp_netdev_flow_stats". */
/* Actions.
*
* Reading 'actions' requires 'mutex'.
* Writing 'actions' requires 'mutex' and (to allow for transactions) the
* datapath's flow_mutex. */
- struct dp_netdev_actions *actions OVS_GUARDED;
+ OVSRCU_TYPE(struct dp_netdev_actions *) actions;
};
-static struct dp_netdev_flow *dp_netdev_flow_ref(
- const struct dp_netdev_flow *);
-static void dp_netdev_flow_unref(struct dp_netdev_flow *);
+static void dp_netdev_flow_free(struct dp_netdev_flow *);
+
+/* Contained by struct dp_netdev_flow's 'stats' member. */
+struct dp_netdev_flow_stats {
+ struct ovs_mutex mutex; /* Guards all the other members. */
+
+ long long int used OVS_GUARDED; /* Last used time, in monotonic msecs. */
+ long long int packet_count OVS_GUARDED; /* Number of packets matched. */
+ long long int byte_count OVS_GUARDED; /* Number of bytes matched. */
+ uint16_t tcp_flags OVS_GUARDED; /* Bitwise-OR of seen tcp_flags values. */
+};
/* A set of datapath actions within a "struct dp_netdev_flow".
*
* 'flow' is the dp_netdev_flow for which 'flow->actions == actions') or that
* owns a reference to 'actions->ref_cnt' (or both). */
struct dp_netdev_actions {
- struct ovs_refcount ref_cnt;
-
/* These members are immutable: they do not change during the struct's
* lifetime. */
struct nlattr *actions; /* Sequence of OVS_ACTION_ATTR_* attributes. */
struct dp_netdev_actions *dp_netdev_actions_create(const struct nlattr *,
size_t);
-struct dp_netdev_actions *dp_netdev_actions_ref(
- const struct dp_netdev_actions *);
-void dp_netdev_actions_unref(struct dp_netdev_actions *);
+struct dp_netdev_actions *dp_netdev_flow_get_actions(
+ const struct dp_netdev_flow *);
+static void dp_netdev_actions_free(struct dp_netdev_actions *);
-/* A thread that receives packets from some ports, looks them up in the flow
- * table, and executes the actions it finds. */
-struct dp_forwarder {
+/* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate
+ * the performance overhead of interrupt processing. Therefore netdev can
+ * not implement rx-wait for these devices. dpif-netdev needs to poll
+ * these device to check for recv buffer. pmd-thread does polling for
+ * devices assigned to itself thread.
+ *
+ * DPDK used PMD for accessing NIC.
+ *
+ * A thread that receives packets from PMD ports, looks them up in the flow
+ * table, and executes the actions it finds.
+ **/
+struct pmd_thread {
struct dp_netdev *dp;
pthread_t thread;
+ int id;
+ atomic_uint change_seq;
char *name;
- uint32_t min_hash, max_hash;
};
/* Interface to netdev-based datapath. */
OVS_REQ_WRLOCK(dp->port_rwlock);
static int do_del_port(struct dp_netdev *dp, odp_port_t port_no)
OVS_REQ_WRLOCK(dp->port_rwlock);
+static void dp_netdev_destroy_all_queues(struct dp_netdev *dp)
+ OVS_REQ_WRLOCK(dp->queue_rwlock);
static int dpif_netdev_open(const struct dpif_class *, const char *name,
bool create, struct dpif **);
static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf *,
- int queue_no, const struct flow *,
- const struct nlattr *userdata)
- OVS_EXCLUDED(dp->queue_mutex);
+ int queue_no, int type,
+ const struct flow *,
+ const struct nlattr *userdata);
static void dp_netdev_execute_actions(struct dp_netdev *dp,
- const struct flow *, struct ofpbuf *,
+ const struct flow *, struct ofpbuf *, bool may_steal,
struct pkt_metadata *,
const struct nlattr *actions,
- size_t actions_len)
- OVS_REQ_RDLOCK(dp->port_rwlock);
+ size_t actions_len);
static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf *packet,
- struct pkt_metadata *)
- OVS_REQ_RDLOCK(dp->port_rwlock);
-static void dp_netdev_set_threads(struct dp_netdev *, int n);
+ struct pkt_metadata *);
+
+static void dp_netdev_set_pmd_threads(struct dp_netdev *, int n);
static struct dpif_netdev *
dpif_netdev_cast(const struct dpif *dpif)
{
struct dp_netdev *dp;
int error;
- int i;
dp = xzalloc(sizeof *dp);
shash_add(&dp_netdevs, name, dp);
*CONST_CAST(const struct dpif_class **, &dp->class) = class;
*CONST_CAST(const char **, &dp->name) = xstrdup(name);
ovs_refcount_init(&dp->ref_cnt);
- atomic_flag_init(&dp->destroyed);
+ atomic_flag_clear(&dp->destroyed);
ovs_mutex_init(&dp->flow_mutex);
classifier_init(&dp->cls, NULL);
hmap_init(&dp->flow_table);
- ovs_mutex_init(&dp->queue_mutex);
- ovs_mutex_lock(&dp->queue_mutex);
- for (i = 0; i < N_QUEUES; i++) {
- dp->queues[i].head = dp->queues[i].tail = 0;
- }
- ovs_mutex_unlock(&dp->queue_mutex);
- dp->queue_seq = seq_create();
+ fat_rwlock_init(&dp->queue_rwlock);
- dp->n_hit = ovsthread_counter_create();
- dp->n_missed = ovsthread_counter_create();
- dp->n_lost = ovsthread_counter_create();
+ ovsthread_stats_init(&dp->stats);
ovs_rwlock_init(&dp->port_rwlock);
hmap_init(&dp->ports);
dp_netdev_free(dp);
return error;
}
- dp_netdev_set_threads(dp, 2);
*dpp = dp;
return 0;
static void
dp_netdev_purge_queues(struct dp_netdev *dp)
+ OVS_REQ_WRLOCK(dp->queue_rwlock)
{
int i;
- ovs_mutex_lock(&dp->queue_mutex);
- for (i = 0; i < N_QUEUES; i++) {
- struct dp_netdev_queue *q = &dp->queues[i];
+ for (i = 0; i < dp->n_handlers; i++) {
+ struct dp_netdev_queue *q = &dp->handler_queues[i];
+ ovs_mutex_lock(&q->mutex);
while (q->tail != q->head) {
struct dp_netdev_upcall *u = &q->upcalls[q->tail++ & QUEUE_MASK];
ofpbuf_uninit(&u->upcall.packet);
ofpbuf_uninit(&u->buf);
}
+ ovs_mutex_unlock(&q->mutex);
}
- ovs_mutex_unlock(&dp->queue_mutex);
}
/* Requires dp_netdev_mutex so that we can't get a new reference to 'dp'
OVS_REQUIRES(dp_netdev_mutex)
{
struct dp_netdev_port *port, *next;
+ struct dp_netdev_stats *bucket;
+ int i;
shash_find_and_delete(&dp_netdevs, dp->name);
- dp_netdev_set_threads(dp, 0);
- free(dp->forwarders);
+ dp_netdev_set_pmd_threads(dp, 0);
+ free(dp->pmd_threads);
dp_netdev_flow_flush(dp);
ovs_rwlock_wrlock(&dp->port_rwlock);
do_del_port(dp, port->port_no);
}
ovs_rwlock_unlock(&dp->port_rwlock);
- ovsthread_counter_destroy(dp->n_hit);
- ovsthread_counter_destroy(dp->n_missed);
- ovsthread_counter_destroy(dp->n_lost);
- dp_netdev_purge_queues(dp);
- seq_destroy(dp->queue_seq);
- ovs_mutex_destroy(&dp->queue_mutex);
+ OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &dp->stats) {
+ ovs_mutex_destroy(&bucket->mutex);
+ free_cacheline(bucket);
+ }
+ ovsthread_stats_destroy(&dp->stats);
+
+ fat_rwlock_wrlock(&dp->queue_rwlock);
+ dp_netdev_destroy_all_queues(dp);
+ fat_rwlock_unlock(&dp->queue_rwlock);
+
+ fat_rwlock_destroy(&dp->queue_rwlock);
classifier_destroy(&dp->cls);
hmap_destroy(&dp->flow_table);
ovs_mutex_destroy(&dp->flow_mutex);
seq_destroy(dp->port_seq);
hmap_destroy(&dp->ports);
- atomic_flag_destroy(&dp->destroyed);
- ovs_refcount_destroy(&dp->ref_cnt);
latch_destroy(&dp->exit_latch);
free(CONST_CAST(char *, dp->name));
free(dp);
dpif_netdev_get_stats(const struct dpif *dpif, struct dpif_dp_stats *stats)
{
struct dp_netdev *dp = get_dp_netdev(dpif);
+ struct dp_netdev_stats *bucket;
+ size_t i;
fat_rwlock_rdlock(&dp->cls.rwlock);
stats->n_flows = hmap_count(&dp->flow_table);
fat_rwlock_unlock(&dp->cls.rwlock);
- stats->n_hit = ovsthread_counter_read(dp->n_hit);
- stats->n_missed = ovsthread_counter_read(dp->n_missed);
- stats->n_lost = ovsthread_counter_read(dp->n_lost);
+ stats->n_hit = stats->n_missed = stats->n_lost = 0;
+ OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &dp->stats) {
+ ovs_mutex_lock(&bucket->mutex);
+ stats->n_hit += bucket->n[DP_STAT_HIT];
+ stats->n_missed += bucket->n[DP_STAT_MISS];
+ stats->n_lost += bucket->n[DP_STAT_LOST];
+ ovs_mutex_unlock(&bucket->mutex);
+ }
stats->n_masks = UINT32_MAX;
stats->n_mask_hit = UINT64_MAX;
return 0;
}
+static void
+dp_netdev_reload_pmd_threads(struct dp_netdev *dp)
+{
+ int i;
+
+ for (i = 0; i < dp->n_pmd_threads; i++) {
+ struct pmd_thread *f = &dp->pmd_threads[i];
+ int id;
+
+ atomic_add(&f->change_seq, 1, &id);
+ }
+}
+
static int
do_add_port(struct dp_netdev *dp, const char *devname, const char *type,
odp_port_t port_no)
struct netdev_saved_flags *sf;
struct dp_netdev_port *port;
struct netdev *netdev;
- struct netdev_rx *rx;
enum netdev_flags flags;
const char *open_type;
int error;
+ int i;
/* XXX reject devices already in some dp_netdev. */
return EINVAL;
}
- error = netdev_rx_open(netdev, &rx);
- if (error
- && !(error == EOPNOTSUPP && dpif_netdev_class_is_dummy(dp->class))) {
- VLOG_ERR("%s: cannot receive packets on this network device (%s)",
- devname, ovs_strerror(errno));
- netdev_close(netdev);
- return error;
+ port = xzalloc(sizeof *port);
+ port->port_no = port_no;
+ port->netdev = netdev;
+ port->rxq = xmalloc(sizeof *port->rxq * netdev_n_rxq(netdev));
+ port->type = xstrdup(type);
+ for (i = 0; i < netdev_n_rxq(netdev); i++) {
+ error = netdev_rxq_open(netdev, &port->rxq[i], i);
+ if (error
+ && !(error == EOPNOTSUPP && dpif_netdev_class_is_dummy(dp->class))) {
+ VLOG_ERR("%s: cannot receive packets on this network device (%s)",
+ devname, ovs_strerror(errno));
+ netdev_close(netdev);
+ return error;
+ }
}
error = netdev_turn_flags_on(netdev, NETDEV_PROMISC, &sf);
if (error) {
- netdev_rx_close(rx);
+ for (i = 0; i < netdev_n_rxq(netdev); i++) {
+ netdev_rxq_close(port->rxq[i]);
+ }
netdev_close(netdev);
+ free(port->rxq);
+ free(port);
return error;
}
-
- port = xmalloc(sizeof *port);
- port->port_no = port_no;
- port->netdev = netdev;
port->sf = sf;
- port->rx = rx;
- port->type = xstrdup(type);
+
+ if (netdev_is_pmd(netdev)) {
+ dp->pmd_count++;
+ dp_netdev_set_pmd_threads(dp, NR_THREADS);
+ dp_netdev_reload_pmd_threads(dp);
+ }
+ ovs_refcount_init(&port->ref_cnt);
hmap_insert(&dp->ports, &port->node, hash_int(odp_to_u32(port_no), 0));
seq_change(dp->port_seq);
}
}
+static void
+port_ref(struct dp_netdev_port *port)
+{
+ if (port) {
+ ovs_refcount_ref(&port->ref_cnt);
+ }
+}
+
+static void
+port_unref(struct dp_netdev_port *port)
+{
+ if (port && ovs_refcount_unref(&port->ref_cnt) == 1) {
+ int i;
+
+ netdev_close(port->netdev);
+ netdev_restore_flags(port->sf);
+
+ for (i = 0; i < netdev_n_rxq(port->netdev); i++) {
+ netdev_rxq_close(port->rxq[i]);
+ }
+ free(port->type);
+ free(port);
+ }
+}
+
static int
get_port_by_name(struct dp_netdev *dp,
const char *devname, struct dp_netdev_port **portp)
hmap_remove(&dp->ports, &port->node);
seq_change(dp->port_seq);
+ if (netdev_is_pmd(port->netdev)) {
+ dp_netdev_reload_pmd_threads(dp);
+ }
- netdev_close(port->netdev);
- netdev_restore_flags(port->sf);
- netdev_rx_close(port->rx);
- free(port->type);
- free(port);
-
+ port_unref(port);
return 0;
}
return error;
}
+static void
+dp_netdev_flow_free(struct dp_netdev_flow *flow)
+{
+ struct dp_netdev_flow_stats *bucket;
+ size_t i;
+
+ OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &flow->stats) {
+ ovs_mutex_destroy(&bucket->mutex);
+ free_cacheline(bucket);
+ }
+ ovsthread_stats_destroy(&flow->stats);
+
+ cls_rule_destroy(CONST_CAST(struct cls_rule *, &flow->cr));
+ dp_netdev_actions_free(dp_netdev_flow_get_actions(flow));
+ ovs_mutex_destroy(&flow->mutex);
+ free(flow);
+}
+
static void
dp_netdev_remove_flow(struct dp_netdev *dp, struct dp_netdev_flow *flow)
OVS_REQ_WRLOCK(dp->cls.rwlock)
classifier_remove(&dp->cls, cr);
hmap_remove(&dp->flow_table, node);
- dp_netdev_flow_unref(flow);
-}
-
-static struct dp_netdev_flow *
-dp_netdev_flow_ref(const struct dp_netdev_flow *flow_)
-{
- struct dp_netdev_flow *flow = CONST_CAST(struct dp_netdev_flow *, flow_);
- if (flow) {
- ovs_refcount_ref(&flow->ref_cnt);
- }
- return flow;
-}
-
-static void
-dp_netdev_flow_unref(struct dp_netdev_flow *flow)
-{
- if (flow && ovs_refcount_unref(&flow->ref_cnt) == 1) {
- cls_rule_destroy(CONST_CAST(struct cls_rule *, &flow->cr));
- ovs_mutex_lock(&flow->mutex);
- dp_netdev_actions_unref(flow->actions);
- ovs_refcount_destroy(&flow->ref_cnt);
- ovs_mutex_unlock(&flow->mutex);
- ovs_mutex_destroy(&flow->mutex);
- free(flow);
- }
+ ovsrcu_postpone(dp_netdev_flow_free, flow);
}
static void
fat_rwlock_rdlock(&dp->cls.rwlock);
netdev_flow = dp_netdev_flow_cast(classifier_lookup(&dp->cls, flow, NULL));
- dp_netdev_flow_ref(netdev_flow);
fat_rwlock_unlock(&dp->cls.rwlock);
return netdev_flow;
HMAP_FOR_EACH_WITH_HASH (netdev_flow, node, flow_hash(flow, 0),
&dp->flow_table) {
if (flow_equal(&netdev_flow->flow, flow)) {
- return dp_netdev_flow_ref(netdev_flow);
+ return netdev_flow;
}
}
static void
get_dpif_flow_stats(struct dp_netdev_flow *netdev_flow,
struct dpif_flow_stats *stats)
- OVS_REQ_RDLOCK(netdev_flow->mutex)
{
- stats->n_packets = netdev_flow->packet_count;
- stats->n_bytes = netdev_flow->byte_count;
- stats->used = netdev_flow->used;
- stats->tcp_flags = netdev_flow->tcp_flags;
+ struct dp_netdev_flow_stats *bucket;
+ size_t i;
+
+ memset(stats, 0, sizeof *stats);
+ OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &netdev_flow->stats) {
+ ovs_mutex_lock(&bucket->mutex);
+ stats->n_packets += bucket->packet_count;
+ stats->n_bytes += bucket->byte_count;
+ stats->used = MAX(stats->used, bucket->used);
+ stats->tcp_flags |= bucket->tcp_flags;
+ ovs_mutex_unlock(&bucket->mutex);
+ }
}
static int
fat_rwlock_unlock(&dp->cls.rwlock);
if (netdev_flow) {
- struct dp_netdev_actions *actions = NULL;
-
- ovs_mutex_lock(&netdev_flow->mutex);
if (stats) {
get_dpif_flow_stats(netdev_flow, stats);
}
- if (actionsp) {
- actions = dp_netdev_actions_ref(netdev_flow->actions);
- }
- ovs_mutex_unlock(&netdev_flow->mutex);
-
- dp_netdev_flow_unref(netdev_flow);
if (actionsp) {
+ struct dp_netdev_actions *actions;
+
+ actions = dp_netdev_flow_get_actions(netdev_flow);
*actionsp = ofpbuf_clone_data(actions->actions, actions->size);
- dp_netdev_actions_unref(actions);
}
- } else {
+ } else {
error = ENOENT;
}
netdev_flow = xzalloc(sizeof *netdev_flow);
*CONST_CAST(struct flow *, &netdev_flow->flow) = *flow;
- ovs_refcount_init(&netdev_flow->ref_cnt);
ovs_mutex_init(&netdev_flow->mutex);
- ovs_mutex_lock(&netdev_flow->mutex);
- netdev_flow->actions = dp_netdev_actions_create(actions, actions_len);
+ ovsthread_stats_init(&netdev_flow->stats);
+
+ ovsrcu_set(&netdev_flow->actions,
+ dp_netdev_actions_create(actions, actions_len));
match_init(&match, flow, wc);
cls_rule_init(CONST_CAST(struct cls_rule *, &netdev_flow->cr),
flow_hash(flow, 0));
fat_rwlock_unlock(&dp->cls.rwlock);
- ovs_mutex_unlock(&netdev_flow->mutex);
-
return 0;
}
static void
clear_stats(struct dp_netdev_flow *netdev_flow)
- OVS_REQUIRES(netdev_flow->mutex)
{
- netdev_flow->used = 0;
- netdev_flow->packet_count = 0;
- netdev_flow->byte_count = 0;
- netdev_flow->tcp_flags = 0;
+ struct dp_netdev_flow_stats *bucket;
+ size_t i;
+
+ OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &netdev_flow->stats) {
+ ovs_mutex_lock(&bucket->mutex);
+ bucket->used = 0;
+ bucket->packet_count = 0;
+ bucket->byte_count = 0;
+ bucket->tcp_flags = 0;
+ ovs_mutex_unlock(&bucket->mutex);
+ }
}
static int
new_actions = dp_netdev_actions_create(put->actions,
put->actions_len);
- ovs_mutex_lock(&netdev_flow->mutex);
- old_actions = netdev_flow->actions;
- netdev_flow->actions = new_actions;
+ old_actions = dp_netdev_flow_get_actions(netdev_flow);
+ ovsrcu_set(&netdev_flow->actions, new_actions);
+
if (put->stats) {
get_dpif_flow_stats(netdev_flow, put->stats);
}
if (put->flags & DPIF_FP_ZERO_STATS) {
clear_stats(netdev_flow);
}
- ovs_mutex_unlock(&netdev_flow->mutex);
- dp_netdev_actions_unref(old_actions);
+ ovsrcu_postpone(dp_netdev_actions_free, old_actions);
} else if (put->flags & DPIF_FP_CREATE) {
error = EEXIST;
} else {
/* Overlapping flow. */
error = EINVAL;
}
- dp_netdev_flow_unref(netdev_flow);
}
ovs_mutex_unlock(&dp->flow_mutex);
netdev_flow = dp_netdev_find_flow(dp, &key);
if (netdev_flow) {
if (del->stats) {
- ovs_mutex_lock(&netdev_flow->mutex);
get_dpif_flow_stats(netdev_flow, del->stats);
- ovs_mutex_unlock(&netdev_flow->mutex);
}
dp_netdev_remove_flow(dp, netdev_flow);
- dp_netdev_flow_unref(netdev_flow);
} else {
error = ENOENT;
}
{
struct dp_netdev_flow_state *state = state_;
- dp_netdev_actions_unref(state->actions);
free(state);
}
return 0;
}
+/* XXX the caller must use 'actions' without quiescing */
static int
dpif_netdev_flow_dump_next(const struct dpif *dpif, void *iter_, void *state_,
const struct nlattr **key, size_t *key_len,
node = hmap_at_position(&dp->flow_table, &iter->bucket, &iter->offset);
if (node) {
netdev_flow = CONTAINER_OF(node, struct dp_netdev_flow, node);
- dp_netdev_flow_ref(netdev_flow);
}
fat_rwlock_unlock(&dp->cls.rwlock);
if (!node) {
}
if (actions || stats) {
- dp_netdev_actions_unref(state->actions);
state->actions = NULL;
- ovs_mutex_lock(&netdev_flow->mutex);
if (actions) {
- state->actions = dp_netdev_actions_ref(netdev_flow->actions);
+ state->actions = dp_netdev_flow_get_actions(netdev_flow);
*actions = state->actions->actions;
*actions_len = state->actions->size;
}
+
if (stats) {
get_dpif_flow_stats(netdev_flow, &state->stats);
*stats = &state->stats;
}
- ovs_mutex_unlock(&netdev_flow->mutex);
}
- dp_netdev_flow_unref(netdev_flow);
-
return 0;
}
flow_extract(execute->packet, md, &key);
ovs_rwlock_rdlock(&dp->port_rwlock);
- dp_netdev_execute_actions(dp, &key, execute->packet, md, execute->actions,
- execute->actions_len);
+ dp_netdev_execute_actions(dp, &key, execute->packet, false, md,
+ execute->actions, execute->actions_len);
ovs_rwlock_unlock(&dp->port_rwlock);
return 0;
}
+static void
+dp_netdev_destroy_all_queues(struct dp_netdev *dp)
+ OVS_REQ_WRLOCK(dp->queue_rwlock)
+{
+ size_t i;
+
+ dp_netdev_purge_queues(dp);
+
+ for (i = 0; i < dp->n_handlers; i++) {
+ struct dp_netdev_queue *q = &dp->handler_queues[i];
+
+ ovs_mutex_destroy(&q->mutex);
+ seq_destroy(q->seq);
+ }
+ free(dp->handler_queues);
+ dp->handler_queues = NULL;
+ dp->n_handlers = 0;
+}
+
+static void
+dp_netdev_refresh_queues(struct dp_netdev *dp, uint32_t n_handlers)
+ OVS_REQ_WRLOCK(dp->queue_rwlock)
+{
+ if (dp->n_handlers != n_handlers) {
+ size_t i;
+
+ dp_netdev_destroy_all_queues(dp);
+
+ dp->n_handlers = n_handlers;
+ dp->handler_queues = xzalloc(n_handlers * sizeof *dp->handler_queues);
+
+ for (i = 0; i < n_handlers; i++) {
+ struct dp_netdev_queue *q = &dp->handler_queues[i];
+
+ ovs_mutex_init(&q->mutex);
+ q->seq = seq_create();
+ }
+ }
+}
+
static int
-dpif_netdev_recv_set(struct dpif *dpif OVS_UNUSED, bool enable OVS_UNUSED)
+dpif_netdev_recv_set(struct dpif *dpif, bool enable)
{
+ struct dp_netdev *dp = get_dp_netdev(dpif);
+
+ if ((dp->handler_queues != NULL) == enable) {
+ return 0;
+ }
+
+ fat_rwlock_wrlock(&dp->queue_rwlock);
+ if (!enable) {
+ dp_netdev_destroy_all_queues(dp);
+ } else {
+ dp_netdev_refresh_queues(dp, 1);
+ }
+ fat_rwlock_unlock(&dp->queue_rwlock);
+
+ return 0;
+}
+
+static int
+dpif_netdev_handlers_set(struct dpif *dpif, uint32_t n_handlers)
+{
+ struct dp_netdev *dp = get_dp_netdev(dpif);
+
+ fat_rwlock_wrlock(&dp->queue_rwlock);
+ if (dp->handler_queues) {
+ dp_netdev_refresh_queues(dp, n_handlers);
+ }
+ fat_rwlock_unlock(&dp->queue_rwlock);
+
return 0;
}
return 0;
}
-static struct dp_netdev_queue *
-find_nonempty_queue(struct dp_netdev *dp)
- OVS_REQUIRES(dp->queue_mutex)
+static bool
+dp_netdev_recv_check(const struct dp_netdev *dp, const uint32_t handler_id)
+ OVS_REQ_RDLOCK(dp->queue_rwlock)
{
- int i;
+ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
- for (i = 0; i < N_QUEUES; i++) {
- struct dp_netdev_queue *q = &dp->queues[i];
- if (q->head != q->tail) {
- return q;
- }
+ if (!dp->handler_queues) {
+ VLOG_WARN_RL(&rl, "receiving upcall disabled");
+ return false;
}
- return NULL;
+
+ if (handler_id >= dp->n_handlers) {
+ VLOG_WARN_RL(&rl, "handler index out of bound");
+ return false;
+ }
+
+ return true;
}
static int
-dpif_netdev_recv(struct dpif *dpif, struct dpif_upcall *upcall,
- struct ofpbuf *buf)
+dpif_netdev_recv(struct dpif *dpif, uint32_t handler_id,
+ struct dpif_upcall *upcall, struct ofpbuf *buf)
{
struct dp_netdev *dp = get_dp_netdev(dpif);
struct dp_netdev_queue *q;
- int error;
+ int error = 0;
+
+ fat_rwlock_rdlock(&dp->queue_rwlock);
- ovs_mutex_lock(&dp->queue_mutex);
- q = find_nonempty_queue(dp);
- if (q) {
+ if (!dp_netdev_recv_check(dp, handler_id)) {
+ error = EAGAIN;
+ goto out;
+ }
+
+ q = &dp->handler_queues[handler_id];
+ ovs_mutex_lock(&q->mutex);
+ if (q->head != q->tail) {
struct dp_netdev_upcall *u = &q->upcalls[q->tail++ & QUEUE_MASK];
*upcall = u->upcall;
ofpbuf_uninit(buf);
*buf = u->buf;
-
- error = 0;
} else {
error = EAGAIN;
}
- ovs_mutex_unlock(&dp->queue_mutex);
+ ovs_mutex_unlock(&q->mutex);
+
+out:
+ fat_rwlock_unlock(&dp->queue_rwlock);
return error;
}
static void
-dpif_netdev_recv_wait(struct dpif *dpif)
+dpif_netdev_recv_wait(struct dpif *dpif, uint32_t handler_id)
{
struct dp_netdev *dp = get_dp_netdev(dpif);
+ struct dp_netdev_queue *q;
uint64_t seq;
- ovs_mutex_lock(&dp->queue_mutex);
- seq = seq_read(dp->queue_seq);
- if (find_nonempty_queue(dp)) {
+ fat_rwlock_rdlock(&dp->queue_rwlock);
+
+ if (!dp_netdev_recv_check(dp, handler_id)) {
+ goto out;
+ }
+
+ q = &dp->handler_queues[handler_id];
+ ovs_mutex_lock(&q->mutex);
+ seq = seq_read(q->seq);
+ if (q->head != q->tail) {
poll_immediate_wake();
} else {
- seq_wait(dp->queue_seq, seq);
+ seq_wait(q->seq, seq);
}
- ovs_mutex_unlock(&dp->queue_mutex);
+
+ ovs_mutex_unlock(&q->mutex);
+
+out:
+ fat_rwlock_unlock(&dp->queue_rwlock);
}
static void
{
struct dpif_netdev *dpif_netdev = dpif_netdev_cast(dpif);
+ fat_rwlock_wrlock(&dpif_netdev->dp->queue_rwlock);
dp_netdev_purge_queues(dpif_netdev->dp);
+ fat_rwlock_unlock(&dpif_netdev->dp->queue_rwlock);
}
\f
/* Creates and returns a new 'struct dp_netdev_actions', with a reference count
struct dp_netdev_actions *netdev_actions;
netdev_actions = xmalloc(sizeof *netdev_actions);
- ovs_refcount_init(&netdev_actions->ref_cnt);
netdev_actions->actions = xmemdup(actions, size);
netdev_actions->size = size;
return netdev_actions;
}
-/* Increments 'actions''s refcount. */
struct dp_netdev_actions *
-dp_netdev_actions_ref(const struct dp_netdev_actions *actions_)
+dp_netdev_flow_get_actions(const struct dp_netdev_flow *flow)
{
- struct dp_netdev_actions *actions;
+ return ovsrcu_get(struct dp_netdev_actions *, &flow->actions);
+}
+
+static void
+dp_netdev_actions_free(struct dp_netdev_actions *actions)
+{
+ free(actions->actions);
+ free(actions);
+}
+\f
+
+static void
+dp_netdev_process_rxq_port(struct dp_netdev *dp,
+ struct dp_netdev_port *port,
+ struct netdev_rxq *rxq)
+{
+ struct ofpbuf *packet[NETDEV_MAX_RX_BATCH];
+ int error, c;
+
+ error = netdev_rxq_recv(rxq, packet, &c);
+ if (!error) {
+ struct pkt_metadata md = PKT_METADATA_INITIALIZER(port->port_no);
+ int i;
- actions = CONST_CAST(struct dp_netdev_actions *, actions_);
- if (actions) {
- ovs_refcount_ref(&actions->ref_cnt);
+ for (i = 0; i < c; i++) {
+ dp_netdev_port_input(dp, packet[i], &md);
+ }
+ } else if (error != EAGAIN && error != EOPNOTSUPP) {
+ static struct vlog_rate_limit rl
+ = VLOG_RATE_LIMIT_INIT(1, 5);
+
+ VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
+ netdev_get_name(port->netdev),
+ ovs_strerror(error));
}
- return actions;
}
-/* Decrements 'actions''s refcount and frees 'actions' if the refcount reaches
- * 0. */
-void
-dp_netdev_actions_unref(struct dp_netdev_actions *actions)
+static void
+dpif_netdev_run(struct dpif *dpif)
{
- if (actions && ovs_refcount_unref(&actions->ref_cnt) == 1) {
- ovs_refcount_destroy(&actions->ref_cnt);
- free(actions->actions);
- free(actions);
+ struct dp_netdev_port *port;
+ struct dp_netdev *dp = get_dp_netdev(dpif);
+
+ ovs_rwlock_rdlock(&dp->port_rwlock);
+
+ HMAP_FOR_EACH (port, node, &dp->ports) {
+ if (!netdev_is_pmd(port->netdev)) {
+ int i;
+
+ for (i = 0; i < netdev_n_rxq(port->netdev); i++) {
+ dp_netdev_process_rxq_port(dp, port, port->rxq[i]);
+ }
+ }
}
+
+ ovs_rwlock_unlock(&dp->port_rwlock);
}
-\f
+
+static void
+dpif_netdev_wait(struct dpif *dpif)
+{
+ struct dp_netdev_port *port;
+ struct dp_netdev *dp = get_dp_netdev(dpif);
+
+ ovs_rwlock_rdlock(&dp->port_rwlock);
+
+ HMAP_FOR_EACH (port, node, &dp->ports) {
+ if (!netdev_is_pmd(port->netdev)) {
+ int i;
+
+ for (i = 0; i < netdev_n_rxq(port->netdev); i++) {
+ netdev_rxq_wait(port->rxq[i]);
+ }
+ }
+ }
+ ovs_rwlock_unlock(&dp->port_rwlock);
+}
+
+struct rxq_poll {
+ struct dp_netdev_port *port;
+ struct netdev_rxq *rx;
+};
+
+static int
+pmd_load_queues(struct pmd_thread *f,
+ struct rxq_poll **ppoll_list, int poll_cnt)
+{
+ struct dp_netdev *dp = f->dp;
+ struct rxq_poll *poll_list = *ppoll_list;
+ struct dp_netdev_port *port;
+ int id = f->id;
+ int index;
+ int i;
+
+ /* Simple scheduler for netdev rx polling. */
+ ovs_rwlock_rdlock(&dp->port_rwlock);
+ for (i = 0; i < poll_cnt; i++) {
+ port_unref(poll_list[i].port);
+ }
+
+ poll_cnt = 0;
+ index = 0;
+
+ HMAP_FOR_EACH (port, node, &f->dp->ports) {
+ if (netdev_is_pmd(port->netdev)) {
+ int i;
+
+ for (i = 0; i < netdev_n_rxq(port->netdev); i++) {
+ if ((index % dp->n_pmd_threads) == id) {
+ poll_list = xrealloc(poll_list, sizeof *poll_list * (poll_cnt + 1));
+
+ port_ref(port);
+ poll_list[poll_cnt].port = port;
+ poll_list[poll_cnt].rx = port->rxq[i];
+ poll_cnt++;
+ }
+ index++;
+ }
+ }
+ }
+
+ ovs_rwlock_unlock(&dp->port_rwlock);
+ *ppoll_list = poll_list;
+ return poll_cnt;
+}
+
static void *
-dp_forwarder_main(void *f_)
+pmd_thread_main(void *f_)
{
- struct dp_forwarder *f = f_;
+ struct pmd_thread *f = f_;
struct dp_netdev *dp = f->dp;
- struct ofpbuf packet;
+ unsigned int lc = 0;
+ struct rxq_poll *poll_list;
+ unsigned int port_seq;
+ int poll_cnt;
+ int i;
- f->name = xasprintf("forwarder_%u", ovsthread_id_self());
+ f->name = xasprintf("pmd_%u", ovsthread_id_self());
set_subprogram_name("%s", f->name);
+ poll_cnt = 0;
+ poll_list = NULL;
- ofpbuf_init(&packet, 0);
- while (!latch_is_set(&dp->exit_latch)) {
- bool received_anything;
+ pmd_thread_setaffinity_cpu(f->id);
+reload:
+ poll_cnt = pmd_load_queues(f, &poll_list, poll_cnt);
+ atomic_read(&f->change_seq, &port_seq);
+
+ for (;;) {
+ unsigned int c_port_seq;
int i;
- ovs_rwlock_rdlock(&dp->port_rwlock);
- for (i = 0; i < 50; i++) {
- struct dp_netdev_port *port;
-
- received_anything = false;
- HMAP_FOR_EACH (port, node, &f->dp->ports) {
- if (port->rx
- && port->node.hash >= f->min_hash
- && port->node.hash <= f->max_hash) {
- int buf_size;
- int error;
- int mtu;
-
- if (netdev_get_mtu(port->netdev, &mtu)) {
- mtu = ETH_PAYLOAD_MAX;
- }
- buf_size = DP_NETDEV_HEADROOM + VLAN_ETH_HEADER_LEN + mtu;
-
- ofpbuf_clear(&packet);
- ofpbuf_reserve_with_tailroom(&packet, DP_NETDEV_HEADROOM,
- buf_size);
-
- error = netdev_rx_recv(port->rx, &packet);
- if (!error) {
- struct pkt_metadata md
- = PKT_METADATA_INITIALIZER(port->port_no);
-
- dp_netdev_port_input(dp, &packet, &md);
- received_anything = true;
- } else if (error != EAGAIN && error != EOPNOTSUPP) {
- static struct vlog_rate_limit rl
- = VLOG_RATE_LIMIT_INIT(1, 5);
-
- VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
- netdev_get_name(port->netdev),
- ovs_strerror(error));
- }
- }
- }
+ for (i = 0; i < poll_cnt; i++) {
+ dp_netdev_process_rxq_port(dp, poll_list[i].port, poll_list[i].rx);
+ }
- if (!received_anything) {
+ if (lc++ > 1024) {
+ ovsrcu_quiesce();
+
+ /* TODO: need completely userspace based signaling method.
+ * to keep this thread entirely in userspace.
+ * For now using atomic counter. */
+ lc = 0;
+ atomic_read_explicit(&f->change_seq, &c_port_seq, memory_order_consume);
+ if (c_port_seq != port_seq) {
break;
}
}
+ }
- if (received_anything) {
- poll_immediate_wake();
- } else {
- struct dp_netdev_port *port;
-
- HMAP_FOR_EACH (port, node, &f->dp->ports)
- if (port->rx
- && port->node.hash >= f->min_hash
- && port->node.hash <= f->max_hash) {
- netdev_rx_wait(port->rx);
- }
- seq_wait(dp->port_seq, seq_read(dp->port_seq));
- latch_wait(&dp->exit_latch);
- }
- ovs_rwlock_unlock(&dp->port_rwlock);
+ if (!latch_is_set(&f->dp->exit_latch)){
+ goto reload;
+ }
- poll_block();
+ for (i = 0; i < poll_cnt; i++) {
+ port_unref(poll_list[i].port);
}
- ofpbuf_uninit(&packet);
+ free(poll_list);
free(f->name);
-
return NULL;
}
static void
-dp_netdev_set_threads(struct dp_netdev *dp, int n)
+dp_netdev_set_pmd_threads(struct dp_netdev *dp, int n)
{
int i;
- if (n == dp->n_forwarders) {
+ if (n == dp->n_pmd_threads) {
return;
}
/* Stop existing threads. */
latch_set(&dp->exit_latch);
- for (i = 0; i < dp->n_forwarders; i++) {
- struct dp_forwarder *f = &dp->forwarders[i];
+ dp_netdev_reload_pmd_threads(dp);
+ for (i = 0; i < dp->n_pmd_threads; i++) {
+ struct pmd_thread *f = &dp->pmd_threads[i];
xpthread_join(f->thread, NULL);
}
latch_poll(&dp->exit_latch);
- free(dp->forwarders);
+ free(dp->pmd_threads);
/* Start new threads. */
- dp->forwarders = xmalloc(n * sizeof *dp->forwarders);
- dp->n_forwarders = n;
+ dp->pmd_threads = xmalloc(n * sizeof *dp->pmd_threads);
+ dp->n_pmd_threads = n;
+
for (i = 0; i < n; i++) {
- struct dp_forwarder *f = &dp->forwarders[i];
+ struct pmd_thread *f = &dp->pmd_threads[i];
f->dp = dp;
- f->min_hash = UINT32_MAX / n * i;
- f->max_hash = UINT32_MAX / n * (i + 1) - 1;
- if (i == n - 1) {
- f->max_hash = UINT32_MAX;
- }
- xpthread_create(&f->thread, NULL, dp_forwarder_main, f);
+ f->id = i;
+ atomic_store(&f->change_seq, 1);
+
+ /* Each thread will distribute all devices rx-queues among
+ * themselves. */
+ xpthread_create(&f->thread, NULL, pmd_thread_main, f);
}
}
+
\f
+static void *
+dp_netdev_flow_stats_new_cb(void)
+{
+ struct dp_netdev_flow_stats *bucket = xzalloc_cacheline(sizeof *bucket);
+ ovs_mutex_init(&bucket->mutex);
+ return bucket;
+}
+
static void
dp_netdev_flow_used(struct dp_netdev_flow *netdev_flow,
- const struct ofpbuf *packet)
- OVS_REQUIRES(netdev_flow->mutex)
+ const struct ofpbuf *packet,
+ const struct flow *key)
+{
+ uint16_t tcp_flags = ntohs(key->tcp_flags);
+ long long int now = time_msec();
+ struct dp_netdev_flow_stats *bucket;
+
+ bucket = ovsthread_stats_bucket_get(&netdev_flow->stats,
+ dp_netdev_flow_stats_new_cb);
+
+ ovs_mutex_lock(&bucket->mutex);
+ bucket->used = MAX(now, bucket->used);
+ bucket->packet_count++;
+ bucket->byte_count += packet->size;
+ bucket->tcp_flags |= tcp_flags;
+ ovs_mutex_unlock(&bucket->mutex);
+}
+
+static void *
+dp_netdev_stats_new_cb(void)
{
- netdev_flow->used = time_msec();
- netdev_flow->packet_count++;
- netdev_flow->byte_count += packet->size;
- netdev_flow->tcp_flags |= packet_get_tcp_flags(packet, &netdev_flow->flow);
+ struct dp_netdev_stats *bucket = xzalloc_cacheline(sizeof *bucket);
+ ovs_mutex_init(&bucket->mutex);
+ return bucket;
+}
+
+static void
+dp_netdev_count_packet(struct dp_netdev *dp, enum dp_stat_type type)
+{
+ struct dp_netdev_stats *bucket;
+
+ bucket = ovsthread_stats_bucket_get(&dp->stats, dp_netdev_stats_new_cb);
+ ovs_mutex_lock(&bucket->mutex);
+ bucket->n[type]++;
+ ovs_mutex_unlock(&bucket->mutex);
}
static void
dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf *packet,
struct pkt_metadata *md)
- OVS_REQ_RDLOCK(dp->port_rwlock)
{
struct dp_netdev_flow *netdev_flow;
struct flow key;
if (packet->size < ETH_HEADER_LEN) {
+ ofpbuf_delete(packet);
return;
}
flow_extract(packet, md, &key);
if (netdev_flow) {
struct dp_netdev_actions *actions;
- ovs_mutex_lock(&netdev_flow->mutex);
- dp_netdev_flow_used(netdev_flow, packet);
- actions = dp_netdev_actions_ref(netdev_flow->actions);
- ovs_mutex_unlock(&netdev_flow->mutex);
+ dp_netdev_flow_used(netdev_flow, packet, &key);
- dp_netdev_execute_actions(dp, &key, packet, md,
+ actions = dp_netdev_flow_get_actions(netdev_flow);
+ dp_netdev_execute_actions(dp, &key, packet, true, md,
actions->actions, actions->size);
- dp_netdev_actions_unref(actions);
- dp_netdev_flow_unref(netdev_flow);
- ovsthread_counter_inc(dp->n_hit, 1);
- } else {
- ovsthread_counter_inc(dp->n_missed, 1);
- dp_netdev_output_userspace(dp, packet, DPIF_UC_MISS, &key, NULL);
+ dp_netdev_count_packet(dp, DP_STAT_HIT);
+ } else if (dp->handler_queues) {
+ dp_netdev_count_packet(dp, DP_STAT_MISS);
+ dp_netdev_output_userspace(dp, packet,
+ flow_hash_5tuple(&key, 0) % dp->n_handlers,
+ DPIF_UC_MISS, &key, NULL);
+ ofpbuf_delete(packet);
}
}
static int
dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf *packet,
- int queue_no, const struct flow *flow,
+ int queue_no, int type, const struct flow *flow,
const struct nlattr *userdata)
- OVS_EXCLUDED(dp->queue_mutex)
{
- struct dp_netdev_queue *q = &dp->queues[queue_no];
+ struct dp_netdev_queue *q;
int error;
- ovs_mutex_lock(&dp->queue_mutex);
+ fat_rwlock_rdlock(&dp->queue_rwlock);
+ q = &dp->handler_queues[queue_no];
+ ovs_mutex_lock(&q->mutex);
if (q->head - q->tail < MAX_QUEUE_LEN) {
struct dp_netdev_upcall *u = &q->upcalls[q->head++ & QUEUE_MASK];
struct dpif_upcall *upcall = &u->upcall;
struct ofpbuf *buf = &u->buf;
size_t buf_size;
- upcall->type = queue_no;
+ upcall->type = type;
/* Allocate buffer big enough for everything. */
buf_size = ODPUTIL_FLOW_KEY_BYTES;
if (userdata) {
buf_size += NLA_ALIGN(userdata->nla_len);
}
+ buf_size += packet->size;
ofpbuf_init(buf, buf_size);
/* Put ODP flow. */
NLA_ALIGN(userdata->nla_len));
}
- /* Steal packet data. */
- ovs_assert(packet->source == OFPBUF_MALLOC);
- upcall->packet = *packet;
- ofpbuf_use(packet, NULL, 0);
+ upcall->packet.data = ofpbuf_put(buf, packet->data, packet->size);
+ upcall->packet.size = packet->size;
- seq_change(dp->queue_seq);
+ seq_change(q->seq);
error = 0;
} else {
- ovsthread_counter_inc(dp->n_lost, 1);
+ dp_netdev_count_packet(dp, DP_STAT_LOST);
error = ENOBUFS;
}
- ovs_mutex_unlock(&dp->queue_mutex);
+ ovs_mutex_unlock(&q->mutex);
+ fat_rwlock_unlock(&dp->queue_rwlock);
return error;
}
static void
dp_execute_cb(void *aux_, struct ofpbuf *packet,
- const struct pkt_metadata *md OVS_UNUSED,
+ struct pkt_metadata *md,
const struct nlattr *a, bool may_steal)
OVS_NO_THREAD_SAFETY_ANALYSIS
{
case OVS_ACTION_ATTR_OUTPUT:
p = dp_netdev_lookup_port(aux->dp, u32_to_odp(nl_attr_get_u32(a)));
if (p) {
- netdev_send(p->netdev, packet);
+ netdev_send(p->netdev, packet, may_steal);
}
break;
userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
- /* Make a copy if we are not allowed to steal the packet's data. */
- if (!may_steal) {
- packet = ofpbuf_clone_with_headroom(packet, DP_NETDEV_HEADROOM);
- }
- dp_netdev_output_userspace(aux->dp, packet, DPIF_UC_ACTION, aux->key,
+ dp_netdev_output_userspace(aux->dp, packet,
+ flow_hash_5tuple(aux->key, 0)
+ % aux->dp->n_handlers,
+ DPIF_UC_ACTION, aux->key,
userdata);
- if (!may_steal) {
- ofpbuf_uninit(packet);
+
+ if (may_steal) {
+ ofpbuf_delete(packet);
}
break;
}
+
+ case OVS_ACTION_ATTR_RECIRC: {
+ const struct ovs_action_recirc *act;
+
+ act = nl_attr_get(a);
+ md->recirc_id = act->recirc_id;
+ md->dp_hash = 0;
+
+ if (act->hash_alg == OVS_RECIRC_HASH_ALG_L4) {
+ struct flow flow;
+
+ flow_extract(packet, md, &flow);
+ md->dp_hash = flow_hash_symmetric_l4(&flow, act->hash_bias);
+ }
+
+ dp_netdev_port_input(aux->dp, packet, md);
+ break;
+ }
+
case OVS_ACTION_ATTR_PUSH_VLAN:
case OVS_ACTION_ATTR_POP_VLAN:
case OVS_ACTION_ATTR_PUSH_MPLS:
case __OVS_ACTION_ATTR_MAX:
OVS_NOT_REACHED();
}
+
}
static void
dp_netdev_execute_actions(struct dp_netdev *dp, const struct flow *key,
- struct ofpbuf *packet, struct pkt_metadata *md,
+ struct ofpbuf *packet, bool may_steal,
+ struct pkt_metadata *md,
const struct nlattr *actions, size_t actions_len)
- OVS_REQ_RDLOCK(dp->port_rwlock)
{
struct dp_netdev_execute_aux aux = {dp, key};
- odp_execute_actions(&aux, packet, md, actions, actions_len, dp_execute_cb);
+ odp_execute_actions(&aux, packet, may_steal, md,
+ actions, actions_len, dp_execute_cb);
}
#define DPIF_NETDEV_CLASS_FUNCTIONS \
dpif_netdev_open, \
dpif_netdev_close, \
dpif_netdev_destroy, \
- NULL, \
- NULL, \
+ dpif_netdev_run, \
+ dpif_netdev_wait, \
dpif_netdev_get_stats, \
dpif_netdev_port_add, \
dpif_netdev_port_del, \
dpif_netdev_execute, \
NULL, /* operate */ \
dpif_netdev_recv_set, \
+ dpif_netdev_handlers_set, \
dpif_netdev_queue_to_priority, \
dpif_netdev_recv, \
dpif_netdev_recv_wait, \
--- /dev/null
+/*
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef DPIF_NETDEV_H
+#define DPIF_NETDEV_H 1
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+#include "openvswitch/types.h"
+#include "ofpbuf.h"
+#include "packets.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Enough headroom to add a vlan tag, plus an extra 2 bytes to allow IP
+ * headers to be aligned on a 4-byte boundary. */
+enum { DP_NETDEV_HEADROOM = 2 + VLAN_HEADER_LEN };
+
+static inline void dp_packet_pad(struct ofpbuf *b)
+{
+ if (b->size < ETH_TOTAL_MIN) {
+ ofpbuf_put_zeros(b, ETH_TOTAL_MIN - b->size);
+ }
+}
+
+#define NR_QUEUE 1
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* netdev.h */
/* Returns the Netlink PID value to supply in OVS_ACTION_ATTR_USERSPACE
* actions as the OVS_USERSPACE_ATTR_PID attribute's value, for use in
- * flows whose packets arrived on port 'port_no'.
+ * flows whose packets arrived on port 'port_no'. In the case where the
+ * provider allocates multiple Netlink PIDs to a single port, it may use
+ * 'hash' to spread load among them. The caller need not use a particular
+ * hash function; a 5-tuple hash is suitable.
+ *
+ * (The datapath implementation might use some different hash function for
+ * distributing packets received via flow misses among PIDs. This means
+ * that packets received via flow misses might be reordered relative to
+ * packets received via userspace actions. This is not ordinarily a
+ * problem.)
*
* A 'port_no' of UINT32_MAX should be treated as a special case. The
* implementation should return a reserved PID, not allocated to any port,
*
* A dpif provider that doesn't have meaningful Netlink PIDs can use NULL
* for this function. This is equivalent to always returning 0. */
- uint32_t (*port_get_pid)(const struct dpif *dpif, odp_port_t port_no);
+ uint32_t (*port_get_pid)(const struct dpif *dpif, odp_port_t port_no,
+ uint32_t hash);
/* Attempts to begin dumping the ports in a dpif. On success, returns 0
* and initializes '*statep' with any data needed for iteration. On
* updating flows as necessary if it does this. */
int (*recv_set)(struct dpif *dpif, bool enable);
+ /* Refreshes the poll loops and Netlink sockets associated to each port,
+ * when the number of upcall handlers (upcall receiving thread) is changed
+ * to 'n_handlers' and receiving packets for 'dpif' is enabled by
+ * recv_set().
+ *
+ * Since multiple upcall handlers can read upcalls simultaneously from
+ * 'dpif', each port can have multiple Netlink sockets, one per upcall
+ * handler. So, handlers_set() is responsible for the following tasks:
+ *
+ * When receiving upcall is enabled, extends or creates the
+ * configuration to support:
+ *
+ * - 'n_handlers' Netlink sockets for each port.
+ *
+ * - 'n_handlers' poll loops, one for each upcall handler.
+ *
+ * - registering the Netlink sockets for the same upcall handler to
+ * the corresponding poll loop.
+ * */
+ int (*handlers_set)(struct dpif *dpif, uint32_t n_handlers);
+
/* Translates OpenFlow queue ID 'queue_id' (in host byte order) into a
* priority value used for setting packet priority. */
int (*queue_to_priority)(const struct dpif *dpif, uint32_t queue_id,
uint32_t *priority);
- /* Polls for an upcall from 'dpif'. If successful, stores the upcall into
- * '*upcall', using 'buf' for storage. Should only be called if 'recv_set'
- * has been used to enable receiving packets from 'dpif'.
+ /* Polls for an upcall from 'dpif' for an upcall handler. Since there
+ * can be multiple poll loops (see ->handlers_set()), 'handler_id' is
+ * needed as index to identify the corresponding poll loop. If
+ * successful, stores the upcall into '*upcall', using 'buf' for
+ * storage. Should only be called if 'recv_set' has been used to enable
+ * receiving packets from 'dpif'.
*
* The implementation should point 'upcall->key' and 'upcall->userdata'
* (if any) into data in the caller-provided 'buf'. The implementation may
*
* This function must not block. If no upcall is pending when it is
* called, it should return EAGAIN without blocking. */
- int (*recv)(struct dpif *dpif, struct dpif_upcall *upcall,
- struct ofpbuf *buf);
-
- /* Arranges for the poll loop to wake up when 'dpif' has a message queued
- * to be received with the recv member function. */
- void (*recv_wait)(struct dpif *dpif);
+ int (*recv)(struct dpif *dpif, uint32_t handler_id,
+ struct dpif_upcall *upcall, struct ofpbuf *buf);
+
+ /* Arranges for the poll loop for an upcall handler to wake up when 'dpif'
+ * has a message queued to be received with the recv member functions.
+ * Since there can be multiple poll loops (see ->handlers_set()),
+ * 'handler_id' is needed as index to identify the corresponding poll loop.
+ * */
+ void (*recv_wait)(struct dpif *dpif, uint32_t handler_id);
/* Throws away any queued upcalls that 'dpif' currently has ready to
* return. */
return error;
}
-/* Returns the Netlink PID value to supply in OVS_ACTION_ATTR_USERSPACE actions
- * as the OVS_USERSPACE_ATTR_PID attribute's value, for use in flows whose
- * packets arrived on port 'port_no'.
+/* Returns the Netlink PID value to supply in OVS_ACTION_ATTR_USERSPACE
+ * actions as the OVS_USERSPACE_ATTR_PID attribute's value, for use in
+ * flows whose packets arrived on port 'port_no'. In the case where the
+ * provider allocates multiple Netlink PIDs to a single port, it may use
+ * 'hash' to spread load among them. The caller need not use a particular
+ * hash function; a 5-tuple hash is suitable.
+ *
+ * (The datapath implementation might use some different hash function for
+ * distributing packets received via flow misses among PIDs. This means
+ * that packets received via flow misses might be reordered relative to
+ * packets received via userspace actions. This is not ordinarily a
+ * problem.)
*
* A 'port_no' of ODPP_NONE is a special case: it returns a reserved PID, not
* allocated to any port, that the client may use for special purposes.
* update all of the flows that it installed that contain
* OVS_ACTION_ATTR_USERSPACE actions. */
uint32_t
-dpif_port_get_pid(const struct dpif *dpif, odp_port_t port_no)
+dpif_port_get_pid(const struct dpif *dpif, odp_port_t port_no, uint32_t hash)
{
return (dpif->dpif_class->port_get_pid
- ? (dpif->dpif_class->port_get_pid)(dpif, port_no)
+ ? (dpif->dpif_class->port_get_pid)(dpif, port_no, hash)
: 0);
}
dpif_flow_stats_extract(const struct flow *flow, const struct ofpbuf *packet,
long long int used, struct dpif_flow_stats *stats)
{
- stats->tcp_flags = packet_get_tcp_flags(packet, flow);
+ stats->tcp_flags = ntohs(flow->tcp_flags);
stats->n_bytes = packet->size;
stats->n_packets = 1;
stats->used = used;
* meaningful. */
static void
dpif_execute_helper_cb(void *aux_, struct ofpbuf *packet,
- const struct pkt_metadata *md,
+ struct pkt_metadata *md,
const struct nlattr *action, bool may_steal OVS_UNUSED)
{
struct dpif_execute_helper_aux *aux = aux_;
case OVS_ACTION_ATTR_SET:
case OVS_ACTION_ATTR_SAMPLE:
case OVS_ACTION_ATTR_UNSPEC:
+ case OVS_ACTION_ATTR_RECIRC:
case __OVS_ACTION_ATTR_MAX:
OVS_NOT_REACHED();
}
COVERAGE_INC(dpif_execute_with_help);
- odp_execute_actions(&aux, execute->packet, &execute->md,
+ odp_execute_actions(&aux, execute->packet, false, &execute->md,
execute->actions, execute->actions_len,
dpif_execute_helper_cb);
return aux.error;
return error;
}
-/* Polls for an upcall from 'dpif'. If successful, stores the upcall into
- * '*upcall', using 'buf' for storage. Should only be called if
- * dpif_recv_set() has been used to enable receiving packets on 'dpif'.
+/* Refreshes the poll loops and Netlink sockets associated to each port,
+ * when the number of upcall handlers (upcall receiving thread) is changed
+ * to 'n_handlers' and receiving packets for 'dpif' is enabled by
+ * recv_set().
+ *
+ * Since multiple upcall handlers can read upcalls simultaneously from
+ * 'dpif', each port can have multiple Netlink sockets, one per upcall
+ * handler. So, handlers_set() is responsible for the following tasks:
+ *
+ * When receiving upcall is enabled, extends or creates the
+ * configuration to support:
+ *
+ * - 'n_handlers' Netlink sockets for each port.
+ *
+ * - 'n_handlers' poll loops, one for each upcall handler.
+ *
+ * - registering the Netlink sockets for the same upcall handler to
+ * the corresponding poll loop.
+ *
+ * Returns 0 if successful, otherwise a positive errno value. */
+int
+dpif_handlers_set(struct dpif *dpif, uint32_t n_handlers)
+{
+ int error = dpif->dpif_class->handlers_set(dpif, n_handlers);
+ log_operation(dpif, "handlers_set", error);
+ return error;
+}
+
+/* Polls for an upcall from 'dpif' for an upcall handler. Since there
+ * there can be multiple poll loops, 'handler_id' is needed as index to
+ * identify the corresponding poll loop. If successful, stores the upcall
+ * into '*upcall', using 'buf' for storage. Should only be called if
+ * 'recv_set' has been used to enable receiving packets from 'dpif'.
*
* 'upcall->key' and 'upcall->userdata' point into data in the caller-provided
* 'buf', so their memory cannot be freed separately from 'buf'.
* Returns 0 if successful, otherwise a positive errno value. Returns EAGAIN
* if no upcall is immediately available. */
int
-dpif_recv(struct dpif *dpif, struct dpif_upcall *upcall, struct ofpbuf *buf)
+dpif_recv(struct dpif *dpif, uint32_t handler_id, struct dpif_upcall *upcall,
+ struct ofpbuf *buf)
{
- int error = dpif->dpif_class->recv(dpif, upcall, buf);
+ int error = dpif->dpif_class->recv(dpif, handler_id, upcall, buf);
if (!error && !VLOG_DROP_DBG(&dpmsg_rl)) {
struct ds flow;
char *packet;
}
}
-/* Arranges for the poll loop to wake up when 'dpif' has a message queued to be
- * received with dpif_recv(). */
+/* Arranges for the poll loop for an upcall handler to wake up when 'dpif'
+ * 'dpif' has a message queued to be received with the recv member
+ * function. Since there can be multiple poll loops, 'handler_id' is
+ * needed as index to identify the corresponding poll loop. */
void
-dpif_recv_wait(struct dpif *dpif)
+dpif_recv_wait(struct dpif *dpif, uint32_t handler_id)
{
- dpif->dpif_class->recv_wait(dpif);
+ dpif->dpif_class->recv_wait(dpif, handler_id);
}
/* Obtains the NetFlow engine type and engine ID for 'dpif' into '*engine_type'
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* "internal" (for a simulated port used to connect to the TCP/IP stack),
* and "gre" (for a GRE tunnel).
*
- * - A Netlink PID (see "Upcall Queuing and Ordering" below).
+ * - A Netlink PID for each upcall reading thread (see "Upcall Queuing and
+ * Ordering" below).
*
* The dpif interface has functions for adding and deleting ports. When a
* datapath implements these (e.g. as the Linux and netdev datapaths do), then
* connection consists of two flows with 1-ms latency to set up each one.
*
* To receive upcalls, a client has to enable them with dpif_recv_set(). A
- * datapath should generally support multiple clients at once (e.g. so that one
- * may run "ovs-dpctl show" or "ovs-dpctl dump-flows" while "ovs-vswitchd" is
- * also running) but need not support multiple clients enabling upcalls at
- * once.
+ * datapath should generally support being opened multiple times (e.g. so that
+ * one may run "ovs-dpctl show" or "ovs-dpctl dump-flows" while "ovs-vswitchd"
+ * is also running) but need not support more than one of these clients
+ * enabling upcalls at once.
*
*
* Upcall Queuing and Ordering
* PID in "action" upcalls is that dpif_port_get_pid() returns a constant value
* and all upcalls are appended to a single queue.
*
- * The ideal behavior is:
+ * The preferred behavior is:
*
* - Each port has a PID that identifies the queue used for "miss" upcalls
* on that port. (Thus, if each port has its own queue for "miss"
*
* - Upcalls that specify the "special" Netlink PID are queued separately.
*
+ * Multiple threads may want to read upcalls simultaneously from a single
+ * datapath. To support multiple threads well, one extends the above preferred
+ * behavior:
+ *
+ * - Each port has multiple PIDs. The datapath distributes "miss" upcalls
+ * across the PIDs, ensuring that a given flow is mapped in a stable way
+ * to a single PID.
+ *
+ * - For "action" upcalls, the thread can specify its own Netlink PID or
+ * other threads' Netlink PID of the same port for offloading purpose
+ * (e.g. in a "round robin" manner).
+ *
*
* Packet Format
* =============
struct dpif_port *);
int dpif_port_get_name(struct dpif *, odp_port_t port_no,
char *name, size_t name_size);
-uint32_t dpif_port_get_pid(const struct dpif *, odp_port_t port_no);
+uint32_t dpif_port_get_pid(const struct dpif *, odp_port_t port_no,
+ uint32_t hash);
struct dpif_port_dump {
const struct dpif *dpif;
};
int dpif_recv_set(struct dpif *, bool enable);
-int dpif_recv(struct dpif *, struct dpif_upcall *, struct ofpbuf *);
+int dpif_handlers_set(struct dpif *, uint32_t n_handlers);
+int dpif_recv(struct dpif *, uint32_t handler_id, struct dpif_upcall *,
+ struct ofpbuf *);
void dpif_recv_purge(struct dpif *);
-void dpif_recv_wait(struct dpif *);
+void dpif_recv_wait(struct dpif *, uint32_t handler_id);
\f
/* Miscellaneous. */
#include "ovs-thread.h"
#include "random.h"
-/* This system's cache line size, in bytes.
- * Being wrong hurts performance but not correctness. */
-#define CACHE_LINE_SIZE 64 /* Correct for most CPUs. */
-BUILD_ASSERT_DECL(IS_POW2(CACHE_LINE_SIZE));
-
struct fat_rwlock_slot {
/* Membership in rwlock's list of "struct fat_rwlock_slot"s.
*
* Accessed only by the slot's own thread, so no synchronization is
* needed. */
unsigned int depth;
-
- /* To prevent two of these structures from accidentally occupying the same
- * cache line (causing "false sharing"), we cache-align each of these data
- * structures. That requires malloc()ing extra space and throwing away
- * some space at the beginning, which means that the pointer to this struct
- * isn't necessarily the pointer to the beginning of the block, and so we
- * need to retain the original pointer to free later.
- *
- * Accessed only by a single thread, so no synchronization is needed. */
- void *base; /* Pointer to pass to free() for this block. */
};
static void
}
list_remove(&slot->list_node);
- free(slot->base);
+ free_cacheline(slot);
}
static void
fat_rwlock_get_slot__(struct fat_rwlock *rwlock)
{
struct fat_rwlock_slot *slot;
- void *base;
/* Fast path. */
slot = ovsthread_getspecific(rwlock->key);
/* Slow path: create a new slot for 'rwlock' in this thread. */
- /* Allocate room for:
- *
- * - Up to CACHE_LINE_SIZE - 1 bytes before the per-thread, so that
- * the start of the slot doesn't potentially share a cache line.
- *
- * - The slot itself.
- *
- * - Space following the slot up to the end of the cache line, so
- * that the end of the slot doesn't potentially share a cache
- * line. */
- base = xmalloc((CACHE_LINE_SIZE - 1)
- + ROUND_UP(sizeof *slot, CACHE_LINE_SIZE));
- slot = (void *) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE);
-
- slot->base = base;
+ slot = xmalloc_cacheline(sizeof *slot);
slot->rwlock = rwlock;
ovs_mutex_init(&slot->mutex);
slot->depth = 0;
return NULL;
}
-static struct tcp_header *
-pull_tcp(struct ofpbuf *packet)
-{
- if (packet->size >= TCP_HEADER_LEN) {
- struct tcp_header *tcp = packet->data;
- int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4;
- if (tcp_len >= TCP_HEADER_LEN && packet->size >= tcp_len) {
- return ofpbuf_pull(packet, tcp_len);
- }
- }
- return NULL;
-}
-
-static struct udp_header *
-pull_udp(struct ofpbuf *packet)
-{
- return ofpbuf_try_pull(packet, UDP_HEADER_LEN);
-}
-
-static struct sctp_header *
-pull_sctp(struct ofpbuf *packet)
-{
- return ofpbuf_try_pull(packet, SCTP_HEADER_LEN);
-}
-
static struct icmp_header *
pull_icmp(struct ofpbuf *packet)
{
}
static void
-parse_tcp(struct ofpbuf *packet, struct ofpbuf *b, struct flow *flow)
+parse_tcp(struct ofpbuf *b, struct flow *flow)
{
- const struct tcp_header *tcp = pull_tcp(b);
- if (tcp) {
+ if (b->size >= TCP_HEADER_LEN) {
+ const struct tcp_header *tcp = b->data;
+
flow->tp_src = tcp->tcp_src;
flow->tp_dst = tcp->tcp_dst;
flow->tcp_flags = tcp->tcp_ctl & htons(0x0fff);
- packet->l7 = b->data;
}
}
static void
-parse_udp(struct ofpbuf *packet, struct ofpbuf *b, struct flow *flow)
+parse_udp(struct ofpbuf *b, struct flow *flow)
{
- const struct udp_header *udp = pull_udp(b);
- if (udp) {
+ if (b->size >= UDP_HEADER_LEN) {
+ const struct udp_header *udp = b->data;
+
flow->tp_src = udp->udp_src;
flow->tp_dst = udp->udp_dst;
- packet->l7 = b->data;
}
}
static void
-parse_sctp(struct ofpbuf *packet, struct ofpbuf *b, struct flow *flow)
+parse_sctp(struct ofpbuf *b, struct flow *flow)
{
- const struct sctp_header *sctp = pull_sctp(b);
- if (sctp) {
+ if (b->size >= SCTP_HEADER_LEN) {
+ const struct sctp_header *sctp = b->data;
+
flow->tp_src = sctp->sctp_src;
flow->tp_dst = sctp->sctp_dst;
- packet->l7 = b->data;
}
}
-static bool
+static void
parse_icmpv6(struct ofpbuf *b, struct flow *flow)
{
const struct icmp6_hdr *icmp = pull_icmpv6(b);
if (!icmp) {
- return false;
+ return;
}
/* The ICMPv6 type and code fields use the 16-bit transport port
nd_target = ofpbuf_try_pull(b, sizeof *nd_target);
if (!nd_target) {
- return false;
+ return;
}
flow->nd_target = *nd_target;
}
}
- return true;
+ return;
invalid:
memset(&flow->nd_target, 0, sizeof(flow->nd_target));
memset(flow->arp_sha, 0, sizeof(flow->arp_sha));
memset(flow->arp_tha, 0, sizeof(flow->arp_tha));
- return false;
-
+ return;
}
/* Initializes 'flow' members from 'packet' and 'md'
*
- * Initializes 'packet' header pointers as follows:
+ * Initializes 'packet' header l2 pointer to the start of the Ethernet
+ * header, and the layer offsets as follows:
*
- * - packet->l2 to the start of the Ethernet header.
+ * - packet->l2_5_ofs to the start of the MPLS shim header, or UINT16_MAX
+ * when there is no MPLS shim header.
*
- * - packet->l2_5 to the start of the MPLS shim header.
- *
- * - packet->l3 to just past the Ethernet header, or just past the
+ * - packet->l3_ofs to just past the Ethernet header, or just past the
* vlan_header if one is present, to the first byte of the payload of the
- * Ethernet frame.
- *
- * - packet->l4 to just past the IPv4 header, if one is present and has a
- * correct length, and otherwise NULL.
+ * Ethernet frame. UINT16_MAX if the frame is too short to contain an
+ * Ethernet header.
*
- * - packet->l7 to just past the TCP/UDP/SCTP/ICMP header, if one is
- * present and has a correct length, and otherwise NULL.
+ * - packet->l4_ofs to just past the IPv4 header, if one is present and
+ * has at least the content used for the fields of interest for the flow,
+ * otherwise UINT16_MAX.
*/
void
flow_extract(struct ofpbuf *packet, const struct pkt_metadata *md,
if (md) {
flow->tunnel = md->tunnel;
- if (md->in_port.odp_port != ODPP_NONE) {
- flow->in_port = md->in_port;
- };
+ flow->in_port = md->in_port;
flow->skb_priority = md->skb_priority;
flow->pkt_mark = md->pkt_mark;
}
packet->l2 = b.data;
- packet->l2_5 = NULL;
- packet->l3 = NULL;
- packet->l4 = NULL;
- packet->l7 = NULL;
+ ofpbuf_set_l2_5(packet, NULL);
+ ofpbuf_set_l3(packet, NULL);
+ ofpbuf_set_l4(packet, NULL);
if (b.size < sizeof *eth) {
return;
/* Parse mpls, copy l3 ttl. */
if (eth_type_mpls(flow->dl_type)) {
- packet->l2_5 = b.data;
+ ofpbuf_set_l2_5(packet, b.data);
parse_mpls(&b, flow);
}
/* Network layer. */
- packet->l3 = b.data;
+ ofpbuf_set_l3(packet, b.data);
if (flow->dl_type == htons(ETH_TYPE_IP)) {
const struct ip_header *nh = pull_ip(&b);
if (nh) {
- packet->l4 = b.data;
+ ofpbuf_set_l4(packet, b.data);
flow->nw_src = get_16aligned_be32(&nh->ip_src);
flow->nw_dst = get_16aligned_be32(&nh->ip_dst);
if (!(nh->ip_frag_off & htons(IP_FRAG_OFF_MASK))) {
if (flow->nw_proto == IPPROTO_TCP) {
- parse_tcp(packet, &b, flow);
+ parse_tcp(&b, flow);
} else if (flow->nw_proto == IPPROTO_UDP) {
- parse_udp(packet, &b, flow);
+ parse_udp(&b, flow);
} else if (flow->nw_proto == IPPROTO_SCTP) {
- parse_sctp(packet, &b, flow);
+ parse_sctp(&b, flow);
} else if (flow->nw_proto == IPPROTO_ICMP) {
const struct icmp_header *icmp = pull_icmp(&b);
if (icmp) {
flow->tp_src = htons(icmp->icmp_type);
flow->tp_dst = htons(icmp->icmp_code);
- packet->l7 = b.data;
}
}
}
return;
}
- packet->l4 = b.data;
+ ofpbuf_set_l4(packet, b.data);
if (flow->nw_proto == IPPROTO_TCP) {
- parse_tcp(packet, &b, flow);
+ parse_tcp(&b, flow);
} else if (flow->nw_proto == IPPROTO_UDP) {
- parse_udp(packet, &b, flow);
+ parse_udp(&b, flow);
} else if (flow->nw_proto == IPPROTO_SCTP) {
- parse_sctp(packet, &b, flow);
+ parse_sctp(&b, flow);
} else if (flow->nw_proto == IPPROTO_ICMPV6) {
- if (parse_icmpv6(&b, flow)) {
- packet->l7 = b.data;
- }
+ parse_icmpv6(&b, flow);
}
} else if (flow->dl_type == htons(ETH_TYPE_ARP) ||
flow->dl_type == htons(ETH_TYPE_RARP)) {
void
flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd)
{
- BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
+ fmd->dp_hash = flow->dp_hash;
+ fmd->recirc_id = flow->recirc_id;
fmd->tun_id = flow->tunnel.tun_id;
fmd->tun_src = flow->tunnel.ip_src;
fmd->tun_dst = flow->tunnel.ip_dst;
wc->masks.regs[idx] = mask;
}
+/* Calculates the 5-tuple hash from the given flow. */
+uint32_t
+flow_hash_5tuple(const struct flow *flow, uint32_t basis)
+{
+ uint32_t hash = 0;
+
+ if (!flow) {
+ return 0;
+ }
+
+ hash = mhash_add(basis, (OVS_FORCE uint32_t) flow->nw_src);
+ hash = mhash_add(hash, (OVS_FORCE uint32_t) flow->nw_dst);
+ hash = mhash_add(hash, ((OVS_FORCE uint32_t) flow->tp_src << 16)
+ | (OVS_FORCE uint32_t) flow->tp_dst);
+ hash = mhash_add(hash, flow->nw_proto);
+
+ return mhash_finish(hash, 13);
+}
+
/* Hashes 'flow' based on its L2 through L4 protocol information. */
uint32_t
flow_hash_symmetric_l4(const struct flow *flow, uint32_t basis)
flow->mpls_lse[0] = set_mpls_lse_values(ttl, tc, 1, htonl(label));
/* Clear all L3 and L4 fields. */
- BUILD_ASSERT(FLOW_WC_SEQ == 24);
+ BUILD_ASSERT(FLOW_WC_SEQ == 25);
memset((char *) flow + FLOW_SEGMENT_2_ENDS_AT, 0,
sizeof(struct flow) - FLOW_SEGMENT_2_ENDS_AT);
}
flow->mpls_lse[idx] = lse;
}
-static void
+static size_t
flow_compose_l4(struct ofpbuf *b, const struct flow *flow)
{
+ size_t l4_len = 0;
+
if (!(flow->nw_frag & FLOW_NW_FRAG_ANY)
|| !(flow->nw_frag & FLOW_NW_FRAG_LATER)) {
if (flow->nw_proto == IPPROTO_TCP) {
struct tcp_header *tcp;
- tcp = ofpbuf_put_zeros(b, sizeof *tcp);
+ l4_len = sizeof *tcp;
+ tcp = ofpbuf_put_zeros(b, l4_len);
tcp->tcp_src = flow->tp_src;
tcp->tcp_dst = flow->tp_dst;
tcp->tcp_ctl = TCP_CTL(ntohs(flow->tcp_flags), 5);
- b->l7 = ofpbuf_tail(b);
} else if (flow->nw_proto == IPPROTO_UDP) {
struct udp_header *udp;
- udp = ofpbuf_put_zeros(b, sizeof *udp);
+ l4_len = sizeof *udp;
+ udp = ofpbuf_put_zeros(b, l4_len);
udp->udp_src = flow->tp_src;
udp->udp_dst = flow->tp_dst;
- b->l7 = ofpbuf_tail(b);
} else if (flow->nw_proto == IPPROTO_SCTP) {
struct sctp_header *sctp;
- sctp = ofpbuf_put_zeros(b, sizeof *sctp);
+ l4_len = sizeof *sctp;
+ sctp = ofpbuf_put_zeros(b, l4_len);
sctp->sctp_src = flow->tp_src;
sctp->sctp_dst = flow->tp_dst;
- b->l7 = ofpbuf_tail(b);
} else if (flow->nw_proto == IPPROTO_ICMP) {
struct icmp_header *icmp;
- icmp = ofpbuf_put_zeros(b, sizeof *icmp);
+ l4_len = sizeof *icmp;
+ icmp = ofpbuf_put_zeros(b, l4_len);
icmp->icmp_type = ntohs(flow->tp_src);
icmp->icmp_code = ntohs(flow->tp_dst);
icmp->icmp_csum = csum(icmp, ICMP_HEADER_LEN);
- b->l7 = ofpbuf_tail(b);
} else if (flow->nw_proto == IPPROTO_ICMPV6) {
struct icmp6_hdr *icmp;
- icmp = ofpbuf_put_zeros(b, sizeof *icmp);
+ l4_len = sizeof *icmp;
+ icmp = ofpbuf_put_zeros(b, l4_len);
icmp->icmp6_type = ntohs(flow->tp_src);
icmp->icmp6_code = ntohs(flow->tp_dst);
struct in6_addr *nd_target;
struct nd_opt_hdr *nd_opt;
+ l4_len += sizeof *nd_target;
nd_target = ofpbuf_put_zeros(b, sizeof *nd_target);
*nd_target = flow->nd_target;
if (!eth_addr_is_zero(flow->arp_sha)) {
+ l4_len += 8;
nd_opt = ofpbuf_put_zeros(b, 8);
nd_opt->nd_opt_len = 1;
nd_opt->nd_opt_type = ND_OPT_SOURCE_LINKADDR;
memcpy(nd_opt + 1, flow->arp_sha, ETH_ADDR_LEN);
}
if (!eth_addr_is_zero(flow->arp_tha)) {
+ l4_len += 8;
nd_opt = ofpbuf_put_zeros(b, 8);
nd_opt->nd_opt_len = 1;
nd_opt->nd_opt_type = ND_OPT_TARGET_LINKADDR;
}
icmp->icmp6_cksum = (OVS_FORCE uint16_t)
csum(icmp, (char *)ofpbuf_tail(b) - (char *)icmp);
- b->l7 = ofpbuf_tail(b);
}
}
+ return l4_len;
}
/* Puts into 'b' a packet that flow_extract() would parse as having the given
void
flow_compose(struct ofpbuf *b, const struct flow *flow)
{
+ size_t l4_len;
+
/* eth_compose() sets l3 pointer and makes sure it is 32-bit aligned. */
eth_compose(b, flow->dl_dst, flow->dl_src, ntohs(flow->dl_type), 0);
if (flow->dl_type == htons(FLOW_DL_TYPE_NONE)) {
}
}
- b->l4 = ofpbuf_tail(b);
+ ofpbuf_set_l4(b, ofpbuf_tail(b));
- flow_compose_l4(b, flow);
+ l4_len = flow_compose_l4(b, flow);
- ip->ip_tot_len = htons((uint8_t *) b->data + b->size
- - (uint8_t *) b->l3);
+ ip->ip_tot_len = htons(b->l4_ofs - b->l3_ofs + l4_len);
ip->ip_csum = csum(ip, sizeof *ip);
} else if (flow->dl_type == htons(ETH_TYPE_IPV6)) {
struct ovs_16aligned_ip6_hdr *nh;
memcpy(&nh->ip6_src, &flow->ipv6_src, sizeof(nh->ip6_src));
memcpy(&nh->ip6_dst, &flow->ipv6_dst, sizeof(nh->ip6_dst));
- b->l4 = ofpbuf_tail(b);
+ ofpbuf_set_l4(b, ofpbuf_tail(b));
- flow_compose_l4(b, flow);
+ l4_len = flow_compose_l4(b, flow);
- nh->ip6_plen =
- b->l7 ? htons((uint8_t *) b->l7 - (uint8_t *) b->l4) : htons(0);
+ nh->ip6_plen = htons(l4_len);
} else if (flow->dl_type == htons(ETH_TYPE_ARP) ||
flow->dl_type == htons(ETH_TYPE_RARP)) {
struct arp_eth_header *arp;
- b->l3 = arp = ofpbuf_put_zeros(b, sizeof *arp);
+ arp = ofpbuf_put_zeros(b, sizeof *arp);
+ ofpbuf_set_l3(b, arp);
arp->ar_hrd = htons(1);
arp->ar_pro = htons(ETH_TYPE_IP);
arp->ar_hln = ETH_ADDR_LEN;
if (eth_type_mpls(flow->dl_type)) {
int n;
- b->l2_5 = b->l3;
+ b->l2_5_ofs = b->l3_ofs;
for (n = 1; n < FLOW_MAX_MPLS_LABELS; n++) {
if (flow->mpls_lse[n - 1] & htonl(MPLS_BOS_MASK)) {
break;
/* This sequence number should be incremented whenever anything involving flows
* or the wildcarding of flows changes. This will cause build assertion
* failures in places which likely need to be updated. */
-#define FLOW_WC_SEQ 24
+#define FLOW_WC_SEQ 25
#define FLOW_N_REGS 8
BUILD_ASSERT_DECL(FLOW_N_REGS <= NXM_NX_MAX_REGS);
* be looked at. This enables better wildcarding for datapath flows.
*/
struct flow {
+ /* Recirculation */
+ uint32_t dp_hash; /* Datapath computed hash value. The exact
+ computation is opaque to the user space.*/
+ uint32_t recirc_id; /* Must be exact match. */
+
/* L1 */
struct flow_tnl tunnel; /* Encapsulating tunnel parameters. */
ovs_be64 metadata; /* OpenFlow Metadata. */
/* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
BUILD_ASSERT_DECL(offsetof(struct flow, tp_dst) + 2
- == sizeof(struct flow_tnl) + 164
- && FLOW_WC_SEQ == 24);
+ == sizeof(struct flow_tnl) + 172
+ && FLOW_WC_SEQ == 25);
/* Incremental points at which flow classification may be performed in
* segments.
/* Represents the metadata fields of struct flow. */
struct flow_metadata {
+ uint32_t dp_hash; /* Datapath computed hash field. */
+ uint32_t recirc_id; /* Recirculation ID. */
ovs_be64 tun_id; /* Encapsulating tunnel ID. */
ovs_be32 tun_src; /* Tunnel outer IPv4 src addr */
ovs_be32 tun_dst; /* Tunnel outer IPv4 dst addr */
uint32_t flow_wildcards_hash(const struct flow_wildcards *, uint32_t basis);
bool flow_wildcards_equal(const struct flow_wildcards *,
const struct flow_wildcards *);
+uint32_t flow_hash_5tuple(const struct flow *flow, uint32_t basis);
uint32_t flow_hash_symmetric_l4(const struct flow *flow, uint32_t basis);
/* Initialize a flow with random fields that matter for nx_hash_fields. */
static inline uint32_t hash_int(uint32_t x, uint32_t basis);
static inline uint32_t hash_2words(uint32_t, uint32_t);
+static inline uint32_t hash_uint64(uint64_t);
+static inline uint32_t hash_uint64_basis(uint64_t x, uint32_t basis);
uint32_t hash_3words(uint32_t, uint32_t, uint32_t);
static inline uint32_t hash_boolean(bool x, uint32_t basis);
static inline uint32_t hash_2words(uint32_t x, uint32_t y)
{
- return mhash_finish(mhash_add(mhash_add(x, 0), y), 4);
+ return mhash_finish(mhash_add(mhash_add(x, 0), y), 8);
}
+static inline uint32_t hash_uint64(const uint64_t x)
+{
+ return hash_2words((uint32_t)(x >> 32), (uint32_t)x);
+}
+
+static inline uint32_t hash_uint64_basis(const uint64_t x,
+ const uint32_t basis)
+{
+ return hash_3words((uint32_t)(x >> 32), (uint32_t)x, basis);
+}
#ifdef __cplusplus
}
#endif
static inline bool
hmap_is_empty(const struct hmap *hmap)
{
- atomic_thread_fence(memory_order_acquire);
return hmap->n == 0;
}
-/* Copyright (c) 2011, 2012, 2013 Nicira, Inc.
+/* Copyright (c) 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
{
const struct lacp_pdu *pdu;
- pdu = ofpbuf_at(b, (uint8_t *)b->l3 - (uint8_t *)b->data, LACP_PDU_LEN);
+ pdu = ofpbuf_at(b, (uint8_t *)ofpbuf_get_l3(b) - (uint8_t *)b->data,
+ LACP_PDU_LEN);
if (pdu && pdu->subtype == 1
&& pdu->actor_type == 1 && pdu->actor_len == 20
hmap_destroy(&lacp->slaves);
list_remove(&lacp->node);
free(lacp->name);
- ovs_refcount_destroy(&lacp->ref_cnt);
free(lacp);
ovs_mutex_unlock(&mutex);
}
- /* Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ /* Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
= &lock_table__;
static void lockfile_unhash(struct lockfile *);
-#ifdef _WIN32
-static int lockfile_try_lock_windows(const char *name, pid_t *pidp,
- struct lockfile **lockfilep);
-static void lockfile_unlock_windows(struct lockfile * lockfile);
-#else
-static int lockfile_try_lock_posix(const char *name, pid_t *pidp,
- struct lockfile **lockfilep);
-#endif
+static int lockfile_try_lock(const char *name, pid_t *pidp,
+ struct lockfile **lockfilep)
+ OVS_REQUIRES(&lock_table_mutex);
+static void lockfile_do_unlock(struct lockfile * lockfile)
+ OVS_REQUIRES(&lock_table_mutex);
/* Returns the name of the lockfile that would be created for locking a file
* named 'filename_'. The caller is responsible for freeing the returned name,
lock_name = lockfile_name(file);
ovs_mutex_lock(&lock_table_mutex);
-#ifdef _WIN32
- error = lockfile_try_lock_windows(lock_name, &pid, lockfilep);
-#else
- error = lockfile_try_lock_posix(lock_name, &pid, lockfilep);
-#endif
+ error = lockfile_try_lock(lock_name, &pid, lockfilep);
ovs_mutex_unlock(&lock_table_mutex);
if (error) {
{
if (lockfile) {
ovs_mutex_lock(&lock_table_mutex);
-#ifdef _WIN32
- lockfile_unlock_windows(lockfile);
-#else
- lockfile_unhash(lockfile);
-#endif
+ lockfile_do_unlock(lockfile);
ovs_mutex_unlock(&lock_table_mutex);
COVERAGE_INC(lockfile_unlock);
#ifdef _WIN32
static void
-lockfile_unlock_windows(struct lockfile *lockfile)
+lockfile_do_unlock(struct lockfile *lockfile)
OVS_REQUIRES(&lock_table_mutex)
{
if (lockfile->fd >= 0) {
}
static int
-lockfile_try_lock_windows(const char *name, pid_t *pidp,
- struct lockfile **lockfilep)
+lockfile_try_lock(const char *name, pid_t *pidp, struct lockfile **lockfilep)
OVS_REQUIRES(&lock_table_mutex)
{
HANDLE lock_handle;
*lockfilep = lockfile;
return 0;
}
-#endif
+#else /* !_WIN32 */
+static void
+lockfile_do_unlock(struct lockfile *lockfile)
+{
+ lockfile_unhash(lockfile);
+}
-#ifndef _WIN32
static int
-lockfile_try_lock_posix(const char *name, pid_t *pidp,
- struct lockfile **lockfilep)
+lockfile_try_lock(const char *name, pid_t *pidp, struct lockfile **lockfilep)
OVS_REQUIRES(&lock_table_mutex)
{
struct flock l;
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
mac_table_hash(const struct mac_learning *ml, const uint8_t mac[ETH_ADDR_LEN],
uint16_t vlan)
{
- unsigned int mac1 = get_unaligned_u32(ALIGNED_CAST(uint32_t *, mac));
- unsigned int mac2 = get_unaligned_u16(ALIGNED_CAST(uint16_t *, mac + 4));
- return hash_3words(mac1, mac2 | (vlan << 16), ml->secret);
+ return hash_mac(mac, vlan, ml->secret);
}
static struct mac_entry *
bitmap_free(ml->flood_vlans);
ovs_rwlock_destroy(&ml->rwlock);
- ovs_refcount_destroy(&ml->ref_cnt);
free(ml);
}
}
flow_zero_wildcards(&match->flow, &match->wc);
}
+void
+match_set_dp_hash(struct match *match, uint32_t value)
+{
+ match_set_dp_hash_masked(match, value, UINT32_MAX);
+}
+
+void
+match_set_dp_hash_masked(struct match *match, uint32_t value, uint32_t mask)
+{
+ match->wc.masks.dp_hash = mask;
+ match->flow.dp_hash = value & mask;
+}
+
+void
+match_set_recirc_id(struct match *match, uint32_t value)
+{
+ match->flow.recirc_id = value;
+ match->wc.masks.recirc_id = UINT32_MAX;
+}
+
void
match_set_reg(struct match *match, unsigned int reg_idx, uint32_t value)
{
int i;
- BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
if (priority != OFP_DEFAULT_PRIORITY) {
ds_put_format(s, "priority=%u,", priority);
format_uint32_masked(s, "pkt_mark", f->pkt_mark, wc->masks.pkt_mark);
+ if (wc->masks.recirc_id) {
+ format_uint32_masked(s, "recirc_id", f->recirc_id,
+ wc->masks.recirc_id);
+ }
+
+ if (f->dp_hash && wc->masks.dp_hash) {
+ format_uint32_masked(s, "dp_hash", f->dp_hash,
+ wc->masks.dp_hash);
+ }
+
if (wc->masks.skb_priority) {
ds_put_format(s, "skb_priority=%#"PRIx32",", f->skb_priority);
}
void match_zero_wildcarded_fields(struct match *);
+void match_set_dp_hash(struct match *, uint32_t value);
+void match_set_dp_hash_masked(struct match *, uint32_t value, uint32_t mask);
+
+void match_set_recirc_id(struct match *, uint32_t value);
+void match_set_recirc_id_masked(struct match *, uint32_t value, uint32_t mask);
+
void match_set_reg(struct match *, unsigned int reg_idx, uint32_t value);
void match_set_reg_masked(struct match *, unsigned int reg_idx,
uint32_t value, uint32_t mask);
/* ## -------- ## */
{
+ MFF_DP_HASH, "dp_hash", NULL,
+ MF_FIELD_SIZES(be32),
+ MFM_FULLY,
+ MFS_HEXADECIMAL,
+ MFP_NONE,
+ false,
+ NXM_NX_DP_HASH, "NXM_NX_DP_HASH",
+ NXM_NX_DP_HASH, "NXM_NX_DP_HASH",
+ OFPUTIL_P_NXM_OXM_ANY,
+ OFPUTIL_P_NXM_OXM_ANY,
+ -1,
+ }, {
+ MFF_RECIRC_ID, "recirc_id", NULL,
+ MF_FIELD_SIZES(be32),
+ MFM_NONE,
+ MFS_DECIMAL,
+ MFP_NONE,
+ false,
+ NXM_NX_RECIRC_ID, "NXM_NX_RECIRC_ID",
+ NXM_NX_RECIRC_ID, "NXM_NX_RECIRC_ID",
+ OFPUTIL_P_NXM_OXM_ANY,
+ OFPUTIL_P_NXM_OXM_ANY,
+ -1,
+ }, {
MFF_TUN_ID, "tun_id", "tunnel_id",
MF_FIELD_SIZES(be64),
MFM_FULLY,
mf_is_all_wild(const struct mf_field *mf, const struct flow_wildcards *wc)
{
switch (mf->id) {
+ case MFF_DP_HASH:
+ return !wc->masks.dp_hash;
+ case MFF_RECIRC_ID:
+ return !wc->masks.recirc_id;
case MFF_TUN_SRC:
return !wc->masks.tunnel.ip_src;
case MFF_TUN_DST:
mf_is_value_valid(const struct mf_field *mf, const union mf_value *value)
{
switch (mf->id) {
+ case MFF_DP_HASH:
+ case MFF_RECIRC_ID:
case MFF_TUN_ID:
case MFF_TUN_SRC:
case MFF_TUN_DST:
union mf_value *value)
{
switch (mf->id) {
+ case MFF_DP_HASH:
+ value->be32 = htonl(flow->dp_hash);
+ break;
+ case MFF_RECIRC_ID:
+ value->be32 = htonl(flow->recirc_id);
+ break;
case MFF_TUN_ID:
value->be64 = flow->tunnel.tun_id;
break;
const union mf_value *value, struct match *match)
{
switch (mf->id) {
+ case MFF_DP_HASH:
+ match_set_dp_hash(match, ntohl(value->be32));
+ break;
+ case MFF_RECIRC_ID:
+ match_set_recirc_id(match, ntohl(value->be32));
+ break;
case MFF_TUN_ID:
match_set_tun_id(match, value->be64);
break;
const union mf_value *value, struct flow *flow)
{
switch (mf->id) {
+ case MFF_DP_HASH:
+ flow->dp_hash = ntohl(value->be32);
+ break;
+ case MFF_RECIRC_ID:
+ flow->recirc_id = ntohl(value->be32);
+ break;
case MFF_TUN_ID:
flow->tunnel.tun_id = value->be64;
break;
mf_set_wild(const struct mf_field *mf, struct match *match)
{
switch (mf->id) {
+ case MFF_DP_HASH:
+ match->flow.dp_hash = 0;
+ match->wc.masks.dp_hash = 0;
+ break;
+ case MFF_RECIRC_ID:
+ match->flow.recirc_id = 0;
+ match->wc.masks.recirc_id = 0;
+ break;
case MFF_TUN_ID:
match_set_tun_id_masked(match, htonll(0), htonll(0));
break;
}
switch (mf->id) {
+ case MFF_RECIRC_ID:
case MFF_IN_PORT:
case MFF_IN_PORT_OXM:
case MFF_SKB_PRIORITY:
case MFF_ICMPV6_CODE:
return OFPUTIL_P_NONE;
+ case MFF_DP_HASH:
+ match_set_dp_hash_masked(match, ntohl(value->be32), ntohl(mask->be32));
+ break;
case MFF_TUN_ID:
match_set_tun_id_masked(match, value->be64, mask->be64);
break;
* to represent its value. */
enum OVS_PACKED_ENUM mf_field_id {
/* Metadata. */
+ MFF_DP_HASH, /* be32 */
+ MFF_RECIRC_ID, /* be32 */
MFF_TUN_ID, /* be64 */
MFF_TUN_SRC, /* be32 */
MFF_TUN_DST, /* be32 */
/*
* Copyright (c) 2011, 2013 Gaetano Catalli.
- * Copyright (c) 2013 YAMAMOTO Takashi.
+ * Copyright (c) 2013, 2014 YAMAMOTO Takashi.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "rtbsd.h"
#include "connectivity.h"
#include "coverage.h"
+#include "dpif-netdev.h"
#include "dynamic-string.h"
#include "fatal-signal.h"
#include "ofpbuf.h"
VLOG_DEFINE_THIS_MODULE(netdev_bsd);
\f
-struct netdev_rx_bsd {
- struct netdev_rx up;
+struct netdev_rxq_bsd {
+ struct netdev_rxq up;
/* Packet capture descriptor for a system network device.
* For a tap device this is NULL. */
#endif
static void netdev_bsd_run(void);
+static int netdev_bsd_get_mtu(const struct netdev *netdev_, int *mtup);
static bool
is_netdev_bsd_class(const struct netdev_class *netdev_class)
return CONTAINER_OF(netdev, struct netdev_bsd, up);
}
-static struct netdev_rx_bsd *
-netdev_rx_bsd_cast(const struct netdev_rx *rx)
+static struct netdev_rxq_bsd *
+netdev_rxq_bsd_cast(const struct netdev_rxq *rxq)
{
- ovs_assert(is_netdev_bsd_class(netdev_get_class(rx->netdev)));
- return CONTAINER_OF(rx, struct netdev_rx_bsd, up);
+ ovs_assert(is_netdev_bsd_class(netdev_get_class(rxq->netdev)));
+ return CONTAINER_OF(rxq, struct netdev_rxq_bsd, up);
}
static const char *
return error;
}
-static struct netdev_rx *
-netdev_bsd_rx_alloc(void)
+static struct netdev_rxq *
+netdev_bsd_rxq_alloc(void)
{
- struct netdev_rx_bsd *rx = xzalloc(sizeof *rx);
- return &rx->up;
+ struct netdev_rxq_bsd *rxq = xzalloc(sizeof *rxq);
+ return &rxq->up;
}
static int
-netdev_bsd_rx_construct(struct netdev_rx *rx_)
+netdev_bsd_rxq_construct(struct netdev_rxq *rxq_)
{
- struct netdev_rx_bsd *rx = netdev_rx_bsd_cast(rx_);
- struct netdev *netdev_ = rx->up.netdev;
+ struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
+ struct netdev *netdev_ = rxq->up.netdev;
struct netdev_bsd *netdev = netdev_bsd_cast(netdev_);
int error;
if (!strcmp(netdev_get_type(netdev_), "tap")) {
- rx->pcap_handle = NULL;
- rx->fd = netdev->tap_fd;
+ rxq->pcap_handle = NULL;
+ rxq->fd = netdev->tap_fd;
error = 0;
} else {
ovs_mutex_lock(&netdev->mutex);
error = netdev_bsd_open_pcap(netdev_get_kernel_name(netdev_),
- &rx->pcap_handle, &rx->fd);
+ &rxq->pcap_handle, &rxq->fd);
ovs_mutex_unlock(&netdev->mutex);
}
}
static void
-netdev_bsd_rx_destruct(struct netdev_rx *rx_)
+netdev_bsd_rxq_destruct(struct netdev_rxq *rxq_)
{
- struct netdev_rx_bsd *rx = netdev_rx_bsd_cast(rx_);
+ struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
- if (rx->pcap_handle) {
- pcap_close(rx->pcap_handle);
+ if (rxq->pcap_handle) {
+ pcap_close(rxq->pcap_handle);
}
}
static void
-netdev_bsd_rx_dealloc(struct netdev_rx *rx_)
+netdev_bsd_rxq_dealloc(struct netdev_rxq *rxq_)
{
- struct netdev_rx_bsd *rx = netdev_rx_bsd_cast(rx_);
+ struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
- free(rx);
+ free(rxq);
}
/* The recv callback of the netdev class returns the number of bytes of the
* This function attempts to receive a packet from the specified network
* device. It is assumed that the network device is a system device or a tap
* device opened as a system one. In this case the read operation is performed
- * from rx->pcap.
+ * from rxq->pcap.
*/
static int
-netdev_rx_bsd_recv_pcap(struct netdev_rx_bsd *rx, struct ofpbuf *buffer)
+netdev_rxq_bsd_recv_pcap(struct netdev_rxq_bsd *rxq, struct ofpbuf *buffer)
{
struct pcap_arg arg;
int ret;
arg.data = buffer->data;
for (;;) {
- ret = pcap_dispatch(rx->pcap_handle, 1, proc_pkt, (u_char *) &arg);
+ ret = pcap_dispatch(rxq->pcap_handle, 1, proc_pkt, (u_char *) &arg);
if (ret > 0) {
buffer->size += arg.retval;
/*
* This function attempts to receive a packet from the specified network
* device. It is assumed that the network device is a tap device and
- * 'rx->fd' is initialized with the tap file descriptor.
+ * 'rxq->fd' is initialized with the tap file descriptor.
*/
static int
-netdev_rx_bsd_recv_tap(struct netdev_rx_bsd *rx, struct ofpbuf *buffer)
+netdev_rxq_bsd_recv_tap(struct netdev_rxq_bsd *rxq, struct ofpbuf *buffer)
{
size_t size = ofpbuf_tailroom(buffer);
for (;;) {
- ssize_t retval = read(rx->fd, buffer->data, size);
+ ssize_t retval = read(rxq->fd, buffer->data, size);
if (retval >= 0) {
buffer->size += retval;
return 0;
} else if (errno != EINTR) {
if (errno != EAGAIN) {
VLOG_WARN_RL(&rl, "error receiving Ethernet packet on %s: %s",
- ovs_strerror(errno), netdev_rx_get_name(&rx->up));
+ ovs_strerror(errno), netdev_rxq_get_name(&rxq->up));
}
return errno;
}
}
static int
-netdev_bsd_rx_recv(struct netdev_rx *rx_, struct ofpbuf *buffer)
+netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct ofpbuf **packet, int *c)
{
- struct netdev_rx_bsd *rx = netdev_rx_bsd_cast(rx_);
+ struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
+ struct netdev *netdev = rxq->up.netdev;
+ struct ofpbuf *buffer;
+ ssize_t retval;
+ int mtu;
+
+ if (netdev_bsd_get_mtu(netdev, &mtu)) {
+ mtu = ETH_PAYLOAD_MAX;
+ }
+
+ buffer = ofpbuf_new_with_headroom(VLAN_ETH_HEADER_LEN + mtu, DP_NETDEV_HEADROOM);
+
+ retval = (rxq->pcap_handle
+ ? netdev_rxq_bsd_recv_pcap(rxq, buffer)
+ : netdev_rxq_bsd_recv_tap(rxq, buffer));
- return (rx->pcap_handle
- ? netdev_rx_bsd_recv_pcap(rx, buffer)
- : netdev_rx_bsd_recv_tap(rx, buffer));
+ if (retval) {
+ ofpbuf_delete(buffer);
+ } else {
+ dp_packet_pad(buffer);
+ packet[0] = buffer;
+ *c = 1;
+ }
+ return retval;
}
/*
* Registers with the poll loop to wake up from the next call to poll_block()
- * when a packet is ready to be received with netdev_rx_recv() on 'rx'.
+ * when a packet is ready to be received with netdev_rxq_recv() on 'rxq'.
*/
static void
-netdev_bsd_rx_wait(struct netdev_rx *rx_)
+netdev_bsd_rxq_wait(struct netdev_rxq *rxq_)
{
- struct netdev_rx_bsd *rx = netdev_rx_bsd_cast(rx_);
+ struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
- poll_fd_wait(rx->fd, POLLIN);
+ poll_fd_wait(rxq->fd, POLLIN);
}
-/* Discards all packets waiting to be received from 'rx'. */
+/* Discards all packets waiting to be received from 'rxq'. */
static int
-netdev_bsd_rx_drain(struct netdev_rx *rx_)
+netdev_bsd_rxq_drain(struct netdev_rxq *rxq_)
{
struct ifreq ifr;
- struct netdev_rx_bsd *rx = netdev_rx_bsd_cast(rx_);
+ struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
- strcpy(ifr.ifr_name, netdev_get_kernel_name(netdev_rx_get_netdev(rx_)));
- if (ioctl(rx->fd, BIOCFLUSH, &ifr) == -1) {
+ strcpy(ifr.ifr_name, netdev_get_kernel_name(netdev_rxq_get_netdev(rxq_)));
+ if (ioctl(rxq->fd, BIOCFLUSH, &ifr) == -1) {
VLOG_DBG_RL(&rl, "%s: ioctl(BIOCFLUSH) failed: %s",
- netdev_rx_get_name(rx_), ovs_strerror(errno));
+ netdev_rxq_get_name(rxq_), ovs_strerror(errno));
return errno;
}
return 0;
* system or a tap device.
*/
static int
-netdev_bsd_send(struct netdev *netdev_, const void *data, size_t size)
+netdev_bsd_send(struct netdev *netdev_, struct ofpbuf *pkt, bool may_steal)
{
struct netdev_bsd *dev = netdev_bsd_cast(netdev_);
const char *name = netdev_get_name(netdev_);
+ const void *data = pkt->data;
+ size_t size = pkt->size;
int error;
ovs_mutex_lock(&dev->mutex);
}
ovs_mutex_unlock(&dev->mutex);
+ if (may_steal) {
+ ofpbuf_delete(pkt);
+ }
+
return error;
}
convert_stats_tap(struct netdev_stats *stats, const struct if_data *ifd)
{
/*
- * Similar to convert_stats_system but swapping rx and tx
+ * Similar to convert_stats_system but swapping rxq and tx
* because 'ifd' is stats for the network interface side of the
* tap device and what the caller wants is one for the character
* device side.
\
netdev_bsd_update_flags, \
\
- netdev_bsd_rx_alloc, \
- netdev_bsd_rx_construct, \
- netdev_bsd_rx_destruct, \
- netdev_bsd_rx_dealloc, \
- netdev_bsd_rx_recv, \
- netdev_bsd_rx_wait, \
- netdev_bsd_rx_drain, \
+ netdev_bsd_rxq_alloc, \
+ netdev_bsd_rxq_construct, \
+ netdev_bsd_rxq_destruct, \
+ netdev_bsd_rxq_dealloc, \
+ netdev_bsd_rxq_recv, \
+ netdev_bsd_rxq_wait, \
+ netdev_bsd_rxq_drain, \
}
const struct netdev_class netdev_bsd_class =
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+
+#include <stdio.h>
+#include <string.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <pthread.h>
+#include <config.h>
+#include <errno.h>
+#include <sched.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <stdio.h>
+
+#include "connectivity.h"
+#include "dpif-netdev.h"
+#include "list.h"
+#include "netdev-dpdk.h"
+#include "netdev-provider.h"
+#include "netdev-vport.h"
+#include "odp-util.h"
+#include "ofp-print.h"
+#include "ofpbuf.h"
+#include "ovs-thread.h"
+#include "ovs-rcu.h"
+#include "packets.h"
+#include "shash.h"
+#include "seq.h"
+#include "sset.h"
+#include "unaligned.h"
+#include "timeval.h"
+#include "unixctl.h"
+#include "vlog.h"
+
+VLOG_DEFINE_THIS_MODULE(dpdk);
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+
+#define DPDK_PORT_WATCHDOG_INTERVAL 5
+
+#define OVS_CACHE_LINE_SIZE CACHE_LINE_SIZE
+#define OVS_VPORT_DPDK "ovs_dpdk"
+
+/*
+ * need to reserve tons of extra space in the mbufs so we can align the
+ * DMA addresses to 4KB.
+ */
+
+#define MTU_TO_MAX_LEN(mtu) ((mtu) + ETHER_HDR_LEN + ETHER_CRC_LEN)
+#define MBUF_SIZE(mtu) (MTU_TO_MAX_LEN(mtu) + (512) + \
+ sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+
+/* TODO: mempool size should be based on system resources. */
+#define NB_MBUF (4096 * 64)
+#define MP_CACHE_SZ (256 * 2)
+#define SOCKET0 0
+
+#define NON_PMD_THREAD_TX_QUEUE 0
+
+/* TODO: Needs per NIC value for these constants. */
+#define RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */
+#define RX_HTHRESH 32 /* Default values of RX host threshold reg. */
+#define RX_WTHRESH 16 /* Default values of RX write-back threshold reg. */
+
+#define TX_PTHRESH 36 /* Default values of TX prefetch threshold reg. */
+#define TX_HTHRESH 0 /* Default values of TX host threshold reg. */
+#define TX_WTHRESH 0 /* Default values of TX write-back threshold reg. */
+
+static const struct rte_eth_conf port_conf = {
+ .rxmode = {
+ .mq_mode = ETH_MQ_RX_RSS,
+ .split_hdr_size = 0,
+ .header_split = 0, /* Header Split disabled */
+ .hw_ip_checksum = 0, /* IP checksum offload disabled */
+ .hw_vlan_filter = 0, /* VLAN filtering disabled */
+ .jumbo_frame = 0, /* Jumbo Frame Support disabled */
+ .hw_strip_crc = 0,
+ },
+ .rx_adv_conf = {
+ .rss_conf = {
+ .rss_key = NULL,
+ .rss_hf = ETH_RSS_IPV4_TCP | ETH_RSS_IPV4 | ETH_RSS_IPV6,
+ },
+ },
+ .txmode = {
+ .mq_mode = ETH_MQ_TX_NONE,
+ },
+};
+
+static const struct rte_eth_rxconf rx_conf = {
+ .rx_thresh = {
+ .pthresh = RX_PTHRESH,
+ .hthresh = RX_HTHRESH,
+ .wthresh = RX_WTHRESH,
+ },
+};
+
+static const struct rte_eth_txconf tx_conf = {
+ .tx_thresh = {
+ .pthresh = TX_PTHRESH,
+ .hthresh = TX_HTHRESH,
+ .wthresh = TX_WTHRESH,
+ },
+ .tx_free_thresh = 0,
+ .tx_rs_thresh = 0,
+};
+
+enum { MAX_RX_QUEUE_LEN = 64 };
+enum { MAX_TX_QUEUE_LEN = 64 };
+enum { DRAIN_TSC = 200000ULL };
+
+static int rte_eal_init_ret = ENODEV;
+
+static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER;
+
+/* Contains all 'struct dpdk_dev's. */
+static struct list dpdk_list OVS_GUARDED_BY(dpdk_mutex)
+ = LIST_INITIALIZER(&dpdk_list);
+
+static struct list dpdk_mp_list OVS_GUARDED_BY(dpdk_mutex)
+ = LIST_INITIALIZER(&dpdk_mp_list);
+
+static pthread_t watchdog_thread;
+
+struct dpdk_mp {
+ struct rte_mempool *mp;
+ int mtu;
+ int socket_id;
+ int refcount;
+ struct list list_node OVS_GUARDED_BY(dpdk_mutex);
+};
+
+struct dpdk_tx_queue {
+ rte_spinlock_t tx_lock;
+ int count;
+ uint64_t tsc;
+ struct rte_mbuf *burst_pkts[MAX_TX_QUEUE_LEN];
+};
+
+struct netdev_dpdk {
+ struct netdev up;
+ int port_id;
+ int max_packet_len;
+
+ struct dpdk_tx_queue tx_q[NR_QUEUE];
+
+ struct ovs_mutex mutex OVS_ACQ_AFTER(dpdk_mutex);
+
+ struct dpdk_mp *dpdk_mp;
+ int mtu;
+ int socket_id;
+ int buf_size;
+ struct netdev_stats stats_offset;
+ struct netdev_stats stats;
+
+ uint8_t hwaddr[ETH_ADDR_LEN];
+ enum netdev_flags flags;
+
+ struct rte_eth_link link;
+ int link_reset_cnt;
+
+ /* In dpdk_list. */
+ struct list list_node OVS_GUARDED_BY(dpdk_mutex);
+};
+
+struct netdev_rxq_dpdk {
+ struct netdev_rxq up;
+ int port_id;
+};
+
+static int netdev_dpdk_construct(struct netdev *);
+
+static bool
+is_dpdk_class(const struct netdev_class *class)
+{
+ return class->construct == netdev_dpdk_construct;
+}
+
+/* TODO: use dpdk malloc for entire OVS. infact huge page shld be used
+ * for all other sengments data, bss and text. */
+
+static void *
+dpdk_rte_mzalloc(size_t sz)
+{
+ void *ptr;
+
+ ptr = rte_zmalloc(OVS_VPORT_DPDK, sz, OVS_CACHE_LINE_SIZE);
+ if (ptr == NULL) {
+ out_of_memory();
+ }
+ return ptr;
+}
+
+void
+free_dpdk_buf(struct ofpbuf *b)
+{
+ struct rte_mbuf *pkt;
+
+ pkt = b->private_p;
+ if (!pkt) {
+ return;
+ }
+
+ rte_mempool_put(pkt->pool, pkt);
+}
+
+static struct dpdk_mp *
+dpdk_mp_get(int socket_id, int mtu) OVS_REQUIRES(dpdk_mutex)
+{
+ struct dpdk_mp *dmp = NULL;
+ char mp_name[RTE_MEMPOOL_NAMESIZE];
+
+ LIST_FOR_EACH (dmp, list_node, &dpdk_mp_list) {
+ if (dmp->socket_id == socket_id && dmp->mtu == mtu) {
+ dmp->refcount++;
+ return dmp;
+ }
+ }
+
+ dmp = dpdk_rte_mzalloc(sizeof *dmp);
+ dmp->socket_id = socket_id;
+ dmp->mtu = mtu;
+ dmp->refcount = 1;
+
+ snprintf(mp_name, RTE_MEMPOOL_NAMESIZE, "ovs_mp_%d", dmp->mtu);
+ dmp->mp = rte_mempool_create(mp_name, NB_MBUF, MBUF_SIZE(mtu),
+ MP_CACHE_SZ,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ socket_id, 0);
+
+ if (dmp->mp == NULL) {
+ return NULL;
+ }
+
+ list_push_back(&dpdk_mp_list, &dmp->list_node);
+ return dmp;
+}
+
+static void
+dpdk_mp_put(struct dpdk_mp *dmp)
+{
+
+ if (!dmp) {
+ return;
+ }
+
+ dmp->refcount--;
+ ovs_assert(dmp->refcount >= 0);
+
+#if 0
+ /* I could not find any API to destroy mp. */
+ if (dmp->refcount == 0) {
+ list_delete(dmp->list_node);
+ /* destroy mp-pool. */
+ }
+#endif
+}
+
+static void
+check_link_status(struct netdev_dpdk *dev)
+{
+ struct rte_eth_link link;
+
+ rte_eth_link_get_nowait(dev->port_id, &link);
+
+ if (dev->link.link_status != link.link_status) {
+ seq_change(connectivity_seq_get());
+
+ dev->link_reset_cnt++;
+ dev->link = link;
+ if (dev->link.link_status) {
+ VLOG_DBG_RL(&rl, "Port %d Link Up - speed %u Mbps - %s",
+ dev->port_id, (unsigned)dev->link.link_speed,
+ (dev->link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+ ("full-duplex") : ("half-duplex"));
+ } else {
+ VLOG_DBG_RL(&rl, "Port %d Link Down", dev->port_id);
+ }
+ }
+}
+
+static void *
+dpdk_watchdog(void *dummy OVS_UNUSED)
+{
+ struct netdev_dpdk *dev;
+
+ pthread_detach(pthread_self());
+
+ for (;;) {
+ ovs_mutex_lock(&dpdk_mutex);
+ LIST_FOR_EACH (dev, list_node, &dpdk_list) {
+ ovs_mutex_lock(&dev->mutex);
+ check_link_status(dev);
+ ovs_mutex_unlock(&dev->mutex);
+ }
+ ovs_mutex_unlock(&dpdk_mutex);
+ xsleep(DPDK_PORT_WATCHDOG_INTERVAL);
+ }
+
+ return NULL;
+}
+
+static int
+dpdk_eth_dev_init(struct netdev_dpdk *dev) OVS_REQUIRES(dpdk_mutex)
+{
+ struct rte_pktmbuf_pool_private *mbp_priv;
+ struct ether_addr eth_addr;
+ int diag;
+ int i;
+
+ if (dev->port_id < 0 || dev->port_id >= rte_eth_dev_count()) {
+ return -ENODEV;
+ }
+
+ diag = rte_eth_dev_configure(dev->port_id, NR_QUEUE, NR_QUEUE, &port_conf);
+ if (diag) {
+ VLOG_ERR("eth dev config error %d",diag);
+ return diag;
+ }
+
+ for (i = 0; i < NR_QUEUE; i++) {
+ diag = rte_eth_tx_queue_setup(dev->port_id, i, 64, 0, &tx_conf);
+ if (diag) {
+ VLOG_ERR("eth dev tx queue setup error %d",diag);
+ return diag;
+ }
+ }
+
+ for (i = 0; i < NR_QUEUE; i++) {
+ diag = rte_eth_rx_queue_setup(dev->port_id, i, 64, 0, &rx_conf,
+ dev->dpdk_mp->mp);
+ if (diag) {
+ VLOG_ERR("eth dev rx queue setup error %d",diag);
+ return diag;
+ }
+ }
+
+ diag = rte_eth_dev_start(dev->port_id);
+ if (diag) {
+ VLOG_ERR("eth dev start error %d",diag);
+ return diag;
+ }
+
+ rte_eth_promiscuous_enable(dev->port_id);
+ rte_eth_allmulticast_enable(dev->port_id);
+
+ memset(ð_addr, 0x0, sizeof(eth_addr));
+ rte_eth_macaddr_get(dev->port_id, ð_addr);
+ VLOG_INFO_RL(&rl, "Port %d: "ETH_ADDR_FMT"",
+ dev->port_id, ETH_ADDR_ARGS(eth_addr.addr_bytes));
+
+ memcpy(dev->hwaddr, eth_addr.addr_bytes, ETH_ADDR_LEN);
+ rte_eth_link_get_nowait(dev->port_id, &dev->link);
+
+ mbp_priv = rte_mempool_get_priv(dev->dpdk_mp->mp);
+ dev->buf_size = mbp_priv->mbuf_data_room_size - RTE_PKTMBUF_HEADROOM;
+
+ dev->flags = NETDEV_UP | NETDEV_PROMISC;
+ return 0;
+}
+
+static struct netdev_dpdk *
+netdev_dpdk_cast(const struct netdev *netdev)
+{
+ return CONTAINER_OF(netdev, struct netdev_dpdk, up);
+}
+
+static struct netdev *
+netdev_dpdk_alloc(void)
+{
+ struct netdev_dpdk *netdev = dpdk_rte_mzalloc(sizeof *netdev);
+ return &netdev->up;
+}
+
+static int
+netdev_dpdk_construct(struct netdev *netdev_)
+{
+ struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);
+ unsigned int port_no;
+ char *cport;
+ int err;
+ int i;
+
+ if (rte_eal_init_ret) {
+ return rte_eal_init_ret;
+ }
+
+ ovs_mutex_lock(&dpdk_mutex);
+ cport = netdev_->name + 4; /* Names always start with "dpdk" */
+
+ if (strncmp(netdev_->name, "dpdk", 4)) {
+ err = ENODEV;
+ goto unlock_dpdk;
+ }
+
+ port_no = strtol(cport, 0, 0); /* string must be null terminated */
+
+ for (i = 0; i < NR_QUEUE; i++) {
+ rte_spinlock_init(&netdev->tx_q[i].tx_lock);
+ }
+
+ ovs_mutex_init(&netdev->mutex);
+
+ ovs_mutex_lock(&netdev->mutex);
+ netdev->flags = 0;
+
+ netdev->mtu = ETHER_MTU;
+ netdev->max_packet_len = MTU_TO_MAX_LEN(netdev->mtu);
+
+ /* TODO: need to discover device node at run time. */
+ netdev->socket_id = SOCKET0;
+ netdev->port_id = port_no;
+
+ netdev->dpdk_mp = dpdk_mp_get(netdev->socket_id, netdev->mtu);
+ if (!netdev->dpdk_mp) {
+ err = ENOMEM;
+ goto unlock_dev;
+ }
+
+ err = dpdk_eth_dev_init(netdev);
+ if (err) {
+ goto unlock_dev;
+ }
+ netdev_->n_rxq = NR_QUEUE;
+
+ list_push_back(&dpdk_list, &netdev->list_node);
+
+unlock_dev:
+ ovs_mutex_unlock(&netdev->mutex);
+unlock_dpdk:
+ ovs_mutex_unlock(&dpdk_mutex);
+ return err;
+}
+
+static void
+netdev_dpdk_destruct(struct netdev *netdev_)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);
+
+ ovs_mutex_lock(&dev->mutex);
+ rte_eth_dev_stop(dev->port_id);
+ ovs_mutex_unlock(&dev->mutex);
+
+ ovs_mutex_lock(&dpdk_mutex);
+ list_remove(&dev->list_node);
+ dpdk_mp_put(dev->dpdk_mp);
+ ovs_mutex_unlock(&dpdk_mutex);
+
+ ovs_mutex_destroy(&dev->mutex);
+}
+
+static void
+netdev_dpdk_dealloc(struct netdev *netdev_)
+{
+ struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);
+
+ rte_free(netdev);
+}
+
+static int
+netdev_dpdk_get_config(const struct netdev *netdev_, struct smap *args)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);
+
+ ovs_mutex_lock(&dev->mutex);
+
+ /* TODO: Allow to configure number of queues. */
+ smap_add_format(args, "configured_rx_queues", "%u", netdev_->n_rxq);
+ smap_add_format(args, "configured_tx_queues", "%u", netdev_->n_rxq);
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static struct netdev_rxq *
+netdev_dpdk_rxq_alloc(void)
+{
+ struct netdev_rxq_dpdk *rx = dpdk_rte_mzalloc(sizeof *rx);
+
+ return &rx->up;
+}
+
+static struct netdev_rxq_dpdk *
+netdev_rxq_dpdk_cast(const struct netdev_rxq *rx)
+{
+ return CONTAINER_OF(rx, struct netdev_rxq_dpdk, up);
+}
+
+static int
+netdev_dpdk_rxq_construct(struct netdev_rxq *rxq_)
+{
+ struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq_);
+ struct netdev_dpdk *netdev = netdev_dpdk_cast(rx->up.netdev);
+
+ ovs_mutex_lock(&netdev->mutex);
+ rx->port_id = netdev->port_id;
+ ovs_mutex_unlock(&netdev->mutex);
+
+ return 0;
+}
+
+static void
+netdev_dpdk_rxq_destruct(struct netdev_rxq *rxq_ OVS_UNUSED)
+{
+}
+
+static void
+netdev_dpdk_rxq_dealloc(struct netdev_rxq *rxq_)
+{
+ struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq_);
+
+ rte_free(rx);
+}
+
+inline static void
+dpdk_queue_flush(struct netdev_dpdk *dev, int qid)
+{
+ struct dpdk_tx_queue *txq = &dev->tx_q[qid];
+ uint32_t nb_tx;
+
+ if (txq->count == 0) {
+ return;
+ }
+ rte_spinlock_lock(&txq->tx_lock);
+ nb_tx = rte_eth_tx_burst(dev->port_id, qid, txq->burst_pkts, txq->count);
+ if (nb_tx != txq->count) {
+ /* free buffers if we couldn't transmit packets */
+ rte_mempool_put_bulk(dev->dpdk_mp->mp,
+ (void **) &txq->burst_pkts[nb_tx],
+ (txq->count - nb_tx));
+ }
+ txq->count = 0;
+ rte_spinlock_unlock(&txq->tx_lock);
+}
+
+inline static struct ofpbuf *
+build_ofpbuf(struct rte_mbuf *pkt)
+{
+ struct ofpbuf *b;
+
+ b = ofpbuf_new(0);
+ b->private_p = pkt;
+
+ b->data = pkt->pkt.data;
+ b->base = (char *)b->data - DP_NETDEV_HEADROOM - VLAN_ETH_HEADER_LEN;
+ b->allocated = pkt->buf_len;
+ b->source = OFPBUF_DPDK;
+ b->size = rte_pktmbuf_data_len(pkt);
+
+ dp_packet_pad(b);
+
+ return b;
+}
+
+static int
+netdev_dpdk_rxq_recv(struct netdev_rxq *rxq_, struct ofpbuf **packet, int *c)
+{
+ struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq_);
+ struct netdev *netdev = rx->up.netdev;
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+ struct rte_mbuf *burst_pkts[MAX_RX_QUEUE_LEN];
+ int nb_rx;
+ int i;
+
+ dpdk_queue_flush(dev, rxq_->queue_id);
+
+ nb_rx = rte_eth_rx_burst(rx->port_id, rxq_->queue_id,
+ burst_pkts, MAX_RX_QUEUE_LEN);
+ if (!nb_rx) {
+ return EAGAIN;
+ }
+
+ for (i = 0; i < nb_rx; i++) {
+ packet[i] = build_ofpbuf(burst_pkts[i]);
+ }
+
+ *c = nb_rx;
+
+ return 0;
+}
+
+inline static void
+dpdk_queue_pkt(struct netdev_dpdk *dev, int qid,
+ struct rte_mbuf *pkt)
+{
+ struct dpdk_tx_queue *txq = &dev->tx_q[qid];
+ uint64_t diff_tsc;
+ uint64_t cur_tsc;
+ uint32_t nb_tx;
+
+ rte_spinlock_lock(&txq->tx_lock);
+ txq->burst_pkts[txq->count++] = pkt;
+ if (txq->count == MAX_TX_QUEUE_LEN) {
+ goto flush;
+ }
+ cur_tsc = rte_get_timer_cycles();
+ if (txq->count == 1) {
+ txq->tsc = cur_tsc;
+ }
+ diff_tsc = cur_tsc - txq->tsc;
+ if (diff_tsc >= DRAIN_TSC) {
+ goto flush;
+ }
+ rte_spinlock_unlock(&txq->tx_lock);
+ return;
+
+flush:
+ nb_tx = rte_eth_tx_burst(dev->port_id, qid, txq->burst_pkts, txq->count);
+ if (nb_tx != txq->count) {
+ /* free buffers if we couldn't transmit packets */
+ rte_mempool_put_bulk(dev->dpdk_mp->mp,
+ (void **) &txq->burst_pkts[nb_tx],
+ (txq->count - nb_tx));
+ }
+ txq->count = 0;
+ rte_spinlock_unlock(&txq->tx_lock);
+}
+
+/* Tx function. Transmit packets indefinitely */
+static void
+dpdk_do_tx_copy(struct netdev *netdev, char *buf, int size)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+ struct rte_mbuf *pkt;
+
+ pkt = rte_pktmbuf_alloc(dev->dpdk_mp->mp);
+ if (!pkt) {
+ ovs_mutex_lock(&dev->mutex);
+ dev->stats.tx_dropped++;
+ ovs_mutex_unlock(&dev->mutex);
+ return;
+ }
+
+ /* We have to do a copy for now */
+ memcpy(pkt->pkt.data, buf, size);
+
+ rte_pktmbuf_data_len(pkt) = size;
+ rte_pktmbuf_pkt_len(pkt) = size;
+
+ dpdk_queue_pkt(dev, NON_PMD_THREAD_TX_QUEUE, pkt);
+ dpdk_queue_flush(dev, NON_PMD_THREAD_TX_QUEUE);
+}
+
+static int
+netdev_dpdk_send(struct netdev *netdev,
+ struct ofpbuf *ofpbuf, bool may_steal)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+ int ret;
+
+ if (ofpbuf->size > dev->max_packet_len) {
+ VLOG_WARN_RL(&rl, "Too big size %d max_packet_len %d",
+ (int)ofpbuf->size , dev->max_packet_len);
+
+ ovs_mutex_lock(&dev->mutex);
+ dev->stats.tx_dropped++;
+ ovs_mutex_unlock(&dev->mutex);
+
+ ret = E2BIG;
+ goto out;
+ }
+
+ rte_prefetch0(&ofpbuf->private_p);
+ if (!may_steal ||
+ !ofpbuf->private_p || ofpbuf->source != OFPBUF_DPDK) {
+ dpdk_do_tx_copy(netdev, (char *) ofpbuf->data, ofpbuf->size);
+ } else {
+ struct rte_mbuf *pkt;
+ int qid;
+
+ pkt = ofpbuf->private_p;
+ ofpbuf->private_p = NULL;
+ rte_pktmbuf_data_len(pkt) = ofpbuf->size;
+ rte_pktmbuf_pkt_len(pkt) = ofpbuf->size;
+
+ qid = rte_lcore_id() % NR_QUEUE;
+
+ dpdk_queue_pkt(dev, qid, pkt);
+
+ ofpbuf->private_p = NULL;
+ }
+ ret = 0;
+
+out:
+ if (may_steal) {
+ ofpbuf_delete(ofpbuf);
+ }
+
+ return ret;
+}
+
+static int
+netdev_dpdk_set_etheraddr(struct netdev *netdev,
+ const uint8_t mac[ETH_ADDR_LEN])
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ ovs_mutex_lock(&dev->mutex);
+ if (!eth_addr_equals(dev->hwaddr, mac)) {
+ memcpy(dev->hwaddr, mac, ETH_ADDR_LEN);
+ }
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static int
+netdev_dpdk_get_etheraddr(const struct netdev *netdev,
+ uint8_t mac[ETH_ADDR_LEN])
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ ovs_mutex_lock(&dev->mutex);
+ memcpy(mac, dev->hwaddr, ETH_ADDR_LEN);
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static int
+netdev_dpdk_get_mtu(const struct netdev *netdev, int *mtup)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ ovs_mutex_lock(&dev->mutex);
+ *mtup = dev->mtu;
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static int
+netdev_dpdk_set_mtu(const struct netdev *netdev, int mtu)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+ int old_mtu, err;
+ struct dpdk_mp *old_mp;
+ struct dpdk_mp *mp;
+
+ ovs_mutex_lock(&dpdk_mutex);
+ ovs_mutex_lock(&dev->mutex);
+ if (dev->mtu == mtu) {
+ err = 0;
+ goto out;
+ }
+
+ mp = dpdk_mp_get(dev->socket_id, dev->mtu);
+ if (!mp) {
+ err = ENOMEM;
+ goto out;
+ }
+
+ rte_eth_dev_stop(dev->port_id);
+
+ old_mtu = dev->mtu;
+ old_mp = dev->dpdk_mp;
+ dev->dpdk_mp = mp;
+ dev->mtu = mtu;
+ dev->max_packet_len = MTU_TO_MAX_LEN(dev->mtu);
+
+ err = dpdk_eth_dev_init(dev);
+ if (err) {
+
+ dpdk_mp_put(mp);
+ dev->mtu = old_mtu;
+ dev->dpdk_mp = old_mp;
+ dev->max_packet_len = MTU_TO_MAX_LEN(dev->mtu);
+ dpdk_eth_dev_init(dev);
+ goto out;
+ }
+
+ dpdk_mp_put(old_mp);
+out:
+ ovs_mutex_unlock(&dev->mutex);
+ ovs_mutex_unlock(&dpdk_mutex);
+ return err;
+}
+
+static int
+netdev_dpdk_get_carrier(const struct netdev *netdev_, bool *carrier);
+
+static int
+netdev_dpdk_get_stats(const struct netdev *netdev, struct netdev_stats *stats)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+ struct rte_eth_stats rte_stats;
+ bool gg;
+
+ netdev_dpdk_get_carrier(netdev, &gg);
+ ovs_mutex_lock(&dev->mutex);
+ rte_eth_stats_get(dev->port_id, &rte_stats);
+
+ *stats = dev->stats_offset;
+
+ stats->rx_packets += rte_stats.ipackets;
+ stats->tx_packets += rte_stats.opackets;
+ stats->rx_bytes += rte_stats.ibytes;
+ stats->tx_bytes += rte_stats.obytes;
+ stats->rx_errors += rte_stats.ierrors;
+ stats->tx_errors += rte_stats.oerrors;
+ stats->multicast += rte_stats.imcasts;
+
+ stats->tx_dropped += dev->stats.tx_dropped;
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static int
+netdev_dpdk_set_stats(struct netdev *netdev, const struct netdev_stats *stats)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ ovs_mutex_lock(&dev->mutex);
+ dev->stats_offset = *stats;
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static int
+netdev_dpdk_get_features(const struct netdev *netdev_,
+ enum netdev_features *current,
+ enum netdev_features *advertised OVS_UNUSED,
+ enum netdev_features *supported OVS_UNUSED,
+ enum netdev_features *peer OVS_UNUSED)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);
+ struct rte_eth_link link;
+
+ ovs_mutex_lock(&dev->mutex);
+ link = dev->link;
+ ovs_mutex_unlock(&dev->mutex);
+
+ if (link.link_duplex == ETH_LINK_AUTONEG_DUPLEX) {
+ if (link.link_speed == ETH_LINK_SPEED_AUTONEG) {
+ *current = NETDEV_F_AUTONEG;
+ }
+ } else if (link.link_duplex == ETH_LINK_HALF_DUPLEX) {
+ if (link.link_speed == ETH_LINK_SPEED_10) {
+ *current = NETDEV_F_10MB_HD;
+ }
+ if (link.link_speed == ETH_LINK_SPEED_100) {
+ *current = NETDEV_F_100MB_HD;
+ }
+ if (link.link_speed == ETH_LINK_SPEED_1000) {
+ *current = NETDEV_F_1GB_HD;
+ }
+ } else if (link.link_duplex == ETH_LINK_FULL_DUPLEX) {
+ if (link.link_speed == ETH_LINK_SPEED_10) {
+ *current = NETDEV_F_10MB_FD;
+ }
+ if (link.link_speed == ETH_LINK_SPEED_100) {
+ *current = NETDEV_F_100MB_FD;
+ }
+ if (link.link_speed == ETH_LINK_SPEED_1000) {
+ *current = NETDEV_F_1GB_FD;
+ }
+ if (link.link_speed == ETH_LINK_SPEED_10000) {
+ *current = NETDEV_F_10GB_FD;
+ }
+ }
+
+ return 0;
+}
+
+static int
+netdev_dpdk_get_ifindex(const struct netdev *netdev)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+ int ifindex;
+
+ ovs_mutex_lock(&dev->mutex);
+ ifindex = dev->port_id;
+ ovs_mutex_unlock(&dev->mutex);
+
+ return ifindex;
+}
+
+static int
+netdev_dpdk_get_carrier(const struct netdev *netdev_, bool *carrier)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);
+
+ ovs_mutex_lock(&dev->mutex);
+ check_link_status(dev);
+ *carrier = dev->link.link_status;
+ ovs_mutex_unlock(&dev->mutex);
+
+ return 0;
+}
+
+static long long int
+netdev_dpdk_get_carrier_resets(const struct netdev *netdev_)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);
+ long long int carrier_resets;
+
+ ovs_mutex_lock(&dev->mutex);
+ carrier_resets = dev->link_reset_cnt;
+ ovs_mutex_unlock(&dev->mutex);
+
+ return carrier_resets;
+}
+
+static int
+netdev_dpdk_set_miimon(struct netdev *netdev_ OVS_UNUSED,
+ long long int interval OVS_UNUSED)
+{
+ return 0;
+}
+
+static int
+netdev_dpdk_update_flags__(struct netdev_dpdk *dev,
+ enum netdev_flags off, enum netdev_flags on,
+ enum netdev_flags *old_flagsp)
+ OVS_REQUIRES(dev->mutex)
+{
+ int err;
+
+ if ((off | on) & ~(NETDEV_UP | NETDEV_PROMISC)) {
+ return EINVAL;
+ }
+
+ *old_flagsp = dev->flags;
+ dev->flags |= on;
+ dev->flags &= ~off;
+
+ if (dev->flags == *old_flagsp) {
+ return 0;
+ }
+
+ if (dev->flags & NETDEV_UP) {
+ err = rte_eth_dev_start(dev->port_id);
+ if (err)
+ return err;
+ }
+
+ if (dev->flags & NETDEV_PROMISC) {
+ rte_eth_promiscuous_enable(dev->port_id);
+ }
+
+ if (!(dev->flags & NETDEV_UP)) {
+ rte_eth_dev_stop(dev->port_id);
+ }
+
+ return 0;
+}
+
+static int
+netdev_dpdk_update_flags(struct netdev *netdev_,
+ enum netdev_flags off, enum netdev_flags on,
+ enum netdev_flags *old_flagsp)
+{
+ struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);
+ int error;
+
+ ovs_mutex_lock(&netdev->mutex);
+ error = netdev_dpdk_update_flags__(netdev, off, on, old_flagsp);
+ ovs_mutex_unlock(&netdev->mutex);
+
+ return error;
+}
+
+static int
+netdev_dpdk_get_status(const struct netdev *netdev_, struct smap *args)
+{
+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);
+ struct rte_eth_dev_info dev_info;
+
+ if (dev->port_id <= 0)
+ return ENODEV;
+
+ ovs_mutex_lock(&dev->mutex);
+ rte_eth_dev_info_get(dev->port_id, &dev_info);
+ ovs_mutex_unlock(&dev->mutex);
+
+ smap_add_format(args, "driver_name", "%s", dev_info.driver_name);
+
+ smap_add_format(args, "numa_id", "%d", rte_eth_dev_socket_id(dev->port_id));
+ smap_add_format(args, "driver_name", "%s", dev_info.driver_name);
+ smap_add_format(args, "min_rx_bufsize", "%u", dev_info.min_rx_bufsize);
+ smap_add_format(args, "max_rx_pktlen", "%u", dev_info.max_rx_pktlen);
+ smap_add_format(args, "max_rx_queues", "%u", dev_info.max_rx_queues);
+ smap_add_format(args, "max_tx_queues", "%u", dev_info.max_tx_queues);
+ smap_add_format(args, "max_mac_addrs", "%u", dev_info.max_mac_addrs);
+ smap_add_format(args, "max_hash_mac_addrs", "%u", dev_info.max_hash_mac_addrs);
+ smap_add_format(args, "max_vfs", "%u", dev_info.max_vfs);
+ smap_add_format(args, "max_vmdq_pools", "%u", dev_info.max_vmdq_pools);
+
+ smap_add_format(args, "pci-vendor_id", "0x%u", dev_info.pci_dev->id.vendor_id);
+ smap_add_format(args, "pci-device_id", "0x%x", dev_info.pci_dev->id.device_id);
+
+ return 0;
+}
+
+static void
+netdev_dpdk_set_admin_state__(struct netdev_dpdk *dev, bool admin_state)
+ OVS_REQUIRES(dev->mutex)
+{
+ enum netdev_flags old_flags;
+
+ if (admin_state) {
+ netdev_dpdk_update_flags__(dev, 0, NETDEV_UP, &old_flags);
+ } else {
+ netdev_dpdk_update_flags__(dev, NETDEV_UP, 0, &old_flags);
+ }
+}
+
+static void
+netdev_dpdk_set_admin_state(struct unixctl_conn *conn, int argc,
+ const char *argv[], void *aux OVS_UNUSED)
+{
+ bool up;
+
+ if (!strcasecmp(argv[argc - 1], "up")) {
+ up = true;
+ } else if ( !strcasecmp(argv[argc - 1], "down")) {
+ up = false;
+ } else {
+ unixctl_command_reply_error(conn, "Invalid Admin State");
+ return;
+ }
+
+ if (argc > 2) {
+ struct netdev *netdev = netdev_from_name(argv[1]);
+ if (netdev && is_dpdk_class(netdev->netdev_class)) {
+ struct netdev_dpdk *dpdk_dev = netdev_dpdk_cast(netdev);
+
+ ovs_mutex_lock(&dpdk_dev->mutex);
+ netdev_dpdk_set_admin_state__(dpdk_dev, up);
+ ovs_mutex_unlock(&dpdk_dev->mutex);
+
+ netdev_close(netdev);
+ } else {
+ unixctl_command_reply_error(conn, "Not a DPDK Interface");
+ netdev_close(netdev);
+ return;
+ }
+ } else {
+ struct netdev_dpdk *netdev;
+
+ ovs_mutex_lock(&dpdk_mutex);
+ LIST_FOR_EACH (netdev, list_node, &dpdk_list) {
+ ovs_mutex_lock(&netdev->mutex);
+ netdev_dpdk_set_admin_state__(netdev, up);
+ ovs_mutex_unlock(&netdev->mutex);
+ }
+ ovs_mutex_unlock(&dpdk_mutex);
+ }
+ unixctl_command_reply(conn, "OK");
+}
+
+static int
+dpdk_class_init(void)
+{
+ int result;
+
+ if (rte_eal_init_ret) {
+ return 0;
+ }
+
+ result = rte_pmd_init_all();
+ if (result) {
+ VLOG_ERR("Cannot init PMD");
+ return result;
+ }
+
+ result = rte_eal_pci_probe();
+ if (result) {
+ VLOG_ERR("Cannot probe PCI");
+ return result;
+ }
+
+ if (rte_eth_dev_count() < 1) {
+ VLOG_ERR("No Ethernet devices found. Try assigning ports to UIO.");
+ }
+
+ VLOG_INFO("Ethernet Device Count: %d", (int)rte_eth_dev_count());
+
+ list_init(&dpdk_list);
+ list_init(&dpdk_mp_list);
+
+ unixctl_command_register("netdev-dpdk/set-admin-state",
+ "[netdev] up|down", 1, 2,
+ netdev_dpdk_set_admin_state, NULL);
+
+ xpthread_create(&watchdog_thread, NULL, dpdk_watchdog, NULL);
+ return 0;
+}
+
+static struct netdev_class netdev_dpdk_class = {
+ "dpdk",
+ dpdk_class_init, /* init */
+ NULL, /* netdev_dpdk_run */
+ NULL, /* netdev_dpdk_wait */
+
+ netdev_dpdk_alloc,
+ netdev_dpdk_construct,
+ netdev_dpdk_destruct,
+ netdev_dpdk_dealloc,
+ netdev_dpdk_get_config,
+ NULL, /* netdev_dpdk_set_config */
+ NULL, /* get_tunnel_config */
+
+ netdev_dpdk_send, /* send */
+ NULL, /* send_wait */
+
+ netdev_dpdk_set_etheraddr,
+ netdev_dpdk_get_etheraddr,
+ netdev_dpdk_get_mtu,
+ netdev_dpdk_set_mtu,
+ netdev_dpdk_get_ifindex,
+ netdev_dpdk_get_carrier,
+ netdev_dpdk_get_carrier_resets,
+ netdev_dpdk_set_miimon,
+ netdev_dpdk_get_stats,
+ netdev_dpdk_set_stats,
+ netdev_dpdk_get_features,
+ NULL, /* set_advertisements */
+
+ NULL, /* set_policing */
+ NULL, /* get_qos_types */
+ NULL, /* get_qos_capabilities */
+ NULL, /* get_qos */
+ NULL, /* set_qos */
+ NULL, /* get_queue */
+ NULL, /* set_queue */
+ NULL, /* delete_queue */
+ NULL, /* get_queue_stats */
+ NULL, /* queue_dump_start */
+ NULL, /* queue_dump_next */
+ NULL, /* queue_dump_done */
+ NULL, /* dump_queue_stats */
+
+ NULL, /* get_in4 */
+ NULL, /* set_in4 */
+ NULL, /* get_in6 */
+ NULL, /* add_router */
+ NULL, /* get_next_hop */
+ netdev_dpdk_get_status,
+ NULL, /* arp_lookup */
+
+ netdev_dpdk_update_flags,
+
+ netdev_dpdk_rxq_alloc,
+ netdev_dpdk_rxq_construct,
+ netdev_dpdk_rxq_destruct,
+ netdev_dpdk_rxq_dealloc,
+ netdev_dpdk_rxq_recv,
+ NULL, /* rxq_wait */
+ NULL, /* rxq_drain */
+};
+
+int
+dpdk_init(int argc, char **argv)
+{
+ int result;
+
+ if (strcmp(argv[1], "--dpdk"))
+ return 0;
+
+ argc--;
+ argv++;
+
+ /* Make sure things are initialized ... */
+ result = rte_eal_init(argc, argv);
+ if (result < 0)
+ ovs_abort(result, "Cannot init EAL\n");
+
+ rte_memzone_dump();
+ rte_eal_init_ret = 0;
+
+ return result;
+}
+
+void
+netdev_dpdk_register(void)
+{
+ netdev_register_provider(&netdev_dpdk_class);
+}
+
+int
+pmd_thread_setaffinity_cpu(int cpu)
+{
+ cpu_set_t cpuset;
+ int err;
+
+ CPU_ZERO(&cpuset);
+ CPU_SET(cpu, &cpuset);
+ err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+ if (err) {
+ VLOG_ERR("Thread affinity error %d",err);
+ return err;
+ }
+ RTE_PER_LCORE(_lcore_id) = cpu;
+
+ return 0;
+}
--- /dev/null
+#ifndef NETDEV_DPDK_H
+#define NETDEV_DPDK_H
+
+#include <config.h>
+#include "ofpbuf.h"
+
+#ifdef DPDK_NETDEV
+
+#include <rte_config.h>
+#include <rte_eal.h>
+#include <rte_debug.h>
+#include <rte_ethdev.h>
+#include <rte_errno.h>
+#include <rte_memzone.h>
+#include <rte_memcpy.h>
+#include <rte_cycles.h>
+#include <rte_spinlock.h>
+#include <rte_launch.h>
+#include <rte_malloc.h>
+
+int dpdk_init(int argc, char **argv);
+void netdev_dpdk_register(void);
+void free_dpdk_buf(struct ofpbuf *);
+int pmd_thread_setaffinity_cpu(int cpu);
+
+#else
+
+static inline int
+dpdk_init(int arg1 OVS_UNUSED, char **arg2 OVS_UNUSED)
+{
+ return 0;
+}
+
+static inline void
+netdev_dpdk_register(void)
+{
+ /* Nothing */
+}
+
+static inline void
+free_dpdk_buf(struct ofpbuf *buf OVS_UNUSED)
+{
+ /* Nothing */
+}
+
+static inline int
+pmd_thread_setaffinity_cpu(int cpu OVS_UNUSED)
+{
+ return 0;
+}
+
+#endif /* DPDK_NETDEV */
+#endif
#include <errno.h>
#include "connectivity.h"
+#include "dpif-netdev.h"
#include "flow.h"
#include "list.h"
#include "netdev-provider.h"
struct dummy_packet_conn conn OVS_GUARDED;
- FILE *tx_pcap, *rx_pcap OVS_GUARDED;
+ FILE *tx_pcap, *rxq_pcap OVS_GUARDED;
- struct list rxes OVS_GUARDED; /* List of child "netdev_rx_dummy"s. */
+ struct list rxes OVS_GUARDED; /* List of child "netdev_rxq_dummy"s. */
};
/* Max 'recv_queue_len' in struct netdev_dummy. */
#define NETDEV_DUMMY_MAX_QUEUE 100
-struct netdev_rx_dummy {
- struct netdev_rx up;
+struct netdev_rxq_dummy {
+ struct netdev_rxq up;
struct list node; /* In netdev_dummy's "rxes" list. */
struct list recv_queue;
int recv_queue_len; /* list_size(&recv_queue). */
return CONTAINER_OF(netdev, struct netdev_dummy, up);
}
-static struct netdev_rx_dummy *
-netdev_rx_dummy_cast(const struct netdev_rx *rx)
+static struct netdev_rxq_dummy *
+netdev_rxq_dummy_cast(const struct netdev_rxq *rx)
{
ovs_assert(is_dummy_class(netdev_get_class(rx->netdev)));
- return CONTAINER_OF(rx, struct netdev_rx_dummy, up);
+ return CONTAINER_OF(rx, struct netdev_rxq_dummy, up);
}
static void
dummy_packet_conn_set_config(&netdev->conn, args);
- if (netdev->rx_pcap) {
- fclose(netdev->rx_pcap);
+ if (netdev->rxq_pcap) {
+ fclose(netdev->rxq_pcap);
}
- if (netdev->tx_pcap && netdev->tx_pcap != netdev->rx_pcap) {
+ if (netdev->tx_pcap && netdev->tx_pcap != netdev->rxq_pcap) {
fclose(netdev->tx_pcap);
}
- netdev->rx_pcap = netdev->tx_pcap = NULL;
+ netdev->rxq_pcap = netdev->tx_pcap = NULL;
pcap = smap_get(args, "pcap");
if (pcap) {
- netdev->rx_pcap = netdev->tx_pcap = ovs_pcap_open(pcap, "ab");
+ netdev->rxq_pcap = netdev->tx_pcap = ovs_pcap_open(pcap, "ab");
} else {
- const char *rx_pcap = smap_get(args, "rx_pcap");
+ const char *rxq_pcap = smap_get(args, "rxq_pcap");
const char *tx_pcap = smap_get(args, "tx_pcap");
- if (rx_pcap) {
- netdev->rx_pcap = ovs_pcap_open(rx_pcap, "ab");
+ if (rxq_pcap) {
+ netdev->rxq_pcap = ovs_pcap_open(rxq_pcap, "ab");
}
if (tx_pcap) {
netdev->tx_pcap = ovs_pcap_open(tx_pcap, "ab");
return 0;
}
-static struct netdev_rx *
-netdev_dummy_rx_alloc(void)
+static struct netdev_rxq *
+netdev_dummy_rxq_alloc(void)
{
- struct netdev_rx_dummy *rx = xzalloc(sizeof *rx);
+ struct netdev_rxq_dummy *rx = xzalloc(sizeof *rx);
return &rx->up;
}
static int
-netdev_dummy_rx_construct(struct netdev_rx *rx_)
+netdev_dummy_rxq_construct(struct netdev_rxq *rxq_)
{
- struct netdev_rx_dummy *rx = netdev_rx_dummy_cast(rx_);
+ struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev);
ovs_mutex_lock(&netdev->mutex);
}
static void
-netdev_dummy_rx_destruct(struct netdev_rx *rx_)
+netdev_dummy_rxq_destruct(struct netdev_rxq *rxq_)
{
- struct netdev_rx_dummy *rx = netdev_rx_dummy_cast(rx_);
+ struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev);
ovs_mutex_lock(&netdev->mutex);
}
static void
-netdev_dummy_rx_dealloc(struct netdev_rx *rx_)
+netdev_dummy_rxq_dealloc(struct netdev_rxq *rxq_)
{
- struct netdev_rx_dummy *rx = netdev_rx_dummy_cast(rx_);
+ struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
free(rx);
}
static int
-netdev_dummy_rx_recv(struct netdev_rx *rx_, struct ofpbuf *buffer)
+netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct ofpbuf **arr, int *c)
{
- struct netdev_rx_dummy *rx = netdev_rx_dummy_cast(rx_);
+ struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev);
struct ofpbuf *packet;
- int retval;
ovs_mutex_lock(&netdev->mutex);
if (!list_is_empty(&rx->recv_queue)) {
if (!packet) {
return EAGAIN;
}
+ ovs_mutex_lock(&netdev->mutex);
+ netdev->stats.rx_packets++;
+ netdev->stats.rx_bytes += packet->size;
+ ovs_mutex_unlock(&netdev->mutex);
- if (packet->size <= ofpbuf_tailroom(buffer)) {
- memcpy(buffer->data, packet->data, packet->size);
- buffer->size += packet->size;
- retval = 0;
-
- ovs_mutex_lock(&netdev->mutex);
- netdev->stats.rx_packets++;
- netdev->stats.rx_bytes += packet->size;
- ovs_mutex_unlock(&netdev->mutex);
- } else {
- retval = EMSGSIZE;
- }
- ofpbuf_delete(packet);
-
- return retval;
+ dp_packet_pad(packet);
+ arr[0] = packet;
+ *c = 1;
+ return 0;
}
static void
-netdev_dummy_rx_wait(struct netdev_rx *rx_)
+netdev_dummy_rxq_wait(struct netdev_rxq *rxq_)
{
- struct netdev_rx_dummy *rx = netdev_rx_dummy_cast(rx_);
+ struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev);
uint64_t seq = seq_read(rx->seq);
}
static int
-netdev_dummy_rx_drain(struct netdev_rx *rx_)
+netdev_dummy_rxq_drain(struct netdev_rxq *rxq_)
{
- struct netdev_rx_dummy *rx = netdev_rx_dummy_cast(rx_);
+ struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev);
ovs_mutex_lock(&netdev->mutex);
}
static int
-netdev_dummy_send(struct netdev *netdev, const void *buffer, size_t size)
+netdev_dummy_send(struct netdev *netdev, struct ofpbuf *pkt, bool may_steal)
{
struct netdev_dummy *dev = netdev_dummy_cast(netdev);
+ const void *buffer = pkt->data;
+ size_t size = pkt->size;
if (size < ETH_HEADER_LEN) {
return EMSGSIZE;
}
ovs_mutex_unlock(&dev->mutex);
+ if (may_steal) {
+ ofpbuf_delete(pkt);
+ }
return 0;
}
netdev_dummy_update_flags,
- netdev_dummy_rx_alloc,
- netdev_dummy_rx_construct,
- netdev_dummy_rx_destruct,
- netdev_dummy_rx_dealloc,
- netdev_dummy_rx_recv,
- netdev_dummy_rx_wait,
- netdev_dummy_rx_drain,
+ netdev_dummy_rxq_alloc,
+ netdev_dummy_rxq_construct,
+ netdev_dummy_rxq_destruct,
+ netdev_dummy_rxq_dealloc,
+ netdev_dummy_rxq_recv,
+ netdev_dummy_rxq_wait,
+ netdev_dummy_rxq_drain,
};
static struct ofpbuf *
}
static void
-netdev_dummy_queue_packet__(struct netdev_rx_dummy *rx, struct ofpbuf *packet)
+netdev_dummy_queue_packet__(struct netdev_rxq_dummy *rx, struct ofpbuf *packet)
{
list_push_back(&rx->recv_queue, &packet->list_node);
rx->recv_queue_len++;
netdev_dummy_queue_packet(struct netdev_dummy *dummy, struct ofpbuf *packet)
OVS_REQUIRES(dummy->mutex)
{
- struct netdev_rx_dummy *rx, *prev;
+ struct netdev_rxq_dummy *rx, *prev;
- if (dummy->rx_pcap) {
- ovs_pcap_write(dummy->rx_pcap, packet);
- fflush(dummy->rx_pcap);
+ if (dummy->rxq_pcap) {
+ ovs_pcap_write(dummy->rxq_pcap, packet);
+ fflush(dummy->rxq_pcap);
}
prev = NULL;
LIST_FOR_EACH (rx, node, &dummy->rxes) {
#include <linux/pkt_sched.h>
#include <linux/rtnetlink.h>
#include <linux/sockios.h>
-#include <linux/version.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include "connectivity.h"
#include "coverage.h"
#include "dpif-linux.h"
+#include "dpif-netdev.h"
#include "dynamic-string.h"
#include "fatal-signal.h"
#include "hash.h"
int tap_fd;
};
-struct netdev_rx_linux {
- struct netdev_rx up;
+struct netdev_rxq_linux {
+ struct netdev_rxq up;
bool is_tap;
int fd;
};
static bool netdev_linux_miimon_enabled(void);
static void netdev_linux_miimon_run(void);
static void netdev_linux_miimon_wait(void);
+static int netdev_linux_get_mtu__(struct netdev_linux *netdev, int *mtup);
static bool
is_netdev_linux_class(const struct netdev_class *netdev_class)
return CONTAINER_OF(netdev, struct netdev_linux, up);
}
-static struct netdev_rx_linux *
-netdev_rx_linux_cast(const struct netdev_rx *rx)
+static struct netdev_rxq_linux *
+netdev_rxq_linux_cast(const struct netdev_rxq *rx)
{
ovs_assert(is_netdev_linux_class(netdev_get_class(rx->netdev)));
- return CONTAINER_OF(rx, struct netdev_rx_linux, up);
+ return CONTAINER_OF(rx, struct netdev_rxq_linux, up);
}
\f
static void netdev_linux_update(struct netdev_linux *netdev,
free(netdev);
}
-static struct netdev_rx *
-netdev_linux_rx_alloc(void)
+static struct netdev_rxq *
+netdev_linux_rxq_alloc(void)
{
- struct netdev_rx_linux *rx = xzalloc(sizeof *rx);
+ struct netdev_rxq_linux *rx = xzalloc(sizeof *rx);
return &rx->up;
}
static int
-netdev_linux_rx_construct(struct netdev_rx *rx_)
+netdev_linux_rxq_construct(struct netdev_rxq *rxq_)
{
- struct netdev_rx_linux *rx = netdev_rx_linux_cast(rx_);
+ struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
struct netdev *netdev_ = rx->up.netdev;
struct netdev_linux *netdev = netdev_linux_cast(netdev_);
int error;
}
static void
-netdev_linux_rx_destruct(struct netdev_rx *rx_)
+netdev_linux_rxq_destruct(struct netdev_rxq *rxq_)
{
- struct netdev_rx_linux *rx = netdev_rx_linux_cast(rx_);
+ struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
if (!rx->is_tap) {
close(rx->fd);
}
static void
-netdev_linux_rx_dealloc(struct netdev_rx *rx_)
+netdev_linux_rxq_dealloc(struct netdev_rxq *rxq_)
{
- struct netdev_rx_linux *rx = netdev_rx_linux_cast(rx_);
+ struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
free(rx);
}
}
static int
-netdev_linux_rx_recv_sock(int fd, struct ofpbuf *buffer)
+netdev_linux_rxq_recv_sock(int fd, struct ofpbuf *buffer)
{
size_t size;
ssize_t retval;
}
static int
-netdev_linux_rx_recv_tap(int fd, struct ofpbuf *buffer)
+netdev_linux_rxq_recv_tap(int fd, struct ofpbuf *buffer)
{
ssize_t retval;
size_t size = ofpbuf_tailroom(buffer);
}
static int
-netdev_linux_rx_recv(struct netdev_rx *rx_, struct ofpbuf *buffer)
+netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct ofpbuf **packet, int *c)
{
- struct netdev_rx_linux *rx = netdev_rx_linux_cast(rx_);
- int retval;
+ struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
+ struct netdev *netdev = rx->up.netdev;
+ struct ofpbuf *buffer;
+ ssize_t retval;
+ int mtu;
+
+ if (netdev_linux_get_mtu__(netdev_linux_cast(netdev), &mtu)) {
+ mtu = ETH_PAYLOAD_MAX;
+ }
+
+ buffer = ofpbuf_new_with_headroom(VLAN_ETH_HEADER_LEN + mtu, DP_NETDEV_HEADROOM);
retval = (rx->is_tap
- ? netdev_linux_rx_recv_tap(rx->fd, buffer)
- : netdev_linux_rx_recv_sock(rx->fd, buffer));
- if (retval && retval != EAGAIN && retval != EMSGSIZE) {
- VLOG_WARN_RL(&rl, "error receiving Ethernet packet on %s: %s",
- ovs_strerror(errno), netdev_rx_get_name(rx_));
+ ? netdev_linux_rxq_recv_tap(rx->fd, buffer)
+ : netdev_linux_rxq_recv_sock(rx->fd, buffer));
+
+ if (retval) {
+ if (retval != EAGAIN && retval != EMSGSIZE) {
+ VLOG_WARN_RL(&rl, "error receiving Ethernet packet on %s: %s",
+ ovs_strerror(errno), netdev_rxq_get_name(rxq_));
+ }
+ ofpbuf_delete(buffer);
+ } else {
+ dp_packet_pad(buffer);
+ packet[0] = buffer;
+ *c = 1;
}
return retval;
}
static void
-netdev_linux_rx_wait(struct netdev_rx *rx_)
+netdev_linux_rxq_wait(struct netdev_rxq *rxq_)
{
- struct netdev_rx_linux *rx = netdev_rx_linux_cast(rx_);
+ struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
poll_fd_wait(rx->fd, POLLIN);
}
static int
-netdev_linux_rx_drain(struct netdev_rx *rx_)
+netdev_linux_rxq_drain(struct netdev_rxq *rxq_)
{
- struct netdev_rx_linux *rx = netdev_rx_linux_cast(rx_);
+ struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
if (rx->is_tap) {
struct ifreq ifr;
- int error = af_inet_ifreq_ioctl(netdev_rx_get_name(rx_), &ifr,
+ int error = af_inet_ifreq_ioctl(netdev_rxq_get_name(rxq_), &ifr,
SIOCGIFTXQLEN, "SIOCGIFTXQLEN");
if (error) {
return error;
* The kernel maintains a packet transmission queue, so the caller is not
* expected to do additional queuing of packets. */
static int
-netdev_linux_send(struct netdev *netdev_, const void *data, size_t size)
+netdev_linux_send(struct netdev *netdev_, struct ofpbuf *pkt, bool may_steal)
{
+ const void *data = pkt->data;
+ size_t size = pkt->size;
+
for (;;) {
ssize_t retval;
retval = write(netdev->tap_fd, data, size);
}
+ if (may_steal) {
+ ofpbuf_delete(pkt);
+ }
+
if (retval < 0) {
/* The Linux AF_PACKET implementation never blocks waiting for room
* for packets, instead returning ENOBUFS. Translate this into
\
netdev_linux_update_flags, \
\
- netdev_linux_rx_alloc, \
- netdev_linux_rx_construct, \
- netdev_linux_rx_destruct, \
- netdev_linux_rx_dealloc, \
- netdev_linux_rx_recv, \
- netdev_linux_rx_wait, \
- netdev_linux_rx_drain, \
+ netdev_linux_rxq_alloc, \
+ netdev_linux_rxq_construct, \
+ netdev_linux_rxq_destruct, \
+ netdev_linux_rxq_dealloc, \
+ netdev_linux_rxq_recv, \
+ netdev_linux_rxq_wait, \
+ netdev_linux_rxq_drain, \
}
const struct netdev_class netdev_linux_class =
#include "flow.h"
#include "list.h"
+#include "dpif-netdev.h"
#include "netdev-provider.h"
#include "odp-util.h"
#include "ofp-print.h"
};
-struct netdev_rx_pltap {
- struct netdev_rx up;
+struct netdev_rxq_pltap {
+ struct netdev_rxq up;
int fd;
};
return CONTAINER_OF(netdev, struct netdev_pltap, up);
}
-static struct netdev_rx_pltap*
-netdev_rx_pltap_cast(const struct netdev_rx *rx)
+static struct netdev_rxq_pltap*
+netdev_rxq_pltap_cast(const struct netdev_rxq *rx)
{
ovs_assert(is_netdev_pltap_class(netdev_get_class(rx->netdev)));
- return CONTAINER_OF(rx, struct netdev_rx_pltap, up);
+ return CONTAINER_OF(rx, struct netdev_rxq_pltap, up);
}
static void sync_needed(struct netdev_pltap *dev)
static int netdev_pltap_up(struct netdev_pltap *dev) OVS_REQUIRES(dev->mutex);
-static struct netdev_rx *
-netdev_pltap_rx_alloc(void)
+static struct netdev_rxq *
+netdev_pltap_rxq_alloc(void)
{
- struct netdev_rx_pltap *rx = xzalloc(sizeof *rx);
+ struct netdev_rxq_pltap *rx = xzalloc(sizeof *rx);
return &rx->up;
}
static int
-netdev_pltap_rx_construct(struct netdev_rx *rx_)
+netdev_pltap_rxq_construct(struct netdev_rxq *rx_)
{
- struct netdev_rx_pltap *rx = netdev_rx_pltap_cast(rx_);
+ struct netdev_rxq_pltap *rx = netdev_rxq_pltap_cast(rx_);
struct netdev *netdev_ = rx->up.netdev;
struct netdev_pltap *netdev =
netdev_pltap_cast(netdev_);
}
static void
-netdev_pltap_rx_destruct(struct netdev_rx *rx_ OVS_UNUSED)
+netdev_pltap_rxq_destruct(struct netdev_rxq *rx_ OVS_UNUSED)
{
}
static void
-netdev_pltap_rx_dealloc(struct netdev_rx *rx_)
+netdev_pltap_rxq_dealloc(struct netdev_rxq *rx_)
{
- struct netdev_rx_pltap *rx = netdev_rx_pltap_cast(rx_);
+ struct netdev_rxq_pltap *rx = netdev_rxq_pltap_cast(rx_);
free(rx);
}
}
static int
-netdev_pltap_rx_recv(struct netdev_rx *rx_, struct ofpbuf *buffer)
+netdev_pltap_rxq_recv(struct netdev_rxq *rx_, struct ofpbuf **packet, int *c)
{
- size_t size = ofpbuf_tailroom(buffer);
- struct netdev_rx_pltap *rx = netdev_rx_pltap_cast(rx_);
+ struct netdev_rxq_pltap *rx = netdev_rxq_pltap_cast(rx_);
struct tun_pi pi;
struct iovec iov[2] = {
{ .iov_base = &pi, .iov_len = sizeof(pi) },
- { .iov_base = buffer->data, .iov_len = size }
};
+ struct ofpbuf *buffer = NULL;
+ size_t size;
+ int error = 0;
+
+ buffer = ofpbuf_new_with_headroom(VLAN_ETH_HEADER_LEN + ETH_PAYLOAD_MAX,
+ DP_NETDEV_HEADROOM);
+ size = ofpbuf_tailroom(buffer);
+ iov[1].iov_base = buffer->data;
+ iov[1].iov_len = size;
for (;;) {
ssize_t retval;
retval = readv(rx->fd, iov, 2);
if (retval >= 0) {
if (retval <= size) {
buffer->size += retval;
- return 0;
+ goto out;
} else {
- return EMSGSIZE;
+ error = EMSGSIZE;
+ goto out;
}
} else if (errno != EINTR) {
if (errno != EAGAIN) {
VLOG_WARN_RL(&rl, "error receiveing Ethernet packet on %s: %s",
- netdev_rx_get_name(rx_), ovs_strerror(errno));
+ netdev_rxq_get_name(rx_), ovs_strerror(errno));
}
- return errno;
+ error = errno;
+ goto out;
}
}
+out:
+ if (error) {
+ ofpbuf_delete(buffer);
+ } else {
+ dp_packet_pad(buffer);
+ packet[0] = buffer;
+ *c = 1;
+ }
+
+ return error;
}
static void
-netdev_pltap_rx_wait(struct netdev_rx *rx_)
+netdev_pltap_rxq_wait(struct netdev_rxq *rx_)
{
- struct netdev_rx_pltap *rx = netdev_rx_pltap_cast(rx_);
+ struct netdev_rxq_pltap *rx = netdev_rxq_pltap_cast(rx_);
struct netdev_pltap *netdev =
netdev_pltap_cast(rx->up.netdev);
if (rx->fd >= 0 && netdev_pltap_finalized(netdev)) {
}
static int
-netdev_pltap_send(struct netdev *netdev_, const void *buffer, size_t size)
+netdev_pltap_send(struct netdev *netdev_, struct ofpbuf *pkt, bool may_steal)
{
+ const void *buffer = pkt->data;
+ size_t size = pkt->size;
struct netdev_pltap *dev =
- netdev_pltap_cast(netdev_);
+ netdev_pltap_cast(netdev_);
+ int error = 0;
struct tun_pi pi = { 0, 0x86 };
struct iovec iov[2] = {
{ .iov_base = &pi, .iov_len = sizeof(pi) },
- { .iov_base = (char*) buffer, .iov_len = size }
+ { .iov_base = (char*) buffer, .iov_len = size }
};
- if (dev->fd < 0)
- return EAGAIN;
+ if (dev->fd < 0) {
+ error = EAGAIN;
+ goto out;
+ }
for (;;) {
ssize_t retval;
retval = writev(dev->fd, iov, 2);
if (retval >= 0) {
- if (retval != size + 4) {
- VLOG_WARN_RL(&rl, "sent partial Ethernet packet (%"PRIdSIZE" bytes of %"PRIuSIZE") on %s",
- retval, size + 4, netdev_get_name(netdev_));
- }
- return 0;
+ if (retval != size + 4) {
+ VLOG_WARN_RL(&rl, "sent partial Ethernet packet (%"PRIdSIZE" bytes of %"PRIuSIZE") on %s",
+ retval, size + 4, netdev_get_name(netdev_));
+ }
+ goto out;
} else if (errno != EINTR) {
if (errno != EAGAIN) {
VLOG_WARN_RL(&rl, "error sending Ethernet packet on %s: %s",
- netdev_get_name(netdev_), ovs_strerror(errno));
+ netdev_get_name(netdev_), ovs_strerror(errno));
}
- return errno;
+ error = errno;
+ goto out;
}
}
+out:
+ if (may_steal) {
+ ofpbuf_delete(pkt);
+ }
+ return error;
}
static void
}
static int
-netdev_pltap_rx_drain(struct netdev_rx *rx_)
+netdev_pltap_rxq_drain(struct netdev_rxq *rx_)
{
- struct netdev_rx_pltap *rx = netdev_rx_pltap_cast(rx_);
+ struct netdev_rxq_pltap *rx = netdev_rxq_pltap_cast(rx_);
char buffer[128];
int error;
netdev_pltap_update_flags,
- netdev_pltap_rx_alloc,
- netdev_pltap_rx_construct,
- netdev_pltap_rx_destruct,
- netdev_pltap_rx_dealloc,
- netdev_pltap_rx_recv,
- netdev_pltap_rx_wait,
- netdev_pltap_rx_drain,
+ netdev_pltap_rxq_alloc,
+ netdev_pltap_rxq_construct,
+ netdev_pltap_rxq_destruct,
+ netdev_pltap_rxq_dealloc,
+ netdev_pltap_rxq_recv,
+ netdev_pltap_rxq_wait,
+ netdev_pltap_rxq_drain,
};
this device. */
/* The following are protected by 'netdev_mutex' (internal to netdev.c). */
+ int n_rxq;
int ref_cnt; /* Times this devices was opened. */
struct shash_node *node; /* Pointer to element in global map. */
struct list saved_flags_list; /* Contains "struct netdev_saved_flags". */
* Network device implementations may read these members but should not modify
* them.
*
- * None of these members change during the lifetime of a struct netdev_rx. */
-struct netdev_rx {
+ * None of these members change during the lifetime of a struct netdev_rxq. */
+struct netdev_rxq {
struct netdev *netdev; /* Owns a reference to the netdev. */
+ int queue_id;
};
-struct netdev *netdev_rx_get_netdev(const struct netdev_rx *);
+struct netdev *netdev_rxq_get_netdev(const struct netdev_rxq *);
/* Network device class structure, to be defined by each implementation of a
* network device.
*
* - "struct netdev", which represents a network device.
*
- * - "struct netdev_rx", which represents a handle for capturing packets
+ * - "struct netdev_rxq", which represents a handle for capturing packets
* received on a network device
*
* Each of these data structures contains all of the implementation-independent
*
* Four stylized functions accompany each of these data structures:
*
- * "alloc" "construct" "destruct" "dealloc"
- * ------------ ---------------- --------------- --------------
- * netdev ->alloc ->construct ->destruct ->dealloc
- * netdev_rx ->rx_alloc ->rx_construct ->rx_destruct ->rx_dealloc
+ * "alloc" "construct" "destruct" "dealloc"
+ * ------------ ---------------- --------------- --------------
+ * netdev ->alloc ->construct ->destruct ->dealloc
+ * netdev_rxq ->rxq_alloc ->rxq_construct ->rxq_destruct ->rxq_dealloc
*
* Any instance of a given data structure goes through the following life
* cycle:
* implementation must not refer to base or derived state in the data
* structure, because it has already been uninitialized.
*
+ * If netdev support multi-queue IO then netdev->construct should set initialize
+ * netdev->n_rxq to number of queues.
+ *
* Each "alloc" function allocates and returns a new instance of the respective
* data structure. The "alloc" function is not given any information about the
* use of the new data structure, so it cannot perform much initialization.
const struct netdev_tunnel_config *
(*get_tunnel_config)(const struct netdev *netdev);
- /* Sends the 'size'-byte packet in 'buffer' on 'netdev'. Returns 0 if
- * successful, otherwise a positive errno value. Returns EAGAIN without
- * blocking if the packet cannot be queued immediately. Returns EMSGSIZE
- * if a partial packet was transmitted or if the packet is too big or too
- * small to transmit on the device.
+ /* Sends the buffer on 'netdev'.
+ * Returns 0 if successful, otherwise a positive errno value. Returns
+ * EAGAIN without blocking if the packet cannot be queued immediately.
+ * Returns EMSGSIZE if a partial packet was transmitted or if the packet
+ * is too big or too small to transmit on the device.
*
- * The caller retains ownership of 'buffer' in all cases.
+ * To retain ownership of 'buffer' caller can set may_steal to false.
*
* The network device is expected to maintain a packet transmission queue,
* so that the caller does not ordinarily have to do additional queuing of
* network device from being usefully used by the netdev-based "userspace
* datapath". It will also prevent the OVS implementation of bonding from
* working properly over 'netdev'.) */
- int (*send)(struct netdev *netdev, const void *buffer, size_t size);
+ int (*send)(struct netdev *netdev, struct ofpbuf *buffer, bool may_steal);
/* Registers with the poll loop to wake up from the next call to
* poll_block() when the packet transmission queue for 'netdev' has
int (*update_flags)(struct netdev *netdev, enum netdev_flags off,
enum netdev_flags on, enum netdev_flags *old_flags);
-/* ## ------------------- ## */
-/* ## netdev_rx Functions ## */
-/* ## ------------------- ## */
+/* ## -------------------- ## */
+/* ## netdev_rxq Functions ## */
+/* ## -------------------- ## */
/* If a particular netdev class does not support receiving packets, all these
* function pointers must be NULL. */
- /* Life-cycle functions for a netdev_rx. See the large comment above on
+ /* Life-cycle functions for a netdev_rxq. See the large comment above on
* struct netdev_class. */
- struct netdev_rx *(*rx_alloc)(void);
- int (*rx_construct)(struct netdev_rx *);
- void (*rx_destruct)(struct netdev_rx *);
- void (*rx_dealloc)(struct netdev_rx *);
-
- /* Attempts to receive a packet from 'rx' into the tailroom of 'buffer',
- * which should initially be empty. If successful, returns 0 and
- * increments 'buffer->size' by the number of bytes in the received packet,
- * otherwise a positive errno value. Returns EAGAIN immediately if no
- * packet is ready to be received.
- *
- * Must return EMSGSIZE, and discard the packet, if the received packet
- * is longer than 'ofpbuf_tailroom(buffer)'.
- *
- * Implementations may make use of VLAN_HEADER_LEN bytes of tailroom to
- * add a VLAN header which is obtained out-of-band to the packet. If
- * this occurs then VLAN_HEADER_LEN bytes of tailroom will no longer be
- * available for the packet, otherwise it may be used for the packet
- * itself.
- *
- * It is advised that the tailroom of 'buffer' should be
- * VLAN_HEADER_LEN bytes longer than the MTU to allow space for an
- * out-of-band VLAN header to be added to the packet.
+ struct netdev_rxq *(*rxq_alloc)(void);
+ int (*rxq_construct)(struct netdev_rxq *);
+ void (*rxq_destruct)(struct netdev_rxq *);
+ void (*rxq_dealloc)(struct netdev_rxq *);
+
+ /* Attempts to receive batch of packets from 'rx' and place array of pointers
+ * into '*pkt'. netdev is responsible for allocating buffers.
+ * '*cnt' points to packet count for given batch. Once packets are returned
+ * to caller, netdev should give up ownership of ofpbuf data.
+ *
+ * Implementations should allocate buffer with DP_NETDEV_HEADROOM headroom
+ * and add a VLAN header which is obtained out-of-band to the packet.
*
+ * Caller is expected to pass array of size MAX_RX_BATCH.
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
- int (*rx_recv)(struct netdev_rx *rx, struct ofpbuf *buffer);
+ int (*rxq_recv)(struct netdev_rxq *rx, struct ofpbuf **pkt, int *cnt);
/* Registers with the poll loop to wake up from the next call to
- * poll_block() when a packet is ready to be received with netdev_rx_recv()
+ * poll_block() when a packet is ready to be received with netdev_rxq_recv()
* on 'rx'. */
- void (*rx_wait)(struct netdev_rx *rx);
+ void (*rxq_wait)(struct netdev_rxq *rx);
/* Discards all packets waiting to be received from 'rx'. */
- int (*rx_drain)(struct netdev_rx *rx);
+ int (*rxq_drain)(struct netdev_rxq *rx);
};
int netdev_register_provider(const struct netdev_class *);
#include "flow.h"
#include "list.h"
+#include "dpif-netdev.h"
#include "netdev-provider.h"
#include "odp-util.h"
#include "ofp-print.h"
unsigned int change_seq;
};
-struct netdev_rx_tunnel {
- struct netdev_rx up;
+struct netdev_rxq_tunnel {
+ struct netdev_rxq up;
int fd;
};
return CONTAINER_OF(netdev, struct netdev_tunnel, up);
}
-static struct netdev_rx_tunnel *
-netdev_rx_tunnel_cast(const struct netdev_rx *rx)
+static struct netdev_rxq_tunnel *
+netdev_rxq_tunnel_cast(const struct netdev_rxq *rx)
{
ovs_assert(is_netdev_tunnel_class(netdev_get_class(rx->netdev)));
- return CONTAINER_OF(rx, struct netdev_rx_tunnel, up);
+ return CONTAINER_OF(rx, struct netdev_rxq_tunnel, up);
}
static struct netdev *
if (netdev->valid_remote_ip) {
const struct sockaddr_in *sin =
ALIGNED_CAST(const struct sockaddr_in *, &netdev->remote_addr);
- smap_add_format(args, "remote_ip", IP_FMT,
- IP_ARGS(sin->sin_addr.s_addr));
+ smap_add_format(args, "remote_ip", IP_FMT,
+ IP_ARGS(sin->sin_addr.s_addr));
}
if (netdev->valid_remote_port)
smap_add_format(args, "remote_port", "%"PRIu16,
- ss_get_port(&netdev->remote_addr));
+ ss_get_port(&netdev->remote_addr));
ovs_mutex_unlock(&netdev->mutex);
return 0;
}
return 0;
if (connect(dev->sockfd, (struct sockaddr*) sin, sizeof(*sin)) < 0) {
VLOG_DBG("%s: connect returned %s", netdev_get_name(&dev->up),
- ovs_strerror(errno));
+ ovs_strerror(errno));
return errno;
}
dev->connected = true;
netdev_tunnel_update_seq(dev);
VLOG_DBG("%s: connected to (%s, %d)", netdev_get_name(&dev->up),
- inet_ntop(AF_INET, &sin->sin_addr.s_addr, buf, 1024),
- ss_get_port(&dev->remote_addr));
+ inet_ntop(AF_INET, &sin->sin_addr.s_addr, buf, 1024),
+ ss_get_port(&dev->remote_addr));
return 0;
}
VLOG_DBG("tunnel_set_config(%s)", netdev_get_name(dev_));
SMAP_FOR_EACH(node, args) {
VLOG_DBG("arg: %s->%s", node->name, (char*)node->data);
- if (!strcmp(node->name, "remote_ip")) {
- struct in_addr addr;
- if (lookup_ip(node->data, &addr)) {
- VLOG_WARN("%s: bad 'remote_ip'", node->name);
- } else {
- sin->sin_family = AF_INET;
- sin->sin_addr = addr;
- netdev->valid_remote_ip = true;
- }
- } else if (!strcmp(node->name, "remote_port")) {
- sin->sin_port = htons(atoi(node->data));
- netdev->valid_remote_port = true;
- } else {
- VLOG_WARN("%s: unknown argument '%s'",
- netdev_get_name(dev_), node->name);
- }
+ if (!strcmp(node->name, "remote_ip")) {
+ struct in_addr addr;
+ if (lookup_ip(node->data, &addr)) {
+ VLOG_WARN("%s: bad 'remote_ip'", node->name);
+ } else {
+ sin->sin_family = AF_INET;
+ sin->sin_addr = addr;
+ netdev->valid_remote_ip = true;
+ }
+ } else if (!strcmp(node->name, "remote_port")) {
+ sin->sin_port = htons(atoi(node->data));
+ netdev->valid_remote_port = true;
+ } else {
+ VLOG_WARN("%s: unknown argument '%s'",
+ netdev_get_name(dev_), node->name);
+ }
}
error = netdev_tunnel_connect(netdev);
ovs_mutex_unlock(&netdev->mutex);
return error;
}
-static struct netdev_rx *
-netdev_tunnel_rx_alloc(void)
+static struct netdev_rxq *
+netdev_tunnel_rxq_alloc(void)
{
- struct netdev_rx_tunnel *rx = xzalloc(sizeof *rx);
+ struct netdev_rxq_tunnel *rx = xzalloc(sizeof *rx);
return &rx->up;
}
static int
-netdev_tunnel_rx_construct(struct netdev_rx *rx_)
+netdev_tunnel_rxq_construct(struct netdev_rxq *rx_)
{
- struct netdev_rx_tunnel *rx = netdev_rx_tunnel_cast(rx_);
+ struct netdev_rxq_tunnel *rx = netdev_rxq_tunnel_cast(rx_);
struct netdev *netdev_ = rx->up.netdev;
struct netdev_tunnel *netdev = netdev_tunnel_cast(netdev_);
}
static void
-netdev_tunnel_rx_destruct(struct netdev_rx *rx_ OVS_UNUSED)
+netdev_tunnel_rxq_destruct(struct netdev_rxq *rx_ OVS_UNUSED)
{
}
static void
-netdev_tunnel_rx_dealloc(struct netdev_rx *rx_)
+netdev_tunnel_rxq_dealloc(struct netdev_rxq *rx_)
{
- struct netdev_rx_tunnel *rx = netdev_rx_tunnel_cast(rx_);
+ struct netdev_rxq_tunnel *rx = netdev_rxq_tunnel_cast(rx_);
free(rx);
}
static int
-netdev_tunnel_rx_recv(struct netdev_rx *rx_, struct ofpbuf *buffer)
+netdev_tunnel_rxq_recv(struct netdev_rxq *rx_, struct ofpbuf **packet, int *c)
{
- size_t size = ofpbuf_tailroom(buffer);
- struct netdev_rx_tunnel *rx = netdev_rx_tunnel_cast(rx_);
+ struct netdev_rxq_tunnel *rx = netdev_rxq_tunnel_cast(rx_);
struct netdev_tunnel *netdev =
netdev_tunnel_cast(rx_->netdev);
+ struct ofpbuf *buffer = NULL;
+ size_t size;
+ int error = 0;
+
if (!netdev->connected)
return EAGAIN;
+ buffer = ofpbuf_new_with_headroom(VLAN_ETH_HEADER_LEN + ETH_PAYLOAD_MAX,
+ DP_NETDEV_HEADROOM);
+ size = ofpbuf_tailroom(buffer);
+
for (;;) {
ssize_t retval;
retval = recv(rx->fd, buffer->data, size, MSG_TRUNC);
- VLOG_DBG("%s: recv(%"PRIxPTR", %"PRIuSIZE", MSG_TRUNC) = %"PRIdSIZE,
- netdev_rx_get_name(rx_), (uintptr_t)buffer->data, size, retval);
+ VLOG_DBG("%s: recv(%"PRIxPTR", %"PRIuSIZE", MSG_TRUNC) = %"PRIdSIZE,
+ netdev_rxq_get_name(rx_), (uintptr_t)buffer->data, size, retval);
if (retval >= 0) {
- netdev->stats.rx_packets++;
- netdev->stats.rx_bytes += retval;
+ netdev->stats.rx_packets++;
+ netdev->stats.rx_bytes += retval;
if (retval <= size) {
buffer->size += retval;
- return 0;
+ goto out;
} else {
netdev->stats.rx_errors++;
netdev->stats.rx_length_errors++;
- return EMSGSIZE;
+ error = EMSGSIZE;
+ goto out;
}
} else if (errno != EINTR) {
if (errno != EAGAIN) {
VLOG_WARN_RL(&rl, "error receiveing Ethernet packet on %s: %s",
- netdev_rx_get_name(rx_), ovs_strerror(errno));
- netdev->stats.rx_errors++;
+ netdev_rxq_get_name(rx_), ovs_strerror(errno));
+ netdev->stats.rx_errors++;
}
- return errno;
+ error = errno;
+ goto out;
}
}
+out:
+ if (error) {
+ ofpbuf_delete(buffer);
+ } else {
+ dp_packet_pad(buffer);
+ packet[0] = buffer;
+ *c = 1;
+ }
+
+ return error;
}
static void
-netdev_tunnel_rx_wait(struct netdev_rx *rx_)
+netdev_tunnel_rxq_wait(struct netdev_rxq *rx_)
{
- struct netdev_rx_tunnel *rx =
- netdev_rx_tunnel_cast(rx_);
+ struct netdev_rxq_tunnel *rx =
+ netdev_rxq_tunnel_cast(rx_);
if (rx->fd >= 0) {
poll_fd_wait(rx->fd, POLLIN);
}
}
static int
-netdev_tunnel_send(struct netdev *netdev_, const void *buffer, size_t size)
+netdev_tunnel_send(struct netdev *netdev_, struct ofpbuf *pkt, bool may_steal)
{
+ const void *buffer = pkt->data;
+ size_t size = pkt->size;
struct netdev_tunnel *dev =
- netdev_tunnel_cast(netdev_);
- if (!dev->connected)
- return EAGAIN;
+ netdev_tunnel_cast(netdev_);
+ int error = 0;
+ if (!dev->connected) {
+ error = EAGAIN;
+ goto out;
+ }
for (;;) {
ssize_t retval;
retval = send(dev->sockfd, buffer, size, 0);
- VLOG_DBG("%s: send(%"PRIxPTR", %"PRIuSIZE") = %"PRIdSIZE,
- netdev_get_name(netdev_), (uintptr_t)buffer, size, retval);
+ VLOG_DBG("%s: send(%"PRIxPTR", %"PRIuSIZE") = %"PRIdSIZE,
+ netdev_get_name(netdev_), (uintptr_t)buffer, size, retval);
if (retval >= 0) {
- dev->stats.tx_packets++;
- dev->stats.tx_bytes += retval;
- if (retval != size) {
- VLOG_WARN_RL(&rl, "sent partial Ethernet packet (%"PRIdSIZE" bytes of "
- "%"PRIuSIZE") on %s", retval, size, netdev_get_name(netdev_));
- dev->stats.tx_errors++;
- }
- return 0;
+ dev->stats.tx_packets++;
+ dev->stats.tx_bytes += retval;
+ if (retval != size) {
+ VLOG_WARN_RL(&rl, "sent partial Ethernet packet (%"PRIdSIZE" bytes of "
+ "%"PRIuSIZE") on %s", retval, size, netdev_get_name(netdev_));
+ dev->stats.tx_errors++;
+ }
+ goto out;
} else if (errno != EINTR) {
if (errno != EAGAIN) {
VLOG_WARN_RL(&rl, "error sending Ethernet packet on %s: %s",
- netdev_get_name(netdev_), ovs_strerror(errno));
- dev->stats.tx_errors++;
+ netdev_get_name(netdev_), ovs_strerror(errno));
+ dev->stats.tx_errors++;
}
- return errno;
+ error = errno;
+ goto out;
}
}
+out:
+ if (may_steal) {
+ ofpbuf_delete(pkt);
+ }
+
+ return error;
}
static void
}
static int
-netdev_tunnel_rx_drain(struct netdev_rx *rx_)
+netdev_tunnel_rxq_drain(struct netdev_rxq *rx_)
{
struct netdev_tunnel *netdev =
netdev_tunnel_cast(rx_->netdev);
- struct netdev_rx_tunnel *rx =
- netdev_rx_tunnel_cast(rx_);
+ struct netdev_rxq_tunnel *rx =
+ netdev_rxq_tunnel_cast(rx_);
char buffer[128];
int error;
if (!netdev->connected)
- return 0;
+ return 0;
for (;;) {
- error = recv(rx->fd, buffer, 128, MSG_TRUNC);
- if (error) {
+ error = recv(rx->fd, buffer, 128, MSG_TRUNC);
+ if (error) {
if (error == -EAGAIN)
- break;
+ break;
else if (error != -EMSGSIZE)
- return error;
- }
+ return error;
+ }
}
return 0;
}
netdev_tunnel_update_flags,
- netdev_tunnel_rx_alloc,
- netdev_tunnel_rx_construct,
- netdev_tunnel_rx_destruct,
- netdev_tunnel_rx_dealloc,
- netdev_tunnel_rx_recv,
- netdev_tunnel_rx_wait,
- netdev_tunnel_rx_drain,
+ netdev_tunnel_rxq_alloc,
+ netdev_tunnel_rxq_construct,
+ netdev_tunnel_rxq_destruct,
+ netdev_tunnel_rxq_dealloc,
+ netdev_tunnel_rxq_recv,
+ netdev_tunnel_rxq_wait,
+ netdev_tunnel_rxq_drain,
};
static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER;
static pid_t pid = 0;
+#ifndef _WIN32
ovs_mutex_lock(&mutex);
if (pid <= 0) {
char *file_name = xasprintf("%s/%s", ovs_rundir(),
free(file_name);
}
ovs_mutex_unlock(&mutex);
+#endif
if (pid < 0) {
VLOG_ERR("%s: IPsec requires the ovs-monitor-ipsec daemon",
const char *netdev_vport_class_get_dpif_port(const struct netdev_class *);
+#ifndef _WIN32
enum { NETDEV_VPORT_NAME_BUFSIZE = 16 };
+#else
+enum { NETDEV_VPORT_NAME_BUFSIZE = 256 };
+#endif
const char *netdev_vport_get_dpif_port(const struct netdev *,
char namebuf[], size_t bufsize);
char *netdev_vport_get_dpif_port_strdup(const struct netdev *);
#include "fatal-signal.h"
#include "hash.h"
#include "list.h"
+#include "netdev-dpdk.h"
#include "netdev-provider.h"
#include "netdev-vport.h"
#include "ofpbuf.h"
static void restore_all_flags(void *aux OVS_UNUSED);
void update_device_args(struct netdev *, const struct shash *args);
+int
+netdev_n_rxq(const struct netdev *netdev)
+{
+ return netdev->n_rxq;
+}
+
+bool
+netdev_is_pmd(const struct netdev *netdev)
+{
+ return !strcmp(netdev->netdev_class->type, "dpdk");
+}
+
static void
netdev_initialize(void)
OVS_EXCLUDED(netdev_class_mutex, netdev_mutex)
#endif
netdev_register_provider(&netdev_tunnel_class);
netdev_register_provider(&netdev_pltap_class);
+ netdev_dpdk_register();
ovsthread_once_done(&once);
}
atomic_read(&rc->ref_cnt, &ref_cnt);
if (!ref_cnt) {
hmap_remove(&netdev_classes, &rc->hmap_node);
- atomic_destroy(&rc->ref_cnt);
free(rc);
error = 0;
} else {
netdev->netdev_class = rc->class;
netdev->name = xstrdup(name);
netdev->node = shash_add(&netdev_shash, name, netdev);
+
+ /* By default enable one rx queue per netdev. */
+ if (netdev->netdev_class->rxq_alloc) {
+ netdev->n_rxq = 1;
+ } else {
+ netdev->n_rxq = 0;
+ }
list_init(&netdev->saved_flags_list);
error = rc->class->construct(netdev);
}
}
-/* Attempts to open a netdev_rx handle for obtaining packets received on
- * 'netdev'. On success, returns 0 and stores a nonnull 'netdev_rx *' into
+/* Attempts to open a netdev_rxq handle for obtaining packets received on
+ * 'netdev'. On success, returns 0 and stores a nonnull 'netdev_rxq *' into
* '*rxp'. On failure, returns a positive errno value and stores NULL into
* '*rxp'.
*
* Some kinds of network devices might not support receiving packets. This
* function returns EOPNOTSUPP in that case.*/
int
-netdev_rx_open(struct netdev *netdev, struct netdev_rx **rxp)
+netdev_rxq_open(struct netdev *netdev, struct netdev_rxq **rxp, int id)
OVS_EXCLUDED(netdev_mutex)
{
int error;
- if (netdev->netdev_class->rx_alloc) {
- struct netdev_rx *rx = netdev->netdev_class->rx_alloc();
+ if (netdev->netdev_class->rxq_alloc && id < netdev->n_rxq) {
+ struct netdev_rxq *rx = netdev->netdev_class->rxq_alloc();
if (rx) {
rx->netdev = netdev;
- error = netdev->netdev_class->rx_construct(rx);
+ rx->queue_id = id;
+ error = netdev->netdev_class->rxq_construct(rx);
if (!error) {
ovs_mutex_lock(&netdev_mutex);
netdev->ref_cnt++;
*rxp = rx;
return 0;
}
- netdev->netdev_class->rx_dealloc(rx);
+ netdev->netdev_class->rxq_dealloc(rx);
} else {
error = ENOMEM;
}
/* Closes 'rx'. */
void
-netdev_rx_close(struct netdev_rx *rx)
+netdev_rxq_close(struct netdev_rxq *rx)
OVS_EXCLUDED(netdev_mutex)
{
if (rx) {
struct netdev *netdev = rx->netdev;
- netdev->netdev_class->rx_destruct(rx);
- netdev->netdev_class->rx_dealloc(rx);
+ netdev->netdev_class->rxq_destruct(rx);
+ netdev->netdev_class->rxq_dealloc(rx);
netdev_close(netdev);
}
}
-/* Attempts to receive a packet from 'rx' into the tailroom of 'buffer', which
- * must initially be empty. If successful, returns 0 and increments
- * 'buffer->size' by the number of bytes in the received packet, otherwise a
- * positive errno value.
+/* Attempts to receive batch of packets from 'rx'.
*
* Returns EAGAIN immediately if no packet is ready to be received.
*
* Returns EMSGSIZE, and discards the packet, if the received packet is longer
* than 'ofpbuf_tailroom(buffer)'.
*
- * Implementations may make use of VLAN_HEADER_LEN bytes of tailroom to
- * add a VLAN header which is obtained out-of-band to the packet. If
- * this occurs then VLAN_HEADER_LEN bytes of tailroom will no longer be
- * available for the packet, otherwise it may be used for the packet
- * itself.
- *
* It is advised that the tailroom of 'buffer' should be
* VLAN_HEADER_LEN bytes longer than the MTU to allow space for an
* out-of-band VLAN header to be added to the packet. At the very least,
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
int
-netdev_rx_recv(struct netdev_rx *rx, struct ofpbuf *buffer)
+netdev_rxq_recv(struct netdev_rxq *rx, struct ofpbuf **buffers, int *cnt)
{
int retval;
- ovs_assert(buffer->size == 0);
- ovs_assert(ofpbuf_tailroom(buffer) >= ETH_TOTAL_MIN);
-
- retval = rx->netdev->netdev_class->rx_recv(rx, buffer);
+ retval = rx->netdev->netdev_class->rxq_recv(rx, buffers, cnt);
if (!retval) {
COVERAGE_INC(netdev_received);
- if (buffer->size < ETH_TOTAL_MIN) {
- ofpbuf_put_zeros(buffer, ETH_TOTAL_MIN - buffer->size);
- }
- return 0;
- } else {
- return retval;
}
+ return retval;
}
/* Arranges for poll_block() to wake up when a packet is ready to be received
* on 'rx'. */
void
-netdev_rx_wait(struct netdev_rx *rx)
+netdev_rxq_wait(struct netdev_rxq *rx)
{
- rx->netdev->netdev_class->rx_wait(rx);
+ rx->netdev->netdev_class->rxq_wait(rx);
}
/* Discards any packets ready to be received on 'rx'. */
int
-netdev_rx_drain(struct netdev_rx *rx)
+netdev_rxq_drain(struct netdev_rxq *rx)
{
- return (rx->netdev->netdev_class->rx_drain
- ? rx->netdev->netdev_class->rx_drain(rx)
+ return (rx->netdev->netdev_class->rxq_drain
+ ? rx->netdev->netdev_class->rxq_drain(rx)
: 0);
}
* immediately. Returns EMSGSIZE if a partial packet was transmitted or if
* the packet is too big or too small to transmit on the device.
*
- * The caller retains ownership of 'buffer' in all cases.
+ * To retain ownership of 'buffer' caller can set may_steal to false.
*
* The kernel maintains a packet transmission queue, so the caller is not
* expected to do additional queuing of packets.
* Some network devices may not implement support for this function. In such
* cases this function will always return EOPNOTSUPP. */
int
-netdev_send(struct netdev *netdev, const struct ofpbuf *buffer)
+netdev_send(struct netdev *netdev, struct ofpbuf *buffer, bool may_steal)
{
int error;
error = (netdev->netdev_class->send
- ? netdev->netdev_class->send(netdev, buffer->data, buffer->size)
+ ? netdev->netdev_class->send(netdev, buffer, may_steal)
: EOPNOTSUPP);
if (!error) {
COVERAGE_INC(netdev_sent);
}
\f
struct netdev *
-netdev_rx_get_netdev(const struct netdev_rx *rx)
+netdev_rxq_get_netdev(const struct netdev_rxq *rx)
{
ovs_assert(rx->netdev->ref_cnt > 0);
return rx->netdev;
}
const char *
-netdev_rx_get_name(const struct netdev_rx *rx)
+netdev_rxq_get_name(const struct netdev_rxq *rx)
{
- return netdev_get_name(netdev_rx_get_netdev(rx));
+ return netdev_get_name(netdev_rxq_get_netdev(rx));
}
static void
* any number of threads on the same or different netdev objects. The
* exceptions are:
*
- * netdev_rx_recv()
- * netdev_rx_wait()
- * netdev_rx_drain()
+ * netdev_rxq_recv()
+ * netdev_rxq_wait()
+ * netdev_rxq_drain()
*
* These functions are conditionally thread-safe: they may be called from
- * different threads only on different netdev_rx objects. (The client may
- * create multiple netdev_rx objects for a single netdev and access each
+ * different threads only on different netdev_rxq objects. (The client may
+ * create multiple netdev_rxq objects for a single netdev and access each
* of those from a different thread.)
*
* NETDEV_FOR_EACH_QUEUE
struct netdev;
struct netdev_class;
-struct netdev_rx;
+struct netdev_rxq;
struct netdev_saved_flags;
struct ofpbuf;
struct in_addr;
void netdev_enumerate_types(struct sset *types);
bool netdev_is_reserved_name(const char *name);
+int netdev_n_rxq(const struct netdev *netdev);
+bool netdev_is_pmd(const struct netdev *netdev);
+
/* Open and close. */
-int netdev_open(const char *name, const char *type, struct netdev **);
+int netdev_open(const char *name, const char *type, struct netdev **netdevp);
+
struct netdev *netdev_ref(const struct netdev *);
void netdev_close(struct netdev *);
int netdev_get_ifindex(const struct netdev *);
/* Packet reception. */
-int netdev_rx_open(struct netdev *, struct netdev_rx **);
-void netdev_rx_close(struct netdev_rx *);
+int netdev_rxq_open(struct netdev *, struct netdev_rxq **, int id);
+void netdev_rxq_close(struct netdev_rxq *);
-const char *netdev_rx_get_name(const struct netdev_rx *);
+const char *netdev_rxq_get_name(const struct netdev_rxq *);
-int netdev_rx_recv(struct netdev_rx *, struct ofpbuf *);
-void netdev_rx_wait(struct netdev_rx *);
-int netdev_rx_drain(struct netdev_rx *);
+int netdev_rxq_recv(struct netdev_rxq *rx, struct ofpbuf **buffers, int *cnt);
+void netdev_rxq_wait(struct netdev_rxq *);
+int netdev_rxq_drain(struct netdev_rxq *);
/* Packet transmission. */
-int netdev_send(struct netdev *, const struct ofpbuf *);
+int netdev_send(struct netdev *, struct ofpbuf *, bool may_steal);
void netdev_send_wait(struct netdev *);
/* Hardware address. */
int netdev_dump_queue_stats(const struct netdev *,
netdev_dump_queue_stats_cb *, void *aux);
+enum { NETDEV_MAX_RX_BATCH = 256 }; /* Maximum number packets in rx_recv() batch. */
+
#ifdef __cplusplus
}
#endif
ovs_assert(status);
ofpbuf_uninit(&buf);
}
- atomic_destroy(&dump->status);
nl_pool_release(dump->sock);
seq_destroy(dump->status_seq);
return status >> 1;
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
struct nlmsgerr *err = ofpbuf_at(msg, NLMSG_HDRLEN, sizeof *err);
int code = EPROTO;
if (!err) {
- VLOG_ERR_RL(&rl, "received invalid nlmsgerr (%"PRIuSIZE"d bytes < %"PRIuSIZE"d)",
+ VLOG_ERR_RL(&rl, "received invalid nlmsgerr (%"PRIu32"d bytes < %"PRIuSIZE"d)",
msg->size, NLMSG_HDRLEN + sizeof *err);
} else if (err->error <= 0 && err->error > INT_MIN) {
code = -err->error;
void *
nl_msg_put_uninit(struct ofpbuf *msg, size_t size)
{
- size_t pad = NLMSG_ALIGN(size) - size;
+ size_t pad = PAD_SIZE(size, NLMSG_ALIGNTO);
char *p = ofpbuf_put_uninit(msg, size + pad);
if (pad) {
memset(p + size, 0, pad);
void *
nl_msg_push_uninit(struct ofpbuf *msg, size_t size)
{
- size_t pad = NLMSG_ALIGN(size) - size;
+ size_t pad = PAD_SIZE(size, NLMSG_ALIGNTO);
char *p = ofpbuf_push_uninit(msg, size + pad);
if (pad) {
memset(p + size, 0, pad);
/*
- * Copyright (c) 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
if (!p) {
VLOG_DBG_RL(&rl, "nx_match length %u, rounded up to a "
"multiple of 8, is longer than space in message (max "
- "length %"PRIuSIZE")", match_len, b->size);
+ "length %"PRIu32")", match_len, b->size);
return OFPERR_OFPBMC_BAD_LEN;
}
}
if (!p) {
VLOG_DBG_RL(&rl, "oxm length %u, rounded up to a "
"multiple of 8, is longer than space in message (max "
- "length %"PRIuSIZE")", match_len, b->size);
+ "length %"PRIu32")", match_len, b->size);
return OFPERR_OFPBMC_BAD_LEN;
}
int match_len;
int i;
- BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
/* Metadata. */
+ if (match->wc.masks.dp_hash) {
+ if (!oxm) {
+ nxm_put_32m(b, NXM_NX_DP_HASH, htonl(flow->dp_hash),
+ htonl(match->wc.masks.dp_hash));
+ }
+ }
+
+ if (match->wc.masks.recirc_id) {
+ if (!oxm) {
+ nxm_put_32(b, NXM_NX_RECIRC_ID, htonl(flow->recirc_id));
+ }
+ }
+
if (match->wc.masks.in_port.ofp_port) {
ofp_port_t in_port = flow->in_port.ofp_port;
if (oxm) {
{
int match_len = nx_put_raw(b, false, match, cookie, cookie_mask);
- ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
return match_len;
}
ofpbuf_put_uninit(b, sizeof *omh);
match_len = nx_put_raw(b, true, match, cookie, cookie_mask) + sizeof *omh;
- ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
omh = ofpbuf_at(b, start_len, sizeof *omh);
omh->type = htons(OFPMT_OXM);
nx_match_from_string(const char *s, struct ofpbuf *b)
{
int match_len = nx_match_from_string_raw(s, b);
- ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
return match_len;
}
ofpbuf_put_uninit(b, sizeof *omh);
match_len = nx_match_from_string_raw(s, b) + sizeof *omh;
- ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
omh = ofpbuf_at(b, start_len, sizeof *omh);
omh->type = htons(OFPMT_OXM);
/*
- * Copyright (c) 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
struct ds;
struct match;
+struct mf_field;
struct mf_subfield;
struct ofpact_reg_move;
struct ofpact_reg_load;
static void
set_arp(struct ofpbuf *packet, const struct ovs_key_arp *arp_key)
{
- struct arp_eth_header *arp = packet->l3;
+ struct arp_eth_header *arp = ofpbuf_get_l3(packet);
arp->ar_op = arp_key->arp_op;
memcpy(arp->ar_sha, arp_key->arp_sha, ETH_ADDR_LEN);
set_arp(packet, nl_attr_get_unspec(a, sizeof(struct ovs_key_arp)));
break;
+ case OVS_KEY_ATTR_DP_HASH:
+ md->dp_hash = nl_attr_get_u32(a);
+ break;
+
+ case OVS_KEY_ATTR_RECIRC_ID:
+ md->recirc_id = nl_attr_get_u32(a);
+ break;
+
case OVS_KEY_ATTR_UNSPEC:
case OVS_KEY_ATTR_ENCAP:
case OVS_KEY_ATTR_ETHERTYPE:
}
static void
-odp_execute_actions__(void *dp, struct ofpbuf *packet, struct pkt_metadata *,
+odp_execute_actions__(void *dp, struct ofpbuf *packet, bool steal,
+ struct pkt_metadata *,
const struct nlattr *actions, size_t actions_len,
odp_execute_cb dp_execute_action, bool more_actions);
static void
-odp_execute_sample(void *dp, struct ofpbuf *packet, struct pkt_metadata *md,
- const struct nlattr *action,
+odp_execute_sample(void *dp, struct ofpbuf *packet, bool steal,
+ struct pkt_metadata *md, const struct nlattr *action,
odp_execute_cb dp_execute_action, bool more_actions)
{
const struct nlattr *subactions = NULL;
}
}
- odp_execute_actions__(dp, packet, md, nl_attr_get(subactions),
+ odp_execute_actions__(dp, packet, steal, md, nl_attr_get(subactions),
nl_attr_get_size(subactions), dp_execute_action,
more_actions);
}
static void
-odp_execute_actions__(void *dp, struct ofpbuf *packet, struct pkt_metadata *md,
+odp_execute_actions__(void *dp, struct ofpbuf *packet, bool steal,
+ struct pkt_metadata *md,
const struct nlattr *actions, size_t actions_len,
odp_execute_cb dp_execute_action, bool more_actions)
{
/* These only make sense in the context of a datapath. */
case OVS_ACTION_ATTR_OUTPUT:
case OVS_ACTION_ATTR_USERSPACE:
+ case OVS_ACTION_ATTR_RECIRC:
if (dp_execute_action) {
+ bool may_steal;
/* Allow 'dp_execute_action' to steal the packet data if we do
* not need it any more. */
- bool steal = !more_actions && left <= NLA_ALIGN(a->nla_len);
- dp_execute_action(dp, packet, md, a, steal);
+ may_steal = steal && (!more_actions && left <= NLA_ALIGN(a->nla_len));
+ dp_execute_action(dp, packet, md, a, may_steal);
}
break;
break;
case OVS_ACTION_ATTR_SAMPLE:
- odp_execute_sample(dp, packet, md, a, dp_execute_action,
+ odp_execute_sample(dp, packet, steal, md, a, dp_execute_action,
more_actions || left > NLA_ALIGN(a->nla_len));
break;
}
void
-odp_execute_actions(void *dp, struct ofpbuf *packet, struct pkt_metadata *md,
+odp_execute_actions(void *dp, struct ofpbuf *packet, bool steal,
+ struct pkt_metadata *md,
const struct nlattr *actions, size_t actions_len,
odp_execute_cb dp_execute_action)
{
- odp_execute_actions__(dp, packet, md, actions, actions_len,
+ odp_execute_actions__(dp, packet, steal, md, actions, actions_len,
dp_execute_action, false);
+
+ if (!actions_len && steal) {
+ /* Drop action. */
+ ofpbuf_delete(packet);
+ }
}
struct pkt_metadata;
typedef void (*odp_execute_cb)(void *dp, struct ofpbuf *packet,
- const struct pkt_metadata *,
+ struct pkt_metadata *,
const struct nlattr *action, bool may_steal);
/* Actions that need to be executed in the context of a datapath are handed
* to 'dp_execute_action', if non-NULL. Currently this is called only for
* actions OVS_ACTION_ATTR_OUTPUT and OVS_ACTION_ATTR_USERSPACE so
* 'dp_execute_action' needs to handle only these. */
-void
-odp_execute_actions(void *dp, struct ofpbuf *packet, struct pkt_metadata *,
+void odp_execute_actions(void *dp, struct ofpbuf *packet, bool steal,
+ struct pkt_metadata *,
const struct nlattr *actions, size_t actions_len,
odp_execute_cb dp_execute_action);
#endif
case OVS_ACTION_ATTR_POP_VLAN: return 0;
case OVS_ACTION_ATTR_PUSH_MPLS: return sizeof(struct ovs_action_push_mpls);
case OVS_ACTION_ATTR_POP_MPLS: return sizeof(ovs_be16);
+ case OVS_ACTION_ATTR_RECIRC: return sizeof(struct ovs_action_recirc);
case OVS_ACTION_ATTR_SET: return -2;
case OVS_ACTION_ATTR_SAMPLE: return -2;
case OVS_KEY_ATTR_ARP: return "arp";
case OVS_KEY_ATTR_ND: return "nd";
case OVS_KEY_ATTR_MPLS: return "mpls";
+ case OVS_KEY_ATTR_DP_HASH: return "dp_hash";
+ case OVS_KEY_ATTR_RECIRC_ID: return "recirc_id";
case __OVS_KEY_ATTR_MAX:
default:
format_odp_sample_action(struct ds *ds, const struct nlattr *attr)
{
static const struct nl_policy ovs_sample_policy[] = {
- { NL_A_NO_ATTR, 0, 0, false }, /* OVS_SAMPLE_ATTR_UNSPEC */
- { NL_A_U32, 0, 0, false }, /* OVS_SAMPLE_ATTR_PROBABILITY */
- { NL_A_NESTED, 0, 0, false }, /* OVS_SAMPLE_ATTR_ACTIONS */
+ [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 },
+ [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED }
};
struct nlattr *a[ARRAY_SIZE(ovs_sample_policy)];
double percentage;
format_odp_userspace_action(struct ds *ds, const struct nlattr *attr)
{
static const struct nl_policy ovs_userspace_policy[] = {
- { NL_A_NO_ATTR, 0, 0, false }, /* OVS_USERSPACE_ATTR_UNSPEC */
- { NL_A_U32, 0, 0, false }, /* OVS_USERSPACE_ATTR_PID */
- { NL_A_UNSPEC, 0, 0, true }, /* OVS_USERSPACE_ATTR_USERDATA */
+ [OVS_USERSPACE_ATTR_PID] = { .type = NL_A_U32 },
+ [OVS_USERSPACE_ATTR_USERDATA] = { .type = NL_A_UNSPEC,
+ .optional = true },
};
struct nlattr *a[ARRAY_SIZE(ovs_userspace_policy)];
const struct nlattr *userdata_attr;
}
}
+static void
+format_odp_recirc_action(struct ds *ds,
+ const struct ovs_action_recirc *act)
+{
+ ds_put_format(ds, "recirc(");
+
+ if (act->hash_alg == OVS_RECIRC_HASH_ALG_L4) {
+ ds_put_format(ds, "hash_l4(%"PRIu32"), ", act->hash_bias);
+ }
+
+ ds_put_format(ds, "%"PRIu32")", act->recirc_id);
+}
+
static void
format_odp_action(struct ds *ds, const struct nlattr *a)
{
case OVS_ACTION_ATTR_USERSPACE:
format_odp_userspace_action(ds, a);
break;
+ case OVS_ACTION_ATTR_RECIRC:
+ format_odp_recirc_action(ds, nl_attr_get(a));
+ break;
case OVS_ACTION_ATTR_SET:
ds_put_cstr(ds, "set(");
format_odp_key_attr(nl_attr_get(a), NULL, NULL, ds, true);
case OVS_KEY_ATTR_ENCAP: return -2;
case OVS_KEY_ATTR_PRIORITY: return 4;
case OVS_KEY_ATTR_SKB_MARK: return 4;
+ case OVS_KEY_ATTR_DP_HASH: return 4;
+ case OVS_KEY_ATTR_RECIRC_ID: return 4;
case OVS_KEY_ATTR_TUNNEL: return -2;
case OVS_KEY_ATTR_IN_PORT: return 4;
case OVS_KEY_ATTR_ETHERNET: return sizeof(struct ovs_key_ethernet);
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_SKB_MARK:
+ case OVS_KEY_ATTR_DP_HASH:
+ case OVS_KEY_ATTR_RECIRC_ID:
ds_put_format(ds, "%#"PRIx32, nl_attr_get_u32(a));
if (!is_exact) {
ds_put_format(ds, "/%#"PRIx32, nl_attr_get_u32(ma));
}
break;
}
-
case OVS_KEY_ATTR_UNSPEC:
case __OVS_KEY_ATTR_MAX:
default:
}
}
+ {
+ uint32_t recirc_id;
+ int n = -1;
+
+ if (ovs_scan(s, "recirc_id(%"SCNi32")%n", &recirc_id, &n)) {
+ nl_msg_put_u32(key, OVS_KEY_ATTR_RECIRC_ID, recirc_id);
+ nl_msg_put_u32(mask, OVS_KEY_ATTR_RECIRC_ID, UINT32_MAX);
+ return n;
+ }
+ }
+
+ {
+ uint32_t dp_hash;
+ uint32_t dp_hash_mask;
+ int n = -1;
+
+ if (mask && ovs_scan(s, "dp_hash(%"SCNi32"/%"SCNi32")%n", &dp_hash,
+ &dp_hash_mask, &n)) {
+ nl_msg_put_u32(key, OVS_KEY_ATTR_DP_HASH, dp_hash);
+ nl_msg_put_u32(mask, OVS_KEY_ATTR_DP_HASH, dp_hash_mask);
+ return n;
+ } else if (ovs_scan(s, "dp_hash(%"SCNi32")%n", &dp_hash, &n)) {
+ nl_msg_put_u32(key, OVS_KEY_ATTR_DP_HASH, dp_hash);
+ if (mask) {
+ nl_msg_put_u32(mask, OVS_KEY_ATTR_DP_HASH, UINT32_MAX);
+ }
+ return n;
+ }
+ }
+
{
uint64_t tun_id, tun_id_mask;
struct flow_tnl tun_key, tun_key_mask;
nl_msg_put_u32(buf, OVS_KEY_ATTR_SKB_MARK, data->pkt_mark);
+ if (flow->recirc_id) {
+ nl_msg_put_u32(buf, OVS_KEY_ATTR_RECIRC_ID, data->recirc_id);
+ }
+
+ if (flow->dp_hash) {
+ nl_msg_put_u32(buf, OVS_KEY_ATTR_DP_HASH, data->dp_hash);
+ }
+
/* Add an ingress port attribute if this is a mask or 'odp_in_port'
* is not the magical value "ODPP_NONE". */
if (is_mask || odp_in_port != ODPP_NONE) {
continue;
}
- if (type == OVS_KEY_ATTR_PRIORITY) {
+ switch (type) {
+ case OVS_KEY_ATTR_RECIRC_ID:
+ md->recirc_id = nl_attr_get_u32(nla);
+ wanted_attrs &= ~(1u << OVS_KEY_ATTR_RECIRC_ID);
+ break;
+ case OVS_KEY_ATTR_DP_HASH:
+ md->dp_hash = nl_attr_get_u32(nla);
+ wanted_attrs &= ~(1u << OVS_KEY_ATTR_DP_HASH);
+ break;
+ case OVS_KEY_ATTR_PRIORITY:
md->skb_priority = nl_attr_get_u32(nla);
wanted_attrs &= ~(1u << OVS_KEY_ATTR_PRIORITY);
- } else if (type == OVS_KEY_ATTR_SKB_MARK) {
+ break;
+ case OVS_KEY_ATTR_SKB_MARK:
md->pkt_mark = nl_attr_get_u32(nla);
wanted_attrs &= ~(1u << OVS_KEY_ATTR_SKB_MARK);
- } else if (type == OVS_KEY_ATTR_TUNNEL) {
+ break;
+ case OVS_KEY_ATTR_TUNNEL: {
enum odp_key_fitness res;
res = odp_tun_key_from_attr(nla, &md->tunnel);
} else if (res == ODP_FIT_PERFECT) {
wanted_attrs &= ~(1u << OVS_KEY_ATTR_TUNNEL);
}
- } else if (type == OVS_KEY_ATTR_IN_PORT) {
+ break;
+ }
+ case OVS_KEY_ATTR_IN_PORT:
md->in_port.odp_port = nl_attr_get_odp_port(nla);
wanted_attrs &= ~(1u << OVS_KEY_ATTR_IN_PORT);
+ break;
+ default:
+ break;
}
if (!wanted_attrs) {
? attrs[OVS_KEY_ATTR_ENCAP] : NULL);
enum odp_key_fitness encap_fitness;
enum odp_key_fitness fitness;
- ovs_be16 tci;
/* Calculate fitness of outer attributes. */
if (!is_mask) {
fitness = check_expectations(present_attrs, out_of_range_attr,
expected_attrs, key, key_len);
- /* Get the VLAN TCI value. */
- if (!is_mask && !(present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_VLAN))) {
- return ODP_FIT_TOO_LITTLE;
+ /* Set vlan_tci.
+ * Remove the TPID from dl_type since it's not the real Ethertype. */
+ flow->dl_type = htons(0);
+ flow->vlan_tci = (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_VLAN)
+ ? nl_attr_get_be16(attrs[OVS_KEY_ATTR_VLAN])
+ : htons(0));
+ if (!is_mask) {
+ if (!(present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_VLAN))) {
+ return ODP_FIT_TOO_LITTLE;
+ } else if (flow->vlan_tci == htons(0)) {
+ /* Corner case for a truncated 802.1Q header. */
+ if (fitness == ODP_FIT_PERFECT && nl_attr_get_size(encap)) {
+ return ODP_FIT_TOO_MUCH;
+ }
+ return fitness;
+ } else if (!(flow->vlan_tci & htons(VLAN_CFI))) {
+ VLOG_ERR_RL(&rl, "OVS_KEY_ATTR_VLAN 0x%04"PRIx16" is nonzero "
+ "but CFI bit is not set", ntohs(flow->vlan_tci));
+ return ODP_FIT_ERROR;
+ }
} else {
- tci = (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_VLAN)
- ? nl_attr_get_be16(attrs[OVS_KEY_ATTR_VLAN])
- : htons(0));
- if (!is_mask) {
- if (tci == htons(0)) {
- /* Corner case for a truncated 802.1Q header. */
- if (fitness == ODP_FIT_PERFECT && nl_attr_get_size(encap)) {
- return ODP_FIT_TOO_MUCH;
- }
- return fitness;
- } else if (!(tci & htons(VLAN_CFI))) {
- VLOG_ERR_RL(&rl, "OVS_KEY_ATTR_VLAN 0x%04"PRIx16" is nonzero "
- "but CFI bit is not set", ntohs(tci));
- return ODP_FIT_ERROR;
- }
+ if (!(present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ENCAP))) {
+ return fitness;
}
- /* Set vlan_tci.
- * Remove the TPID from dl_type since it's not the real Ethertype. */
- flow->dl_type = htons(0);
- flow->vlan_tci = tci;
}
- if (is_mask && !(present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ENCAP))) {
- return fitness;
- }
/* Now parse the encapsulated attributes. */
if (!parse_flow_nlattrs(nl_attr_get(encap), nl_attr_get_size(encap),
attrs, &present_attrs, &out_of_range_attr)) {
expected_attrs = 0;
/* Metadata. */
+ if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_RECIRC_ID)) {
+ flow->recirc_id = nl_attr_get_u32(attrs[OVS_KEY_ATTR_RECIRC_ID]);
+ expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_RECIRC_ID;
+ } else if (is_mask) {
+ /* Always exact match recirc_id when datapath does not sepcify it. */
+ flow->recirc_id = UINT32_MAX;
+ }
+
+ if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_DP_HASH)) {
+ flow->dp_hash = nl_attr_get_u32(attrs[OVS_KEY_ATTR_DP_HASH]);
+ expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_DP_HASH;
+ }
if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_PRIORITY)) {
flow->skb_priority = nl_attr_get_u32(attrs[OVS_KEY_ATTR_PRIORITY]);
expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY;
case OFPUTIL_ACTION_INVALID:
#define OFPAT10_ACTION(ENUM, STRUCT, NAME) case OFPUTIL_##ENUM:
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) case OFPUTIL_##ENUM:
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) case OFPUTIL_##ENUM:
#include "ofp-util.def"
OVS_NOT_REACHED();
switch (code) {
case OFPUTIL_ACTION_INVALID:
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) case OFPUTIL_##ENUM:
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) case OFPUTIL_##ENUM:
#include "ofp-util.def"
OVS_NOT_REACHED();
actions = ofpbuf_try_pull(openflow, actions_len);
if (actions == NULL) {
VLOG_WARN_RL(&rl, "OpenFlow message actions length %u exceeds "
- "remaining message length (%"PRIuSIZE")",
+ "remaining message length (%"PRIu32")",
actions_len, openflow->size);
return OFPERR_OFPBRC_BAD_LEN;
}
switch (code) {
case OFPUTIL_ACTION_INVALID:
#define OFPAT10_ACTION(ENUM, STRUCT, NAME) case OFPUTIL_##ENUM:
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) case OFPUTIL_##ENUM:
#include "ofp-util.def"
OVS_NOT_REACHED();
}
}
+enum ofperr
+ovs_instruction_type_from_inst_type(enum ovs_instruction_type *instruction_type,
+ const uint16_t inst_type)
+{
+ switch (inst_type) {
+
+#define DEFINE_INST(ENUM, STRUCT, EXTENSIBLE, NAME) \
+ case ENUM: \
+ *instruction_type = OVSINST_##ENUM; \
+ return 0;
+OVS_INSTRUCTIONS
+#undef DEFINE_INST
+
+ default:
+ return OFPERR_OFPBIC_UNKNOWN_INST;
+ }
+}
+
static inline struct ofp11_instruction *
instruction_next(const struct ofp11_instruction *inst)
{
instructions = ofpbuf_try_pull(openflow, instructions_len);
if (instructions == NULL) {
VLOG_WARN_RL(&rl, "OpenFlow message instructions length %u exceeds "
- "remaining message length (%"PRIuSIZE")",
+ "remaining message length (%"PRIu32")",
instructions_len, openflow->size);
error = OFPERR_OFPBIC_BAD_LEN;
goto exit;
void
ofpact_pad(struct ofpbuf *ofpacts)
{
- unsigned int rem = ofpacts->size % OFPACT_ALIGNTO;
- if (rem) {
- ofpbuf_put_zeros(ofpacts, OFPACT_ALIGNTO - rem);
+ unsigned int pad = PAD_SIZE(ofpacts->size, OFPACT_ALIGNTO);
+ if (pad) {
+ ofpbuf_put_zeros(ofpacts, pad);
}
}
* Used for OFPIT11_WRITE_ACTIONS. */
struct ofpact_nest {
struct ofpact ofpact;
- uint8_t pad[OFPACT_ALIGN(sizeof(struct ofpact)) - sizeof(struct ofpact)];
+ uint8_t pad[PAD_SIZE(sizeof(struct ofpact), OFPACT_ALIGNTO)];
struct ofpact actions[];
};
-BUILD_ASSERT_DECL(offsetof(struct ofpact_nest, actions) == OFPACT_ALIGNTO);
+BUILD_ASSERT_DECL(offsetof(struct ofpact_nest, actions) % OFPACT_ALIGNTO == 0);
static inline size_t
ofpact_nest_get_action_len(const struct ofpact_nest *on)
int ovs_instruction_type_from_name(const char *name);
enum ovs_instruction_type ovs_instruction_type_from_ofpact_type(
enum ofpact_type);
+enum ofperr ovs_instruction_type_from_inst_type(
+ enum ovs_instruction_type *instruction_type, const uint16_t inst_type);
+
#endif /* ofp-actions.h */
return &ofperr_of12;
case OFP13_VERSION:
return &ofperr_of13;
+ case OFP14_VERSION:
+ return &ofperr_of14;
default:
return NULL;
}
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
return type == OFPT11_STATS_REQUEST;
}
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
return type == OFPT11_STATS_REPLY;
}
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
if (hdrs->type == OFPT11_STATS_REQUEST ||
hdrs->type == OFPT11_STATS_REPLY) {
return (hdrs->stat == OFPST_VENDOR
enum ofpraw raw;
/* Set default outputs. */
- msg->l2 = msg->l3 = msg->data;
+ msg->l2 = msg->data;
+ ofpbuf_set_l3(msg, msg->data);
*rawp = 0;
len = msg->size;
info = raw_info_get(raw);
instance = raw_instance_get(info, hdrs.version);
msg->l2 = ofpbuf_pull(msg, instance->hdrs_len);
- msg->l3 = msg->data;
+ ofpbuf_set_l3(msg, msg->data);
min_len = instance->hdrs_len + info->min_body;
switch (info->extra_multiple) {
ofpbuf_prealloc_tailroom(buf, (instance->hdrs_len + info->min_body
+ extra_tailroom));
buf->l2 = ofpbuf_put_uninit(buf, instance->hdrs_len);
- buf->l3 = ofpbuf_tail(buf);
+ ofpbuf_set_l3(buf, ofpbuf_tail(buf));
oh = buf->l2;
oh->version = version;
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
ovs_assert(hdrs.type == OFPT11_STATS_REQUEST);
hdrs.type = OFPT11_STATS_REPLY;
break;
next = ofpbuf_new(MAX(1024, hdrs_len + len));
ofpbuf_put(next, msg->data, hdrs_len);
next->l2 = next->data;
- next->l3 = ofpbuf_tail(next);
+ ofpbuf_set_l3(next, ofpbuf_tail(next));
list_push_back(replies, &next->list_node);
*ofpmp_flags__(msg->data) |= htons(OFPSF_REPLY_MORE);
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
return &((struct ofp11_stats_msg *) oh)->flags;
default:
OVS_NOT_REACHED();
OFPRAW_OFPST11_TABLE_REPLY,
/* OFPST 1.2 (3): struct ofp12_table_stats[]. */
OFPRAW_OFPST12_TABLE_REPLY,
- /* OFPST 1.3 (3): struct ofp13_table_stats[]. */
+ /* OFPST 1.3+ (3): struct ofp13_table_stats[]. */
OFPRAW_OFPST13_TABLE_REPLY,
/* OFPST 1.0 (4): struct ofp10_port_stats_request. */
/* OFPST 1.1-1.2 (6): uint8_t[8][]. */
OFPRAW_OFPST11_GROUP_REPLY,
- /* OFPST 1.3 (6): uint8_t[8][]. */
+ /* OFPST 1.3+ (6): uint8_t[8][]. */
OFPRAW_OFPST13_GROUP_REPLY,
/* OFPST 1.1+ (7): void. */
case OFPUTIL_OFPAT10_OUTPUT:
case OFPUTIL_OFPAT11_OUTPUT:
+ case OFPUTIL_OFPAT13_OUTPUT:
error = parse_output(arg, ofpacts);
break;
break;
case OFPUTIL_OFPAT12_SET_FIELD:
+ case OFPUTIL_OFPAT13_SET_FIELD:
return set_field_parse(arg, ofpacts, usable_protocols);
case OFPUTIL_OFPAT10_STRIP_VLAN:
case OFPUTIL_OFPAT11_POP_VLAN:
+ case OFPUTIL_OFPAT13_POP_VLAN:
ofpact_put_STRIP_VLAN(ofpacts)->ofpact.compat = code;
break;
case OFPUTIL_OFPAT11_PUSH_VLAN:
+ case OFPUTIL_OFPAT13_PUSH_VLAN:
*usable_protocols &= OFPUTIL_P_OF11_UP;
error = str_to_u16(arg, "ethertype", ðertype);
if (error) {
break;
case OFPUTIL_OFPAT11_SET_QUEUE:
+ case OFPUTIL_OFPAT13_SET_QUEUE:
error = str_to_u32(arg, &ofpact_put_SET_QUEUE(ofpacts)->queue_id);
break;
break;
case OFPUTIL_OFPAT11_SET_NW_TTL:
+ case OFPUTIL_OFPAT13_SET_NW_TTL:
error = str_to_u8(arg, "TTL", &ttl);
if (error) {
return error;
break;
case OFPUTIL_OFPAT11_DEC_NW_TTL:
+ case OFPUTIL_OFPAT13_DEC_NW_TTL:
OVS_NOT_REACHED();
case OFPUTIL_OFPAT10_SET_TP_SRC:
case OFPUTIL_NXAST_SET_MPLS_TTL:
case OFPUTIL_OFPAT11_SET_MPLS_TTL:
+ case OFPUTIL_OFPAT13_SET_MPLS_TTL:
error = parse_set_mpls_ttl(ofpacts, arg);
break;
case OFPUTIL_OFPAT11_DEC_MPLS_TTL:
+ case OFPUTIL_OFPAT13_DEC_MPLS_TTL:
case OFPUTIL_NXAST_DEC_MPLS_TTL:
ofpact_put_DEC_MPLS_TTL(ofpacts);
break;
break;
case OFPUTIL_OFPAT11_PUSH_MPLS:
+ case OFPUTIL_OFPAT13_PUSH_MPLS:
case OFPUTIL_NXAST_PUSH_MPLS:
error = str_to_u16(arg, "push_mpls", ðertype);
if (!error) {
break;
case OFPUTIL_OFPAT11_POP_MPLS:
+ case OFPUTIL_OFPAT13_POP_MPLS:
case OFPUTIL_NXAST_POP_MPLS:
error = str_to_u16(arg, "pop_mpls", ðertype);
if (!error) {
break;
case OFPUTIL_OFPAT11_GROUP:
+ case OFPUTIL_OFPAT13_GROUP:
error = str_to_u32(arg, &ofpact_put_GROUP(ofpacts)->group_id);
break;
+ /* FIXME when implement OFPAT13_* */
+ case OFPUTIL_OFPAT13_COPY_TTL_OUT:
+ case OFPUTIL_OFPAT13_COPY_TTL_IN:
+ case OFPUTIL_OFPAT13_PUSH_PBB:
+ case OFPUTIL_OFPAT13_POP_PBB:
+ OVS_NOT_REACHED();
+
case OFPUTIL_NXAST_STACK_PUSH:
error = nxm_parse_stack_action(ofpact_put_STACK_PUSH(ofpacts), arg);
break;
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
ofp_packet_to_string(const void *data, size_t len)
{
struct ds ds = DS_EMPTY_INITIALIZER;
- const struct pkt_metadata md = PKT_METADATA_INITIALIZER(ODPP_NONE);
+ const struct pkt_metadata md = PKT_METADATA_INITIALIZER(0);
struct ofpbuf buf;
struct flow flow;
+ size_t l4_size;
ofpbuf_use_const(&buf, data, len);
flow_extract(&buf, &md, &flow);
flow_format(&ds, &flow);
- if (buf.l7) {
- if (flow.nw_proto == IPPROTO_TCP) {
- struct tcp_header *th = buf.l4;
- ds_put_format(&ds, " tcp_csum:%"PRIx16,
- ntohs(th->tcp_csum));
- } else if (flow.nw_proto == IPPROTO_UDP) {
- struct udp_header *uh = buf.l4;
- ds_put_format(&ds, " udp_csum:%"PRIx16,
- ntohs(uh->udp_csum));
- } else if (flow.nw_proto == IPPROTO_SCTP) {
- struct sctp_header *sh = buf.l4;
- ds_put_format(&ds, " sctp_csum:%"PRIx32,
- ntohl(sh->sctp_csum));
- }
+ l4_size = ofpbuf_get_l4_size(&buf);
+
+ if (flow.nw_proto == IPPROTO_TCP && l4_size >= TCP_HEADER_LEN) {
+ struct tcp_header *th = ofpbuf_get_l4(&buf);
+ ds_put_format(&ds, " tcp_csum:%"PRIx16, ntohs(th->tcp_csum));
+ } else if (flow.nw_proto == IPPROTO_UDP && l4_size >= UDP_HEADER_LEN) {
+ struct udp_header *uh = ofpbuf_get_l4(&buf);
+ ds_put_format(&ds, " udp_csum:%"PRIx16, ntohs(uh->udp_csum));
+ } else if (flow.nw_proto == IPPROTO_SCTP && l4_size >= SCTP_HEADER_LEN) {
+ struct sctp_header *sh = ofpbuf_get_l4(&buf);
+ ds_put_format(&ds, " sctp_csum:%"PRIx32, ntohl(sh->sctp_csum));
}
ds_put_char(&ds, '\n');
case OFP12_VERSION:
break;
case OFP13_VERSION:
+ case OFP14_VERSION:
return; /* no ports in ofp13_switch_features */
default:
OVS_NOT_REACHED();
static void print_wild(struct ds *string, const char *leader, int is_wild,
int verbosity, const char *format, ...)
- __attribute__((format(printf, 5, 6)));
+ PRINTF_FORMAT(5, 6);
static void print_wild(struct ds *string, const char *leader, int is_wild,
int verbosity, const char *format, ...)
int verbosity)
{
switch ((enum ofp_version)oh->version) {
+ case OFP14_VERSION:
case OFP13_VERSION:
ofp_print_ofpst_table_reply13(string, oh, verbosity);
break;
ofp_print_version(oh, string);
}
-static void
-ofp_print_not_implemented(struct ds *string)
-{
- ds_put_cstr(string, "NOT IMPLEMENTED YET!\n");
-}
-
static void
ofp_print_group(struct ds *s, uint32_t group_id, uint8_t type,
struct list *p_buckets)
ofp_print_group(s, gm.group_id, gm.type, &gm.buckets);
}
+static const char *
+ofp13_action_to_string(uint32_t bit)
+{
+ switch (bit) {
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
+ case 1u << ENUM: return NAME;
+#include "ofp-util.def"
+ }
+ return NULL;
+}
+
+static void
+print_table_action_features(struct ds *s,
+ const struct ofputil_table_action_features *taf)
+{
+ ds_put_cstr(s, " actions: ");
+ ofp_print_bit_names(s, taf->actions, ofp13_action_to_string, ',');
+ ds_put_char(s, '\n');
+
+ ds_put_cstr(s, " supported on Set-Field: ");
+ if (taf->set_fields) {
+ int i;
+
+ for (i = 0; i < MFF_N_IDS; i++) {
+ uint64_t bit = UINT64_C(1) << i;
+
+ if (taf->set_fields & bit) {
+ ds_put_format(s, "%s,", mf_from_id(i)->name);
+ }
+ }
+ ds_chomp(s, ',');
+ } else {
+ ds_put_cstr(s, "none");
+ }
+ ds_put_char(s, '\n');
+}
+
+static bool
+table_action_features_equal(const struct ofputil_table_action_features *a,
+ const struct ofputil_table_action_features *b)
+{
+ return a->actions == b->actions && a->set_fields == b->set_fields;
+}
+
+static void
+print_table_instruction_features(
+ struct ds *s, const struct ofputil_table_instruction_features *tif)
+{
+ int start, end;
+
+ ds_put_cstr(s, " next tables: ");
+ for (start = bitmap_scan(tif->next, 1, 0, 255); start < 255;
+ start = bitmap_scan(tif->next, 1, end, 255)) {
+ end = bitmap_scan(tif->next, 0, start + 1, 255);
+ if (end == start + 1) {
+ ds_put_format(s, "%d,", start);
+ } else {
+ ds_put_format(s, "%d-%d,", start, end - 1);
+ }
+ }
+ ds_chomp(s, ',');
+ if (ds_last(s) == ' ') {
+ ds_put_cstr(s, "none");
+ }
+ ds_put_char(s, '\n');
+
+ ds_put_cstr(s, " instructions: ");
+ if (tif->instructions) {
+ int i;
+
+ for (i = 0; i < 32; i++) {
+ if (tif->instructions & (1u << i)) {
+ ds_put_format(s, "%s,", ovs_instruction_name_from_type(i));
+ }
+ }
+ ds_chomp(s, ',');
+ } else {
+ ds_put_cstr(s, "none");
+ }
+ ds_put_char(s, '\n');
+
+ if (table_action_features_equal(&tif->write, &tif->apply)) {
+ ds_put_cstr(s, " Write-Actions and Apply-Actions features:\n");
+ print_table_action_features(s, &tif->write);
+ } else {
+ ds_put_cstr(s, " Write-Actions features:\n");
+ print_table_action_features(s, &tif->write);
+ ds_put_cstr(s, " Apply-Actions features:\n");
+ print_table_action_features(s, &tif->apply);
+ }
+}
+
+static bool
+table_instruction_features_equal(
+ const struct ofputil_table_instruction_features *a,
+ const struct ofputil_table_instruction_features *b)
+{
+ return (bitmap_equal(a->next, b->next, 255)
+ && a->instructions == b->instructions
+ && table_action_features_equal(&a->write, &b->write)
+ && table_action_features_equal(&a->apply, &b->apply));
+}
+
+static void
+ofp_print_table_features(struct ds *s, const struct ofp_header *oh)
+{
+ struct ofpbuf b;
+
+ ofpbuf_use_const(&b, oh, ntohs(oh->length));
+
+ for (;;) {
+ struct ofputil_table_features tf;
+ int retval;
+ int i;
+
+ retval = ofputil_decode_table_features(&b, &tf, true);
+ if (retval) {
+ if (retval != EOF) {
+ ofp_print_error(s, retval);
+ }
+ return;
+ }
+
+ ds_put_format(s, "\n table %"PRIu8":\n", tf.table_id);
+ ds_put_format(s, " name=\"%s\"\n", tf.name);
+ ds_put_format(s, " metadata: match=%#"PRIx64" write=%#"PRIx64"\n",
+ ntohll(tf.metadata_match), ntohll(tf.metadata_write));
+
+ ds_put_cstr(s, " config=");
+ ofp_print_table_miss_config(s, tf.config);
+
+ ds_put_format(s, " max_entries=%"PRIu32"\n", tf.max_entries);
+
+ if (table_instruction_features_equal(&tf.nonmiss, &tf.miss)) {
+ ds_put_cstr(s, " instructions (table miss and others):\n");
+ print_table_instruction_features(s, &tf.nonmiss);
+ } else {
+ ds_put_cstr(s, " instructions (other than table miss):\n");
+ print_table_instruction_features(s, &tf.nonmiss);
+ ds_put_cstr(s, " instructions (table miss):\n");
+ print_table_instruction_features(s, &tf.miss);
+ }
+
+ ds_put_cstr(s, " matching:\n");
+ for (i = 0; i < MFF_N_IDS; i++) {
+ uint64_t bit = UINT64_C(1) << i;
+
+ if (tf.match & bit) {
+ const struct mf_field *f = mf_from_id(i);
+
+ ds_put_format(s, " %s: %s\n",
+ f->name,
+ (tf.mask ? "arbitrary mask"
+ : tf.wildcard ? "exact match or wildcard"
+ : "must exact match"));
+ }
+ }
+ }
+}
+
static void
ofp_to_string__(const struct ofp_header *oh, enum ofpraw raw,
struct ds *string, int verbosity)
case OFPTYPE_TABLE_FEATURES_STATS_REQUEST:
case OFPTYPE_TABLE_FEATURES_STATS_REPLY:
- ofp_print_not_implemented(string);
+ ofp_print_table_features(string, oh);
break;
case OFPTYPE_HELLO:
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "unaligned.h"
#include "type-props.h"
#include "vlog.h"
+#include "bitmap.h"
VLOG_DEFINE_THIS_MODULE(ofp_util);
void
ofputil_wildcard_from_ofpfw10(uint32_t ofpfw, struct flow_wildcards *wc)
{
- BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
/* Initialize most of wc. */
flow_wildcards_init_catchall(wc);
case OFPUTIL_P_OF12_OXM:
case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM:
return NXM_TYPICAL_LEN;
default:
case OFPUTIL_P_OF12_OXM:
case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM:
return oxm_put_match(b, match);
}
};
/* Most users really don't care about some of the differences between
- * protocols. These abbreviations help with that. */
+ * protocols. These abbreviations help with that.
+ *
+ * Until it is safe to use the OpenFlow 1.4 protocol (which currently can
+ * cause aborts due to unimplemented features), we omit OpenFlow 1.4 from all
+ * abbrevations. */
static const struct proto_abbrev proto_abbrevs[] = {
- { OFPUTIL_P_ANY, "any" },
- { OFPUTIL_P_OF10_STD_ANY, "OpenFlow10" },
- { OFPUTIL_P_OF10_NXM_ANY, "NXM" },
- { OFPUTIL_P_ANY_OXM, "OXM" },
+ { OFPUTIL_P_ANY & ~OFPUTIL_P_OF14_OXM, "any" },
+ { OFPUTIL_P_OF10_STD_ANY & ~OFPUTIL_P_OF14_OXM, "OpenFlow10" },
+ { OFPUTIL_P_OF10_NXM_ANY & ~OFPUTIL_P_OF14_OXM, "NXM" },
+ { OFPUTIL_P_ANY_OXM & ~OFPUTIL_P_OF14_OXM, "OXM" },
};
#define N_PROTO_ABBREVS ARRAY_SIZE(proto_abbrevs)
enum ofputil_protocol ofputil_flow_dump_protocols[] = {
+ OFPUTIL_P_OF14_OXM,
OFPUTIL_P_OF13_OXM,
OFPUTIL_P_OF12_OXM,
OFPUTIL_P_OF11_STD,
return OFPUTIL_P_OF12_OXM;
case OFP13_VERSION:
return OFPUTIL_P_OF13_OXM;
+ case OFP14_VERSION:
+ return OFPUTIL_P_OF14_OXM;
default:
return 0;
}
return OFP12_VERSION;
case OFPUTIL_P_OF13_OXM:
return OFP13_VERSION;
+ case OFPUTIL_P_OF14_OXM:
+ return OFP14_VERSION;
}
OVS_NOT_REACHED();
case OFPUTIL_P_OF13_OXM:
return OFPUTIL_P_OF13_OXM;
+ case OFPUTIL_P_OF14_OXM:
+ return OFPUTIL_P_OF14_OXM;
+
default:
OVS_NOT_REACHED();
}
case OFPUTIL_P_OF13_OXM:
return ofputil_protocol_set_tid(OFPUTIL_P_OF13_OXM, tid);
+ case OFPUTIL_P_OF14_OXM:
+ return ofputil_protocol_set_tid(OFPUTIL_P_OF14_OXM, tid);
+
default:
OVS_NOT_REACHED();
}
case OFPUTIL_P_OF13_OXM:
return "OXM-OpenFlow13";
+
+ case OFPUTIL_P_OF14_OXM:
+ return "OXM-OpenFlow14";
}
/* Check abbreviations. */
if (!strcasecmp(s, "OpenFlow13")) {
return OFP13_VERSION;
}
+ if (!strcasecmp(s, "OpenFlow14")) {
+ return OFP14_VERSION;
+ }
return 0;
}
return "OpenFlow12";
case OFP13_VERSION:
return "OpenFlow13";
+ case OFP14_VERSION:
+ return "OpenFlow14";
default:
OVS_NOT_REACHED();
}
case OFPUTIL_P_OF11_STD:
case OFPUTIL_P_OF12_OXM:
case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM:
/* There is only one variant of each OpenFlow 1.1+ protocol, and we
* verified above that we're not trying to change versions. */
OVS_NOT_REACHED();
omc = ofpbuf_try_pull(msg, sizeof *omc);
if (!omc) {
VLOG_WARN_RL(&bad_ofmsg_rl,
- "OFPMP_METER_CONFIG reply has %"PRIuSIZE" leftover bytes at end",
+ "OFPMP_METER_CONFIG reply has %"PRIu32" leftover bytes at end",
msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
oms = ofpbuf_try_pull(msg, sizeof *oms);
if (!oms) {
VLOG_WARN_RL(&bad_ofmsg_rl,
- "OFPMP_METER reply has %"PRIuSIZE" leftover bytes at end",
+ "OFPMP_METER reply has %"PRIu32" leftover bytes at end",
msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
switch (protocol) {
case OFPUTIL_P_OF11_STD:
case OFPUTIL_P_OF12_OXM:
- case OFPUTIL_P_OF13_OXM: {
+ case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM: {
struct ofp11_flow_mod *ofm;
int tailroom;
nfm->command = ofputil_tid_command(fm, protocol);
nfm->cookie = fm->new_cookie;
match_len = nx_put_match(msg, &fm->match, fm->cookie, fm->cookie_mask);
- nfm = msg->l3;
+ nfm = ofpbuf_get_l3(msg);
nfm->idle_timeout = htons(fm->idle_timeout);
nfm->hard_timeout = htons(fm->hard_timeout);
nfm->priority = htons(fm->priority);
struct ofp12_packet_queue *opq12;
ovs_be32 port;
- qgcr11 = reply->l3;
+ qgcr11 = ofpbuf_get_l3(reply);
port = qgcr11->port;
opq12 = ofpbuf_put_zeros(reply, sizeof *opq12);
switch (protocol) {
case OFPUTIL_P_OF11_STD:
case OFPUTIL_P_OF12_OXM:
- case OFPUTIL_P_OF13_OXM: {
+ case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM: {
struct ofp11_flow_stats_request *ofsr;
raw = (fsr->aggregate
match_len = nx_put_match(msg, &fsr->match,
fsr->cookie, fsr->cookie_mask);
- nfsr = msg->l3;
+ nfsr = ofpbuf_get_l3(msg);
nfsr->out_port = htons(ofp_to_u16(fsr->out_port));
nfsr->match_len = htons(match_len);
nfsr->table_id = fsr->table_id;
ofs = ofpbuf_try_pull(msg, sizeof *ofs);
if (!ofs) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_FLOW reply has %"PRIuSIZE" leftover "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_FLOW reply has %"PRIu32" leftover "
"bytes at end", msg->size);
return EINVAL;
}
ofs = ofpbuf_try_pull(msg, sizeof *ofs);
if (!ofs) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_FLOW reply has %"PRIuSIZE" leftover "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_FLOW reply has %"PRIu32" leftover "
"bytes at end", msg->size);
return EINVAL;
}
nfs = ofpbuf_try_pull(msg, sizeof *nfs);
if (!nfs) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "NXST_FLOW reply has %"PRIuSIZE" leftover "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "NXST_FLOW reply has %"PRIu32" leftover "
"bytes at end", msg->size);
return EINVAL;
}
ofpbuf_use_const(&msg, reply, ntohs(reply->length));
ofpraw_pull_assert(&msg);
- asr = msg.l3;
+ asr = ofpbuf_get_l3(&msg);
stats->packet_count = ntohll(get_32aligned_be64(&asr->packet_count));
stats->byte_count = ntohll(get_32aligned_be64(&asr->byte_count));
stats->flow_count = ntohl(asr->flow_count);
switch (protocol) {
case OFPUTIL_P_OF11_STD:
case OFPUTIL_P_OF12_OXM:
- case OFPUTIL_P_OF13_OXM: {
+ case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM: {
struct ofp12_flow_removed *ofr;
msg = ofpraw_alloc_xid(OFPRAW_OFPT11_FLOW_REMOVED,
nfr = ofpbuf_put_zeros(msg, sizeof *nfr);
match_len = nx_put_match(msg, &fr->match, 0, 0);
- nfr = msg->l3;
+ nfr = ofpbuf_get_l3(msg);
nfr->cookie = fr->cookie;
nfr->priority = htons(fr->priority);
nfr->reason = fr->reason;
ofpbuf_put_zeros(packet, 2);
ofpbuf_put(packet, pin->packet, pin->packet_len);
- npi = packet->l3;
+ npi = ofpbuf_get_l3(packet);
npi->buffer_id = htonl(pin->buffer_id);
npi->total_len = htons(pin->total_len);
npi->reason = pin->reason;
ofpbuf_put_zeros(packet, 2);
ofpbuf_put(packet, pin->packet, pin->packet_len);
- opi = packet->l3;
+ opi = ofpbuf_get_l3(packet);
opi->pi.buffer_id = htonl(pin->buffer_id);
opi->pi.total_len = htons(pin->total_len);
opi->pi.reason = pin->reason;
case OFPUTIL_P_OF12_OXM:
case OFPUTIL_P_OF13_OXM:
+ case OFPUTIL_P_OF14_OXM:
packet = ofputil_encode_ofp12_packet_in(pin, protocol);
break;
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
return sizeof(struct ofp11_port);
default:
OVS_NOT_REACHED();
break;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
+
default:
OVS_NOT_REACHED();
}
break;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
+
default:
OVS_NOT_REACHED();
}
case OFP12_VERSION:
case OFP13_VERSION:
return OFPC_COMMON | OFPC12_PORT_BLOCKED;
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
default:
/* Caller needs to check osf->header.version itself */
return 0;
raw = OFPRAW_OFPT11_FEATURES_REPLY;
break;
case OFP13_VERSION:
+ case OFP14_VERSION:
raw = OFPRAW_OFPT13_FEATURES_REPLY;
break;
default:
osf->actions = encode_action_bits(features->actions, of10_action_bits);
break;
case OFP13_VERSION:
+ case OFP14_VERSION:
osf->auxiliary_id = features->auxiliary_id;
/* fall through */
case OFP11_VERSION:
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
raw = OFPRAW_OFPT11_PORT_STATUS;
break;
opm->advertise = netdev_port_features_to_ofp11(pm->advertise);
break;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
default:
OVS_NOT_REACHED();
}
return b;
}
+struct ofp_prop_header {
+ ovs_be16 type;
+ ovs_be16 len;
+};
+
+static enum ofperr
+ofputil_pull_property(struct ofpbuf *msg, struct ofpbuf *payload,
+ uint16_t *typep)
+{
+ struct ofp_prop_header *oph;
+ unsigned int len;
+
+ if (msg->size < sizeof *oph) {
+ return OFPERR_OFPTFFC_BAD_LEN;
+ }
+
+ oph = msg->data;
+ len = ntohs(oph->len);
+ if (len < sizeof *oph || ROUND_UP(len, 8) > msg->size) {
+ return OFPERR_OFPTFFC_BAD_LEN;
+ }
+
+ *typep = ntohs(oph->type);
+ if (payload) {
+ ofpbuf_use_const(payload, msg->data, len);
+ ofpbuf_pull(payload, sizeof *oph);
+ }
+ ofpbuf_pull(msg, ROUND_UP(len, 8));
+ return 0;
+}
+
+static void PRINTF_FORMAT(2, 3)
+log_property(bool loose, const char *message, ...)
+{
+ enum vlog_level level = loose ? VLL_DBG : VLL_WARN;
+ if (!vlog_should_drop(THIS_MODULE, level, &bad_ofmsg_rl)) {
+ va_list args;
+
+ va_start(args, message);
+ vlog_valist(THIS_MODULE, level, message, args);
+ va_end(args);
+ }
+}
+
+static enum ofperr
+parse_table_ids(struct ofpbuf *payload, uint32_t *ids)
+{
+ uint16_t type;
+
+ *ids = 0;
+ while (payload->size > 0) {
+ enum ofperr error = ofputil_pull_property(payload, NULL, &type);
+ if (error) {
+ return error;
+ }
+ if (type < CHAR_BIT * sizeof *ids) {
+ *ids |= 1u << type;
+ }
+ }
+ return 0;
+}
+
+static enum ofperr
+parse_instruction_ids(struct ofpbuf *payload, bool loose, uint32_t *insts)
+{
+ *insts = 0;
+ while (payload->size > 0) {
+ enum ovs_instruction_type inst;
+ enum ofperr error;
+ uint16_t ofpit;
+
+ error = ofputil_pull_property(payload, NULL, &ofpit);
+ if (error) {
+ return error;
+ }
+
+ error = ovs_instruction_type_from_inst_type(&inst, ofpit);
+ if (!error) {
+ *insts |= 1u << inst;
+ } else if (!loose) {
+ return error;
+ }
+ }
+ return 0;
+}
+
+static enum ofperr
+parse_table_features_next_table(struct ofpbuf *payload,
+ unsigned long int *next_tables)
+{
+ size_t i;
+
+ memset(next_tables, 0, bitmap_n_bytes(255));
+ for (i = 0; i < payload->size; i++) {
+ uint8_t id = ((const uint8_t *) payload->data)[i];
+ if (id >= 255) {
+ return OFPERR_OFPTFFC_BAD_ARGUMENT;
+ }
+ bitmap_set1(next_tables, id);
+ }
+ return 0;
+}
+
+static enum ofperr
+parse_oxm(struct ofpbuf *b, bool loose,
+ const struct mf_field **fieldp, bool *hasmask)
+{
+ ovs_be32 *oxmp;
+ uint32_t oxm;
+
+ oxmp = ofpbuf_try_pull(b, sizeof *oxmp);
+ if (!oxmp) {
+ return OFPERR_OFPTFFC_BAD_LEN;
+ }
+ oxm = ntohl(*oxmp);
+
+ /* Determine '*hasmask'. If 'oxm' is masked, convert it to the equivalent
+ * unmasked version, because the table of OXM fields we support only has
+ * masked versions of fields that we support with masks, but we should be
+ * able to parse the masked versions of those here. */
+ *hasmask = NXM_HASMASK(oxm);
+ if (*hasmask) {
+ if (NXM_LENGTH(oxm) & 1) {
+ return OFPERR_OFPTFFC_BAD_ARGUMENT;
+ }
+ oxm = NXM_HEADER(NXM_VENDOR(oxm), NXM_FIELD(oxm), NXM_LENGTH(oxm) / 2);
+ }
+
+ *fieldp = mf_from_nxm_header(oxm);
+ if (!*fieldp) {
+ log_property(loose, "unknown OXM field %#"PRIx32, ntohl(*oxmp));
+ }
+ return *fieldp ? 0 : OFPERR_OFPBMC_BAD_FIELD;
+}
+
+static enum ofperr
+parse_oxms(struct ofpbuf *payload, bool loose,
+ uint64_t *exactp, uint64_t *maskedp)
+{
+ uint64_t exact, masked;
+
+ exact = masked = 0;
+ while (payload->size > 0) {
+ const struct mf_field *field;
+ enum ofperr error;
+ bool hasmask;
+
+ error = parse_oxm(payload, loose, &field, &hasmask);
+ if (!error) {
+ if (hasmask) {
+ masked |= UINT64_C(1) << field->id;
+ } else {
+ exact |= UINT64_C(1) << field->id;
+ }
+ } else if (error != OFPERR_OFPBMC_BAD_FIELD || !loose) {
+ return error;
+ }
+ }
+ if (exactp) {
+ *exactp = exact;
+ } else if (exact) {
+ return OFPERR_OFPBMC_BAD_MASK;
+ }
+ if (maskedp) {
+ *maskedp = masked;
+ } else if (masked) {
+ return OFPERR_OFPBMC_BAD_MASK;
+ }
+ return 0;
+}
+
+/* Converts an OFPMP_TABLE_FEATURES request or reply in 'msg' into an abstract
+ * ofputil_table_features in 'tf'.
+ *
+ * If 'loose' is true, this function ignores properties and values that it does
+ * not understand, as a controller would want to do when interpreting
+ * capabilities provided by a switch. If 'loose' is false, this function
+ * treats unknown properties and values as an error, as a switch would want to
+ * do when interpreting a configuration request made by a controller.
+ *
+ * A single OpenFlow message can specify features for multiple tables. Calling
+ * this function multiple times for a single 'msg' iterates through the tables
+ * in the message. The caller must initially leave 'msg''s layer pointers null
+ * and not modify them between calls.
+ *
+ * Returns 0 if successful, EOF if no tables were left in this 'msg', otherwise
+ * a positive "enum ofperr" value. */
+int
+ofputil_decode_table_features(struct ofpbuf *msg,
+ struct ofputil_table_features *tf, bool loose)
+{
+ struct ofp13_table_features *otf;
+ unsigned int len;
+
+ if (!msg->l2) {
+ msg->l2 = msg->data;
+ ofpraw_pull_assert(msg);
+ }
+
+ if (!msg->size) {
+ return EOF;
+ }
+
+ if (msg->size < sizeof *otf) {
+ return OFPERR_OFPTFFC_BAD_LEN;
+ }
+
+ otf = msg->data;
+ len = ntohs(otf->length);
+ if (len < sizeof *otf || len % 8 || len > msg->size) {
+ return OFPERR_OFPTFFC_BAD_LEN;
+ }
+ ofpbuf_pull(msg, sizeof *otf);
+
+ tf->table_id = otf->table_id;
+ if (tf->table_id == OFPTT_ALL) {
+ return OFPERR_OFPTFFC_BAD_TABLE;
+ }
+
+ ovs_strlcpy(tf->name, otf->name, OFP_MAX_TABLE_NAME_LEN);
+ tf->metadata_match = otf->metadata_match;
+ tf->metadata_write = otf->metadata_write;
+ tf->config = ntohl(otf->config);
+ tf->max_entries = ntohl(otf->max_entries);
+
+ while (msg->size > 0) {
+ struct ofpbuf payload;
+ enum ofperr error;
+ uint16_t type;
+
+ error = ofputil_pull_property(msg, &payload, &type);
+ if (error) {
+ return error;
+ }
+
+ switch ((enum ofp13_table_feature_prop_type) type) {
+ case OFPTFPT13_INSTRUCTIONS:
+ error = parse_instruction_ids(&payload, loose,
+ &tf->nonmiss.instructions);
+ break;
+ case OFPTFPT13_INSTRUCTIONS_MISS:
+ error = parse_instruction_ids(&payload, loose,
+ &tf->miss.instructions);
+ break;
+
+ case OFPTFPT13_NEXT_TABLES:
+ error = parse_table_features_next_table(&payload,
+ tf->nonmiss.next);
+ break;
+ case OFPTFPT13_NEXT_TABLES_MISS:
+ error = parse_table_features_next_table(&payload, tf->miss.next);
+ break;
+
+ case OFPTFPT13_WRITE_ACTIONS:
+ error = parse_table_ids(&payload, &tf->nonmiss.write.actions);
+ break;
+ case OFPTFPT13_WRITE_ACTIONS_MISS:
+ error = parse_table_ids(&payload, &tf->miss.write.actions);
+ break;
+
+ case OFPTFPT13_APPLY_ACTIONS:
+ error = parse_table_ids(&payload, &tf->nonmiss.apply.actions);
+ break;
+ case OFPTFPT13_APPLY_ACTIONS_MISS:
+ error = parse_table_ids(&payload, &tf->miss.apply.actions);
+ break;
+
+ case OFPTFPT13_MATCH:
+ error = parse_oxms(&payload, loose, &tf->match, &tf->mask);
+ break;
+ case OFPTFPT13_WILDCARDS:
+ error = parse_oxms(&payload, loose, &tf->wildcard, NULL);
+ break;
+
+ case OFPTFPT13_WRITE_SETFIELD:
+ error = parse_oxms(&payload, loose,
+ &tf->nonmiss.write.set_fields, NULL);
+ break;
+ case OFPTFPT13_WRITE_SETFIELD_MISS:
+ error = parse_oxms(&payload, loose,
+ &tf->miss.write.set_fields, NULL);
+ break;
+ case OFPTFPT13_APPLY_SETFIELD:
+ error = parse_oxms(&payload, loose,
+ &tf->nonmiss.apply.set_fields, NULL);
+ break;
+ case OFPTFPT13_APPLY_SETFIELD_MISS:
+ error = parse_oxms(&payload, loose,
+ &tf->miss.apply.set_fields, NULL);
+ break;
+
+ case OFPTFPT13_EXPERIMENTER:
+ case OFPTFPT13_EXPERIMENTER_MISS:
+ log_property(loose,
+ "unknown table features experimenter property");
+ error = loose ? 0 : OFPERR_OFPTFFC_BAD_TYPE;
+ break;
+ }
+ if (error) {
+ return error;
+ }
+ }
+
+ /* Fix inconsistencies:
+ *
+ * - Turn off 'mask' and 'wildcard' bits that are not in 'match',
+ * because a field must be matchable to be masked or wildcarded.
+ *
+ * - Turn on 'wildcard' bits that are set in 'mask', because a field
+ * that is arbitrarily maskable can be wildcarded entirely. */
+ tf->mask &= tf->match;
+ tf->wildcard &= tf->match;
+
+ tf->wildcard |= tf->mask;
+
+ return 0;
+}
+
+/* Encodes and returns a request to obtain the table features of a switch.
+ * The message is encoded for OpenFlow version 'ofp_version'. */
+struct ofpbuf *
+ofputil_encode_table_features_request(enum ofp_version ofp_version)
+{
+ struct ofpbuf *request = NULL;
+
+ switch (ofp_version) {
+ case OFP10_VERSION:
+ case OFP11_VERSION:
+ case OFP12_VERSION:
+ ovs_fatal(0, "dump-table-features needs OpenFlow 1.3 or later "
+ "(\'-O OpenFlow13\')");
+ case OFP13_VERSION:
+ case OFP14_VERSION:
+ request = ofpraw_alloc(OFPRAW_OFPST13_TABLE_FEATURES_REQUEST,
+ ofp_version, 0);
+ break;
+ default:
+ OVS_NOT_REACHED();
+ }
+
+ return request;
+}
+
/* ofputil_table_mod */
/* Decodes the OpenFlow "table mod" message in '*oh' into an abstract form in
otm->config = htonl(pm->config);
break;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
default:
OVS_NOT_REACHED();
}
if (raw == OFPRAW_OFPT12_ROLE_REQUEST ||
raw == OFPRAW_OFPT12_ROLE_REPLY) {
- const struct ofp12_role_request *orr = b.l3;
+ const struct ofp12_role_request *orr = ofpbuf_get_l3(&b);
if (orr->role != htonl(OFPCR12_ROLE_NOCHANGE) &&
orr->role != htonl(OFPCR12_ROLE_EQUAL) &&
}
} else if (raw == OFPRAW_NXT_ROLE_REQUEST ||
raw == OFPRAW_NXT_ROLE_REPLY) {
- const struct nx_role_request *nrr = b.l3;
+ const struct nx_role_request *nrr = ofpbuf_get_l3(&b);
BUILD_ASSERT(NX_ROLE_OTHER + 1 == OFPCR12_ROLE_EQUAL);
BUILD_ASSERT(NX_ROLE_MASTER + 1 == OFPCR12_ROLE_MASTER);
raw = ofpraw_pull_assert(&b);
ovs_assert(raw == OFPRAW_OFPT14_ROLE_STATUS);
- r = b.l3;
+ r = ofpbuf_get_l3(&b);
if (r->role != htonl(OFPCR12_ROLE_NOCHANGE) &&
r->role != htonl(OFPCR12_ROLE_EQUAL) &&
r->role != htonl(OFPCR12_ROLE_MASTER) &&
break;
case OFP13_VERSION:
+ case OFP14_VERSION:
ofputil_put_ofp13_table_stats(&stats[i], reply);
break;
nfmr = ofpbuf_try_pull(msg, sizeof *nfmr);
if (!nfmr) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "NXST_FLOW_MONITOR request has %"PRIuSIZE" "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "NXST_FLOW_MONITOR request has %"PRIu32" "
"leftover bytes at end", msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
}
bad_len:
- VLOG_WARN_RL(&bad_ofmsg_rl, "NXST_FLOW_MONITOR reply has %"PRIuSIZE" "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "NXST_FLOW_MONITOR reply has %"PRIu32" "
"leftover bytes at end", msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
ofpacts_put_openflow_actions(po->ofpacts, po->ofpacts_len, msg,
ofp_version);
- opo = msg->l3;
+ opo = ofpbuf_get_l3(msg);
opo->buffer_id = htonl(po->buffer_id);
opo->in_port = htons(ofp_to_u16(po->in_port));
opo->actions_len = htons(msg->size - actions_ofs);
case OFP11_VERSION:
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION:{
struct ofp11_packet_out *opo;
size_t len;
ofpbuf_put_zeros(msg, sizeof *opo);
len = ofpacts_put_openflow_actions(po->ofpacts, po->ofpacts_len, msg,
ofp_version);
- opo = msg->l3;
+ opo = ofpbuf_get_l3(msg);
opo->buffer_id = htonl(po->buffer_id);
opo->in_port = ofputil_port_to_ofp11(po->in_port);
opo->actions_len = htons(len);
enum ofpraw type;
switch (ofp_version) {
+ case OFP14_VERSION:
case OFP13_VERSION:
case OFP12_VERSION:
case OFP11_VERSION:
const struct ofp11_port *op = ofpbuf_try_pull(b, sizeof *op);
return op ? ofputil_decode_ofp11_port(pp, op) : EOF;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
default:
OVS_NOT_REACHED();
}
NULL,
#define OFPAT10_ACTION(ENUM, STRUCT, NAME) NAME,
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) NAME,
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) NAME,
#define NXAST_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) NAME,
#include "ofp-util.def"
};
: "Unknown action";
}
+enum ofputil_action_code
+ofputil_action_code_from_ofp13_action(enum ofp13_action_type type)
+{
+ switch (type) {
+
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
+ case ENUM: \
+ return OFPUTIL_##ENUM;
+#include "ofp-util.def"
+
+ default:
+ return OFPUTIL_ACTION_INVALID;
+ }
+}
+
/* Appends an action of the type specified by 'code' to 'buf' and returns the
* action. Initializes the parts of 'action' that identify it as having type
* <ENUM> and length 'sizeof *action' and zeros the rest. For actions that
{
switch (code) {
case OFPUTIL_ACTION_INVALID:
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) case OFPUTIL_##ENUM:
+#include "ofp-util.def"
OVS_NOT_REACHED();
#define OFPAT10_ACTION(ENUM, STRUCT, NAME) \
}
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
OFPAT10_ACTION(ENUM, STRUCT, NAME)
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
+ OFPAT10_ACTION(ENUM, STRUCT, NAME)
#define NXAST_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
void \
ofputil_init_##ENUM(struct STRUCT *s) \
}
case OFP11_VERSION:
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION:{
struct ofp11_port_stats_request *req;
request = ofpraw_alloc(OFPRAW_OFPST11_PORT_REQUEST, ofp_version, 0);
req = ofpbuf_put_zeros(request, sizeof *req);
break;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
+
default:
OVS_NOT_REACHED();
}
return sizeof(struct ofp11_port_stats);
case OFP13_VERSION:
return sizeof(struct ofp13_port_stats);
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ return 0;
default:
OVS_NOT_REACHED();
}
}
bad_len:
- VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_PORT reply has %"PRIuSIZE" leftover "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_PORT reply has %"PRIu32" leftover "
"bytes at end", msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
return 0;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
+
default:
OVS_NOT_REACHED();
}
"(\'-O OpenFlow11\')");
case OFP11_VERSION:
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION: {
struct ofp11_group_stats_request *req;
request = ofpraw_alloc(OFPRAW_OFPST11_GROUP_REQUEST, ofp_version, 0);
req = ofpbuf_put_zeros(request, sizeof *req);
"(\'-O OpenFlow11\')");
case OFP11_VERSION:
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION:
request = ofpraw_alloc(OFPRAW_OFPST11_GROUP_DESC_REQUEST, ofp_version, 0);
break;
- }
default:
OVS_NOT_REACHED();
}
ofputil_append_of13_group_stats(ogs, replies);
break;
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
+
case OFP10_VERSION:
default:
OVS_NOT_REACHED();
ovs_fatal(0, "dump-group-features needs OpenFlow 1.2 or later "
"(\'-O OpenFlow12\')");
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION:
request = ofpraw_alloc(OFPRAW_OFPST12_GROUP_FEATURES_REQUEST,
- ofp_version, 0);
+ ofp_version, 0);
break;
- }
default:
OVS_NOT_REACHED();
}
}
if (!ogs11) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "%s reply has %"PRIuSIZE" leftover bytes at end",
+ VLOG_WARN_RL(&bad_ofmsg_rl, "%s reply has %"PRIu32" leftover bytes at end",
ofpraw_get_name(raw), msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
gs->n_buckets = (length - base_len) / sizeof *obc;
obc = ofpbuf_try_pull(msg, gs->n_buckets * sizeof *obc);
if (!obc) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "%s reply has %"PRIuSIZE" leftover bytes at end",
+ VLOG_WARN_RL(&bad_ofmsg_rl, "%s reply has %"PRIu32" leftover bytes at end",
ofpraw_get_name(raw), msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
ogds = ofpbuf_try_pull(msg, sizeof *ogds);
if (!ogds) {
- VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST11_GROUP_DESC reply has %"PRIuSIZE" "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST11_GROUP_DESC reply has %"PRIu32" "
"leftover bytes at end", msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
case OFP11_VERSION:
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION:
b = ofpraw_alloc(OFPRAW_OFPT11_GROUP_MOD, ofp_version, 0);
start_ogm = b->size;
ofpbuf_put_zeros(b, sizeof *ogm);
ogm->group_id = htonl(gm->group_id);
break;
- }
default:
OVS_NOT_REACHED();
struct ofputil_queue_stats_request *oqsr)
{
switch ((enum ofp_version)request->version) {
+ case OFP14_VERSION:
case OFP13_VERSION:
case OFP12_VERSION:
case OFP11_VERSION: {
switch (ofp_version) {
case OFP11_VERSION:
case OFP12_VERSION:
- case OFP13_VERSION: {
+ case OFP13_VERSION:
+ case OFP14_VERSION: {
struct ofp11_queue_stats_request *req;
request = ofpraw_alloc(OFPRAW_OFPST11_QUEUE_REQUEST, ofp_version, 0);
req = ofpbuf_put_zeros(request, sizeof *req);
return sizeof(struct ofp11_queue_stats);
case OFP13_VERSION:
return sizeof(struct ofp13_queue_stats);
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ return 0;
default:
OVS_NOT_REACHED();
}
}
bad_len:
- VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_QUEUE reply has %"PRIuSIZE" leftover "
+ VLOG_WARN_RL(&bad_ofmsg_rl, "OFPST_QUEUE reply has %"PRIu32" leftover "
"bytes at end", msg->size);
return OFPERR_OFPBRC_BAD_LEN;
}
break;
}
+ case OFP14_VERSION:
+ OVS_NOT_REACHED();
+ break;
+
default:
OVS_NOT_REACHED();
}
OFPAT11_ACTION(OFPAT12_SET_FIELD, ofp12_action_set_field, 1, "set_field")
OFPAT11_ACTION(OFPAT11_GROUP, ofp11_action_group, 0, "group")
+#ifndef OFPAT13_ACTION
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME)
+#endif
+OFPAT13_ACTION(OFPAT13_OUTPUT, ofp11_action_output, 0, "output")
+OFPAT13_ACTION(OFPAT13_COPY_TTL_OUT, ofp_action_header, 0, "copy_ttl_out")
+OFPAT13_ACTION(OFPAT13_COPY_TTL_IN, ofp_action_header, 0, "copy_ttl_in")
+OFPAT13_ACTION(OFPAT13_SET_MPLS_TTL, ofp11_action_mpls_ttl, 0, "set_mpls_ttl")
+OFPAT13_ACTION(OFPAT13_DEC_MPLS_TTL, ofp_action_header, 0, "dec_mpls_ttl")
+OFPAT13_ACTION(OFPAT13_PUSH_VLAN, ofp11_action_push, 0, "push_vlan")
+OFPAT13_ACTION(OFPAT13_POP_VLAN, ofp_action_header, 0, "pop_vlan")
+OFPAT13_ACTION(OFPAT13_PUSH_MPLS, ofp11_action_push, 0, "push_mpls")
+OFPAT13_ACTION(OFPAT13_POP_MPLS, ofp11_action_pop_mpls, 0, "pop_mpls")
+OFPAT13_ACTION(OFPAT13_SET_QUEUE, ofp11_action_set_queue, 0, "set_queue")
+OFPAT13_ACTION(OFPAT13_GROUP, ofp11_action_group, 0, "group")
+OFPAT13_ACTION(OFPAT13_SET_NW_TTL, ofp11_action_nw_ttl, 0, "set_nw_ttl")
+OFPAT13_ACTION(OFPAT13_DEC_NW_TTL, ofp_action_header, 0, "dec_nw_ttl")
+OFPAT13_ACTION(OFPAT13_SET_FIELD, ofp12_action_set_field, 1, "set_field")
+OFPAT13_ACTION(OFPAT13_PUSH_PBB, ofp11_action_push, 0, "push_pbb")
+OFPAT13_ACTION(OFPAT13_POP_PBB, ofp_action_header, 0, "pop_pbb")
+
#ifndef NXAST_ACTION
#define NXAST_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME)
#endif
#undef OFPAT10_ACTION
#undef OFPAT11_ACTION
+#undef OFPAT13_ACTION
#undef NXAST_ACTION
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
+#include "bitmap.h"
#include "compiler.h"
#include "flow.h"
#include "list.h"
* variant. */
OFPUTIL_P_OF12_OXM = 1 << 5,
OFPUTIL_P_OF13_OXM = 1 << 6,
-#define OFPUTIL_P_ANY_OXM (OFPUTIL_P_OF12_OXM | OFPUTIL_P_OF13_OXM)
+ OFPUTIL_P_OF14_OXM = 1 << 7,
+#define OFPUTIL_P_ANY_OXM (OFPUTIL_P_OF12_OXM | OFPUTIL_P_OF13_OXM | OFPUTIL_P_OF14_OXM)
#define OFPUTIL_P_NXM_OF11_UP (OFPUTIL_P_OF10_NXM_ANY | OFPUTIL_P_OF11_STD | \
OFPUTIL_P_ANY_OXM)
#define OFPUTIL_P_OF13_UP (OFPUTIL_P_OF13_OXM)
+#define OFPUTIL_P_OF14_UP (OFPUTIL_P_OF14_OXM)
+
/* All protocols. */
-#define OFPUTIL_P_ANY ((1 << 7) - 1)
+#define OFPUTIL_P_ANY ((1 << 8) - 1)
/* Protocols in which a specific table may be specified in flow_mods. */
#define OFPUTIL_P_TID (OFPUTIL_P_OF10_STD_TID | \
/* Abstract ofp_table_mod. */
struct ofputil_table_mod {
uint8_t table_id; /* ID of the table, 0xff indicates all tables. */
- uint32_t config;
+ enum ofp_table_config config;
};
enum ofperr ofputil_decode_table_mod(const struct ofp_header *,
struct ofpbuf *ofputil_encode_table_mod(const struct ofputil_table_mod *,
enum ofputil_protocol);
+/* Abstract ofp_table_features. */
+struct ofputil_table_features {
+ uint8_t table_id; /* Identifier of table. Lower numbered tables
+ are consulted first. */
+ char name[OFP_MAX_TABLE_NAME_LEN];
+ ovs_be64 metadata_match; /* Bits of metadata table can match. */
+ ovs_be64 metadata_write; /* Bits of metadata table can write. */
+ uint32_t config; /* Bitmap of OFPTC_* values */
+ uint32_t max_entries; /* Max number of entries supported. */
+
+ /* Table features related to instructions. There are two instances:
+ *
+ * - 'miss' reports features available in the table miss flow.
+ *
+ * - 'nonmiss' reports features available in other flows. */
+ struct ofputil_table_instruction_features {
+ /* Tables that "goto-table" may jump to. */
+ unsigned long int next[BITMAP_N_LONGS(255)];
+
+ /* Bitmap of OVSINST_* for supported instructions. */
+ uint32_t instructions;
+
+ /* Table features related to actions. There are two instances:
+ *
+ * - 'write' reports features available in a "write_actions"
+ * instruction.
+ *
+ * - 'apply' reports features available in an "apply_actions"
+ * instruction. */
+ struct ofputil_table_action_features {
+ uint32_t actions; /* Bitmap of supported OFPAT*. */
+ uint64_t set_fields; /* Bitmap of MFF_* "set-field" supports. */
+ } write, apply;
+ } nonmiss, miss;
+
+ /* MFF_* bitmaps.
+ *
+ * For any given field the following combinations are valid:
+ *
+ * - match=0, wildcard=0, mask=0: Flows in this table cannot match on
+ * this field.
+ *
+ * - match=1, wildcard=0, mask=0: Flows in this table must match on all
+ * the bits in this field.
+ *
+ * - match=1, wildcard=1, mask=0: Flows in this table must either match
+ * on all the bits in the field or wildcard the field entirely.
+ *
+ * - match=1, wildcard=1, mask=1: Flows in this table may arbitrarily
+ * mask this field (as special cases, they may match on all the bits
+ * or wildcard it entirely).
+ *
+ * Other combinations do not make sense.
+ */
+ uint64_t match; /* Fields that may be matched. */
+ uint64_t mask; /* Subset of 'match' that may have masks. */
+ uint64_t wildcard; /* Subset of 'match' that may be wildcarded. */
+};
+
+int ofputil_decode_table_features(struct ofpbuf *,
+ struct ofputil_table_features *, bool loose);
+struct ofpbuf *ofputil_encode_table_features_request(
+ enum ofp_version ofp_version);
+void ofputil_append_table_features_reply(
+ const struct ofputil_table_features *tf,
+ struct list *replies);
+
+uint16_t table_feature_prop_get_size(enum ofp13_table_feature_prop_type type);
+char *table_feature_prop_get_name(enum ofp13_table_feature_prop_type type);
+
/* Meter band configuration for all supported band types. */
struct ofputil_meter_band {
uint16_t type;
OFPUTIL_ACTION_INVALID,
#define OFPAT10_ACTION(ENUM, STRUCT, NAME) OFPUTIL_##ENUM,
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) OFPUTIL_##ENUM,
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) OFPUTIL_##ENUM,
#define NXAST_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) OFPUTIL_##ENUM,
#include "ofp-util.def"
};
enum {
#define OFPAT10_ACTION(ENUM, STRUCT, NAME) + 1
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) + 1
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) + 1
#define NXAST_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) + 1
OFPUTIL_N_ACTIONS = 1
#include "ofp-util.def"
int ofputil_action_code_from_name(const char *);
const char * ofputil_action_name_from_code(enum ofputil_action_code code);
+enum ofputil_action_code ofputil_action_code_from_ofp13_action(
+ enum ofp13_action_type type);
void *ofputil_put_action(enum ofputil_action_code, struct ofpbuf *buf);
#define OFPAT11_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
void ofputil_init_##ENUM(struct STRUCT *); \
struct STRUCT *ofputil_put_##ENUM(struct ofpbuf *);
+#define OFPAT13_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
+ void ofputil_init_##ENUM(struct STRUCT *); \
+ struct STRUCT *ofputil_put_##ENUM(struct ofpbuf *);
#define NXAST_ACTION(ENUM, STRUCT, EXTENSIBLE, NAME) \
void ofputil_init_##ENUM(struct STRUCT *); \
struct STRUCT *ofputil_put_##ENUM(struct ofpbuf *);
#include <stdlib.h>
#include <string.h>
#include "dynamic-string.h"
+#include "netdev-dpdk.h"
#include "util.h"
static void
b->allocated = allocated;
b->source = source;
b->size = 0;
- b->l2 = b->l2_5 = b->l3 = b->l4 = b->l7 = NULL;
+ b->l2 = NULL;
+ b->l2_5_ofs = b->l3_ofs = b->l4_ofs = UINT16_MAX;
list_poison(&b->list_node);
- b->private_p = NULL;
}
/* Initializes 'b' as an empty ofpbuf that contains the 'allocated' bytes of
void
ofpbuf_uninit(struct ofpbuf *b)
{
- if (b && b->source == OFPBUF_MALLOC) {
- free(b->base);
+ if (b) {
+ if (b->source == OFPBUF_MALLOC) {
+ free(b->base);
+ }
+ if (b->source == OFPBUF_DPDK) {
+ free_dpdk_buf(b);
+ }
}
}
-/* Returns a pointer that may be passed to free() to accomplish the same thing
- * as ofpbuf_uninit(b). The return value is a null pointer if ofpbuf_uninit()
- * would not free any memory. */
-void *
-ofpbuf_get_uninit_pointer(struct ofpbuf *b)
-{
- return b && b->source == OFPBUF_MALLOC ? b->base : NULL;
-}
-
/* Frees memory that 'b' points to and allocates a new ofpbuf */
void
ofpbuf_reinit(struct ofpbuf *b, size_t size)
ofpbuf_clone_with_headroom(const struct ofpbuf *buffer, size_t headroom)
{
struct ofpbuf *new_buffer;
- uintptr_t data_delta;
new_buffer = ofpbuf_clone_data_with_headroom(buffer->data, buffer->size,
headroom);
- data_delta = (char *) new_buffer->data - (char *) buffer->data;
-
if (buffer->l2) {
+ uintptr_t data_delta = (char *)new_buffer->data - (char *)buffer->data;
+
new_buffer->l2 = (char *) buffer->l2 + data_delta;
}
- if (buffer->l2_5) {
- new_buffer->l2_5 = (char *) buffer->l2_5 + data_delta;
- }
- if (buffer->l3) {
- new_buffer->l3 = (char *) buffer->l3 + data_delta;
- }
- if (buffer->l4) {
- new_buffer->l4 = (char *) buffer->l4 + data_delta;
- }
- if (buffer->l7) {
- new_buffer->l7 = (char *) buffer->l7 + data_delta;
- }
+ new_buffer->l2_5_ofs = buffer->l2_5_ofs;
+ new_buffer->l3_ofs = buffer->l3_ofs;
+ new_buffer->l4_ofs = buffer->l4_ofs;
return new_buffer;
}
return b;
}
-/* Frees memory that 'b' points to, as well as 'b' itself. */
-void
-ofpbuf_delete(struct ofpbuf *b)
-{
- if (b) {
- ofpbuf_uninit(b);
- free(b);
- }
-}
-
-/* Returns the number of bytes of headroom in 'b', that is, the number of bytes
- * of unused space in ofpbuf 'b' before the data that is in use. (Most
- * commonly, the data in a ofpbuf is at its beginning, and thus the ofpbuf's
- * headroom is 0.) */
-size_t
-ofpbuf_headroom(const struct ofpbuf *b)
-{
- return (char*)b->data - (char*)b->base;
-}
-
-/* Returns the number of bytes that may be appended to the tail end of ofpbuf
- * 'b' before the ofpbuf must be reallocated. */
-size_t
-ofpbuf_tailroom(const struct ofpbuf *b)
-{
- return (char*)ofpbuf_end(b) - (char*)ofpbuf_tail(b);
-}
-
static void
ofpbuf_copy__(struct ofpbuf *b, uint8_t *new_base,
size_t new_headroom, size_t new_tailroom)
new_allocated = new_headroom + b->size + new_tailroom;
switch (b->source) {
+ case OFPBUF_DPDK:
+ OVS_NOT_REACHED();
+
case OFPBUF_MALLOC:
if (new_headroom == ofpbuf_headroom(b)) {
new_base = xrealloc(b->base, new_allocated);
new_data = (char *) new_base + new_headroom;
if (b->data != new_data) {
uintptr_t data_delta = (char *) new_data - (char *) b->data;
+
b->data = new_data;
if (b->l2) {
b->l2 = (char *) b->l2 + data_delta;
}
- if (b->l2_5) {
- b->l2_5 = (char *) b->l2_5 + data_delta;
- }
- if (b->l3) {
- b->l3 = (char *) b->l3 + data_delta;
- }
- if (b->l4) {
- b->l4 = (char *) b->l4 + data_delta;
- }
- if (b->l7) {
- b->l7 = (char *) b->l7 + data_delta;
- }
}
}
void
ofpbuf_trim(struct ofpbuf *b)
{
+ ovs_assert(b->source != OFPBUF_DPDK);
+
if (b->source == OFPBUF_MALLOC
&& (ofpbuf_headroom(b) || ofpbuf_tailroom(b))) {
ofpbuf_resize__(b, 0, 0);
return dst;
}
-/* If 'b' contains at least 'offset + size' bytes of data, returns a pointer to
- * byte 'offset'. Otherwise, returns a null pointer. */
-void *
-ofpbuf_at(const struct ofpbuf *b, size_t offset, size_t size)
-{
- return offset + size <= b->size ? (char *) b->data + offset : NULL;
-}
-
-/* Returns a pointer to byte 'offset' in 'b', which must contain at least
- * 'offset + size' bytes of data. */
-void *
-ofpbuf_at_assert(const struct ofpbuf *b, size_t offset, size_t size)
-{
- ovs_assert(offset + size <= b->size);
- return ((char *) b->data) + offset;
-}
-
-/* Returns the byte following the last byte of data in use in 'b'. */
-void *
-ofpbuf_tail(const struct ofpbuf *b)
-{
- return (char *) b->data + b->size;
-}
-
-/* Returns the byte following the last byte allocated for use (but not
- * necessarily in use) by 'b'. */
-void *
-ofpbuf_end(const struct ofpbuf *b)
-{
- return (char *) b->base + b->allocated;
-}
-
-/* Clears any data from 'b'. */
-void
-ofpbuf_clear(struct ofpbuf *b)
-{
- b->data = b->base;
- b->size = 0;
-}
-
-/* Removes 'size' bytes from the head end of 'b', which must contain at least
- * 'size' bytes of data. Returns the first byte of data removed. */
-void *
-ofpbuf_pull(struct ofpbuf *b, size_t size)
-{
- void *data = b->data;
- ovs_assert(b->size >= size);
- b->data = (char*)b->data + size;
- b->size -= size;
- return data;
-}
-
-/* If 'b' has at least 'size' bytes of data, removes that many bytes from the
- * head end of 'b' and returns the first byte removed. Otherwise, returns a
- * null pointer without modifying 'b'. */
-void *
-ofpbuf_try_pull(struct ofpbuf *b, size_t size)
-{
- return b->size >= size ? ofpbuf_pull(b, size) : NULL;
-}
-
/* Returns the data in 'b' as a block of malloc()'d memory and frees the buffer
* within 'b'. (If 'b' itself was dynamically allocated, e.g. with
* ofpbuf_new(), then it should still be freed with, e.g., ofpbuf_delete().) */
ofpbuf_steal_data(struct ofpbuf *b)
{
void *p;
+ ovs_assert(b->source != OFPBUF_DPDK);
+
if (b->source == OFPBUF_MALLOC && b->data == b->base) {
p = b->data;
} else {
struct ds s;
ds_init(&s);
- ds_put_format(&s, "size=%"PRIuSIZE", allocated=%"PRIuSIZE", head=%"PRIuSIZE", tail=%"PRIuSIZE"\n",
+ ds_put_format(&s, "size=%"PRIu32", allocated=%"PRIu32", head=%"PRIuSIZE", tail=%"PRIuSIZE"\n",
b->size, b->allocated,
ofpbuf_headroom(b), ofpbuf_tailroom(b));
ds_put_hex_dump(&s, b->data, MIN(b->size, maxbytes), 0, false);
ofpbuf_delete(b);
}
}
+
+static inline void
+ofpbuf_adjust_layer_offset(uint16_t *offset, int increment)
+{
+ if (*offset != UINT16_MAX) {
+ *offset += increment;
+ }
+}
+
+/* Adjust the size of the l2_5 portion of the ofpbuf, updating the l2
+ * pointer and the layer offsets. The caller is responsible for
+ * modifying the contents. */
+void *
+ofpbuf_resize_l2_5(struct ofpbuf *b, int increment)
+{
+ if (increment >= 0) {
+ ofpbuf_push_uninit(b, increment);
+ } else {
+ ofpbuf_pull(b, -increment);
+ }
+
+ b->l2 = b->data;
+ /* Adjust layer offsets after l2_5. */
+ ofpbuf_adjust_layer_offset(&b->l3_ofs, increment);
+ ofpbuf_adjust_layer_offset(&b->l4_ofs, increment);
+
+ return b->l2;
+}
+
+/* Adjust the size of the l2 portion of the ofpbuf, updating the l2
+ * pointer and the layer offsets. The caller is responsible for
+ * modifying the contents. */
+void *
+ofpbuf_resize_l2(struct ofpbuf *b, int increment)
+{
+ ofpbuf_resize_l2_5(b, increment);
+ ofpbuf_adjust_layer_offset(&b->l2_5_ofs, increment);
+ return b->l2;
+}
#include <stddef.h>
#include <stdint.h>
#include "list.h"
+#include "packets.h"
#include "util.h"
#ifdef __cplusplus
extern "C" {
#endif
-enum ofpbuf_source {
+enum OVS_PACKED_ENUM ofpbuf_source {
OFPBUF_MALLOC, /* Obtained via malloc(). */
OFPBUF_STACK, /* Un-movable stack space or static buffer. */
- OFPBUF_STUB /* Starts on stack, may expand into heap. */
+ OFPBUF_STUB, /* Starts on stack, may expand into heap. */
+ OFPBUF_DPDK, /* buffer data is from DPDK allocated memory.
+ ref to build_ofpbuf() in netdev-dpdk. */
};
/* Buffer for holding arbitrary data. An ofpbuf is automatically reallocated
* as necessary if it grows too large for the available memory. */
struct ofpbuf {
void *base; /* First byte of allocated space. */
- size_t allocated; /* Number of bytes allocated. */
- enum ofpbuf_source source; /* Source of memory allocated as 'base'. */
-
+ uint32_t allocated; /* Number of bytes allocated. */
+ uint32_t size; /* Number of bytes in use. */
void *data; /* First byte actually in use. */
- size_t size; /* Number of bytes in use. */
void *l2; /* Link-level header. */
- void *l2_5; /* MPLS label stack */
- void *l3; /* Network-level header. */
- void *l4; /* Transport-level header. */
- void *l7; /* Application data. */
-
+ uint16_t l2_5_ofs; /* MPLS label stack offset from l2, or
+ * UINT16_MAX */
+ uint16_t l3_ofs; /* Network-level header offset from l2, or
+ * UINT16_MAX. */
+ uint16_t l4_ofs; /* Transport-level header offset from l2, or
+ UINT16_MAX. */
+ enum ofpbuf_source source; /* Source of memory allocated as 'base'. */
struct list list_node; /* Private list element for use by owner. */
- void *private_p; /* Private pointer for use by owner. */
};
+void * ofpbuf_resize_l2(struct ofpbuf *, int increment);
+void * ofpbuf_resize_l2_5(struct ofpbuf *, int increment);
+static inline void * ofpbuf_get_l2_5(const struct ofpbuf *);
+static inline void ofpbuf_set_l2_5(struct ofpbuf *, void *);
+static inline void * ofpbuf_get_l3(const struct ofpbuf *);
+static inline void ofpbuf_set_l3(struct ofpbuf *, void *);
+static inline void * ofpbuf_get_l4(const struct ofpbuf *);
+static inline void ofpbuf_set_l4(struct ofpbuf *, void *);
+static inline size_t ofpbuf_get_l4_size(const struct ofpbuf *);
+static inline const void *ofpbuf_get_tcp_payload(const struct ofpbuf *);
+static inline const void *ofpbuf_get_udp_payload(const struct ofpbuf *);
+static inline const void *ofpbuf_get_sctp_payload(const struct ofpbuf *);
+static inline const void *ofpbuf_get_icmp_payload(const struct ofpbuf *);
+
void ofpbuf_use(struct ofpbuf *, void *, size_t);
void ofpbuf_use_stack(struct ofpbuf *, void *, size_t);
void ofpbuf_use_stub(struct ofpbuf *, void *, size_t);
void ofpbuf_init(struct ofpbuf *, size_t);
void ofpbuf_uninit(struct ofpbuf *);
-void *ofpbuf_get_uninit_pointer(struct ofpbuf *);
+static inline void *ofpbuf_get_uninit_pointer(struct ofpbuf *);
void ofpbuf_reinit(struct ofpbuf *, size_t);
struct ofpbuf *ofpbuf_new(size_t);
struct ofpbuf *ofpbuf_clone_data(const void *, size_t);
struct ofpbuf *ofpbuf_clone_data_with_headroom(const void *, size_t,
size_t headroom);
-void ofpbuf_delete(struct ofpbuf *);
+static inline void ofpbuf_delete(struct ofpbuf *);
-void *ofpbuf_at(const struct ofpbuf *, size_t offset, size_t size);
-void *ofpbuf_at_assert(const struct ofpbuf *, size_t offset, size_t size);
-void *ofpbuf_tail(const struct ofpbuf *);
-void *ofpbuf_end(const struct ofpbuf *);
+static inline void *ofpbuf_at(const struct ofpbuf *, size_t offset,
+ size_t size);
+static inline void *ofpbuf_at_assert(const struct ofpbuf *, size_t offset,
+ size_t size);
+static inline void *ofpbuf_tail(const struct ofpbuf *);
+static inline void *ofpbuf_end(const struct ofpbuf *);
void *ofpbuf_put_uninit(struct ofpbuf *, size_t);
void *ofpbuf_put_zeros(struct ofpbuf *, size_t);
void *ofpbuf_push_zeros(struct ofpbuf *, size_t);
void *ofpbuf_push(struct ofpbuf *b, const void *, size_t);
-size_t ofpbuf_headroom(const struct ofpbuf *);
-size_t ofpbuf_tailroom(const struct ofpbuf *);
+static inline size_t ofpbuf_headroom(const struct ofpbuf *);
+static inline size_t ofpbuf_tailroom(const struct ofpbuf *);
void ofpbuf_prealloc_headroom(struct ofpbuf *, size_t);
void ofpbuf_prealloc_tailroom(struct ofpbuf *, size_t);
void ofpbuf_trim(struct ofpbuf *);
void ofpbuf_padto(struct ofpbuf *, size_t);
void ofpbuf_shift(struct ofpbuf *, int);
-void ofpbuf_clear(struct ofpbuf *);
-void *ofpbuf_pull(struct ofpbuf *, size_t);
-void *ofpbuf_try_pull(struct ofpbuf *, size_t);
+static inline void ofpbuf_clear(struct ofpbuf *);
+static inline void *ofpbuf_pull(struct ofpbuf *, size_t);
+static inline void *ofpbuf_try_pull(struct ofpbuf *, size_t);
void *ofpbuf_steal_data(struct ofpbuf *);
char *ofpbuf_to_string(const struct ofpbuf *, size_t maxbytes);
+static inline struct ofpbuf *ofpbuf_from_list(const struct list *);
+void ofpbuf_list_delete(struct list *);
+static inline bool ofpbuf_equal(const struct ofpbuf *, const struct ofpbuf *);
+
+\f
+/* Returns a pointer that may be passed to free() to accomplish the same thing
+ * as ofpbuf_uninit(b). The return value is a null pointer if ofpbuf_uninit()
+ * would not free any memory. */
+static inline void *ofpbuf_get_uninit_pointer(struct ofpbuf *b)
+{
+ /* XXX: If 'source' is OFPBUF_DPDK memory gets leaked! */
+ return b && b->source == OFPBUF_MALLOC ? b->base : NULL;
+}
+
+/* Frees memory that 'b' points to, as well as 'b' itself. */
+static inline void ofpbuf_delete(struct ofpbuf *b)
+{
+ if (b) {
+ ofpbuf_uninit(b);
+ free(b);
+ }
+}
+
+/* If 'b' contains at least 'offset + size' bytes of data, returns a pointer to
+ * byte 'offset'. Otherwise, returns a null pointer. */
+static inline void *ofpbuf_at(const struct ofpbuf *b, size_t offset,
+ size_t size)
+{
+ return offset + size <= b->size ? (char *) b->data + offset : NULL;
+}
+
+/* Returns a pointer to byte 'offset' in 'b', which must contain at least
+ * 'offset + size' bytes of data. */
+static inline void *ofpbuf_at_assert(const struct ofpbuf *b, size_t offset,
+ size_t size)
+{
+ ovs_assert(offset + size <= b->size);
+ return ((char *) b->data) + offset;
+}
+
+/* Returns the byte following the last byte of data in use in 'b'. */
+static inline void *ofpbuf_tail(const struct ofpbuf *b)
+{
+ return (char *) b->data + b->size;
+}
+
+/* Returns the byte following the last byte allocated for use (but not
+ * necessarily in use) by 'b'. */
+static inline void *ofpbuf_end(const struct ofpbuf *b)
+{
+ return (char *) b->base + b->allocated;
+}
+
+/* Returns the number of bytes of headroom in 'b', that is, the number of bytes
+ * of unused space in ofpbuf 'b' before the data that is in use. (Most
+ * commonly, the data in a ofpbuf is at its beginning, and thus the ofpbuf's
+ * headroom is 0.) */
+static inline size_t ofpbuf_headroom(const struct ofpbuf *b)
+{
+ return (char*)b->data - (char*)b->base;
+}
+
+/* Returns the number of bytes that may be appended to the tail end of ofpbuf
+ * 'b' before the ofpbuf must be reallocated. */
+static inline size_t ofpbuf_tailroom(const struct ofpbuf *b)
+{
+ return (char*)ofpbuf_end(b) - (char*)ofpbuf_tail(b);
+}
+
+/* Clears any data from 'b'. */
+static inline void ofpbuf_clear(struct ofpbuf *b)
+{
+ b->data = b->base;
+ b->size = 0;
+}
+
+/* Removes 'size' bytes from the head end of 'b', which must contain at least
+ * 'size' bytes of data. Returns the first byte of data removed. */
+static inline void *ofpbuf_pull(struct ofpbuf *b, size_t size)
+{
+ void *data = b->data;
+ ovs_assert(b->size >= size);
+ b->data = (char*)b->data + size;
+ b->size -= size;
+ return data;
+}
+
+/* If 'b' has at least 'size' bytes of data, removes that many bytes from the
+ * head end of 'b' and returns the first byte removed. Otherwise, returns a
+ * null pointer without modifying 'b'. */
+static inline void *ofpbuf_try_pull(struct ofpbuf *b, size_t size)
+{
+ return b->size >= size ? ofpbuf_pull(b, size) : NULL;
+}
static inline struct ofpbuf *ofpbuf_from_list(const struct list *list)
{
return CONTAINER_OF(list, struct ofpbuf, list_node);
}
-void ofpbuf_list_delete(struct list *);
-static inline bool
-ofpbuf_equal(const struct ofpbuf *a, const struct ofpbuf *b)
+static inline bool ofpbuf_equal(const struct ofpbuf *a, const struct ofpbuf *b)
{
return a->size == b->size && memcmp(a->data, b->data, a->size) == 0;
}
+static inline void * ofpbuf_get_l2_5(const struct ofpbuf *b)
+{
+ return b->l2_5_ofs != UINT16_MAX ? (char *)b->l2 + b->l2_5_ofs : NULL;
+}
+
+static inline void ofpbuf_set_l2_5(struct ofpbuf *b, void *l2_5)
+{
+ b->l2_5_ofs = l2_5 ? (char *)l2_5 - (char *)b->l2 : UINT16_MAX;
+}
+
+static inline void * ofpbuf_get_l3(const struct ofpbuf *b)
+{
+ return b->l3_ofs != UINT16_MAX ? (char *)b->l2 + b->l3_ofs : NULL;
+}
+
+static inline void ofpbuf_set_l3(struct ofpbuf *b, void *l3)
+{
+ b->l3_ofs = l3 ? (char *)l3 - (char *)b->l2 : UINT16_MAX;
+}
+
+static inline void * ofpbuf_get_l4(const struct ofpbuf *b)
+{
+ return b->l4_ofs != UINT16_MAX ? (char *)b->l2 + b->l4_ofs : NULL;
+}
+
+static inline void ofpbuf_set_l4(struct ofpbuf *b, void *l4)
+{
+ b->l4_ofs = l4 ? (char *)l4 - (char *)b->l2 : UINT16_MAX;
+}
+
+static inline size_t ofpbuf_get_l4_size(const struct ofpbuf *b)
+{
+ return b->l4_ofs != UINT16_MAX
+ ? (const char *)ofpbuf_tail(b) - (const char *)ofpbuf_get_l4(b) : 0;
+}
+
+static inline const void *ofpbuf_get_tcp_payload(const struct ofpbuf *b)
+{
+ size_t l4_size = ofpbuf_get_l4_size(b);
+
+ if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) {
+ struct tcp_header *tcp = ofpbuf_get_l4(b);
+ int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4;
+
+ if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <= l4_size)) {
+ return (const char *)tcp + tcp_len;
+ }
+ }
+ return NULL;
+}
+
+static inline const void *ofpbuf_get_udp_payload(const struct ofpbuf *b)
+{
+ return OVS_LIKELY(ofpbuf_get_l4_size(b) >= UDP_HEADER_LEN)
+ ? (const char *)ofpbuf_get_l4(b) + UDP_HEADER_LEN : NULL;
+}
+
+static inline const void *ofpbuf_get_sctp_payload(const struct ofpbuf *b)
+{
+ return OVS_LIKELY(ofpbuf_get_l4_size(b) >= SCTP_HEADER_LEN)
+ ? (const char *)ofpbuf_get_l4(b) + SCTP_HEADER_LEN : NULL;
+}
+
+static inline const void *ofpbuf_get_icmp_payload(const struct ofpbuf *b)
+{
+ return OVS_LIKELY(ofpbuf_get_l4_size(b) >= ICMP_HEADER_LEN)
+ ? (const char *)ofpbuf_get_l4(b) + ICMP_HEADER_LEN : NULL;
+}
+
#ifdef __cplusplus
}
#endif
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include <stdatomic.h>
-/* Nonstandard atomic types. */
-typedef _Atomic(uint8_t) atomic_uint8_t;
-typedef _Atomic(uint16_t) atomic_uint16_t;
-typedef _Atomic(uint32_t) atomic_uint32_t;
-typedef _Atomic(uint64_t) atomic_uint64_t;
-
-typedef _Atomic(int8_t) atomic_int8_t;
-typedef _Atomic(int16_t) atomic_int16_t;
-typedef _Atomic(int32_t) atomic_int32_t;
-typedef _Atomic(int64_t) atomic_int64_t;
+#define OMIT_STANDARD_ATOMIC_TYPES 1
+#define ATOMIC(TYPE) _Atomic(TYPE)
#define atomic_read(SRC, DST) \
atomic_read_explicit(SRC, DST, memory_order_seq_cst)
(*(ORIG) = atomic_fetch_xor_explicit(RMW, ARG, ORDER), (void) 0)
#define atomic_and_explicit(RMW, ARG, ORIG, ORDER) \
(*(ORIG) = atomic_fetch_and_explicit(RMW, ARG, ORDER), (void) 0)
-
-static inline void
-atomic_flag_init(volatile atomic_flag *object OVS_UNUSED)
-{
- /* Nothing to do. */
-}
-
-static inline void
-atomic_flag_destroy(volatile atomic_flag *object OVS_UNUSED)
-{
- /* Nothing to do. */
-}
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#define OVS_ATOMIC_CLANG_IMPL 1
-/* Standard atomic types. */
-typedef _Atomic(_Bool) atomic_bool;
-
-typedef _Atomic(char) atomic_char;
-typedef _Atomic(signed char) atomic_schar;
-typedef _Atomic(unsigned char) atomic_uchar;
-
-typedef _Atomic(short) atomic_short;
-typedef _Atomic(unsigned short) atomic_ushort;
-
-typedef _Atomic(int) atomic_int;
-typedef _Atomic(unsigned int) atomic_uint;
-
-typedef _Atomic(long) atomic_long;
-typedef _Atomic(unsigned long) atomic_ulong;
-
-typedef _Atomic(long long) atomic_llong;
-typedef _Atomic(unsigned long long) atomic_ullong;
-
-typedef _Atomic(size_t) atomic_size_t;
-typedef _Atomic(ptrdiff_t) atomic_ptrdiff_t;
-
-typedef _Atomic(intmax_t) atomic_intmax_t;
-typedef _Atomic(uintmax_t) atomic_uintmax_t;
-
-typedef _Atomic(intptr_t) atomic_intptr_t;
-typedef _Atomic(uintptr_t) atomic_uintptr_t;
-
-/* Nonstandard atomic types. */
-typedef _Atomic(uint8_t) atomic_uint8_t;
-typedef _Atomic(uint16_t) atomic_uint16_t;
-typedef _Atomic(uint32_t) atomic_uint32_t;
-typedef _Atomic(uint64_t) atomic_uint64_t;
-
-typedef _Atomic(int8_t) atomic_int8_t;
-typedef _Atomic(int16_t) atomic_int16_t;
-typedef _Atomic(int32_t) atomic_int32_t;
-typedef _Atomic(int64_t) atomic_int64_t;
+#define ATOMIC(TYPE) _Atomic(TYPE)
#define ATOMIC_VAR_INIT(VALUE) (VALUE)
#define atomic_init(OBJECT, VALUE) __c11_atomic_init(OBJECT, VALUE)
-#define atomic_destroy(OBJECT) ((void) (OBJECT))
/* Clang hard-codes these exact values internally but does not appear to
* export any names for them. */
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
} atomic_flag;
#define ATOMIC_FLAG_INIT { .b = false }
-static inline void
-atomic_flag_init(volatile atomic_flag *object OVS_UNUSED)
-{
- /* Nothing to do. */
-}
-
-static inline void
-atomic_flag_destroy(volatile atomic_flag *object OVS_UNUSED)
-{
- /* Nothing to do. */
-}
-
static inline bool
atomic_flag_test_and_set_explicit(volatile atomic_flag *object,
memory_order order)
+++ /dev/null
-/*
- * Copyright (c) 2013 Nicira, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at:
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-#include <config.h>
-
-#include "ovs-atomic.h"
-#include "ovs-thread.h"
-
-#if OVS_ATOMIC_GCC4P_IMPL
-static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER;
-
-#define DEFINE_LOCKED_OP(TYPE, NAME, OPERATOR) \
- TYPE##_t \
- locked_##TYPE##_##NAME(struct locked_##TYPE *u, TYPE##_t arg) \
- { \
- TYPE##_t old_value; \
- \
- ovs_mutex_lock(&mutex); \
- old_value = u->value; \
- u->value OPERATOR arg; \
- ovs_mutex_unlock(&mutex); \
- \
- return old_value; \
- }
-
-#define DEFINE_LOCKED_TYPE(TYPE) \
- TYPE##_t \
- locked_##TYPE##_load(const struct locked_##TYPE *u) \
- { \
- TYPE##_t value; \
- \
- ovs_mutex_lock(&mutex); \
- value = u->value; \
- ovs_mutex_unlock(&mutex); \
- \
- return value; \
- } \
- \
- void \
- locked_##TYPE##_store(struct locked_##TYPE *u, TYPE##_t value) \
- { \
- ovs_mutex_lock(&mutex); \
- u->value = value; \
- ovs_mutex_unlock(&mutex); \
- } \
- DEFINE_LOCKED_OP(TYPE, add, +=); \
- DEFINE_LOCKED_OP(TYPE, sub, -=); \
- DEFINE_LOCKED_OP(TYPE, or, |=); \
- DEFINE_LOCKED_OP(TYPE, xor, ^=); \
- DEFINE_LOCKED_OP(TYPE, and, &=)
-
-DEFINE_LOCKED_TYPE(uint64);
-DEFINE_LOCKED_TYPE(int64);
-
-#endif /* OVS_ATOMIC_GCC4P_IMPL */
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#error "This header should only be included indirectly via ovs-atomic.h."
#endif
+#include "ovs-atomic-locked.h"
#define OVS_ATOMIC_GCC4P_IMPL 1
-#define DEFINE_LOCKLESS_ATOMIC(TYPE, NAME) typedef struct { TYPE value; } NAME
+#define ATOMIC(TYPE) TYPE
#define ATOMIC_BOOL_LOCK_FREE 2
-DEFINE_LOCKLESS_ATOMIC(bool, atomic_bool);
-
#define ATOMIC_CHAR_LOCK_FREE 2
-DEFINE_LOCKLESS_ATOMIC(char, atomic_char);
-DEFINE_LOCKLESS_ATOMIC(signed char, atomic_schar);
-DEFINE_LOCKLESS_ATOMIC(unsigned char, atomic_uchar);
-
#define ATOMIC_SHORT_LOCK_FREE 2
-DEFINE_LOCKLESS_ATOMIC(short, atomic_short);
-DEFINE_LOCKLESS_ATOMIC(unsigned short, atomic_ushort);
-
#define ATOMIC_INT_LOCK_FREE 2
-DEFINE_LOCKLESS_ATOMIC(int, atomic_int);
-DEFINE_LOCKLESS_ATOMIC(unsigned int, atomic_uint);
-
-#if ULONG_MAX <= UINTPTR_MAX
- #define ATOMIC_LONG_LOCK_FREE 2
- DEFINE_LOCKLESS_ATOMIC(long, atomic_long);
- DEFINE_LOCKLESS_ATOMIC(unsigned long, atomic_ulong);
-#elif ULONG_MAX == UINT64_MAX
- #define ATOMIC_LONG_LOCK_FREE 0
- typedef struct locked_int64 atomic_long;
- typedef struct locked_uint64 atomic_ulong;
-#else
- #error "not implemented"
-#endif
-
-#if ULLONG_MAX <= UINTPTR_MAX
- #define ATOMIC_LLONG_LOCK_FREE 2
- DEFINE_LOCKLESS_ATOMIC(long long, atomic_llong);
- DEFINE_LOCKLESS_ATOMIC(unsigned long long, atomic_ullong);
-#elif ULLONG_MAX == UINT64_MAX
- #define ATOMIC_LLONG_LOCK_FREE 0
- typedef struct locked_int64 atomic_llong;
- typedef struct locked_uint64 atomic_ullong;
-#else
- #error "not implemented"
-#endif
-
-#if SIZE_MAX <= UINTPTR_MAX
- DEFINE_LOCKLESS_ATOMIC(size_t, atomic_size_t);
- DEFINE_LOCKLESS_ATOMIC(ptrdiff_t, atomic_ptrdiff_t);
-#elif SIZE_MAX == UINT64_MAX
- typedef struct locked_uint64 atomic_size_t;
- typedef struct locked_int64 atomic_ptrdiff_t;
-#else
- #error "not implemented"
-#endif
-
-#if UINTMAX_MAX <= UINTPTR_MAX
- DEFINE_LOCKLESS_ATOMIC(intmax_t, atomic_intmax_t);
- DEFINE_LOCKLESS_ATOMIC(uintmax_t, atomic_uintmax_t);
-#elif UINTMAX_MAX == UINT64_MAX
- typedef struct locked_int64 atomic_intmax_t;
- typedef struct locked_uint64 atomic_uintmax_t;
-#else
- #error "not implemented"
-#endif
-
+#define ATOMIC_LONG_LOCK_FREE (ULONG_MAX <= UINTPTR_MAX ? 2 : 0)
+#define ATOMIC_LLONG_LOCK_FREE (ULLONG_MAX <= UINTPTR_MAX ? 2 : 0)
#define ATOMIC_POINTER_LOCK_FREE 2
-DEFINE_LOCKLESS_ATOMIC(intptr_t, atomic_intptr_t);
-DEFINE_LOCKLESS_ATOMIC(uintptr_t, atomic_uintptr_t);
-
-/* Nonstandard atomic types. */
-DEFINE_LOCKLESS_ATOMIC(uint8_t, atomic_uint8_t);
-DEFINE_LOCKLESS_ATOMIC(uint16_t, atomic_uint16_t);
-DEFINE_LOCKLESS_ATOMIC(uint32_t, atomic_uint32_t);
-DEFINE_LOCKLESS_ATOMIC(int8_t, atomic_int8_t);
-DEFINE_LOCKLESS_ATOMIC(int16_t, atomic_int16_t);
-DEFINE_LOCKLESS_ATOMIC(int32_t, atomic_int32_t);
-#if UINT64_MAX <= UINTPTR_MAX
- DEFINE_LOCKLESS_ATOMIC(uint64_t, atomic_uint64_t);
- DEFINE_LOCKLESS_ATOMIC(int64_t, atomic_int64_t);
-#else
- typedef struct locked_uint64 atomic_uint64_t;
- typedef struct locked_int64 atomic_int64_t;
-#endif
typedef enum {
memory_order_relaxed,
memory_order_seq_cst
} memory_order;
\f
-/* locked_uint64. */
-
-#define IF_LOCKED_UINT64(OBJECT, THEN, ELSE) \
- __builtin_choose_expr( \
- __builtin_types_compatible_p(typeof(OBJECT), struct locked_uint64), \
- (THEN), (ELSE))
-#define AS_LOCKED_UINT64(OBJECT) ((struct locked_uint64 *) (void *) (OBJECT))
-#define AS_UINT64(OBJECT) ((uint64_t *) (OBJECT))
-struct locked_uint64 {
- uint64_t value;
-};
-
-uint64_t locked_uint64_load(const struct locked_uint64 *);
-void locked_uint64_store(struct locked_uint64 *, uint64_t);
-uint64_t locked_uint64_add(struct locked_uint64 *, uint64_t arg);
-uint64_t locked_uint64_sub(struct locked_uint64 *, uint64_t arg);
-uint64_t locked_uint64_or(struct locked_uint64 *, uint64_t arg);
-uint64_t locked_uint64_xor(struct locked_uint64 *, uint64_t arg);
-uint64_t locked_uint64_and(struct locked_uint64 *, uint64_t arg);
+#define IS_LOCKLESS_ATOMIC(OBJECT) (sizeof(OBJECT) <= sizeof(void *))
\f
-#define IF_LOCKED_INT64(OBJECT, THEN, ELSE) \
- __builtin_choose_expr( \
- __builtin_types_compatible_p(typeof(OBJECT), struct locked_int64), \
- (THEN), (ELSE))
-#define AS_LOCKED_INT64(OBJECT) ((struct locked_int64 *) (void *) (OBJECT))
-#define AS_INT64(OBJECT) ((int64_t *) (OBJECT))
-struct locked_int64 {
- int64_t value;
-};
-int64_t locked_int64_load(const struct locked_int64 *);
-void locked_int64_store(struct locked_int64 *, int64_t);
-int64_t locked_int64_add(struct locked_int64 *, int64_t arg);
-int64_t locked_int64_sub(struct locked_int64 *, int64_t arg);
-int64_t locked_int64_or(struct locked_int64 *, int64_t arg);
-int64_t locked_int64_xor(struct locked_int64 *, int64_t arg);
-int64_t locked_int64_and(struct locked_int64 *, int64_t arg);
-\f
-#define ATOMIC_VAR_INIT(VALUE) { .value = (VALUE) }
-#define atomic_init(OBJECT, VALUE) ((OBJECT)->value = (VALUE), (void) 0)
-#define atomic_destroy(OBJECT) ((void) (OBJECT))
+#define ATOMIC_VAR_INIT(VALUE) VALUE
+#define atomic_init(OBJECT, VALUE) (*(OBJECT) = (VALUE), (void) 0)
static inline void
atomic_thread_fence(memory_order order)
}
}
-#define ATOMIC_SWITCH(OBJECT, LOCKLESS_CASE, \
- LOCKED_UINT64_CASE, LOCKED_INT64_CASE) \
- IF_LOCKED_UINT64(OBJECT, LOCKED_UINT64_CASE, \
- IF_LOCKED_INT64(OBJECT, LOCKED_INT64_CASE, \
- LOCKLESS_CASE))
-
#define atomic_is_lock_free(OBJ) \
- ((void) (OBJ)->value, \
- ATOMIC_SWITCH(OBJ, true, false, false))
+ ((void) *(OBJ), \
+ IF_LOCKLESS_ATOMIC(OBJ, true, false))
#define atomic_store(DST, SRC) \
atomic_store_explicit(DST, SRC, memory_order_seq_cst)
-#define atomic_store_explicit(DST, SRC, ORDER) \
- (ATOMIC_SWITCH(DST, \
- (atomic_thread_fence(ORDER), \
- (DST)->value = (SRC), \
- atomic_thread_fence_if_seq_cst(ORDER)), \
- locked_uint64_store(AS_LOCKED_UINT64(DST), SRC), \
- locked_int64_store(AS_LOCKED_INT64(DST), SRC)), \
- (void) 0)
-
+#define atomic_store_explicit(DST, SRC, ORDER) \
+ ({ \
+ typeof(DST) dst__ = (DST); \
+ typeof(SRC) src__ = (SRC); \
+ memory_order order__ = (ORDER); \
+ \
+ if (IS_LOCKLESS_ATOMIC(*dst__)) { \
+ atomic_thread_fence(order__); \
+ *dst__ = src__; \
+ atomic_thread_fence_if_seq_cst(order__); \
+ } else { \
+ atomic_store_locked(DST, SRC); \
+ } \
+ (void) 0; \
+ })
#define atomic_read(SRC, DST) \
atomic_read_explicit(SRC, DST, memory_order_seq_cst)
-#define atomic_read_explicit(SRC, DST, ORDER) \
- (ATOMIC_SWITCH(SRC, \
- (atomic_thread_fence_if_seq_cst(ORDER), \
- (*DST) = (SRC)->value, \
- atomic_thread_fence(ORDER)), \
- *(DST) = locked_uint64_load(AS_LOCKED_UINT64(SRC)), \
- *(DST) = locked_int64_load(AS_LOCKED_INT64(SRC))), \
- (void) 0)
-
-#define atomic_op__(RMW, OP, ARG, ORIG) \
- (ATOMIC_SWITCH(RMW, \
- *(ORIG) = __sync_fetch_and_##OP(&(RMW)->value, ARG), \
- *(ORIG) = locked_uint64_##OP(AS_LOCKED_UINT64(RMW), ARG), \
- *(ORIG) = locked_int64_##OP(AS_LOCKED_INT64(RMW), ARG)), \
- (void) 0)
+#define atomic_read_explicit(SRC, DST, ORDER) \
+ ({ \
+ typeof(DST) dst__ = (DST); \
+ typeof(SRC) src__ = (SRC); \
+ memory_order order__ = (ORDER); \
+ \
+ if (IS_LOCKLESS_ATOMIC(*src__)) { \
+ atomic_thread_fence_if_seq_cst(order__); \
+ *dst__ = *src__; \
+ } else { \
+ atomic_read_locked(SRC, DST); \
+ } \
+ (void) 0; \
+ })
+
+#define atomic_op__(RMW, OP, ARG, ORIG) \
+ ({ \
+ typeof(RMW) rmw__ = (RMW); \
+ typeof(ARG) arg__ = (ARG); \
+ typeof(ORIG) orig__ = (ORIG); \
+ \
+ if (IS_LOCKLESS_ATOMIC(*rmw__)) { \
+ *orig__ = __sync_fetch_and_##OP(rmw__, arg__); \
+ } else { \
+ atomic_op_locked(RMW, OP, ARG, ORIG); \
+ } \
+ })
#define atomic_add(RMW, ARG, ORIG) atomic_op__(RMW, add, ARG, ORIG)
#define atomic_sub(RMW, ARG, ORIG) atomic_op__(RMW, sub, ARG, ORIG)
} atomic_flag;
#define ATOMIC_FLAG_INIT { false }
-static inline void
-atomic_flag_init(volatile atomic_flag *object OVS_UNUSED)
-{
- /* Nothing to do. */
-}
-
-static inline void
-atomic_flag_destroy(volatile atomic_flag *object OVS_UNUSED)
-{
- /* Nothing to do. */
-}
-
static inline bool
atomic_flag_test_and_set(volatile atomic_flag *object)
{
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#error "This header should only be included indirectly via ovs-atomic.h."
#endif
-/* C11 standardized atomic type. */
-typedef bool atomic_bool;
-
-typedef char atomic_char;
-typedef signed char atomic_schar;
-typedef unsigned char atomic_uchar;
-
-typedef short atomic_short;
-typedef unsigned short atomic_ushort;
-
-typedef int atomic_int;
-typedef unsigned int atomic_uint;
-
-typedef long atomic_long;
-typedef unsigned long atomic_ulong;
-
-typedef long long atomic_llong;
-typedef unsigned long long atomic_ullong;
-
-typedef size_t atomic_size_t;
-typedef ptrdiff_t atomic_ptrdiff_t;
-
-typedef intmax_t atomic_intmax_t;
-typedef uintmax_t atomic_uintmax_t;
-
-typedef intptr_t atomic_intptr_t;
-typedef uintptr_t atomic_uintptr_t;
-
-/* Nonstandard atomic types. */
-typedef int8_t atomic_int8_t;
-typedef uint8_t atomic_uint8_t;
-
-typedef int16_t atomic_int16_t;
-typedef uint16_t atomic_uint16_t;
-
-typedef int32_t atomic_int32_t;
-typedef uint32_t atomic_uint32_t;
-
-typedef int64_t atomic_int64_t;
-typedef uint64_t atomic_uint64_t;
+#define ATOMIC(TYPE) TYPE
typedef enum {
memory_order_relaxed = __ATOMIC_RELAXED,
#define ATOMIC_VAR_INIT(VALUE) (VALUE)
#define atomic_init(OBJECT, VALUE) (*(OBJECT) = (VALUE), (void) 0)
-#define atomic_destroy(OBJECT) ((void) (OBJECT))
#define atomic_thread_fence __atomic_thread_fence
#define atomic_signal_fence __atomic_signal_fence
--- /dev/null
+/*
+ * Copyright (c) 2013, 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+
+#include "ovs-atomic.h"
+#include "hash.h"
+#include "ovs-thread.h"
+
+#ifdef OVS_ATOMIC_LOCKED_IMPL
+static struct ovs_mutex *
+mutex_for_pointer(void *p)
+{
+ OVS_ALIGNED_STRUCT(CACHE_LINE_SIZE, aligned_mutex) {
+ struct ovs_mutex mutex;
+ char pad[PAD_SIZE(sizeof(struct ovs_mutex), CACHE_LINE_SIZE)];
+ };
+
+ static struct aligned_mutex atomic_mutexes[] = {
+#define MUTEX_INIT { .mutex = OVS_MUTEX_INITIALIZER }
+#define MUTEX_INIT4 MUTEX_INIT, MUTEX_INIT, MUTEX_INIT, MUTEX_INIT
+#define MUTEX_INIT16 MUTEX_INIT4, MUTEX_INIT4, MUTEX_INIT4, MUTEX_INIT4
+ MUTEX_INIT16, MUTEX_INIT16,
+ };
+ BUILD_ASSERT_DECL(IS_POW2(ARRAY_SIZE(atomic_mutexes)));
+
+ uint32_t hash = hash_pointer(p, 0);
+ uint32_t indx = hash & (ARRAY_SIZE(atomic_mutexes) - 1);
+ return &atomic_mutexes[indx].mutex;
+}
+
+void
+atomic_lock__(void *p)
+ OVS_ACQUIRES(mutex_for_pointer(p))
+{
+ ovs_mutex_lock(mutex_for_pointer(p));
+}
+
+void
+atomic_unlock__(void *p)
+ OVS_RELEASES(mutex_for_pointer(p))
+{
+ ovs_mutex_unlock(mutex_for_pointer(p));
+}
+#endif /* OVS_ATOMIC_LOCKED_IMPL */
--- /dev/null
+/* This header implements atomic operation locking helpers. */
+#ifndef IN_OVS_ATOMIC_H
+#error "This header should only be included indirectly via ovs-atomic.h."
+#endif
+
+#define OVS_ATOMIC_LOCKED_IMPL 1
+
+void atomic_lock__(void *);
+void atomic_unlock__(void *);
+
+#define atomic_store_locked(DST, SRC) \
+ (atomic_lock__(DST), \
+ *(DST) = (SRC), \
+ atomic_unlock__(DST), \
+ (void) 0)
+
+#define atomic_read_locked(SRC, DST) \
+ (atomic_lock__(SRC), \
+ *(DST) = *(SRC), \
+ atomic_unlock__(SRC), \
+ (void) 0)
+
+#define atomic_op_locked_add +=
+#define atomic_op_locked_sub -=
+#define atomic_op_locked_or |=
+#define atomic_op_locked_xor ^=
+#define atomic_op_locked_and &=
+#define atomic_op_locked(RMW, OP, OPERAND, ORIG) \
+ (atomic_lock__(RMW), \
+ *(ORIG) = *(RMW), \
+ *(RMW) atomic_op_locked_##OP (OPERAND), \
+ atomic_unlock__(RMW))
+++ /dev/null
-/*
- * Copyright (c) 2013, 2014 Nicira, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at:
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-#include <config.h>
-
-#include "ovs-atomic.h"
-#include "ovs-thread.h"
-
-#if OVS_ATOMIC_PTHREADS_IMPL
-void
-atomic_flag_init(volatile atomic_flag *flag_)
-{
- atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-
- pthread_mutex_init(&flag->mutex, NULL);
- atomic_flag_clear(flag_);
-}
-
-void
-atomic_flag_destroy(volatile atomic_flag *flag_)
-{
- atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-
- pthread_mutex_destroy(&flag->mutex);
-}
-
-bool
-atomic_flag_test_and_set(volatile atomic_flag *flag_)
-{
- atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
- bool old_value;
-
- xpthread_mutex_lock(&flag->mutex);
- old_value = flag->b;
- flag->b = true;
- xpthread_mutex_unlock(&flag->mutex);
-
- return old_value;
-}
-
-bool
-atomic_flag_test_and_set_explicit(volatile atomic_flag *flag,
- memory_order order OVS_UNUSED)
-{
- return atomic_flag_test_and_set(flag);
-}
-
-void
-atomic_flag_clear(volatile atomic_flag *flag_)
-{
- atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-
- xpthread_mutex_lock(&flag->mutex);
- flag->b = false;
- xpthread_mutex_unlock(&flag->mutex);
-}
-
-void
-atomic_flag_clear_explicit(volatile atomic_flag *flag,
- memory_order order OVS_UNUSED)
-{
- return atomic_flag_clear(flag);
-}
-
-#endif /* OVS_ATOMIC_PTHREADS_IMPL */
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#error "This header should only be included indirectly via ovs-atomic.h."
#endif
+#include "ovs-atomic-locked.h"
+
#define OVS_ATOMIC_PTHREADS_IMPL 1
-#define DEFINE_PTHREAD_ATOMIC(TYPE, NAME) \
- typedef struct { \
- TYPE value; \
- pthread_mutex_t mutex; \
- } NAME;
+#define ATOMIC(TYPE) TYPE
#define ATOMIC_BOOL_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(bool, atomic_bool);
-
#define ATOMIC_CHAR_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(char, atomic_char);
-DEFINE_PTHREAD_ATOMIC(signed char, atomic_schar);
-DEFINE_PTHREAD_ATOMIC(unsigned char, atomic_uchar);
-
#define ATOMIC_SHORT_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(short, atomic_short);
-DEFINE_PTHREAD_ATOMIC(unsigned short, atomic_ushort);
-
#define ATOMIC_INT_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(int, atomic_int);
-DEFINE_PTHREAD_ATOMIC(unsigned int, atomic_uint);
-
#define ATOMIC_LONG_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(long, atomic_long);
-DEFINE_PTHREAD_ATOMIC(unsigned long, atomic_ulong);
-
#define ATOMIC_LLONG_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(long long, atomic_llong);
-DEFINE_PTHREAD_ATOMIC(unsigned long long, atomic_ullong);
-
-DEFINE_PTHREAD_ATOMIC(size_t, atomic_size_t);
-DEFINE_PTHREAD_ATOMIC(ptrdiff_t, atomic_ptrdiff_t);
-
-DEFINE_PTHREAD_ATOMIC(intmax_t, atomic_intmax_t);
-DEFINE_PTHREAD_ATOMIC(uintmax_t, atomic_uintmax_t);
-
#define ATOMIC_POINTER_LOCK_FREE 0
-DEFINE_PTHREAD_ATOMIC(intptr_t, atomic_intptr_t);
-DEFINE_PTHREAD_ATOMIC(uintptr_t, atomic_uintptr_t);
-
-/* Nonstandard atomic types. */
-DEFINE_PTHREAD_ATOMIC(uint8_t, atomic_uint8_t);
-DEFINE_PTHREAD_ATOMIC(uint16_t, atomic_uint16_t);
-DEFINE_PTHREAD_ATOMIC(uint32_t, atomic_uint32_t);
-DEFINE_PTHREAD_ATOMIC(int8_t, atomic_int8_t);
-DEFINE_PTHREAD_ATOMIC(int16_t, atomic_int16_t);
-DEFINE_PTHREAD_ATOMIC(int32_t, atomic_int32_t);
-DEFINE_PTHREAD_ATOMIC(uint64_t, atomic_uint64_t);
-DEFINE_PTHREAD_ATOMIC(int64_t, atomic_int64_t);
typedef enum {
memory_order_relaxed,
memory_order_seq_cst
} memory_order;
-#define ATOMIC_VAR_INIT(VALUE) { VALUE, PTHREAD_MUTEX_INITIALIZER }
-#define atomic_init(OBJECT, VALUE) \
- ((OBJECT)->value = (VALUE), \
- pthread_mutex_init(&(OBJECT)->mutex, NULL), \
- (void) 0)
-#define atomic_destroy(OBJECT) \
- (pthread_mutex_destroy(&(OBJECT)->mutex), \
- (void) 0)
+#define ATOMIC_VAR_INIT(VALUE) (VALUE)
+#define atomic_init(OBJECT, VALUE) (*(OBJECT) = (VALUE), (void) 0)
static inline void
atomic_thread_fence(memory_order order OVS_UNUSED)
#define atomic_is_lock_free(OBJ) false
-#define atomic_store(DST, SRC) \
- (pthread_mutex_lock(&(DST)->mutex), \
- (DST)->value = (SRC), \
- pthread_mutex_unlock(&(DST)->mutex), \
- (void) 0)
+#define atomic_store(DST, SRC) atomic_store_locked(DST, SRC)
#define atomic_store_explicit(DST, SRC, ORDER) \
((void) (ORDER), atomic_store(DST, SRC))
-#define atomic_read(SRC, DST) \
- (pthread_mutex_lock(CONST_CAST(pthread_mutex_t *, &(SRC)->mutex)), \
- *(DST) = (SRC)->value, \
- pthread_mutex_unlock(CONST_CAST(pthread_mutex_t *, &(SRC)->mutex)), \
- (void) 0)
+#define atomic_read(SRC, DST) atomic_read_locked(SRC, DST)
#define atomic_read_explicit(SRC, DST, ORDER) \
((void) (ORDER), atomic_read(SRC, DST))
-#define atomic_op__(RMW, OPERATOR, OPERAND, ORIG) \
- (pthread_mutex_lock(&(RMW)->mutex), \
- *(ORIG) = (RMW)->value, \
- (RMW)->value OPERATOR (OPERAND), \
- pthread_mutex_unlock(&(RMW)->mutex), \
- (void) 0)
-
-#define atomic_add(RMW, OPERAND, ORIG) atomic_op__(RMW, +=, OPERAND, ORIG)
-#define atomic_sub(RMW, OPERAND, ORIG) atomic_op__(RMW, -=, OPERAND, ORIG)
-#define atomic_or( RMW, OPERAND, ORIG) atomic_op__(RMW, |=, OPERAND, ORIG)
-#define atomic_xor(RMW, OPERAND, ORIG) atomic_op__(RMW, ^=, OPERAND, ORIG)
-#define atomic_and(RMW, OPERAND, ORIG) atomic_op__(RMW, &=, OPERAND, ORIG)
-
-#define atomic_add_explicit(RMW, OPERAND, ORIG, ORDER) \
- ((void) (ORDER), atomic_add(RMW, OPERAND, ORIG))
-#define atomic_sub_explicit(RMW, OPERAND, ORIG, ORDER) \
- ((void) (ORDER), atomic_sub(RMW, OPERAND, ORIG))
-#define atomic_or_explicit(RMW, OPERAND, ORIG, ORDER) \
- ((void) (ORDER), atomic_or(RMW, OPERAND, ORIG))
-#define atomic_xor_explicit(RMW, OPERAND, ORIG, ORDER) \
- ((void) (ORDER), atomic_xor(RMW, OPERAND, ORIG))
-#define atomic_and_explicit(RMW, OPERAND, ORIG, ORDER) \
- ((void) (ORDER), atomic_and(RMW, OPERAND, ORIG))
+#define atomic_add(RMW, ARG, ORIG) atomic_op_locked(RMW, add, ARG, ORIG)
+#define atomic_sub(RMW, ARG, ORIG) atomic_op_locked(RMW, sub, ARG, ORIG)
+#define atomic_or( RMW, ARG, ORIG) atomic_op_locked(RMW, or, ARG, ORIG)
+#define atomic_xor(RMW, ARG, ORIG) atomic_op_locked(RMW, xor, ARG, ORIG)
+#define atomic_and(RMW, ARG, ORIG) atomic_op_locked(RMW, and, ARG, ORIG)
+
+#define atomic_add_explicit(RMW, ARG, ORIG, ORDER) \
+ ((void) (ORDER), atomic_add(RMW, ARG, ORIG))
+#define atomic_sub_explicit(RMW, ARG, ORIG, ORDER) \
+ ((void) (ORDER), atomic_sub(RMW, ARG, ORIG))
+#define atomic_or_explicit(RMW, ARG, ORIG, ORDER) \
+ ((void) (ORDER), atomic_or(RMW, ARG, ORIG))
+#define atomic_xor_explicit(RMW, ARG, ORIG, ORDER) \
+ ((void) (ORDER), atomic_xor(RMW, ARG, ORIG))
+#define atomic_and_explicit(RMW, ARG, ORIG, ORDER) \
+ ((void) (ORDER), atomic_and(RMW, ARG, ORIG))
\f
/* atomic_flag */
typedef struct {
bool b;
- pthread_mutex_t mutex;
} atomic_flag;
-#define ATOMIC_FLAG_INIT { false, PTHREAD_MUTEX_INITIALIZER }
+#define ATOMIC_FLAG_INIT { false }
-void atomic_flag_init(volatile atomic_flag *);
-void atomic_flag_destroy(volatile atomic_flag *);
+static inline bool
+atomic_flag_test_and_set(volatile atomic_flag *flag_)
+{
+ atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
+ bool old_value;
+
+ atomic_lock__(flag);
+ old_value = flag->b;
+ flag->b = true;
+ atomic_unlock__(flag);
-bool atomic_flag_test_and_set(volatile atomic_flag *);
-bool atomic_flag_test_and_set_explicit(volatile atomic_flag *, memory_order);
+ return old_value;
+}
-void atomic_flag_clear(volatile atomic_flag *);
-void atomic_flag_clear_explicit(volatile atomic_flag *, memory_order);
+static inline bool
+atomic_flag_test_and_set_explicit(volatile atomic_flag *flag,
+ memory_order order OVS_UNUSED)
+{
+ return atomic_flag_test_and_set(flag);
+}
+
+static inline void
+atomic_flag_clear(volatile atomic_flag *flag_)
+{
+ atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
+
+ atomic_lock__(flag);
+ flag->b = false;
+ atomic_unlock__(flag);
+}
+
+static inline void
+atomic_flag_clear_explicit(volatile atomic_flag *flag,
+ memory_order order OVS_UNUSED)
+{
+ atomic_flag_clear(flag);
+}
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
*
* (*) Not specified by C11.
*
+ * Atomic types may also be obtained via ATOMIC(TYPE), e.g. ATOMIC(void *).
+ * Only basic integer types and pointer types can be made atomic this way,
+ * e.g. atomic structs are not supported.
+ *
* The atomic version of a type doesn't necessarily have the same size or
* representation as the ordinary version; for example, atomic_int might be a
- * typedef for a struct that also includes a mutex. The range of an atomic
- * type does match the range of the corresponding ordinary type.
+ * typedef for a struct. The range of an atomic type does match the range of
+ * the corresponding ordinary type.
*
* C11 says that one may use the _Atomic keyword in place of the typedef name,
* e.g. "_Atomic int" instead of "atomic_int". This library doesn't support
* ...
* atomic_init(&ai, 123);
*
- * C11 does not hav an destruction function for atomic types, but some
- * implementations of the OVS atomics do need them. Thus, the following
- * function is provided for destroying non-static atomic objects (A is any
- * atomic type):
- *
- * void atomic_destroy(A *object);
- *
- * Destroys 'object'.
- *
*
* Barriers
* ========
* ATOMIC_FLAG_INIT is an initializer for atomic_flag. The initial state is
* "clear".
*
- * C11 does not have an initialization or destruction function for atomic_flag,
- * because implementations should not need one (one may simply
- * atomic_flag_clear() an uninitialized atomic_flag), but some implementations
- * of the OVS atomics do need them. Thus, the following two functions are
- * provided for initializing and destroying non-static atomic_flags:
- *
- * void atomic_flag_init(volatile atomic_flag *object);
- *
- * Initializes 'object'. The initial state is "clear".
- *
- * void atomic_flag_destroy(volatile atomic_flag *object);
- *
- * Destroys 'object'.
+ * An atomic_flag may also be initialized at runtime with atomic_flag_clear().
*
*
* Operations
#endif
#undef IN_OVS_ATOMIC_H
+#ifndef OMIT_STANDARD_ATOMIC_TYPES
+typedef ATOMIC(bool) atomic_bool;
+
+typedef ATOMIC(char) atomic_char;
+typedef ATOMIC(signed char) atomic_schar;
+typedef ATOMIC(unsigned char) atomic_uchar;
+
+typedef ATOMIC(short) atomic_short;
+typedef ATOMIC(unsigned short) atomic_ushort;
+
+typedef ATOMIC(int) atomic_int;
+typedef ATOMIC(unsigned int) atomic_uint;
+
+typedef ATOMIC(long) atomic_long;
+typedef ATOMIC(unsigned long) atomic_ulong;
+
+typedef ATOMIC(long long) atomic_llong;
+typedef ATOMIC(unsigned long long) atomic_ullong;
+
+typedef ATOMIC(size_t) atomic_size_t;
+typedef ATOMIC(ptrdiff_t) atomic_ptrdiff_t;
+
+typedef ATOMIC(intmax_t) atomic_intmax_t;
+typedef ATOMIC(uintmax_t) atomic_uintmax_t;
+
+typedef ATOMIC(intptr_t) atomic_intptr_t;
+typedef ATOMIC(uintptr_t) atomic_uintptr_t;
+#endif /* !OMIT_STANDARD_ATOMIC_TYPES */
+
+/* Nonstandard atomic types. */
+typedef ATOMIC(uint8_t) atomic_uint8_t;
+typedef ATOMIC(uint16_t) atomic_uint16_t;
+typedef ATOMIC(uint32_t) atomic_uint32_t;
+typedef ATOMIC(uint64_t) atomic_uint64_t;
+
+typedef ATOMIC(int8_t) atomic_int8_t;
+typedef ATOMIC(int16_t) atomic_int16_t;
+typedef ATOMIC(int32_t) atomic_int32_t;
+typedef ATOMIC(int64_t) atomic_int64_t;
+
/* Reference count. */
struct ovs_refcount {
atomic_uint count;
atomic_init(&refcount->count, 1);
}
-/* Destroys 'refcount'. */
-static inline void
-ovs_refcount_destroy(struct ovs_refcount *refcount)
-{
- atomic_destroy(&refcount->count);
-}
-
/* Increments 'refcount'. */
static inline void
ovs_refcount_ref(struct ovs_refcount *refcount)
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+#include "ovs-rcu.h"
+#include "guarded-list.h"
+#include "list.h"
+#include "ovs-thread.h"
+#include "poll-loop.h"
+#include "seq.h"
+
+struct ovsrcu_cb {
+ void (*function)(void *aux);
+ void *aux;
+};
+
+struct ovsrcu_cbset {
+ struct list list_node;
+ struct ovsrcu_cb cbs[16];
+ int n_cbs;
+};
+
+struct ovsrcu_perthread {
+ struct list list_node; /* In global list. */
+
+ struct ovs_mutex mutex;
+ uint64_t seqno;
+ struct ovsrcu_cbset *cbset;
+};
+
+static struct seq *global_seqno;
+
+static pthread_key_t perthread_key;
+static struct list ovsrcu_threads;
+static struct ovs_mutex ovsrcu_threads_mutex;
+
+static struct guarded_list flushed_cbsets;
+static struct seq *flushed_cbsets_seq;
+
+static void ovsrcu_init(void);
+static void ovsrcu_flush_cbset(struct ovsrcu_perthread *);
+static void ovsrcu_unregister__(struct ovsrcu_perthread *);
+static bool ovsrcu_call_postponed(void);
+static void *ovsrcu_postpone_thread(void *arg OVS_UNUSED);
+static void ovsrcu_synchronize(void);
+
+static struct ovsrcu_perthread *
+ovsrcu_perthread_get(void)
+{
+ struct ovsrcu_perthread *perthread;
+
+ ovsrcu_init();
+
+ perthread = pthread_getspecific(perthread_key);
+ if (!perthread) {
+ perthread = xmalloc(sizeof *perthread);
+ ovs_mutex_init(&perthread->mutex);
+ perthread->seqno = seq_read(global_seqno);
+ perthread->cbset = NULL;
+
+ ovs_mutex_lock(&ovsrcu_threads_mutex);
+ list_push_back(&ovsrcu_threads, &perthread->list_node);
+ ovs_mutex_unlock(&ovsrcu_threads_mutex);
+
+ pthread_setspecific(perthread_key, perthread);
+ }
+ return perthread;
+}
+
+/* Indicates the end of a quiescent state. See "Details" near the top of
+ * ovs-rcu.h.
+ *
+ * Quiescent states don't stack or nest, so this always ends a quiescent state
+ * even if ovsrcu_quiesce_start() was called multiple times in a row. */
+void
+ovsrcu_quiesce_end(void)
+{
+ ovsrcu_perthread_get();
+}
+
+static void
+ovsrcu_quiesced(void)
+{
+ if (single_threaded()) {
+ ovsrcu_call_postponed();
+ } else {
+ static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+ if (ovsthread_once_start(&once)) {
+ xpthread_create(NULL, NULL, ovsrcu_postpone_thread, NULL);
+ ovsthread_once_done(&once);
+ }
+ }
+}
+
+/* Indicates the beginning of a quiescent state. See "Details" near the top of
+ * ovs-rcu.h. */
+void
+ovsrcu_quiesce_start(void)
+{
+ struct ovsrcu_perthread *perthread;
+
+ ovsrcu_init();
+ perthread = pthread_getspecific(perthread_key);
+ if (perthread) {
+ pthread_setspecific(perthread_key, NULL);
+ ovsrcu_unregister__(perthread);
+ }
+
+ ovsrcu_quiesced();
+}
+
+/* Indicates a momentary quiescent state. See "Details" near the top of
+ * ovs-rcu.h. */
+void
+ovsrcu_quiesce(void)
+{
+ ovsrcu_init();
+ ovsrcu_perthread_get()->seqno = seq_read(global_seqno);
+ seq_change(global_seqno);
+
+ ovsrcu_quiesced();
+}
+
+static void
+ovsrcu_synchronize(void)
+{
+ uint64_t target_seqno;
+
+ if (single_threaded()) {
+ return;
+ }
+
+ target_seqno = seq_read(global_seqno);
+ ovsrcu_quiesce_start();
+
+ for (;;) {
+ uint64_t cur_seqno = seq_read(global_seqno);
+ struct ovsrcu_perthread *perthread;
+ bool done = true;
+
+ ovs_mutex_lock(&ovsrcu_threads_mutex);
+ LIST_FOR_EACH (perthread, list_node, &ovsrcu_threads) {
+ if (perthread->seqno <= target_seqno) {
+ done = false;
+ break;
+ }
+ }
+ ovs_mutex_unlock(&ovsrcu_threads_mutex);
+
+ if (done) {
+ break;
+ }
+
+ seq_wait(global_seqno, cur_seqno);
+ poll_block();
+ }
+ ovsrcu_quiesce_end();
+}
+
+/* Registers 'function' to be called, passing 'aux' as argument, after the
+ * next grace period.
+ *
+ * This function is more conveniently called through the ovsrcu_postpone()
+ * macro, which provides a type-safe way to allow 'function''s parameter to be
+ * any pointer type. */
+void
+ovsrcu_postpone__(void (*function)(void *aux), void *aux)
+{
+ struct ovsrcu_perthread *perthread = ovsrcu_perthread_get();
+ struct ovsrcu_cbset *cbset;
+ struct ovsrcu_cb *cb;
+
+ cbset = perthread->cbset;
+ if (!cbset) {
+ cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset);
+ cbset->n_cbs = 0;
+ }
+
+ cb = &cbset->cbs[cbset->n_cbs++];
+ cb->function = function;
+ cb->aux = aux;
+
+ if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) {
+ ovsrcu_flush_cbset(perthread);
+ }
+}
+
+static bool
+ovsrcu_call_postponed(void)
+{
+ struct ovsrcu_cbset *cbset, *next_cbset;
+ struct list cbsets;
+
+ guarded_list_pop_all(&flushed_cbsets, &cbsets);
+ if (list_is_empty(&cbsets)) {
+ return false;
+ }
+
+ ovsrcu_synchronize();
+
+ LIST_FOR_EACH_SAFE (cbset, next_cbset, list_node, &cbsets) {
+ struct ovsrcu_cb *cb;
+
+ for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) {
+ cb->function(cb->aux);
+ }
+ list_remove(&cbset->list_node);
+ free(cbset);
+ }
+
+ return true;
+}
+
+static void *
+ovsrcu_postpone_thread(void *arg OVS_UNUSED)
+{
+ pthread_detach(pthread_self());
+
+ for (;;) {
+ uint64_t seqno = seq_read(flushed_cbsets_seq);
+ if (!ovsrcu_call_postponed()) {
+ seq_wait(flushed_cbsets_seq, seqno);
+ poll_block();
+ }
+ }
+
+ OVS_NOT_REACHED();
+}
+
+static void
+ovsrcu_flush_cbset(struct ovsrcu_perthread *perthread)
+{
+ struct ovsrcu_cbset *cbset = perthread->cbset;
+
+ if (cbset) {
+ guarded_list_push_back(&flushed_cbsets, &cbset->list_node, SIZE_MAX);
+ perthread->cbset = NULL;
+
+ seq_change(flushed_cbsets_seq);
+ }
+}
+
+static void
+ovsrcu_unregister__(struct ovsrcu_perthread *perthread)
+{
+ if (perthread->cbset) {
+ ovsrcu_flush_cbset(perthread);
+ }
+
+ ovs_mutex_lock(&ovsrcu_threads_mutex);
+ list_remove(&perthread->list_node);
+ ovs_mutex_unlock(&ovsrcu_threads_mutex);
+
+ ovs_mutex_destroy(&perthread->mutex);
+ free(perthread);
+
+ seq_change(global_seqno);
+}
+
+static void
+ovsrcu_thread_exit_cb(void *perthread)
+{
+ ovsrcu_unregister__(perthread);
+}
+
+static void
+ovsrcu_init(void)
+{
+ static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+ if (ovsthread_once_start(&once)) {
+ global_seqno = seq_create();
+ xpthread_key_create(&perthread_key, ovsrcu_thread_exit_cb);
+ list_init(&ovsrcu_threads);
+ ovs_mutex_init(&ovsrcu_threads_mutex);
+
+ guarded_list_init(&flushed_cbsets);
+ flushed_cbsets_seq = seq_create();
+
+ ovsthread_once_done(&once);
+ }
+}
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef OVS_RCU_H
+#define OVS_RCU_H 1
+
+/* Read-Copy-Update (RCU)
+ * ======================
+ *
+ * Introduction
+ * ------------
+ *
+ * Atomic pointer access makes it pretty easy to implement lock-free
+ * algorithms. There is one big problem, though: when a writer updates a
+ * pointer to point to a new data structure, some thread might be reading the
+ * old version, and there's no convenient way to free the old version when all
+ * threads are done with the old version.
+ *
+ * The function ovsrcu_postpone() solves that problem. The function pointer
+ * passed in as its argument is called only after all threads are done with old
+ * versions of data structures. The function callback frees an old version of
+ * data no longer in use. This technique is called "read-copy-update", or RCU
+ * for short.
+ *
+ *
+ * Details
+ * -------
+ *
+ * A "quiescent state" is a time at which a thread holds no pointers to memory
+ * that is managed by RCU; that is, when the thread is known not to reference
+ * memory that might be an old version of some object freed via RCU. For
+ * example, poll_block() includes a quiescent state, as does
+ * ovs_mutex_cond_wait().
+ *
+ * The following functions manage the recognition of quiescent states:
+ *
+ * void ovsrcu_quiesce(void)
+ *
+ * Recognizes a momentary quiescent state in the current thread.
+ *
+ * void ovsrcu_quiesce_start(void)
+ * void ovsrcu_quiesce_end(void)
+ *
+ * Brackets a time period during which the current thread is quiescent.
+ *
+ * A newly created thread is initially active, not quiescent.
+ *
+ * When a quiescient state has occurred in every thread, we say that a "grace
+ * period" has occurred. Following a grace period, all of the callbacks
+ * postponed before the start of the grace period may be invoked. OVS takes
+ * care of this automatically through the RCU mechanism: while a process still
+ * has only a single thread, it invokes the postponed callbacks directly from
+ * ovsrcu_quiesce() and ovsrcu_quiesce_start(); after additional threads have
+ * been created, it creates an extra helper thread to invoke callbacks.
+ *
+ *
+ * Use
+ * ---
+ *
+ * Use OVSRCU_TYPE(TYPE) to declare a pointer to RCU-protected data, e.g. the
+ * following declares an RCU-protected "struct flow *" named flowp:
+ *
+ * OVSRCU_TYPE(struct flow *) flowp;
+ *
+ * Use ovsrcu_get(TYPE, VAR) to read an RCU-protected pointer, e.g. to read the
+ * pointer variable declared above:
+ *
+ * struct flow *flow = ovsrcu_get(struct flow *, flowp);
+ *
+ * Use ovsrcu_set() to write an RCU-protected pointer and ovsrcu_postpone() to
+ * free the previous data. If more than one thread can write the pointer, then
+ * some form of external synchronization, e.g. a mutex, is needed to prevent
+ * writers from interfering with one another. For example, to write the
+ * pointer variable declared above while safely freeing the old value:
+ *
+ * static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER;
+ *
+ * static void
+ * free_flow(struct flow *flow)
+ * {
+ * free(flow);
+ * }
+ *
+ * void
+ * change_flow(struct flow *new_flow)
+ * {
+ * ovs_mutex_lock(&mutex);
+ * ovsrcu_postpone(free_flow,
+ * ovsrcu_get_protected(struct flow *, &flowp));
+ * ovsrcu_set(&flowp, new_flow);
+ * ovs_mutex_unlock(&mutex);
+ * }
+ *
+ */
+
+#include "compiler.h"
+#include "ovs-atomic.h"
+
+/* Use OVSRCU_TYPE(TYPE) to declare a pointer to RCU-protected data, e.g. the
+ * following declares an RCU-protected "struct flow *" named flowp:
+ *
+ * OVSRCU_TYPE(struct flow *) flowp;
+ *
+ * Use ovsrcu_get(TYPE, VAR) to read an RCU-protected pointer, e.g. to read the
+ * pointer variable declared above:
+ *
+ * struct flow *flow = ovsrcu_get(struct flow *, flowp);
+ *
+ * If the pointer variable is currently protected against change (because
+ * the current thread holds a mutex that protects it), ovsrcu_get_protected()
+ * may be used instead. Only on the Alpha architecture is this likely to
+ * generate different code, but it may be useful documentation.
+ *
+ * (With GNU C or Clang, you get a compiler error if TYPE is wrong; other
+ * compilers will merrily carry along accepting the wrong type.)
+ */
+#if __GNUC__
+#define OVSRCU_TYPE(TYPE) struct { ATOMIC(TYPE) p; }
+#define ovsrcu_get__(TYPE, VAR, ORDER) \
+ ({ \
+ TYPE value__; \
+ \
+ atomic_read_explicit(CONST_CAST(ATOMIC(TYPE) *, &(VAR)->p), \
+ &value__, ORDER); \
+ \
+ value__; \
+ })
+#define ovsrcu_get(TYPE, VAR) \
+ CONST_CAST(TYPE, ovsrcu_get__(TYPE, VAR, memory_order_consume))
+#define ovsrcu_get_protected(TYPE, VAR) \
+ CONST_CAST(TYPE, ovsrcu_get__(TYPE, VAR, memory_order_relaxed))
+#else /* not GNU C */
+typedef struct ovsrcu_pointer { ATOMIC(void *) p; };
+#define OVSRCU_TYPE(TYPE) struct ovsrcu_pointer
+static inline void *
+ovsrcu_get__(const struct ovsrcu_pointer *pointer, memory_order order)
+{
+ void *value;
+ atomic_read_explicit(&CONST_CAST(struct ovsrcu_pointer *, pointer)->p,
+ &value, order);
+ return value;
+}
+#define ovsrcu_get(TYPE, VAR) \
+ CONST_CAST(TYPE, ovsrcu_get__(VAR, memory_order_consume))
+#define ovsrcu_get_protected(TYPE, VAR) \
+ CONST_CAST(TYPE, ovsrcu_get__(VAR, memory_order_relaxed))
+#endif
+
+/* Writes VALUE to the RCU-protected pointer whose address is VAR.
+ *
+ * Users require external synchronization (e.g. a mutex). See "Usage" above
+ * for an example. */
+#define ovsrcu_set(VAR, VALUE) \
+ atomic_store_explicit(&(VAR)->p, VALUE, memory_order_release)
+
+/* Calls FUNCTION passing ARG as its pointer-type argument following the next
+ * grace period. See "Usage" above for example. */
+void ovsrcu_postpone__(void (*function)(void *aux), void *aux);
+#define ovsrcu_postpone(FUNCTION, ARG) \
+ ((void) sizeof((FUNCTION)(ARG), 1), \
+ (void) sizeof(*(ARG)), \
+ ovsrcu_postpone__((void (*)(void *))(FUNCTION), ARG))
+
+/* Quiescent states. */
+void ovsrcu_quiesce_start(void);
+void ovsrcu_quiesce_end(void);
+void ovsrcu_quiesce(void);
+
+#endif /* ovs-rcu.h */
#include <unistd.h>
#include "compiler.h"
#include "hash.h"
+#include "ovs-rcu.h"
#include "poll-loop.h"
#include "socket-util.h"
#include "util.h"
ovs_mutex_cond_wait(pthread_cond_t *cond, const struct ovs_mutex *mutex_)
{
struct ovs_mutex *mutex = CONST_CAST(struct ovs_mutex *, mutex_);
- int error = pthread_cond_wait(cond, &mutex->lock);
+ int error;
+
+ ovsrcu_quiesce_start();
+ error = pthread_cond_wait(cond, &mutex->lock);
+ ovsrcu_quiesce_end();
+
if (OVS_UNLIKELY(error)) {
ovs_abort(error, "pthread_cond_wait failed");
}
aux = *auxp;
free(auxp);
+ ovsrcu_quiesce_end();
return aux.start(aux.arg);
}
forbid_forking("multiple threads exist");
multithreaded = true;
+ ovsrcu_quiesce_end();
aux = xmalloc(sizeof *aux);
aux->start = start;
ovs_mutex_unlock(&once->mutex);
}
\f
+bool
+single_threaded(void)
+{
+ return !multithreaded;
+}
+
/* Asserts that the process has not yet created any threads (beyond the initial
* thread).
*
}
}
+#ifndef _WIN32
/* Forks the current process (checking that this is allowed). Aborts with
* VLOG_FATAL if fork() returns an error, and otherwise returns the value
* returned by fork().
}
return pid;
}
+#endif
/* Notes that the process must not call fork() from now on, for the specified
* 'reason'. (The process may still fork() if it execs itself immediately
return !must_not_fork;
}
\f
-/* ovsthread_counter.
- *
- * We implement the counter as an array of N_COUNTERS individual counters, each
- * with its own lock. Each thread uses one of the counters chosen based on a
- * hash of the thread's ID, the idea being that, statistically, different
- * threads will tend to use different counters and therefore avoid
- * interfering with each other.
- *
- * Undoubtedly, better implementations are possible. */
-
-/* Basic counter structure. */
-struct ovsthread_counter__ {
- struct ovs_mutex mutex;
- unsigned long long int value;
-};
-
-/* Pad the basic counter structure to 64 bytes to avoid cache line
- * interference. */
-struct ovsthread_counter {
- struct ovsthread_counter__ c;
- char pad[ROUND_UP(sizeof(struct ovsthread_counter__), 64)
- - sizeof(struct ovsthread_counter__)];
-};
-
-#define N_COUNTERS 16
+/* ovsthread_stats. */
-struct ovsthread_counter *
-ovsthread_counter_create(void)
+void
+ovsthread_stats_init(struct ovsthread_stats *stats)
{
- struct ovsthread_counter *c;
int i;
- c = xmalloc(N_COUNTERS * sizeof *c);
- for (i = 0; i < N_COUNTERS; i++) {
- ovs_mutex_init(&c[i].c.mutex);
- c[i].c.value = 0;
+ ovs_mutex_init(&stats->mutex);
+ for (i = 0; i < ARRAY_SIZE(stats->buckets); i++) {
+ stats->buckets[i] = NULL;
}
- return c;
}
void
-ovsthread_counter_destroy(struct ovsthread_counter *c)
+ovsthread_stats_destroy(struct ovsthread_stats *stats)
{
- if (c) {
- int i;
-
- for (i = 0; i < N_COUNTERS; i++) {
- ovs_mutex_destroy(&c[i].c.mutex);
- }
- free(c);
- }
+ ovs_mutex_destroy(&stats->mutex);
}
-void
-ovsthread_counter_inc(struct ovsthread_counter *c, unsigned long long int n)
+void *
+ovsthread_stats_bucket_get(struct ovsthread_stats *stats,
+ void *(*new_bucket)(void))
{
- c = &c[hash_int(ovsthread_id_self(), 0) % N_COUNTERS];
-
- ovs_mutex_lock(&c->c.mutex);
- c->c.value += n;
- ovs_mutex_unlock(&c->c.mutex);
+ unsigned int idx = ovsthread_id_self() & (ARRAY_SIZE(stats->buckets) - 1);
+ void *bucket = stats->buckets[idx];
+ if (!bucket) {
+ ovs_mutex_lock(&stats->mutex);
+ bucket = stats->buckets[idx];
+ if (!bucket) {
+ bucket = stats->buckets[idx] = new_bucket();
+ }
+ ovs_mutex_unlock(&stats->mutex);
+ }
+ return bucket;
}
-unsigned long long int
-ovsthread_counter_read(const struct ovsthread_counter *c)
+size_t
+ovs_thread_stats_next_bucket(const struct ovsthread_stats *stats, size_t i)
{
- unsigned long long int sum;
- int i;
-
- sum = 0;
- for (i = 0; i < N_COUNTERS; i++) {
- ovs_mutex_lock(&c[i].c.mutex);
- sum += c[i].c.value;
- ovs_mutex_unlock(&c[i].c.mutex);
+ for (; i < ARRAY_SIZE(stats->buckets); i++) {
+ if (stats->buckets[i]) {
+ break;
+ }
}
- return sum;
+ return i;
}
+
\f
/* Parses /proc/cpuinfo for the total number of physical cores on this system
* across all CPU packages, not counting hyper-threads.
static long int n_cores;
if (ovsthread_once_start(&once)) {
+#ifndef _WIN32
parse_cpuinfo(&n_cores);
if (!n_cores) {
n_cores = sysconf(_SC_NPROCESSORS_ONLN);
}
+#else
+ SYSTEM_INFO sysinfo;
+ GetSystemInfo(&sysinfo);
+ n_cores = sysinfo.dwNumberOfProcessors;
+#endif
ovsthread_once_done(&once);
}
*
* Fully thread-safe. */
-struct ovsthread_counter *ovsthread_counter_create(void);
-void ovsthread_counter_destroy(struct ovsthread_counter *);
-void ovsthread_counter_inc(struct ovsthread_counter *, unsigned long long int);
-unsigned long long int ovsthread_counter_read(
- const struct ovsthread_counter *);
+struct ovsthread_stats {
+ struct ovs_mutex mutex;
+ void *volatile buckets[16];
+};
+
+void ovsthread_stats_init(struct ovsthread_stats *);
+void ovsthread_stats_destroy(struct ovsthread_stats *);
+
+void *ovsthread_stats_bucket_get(struct ovsthread_stats *,
+ void *(*new_bucket)(void));
+
+#define OVSTHREAD_STATS_FOR_EACH_BUCKET(BUCKET, IDX, STATS) \
+ for ((IDX) = ovs_thread_stats_next_bucket(STATS, 0); \
+ ((IDX) < ARRAY_SIZE((STATS)->buckets) \
+ ? ((BUCKET) = (STATS)->buckets[IDX], true) \
+ : false); \
+ (IDX) = ovs_thread_stats_next_bucket(STATS, (IDX) + 1))
+size_t ovs_thread_stats_next_bucket(const struct ovsthread_stats *, size_t);
\f
+bool single_threaded(void);
+
void assert_single_threaded_at(const char *where);
#define assert_single_threaded() assert_single_threaded_at(SOURCE_LOCATOR)
+#ifndef _WIN32
pid_t xfork_at(const char *where);
#define xfork() xfork_at(SOURCE_LOCATOR)
+#endif
void forbid_forking(const char *reason);
bool may_fork(void);
if (ovsthread_once_start(&once)) {
hmap_init(&addrs);
for (node = nodes; node < &nodes[ARRAY_SIZE(nodes)]; node++) {
- hmap_insert(&addrs, &node->hmap_node,
- hash_2words(node->ea64, node->ea64 >> 32));
+ hmap_insert(&addrs, &node->hmap_node, hash_uint64(node->ea64));
}
ovsthread_once_done(&once);
}
ea64 = eth_addr_to_uint64(ea);
- HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash_2words(ea64, ea64 >> 32),
- &addrs) {
+ HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash_uint64(ea64), &addrs) {
if (node->ea64 == ea64) {
return true;
}
/* Insert VLAN header according to given TCI. Packet passed must be Ethernet
* packet. Ignores the CFI bit of 'tci' using 0 instead.
*
- * Also sets 'packet->l2' to point to the new Ethernet header. */
+ * Also sets 'packet->l2' to point to the new Ethernet header and adjusts
+ * the layer offsets accordingly. */
void
eth_push_vlan(struct ofpbuf *packet, ovs_be16 tpid, ovs_be16 tci)
{
- struct eth_header *eh = packet->data;
struct vlan_eth_header *veh;
/* Insert new 802.1Q header. */
- struct vlan_eth_header tmp;
- memcpy(tmp.veth_dst, eh->eth_dst, ETH_ADDR_LEN);
- memcpy(tmp.veth_src, eh->eth_src, ETH_ADDR_LEN);
- tmp.veth_type = tpid;
- tmp.veth_tci = tci & htons(~VLAN_CFI);
- tmp.veth_next_type = eh->eth_type;
-
- veh = ofpbuf_push_uninit(packet, VLAN_HEADER_LEN);
- memcpy(veh, &tmp, sizeof tmp);
-
- packet->l2 = packet->data;
+ veh = ofpbuf_resize_l2(packet, VLAN_HEADER_LEN);
+ memmove(veh, (char *)veh + VLAN_HEADER_LEN, 2 * ETH_ADDR_LEN);
+ veh->veth_type = tpid;
+ veh->veth_tci = tci & htons(~VLAN_CFI);
}
/* Removes outermost VLAN header (if any is present) from 'packet'.
eth_pop_vlan(struct ofpbuf *packet)
{
struct vlan_eth_header *veh = packet->l2;
+
if (packet->size >= sizeof *veh
&& veh->veth_type == htons(ETH_TYPE_VLAN)) {
- struct eth_header tmp;
- memcpy(tmp.eth_dst, veh->veth_dst, ETH_ADDR_LEN);
- memcpy(tmp.eth_src, veh->veth_src, ETH_ADDR_LEN);
- tmp.eth_type = veh->veth_next_type;
-
- ofpbuf_pull(packet, VLAN_HEADER_LEN);
- packet->l2 = (char*)packet->l2 + VLAN_HEADER_LEN;
- memcpy(packet->data, &tmp, sizeof tmp);
+ memmove((char *)veh + VLAN_HEADER_LEN, veh, 2 * ETH_ADDR_LEN);
+ ofpbuf_resize_l2(packet, -VLAN_HEADER_LEN);
}
}
static void
set_ethertype(struct ofpbuf *packet, ovs_be16 eth_type)
{
- struct eth_header *eh = packet->data;
+ struct eth_header *eh = packet->l2;
if (eh->eth_type == htons(ETH_TYPE_VLAN)) {
ovs_be16 *p;
+ char *l2_5 = ofpbuf_get_l2_5(packet);
+
p = ALIGNED_CAST(ovs_be16 *,
- (char *)(packet->l2_5 ? packet->l2_5 : packet->l3) - 2);
+ (l2_5 ? l2_5 : (char *)ofpbuf_get_l3(packet)) - 2);
*p = eth_type;
} else {
eh->eth_type = eth_type;
static bool is_mpls(struct ofpbuf *packet)
{
- return packet->l2_5 != NULL;
+ return packet->l2_5_ofs != UINT16_MAX;
}
/* Set time to live (TTL) of an MPLS label stack entry (LSE). */
return lse;
}
-/* Push an new MPLS stack entry onto the MPLS stack and adjust 'packet->l2' and
- * 'packet->l2_5' accordingly. The new entry will be the outermost entry on
- * the stack.
- *
- * Previous to calling this function, 'packet->l2_5' must be set; if the MPLS
- * label to be pushed will be the first label in 'packet', then it should be
- * the same as 'packet->l3'. */
-static void
-push_mpls_lse(struct ofpbuf *packet, struct mpls_hdr *mh)
-{
- char * header;
- size_t len;
- header = ofpbuf_push_uninit(packet, MPLS_HLEN);
- len = (char *)packet->l2_5 - (char *)packet->l2;
- memmove(header, packet->l2, len);
- memcpy(header + len, mh, sizeof *mh);
- packet->l2 = (char*)packet->l2 - MPLS_HLEN;
- packet->l2_5 = (char*)packet->l2_5 - MPLS_HLEN;
-}
-
/* Set MPLS label stack entry to outermost MPLS header.*/
void
set_mpls_lse(struct ofpbuf *packet, ovs_be32 mpls_lse)
{
- struct mpls_hdr *mh = packet->l2_5;
-
/* Packet type should be MPLS to set label stack entry. */
if (is_mpls(packet)) {
+ struct mpls_hdr *mh = ofpbuf_get_l2_5(packet);
+
/* Update mpls label stack entry. */
mh->mpls_lse = mpls_lse;
}
void
push_mpls(struct ofpbuf *packet, ovs_be16 ethtype, ovs_be32 lse)
{
- struct mpls_hdr mh;
+ char * header;
+ size_t len;
if (!eth_type_mpls(ethtype)) {
return;
}
- set_ethertype(packet, ethtype);
-
if (!is_mpls(packet)) {
- /* Set MPLS label stack entry. */
- packet->l2_5 = packet->l3;
+ /* Set MPLS label stack offset. */
+ packet->l2_5_ofs = packet->l3_ofs;
}
+ set_ethertype(packet, ethtype);
+
/* Push new MPLS shim header onto packet. */
- mh.mpls_lse = lse;
- push_mpls_lse(packet, &mh);
+ len = packet->l2_5_ofs;
+ header = ofpbuf_resize_l2_5(packet, MPLS_HLEN);
+ memmove(header, header + MPLS_HLEN, len);
+ memcpy(header + len, &lse, sizeof lse);
}
/* If 'packet' is an MPLS packet, removes its outermost MPLS label stack entry.
void
pop_mpls(struct ofpbuf *packet, ovs_be16 ethtype)
{
- struct mpls_hdr *mh = NULL;
-
if (is_mpls(packet)) {
- size_t len;
- mh = packet->l2_5;
- len = (char*)packet->l2_5 - (char*)packet->l2;
+ struct mpls_hdr *mh = ofpbuf_get_l2_5(packet);
+ size_t len = packet->l2_5_ofs;
+
set_ethertype(packet, ethtype);
if (mh->mpls_lse & htonl(MPLS_BOS_MASK)) {
- packet->l2_5 = NULL;
- } else {
- packet->l2_5 = (char*)packet->l2_5 + MPLS_HLEN;
+ ofpbuf_set_l2_5(packet, NULL);
}
/* Shift the l2 header forward. */
memmove((char*)packet->data + MPLS_HLEN, packet->data, len);
- packet->size -= MPLS_HLEN;
- packet->data = (char*)packet->data + MPLS_HLEN;
- packet->l2 = (char*)packet->l2 + MPLS_HLEN;
+ ofpbuf_resize_l2_5(packet, -MPLS_HLEN);
}
}
/* Populates 'b' with an Ethernet II packet headed with the given 'eth_dst',
* 'eth_src' and 'eth_type' parameters. A payload of 'size' bytes is allocated
* in 'b' and returned. This payload may be populated with appropriate
- * information by the caller. Sets 'b''s 'l2' and 'l3' pointers to the
+ * information by the caller. Sets 'b''s 'l2' pointer and 'l3' offset to the
* Ethernet header and payload respectively. Aligns b->l3 on a 32-bit
* boundary.
*
eth->eth_type = htons(eth_type);
b->l2 = eth;
- b->l3 = data;
+ ofpbuf_set_l3(b, data);
return data;
}
packet_set_ipv4_addr(struct ofpbuf *packet,
ovs_16aligned_be32 *addr, ovs_be32 new_addr)
{
- struct ip_header *nh = packet->l3;
+ struct ip_header *nh = ofpbuf_get_l3(packet);
ovs_be32 old_addr = get_16aligned_be32(addr);
+ size_t l4_size = ofpbuf_get_l4_size(packet);
- if (nh->ip_proto == IPPROTO_TCP && packet->l7) {
- struct tcp_header *th = packet->l4;
+ if (nh->ip_proto == IPPROTO_TCP && l4_size >= TCP_HEADER_LEN) {
+ struct tcp_header *th = ofpbuf_get_l4(packet);
th->tcp_csum = recalc_csum32(th->tcp_csum, old_addr, new_addr);
- } else if (nh->ip_proto == IPPROTO_UDP && packet->l7) {
- struct udp_header *uh = packet->l4;
+ } else if (nh->ip_proto == IPPROTO_UDP && l4_size >= UDP_HEADER_LEN ) {
+ struct udp_header *uh = ofpbuf_get_l4(packet);
if (uh->udp_csum) {
uh->udp_csum = recalc_csum32(uh->udp_csum, old_addr, new_addr);
/* Returns true, if packet contains at least one routing header where
* segements_left > 0.
*
- * This function assumes that L3 and L4 markers are set in the packet. */
+ * This function assumes that L3 and L4 offsets are set in the packet. */
static bool
packet_rh_present(struct ofpbuf *packet)
{
int nexthdr;
size_t len;
size_t remaining;
- uint8_t *data = packet->l3;
+ uint8_t *data = ofpbuf_get_l3(packet);
- remaining = (uint8_t *)packet->l4 - (uint8_t *)packet->l3;
+ remaining = packet->l4_ofs - packet->l3_ofs;
if (remaining < sizeof *nh) {
return false;
packet_update_csum128(struct ofpbuf *packet, uint8_t proto,
ovs_16aligned_be32 addr[4], const ovs_be32 new_addr[4])
{
- if (proto == IPPROTO_TCP && packet->l7) {
- struct tcp_header *th = packet->l4;
+ size_t l4_size = ofpbuf_get_l4_size(packet);
+
+ if (proto == IPPROTO_TCP && l4_size >= TCP_HEADER_LEN) {
+ struct tcp_header *th = ofpbuf_get_l4(packet);
th->tcp_csum = recalc_csum128(th->tcp_csum, addr, new_addr);
- } else if (proto == IPPROTO_UDP && packet->l7) {
- struct udp_header *uh = packet->l4;
+ } else if (proto == IPPROTO_UDP && l4_size >= UDP_HEADER_LEN) {
+ struct udp_header *uh = ofpbuf_get_l4(packet);
if (uh->udp_csum) {
uh->udp_csum = recalc_csum128(uh->udp_csum, addr, new_addr);
packet_set_ipv4(struct ofpbuf *packet, ovs_be32 src, ovs_be32 dst,
uint8_t tos, uint8_t ttl)
{
- struct ip_header *nh = packet->l3;
+ struct ip_header *nh = ofpbuf_get_l3(packet);
if (get_16aligned_be32(&nh->ip_src) != src) {
packet_set_ipv4_addr(packet, &nh->ip_src, src);
/* Modifies the IPv6 header fields of 'packet' to be consistent with 'src',
* 'dst', 'traffic class', and 'next hop'. Updates 'packet''s L4 checksums as
* appropriate. 'packet' must contain a valid IPv6 packet with correctly
- * populated l[347] markers. */
+ * populated l[34] offsets. */
void
packet_set_ipv6(struct ofpbuf *packet, uint8_t proto, const ovs_be32 src[4],
const ovs_be32 dst[4], uint8_t key_tc, ovs_be32 key_fl,
uint8_t key_hl)
{
- struct ovs_16aligned_ip6_hdr *nh = packet->l3;
+ struct ovs_16aligned_ip6_hdr *nh = ofpbuf_get_l3(packet);
if (memcmp(&nh->ip6_src, src, sizeof(ovs_be32[4]))) {
packet_set_ipv6_addr(packet, proto, nh->ip6_src.be32, src, true);
/* Sets the TCP source and destination port ('src' and 'dst' respectively) of
* the TCP header contained in 'packet'. 'packet' must be a valid TCP packet
- * with its l4 marker properly populated. */
+ * with its l4 offset properly populated. */
void
packet_set_tcp_port(struct ofpbuf *packet, ovs_be16 src, ovs_be16 dst)
{
- struct tcp_header *th = packet->l4;
+ struct tcp_header *th = ofpbuf_get_l4(packet);
packet_set_port(&th->tcp_src, src, &th->tcp_csum);
packet_set_port(&th->tcp_dst, dst, &th->tcp_csum);
/* Sets the UDP source and destination port ('src' and 'dst' respectively) of
* the UDP header contained in 'packet'. 'packet' must be a valid UDP packet
- * with its l4 marker properly populated. */
+ * with its l4 offset properly populated. */
void
packet_set_udp_port(struct ofpbuf *packet, ovs_be16 src, ovs_be16 dst)
{
- struct udp_header *uh = packet->l4;
+ struct udp_header *uh = ofpbuf_get_l4(packet);
if (uh->udp_csum) {
packet_set_port(&uh->udp_src, src, &uh->udp_csum);
/* Sets the SCTP source and destination port ('src' and 'dst' respectively) of
* the SCTP header contained in 'packet'. 'packet' must be a valid SCTP packet
- * with its l4 marker properly populated. */
+ * with its l4 offset properly populated. */
void
packet_set_sctp_port(struct ofpbuf *packet, ovs_be16 src, ovs_be16 dst)
{
- struct sctp_header *sh = packet->l4;
+ struct sctp_header *sh = ofpbuf_get_l4(packet);
ovs_be32 old_csum, old_correct_csum, new_csum;
uint16_t tp_len = packet->size - ((uint8_t*)sh - (uint8_t*)packet->data);
old_csum = sh->sctp_csum;
sh->sctp_csum = 0;
- old_correct_csum = crc32c(packet->l4, tp_len);
+ old_correct_csum = crc32c((void *)sh, tp_len);
sh->sctp_src = src;
sh->sctp_dst = dst;
- new_csum = crc32c(packet->l4, tp_len);
+ new_csum = crc32c((void *)sh, tp_len);
sh->sctp_csum = old_csum ^ old_correct_csum ^ new_csum;
}
-/* If 'packet' is a TCP packet, returns the TCP flags. Otherwise, returns 0.
- *
- * 'flow' must be the flow corresponding to 'packet' and 'packet''s header
- * pointers must be properly initialized (e.g. with flow_extract()). */
-uint16_t
-packet_get_tcp_flags(const struct ofpbuf *packet, const struct flow *flow)
-{
- if (dl_type_is_ip_any(flow->dl_type) &&
- flow->nw_proto == IPPROTO_TCP && packet->l7) {
- const struct tcp_header *tcp = packet->l4;
- return TCP_FLAGS(tcp->tcp_ctl);
- } else {
- return 0;
- }
-}
-
const char *
packet_tcp_flag_to_string(uint32_t flag)
{
}
/* Appends a string representation of the TCP flags value 'tcp_flags'
- * (e.g. obtained via packet_get_tcp_flags() or TCP_FLAGS) to 's', in the
+ * (e.g. from struct flow.tcp_flags or obtained via TCP_FLAGS) to 's', in the
* format used by tcpdump. */
void
packet_format_tcp_flags(struct ds *s, uint16_t tcp_flags)
ds_put_cstr(s, "[800]");
}
}
-
-void pkt_metadata_init(struct pkt_metadata *md, const struct flow_tnl *tnl,
- const uint32_t skb_priority,
- const uint32_t pkt_mark,
- const union flow_in_port *in_port)
-{
-
- tnl ? memcpy(&md->tunnel, tnl, sizeof(md->tunnel))
- : memset(&md->tunnel, 0, sizeof(md->tunnel));
-
- md->skb_priority = skb_priority;
- md->pkt_mark = pkt_mark;
- md->in_port.odp_port = in_port ? in_port->odp_port : ODPP_NONE;
-}
-
-void pkt_metadata_from_flow(struct pkt_metadata *md, const struct flow *flow)
-{
- pkt_metadata_init(md, &flow->tunnel, flow->skb_priority,
- flow->pkt_mark, &flow->in_port);
-}
#include "flow.h"
#include "openvswitch/types.h"
#include "random.h"
+#include "hash.h"
#include "util.h"
struct ofpbuf;
/* Datapath packet metadata */
struct pkt_metadata {
+ uint32_t recirc_id; /* Recirculation id carried with the
+ recirculating packets. 0 for packets
+ received from the wire. */
+ uint32_t dp_hash; /* hash value computed by the recirculation
+ action. */
struct flow_tnl tunnel; /* Encapsulating tunnel parameters. */
uint32_t skb_priority; /* Packet priority for QoS. */
uint32_t pkt_mark; /* Packet mark. */
};
#define PKT_METADATA_INITIALIZER(PORT) \
- (struct pkt_metadata){ { 0, 0, 0, 0, 0, 0}, 0, 0, {(PORT)} }
+ (struct pkt_metadata){ 0, 0, { 0, 0, 0, 0, 0, 0}, 0, 0, {(PORT)} }
-void pkt_metadata_init(struct pkt_metadata *md, const struct flow_tnl *tnl,
- const uint32_t skb_priority,
- const uint32_t pkt_mark,
- const union flow_in_port *in_port);
-void pkt_metadata_from_flow(struct pkt_metadata *md, const struct flow *flow);
+static inline struct pkt_metadata
+pkt_metadata_from_flow(const struct flow *flow)
+{
+ struct pkt_metadata md;
+
+ md.recirc_id = flow->recirc_id;
+ md.dp_hash = flow->dp_hash;
+ md.tunnel = flow->tunnel;
+ md.skb_priority = flow->skb_priority;
+ md.pkt_mark = flow->pkt_mark;
+ md.in_port = flow->in_port;
+
+ return md;
+}
bool dpid_from_string(const char *s, uint64_t *dpidp);
| ((uint64_t) ea[4] << 8)
| ea[5]);
}
+static inline uint64_t eth_addr_vlan_to_uint64(const uint8_t ea[ETH_ADDR_LEN],
+ uint16_t vlan)
+{
+ return (((uint64_t)vlan << 48) | eth_addr_to_uint64(ea));
+}
static inline void eth_addr_from_uint64(uint64_t x, uint8_t ea[ETH_ADDR_LEN])
{
ea[0] = x >> 40;
/* Set the top bit to indicate random Nicira address. */
ea[3] |= 0x80;
}
+static inline uint32_t hash_mac(const uint8_t ea[ETH_ADDR_LEN],
+ const uint16_t vlan, const uint32_t basis)
+{
+ return hash_uint64_basis(eth_addr_vlan_to_uint64(ea, vlan), basis);
+}
bool eth_addr_is_reserved(const uint8_t ea[ETH_ADDR_LEN]);
bool eth_addr_from_string(const char *, uint8_t ea[ETH_ADDR_LEN]);
void packet_set_udp_port(struct ofpbuf *, ovs_be16 src, ovs_be16 dst);
void packet_set_sctp_port(struct ofpbuf *, ovs_be16 src, ovs_be16 dst);
-uint16_t packet_get_tcp_flags(const struct ofpbuf *, const struct flow *);
void packet_format_tcp_flags(struct ds *, uint16_t);
const char *packet_tcp_flag_to_string(uint32_t flag);
uint32_t hash;
uint32_t seq;
uint8_t flags;
+ const char *l7 = ofpbuf_get_tcp_payload(packet);
if (flow->dl_type != htons(ETH_TYPE_IP)
|| flow->nw_proto != IPPROTO_TCP
- || !packet->l7) {
+ || !l7) {
return NULL;
}
- tcp = packet->l4;
+ tcp = ofpbuf_get_l4(packet);
flags = TCP_FLAGS(tcp->tcp_ctl);
- l7_length = (char *) ofpbuf_end(packet) - (char *) packet->l7;
+ l7_length = (char *) ofpbuf_tail(packet) - l7;
seq = ntohl(get_16aligned_be32(&tcp->tcp_seq));
/* Construct key. */
* continually expanding it. */
ofpbuf_shift(payload, (char *) payload->base - (char *) payload->data);
- ofpbuf_put(payload, packet->l7, l7_length);
+ ofpbuf_put(payload, l7, l7_length);
stream->seq_no += l7_length;
return payload;
} else {
if (rconn_is_connected(rc)) {
COVERAGE_INC(rconn_queued);
copy_to_monitor(rc, b);
- b->private_p = counter;
+
if (counter) {
rconn_packet_counter_inc(counter, b->size);
}
+
+ /* Use 'l2' as a private pointer while 'b' is in txq. */
+ ovs_assert(b->l2 == b->data);
+ b->l2 = counter;
+
list_push_back(&rc->txq, &b->list_node);
/* If the queue was empty before we added 'b', try to send some
{
struct ofpbuf *msg = ofpbuf_from_list(rc->txq.next);
unsigned int n_bytes = msg->size;
- struct rconn_packet_counter *counter = msg->private_p;
+ struct rconn_packet_counter *counter = msg->l2;
int retval;
/* Eagerly remove 'msg' from the txq. We can't remove it from the list
* after sending, if sending is successful, because it is then owned by the
* vconn, which might have freed it already. */
list_remove(&msg->list_node);
+ msg->l2 = msg->data; /* Restore 'l2'. */
retval = vconn_send(rc->vconn, msg);
if (retval) {
+ msg->l2 = counter; /* 'l2' is a private pointer while msg is in txq. */
list_push_front(&rc->txq, &msg->list_node);
if (retval != EAGAIN) {
report_error(rc, retval);
}
while (!list_is_empty(&rc->txq)) {
struct ofpbuf *b = ofpbuf_from_list(list_pop_front(&rc->txq));
- struct rconn_packet_counter *counter = b->private_p;
+ struct rconn_packet_counter *counter = b->l2;
if (counter) {
rconn_packet_counter_dec(counter, b->size);
}
#ifndef SFLOW_H
#define SFLOW_H 1
+#ifdef _WIN32
+#include "windefs.h"
+#endif
+
typedef enum {
SFL_DSCLASS_IFINDEX = 0,
SFL_DSCLASS_VLAN = 1,
* Thus, this file compiles all of the code regardless of the target, by
* writing "if (LINUX)" instead of "#ifdef __linux__". */
#ifdef __linux__
-#define LINUX 0
-#else
#define LINUX 1
+#else
+#define LINUX 0
#endif
#ifndef O_DIRECTORY
success = false;
val = dscp << 2;
if (setsockopt(fd, IPPROTO_IP, IP_TOS, &val, sizeof val)) {
+#ifndef _WIN32
if (sock_errno() != ENOPROTOOPT) {
+#else
+ if (sock_errno() != WSAENOPROTOOPT) {
+#endif
return sock_errno();
}
} else {
success = true;
}
if (setsockopt(fd, IPPROTO_IPV6, IPV6_TCLASS, &val, sizeof val)) {
+#ifndef _WIN32
if (sock_errno() != ENOPROTOOPT) {
+#else
+ if (sock_errno() != WSAENOPROTOOPT) {
+#endif
return sock_errno();
}
} else {
int dirfd;
int len;
- if (LINUX) {
+ if (!LINUX) {
return ENAMETOOLONG;
}
#include <config.h>
#include <stdio.h>
+#include <sys/types.h>
#ifdef _WIN32
#undef snprintf
}
return needed;
}
+
+int
+fseeko(FILE *stream, off_t offset, int whence)
+{
+ int error;
+ error = _fseeki64(stream, offset, whence);
+ if (error) {
+ return -1;
+ }
+ return error;
+}
#endif /* _WIN32 */
#include <stdarg.h>
#include <stddef.h>
+#include <sys/types.h>
/* Windows libc has defective snprintf() and vsnprintf():
*
#undef vsnprintf
#define vsnprintf ovs_vsnprintf
int ovs_vsnprintf(char *, size_t, const char *, va_list);
+
+int fseeko(FILE *stream, off_t offset, int whence);
#endif /* _WIN32 */
#endif /* stdio.h wrapper */
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
list_remove(&stp->node);
ovs_mutex_unlock(&mutex);
free(stp->name);
- ovs_refcount_destroy(&stp->ref_cnt);
free(stp);
}
}
return (state & (STP_DISABLED | STP_LEARNING | STP_FORWARDING)) != 0;
}
+/* Returns true if 'state' is one in which rx&tx bpdu should be done on
+ * on a port, false otherwise. */
+bool
+stp_listen_in_state(enum stp_state state)
+{
+ return (state &
+ (STP_LISTENING | STP_LEARNING | STP_FORWARDING)) != 0;
+}
+
/* Returns the name for the given 'role' (for use in debugging and log
* messages). */
const char *
pkt = ofpbuf_new(ETH_HEADER_LEN + LLC_HEADER_LEN + bpdu_size);
pkt->l2 = eth = ofpbuf_put_zeros(pkt, sizeof *eth);
llc = ofpbuf_put_zeros(pkt, sizeof *llc);
- pkt->l3 = ofpbuf_put(pkt, bpdu, bpdu_size);
+ ofpbuf_set_l3(pkt, ofpbuf_put(pkt, bpdu, bpdu_size));
/* 802.2 header. */
memcpy(eth->eth_dst, eth_addr_stp, ETH_ADDR_LEN);
const char *stp_state_name(enum stp_state);
bool stp_forward_in_state(enum stp_state);
bool stp_learn_in_state(enum stp_state);
+bool stp_listen_in_state(enum stp_state);
/* Role of an STP port. */
enum stp_role {
break;
default:
- NOT_REACHED();
+ OVS_NOT_REACHED();
}
}
retval = setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, &on, sizeof on);
if (retval) {
- VLOG_ERR("%s: setsockopt(TCP_NODELAY): %s", name, ovs_strerror(errno));
- close(fd);
- return errno;
+ int error = sock_errno();
+ VLOG_ERR("%s: setsockopt(TCP_NODELAY): %s",
+ name, sock_strerror(error));
+ closesocket(fd);
+ return error;
}
return new_fd_stream(name, fd, connect_status, streamp);
#include "ofpbuf.h"
#include "openflow/nicira-ext.h"
#include "openflow/openflow.h"
+#include "ovs-thread.h"
#include "packets.h"
#include "poll-loop.h"
#include "random.h"
+#include "socket-util.h"
#include "util.h"
#include "vlog.h"
#endif
};
+#ifdef _WIN32
+static void
+do_winsock_start(void)
+{
+ WSADATA wsaData;
+ int error;
+
+ error = WSAStartup(MAKEWORD(2, 2), &wsaData);
+ if (error != 0) {
+ VLOG_FATAL("WSAStartup failed: %s", sock_strerror(sock_errno()));
+ }
+}
+
+static void
+winsock_start(void)
+{
+ static pthread_once_t once = PTHREAD_ONCE_INIT;
+ pthread_once(&once, do_winsock_start);
+}
+#endif
+
/* Check the validity of the stream class structures. */
static void
check_stream_classes(void)
COVERAGE_INC(stream_open);
+#ifdef _WIN32
+ winsock_start();
+#endif
+
/* Look up the class. */
error = stream_lookup_class(name, &class);
if (!class) {
COVERAGE_INC(pstream_open);
+#ifdef _WIN32
+ winsock_start();
+#endif
+
/* Look up the class. */
error = pstream_lookup_class(name, &class);
if (!class) {
--- /dev/null
+/*-
+ * Copyright (c) 1990, 1993
+ * The Regents of the University of California. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <config.h>
+#include "util.h"
+
+/*
+ * Get next token from string *stringp, where tokens are possibly-empty
+ * strings separated by characters from delim.
+ *
+ * Writes NULs into the string at *stringp to end tokens.
+ * delim need not remain constant from call to call.
+ * On return, *stringp points past the last NUL written (if there might
+ * be further tokens), or is NULL (if there are definitely no more tokens).
+ *
+ * If *stringp is NULL, strsep returns NULL.
+ */
+#define _DIAGASSERT(q) ovs_assert(q)
+char *
+strsep(char **stringp, const char *delim)
+{
+ char *s;
+ const char *spanp;
+ int c, sc;
+ char *tok;
+
+ _DIAGASSERT(stringp != NULL);
+ _DIAGASSERT(delim != NULL);
+
+ if ((s = *stringp) == NULL)
+ return (NULL);
+ for (tok = s;;) {
+ c = *s++;
+ spanp = delim;
+ do {
+ if ((sc = *spanp++) == c) {
+ if (c == 0)
+ s = NULL;
+ else
+ s[-1] = 0;
+ *stringp = s;
+ return (tok);
+ }
+ } while (sc != 0);
+ }
+ /* NOTREACHED */
+}
#include "fatal-signal.h"
#include "hash.h"
#include "hmap.h"
+#include "ovs-rcu.h"
#include "ovs-thread.h"
#include "signals.h"
#include "seq.h"
time_left = timeout_when - now;
}
+ if (!time_left) {
+ ovsrcu_quiesce();
+ } else {
+ ovsrcu_quiesce_start();
+ }
+
#ifndef _WIN32
retval = poll(pollfds, n_pollfds, time_left);
if (retval < 0) {
}
#endif
+ if (time_left) {
+ ovsrcu_quiesce_end();
+ }
+
if (deadline <= time_msec()) {
#ifndef _WIN32
fatal_signal_handler(SIGALRM);
timespec_add(&monotonic_clock.warp, &monotonic_clock.warp, &ts);
ovs_mutex_unlock(&monotonic_clock.mutex);
seq_change(timewarp_seq);
- poll(NULL, 0, 10); /* give threads (eg. monitor) some chances to run */
+ /* give threads (eg. monitor) some chances to run */
+#ifndef _WIN32
+ poll(NULL, 0, 10);
+#else
+ Sleep(10);
+#endif
unixctl_command_reply(conn, "warped");
}
#include "poll-loop.h"
#include "shash.h"
#include "stream.h"
+#include "stream-provider.h"
#include "svec.h"
#include "vlog.h"
unixctl_command_reply__(conn, false, error);
}
-/* Creates a unixctl server listening on 'path', which may be:
+/* Creates a unixctl server listening on 'path', which for POSIX may be:
*
* - NULL, in which case <rundir>/<program>.<pid>.ctl is used.
*
- * - "none", in which case the function will return successfully but
- * no socket will actually be created.
- *
* - A name that does not start with '/', in which case it is put in
* <rundir>.
*
* - An absolute path (starting with '/') that gives the exact name of
* the Unix domain socket to listen on.
*
+ * For Windows, a kernel assigned TCP port is used and written in 'path'
+ * which may be:
+ *
+ * - NULL, in which case <rundir>/<program>.ctl is used.
+ *
+ * - An absolute path that gives the name of the file.
+ *
+ * For both POSIX and Windows, if the path is "none", the function will
+ * return successfully but no socket will actually be created.
+ *
* A program that (optionally) daemonizes itself should call this function
* *after* daemonization, so that the socket name contains the pid of the
* daemon instead of the pid of the program that exited. (Otherwise,
{
struct unixctl_server *server;
struct pstream *listener;
- char *punix_path;
+ char *punix_path, *abs_path = NULL;
int error;
+#ifdef _WIN32
+ FILE *file;
+#endif
*serverp = NULL;
if (path && !strcmp(path, "none")) {
return 0;
}
+#ifndef _WIN32
if (path) {
- char *abs_path = abs_file_name(ovs_rundir(), path);
+ abs_path = abs_file_name(ovs_rundir(), path);
punix_path = xasprintf("punix:%s", abs_path);
- free(abs_path);
} else {
punix_path = xasprintf("punix:%s/%s.%ld.ctl", ovs_rundir(),
program_name, (long int) getpid());
}
+#else
+ punix_path = xstrdup("ptcp:0:127.0.0.1");
+#endif
error = pstream_open(punix_path, &listener, 0);
if (error) {
goto exit;
}
+#ifdef _WIN32
+ if (path) {
+ abs_path = xstrdup(path);
+ } else {
+ abs_path = xasprintf("%s/%s.ctl", ovs_rundir(), program_name);
+ }
+
+ file = fopen(abs_path, "w");
+ if (!file) {
+ error = errno;
+ ovs_error(error, "could not open %s", abs_path);
+ goto exit;
+ }
+
+ fprintf(file, "%d\n", ntohs(listener->bound_port));
+ if (fflush(file) == EOF) {
+ error = EIO;
+ ovs_error(error, "write failed for %s", abs_path);
+ fclose(file);
+ goto exit;
+ }
+ fclose(file);
+#endif
+
unixctl_command_register("help", "", 0, 0, unixctl_help, NULL);
unixctl_command_register("version", "", 0, 0, unixctl_version, NULL);
*serverp = server;
exit:
+ if (abs_path) {
+ free(abs_path);
+ }
free(punix_path);
return error;
}
}
}
\f
-/* Connects to a unixctl server socket. 'path' should be the name of a unixctl
- * server socket. If it does not start with '/', it will be prefixed with the
- * rundir (e.g. /usr/local/var/run/openvswitch).
+/* On POSIX based systems, connects to a unixctl server socket. 'path' should
+ * be the name of a unixctl server socket. If it does not start with '/', it
+ * will be prefixed with the rundir (e.g. /usr/local/var/run/openvswitch).
+ *
+ * On Windows, connects to a localhost TCP port as written inside 'path'.
+ * 'path' should be an absolute path of the file.
*
* Returns 0 if successful, otherwise a positive errno value. If successful,
* sets '*client' to the new jsonrpc, otherwise to NULL. */
char *abs_path, *unix_path;
struct stream *stream;
int error;
+#ifdef _WIN32
+ FILE *file;
+ int port;
+
+ abs_path = strdup(path);
+ file = fopen(abs_path, "r");
+ if (!file) {
+ int error = errno;
+ ovs_error(error, "could not open %s", abs_path);
+ free(abs_path);
+ return error;
+ }
- *client = NULL;
+ error = fscanf(file, "%d", &port);
+ if (error != 1) {
+ ovs_error(errno, "failed to read port from %s", abs_path);
+ free(abs_path);
+ return EINVAL;
+ }
+ fclose(file);
+ unix_path = xasprintf("tcp:127.0.0.1:%d", port);
+#else
abs_path = abs_file_name(ovs_rundir(), path);
unix_path = xasprintf("unix:%s", abs_path);
+#endif
+
+ *client = NULL;
+
error = stream_open_block(stream_open(unix_path, &stream, DSCP_DEFAULT),
&stream);
free(unix_path);
not used at all, the default socket is
\fB@RUNDIR@/\*(PN.\fIpid\fB.ctl\fR, where \fIpid\fR is \fB\*(PN\fR's
process ID.
+.IP
+On Windows, uses a kernel chosen TCP port on the localhost to listen
+for runtime management commands. The kernel chosen TCP port value is written
+in a file whose absolute path is pointed by \fIsocket\fR. If \fB\-\-unixctl\fR
+is not used at all, the file is created as \fB\*(PN.ctl\fR in the configured
+\fIOVS_RUNDIR\fR directory.
+.IP
Specifying \fBnone\fR for \fIsocket\fR disables the control socket
feature.
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "bitmap.h"
#include "byte-order.h"
#include "coverage.h"
+#include "ovs-rcu.h"
#include "ovs-thread.h"
#include "vlog.h"
#ifdef HAVE_PTHREAD_SET_NAME_NP
return xrealloc(p, *n * s);
}
+/* The desired minimum alignment for an allocated block of memory. */
+#define MEM_ALIGN MAX(sizeof(void *), 8)
+BUILD_ASSERT_DECL(IS_POW2(MEM_ALIGN));
+BUILD_ASSERT_DECL(CACHE_LINE_SIZE >= MEM_ALIGN);
+
+/* Allocates and returns 'size' bytes of memory in dedicated cache lines. That
+ * is, the memory block returned will not share a cache line with other data,
+ * avoiding "false sharing". (The memory returned will not be at the start of
+ * a cache line, though, so don't assume such alignment.)
+ *
+ * Use free_cacheline() to free the returned memory block. */
+void *
+xmalloc_cacheline(size_t size)
+{
+ void **payload;
+ void *base;
+
+ /* Allocate room for:
+ *
+ * - Up to CACHE_LINE_SIZE - 1 bytes before the payload, so that the
+ * start of the payload doesn't potentially share a cache line.
+ *
+ * - A payload consisting of a void *, followed by padding out to
+ * MEM_ALIGN bytes, followed by 'size' bytes of user data.
+ *
+ * - Space following the payload up to the end of the cache line, so
+ * that the end of the payload doesn't potentially share a cache line
+ * with some following block. */
+ base = xmalloc((CACHE_LINE_SIZE - 1)
+ + ROUND_UP(MEM_ALIGN + size, CACHE_LINE_SIZE));
+
+ /* Locate the payload and store a pointer to the base at the beginning. */
+ payload = (void **) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE);
+ *payload = base;
+
+ return (char *) payload + MEM_ALIGN;
+}
+
+/* Like xmalloc_cacheline() but clears the allocated memory to all zero
+ * bytes. */
+void *
+xzalloc_cacheline(size_t size)
+{
+ void *p = xmalloc_cacheline(size);
+ memset(p, 0, size);
+ return p;
+}
+
+/* Frees a memory block allocated with xmalloc_cacheline() or
+ * xzalloc_cacheline(). */
+void
+free_cacheline(void *p)
+{
+ if (p) {
+ free(*(void **) ((uintptr_t) p - MEM_ALIGN));
+ }
+}
+
char *
xasprintf(const char *format, ...)
{
size_t size;
/* Get maximum path length or at least a reasonable estimate. */
+#ifndef _WIN32
path_max = pathconf(".", _PC_PATH_MAX);
+#else
+ path_max = MAX_PATH;
+#endif
size = (path_max < 0 ? 1024
: path_max > 10240 ? 10240
: path_max);
return ok;
}
+void
+xsleep(unsigned int seconds)
+{
+ ovsrcu_quiesce_start();
+#ifdef _WIN32
+ Sleep(seconds * 1000);
+#else
+ sleep(seconds);
+#endif
+ ovsrcu_quiesce_end();
+}
+
#ifdef _WIN32
\f
char *
{
return ovs_format_message(GetLastError());
}
+
+int
+ftruncate(int fd, off_t length)
+{
+ int error;
+
+ error = _chsize_s(fd, length);
+ if (error) {
+ return -1;
+ }
+ return 0;
+}
#endif
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#ifndef UTIL_H
#define UTIL_H 1
+#include <inttypes.h>
#include <limits.h>
#include <stdarg.h>
#include <stdbool.h>
/* Returns X rounded up to the nearest multiple of Y. */
#define ROUND_UP(X, Y) (DIV_ROUND_UP(X, Y) * (Y))
+/* Returns the least number that, when added to X, yields a multiple of Y. */
+#define PAD_SIZE(X, Y) (ROUND_UP(X, Y) - (X))
+
/* Returns X rounded down to the nearest multiple of Y. */
#define ROUND_DOWN(X, Y) ((X) / (Y) * (Y))
#define RDP2_4(X) (RDP2_5(X) | (RDP2_5(X) >> 2))
#define RDP2_5(X) ( (X) | ( (X) >> 1))
+/* This system's cache line size, in bytes.
+ * Being wrong hurts performance but not correctness. */
+#define CACHE_LINE_SIZE 64
+BUILD_ASSERT_DECL(IS_POW2(CACHE_LINE_SIZE));
+
#ifndef MIN
#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
#endif
char *xvasprintf(const char *format, va_list) PRINTF_FORMAT(1, 0) MALLOC_LIKE;
void *x2nrealloc(void *p, size_t *n, size_t s);
+void *xmalloc_cacheline(size_t) MALLOC_LIKE;
+void *xzalloc_cacheline(size_t) MALLOC_LIKE;
+void free_cacheline(void *);
+
void ovs_strlcpy(char *dst, const char *src, size_t size);
void ovs_strzcpy(char *dst, const char *src, size_t size);
uint64_t bitwise_get(const void *src, unsigned int src_len,
unsigned int src_ofs, unsigned int n_bits);
+void xsleep(unsigned int seconds);
#ifdef _WIN32
\f
char *ovs_format_message(int error);
char *ovs_lasterror_to_string(void);
+int ftruncate(int fd, off_t length);
#endif
#ifdef __cplusplus
return false;
}
\f
+static void
+sha1_update_int(struct sha1_ctx *sha1_ctx, uintmax_t x)
+{
+ sha1_update(sha1_ctx, &x, sizeof x);
+}
+
static void
do_init(void)
{
struct sha1_ctx sha1_ctx;
uint8_t random_seed[16];
struct timeval now;
- pid_t pid, ppid;
- uid_t uid;
- gid_t gid;
/* Get seed data. */
get_entropy_or_die(random_seed, sizeof random_seed);
xgettimeofday(&now);
- pid = getpid();
- ppid = getppid();
- uid = getuid();
- gid = getgid();
/* Convert seed into key. */
sha1_init(&sha1_ctx);
sha1_update(&sha1_ctx, random_seed, sizeof random_seed);
- sha1_update(&sha1_ctx, &pid, sizeof pid);
- sha1_update(&sha1_ctx, &ppid, sizeof ppid);
- sha1_update(&sha1_ctx, &uid, sizeof uid);
- sha1_update(&sha1_ctx, &gid, sizeof gid);
+ sha1_update(&sha1_ctx, &now, sizeof now);
+ sha1_update_int(&sha1_ctx, getpid());
+#ifndef _WIN32
+ sha1_update_int(&sha1_ctx, getppid());
+ sha1_update_int(&sha1_ctx, getuid());
+ sha1_update_int(&sha1_ctx, getgid());
+#endif
sha1_final(&sha1_ctx, sha1);
/* Generate key. */
{ \
LIST_INITIALIZER(&VLM_##MODULE.list), \
#MODULE, /* name */ \
- { [ 0 ... VLF_N_FACILITIES - 1] = VLL_INFO }, /* levels */ \
+ { VLL_INFO, VLL_INFO, VLL_INFO }, /* levels */ \
VLL_INFO, /* min_level */ \
true /* honor_rate_limits */ \
};
[WIN32=no])
AM_CONDITIONAL([WIN32], [test "$WIN32" = yes])
if test "$WIN32" = yes; then
+ AC_ARG_WITH([pthread],
+ [AS_HELP_STRING([--with-pthread=DIR],
+ [root of the pthread-win32 directory])],
+ [
+ case "$withval" in
+ "" | y | ye | yes | n | no)
+ AC_MSG_ERROR([Invalid --with-pthread value])
+ ;;
+ *)
+ PTHREAD_INCLUDES="-I$withval/include"
+ PTHREAD_LDFLAGS="-L$withval/lib/x86"
+ PTHREAD_LIBS="-lpthreadVC2"
+ AC_SUBST([PTHREAD_INCLUDES])
+ AC_SUBST([PTHREAD_LDFLAGS])
+ AC_SUBST([PTHREAD_LIBS])
+ ;;
+ esac
+ ], [
+ AC_MSG_ERROR([pthread directory not specified])
+ ]
+ )
AC_DEFINE([WIN32], [1], [Define to 1 if building on WIN32.])
AH_BOTTOM([#ifdef WIN32
#include "include/windows/windefs.h"
ofproto/ofproto-dpif-mirror.h \
ofproto/ofproto-dpif-monitor.c \
ofproto/ofproto-dpif-monitor.h \
+ ofproto/ofproto-dpif-rid.c \
+ ofproto/ofproto-dpif-rid.h \
ofproto/ofproto-dpif-sflow.c \
ofproto/ofproto-dpif-sflow.h \
ofproto/ofproto-dpif-upcall.c \
ofproto_libofproto_la_CPPFLAGS = $(AM_CPPFLAGS)
ofproto_libofproto_la_CFLAGS = $(AM_CFLAGS)
ofproto_libofproto_la_LIBADD = lib/libsflow.la
+if WIN32
+ofproto_libofproto_la_LIBADD += ${PTHREAD_LIBS}
+endif
+
# Distribute this generated file in order not to require Python at
# build time if ofproto/ipfix.xml is not modified.
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
ovs_mutex_destroy(&bond->mutex);
free(bond->hash);
free(bond->name);
- ovs_refcount_destroy(&bond->ref_cnt);
free(bond);
}
static unsigned int
bond_hash_src(const uint8_t mac[ETH_ADDR_LEN], uint16_t vlan, uint32_t basis)
{
- return hash_3words(hash_bytes(mac, ETH_ADDR_LEN, 0), vlan, basis);
+ return hash_mac(mac, vlan, basis);
}
static unsigned int
static enum ofp_packet_in_reason
wire_reason(struct ofconn *ofconn, const struct ofproto_packet_in *pin)
{
- if (pin->generated_by_table_miss && pin->up.reason == OFPR_ACTION) {
+ if (pin->miss_type == OFPROTO_PACKET_IN_MISS_FLOW
+ && pin->up.reason == OFPR_ACTION) {
enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
if (protocol != OFPUTIL_P_NONE
ovs_mutex_unlock(&rule->mutex);
if (flags & NXFMF_ACTIONS) {
- fu.ofpacts = rule->actions->ofpacts;
- fu.ofpacts_len = rule->actions->ofpacts_len;
+ struct rule_actions *actions = rule_get_actions(rule);
+ fu.ofpacts = actions->ofpacts;
+ fu.ofpacts_len = actions->ofpacts_len;
} else {
fu.ofpacts = NULL;
fu.ofpacts_len = 0;
OAM_N_TYPES
};
+enum ofproto_packet_in_miss_type {
+ /* Not generated by a flow miss or table-miss flow. */
+ OFPROTO_PACKET_IN_NO_MISS,
+
+ /* The packet_in was generated directly by a table-miss flow, that is, a
+ * flow with priority 0 that wildcards all fields. See OF1.3.3 section
+ * 5.4.
+ *
+ * (Our interpretation of "directly" is "not via groups". Packet_ins
+ * generated by table-miss flows via groups use
+ * OFPROTO_PACKET_IN_NO_MISS.) */
+ OFPROTO_PACKET_IN_MISS_FLOW,
+
+ /* The packet-in was generated directly by a table-miss, but not a
+ * table-miss flow. That is, it was generated by the OpenFlow 1.0, 1.1, or
+ * 1.2 table-miss behavior. */
+ OFPROTO_PACKET_IN_MISS_WITHOUT_FLOW,
+};
+
/* A packet_in, with extra members to assist in queuing and routing it. */
struct ofproto_packet_in {
struct ofputil_packet_in up;
struct list list_node; /* For queuing. */
uint16_t controller_id; /* Controller ID to send to. */
int send_len; /* Length that the action requested sending. */
-
- /* True if the packet_in was generated directly by a table-miss flow, that
- * is, a flow with priority 0 that wildcards all fields. (Our
- * interpretation of "directly" is "not via groups".) */
- bool generated_by_table_miss;
+ enum ofproto_packet_in_miss_type miss_type;
};
/* Basics. */
pin.up.reason = OFPR_NO_MATCH;
pin.up.fmd.in_port = OFPP_LOCAL;
pin.send_len = b.size;
- pin.generated_by_table_miss = false;
+ pin.miss_type = OFPROTO_PACKET_IN_NO_MISS;
connmgr_send_packet_in(fo->connmgr, &pin);
ofpbuf_uninit(&b);
/* What to do to an in_band_rule. */
enum in_band_op {
ADD, /* Add the rule to ofproto's flow table. */
- DELETE /* Delete the rule from ofproto's flow table. */
+ DEL /* Delete the rule from ofproto's flow table. */
};
/* A rule to add to or delete from ofproto's flow table. */
/* Mark all the existing rules for deletion. (Afterward we will re-add any
* rules that are still valid.) */
HMAP_FOR_EACH (ib_rule, hmap_node, &ib->rules) {
- ib_rule->op = DELETE;
+ ib_rule->op = DEL;
}
if (ib->n_remotes && !eth_addr_is_zero(ib->local_mac)) {
ofpacts.data, ofpacts.size);
break;
- case DELETE:
+ case DEL:
if (ofproto_delete_flow(ib->ofproto,
&rule->match, rule->priority)) {
/* ofproto doesn't have the rule anymore so there's no reason
/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
}
void
-netflow_flow_update(struct netflow *nf, struct flow *flow,
+netflow_flow_update(struct netflow *nf, const struct flow *flow,
ofp_port_t output_iface,
const struct dpif_flow_stats *stats)
OVS_EXCLUDED(mutex)
atomic_sub(&netflow_count, 1, &orig);
collectors_destroy(nf->collectors);
ofpbuf_uninit(&nf->packet);
- ovs_refcount_destroy(&nf->ref_cnt);
free(nf);
}
}
void netflow_flow_clear(struct netflow *netflow, struct flow *flow);
-void netflow_flow_update(struct netflow *nf, struct flow *flow,
+void netflow_flow_update(struct netflow *nf, const struct flow *flow,
ofp_port_t output_iface,
const struct dpif_flow_stats *);
/*
- * Copyright (c) 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
dpif_ipfix_clear(di);
dpif_ipfix_bridge_exporter_destroy(&di->bridge_exporter);
hmap_destroy(&di->flow_exporter_map);
- ovs_refcount_destroy(&di->ref_cnt);
free(di);
ovs_mutex_unlock(&mutex);
}
bool need_revalidate;
bool has_mirrors;
- int ref_cnt;
+ struct ovs_refcount ref_cnt;
};
struct mbundle {
struct mbridge *mbridge;
mbridge = xzalloc(sizeof *mbridge);
- mbridge->ref_cnt = 1;
+ ovs_refcount_init(&mbridge->ref_cnt);
hmap_init(&mbridge->mbundles);
return mbridge;
{
struct mbridge *mbridge = CONST_CAST(struct mbridge *, mbridge_);
if (mbridge) {
- ovs_assert(mbridge->ref_cnt > 0);
- mbridge->ref_cnt++;
+ ovs_refcount_ref(&mbridge->ref_cnt);
}
return mbridge;
}
return;
}
- ovs_assert(mbridge->ref_cnt > 0);
- if (--mbridge->ref_cnt) {
- return;
- }
+ if (ovs_refcount_unref(&mbridge->ref_cnt) == 1) {
+ for (i = 0; i < MAX_MIRRORS; i++) {
+ if (mbridge->mirrors[i]) {
+ mirror_destroy(mbridge, mbridge->mirrors[i]->aux);
+ }
+ }
- for (i = 0; i < MAX_MIRRORS; i++) {
- if (mbridge->mirrors[i]) {
- mirror_destroy(mbridge, mbridge->mirrors[i]->aux);
+ HMAP_FOR_EACH_SAFE (mbundle, next, hmap_node, &mbridge->mbundles) {
+ mbridge_unregister_bundle(mbridge, mbundle->ofbundle);
}
- }
- HMAP_FOR_EACH_SAFE (mbundle, next, hmap_node, &mbridge->mbundles) {
- mbridge_unregister_bundle(mbridge, mbundle->ofbundle);
+ hmap_destroy(&mbridge->mbundles);
+ free(mbridge);
}
-
- hmap_destroy(&mbridge->mbundles);
- free(mbridge);
}
bool
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+
+#include "hmap.h"
+#include "hash.h"
+#include "ovs-thread.h"
+#include "ofproto-dpif-rid.h"
+
+struct rid_map {
+ struct hmap map;
+};
+
+struct rid_node {
+ struct hmap_node node;
+ uint32_t recirc_id;
+};
+
+struct rid_pool {
+ struct rid_map ridmap;
+ uint32_t base; /* IDs in the range of [base, base + n_ids). */
+ uint32_t n_ids; /* Total number of ids in the pool. */
+ uint32_t next_free_id; /* Possible next free id. */
+};
+
+struct recirc_id_pool {
+ struct ovs_mutex lock;
+ struct rid_pool rids;
+};
+
+#define RECIRC_ID_BASE 300
+#define RECIRC_ID_N_IDS 1024
+
+static void rid_pool_init(struct rid_pool *rids,
+ uint32_t base, uint32_t n_ids);
+static void rid_pool_uninit(struct rid_pool *pool);
+static uint32_t rid_pool_alloc_id(struct rid_pool *pool);
+static void rid_pool_free_id(struct rid_pool *rids, uint32_t rid);
+static struct rid_node *rid_pool_find(struct rid_pool *rids, uint32_t id);
+static struct rid_node *rid_pool_add(struct rid_pool *rids, uint32_t id);
+
+struct recirc_id_pool *
+recirc_id_pool_create(void)
+{
+ struct recirc_id_pool *pool;
+
+ pool = xmalloc(sizeof *pool);
+ rid_pool_init(&pool->rids, RECIRC_ID_BASE, RECIRC_ID_N_IDS);
+ ovs_mutex_init(&pool->lock);
+
+ return pool;
+}
+
+void
+recirc_id_pool_destroy(struct recirc_id_pool *pool)
+{
+ rid_pool_uninit(&pool->rids);
+ ovs_mutex_destroy(&pool->lock);
+}
+
+uint32_t
+recirc_id_alloc(struct recirc_id_pool *pool)
+{
+ uint32_t id;
+
+ ovs_mutex_lock(&pool->lock);
+ id = rid_pool_alloc_id(&pool->rids);
+ ovs_mutex_unlock(&pool->lock);
+
+ return id;
+}
+
+void
+recirc_id_free(struct recirc_id_pool *pool, uint32_t id)
+{
+ ovs_mutex_lock(&pool->lock);
+ rid_pool_free_id(&pool->rids, id);
+ ovs_mutex_unlock(&pool->lock);
+}
+
+static void
+rid_pool_init(struct rid_pool *rids, uint32_t base, uint32_t n_ids)
+{
+ rids->base = base;
+ rids->n_ids = n_ids;
+ rids->next_free_id = base;
+ hmap_init(&rids->ridmap.map);
+}
+
+static void
+rid_pool_uninit(struct rid_pool *rids)
+{
+ struct rid_node *rid, *next;
+
+ HMAP_FOR_EACH_SAFE(rid, next, node, &rids->ridmap.map) {
+ hmap_remove(&rids->ridmap.map, &rid->node);
+ free(rid);
+ }
+
+ hmap_destroy(&rids->ridmap.map);
+}
+
+static struct rid_node *
+rid_pool_find(struct rid_pool *rids, uint32_t id)
+{
+ size_t hash;
+ struct rid_node *rid;
+
+ hash = hash_int(id, 0);
+ HMAP_FOR_EACH_WITH_HASH(rid, node, hash, &rids->ridmap.map) {
+ if (id == rid->recirc_id) {
+ return rid;
+ }
+ }
+ return NULL;
+}
+
+static struct rid_node *
+rid_pool_add(struct rid_pool *rids, uint32_t id)
+{
+ struct rid_node *rid = xmalloc(sizeof *rid);
+ size_t hash;
+
+ rid->recirc_id = id;
+ hash = hash_int(id, 0);
+ hmap_insert(&rids->ridmap.map, &rid->node, hash);
+ return rid;
+}
+
+static uint32_t
+rid_pool_alloc_id(struct rid_pool *rids)
+{
+ uint32_t id;
+
+ if (rids->n_ids == 0) {
+ return 0;
+ }
+
+ if (!(rid_pool_find(rids, rids->next_free_id))) {
+ id = rids->next_free_id;
+ goto found_free_id;
+ }
+
+ for(id = rids->base; id < rids->base + rids->n_ids; id++) {
+ if (rid_pool_find(rids, id)) {
+ goto found_free_id;
+ }
+ }
+
+ /* Not available. */
+ return 0;
+
+found_free_id:
+ rid_pool_add(rids, id);
+
+ if (id < rids->base + rids->n_ids) {
+ rids->next_free_id = id + 1;
+ } else {
+ rids->next_free_id = rids->base;
+ }
+
+ return id;
+}
+
+static void
+rid_pool_free_id(struct rid_pool *rids, uint32_t id)
+{
+ struct rid_node *rid;
+ if (id > rids->base && (id <= rids->base + rids->n_ids)) {
+ rid = rid_pool_find(rids, id);
+ if (rid) {
+ hmap_remove(&rids->ridmap.map, &rid->node);
+ }
+ }
+}
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef OFPROTO_DPIF_RID_H
+#define OFPROTO_DPIF_RID_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+struct recirc_id_pool;
+
+/*
+ * Recirculation ID pool.
+ * ======================
+ *
+ * Recirculation ID needs to be unique for each datapath. Recirculation
+ * ID pool keeps track recirculation ids.
+ *
+ * Typically, there is one recirculation ID pool for each backer.
+ *
+ * In theory, Recirculation ID can be any uint32_t value, except 0.
+ * The implementation usually limits it to a smaller range to ease
+ * debugging.
+ *
+ * Thread-safety
+ * =============
+ *
+ * All APIs are thread safe.
+ *
+ */
+struct recirc_id_pool *recirc_id_pool_create(void);
+void recirc_id_pool_destroy(struct recirc_id_pool *pool);
+uint32_t recirc_id_alloc(struct recirc_id_pool *pool);
+void recirc_id_free(struct recirc_id_pool *pool, uint32_t recirc_id);
+#endif
dpif_sflow_del_port__(ds, dsp);
}
hmap_destroy(&ds->ports);
- ovs_refcount_destroy(&ds->ref_cnt);
free(ds);
}
}
#include "ofproto-dpif-ipfix.h"
#include "ofproto-dpif-sflow.h"
#include "ofproto-dpif-xlate.h"
+#include "ovs-rcu.h"
#include "packets.h"
#include "poll-loop.h"
#include "seq.h"
udpif_destroy(struct udpif *udpif)
{
udpif_set_threads(udpif, 0, 0);
- udpif_flush();
+ udpif_flush(udpif);
list_remove(&udpif->list_node);
latch_destroy(&udpif->exit_latch);
seq_destroy(udpif->reval_seq);
seq_destroy(udpif->dump_seq);
- atomic_destroy(&udpif->flow_limit);
- atomic_destroy(&udpif->n_flows);
- atomic_destroy(&udpif->n_flows_timestamp);
ovs_mutex_destroy(&udpif->n_flows_mutex);
free(udpif);
}
udpif_set_threads(struct udpif *udpif, size_t n_handlers,
size_t n_revalidators)
{
+ int error;
+
+ ovsrcu_quiesce_start();
/* Stop the old threads (if any). */
if (udpif->handlers &&
(udpif->n_handlers != n_handlers
udpif->n_handlers = 0;
}
+ error = dpif_handlers_set(udpif->dpif, 1);
+ if (error) {
+ VLOG_ERR("failed to configure handlers in dpif %s: %s",
+ dpif_name(udpif->dpif), ovs_strerror(error));
+ return;
+ }
+
/* Start new threads (if necessary). */
if (!udpif->handlers && n_handlers) {
size_t i;
xpthread_create(&udpif->dispatcher, NULL, udpif_dispatcher, udpif);
xpthread_create(&udpif->flow_dumper, NULL, udpif_flow_dumper, udpif);
}
+
+ ovsrcu_quiesce_end();
}
/* Waits for all ongoing upcall translations to complete. This ensures that
}
}
-/* Removes all flows from all datapaths. */
+/* Remove flows from a single datapath. */
void
-udpif_flush(void)
+udpif_flush(struct udpif *udpif)
+{
+ size_t n_handlers, n_revalidators;
+
+ n_handlers = udpif->n_handlers;
+ n_revalidators = udpif->n_revalidators;
+
+ udpif_set_threads(udpif, 0, 0);
+ dpif_flow_flush(udpif->dpif);
+ udpif_set_threads(udpif, n_handlers, n_revalidators);
+}
+
+/* Removes all flows from all datapaths. */
+static void
+udpif_flush_all_datapaths(void)
{
struct udpif *udpif;
LIST_FOR_EACH (udpif, list_node, &all_udpifs) {
- dpif_flow_flush(udpif->dpif);
+ udpif_flush(udpif);
}
}
+
\f
/* Destroys and deallocates 'upcall'. */
static void
set_subprogram_name("dispatcher");
while (!latch_is_set(&udpif->exit_latch)) {
recv_upcalls(udpif);
- dpif_recv_wait(udpif->dpif);
+ dpif_recv_wait(udpif->dpif, 0);
latch_wait(&udpif->exit_latch);
poll_block();
}
size_t i;
ovs_mutex_lock(&handler->mutex);
- if (!handler->n_upcalls) {
+ /* Must check the 'exit_latch' again to make sure the main thread is
+ * not joining on the handler thread. */
+ if (!handler->n_upcalls
+ && !latch_is_set(&handler->udpif->exit_latch)) {
ovs_mutex_cond_wait(&handler->wake_cond, &handler->mutex);
}
upcall = xmalloc(sizeof *upcall);
ofpbuf_use_stub(&upcall->upcall_buf, upcall->upcall_stub,
sizeof upcall->upcall_stub);
- error = dpif_recv(udpif->dpif, &upcall->dpif_upcall,
+ error = dpif_recv(udpif->dpif, 0, &upcall->dpif_upcall,
&upcall->upcall_buf);
if (error) {
/* upcall_destroy() can only be called on successfully received
port = xout->slow & (SLOW_CFM | SLOW_BFD | SLOW_LACP | SLOW_STP)
? ODPP_NONE
: odp_in_port;
- pid = dpif_port_get_pid(udpif->dpif, port);
+ pid = dpif_port_get_pid(udpif->dpif, port, 0);
odp_put_userspace_action(pid, &cookie, sizeof cookie.slow_path, buf);
}
type = classify_upcall(upcall);
if (type == MISS_UPCALL) {
uint32_t hash;
- struct pkt_metadata md;
+ struct pkt_metadata md = pkt_metadata_from_flow(&flow);
- pkt_metadata_from_flow(&md, &flow);
flow_extract(packet, &md, &miss->flow);
-
hash = flow_hash(&miss->flow, 0);
existing_miss = flow_miss_find(&misses, ofproto, &miss->flow,
hash);
} else {
miss = existing_miss;
}
- miss->stats.tcp_flags |= packet_get_tcp_flags(packet, &miss->flow);
+ miss->stats.tcp_flags |= ntohs(miss->flow.tcp_flags);
miss->stats.n_bytes += packet->size;
miss->stats.n_packets++;
pin->up.cookie = OVS_BE64_MAX;
flow_get_metadata(&miss->flow, &pin->up.fmd);
pin->send_len = 0; /* Not used for flow table misses. */
- pin->generated_by_table_miss = false;
+ pin->miss_type = OFPROTO_PACKET_IN_NO_MISS;
ofproto_dpif_send_packet_in(miss->ofproto, pin);
}
}
void *aux OVS_UNUSED)
{
atomic_store(&enable_megaflows, false);
- udpif_flush();
+ udpif_flush_all_datapaths();
unixctl_command_reply(conn, "megaflows disabled");
}
void *aux OVS_UNUSED)
{
atomic_store(&enable_megaflows, true);
- udpif_flush();
+ udpif_flush_all_datapaths();
unixctl_command_reply(conn, "megaflows enabled");
}
void udpif_revalidate(struct udpif *);
void udpif_get_memory_usage(struct udpif *, struct simap *usage);
struct seq *udpif_dump_seq(struct udpif *);
-void udpif_flush(void);
+void udpif_flush(struct udpif *);
#endif /* ofproto-dpif-upcall.h */
* it did not arrive on a "real" port. 'ofpp_none_bundle' exists for
* when an input bundle is needed for validation (e.g., mirroring or
* OFPP_NORMAL processing). It is not connected to an 'ofproto' or have
- * any 'port' structs, so care must be taken when dealing with it.
- * The bundle's name and vlan mode are initialized in lookup_input_bundle() */
-static struct xbundle ofpp_none_bundle;
+ * any 'port' structs, so care must be taken when dealing with it. */
+static struct xbundle ofpp_none_bundle = {
+ .name = "OFPP_NONE",
+ .vlan_mode = PORT_VLAN_TRUNK
+};
/* Node in 'xport''s 'skb_priorities' map. Used to maintain a map from
* 'priority' (the datapath's term for QoS queue) to the dscp bits which all
OVS_REQ_RDLOCK(xlate_rwlock);
static void xlate_normal(struct xlate_ctx *);
static void xlate_report(struct xlate_ctx *, const char *);
- static void xlate_table_action(struct xlate_ctx *, ofp_port_t in_port,
- uint8_t table_id, bool may_packet_in);
+static void xlate_table_action(struct xlate_ctx *, ofp_port_t in_port,
+ uint8_t table_id, bool may_packet_in,
+ bool honor_table_miss);
static bool input_vid_is_valid(uint16_t vid, struct xbundle *, bool warn);
static uint16_t input_vid_to_vlan(const struct xbundle *, uint16_t vid);
static void output_normal(struct xlate_ctx *, const struct xbundle *,
: NULL;
}
-static enum stp_state
+static bool
xport_stp_learn_state(const struct xport *xport)
{
struct stp_port *sp = xport_get_stp_port(xport);
return stp_forward_in_state(sp ? stp_port_get_state(sp) : STP_DISABLED);
}
+static bool
+xport_stp_listen_state(const struct xport *xport)
+{
+ struct stp_port *sp = xport_get_stp_port(xport);
+ return stp_listen_in_state(sp ? stp_port_get_state(sp) : STP_DISABLED);
+}
+
/* Returns true if STP should process 'flow'. Sets fields in 'wc' that
* were used to make the determination.*/
static bool
/* Special-case OFPP_NONE, which a controller may use as the ingress
* port for traffic that it is sourcing. */
if (in_port == OFPP_NONE) {
- ofpp_none_bundle.name = "OFPP_NONE";
- ofpp_none_bundle.vlan_mode = PORT_VLAN_TRUNK;
return &ofpp_none_bundle;
}
actions_offset = nl_msg_start_nested(odp_actions, OVS_SAMPLE_ATTR_ACTIONS);
odp_port = ofp_port_to_odp_port(xbridge, flow->in_port.ofp_port);
- pid = dpif_port_get_pid(xbridge->dpif, odp_port);
+ pid = dpif_port_get_pid(xbridge->dpif, odp_port, 0);
cookie_offset = odp_put_userspace_action(pid, cookie, cookie_size, odp_actions);
nl_msg_end_nested(odp_actions, actions_offset);
/* If 'struct flow' gets additional metadata, we'll need to zero it out
* before traversing a patch port. */
- BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
if (!xport) {
xlate_report(ctx, "Nonexistent output port");
} else if (xport->config & OFPUTIL_PC_NO_FWD) {
xlate_report(ctx, "OFPPC_NO_FWD set, skipping output");
return;
- } else if (check_stp && !xport_stp_forward_state(xport)) {
- xlate_report(ctx, "STP not in forwarding state, skipping output");
- return;
+ } else if (check_stp) {
+ if (eth_addr_equals(ctx->base_flow.dl_dst, eth_addr_stp)) {
+ if (!xport_stp_listen_state(xport)) {
+ xlate_report(ctx, "STP not in listening state, "
+ "skipping bpdu output");
+ return;
+ }
+ } else if (!xport_stp_forward_state(xport)) {
+ xlate_report(ctx, "STP not in forwarding state, "
+ "skipping output");
+ return;
+ }
}
if (mbridge_has_mirrors(ctx->xbridge->mbridge) && xport->xbundle) {
ctx->xout->slow |= special;
} else if (may_receive(peer, ctx)) {
if (xport_stp_forward_state(peer)) {
- xlate_table_action(ctx, flow->in_port.ofp_port, 0, true);
+ xlate_table_action(ctx, flow->in_port.ofp_port, 0, true, true);
} else {
/* Forwarding is disabled by STP. Let OFPP_NORMAL and the
* learning action look at the packet, then drop it. */
struct flow old_base_flow = ctx->base_flow;
size_t old_size = ctx->xout->odp_actions.size;
mirror_mask_t old_mirrors = ctx->xout->mirrors;
- xlate_table_action(ctx, flow->in_port.ofp_port, 0, true);
+ xlate_table_action(ctx, flow->in_port.ofp_port, 0, true, true);
ctx->xout->mirrors = old_mirrors;
ctx->base_flow = old_base_flow;
ctx->xout->odp_actions.size = old_size;
ctx->rule = rule;
actions = rule_dpif_get_actions(rule);
do_xlate_actions(actions->ofpacts, actions->ofpacts_len, ctx);
- rule_actions_unref(actions);
ctx->rule = old_rule;
ctx->recurse--;
}
}
static void
-xlate_table_action(struct xlate_ctx *ctx,
- ofp_port_t in_port, uint8_t table_id, bool may_packet_in)
+xlate_table_action(struct xlate_ctx *ctx, ofp_port_t in_port, uint8_t table_id,
+ bool may_packet_in, bool honor_table_miss)
{
if (xlate_resubmit_resource_check(ctx)) {
ofp_port_t old_in_port = ctx->xin->flow.in_port.ofp_port;
bool skip_wildcards = ctx->xin->skip_wildcards;
uint8_t old_table_id = ctx->table_id;
struct rule_dpif *rule;
+ enum rule_dpif_lookup_verdict verdict;
+ enum ofputil_port_config config = 0;
ctx->table_id = table_id;
* original input port (otherwise OFPP_NORMAL and OFPP_IN_PORT will
* have surprising behavior). */
ctx->xin->flow.in_port.ofp_port = in_port;
- rule_dpif_lookup_in_table(ctx->xbridge->ofproto, &ctx->xin->flow,
- !skip_wildcards ? &ctx->xout->wc : NULL,
- table_id, &rule);
+ verdict = rule_dpif_lookup_from_table(ctx->xbridge->ofproto,
+ &ctx->xin->flow,
+ !skip_wildcards
+ ? &ctx->xout->wc : NULL,
+ honor_table_miss,
+ &ctx->table_id, &rule);
ctx->xin->flow.in_port.ofp_port = old_in_port;
if (ctx->xin->resubmit_hook) {
ctx->xin->resubmit_hook(ctx->xin, rule, ctx->recurse);
}
- if (!rule && may_packet_in) {
- struct xport *xport;
-
- /* XXX
- * check if table configuration flags
- * OFPTC11_TABLE_MISS_CONTROLLER, default.
- * OFPTC11_TABLE_MISS_CONTINUE,
- * OFPTC11_TABLE_MISS_DROP
- * When OF1.0, OFPTC11_TABLE_MISS_CONTINUE is used. What to do? */
- xport = get_ofp_port(ctx->xbridge, ctx->xin->flow.in_port.ofp_port);
- choose_miss_rule(xport ? xport->config : 0,
- ctx->xbridge->miss_rule,
- ctx->xbridge->no_packet_in_rule, &rule);
+ switch (verdict) {
+ case RULE_DPIF_LOOKUP_VERDICT_MATCH:
+ goto match;
+ case RULE_DPIF_LOOKUP_VERDICT_CONTROLLER:
+ if (may_packet_in) {
+ struct xport *xport;
+
+ xport = get_ofp_port(ctx->xbridge,
+ ctx->xin->flow.in_port.ofp_port);
+ config = xport ? xport->config : 0;
+ break;
+ }
+ /* Fall through to drop */
+ case RULE_DPIF_LOOKUP_VERDICT_DROP:
+ config = OFPUTIL_PC_NO_PACKET_IN;
+ break;
+ default:
+ OVS_NOT_REACHED();
}
+
+ choose_miss_rule(config, ctx->xbridge->miss_rule,
+ ctx->xbridge->no_packet_in_rule, &rule);
+
+match:
if (rule) {
xlate_recursively(ctx, rule);
rule_dpif_unref(rule);
const struct ofputil_bucket *bucket;
uint32_t basis;
- basis = hash_bytes(ctx->xin->flow.dl_dst, sizeof ctx->xin->flow.dl_dst, 0);
+ basis = hash_mac(ctx->xin->flow.dl_dst, 0, 0);
bucket = group_best_live_bucket(ctx, group, basis);
if (bucket) {
memset(&wc->masks.dl_dst, 0xff, sizeof wc->masks.dl_dst);
table_id = ctx->table_id;
}
- xlate_table_action(ctx, in_port, table_id, false);
+ xlate_table_action(ctx, in_port, table_id, false, false);
}
static void
&ctx->xout->odp_actions,
&ctx->xout->wc);
- odp_execute_actions(NULL, packet, &md, ctx->xout->odp_actions.data,
+ odp_execute_actions(NULL, packet, false, &md, ctx->xout->odp_actions.data,
ctx->xout->odp_actions.size, NULL);
pin = xmalloc(sizeof *pin);
pin->controller_id = controller_id;
pin->send_len = len;
- pin->generated_by_table_miss = (ctx->rule
- && rule_dpif_is_table_miss(ctx->rule));
+ /* If a rule is a table-miss rule then this is
+ * a table-miss handled by a table-miss rule.
+ *
+ * Else, if rule is internal and has a controller action,
+ * the later being implied by the rule being processed here,
+ * then this is a table-miss handled without a table-miss rule.
+ *
+ * Otherwise this is not a table-miss. */
+ pin->miss_type = OFPROTO_PACKET_IN_NO_MISS;
+ if (ctx->rule) {
+ if (rule_dpif_is_table_miss(ctx->rule)) {
+ pin->miss_type = OFPROTO_PACKET_IN_MISS_FLOW;
+ } else if (rule_dpif_is_internal(ctx->rule)) {
+ pin->miss_type = OFPROTO_PACKET_IN_MISS_WITHOUT_FLOW;
+ }
+ }
ofproto_dpif_send_packet_in(ctx->xbridge->ofproto, pin);
ofpbuf_delete(packet);
}
break;
case OFPP_TABLE:
xlate_table_action(ctx, ctx->xin->flow.in_port.ofp_port,
- 0, may_packet_in);
+ 0, may_packet_in, true);
break;
case OFPP_NORMAL:
xlate_normal(ctx);
ovs_assert(ctx->table_id < ogt->table_id);
xlate_table_action(ctx, ctx->xin->flow.in_port.ofp_port,
- ogt->table_id, true);
+ ogt->table_id, true, true);
break;
}
ctx.exit = false;
if (!xin->ofpacts && !ctx.rule) {
- rule_dpif_lookup(ctx.xbridge->ofproto, flow,
- !xin->skip_wildcards ? wc : NULL, &rule);
+ ctx.table_id = rule_dpif_lookup(ctx.xbridge->ofproto, flow,
+ !xin->skip_wildcards ? wc : NULL,
+ &rule);
if (ctx.xin->resubmit_stats) {
rule_dpif_credit_stats(rule, ctx.xin->resubmit_stats);
}
}
out:
- rule_actions_unref(actions);
rule_dpif_unref(rule);
}
#include "ofproto-dpif-ipfix.h"
#include "ofproto-dpif-mirror.h"
#include "ofproto-dpif-monitor.h"
+#include "ofproto-dpif-rid.h"
#include "ofproto-dpif-sflow.h"
#include "ofproto-dpif-upcall.h"
#include "ofproto-dpif-xlate.h"
* - Do include packets and bytes from datapath flows which have not
* recently been processed by a revalidator. */
struct ovs_mutex stats_mutex;
- uint64_t packet_count OVS_GUARDED; /* Number of packets received. */
- uint64_t byte_count OVS_GUARDED; /* Number of bytes received. */
- long long int used; /* Last used time (msec). */
+ struct dpif_flow_stats stats OVS_GUARDED;
};
static void rule_get_stats(struct rule *, uint64_t *packets, uint64_t *bytes,
bool recv_set_enable; /* Enables or disables receiving packets. */
+ struct recirc_id_pool *rid_pool; /* Recirculation ID pool. */
+
/* True if the datapath supports variable-length
* OVS_USERSPACE_ATTR_USERDATA in OVS_ACTION_ATTR_USERSPACE actions.
* False if the datapath supports only 8-byte (or shorter) userdata. */
ovs_rwlock_destroy(&backer->odp_to_ofport_lock);
hmap_destroy(&backer->odp_to_ofport_map);
shash_find_and_delete(&all_dpif_backers, backer->type);
+ recirc_id_pool_destroy(backer->rid_pool);
free(backer->type);
dpif_close(backer->dpif);
-
free(backer);
}
struct shash_node *node;
struct list garbage_list;
struct odp_garbage *garbage, *next;
+
struct sset names;
char *backer_name;
const char *name;
}
backer->variable_length_userdata = check_variable_length_userdata(backer);
backer->max_mpls_depth = check_max_mpls_depth(backer);
+ backer->rid_pool = recirc_id_pool_create();
if (backer->recv_set_enable) {
udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
ofpbuf_init(&actions, 64);
start = nl_msg_start_nested(&actions, OVS_ACTION_ATTR_USERSPACE);
nl_msg_put_u32(&actions, OVS_USERSPACE_ATTR_PID,
- dpif_port_get_pid(backer->dpif, ODPP_NONE));
+ dpif_port_get_pid(backer->dpif, ODPP_NONE, 0));
nl_msg_put_unspec_zero(&actions, OVS_USERSPACE_ATTR_USERDATA, 4);
nl_msg_end_nested(&actions, start);
const struct ofpbuf *ofpacts, struct rule_dpif **rulep)
{
struct ofputil_flow_mod fm;
+ struct classifier *cls;
int error;
match_init_catchall(&fm.match);
return error;
}
- if (rule_dpif_lookup_in_table(ofproto, &fm.match.flow, NULL, TBL_INTERNAL,
- rulep)) {
- rule_dpif_unref(*rulep);
- } else {
- OVS_NOT_REACHED();
- }
+ cls = &ofproto->up.tables[TBL_INTERNAL].cls;
+ fat_rwlock_rdlock(&cls->rwlock);
+ *rulep = rule_dpif_cast(rule_from_cls_rule(
+ classifier_lookup(cls, &fm.match.flow, NULL)));
+ ovs_assert(*rulep != NULL);
+ fat_rwlock_unlock(&cls->rwlock);
return 0;
}
}
static void
-flush(struct ofproto *ofproto OVS_UNUSED)
+flush(struct ofproto *ofproto_)
{
- udpif_flush();
+ struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
+ struct dpif_backer *backer = ofproto->backer;
+
+ if (backer) {
+ udpif_flush(backer->udpif);
+ }
}
static void
learning_packet = bond_compose_learning_packet(bundle->bond,
e->mac, e->vlan,
&port_void);
- learning_packet->private_p = port_void;
+ /* Temporarily use l2 as a private pointer (see below). */
+ ovs_assert(learning_packet->l2 == learning_packet->data);
+ learning_packet->l2 = port_void;
list_push_back(&packets, &learning_packet->list_node);
}
}
error = n_packets = n_errors = 0;
LIST_FOR_EACH (learning_packet, list_node, &packets) {
int ret;
+ void *port_void = learning_packet->l2;
- ret = ofproto_dpif_send_packet(learning_packet->private_p, learning_packet);
+ /* Restore l2. */
+ learning_packet->l2 = learning_packet->data;
+ ret = ofproto_dpif_send_packet(port_void, learning_packet);
if (ret) {
error = ret;
n_errors++;
long long int used;
ovs_mutex_lock(&rule->stats_mutex);
- used = rule->used;
+ used = rule->stats.used;
ovs_mutex_unlock(&rule->stats_mutex);
if (now > used + idle_timeout * 1000) {
const struct dpif_flow_stats *stats)
{
ovs_mutex_lock(&rule->stats_mutex);
- rule->packet_count += stats->n_packets;
- rule->byte_count += stats->n_bytes;
- rule->used = MAX(rule->used, stats->used);
+ rule->stats.n_packets += stats->n_packets;
+ rule->stats.n_bytes += stats->n_bytes;
+ rule->stats.used = MAX(rule->stats.used, stats->used);
ovs_mutex_unlock(&rule->stats_mutex);
}
return rule_is_table_miss(&rule->up);
}
+bool
+rule_dpif_is_internal(const struct rule_dpif *rule)
+{
+ return rule_is_internal(&rule->up);
+}
+
ovs_be64
rule_dpif_get_flow_cookie(const struct rule_dpif *rule)
OVS_REQUIRES(rule->up.mutex)
return rule_get_actions(&rule->up);
}
-/* Lookup 'flow' in 'ofproto''s classifier. If 'wc' is non-null, sets
- * the fields that were relevant as part of the lookup. */
-void
+/* Lookup 'flow' in table 0 of 'ofproto''s classifier.
+ * If 'wc' is non-null, sets the fields that were relevant as part of
+ * the lookup. Returns the table_id where a match or miss occurred.
+ *
+ * The return value will be zero unless there was a miss and
+ * OFPTC_TABLE_MISS_CONTINUE is in effect for the sequence of tables
+ * where misses occur. */
+uint8_t
rule_dpif_lookup(struct ofproto_dpif *ofproto, const struct flow *flow,
struct flow_wildcards *wc, struct rule_dpif **rule)
{
- struct ofport_dpif *port;
+ enum rule_dpif_lookup_verdict verdict;
+ enum ofputil_port_config config = 0;
+ uint8_t table_id = 0;
- if (rule_dpif_lookup_in_table(ofproto, flow, wc, 0, rule)) {
- return;
+ verdict = rule_dpif_lookup_from_table(ofproto, flow, wc, true,
+ &table_id, rule);
+
+ switch (verdict) {
+ case RULE_DPIF_LOOKUP_VERDICT_MATCH:
+ return table_id;
+ case RULE_DPIF_LOOKUP_VERDICT_CONTROLLER: {
+ struct ofport_dpif *port;
+
+ port = get_ofp_port(ofproto, flow->in_port.ofp_port);
+ if (!port) {
+ VLOG_WARN_RL(&rl, "packet-in on unknown OpenFlow port %"PRIu16,
+ flow->in_port.ofp_port);
+ }
+ config = port ? port->up.pp.config : 0;
+ break;
}
- port = get_ofp_port(ofproto, flow->in_port.ofp_port);
- if (!port) {
- VLOG_WARN_RL(&rl, "packet-in on unknown OpenFlow port %"PRIu16,
- flow->in_port.ofp_port);
+ case RULE_DPIF_LOOKUP_VERDICT_DROP:
+ config = OFPUTIL_PC_NO_PACKET_IN;
+ break;
+ default:
+ OVS_NOT_REACHED();
}
- choose_miss_rule(port ? port->up.pp.config : 0, ofproto->miss_rule,
+ choose_miss_rule(config, ofproto->miss_rule,
ofproto->no_packet_in_rule, rule);
+ return table_id;
}
-bool
-rule_dpif_lookup_in_table(struct ofproto_dpif *ofproto,
- const struct flow *flow, struct flow_wildcards *wc,
- uint8_t table_id, struct rule_dpif **rule)
+static struct rule_dpif *
+rule_dpif_lookup_in_table(struct ofproto_dpif *ofproto, uint8_t table_id,
+ const struct flow *flow, struct flow_wildcards *wc)
{
+ struct classifier *cls = &ofproto->up.tables[table_id].cls;
const struct cls_rule *cls_rule;
- struct classifier *cls;
- bool frag;
-
- *rule = NULL;
- if (table_id >= N_TABLES) {
- return false;
- }
+ struct rule_dpif *rule;
- if (wc) {
- memset(&wc->masks.dl_type, 0xff, sizeof wc->masks.dl_type);
- if (is_ip_any(flow)) {
- wc->masks.nw_frag |= FLOW_NW_FRAG_MASK;
+ fat_rwlock_rdlock(&cls->rwlock);
+ if (ofproto->up.frag_handling != OFPC_FRAG_NX_MATCH) {
+ if (wc) {
+ memset(&wc->masks.dl_type, 0xff, sizeof wc->masks.dl_type);
+ if (is_ip_any(flow)) {
+ wc->masks.nw_frag |= FLOW_NW_FRAG_MASK;
+ }
}
- }
- cls = &ofproto->up.tables[table_id].cls;
- fat_rwlock_rdlock(&cls->rwlock);
- frag = (flow->nw_frag & FLOW_NW_FRAG_ANY) != 0;
- if (frag && ofproto->up.frag_handling == OFPC_FRAG_NORMAL) {
- /* We must pretend that transport ports are unavailable. */
- struct flow ofpc_normal_flow = *flow;
- ofpc_normal_flow.tp_src = htons(0);
- ofpc_normal_flow.tp_dst = htons(0);
- cls_rule = classifier_lookup(cls, &ofpc_normal_flow, wc);
- } else if (frag && ofproto->up.frag_handling == OFPC_FRAG_DROP) {
- cls_rule = &ofproto->drop_frags_rule->up.cr;
- /* Frag mask in wc already set above. */
+ if (flow->nw_frag & FLOW_NW_FRAG_ANY) {
+ if (ofproto->up.frag_handling == OFPC_FRAG_NORMAL) {
+ /* We must pretend that transport ports are unavailable. */
+ struct flow ofpc_normal_flow = *flow;
+ ofpc_normal_flow.tp_src = htons(0);
+ ofpc_normal_flow.tp_dst = htons(0);
+ cls_rule = classifier_lookup(cls, &ofpc_normal_flow, wc);
+ } else {
+ /* Must be OFPC_FRAG_DROP (we don't have OFPC_FRAG_REASM). */
+ cls_rule = &ofproto->drop_frags_rule->up.cr;
+ }
+ } else {
+ cls_rule = classifier_lookup(cls, flow, wc);
+ }
} else {
cls_rule = classifier_lookup(cls, flow, wc);
}
- *rule = rule_dpif_cast(rule_from_cls_rule(cls_rule));
- rule_dpif_ref(*rule);
+ rule = rule_dpif_cast(rule_from_cls_rule(cls_rule));
+ rule_dpif_ref(rule);
fat_rwlock_unlock(&cls->rwlock);
- return *rule != NULL;
+ return rule;
+}
+
+/* Look up 'flow' in 'ofproto''s classifier starting from table '*table_id'.
+ * Stores the rule that was found in '*rule', or NULL if none was found.
+ * Updates 'wc', if nonnull, to reflect the fields that were used during the
+ * lookup.
+ *
+ * If 'honor_table_miss' is true, the first lookup occurs in '*table_id', but
+ * if none is found then the table miss configuration for that table is
+ * honored, which can result in additional lookups in other OpenFlow tables.
+ * In this case the function updates '*table_id' to reflect the final OpenFlow
+ * table that was searched.
+ *
+ * If 'honor_table_miss' is false, then only one table lookup occurs, in
+ * '*table_id'.
+ *
+ * Returns:
+ *
+ * - RULE_DPIF_LOOKUP_VERDICT_MATCH if a rule (in '*rule') was found.
+ *
+ * - RULE_DPIF_LOOKUP_VERDICT_DROP if no rule was found and a table miss
+ * configuration specified that the packet should be dropped in this
+ * case. (This occurs only if 'honor_table_miss' is true, because only in
+ * this case does the table miss configuration matter.)
+ *
+ * - RULE_DPIF_LOOKUP_VERDICT_CONTROLLER if no rule was found otherwise. */
+enum rule_dpif_lookup_verdict
+rule_dpif_lookup_from_table(struct ofproto_dpif *ofproto,
+ const struct flow *flow,
+ struct flow_wildcards *wc,
+ bool honor_table_miss,
+ uint8_t *table_id, struct rule_dpif **rule)
+{
+ uint8_t next_id;
+
+ for (next_id = *table_id;
+ next_id < ofproto->up.n_tables;
+ next_id++, next_id += (next_id == TBL_INTERNAL))
+ {
+ *table_id = next_id;
+ *rule = rule_dpif_lookup_in_table(ofproto, *table_id, flow, wc);
+ if (*rule) {
+ return RULE_DPIF_LOOKUP_VERDICT_MATCH;
+ } else if (!honor_table_miss) {
+ return RULE_DPIF_LOOKUP_VERDICT_CONTROLLER;
+ } else {
+ switch (table_get_config(&ofproto->up, *table_id)
+ & OFPTC11_TABLE_MISS_MASK) {
+ case OFPTC11_TABLE_MISS_CONTINUE:
+ break;
+
+ case OFPTC11_TABLE_MISS_CONTROLLER:
+ return RULE_DPIF_LOOKUP_VERDICT_CONTROLLER;
+
+ case OFPTC11_TABLE_MISS_DROP:
+ return RULE_DPIF_LOOKUP_VERDICT_DROP;
+ }
+ }
+ }
+
+ return RULE_DPIF_LOOKUP_VERDICT_CONTROLLER;
}
/* Given a port configuration (specified as zero if there's no port), chooses
{
struct rule_dpif *rule = rule_dpif_cast(rule_);
ovs_mutex_init_adaptive(&rule->stats_mutex);
- rule->packet_count = 0;
- rule->byte_count = 0;
- rule->used = rule->up.modified;
+ rule->stats.n_packets = 0;
+ rule->stats.n_bytes = 0;
+ rule->stats.used = rule->up.modified;
return 0;
}
struct rule_dpif *rule = rule_dpif_cast(rule_);
ovs_mutex_lock(&rule->stats_mutex);
- *packets = rule->packet_count;
- *bytes = rule->byte_count;
- *used = rule->used;
+ *packets = rule->stats.n_packets;
+ *bytes = rule->stats.n_bytes;
+ *used = rule->stats.used;
ovs_mutex_unlock(&rule->stats_mutex);
}
if (reset_counters) {
ovs_mutex_lock(&rule->stats_mutex);
- rule->packet_count = 0;
- rule->byte_count = 0;
+ rule->stats.n_packets = 0;
+ rule->stats.n_bytes = 0;
ovs_mutex_unlock(&rule->stats_mutex);
}
static enum ofperr
group_modify(struct ofgroup *group_, struct ofgroup *victim_)
{
+ struct ofproto_dpif *ofproto = ofproto_dpif_cast(group_->ofproto);
struct group_dpif *group = group_dpif_cast(group_);
struct group_dpif *victim = group_dpif_cast(victim_);
group_construct_stats(group);
ovs_mutex_unlock(&group->stats_mutex);
+ ofproto->backer->need_revalidate = REV_FLOW_TABLE;
+
return 0;
}
ds_put_cstr(result, "OpenFlow actions=");
ofpacts_format(actions->ofpacts, actions->ofpacts_len, result);
ds_put_char(result, '\n');
-
- rule_actions_unref(actions);
}
static void
if (!packet->size) {
flow_compose(packet, flow);
} else {
- union flow_in_port in_port = flow->in_port;
- struct pkt_metadata md;
+ struct pkt_metadata md = pkt_metadata_from_flow(flow);
/* Use the metadata from the flow and the packet argument
* to reconstruct the flow. */
- pkt_metadata_init(&md, NULL, flow->skb_priority,
- flow->pkt_mark, &in_port);
-
flow_extract(packet, &md, flow);
}
}
}
if (rule || ofpacts) {
- uint16_t tcp_flags;
-
- tcp_flags = packet ? packet_get_tcp_flags(packet, flow) : 0;
trace.result = ds;
trace.flow = *flow;
- xlate_in_init(&trace.xin, ofproto, flow, rule, tcp_flags, packet);
+ xlate_in_init(&trace.xin, ofproto, flow, rule, ntohs(flow->tcp_flags),
+ packet);
if (ofpacts) {
trace.xin.ofpacts = ofpacts;
trace.xin.ofpacts_len = ofpacts_len;
unixctl_command_register("dpif/dump-flows", "[-m] bridge", 1, 2,
ofproto_unixctl_dpif_dump_flows, NULL);
}
+
+
+/* Returns true if 'rule' is an internal rule, false otherwise. */
+bool
+rule_is_internal(const struct rule *rule)
+{
+ return rule->table_id == TBL_INTERNAL;
+}
\f
/* Linux VLAN device support (e.g. "eth0.10" for VLAN 10.)
*
return !hmap_is_empty(&ofproto->realdev_vid_map);
}
+
static ofp_port_t
vsp_realdev_to_vlandev__(const struct ofproto_dpif *ofproto,
ofp_port_t realdev_ofp_port, ovs_be16 vlan_tci)
}
}
+uint32_t
+ofproto_dpif_alloc_recirc_id(struct ofproto_dpif *ofproto)
+{
+ struct dpif_backer *backer = ofproto->backer;
+
+ return recirc_id_alloc(backer->rid_pool);
+}
+
+void
+ofproto_dpif_free_recirc_id(struct ofproto_dpif *ofproto, uint32_t recirc_id)
+{
+ struct dpif_backer *backer = ofproto->backer;
+
+ recirc_id_free(backer->rid_pool, recirc_id);
+}
+
const struct ofproto_class ofproto_dpif_class = {
init,
enumerate_types,
union user_action_cookie;
struct dpif_flow_stats;
+struct ofproto;
struct ofproto_dpif;
struct ofproto_packet_in;
struct ofport_dpif;
struct OVS_LOCKABLE rule_dpif;
struct OVS_LOCKABLE group_dpif;
+enum rule_dpif_lookup_verdict {
+ RULE_DPIF_LOOKUP_VERDICT_MATCH, /* A match occurred. */
+ RULE_DPIF_LOOKUP_VERDICT_CONTROLLER, /* A miss occurred and the packet
+ * should be passed to
+ * the controller. */
+ RULE_DPIF_LOOKUP_VERDICT_DROP, /* A miss occurred and the packet
+ * should be dropped. */
+};
+
/* For lock annotation below only. */
extern struct ovs_rwlock xlate_rwlock;
size_t ofproto_dpif_get_max_mpls_depth(const struct ofproto_dpif *);
-void rule_dpif_lookup(struct ofproto_dpif *, const struct flow *,
- struct flow_wildcards *, struct rule_dpif **rule);
+uint8_t rule_dpif_lookup(struct ofproto_dpif *, const struct flow *,
+ struct flow_wildcards *, struct rule_dpif **rule);
-bool rule_dpif_lookup_in_table(struct ofproto_dpif *, const struct flow *,
- struct flow_wildcards *, uint8_t table_id,
- struct rule_dpif **rule);
+enum rule_dpif_lookup_verdict rule_dpif_lookup_from_table(struct ofproto_dpif *,
+ const struct flow *,
+ struct flow_wildcards *,
+ bool force_controller_on_miss,
+ uint8_t *table_id,
+ struct rule_dpif **rule);
void rule_dpif_ref(struct rule_dpif *);
void rule_dpif_unref(struct rule_dpif *);
bool rule_dpif_is_fail_open(const struct rule_dpif *);
bool rule_dpif_is_table_miss(const struct rule_dpif *);
+bool rule_dpif_is_internal(const struct rule_dpif *);
struct rule_actions *rule_dpif_get_actions(const struct rule_dpif *);
struct ofport_dpif *odp_port_to_ofport(const struct dpif_backer *, odp_port_t);
+/*
+ * Recirculation
+ * =============
+ *
+ * Recirculation is a technique to allow a frame to re-enter the packet
+ * processing path for one or multiple times to achieve more flexible packet
+ * processing in the data path. MPLS handling and selecting bond slave port
+ * of a bond ports.
+ *
+ * Data path and user space interface
+ * -----------------------------------
+ *
+ * Two new fields, recirc_id and dp_hash, are added to the current flow data
+ * structure. They are both of type uint32_t. In addition, a new action,
+ * RECIRC, are added.
+ *
+ * The value recirc_id is used to distinguish a packet from multiple
+ * iterations of recirculation. A packet initially received is considered of
+ * having recirc_id of 0. Recirc_id is managed by the user space, opaque to
+ * the data path.
+ *
+ * On the other hand, dp_hash can only be computed by the data path, opaque to
+ * the user space. In fact, user space may not able to recompute the hash
+ * value. The dp_hash value should be wildcarded when for a newly received
+ * packet. RECIRC action specifies whether the hash is computed. If computed,
+ * how many fields to be included in the hash computation. The computed hash
+ * value is stored into the dp_hash field prior to recirculation.
+ *
+ * The RECIRC action computes and set the dp_hash field, set the recirc_id
+ * field and then reprocess the packet as if it was received on the same input
+ * port. RECIRC action works like a function call; actions listed behind the
+ * RECIRC action will be executed after its execution. RECIRC action can be
+ * nested, data path implementation limits the number of recirculation executed
+ * to prevent unreasonable nesting depth or infinite loop.
+ *
+ * Both flow fields and the RECIRC action are exposed as open flow fields via
+ * Nicira extensions.
+ *
+ * Post recirculation flow
+ * ------------------------
+ *
+ * At the open flow level, post recirculation rules are always hidden from the
+ * controller. They are installed in table 254 which is set up as a hidden
+ * table during boot time. Those rules are managed by the local user space
+ * program only.
+ *
+ * To speed up the classifier look up process, recirc_id is always reflected
+ * into the metadata field, since recirc_id is required to be exactly matched.
+ *
+ * Classifier look up always starts with table 254. A post recirculation flow
+ * lookup should find its hidden rule within this table. On the other hand, A
+ * newly received packet should miss all post recirculation rules because its
+ * recirc_id is zero, then hit a pre-installed lower priority rule to redirect
+ * classifier to look up starting from table 0:
+ *
+ * * , actions=resubmit(,0)
+ *
+ * Post recirculation data path flows are managed like other data path flows.
+ * They are created on demand. Miss handling, stats collection and revalidation
+ * work the same way as regular flows.
+ */
+
+uint32_t ofproto_dpif_alloc_recirc_id(struct ofproto_dpif *ofproto);
+void ofproto_dpif_free_recirc_id(struct ofproto_dpif *ofproto, uint32_t recirc_id);
#endif /* ofproto-dpif.h */
/*
- * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "ofp-util.h"
#include "ofproto/ofproto.h"
#include "ovs-atomic.h"
+#include "ovs-rcu.h"
#include "ovs-thread.h"
#include "shash.h"
#include "simap.h"
/* OpenFlow actions. See struct rule_actions for more thread-safety
* notes. */
- struct rule_actions *actions OVS_GUARDED;
+ OVSRCU_TYPE(struct rule_actions *) actions;
/* In owning meter's 'rules' list. An empty list if there is no meter. */
struct list meter_list_node OVS_GUARDED_BY(ofproto_mutex);
void ofproto_rule_ref(struct rule *);
void ofproto_rule_unref(struct rule *);
-struct rule_actions *rule_get_actions(const struct rule *rule)
- OVS_EXCLUDED(rule->mutex);
-struct rule_actions *rule_get_actions__(const struct rule *rule)
- OVS_REQUIRES(rule->mutex);
+static inline struct rule_actions *
+rule_get_actions(const struct rule *rule)
+{
+ return ovsrcu_get(struct rule_actions *, &rule->actions);
+}
/* Returns true if 'rule' is an OpenFlow 1.3 "table-miss" rule, false
* otherwise.
{
return rule->cr.priority == 0 && cls_rule_is_catchall(&rule->cr);
}
+bool rule_is_internal(const struct rule *);
/* A set of actions within a "struct rule".
*
* 'rule' is the rule for which 'rule->actions == actions') or that owns a
* reference to 'actions->ref_count' (or both). */
struct rule_actions {
- struct ovs_refcount ref_count;
-
/* These members are immutable: they do not change during the struct's
* lifetime. */
struct ofpact *ofpacts; /* Sequence of "struct ofpacts". */
struct rule_actions *rule_actions_create(const struct ofproto *,
const struct ofpact *, size_t);
-void rule_actions_ref(struct rule_actions *);
-void rule_actions_unref(struct rule_actions *);
+void rule_actions_destroy(struct rule_actions *);
/* A set of rules to which an OpenFlow operation applies. */
struct rule_collection {
*
* - 'write_setfields' and 'apply_setfields' to OFPXMT12_MASK.
*
- * - 'metadata_match' and 'metadata_write' to UINT64_MAX.
+ * - 'metadata_match' and 'metadata_write' to OVS_BE64_MAX.
*
* - 'instructions' to OFPIT11_ALL.
*
* information in 'flow' is extracted from 'packet', except for
* flow->tunnel and flow->in_port, which are assigned the correct values
* for the incoming packet. The register values are zeroed. 'packet''s
- * header pointers (e.g. packet->l3) are appropriately initialized.
- * packet->l3 is aligned on a 32-bit boundary.
+ * header pointers and offsets (e.g. packet->l3) are appropriately
+ * initialized. packet->l3 is aligned on a 32-bit boundary.
*
* The implementation should add the statistics for 'packet' into 'rule'.
*
#include "ofproto-provider.h"
#include "openflow/nicira-ext.h"
#include "openflow/openflow.h"
+#include "ovs-rcu.h"
#include "packets.h"
#include "pinsched.h"
#include "pktbuf.h"
}
ovs_mutex_lock(&ofproto_mutex);
- HEAP_FOR_EACH (evg, size_node, &table->eviction_groups_by_size) {
- heap_rebuild(&evg->rules);
- }
-
fat_rwlock_rdlock(&table->cls.rwlock);
cls_cursor_init(&cursor, &table->cls, NULL);
CLS_CURSOR_FOR_EACH (rule, cr, &cursor) {
- if (!rule->eviction_group
- && (rule->idle_timeout || rule->hard_timeout)) {
- eviction_group_add_rule(rule);
+ if (rule->idle_timeout || rule->hard_timeout) {
+ if (!rule->eviction_group) {
+ eviction_group_add_rule(rule);
+ } else {
+ heap_raw_change(&rule->evg_node,
+ rule_eviction_priority(p, rule));
+ }
}
}
fat_rwlock_unlock(&table->cls.rwlock);
+
+ HEAP_FOR_EACH (evg, size_node, &table->eviction_groups_by_size) {
+ heap_rebuild(&evg->rules);
+ }
ovs_mutex_unlock(&ofproto_mutex);
}
}
rule = rule_from_cls_rule(classifier_find_match_exactly(
&ofproto->tables[0].cls, match, priority));
if (rule) {
- ovs_mutex_lock(&rule->mutex);
- must_add = !ofpacts_equal(rule->actions->ofpacts,
- rule->actions->ofpacts_len,
+ struct rule_actions *actions = rule_get_actions(rule);
+ must_add = !ofpacts_equal(actions->ofpacts, actions->ofpacts_len,
ofpacts, ofpacts_len);
- ovs_mutex_unlock(&rule->mutex);
} else {
must_add = true;
}
/* Reading many of the rule fields and writing on 'modified'
* requires the rule->mutex. Also, rule->actions may change
* if rule->mutex is not held. */
+ const struct rule_actions *actions;
+
ovs_mutex_lock(&rule->mutex);
+ actions = rule_get_actions(rule);
if (rule->idle_timeout == fm->idle_timeout
&& rule->hard_timeout == fm->hard_timeout
&& rule->flags == (fm->flags & OFPUTIL_FF_STATE)
&& (!fm->modify_cookie || (fm->new_cookie == rule->flow_cookie))
&& ofpacts_equal(fm->ofpacts, fm->ofpacts_len,
- rule->actions->ofpacts,
- rule->actions->ofpacts_len)) {
+ actions->ofpacts, actions->ofpacts_len)) {
/* Rule already exists and need not change, only update the
modified timestamp. */
rule->modified = time_msec();
}
}
-struct rule_actions *
-rule_get_actions(const struct rule *rule)
- OVS_EXCLUDED(rule->mutex)
-{
- struct rule_actions *actions;
-
- ovs_mutex_lock(&rule->mutex);
- actions = rule_get_actions__(rule);
- ovs_mutex_unlock(&rule->mutex);
-
- return actions;
-}
-
-struct rule_actions *
-rule_get_actions__(const struct rule *rule)
- OVS_REQUIRES(rule->mutex)
-{
- rule_actions_ref(rule->actions);
- return rule->actions;
-}
-
static void
ofproto_rule_destroy__(struct rule *rule)
OVS_NO_THREAD_SAFETY_ANALYSIS
{
cls_rule_destroy(CONST_CAST(struct cls_rule *, &rule->cr));
- rule_actions_unref(rule->actions);
+ rule_actions_destroy(rule_get_actions(rule));
ovs_mutex_destroy(&rule->mutex);
- ovs_refcount_destroy(&rule->ref_count);
rule->ofproto->ofproto_class->rule_dealloc(rule);
}
struct rule_actions *actions;
actions = xmalloc(sizeof *actions);
- ovs_refcount_init(&actions->ref_count);
actions->ofpacts = xmemdup(ofpacts, ofpacts_len);
actions->ofpacts_len = ofpacts_len;
actions->provider_meter_id
return actions;
}
-/* Increments 'actions''s ref_count. */
-void
-rule_actions_ref(struct rule_actions *actions)
+static void
+rule_actions_destroy_cb(struct rule_actions *actions)
{
- if (actions) {
- ovs_refcount_ref(&actions->ref_count);
- }
+ free(actions->ofpacts);
+ free(actions);
}
/* Decrements 'actions''s ref_count and frees 'actions' if the ref_count
* reaches 0. */
void
-rule_actions_unref(struct rule_actions *actions)
+rule_actions_destroy(struct rule_actions *actions)
{
- if (actions && ovs_refcount_unref(&actions->ref_count) == 1) {
- ovs_refcount_destroy(&actions->ref_count);
- free(actions->ofpacts);
- free(actions);
+ if (actions) {
+ ovsrcu_postpone(rule_actions_destroy_cb, actions);
}
}
ofproto_rule_has_out_port(const struct rule *rule, ofp_port_t port)
OVS_REQUIRES(ofproto_mutex)
{
- return (port == OFPP_ANY
- || ofpacts_output_to_port(rule->actions->ofpacts,
- rule->actions->ofpacts_len, port));
+ if (port == OFPP_ANY) {
+ return true;
+ } else {
+ const struct rule_actions *actions = rule_get_actions(rule);
+ return ofpacts_output_to_port(actions->ofpacts,
+ actions->ofpacts_len, port);
+ }
}
/* Returns true if 'rule' has group and equals group_id. */
ofproto_rule_has_out_group(const struct rule *rule, uint32_t group_id)
OVS_REQUIRES(ofproto_mutex)
{
- return (group_id == OFPG11_ANY
- || ofpacts_output_to_group(rule->actions->ofpacts,
- rule->actions->ofpacts_len, group_id));
+ if (group_id == OFPG_ANY) {
+ return true;
+ } else {
+ const struct rule_actions *actions = rule_get_actions(rule);
+ return ofpacts_output_to_group(actions->ofpacts,
+ actions->ofpacts_len, group_id);
+ }
}
/* Returns true if a rule related to 'op' has an OpenFlow OFPAT_OUTPUT or
static uint32_t
hash_cookie(ovs_be64 cookie)
{
- return hash_2words((OVS_FORCE uint64_t)cookie >> 32,
- (OVS_FORCE uint64_t)cookie);
+ return hash_uint64((OVS_FORCE uint64_t)cookie);
}
static void
fs.hard_timeout = rule->hard_timeout;
created = rule->created;
modified = rule->modified;
- actions = rule_get_actions__(rule);
+ actions = rule_get_actions(rule);
flags = rule->flags;
ovs_mutex_unlock(&rule->mutex);
fs.flags = flags;
ofputil_append_flow_stats_reply(&fs, &replies);
-
- rule_actions_unref(actions);
}
rule_collection_unref(&rules);
&byte_count, &used);
ovs_mutex_lock(&rule->mutex);
- actions = rule_get_actions__(rule);
+ actions = rule_get_actions(rule);
created = rule->created;
ovs_mutex_unlock(&rule->mutex);
ofpacts_format(actions->ofpacts, actions->ofpacts_len, results);
ds_put_cstr(results, "\n");
-
- rule_actions_unref(actions);
}
/* Adds a pretty-printed description of all flows to 'results', including
*CONST_CAST(uint8_t *, &rule->table_id) = table - ofproto->tables;
rule->flags = fm->flags & OFPUTIL_FF_STATE;
- rule->actions = rule_actions_create(ofproto, fm->ofpacts, fm->ofpacts_len);
+ ovsrcu_set(&rule->actions,
+ rule_actions_create(ofproto, fm->ofpacts, fm->ofpacts_len));
list_init(&rule->meter_list_node);
rule->eviction_group = NULL;
list_init(&rule->expirable);
error = OFPERR_OFPBRC_EPERM;
for (i = 0; i < rules->n; i++) {
struct rule *rule = rules->rules[i];
+ const struct rule_actions *actions;
struct ofoperation *op;
bool actions_changed;
bool reset_counters;
continue;
}
+ actions = rule_get_actions(rule);
actions_changed = !ofpacts_equal(fm->ofpacts, fm->ofpacts_len,
- rule->actions->ofpacts,
- rule->actions->ofpacts_len);
+ actions->ofpacts,
+ actions->ofpacts_len);
op = ofoperation_create(group, rule, type, 0);
if (actions_changed || reset_counters) {
struct rule_actions *new_actions;
- op->actions = rule->actions;
+ op->actions = rule_get_actions(rule);
new_actions = rule_actions_create(ofproto,
fm->ofpacts, fm->ofpacts_len);
- ovs_mutex_lock(&rule->mutex);
- rule->actions = new_actions;
- ovs_mutex_unlock(&rule->mutex);
+ ovsrcu_set(&rule->actions, new_actions);
rule->ofproto->ofproto_class->rule_modify_actions(rule,
reset_counters);
if (!(flags & NXFMF_ACTIONS)) {
actions = NULL;
} else if (!op) {
- actions = rule->actions;
+ actions = rule_get_actions(rule);
} else {
/* An operation is in progress. Use the previous version of the flow's
* actions, so that when the operation commits we report the change. */
case OFOPERATION_MODIFY:
case OFOPERATION_REPLACE:
- actions = op->actions ? op->actions : rule->actions;
+ actions = op->actions ? op->actions : rule_get_actions(rule);
break;
case OFOPERATION_DELETE:
- actions = rule->actions;
+ actions = rule_get_actions(rule);
break;
default:
}
}
+enum ofp_table_config
+table_get_config(const struct ofproto *ofproto, uint8_t table_id)
+{
+ unsigned int value;
+ atomic_read(&ofproto->tables[table_id].config, &value);
+ return (enum ofp_table_config)value;
+}
+
static enum ofperr
table_mod(struct ofproto *ofproto, const struct ofputil_table_mod *tm)
{
- /* XXX Reject all configurations because none are currently supported */
- return OFPERR_OFPTMFC_BAD_CONFIG;
+ /* Only accept currently supported configurations */
+ if (tm->config & ~OFPTC11_TABLE_MISS_MASK) {
+ return OFPERR_OFPTMFC_BAD_CONFIG;
+ }
if (tm->table_id == OFPTT_ALL) {
int i;
struct rule_actions *old_actions;
ovs_mutex_lock(&rule->mutex);
- old_actions = rule->actions;
- rule->actions = op->actions;
+ old_actions = rule_get_actions(rule);
+ ovsrcu_set(&rule->actions, op->actions);
ovs_mutex_unlock(&rule->mutex);
op->actions = NULL;
- rule_actions_unref(old_actions);
+ rule_actions_destroy(old_actions);
}
rule->flags = op->flags;
}
hmap_remove(&group->ofproto->deletions, &op->hmap_node);
}
list_remove(&op->group_node);
- rule_actions_unref(op->actions);
+ rule_actions_destroy(op->actions);
free(op);
}
oftable_disable_eviction(table);
classifier_destroy(&table->cls);
free(table->name);
- atomic_destroy(&table->config);
}
/* Changes the name of 'table' to 'name'. If 'name' is NULL or the empty
{
struct ofproto *ofproto = rule->ofproto;
struct oftable *table = &ofproto->tables[rule->table_id];
+ struct rule_actions *actions;
bool may_expire;
ovs_mutex_lock(&rule->mutex);
cookies_insert(ofproto, rule);
- if (rule->actions->provider_meter_id != UINT32_MAX) {
- uint32_t meter_id = ofpacts_get_meter(rule->actions->ofpacts,
- rule->actions->ofpacts_len);
+ actions = rule_get_actions(rule);
+ if (actions->provider_meter_id != UINT32_MAX) {
+ uint32_t meter_id = ofpacts_get_meter(actions->ofpacts,
+ actions->ofpacts_len);
struct meter *meter = ofproto->meters[meter_id];
list_insert(&meter->rules, &rule->meter_list_node);
}
bool ofproto_has_vlan_usage_changed(const struct ofproto *);
int ofproto_port_set_realdev(struct ofproto *, ofp_port_t vlandev_ofp_port,
ofp_port_t realdev_ofp_port, int vid);
+\f
+/* Table configuration */
+
+enum ofp_table_config table_get_config(const struct ofproto *,
+ uint8_t table_id);
#ifdef __cplusplus
}
} else if (open_mode == OVSDB_LOG_READ_WRITE) {
flags = O_RDWR;
} else if (open_mode == OVSDB_LOG_CREATE) {
+#ifndef _WIN32
if (stat(name, &s) == -1 && errno == ENOENT
&& lstat(name, &s) == 0 && S_ISLNK(s.st_mode)) {
/* 'name' is a dangling symlink. We want to create the file that
} else {
flags = O_RDWR | O_CREAT | O_EXCL;
}
+#else
+ flags = O_RDWR | O_CREAT | O_EXCL;
+#endif
} else {
OVS_NOT_REACHED();
}
print "\nextern struct ovsdb_idl_column %s_columns[%s_N_COLUMNS];" % (structName, structName.upper())
print '''
+const struct %(s)s *%(s)s_get_for_uuid(const struct ovsdb_idl *, const struct uuid *);
const struct %(s)s *%(s)s_first(const struct ovsdb_idl *);
const struct %(s)s *%(s)s_next(const struct %(s)s *);
#define %(S)s_FOR_EACH(ROW, IDL) \\
# First, next functions.
print '''
+const struct %(s)s *
+%(s)s_get_for_uuid(const struct ovsdb_idl *idl, const struct uuid *uuid)
+{
+ return %(s)s_cast(ovsdb_idl_get_row_for_uuid(idl, &%(p)stable_classes[%(P)sTABLE_%(T)s], uuid));
+}
+
const struct %(s)s *
%(s)s_first(const struct ovsdb_idl *idl)
{
-/* Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+/* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
ds_put_format(out, "%s row, with UUID "UUID_FMT", ",
title, UUID_ARGS(ovsdb_row_get_uuid(row)));
if (!row->txn_row
- || bitmap_scan(row->txn_row->changed, 0, n_columns) == n_columns) {
+ || bitmap_scan(row->txn_row->changed, 1, 0, n_columns) == n_columns) {
ds_put_cstr(out, "existed in the database before this "
"transaction and was not modified by the transaction.");
} else if (!row->txn_row->old) {
* "OVSTunnel", if <name> is an OVS tunnel.
+ * "OVSPatchPort", if <name> is a patch port
+
- OVS_BRIDGE: If TYPE is anything other than "OVSBridge", set to
the name of the OVS bridge to which the port should be attached.
- OVS_TUNNEL_OPTIONS: For "OVSTunnel" interfaces, this field should be
used to specify the tunnel options like remote_ip, key, etc.
+ - OVS_PATCH_PEER: For "OVSPatchPort" devices, this field specifies
+ the patch's peer on the other bridge.
+
Note
----
OVS_TUNNEL_TYPE=gre
OVS_TUNNEL_OPTIONS="options:remote_ip=A.B.C.D"
+
+Patch Ports:
+
+==> ifcfg-patch-ovs-0 <==
+DEVICE=patch-ovs-0
+ONBOOT=yes
+DEVICETYPE=ovs
+TYPE=OVSPatchPort
+OVS_BRIDGE=ovsbridge0
+OVS_PATCH_PEER=patch-ovs-1
+
+==> ifcfg-patch-ovs-1 <==
+DEVICE=patch-ovs-1
+ONBOOT=yes
+DEVICETYPE=ovs
+TYPE=OVSPatchPort
+OVS_BRIDGE=ovsbridge1
+OVS_PATCH_PEER=patch-ovs-0
+
+
Reporting Bugs
--------------
retval=$?
ovs-vsctl -t ${TIMEOUT} -- --if-exists del-port "$OVS_BRIDGE" "$DEVICE"
;;
+ OVSPatchPort)
+ ovs-vsctl -t ${TIMEOUT} -- --if-exists del-port "$OVS_BRIDGE" "$DEVICE"
+ ;;
*)
echo $"Invalid OVS interface type $TYPE"
exit 1
ovs-vsctl -t ${TIMEOUT} -- --may-exist add-port "$OVS_BRIDGE" "$DEVICE" $OVS_OPTIONS -- set Interface "$DEVICE" type=$OVS_TUNNEL_TYPE $OVS_TUNNEL_OPTIONS ${OVS_EXTRA+-- $OVS_EXTRA}
${OTHERSCRIPT} ${CONFIG} ${2}
;;
+ OVSPatchPort)
+ ifup_ovs_bridge
+ ovs-vsctl -t ${TIMEOUT} -- --may-exist add-port "$OVS_BRIDGE" "$DEVICE" $OVS_OPTIONS -- set Interface "$DEVICE" type=patch options:peer="${OVS_PATCH_PEER}" ${OVS_EXTRA+-- $OVS_EXTRA}
+ ;;
*)
echo $"Invalid OVS interface type $TYPE"
exit 1
valgrind_wrappers = \
tests/valgrind/ovs-appctl \
tests/valgrind/ovs-ofctl \
+ tests/valgrind/ovstest \
tests/valgrind/ovs-vsctl \
tests/valgrind/ovs-vswitchd \
tests/valgrind/ovsdb-client \
tests/valgrind/test-file_name \
tests/valgrind/test-flows \
tests/valgrind/test-hash \
- tests/valgrind/test-heap \
tests/valgrind/test-hindex \
tests/valgrind/test-hmap \
tests/valgrind/test-json \
tests_test_hash_SOURCES = tests/test-hash.c
tests_test_hash_LDADD = lib/libopenvswitch.la
-noinst_PROGRAMS += tests/test-heap
-tests_test_heap_SOURCES = tests/test-heap.c
-tests_test_heap_LDADD = lib/libopenvswitch.la
-
noinst_PROGRAMS += tests/test-hindex
tests_test_hindex_SOURCES = tests/test-hindex.c
tests_test_hindex_LDADD = lib/libopenvswitch.la
tests/idltest.c: tests/idltest.h
+noinst_PROGRAMS += tests/ovstest
+tests_ovstest_SOURCES = tests/ovstest.c \
+ tests/ovstest.h \
+ tests/test-heap.c
+tests_ovstest_LDADD = lib/libopenvswitch.la
+
noinst_PROGRAMS += tests/test-reconnect
tests_test_reconnect_SOURCES = tests/test-reconnect.c
tests_test_reconnect_LDADD = lib/libopenvswitch.la
])
])
+m4_define([CFM_CHECK_DB], [
+CFM_VSCTL_LIST_IFACE([$1], [cfm_fault_status], [cfm_fault_status : [[$2]]])
+CFM_VSCTL_LIST_IFACE([$1], [cfm_flap_count], [cfm_flap_count : $3])
+CFM_VSCTL_LIST_IFACE([$1], [cfm_health], [cfm_health : [[$4]]])
+CFM_VSCTL_LIST_IFACE([$1], [cfm_remote_mpids], [cfm_remote_mpids : [[$5]]])
+CFM_VSCTL_LIST_IFACE([$1], [cfm_remote_opstate], [cfm_remote_opstate : $6])
+])
+
+# This test checks the update of cfm status to OVSDB at startup.
+# The cfm status should be updated to OVSDB within 3.5 * cfm_interval.
+AT_SETUP([cfm - check update ovsdb])
+#Create 2 bridges connected by patch ports and enable cfm
+OVS_VSWITCHD_START([add-br br1 -- \
+ set bridge br1 datapath-type=dummy \
+ other-config:hwaddr=aa:55:aa:56:00:00 -- \
+ add-port br1 p1 -- set Interface p1 type=patch \
+ options:peer=p0 -- \
+ add-port br0 p0 -- set Interface p0 type=patch \
+ options:peer=p1 -- \
+ set Interface p0 other_config:cfm_interval=300 other_config:cfm_extended=true -- \
+ set Interface p1 other_config:cfm_interval=300 other_config:cfm_extended=true])
+
+ovs-appctl time/stop
+
+AT_CHECK([ovs-vsctl set Interface p0 cfm_mpid=1])
+# check cfm status update in OVSDB.
+for i in `seq 0 14`; do ovs-appctl time/warp 100; done
+CFM_CHECK_DB([p0], [recv], [1], [], [], [up])
+
+# turn cfm on p1 off, should increment the cfm_flap_count on p0.
+AT_CHECK([ovs-vsctl set interface p1 cfm_mpid=2])
+for i in `seq 0 14`; do ovs-appctl time/warp 100; done
+CFM_CHECK_DB([p0], [], [2], [], [2], [up])
+CFM_CHECK_DB([p1], [], [0], [], [1], [up])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
# test cfm under demand mode.
AT_SETUP([cfm - demand mode])
#Create 2 bridges connected by patch ports and enable cfm
m4_define([TEST_HEAP],
[AT_SETUP([heap library -- m4_bpatsubst([$1], [-], [ ])])
- AT_CHECK([test-heap $1])
+ AT_CHECK([ovstest test-heap $1])
AT_CLEANUP])
TEST_HEAP([insert-delete-same-order])
actions=learn(table=1,idle_timeout=10, hard_timeout=20, fin_idle_timeout=5, fin_hard_timeout=10, priority=10, cookie=0xfedcba9876543210, in_port=99,NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_OF_IN_PORT[]->NXM_NX_REG1[16..31])
]])
AT_CHECK([ovs-ofctl parse-flows flows.txt], [0],
-[[usable protocols: any
+[[usable protocols: any,OXM-OpenFlow14
chosen protocol: OpenFlow10-table_id
OFPT_FLOW_MOD (xid=0x1): ADD actions=learn(table=1)
OFPT_FLOW_MOD (xid=0x2): ADD actions=learn(table=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],output:NXM_OF_IN_PORT[],load:0xa->NXM_NX_REG0[5..10])
actions=learn(table=1, in_port=1, load:OXM_OF_IN_PORT[]->NXM_NX_REG1[], load:0xfffffffe->OXM_OF_IN_PORT[])
]])
AT_CHECK([ovs-ofctl -O OpenFlow12 parse-flows flows.txt], [0],
-[[usable protocols: any
+[[usable protocols: any,OXM-OpenFlow14
chosen protocol: OXM-OpenFlow12
OFPT_FLOW_MOD (OF1.2) (xid=0x1): ADD actions=learn(table=1,output:OXM_OF_IN_PORT[])
OFPT_FLOW_MOD (OF1.2) (xid=0x2): ADD actions=learn(table=1,in_port=1,load:OXM_OF_IN_PORT[]->NXM_NX_REG1[],load:0xfffffffe->OXM_OF_IN_PORT[])
table=1 priority=0 actions=flood
]])
AT_CHECK([ovs-ofctl parse-flows flows.txt], [0],
-[[usable protocols: OXM,OpenFlow10+table_id,NXM+table_id,OpenFlow11
+[[usable protocols: OXM,OpenFlow10+table_id,NXM+table_id,OpenFlow11,OXM-OpenFlow14
chosen protocol: OpenFlow10+table_id
OFPT_FLOW_MOD (xid=0x1): ADD table:255 actions=learn(table=1,in_port=99,NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_OF_IN_PORT[]->NXM_NX_REG1[16..31])
OFPT_FLOW_MOD (xid=0x2): ADD table:255 actions=learn(table=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],output:NXM_OF_IN_PORT[])
ip,actions=learn(eth_type=0x800,OXM_OF_IPV4_DST[])
]])
AT_CHECK([ovs-ofctl parse-flows flows.txt], [0],
-[[usable protocols: any
+[[usable protocols: any,OXM-OpenFlow14
chosen protocol: OpenFlow10-table_id
OFPT_FLOW_MOD (xid=0x1): ADD actions=learn(table=1,eth_type=0x800,load:0x5->NXM_OF_IP_DST[])
OFPT_FLOW_MOD (xid=0x2): ADD ip actions=learn(table=1,load:NXM_OF_IP_DST[]->NXM_NX_REG1[])
port 3: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0
tx pkts=9, bytes=540, drop=0, errs=0, coll=0
])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+# This test is much like the previous, but adds idle timeouts and sends
+# two different flows to the bridge. This tests that the statistics are
+# attributed correctly.
+AT_SETUP([learning action - self-modifying flow with idle_timeout])
+OVS_VSWITCHD_START
+ADD_OF_PORTS([br0], 1, 2, 3)
+
+ovs-appctl time/stop
+# Set up flow table for TCPv4 port learning.
+AT_CHECK([[ovs-ofctl add-flow br0 'actions=load:3->NXM_NX_REG0[0..15],learn(table=0,idle_timeout=5,priority=65535,NXM_OF_ETH_SRC[],NXM_OF_VLAN_TCI[0..11],output:NXM_NX_REG0[0..15]),output:2']])
+
+# Trace some packets arriving. The particular packets don't matter.
+for i in `seq 1 10`; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
+ ovs-appctl time/warp 10
+ if [[ $i -eq 1 ]]; then
+ sleep 1
+ fi
+done
+
+# Trace some packets arriving. This is is a different flow from the previous.
+# Note that we advance time by 1 second between each packet here.
+for i in `seq 1 10`; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
+ ovs-appctl time/warp 1000
+ if [[ $i -eq 1 ]]; then
+ sleep 1
+ fi
+done
+
+# Check that the first packet of each flow went out port 2 and the rest out
+# port 3.
+AT_CHECK(
+ [(ovs-ofctl dump-ports br0 2; ovs-ofctl dump-ports br0 3) | STRIP_XIDS], [0],
+ [OFPST_PORT reply: 1 ports
+ port 2: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0
+ tx pkts=2, bytes=120, drop=0, errs=0, coll=0
+OFPST_PORT reply: 1 ports
+ port 3: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0
+ tx pkts=18, bytes=1080, drop=0, errs=0, coll=0
+])
+
+# Check for the learning entry.
+ovs-appctl time/warp 1000
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0],
+[[ n_packets=2, n_bytes=120, actions=load:0x3->NXM_NX_REG0[0..15],learn(table=0,idle_timeout=5,priority=65535,NXM_OF_ETH_SRC[],NXM_OF_VLAN_TCI[0..11],output:NXM_NX_REG0[0..15]),output:2
+ n_packets=9, n_bytes=540, idle_timeout=5, priority=65535,vlan_tci=0x0000/0x0fff,dl_src=50:54:00:00:00:06 actions=output:3
+NXST_FLOW reply:
+]])
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+# This test is much like the previous, but adds hard timeouts and sends
+# two different flows to the bridge. This tests that the statistics are
+# attributed correctly.
+AT_SETUP([learning action - self-modifying flow with hard_timeout])
+OVS_VSWITCHD_START
+ADD_OF_PORTS([br0], 1, 2, 3)
+
+ovs-appctl time/stop
+# Set up flow table for TCPv4 port learning.
+AT_CHECK([[ovs-ofctl add-flow br0 'actions=load:3->NXM_NX_REG0[0..15],learn(table=0,hard_timeout=10,priority=65535,NXM_OF_ETH_SRC[],NXM_OF_VLAN_TCI[0..11],output:NXM_NX_REG0[0..15]),output:2']])
+
+# Trace some packets arriving. The particular packets don't matter.
+for i in `seq 1 10`; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
+ if [[ $i -eq 1 ]]; then
+ sleep 1
+ fi
+ ovs-appctl time/warp 10
+done
+
+# Trace some packets arriving. This is is a different flow from the previous.
+# Note that we advance time by 2 second between each packet here.
+for i in `seq 1 10`; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
+ # Note: hard_timeout should fire immediately after #6 packet.
+ # #7 packet re-install the flow and the following 3 packets
+ # (#8, #9, #10) use the flow.
+ # it's difficult to predict the exact timing of rule expiry
+ # because it's affected by flow dumper thread via udpif_dump_seq.
+ # hard_timeout value for this test was chosen to overcome the uncertainty.
+ if [[ $i -eq 1 -o $i -eq 6 -o $i -eq 7 ]]; then
+ sleep 1
+ fi
+ ovs-appctl time/warp 2000
+done
+
+# Check that the first packet of each flow went out port 2 and the rest out
+# port 3.
+AT_CHECK(
+ [(ovs-ofctl dump-ports br0 2; ovs-ofctl dump-ports br0 3) | STRIP_XIDS], [0],
+ [OFPST_PORT reply: 1 ports
+ port 2: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0
+ tx pkts=3, bytes=180, drop=0, errs=0, coll=0
+OFPST_PORT reply: 1 ports
+ port 3: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0
+ tx pkts=17, bytes=1020, drop=0, errs=0, coll=0
+])
+
+# Check for the learning entry.
+ovs-appctl time/warp 1000
+sleep 1
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0],
+[[ n_packets=3, n_bytes=180, actions=load:0x3->NXM_NX_REG0[0..15],learn(table=0,hard_timeout=10,priority=65535,NXM_OF_ETH_SRC[],NXM_OF_VLAN_TCI[0..11],output:NXM_NX_REG0[0..15]),output:2
+ n_packets=3, n_bytes=180, hard_timeout=10, priority=65535,vlan_tci=0x0000/0x0fff,dl_src=50:54:00:00:00:06 actions=output:3
+NXST_FLOW reply:
+]])
OVS_VSWITCHD_STOP
AT_CLEANUP
OpenFlow 1.1: vendor 0, type 3, code 5
OpenFlow 1.2: vendor 0, type 3, code 5
OpenFlow 1.3: vendor 0, type 3, code 5
+OpenFlow 1.4: vendor 0, type 3, code 5
])
AT_CHECK([ovs-ofctl print-error OFPBIC_BAD_EXP_TYPE], [0], [dnl
OpenFlow 1.1: vendor 0, type 3, code 5
OpenFlow 1.2: vendor 0, type 3, code 6
OpenFlow 1.3: vendor 0, type 3, code 6
+OpenFlow 1.4: vendor 0, type 3, code 6
])
AT_CLEANUP
])
AT_CLEANUP
+AT_SETUP([OFPST_TABLE_FEATURES request - OF1.3])
+AT_KEYWORDS([ofp-print OFPT_STATS_REQUEST])
+AT_CHECK([ovs-ofctl ofp-print "\
+04 13 09 40 00 00 00 d5 00 0c 00 01 00 00 00 00 \
+09 30 00 00 00 00 00 00 74 61 62 6c 65 30 00 00 \
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 \
+00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff \
+ff ff ff ff ff ff ff ff 00 00 00 03 00 0f 42 40 \
+00 00 00 2c 00 01 00 08 00 00 00 00 00 02 00 08 \
+00 00 00 00 00 03 00 08 00 00 00 00 00 04 00 08 \
+00 00 00 00 00 05 00 08 00 00 00 00 00 00 00 00 \
+00 01 00 2c 00 01 00 08 00 00 00 00 00 02 00 08 \
+00 00 00 00 00 03 00 08 00 00 00 00 00 04 00 08 \
+00 00 00 00 00 05 00 08 00 00 00 00 00 00 00 00 \
+00 02 01 01 01 02 03 04 05 06 07 08 09 0a 0b 0c \
+0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c \
+1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c \
+2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c \
+3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c \
+4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a 5b 5c \
+5d 5e 5f 60 61 62 63 64 65 66 67 68 69 6a 6b 6c \
+6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 7b 7c \
+7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 8c \
+8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b 9c \
+9d 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac \
+ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc \
+bd be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc \
+cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc \
+dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec \
+ed ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc \
+fd 00 00 00 00 00 00 00 00 03 01 01 01 02 03 04 \
+05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 \
+15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 \
+25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 \
+35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 \
+45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 \
+55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 \
+65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 \
+75 76 77 78 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 \
+85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 91 92 93 94 \
+95 96 97 98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4 \
+a5 a6 a7 a8 a9 aa ab ac ad ae af b0 b1 b2 b3 b4 \
+b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1 c2 c3 c4 \
+c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 \
+d5 d6 d7 d8 d9 da db dc dd de df e0 e1 e2 e3 e4 \
+e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 f2 f3 f4 \
+f5 f6 f7 f8 f9 fa fb fc fd 00 00 00 00 00 00 00 \
+00 04 00 84 00 00 00 08 00 00 00 00 00 0b 00 08 \
+00 00 00 00 00 0c 00 08 00 00 00 00 00 0f 00 08 \
+00 00 00 00 00 10 00 08 00 00 00 00 00 11 00 08 \
+00 00 00 00 00 12 00 08 00 00 00 00 00 13 00 08 \
+00 00 00 00 00 14 00 08 00 00 00 00 00 15 00 08 \
+00 00 00 00 00 16 00 08 00 00 00 00 00 17 00 08 \
+00 00 00 00 00 18 00 08 00 00 00 00 00 19 00 08 \
+00 00 00 00 00 1a 00 08 00 00 00 00 00 1b 00 08 \
+00 00 00 00 00 00 00 00 00 05 00 84 00 00 00 08 \
+00 00 00 00 00 0b 00 08 00 00 00 00 00 0c 00 08 \
+00 00 00 00 00 0f 00 08 00 00 00 00 00 10 00 08 \
+00 00 00 00 00 11 00 08 00 00 00 00 00 12 00 08 \
+00 00 00 00 00 13 00 08 00 00 00 00 00 14 00 08 \
+00 00 00 00 00 15 00 08 00 00 00 00 00 16 00 08 \
+00 00 00 00 00 17 00 08 00 00 00 00 00 18 00 08 \
+00 00 00 00 00 19 00 08 00 00 00 00 00 1a 00 08 \
+00 00 00 00 00 1b 00 08 00 00 00 00 00 00 00 00 \
+00 06 00 84 00 00 00 08 00 00 00 00 00 0b 00 08 \
+00 00 00 00 00 0c 00 08 00 00 00 00 00 0f 00 08 \
+00 00 00 00 00 10 00 08 00 00 00 00 00 11 00 08 \
+00 00 00 00 00 12 00 08 00 00 00 00 00 13 00 08 \
+00 00 00 00 00 14 00 08 00 00 00 00 00 15 00 08 \
+00 00 00 00 00 16 00 08 00 00 00 00 00 17 00 08 \
+00 00 00 00 00 18 00 08 00 00 00 00 00 19 00 08 \
+00 00 00 00 00 1a 00 08 00 00 00 00 00 1b 00 08 \
+00 00 00 00 00 00 00 00 00 07 00 84 00 00 00 08 \
+00 00 00 00 00 0b 00 08 00 00 00 00 00 0c 00 08 \
+00 00 00 00 00 0f 00 08 00 00 00 00 00 10 00 08 \
+00 00 00 00 00 11 00 08 00 00 00 00 00 12 00 08 \
+00 00 00 00 00 13 00 08 00 00 00 00 00 14 00 08 \
+00 00 00 00 00 15 00 08 00 00 00 00 00 16 00 08 \
+00 00 00 00 00 17 00 08 00 00 00 00 00 18 00 08 \
+00 00 00 00 00 19 00 08 00 00 00 00 00 1a 00 08 \
+00 00 00 00 00 1b 00 08 00 00 00 00 00 00 00 00 \
+00 08 00 dc 80 00 4c 08 00 01 3e 04 00 01 40 04 \
+80 00 04 08 00 00 00 02 80 00 00 04 00 01 42 04 \
+00 01 00 04 00 01 02 04 00 01 04 04 00 01 06 04 \
+00 01 08 04 00 01 0a 04 00 01 0c 04 00 01 0e 04 \
+80 00 08 06 80 00 06 06 80 00 0a 02 00 00 08 02 \
+80 00 0c 02 80 00 0e 01 80 00 44 04 80 00 46 01 \
+80 00 48 01 80 00 16 04 80 00 18 04 80 00 34 10 \
+80 00 36 10 80 00 38 04 80 00 14 01 00 00 0a 01 \
+80 00 10 01 80 00 12 01 00 01 3a 01 00 01 34 01 \
+80 00 2a 02 80 00 2c 04 80 00 2e 04 80 00 30 06 \
+80 00 32 06 80 00 1a 02 80 00 1c 02 00 01 44 02 \
+80 00 1e 02 80 00 20 02 80 00 22 02 80 00 24 02 \
+80 00 26 01 80 00 28 01 80 00 3a 01 80 00 3c 01 \
+80 00 3e 10 80 00 40 06 80 00 42 06 00 00 00 00 \
+00 0a 00 dc 80 00 4c 08 00 01 3e 04 00 01 40 04 \
+80 00 04 08 00 00 00 02 80 00 00 04 00 01 42 04 \
+00 01 00 04 00 01 02 04 00 01 04 04 00 01 06 04 \
+00 01 08 04 00 01 0a 04 00 01 0c 04 00 01 0e 04 \
+80 00 08 06 80 00 06 06 80 00 0a 02 00 00 08 02 \
+80 00 0c 02 80 00 0e 01 80 00 44 04 80 00 46 01 \
+80 00 48 01 80 00 16 04 80 00 18 04 80 00 34 10 \
+80 00 36 10 80 00 38 04 80 00 14 01 00 00 0a 01 \
+80 00 10 01 80 00 12 01 00 01 3a 01 00 01 34 01 \
+80 00 2a 02 80 00 2c 04 80 00 2e 04 80 00 30 06 \
+80 00 32 06 80 00 1a 02 80 00 1c 02 00 01 44 02 \
+80 00 1e 02 80 00 20 02 80 00 22 02 80 00 24 02 \
+80 00 26 01 80 00 28 01 80 00 3a 01 80 00 3c 01 \
+80 00 3e 10 80 00 40 06 80 00 42 06 00 00 00 00 \
+00 0c 00 a8 80 00 4c 08 00 01 3e 04 00 01 40 04 \
+80 00 04 08 00 00 00 02 80 00 00 04 00 01 42 04 \
+00 01 00 04 00 01 02 04 00 01 04 04 00 01 06 04 \
+00 01 08 04 00 01 0a 04 00 01 0c 04 00 01 0e 04 \
+80 00 08 06 80 00 06 06 00 00 08 02 80 00 0c 02 \
+80 00 0e 01 80 00 44 04 80 00 46 01 80 00 16 04 \
+80 00 18 04 80 00 34 10 80 00 36 10 00 00 0a 01 \
+80 00 10 01 80 00 12 01 00 01 3a 01 80 00 2a 02 \
+80 00 2c 04 80 00 2e 04 80 00 30 06 80 00 32 06 \
+80 00 1a 02 80 00 1c 02 80 00 1e 02 80 00 20 02 \
+80 00 22 02 80 00 24 02 00 0d 00 a8 80 00 4c 08 \
+00 01 3e 04 00 01 40 04 80 00 04 08 00 00 00 02 \
+80 00 00 04 00 01 42 04 00 01 00 04 00 01 02 04 \
+00 01 04 04 00 01 06 04 00 01 08 04 00 01 0a 04 \
+00 01 0c 04 00 01 0e 04 80 00 08 06 80 00 06 06 \
+00 00 08 02 80 00 0c 02 80 00 0e 01 80 00 44 04 \
+80 00 46 01 80 00 16 04 80 00 18 04 80 00 34 10 \
+80 00 36 10 00 00 0a 01 80 00 10 01 80 00 12 01 \
+00 01 3a 01 80 00 2a 02 80 00 2c 04 80 00 2e 04 \
+80 00 30 06 80 00 32 06 80 00 1a 02 80 00 1c 02 \
+80 00 1e 02 80 00 20 02 80 00 22 02 80 00 24 02 \
+00 0e 00 a8 80 00 4c 08 00 01 3e 04 00 01 40 04 \
+80 00 04 08 00 00 00 02 80 00 00 04 00 01 42 04 \
+00 01 00 04 00 01 02 04 00 01 04 04 00 01 06 04 \
+00 01 08 04 00 01 0a 04 00 01 0c 04 00 01 0e 04 \
+80 00 08 06 80 00 06 06 00 00 08 02 80 00 0c 02 \
+80 00 0e 01 80 00 44 04 80 00 46 01 80 00 16 04 \
+80 00 18 04 80 00 34 10 80 00 36 10 00 00 0a 01 \
+80 00 10 01 80 00 12 01 00 01 3a 01 80 00 2a 02 \
+80 00 2c 04 80 00 2e 04 80 00 30 06 80 00 32 06 \
+80 00 1a 02 80 00 1c 02 80 00 1e 02 80 00 20 02 \
+80 00 22 02 80 00 24 02 00 0f 00 a8 80 00 4c 08 \
+00 01 3e 04 00 01 40 04 80 00 04 08 00 00 00 02 \
+80 00 00 04 00 01 42 04 00 01 00 04 00 01 02 04 \
+00 01 04 04 00 01 06 04 00 01 08 04 00 01 0a 04 \
+00 01 0c 04 00 01 0e 04 80 00 08 06 80 00 06 06 \
+00 00 08 02 80 00 0c 02 80 00 0e 01 80 00 44 04 \
+80 00 46 01 80 00 16 04 80 00 18 04 80 00 34 10 \
+80 00 36 10 00 00 0a 01 80 00 10 01 80 00 12 01 \
+00 01 3a 01 80 00 2a 02 80 00 2c 04 80 00 2e 04 \
+80 00 30 06 80 00 32 06 80 00 1a 02 80 00 1c 02 \
+80 00 1e 02 80 00 20 02 80 00 22 02 80 00 24 02 \
+"], [0], [OFPST_TABLE_FEATURES reply (OF1.3) (xid=0xd5):
+ table 0:
+ name="table0"
+ metadata: match=0xffffffffffffffff write=0xffffffffffffffff
+ config=Unknown
+ max_entries=1000000
+ instructions (table miss and others):
+ next tables: 1-253
+ instructions: apply_actions,clear_actions,write_actions,write_metadata,goto_table
+ Write-Actions and Apply-Actions features:
+ actions: output,copy_ttl_out,copy_ttl_in,set_mpls_ttl,dec_mpls_ttl,push_vlan,pop_vlan,push_mpls,pop_mpls,set_queue,group,set_nw_ttl,dec_nw_ttl,set_field,push_pbb,pop_pbb
+ supported on Set-Field: tun_id,tun_src,tun_dst,metadata,in_port,in_port_oxm,pkt_mark,reg0,reg1,reg2,reg3,reg4,reg5,reg6,reg7,eth_src,eth_dst,vlan_tci,vlan_vid,vlan_pcp,mpls_label,mpls_tc,ip_src,ip_dst,ipv6_src,ipv6_dst,nw_tos,ip_dscp,nw_ecn,nw_ttl,arp_op,arp_spa,arp_tpa,arp_sha,arp_tha,tcp_src,tcp_dst,udp_src,udp_dst,sctp_src,sctp_dst
+ matching:
+ tun_id: exact match or wildcard
+ tun_src: exact match or wildcard
+ tun_dst: exact match or wildcard
+ metadata: exact match or wildcard
+ in_port: exact match or wildcard
+ in_port_oxm: exact match or wildcard
+ pkt_mark: exact match or wildcard
+ reg0: exact match or wildcard
+ reg1: exact match or wildcard
+ reg2: exact match or wildcard
+ reg3: exact match or wildcard
+ reg4: exact match or wildcard
+ reg5: exact match or wildcard
+ reg6: exact match or wildcard
+ reg7: exact match or wildcard
+ eth_src: exact match or wildcard
+ eth_dst: exact match or wildcard
+ eth_type: exact match or wildcard
+ vlan_tci: exact match or wildcard
+ vlan_vid: exact match or wildcard
+ vlan_pcp: exact match or wildcard
+ mpls_label: exact match or wildcard
+ mpls_tc: exact match or wildcard
+ mpls_bos: exact match or wildcard
+ ip_src: exact match or wildcard
+ ip_dst: exact match or wildcard
+ ipv6_src: exact match or wildcard
+ ipv6_dst: exact match or wildcard
+ ipv6_label: exact match or wildcard
+ nw_proto: exact match or wildcard
+ nw_tos: exact match or wildcard
+ ip_dscp: exact match or wildcard
+ nw_ecn: exact match or wildcard
+ nw_ttl: exact match or wildcard
+ ip_frag: exact match or wildcard
+ arp_op: exact match or wildcard
+ arp_spa: exact match or wildcard
+ arp_tpa: exact match or wildcard
+ arp_sha: exact match or wildcard
+ arp_tha: exact match or wildcard
+ tcp_src: exact match or wildcard
+ tcp_dst: exact match or wildcard
+ tcp_flags: exact match or wildcard
+ udp_src: exact match or wildcard
+ udp_dst: exact match or wildcard
+ sctp_src: exact match or wildcard
+ sctp_dst: exact match or wildcard
+ icmp_type: exact match or wildcard
+ icmp_code: exact match or wildcard
+ icmpv6_type: exact match or wildcard
+ icmpv6_code: exact match or wildcard
+ nd_target: exact match or wildcard
+ nd_sll: exact match or wildcard
+ nd_tll: exact match or wildcard
+])
+AT_CLEANUP
+
AT_SETUP([OFPT_BARRIER_REQUEST - OF1.0])
AT_KEYWORDS([ofp-print])
AT_CHECK([ovs-ofctl ofp-print '01 12 00 08 00 00 00 01'], [0], [dnl
OVS_VSWITCHD_STOP
AT_CLEANUP
+AT_SETUP([ofproto-dpif - Table Miss - OFPTC_TABLE_MISS_CONTROLLER])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+
+AT_CHECK([ovs-ofctl monitor br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+NXST_FLOW reply:
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - goto table and OFPTC_TABLE_MISS_CONTROLLER])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl -OOpenFlow12 add-flow br0 'table=0 actions=goto_table(1)'])
+
+AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=3, n_bytes=180, actions=goto_table:1
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - resubmit and OFPTC_TABLE_MISS_CONTROLLER])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl -OOpenFlow12 add-flow br0 'table=0 actions=resubmit(1,1)'])
+
+AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=3, n_bytes=180, actions=resubmit(1,1)
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - OFPTC_TABLE_MISS_CONTINUE])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl add-flow br0 'table=1 dl_src=10:11:11:11:11:11 actions=controller'])
+AT_CHECK([ovs-ofctl -OOpenFlow11 mod-table br0 all continue])
+
+dnl Miss table 0, Hit table 1
+AT_CHECK([ovs-ofctl monitor br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=10:11:11:11:11:11,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+])
+
+dnl Hit table 0, Miss all other tables, sent to controller
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): table_id=253 cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=253 cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=253 cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ table=1, n_packets=3, n_bytes=180, dl_src=10:11:11:11:11:11 actions=CONTROLLER:65535
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - goto table and OFPTC_TABLE_MISS_CONTINUE])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_DATA([flows.txt], [dnl
+table=0 actions=goto_table(1)
+table=2 dl_src=10:11:11:11:11:11 actions=controller
+])
+AT_CHECK([ovs-ofctl -OOpenFlow12 add-flows br0 flows.txt])
+AT_CHECK([ovs-ofctl -OOpenFlow11 mod-table br0 all continue])
+
+dnl Hit table 0, Miss table 1, Hit table 2
+AT_CHECK([ovs-ofctl monitor br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=10:11:11:11:11:11,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): table_id=2 cookie=0x0 total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=2 cookie=0x0 total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=2 cookie=0x0 total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+])
+
+dnl Hit table 1, Miss all other tables, sent to controller
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): table_id=253 cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=253 cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+dnl
+NXT_PACKET_IN (xid=0x0): table_id=253 cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9,tcp_flags=0x010 tcp_csum:0
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=6, n_bytes=360, actions=goto_table:1
+ table=2, n_packets=3, n_bytes=180, dl_src=10:11:11:11:11:11 actions=CONTROLLER:65535
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - resubmit and OFPTC_TABLE_MISS_CONTINUE])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_DATA([flows.txt], [dnl
+table=0 actions=resubmit(1,1)
+table=2 dl_src=10:11:11:11:11:11 actions=controller
+])
+AT_CHECK([ovs-ofctl -OOpenFlow12 add-flows br0 flows.txt])
+AT_CHECK([ovs-ofctl -OOpenFlow11 mod-table br0 all continue])
+
+dnl Hit table 0, Miss table 1, Dropped
+AT_CHECK([ovs-ofctl monitor br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=10:11:11:11:11:11,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+])
+
+dnl Hit table 1, Dropped
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=6, n_bytes=360, actions=resubmit(1,1)
+ table=2, dl_src=10:11:11:11:11:11 actions=CONTROLLER:65535
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - OFPTC_TABLE_MISS_DROP])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl -OOpenFlow11 mod-table br0 all drop])
+
+AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+dnl Test that missed packets are droped
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+NXST_FLOW reply:
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - goto table and OFPTC_TABLE_MISS_DROP])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl -OOpenFlow12 add-flow br0 'table=0 actions=goto_table(1)'])
+AT_CHECK([ovs-ofctl -OOpenFlow11 mod-table br0 all drop])
+
+AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+dnl Test that missed packets are droped
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=3, n_bytes=180, actions=goto_table:1
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - Table Miss - resubmit and OFPTC_TABLE_MISS_DROP])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl -OOpenFlow12 add-flow br0 'table=0 actions=resubmit(1,1)'])
+AT_CHECK([ovs-ofctl -OOpenFlow11 mod-table br0 all drop])
+
+AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+dnl Test that missed packets are droped
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9),tcp_flags(0x010)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl -OOpenFlow12 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=3, n_bytes=180, actions=resubmit(1,1)
+OFPST_FLOW reply (OF1.2):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
AT_SETUP([ofproto-dpif - controller])
OVS_VSWITCHD_START([dnl
add-port br0 p1 -- set Interface p1 type=dummy
OVS_VSWITCHD_STOP
AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - table-miss flow (OpenFlow 1.0)])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+# A table-miss flow has priority 0 and no match
+AT_CHECK([ovs-ofctl --protocols=OpenFlow10 add-flow br0 'priority=0 actions=output:CONTROLLER'])
+
+dnl Singleton controller action.
+AT_CHECK([ovs-ofctl monitor -P openflow10 --protocols=OpenFlow10 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=10:11:11:11:11:11,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=10),tcp_flags(0x002)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10,tcp_flags=0x002 tcp_csum:0
+dnl
+OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10,tcp_flags=0x002 tcp_csum:0
+dnl
+OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via action) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10,tcp_flags=0x002 tcp_csum:0
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+
+AT_CHECK([ovs-ofctl --protocols=OpenFlow10 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=3, n_bytes=180, priority=0 actions=CONTROLLER:65535
+NXST_FLOW reply:
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+
+AT_SETUP([ofproto-dpif - table-miss flow (OpenFlow 1.3)])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+# A table-miss flow has priority 0 and no match
+AT_CHECK([ovs-ofctl --protocols=OpenFlow13 add-flow br0 'priority=0 actions=output:CONTROLLER'])
+
+dnl Singleton controller action.
+AT_CHECK([ovs-ofctl monitor -P openflow10 --protocols=OpenFlow13 br0 65534 --detach --no-chdir --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3 ; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=10:11:11:11:11:11,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=10),tcp_flags(0x002)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+OFPT_PACKET_IN (OF1.3) (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10,tcp_flags=0x002 tcp_csum:0
+dnl
+OFPT_PACKET_IN (OF1.3) (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10,tcp_flags=0x002 tcp_csum:0
+dnl
+OFPT_PACKET_IN (OF1.3) (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered)
+tcp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10,tcp_flags=0x002 tcp_csum:0
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+
+AT_CHECK([ovs-ofctl --protocols=OpenFlow13 dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ n_packets=3, n_bytes=180, priority=0 actions=CONTROLLER:65535
+OFPST_FLOW reply (OF1.3):
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
AT_SETUP([ofproto-dpif - ARP modification slow-path])
OVS_VSWITCHD_START
ADD_OF_PORTS([br0], [1], [2])
# Check the packets that were output.
AT_CHECK([ovs-ofctl parse-pcap p2.pcap], [0], [dnl
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=40:44:44:44:44:41
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=40:44:44:44:44:41
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
-arp,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=40:44:44:44:44:41
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=40:44:44:44:44:41
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=40:44:44:44:44:41
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=00:00:00:00:00:00
+arp,metadata=0,in_port=ANY,vlan_tci=0x0000,dl_src=80:88:88:88:88:88,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.128.1,arp_tpa=192.168.0.2,arp_op=2,arp_sha=50:54:00:00:00:05,arp_tha=40:44:44:44:44:41
])
OVS_VSWITCHD_STOP
OVS_VSWITCHD_STOP
AT_CLEANUP
-AT_SETUP([ofproto - hard limits on flow table size (OpenFLow 1.0)])
+AT_SETUP([ofproto - hard limits on flow table size (OpenFlow 1.0)])
OVS_VSWITCHD_START
# Configure a maximum of 4 flows.
AT_CHECK(
OVS_VSWITCHD_STOP
AT_CLEANUP
-AT_SETUP([ofproto - hard limits on flow table size (OpenFLow 1.2)])
+AT_SETUP([ofproto - hard limits on flow table size (OpenFlow 1.2)])
OVS_VSWITCHD_START
# Configure a maximum of 4 flows.
AT_CHECK(
OVS_VSWITCHD_STOP
AT_CLEANUP
+AT_SETUP([ofproto - eviction upon table overflow, with modified hard timeout])
+OVS_VSWITCHD_START
+# Configure a maximum of 4 flows.
+AT_CHECK(
+ [ovs-vsctl \
+ -- --id=@t0 create Flow_Table flow-limit=4 overflow-policy=evict \
+ -- set bridge br0 flow_tables:0=@t0 \
+ | ${PERL} $srcdir/uuidfilt.pl],
+ [0], [<0>
+])
+# Add 4 flows.
+for in_port in 4 3 2 1; do
+ ovs-ofctl add-flow br0 hard_timeout=1${in_port},in_port=$in_port,actions=drop
+done
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ hard_timeout=11, in_port=1 actions=drop
+ hard_timeout=12, in_port=2 actions=drop
+ hard_timeout=13, in_port=3 actions=drop
+ hard_timeout=14, in_port=4 actions=drop
+NXST_FLOW reply:
+])
+# Sleep and modify the one that expires soonest
+sleep 2
+AT_CHECK([ovs-ofctl mod-flows br0 in_port=1,actions=drop])
+sleep 2
+# Adding another flow will cause the one that expires soonest to be evicted.
+AT_CHECK([ovs-ofctl add-flow br0 in_port=5,actions=drop])
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ hard_timeout=11, in_port=1 actions=drop
+ hard_timeout=13, in_port=3 actions=drop
+ hard_timeout=14, in_port=4 actions=drop
+ in_port=5 actions=drop
+NXST_FLOW reply:
+])
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto - eviction upon table overflow, with modified idle timeout])
+OVS_VSWITCHD_START([add-port br0 p1 -- set interface p1 type=dummy ofport_request=1])
+# Configure a maximum of 4 flows.
+AT_CHECK(
+ [ovs-vsctl \
+ -- --id=@t0 create Flow_Table flow-limit=4 overflow-policy=evict \
+ -- set bridge br0 flow_tables:0=@t0 \
+ | ${PERL} $srcdir/uuidfilt.pl],
+ [0], [<0>
+])
+# Add 4 flows.
+for in_port in 4 3 2 1; do
+ ovs-ofctl add-flow br0 idle_timeout=1${in_port},in_port=$in_port,actions=drop
+done
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ idle_timeout=11, in_port=1 actions=drop
+ idle_timeout=12, in_port=2 actions=drop
+ idle_timeout=13, in_port=3 actions=drop
+ idle_timeout=14, in_port=4 actions=drop
+NXST_FLOW reply:
+])
+# Sleep and receive on the flow that expires soonest
+sleep 2
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 'in_port(1)'])
+sleep 2
+# Adding another flow will cause the one that expires soonest to be evicted.
+AT_CHECK([ovs-ofctl add-flow br0 in_port=5,actions=drop])
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ idle_timeout=13, in_port=3 actions=drop
+ idle_timeout=14, in_port=4 actions=drop
+ in_port=5 actions=drop
+ n_packets=1, n_bytes=60, idle_timeout=11, in_port=1 actions=drop
+NXST_FLOW reply:
+])
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
AT_SETUP([ofproto - asynchronous message control (OpenFlow 1.0)])
OVS_VSWITCHD_START
AT_CHECK([ovs-ofctl -P openflow10 monitor br0 --detach --no-chdir --pidfile])
AT_SETUP([ovs-ofctl parse-flows choice of protocol])
# This doesn't cover some potential vlan_tci test cases.
for test_case in \
- 'tun_id=0 NXM,OXM' \
- 'tun_id=0/0x1 NXM,OXM' \
- 'tun_src=1.2.3.4 NXM,OXM' \
- 'tun_src=1.2.3.4/0.0.0.1 NXM,OXM' \
- 'tun_dst=1.2.3.4 NXM,OXM' \
- 'tun_dst=1.2.3.4/0.0.0.1 NXM,OXM' \
+ 'tun_id=0 NXM,OXM,OXM-OpenFlow14' \
+ 'tun_id=0/0x1 NXM,OXM,OXM-OpenFlow14' \
+ 'tun_src=1.2.3.4 NXM,OXM,OXM-OpenFlow14' \
+ 'tun_src=1.2.3.4/0.0.0.1 NXM,OXM,OXM-OpenFlow14' \
+ 'tun_dst=1.2.3.4 NXM,OXM,OXM-OpenFlow14' \
+ 'tun_dst=1.2.3.4/0.0.0.1 NXM,OXM,OXM-OpenFlow14' \
'tun_flags=0 none' \
'tun_flags=1/1 none' \
'tun_tos=0 none' \
'tun_ttl=0 none' \
- 'metadata=0 NXM,OXM,OpenFlow11' \
- 'metadata=1/1 NXM,OXM,OpenFlow11' \
- 'in_port=1 any' \
+ 'metadata=0 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'metadata=1/1 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'in_port=1 any,OXM-OpenFlow14' \
'skb_priority=0 none' \
- 'pkt_mark=1 NXM,OXM' \
- 'pkt_mark=1/1 NXM,OXM' \
- 'reg0=0 NXM,OXM' \
- 'reg0=0/1 NXM,OXM' \
- 'reg1=1 NXM,OXM' \
- 'reg1=1/1 NXM,OXM' \
- 'reg2=2 NXM,OXM' \
- 'reg2=2/1 NXM,OXM' \
- 'reg3=3 NXM,OXM' \
- 'reg3=3/1 NXM,OXM' \
- 'reg4=4 NXM,OXM' \
- 'reg4=4/1 NXM,OXM' \
- 'reg5=5 NXM,OXM' \
- 'reg5=5/1 NXM,OXM' \
- 'reg6=6 NXM,OXM' \
- 'reg6=6/1 NXM,OXM' \
- 'reg7=7 NXM,OXM' \
- 'reg7=7/1 NXM,OXM' \
- 'dl_src=00:11:22:33:44:55 any' \
- 'dl_src=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM,OpenFlow11' \
- 'dl_dst=00:11:22:33:44:55 any' \
- 'dl_dst=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM,OpenFlow11' \
- 'dl_type=0x1234 any' \
- 'dl_type=0x0800 any' \
- 'dl_type=0x0806 any' \
- 'dl_type=0x86dd any' \
- 'vlan_tci=0 any' \
- 'vlan_tci=0x1009 any' \
- 'vlan_tci=0x1009/0x1 NXM,OXM' \
- 'dl_vlan=9 any' \
- 'vlan_vid=11 any' \
- 'vlan_vid=11/0x1 NXM,OXM' \
- 'dl_vlan_pcp=6 any' \
- 'vlan_pcp=5 any' \
- 'mpls,mpls_label=5 NXM,OXM,OpenFlow11' \
- 'mpls,mpls_tc=1 NXM,OXM,OpenFlow11' \
- 'mpls,mpls_bos=0 NXM,OXM' \
- 'ip,ip_src=1.2.3.4 any' \
- 'ip,ip_src=192.168.0.0/24 any' \
- 'ip,ip_src=192.0.168.0/255.0.255.0 NXM,OXM,OpenFlow11' \
- 'ip,ip_dst=1.2.3.4 any' \
- 'ip,ip_dst=192.168.0.0/24 any' \
- 'ip,ip_dst=192.0.168.0/255.0.255.0 NXM,OXM,OpenFlow11' \
- 'ipv6,ipv6_src=::1 NXM,OXM' \
- 'ipv6,ipv6_src=::1/::1 NXM,OXM' \
- 'ipv6,ipv6_dst=::1 NXM,OXM' \
- 'ipv6,ipv6_dst=::1/::1 NXM,OXM' \
- 'ipv6,ipv6_label=5 NXM,OXM' \
- 'ipv6,ipv6_label=5/1 NXM,OXM' \
- 'ip,nw_proto=1 any' \
- 'ipv6,nw_proto=1 NXM,OXM' \
- 'ip,nw_tos=0xf0 any' \
- 'ipv6,nw_tos=0xf0 NXM,OXM' \
- 'ip,ip_dscp=0x3c any' \
- 'ipv6,ip_dscp=0x3c NXM,OXM' \
- 'ip,nw_ecn=1 NXM,OXM' \
- 'ipv6,nw_ecn=1 NXM,OXM' \
- 'ip,nw_ttl=5 NXM,OXM' \
- 'ipv6,nw_ttl=5 NXM,OXM' \
- 'ip,ip_frag=no NXM,OXM' \
- 'ipv6,ip_frag=no NXM,OXM' \
- 'arp,arp_op=0 any' \
- 'arp,arp_spa=1.2.3.4 any' \
- 'arp,arp_spa=1.2.3.4/0.0.0.1 NXM,OXM,OpenFlow11' \
- 'arp,arp_tpa=1.2.3.4 any' \
- 'arp,arp_tpa=1.2.3.4/0.0.0.1 NXM,OXM,OpenFlow11' \
- 'arp,arp_sha=00:11:22:33:44:55 NXM,OXM' \
- 'arp,arp_sha=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM' \
- 'arp,arp_tha=00:11:22:33:44:55 NXM,OXM' \
- 'arp,arp_tha=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM' \
- 'tcp,tcp_src=80 any' \
- 'tcp,tcp_src=0x1000/0x1000 NXM,OXM' \
- 'tcp6,tcp_src=80 NXM,OXM' \
- 'tcp6,tcp_src=0x1000/0x1000 NXM,OXM' \
- 'tcp,tcp_dst=80 any' \
- 'tcp,tcp_dst=0x1000/0x1000 NXM,OXM' \
- 'tcp6,tcp_dst=80 NXM,OXM' \
- 'tcp6,tcp_dst=0x1000/0x1000 NXM,OXM' \
- 'udp,udp_src=80 any' \
- 'udp,udp_src=0x1000/0x1000 NXM,OXM' \
- 'udp6,udp_src=80 NXM,OXM' \
- 'udp6,udp_src=0x1000/0x1000 NXM,OXM' \
- 'udp,udp_dst=80 any' \
- 'udp,udp_dst=0x1000/0x1000 NXM,OXM' \
- 'udp6,udp_dst=80 NXM,OXM' \
- 'udp6,udp_dst=0x1000/0x1000 NXM,OXM' \
- 'icmp,icmp_type=1 any' \
- 'icmp,icmp_code=2 any' \
- 'icmp6,icmpv6_type=1 NXM,OXM' \
- 'icmp6,icmpv6_code=2 NXM,OXM'
+ 'pkt_mark=1 NXM,OXM,OXM-OpenFlow14' \
+ 'pkt_mark=1/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg0=0 NXM,OXM,OXM-OpenFlow14' \
+ 'reg0=0/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg1=1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg1=1/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg2=2 NXM,OXM,OXM-OpenFlow14' \
+ 'reg2=2/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg3=3 NXM,OXM,OXM-OpenFlow14' \
+ 'reg3=3/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg4=4 NXM,OXM,OXM-OpenFlow14' \
+ 'reg4=4/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg5=5 NXM,OXM,OXM-OpenFlow14' \
+ 'reg5=5/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg6=6 NXM,OXM,OXM-OpenFlow14' \
+ 'reg6=6/1 NXM,OXM,OXM-OpenFlow14' \
+ 'reg7=7 NXM,OXM,OXM-OpenFlow14' \
+ 'reg7=7/1 NXM,OXM,OXM-OpenFlow14' \
+ 'dl_src=00:11:22:33:44:55 any,OXM-OpenFlow14' \
+ 'dl_src=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'dl_dst=00:11:22:33:44:55 any,OXM-OpenFlow14' \
+ 'dl_dst=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'dl_type=0x1234 any,OXM-OpenFlow14' \
+ 'dl_type=0x0800 any,OXM-OpenFlow14' \
+ 'dl_type=0x0806 any,OXM-OpenFlow14' \
+ 'dl_type=0x86dd any,OXM-OpenFlow14' \
+ 'vlan_tci=0 any,OXM-OpenFlow14' \
+ 'vlan_tci=0x1009 any,OXM-OpenFlow14' \
+ 'vlan_tci=0x1009/0x1 NXM,OXM,OXM-OpenFlow14' \
+ 'dl_vlan=9 any,OXM-OpenFlow14' \
+ 'vlan_vid=11 any,OXM-OpenFlow14' \
+ 'vlan_vid=11/0x1 NXM,OXM,OXM-OpenFlow14' \
+ 'dl_vlan_pcp=6 any,OXM-OpenFlow14' \
+ 'vlan_pcp=5 any,OXM-OpenFlow14' \
+ 'mpls,mpls_label=5 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'mpls,mpls_tc=1 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'mpls,mpls_bos=0 NXM,OXM,OXM-OpenFlow14' \
+ 'ip,ip_src=1.2.3.4 any,OXM-OpenFlow14' \
+ 'ip,ip_src=192.168.0.0/24 any,OXM-OpenFlow14' \
+ 'ip,ip_src=192.0.168.0/255.0.255.0 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'ip,ip_dst=1.2.3.4 any,OXM-OpenFlow14' \
+ 'ip,ip_dst=192.168.0.0/24 any,OXM-OpenFlow14' \
+ 'ip,ip_dst=192.0.168.0/255.0.255.0 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'ipv6,ipv6_src=::1 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,ipv6_src=::1/::1 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,ipv6_dst=::1 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,ipv6_dst=::1/::1 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,ipv6_label=5 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,ipv6_label=5/1 NXM,OXM,OXM-OpenFlow14' \
+ 'ip,nw_proto=1 any,OXM-OpenFlow14' \
+ 'ipv6,nw_proto=1 NXM,OXM,OXM-OpenFlow14' \
+ 'ip,nw_tos=0xf0 any,OXM-OpenFlow14' \
+ 'ipv6,nw_tos=0xf0 NXM,OXM,OXM-OpenFlow14' \
+ 'ip,ip_dscp=0x3c any,OXM-OpenFlow14' \
+ 'ipv6,ip_dscp=0x3c NXM,OXM,OXM-OpenFlow14' \
+ 'ip,nw_ecn=1 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,nw_ecn=1 NXM,OXM,OXM-OpenFlow14' \
+ 'ip,nw_ttl=5 NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,nw_ttl=5 NXM,OXM,OXM-OpenFlow14' \
+ 'ip,ip_frag=no NXM,OXM,OXM-OpenFlow14' \
+ 'ipv6,ip_frag=no NXM,OXM,OXM-OpenFlow14' \
+ 'arp,arp_op=0 any,OXM-OpenFlow14' \
+ 'arp,arp_spa=1.2.3.4 any,OXM-OpenFlow14' \
+ 'arp,arp_spa=1.2.3.4/0.0.0.1 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'arp,arp_tpa=1.2.3.4 any,OXM-OpenFlow14' \
+ 'arp,arp_tpa=1.2.3.4/0.0.0.1 NXM,OXM,OpenFlow11,OXM-OpenFlow14' \
+ 'arp,arp_sha=00:11:22:33:44:55 NXM,OXM,OXM-OpenFlow14' \
+ 'arp,arp_sha=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM,OXM-OpenFlow14' \
+ 'arp,arp_tha=00:11:22:33:44:55 NXM,OXM,OXM-OpenFlow14' \
+ 'arp,arp_tha=00:11:22:33:44:55/00:ff:ff:ff:ff:ff NXM,OXM,OXM-OpenFlow14' \
+ 'tcp,tcp_src=80 any,OXM-OpenFlow14' \
+ 'tcp,tcp_src=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'tcp6,tcp_src=80 NXM,OXM,OXM-OpenFlow14' \
+ 'tcp6,tcp_src=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'tcp,tcp_dst=80 any,OXM-OpenFlow14' \
+ 'tcp,tcp_dst=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'tcp6,tcp_dst=80 NXM,OXM,OXM-OpenFlow14' \
+ 'tcp6,tcp_dst=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'udp,udp_src=80 any,OXM-OpenFlow14' \
+ 'udp,udp_src=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'udp6,udp_src=80 NXM,OXM,OXM-OpenFlow14' \
+ 'udp6,udp_src=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'udp,udp_dst=80 any,OXM-OpenFlow14' \
+ 'udp,udp_dst=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'udp6,udp_dst=80 NXM,OXM,OXM-OpenFlow14' \
+ 'udp6,udp_dst=0x1000/0x1000 NXM,OXM,OXM-OpenFlow14' \
+ 'icmp,icmp_type=1 any,OXM-OpenFlow14' \
+ 'icmp,icmp_code=2 any,OXM-OpenFlow14' \
+ 'icmp6,icmpv6_type=1 NXM,OXM,OXM-OpenFlow14' \
+ 'icmp6,icmpv6_code=2 NXM,OXM,OXM-OpenFlow14'
do
set $test_case
echo
AT_CHECK([ovs-ofctl parse-flows flows.txt
], [0], [stdout])
AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
-[[usable protocols: any
+[[usable protocols: any,OXM-OpenFlow14
chosen protocol: OpenFlow10-table_id
OFPT_FLOW_MOD: ADD tcp,tp_src=123 out_port:5 actions=FLOOD
OFPT_FLOW_MOD: ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=drop
AT_CHECK([ovs-ofctl --protocols OpenFlow11 parse-flows flows.txt
], [0], [stdout])
AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
-[[usable protocols: any
+[[usable protocols: any,OXM-OpenFlow14
chosen protocol: OpenFlow11
OFPT_FLOW_MOD (OF1.1): ADD tcp,tp_src=123 out_port:5 actions=FLOOD
OFPT_FLOW_MOD (OF1.1): ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=drop
AT_CHECK([ovs-ofctl --protocols OpenFlow12 parse-flows flows.txt
], [0], [stdout])
AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
-[[usable protocols: NXM,OXM
+[[usable protocols: NXM,OXM,OXM-OpenFlow14
chosen protocol: OXM-OpenFlow12
OFPT_FLOW_MOD (OF1.2): ADD tcp,tp_src=123 actions=FLOOD
OFPT_FLOW_MOD (OF1.2): ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=set_field:4103->vlan_vid,set_field:2->vlan_pcp
AT_CHECK([ovs-ofctl parse-flows flows.txt
], [0], [stdout])
AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
-[[usable protocols: OXM,NXM+table_id
+[[usable protocols: OXM,NXM+table_id,OXM-OpenFlow14
chosen protocol: NXM+table_id
NXT_FLOW_MOD: ADD table:255 tcp,tp_src=123 actions=FLOOD
NXT_FLOW_MOD: ADD table:255 in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=drop
])
AT_CHECK([ovs-ofctl -F nxm parse-flows flows.txt], [0], [stdout])
AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0], [dnl
-usable protocols: NXM,OXM
+usable protocols: NXM,OXM,OXM-OpenFlow14
chosen protocol: NXM-table_id
NXT_FLOW_MOD: ADD tcp,tp_src=123 actions=FLOOD
NXT_FLOW_MOD: ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=drop
]])
AT_CHECK([ovs-ofctl -F nxm -mmm parse-flows flows.txt], [0], [stdout], [stderr])
AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
-[[usable protocols: NXM,OXM
+[[usable protocols: NXM,OXM,OXM-OpenFlow14
chosen protocol: NXM-table_id
NXT_FLOW_MOD: ADD NXM_OF_ETH_TYPE(0800), NXM_OF_IP_PROTO(06), NXM_OF_TCP_SRC(007b) actions=FLOOD
NXT_FLOW_MOD: ADD NXM_OF_IN_PORT(fffe), NXM_OF_ETH_SRC(000ae4256bb0), NXM_OF_VLAN_TCI_W(1009/1fff) actions=drop
dnl such as tunnels and metadata.
AT_SETUP([ovs-ofctl -F option and NXM features])
AT_CHECK([ovs-ofctl -F openflow10 add-flow dummy tun_id=123,actions=drop],
- [1], [], [ovs-ofctl: none of the usable flow formats (NXM,OXM) is among the allowed flow formats (OpenFlow10)
+ [1], [], [ovs-ofctl: none of the usable flow formats (NXM,OXM,OXM-OpenFlow14) is among the allowed flow formats (OpenFlow10)
])
AT_CHECK([ovs-ofctl -F openflow10 add-flow dummy metadata=123,actions=drop],
- [1], [], [ovs-ofctl: none of the usable flow formats (NXM,OXM,OpenFlow11) is among the allowed flow formats (OpenFlow10)
+ [1], [], [ovs-ofctl: none of the usable flow formats (NXM,OXM,OpenFlow11,OXM-OpenFlow14) is among the allowed flow formats (OpenFlow10)
])
AT_CLEANUP
AT_SETUP([ovs-ofctl dump-flows rejects bad -F option])
OVS_VSWITCHD_START
AT_CHECK([ovs-ofctl -F openflow10 dump-flows unix:br0.mgmt reg0=0xabcdef], [1], [],
- [ovs-ofctl: none of the usable flow formats (NXM,OXM) is among the allowed flow formats (OpenFlow10)
+ [ovs-ofctl: none of the usable flow formats (NXM,OXM,OXM-OpenFlow14) is among the allowed flow formats (OpenFlow10)
])
OVS_VSWITCHD_STOP
AT_CLEANUP
[vxlan_system]],
[
# Try creating the port
-AT_CHECK([ovs-vsctl add-port br0 reserved_name], [0], [], [])
+AT_CHECK([ovs-vsctl add-port br0 reserved_name], [0], [], [dnl
+ovs-vsctl: Error detected while setting up 'reserved_name'. See ovs-vswitchd log for details.
+])
# Detect the warning log message
AT_CHECK([sed -n "s/^.*\(|bridge|WARN|.*\)$/\1/p" ovs-vswitchd.log], [0], [dnl
|bridge|WARN|could not create interface reserved_name, name is reserved
[vxlan_system]],
[
# Try creating the port
-AT_CHECK([ovs-vsctl add-port br0 reserved_name], [0], [], [])
+AT_CHECK([ovs-vsctl add-port br0 reserved_name], [0], [], [dnl
+ovs-vsctl: Error detected while setting up 'reserved_name'. See ovs-vswitchd log for details.
+])
# Detect the warning log message
AT_CHECK([sed -n "s/^.*\(|bridge|WARN|.*\)$/\1/p" ovs-vswitchd.log], [0], [dnl
|bridge|WARN|could not create interface reserved_name, name is reserved
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/* The mother of all test programs that links with libopevswitch.la */
+
+#include <config.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <stdlib.h>
+#include "command-line.h"
+#include "ovstest.h"
+#include "util.h"
+
+static struct command *commands = NULL;
+static size_t n_commands = 0;
+static size_t allocated_commands = 0;
+
+static void
+add_command(struct command *cmd)
+{
+ const struct command nil = {NULL, 0, 0, NULL};
+
+ while (n_commands + 1 >= allocated_commands) {
+ commands = x2nrealloc(commands, &allocated_commands,
+ sizeof *cmd);
+ }
+
+ commands[n_commands] = *cmd;
+ commands[n_commands + 1] = nil;
+ n_commands++;
+}
+
+static void
+list(int argc OVS_UNUSED, char *argv[] OVS_UNUSED)
+{
+ const struct command *p;
+
+ for(p = commands; p->name != NULL; p++) {
+ printf("%s, %d, %d\n", p->name,p->min_args, p->max_args);
+ }
+}
+
+static void
+add_top_level_commands(void)
+{
+ struct command help_cmd = {"--help", 0, 0, list};
+
+ add_command(&help_cmd);
+}
+
+void
+ovstest_register(const char *test_name, ovstest_func f,
+ const struct command *sub_commands)
+{
+ struct command test_cmd;
+ int max_args = 0;
+
+ if (sub_commands) {
+ const struct command *p;
+
+ for(p = sub_commands; p->name != NULL; p++) {
+ if (p->max_args > max_args) {
+ max_args = p->max_args;
+ }
+ }
+ }
+ max_args++; /* adding in the sub program */
+
+ test_cmd.name = test_name;
+ test_cmd.min_args = 1;
+ test_cmd.max_args = max_args;
+ test_cmd.handler = f;
+
+ add_command(&test_cmd);
+}
+
+static void
+cleanup(void)
+{
+ if (allocated_commands) {
+ free(commands);
+ }
+}
+
+int
+main(int argc, char *argv[])
+{
+ set_program_name(argv[0]);
+
+ add_top_level_commands();
+ if (argc > 1) {
+ run_command(argc - 1, argv + 1, commands);
+ }
+ cleanup();
+
+ return 0;
+}
--- /dev/null
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef OVSTEST_H
+#define OVSTEST_H
+
+/* Overview
+ * ========
+ *
+ * OVS tests directory contains many small test programs. One down side
+ * of building them as individual programs is that they all have to be
+ * linked whenever a library function is modified.
+ *
+ * ovstest is an attempt to improve the overall build time by linking
+ * all test programs into a single program, ovs-test. Regardless of
+ * the number of test programs, linking will be done only once to produce
+ * ovstest.
+ *
+ * With ovs-test, each test programs now becomes a sub program of ovstest.
+ * For example, 'mytest' program, can now be invoked as 'ovs mytest'.
+ *
+ * 'ovstest --help' will list all test programs can be invoked.
+ *
+ * The Usage comment section below documents how a new test program can
+ * be added to ovs-test.
+ */
+
+typedef void (*ovstest_func)(int argc, char *argv[]);
+
+void
+ovstest_register(const char *test_name, ovstest_func f,
+ const struct command * sub_commands);
+
+/* Usage
+ * =====
+ *
+ * For each test sub program, its 'main' program should be named as
+ * '<test_name>_main()'.
+ *
+ * OVSTEST_REGISTER register the sub program with ovs-test top level
+ * command line parser. <test_name> is expected as argv[1] when invoking
+ * ovs-test.
+ *
+ * In case the test program has sub commands, its command array can be
+ * passed in as <sub_command>. Otherwise, NULL can be used instead.
+ *
+ * Example:
+ * ----------
+ *
+ * Suppose the test program is called my-test.c
+ * ...
+ *
+ * static void
+ * my_test_main(int argc, char *argv[])
+ * {
+ * ....
+ * }
+ *
+ * // The last parameter is NULL in case my-test.c does
+ * // not have sub commands. Otherwise, its command
+ * // array can replace the NULL here.
+ *
+ * OVSTEST_REGISTER("my-test", my_test_main, NULL);
+ */
+
+#define OVSTEST_REGISTER(name, function, sub_commands) \
+ OVS_CONSTRUCTOR(register_##function) { \
+ ovstest_register(name, function, sub_commands); \
+ }
+
+#endif
/*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
ovs_assert(orig == 2); \
atomic_read(&x, &value); \
ovs_assert(value == 8); \
- \
- atomic_destroy(&x); \
}
static void
/*
- * Copyright (c) 2012 Nicira, Inc.
+ * Copyright (c) 2012, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "command-line.h"
#include "random.h"
#include "util.h"
+#include "ovstest.h"
#undef NDEBUG
#include <assert.h>
test_heap_insert_delete_same_order_with_dups, },
{ "raw-insert", 0, 0, test_heap_raw_insert, },
{ "raw-delete", 0, 0, test_heap_raw_delete, },
+ { NULL, 0, 0, NULL, },
};
-int
-main(int argc, char *argv[])
+static void
+test_heap_main(int argc, char *argv[])
{
set_program_name(argv[0]);
run_command(argc - 1, argv + 1, commands);
-
- return 0;
}
+
+OVSTEST_REGISTER("test-heap", test_heap_main, commands);
}
if (buf->size) {
- printf("%"PRIuSIZE" extra bytes after last record\n", buf->size);
+ printf("%"PRIu32" extra bytes after last record\n", buf->size);
}
}
odp_flow_key_from_flow(&odp_key, &flow, flow.in_port.odp_port);
if (odp_key.size > ODPUTIL_FLOW_KEY_BYTES) {
- printf ("too long: %"PRIuSIZE" > %d\n",
+ printf ("too long: %"PRIu32" > %d\n",
odp_key.size, ODPUTIL_FLOW_KEY_BYTES);
exit_code = 1;
}
assert(port_no < b->n_ports);
lan = b->ports[port_no];
if (lan) {
- const void *data = pkt->l3;
+ const void *data = ofpbuf_get_l3(pkt);
size_t size = (char *) ofpbuf_tail(pkt) - (char *) data;
int i;
replaced by the process ID read from the pidfile, and uses that file
as if it had been specified directly as the target.
.IP
+On Windows, \fItarget\fR can be an absolute path to a file that contains
+a localhost TCP port on which an Open vSwitch daemon is listening
+for control channel connections. By default, each daemon writes the
+TCP port on which it is listening for control connection into the file
+\fIprogram\fB.ctl\fR located inside the configured \fIOVS_RUNDIR\fR
+directory. If \fItarget\fR is not an absolute path, \fBovs\-appctl\fR
+looks for a file named \fItarget\fB.ctl\fR in the configured \fIOVS_RUNDIR\fR
+directory.
+.IP
The default target is \fBovs\-vswitchd\fR.
.
.SH COMMON COMMANDS
}
if (cmd_error) {
+ jsonrpc_close(client);
fputs(cmd_error, stderr);
ovs_error(0, "%s: server returned an error", target);
exit(2);
char *socket_name;
int error;
+#ifndef _WIN32
if (target[0] != '/') {
char *pidfile_name;
pid_t pid;
free(pidfile_name);
socket_name = xasprintf("%s/%s.%ld.ctl",
ovs_rundir(), target, (long int) pid);
+#else
+ /* On windows, if the 'target' contains ':', we make an assumption that
+ * it is an absolute path. */
+ if (!strchr(target, ':')) {
+ socket_name = xasprintf("%s/%s.ctl", ovs_rundir(), target);
+#endif
} else {
socket_name = xstrdup(target);
}
\fBdump\-tables \fIswitch\fR
Prints to the console statistics for each of the flow tables used by
\fIswitch\fR.
+.TP
+\fBdump\-table\-features \fIswitch\fR
+Prints to the console features for each of the flow tables used by
+\fIswitch\fR.
.
.TP
\fBdump\-ports \fIswitch\fR [\fInetdev\fR]
.IP \fBarp_spa=\fIip\fR[\fB/\fInetmask\fR]
.IQ \fBarp_tpa=\fIip\fR[\fB/\fInetmask\fR]
When \fBdl_type\fR specifies either ARP or RARP, \fBarp_spa\fR and
-\fBarp_tha\fR match the source and target IPv4 address, respectively.
+\fBarp_tpa\fR match the source and target IPv4 address, respectively.
An address may be specified as an IP address or host name
(e.g. \fB192.168.1.1\fR or \fBwww.example.com\fR). The optional
\fInetmask\fR allows restricting a match to an IPv4 address prefix.
.IQ "\fBOXM-OpenFlow13\fR"
These are the standard OXM (OpenFlow Extensible Match) flow format in
OpenFlow 1.2 and 1.3, respectively.
+.IP "\fBOXM-OpenFlow14\fR"
+The standard OXM (OpenFlow Extensible Match) flow format in OpenFlow
+1.4. OpenFlow 1.4 is not yet well supported; in particular, the
+implementation is unsafe, such that sending an unsupported message in
+OpenFlow 1.4 to \fBovs\-vswitchd\fR can cause it to crash.
.RE
.
.IP
collections of flow formats:
.RS
.IP "\fBany\fR"
-Any supported flow format.
+Any supported flow format except \fBOXM-OpenFlow14\fR, which is not
+yet well supported (see above).
.IP "\fBOpenFlow10\fR"
\fBOpenFlow10\-table_id\fR or \fBOpenFlow10+table_id\fR.
.IP "\fBNXM\fR"
\fBNXM\-table_id\fR or \fBNXM+table_id\fR.
.IP "\fBOXM\fR"
-\fBOXM-OpenFlow12\fR or \fBOXM-OpenFlow13\fR.
+\fBOXM-OpenFlow12\fR or \fBOXM-OpenFlow13\fR. \fBOXM-OpenFlow14\fR is
+not included because it is not yet well supported (see above).
.RE
.
.IP
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
-#include <sys/fcntl.h>
+#include <fcntl.h>
#include <sys/stat.h>
#include <sys/time.h>
" show SWITCH show OpenFlow information\n"
" dump-desc SWITCH print switch description\n"
" dump-tables SWITCH print table stats\n"
+ " dump-table-features SWITCH print table features\n"
" mod-port SWITCH IFACE ACT modify port behavior\n"
" mod-table SWITCH MOD modify flow table behavior\n"
" get-frags SWITCH print fragment handling behavior\n"
dump_trivial_stats_transaction(argv[1], OFPRAW_OFPST_TABLE_REQUEST);
}
+static void
+ofctl_dump_table_features(int argc OVS_UNUSED, char *argv[])
+{
+ struct ofpbuf *request;
+ struct vconn *vconn;
+
+ open_vconn(argv[1], &vconn);
+ request = ofputil_encode_table_features_request(vconn_get_version(vconn));
+ if (request) {
+ dump_stats_transaction(vconn, request);
+ }
+
+ vconn_close(vconn);
+}
+
static bool
fetch_port_by_features(const char *vconn_name,
const char *port_name, ofp_port_t port_no,
case OFP11_VERSION:
case OFP12_VERSION:
case OFP13_VERSION:
+ case OFP14_VERSION:
break;
default:
OVS_NOT_REACHED();
if (ofptype_pull(&type, reply)
|| type != OFPTYPE_ECHO_REPLY
|| reply->size != payload
- || memcmp(request->l3, reply->l3, payload)) {
+ || memcmp(ofpbuf_get_l3(request), ofpbuf_get_l3(reply), payload)) {
printf("Reply does not match request. Request:\n");
ofp_print(stdout, request, request->size, verbosity + 2);
printf("Reply:\n");
ofp_print(stdout, reply, reply->size, verbosity + 2);
}
- printf("%"PRIuSIZE" bytes from %s: xid=%08"PRIx32" time=%.1f ms\n",
+ printf("%"PRIu32" bytes from %s: xid=%08"PRIx32" time=%.1f ms\n",
reply->size, argv[1], ntohl(rpy_hdr->xid),
(1000*(double)(end.tv_sec - start.tv_sec))
+ (.001*(end.tv_usec - start.tv_usec)));
ovs_fatal(0, "Trailing garbage in hex data");
}
if (match_expout.size != sizeof(struct ofp10_match)) {
- ovs_fatal(0, "Input is %"PRIuSIZE" bytes, expected %"PRIuSIZE,
+ ovs_fatal(0, "Input is %"PRIu32" bytes, expected %"PRIuSIZE,
match_expout.size, sizeof(struct ofp10_match));
}
ovs_fatal(0, "Trailing garbage in hex data");
}
if (match_in.size != sizeof(struct ofp10_match)) {
- ovs_fatal(0, "Input is %"PRIuSIZE" bytes, expected %"PRIuSIZE,
+ ovs_fatal(0, "Input is %"PRIu32" bytes, expected %"PRIuSIZE,
match_in.size, sizeof(struct ofp10_match));
}
ovs_fatal(0, "Trailing garbage in hex data");
}
if (match_in.size != sizeof(struct ofp11_match)) {
- ovs_fatal(0, "Input is %"PRIuSIZE" bytes, expected %"PRIuSIZE,
+ ovs_fatal(0, "Input is %"PRIu32" bytes, expected %"PRIuSIZE,
match_in.size, sizeof(struct ofp11_match));
}
{ "snoop", 1, 1, ofctl_snoop },
{ "dump-desc", 1, 1, ofctl_dump_desc },
{ "dump-tables", 1, 1, ofctl_dump_tables },
+ { "dump-table-features", 1, 1, ofctl_dump_table_features },
{ "dump-flows", 1, 2, ofctl_dump_flows },
{ "dump-aggregate", 1, 2, ofctl_dump_aggregate },
{ "queue-stats", 1, 3, ofctl_queue_stats },
/*
- * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
const char *arg,
struct ovsdb_symbol_table *);
+/* Post_db_reload_check frame work is to allow ovs-vsctl to do additional
+ * checks after OVSDB transactions are successfully recorded and reload by
+ * ovs-vswitchd.
+ *
+ * For example, When a new interface is added to OVSDB, ovs-vswitchd will
+ * either store a positive values on successful implementing the new
+ * interface, or -1 on failure.
+ *
+ * Unless -no-wait command line option is specified,
+ * post_db_reload_do_checks() is called right after any configuration
+ * changes is picked up (i.e. reload) by ovs-vswitchd. Any error detected
+ * post OVSDB reload is reported as ovs-vsctl errors. OVS-vswitchd logs
+ * more detailed messages about those errors.
+ *
+ * Current implementation only check for Post OVSDB reload failures on new
+ * interface additions with 'add-br' and 'add-port' commands.
+ *
+ * post_db_reload_expect_iface()
+ *
+ * keep track of interfaces to be checked post OVSDB reload. */
+static void post_db_reload_check_init(void);
+static void post_db_reload_do_checks(const struct vsctl_context *);
+static void post_db_reload_expect_iface(const struct ovsrec_interface *);
+
+static struct uuid *neoteric_ifaces;
+static size_t n_neoteric_ifaces;
+static size_t allocated_neoteric_ifaces;
+
int
main(int argc, char *argv[])
{
ovsdb_idl_add_column(ctx->idl, &ovsrec_port_col_interfaces);
ovsdb_idl_add_column(ctx->idl, &ovsrec_interface_col_name);
+ ovsdb_idl_add_column(ctx->idl, &ovsrec_interface_col_ofport);
}
static void
{
bool may_exist = shash_find(&ctx->options, "--may-exist") != NULL;
const char *br_name, *parent_name;
+ struct ovsrec_interface *iface;
int vlan;
br_name = ctx->argv[1];
if (!parent_name) {
struct ovsrec_port *port;
- struct ovsrec_interface *iface;
struct ovsrec_bridge *br;
iface = ovsrec_interface_insert(ctx->txn);
} else {
struct vsctl_bridge *parent;
struct ovsrec_port *port;
- struct ovsrec_interface *iface;
struct ovsrec_bridge *br;
int64_t tag = vlan;
bridge_insert_port(br, port);
}
+ post_db_reload_expect_iface(iface);
vsctl_context_invalidate_cache(ctx);
}
for (i = 0; i < n_ifaces; i++) {
ifaces[i] = ovsrec_interface_insert(ctx->txn);
ovsrec_interface_set_name(ifaces[i], iface_names[i]);
+ post_db_reload_expect_iface(ifaces[i]);
}
port = ovsrec_port_insert(ctx->txn);
ds_put_char(&ctx->output, '\n');
}
+static void
+post_db_reload_check_init(void)
+{
+ n_neoteric_ifaces = 0;
+}
+
+static void
+post_db_reload_expect_iface(const struct ovsrec_interface *iface)
+{
+ if (n_neoteric_ifaces >= allocated_neoteric_ifaces) {
+ neoteric_ifaces = x2nrealloc(neoteric_ifaces,
+ &allocated_neoteric_ifaces,
+ sizeof *neoteric_ifaces);
+ }
+ neoteric_ifaces[n_neoteric_ifaces++] = iface->header_.uuid;
+}
+
+static void
+post_db_reload_do_checks(const struct vsctl_context *ctx)
+{
+ struct ds dead_ifaces = DS_EMPTY_INITIALIZER;
+ size_t i;
+
+ for (i = 0; i < n_neoteric_ifaces; i++) {
+ const struct uuid *uuid;
+
+ uuid = ovsdb_idl_txn_get_insert_uuid(ctx->txn, &neoteric_ifaces[i]);
+ if (uuid) {
+ const struct ovsrec_interface *iface;
+
+ iface = ovsrec_interface_get_for_uuid(ctx->idl, uuid);
+ if (iface && (!iface->ofport || *iface->ofport == -1)) {
+ ds_put_format(&dead_ifaces, "'%s', ", iface->name);
+ }
+ }
+ }
+
+ if (dead_ifaces.length) {
+ dead_ifaces.length -= 2; /* Strip off trailing comma and space. */
+ ovs_error(0, "Error detected while setting up %s. See ovs-vswitchd "
+ "log for details.", ds_cstr(&dead_ifaces));
+ }
+
+ ds_destroy(&dead_ifaces);
+}
+
static void
pre_cmd_destroy(struct vsctl_context *ctx)
{
&ovsrec_open_vswitch_col_next_cfg);
}
+ post_db_reload_check_init();
symtab = ovsdb_symbol_table_create();
for (c = commands; c < &commands[n_commands]; c++) {
ds_init(&c->output);
}
}
error = xstrdup(ovsdb_idl_txn_get_error(txn));
- ovsdb_idl_txn_destroy(txn);
- txn = the_idl_txn = NULL;
switch (status) {
case TXN_UNCOMMITTED:
ovsdb_idl_run(idl);
OVSREC_OPEN_VSWITCH_FOR_EACH (ovs, idl) {
if (ovs->cur_cfg >= next_cfg) {
+ post_db_reload_do_checks(&ctx);
goto done;
}
}
}
done: ;
}
+ ovsdb_idl_txn_destroy(txn);
ovsdb_idl_destroy(idl);
exit(EXIT_SUCCESS);
#define IFACE_STATS_INTERVAL (5 * 1000) /* In milliseconds. */
static long long int iface_stats_timer = LLONG_MIN;
+/* Set to true to allow experimental use of OpenFlow 1.4.
+ * This is false initially because OpenFlow 1.4 is not yet safe to use: it can
+ * abort due to unimplemented features. */
+static bool allow_of14;
+
/* In some datapaths, creating and destroying OpenFlow ports can be extremely
* expensive. This can cause bridge_reconfigure() to take a long time during
* which no other work can be done. To deal with this problem, we limit port
ovsdb_idl_destroy(idl);
}
+/* Enables use of OpenFlow 1.4. This is off by default because OpenFlow 1.4 is
+ * not yet safe to use: it can abort due to unimplemented features. */
+void
+bridge_enable_of14(void)
+{
+ allow_of14 = true;
+}
+
/* Looks at the list of managers in 'ovs_cfg' and extracts their remote IP
* addresses and ports into '*managersp' and '*n_managersp'. The caller is
* responsible for freeing '*managersp' (with free()).
static uint32_t
bridge_get_allowed_versions(struct bridge *br)
{
+ uint32_t allowed_versions;
+
if (!br->cfg->n_protocols)
return 0;
- return ofputil_versions_from_strings(br->cfg->protocols,
- br->cfg->n_protocols);
+ allowed_versions = ofputil_versions_from_strings(br->cfg->protocols,
+ br->cfg->n_protocols);
+ if (!allow_of14) {
+ allowed_versions &= ~(1u << OFP14_VERSION);
+ }
+ return allowed_versions;
}
/* Set NetFlow configuration on 'br'. */
}
}
- if (bitmap_scan(port_num_bitmap, 0, STP_MAX_PORTS) != STP_MAX_PORTS
+ if (bitmap_scan(port_num_bitmap, 1, 0, STP_MAX_PORTS) != STP_MAX_PORTS
&& port_num_counter) {
VLOG_ERR("bridge %s: must manually configure all STP port "
"IDs or none, disabling", br->name);
sset_destroy(&splinter_ifaces);
- if (bitmap_scan(splinter_vlans, 0, 4096) >= 4096) {
+ if (bitmap_scan(splinter_vlans, 1, 0, 4096) >= 4096) {
free(splinter_vlans);
return NULL;
}
-/* Copyright (c) 2008, 2009, 2010, 2011, 2012 Nicira, Inc.
+/* Copyright (c) 2008, 2009, 2010, 2011, 2012, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
void bridge_init(const char *remote);
void bridge_exit(void);
+void bridge_enable_of14(void);
+
void bridge_run(void);
void bridge_wait(void);
\fBovs\-vswitchd\fR emits a log message if \fBmlockall()\fR is
unavailable or unsuccessful.
.
+.IP "\fB\-\-enable\-of14\fR"
+Specifying this option allows OpenFlow 1.4 to be used if it is enabled
+through the \fBprotocols\fR column in the \fBController\fR. Without
+this option, \fBovs\-vswitchd\fR will not use OpenFlow 1.4 even if it
+is enabled that way. This option is present because OpenFlow 1.4
+support is not safe: the daemon will abort when certain unimplemented
+features are tested. Thus, for now it is suitable only for
+experimental use. When the support is implemented safely, this option
+will be removed.
+.
.SS "Daemon Options"
.ds DD \
\fBovs\-vswitchd\fR detaches only after it has connected to the \
-/* Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+/* Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
#include "vconn.h"
#include "vlog.h"
#include "lib/vswitch-idl.h"
+#include "lib/netdev-dpdk.h"
VLOG_DEFINE_THIS_MODULE(vswitchd);
bool exiting;
int retval;
- proctitle_init(argc, argv);
set_program_name(argv[0]);
+ retval = dpdk_init(argc,argv);
+ argc -= retval;
+ argv += retval;
+
+ proctitle_init(argc, argv);
service_start(&argc, &argv);
remote = parse_options(argc, argv, &unixctl_path);
fatal_ignore_sigpipe();
OPT_BOOTSTRAP_CA_CERT,
OPT_ENABLE_DUMMY,
OPT_DISABLE_SYSTEM,
- DAEMON_OPTION_ENUMS
+ OPT_ENABLE_OF14,
+ DAEMON_OPTION_ENUMS,
+ OPT_DPDK,
};
static const struct option long_options[] = {
{"help", no_argument, NULL, 'h'},
{"bootstrap-ca-cert", required_argument, NULL, OPT_BOOTSTRAP_CA_CERT},
{"enable-dummy", optional_argument, NULL, OPT_ENABLE_DUMMY},
{"disable-system", no_argument, NULL, OPT_DISABLE_SYSTEM},
+ {"enable-of14", no_argument, NULL, OPT_ENABLE_OF14},
+ {"dpdk", required_argument, NULL, OPT_DPDK},
{NULL, 0, NULL, 0},
};
char *short_options = long_options_to_short_options(long_options);
dp_blacklist_provider("system");
break;
+ case OPT_ENABLE_OF14:
+ bridge_enable_of14();
+ break;
+
case '?':
exit(EXIT_FAILURE);
+ case OPT_DPDK:
+ break;
+
default:
abort();
}
vlog_usage();
printf("\nOther options:\n"
" --unixctl=SOCKET override default control socket name\n"
+ " --enable-of14 allow enabling OF1.4 (unsafely!)\n"
" -h, --help display this help message\n"
" -V, --version display version information\n");
exit(EXIT_SUCCESS);
static unsigned int cached;
if (!cached) {
+#ifndef _WIN32
long int value = sysconf(_SC_PAGESIZE);
+#else
+ long int value;
+ SYSTEM_INFO sysinfo;
+ GetSystemInfo(&sysinfo);
+ value = sysinfo.dwPageSize;
+#endif
if (value >= 0) {
cached = value;
}
#endif
int mem_total, mem_used;
+#ifndef _WIN32
if (pagesize <= 0 || phys_pages <= 0 || avphys_pages <= 0) {
return;
}
mem_total = phys_pages * (pagesize / 1024);
mem_used = (phys_pages - avphys_pages) * (pagesize / 1024);
+#else
+ MEMORYSTATUS memory_status;
+ GlobalMemoryStatus(&memory_status);
+
+ mem_total = memory_status.dwTotalPhys;
+ mem_used = memory_status.dwTotalPhys - memory_status.dwAvailPhys;
+#endif
smap_add_format(stats, "memory", "%d,%d", mem_total, mem_used);
} else {
static const char file_name[] = "/proc/meminfo";
static void
get_process_stats(struct smap *stats)
{
+#ifndef _WIN32
struct dirent *de;
DIR *dir;
}
closedir(dir);
+#endif /* _WIN32 */
}
static void
{"name": "Open_vSwitch",
- "version": "7.4.0",
- "cksum": "951746691 20389",
+ "version": "7.5.0",
+ "cksum": "1448369194 20560",
"tables": {
"Open_vSwitch": {
"columns": {
"enum": ["set", ["OpenFlow10",
"OpenFlow11",
"OpenFlow12",
- "OpenFlow13"]]},
+ "OpenFlow13",
+ "OpenFlow14"]]},
"min": 0, "max": "unlimited"}},
"fail_mode": {
"type": {"key": {"type": "string",
"groups": {
"type": {"key": "string", "min": 0, "max": "unlimited"}},
"prefixes": {
- "type": {"key": "string", "min": 0, "max": 3}}}},
+ "type": {"key": "string", "min": 0, "max": 3}},
+ "external_ids": {
+ "type": {"key": "string", "value": "string",
+ "min": 0, "max": "unlimited"}}}},
"QoS": {
"columns": {
"type": {
</column>
<column name="protocols">
- List of OpenFlow protocols that may be used when negotiating a
- connection with a controller. A default value of
- <code>OpenFlow10</code> will be used if this column is empty.
+ <p>
+ List of OpenFlow protocols that may be used when negotiating a
+ connection with a controller. A default value of
+ <code>OpenFlow10</code> will be used if this column is empty.
+ </p>
+
+ <p>
+ The current implementation of OpenFlow 1.4 support is not safe:
+ <code>ovs-vswitchd</code> will abort when certain unimplemented
+ features are tested. Thus, for now it is suitable only for
+ experimental use. For this reason, OpenFlow 1.4 is supported only
+ if, in addition to specifying <code>OpenFlow14</code> in this field,
+ <code>ovs-vswitchd</code> is invoked with the
+ <code>--enable-of14</code> option. (When support becomes safe, this
+ option will be removed.)
+ </p>
</column>
</group>
one flow table. Currently this limit is 3.
</p>
</column>
+
+ <group title="Common Columns">
+ The overall purpose of these columns is described under <code>Common
+ Columns</code> at the beginning of this document.
+
+ <column name="external_ids"/>
+ </group>
</table>
<table name="QoS" title="Quality of Service configuration">