sliver-openvswitch.git
10 years agoMerge branch 'master' of git://openvswitch.org/openvswitch
Giuseppe Lettieri [Thu, 8 Aug 2013 14:42:27 +0000 (16:42 +0200)]
Merge branch 'master' of git://openvswitch.org/openvswitch

10 years agonetdev: Make netdev_from_name() take a reference to its returned netdev.
Ben Pfaff [Fri, 26 Jul 2013 00:05:46 +0000 (17:05 -0700)]
netdev: Make netdev_from_name() take a reference to its returned netdev.

This API change is necessary for thread safety, to be added in an upcoming
commit.  Otherwise, the client would not be able to safely use the returned
netdev because it could already have been destroyed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev: Make netdev_get_devices() take a reference to each netdev.
Ben Pfaff [Thu, 25 Jul 2013 23:27:39 +0000 (16:27 -0700)]
netdev: Make netdev_get_devices() take a reference to each netdev.

This API change is necessary for thread safety, to be added in an upcoming
commit.  Otherwise, the client would not be able to actually use any of
the returned netdevs because they could already have been destroyed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-provider: Remove unused function netdev_assert_class().
Ben Pfaff [Sat, 27 Jul 2013 00:16:08 +0000 (17:16 -0700)]
netdev-provider: Remove unused function netdev_assert_class().

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-bsd: Use xmemdup0() to simplify netdev_bsd_get_next_hop().
Ben Pfaff [Thu, 25 Jul 2013 22:38:29 +0000 (15:38 -0700)]
netdev-bsd: Use xmemdup0() to simplify netdev_bsd_get_next_hop().

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Ed Maste <emaste@freebsd.org>
CC: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
10 years agonetdev-linux: Move variable declaration inward in netdev_linux_cache_cb().
Ben Pfaff [Fri, 26 Jul 2013 19:42:02 +0000 (12:42 -0700)]
netdev-linux: Move variable declaration inward in netdev_linux_cache_cb().

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-linux: Remove useless member 'peer', which was always zero.
Ben Pfaff [Wed, 24 Jul 2013 17:44:42 +0000 (10:44 -0700)]
netdev-linux: Remove useless member 'peer', which was always zero.

Always, correct a comment on netdev_linux_get_features().

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-linux: Remove unused struct netdev_linux member.
Ben Pfaff [Wed, 24 Jul 2013 17:37:37 +0000 (10:37 -0700)]
netdev-linux: Remove unused struct netdev_linux member.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-linux: Remove pointless layers of indirection for tap devices.
Ben Pfaff [Fri, 26 Jul 2013 00:04:30 +0000 (17:04 -0700)]
netdev-linux: Remove pointless layers of indirection for tap devices.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-linux: Remove unneeded struct forward declarations from header.
Ben Pfaff [Fri, 26 Jul 2013 18:20:09 +0000 (11:20 -0700)]
netdev-linux: Remove unneeded struct forward declarations from header.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-vport: Use ovs_mutex rather than a raw pthread_mutex_t.
Ben Pfaff [Wed, 31 Jul 2013 21:15:05 +0000 (14:15 -0700)]
netdev-vport: Use ovs_mutex rather than a raw pthread_mutex_t.

I'd forgotten even to use the xpthread variants here.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-bsd: Don't assume 'struct netdev' has offset 0.
Ben Pfaff [Fri, 2 Aug 2013 19:19:49 +0000 (12:19 -0700)]
netdev-bsd: Don't assume 'struct netdev' has offset 0.

The data items returned by netdev_get_devices() are "struct netdev *"s.
The code fixed up by this commit used them as "struct netdev_bsd *",
which happens to work because struct netdev happens to be at offset 0 in
each struct but it's better to do a proper cast in case someday
struct netdev gets moved to a nonzero offset.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-bsd: Correctly handle IPv4 netmasks.
Ben Pfaff [Wed, 31 Jul 2013 22:22:12 +0000 (15:22 -0700)]
netdev-bsd: Correctly handle IPv4 netmasks.

netdev_bsd_get_in4() did not set anything in its 'netmask' output argument
if the IPv4 address was cached, leaving it indeterminate.  It would also
mark the cache as valid even if there was an error retrieving the netmask.
This fixes both problems.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Ed Maste <emaste@freebsd.org>
10 years agonetdev-bsd: Fix fd leak on error path.
Ben Pfaff [Thu, 25 Jul 2013 21:41:12 +0000 (14:41 -0700)]
netdev-bsd: Fix fd leak on error path.

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Ed Maste <emaste@freebsd.org>
10 years agonetdev-bsd: Fix typo in label name.
Ben Pfaff [Thu, 25 Jul 2013 21:14:09 +0000 (14:14 -0700)]
netdev-bsd: Fix typo in label name.

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Ed Maste <emaste@freebsd.org>
10 years agonetdev-bsd: Fix memory leak on error path.
Ben Pfaff [Thu, 25 Jul 2013 21:03:32 +0000 (14:03 -0700)]
netdev-bsd: Fix memory leak on error path.

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Ed Maste <emaste@freebsd.org>
10 years agobfd: Fix build on netbsd-6.
YAMAMOTO Takashi [Thu, 8 Aug 2013 00:33:24 +0000 (09:33 +0900)]
bfd: Fix build on netbsd-6.

ip.h requires in_systm.h here.

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoUpdate OPENFLOW-1.1+ to differentiate optional and required features
Simon Horman [Wed, 7 Aug 2013 00:28:00 +0000 (09:28 +0900)]
Update OPENFLOW-1.1+ to differentiate optional and required features

The purpose of this patch is primarily to provide details on which
unimplemented features are optional and which are required as this
may be of interest to those working on OpenFlow 1.1+ coverage.

This patch also:
* Clarifies the text of some entries which seemed difficult to understand
  for the authors of this patch.
* Adds entries for features that were missing from the existing list.
  N.B: It is entirely possible that there are still missing entries.
* Expands some entries into sub-entries where some portions of
  a feature are required and others are optional

Co-authored-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-ofctl: Add "ofp-parse" command for printing OpenFlow from a file.
Ben Pfaff [Tue, 6 Aug 2013 16:45:07 +0000 (09:45 -0700)]
ovs-ofctl: Add "ofp-parse" command for printing OpenFlow from a file.

Test provided by Alex Wang <alexw@nicira.com>.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: Always export the priority and skb_mark netlink attributes.
Andy Zhou [Sat, 3 Aug 2013 19:23:15 +0000 (12:23 -0700)]
odp-util: Always export the priority and skb_mark netlink attributes.

The current Netlink protocol allows a default value of zero if either mark
or priority is not specified (this is part of the ABI).  Until now, when
userspace serializes either the value or mask, it looked at the value and
omitted the netlink attribute if it is zero.  This is a bug because an
exact match on zero turns into a wildcard of the field.

These two fields (plus input port and EtherType) are special because they
can be omitted whereas most other values are required to be fully
specified.  These protocol variations tend to cause bugs (as above) when we
evolve the protocol because an exception that makes sense in one context
might not be logical in another.  Since the default value for mark and
priority are merely shorthands, we can push the protocol in a more
consistent direction by ignoring the shortcut and always serializing the
values.  This is what this commits does.

Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com added Jesse's text to the commit message]
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agocfm: update remote opstate only when a CCM is received.
Paul Ingram [Sat, 3 Aug 2013 07:12:36 +0000 (07:12 +0000)]
cfm: update remote opstate only when a CCM is received.

The remote opstate for a CFM interface is presumed to be up unless a CCM is
received which signals opstate down. This means than an interface configured
for CFM demand mode may incorrectly appear to be opstate up if it has not
received a CCM within the last fault interval.

We should remember the last remote opstate for a CFM interface and only
change it when a CCM arrives signaling a change.

Bug #18806
Signed-off-by: Paul Ingram <pingram@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agobfd: Optimize BFD for Megaflows.
Gurucharan Shetty [Sat, 3 Aug 2013 13:46:26 +0000 (13:46 +0000)]
bfd: Optimize BFD for Megaflows.

The current situation is that whenever any packet enters the
userspace, bfd_should_process_flow() looks at the UDP destination
port to figure out whether that is a BFD packet. This means that
UDP destination port cannot be wildcarded for all the other flows
too.

To optimize BFD for megaflows, we introduce a new
'bfd:bfd_dst_mac' field in the database. Whenever this field is set
by a controller, it is assumed that all the BFD packets to/from
this interface will have the destination mac address set as the one
specified in the bfd:bfd_dst_mac field. If this field is set, we
first look at the destination mac address of a packet and if it
does not match the mac address set in bfd:bfd_dst_mac, we do not
process that packet as bfd. If the field does match, we go ahead
and look at the UDP destination port too.

Also, change the default BFD destination mac address to
"00:23:20:00:00:01".

Feature #18850.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoBFD: Populate ToS field in BFD packets.
Pavithra Ramesh [Sat, 20 Jul 2013 07:17:47 +0000 (07:17 +0000)]
BFD: Populate ToS field in BFD packets.

Signed-off-by: Pavithra Ramesh <paramesh@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoBFD: Edit the unit test time/stop command
Pavithra Ramesh [Thu, 1 Aug 2013 09:55:22 +0000 (09:55 +0000)]
BFD: Edit the unit test time/stop command

Run the ovs-appctl time/stop command after OVS_VSWITCHD_START.
Also increase the wait time before checking if BFD session is up in
test 4.

Signed-off-by: Pavithra Ramesh <paramesh@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif-xlate: Take responsibility for ofproto_receive().
Ethan Jackson [Fri, 2 Aug 2013 19:43:03 +0000 (12:43 -0700)]
ofproto-dpif-xlate: Take responsibility for ofproto_receive().

ofproto_receive() is a slightly odd function which doesn't fit
perfectly in either ofproto-dpif or ofproto-dpif-xlate.  However, it's
much easier to reason about its thread safety in ofproto-dpif-xlate,
so this patch moves it there and renames it xlate_receive().

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Cleanup lookup functions.
Ethan Jackson [Sat, 3 Aug 2013 02:31:02 +0000 (19:31 -0700)]
ofproto-dpif-xlate: Cleanup lookup functions.

This patch allows the lookup functions to take NULL as an argument as
a convenience.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Make vlan splinters thread safe.
Ethan Jackson [Fri, 26 Jul 2013 00:42:24 +0000 (17:42 -0700)]
ofproto-dpif: Make vlan splinters thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Guard rule statistics with a mutex.
Ethan Jackson [Sat, 3 Aug 2013 20:13:26 +0000 (13:13 -0700)]
ofproto-dpif: Guard rule statistics with a mutex.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Maintain a pointer to struct dpif.
Ethan Jackson [Sat, 6 Jul 2013 18:46:48 +0000 (11:46 -0700)]
ofproto-dpif-xlate: Maintain a pointer to struct dpif.

This allows us to move some minor functionality from ofproto-dpif to
ofproto-dpif-xlate, where it's easier to ensure it's thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: add verbose mode for displaying dp flow.
Andy Zhou [Sat, 3 Aug 2013 19:23:14 +0000 (12:23 -0700)]
odp-util: add verbose mode for displaying dp flow.

When verbose mode tuned on, all dp flow fields described by the netlink
attributes are displayed, including fully wildcarded attributes.
Otherwise, the fully wildcarded attributes are omitted for brevity.

Added -m option to "ovs-dpctl dump-flows" to enable verbose mode. It is
off by default.

Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com added documentation]
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Don't trace on deep resubmit.
Ethan Jackson [Fri, 2 Aug 2013 03:52:01 +0000 (20:52 -0700)]
ofproto-dpif-xlate: Don't trace on deep resubmit.

While this code is useful for debugging, removing it allows us to hide
ofproto_trace() in ofproto-dpif. ofproto_trace() is a complex function
which could be difficult to make "obviously" thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Refactor stp_get_port() calls.
Ethan Jackson [Fri, 2 Aug 2013 21:55:31 +0000 (14:55 -0700)]
ofproto-dpif-xlate: Refactor stp_get_port() calls.

I had intended to fold this into a previous patch.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: avoid losing track of kernel flows upon reinstallation
Andy Zhou [Sat, 3 Aug 2013 03:22:17 +0000 (20:22 -0700)]
ofproto-dpif: avoid losing track of kernel flows upon reinstallation

This commit fixes a problem whereby userspace can lose track of a
flow installed in the kernel, instead believing that the flow is
not installed.  The most visible consequence of this bug was a
message in the ovs-vswitchd log warning about an unexpected flow
in the kernel.  Other possible consequences included loss of
statistics and failure to updates actions when the OpenFlow flow
table changed.

The problem arose in the following scenario.  Suppose userspace
sets up a kernel flow due to an arriving packet.  Before kernel
flow setup completes, another packet for that flow arrives.  The
kernel sends the new packet to userspace after userspace has
completed processing the batch of packets that set up the flow.
Userspace then attempts to reinstall the kernel flow.  This fails
with EEXIST, so userspace then marked the flow as not-installed,
even though it was successfully installed before and remains
installed.  The next time userspace dumped the kernel flow
table to gather statistics, it would complain about an unexpected
flow and delete it.

In practice, we have seen these messages with netperf TCP_CRR tests and
UDP stream tests.

This patch fixes the problem by changing userspace so that, once
it successfully installs a flow in the kernel, it will not reinstall
it when it sees another packet for the flow in userspace.  This
has the downside that, if something goes wrong and a flow
disappears from the kernel (e.g. ovs-dpctl del-flows), then userspace
won't reinstall it (until it tries to delete it).  (This is in fact
the reason why until now userspace reinstalled flows it knew it
already installed.)

Some more background may be warranted.  There are two EEXIST error
cases:

       1. A subfacet was installed successfully in a previous (recent)
          batch.  Now we've attempted to reinstall exactly the same
          subfacet in this batch.

       2. A subfacet was installed successfully in a previous (recent)
          batch or earlier in the current batch.  We've attempted to
          install a subfacet for an overlapping megaflow.

Before megaflows, installation errors were ignored completely.
Since megaflows were introduced, they have been handled by
considering on any installation error that the given subfacet is
not installed.  This works well for case #2 but causes case #1 to
yield unexpected flows, as described at the top of the commit
message.

This commit adds the wrinkle that we never try to reinstall
exactly the same subfacet that we know we installed successfully
earlier (and haven't deleted) unless its actions change.  This
ought to work just as well for case #2, and avoids the problem
with case #1.

Prepared with assistance from Ethan.

Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com rewrote the commit message]
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Always un-wildcard fields that are being set.
Justin Pettit [Sat, 3 Aug 2013 04:17:31 +0000 (21:17 -0700)]
ofproto-dpif: Always un-wildcard fields that are being set.

The ODP library has an optimization to not set a header if the field was
not changed, regardless of whether an action to set the field was
present.  That library is also responsible for un-wildcarding fields
that are bieng modified.  This leads to a problem where a packet matches
a flow that updates a field, but that particular packet's field already
has that value.  As such, an overly loose megaflow will be generated
that doesn't match on that field and the actions won't update it.  A
second packet that should have the field set will match that flow and
will not be modified.

This commit changes the behavior to always un-wildcard fields that are
being modified.  Since the ODP library updates the entire header if a
field in it is modified, and all those fields will be un-wildcarded, the
generated flows may be different.  However, they should be correct.

Bug #18946.

Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoasync-append: Refactor to avoid requiring enabling while single threaded.
Ben Pfaff [Sat, 3 Aug 2013 00:32:25 +0000 (17:32 -0700)]
async-append: Refactor to avoid requiring enabling while single threaded.

Until now, the async append interface has required async_append_enable()
to be called while the process was still single-threaded, with the
rationale being that async_append_enable() could race with
async_append_write() on some existing async_append object.  This was a
difficult problem when the async append interface was introduced, because
at the time Open vSwitch did not have any infrastructure for inter-thread
synchronization.

Now it is easy to solve, by introducing synchronization into the
async append module.  However, that's more or less wasted, because the
client is already required to serialize access to async append objects.
Moreover, vlog, the only existing client, needs to serialize access for
other reasons, so it wouldn't even be possible to just drop the client's
synchronization.

This commit therefore takes another approach.  It drops the
async_append_enable() interface entirely.  Now any existing async_append
object is always enabled.  The responsibility for "enabling", then, now
rests in whether the client creates and uses an async_append object, and
so vlog now takes care of that by itself.  Also, since vlog now has to
deal with sometimes having an async_append and sometimes not having one,
we might as well allow creating an async_append to fail, thereby slightly
simplifying the "no async I/O" implementation from "write synchronously"
to "always fail creating an async_append".

Reported-by: Shih-Hao Li <shihli@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Handle learn action flow mods asynchronously.
Ethan Jackson [Fri, 12 Jul 2013 00:17:00 +0000 (17:17 -0700)]
ofproto-dpif: Handle learn action flow mods asynchronously.

Once we have multiple threads running, having each execute flow mods
created by the learn action won't be tenable.  It essentially will
require us to make the core ofproto module thread safe, which is not
the direction we want to go.  This patch punts on the problem by
handing flow mods to ofproto-dpif to handle later.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Take control of the qdscp map.
Ethan Jackson [Sat, 6 Jul 2013 17:25:06 +0000 (10:25 -0700)]
ofproto-dpif-xlate: Take control of the qdscp map.

This will make locking easier in future patches.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Pull STP xlation into ofproto-dpif-xlate.
Ethan Jackson [Sat, 6 Jul 2013 16:31:35 +0000 (09:31 -0700)]
ofproto-dpif-xlate: Pull STP xlation into ofproto-dpif-xlate.

This patch pulls the STP xlation code into ofproto-dpif-xlate where it
will be easier to guard.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agotag: Retire the venerable tag library.
Ethan Jackson [Fri, 2 Aug 2013 00:07:08 +0000 (17:07 -0700)]
tag: Retire the venerable tag library.

This patch retires a venerable library whose inception dates before
the first patch of the current repository: tags.  They have served us
well, but their time has come for the reasons listed below.

1) They don't actually help much.
In theory, tags had been used to reduce revalidation necessary when
using bonds, mac-learning, and frequently changing flow tables.  With
bonds and mac-learning, things change happen so rarely that tagging
isn't worth it.  That leaves flow table changes. With the complex flow
tables in my testing, the revalidate_set gets so overwhelmed with
tags, that we end up revalidating every facet every time through the
run loop.  In other words, they tags are giving us no benefit.

2) They complicate the code.
This patch simplifies the code and removes a couple of rather ugly
kludges.

3) They complicated locking once threading hits.
Because of the calculate_flow_tag() function, the table_dpif structure
would require locking in a multi-threaded OVS.  Though this problem
isn't insurmountable, it's annoying and probably would cause lock
contention.

Of course, we could try to work around these problems with a more
advanced tagging infrastructure, but this moves in the opposite of the
direction we should be.  Ideally we'll have a more-or-less stateless
ofproto-dpif supporting a massive number of datapath flows.  Tags (or
facets for that matter) aren't going to work in this new world.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agobond: Stop using tags.
Ethan Jackson [Fri, 2 Aug 2013 01:23:13 +0000 (18:23 -0700)]
bond: Stop using tags.

This patch transitions bonding away from using tags as required by
future patches.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agomac-learning: Stop using tags.
Ethan Jackson [Fri, 2 Aug 2013 01:04:07 +0000 (18:04 -0700)]
mac-learning: Stop using tags.

This patch transitions mac learning away from using tags as required
by future patches.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev: Minor formatting improvements.
Ben Pfaff [Wed, 24 Jul 2013 21:20:43 +0000 (14:20 -0700)]
netdev: Minor formatting improvements.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-linux: Don't assume 'struct netdev' has offset 0.
Ben Pfaff [Fri, 2 Aug 2013 19:19:49 +0000 (12:19 -0700)]
netdev-linux: Don't assume 'struct netdev' has offset 0.

The data items returned by netdev_get_devices() are "struct netdev *"s.
The code fixed up by this commit used them as "struct netdev_linux *",
which happens to work because struct netdev happens to be at offset 0 in
each struct but it's better to do a proper cast in case someday
struct netdev gets moved to a nonzero offset.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-dummy: Fix memory leak on error path in netdev_rx_dummy_recv().
Ben Pfaff [Tue, 30 Jul 2013 19:26:08 +0000 (12:26 -0700)]
netdev-dummy: Fix memory leak on error path in netdev_rx_dummy_recv().

This code failed to free the packet if it was too big for the caller.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-linux: Initialize change_seq for tap devices too.
Ben Pfaff [Fri, 26 Jul 2013 23:27:19 +0000 (16:27 -0700)]
netdev-linux: Initialize change_seq for tap devices too.

change_seq is supposed to always be nonzero but tap devices got this wrong.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-linux: Fix fd leak on error path.
Ben Pfaff [Fri, 26 Jul 2013 00:03:03 +0000 (17:03 -0700)]
netdev-linux: Fix fd leak on error path.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-linux: Fix theoretical memory leak on error path.
Ben Pfaff [Thu, 1 Aug 2013 21:07:35 +0000 (14:07 -0700)]
dpif-linux: Fix theoretical memory leak on error path.

If a notification is bigger than 4 kB (I doubt it one would be), then the
lack of ofpbuf_uninit() would cause a memory leak.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: Always serialize tunnel mask attributes.
Jesse Gross [Thu, 1 Aug 2013 23:17:47 +0000 (16:17 -0700)]
odp-util: Always serialize tunnel mask attributes.

A tunnel value attribute is not allowed to have an empty IP destination
address but this is legal for masks. This drops both the checks for
serializing masks and also the sanity checks on them.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agodatapath: Introduce is_mask when serializing netlink attributes.
Jesse Gross [Thu, 1 Aug 2013 23:17:46 +0000 (16:17 -0700)]
datapath: Introduce is_mask when serializing netlink attributes.

The intention is clearer than if we rederive it in every location.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoofproto-dpif-xlate: Unmask mark when used for tunnels.
Jesse Gross [Thu, 1 Aug 2013 20:31:28 +0000 (13:31 -0700)]
ofproto-dpif-xlate: Unmask mark when used for tunnels.

The tunnel lookup uses the skb_mark as part of the port find process
but it isn't unmasked along with the other fields. This adds it to
the list of significant fields.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Hide rule_calculate_tag().
Ethan Jackson [Thu, 1 Aug 2013 23:05:11 +0000 (16:05 -0700)]
ofproto-dpif: Hide rule_calculate_tag().

No one uses it except ofproto-dpif.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoINSTALL, CodingStyle: Recognize that Clang is an acceptable compiler.
Ben Pfaff [Thu, 1 Aug 2013 22:27:52 +0000 (15:27 -0700)]
INSTALL, CodingStyle: Recognize that Clang is an acceptable compiler.

Clang has nice static analysis and works well as an Open vSwitch compiler,
so mention it more explicitly.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif-xlate: Don't try to optimize goto table.
Ethan Jackson [Sat, 27 Jul 2013 19:24:15 +0000 (12:24 -0700)]
ofproto-dpif-xlate: Don't try to optimize goto table.

This patch reverts commit 5559942 (ofproto-dpif: GOTO_TABLE recursion
removal.) by reintroducing the recursion through xlate_table_action().
The main reason to do this is the introduction of new rule locking in
future patches.  The code before this patch was relatively difficult
to lock in a clean straight-forward manner.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: Accept any 802.2 eth_type mask but override to be exact match
Andy Zhou [Thu, 1 Aug 2013 17:49:46 +0000 (10:49 -0700)]
datapath: Accept any 802.2 eth_type mask but override to be exact match

When key.eth_type is absent it is interpreted to be 802.2, which is
represented by a special value. In order to prevent inadvertant matches
on this opaque value, the mask is forced to be either fully wildcarded
or fully exact.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Accept any in_port mask but override to be exact match
Andy Zhou [Thu, 1 Aug 2013 17:49:45 +0000 (10:49 -0700)]
datapath: Accept any in_port mask but override to be exact match

Pre mega flow, netlink allows the in_port key attribute
to be missing. Missing in_port is interpreted as DP_MAX_PORTS.

For backward compatibility, mega flow implementation will always allow
the mask of in_port to be specified, as if the in_port key attribute
is always specified.

To prevent accidental match of the DP_MAX_PORTS, which value is opaque to
the user space, we will always force the mask to be exact match,
regardless of the value supplied by the netline message. Missing
in_port mask continue to mean wildcarded match, same as other masks.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agomac-learning: Make the mac-learning module thread safe.
Ethan Jackson [Mon, 22 Jul 2013 18:11:54 +0000 (11:11 -0700)]
mac-learning: Make the mac-learning module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agobond: Make the bond module thread safe.
Ethan Jackson [Tue, 23 Jul 2013 01:17:23 +0000 (18:17 -0700)]
bond: Make the bond module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Make the BFD module thread safe.
Ethan Jackson [Fri, 26 Jul 2013 23:18:00 +0000 (16:18 -0700)]
bfd: Make the BFD module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agocfm: Make the CFM module thread safe.
Ethan Jackson [Tue, 23 Jul 2013 20:09:38 +0000 (13:09 -0700)]
cfm: Make the CFM module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agostp: Make the STP module thread safe.
Ethan Jackson [Sat, 6 Jul 2013 16:34:34 +0000 (09:34 -0700)]
stp: Make the STP module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoconfigure: Distinguish glibc and NetBSD pthread_setname_np() variants.
Ben Pfaff [Thu, 1 Aug 2013 16:35:56 +0000 (09:35 -0700)]
configure: Distinguish glibc and NetBSD pthread_setname_np() variants.

Reported-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Tested-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Always allow tunnel mask to be specified in the netlink
Andy Zhou [Thu, 1 Aug 2013 03:39:49 +0000 (20:39 -0700)]
datapath: Always allow tunnel mask to be specified in the netlink

Netlink message usually only accpets a mask when there is a
corresponding key attribute. Tunnel mask and eth_type are the
only two expections so far.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoovs-atomic-pthreads: Fix "has incomplete type" error.
Alex Wang [Wed, 31 Jul 2013 23:09:11 +0000 (16:09 -0700)]
ovs-atomic-pthreads: Fix "has incomplete type" error.

Commit 97be153858b4cd175cbe7862b8e1624bf22ab98a (clang: Add
annotations for thread safety check.) defined 'struct ovs_mutex'
variable in 'atomic_flag' in 'ovs-atomic-pthreads.h'. This
casued "mutex: has incomplete type" error in compilation when
'ovs-atomic-pthreads.h' is included.

This commit goes back to use 'pthread_mutex_t' for that variable
and adds test for the 'atomic_flag' related functions.

Reported-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: fix bug in setting ipv4 frag flag mask
Andy Zhou [Wed, 31 Jul 2013 20:54:12 +0000 (13:54 -0700)]
odp-util: fix bug in setting ipv4 frag flag mask

This bug causes the flag mask to always mask only 1 bit, not the 2 bits
possible. While at it, make the top 6 bits exact match.

Bug #18834.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Support for Linux kernel 3.9.
Kyle Mestery [Wed, 31 Jul 2013 21:18:19 +0000 (17:18 -0400)]
datapath: Support for Linux kernel 3.9.

In certain cases we need to ensure we save off skb->cb before
calling __skb_gso_segment() since in kernels >= 3.9 skb->cb is
used by this routine.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agobfd: Downgrade long delay messages to INFO level.
Ethan Jackson [Wed, 31 Jul 2013 18:29:28 +0000 (11:29 -0700)]
bfd: Downgrade long delay messages to INFO level.

On my system the long delay messages can cause a transient BFD unit
test failure due to the log checking.  These messages don't *really*
need to be at WARN level, so this patch downgrades them.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoMerge remote-tracking branch 'ovs-dev/master'
Giuseppe Lettieri [Wed, 31 Jul 2013 20:17:34 +0000 (22:17 +0200)]
Merge remote-tracking branch 'ovs-dev/master'

Conflicts:
.gitignore

10 years agoofproto-dpif-ipfix: Make the ofproto-dpif-ipfix module thread safe.
Ethan Jackson [Tue, 23 Jul 2013 00:39:14 +0000 (17:39 -0700)]
ofproto-dpif-ipfix: Make the ofproto-dpif-ipfix module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-sflow: Make the ofproto-dpif-sflow module thread safe.
Ethan Jackson [Mon, 22 Jul 2013 19:32:19 +0000 (12:32 -0700)]
ofproto-dpif-sflow: Make the ofproto-dpif-sflow module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agotunnel: Make the ofproto-dpif tunnel module thread safe.
Ethan Jackson [Tue, 23 Jul 2013 19:03:37 +0000 (12:03 -0700)]
tunnel: Make the ofproto-dpif tunnel module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agolacp: Make the LACP module thread safe.
Ethan Jackson [Tue, 23 Jul 2013 01:35:28 +0000 (18:35 -0700)]
lacp: Make the LACP module thread safe.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agocompiler: Fix OVS_LOCKS_EXCLUDED on non clang compilers.
Ethan Jackson [Wed, 31 Jul 2013 17:49:34 +0000 (10:49 -0700)]
compiler: Fix OVS_LOCKS_EXCLUDED on non clang compilers.

This patch renames OVS_LOCKS_EXCLUDED to simply OVS_EXCLUDED so it's
more consistent with the other thread safety annotations.  It also
adds it to the non-clang compilers.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: always export priority and skb_mark in netlink message
Andy Zhou [Wed, 31 Jul 2013 02:49:12 +0000 (19:49 -0700)]
datapath: always export priority and skb_mark in netlink message

Handling of missing attributes in netlink can be tricky and turns out
to be error prone. The value (savings in netlink bandwidth) does not
seem to be significant enough to justify allowing them. This patch
series make both kernel and userspace always export priority and
skb_mark attribute. There will be follow on patches in the
direction of making all attributes explicit.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoovsdb-doc: Add ovsdb-doc to distribution tar ball.
Gurucharan Shetty [Wed, 31 Jul 2013 16:24:46 +0000 (09:24 -0700)]
ovsdb-doc: Add ovsdb-doc to distribution tar ball.

Certain platforms like xenserver do not have the latest
python libraries that are needed by ovsdb-doc (which in-turn
creates ovs-vswitchd.conf.db.5). When we run 'make dist' and
copy over the tar ball to xenserver ddk environemt, we
already include ovs-vswitchd.conf.db.5. But the absence of
ovsdb-doc results in an attempt to regenerate ovs-vswitchd.conf.db.5
and that fails because of the missing python libraries.

Instead of producing ovsdb-doc from ovsdb-doc.in dynamically, we
statically provide ovsdb-doc and pass on the version information
to it through the command line option --version.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agotest-atomic: Re-enable atomic read-write-modify tests.
Ben Pfaff [Mon, 15 Jul 2013 21:14:05 +0000 (14:14 -0700)]
test-atomic: Re-enable atomic read-write-modify tests.

This reverts commit 05d299e0ccca80736cd4438c3224540c5448a7d4 (test-atomic:
Drop atomic read-modify-write tests for the moment.) because the
test for detecting whether GCC support atomic operation built-ins has
been fixed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoconfigure: Add configure-time check for GCC 4.0+ atomic built-ins.
Ben Pfaff [Mon, 15 Jul 2013 21:13:53 +0000 (14:13 -0700)]
configure: Add configure-time check for GCC 4.0+ atomic built-ins.

We found out earlier that GCC sometimes produces an error only at link time
for atomic built-ins that are not supported on a platform.  This actually
tries the link at configure time and should thus reliably detect whether
the atomic built-ins are really supported.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoclang: Add annotations for thread safety check.
Ethan Jackson [Tue, 30 Jul 2013 22:31:48 +0000 (15:31 -0700)]
clang: Add annotations for thread safety check.

This commit adds annotations for thread safety check. And the
check can be conducted by using -Wthread-safety flag in clang.

Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: link upper device for port devices
Jiri Pirko [Tue, 30 Jul 2013 23:27:31 +0000 (16:27 -0700)]
datapath: link upper device for port devices

Link upper device properly. That will make IFLA_MASTER filled up.
Set the master to port 0 of the datapath under which the port belongs.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoBFD: Unit tests for BFD.
Pavithra Ramesh [Sat, 27 Jul 2013 09:58:06 +0000 (09:58 +0000)]
BFD: Unit tests for BFD.

Signed-off-by: Pavithra Ramesh <paramesh@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoovs-vswitchd.conf.db: Correct the header and footer lines.
Gurucharan Shetty [Tue, 30 Jul 2013 17:53:02 +0000 (10:53 -0700)]
ovs-vswitchd.conf.db: Correct the header and footer lines.

Right now, the following 2 lines are how the header and footer
looks like for ovs-vswitchd.conf.db

@VERSION@(5)   Open vSwitch Manual    @VERSION@(5)
Open vSwitch   Open_vSwitch           @VERSION@(5)

After this commit, they look like this:
ovs-vswitchd.conf.db(5)   Open vSwitch Manual   ovs-vswitchd.conf.db(5)
Open vSwitch              1.12.90               ovs-vswitchd.conf.db(5)

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Only track drop flows that are installed
Andy Zhou [Tue, 30 Jul 2013 17:52:35 +0000 (10:52 -0700)]
ofproto-dpif: Only track drop flows that are installed

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Unset DPIF_FP_MODIFY flag when creating a new flo
Andy Zhou [Tue, 30 Jul 2013 17:52:34 +0000 (10:52 -0700)]
ofproto-dpif: Unset DPIF_FP_MODIFY flag when creating a new flo

Remove the DPIF_FP_MODIFY flag when creating a new flow. When flows arrive in
a batch, userspace may push down multiple unique flow definitions that
overlap when wildcards are applied. Kernels support flow wildcarding
will reject these flow as duplicates (EEXIST), which will be logged
at a lower logging level.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agovalgrind: Update glibc timer_create() suppression.
Ethan Jackson [Mon, 29 Jul 2013 22:27:15 +0000 (15:27 -0700)]
valgrind: Update glibc timer_create() suppression.

For some reason the current suppression fails to actually suppress the
warning.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
10 years agoovs-dev.py: Use custom suppressions when running valgrind.
Ethan Jackson [Mon, 29 Jul 2013 22:26:17 +0000 (15:26 -0700)]
ovs-dev.py: Use custom suppressions when running valgrind.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: Add vxlan and flow_dissector to gitignore.
Ethan Jackson [Mon, 29 Jul 2013 21:23:27 +0000 (14:23 -0700)]
datapath: Add vxlan and flow_dissector to gitignore.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoAvoid C preprocessor trick where macro has the same name as a function.
Ben Pfaff [Mon, 29 Jul 2013 22:24:45 +0000 (15:24 -0700)]
Avoid C preprocessor trick where macro has the same name as a function.

In C, one can do preprocessor tricks by making a macro expansion include
the macro's own name.  We actually used this in the tree to automatically
provide function arguments, e.g.:

    int f(int x, const char *file, int line);
    #define f(x) f(x, __FILE__, __LINE__)

...

    f(1);    /* Expands to a call like f(1, __FILE__, __LINE__); */

However it's somewhat confusing, so this commit stops using that trick.

Reported-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ed Maste <emaste@freebsd.org>
10 years agoofproto-dpif: Tolerate spontaneous changes in datapath port numbers.
Ben Pfaff [Mon, 29 Jul 2013 22:11:49 +0000 (15:11 -0700)]
ofproto-dpif: Tolerate spontaneous changes in datapath port numbers.

This can happen on ESX.

Also adds a test to make sure this works.

Bug #17634.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Tested-by: Guolin Yang <gyang@vmware.com>
10 years agoofproto-dpif: Correctly refresh all ports on ENOBUFS from dpif_port_poll().
Ben Pfaff [Thu, 13 Jun 2013 20:20:17 +0000 (13:20 -0700)]
ofproto-dpif: Correctly refresh all ports on ENOBUFS from dpif_port_poll().

dpif_port_poll() is allowed to return ENOBUFS if something might have
changed, but the specific change isn't easily reportable.  type_run()
didn't handle this case, so it wouldn't notice any changes when this
happened.

dpif-netdev (including dpif-dummy) uses ENOBUFS exclusively to report
changes, so this fixes a problem there.  dpif-linux rarely uses ENOBUFS but
it can do so if a kernel-to-user Netlink buffer overflows.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix missing VLAN netlink attribute handling
Andy Zhou [Mon, 29 Jul 2013 21:05:23 +0000 (14:05 -0700)]
datapath: Fix missing VLAN netlink attribute handling

Missing VLAN netlink attribute should be interpreted as exact match
of no VLAN tag, instead of wildcarded match for all VLAN tags.

Bug #18736.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoovs-dev.py: Add support for clang builds.
Ethan Jackson [Sat, 27 Jul 2013 00:06:15 +0000 (17:06 -0700)]
ovs-dev.py: Add support for clang builds.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-dev.py: Rely on configure for warning options.
Ethan Jackson [Mon, 29 Jul 2013 20:34:57 +0000 (13:34 -0700)]
ovs-dev.py: Rely on configure for warning options.

Both -Wall and -Wextra are handled by autoconf, so there's no longer a
need for ovs-dev.py to pass them through CFLAGS.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: fix a bug in SF_FLOW_KEY_PUT macro
Andy Zhou [Mon, 29 Jul 2013 20:26:08 +0000 (13:26 -0700)]
datapath: fix a bug in SF_FLOW_KEY_PUT macro

This bug will cause mask values to corrupt the flow key value. So far
the bug has not showed up because we don't write mask value when
there is no mask Netlink attributes.  However, it needs to be fixed for
the next and future commits where we will start to set default
values for key and mask for missing Netlink attributes.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: list: Fix double fetch of pointer in hlist_entry_safe()
Pravin B Shelar [Fri, 26 Jul 2013 20:52:24 +0000 (13:52 -0700)]
datapath: list: Fix double fetch of pointer in hlist_entry_safe()

Following patch backports commit f65846a1800ef8c48d (list: Fix double
fetch of pointer in hlist_entry_safe()) from upstream kernel.
This patch fixes following panic. Thanks to Jesse for helping to
debug this issue.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000118
[129608.216422] IP: [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch]
[129608.216918] PGD 11500a067 PUD 120851067 PMD 0
[129608.216994] Oops: 0000 [#1] SMP
[129608.217049] CPU 0
[129608.218697]
[129608.218726] Pid: 0, comm: swapper/0 Tainted: G           O
3.2.39-server-nn21 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform
[129608.219288] RIP: 0010:[<ffffffffa02436da>]  [<ffffffffa02436da>]
ovs_masked_flow_lookup+0xda/0x140 [openvswitch]
[129608.219454] RSP: 0018:ffff88013fc03b60  EFLAGS: 00010282
[129608.219536] RAX: 0000000000000020 RBX: ffff880123087100 RCX:
ffff88012098e000
[129608.219719] RDX: ffff8800b3b0ca30 RSI: 000000000000010a RDI:
ffff88011df8c000
[129608.220121] RBP: ffff88013fc03c30 R08: 0000000000000001 R09:
0000000020069825
[129608.220287] R10: 0000000000000020 R11: 0000000000000001 R12:
ffff880036e1c6c0
[129608.220451] R13: ffff88013fc03b98 R14: 0000000000000024 R15:
ffffffffffffffe0
[129608.220618] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000)
knlGS:0000000000000000
[129608.220794] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[129608.220911] CR2: 0000000000000118 CR3: 00000001190c9000 CR4:
00000000000406f0
[129608.221122] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[129608.221320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[129608.221488] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000,
task ffffffff81c0d020)
[129608.221669] Stack:
[129608.221725]  0000000000000044 0000000000000020 ffff88013fc03c60
0000000000000000
[129608.221906]  0000000000000000 0000000000000000 0000000000000000
f014232200000002
[129608.222069]  1973f0142322015a 0000000600080000 1973f0140579f414
000000002f1dc7ec
[129608.222211] Call Trace:
[129608.222264]  <IRQ>
[129608.222316]  [<ffffffffa02445bd>] ovs_flow_lookup+0x5d/0x70
[openvswitch]
[129608.222411]  [<ffffffffa0242550>] ovs_dp_process_received_packet+0x70/0x110 [openvswitch]
[129608.222541]  [<ffffffff8104d6ec>] ? resched_task+0x2c/0x80
[129608.222644]  [<ffffffffa0249b20>] ? netdev_create+0x120/0x120
[openvswitch]
[129608.222743]  [<ffffffffa02483f8>] ovs_vport_receive+0x38/0x40
[openvswitch]
[129608.222838]  [<ffffffffa0249bc3>] netdev_frame_hook+0xa3/0xf0
[openvswitch]
[129608.222933]  [<ffffffffa0249b20>] ? netdev_create+0x120/0x120
[openvswitch]
[129608.223029]  [<ffffffff81539318>] __netif_receive_skb+0x1c8/0x620
[129608.223114]  [<ffffffff81325740>] ? map_single+0x60/0x60
[129608.223192]  [<ffffffff81539b71>] process_backlog+0xb1/0x190
[129608.223274]  [<ffffffff8153ae64>] net_rx_action+0x134/0x290
[129608.223355]  [<ffffffff8106e1a8>] __do_softirq+0xa8/0x210
[129608.223433]  [<ffffffff816458de>] ? _raw_spin_lock+0xe/0x20
[129608.223513]  [<ffffffff8164fcac>] call_softirq+0x1c/0x30
[129608.223590]  [<ffffffff81016215>] do_softirq+0x65/0xa0
[129608.223665]  [<ffffffff8106e58e>] irq_exit+0x8e/0xb0
[129608.223738]  [<ffffffff81650573>] do_IRQ+0x63/0xe0
[129608.223808]  [<ffffffff81645d6e>] common_interrupt+0x6e/0x6e
[129608.223887]  <EOI>
[129608.223933]  [<ffffffff8103ced6>] ? native_safe_halt+0x6/0x10
[129608.224014]  [<ffffffff8101c6a3>] default_idle+0x53/0x1d0
[129608.224092]  [<ffffffff81013236>] cpu_idle+0xd6/0x120
[129608.224167]  [<ffffffff8161785e>] rest_init+0x72/0x74
[129608.224252]  [<ffffffff81cfcbaa>] start_kernel+0x3b5/0x3c2
[129608.224331]  [<ffffffff81cfc347>]
x86_64_start_reservations+0x132/0x136
[129608.224421]  [<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140
[129608.224506]  [<ffffffff81cfc44d>] x86_64_start_kernel+0x102/0x111
[129608.224589] Code: 25 48 63 53 28 48 8d 42 01 48 c1 e0 04 49 01 c7 49
8b 07 48 85 c0 74 61 4d 8b 3f 48 c1 e2 04 48 83 c2 10 49 29 d7 4d 85 ff
74 26 <4d> 39 a7 38 01 00 00 75 cd 48 8b 95 38 ff ff ff 4c 89 ee 49 8d
[129608.224949] RIP  [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch]

Original commit msg:

list: Fix double fetch of pointer in hlist_entry_safe()

The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure.  This is OK for normal lock-based use because in
that case, the pointer cannot change.  However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time.  This use case can result in the following sequence
of events:

1.  CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
    pointer as sees that it is non-NULL.

2.  CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
    just fetched a pointer to.  Because this is the last entry
    in the list, the pointer fetched by CPU 0 is now NULL.

3.  CPU 0 refetches the pointer, obtains NULL, and then gets a
    NULL-pointer crash.

This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events.  Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.

Many thanks to Eric Dumazet for spotting root cause!

Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Li Zefan <lizefan@huawei.com>
Bug #17099

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoImplement OpenFlow 1.3 queue stats duration feature.
Ben Pfaff [Wed, 17 Jul 2013 22:56:22 +0000 (15:56 -0700)]
Implement OpenFlow 1.3 queue stats duration feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofp-util: Fix port and queue stat counting for OpenFlow 1.3.
Ben Pfaff [Wed, 17 Jul 2013 22:53:26 +0000 (15:53 -0700)]
ofp-util: Fix port and queue stat counting for OpenFlow 1.3.

OpenFlow 1.0, 1.1, and 1.2 all have the same struct size for port and
queue stats.  OpenFlow 1.3 has larger structs, but the ofp-util code didn't
realize that.  This fixes the problem.

It appears that the only consequence of this problem would have been
printing the wrong count in ofp-print output.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Use non rcu hlist_del() flow table entry.
Pravin B Shelar [Thu, 25 Jul 2013 18:28:02 +0000 (11:28 -0700)]
datapath: Use non rcu hlist_del() flow table entry.

Flow table destroy is done in rcu call-back context.  Therefore
there is no need to use rcu variant of hlist_del().

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@niciria.com>
10 years agodatapath: Use correct type while allocating flex array.
Pravin B Shelar [Thu, 25 Jul 2013 18:25:21 +0000 (11:25 -0700)]
datapath: Use correct type while allocating flex array.

Flex array is used to allocate hash buckets which is type struct
hlist_head, but we use `struct hlist_head *` to calculate
array size.  Since hlist_head is of size pointer it works fine.

Following patch use correct type.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years ago.gitignore: ignore temporary _debian packaging content
Mark Hamilton [Thu, 25 Jul 2013 19:41:55 +0000 (12:41 -0700)]
.gitignore: ignore temporary _debian packaging content

Signed-off-by: Mark Hamilton <mhamilton@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif: Make dpifs thread-safe, and document it.
Ben Pfaff [Thu, 25 Jul 2013 17:31:42 +0000 (10:31 -0700)]
dpif: Make dpifs thread-safe, and document it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodpif-netdev: Make internally thread-safe by introducing a global mutex.
Ben Pfaff [Tue, 23 Jul 2013 23:56:26 +0000 (16:56 -0700)]
dpif-netdev: Make internally thread-safe by introducing a global mutex.

This can be improved later but it is the simple thing to do for now.

I marked a couple of races with XXX.  I don't have a really good solution
for these, but I hope to find one.  They may be harmless in practice.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>