sliver-openvswitch.git
11 years agoMerge remote-tracking branch 'origin/ovs-dev' into bsd-port bsd-port
Giuseppe Lettieri [Sat, 30 Jun 2012 12:10:38 +0000 (14:10 +0200)]
Merge remote-tracking branch 'origin/ovs-dev' into bsd-port

Conflicts:
acinclude.m4
lib/automake.mk

11 years agorevert commit 217e648afe5
Giuseppe Lettieri [Sat, 30 Jun 2012 10:45:56 +0000 (12:45 +0200)]
revert commit 217e648afe5

prefer fix by Ed in upstream git master

11 years agoovs-vswitchd: Call mlockall() from the daemon, not the parent or monitor.
Ben Pfaff [Fri, 29 Jun 2012 16:22:59 +0000 (09:22 -0700)]
ovs-vswitchd: Call mlockall() from the daemon, not the parent or monitor.

mlockall(2) says:

       Memory  locks  are not inherited by a child created via fork(2) and are
       automatically removed  (unlocked)  during  an  execve(2)  or  when  the
       process terminates.

which means that --mlockall was ineffective in combination with --detach
or --monitor or both.  Both are used in the most common production
configuration of Open vSwitch, so this means that --mlockall has never been
effective in production.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoRoute-table implementation for (Free)BSD
Ed Maste [Fri, 29 Jun 2012 21:11:24 +0000 (21:11 +0000)]
Route-table implementation for (Free)BSD

This is a trivial implementation of the route-table functionality for
FreeBSD, as needed by ofproto/ofproto-dpif-sflow.c.  It has not yet
been extensively tested.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agovlandev: Move Linux #include under #ifdef __linux__
Ed Maste [Fri, 29 Jun 2012 21:13:54 +0000 (21:13 +0000)]
vlandev: Move Linux #include under #ifdef __linux__

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoDebian packaging to remove system-id.conf
Arun Sharma [Fri, 29 Jun 2012 02:56:39 +0000 (19:56 -0700)]
Debian packaging to remove system-id.conf

Debian packaging, "--purge" to remove /etc/openvswitch/system-id.conf

Signed-off-by: Arun Sharma <arun.sharma@calsoftinc.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow: Move enums for "packet_out" and "flow_mod" to common header.
Ben Pfaff [Thu, 28 Jun 2012 05:48:01 +0000 (22:48 -0700)]
openflow: Move enums for "packet_out" and "flow_mod" to common header.

Reviewed-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto: Report nonexistent ports and queues as errors in queue stats.
Ben Pfaff [Mon, 11 Jun 2012 18:23:06 +0000 (11:23 -0700)]
ofproto: Report nonexistent ports and queues as errors in queue stats.

Until now, Open vSwitch has ignored missing ports and queues in most cases
in queue stats requests, simply returning an empty set of statistics.
It seems that it is better to report an error, so this commit does this.

Reported-by: Prabina Pattnaik <Prabina.Pattnaik@nechclst.in>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-ofctl: Fix handling of unexpected replies in dump_stats_transaction().
Ben Pfaff [Mon, 11 Jun 2012 18:15:31 +0000 (11:15 -0700)]
ovs-ofctl: Fix handling of unexpected replies in dump_stats_transaction().

dump_stats_transaction() ignored errors and other non-stats replies to
its request and would continue to wait forever.  This fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow: Add Open Flow 1.2 to ofp_to_string__()
Simon Horman [Wed, 27 Jun 2012 08:19:52 +0000 (17:19 +0900)]
openflow: Add Open Flow 1.2 to ofp_to_string__()

Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com fixed testsuite failure.]
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-ctl: Add additional options to strace wrapper.
Ethan Jackson [Wed, 27 Jun 2012 20:41:17 +0000 (13:41 -0700)]
ovs-ctl: Add additional options to strace wrapper.

It's useful to know how long each system call took, and at what
time each system call happened.  In addition this patch causes
strace to print strings more fully allowing log messages to be seen
in the output.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agotests: Fix MockXenAPI to make the ovs-xapi-sync test case pass again.
Ben Pfaff [Wed, 27 Jun 2012 16:56:20 +0000 (09:56 -0700)]
tests: Fix MockXenAPI to make the ovs-xapi-sync test case pass again.

Commit 1dc6839d2d (xenserver: Improve efficiency of code by using
get_all_records_where()) updated the ovs-xapi-sync script and caused a unit
test failure.  This fixes it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoxenserver: Improve efficiency of code by using get_all_records_where()
Rob Hoes [Wed, 27 Jun 2012 15:14:21 +0000 (16:14 +0100)]
xenserver: Improve efficiency of code by using get_all_records_where()

Replace the get_record() for network references which caused as many
slave-to-master calls as there are Network records plus one.
The get_all_records_where() call gets exactly what is needed with a single
call.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Dominic Curran <dominic.curran@citrix.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agolib/meta-flow: introduce a macro, CASE_MFF_REGS, to catch "case MFF_REG<N>:"
Isaku Yamahata [Wed, 27 Jun 2012 14:23:25 +0000 (07:23 -0700)]
lib/meta-flow: introduce a macro, CASE_MFF_REGS, to catch "case MFF_REG<N>:"

Introduce a macro instead for
With this macro, the code is a bit reduced.
test: compile-tested and unit tests passed.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
[blp@nicira.com moved the macro declaration, moved trailing colon from
 macro definition to invocation, adjusted style slightly]
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agonicira-ext: Fix typo in comment.
Ben Pfaff [Wed, 27 Jun 2012 14:12:04 +0000 (07:12 -0700)]
nicira-ext: Fix typo in comment.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoAdd OXM_OF_METADATA field as a step toward OpenFlow 1.1 support.
Joe Stringer [Tue, 26 Jun 2012 13:09:44 +0000 (01:09 +1200)]
Add OXM_OF_METADATA field as a step toward OpenFlow 1.1 support.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agometa-flow: Accept NXM and OXM field names, support NXM and OXM for output.
Ben Pfaff [Tue, 26 Jun 2012 17:52:34 +0000 (10:52 -0700)]
meta-flow: Accept NXM and OXM field names, support NXM and OXM for output.

This commit makes actions that accept NXM header values also accept OXM
header values and accept OXM field names where previously only NXM field
names were accepted.

This makes it possible to add new OXM fields that don't have NXM header
values, e.g. the OXM "metadata" field.

Inspired by Joe Stringer's patch:
http://openvswitch.org/pipermail/dev/2012-June/018344.html

Reported-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agolib/meta-flow: use symbolic value instead of ~7
Isaku Yamahata [Wed, 27 Jun 2012 04:26:57 +0000 (13:26 +0900)]
lib/meta-flow: use symbolic value instead of ~7

mf_is_value_valid() use symbolic value instead of 7 for vlan pcp

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agonicira-ext: Fix nx-action documentation
Joe Stringer [Wed, 27 Jun 2012 03:38:39 +0000 (15:38 +1200)]
nicira-ext: Fix nx-action documentation

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoSubmittingPatches: Correct mailing list to use for sending patches.
Ben Pfaff [Wed, 27 Jun 2012 01:38:20 +0000 (18:38 -0700)]
SubmittingPatches: Correct mailing list to use for sending patches.

Reported-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoUpdate dates and notes for 1.6.1 release.
Justin Pettit [Tue, 26 Jun 2012 04:44:56 +0000 (21:44 -0700)]
Update dates and notes for 1.6.1 release.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agolib: Do not assume sig_atomic_t is int.
Ed Maste [Tue, 26 Jun 2012 14:43:54 +0000 (14:43 +0000)]
lib: Do not assume sig_atomic_t is int.

On FreeBSD sig_atomic_t is long, which causes the comparison in
fatal_signal_run to be true when no signal has been reported.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoSetting miss_send_len on receiving NXT_SET_ASYNC_CONFIG message.
Mehak Mahajan [Tue, 26 Jun 2012 19:30:26 +0000 (12:30 -0700)]
Setting miss_send_len on receiving NXT_SET_ASYNC_CONFIG message.

For the service controllers to receive any asynchronous messages, the
miss_send_len must be set to a non-zero value (refer to DESIGN).  On
receiving the NXT_SET_ASYNC_CONFIG message, the miss_send_len is set
to the default value unless it is set to a non-zero value earlier by
the OFPT_SET_CONFIG message.

Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
11 years agoMerge branch 'ovs-dev', remote branch 'origin' into bsd-port
Giuseppe Lettieri [Tue, 26 Jun 2012 12:16:32 +0000 (14:16 +0200)]
Merge branch 'ovs-dev', remote branch 'origin' into bsd-port

11 years agoofproto-dpif-governor: Improve performance when most flows get set up.
Ben Pfaff [Mon, 25 Jun 2012 16:48:44 +0000 (09:48 -0700)]
ofproto-dpif-governor: Improve performance when most flows get set up.

The "flow setup governor" was introduced to avoid the cost of setting up
short flows when there are many of them.  It works very well for short
flows, in fact.  However, when the bulk of flows are short, but still long
enough to be set up by the governor, we end up with the worst of both
worlds: OVS processes the first 5 packets of every flow "by hand" and then
it still has to set up a flow.

This commit refines the flow setup governor so that, when most of the flows
that go through it actually get set up, it in turn starts setting up most
flows at the first packet.  When it does this, it continues to sample a
small fraction of the flows in the governor's usual manner, so that if the
behavior changes it can react to it.

This increases netperf TCP_CRR transactions per second by about 25% in my
test setup, without affecting "ovs-benchmark rate" performance.

(I found that to get relatively stable performance for TCP_CRR, regardless
of whether Open vSwitch or any kind of bridging was involved, I had to pin
the netperf processes on each side of the link to a single core.  I found
that my NIC's interrupts were already pinned.  Thanks to Luca Giraudo
<lgiraudo@nicira.com> for these hints.)

Bug #12080.
Reported-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofp-util: Avoid use-after-free in ofputil_encode_flow_mod().
Ben Pfaff [Sun, 24 Jun 2012 05:34:39 +0000 (22:34 -0700)]
ofp-util: Avoid use-after-free in ofputil_encode_flow_mod().

nx_put_match() can reallocate the ofpbuf's data so we need to reload the
pointer.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agodpif-linux: Zero 'stats' outputs of dpif_operate() ops on failure.
Ben Pfaff [Wed, 20 Jun 2012 17:55:41 +0000 (10:55 -0700)]
dpif-linux: Zero 'stats' outputs of dpif_operate() ops on failure.

When DPIF_OP_FLOW_PUT or DPIF_OP_FLOW_DEL operations failed, they left
their 'stats' outputs uninitialized.  For DPIF_OP_FLOW_DEL, this meant that
the caller would read indeterminate data:

Conditional jump or move depends on uninitialised value(s)
   at 0x805C1EB: subfacet_reset_dp_stats (ofproto-dpif.c:4410)
    by 0x80637D2: expire_batch (ofproto-dpif.c:3471)
    by 0x8066114: run (ofproto-dpif.c:3513)
    by 0x8059DF4: ofproto_run (ofproto.c:1035)
    by 0x8052E17: bridge_run (bridge.c:2005)
    by 0x8053F74: main (ovs-vswitchd.c:108)

It's unusual for a delete operation to fail.  The most common reason is an
administrator running "ovs-dpctl del-flows".

The only user of DPIF_OP_FLOW_PUT did not request stats, so this doesn't
fix an actual bug for that case.

Bug #11797.
Reported-by: James Schmidt <jschmidt@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-bugtool: Avoid running ethtool on non-physical devices.
Gurucharan Shetty [Mon, 25 Jun 2012 21:14:39 +0000 (14:14 -0700)]
ovs-bugtool: Avoid running ethtool on non-physical devices.

There can be possibilities where there are hundreds of OVS
internal devices. In such a situation, running ovs-bugtool
can take a very long time to complete as multiple ethtool
commands are run on each interface in /sys/class/net. Once
the ovs-bugtool completes, most of the ethtool command outputs
would be incomplete with "timeouts" as we only give 30 seconds
for CAP_NETWORK_STATUS.

With the following patch, we only run ethtools on those interfaces
that have an associated "device". All physical interfaces have
this entry in /sys/class/net/${interface_name}/. Virtual interfaces
can have this entry too, if it has an underlying virtual device.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoofproto-dpif: Place high priority on sending CCMs.
Ethan Jackson [Fri, 22 Jun 2012 00:57:30 +0000 (17:57 -0700)]
ofproto-dpif: Place high priority on sending CCMs.

It's very important to get CCMs out as quickly as possible to avoid
causing a fault when there is really no problem.  This patch sends
CCMs as part of port_run_fast() in an attempt to move in this
direction.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agobridge: Run fast when adding and deleting ports.
Ethan Jackson [Fri, 22 Jun 2012 01:14:33 +0000 (18:14 -0700)]
bridge: Run fast when adding and deleting ports.

Adding and deleting ports can be extremely expensive so it makes
sense to get important work done before and after doing it.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoReapplying the dscp changes: No need to restart DB/OVS on changing dscp value.
Mehak Mahajan [Thu, 21 Jun 2012 19:22:42 +0000 (12:22 -0700)]
Reapplying the dscp changes: No need to restart DB/OVS on changing dscp value.

This patch reapplies the changes that were reverted with the commit 59efa47
(Revert DSCP update changes.). It also addresses the problem introduced by
the original commits, cd8fca2 ((jsonrpc: Correctly setting the dscp value
before reconnect.) and b2e18d (No need to restart DB / OVS on changing
dscp value.), that caused numerous unit test failures on some systems (as
diagnosed by valgrind).
With this change there is no need to restart the DB or OVS on configuring a
different value for the manager or controller connection respectively. On
detecting a change in the dscp value on the socket, the previous socket is
closed and a new socket is created and connection is established with the new
configured dscp value.

Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
11 years agoodp-util: Include <config.h> first.
Ben Pfaff [Thu, 21 Jun 2012 17:42:20 +0000 (10:42 -0700)]
odp-util: Include <config.h> first.

Otherwise _GNU_SOURCE doesn't get defined early enough and on some systems
LLONG_MIN is missing when odp-util.c tries to use it indirectly through
token-bucket.h.

Reported-by: Michael Hu <mhu@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Add coverage counters for facet revalidation.
Ben Pfaff [Thu, 21 Jun 2012 15:31:59 +0000 (08:31 -0700)]
ofproto-dpif: Add coverage counters for facet revalidation.

Revalidation can be very expensive, so may be useful for performance
monitoring to keep track of how often it is necessary and for what reasons.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Remove unused coverage counters.
Ben Pfaff [Thu, 14 Jun 2012 21:44:36 +0000 (14:44 -0700)]
ofproto-dpif: Remove unused coverage counters.

Nothing ever increments these.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agonetdev-linux: Break ethtool coverage counter into "get" and "set" versions.
Ben Pfaff [Thu, 14 Jun 2012 21:37:22 +0000 (14:37 -0700)]
netdev-linux: Break ethtool coverage counter into "get" and "set" versions.

Reads and writes have difference performance implications so it's better to
separate them.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto: Periodically log a summary of flow table changes.
Ben Pfaff [Thu, 14 Jun 2012 21:16:55 +0000 (14:16 -0700)]
ofproto: Periodically log a summary of flow table changes.

We expect this to occasionally be useful for debugging.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agotoken-bucket: New library for generic rate-limiting.
Ben Pfaff [Thu, 31 May 2012 00:16:16 +0000 (17:16 -0700)]
token-bucket: New library for generic rate-limiting.

This commit converts two rate-limiters in the tree to use the library.
I intend to use the library elsewhere in the future.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agosat-math: Introduce macro version of SAT_MUL.
Ben Pfaff [Thu, 31 May 2012 00:05:34 +0000 (17:05 -0700)]
sat-math: Introduce macro version of SAT_MUL.

The macro version can be used in a constant expression, such as an
initializer for a variable with static lifetime.  (Otherwise, it's better
to use the function.)

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agopinsched: Completely fill the token bucket at initialization.
Ben Pfaff [Wed, 30 May 2012 21:33:08 +0000 (14:33 -0700)]
pinsched: Completely fill the token bucket at initialization.

This code, which dates to August 2008, initially sets the packet-in
scheduler token buckets to 10% full, without any rationale.  I suspect
that this is just a typo for 100% full, which I think would be more
conventional, so this commit switches to that.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoMerge branch 'ovs-dev' of /usr/local/git/openvswitch into ovs-dev
Giuseppe Lettieri [Thu, 21 Jun 2012 09:24:15 +0000 (11:24 +0200)]
Merge branch 'ovs-dev' of /usr/local/git/openvswitch into ovs-dev

11 years agobuild: automake complains IntegrationGuide is missing
Isaku Yamahata [Thu, 21 Jun 2012 02:25:48 +0000 (11:25 +0900)]
build: automake complains IntegrationGuide is missing

Change set of 502c471406b32e5afcdea62fa8307f9856d05437 added IntegrationGuide,
but it wasn't added to EXTRA_DIST. So automake complains.
This patch adds the file to EXTRA_DIST.

> make[3]: Leaving directory `/openvswitch/build/datapath'
> The distribution is missing the following files:
> IntegrationGuide
> make[2]: *** [dist-hook-git] Error 1
> make[2]: *** Waiting for unfinished jobs....
> make[2]: Leaving directory `/openvswitch/build'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/openvswitch/build'
> make: *** [all] Error 2

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoIntegrationGuide: A guide to help platform integrators.
Justin Pettit [Tue, 15 May 2012 17:47:42 +0000 (10:47 -0700)]
IntegrationGuide: A guide to help platform integrators.

Provide a guide to integrating OVS on new platforms.

Co-authored-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agocfm: Minor whitespace cleanup.
Ethan Jackson [Wed, 20 Jun 2012 23:04:27 +0000 (16:04 -0700)]
cfm: Minor whitespace cleanup.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agocfm: Warn when delayed sending CCMs.
Ethan Jackson [Tue, 19 Jun 2012 20:24:43 +0000 (13:24 -0700)]
cfm: Warn when delayed sending CCMs.

We've recently seen problems where OVS can get delayed sending CCM
probes by several seconds.  This can cause tunnels to flap, and
generally wreak havoc.  It's easy to detect when this is happening,
so minimally, warning should be helpful to those debugging
problems.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agocfm: Log the start of new fault intervals.
Ethan Jackson [Tue, 19 Jun 2012 20:03:16 +0000 (13:03 -0700)]
cfm: Log the start of new fault intervals.

When debugging CFM, it's useful to know exactly when each fault
interval starts in relation to other CFM events.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agodocs: Add references to the database schema documentation.
Ben Pfaff [Wed, 20 Jun 2012 22:13:38 +0000 (15:13 -0700)]
docs: Add references to the database schema documentation.

I field lots of questions about "where's the documentation?"  Perhaps this
will help.

The changes to ovs-vsctl(8) add a couple of references to
ovs-vswitchd.conf.db(5) but they also rephrase a couple of paragraphs in
what seems to me an easier to understand style.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoFAQ: Add additional entries.
Justin Pettit [Tue, 19 Jun 2012 23:44:54 +0000 (16:44 -0700)]
FAQ: Add additional entries.

Does some cleanup and adds entries that cover:

    - OVS isn't Linux-specific.
    - Point out PORTING guide.
    - Explanation of LTS releases.
    - Supported versions of OpenFlow.
    - Missing features from userspace datapath and upstream kernel
      module.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agoFAQ: Add FAQ entries from website.
Justin Pettit [Tue, 19 Jun 2012 01:03:52 +0000 (18:03 -0700)]
FAQ: Add FAQ entries from website.

The openvswitch.org web site has a FAQ.  This commit integrates those
entries into the FAQ file.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agoovs-lib: Add time stamps to Valgrind log messages.
Ben Pfaff [Wed, 20 Jun 2012 17:29:49 +0000 (10:29 -0700)]
ovs-lib: Add time stamps to Valgrind log messages.

Sometimes it's easier to interpret Valgrind warnings when you can
correlate them with other events.

Suggested-by: James Schmidt <jschmidt@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif-governor: Wake up only when there is genuinely work to do.
Ben Pfaff [Wed, 20 Jun 2012 20:18:25 +0000 (13:18 -0700)]
ofproto-dpif-governor: Wake up only when there is genuinely work to do.

Until now, governor_wait() has awakened the poll loop whenever the
generation timer expires, to allow it to shrink the governor to the next
smaller size in governor_run().  However, if the governor is already the
smallest possible size, then governor_run() will not have anything to do
and will not restart the timer, which means that governor_wait() will again
immediately wake up the poll loop, and we end up using 100% CPU.

This is kind of hard to trigger because normally the client will destroy
a governor in such a case.  However, if there are too many subfacets, the
client will keep even a minimum-size governor, triggering the bug.

Bug #12106.
Reported-by: Alex Yip <alex@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoRevert DSCP update changes.
Ben Pfaff [Wed, 20 Jun 2012 16:31:42 +0000 (09:31 -0700)]
Revert DSCP update changes.

This reverts commit cd8fca2ba0a7d036da069a4484d501bdc7a6f611 (jsonrpc:
Correctly setting the dscp value before reconnect.) and commit
b2e18db292cd4962af3248f11e9f17e6eaf9c033 (No need to restart DB / OVS on
changing dscp value.), which on some systems causes numerous unit test
failures that valgrind diagnoses as:

Conditional jump or move depends on uninitialised value(s)
   at 0x805F63F: jsonrpc_session_set_dscp (jsonrpc.c:1061)
   by 0x804F45D: ovsdb_jsonrpc_server_set_remotes (jsonrpc-server.c:417)
   by 0x804B775: reconfigure_from_db (ovsdb-server.c:656)
   by 0x804C231: main (ovsdb-server.c:159)

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agojsonrpc: Correctly setting the dscp value before reconnect.
Mehak Mahajan [Wed, 20 Jun 2012 03:13:19 +0000 (20:13 -0700)]
jsonrpc: Correctly setting the dscp value before reconnect.

In commit b2e18d(No need to restart DB / OVS on changing dscp value.), the
dscp value was wrongly set after the reconnect.

Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
Reported-by: Ravi Kerur <rkerur@gmail.com>
11 years agodatapath: Support for kernel 3.4.
Pravin B Shelar [Wed, 20 Jun 2012 01:04:27 +0000 (18:04 -0700)]
datapath: Support for kernel 3.4.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
11 years agodatapath: Make 'struct work_struct' consistent with kernel definition.
Pravin B Shelar [Wed, 20 Jun 2012 00:22:54 +0000 (17:22 -0700)]
datapath: Make 'struct work_struct' consistent with kernel definition.

From kernel 3.4 netdevice structure has delayed_work in
net_device->pm_qos_req. delayed_work needs work_struct definition.
OVS has its own workq implementation which redefines work_struct.
So we need to make it consistent with work_struct defined
in kernel workqueue.h to have correct net_device definition.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
11 years agolib: Minor const tweak in smap library.
Ethan Jackson [Tue, 19 Jun 2012 01:40:31 +0000 (18:40 -0700)]
lib: Minor const tweak in smap library.

The source argument of smap_clone() isn't modified, and thus can be
declared const.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoNo need to restart DB / OVS on changing dscp value.
Mehak Mahajan [Thu, 7 Jun 2012 23:57:56 +0000 (16:57 -0700)]
No need to restart DB / OVS on changing dscp value.

With this change there is no need to restart the DB or OVS on configuring a
different value for the manager or controller connection respectively. On
detecting a change in the dscp value on the socket, the previous socket is
closed and a new socket is created and connection is established with the new
configured dscp value.

Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
11 years agodebian: Make DKMS automatically build for running kernel.
Ben Pfaff [Mon, 18 Jun 2012 16:33:23 +0000 (09:33 -0700)]
debian: Make DKMS automatically build for running kernel.

By default DKMS doesn't build on demand for each kernel booted or updated.
Adding AUTOINSTALL=yes gives it this behavior.  Based on a small sample of
Debian packages and how-to guides for Ubuntu, AUTOINSTALL=yes is what most
packages use and what users expect.

Fix-suggested-by: Kirill Kabardin
Reported-by: Ralf Heiringhoff <ralf@frosty-geek.net>
Reported-at: https://bugs.launchpad.net/bugs/962189
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoxenserver, rhel: Enable extra ovs-ctl options from init scripts.
Ben Pfaff [Fri, 15 Jun 2012 17:26:34 +0000 (10:26 -0700)]
xenserver, rhel: Enable extra ovs-ctl options from init scripts.

This is useful for passing wrapper script options and possibly for other
purposes.

Bug #11889.
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agodebian: Enable passing extra options to ovs-ctl from init scripts.
Ben Pfaff [Fri, 15 Jun 2012 00:12:19 +0000 (17:12 -0700)]
debian: Enable passing extra options to ovs-ctl from init scripts.

This is useful for passing wrapper script options and possibly for other
purposes.

Bug #11889.
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-ctl: Add support for running daemons under valgrind or strace.
Ben Pfaff [Fri, 15 Jun 2012 00:09:30 +0000 (17:09 -0700)]
ovs-ctl: Add support for running daemons under valgrind or strace.

This is occasionally useful for debugging.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-ctl: Document --ovs-brcompatd-priority.
Ben Pfaff [Fri, 15 Jun 2012 00:07:24 +0000 (17:07 -0700)]
ovs-ctl: Document --ovs-brcompatd-priority.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow: Add ofp11_group
Simon Horman [Wed, 6 Jun 2012 07:26:36 +0000 (16:26 +0900)]
openflow: Add ofp11_group

OFPG11_ANY may be used as the out_group for ofp11_flow_mod and
ofp11_flow_stats_request.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agolib: Utilize smaps in the idl.
Ethan Jackson [Tue, 22 May 2012 08:53:07 +0000 (01:53 -0700)]
lib: Utilize smaps in the idl.

String to string maps are used all over the Open vSwitch database.
Before this patch, they were implemented in the idl as parallel
string arrays.  This strategy has proven a bit cumbersome.  With
this patch, string to string maps are implemented using the smap
library.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agolib: New data structure - smap.
Ethan Jackson [Tue, 22 May 2012 10:47:36 +0000 (03:47 -0700)]
lib: New data structure - smap.

A smap is a string to string hash map.  It has a cleaner interface
than shash's which were traditionally used for the same purpose.
This patch implements the data structure, and changes netdev and
its providers to use it.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agobridge: Simplify VLAN splinter memory management.
Ethan Jackson [Tue, 22 May 2012 23:16:08 +0000 (16:16 -0700)]
bridge: Simplify VLAN splinter memory management.

Before this patch, the VLAN splinter memory management operated on
blocks of memory instead of ovsrec_ports.  This strategy is
problematic in future patches when more than simply calling
'free()' needs to be done to destroy splinter ports.  This patch
solves the problem by keeping track of entire ovsrec_ports instead
of just the memory allocated to create them.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto: Remove member whose value is always -1 from struct ofoperation.
Ben Pfaff [Thu, 14 Jun 2012 16:30:28 +0000 (09:30 -0700)]
ofproto: Remove member whose value is always -1 from struct ofoperation.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agotests: Add $(check_DATA) to check-valgrind dependencies.
Ben Pfaff [Wed, 13 Jun 2012 20:26:27 +0000 (13:26 -0700)]
tests: Add $(check_DATA) to check-valgrind dependencies.

Otherwise if you run "check-valgrind" in a tree where you've never run
"check", you get some test failures because some data files don't get
generated before the tests run.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agovlog: Avoid use-after-free in corner case.
Ben Pfaff [Tue, 12 Jun 2012 23:45:20 +0000 (16:45 -0700)]
vlog: Avoid use-after-free in corner case.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofp-util: Improve return type of ofputil_decode_packet_in().
Ben Pfaff [Thu, 7 Jun 2012 03:13:49 +0000 (23:13 -0400)]
ofp-util: Improve return type of ofputil_decode_packet_in().

"enum ofperr" is clearer than "int".

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow-1.0.h: Clarify meaning of nw_tos in struct ofp_action_nw_tos.
Ben Pfaff [Thu, 7 Jun 2012 03:08:24 +0000 (23:08 -0400)]
openflow-1.0.h: Clarify meaning of nw_tos in struct ofp_action_nw_tos.

So that I don't have to figure it out yet again.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofp-util: Remove unused functions make_add_simple_flow(), make_del_flow().
Ben Pfaff [Thu, 31 May 2012 03:30:05 +0000 (20:30 -0700)]
ofp-util: Remove unused functions make_add_simple_flow(), make_del_flow().

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoAllow general masking of IPv6 addresses rather than just CIDR masks.
Ben Pfaff [Wed, 23 May 2012 05:49:31 +0000 (22:49 -0700)]
Allow general masking of IPv6 addresses rather than just CIDR masks.

OF1.2 and later make these fields fully maskable so we might as well also.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoAllow general masking of IPv4 addresses rather than just CIDR masks.
Ben Pfaff [Wed, 23 May 2012 05:06:03 +0000 (22:06 -0700)]
Allow general masking of IPv4 addresses rather than just CIDR masks.

OF1.1 and later make these fields fully maskable so we might as well also.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofp-util: Implement translation to and from OpenFlow 1.1 ofp_match.
Ben Pfaff [Sat, 9 Jun 2012 22:49:16 +0000 (15:49 -0700)]
ofp-util: Implement translation to and from OpenFlow 1.1 ofp_match.

This is another step toward OpenFlow 1.1 support.  The change does not
affect any outwardly visible OpenFlow behavior yet.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agopackets: Add ETH_TYPE_MPLS and ETH_TYPE_MPLS_MCAST.
Ben Pfaff [Tue, 22 May 2012 07:15:25 +0000 (00:15 -0700)]
packets: Add ETH_TYPE_MPLS and ETH_TYPE_MPLS_MCAST.

We need these for OpenFlow 1.1 ofp_match support even if we don't support
MPLS matching (which we don't, yet).

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agopackets: Define IPPROTO_SCTP if not provided by <netinet/in.h>.
Ben Pfaff [Tue, 22 May 2012 04:34:46 +0000 (21:34 -0700)]
packets: Define IPPROTO_SCTP if not provided by <netinet/in.h>.

SUSv3 doesn't require IPPROTO_SCTP so some systems might not provide it.

IPPROTO_SCTP isn't used in the tree yet so this doesn't fix a real bug.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow-1.0: Rename ofp_match to ofp10_match, OFPFW_* to OFPFW10_*.
Ben Pfaff [Tue, 22 May 2012 04:51:03 +0000 (21:51 -0700)]
openflow-1.0: Rename ofp_match to ofp10_match, OFPFW_* to OFPFW10_*.

This better fits our general policy of adding a version number suffix
to structures and constants whose values differ from one OpenFlow
version to the next.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow-1.1.h: Fix OFPFW11_* definitions.
Ben Pfaff [Tue, 22 May 2012 07:25:08 +0000 (00:25 -0700)]
openflow-1.1.h: Fix OFPFW11_* definitions.

OFPFW_DL_SRC and OFPFW_DL_DST don't exist in OpenFlow 1.1.  Replace them
by the correct enums.

Most of the change here is due to respacing since DL_VLAN_PCP is one
character wider than any previous name.

This doesn't fix a real bug because these constants didn't have any users
in the tree.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agodatapath: Check currect return value from skb_gso_segment()
Pravin B Shelar [Tue, 12 Jun 2012 18:19:16 +0000 (11:19 -0700)]
datapath: Check currect return value from skb_gso_segment()

Fix return check typo.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #11933

11 years agoFAQ: Mention high CPU usage as symptom of looping the network.
Ben Pfaff [Tue, 12 Jun 2012 18:19:36 +0000 (11:19 -0700)]
FAQ: Mention high CPU usage as symptom of looping the network.

Suggested-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoAdd a FAQ.
Ben Pfaff [Tue, 12 Jun 2012 16:40:11 +0000 (09:40 -0700)]
Add a FAQ.

I wrote most of this myself.  The answer to "I can't seem to use Open
vSwitch in a wireless network" is based on a response by Jesse Gross:
http://openvswitch.org/pipermail/discuss/2011-January/004707.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto: Update comment.
Ben Pfaff [Wed, 30 May 2012 20:15:00 +0000 (13:15 -0700)]
ofproto: Update comment.

CC: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agonx-match: Add parsing and serialisation of OXM matches.
Simon Horman [Mon, 11 Jun 2012 16:56:12 +0000 (09:56 -0700)]
nx-match: Add parsing and serialisation of OXM matches.

This code, which leverages the existing NXM implementation,
adds parsing and serialisation of OXM matches. Test cases
have also been provided.

This patch only implements parsing and serialisation of OXM fields that
are already handled by NXM.

It should be noted that in OXM ports are 32bit whereas in NXM they
are 16 bit. This has been handled as a special case as all other field
widths are the same in both OXM and NXM.

This patch does not address differences in wildcarding between OXM and NXM.
It is planned that liberal wildcarding policy dictated by either OXM or
NXM will be implemented.

This patch also does not address any (subtle?) differences between
OXM and NXM treatment of specific fields. It is envisages that his
can be handled by subsequent patches.

Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com adjusted style, added a comment, changed in_port special
 case, enabled NXM extensions to OXM]
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agodpif-netdev: Fix use-after-free in dpif_netdev_recv.
Ben Pfaff [Sat, 9 Jun 2012 15:55:29 +0000 (11:55 -0400)]
dpif-netdev: Fix use-after-free in dpif_netdev_recv.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoopenflow: Use spaces for indentation
Simon Horman [Fri, 8 Jun 2012 07:03:34 +0000 (16:03 +0900)]
openflow: Use spaces for indentation

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agopackets: Rename compose_benign_packet().
Ethan Jackson [Fri, 8 Jun 2012 01:24:29 +0000 (18:24 -0700)]
packets: Rename compose_benign_packet().

The name compose_rarp() more clearly describes what it's doing now.

Requested-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agolacp: Print may_enable flag in appctl output.
Ethan Jackson [Thu, 7 Jun 2012 21:21:36 +0000 (14:21 -0700)]
lacp: Print may_enable flag in appctl output.

I would have found this helpful when debugging a problem recently.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agopackets: Use RARPs for learning packets.
Ethan Jackson [Thu, 7 Jun 2012 22:27:22 +0000 (15:27 -0700)]
packets: Use RARPs for learning packets.

Traditionally Open vSwitch had used 802.2 SNAP packets to update
upstream switch learning tables when necessary.  This approach had
advantages in that debugging information could be embedded in the
packet helping hapless admins figure out what's going on.  However,
since both qemu and VMware use RARP for this purpose, it seems
appropriate to fall in line with the defacto standard.

Requested-by: Ben Basler <bbasler@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agodatapath: Fix sparse warning on BUILD_BUG_ON_NOT_POWER_OF_2 definition.
Pravin B Shelar [Thu, 7 Jun 2012 22:20:27 +0000 (15:20 -0700)]
datapath: Fix sparse warning on BUILD_BUG_ON_NOT_POWER_OF_2 definition.

BUILD_BUG_ON_NOT_POWER_OF_2 could been defined in kernel.h or bug.h
depending on kernel version.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
11 years agodatapath: Fix use-after-free bug in dp_notify.
Pravin B Shelar [Thu, 7 Jun 2012 22:18:17 +0000 (15:18 -0700)]
datapath: Fix use-after-free bug in dp_notify.

dp_notify, in unregister case, is accessing vport after detaching
it. Following patch fixes it.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
11 years agoofproto: Fix use after free in ofoperation_complete().
Ethan Jackson [Thu, 7 Jun 2012 20:05:41 +0000 (13:05 -0700)]
ofproto: Fix use after free in ofoperation_complete().

In one edge case, ofoperation_complete() destroys its rule, without
updating its ofoperation that the rule is gone.  Later in the same
function, ofoperation_destroy() attempts to modify the rule which
already destroyed.

Bug #11797.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif: Avoid calling eth_addr_is_reserved().
Ethan Jackson [Wed, 6 Jun 2012 22:06:15 +0000 (15:06 -0700)]
ofproto-dpif: Avoid calling eth_addr_is_reserved().

eth_addr_is_reserved() is a bit more expensive than it used to be,
so it makes sense to avoid calling it when convenient as an
optimization.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agopackets: Update the reserved protocols list.
Ethan Jackson [Wed, 6 Jun 2012 22:22:52 +0000 (15:22 -0700)]
packets: Update the reserved protocols list.

The protocols added in this patch should be considered "reserved"
and not forward when "forward-bpdu" is false, nor should they be
mirrored.

Bug #11755.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agopackets: Generalize reserved RSPAN protocols.
Ethan Jackson [Fri, 1 Jun 2012 21:33:41 +0000 (14:33 -0700)]
packets: Generalize reserved RSPAN protocols.

Open vSwitch refuses to mirror certain destination addresses in
addition to those classified by eth_addr_is_reserved().  Looking
through the uses of eth_addr_is_reserved(), one finds that no
callers should be using the additional addresses which mirroring
drops.  This patch folds the additional addresses dropped in the
mirroring code, into the more general eth_addr_is_reserverd()
function.

This patch also changes the implementation in a way that is
slightly less efficient, but much easier to read and extend int he
future.

Bug #11755.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agopackets: Fix eth_addr_equal_except().
Ethan Jackson [Thu, 7 Jun 2012 00:37:46 +0000 (17:37 -0700)]
packets: Fix eth_addr_equal_except().

It turns out that eth_addr_equal_except() computed the exact
opposite of what it purported to.  It returned true if the two
arguments where *not* equal.  This is extremely confusing, so this
patch changes it.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agodpif-linux: Fix invalid format specifier.
Ethan Jackson [Wed, 6 Jun 2012 01:02:30 +0000 (18:02 -0700)]
dpif-linux: Fix invalid format specifier.

This fixes the following warning on my system. "format '%d' expects
argument of type 'int', but argument 5 has type 'long int'"

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoovsdb-client: Fix bugs in man page
Bruce Davie [Wed, 6 Jun 2012 01:49:51 +0000 (18:49 -0700)]
ovsdb-client: Fix bugs in man page

In commit 53ffefe9 (ovsdb-client: Make "server" and "database"
arguments optional.), two errors were introduced.  "list-columns"
appeared twice in the list of commands, the first instance should be
"list-tables".  The "monitor" command now lists optional "column"
arguments.

Signed-off-by: Bruce Davie <bsd@nicira.com>
Signed-off-by: Bruce Davie <bdavie@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agodpif-linux: Log details when a packet is lost.
Ben Pfaff [Fri, 1 Jun 2012 21:40:31 +0000 (17:40 -0400)]
dpif-linux: Log details when a packet is lost.

Until now, when a packet was dropped in the kernel-to-user buffers, we
logged the occurrence but nothing that would allow a person reading the
log after the fact to learn why it was dropped.  This commit adds details
that identify the major sources of packets in the buffer, which should
help.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agodpif-linux: Slightly refactor internal data structures.
Ben Pfaff [Wed, 23 May 2012 23:55:09 +0000 (16:55 -0700)]
dpif-linux: Slightly refactor internal data structures.

An initial attempt also replaced the 'uint32_t ready_mask' in struct
dpif_linux by a 'bool ready' in each struct dpif_channel, but I wasn't
happy with the result (the ready_mask bitmap works out really well) and so
I dropped that part.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agodpif-linux: Avoid pessimal behavior when kernel-to-user buffers overflow.
Ben Pfaff [Wed, 23 May 2012 21:56:20 +0000 (14:56 -0700)]
dpif-linux: Avoid pessimal behavior when kernel-to-user buffers overflow.

When a kernel-to-user Netlink buffer overflows, the kernel reports
ENOBUFS without passing along an actual message.  When it does this,
we should immediately try again, because we know that there is a
message waiting, instead of reporting the error to the caller.

This improves the OVS response rate to "hping3 --flood" traffic by
a few percentage points in my testing.

Signed-off-by: Ben Pfaff <blp@nicira.com>