sliver-openvswitch.git
14 years agodatapath: Add configure test for skb_warn_if_lro().
Jesse Gross [Thu, 6 May 2010 19:31:43 +0000 (12:31 -0700)]
datapath: Add configure test for skb_warn_if_lro().

Some distributions backport this function, so use a configure
test instead of a version check.

CC: Alexey I. Froloff <raorn@altlinux.org>
14 years agoofproto: Implement ofp_action_output "max_len" feature.
Ben Pfaff [Tue, 4 May 2010 19:29:39 +0000 (12:29 -0700)]
ofproto: Implement ofp_action_output "max_len" feature.

The "max_len" feature of ofp_action_output is completely unimplemented
currently.  Implement it.

Reported-by: Tetsuo NAKAGAWA <nakagawa@mxc.nes.nec.co.jp>
14 years agonetdev-linux: Optimize removing policing from an interface.
Ben Pfaff [Mon, 3 May 2010 22:38:31 +0000 (15:38 -0700)]
netdev-linux: Optimize removing policing from an interface.

It is very expensive to start a subprocess and, especially, to wait for it
to complete.  This replaces the most common subprocess operation in
netdev_linux_set_policing() by a Netlink socket operation, which is much
faster.

Without this and the other netdev-linux commits, my 1000-interface test
case runs in 1 min 48 s.  With them, it runs in 25 seconds.

14 years agonetdev-linux: Cache policing values.
Ben Pfaff [Mon, 3 May 2010 22:37:24 +0000 (15:37 -0700)]
netdev-linux: Cache policing values.

Without this and the following netdev-linux commits, my 1000-interface test
case runs in 1 min 48 s.  With them, it runs in 25 seconds.

14 years agonetdev-linux: Factor out removing policing.
Ben Pfaff [Mon, 3 May 2010 22:31:38 +0000 (15:31 -0700)]
netdev-linux: Factor out removing policing.

This is duplicated code that the following commit will rewrite.

14 years agonetdev-linux: Factor out obtaining an RTNL socket.
Ben Pfaff [Mon, 3 May 2010 21:31:16 +0000 (14:31 -0700)]
netdev-linux: Factor out obtaining an RTNL socket.

Another function needs this same functionality in an upcoming commit, so
factor this into a new function get_rtnl_sock().

14 years agodpif-linux: Use hash instead of sorted array.
Ben Pfaff [Mon, 3 May 2010 20:47:28 +0000 (13:47 -0700)]
dpif-linux: Use hash instead of sorted array.

With 1000 network devices being added or removed, sorting the array was a
profiling hot spot.  Using a hash makes it drop off the profile.

14 years agobridge: Optimize trunk port common case.
Ben Pfaff [Mon, 3 May 2010 18:47:56 +0000 (11:47 -0700)]
bridge: Optimize trunk port common case.

Profiling with qprof showed that bitmap_set_multiple() and bitmap_equal()
were eating up quite a bit of CPU time during bridge reconfiguration (up
to about 10% of total runtime).  This is completely avoidable in the common
case where a port trunks all VLANs, where we don't really need a bitmap at
all.  This commit implements that optimization.

14 years agodynamic-string: Optimize ds_put_char().
Ben Pfaff [Mon, 3 May 2010 19:30:37 +0000 (12:30 -0700)]
dynamic-string: Optimize ds_put_char().

A qprof profile showed ds_put_char() and ds_put_uninit() as 4% of total
runtime.  This commit inlines the common case, which reduces them to 1%
total.

14 years agobridge: Optimize port_lookup() using a hash.
Ben Pfaff [Mon, 3 May 2010 20:42:39 +0000 (13:42 -0700)]
bridge: Optimize port_lookup() using a hash.

Before this commit and the preceding one, with 1000 interfaces strcmp()
took 36% and port_lookup() took 8% of total runtime when reconfiguring
bridges.  With these two commits the percentage is reduced to 3% and 0%,
respectively.

14 years agobridge: Optimize iface_lookup() and port_lookup_iface() with a hash.
Ben Pfaff [Wed, 5 May 2010 21:00:47 +0000 (14:00 -0700)]
bridge: Optimize iface_lookup() and port_lookup_iface() with a hash.

Before this commit and the following one, with 1000 interfaces strcmp()
took 36% and port_lookup() took 8% of total runtime when reconfiguring
bridges.  With these two commits the percentage is reduced to 3% and 0%,
respectively.

14 years agoovs-vswitchd: Implement "exit" unixctl command.
Ben Pfaff [Mon, 3 May 2010 22:43:49 +0000 (15:43 -0700)]
ovs-vswitchd: Implement "exit" unixctl command.

This is useful for profiling, since common profilers do not print anything
until the process terminates, and only if the process terminates in the
ordinary way by calling exit().

14 years agosflow: Always add poller and sampler together.
Neil McKee [Wed, 5 May 2010 20:26:23 +0000 (13:26 -0700)]
sflow: Always add poller and sampler together.

he ofproto_sflow_add_poller() and ofproto_sflow_add_sampler() calls
should always be made together, either when a port is added
dynamically with ofproto_sflow_add_port() or when the sflow_agent is
first created in ofproto_sflow_set_options().  I was seeing odd
behavior where either the pollers or the samplers would never be
instantiated depending on the order that things happened.  (It's OK to
add the same sampler or poller again, because the library routines
sfl_agent_addPoller() and sfl_agent_addSampler() will just return the
existing one if it is there.  Perhaps we should add a comment to make
that clear?).

I changed the parameters to ofproto_sflow_add_sampler to make it work
the same way as ofproto_sflow_add_poller, where the options are
extracted from os->options within the function itself.

14 years agosflow: Properly fill in initial destination VLAN in sFlow output.
Neil McKee [Wed, 5 May 2010 20:24:27 +0000 (13:24 -0700)]
sflow: Properly fill in initial destination VLAN in sFlow output.

14 years agosflow: Include Ethernet FCS in frame_length to comply with sFlow spec.
Neil McKee [Wed, 5 May 2010 20:24:03 +0000 (13:24 -0700)]
sflow: Include Ethernet FCS in frame_length to comply with sFlow spec.

14 years agovswitchd: Fix documentation of "agent" column in "sFlow" table.
Ben Pfaff [Wed, 5 May 2010 17:52:46 +0000 (10:52 -0700)]
vswitchd: Fix documentation of "agent" column in "sFlow" table.

The documentation now accurately reflects the implementation.  The
implementation, however, leaves a great deal to be desired.

14 years agobridge: Fix double-free in sFlow configuration.
Ben Pfaff [Wed, 5 May 2010 17:50:38 +0000 (10:50 -0700)]
bridge: Fix double-free in sFlow configuration.

14 years agoovs-vsctl: Add sFlow to supported set of tables.
Ben Pfaff [Wed, 5 May 2010 17:38:24 +0000 (10:38 -0700)]
ovs-vsctl: Add sFlow to supported set of tables.

Somehow this one got left out accidentally.

Reported-by: Neil McKee <neil.mckee@inmon.com>
14 years agoxenserver: Make Open vSwitch disable itself in "bridge" mode.
Ben Pfaff [Tue, 4 May 2010 17:20:58 +0000 (10:20 -0700)]
xenserver: Make Open vSwitch disable itself in "bridge" mode.

When /etc/xensource/network.conf contains the word "bridge", the system
is supposed to use the Linux bridge, not Open vSwitch.  This commit makes
OVS honor this setting, which until now it has mainly ignored.

Reported-by: Reid Price <reid@nicira.com>
14 years agoxenserverd: Give XAPI a grace period before refreshing network UUIDs.
Ben Pfaff [Sat, 1 May 2010 21:27:53 +0000 (14:27 -0700)]
xenserverd: Give XAPI a grace period before refreshing network UUIDs.

XAPI updates the pool.conf file before it actually refreshes its database
from the new pool, so we need to wait for that to happen.  Hard-coded
delays aren't a good idea, but in the long term XAPI will probably be
adding a hook script for us to use, so this may be an OK stopgap measure.

Bug #2756.

14 years agodatapath: Don't hold dp_mutex when setting internal devs MTU.
Jesse Gross [Tue, 27 Apr 2010 01:08:54 +0000 (18:08 -0700)]
datapath: Don't hold dp_mutex when setting internal devs MTU.

We currently acquire dp_mutex when we are notified that the MTU
of a device attached to the datapath has changed so that we can
set the internal devices to the minimum MTU.  However, it is not
required to hold dp_mutex because we already have RTNL lock and it
causes a deadlock, so don't do it.

Specifically, the issue is that DP mutex is acquired twice: once in
dp_device_event() before calling set_internal_devs_mtu() and then
again in internal_dev_change_mtu() when it is actually being changed
(since the MTU can also be set directly).  Since it's not a recursive
mutex, deadlock.

14 years agoxenserver: Wire up emergency reset plug-in and call it on manager change
Justin Pettit [Fri, 30 Apr 2010 22:09:34 +0000 (15:09 -0700)]
xenserver: Wire up emergency reset plug-in and call it on manager change

Add code to "emergency_reset" plug-in method to actually do its work.
Also, when the manager is changed, call the emergency reset command to
clear out any configuration that may have been done by a previous
controller.

14 years agoovs-vsctl: Add emergency reset command
Justin Pettit [Fri, 30 Apr 2010 22:06:51 +0000 (15:06 -0700)]
ovs-vsctl: Add emergency reset command

Add the "emer-reset" command, which is used to clear the configuration of
items likely to have been configured by the manager.  This will leave
the core networking configuration as it was.

14 years agoovsdb-idl: Add "safe" iterator macro to generated code.
Ben Pfaff [Fri, 30 Apr 2010 21:16:25 +0000 (14:16 -0700)]
ovsdb-idl: Add "safe" iterator macro to generated code.

14 years agodatapath: Ensure packet length matches headers during checksum setup.
Jesse Gross [Mon, 26 Apr 2010 21:19:28 +0000 (14:19 -0700)]
datapath: Ensure packet length matches headers during checksum setup.

During the setup of checksumming pointers we need to make sure that
the transport headers are in the skb linear data area.  However, we
don't currently verify that the lengths in the packet headers are
within the size of the packet.  This makes that check before a
BUG() check does it for us.

CC: "Nick Couchman" <Nick.Couchman@seakr.com>
14 years agobridge: Immediately drop interfaces that can't be opened.
Jesse Gross [Thu, 29 Apr 2010 21:54:10 +0000 (14:54 -0700)]
bridge: Immediately drop interfaces that can't be opened.

Previously we would keep interfaces around that couldn't be opened
because they might be internal interfaces that are created later.
However, this leads to a race condition if the interface appears
after we try to create it and fails since some operations may
succeed.  Instead, give up on the interface immediately if it can't
be opened and isn't internal (which we control and so won't have
this issue).

Bug #2737

14 years agovport: Better handle too-long network device names in vport_del().
Ben Pfaff [Tue, 27 Apr 2010 19:41:11 +0000 (12:41 -0700)]
vport: Better handle too-long network device names in vport_del().

The 'count' argument to strncpy_from_user() is supposed to include space
for the null terminator, so add it in.  Also, refuse names that have more
than IFNAMSIZ-1 characters outright, instead of truncating them.

14 years agodatapath: Check device name length more carefully in create_dp().
Ben Pfaff [Tue, 27 Apr 2010 17:45:28 +0000 (10:45 -0700)]
datapath: Check device name length more carefully in create_dp().

I don't see any value in silently truncating device names.  Doing so will
sow confusion in userspace.  This commit makes too-long device names
return ENAMETOOLONG.

14 years agodatapath: Always null-terminate network device name in create_dp().
Ben Pfaff [Tue, 27 Apr 2010 17:43:24 +0000 (10:43 -0700)]
datapath: Always null-terminate network device name in create_dp().

strncpy() does not null-terminate its output buffer if the source string's
length is at least as large as its 'count' argument.  We know that the
source and destination buffers are the same size and that the source buffer
is null-terminated, so just use strcpy().

This fixes a kernel BUG message that often occurred when strlen(devname)
was exactly IFNAMSIZ-1.  In such a case, if
internal_dev_port.devname[IFNAMSIZ-1] happened to be nonzero, it would
eventually fail the following check in alloc_netdev_mq():
BUG_ON(strlen(name) >= sizeof(dev->name));

Bug #2722.

14 years agodatapath: Fix argument to strncpy_from_user().
Ben Pfaff [Tue, 27 Apr 2010 17:21:12 +0000 (10:21 -0700)]
datapath: Fix argument to strncpy_from_user().

The strncpy_from_user() function's 'count' argument is documented to
include the trailing null byte, but create_dp() did not include it.  This
commit adds it in.

14 years agoofproto: Avoid buffer copy in OFPT_PACKET_IN path.
Ben Pfaff [Tue, 27 Apr 2010 16:40:46 +0000 (09:40 -0700)]
ofproto: Avoid buffer copy in OFPT_PACKET_IN path.

When a dpif passes an odp_msg down to ofproto, and ofproto transforms it
into an ofp_packet_in to send to the controller, until now this always
involved a full copy of the packet inside ofproto.  This commit eliminates
this copy by ensuring that there is always enough headroom in the ofpbuf
that holds the odp_msg to replace it by an ofp_packet_in in-place.

From Jean Tourrilhes <jt@hpl.hp.com>, with some revisions.

14 years agoxenserver: Use start_daemon for xenserverd also in /etc/init.d/openvswitch.
Ben Pfaff [Tue, 27 Apr 2010 16:37:06 +0000 (09:37 -0700)]
xenserver: Use start_daemon for xenserverd also in /etc/init.d/openvswitch.

Reported-by: Justin Pettit <jpettit@nicira.com>
14 years agoxenserver: Report correct daemon names at startup in /etc/init.d/openvswitch.
Ben Pfaff [Tue, 27 Apr 2010 16:36:30 +0000 (09:36 -0700)]
xenserver: Report correct daemon names at startup in /etc/init.d/openvswitch.

Reported-by: Justin Pettit <jpettit@nicira.com>
14 years agoxenserver: Use daemon-specific dir for pidfile in /etc/init.d/openvswitch.
Ben Pfaff [Tue, 27 Apr 2010 16:35:45 +0000 (09:35 -0700)]
xenserver: Use daemon-specific dir for pidfile in /etc/init.d/openvswitch.

Reported-by: Justin Pettit <jpettit@nicira.com>
14 years agoxenserver: Avoid using unset $nice variable in /etc/init.d/openvswitch.
Ben Pfaff [Tue, 27 Apr 2010 16:35:03 +0000 (09:35 -0700)]
xenserver: Avoid using unset $nice variable in /etc/init.d/openvswitch.

Reported-by: Justin Pettit <jpettit@nicira.com>
14 years agoxenserver: Fix typo in prompt
Justin Pettit [Mon, 26 Apr 2010 23:42:05 +0000 (16:42 -0700)]
xenserver: Fix typo in prompt

14 years agoxenserver: Factor redundancy out of /etc/init.d/openvswitch.
Ben Pfaff [Mon, 26 Apr 2010 21:18:33 +0000 (14:18 -0700)]
xenserver: Factor redundancy out of /etc/init.d/openvswitch.

We probably have too many configuration variables in any case, but at least
we can use just one shell function to deal with them.

14 years agoxenserver: Gracefully refresh network UUIDs on pool join or leave.
Ben Pfaff [Mon, 26 Apr 2010 21:18:32 +0000 (14:18 -0700)]
xenserver: Gracefully refresh network UUIDs on pool join or leave.

The vswitch database is supposed to maintain an up-to-date UUID for the
system's networks in the Bridge table as external-ids:network-uuids.  On
XenServer systems, /opt/xensource/libexec/interface-reconfigure updates
these fields as bridges are brought up and down.  Most of the time, that is
sufficient.  However, this is one exception: when a XenServer host enters
or leaves a pool, interface-reconfigure is not invoked, and neither is any
other script.  So this commit introduces a new, XenServer-specific daemon
that monitors the XenServer's pool membership status and refreshes the
network UUIDs (by invoking the refresh-network-uuids script) if it changes.

Bug #2097.

14 years agoofproto: Fix bad memory access sending large numbers of port stats replies.
Ben Pfaff [Mon, 26 Apr 2010 22:39:52 +0000 (15:39 -0700)]
ofproto: Fix bad memory access sending large numbers of port stats replies.

The append_stats_reply() function can modify its pointer argument, but
append_port_stat() was failing to propagate this change back to its own
caller.  So when append_stats_reply() did in fact modify it (which happens
when the 64 kB maximum OpenFlow message length was exceeded), the
handle_port_stats_request() function would then access freed memory on the
next call to append_port_stat().

Bug #2714.
Reported-by: Ram Jothikumar <rjothikumar@nicira.com>
Debugging help by Justin Pettit <jpettit@nicira.com>

14 years agoofpbuf: New function ofpbuf_push_zeros().
Ben Pfaff [Fri, 9 Apr 2010 19:36:16 +0000 (12:36 -0700)]
ofpbuf: New function ofpbuf_push_zeros().

14 years agovswitchd: Rename bridge_reconfigure_controller().
Ben Pfaff [Mon, 26 Apr 2010 21:25:27 +0000 (14:25 -0700)]
vswitchd: Rename bridge_reconfigure_controller().

Suggested-by: Justin Pettit <jpettit@nicira.com>
14 years agoxenserver: Fix sense of -d test in /etc/init.d/openvswitch.
Ben Pfaff [Mon, 26 Apr 2010 20:12:35 +0000 (13:12 -0700)]
xenserver: Fix sense of -d test in /etc/init.d/openvswitch.

It doesn't make sense to create a directory if it already exists.

14 years agoxenserver: Rewrite refresh-network-uuids script for decent performance.
Ben Pfaff [Wed, 21 Apr 2010 17:49:12 +0000 (10:49 -0700)]
xenserver: Rewrite refresh-network-uuids script for decent performance.

Calling "interface-reconfigure up" can take a couple of seconds, but all
we have to do here, really, is fetch the network UUIDs and invoke
ovs-vsctl, which is much faster.  So rewrite this script in Python and make
it do just that.

14 years agosocket-util: Move get_mtime() here from stream-ssl.
Ben Pfaff [Wed, 21 Apr 2010 17:47:45 +0000 (10:47 -0700)]
socket-util: Move get_mtime() here from stream-ssl.

An upcoming commit will add a new user for this function in another file,
so export it and move it to a common library file.

14 years agovswitchd: Enable in-band control to managers.
Ben Pfaff [Mon, 26 Apr 2010 17:48:31 +0000 (10:48 -0700)]
vswitchd: Enable in-band control to managers.

ovsdb-server must be able to connect to the OVSDB managers over in-band
control (because the manager may be what configures the OpenFlow
controllers).  This commit enables that.

14 years agosocket-util: Factor out new function inet_parse_active().
Ben Pfaff [Mon, 26 Apr 2010 17:43:55 +0000 (10:43 -0700)]
socket-util: Factor out new function inet_parse_active().

An upcoming commit needs to parse connection strings without connecting to
them, so this change enables that.

14 years agoofproto: Allow client to pass down extra (IP,port) tuples for in-band.
Ben Pfaff [Tue, 20 Apr 2010 23:36:01 +0000 (16:36 -0700)]
ofproto: Allow client to pass down extra (IP,port) tuples for in-band.

ovs-vswitchd needs to be able to tell ofproto where the OVSDB managers are,
so that in-band control can allow traffic to it even if there is no
connection to the controller yet.  This adds the basis for that feature.

14 years agoin-band: Generalize the in-band code to arbitrary (IP,port) pairs.
Ben Pfaff [Mon, 26 Apr 2010 17:16:45 +0000 (10:16 -0700)]
in-band: Generalize the in-band code to arbitrary (IP,port) pairs.

Until now the in-band code has taken an rconn (recently, multiple rconns)
and used its remote IP address as the one for which to set up flows.  But
we also need to support in-band control to the OVSDB manager, and OVSDB
does not use rconns.  This commit takes the first step toward this support
by generalizing the in-band code to take an arbitrary number of (IP,port)
pairs as remotes for which to set up flows.

14 years agogre: Ensure skb properties are consistently set.
Jesse Gross [Sat, 24 Apr 2010 01:54:47 +0000 (18:54 -0700)]
gre: Ensure skb properties are consistently set.

The skb local fragmentation and checksum offloading properties were
sometimes either overwritten or not copied by later operations.  This
ensures that they are consistently correct and solves issues like
TSO not working in certain circumstances on 2.6.18 kernels.

14 years agodatapath: Update 'struct ovs_skb_cb' comments.
Jesse Gross [Mon, 26 Apr 2010 16:52:36 +0000 (09:52 -0700)]
datapath: Update 'struct ovs_skb_cb' comments.

14 years agoin-band: Use NULL for null pointer constant, instead of 0.
Ben Pfaff [Tue, 20 Apr 2010 21:11:23 +0000 (14:11 -0700)]
in-band: Use NULL for null pointer constant, instead of 0.

Suggested-by: Justin Pettit <jpettit@nicira.com>
14 years agoin-band: Refactor in_band_set_remotes().
Ben Pfaff [Tue, 20 Apr 2010 21:10:32 +0000 (14:10 -0700)]
in-band: Refactor in_band_set_remotes().

Seems easier to understand this way.

Suggested-by: Justin Pettit <jpettit@nicira.com>
14 years agoin-band: Refactor slightly to be easier to understand.
Ben Pfaff [Tue, 20 Apr 2010 21:06:25 +0000 (14:06 -0700)]
in-band: Refactor slightly to be easier to understand.

Suggested-by: Justin Pettit <jpettit@nicira.com>
14 years agoin-band: Avoid magic number in refresh_remotes().
Ben Pfaff [Tue, 20 Apr 2010 20:58:24 +0000 (13:58 -0700)]
in-band: Avoid magic number in refresh_remotes().

The initial value of min_refresh() can only matter if there are no remotes,
in which case there is nothing to refresh anyhow.  So avoid the magic
number "10" as the initial minimum, since it has no real significance.

Suggested-by: Justin Pettit <jpettit@nicira.com>
14 years agoin-band: Better adapt to new rconn usage pattern.
Ben Pfaff [Tue, 20 Apr 2010 20:58:40 +0000 (13:58 -0700)]
in-band: Better adapt to new rconn usage pattern.

Previously, in-band control was always handed a single rconn whose remote
target changed as the selected controller was changed or added or removed.
Now, however, the rconns handed to in-band control never change their
remote IP targets (instead, new rconns are added and old ones are removed),
so there is no point in looking for changes in remote IP address.

Suggested-by: Justin Pettit <jpettit@nicira.com>
14 years agoHave git ignore new symlinks and dynamic files from datapath builds
Justin Pettit [Thu, 22 Apr 2010 05:39:20 +0000 (22:39 -0700)]
Have git ignore new symlinks and dynamic files from datapath builds

14 years agoveth: Do a better job cleaning up on rmmod
Justin Pettit [Wed, 21 Apr 2010 05:46:04 +0000 (22:46 -0700)]
veth: Do a better job cleaning up on rmmod

The veth driver doesn't clean itself up very well when removed.  This
commit destroys any outstanding veth devices and then unregisters its
sysfs entry.

14 years agodatapath: Define kmemdup() for kernels older than 2.6.19
Justin Pettit [Wed, 21 Apr 2010 05:42:35 +0000 (22:42 -0700)]
datapath: Define kmemdup() for kernels older than 2.6.19

The new GRE code requires the kmemdup function, but it's not available
on 2.6.18 kernels.  It has been backported to Xen, so only define it for
non-Xen kernels older than 2.6.19.

14 years agoxenserver: Clean-up space/tabs issues in vif script
Justin Pettit [Wed, 21 Apr 2010 10:53:22 +0000 (03:53 -0700)]
xenserver: Clean-up space/tabs issues in vif script

Our vif script had a mishmash of tab and space indentations.  The
original vif script only uses spaces, so I went with that style.

14 years agoxenserver: Set internal network-uuids in Bridge table on XS5.5
Justin Pettit [Wed, 21 Apr 2010 10:42:52 +0000 (03:42 -0700)]
xenserver: Set internal network-uuids in Bridge table on XS5.5

On XenServer 5.5, interface-reconfigure is not called when creating
internal bridges, so we jump through extra hoops to determine the
network UUIDs.  The code that handled this was not properly retrieving
the UUIDs from XAPI, so the field would never be set.  This commit
corrects that.

Bug #2666

14 years agoovs-ofctl: Document that "actions" must be last in flow specifications.
Ben Pfaff [Wed, 21 Apr 2010 20:29:03 +0000 (13:29 -0700)]
ovs-ofctl: Document that "actions" must be last in flow specifications.

Bug #2447.
Reported-by: Reid Price <reid@nicira.com>
14 years agogre: Fix ICMP translation for path MTU discovery.
Jesse Gross [Tue, 20 Apr 2010 20:51:59 +0000 (16:51 -0400)]
gre: Fix ICMP translation for path MTU discovery.

The translation of fragmentation-needed messages from outside the
tunnel to inside didn't quite make the transition from the old
GRE implementation to the new one intact.  This fixes a number of
minor bugs in the implmentation.  The primary issues are with computing
the tunnel header length and comparing the input vs. output values
for tunnel parameters such as the key.

14 years agostream: Fix typo in comment.
Ben Pfaff [Tue, 20 Apr 2010 21:54:10 +0000 (14:54 -0700)]
stream: Fix typo in comment.

14 years agoin-band: Refresh both local and remote rules even if local rules change.
Ben Pfaff [Tue, 20 Apr 2010 20:41:53 +0000 (13:41 -0700)]
in-band: Refresh both local and remote rules even if local rules change.

This code should call refresh_remotes() even if refresh_local() returns
true.  That is, the normal C short-circuit evaluation of || is not desired
here.  So always call both.

14 years agoin-band: Really reinstall in-band rules in in_band_flushed().
Ben Pfaff [Tue, 20 Apr 2010 20:39:35 +0000 (13:39 -0700)]
in-band: Really reinstall in-band rules in in_band_flushed().

14 years agoin-band: Remove comment that only a single controller is supported.
Ben Pfaff [Mon, 19 Apr 2010 23:27:56 +0000 (16:27 -0700)]
in-band: Remove comment that only a single controller is supported.

14 years agoovs-openflowd: Prefer --fail=standalone|secure over --fail=open|closed.
Ben Pfaff [Tue, 20 Apr 2010 17:48:15 +0000 (10:48 -0700)]
ovs-openflowd: Prefer --fail=standalone|secure over --fail=open|closed.

The "standalone" and "secure" terminology is less confusing.

This retains support for "open" and "closed" but does not document it.

14 years agoofproto: Add support for master/slave controller coordination.
Ben Pfaff [Tue, 20 Apr 2010 18:00:58 +0000 (11:00 -0700)]
ofproto: Add support for master/slave controller coordination.

Now that Open vSwitch has support for multiple simultaneous controllers,
there is some need for a degree of coordination among them.  For now, the
plan is for the controllers themselves to take the lead on this.  This
commit adds a small bit of OVS infrastructure: the ability for a controller
to designate itself as a "master" or a "slave".  There may be at most one
master at a time; when a controller designates itself as the master, then
any existing master is demoted to slave status.  Slave controllers are not
allowed to modify the flow table or global configuration; any attempt to
do so is rejected with a "bad request" error.

Feature #2495.

14 years agoAdd support for multiple OpenFlow controllers on a single bridge.
Ben Pfaff [Tue, 20 Apr 2010 17:43:42 +0000 (10:43 -0700)]
Add support for multiple OpenFlow controllers on a single bridge.

With this commit, Open vSwitch permits a bridge to have any number of
OpenFlow controllers.  When multiple controllers are configured, Open
vSwitch connects to all of them simultaneously.  Details of configuration
are in the vswitch schema documentation.

OpenFlow 1.0 does not specify how multiple controllers coordinate in
interacting with a single switch, so more than one controller should be
specified only if the controllers are themselves designed to coordinate
with each other.

An upcoming commit will provide a simple means for coordination between
multiple controllers.

Feature #2495.

14 years agoofproto: Bundle all controller-related settings into a struct.
Ben Pfaff [Tue, 20 Apr 2010 17:05:57 +0000 (10:05 -0700)]
ofproto: Bundle all controller-related settings into a struct.

Many ofproto settings are controller-related.  Upcoming commits will add
to ofproto the ability to support multiple controllers, so it is important
to be able to refer to controller settings as a group.  Hence, this commit
bundles them into a new "struct ofproto_controller".

14 years agoin-band: Support an arbitrary number of controllers.
Ben Pfaff [Fri, 16 Apr 2010 20:50:51 +0000 (13:50 -0700)]
in-band: Support an arbitrary number of controllers.

14 years agoin-band: Drop in-band flows when turning off in-band control.
Ben Pfaff [Tue, 6 Apr 2010 22:24:38 +0000 (15:24 -0700)]
in-band: Drop in-band flows when turning off in-band control.

Destroying the in-band control object didn't remove the flows related to
in-band control, so they could persist until another event caused the
flow table to be reset.  Changing or removing the controller is one such
event, which would probably happen at the same time as turning off in-band
control, so this is a rather minor flaw, but it still seems good to fix it.

14 years agoin-band: Fix memory leak in in_band_destroy().
Ben Pfaff [Tue, 6 Apr 2010 22:07:54 +0000 (15:07 -0700)]
in-band: Fix memory leak in in_band_destroy().

14 years agoin-band: Fix null pointer dereference.
Ben Pfaff [Tue, 6 Apr 2010 19:46:12 +0000 (12:46 -0700)]
in-band: Fix null pointer dereference.

Triggering this would require deleting the in-use datapath at just the
right time, but we still don't want it to happen.

14 years agoin-band: Fix memory leak in get_remote_mac().
Ben Pfaff [Tue, 6 Apr 2010 19:40:11 +0000 (12:40 -0700)]
in-band: Fix memory leak in get_remote_mac().

If the call to netdev_open_default() failed then next_hop_dev was not
freed, but it should be.

14 years agoin-band: Fix inconsistency in in-band code.
Ben Pfaff [Tue, 6 Apr 2010 00:12:43 +0000 (17:12 -0700)]
in-band: Fix inconsistency in in-band code.

The IBR_TO_LOCAL_ARP and IBR_FROM_LOCAL_ARP flows are dropped if there is
no local MAC.  I don't see why IBR_FROM_LOCAL_DHCP should be different, so
this commit adds it too.

14 years agoofproto: Make valgrind happy.
Ben Pfaff [Wed, 7 Apr 2010 22:27:35 +0000 (15:27 -0700)]
ofproto: Make valgrind happy.

The "flags" member of struct odp_flow is not used for adding or deleting
flows, but valgrind doesn't know that.  By zeroing it out we can suppress
spurious warnings.

14 years agofail-open: Fix typo in comment.
Ben Pfaff [Wed, 7 Apr 2010 19:57:21 +0000 (12:57 -0700)]
fail-open: Fix typo in comment.

14 years agoovs-openflowd: Remove documentation for obsolete --mgmt-id option.
Ben Pfaff [Wed, 7 Apr 2010 20:30:14 +0000 (13:30 -0700)]
ovs-openflowd: Remove documentation for obsolete --mgmt-id option.

Also remove unused OPT_MGMT_ID enum.

14 years agoovsdb-doc: Distinguish hyphens and minus signs in nroff output.
Ben Pfaff [Fri, 16 Apr 2010 17:31:05 +0000 (10:31 -0700)]
ovsdb-doc: Distinguish hyphens and minus signs in nroff output.

In nroff, a minus sign (\-) should generally be used in literal text,
whereas hyphens are generally correct elsewhere.  This roughly corresponds
to bold versus non-bold text, so this commit makes ovsdb-doc output "-"
that appears in input as a minus sign if it is bold or a hyphen if it is
not.

14 years agoCodingStyle: Drop advice about breaking lines before binary operators.
Ben Pfaff [Mon, 19 Apr 2010 21:37:56 +0000 (14:37 -0700)]
CodingStyle: Drop advice about breaking lines before binary operators.

I like the style that was prescribed here--I find it slightly easier to
read--but everyone else who submits code seems to prefer breaking
lines after binary operators instead.  No point in fighting the tide.

14 years agoDocument GRE port options.
Jesse Gross [Mon, 19 Apr 2010 20:35:30 +0000 (16:35 -0400)]
Document GRE port options.

14 years agoxenserver: Restore original InterfaceReconfigure*.py on uninstall.
Ben Pfaff [Mon, 19 Apr 2010 18:40:32 +0000 (11:40 -0700)]
xenserver: Restore original InterfaceReconfigure*.py on uninstall.

The %post script fragment in the RPM spec was moving aside the original
InterfaceReconfigure{,Bridge,Vswitch}.py scripts, but the %postun was
not restoring them.  This commit restores them in %postun.

Without this change, installing the openvswitch RPM on a stock XenServer
and then uninstalling it breaks XenServer networking.

Bug #2624.
Reported-by: Ram Jothikumar <rjothikumar@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
14 years agoFix broken build by adding forgotten header file to list of headers.
Ben Pfaff [Mon, 19 Apr 2010 18:26:07 +0000 (11:26 -0700)]
Fix broken build by adding forgotten header file to list of headers.

14 years agoUpdate fake bond devices' statistics with the sum of bond slaves' stats.
Ben Pfaff [Mon, 19 Apr 2010 18:12:27 +0000 (11:12 -0700)]
Update fake bond devices' statistics with the sum of bond slaves' stats.

Needed by XAPI to accurately report bond statistics.

Ugh.

Bug NIC-63.

14 years agotunneling: Remove old GRE implementation.
Jesse Gross [Thu, 8 Apr 2010 14:22:35 +0000 (10:22 -0400)]
tunneling: Remove old GRE implementation.

The new GRE implementation provides a complete drop in replacement
for the old Linux based implementation.  Therefore, remove the
old implementation and rename "grenew" to "gre".

14 years agotunneling: Add userspace support for new GRE implementation.
Jesse Gross [Sun, 11 Apr 2010 13:38:49 +0000 (09:38 -0400)]
tunneling: Add userspace support for new GRE implementation.

Add a netdev that supports the new datapath GRE implementation.
It currently coexists with the old implementation so it is named
"grenew".

14 years agonetdev: Allow get_ifindex and get_features to be null.
Jesse Gross [Sun, 11 Apr 2010 13:37:19 +0000 (09:37 -0400)]
netdev: Allow get_ifindex and get_features to be null.

Allow netdev providers to set get_ifindex and get_features it
null if they would always return EOPNOTSUPP.  This is particuarly
useful for virtual devices.

14 years agonetdev-linux: Don't free a member of a struct.
Jesse Gross [Tue, 30 Mar 2010 22:40:01 +0000 (18:40 -0400)]
netdev-linux: Don't free a member of a struct.

We allocate struct netdev_linux which contains struct netdev but
free the netdev.  In practice this makes no difference because the
netdev is the first member of the struct but we should be correct
anyways.

14 years agonetdev-linux: Check notifications are for netdev-linux device.
Jesse Gross [Tue, 30 Mar 2010 22:39:20 +0000 (18:39 -0400)]
netdev-linux: Check notifications are for netdev-linux device.

When receiving a change notification from rtnetlink we checked whether
a netdev of that name existed and if so tried to handle it.  This also
checks that the type of the device is one handled by netdev-linux.

14 years agotunneling: Add datapath GRE support.
Jesse Gross [Sat, 17 Apr 2010 19:23:31 +0000 (15:23 -0400)]
tunneling: Add datapath GRE support.

Add a new vport type that implements GRE support inside of the
datapath instead of relying on Linux devices.  This provides
greater scalability, performance, and control.

The new GRE implementation supports nearly all features of the
Linux implementation.  It does not currently support multicast,
NBMA tunnels, or non-Ethernet devices.

This implementation of GRE has several important benefits over the
existing Linux implementation.  The first is simply that is not a
Linux device.  Linux devices are fairly heavy weight both in terms
of memory consumption and interactions with the rest of the system
(notifications, processes polling, etc.).  There are many pieces of
code that make assumptions about the maximum reasonable number of
ports.  Simply maintaining the state of several thousand devices is
enough to full occupy the CPU.

A tighter coupling between the GRE implementation and datapath
also allows more flexibility.  The key can be set and retrieved
from the flow table, which allows even greater scalability.
There will probably be additional use cases in the future.

14 years agodatapath: Add function to copy skb checksum bits.
Jesse Gross [Wed, 7 Apr 2010 17:55:28 +0000 (13:55 -0400)]
datapath: Add function to copy skb checksum bits.

Some kernels don't copy the checksum offload state in the skb
header when doing different types of copies.  Xen adds even more
fields, which are also not consistently copied.  The result is
uninitialized memory and random outcomes.  This adds a function to
consistently copy these bits across all kernel versions.

14 years agodatapath: Add skb_csum_help compatibility function.
Jesse Gross [Thu, 1 Apr 2010 21:34:18 +0000 (17:34 -0400)]
datapath: Add skb_csum_help compatibility function.

Later kernel versions remove the direction argument from
skb_checksum_help.  This provides a compatibility function so we
can have consistent syntax across versions.

Since CHECKSUM_PARTIAL is the same as CHECKSUM_HW on older kernels
this allows a unified code path for computing checksums.

14 years agodatapath: Genericize hash table.
Jesse Gross [Fri, 2 Apr 2010 20:46:18 +0000 (16:46 -0400)]
datapath: Genericize hash table.

Currently the flow hash table assumes that it is storing flows.
However, we will need additional types of hash tables in the
future so remove assumptions about flows and convert the datapath
to use the new table.

14 years agodpif-linux: Clean up vports that are no longer in config.
Jesse Gross [Sat, 10 Apr 2010 05:19:29 +0000 (01:19 -0400)]
dpif-linux: Clean up vports that are no longer in config.

If the config changes while ovs-vswitchd is not running it is possible
that there could be some vports which are no longer needed but won't
be destroyed when closed because they aren't open.  This deletes
unneeded vports at the same time that we clean up unneeded datapaths.

14 years agonetdev: Add function netdev_is_open().
Jesse Gross [Tue, 13 Apr 2010 17:57:40 +0000 (13:57 -0400)]
netdev: Add function netdev_is_open().

Add netdev_is_open(), which checks to see if a given netdev is
currently open.  It will be used to assist in cleaning up old ports
that are no longer in use.

14 years agodatapath: Add generic virtual port layer.
Jesse Gross [Mon, 12 Apr 2010 19:53:39 +0000 (15:53 -0400)]
datapath: Add generic virtual port layer.

Currently the datapath directly accesses devices through their
Linux functions.  Obviously this doesn't work for virtual devices
that are not backed by an actual Linux device.  This creates a
new virtual port layer which handles all interaction with devices.

The existing support for Linux devices was then implemented on top
of this layer as two device types.  It splits out and renames dp_dev
to internal_dev.  There were several places where datapath devices
had to handled in a special manner and this cleans that up by putting
all the special casing in a single location.

14 years agodatapath: Don't read net namespace on kernels that don't use them.
Jesse Gross [Mon, 12 Apr 2010 19:53:10 +0000 (15:53 -0400)]
datapath: Don't read net namespace on kernels that don't use them.

Use macros to eliminate the network namespace argument before it
gets to the compiler.  This allows us to specify a namespace on
kernels that know about them and prevent the compiler from complaining
on kernels that don't.

14 years agodatapath: Add rtnl_is_locked compatibility function.
Jesse Gross [Sun, 11 Apr 2010 13:52:40 +0000 (09:52 -0400)]
datapath: Add rtnl_is_locked compatibility function.

rtnl_is_locked wasn't added until 2.6.26 so provide an implementation
of it.

14 years agodatapath: Add dev_get_stats compatibility function.
Jesse Gross [Tue, 6 Apr 2010 02:38:31 +0000 (22:38 -0400)]
datapath: Add dev_get_stats compatibility function.

The dev_get_stats function wasn't added until 2.6.29 so provide
a replacement for it.