sliver-openvswitch.git
10 years agodpif-netdev: Unwildcard entire odp_port in dpif_netdev_mask_from_nlattrs().
Ben Pfaff [Sat, 5 Apr 2014 17:27:05 +0000 (10:27 -0700)]
dpif-netdev: Unwildcard entire odp_port in dpif_netdev_mask_from_nlattrs().

One case in the dpif_netdev_mask_from_nlattrs() function accidentally
wildcarded only a 16-bit subset of the mask's odp_port.  On little-endian
machines this subset was the lower bits, which happened to work out OK,
but on big-endian machines this subset was the upper bits, which doesn't
work and causes a test failure.  (The problem was actually visible in the
test expected results on little-endian machines, but we had not noticed.)

This commit unwildcards the whole field, fixing the problem, and updates
the test expected results to match.

This fixes the failure of test 732 seen here:
https://buildd.debian.org/status/fetch.php?pkg=openvswitch&arch=sparc&ver=2.1.0%2Bgit20140325-1&stamp=1396438624

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoovsdb: Remove SPECS in favor of referring to RFC 7047.
Ben Pfaff [Fri, 4 Apr 2014 16:43:54 +0000 (09:43 -0700)]
ovsdb: Remove SPECS in favor of referring to RFC 7047.

Also, add some clarifications relative to RFC 7047 to ovsdb-server(1).

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoflow: Rearrange struct flow for better megaflows.
Ethan Jackson [Fri, 4 Apr 2014 00:31:03 +0000 (17:31 -0700)]
flow: Rearrange struct flow for better megaflows.

Since the dp_hash will often be a hash of the 5 tuple, it makes sense
to put it with the L4 header so it hits in the last classifier lookup
stage.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agotest-reconnect: Fix a warning.
Alex Wang [Fri, 4 Apr 2014 16:58:28 +0000 (09:58 -0700)]
test-reconnect: Fix a warning.

This commit fixes the "return discards 'const' qualifier
from pointer target type" warning issued when compiling
test-reconnect.c.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofpbuf: fix struct comment
Lorand Jakab [Fri, 4 Apr 2014 08:09:52 +0000 (11:09 +0300)]
ofpbuf: fix struct comment

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoAvoid static declarations of arrays with unknown size.
Gurucharan Shetty [Thu, 3 Apr 2014 23:07:43 +0000 (16:07 -0700)]
Avoid static declarations of arrays with unknown size.

Visual studio does not like it.

This commit is similar to commit 3815d6c2c
(Avoid designated initializers and static decls of arrays
of unknown size.) but touches more files.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agojsonrpc: Return received JSON-RPC messages immediately in jsonrpc_recv().
Ben Pfaff [Thu, 3 Apr 2014 22:27:18 +0000 (15:27 -0700)]
jsonrpc: Return received JSON-RPC messages immediately in jsonrpc_recv().

Until now, jsonrpc_recv() used separate iterations of its loop to receive
data, feed it to the JSON-RPC parser, and return the received message.
This is unnecessarily complicated and can occasionally mean that the
jsonrpc object has received and parsed but not returned a message.  This
commit refactors the code to receive data, feed it to the parse, and
return the received message in a single iteration, and simplifies the code
in the process.

Reported-by: Chris Hydon <chydon@aristanetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Support OF version-specific table-miss behaviours
Simon Horman [Wed, 2 Apr 2014 09:04:27 +0000 (18:04 +0900)]
ofproto: Support OF version-specific table-miss behaviours

OpenFlow 1.1 and 1.2 specify that if a table-miss occurs then the default
behaviour is to forward the packet the controller using a packet-in
message. And until this patch this is the default behaviour that Open
vSwitch uses for all OpenFlow versions.

OpenFlow1.3+ specifies that if a table-miss occurs then the default
behaviour is simply to drop the packet. This patch implements this
behaviour using the following logic:

If a table-miss occurs and the table-miss behaviour for the table
has not been set using a table_mod (in which case it is no longer
the default setting) then:

* Installing a facet in the datapath with a drop action in the
  if there are no pre-OF1.3 controllers connected which would receive
  an packet_in message.

  Note that this covers both the case where there are only OF1.3
  controllers and the case where there are no controllers at all.

* Otherwise sent a packet_in message to all pre-OF1.3 controllers.

  This covers both the case where there are only pre-OF1.3
  controllers and there are both pre-OF1.3 and OF1.3+ controllers.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoMerge branch 'master' of ssh://git.onelab.eu/git/sliver-openvswitch
Giuseppe Lettieri [Tue, 1 Apr 2014 15:46:48 +0000 (17:46 +0200)]
Merge branch 'master' of ssh://git.onelab.eu/git/sliver-openvswitch

10 years agorhel: Add Patch Port support to initscripts
Jason Kölker [Mon, 31 Mar 2014 23:34:14 +0000 (23:34 +0000)]
rhel: Add Patch Port support to initscripts

Allows setting up type=patch ports through sysconfig ifcfg-* files.

Signed-off-by: Jason Kölker <jason@koelker.net>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
10 years agounit-test: merge test-heap into ovstest
Andy Zhou [Mon, 31 Mar 2014 01:20:07 +0000 (18:20 -0700)]
unit-test: merge test-heap into ovstest

Modify test-heap.c to use ovstest framework.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agounit-test: Add ovstest
Andy Zhou [Sat, 29 Mar 2014 01:13:00 +0000 (18:13 -0700)]
unit-test: Add ovstest

Changing one of the files in the Open vSwitch ``lib'' directory
causes 43 binaries to be relinked, which takes a lot of time even with
parallel ``make''.  31 of those binaries are in the ``tests''
directory.  ovs-test attemps to combine most of those binaries into a
single test program that just takes a subcommand name as its first
command-line argument.

The following patch makes use of this infrastructure.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodebian: Depend on 'kmod' instead of module-init-tools.
Ben Pfaff [Mon, 31 Mar 2014 20:38:50 +0000 (13:38 -0700)]
debian: Depend on 'kmod' instead of module-init-tools.

CC: 733696@bugs.debian.org
Reported-by: md@Linux.IT (Marco d'Itri)
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agovswitchd: Add external_ids to Flow_Table table in database schema.
Ben Pfaff [Mon, 31 Mar 2014 20:34:52 +0000 (13:34 -0700)]
vswitchd: Add external_ids to Flow_Table table in database schema.

Every other table has an external_ids column, which can be useful to
controller writers for integration purposes, so add one to Flow_Table also.

Reported-by: Ariel Tubaltsev <atubaltsev@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoutil: xleep for Windows.
Gurucharan Shetty [Fri, 28 Mar 2014 22:15:02 +0000 (15:15 -0700)]
util: xleep for Windows.

Windows does not have a sleep(seconds). But it does have
a Sleep(milliseconds). Sleep() in windows does not have a
return value. Since we are not using the return value for xsleep()
anywhere as of now, don't return any.

Introduced by commit 275eebb9 (utils: Introduce xsleep for RCU quiescent state)

CC: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Add support for kernels 3.13
Pravin Shelar [Wed, 2 Apr 2014 03:55:21 +0000 (20:55 -0700)]
datapath: Add support for kernels 3.13

Add support for building the in-tree kernel datapath for
Linux kernels up to 3.13. There were some changes in the
netlink area which required adding new compatibility code
for this layer. Also, some new per-cpu stats initialization
code was added.

Based on patch from Kyle Mestery.

Signed-off-by: Kyle Mestery <mestery@noironetworks.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Kyle Mestery <mestery@noironetworks.com>
10 years agobridge: don't bring up internal ports by default.
Flavio Leitner [Tue, 1 Apr 2014 21:05:20 +0000 (18:05 -0300)]
bridge: don't bring up internal ports by default.

It should be an administrator task to bring up devices as they
are configured properly.

Currently, Fedora is deleting the bridges when the interface is
brought down. Therefore, there is no bridge on the next boot and
the initscripts can apply the networking configuration properly
for a new bridge.

However, if the system didn't execute ifdown for some reason, the
bridge is left in the ovsdb and since internal ports are brought
up by default, there is no way for initscripts to known if the
adminitrator has already configured it or not.

This patch reverts commit bef071a5fdf8e2dd87677b04b3cf7a8f5094edcb
(bridge: Always "up" internal devices.).

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofpbuf: Abstract 'l2' pointer and document usage conventions.
Jarno Rajahalme [Wed, 2 Apr 2014 22:44:21 +0000 (15:44 -0700)]
ofpbuf: Abstract 'l2' pointer and document usage conventions.

Rename 'l2' to 'frame' and add new ofpbuf_set_frame() and ofpbuf_l2().
ofpbuf_set_frame() alse resets all the layer offsets.  ofpbuf_l2()
returns NULL if the packet has no Ethernet header, as indicated either
by unset l3 offset or NULL frame pointer.  Callers of ofpbuf_l2() are
supposed to check the return value, unless they can otherwise be sure
that the packet has a valid Ethernet header.

The recent commit 437d0d22 made some assumptions that were not valid
regarding the use of the 'l2' pointer in rconn module and by
compose_rarp().  This is now fixed as follows: rconn now relies on the
fact that once OpenFlow messages are given to rconn for transport, the
frame pointer is no longer needed to refer to the OpenFlow header; and
compose_rarp() now sets the frame pointer and offsets as expected.

In addition to storing network frames, ofpbufs are also used for
handling OpenFlow messages and action lists.  lib/ofpbuf.h now has a
comment documenting the current usage conventions and invariants.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofpbuf: Rename trivial _get_ functions without the "get".
Jarno Rajahalme [Thu, 3 Apr 2014 18:51:54 +0000 (11:51 -0700)]
ofpbuf: Rename trivial _get_ functions without the "get".

Code reads better without the "get", for example "ofpbuf_l3()"
v.s. "ofpbuf_get_l3()".  L4 payoad access functions still use the
"get" (e.g., "ofpbuf_get_tcp_payload()").

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agotests: add ovstest to .gitignore
Lorand Jakab [Thu, 3 Apr 2014 18:42:31 +0000 (21:42 +0300)]
tests: add ovstest to .gitignore

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agounit-test: Link 29 test programs into ovstest
Andy Zhou [Tue, 1 Apr 2014 07:47:01 +0000 (00:47 -0700)]
unit-test: Link 29 test programs into ovstest

Improve link speed by linking 29 test programs into ovstest.

On my machine, running the following command against a fully
built tree:

  $ touch lib/random.c; time make

Improve the overall build time from 7 seconds to 3.5 seconds.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agounit-test: Improve ovstest user interface
Andy Zhou [Mon, 31 Mar 2014 20:38:04 +0000 (13:38 -0700)]
unit-test: Improve ovstest user interface

Improve help output. Running without argument now exit with an error
message and an error code. Simplify OVSTEST_REGISTER() since not all
test programs uses sub_commands.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agojsonrpc-server: Combine notifications when connection becomes backlogged.
Ben Pfaff [Wed, 2 Apr 2014 15:43:17 +0000 (08:43 -0700)]
jsonrpc-server: Combine notifications when connection becomes backlogged.

Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management.  Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database.  However,
this is not ideal because of situations where disconnection happens
repeatedly.  For example:

     - A manager toggles a column back and forth between two or more values
       quickly (in which case the data transmitted over the monitoring
       connections always increases quickly, without bound).

     - A manager repeatedly extends the contents of some column in some row
       (in which case the data transmitted over the monitoring connection
       grows with O(n**2) in the length of the string).

A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount.  This commit implements this new way.

Bug #1211786.
Bug #1221378.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agojsonrpc-server: Track monitor updates separately from sending them.
Ben Pfaff [Tue, 1 Apr 2014 23:26:47 +0000 (16:26 -0700)]
jsonrpc-server: Track monitor updates separately from sending them.

This will make combining monitor updates easier in the next commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoRevert "json: New function json_serialized_length()."
Ben Pfaff [Wed, 2 Apr 2014 14:11:20 +0000 (07:11 -0700)]
Revert "json: New function json_serialized_length()."

This reverts commit 1600fa6853872e16130366351a2c14f6fa8b547c.

Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management.  Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database.  However,
this is not ideal because of situations where disconnection happens
repeatedly.  For example:

     - A manager toggles a column back and forth between two or more values
       quickly (in which case the data transmitted over the monitoring
       connections always increases quickly, without bound).

     - A manager repeatedly extends the contents of some column in some row
       (in which case the data transmitted over the monitoring connection
       grows with O(n**2) in the length of the string).

A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount.  An upcoming patch implements this new way.  This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoRevert "ovsdb-data: New functions for predicting serialized length of data."
Ben Pfaff [Wed, 2 Apr 2014 14:10:40 +0000 (07:10 -0700)]
Revert "ovsdb-data: New functions for predicting serialized length of data."

This reverts commit 0ea7bec76d804a2c4efccd3dbdaa3e30cf536a5c.

Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management.  Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database.  However,
this is not ideal because of situations where disconnection happens
repeatedly.  For example:

     - A manager toggles a column back and forth between two or more values
       quickly (in which case the data transmitted over the monitoring
       connections always increases quickly, without bound).

     - A manager repeatedly extends the contents of some column in some row
       (in which case the data transmitted over the monitoring connection
       grows with O(n**2) in the length of the string).

A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount.  An upcoming patch implements this new way.  This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoRevert "jsonrpc-server: Disconnect connections that queue too much data."
Ben Pfaff [Wed, 2 Apr 2014 14:09:47 +0000 (07:09 -0700)]
Revert "jsonrpc-server: Disconnect connections that queue too much data."

This reverts commit 60533a405b2eb77214b767767fe143c8645f82d5.

Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management.  Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database.  However,
this is not ideal because of situations where disconnection happens
repeatedly.  For example:

     - A manager toggles a column back and forth between two or more values
       quickly (in which case the data transmitted over the monitoring
       connections always increases quickly, without bound).

     - A manager repeatedly extends the contents of some column in some row
       (in which case the data transmitted over the monitoring connection
       grows with O(n**2) in the length of the string).

A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount.  An upcoming patch implements this new way.  This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoRevert "jsonrpc-server: Add test for disconnecting connections with too long queues."
Ben Pfaff [Wed, 2 Apr 2014 21:22:25 +0000 (14:22 -0700)]
Revert "jsonrpc-server: Add test for disconnecting connections with too long queues."

This reverts commit 631583739f9aec55d4cbe25fb856143ccde48ab6.

Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management.  Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database.  However,
this is not ideal because of situations where disconnection happens
repeatedly.  For example:

     - A manager toggles a column back and forth between two or more values
       quickly (in which case the data transmitted over the monitoring
       connections always increases quickly, without bound).

     - A manager repeatedly extends the contents of some column in some row
       (in which case the data transmitted over the monitoring connection
       grows with O(n**2) in the length of the string).

A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount.  An upcoming patch implements this new way.  This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoovsdb-server: Send update for _version for changes due to weak ref removal.
Ben Pfaff [Tue, 1 Apr 2014 21:33:24 +0000 (14:33 -0700)]
ovsdb-server: Send update for _version for changes due to weak ref removal.

When a row was deleted, and that caused the transaction manager to remove
a remaining weak reference to the row, and no other change had been made
to the row in that transaction, and a client was monitoring modifications
of the _version column in the row, then the monitor update for the column
did not include the old contents of the _version column, even though it
should have.  This commit fixes the problem.

Probably most clients only look at the new value of the column, if they
monitor _version at all, and this bug is really old, so it's probably
not a serious bug.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoovsdb-client: Support all vlog options.
Ben Pfaff [Wed, 2 Apr 2014 22:03:41 +0000 (15:03 -0700)]
ovsdb-client: Support all vlog options.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agodebian: Allow kmod or module-init-tools for backward compatibility.
Ben Pfaff [Wed, 2 Apr 2014 21:54:51 +0000 (14:54 -0700)]
debian: Allow kmod or module-init-tools for backward compatibility.

Commit d473844693 (debian: Depend on 'kmod' instead of module-init-tools.)
switched from depending on module-init-tools to depending on kmod, which
is the new name of the appropriate package in Debian.  Unfortunately,
while kmod is the right name for the latest Debian distribution, it doesn't
have that name in old distributions and thus breaks the build on those.
This commit should work OK in either case, since it allows both names.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agonetdev-dpdk: Remove alloc from packet recv.
Pravin Shelar [Mon, 31 Mar 2014 20:17:24 +0000 (13:17 -0700)]
netdev-dpdk: Remove alloc from packet recv.

On DPDK packet recv, ovs is given pointer to mbuf which has
information about a packet, for example pointer to data and size.
By moving mbuf to ofpbuf we can let dpdk allocate ofpbuf and
pass that to ovs for processing the packet.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofpbuf: Add DPDK mbuf to ofpbuf.
Pravin Shelar [Tue, 1 Apr 2014 03:41:53 +0000 (20:41 -0700)]
ofpbuf: Add DPDK mbuf to ofpbuf.

Define data, base and size access APIs for DPDK.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofpbuf: Add ofpbuf_init_dpdk()
Pravin Shelar [Mon, 31 Mar 2014 20:22:39 +0000 (13:22 -0700)]
ofpbuf: Add ofpbuf_init_dpdk()

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofpbuf: Introduce access api for base, data and size.
Pravin Shelar [Sun, 30 Mar 2014 08:31:50 +0000 (01:31 -0700)]
ofpbuf: Introduce access api for base, data and size.

These functions will be used by later patches.  Following patch
does not change functionality.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoofpbuf: Add private pointer for dpdk
Pravin Shelar [Mon, 31 Mar 2014 19:44:06 +0000 (12:44 -0700)]
ofpbuf: Add private pointer for dpdk

netdev-dpdk uses this pointer to store dpdk mbuf. This patch fixes
compilation error in dpdk.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agodatapath: Minimize ovs_flow_cmd_new|set critical sections.
Jarno Rajahalme [Wed, 2 Apr 2014 18:14:58 +0000 (11:14 -0700)]
datapath: Minimize ovs_flow_cmd_new|set critical sections.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Split ovs_flow_cmd_new_or_set().
Jarno Rajahalme [Wed, 2 Apr 2014 18:14:58 +0000 (11:14 -0700)]
datapath: Split ovs_flow_cmd_new_or_set().

Following patch will be easier to reason about with separate
ovs_flow_cmd_new() and ovs_flow_cmd_set() functions.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Minimize ovs_flow_cmd_del critical section.
Jarno Rajahalme [Wed, 2 Apr 2014 18:14:58 +0000 (11:14 -0700)]
datapath: Minimize ovs_flow_cmd_del critical section.

ovs_flow_cmd_del() now allocates reply (if needed) after the flow has
already been removed from the flow table.  If the reply allocation
fails, a netlink error is signaled with netlink_set_err(), as is
already done in ovs_flow_cmd_new_or_set() in the similar situation.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Reduce locking requirements.
Jarno Rajahalme [Wed, 2 Apr 2014 18:14:58 +0000 (11:14 -0700)]
datapath: Reduce locking requirements.

Reduce and clarify locking requirements for ovs_flow_cmd_alloc_info(),
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info().

A datapath pointer is available only when holding a lock.  Change
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info() to take a
dp_ifindex directly, rather than a datapath pointer that is then
(only) used to get the dp_ifindex.  This is useful, since the
dp_ifindex is available even when the datapath pointer is not, both
before and after taking a lock, which makes further critical section
reduction possible.

Make ovs_flow_cmd_alloc_info() take an 'acts' argument instead a
'flow' pointer.  This allows some future patches to do the allocation
before acquiring the flow pointer.

The locking requirements after this patch are:

ovs_flow_cmd_alloc_info(): May be called without locking, must not be
  called while holding the RCU read lock (due to memory allocation).
  If 'acts' belong to a flow in the flow table, however, then the
  caller must hold ovs_mutex.

ovs_flow_cmd_fill_info(): Either ovs_mutex or RCU read lock must be held.

ovs_flow_cmd_build_info(): This calls both of the above, so the caller
  must hold ovs_mutex.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath/flow: Fix ovs_flow_stats_get/clear RCU dereference.
Jarno Rajahalme [Wed, 2 Apr 2014 18:14:58 +0000 (11:14 -0700)]
datapath/flow: Fix ovs_flow_stats_get/clear RCU dereference.

For ovs_flow_stats_get() using ovsl_dereference() was wrong, since
flow dumps call this with RCU read lock.

ovs_flow_stats_clear() is always called with ovs_mutex, so can use
ovsl_dereference().

Also, make the ovs_flow_stats_get() 'flow' argument const to make
later patches cleaner.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Fix typo.
Jarno Rajahalme [Wed, 2 Apr 2014 18:14:58 +0000 (11:14 -0700)]
datapath: Fix typo.

Incorrect struct name was confusing, even though otherwise
inconsequental.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agovtep: Add IP address configuration for bfd.
Bruce Davie [Wed, 2 Apr 2014 16:11:48 +0000 (09:11 -0700)]
vtep: Add IP address configuration for bfd.

The OVS implementation of BFD allows configuration of the source and
destination IP addresses of BFD packets. This patch adds the same
configuration option to the VTEP schema.

Signed-off-by: Bruce Davie <bsd@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
10 years agoofproto.at: Fix races in rule eviciton tests
YAMAMOTO Takashi [Mon, 31 Mar 2014 05:04:35 +0000 (14:04 +0900)]
ofproto.at: Fix races in rule eviciton tests

Bump timeout differences, because timeouts different by 1s might end up
to have the same position in the heap as rule_eviction_priority() uses
1024ms as a unit.

Also, use time/stop to avoid relying on how long an add-flow would take.

These tests were introduced by commit 6d56c1f1.
("ofproto: Update rule's priority in eviction group.")

Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Kmindg G <kmindg@gmail.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
10 years agoofproto-dpif: Fix races in MPLS tests
YAMAMOTO Takashi [Mon, 31 Mar 2014 03:31:46 +0000 (12:31 +0900)]
ofproto-dpif: Fix races in MPLS tests

Depending on how output buffers are flushed, there might
be still races.  But this at least makes the race windows smaller.

Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
10 years agoofproto: Fix a race in pause and resume test
YAMAMOTO Takashi [Mon, 31 Mar 2014 03:22:20 +0000 (12:22 +0900)]
ofproto: Fix a race in pause and resume test

Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
10 years agoofproto-dpif.at: Fix races in table-miss tests
YAMAMOTO Takashi [Mon, 31 Mar 2014 03:13:02 +0000 (12:13 +0900)]
ofproto-dpif.at: Fix races in table-miss tests

Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
10 years agolib/ofpbuf: Compact
Jarno Rajahalme [Mon, 24 Mar 2014 16:17:01 +0000 (09:17 -0700)]
lib/ofpbuf: Compact

This patch shrinks the struct ofpbuf from 104 to 48 bytes on 64-bit
systems, or from 52 to 36 bytes on 32-bit systems (counting in the
'l7' removal from an earlier patch).  This may help contribute to
cache efficiency, and will speed up initializing, copying and
manipulating ofpbufs.  This is potentially important for the DPDK
datapath, but the rest of the code base may also see a little benefit.

Changes are:

- Remove 'l7' pointer (previous patch).
- Use offsets instead of layer pointers for l2_5, l3, and l4 using
  'l2' as basis.  Usually 'data' is the same as 'l2', but this is not
  always the case (e.g., when parsing or constructing a packet), so it
  can not be easily used as the offset basis.  Also, packet parsing is
  faster if we do not need to maintain the offsets each time we pull
  data from the ofpbuf.
- Use uint32_t for 'allocated' and 'size', as 2^32 is enough even for
  largest possible messages/packets.
- Use packed enum for 'source'.
- Rearrange to avoid unnecessary padding.
- Remove 'private_p', which was used only in two cases, both of which
  had the invariant ('l2' == 'data'), so we can temporarily use 'l2'
  as a private pointer.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Minimize dp and vport critical sections.
Jarno Rajahalme [Fri, 28 Mar 2014 20:44:23 +0000 (13:44 -0700)]
datapath: Minimize dp and vport critical sections.

Move most memory allocations away from the ovs_mutex critical
sections.  vport allocations still happen while the lock is taken, as
changing that would require major refactoring. Also, vports are
created very rarely so it should not matter.

Change ovs_dp_cmd_get() now only takes the rcu_read_lock(), rather
than ovs_lock(), as nothing need to be changed.  This was done by
ovs_vport_cmd_get() already.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agodatapath: Make flow mask removal symmetric.
Jarno Rajahalme [Sat, 29 Mar 2014 22:52:32 +0000 (15:52 -0700)]
datapath: Make flow mask removal symmetric.

Masks are inserted when flows are inserted to the table, so it is
logical to correspondingly remove masks when flows are removed from
the table, in ovs_flow_table_remove().

This allows ovs_flow_free() to be called without locking, which will
be used by later patches.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>Summary:
10 years agodatapath: Build flow cmd netlink reply only if needed.
Jarno Rajahalme [Sat, 29 Mar 2014 22:52:32 +0000 (15:52 -0700)]
datapath: Build flow cmd netlink reply only if needed.

Use netlink_has_listeners() and NLM_F_ECHO flag to determine if a
reply is needed or not for OVS_FLOW_CMD_NEW, OVS_FLOW_CMD_SET, or
OVS_FLOW_CMD_DEL.  Currently, OVS userspace does not request a reply
for OVS_FLOW_CMD_NEW, but usually does for OVS_FLOW_CMD_DEL, as stats
may have changed.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agorecirculation: Adjust ovs_key_attr ABI
Andy Zhou [Fri, 28 Mar 2014 20:36:28 +0000 (13:36 -0700)]
recirculation: Adjust ovs_key_attr ABI

Jesse helped to clarify how to maintain the ABI. Making the
adjustment accordingly and add some comments.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c
Monam Agarwal [Fri, 28 Mar 2014 03:12:22 +0000 (20:12 -0700)]
datapath: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c

This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)

Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agosparse: workaround for a bug in sparse.
Pritesh Kothari [Fri, 28 Mar 2014 19:20:00 +0000 (12:20 -0700)]
sparse: workaround for a bug in sparse.

sparse emits the following warning:
lib/dpif-netdev.c:1755:15: warning: Initializer entry defined twice
lib/dpif-netdev.c:1755:15:   also defined here
due to a bug in sparse which doesn't like inlined functions which
expands a #define within it. This commit removes inline to make
sparse happy.

Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agosparse: fix the order of include paths to make sparse happy.
Pritesh Kothari [Fri, 28 Mar 2014 19:19:59 +0000 (12:19 -0700)]
sparse: fix the order of include paths to make sparse happy.

This fix restores the order of include path such that the local include/ comes
before the system /usr/include in the #include path. Thus by making sure that
include/linux/types.h and include/linux/openvswitch.h take precedence over the
similar files in /usr/include/ directory.

Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agorecirculation: Some cosmetic fixes
YAMAMOTO Takashi [Thu, 27 Mar 2014 14:38:57 +0000 (14:38 +0000)]
recirculation: Some cosmetic fixes

Wrap long lines, fix whitespaces, and fix a typo in a comment.
No functional changes are intended.

Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andy Zhou <azhou@nicira.com>
10 years agolib/packet.h: add hash_mac()
Andy Zhou [Fri, 28 Mar 2014 03:22:37 +0000 (20:22 -0700)]
lib/packet.h: add hash_mac()

Add hash_mac() and apply it when appropriate.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agolib/hash.h: add hash_uint64()
Andy Zhou [Fri, 28 Mar 2014 02:38:04 +0000 (19:38 -0700)]
lib/hash.h: add hash_uint64()

Add hash_uint64() and apply it when appropriate.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agolib/hash.h: fix hash_2words
Andy Zhou [Fri, 28 Mar 2014 02:08:36 +0000 (19:08 -0700)]
lib/hash.h: fix hash_2words

Number of bytes in 2 words should be 8, instead of 4 bytes,
to better follow the mhash_finish() API. It is unlikely the fix
will improve the quality of hashing results.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-appctl: A port for Windows.
Gurucharan Shetty [Fri, 28 Mar 2014 15:37:36 +0000 (08:37 -0700)]
ovs-appctl: A port for Windows.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agounixctl: Add support for Windows.
Gurucharan Shetty [Fri, 28 Mar 2014 15:19:59 +0000 (08:19 -0700)]
unixctl: Add support for Windows.

For Windows, use a kernel assigned localhost TCP port to listen for
runtime management connections and then write it into a file
so that a client can read it and then make a TCP connection.

Since we do not have the infrastructure to create pidfiles on
windows as of now, we create the *.ctl file without a pid. This
should be okay since we use different OVS_RUNDIR when we run
multiple copies of a daemon.

We do not generate man pages on Windows. But we still update them
for Windows so that anyone can read it elsewhere. Since we do not
generate it directly, we cannot dynamically show the configured
OVS_RUNDIR in windows. So, I have a not so nice \fIOVS_RUNDIR\fR
in the man page.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agostream-tcp: Use closesocket instead of close for sockets.
Gurucharan Shetty [Thu, 27 Mar 2014 23:39:35 +0000 (16:39 -0700)]
stream-tcp: Use closesocket instead of close for sockets.

We should use closesocket() while closing sockets so that
closing sockets work fine on both POSIX and Windows.
(In POSIX, we #define closesocket close)

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-appctl: Close the connection during error.
Gurucharan Shetty [Thu, 27 Mar 2014 22:40:25 +0000 (15:40 -0700)]
ovs-appctl: Close the connection during error.

When we send a wrong command to a Open vSwitch daemon, it returns
an error. In that case, close the connection to the daemon.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-upcall: Fix a race.
Alex Wang [Tue, 25 Mar 2014 20:57:25 +0000 (13:57 -0700)]
ofproto-dpif-upcall: Fix a race.

Commit 61057e884ca9c(ofproto-dpif-upcall: Slightly simplify
udpif_upcall_handler().) restructured the main loop in
udpif_upcall_handler() and discarded the check for the
'exit_latch' after acquiring the mutex.  This makes it
possible for the following race:

- main thread sets the 'exit_latch' after the handler thread
  checking it.
- main thread acquires the handler thread mutex and signals the
  condition variable of handler thread.
- main thread releases the mutex and 'join' the handler thread.
- handler thread acquires the mutex, finds that n_upcalls is 0
  and waits on the signal of condition variable.
- then OVS will hang forever.

This commit fixes the above issue by adding a check for the
'exit_latch' after acquiring the mutex.

Bug #1217229

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoovs-vsctl: Improve error reporting
Andy Zhou [Thu, 27 Mar 2014 16:10:31 +0000 (17:10 +0100)]
ovs-vsctl: Improve error reporting

ovs-vsctl is a command-line interface to the Open vSwitch database,
and as such it just modifies the desired Open vSwitch configuration in
the database.  ovs-vswitchd, on the other hand, monitors the database
and implements the actual configuration specified in the database.
This can lead to surprises when the user makes a change to the
database, with ovs-vsctl, that ovs-vswitchd cannot actually
implement. In such a case, the ovs-vsctl command silently succeeds
(because the database was successfully updated) but its desired
effects don't actually take place. One good example of such a change
is attempting to add a port with a misspelled name (e.g. ``ovs-vsctl
add-port br0 fth0'', where fth0 should be eth0); another is creating
a bridge or a port whose name is longer than supported
(e.g. ``ovs-vsctl add-br'' with a 16-character bridge name on
Linux). It can take users a long time to realize the error, because it
requires looking through the ovs-vswitchd log.

The patch improves the situation by checking whether operations that
ovs executes succeed and report an error when
they do not.  This patch only report add-br and add-port
operation errors by examining the `ofport' value that
ovs-vswitchd stores into the database record for the newly created
interface.  Until ovs-vswitchd finishes implementing the new
configuration, this column is empty, and after it finishes it is
either -1 (on failure) or a positive number (on success).

Signed-off-by: Andy Zhou <azhou@nicira.com>
Co-authored-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovsdb-idlc: Generate new *_get_for_uuid() functions.
Ben Pfaff [Fri, 28 Mar 2014 02:11:20 +0000 (19:11 -0700)]
ovsdb-idlc: Generate new *_get_for_uuid() functions.

This allows a client to obtain the IDL version of a row given its UUID.
It isn't normally useful, but there's a specialized use case for getting
the IDL version of a row given the UUID returned by
ovsdb_idl_txn_get_insert_uuid() following transaction commit.

An alternative would be to generate table-specific versions of
ovsdb_idl_txn_get_insert_uuid().  That seems reasonable to me too.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agostream-tcp: Fix error message for failed TCP_NODELAY setting on Windows.
Ben Pfaff [Thu, 27 Mar 2014 17:04:55 +0000 (10:04 -0700)]
stream-tcp: Fix error message for failed TCP_NODELAY setting on Windows.

Reported-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Kyle Mestery <mestery@noironetworks.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoSetting tag sliver-openvswitch-2.1.90-2 sliver-openvswitch-2.1.90-2
Thierry Parmentelat [Tue, 25 Mar 2014 13:44:19 +0000 (14:44 +0100)]
Setting tag sliver-openvswitch-2.1.90-2
fix packaging

10 years agomentioning files under /usr/lib/ rather than the dir itself as this conflicts with...
Thierry Parmentelat [Mon, 24 Mar 2014 13:09:20 +0000 (14:09 +0100)]
mentioning files under /usr/lib/ rather than the dir itself as this conflicts with the 'filesystem' rpm

10 years agoFAQ: Fix supported kernel version.
Pravin [Tue, 25 Mar 2014 18:08:19 +0000 (11:08 -0700)]
FAQ: Fix supported kernel version.

OVS 2.1 does not support kernel 3.12.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Kyle Mestery <mestery@noironetworks.com>
10 years agolib/ofpbuf: Remove 'l7' pointer.
Jarno Rajahalme [Tue, 25 Mar 2014 22:26:23 +0000 (15:26 -0700)]
lib/ofpbuf: Remove 'l7' pointer.

Now that we don't need to parse TCP flags from the packet after
extraction, we usually do not need the 'l7' pointer any more.  When
needed, ofpbuf_get_tcp|udp|sctp|icmp_payload() or ofpbuf_get_l4_size()
can be used instead.

Removal of 'l7' was requested by Pravin for the DPDK datapath work, as
it simplifies packet parsing a bit.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agolib/ofpbuf: Inline the trivial ofpbuf functions.
Jarno Rajahalme [Tue, 25 Mar 2014 22:26:23 +0000 (15:26 -0700)]
lib/ofpbuf: Inline the trivial ofpbuf functions.

Inline the most trivial ofpbuf functions to allow for better optimization.
Also inline the most often used ofpbuf_pull() and ofpbuf_try_pull(), which
should help streamline packet parsing.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agolib/pcap: Use ofpbuf_tail() instead of ofpbuf_end().
Jarno Rajahalme [Tue, 25 Mar 2014 22:26:23 +0000 (15:26 -0700)]
lib/pcap: Use ofpbuf_tail() instead of ofpbuf_end().

Using ofpbuf_end() to compute payload length would fail if the ofpbuf
had any tailroom.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agolockfile: Minor code cleanup.
Ben Pfaff [Tue, 25 Mar 2014 19:35:28 +0000 (12:35 -0700)]
lockfile: Minor code cleanup.

There were more superficial differences between Windows and non-Windows
versions of the code due to naming.  This commit reduces those differences.

CC: Gurucharan Shetty <shettyg@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-netdev: user space datapath recirculation
Andy Zhou [Tue, 4 Mar 2014 23:36:03 +0000 (15:36 -0800)]
dpif-netdev: user space datapath recirculation

Add basic recirculation infrastructure and user space
data path support for it. The following bond mega flow patch will
make use of this infrastructure.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Added Per backer recirculation ID management
Andy Zhou [Mon, 27 Jan 2014 09:18:30 +0000 (01:18 -0800)]
ofproto-dpif: Added Per backer recirculation ID management

Recirculation ID needs to be unique per datapath. Its usage will be
tracked by the backer that corresponds to the datapath.

In theory, Recirculation ID can be any uint32_t value, except 0. This
implementation limits to a smaller range just for ease of debugging.
Make the range size 0 effectively disables recirculation.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agolib/flow: add dp_hash and recirc_id to struct flow
Andy Zhou [Tue, 4 Mar 2014 22:20:19 +0000 (14:20 -0800)]
lib/flow: add dp_hash and recirc_id to struct flow

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-bsd: compilation fixes
YAMAMOTO Takashi [Tue, 25 Mar 2014 20:13:43 +0000 (05:13 +0900)]
netdev-bsd: compilation fixes

This fixes regressions from commit f7791740
("netdev: Rename netdev_rx to netdev_rxq")
and commit df1e5a3b.
("netdev: Extend rx_recv to pass multiple packets.")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Clarify locking.
Jarno Rajahalme [Tue, 25 Mar 2014 16:12:44 +0000 (09:12 -0700)]
datapath: Clarify locking.

Remove unnecessary locking from functions that are always called with
appropriate locking.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
10 years agodpif-netdev: Fix a compilation warning
Andy Zhou [Tue, 25 Mar 2014 04:10:39 +0000 (21:10 -0700)]
dpif-netdev: Fix a compilation warning

Building OVS tree without DPDK produced the following warning message:
    lib/dpif-netdev.c:1868:5: error: statement with no effect

This error message is complaining the return value of the following
macro not being used.
#define pmd_thread_setaffinity_cpu(c) (0)

The patch fixed this warnning by making the stub functions
as inline funtions.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agolooks like the package now produces files in /usr/lib
Thierry Parmentelat [Mon, 24 Mar 2014 08:35:43 +0000 (09:35 +0100)]
looks like the package now produces files in /usr/lib

10 years agonetdev-dpdk: Use multiple core for dpdk IO.
Pravin [Fri, 21 Mar 2014 05:07:44 +0000 (22:07 -0700)]
netdev-dpdk: Use multiple core for dpdk IO.

DPDK need to set _lcore_id for using multiple core.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agodpif-netdev: Add DPDK netdev.
Pravin [Tue, 25 Mar 2014 02:23:08 +0000 (19:23 -0700)]
dpif-netdev: Add DPDK netdev.

Following patch adds DPDK netdev-class to userspace datapath. Now
OVS can use DPDK port for IO by just configuring DPDK port and then
adding dpdk type port to userspace datapath.

Refer to INSTALL.DPDK doc for further info.

This is based a patch from Gerald Rogers.

Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agoutils: Introduce xsleep for RCU quiescent state
Pravin [Fri, 21 Mar 2014 16:20:42 +0000 (09:20 -0700)]
utils: Introduce xsleep for RCU quiescent state

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofpbuf: Add OFPBUF_DPDK type.
Pravin [Thu, 20 Mar 2014 18:00:14 +0000 (11:00 -0700)]
ofpbuf: Add OFPBUF_DPDK type.

This will be used by DPDK for zero copy IO.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agonetdev: Add support multiqueue recv.
Pravin [Fri, 21 Mar 2014 03:52:06 +0000 (20:52 -0700)]
netdev: Add support multiqueue recv.

new netdev type like DPDK can support multi-queue IO. Following
patch Adds support for same.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agonetdev: Rename netdev_rx to netdev_rxq
Pravin [Fri, 21 Mar 2014 02:38:14 +0000 (19:38 -0700)]
netdev: Rename netdev_rx to netdev_rxq

Preparation for multi queue netdev IO.  There are no functional changes
in this patch.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agodpif-netdev: Add poll-mode-device thread.
Pravin [Thu, 20 Mar 2014 17:57:41 +0000 (10:57 -0700)]
dpif-netdev: Add poll-mode-device thread.

This patch adds PMD type netdev for netdevice with poll-mode
drivers.  Since there is no way to get signal on a packet recv
from these devices we need to poll them in busy loop.  So minimize
system call overhead this patch uses dpif-thread exclusively
for PMD devices and rest of devices which needs system calls to
do IO are moved to dpif-netdev-run().
PMD device like DPDK work in userspace so there is no system call
overhead for them.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agodpif-netdev: Add ref-counting for port.
Pravin [Thu, 20 Mar 2014 17:57:19 +0000 (10:57 -0700)]
dpif-netdev: Add ref-counting for port.

DPDK Poll mode thread need to keep ref to dpif-port.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agonetdev: Send ofpbuf directly to netdev.
Pravin [Thu, 20 Mar 2014 17:56:51 +0000 (10:56 -0700)]
netdev: Send ofpbuf directly to netdev.

DPDK netdev need to access ofpbuf while sending buffer. Following
patch changes netdev_send accordingly.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
10 years agonetdev: Extend rx_recv to pass multiple packets.
Pravin [Thu, 20 Mar 2014 17:54:37 +0000 (10:54 -0700)]
netdev: Extend rx_recv to pass multiple packets.

DPDK can receive multiple packets but current netdev API does
not allow that.  Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port.  This will be
used by dpdk-netdev.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Avoid assigning a NULL pointer to flow actions.
Jarno Rajahalme [Tue, 25 Mar 2014 00:34:48 +0000 (17:34 -0700)]
datapath: Avoid assigning a NULL pointer to flow actions.

Flow SET can accept an empty set of actions, with the intended
semantics of leaving existing actions unmodified.  This seems to have
been brokin after OVS 1.7, as we have assigned the flow's actions
pointer to NULL in this case, but we never check for the NULL pointer
later on.  This patch restores the intended behavior and documents it
in the include/linux/openvswitch.h.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Remove a debugging message.
Jarno Rajahalme [Tue, 25 Mar 2014 00:34:48 +0000 (17:34 -0700)]
datapath: Remove a debugging message.

This was left in accidentally.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agolib/flow: Fix flow_hash_5tuple().
Jarno Rajahalme [Tue, 25 Mar 2014 00:34:48 +0000 (17:34 -0700)]
lib/flow: Fix flow_hash_5tuple().

First part of the hash was discarded as basis was used too late.

Also be explicit about the input type expected by mhash_add().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
10 years agopackets: packet metadata from flow function instead of macro.
Gurucharan Shetty [Fri, 21 Mar 2014 17:36:52 +0000 (10:36 -0700)]
packets: packet metadata from flow function instead of macro.

Commit 03fbdf8d9c80a (lib/flow: Retain ODPP_NONE on flow_extract())
replaced packet metadata initialization function by a macro.
Visual studio does not like nested structure initialization that
is done in that macro.

This commit replaces the macro by a function.

CC: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agodatapath: Compact sw_flow_key.
Jarno Rajahalme [Tue, 18 Mar 2014 23:32:45 +0000 (16:32 -0700)]
datapath: Compact sw_flow_key.

Minimize padding in sw_flow_key and move 'tp' top the main struct.
These changes simplify code when accessing the transport port numbers
and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit
systems (128->120 bytes).  These changes also make the keys for IPv4
packets to fit in one cache line.

There is a valid concern for safety of packing the struct
ovs_key_ipv4_tunnel, as it would be possible to take the address of
the tun_id member as a __be64 * which could result in unaligned access
in some systems. However:

- sw_flow_key itself is 64-bit aligned, so the tun_id within is always
  64-bit aligned.
- We never make arrays of ovs_key_ipv4_tunnel (which would force every
  second tun_key to be misaligned).
- We never take the address of the tun_id in to a __be64 *.
- Whereever we use struct ovs_key_ipv4_tunnel outside the sw_flow_key,
  it is in stack (on tunnel input functions), where compiler has full
  control of the alignment.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoofp-util: Fix inconsistencies in table features type.
Jarno Rajahalme [Mon, 24 Mar 2014 17:16:03 +0000 (10:16 -0700)]
ofp-util: Fix inconsistencies in table features type.

'metadata_match' and 'metadata_write' fields are defined as ovs_be64,
but sometimes used and referred to as uint64_t.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agodatapath: Use TCP flags in the flow key for stats.
Jarno Rajahalme [Tue, 18 Mar 2014 23:32:45 +0000 (16:32 -0700)]
datapath: Use TCP flags in the flow key for stats.

We already extract the TCP flags for the key, might as well use that
for stats.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Fix output of SCTP mask.
Jarno Rajahalme [Tue, 18 Mar 2014 23:32:45 +0000 (16:32 -0700)]
datapath: Fix output of SCTP mask.

The 'output' argument of the ovs_nla_put_flow() is the one from which
the bits are written to the netlink attributes.  For SCTP we
accidentally used the bits from the 'swkey' instead.  This caused the
mask attributes to include the bits from the actual flow key instead
of the mask.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoofp-util: Implement OFPMP_TABLE_FEATURES decoding and printing.
Alexander Wu [Mon, 24 Mar 2014 06:20:04 +0000 (23:20 -0700)]
ofp-util: Implement OFPMP_TABLE_FEATURES decoding and printing.

Signed-off-by: Alexander Wu <alexander.wu@huawei.com>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>