Claudio-Daniel Freire [Sun, 2 Oct 2011 19:10:58 +0000 (21:10 +0200)]
Merge with head
Claudio-Daniel Freire [Sun, 2 Oct 2011 19:08:52 +0000 (21:08 +0200)]
Robustness improvements:
- avoid using hostname all the time, it's sensitive to DNS glitches.
- kill by only pid, pid+ppid does not always work (especially with sudo)
Alina Quereilhac [Sun, 2 Oct 2011 15:17:06 +0000 (17:17 +0200)]
continuing overlay testing
Alina Quereilhac [Sun, 2 Oct 2011 13:38:14 +0000 (15:38 +0200)]
examples/wireless_overlay.py
Claudio-Daniel Freire [Sun, 2 Oct 2011 08:53:02 +0000 (10:53 +0200)]
Fix rpmFusion installation: needs sudo
Claudio-Daniel Freire [Sun, 2 Oct 2011 06:12:24 +0000 (08:12 +0200)]
Switch to using IPs for inter-node communication, DNS sometimes fails (and with 100 nodes, sometimes = always)
Claudio-Daniel Freire [Sun, 2 Oct 2011 00:17:31 +0000 (02:17 +0200)]
Fix slave retrials: was cleaning up, deleting keys, but it still needed them.
Claudio-Daniel Freire [Sun, 2 Oct 2011 00:17:08 +0000 (02:17 +0200)]
Fix multicast forwarder when no router is present.
Claudio-Daniel Freire [Sun, 2 Oct 2011 00:16:45 +0000 (02:16 +0200)]
Add some traces to allow debugging
Alina Quereilhac [Sat, 1 Oct 2011 21:37:07 +0000 (23:37 +0200)]
changed examples/wireless_overlay.py
Claudio-Daniel Freire [Sat, 1 Oct 2011 21:04:58 +0000 (23:04 +0200)]
Missing retusn
Claudio-Daniel Freire [Sat, 1 Oct 2011 20:25:26 +0000 (22:25 +0200)]
Allow not connecting a router
Claudio-Daniel Freire [Sat, 1 Oct 2011 20:23:45 +0000 (22:23 +0200)]
Forward all packets to the default multicast egress when a MulticastForwarder is not connected to a router.
Claudio-Daniel Freire [Sat, 1 Oct 2011 18:46:08 +0000 (20:46 +0200)]
Lower sound quality too
Claudio-Daniel Freire [Sat, 1 Oct 2011 18:26:31 +0000 (20:26 +0200)]
Merge with head
Claudio-Daniel Freire [Sat, 1 Oct 2011 18:25:50 +0000 (20:25 +0200)]
Low-bitrate version of the big buck bunny video
Claudio-Daniel Freire [Sat, 1 Oct 2011 18:25:19 +0000 (20:25 +0200)]
Avoid deadlocks
Alina Quereilhac [Sat, 1 Oct 2011 18:08:55 +0000 (20:08 +0200)]
wireless_overlay.py is now working with unicast vlc stream
Claudio-Daniel Freire [Sat, 1 Oct 2011 08:51:13 +0000 (10:51 +0200)]
Do not re-install keys on retrials (they're no longer available)
Claudio-Daniel Freire [Sat, 1 Oct 2011 08:43:36 +0000 (10:43 +0200)]
Disable NAGLE algorithm to decrease TCP tunnel delay
Claudio-Daniel Freire [Sat, 1 Oct 2011 08:24:02 +0000 (10:24 +0200)]
- Detect SSH misconfigurations in PL nodes
- Retry slaves when possible
Claudio-Daniel Freire [Sat, 1 Oct 2011 08:23:19 +0000 (10:23 +0200)]
Fix TUN<->TunChannel connection in netns: must set TunChan in non-ethernet mode
Alina Quereilhac [Sat, 1 Oct 2011 07:33:35 +0000 (09:33 +0200)]
examples/wireless_overlay.py still not working
Claudio-Daniel Freire [Fri, 30 Sep 2011 23:27:23 +0000 (20:27 -0300)]
Attempt at fixing TCP tunnels. Still in need of extensive testing (passed tests at least once though)
Claudio-Daniel Freire [Fri, 30 Sep 2011 22:08:58 +0000 (19:08 -0300)]
Ping flood test (exposing a serious TCP tunnel bug)
Claudio-Daniel Freire [Fri, 30 Sep 2011 14:59:44 +0000 (11:59 -0300)]
Fix backwards condition in IP header inspection for TCP tunnels
Claudio-Daniel Freire [Fri, 30 Sep 2011 06:26:47 +0000 (08:26 +0200)]
Merge with head
Claudio-Daniel Freire [Fri, 30 Sep 2011 06:25:59 +0000 (08:25 +0200)]
Make application deployment more robust.
New ways of detecting bad nodes: ping build master, if unreachable, this is bad in many ways, so blacklist
Claudio-Daniel Freire [Fri, 30 Sep 2011 06:25:26 +0000 (08:25 +0200)]
Switch SystemRandom with os.urandom
Claudio-Daniel Freire [Fri, 30 Sep 2011 06:24:42 +0000 (08:24 +0200)]
Fix unassign node, was not clearing attributes as it was supposed to
Alina Quereilhac [Thu, 29 Sep 2011 21:32:59 +0000 (23:32 +0200)]
Continuing wireless overlay ...
Claudio-Daniel Freire [Thu, 29 Sep 2011 01:19:07 +0000 (03:19 +0200)]
Merge with miself
Claudio-Daniel Freire [Thu, 29 Sep 2011 01:05:28 +0000 (03:05 +0200)]
Fix routing for disconnected interfaces
Claudio-Daniel Freire [Wed, 28 Sep 2011 22:46:08 +0000 (19:46 -0300)]
Several fixes:
- Add specific hostname to application deployment logging
- Use os.urandom instead of random.SystemRandom (more appropriate)
- Fix fuckminsterfülerene-style deadlock in spanning deployment by including
the application class when defining deployment groups. This ensures
ordered dependencies (which are implemented as thread synchronization),
and avoids deadlocks.
Claudio-Daniel Freire [Wed, 28 Sep 2011 12:05:32 +0000 (14:05 +0200)]
Merge with head
Claudio-Daniel Freire [Wed, 28 Sep 2011 12:04:37 +0000 (14:04 +0200)]
Parallelize more aggressively during liveliness tests
Claudio-Daniel Freire [Wed, 28 Sep 2011 12:04:16 +0000 (14:04 +0200)]
Use persistent connectiosn only when supported
Claudio-Daniel Freire [Wed, 28 Sep 2011 12:03:41 +0000 (14:03 +0200)]
Check broken hosts when deploying Yum dependencies - some lack conectivity or have HD failures
Alina Quereilhac [Tue, 27 Sep 2011 13:10:59 +0000 (15:10 +0200)]
examples/wireless_overlay.py
Alina Quereilhac [Tue, 27 Sep 2011 13:03:50 +0000 (15:03 +0200)]
wirless_overlay is working. It misses the vlc part only
Claudio-Daniel Freire [Tue, 27 Sep 2011 02:03:18 +0000 (04:03 +0200)]
Merge with head
Claudio-Daniel Freire [Tue, 27 Sep 2011 02:02:34 +0000 (04:02 +0200)]
Parallelize node liveliness tests
Claudio-Daniel Freire [Tue, 27 Sep 2011 02:02:14 +0000 (04:02 +0200)]
Limit ControlPath's length (it's got a rather small limit)
Claudio-Daniel Freire [Tue, 27 Sep 2011 02:01:25 +0000 (04:01 +0200)]
Filter blacklisted nodes in util.getNodes
Claudio-Daniel Freire [Tue, 27 Sep 2011 02:00:59 +0000 (04:00 +0200)]
Make multicall threadsafe
Claudio-Daniel Freire [Tue, 27 Sep 2011 02:00:08 +0000 (04:00 +0200)]
Notify the underlying reason when a node is UNRESPONSIVE (it's not always just the slice not being created)
Claudio-Daniel Freire [Tue, 27 Sep 2011 01:57:07 +0000 (03:57 +0200)]
Allow sporadic failures while polling application status
Alina Quereilhac [Mon, 26 Sep 2011 11:04:36 +0000 (13:04 +0200)]
merge
Alina Quereilhac [Mon, 26 Sep 2011 11:04:00 +0000 (13:04 +0200)]
bug fixes and wireless overlay experiment
Claudio-Daniel Freire [Sun, 25 Sep 2011 20:26:46 +0000 (22:26 +0200)]
Merge with head
Claudio-Daniel Freire [Sun, 25 Sep 2011 20:26:30 +0000 (22:26 +0200)]
PL utilities, useful for experiment designers
Claudio-Daniel Freire [Sun, 25 Sep 2011 20:26:12 +0000 (22:26 +0200)]
Use system.multicall to accelerate batch API calls
Alina Quereilhac [Sun, 25 Sep 2011 17:20:10 +0000 (19:20 +0200)]
wireless overlay example with ns3 and netns
Claudio-Daniel Freire [Sat, 24 Sep 2011 07:23:08 +0000 (09:23 +0200)]
Tons of SSH improvements:
- Use TCP Keepalives to immediately sense broken connections
- Use SSH Keepalives to tampered connections
- Use persistent connections to speed up batch commands considerably
Claudio-Daniel Freire [Sat, 24 Sep 2011 07:21:20 +0000 (09:21 +0200)]
Trap errors in dropped packet trace dumps - no need to break the whole overlay if something goes wrong there
Claudio-Daniel Freire [Sat, 24 Sep 2011 07:20:32 +0000 (09:20 +0200)]
Give the PLC API some time to recover when retrying
Alina Quereilhac [Thu, 22 Sep 2011 16:46:36 +0000 (18:46 +0200)]
forgot the wimax overlay test
Alina Quereilhac [Thu, 22 Sep 2011 16:45:41 +0000 (18:45 +0200)]
wimax support.. still ongoing...
Alina Quereilhac [Wed, 21 Sep 2011 13:09:38 +0000 (15:09 +0200)]
fixing wimax in ns3
Claudio-Daniel Freire [Mon, 19 Sep 2011 07:03:03 +0000 (09:03 +0200)]
Fix timeout option spec
Claudio-Daniel Freire [Mon, 19 Sep 2011 06:08:45 +0000 (08:08 +0200)]
Better network failure recovery: added some retries on connection error in application, added ssh timeout with automatic retry on timeout, in case of connection glitches
Claudio-Daniel Freire [Mon, 19 Sep 2011 06:07:47 +0000 (08:07 +0200)]
Fix a few cornercase bugs in resource allocation
Claudio-Daniel Freire [Mon, 19 Sep 2011 06:07:17 +0000 (08:07 +0200)]
Enable min/max cpu/load, forgot to do so in metadata when they were added
Claudio-Daniel Freire [Mon, 19 Sep 2011 01:02:20 +0000 (03:02 +0200)]
Ignore errors on yum cleanup, not really important
Claudio-Daniel Freire [Sun, 18 Sep 2011 23:12:08 +0000 (01:12 +0200)]
Retry operations on networking errors. Really common from wan
Claudio-Daniel Freire [Sun, 18 Sep 2011 23:11:40 +0000 (01:11 +0200)]
Fix in node rating
Claudio-Daniel Freire [Fri, 16 Sep 2011 03:41:21 +0000 (05:41 +0200)]
Fix missing variable in classqueue
Claudio-Daniel Freire [Fri, 16 Sep 2011 03:40:53 +0000 (05:40 +0200)]
Don't silence important errors
Claudio-Daniel Freire [Wed, 14 Sep 2011 04:57:40 +0000 (06:57 +0200)]
Merge with head
Claudio-Daniel Freire [Wed, 14 Sep 2011 04:57:25 +0000 (06:57 +0200)]
Make PlanetLab select lightly loaded nodes when given the chance (ie, when more candidates than necessary are available)
Alina Quereilhac [Sun, 11 Sep 2011 12:33:31 +0000 (14:33 +0200)]
A little more on ExperimentSuite
Alina Quereilhac [Sun, 11 Sep 2011 11:12:20 +0000 (13:12 +0200)]
ExperimentSuite still not working...
Alina Quereilhac [Sat, 10 Sep 2011 17:48:17 +0000 (19:48 +0200)]
experimentsuite test working
Alina Quereilhac [Sat, 10 Sep 2011 13:49:39 +0000 (15:49 +0200)]
more on experiment suite
Alina Quereilhac [Fri, 9 Sep 2011 11:50:30 +0000 (13:50 +0200)]
ExperimentSuite strcuture. Still not working, missing proxy and tests.
Claudio-Daniel Freire [Fri, 9 Sep 2011 05:24:43 +0000 (07:24 +0200)]
Fix logging
Claudio-Daniel Freire [Fri, 9 Sep 2011 05:21:54 +0000 (07:21 +0200)]
Attempt at fixing NS3 in PL:
- Fix in shutdown order (again)
- Do not check tun_port when raising tun channels, FD channels have no port and it's ok
Claudio-Daniel Freire [Thu, 8 Sep 2011 10:52:18 +0000 (12:52 +0200)]
Make servers able to launch when a stale ctrl.sock from a pervious server remains.
ctrl.sock sockets are usually left behind when servers are killed or die, since they're not cleaned up automatically by the OS like other sockets
Claudio-Daniel Freire [Thu, 8 Sep 2011 10:51:03 +0000 (12:51 +0200)]
Fix metadata bug: tun_cipher should also be flagged as META, make tun_cipher in ns3's fdnd only support PLAIN cipher
Claudio-Daniel Freire [Thu, 8 Sep 2011 10:49:34 +0000 (12:49 +0200)]
Fix shutdown order to respect creation order (important when running nepi-in-nepi)
Claudio-Daniel Freire [Wed, 7 Sep 2011 21:52:11 +0000 (23:52 +0200)]
Do not use shell=True with Popen, some distros use dash, we need bash.
Claudio-Daniel Freire [Wed, 7 Sep 2011 18:27:38 +0000 (20:27 +0200)]
Escape quotes as well - it's not always OK to leave them unquoted
Claudio-Daniel Freire [Wed, 7 Sep 2011 02:45:17 +0000 (04:45 +0200)]
Fix NS3: --enable-threading no longer valid or needed
Claudio-Daniel Freire [Wed, 7 Sep 2011 02:44:47 +0000 (04:44 +0200)]
Fix TUN shutdown: waitkill was not effective because of a faulty if_alive
Claudio-Daniel Freire [Tue, 6 Sep 2011 18:06:40 +0000 (20:06 +0200)]
Fix sudo in popen_python code
Claudio-Daniel Freire [Mon, 5 Sep 2011 15:17:45 +0000 (17:17 +0200)]
Make sure proxies load the right version of nepi in case multiple ones are installed.
Claudio-Daniel Freire [Mon, 5 Sep 2011 01:20:01 +0000 (03:20 +0200)]
Fix NO_PI detection in netns
Claudio-Daniel Freire [Mon, 5 Sep 2011 01:19:40 +0000 (03:19 +0200)]
Wait for SERVER_READY or PROXY_READ, instead of expecting it as the first line.
Allows spurious stderr output in environment_setup code (happens in OpenSUSE)
Claudio-Daniel Freire [Mon, 5 Sep 2011 01:18:33 +0000 (03:18 +0200)]
Fix testbed recovery after bad merge with TCP handshake stuff
Claudio-Daniel Freire [Mon, 5 Sep 2011 01:17:46 +0000 (03:17 +0200)]
Fix testbed proxy serialization in the presence of missing values (ie: defaults or None)
Claudio-Daniel Freire [Sun, 4 Sep 2011 17:31:46 +0000 (19:31 +0200)]
Merge with HEAD, close aly's branch.
Claudio-Daniel Freire [Sun, 4 Sep 2011 17:30:41 +0000 (19:30 +0200)]
Fix metadata breakage from recent commit
Claudio-Daniel Freire [Sun, 4 Sep 2011 17:27:37 +0000 (19:27 +0200)]
Merge TCP handshake stuff
Claudio-Daniel Freire [Sun, 4 Sep 2011 16:54:21 +0000 (18:54 +0200)]
Merge non-handshake stuff
Alina Quereilhac [Sun, 4 Sep 2011 13:20:53 +0000 (15:20 +0200)]
log "Connected" after succefull handshake in tunchannel_impl.py
Alina Quereilhac [Fri, 2 Sep 2011 10:19:31 +0000 (12:19 +0200)]
added Tun device for netns
Claudio-Daniel Freire [Wed, 31 Aug 2011 18:08:38 +0000 (20:08 +0200)]
WORKING WORKING WOOOHOOO!!!!!!!!!
I'm outa here... must... get... drunk...
Alina Quereilhac [Wed, 31 Aug 2011 17:38:38 +0000 (19:38 +0200)]
tcp_handshake works!
Alina Quereilhac [Wed, 31 Aug 2011 16:22:55 +0000 (18:22 +0200)]
udp and gre are working. tcp_handshake is not working yet.
Claudio-Daniel Freire [Wed, 31 Aug 2011 12:57:44 +0000 (14:57 +0200)]
Multicast fixes