1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3 % NEPI, a framework to manage network experiments
4 % Copyright (C) 2013 INRIA
6 % This program is free software: you can redistribute it and/or modify
7 % it under the terms of the GNU General Public License version 2 as
8 % published by the Free Software Foundation;
10 % This program is distributed in the hope that it will be useful,
11 % but WITHOUT ANY WARRANTY; without even the implied warranty of
12 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 % GNU General Public License for more details.
15 % You should have received a copy of the GNU General Public License
16 % along with this program. If not, see <http://www.gnu.org/licenses/>.
18 % Author: Alina Quereilhac <alina.quereilhac@inria.fr>
20 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
23 During the past decades a wide variety of platforms to conduct network
24 experiments, including simulators, emulators and live testbeds,
25 have been made available to the research community.
26 Some of these platforms are tailored for very specific use cases (e.g.
27 PlanetLab for very realistic Internet application level scenarios),
28 while others support more generic ones (e.g. ns-3 for controllable
29 and repeatable experimentation). Nevertheless, no single platform is
30 able to satisfy all possible scenarios, and so researchers often rely
31 on different platforms to evaluate their ideas.
33 Given the huge diversity of available platforms, it is to be expected a
34 big disparity in the way to carry out an experiment between one platform and
35 another. Indeed, different platforms provide their own mechanisms to
36 access resources and different tools to conduct experiments.
37 These tools vary widely, for instance, to run a ns-3 simulation it is
38 necessary to write a C++ program, while to conduct an experiment using
39 PlanetLab nodes, one must first provision resources through a special web
40 service, and then connect to the nodes using SSH to launch any applications
41 involved in the experiment.
43 Mastering such diversity of tools can be a daunting task,
44 but the complexity of conducting network experiments is not only limited
45 to having to master different tools and services.
46 Designing and implementing the programs and scripts to run an experiment
47 can be a time consuming and difficult task, specially if distributed
48 resources need to be synchronised to perform the right action at the
49 right time. Detecting and handling possible errors during experiment
50 execution also posses a challenge, even more when dealing with large size
51 experiments. Additionally, difficulties related to instrumenting the
52 experiment and gathering the results must also be considered.
55 In this context, the challenges that NEPI addresses are manifold.
56 Firstly, to simplify the complexity of running network experiments.
57 Secondly, to simplify the use of different experimentation platforms,
58 allowing to easily switch from one to another.
59 Thirdly, to simplify the
60 use of resources from different platforms at the same time in
64 The approach proposed by NEPI consists on exposing a generic API
65 that researchers can use to \emph{program} experiments, and
66 providing the libraries that can execute those experiments on
67 target network experimentation platforms. The API abstracts the
68 researchers from the details required to actually run an experiment
69 on a given platform, while the libraries provide the code to
70 automatically perform the steps necessary to deploy the experiment
73 The API is generic enough to allow describing potentially any
74 type of experiment, while the architecture of the libraries was
75 designed to be extensible to support arbitrary platforms.
76 A consequence of this is that any new platform can be supported in
77 NEPI without changing the API, in a way that is transparent
81 \section{Experiment Description}
83 NEPI represents experiments as graphs of interconnected resources.
84 A resource is an abstraction of any component that takes part of an
85 experiment and that can be controlled by NEPI.
86 It can be a software or hardware component, it could be a virtual
87 machine, a switch, a remote application process, a sensor node, etc.
89 Resources in NEPI are described by a set of attributes, traces and
90 connections. The attributes define the configuration of the resource,
91 the traces represent the results that can be collected for that resource
92 during the experiment and the connections represent how a resource relates
93 to other resources in the experiment.
97 \includegraphics[width=0.5\textwidth]{intro_resource}
98 \caption{Properties of a resource of type LinuxApplication}
99 \label{fig:intro_resources}
102 Examples of attributes are a linux hostname, an IP address to be
103 assigned to a network interface, a command to run as a remote application.
104 Examples of traces are the standard output or standard error of a
105 running application, a tcpdump on a network interface, etc.
107 Resources are also associated to a type (e.g. a Linux host,
108 a Tap device on PlanetLab, an application running on a Linux host, etc).
109 Different types of resources expose different attributes and traces
110 and can be connected to other specific types (e.g. A resource representing
111 a wireless channel can have an attribute SSID and be connected to a
112 Linux interface but not directly to a Linux host resource)
113 Figure \ref{fig:intro_resources} exemplifies this concept.
115 There are two different types of connections between resources, the
116 first one is used to define the \emph{topology graph} of the experiment.
117 This graph provides information about which resources will interact
118 with which other resources during the experiment
119 (e.g. application A should run in host B, and host B will be connected
120 to wireless channel D through a network interface C).
121 Figure \ref{fig:intro_topo_graph} shows a representation of the concept of
122 topology graph to describe the an experiment.
126 \includegraphics[width=0.8\textwidth]{intro_topo_graph}
127 \caption{A topology graph representation of an abstract experiment}
128 \label{fig:intro_topo_graph}
131 The second type of connections (called conditions to differentiate them
132 from the first type) specifies the \emph{dependencies graph}.
133 This graph is optional and imposes constraints on the experiment
134 workflow, that is the order in which different events occur during the
135 experiment. For instance, as depicted in Figure \ref{fig:intro_dependencies_graph}
136 a condition on the experiment could specify that
137 a server application has to start before a client application does, or that
138 an network interface needs to be stopped (go down) at a certain time after
139 the beginning of the experiment.
143 \includegraphics[width=0.8\textwidth]{intro_dependencies_graph}
144 \caption{A dependencies graph representation involving two applications
145 resources in an experiment}
146 \label{fig:intro_dependencies_graph}
149 It is important to note, that the \emph{topology graph} also defines
150 implicit and compulsory workflow constraints
151 (e.g. if an application is \emph{topologically} connected to a host,
152 the host will always need to be up and running before an application
154 The difference is that the \emph{dependency graph} adds complementary
155 constraints specified by the user, related to the behavior of the
158 This technique for modeling experiments is generic enough that can be used
159 to describe experiments involving resources from any experimentation
160 environment (i.e. testbed, simulator, emulator, etc). However, it
161 does not provide by itself any information about how to actually deploy
162 and run an experiment using concrete resources.
165 \section{Experiment Life Cycle}
167 The Experiment Description by itself is not enough to conduct an experiment.
168 In order to run an experiment it is necessary to translate the description
169 into concrete actions and to perform these actions on the specific resources
170 taking part of the experiment. NEPI does this for the user in an automated
175 \includegraphics[width=0.8\textwidth]{intro_life_cycle}
176 \caption{Common stages of a network experiment life cycle}
177 \label{fig:intro_life_cycle}
180 Given that different resources will require performing actions in
181 different ways (e.g. deploying an application on
182 a Linux machine is different than deploying a mobile wireless robot),
183 NEPI abstracts the life cycle of resources into common stages associated
184 to generic actions, and allows to plug-in different implementation of
185 these actions for different types of resources.
186 Figure \ref{fig:intro_life_cycle} shows the three
187 main stages of the network experiment life cycle, \emph{Deployment},
188 \emph{Control} and \emph{Result (collection)}, and the actions that are
189 involved in each of them.
193 \includegraphics[width=\textwidth]{intro_state_transitions}
194 \caption{Resources state transitions}
195 \label{fig:intro_state_transitions}
198 In order to be able to control different types of resources in
199 a uniform way, NEPI assigns a generic state to each of these
200 actions and expects all resources to follow the same set of
201 state transitions during the experiment life. The states and
202 state transitions are depicted in Figure
203 \ref{fig:intro_state_transitions}.
205 It is important to note that NEPI does not require these states
206 to be globally synchronized for all resources (e.g. resources
207 are not required to be all ready or started at the same time).
208 NEPI does not even require all resources to be declared and known
209 at the beginning of the experiment, making it possible to use
210 an \emph{interactive deployment} mode, where new resources can de
211 declared and deployed on the fly, according to the experiment needs.
212 This interactive mode can be useful to run experiments with the
213 purpose of exploring a new technology, or to use NEPI as an adaptive
214 experimentation tool, that could change an experiment according to
215 external conditions or measurements.
217 \section{Resource Management: The EC \& The RMs}
219 The Experiment Controller (EC) is the entity that is responsible for
220 translating the Experiment Description into a running experiment.
221 It holds the \emph{topology} and \emph{dependencies} graphs, and it
222 exposes a generic experiment control API that the user can
223 invoke to deploy experiments, control resources and collect results.
227 \includegraphics[width=\textwidth]{intro_ec}
228 \caption{User interacting with the Experiment Controller}
232 As shown in Figure \ref{fig:intro_ec}, the user declares the resources and
233 their dependencies directly with the EC.
234 When the user requests the EC to deploy a certain resource or a
235 group of resources, the EC will take care of performing all the necessary
236 actions without further user intervention, including the sequencing of
237 actions to respect user defined and topology specific dependencies,
238 through internal scheduling mechanisms.
240 The EC is a generic entity responsible for the global orchestration of
241 the experiment. As such, it abstracts itself from the details of how to
242 control concrete resources and relies on other entities called Resource Managers
243 (RM)s to perform resource specific actions.
245 For each resource that the user registers in the \emph{topology graph}, the EC
246 will instantiate a RM of a corresponding type. A RM is a resource specific
247 controller and different types of resources require different type of
248 RMs, specifically adapted to manage them.
250 The EC communicates with the RMs through a well defined API that exposes
251 the necessary methods (actions) to achieve all the state transitions defined by the
252 common resource life-cycle. Each type of RM must provide a specific implementation
253 for each action and ensure that the correct state transition has been achieved
254 for the resource (e.g. upon invocation of the START action, the RM must take
255 the necessary steps to start the resource and set itself to state STARTED).
256 This decoupling between the EC and the RMs makes it possible to extend the
257 control capabilities of NEPI to arbitrary resources, as long as a RM can be
258 implemented to support it.
260 As an example, a testbed \emph{X} could allow to control host resources using a
261 certain API X, which could be accessed via HTTP, XMLRPC, or via any other protocol.
262 In order to allow NEPI to run experiments using this type of resource, it would
263 suffice to create a new RM of type host X, which extends the common RM API, and
264 implements the API X to manage the resources.
266 Figure \ref{fig:intro_resource_management} illustrates how the user, the EC,
267 the RMs and the resources collaborate together to run an experiment.
271 \includegraphics[width=\textwidth]{intro_resource_management}
272 \caption{Resource management in NEPI}
273 \label{fig:intro_resource_management}