doc/user_manual/ec_api.tex

   1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   2 %
   3 %    NEPI, a framework to manage network experiments
   4 %    Copyright (C) 2013 INRIA
   5 %
   6 %    This program is free software: you can redistribute it and/or modify
   7 %    it under the terms of the GNU General Public License as published by
   8 %    the Free Software Foundation, either version 3 of the License, or
   9 %    (at your option) any later version.
  10 %
  11 %    This program is distributed in the hope that it will be useful,
  12 %    but WITHOUT ANY WARRANTY; without even the implied warranty of
  13 %    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  14 %    GNU General Public License for more details.
  15 %
  16 %    You should have received a copy of the GNU General Public License
  17 %    along with this program.  If not, see <http://www.gnu.org/licenses/>.
  18 %
  19 % Author: Alina Quereilhac <alina.quereilhac@inria.fr>
  20 %
  21 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  22
  23
  24 The ExperimentController (EC) is the entity in charge of turning the
  25 experiment description into a running experiment.
  26 In order to do this the EC needs to know which resources are to be
  27 used, how they should be configured and how resources relate to one another.
  28 To this pourpose the EC exposes methods to register resources, specify their
  29 configuration, and register dependencies between. These methods are part of
  30 the EC design API.
  31 Likewise, in order to deploy and control resources, and collect data,
  32 the EC exposes another set of methods, which form the execution API.
  33 These two APIs are described in detail in the rest of this chapter.
  34
  35 \section{The experiment script}
  36
  37 NEPI is a Python-based language and all classes and functions can
  38 be used by importing the \emph{nepi} module from a Python script.
  39
  40 In particular, the ExperimentController class can be imported as follows:
  41
  42 \begin{lstlisting}[language=Python]
  43
  44 from nepi.execution.ec import ExperimentController
  45
  46 \end{lstlisting}
  47
  48 Once this is done, an ExperimentController instance must be instantiated
  49 for a particular experiment. The ExperimentController constructor receives
  50 the optional argument \emph{exp\_id}. This argument is important because
  51 it defines the experiment identity and allows to distinguish among different
  52 experiments. If an experiment id is not explicitly given, NEPI will automatically
  53 generate a unique id for the experiment.
  54
  55 \begin{lstlisting}[language=Python]
  56
  57 ec = ExperimentController(exp_id = "my-exp-id")
  58
  59 \end{lstlisting}
  60
  61 The experiment id can always be retrieved as follows
  62
  63 \begin{lstlisting}[language=Python]
  64
  65 exp_id = ec.exp_id
  66
  67 \end{lstlisting}
  68
  69 %TODO: What is the run_id ??
  70
  71 \section{The design API}
  72
  73 Once an ExperimentController has been instantiated, it is possible to start
  74 describing the experiment. The design API is the set of methods which
  75 allow to do so.
  76
  77 \subsection{Registering resources}
  78
  79 Every resource supported by NEPI is controlled by a specific ResourceManager
  80 (RM). The RM instances are automatically created by the EC, and the user does
  81 not need to interact with them directly.
  82
  83 Each type of RM is associated with a \emph{type\_id} which uniquely identifies
  84 a concrete kind of resource (e.g PlanetLab node, application that runs in
  85 a Linux machine, etc).
  86 The \emph{type\_ids} are string identifiers, and they are required
  87 to register a resource with the EC.
  88
  89 To discover all the available RMs and their \emph{type\_ids} we
  90 can make use of the ResourceFactory class.
  91 This class is a \emph{Singleton} that holds the templates and information
  92 of all the RMs supported by NEPI. We can retrieve this information as follows:
  93
  94 \begin{lstlisting}[language=Python]
  95
  96 from nepi.execution.resource import ResourceFactory
  97
  98 for type_id in ResourceFactory.resource_types():
  99     rm_type = ResourceFactory.get_resource_type(type_id)
 100     print type_id, ":", rm_type.get_help()
 101
 102 \end{lstlisting}
 103
 104 Once the \emph{type\_id} of the resource is known, the registration of a
 105 new resource with the EC is simple:
 106
 107 \begin{lstlisting}[language=Python]
 108
 109 type_id = "SomeRMType"
 110 guid = ec.register_resources(type_id)
 111
 112 \end{lstlisting}
 113
 114 When a resource is registered, the EC instantiates a RM of the
 115 requested \emph{type\_id} and assigns a global unique identifier
 116 (guid) to it. The guid is an incremental integer number and it
 117 is the value returned by the \emph{register\_resource} method.
 118 The EC keeps internal references to all RMs, which the user can
 119 reference using the corresponding guid value.
 120
 121 \subsection{Attributes}
 122
 123 ResourceManagers expose the configurable parameters of resources
 124 through a list of attributes. An attribute can be seen as a
 125 \emph{{name:value}} pair, that represents a certain aspect of
 126 the resource (whether information or configuration information).
 127
 128 It is possible to discover the list of attributes exposed by an
 129 RM type as follows:
 130
 131 \begin{lstlisting}[language=Python]
 132 from nepi.execution.resource import ResourceFactory
 133
 134 type_id = "SomeRMType"
 135 rm_type = ResourceFactory.get_resource_type(type_id)
 136
 137 for attr in rm_type.get_attributes():
 138     print "       ",  attr.name, ":", attr.help
 139
 140 \end{lstlisting}
 141
 142 To configure or retrieve the value of a certain attribute of
 143 an registered resource we can use the \emph{get} and \emph{set}
 144 methods of the EC.
 145
 146 \begin{lstlisting}[language=Python]
 147
 148 old_value = ec.get(guid, "attr_name")
 149 ec.set(guid, "attr_name", new_value)
 150 new_value = ec.get(guid, "attr_name")
 151
 152 \end{lstlisting}
 153
 154 % Critical attribute
 155 Since each RM type exposes the characteristics of a particular type
 156 of resource, it is to be expected that different RMs will have different
 157 attributes. However, there a particular attribute that is common to all RMs.
 158 This is the \emph{critical} attribute, and it is meant to indicate to the EC
 159 how it should behave when a failure occurs during the experiment.
 160 The \emph{critical} attribute has a default value of \emph{True}, since
 161 all resources are considered critical by default.
 162 When this attribute is set to \emph{False} the EC will ignore failures on that
 163 resource and carry on with the experiment. Otherwise, the EC will immediately
 164 interrupt the experiment.
 165
 166 \subsection{Traces}
 167
 168 A Trace represent a stream of data collected during the experiment and associated
 169 to a single resource. ResourceManagers expose a list of traces, which are identified
 170 by a name. Particular traces might or might not need activation, since some traces
 171 are enabled by default.
 172
 173 It is possible to discover the list of traces exposed by an
 174 RM type as follows:
 175
 176 \begin{lstlisting}[language=Python]
 177 from nepi.execution.resource import ResourceFactory
 178
 179 type_id = "SomeRMType"
 180 rm_type = ResourceFactory.get_resource_type(type_id)
 181
 182 for trace in rm_type.get_traces():
 183     print "       ",  trace.name, ":", trace.enabled
 184
 185 \end{lstlisting}
 186
 187 The \emph{enable\_trace} method allows to enable a specific trace for a
 188 RM instance
 189
 190 \begin{lstlisting}[language=Python]
 191
 192 ec.enable_trace(guid, "trace-name")
 193
 194 print ec.trace_enabled(guid, "trace-name")
 195
 196 \end{lstlisting}
 197
 198 \subsection{Registering connections}
 199
 200 In order to describe the experiment set-up, resources need to be
 201 associated to one another. Through the process of connecting resources
 202 the \emph{topology graph} is constructed. A certain application might
 203 need to be configured and executed on a certain node, and this
 204 must be indicated to the EC by connecting the application RM to the node
 205 RM.
 206
 207 Connections are registered using the \emph{register\_connection} method,
 208 which receives the guids of the two RM.
 209
 210 \begin{lstlisting}[language=Python]
 211
 212 ec.register_connection(node_guid, app_guid)
 213
 214 \end{lstlisting}
 215
 216 The order in which the guids are given is not important, since the
 217 \emph{topology\_graph} is not directed, and the corresponding
 218 RMs \emph{`know'} internally how to interpret the connection
 219 relationship.
 220
 221 \subsection{Registering conditions}
 222
 223 All ResourceMangers must go through the same sequence of state transitions.
 224 Associated to those states are the actions that trigger the transitions.
 225 As an example, a RM will initially be in the state NEW. When the DEPLOY action
 226 is invoked, it will transition to the DISCOVERED, then PROVISIONED, then READY
 227 states. Likewise, the action START will make a RM pass from state READY to
 228 STARTED, and the action STOP will change a RM from state STARTED to STOPPED.
 229
 230 Using these states and actions, it is possible to specify workflow dependencies
 231 between resources. For instance, it would be possible to indicate that
 232 one application should start after another application by registering a
 233 condition with the EC.
 234
 235 \begin{lstlisting}[language=Python]
 236
 237 from nepi.execution.resource import ResourceState, ResourceActions
 238
 239 ec.register_condition(app1_guid, ResourceAction.START, app2_guid, ResourceState.STARTED)
 240
 241 \end{lstlisting}
 242
 243 The above invocation should be read "Application 1 should START after application 2
 244 has STARTED". It is also possible to indicate a relative time from the moment a state
 245 change occurs to the moment the action should be taken as follows:
 246
 247 \begin{lstlisting}[language=Python]
 248
 249 from nepi.execution.resource import ResourceState, ResourceActions
 250
 251 ec.register_condition(app1_guid, ResourceAction.START, app2_guid, ResourceState.STARTED, time = "5s")
 252
 253 \end{lstlisting}
 254
 255 This line should be read "Application 1 should START at least 5 seconds after
 256 application 2 has STARTED". \\
 257
 258 Allowed actions are: DEPLOY, START and STOP. \\
 259
 260 Existing states are: NEW, DISCOVERED, PROVISIONED, READY, STARTED, STOPPED,
 261 FAILED and RELEASED. \\
 262
 263 \section{The execution API}
 264
 265 \subsection{Deploying an experiment}
 266
 267 %TODO: Talk about groups
 268 %TODO: Talk about interactive deploymet
 269
 270 \subsection{Getting attributes}
 271
 272 \subsection{Quering the state}
 273
 274 \subsection{Getting traces}
 275
 276 % TODO: Give examples of Traces (how to collect traces to the local repo, talk about the Collector RM)
 277
 278 % how to retrieve an application trace when the Node failed? (critical attribute)
 279
 280 \subsection{The collector RM}
 281
 282
 283
 284