-
Notifications
You must be signed in to change notification settings - Fork 1
/
nsfreport.tex
executable file
·379 lines (333 loc) · 16.5 KB
/
nsfreport.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
\documentclass[12pt]{article}
%\usepackage{epsf}
\usepackage{epsfig}
%\usepackage{alg,alg2}
%\input{psfig.sty}
%\input{preamble-isca}
%\setcounter{secnumdepth}{4}
\textheight 9in % 1in top and bottom margin
\textwidth 6.5in % 1in left and right margin
\oddsidemargin 0in % Both side margins are now 1in
\evensidemargin 0in \topmargin -0.5 in
% The header goes .5in from top of the page and from the text.
\begin{document}
\normalsize
\bibliographystyle{plain}
\clearpage
\pagenumbering{arabic}
\title{Preliminary report on EAGER-1049758:
\\Investigating Network Testbed Usage}
\author{Jelena Mirkovic and Alefiya Hussain \\
USC/ISI\\
\{sunshine,hussain\}@isi.edu
}
\maketitle
\section*{Introduction}
This is a preliminary report on our findings from investigating testbed usage pattern.
Most of our conclusions at this point come from investigating the DETER
testbed data~\cite{Deter}.
We are in the process of obtaining and analyzing data
from other testbeds and this will be completed by the end of our grant.
We have started to analyze the data along several dimensions,
specifically to answer the following questions about testbeds:
\begin{enumerate}
\item Do testbeds help people conduct useful research?
\item What aspects of testbeds hinder their wider use?
\item Can testbed use/management policies be improved and how?
\end{enumerate}
Most of the findings reported here relate to questions 1 and 3. Unfortunately testbeds today
do not collect enough data nor do they collect the right data to answer the above questions
conclusively so we are often forced to draw bold conclusions based on our interpretation of the
existing, limited data.
We observe that there are three primary types of experimentation patterns
on testbeds today: (a) {\bf Hypothesis Validation} where the experimenter
rigorously explores the parameter space to validate
a particular hypothesis.
(b) {\bf Deployment Study} where the experimenter
installs new technology to study its impact and/or to test it out
(c) {\bf Exploration} where the experimenter takes
unknown technology and immerses it into the testbed to study it further.
In the subsequent sections, we have classified
all three cases of experimentation patterns as
{\it research} experiments.
Additionally, we have {\it class} experiments
on DETER, that are instantiated due to use of DETER in graduate and undergraduate
security courses across 22 universities world-wide.
(A complete list of academic institutions that use DETER in classes is attached as
Appendix A.)
\subsection*{Experiment Duration}
{\it Experiment duration} is defined
as the time lapse between
when an experiment is allocated
testbed resources
to when the experiment releases
the assigned resources.
Figure~\ref{expdur} shows the \textit{cummulative distribution function (cdf)} of
duration for our two
experiment categories, research and class
experiments.
Since each allocation is plotted as an
independent event, the same experiment (identified by its name) can generate multiple
points in Figure~\ref{expdur} if it allocated and released resources multiple times in the course
of its lifetime.
Research experiment duration is heavy-tailed with 26\% of experiments
lasting less than 15 minutes, 51\% lasting less than 1.5 h and 90\%
lasting less than a day. But a few experiments last more than a year.
Class experiment duration is also heavy tailed but longer experiments
dominate more: only less than 7\% last less than 15 minutes, 35\% last
less than 1.5 h and 96\% last less than a day. Longest class experiments
last for a few weeks. We attribute this longer duration of class
experiments when compared with research experiments to the fact that
class experiments are usually well specified in advance by the class
instructor. Class users can thus just allocate resources for the experiment and do
useful work, while research users may need several trial resource
allocations while they test out their set up and discover the combination
that best works for their research purpose.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/edur.pdf}
\caption{Experiment duration for research and class experiments}
\label{expdur}
\end{center}
\end{figure}
\subsection*{Experiment Lifetime}
{\it Experiment Lifetime} is defined as
time lapse between the first experiment creation
event to the last experiment deallocation event.
Thus each unique experiment is only represented by one point in
Figure~\ref{explife}.
Research experiment life is heavy-tailed with almost
51\% of experiments lasting less than 10 minutes. Conversely, only 1.4\%
of class experiments last less than 10 minutes. To verify our hypothesis
that short research experiment life is due to users trying to determine
the best setting for their purpose we examined the percentage of short
experiments that are followed by longer experiments in the same project.
If we define "short" as lasting 10 minutes or less 3,676 out of 3,682
(or 99\%) short research experiments are followed by a longer experiment
in the same project. Similarly, we investigated how many short
experiments are preceded by a long experiment, hypothesizing that this
is due to the user perfecting their scripts and automating the
experiment so that it can run under 10 minutes. This time 3,678 out of
3,682 (or 99\%) short research experiments were preceded by a longer
experiment. We conclude that short research experiments occur often in
the middle of experimentation stream when users want to either
investigate new setup or they have sufficiently automated their
experiments that they can finish quickly.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/elife.pdf}
\caption{Experiment life for research and class experiments}
\label{explife}
\end{center}
\end{figure}
\subsection*{Experiment Size}
Figure \ref{expsize} shows experiment size in number of nodes for class
and research experiments. Since experiment size can be changed during
its lifetime, we plot each value as a new data point in the graph. We
notice that a large percentage of experiments is small. 6\% of research
and 27\% of class experiments require only 1 node, 77\% of research
experiments and 97\% of class experiments require less than 10 nodes.
But the distributions are heavy tailed with a few experiments requiring
$>$ 100 nodes (research) and $>$ 10 nodes (class). Coupled with
experiment duration data, this points to the fact that most testbed
experiments are short and small and may have implications if testbeds
were to implement more advanced resource scheduling algorithms than the
currently used first-come-first-served approach.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/esize.pdf}
\caption{Experiment size for research and class experiments}
\label{expsize}
\end{center}
\end{figure}
\subsection*{Project Size}
Figure~\ref{projsize} shows the project size in number of unique
experiments (one experiment can be allocated resources multiple times but it
still accounts for only one data point) for class and research projects.
We notice that most projects have a small number of experiments: 56\% of
research projects have less than 10 experiments and 23\% of class
projects have less than 10 experiments. Again this distribution is
heavy-tailed with a few projects generating hundreds of experiments.
Distributions of the number of allocations per project (Figure
\ref{projswap} exhibit are similarly heavy-tailed.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/psize.pdf}
\caption{Project size in number of experiments}
\label{projsize}
\end{center}
\end{figure}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/pswaps.pdf}
\caption{Project resource allocations}
\label{projswap}
\end{center}
\end{figure}
\subsection*{Project Lifetime}
Figure \ref{projlife} shows the distribution of project lifetime,
from the first to the last event (experiment creation, resource allocation, resource release, modification). We notice that the smallest lifetime is 46 days,
with more than half of the research projects being active for more than
2.5 years and one third of class projects being active for more than a
year. Coupled with project activity data such as number of experiments
and number of resource allocation, and with experiment activity data such as
duration this shows that people use testbeds in multiple short visits
spread over a long time period.
Additionally, we found that
a significant number of projects are created but never used, that is,
not a single experiment is created within these projects.
The percentages are: 24\% for DETER~\cite{Deter}, 24\% for Emulab~\cite{Emulab}
and 11\% for Schooner-WAIL testbeds~\cite{Wail}.
We contacted the PIs of these unused DETER projects to understand their reasons for not using the
testbed and the responses we received can be broadly classified as:
\begin{enumerate}
\item Found that simulation or live deployment are better fit with my research,
\item Couldn't find sponsors or students for the project
\item Expected there was some specific software or hardware in the testbed, which proved wrong, and
\item Didn't really need to create experiments since the goal was simply to learn how to make a testbed on our own
\end{enumerate}
We are still investigating root causes of this phenomenon but the fact
that we observe these trends across very different testbeds points to the fact that
testbeds need better experimentation tools to eliminate many
cases under (1) and (3) categories. Having an ability to retire and resurrect projects would help
better accounting by eliminating projects in (2) category and having a special project type for
people that seek to build testbeds would eliminate projects in (4) category.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/plife.pdf}
\caption{Project lifetime}
\label{projlife}
\end{center}
\end{figure}
\subsection*{User Activity Patterns}
We define an "active" user as a user that has manipulated (created,
allocated, released, modified) an experiment within a project that
he belongs to. Figure \ref{active} plots the percentage of active users
in a project against the number of project members for research and
class projects. We notice that for small projects ($<$10 members for
research projects and $<$ 50 members for classes) percentage of active
users varies widely. The lowest percentage of active research users is
20\% while it is 3\% for class users. For large projects, however, a
large percentage of users is active. This effect is counterintuitive and
requires further investigation.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{5.pdf}
\includegraphics[width=5in]{4.pdf}
\caption{Percentage of active vs all project members for research projects
(top) and class projects (bottom)}
\label{active}
\end{center}
\end{figure}
Additionally, we found
some number of users never manipulate
(create, allocate or release resources, modify) an experiment .
The percentages are: 10\% for DETER and 15\% for Schooner-WAIL.
Closer investigation shows three causes of such behavior:
\begin{enumerate}
\item Users tend to open duplicate accounts if they forget their password and
can't retrieve it the regular way or if they change institution affiliation
\item PIs tend to create projects but do not manipulate experiments -- their students/employees do
\item Students in class projects may work with an experiment already
set up by the instructor or TA
\end{enumerate}
Cause (1) is preventable with better account management from testbed ops. Cause (2) can be easily identified and corresponding user records taken out of the statistics. Testbeds need better accounting to detect behavior due to cause (3).
\subsection*{Testbed Usage in Classes}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=6in]{data/sigedu/1.pdf}
\includegraphics[width=6in]{data/sigedu/2.pdf}
\caption{Resource usage per class}
\label{cluse}
\end{center}
\end{figure}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{data/sigedu/3.pdf}
\caption{Resource usage in the testbed}
\label{alluse}
\end{center}
\end{figure}
Past few semesters have brought a large increase in the class usage of
DETER testbed. This prompted us to implement a few administrative
polices to ensure that our resources are divided fairly between classes,
and that class usage does not compromise our research usage. These
policies were first enforced in Fall 2010.
We ask instructors at the start of the semester to email us a schedule
of their planned DETER exercises: start time, submission deadline and
the maximum number of machines the class may need assuming the worst
case when all students work simultaneously. This data is input into an
online document, shared via Google docs with all class instructors (with
edit access) for that semester. Each week we impose a resource limit on
each class according to this online schedule, which equals 2/3 of the
anticipated demand recorded in the schedule. This ensures to some extent that no class can starve
other classes for resources. Additionally we make sure that the sum of
all class limits for the week does not exceed 2/3 of all testbed
resources. This ensures that some resources remain available for our
research users.
In Fall 2010 there were ten courses that used DETER testbed, ranging in
enrollment from 100 students to 10 students. Figure \ref{cluse} plots
number of machines used by each class, and the resource limits set on
the course (2/3 of the maximum resource demand in a given week). If the
instructor provided no demand for some week (e.g., no exercise was
planned then) there was no limit set. Graph legend shows the institution
name and the class size. We notice two trends from these graphs. First,
larger classes tend to request more resources but underutilize them,
frequently staying well below their set limits, while smaller classes
tend to bump often against their limits. We attribute this effect to
greater multiplexing in a larger class, which ensures that resources are
used in a more uniform manner. The second effect we noticed is that
classes tend to use resources outside of their planned intervals. It is
possible that this is due to instructors moving exercise deadlines
without updating our online schedule. Another possibility is that this
is due to instructors setting up exercises prior to assigning them to
students. Both these effects merit further investigation and fine-tuning
of our policies to better match observed usage patterns.
Figure \ref{alluse} plots number of machines used by all classes and the
total number of machines used in DETER over the course of Fall 2010
semester. It also shows the aggregate resource limit of 2/3 of DETER
resources that is set over the class demand. We observe that class usage
stays well below this imposed limit. We also observe that this is not
due to lack of testbed resources -- in all cases there were free
resources in the testbed that may have been allocated to classes since
total utilization stayed below 80\%. This observed effect may be due to
instructors overestimating their resource needs but it may also be due
to us setting too strict limits on some classes (i.e. those that tend to
bump against them often from Figure \ref{cluse}) that force them to wait for
resources even when there are free machines in the testbed.
We draw three conclusions from these observations. First, 2/3 aggregate
limit on class resources can be relaxed or at least can be enforced only
when testbed resources are running low instead of all the time. Second,
we need a better approach to ensure fairness of resource allocation
between courses since obviously some courses need more and some need
less resources than their instructors originally estimate. Third, we
need a better resource allocation policy that ensures that a course is
only denied resources when there is real and not just possible resource
shortage.
\section{Appendix A: Institutions that use DETER in class}
\begin{enumerate}
\item UC Berkeley
\item University of Southern California
\item Stevens Institute of Technology
\item UC Los Angeles
\item Lehigh University
\item Jordan University of Science and Technology, Jordan
\item Colorado State University
\item IIT Delhi, India
\item Sao Paulo State University, Brazil
\item Youngstown State University
\item University of Nebraska - Lincoln
\item San Jose State University
\item Vanderbilt University
\item University of Portland
\item Johns Hopkins University
\item George Mason University
\item Saint Louis University
\item Radford University
\item University of Memphis
\item NYU Polytechnic Institute
\item Southern Illinois University at Edwardsville
\item Bar Ilan University, Israel
\end{enumerate}
\bibliography{nsfreport}
\end{document}