forked from alefiyahussain/testbed-study
-
Notifications
You must be signed in to change notification settings - Fork 0
/
alldata.tex
executable file
·413 lines (332 loc) · 17.2 KB
/
alldata.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
\documentclass[10pt]{article}
%\usepackage{epsf}
\usepackage{epsfig}
%\usepackage{alg,alg2}
%\input{psfig.sty}
\usepackage[usenames]{color}
%\input{preamble-isca}
\setcounter{secnumdepth}{4}
\textheight 9in % 1in top and bottom margin
\textwidth 6.5in % 1in left and right margin
\oddsidemargin 0in % Both side margins are now 1in
\evensidemargin 0in \topmargin -0.5 in
% The header goes .5in from top of the page and from the text.
\begin{document}
\normalsize
\bibliographystyle{alpha}
\clearpage
\pagenumbering{arabic}
\title{Investigating Network Testbed Usage}
\author{Jelena Mirkovic, Hao Shi, Alefiya Hussain and Ted Faber\\
USC/ISI\\
\{sunshine, haoshi, hussain, faber\}@isi.edu
}
\maketitle
\begin{abstract}
\end{abstract}
\section{Introduction}
These are key questions that we're seeking to answer:
\begin{enumerate}
\item Do testbeds help people conduct useful research?
\item What aspects of testbeds hinder their wider use?
\item Can testbed use/management policies be improved and how?
\end{enumerate}
Most of the findings reported here relate to questions 1 and 3. Unfortunately testbeds today
do not collect enough data nor do they collect the right data to answer the above questions
conclusively so we are often forced to draw bold conclusions based on our interpretation of the
existing, limited data.
We observe that there are three primary types of experimentation patterns
on testbeds today: (a) {\bf Hypothesis Validation} where the experimenter
rigorously explores the parameter space to validate
a particular hypothesis.
(b) {\bf Deployment Study} where the experimenter
installs new technology to study its impact and/or to test it out
(c) {\bf Exploration} where the experimenter takes
unknown technology and immerses it into the testbed to study it further.
In the subsequent sections, we have classified
all three cases of experimentation patterns as
{\it research} experiments.
Additionally, we have {\it class} experiments
on DETER, that are instantiated due to use of DETER in graduate and undergraduate
security courses across 22 universities world-wide. Testbed is also used for internal
testing -- we exclude all data of such experiments and users from our statistics.
\section{Related Work}
\section{Network Testbeds}
\section{User Diversity}
Testbed use evolves after a few years to outsider-dominant
Class use grows easily
Most users are in US, as expected
\section{Testbed Usage Patterns}
\textbf{Hypothesis:} People use testbeds in multiple visits. Some experiments are small because they are exploring the space
\subsection{Experiment and Project measures}
{\color{red} clustering TODO}
\subsubsection{Experiment Duration}
{\it Experiment duration} is defined
as the time lapse between
when an experiment is allocated
testbed resources
to when the experiment releases
the assigned resources.
Figure~\ref{exdur} shows the \textit{cummulative distribution function (cdf)} of
duration for our two
experiment categories, research and class
experiments.
Since each allocation is plotted as an
independent event, the same experiment (identified by its name) can generate multiple
points in Figure~\ref{exdur} if it allocated and released resources multiple times in the course
of its lifetime.
Research experiment duration is heavy-tailed with 26\% of experiments
lasting less than 15 minutes, 51\% lasting less than 1.5 h and 90\%
lasting less than a day. But a few experiments last more than a year.
Class experiment duration is also heavy tailed but longer experiments
dominate more: only less than 7\% last less than 15 minutes, 35\% last
less than 1.5 h and 96\% last less than a day. Longest class experiments
last for a few weeks. We attribute this longer duration of class
experiments when compared with research experiments to the fact that
class experiments are usually well specified in advance by the class
instructor. Class users can thus just allocate resources for the experiment and do
useful work, while research users may need several trial resource
allocations while they test out their set up and discover the combination
that best works for their research purpose.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/exdur.pdf}
\caption{Experiment duration distribution}
\label{exdur}
\end{center}
\end{figure}
We also sought to correlate experiment duration and size, which is shown in figures \ref{dvss} and \ref{cvdss}. As it can be seen, there is no obvious correlation.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/dvss.pdf}
\caption{Research: Experiment duration vs size}
\label{dvss}
\end{center}
\end{figure}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/cdvss.pdf}
\caption{Class: Experiment duration vs size}
\label{cdvss}
\end{center}
\end{figure}
\subsubsection{Experiment Lifetime}
{\it Experiment Lifetime} is defined as
time lapse between the first experiment creation
event to the last experiment deallocation event.
Thus each unique experiment is only represented by one point in
Figure~\ref{exlife}.
Research experiment life is heavy-tailed with almost
51\% of experiments lasting less than 10 minutes. Conversely, only 1.4\%
of class experiments last less than 10 minutes. To verify our hypothesis
that short research experiment life is due to users trying to determine
the best setting for their purpose we examined the percentage of short
experiments that are followed by longer experiments in the same project.
If we define "short" as lasting 10 minutes or less 3,835 out of 3,846
(or 99\%) short research experiments are followed by a longer experiment
in the same project. Similarly, we investigated how many short
experiments are preceded by a long experiment, hypothesizing that this
is due to the user perfecting their scripts and automating the
experiment so that it can run under 10 minutes. This time 3,839 out of
3,846 (or 99\%) short research experiments were preceded by a longer
experiment. We conclude that short research experiments occur often in
the middle of experimentation stream when users want to either
investigate new setup or they have sufficiently automated their
experiments that they can finish quickly.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/exlife.pdf}
\caption{Experiment life distribution}
\label{exlife}
\end{center}
\end{figure}
To validate this theory that short experiments are somehow related to longer, more useful
experiments we investigated the minimum elapsed time between a short experiment and
a longer one either following it or preceding it. Figure \ref{gaps} shows the \textit{cdf} of
these time intervals. About half of the short experiments are performed within an hour of
a longer experiment, and 96\% are performed within a day of a longer experiment. This validates
our theory.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/gaps.pdf}
\caption{Gap duration between a small experiment and a large one}
\label{gaps}
\end{center}
\end{figure}
\subsubsection{Project Size}
Figure~\ref{projsize} shows the project size in number of unique
experiments (one experiment can be allocated resources multiple times but it
still accounts for only one data point) for class and research projects.
We notice that most projects have a small number of experiments: 56\% of
research projects have less than 10 experiments and 23\% of class
projects have less than 10 experiments. Again this distribution is
heavy-tailed with a few projects generating hundreds of experiments.
Distributions of the number of allocations per project (Figure
\ref{projswap} exhibit are similarly heavy-tailed.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/projsize.pdf}
\caption{Number of experiments per project}
\label{projsize}
\end{center}
\end{figure}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/projswap.pdf}
\caption{Number of allocations per project}
\label{projswap}
\end{center}
\end{figure}
\subsubsection{Project Lifetime}
Figure \ref{projlife} shows the distribution of project lifetime,
from the first to the last event (experiment creation, resource allocation, resource release, modification). We notice that the smallest lifetime is 46 days,
with more than half of the research projects being active for more than
2.5 years and one third of class projects being active for more than a
year. Coupled with project activity data such as number of experiments
and number of resource allocation, and with experiment activity data such as
duration this shows that people use testbeds in multiple short visits
spread over a long time period.
Additionally, we found that
a significant number of projects are created but never used, that is,
not a single experiment is created within these projects.
The percentages are: 24\% for DETER~\cite{Deter}, 24\% for Emulab~\cite{Emulab}
and 11\% for Schooner-WAIL testbeds~\cite{Wail}.
We contacted the PIs of these unused DETER projects to understand their reasons for not using the
testbed and the responses we received can be broadly classified as:
\begin{enumerate}
\item Found that simulation or live deployment are better fit with my research,
\item Couldn't find sponsors or students for the project
\item Expected there was some specific software or hardware in the testbed, which proved wrong, and
\item Didn't really need to create experiments since the goal was simply to learn how to make a testbed on our own
\end{enumerate}
We are still investigating root causes of this phenomenon but the fact
that we observe these trends across very different testbeds points to the fact that
testbeds need better experimentation tools to eliminate many
cases under (1) and (3) categories. Having an ability to retire and resurrect projects would help
better accounting by eliminating projects in (2) category and having a special project type for
people that seek to build testbeds would eliminate projects in (4) category.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/projlife.pdf}
\caption{Project lifetime}
\label{projlife}
\end{center}
\end{figure}
\subsection{User activity patterns}
User measures (TODO)
\subsection*{User Activity Patterns}
We define an "active" user as a user that has manipulated (created,
allocated, released, modified) an experiment within a project that
he belongs to. Figure \ref{projauser} plots the percentage of active users
in a project against the number of project members for research and
class projects. We notice that for small projects ($<$10 members for
research projects and $<$ 50 members for classes) percentage of active
users varies widely. The lowest percentage of active research users is
20\% while it is 3\% for class users. For large projects, however, a
large percentage of users is active. This effect is counterintuitive and
requires further investigation.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/clprojauser.pdf}
\includegraphics[width=5in]{figs/resprojauser.pdf}
\caption{Percentage of active users among all members in class and research projects}
\label{projauser}
\end{center}
\end{figure}
Additionally, we found
some number of users never manipulate
(create, allocate or release resources, modify) an experiment .
The percentages are: 10\% for DETER and 15\% for Schooner-WAIL.
Closer investigation shows three causes of such behavior:
\begin{enumerate}
\item Users tend to open duplicate accounts if they forget their password and
can't retrieve it the regular way or if they change institution affiliation
\item PIs tend to create projects but do not manipulate experiments -- their students/employees do
\item Students in class projects may work with an experiment already
set up by the instructor or TA
\end{enumerate}
Cause (1) is preventable with better account management from testbed ops. Cause (2) can be easily identified and corresponding user records taken out of the statistics. Testbeds need better accounting to detect behavior due to cause (3).
We now look deeper into these statistics. Figure \ref{uex} shows the number of unique experiments manipulated by a user. One unique experiment counts only once even if the user performed multiple runs of that experiment. Around half of research and class users manipulate less than five experiments. Around 90\% of research users manipulate 20 experiments or less, and 97\% manipulate 50 or less. Majority of class users (95\%) manipulate 10 experiments or less.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/uex.pdf}
\caption{Experiments per user}
\label{uex}
\end{center}
\end{figure}
Figure \ref{uproj} shows the number of projects that a user belongs to. Around 90\% of research users and 97\% of class users belong to only one project.
{\color{red} Check if multiple-project users are in fact PIs}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/uproj.pdf}
\caption{Projects per user}
\label{uproj}
\end{center}
\end{figure}
Figure \ref{ulife} shows the distribution of a user's manipulation lifetime, i.e. it shows the elapsed time between the first and the last experiment manipulation by the user. Around 22\% of class users and 35\% of research users never manipulate an experiment, which is indicated by a lifetime on our x-axis. Half of research users and 35\% of class users are active for a week or less. Around 82\% of research users and almost all class users are active for a year or less.
{\color{red} It's surprising that class users are longer active than research users -- investigate this further}
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/ulife.pdf}
\caption{User manipulation lifetime}
\label{ulife}
\end{center}
\end{figure}
We further define ``active lifetime'' of a user as the sum of all time when an experiment swapped in by this user was holding testbed resources. Figure \ref{uactive} shows the distribution of this time for research and class experiments. 60\% of class and research users are active for under one day. 90\% of class users are active for under two weeks, and 90\% of research users are active for under two months.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/uactive.pdf}
\caption{User active lifetime}
\label{uactive}
\end{center}
\end{figure}
\subsection{Idleness}
{\color{red} Explain that we want to investigate how well are resources used and if we could have a better idle-swap policy to serve more users. }
{\color{red} Explain how we calculate idle time and how we gather statistics. }
These are statistics from September 13, 2010 to April 1, 2011.
54\% of timeslots had a free node that was not allocated to an experiment.
26\% of timeslots had an idle node. 12\% had a node with only a network activity,
3\% had a node with only a CPU activity and 1\% had a node with network and CPU activity.
We next seek to understand the cause for idleness and look for two different patterns: (1) instances where some nodes
in an experiment were never used and (2) instances where all nodes in an experiment were idle for some time. We now treat
class and research projects together.
Only 0.5\% of experiment instances had one or more idle nodes. This means that users can usually estimate well
what size of network they need for experimentation.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/timeidle.pdf}
\caption{Distribution of experiment idleness}
\label{timeidle}
\end{center}
\end{figure}
Figure \ref{timeidle} shows the distribution of the percentage of idle time in an experiment. A little more than 40\% of experiments are never idle. The idleness distribution then changes almost linearly until it reaches 100\%.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/mididle.pdf}
\caption{Distribution of experiment idle time in the middle}
\label{mididle}
\end{center}
\end{figure}
We further investigate the duration of idle gaps seeking to understand if testbeds could be better utilized if idle experiments were auto-swapped out after only a brief idle time and swapped back in later.
Figure \ref{mididle} shows the longest idle interval for each experiment, where the number of idle timeslots was positive. Around 65\% of experiments are idle less than 1h, and 96\% of them are idle less than 4h. This means that the idle-swap time set at 4h is a good estimate.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=5in]{figs/endidle.pdf}
\caption{Distribution of experiment idle time at the end}
\label{endidle}
\end{center}
\end{figure}
Figure \ref{endidle} shows the duration of the idle interval before the experiment is swapped out. About a quarter of experiments are idle less than 1h before being swapped out. There is an almost flat line near the 1h mark going from 25\% to near 50\% of experiments. These are mostly class experiments for which the idle-swap time is automatically set to 1h. A similar flat line is visible around 4h from 62\% to 96\%. These are research experiments for which the idle-swap time is automatically set to 4h. Again these figures mean that our settings are correct.
{\color{red} Failed swapin data (TODO)}
{\color{red} Planetlab (TODO)}
{\color{red} Starbed (TODO)}
\section{Experiment Characteristics}
Topology (TODO)
OS (TODO)
Link characteristics (TODO)
SEER (TODO)
\section{What hinders testbed use}
Survey results (TODO)
\section{Implications for the Future}
\section{Conclusions}
\end{document}