-
Notifications
You must be signed in to change notification settings - Fork 0
/
case-study.html
1467 lines (1456 loc) · 70.3 KB
/
case-study.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html data-wf-page="5f71dd169010d6326b65485d">
<head>
<meta charset="utf-8" />
<title>Haven Secrets Manager • Case Study</title>
<meta content="width=device-width, initial-scale=1" name="viewport" />
<link href="assets/css/style.css" rel="stylesheet" type="text/css" />
<script
src="https://ajax.googleapis.com/ajax/libs/webfont/1.6.26/webfont.js"
type="text/javascript"
></script>
<link
rel="stylesheet"
href="https://fonts.googleapis.com/css?family=Inter:regular,500,600,700"
media="all"
/>
<script type="text/javascript">
WebFont.load({ google: { families: ["Inter:regular,500,600,700"] } });
</script>
<script type="text/javascript">
!(function (o, c) {
var n = c.documentElement,
t = " w-mod-";
(n.className += t + "js"),
("ontouchstart" in o ||
(o.DocumentTouch && c instanceof DocumentTouch)) &&
(n.className += t + "touch");
})(window, document);
</script>
<link
href="assets/images/haven-logo.png"
rel="shortcut icon"
type="image/x-icon"
/>
<link href="assets/images/haven-logo.png" rel="apple-touch-icon" />
<script
src="https://kit.fontawesome.com/d019875f94.js"
crossorigin="anonymous"
></script>
<meta
name="image"
property="og:image"
content="assets/images/thumbnail.png"
/>
</head>
<body>
<div class="navigation-wrap">
<div
data-collapse="medium"
data-animation="default"
data-duration="400"
role="banner"
class="navigation w-nav"
>
<div class="navigation-container">
<div class="navigation-left">
<a
href="/"
aria-current="page"
class="brand w-nav-brand w—current"
aria-label="home"
>
<img
src="assets/images/haven-logo.png"
alt=""
class="template-logo"
/>
</a>
<nav role="navigation" class="nav-menu w-nav-menu">
<a href="/case-study" class="link-block w-inline-block">
<div>Case Study</div>
</a>
<a href="/team" class="link-block w-inline-block">
<div>The Team</div>
</a>
</nav>
</div>
<div class="navigation-right">
<div class="login-buttons">
<a href="https://github.com/haven-secrets" target="_blank">
<span style="color: #00f2b1">
<i class="fab fa-github fa-lg"></i>
</span>
</a>
</div>
</div>
</div>
<div class="w-nav-overlay" data-wf-ignore="" id="w-nav-overlay-0"></div>
</div>
</div>
<div id="sidebar" class="toc">
</div>
<div class="section header">
<article class="container case-study-container">
<div class="hero-text-container">
<h1 class="h1 centered">Case Study</h1>
</div>
<div id="case-study">
<br />
<br />
<h2 class="h2">1 Introduction</h2>
<br>
<p>
Your application has secrets. If those secrets leak, you may have to urgently update them and redeploy applications. Leaked secrets can cost you in man-hours, in reputation, and in revenue.
</p>
<br>
<p>
Haven is an open-source solution for easily and securely managing those application secrets. Haven abstracts away the complexity of secrets management for
software engineers, so they can have the peace of mind to focus on
their most important work. In this case study, we describe how we
designed and built Haven, along with some of the technical
challenges we encountered. But first, let’s start with an overview
of secrets.
</p>
<br>
<br>
<h2>2 Secrets</h2>
<br>
<h3>2.1 What is a secret?</h3>
<p>
A secret is something you want to keep <em>secret</em>. More
specifically, it's a sensitive piece of data that authenticates or
authorizes you to a system. [1] For example, a connection
string that you pass to a database so you can authenticate a session
and request data from it. Or an API token that you
supply when you make a call to your cloud provider so you can read
and write from its storage. Both of these pieces of information provide access to sensitive data, so you don't want them falling into the wrong
hands.
</p>
<br>
<h4>Secrets vs. sensitive information</h4>
<p>
You probably have other sensitive information you’d like to keep
secret too, like PII (personally identifiable information). But
they’re not secrets if they don’t directly grant you access to a
system. While any sensitive information should be stored securely,
the scope of our discussion here is limited to application secrets.
</p>
<br>
<h4>Secrets vs. configuration</h4>
<p>
Configuration is important because it influences how your
application operates, but not all of it is really private. Your
application needs to know what environment to run in: should it run
in dev, or prod? That’s a piece of configuration, but it doesn’t
authenticate or authorize you in any way—so it’s not a secret.
</p>
<br />
<h3>2.2 So what?</h3>
<br>
<h4>
Secrets are the keys to your kingdom—yet they're constantly leaked
</h4>
<p>
In 2019, researchers at the North Carolina State University scanned
almost 13% of Github’s public repositories and found "not only is
secret leakage pervasive–affecting over 100,000 repositories–but
that thousands of new, unique secrets are leaked every day." [2] They
noted it wasn't just inexperienced developers leaking secrets in
hobby projects. Several large, prominent organizations were also
leaking their secrets, including a popular website used by millions
of college applicants in the US and a major government agency in
Europe. In both cases, they exposed their respective AWS
credentials.
<a target="_blank" href="https://github.com/search?q=removed+aws+key&type=Commits"
>See for yourself</a
>
how common it is.
</p>
<br>
<h4>Developers make honest but costly mistakes</h4>
<p>
According to the principle of least privilege, anyone working on
your application should only have access only to the secrets they
need to do their work. At the same time, if you’re on a small team,
and you know everyone personally and trust their intentions, it’s
tempting not to follow this principle. But sharing secrets freely is
dangerous, because people make mistakes—over-privileging
developers can lead to honest but costly mistakes. For example, in
2017 DigitalOcean discovered that their "primary database had been
deleted" and as their web press release stated: “The root cause
of this incident was an engineer-driven configuration error. A
process performing automated testing was misconfigured using
production credentials.” [3]
</p>
<br>
<h4>Malicious actors cause damage</h4>
<p>
Even if developers could be perfect, there are always bad actors out
there, and mishandling your secrets can give attackers wrongful
access to your secrets. In 2019, Capital One had a data breach that
affected over 100 million individuals due to a vulnerability related
to configuration secrets involving AWS S3 buckets. The attacker
previously worked for AWS and was able to exploit a misconfigured
firewall to extract files in a Capital One directory stored on AWS's
servers. [4]
</p>
<br />
<h3>2.3 Common practices</h3>
<br>
<h4>Encryption</h4>
<p>
While encrypting a secret protects it from immediate threat, it
isn’t a complete solution. For example, in a Rails application, the
convention is to
<a
href="https://guides.rubyonrails.org/security.html#custom-credentials"
>store your secrets in an encrypted secrets file</a
>. You can store the encryption key to unlock it in another file or
in an environment variable, but <em>that’s</em> a secret too. And a
particularly sensitive one: anyone who has access to it (and your
application) can read and edit <em>any</em> of your secrets. So, you
don’t want it to leak. But if you plan to secure it via encryption
first, you’ll just kick the can down the road. This is an important
problem we’ll revisit later.
</p>
<img
src="assets/images/case-study/encryption-problem.png"
class="case-study-image large-image"
/>
<br>
<h4>Environment variables</h4>
<p>
The Twelve-Factor App methodology made popular the practice of
storing configuration in environment variables to separate
configuration from code. [5] Since secrets are often discussed in
the context of configuration, it may feel natural to store your
secrets in environment variables if you do so with your
configuration. Environment variables aren’t <em>bad</em>, but it’s
dangerous to depend on them to carry the weight of managing your
secrets.
</p>
<br>
<h5>Sourcing environment variables from files</h5>
<p>
Environment variables are often set from files. For example, in
Node.js development, they’re set in a <code>.env</code> file and
then the contents of that file are loaded into the application’s
environment.
</p>
<img
src="assets/images/case-study/dotenv-a.png"
class="case-study-image"
/>
<img
src="assets/images/case-study/dotenv-b.png"
class="case-study-image"
/>
<p>
This has the advantage that you simply need to use a different
<code>.env</code>
file for production versus development environments. But if you
populate environment variables from files, you must ensure those files don’t get
accidentally checked into a public repository or otherwise leaked.
Plus, there's a glaring unanswered question: how are those
<code>.env</code> files distributed, and is that done securely?
</p>
<br>
<h5>Environment set as part of some other system</h5>
<p>
Some deployment and CI/CD tools provide a built-in way to set
environment variables. Heroku, a popular Platform-as-a-Service,
allows users to manually do so in a control panel. This is tedious
and error-prone, and there is no fine-grained access control. A more
general downside of letting your tools take care of it is how many
different tools there are. Every time your team adopts a new one,
developers have to stop and learn each new tool’s method for setting
them, and each time they do, that’s a new opportunity for secrets to
be mishandled.
</p>
<br>
<h5>The leaky environment</h5>
<p>
Regardless of <em>how</em> environment variables are set, they're
leaky. Environment variables are implicitly made available to all
children processes, so they're passed to anything the application
calls. They’re often dumped in plaintext for debugging and error
reporting, so the secrets stored within them can easily end up in
logs.
</p>
<br>
<h3>2.4 Secrets in teams</h3>
<br>
<h4>Secrets get shared</h4>
<p>
Let's take a hypothetical team of four developers, and call them
Alice, Bob, Charlie, and David. Suppose:
</p>
<ul>
<li>Alice emails a config file containing secrets to Bob</li>
<li>Bob Slacks a particular secret to Charlie</li>
<li>Charlie accidentally checks it into version control</li>
<li>
David pulls that code and works off of it, and unintentionally
writes code that later ends up logging that secret to a log file.
</li>
</ul>
<p>
Their secrets are getting around. If you were on that team, would
you know where your secrets are?
</p>
<br>
<p>
Maybe our hypothetical team tries to share secrets securely. Maybe
Alice decides to take a <em>screenshot</em> of a secret and send
that to Bob in Slack—and maybe she even goes back into Slack and
deletes that screenshot once the recipient has got the secret. Or
alternatively, maybe she puts that secret in a file and locks it
with a password, sends the locked file over Slack and sends the
password to Bob over some other communication channel, like email.
</p>
<br>
<p>
This still doesn’t look great. Without a system in place, you have
to get creative to share your secrets securely, and that makes them
even harder to track.
</p>
<br>
<h4>Teams change</h4>
<p>Now suppose the following events occur:</p>
<ul>
<li>Alice quits</li>
<li>Bob moves over from production to development</li>
<li>New-hire Emily joins the team</li>
</ul>
<br>
<p>
When Alice quits, how do you ensure Alice doesn't retain her access
to secrets? Does Bob still have production credentials? When Emily
joins, does she have to ask around to find what secrets she needs?
</p>
<br>
<p>
Without a system in place, Emily may not get, in a complete and
controlled manner, all the secrets she’ll need to do her work. She
may not even know she needs some secrets until she gets to work,
finds out she needs some, and has to hunt them down.
</p>
<br>
<h4>Secrets change</h4>
<p>
Let's add one more type of event to the mix, which will surely
happen much more often than personnel changes:
</p>
<ul>
<li>Charlie updates an API token</li>
</ul>
<p>
How do you ensure Charlie's teammates use the updated version? And
what about applications that depend on it? The old token is invalid,
so applications will crash if they try using it. Unless there’s some
system for managing secrets sanely, Charlie may have to hunt down
the people that need to know or the places where it needs to be
updated, and hope that he got them all.
</p>
<br>
<h4>The security/productivity balancing act</h4>
<p>
While secrets are extremely sensitive, they must be accessible to
you and your application.
</p>
<img
src="assets/images/case-study/tension.png"
class="case-study-image"
/>
<p>
There’s a tension between security and productivity, though,
especially when it comes to sharing secrets with other developers.
For example, how do you decide on and maintain access levels in your
team? Then, how do you securely distribute credentials to team
members who need them? And if someone leaves the team, how do you
know which secrets they've accessed that you now have to update?
</p>
<br>
<p>
It may feel convenient to simply be able to access secrets at any
time but if you don’t have structure, process, and security around
your secrets, you’ll lose visibility and control over them.
Eventually, you’ll find yourself with a big headache called secret
sprawl.
</p>
<br>
<h3>2.5 The problem of Secret Sprawl</h3>
<p>
In all of the scenarios we just examined, we saw hints of secret
sprawl. Secret sprawl is what you have when your secrets could be
anywhere. Secret sprawl means your secrets are littered across your
code, infrastructure, config, and communication channels.
</p>
<figure>
<img
src="assets/images/case-study/ccc-sprawl.png"
class="case-study-image"
/>
<figcaption>Secrets get sprawled across code, config, and communication
channels.</figcaption>
</figure>
<img
src="assets/images/case-study/infrastructure-sprawl.png"
class="case-study-image large-image"
/>
<figure>
<figcaption>Secrets get sprawled across your infrastructure.</figcaption>
</figure>
<br>
<h4>The questions you can't answer</h4>
<p>
Secret sprawl means you can't answer questions like these with any
degree of confidence:
</p>
<ul>
<li>Who has access to what secrets?</li>
<li>When was a particular secret shared or used?</li>
<li>
If you need to change a secret, where do you have to change it?
</li>
</ul>
<br>
<br>
<h2>3 Secrets Managers</h2>
<br>
<h3>3.1 Centralization</h3>
<p>
To prevent secret sprawl, you must have a single source of
truth—one place where all your secrets live. Establishing this
"single source of truth" is <em>centralization</em>. Centralization
tames secret sprawl, and paves the way to gaining visibility and
control around your secrets.
</p>
<figure>
<img
src="assets/images/case-study/ccc-fixed.gif"
class="case-study-image"
/>
<figcaption>Secret sprawl to centralization</figcaption>
</figure>
<p>
Previously, we showed that secrets might be sprawled across your
infrastructure. Perhaps they are even passed down service-to-service
in your pipeline. But you want it to look more like this:
</p>
<p>
<img
src="assets/images/case-study/infrastructure-fixed.gif"
class="case-study-image large-image"
/>
</p>
<p>
After centralization, only your app has secrets, or rather, whatever
service needs secrets will get only the secrets they need. You
thereby reduce the attack surface area of your application.
</p>
<br />
<h3>3.2 Encryption</h3>
<p>
In section 2.3, we mentioned that encryption alone isn't a complete
solution. But when you combine it with centralization, you're well
on the way. Your secrets should be encrypted client-side, meaning
they never leave your device before they’re encrypted. They should
remain encrypted in transit and at rest too, so they’re never seen
nor persisted in plaintext. That includes encryption in
communication channels, temporary stops, and persistent storage.
Each step adds a layer of security, as illustrated below.
</p>
<img
src="assets/images/case-study/encryption-best-practices.gif"
class="case-study-image large-image"
/>
<h3>3.3 Secrets managers</h3>
<p>
A secrets manager is a system that helps you securely store and
manage your secrets. Secrets managers are inherently centralized and
invariably use encryption in some way. Beyond that, they vary in the
use case they are targeted for and in the features they offer.
</p>
<br>
<h4>How to choose a secrets manager</h4>
<p>
In recent years, a number of secrets management solutions have
popped up. There are several things to consider when you’re choosing
a secrets manager. First, a secrets manager must keep your secrets
<strong>safe</strong>. To do that, it should encrypt your secrets.
Second, how does it accommodate multiple users? How does it let you
<strong>share access</strong> safely? Third, you need to know how to
actually use it in your applications. How do
<strong>applications</strong> actually <strong>get secrets</strong>?
You might have to significantly adjust your workflow depending on
the solution you pick.
</p>
<br>
<h4>How secrets managers work with your application</h4>
<p>
Let’s zoom in on that last question: how applications get secrets.
And that’s done in one of two ways: either your application has to
fetch the secrets it needs—so you have to write more application
code—<em>or</em> your application is run in a certain context such
that it already has the secrets it needs.
</p>
<ol>
<li>
A secrets manager might
require you to make an API call within your code to fetch secrets
from it, in which case it’s more of a decoupled and passive
component.
</li>
<li>
On the other hand, your application might also be run
with the secrets it needs already available. For example, if you use
an orchestration service, such as Puppet or Docker Swarm, there
will likely be a built-in way of specifying secrets, which will then
be made available in the environment your application code executes
in.
</li>
</ol>
<p>
Another approach that works the second way is what the secrets manager
SecretHub does. SecretHub runs your application as a child process
and injects the secrets into the environment of that process. This
gives SecretHub some level of control, as it can monitor the
standard output and standard error streams of your application.
</p>
<br />
<h3>3.4 Existing solutions</h3>
<br>
<h4>Vault by Hashicorp</h4>
<p>
Vault is the most popular commercial solution. It's highly flexible
and extensible. For example, it integrates with the storage backend
and identity provider of your choice, and it can integrate with a broad array of plugins.
But Vault is widely
regarded as complex, and can be overkill for many teams. Their own
docs admit this, stating "Vault is a complex system that has many
different pieces." [6] It is probably the best choice if you need some
of the features that only Vault offers, and if your team or
organization has the expertise and bandwidth to manage the Vault
beast.
</p>
<br>
<h4>Other commercial solutions</h4>
<p>
Although Vault is dominant, there are other players on the market.
</p>
<ul>
<li>Doppler is an early YC startup that launched in late 2020 whose
focus is making it "super easy" to manage secrets. One thing that
may give users pause is that secrets are sent plaintext to Doppler.</li>
<li>EnvKey has a different security model—it takes a "zero trust"
approach and encrypts secrets client-side before they are sent over
the network. Like Doppler, EnvKey is easy to get started with, but
it is not as feature-rich: for example, it lacks secret versioning
and the ability to segregate permissions on a per-project basis.</li>
<li>SecretHub has client-side encryption and is feature-rich, but
it is complex. SecretHub also redacts secrets from standard output and standard error,
which helps to prevent secrets being visible in logs locally and/or
in any logs that might be shipped off to third parties.</li>
</ul>
<p>
Ultimately, all commercial solutions are third parties that you have
to trust. Many teams prefer using open-source software for a variety
of reasons, and when it comes to secrets management, there is one
strong reason to: you have full control over the system.
</p>
<br>
<h4>Open-source solutions</h4>
<p>
There are open-source solutions out there, ranging from utility-like
tools that tend to require you do a lot to get up and running, to
more complete solutions with UIs and built-in access control. The
latter, though, tend to be targeted toward specific use cases. For
example, Confidant, which was developed by Lyft in 2015, has a
nice UI and intuitive access control, but it is Docker-centric and
AWS-centric; it assumes you are using Docker and AWS roles for authorization. An example of a more
utility-like tool is credstash, which, like Confidant, uses AWS under
the hood. But it has limited functionality. For example, it doesn't offer logs or the
ability to segregate secrets by project and environment. It also
requires a fair bit of setup. For example, you need to have an AWS KMS key
and your developers all need AWS credentials.
</p>
<br>
<h4>AWS Secrets Manager</h4>
<p>
AWS Secrets Manager came out in 2018, and if you're AWS-native, it may
be perfect for your team. However, if your team doesn't already use
AWS services, it's not exactly a plug-and-play solution. Navigating
the AWS ecosystem presents a steep learning curve in itself, and
Secrets Manager does not come with access control set up for you out
of the box.
</p>
<br>
<h4>Summary of existing solutions</h4>
<p>
Existing solutions can be categorized in very broad strokes as
‘lightweight’, or ‘heavyweight’, where lightweight emphasizes ease
of quickly getting started using it, and heavyweight emphasizes
features.
</p>
<img
src="assets/images/case-study/existing-solutions.png"
class="case-study-image"
/>
<p>
Even in the lightweight category, there’s some diversity. For
example Doppler emphasizes usability, while EnvKey
emphasizes security. The heavyweight ones, such as SecretHub
and Vault, tend to offer more features but at the cost of greater
complexity. There are also several other open-source solutions, but
they either have a lot of overhead or are built for a niche use
case.
</p>
<br />
<h3>3.5 A new solution</h3>
<p>
While the solutions above provide some helpful ways to manage
application secrets, we found they weren’t optimal for small teams
to hit the ground running with. Because of this, we built Haven.
</p>
<img
src="assets/images/case-study/existing-solutions-with-haven.png"
class="case-study-image"
/>
<br>
<br>
<h2>4 Introducing Haven</h2>
<br>
<p>
Haven is an open-source secrets manager built with small teams and
ease of use in mind. It protects application secrets using best
practices, plus it’s easy to integrate and use in your applications.
</p>
<br>
<p>
After we identified the components we’d need to build a good secrets manager, we realized
AWS had trusted and long-standing services for some crucial
components.
</p>
<br>
<ol>
<li>
Since secrets need to be encrypted and decrypted with an
encryption key and "successful key management is critical to the
security of a cryptosystem" (<a
href="https://en.wikipedia.org/wiki/Key_management"
>Wikipedia</a
>), we opted to use the highly vetted AWS Key Management Service
(KMS). This is the only AWS service Haven uses that does not
have a free tier: KMS costs $1 per month per key.
</li>
<br>
<li>
Authentication and authorization is another crucial piece, and we
chose AWS Identity and Access Management (IAM). Every time an
entity makes a request to a non-public AWS resource, the request
goes through IAM first. Using IAM as the gatekeeper for all
storage and encryption logic meant that we could ensure only
entities we authorized could read secrets, write secrets, and so
on.
</li>
<br>
<li>
Since we wanted IAM to be the gatekeeper for storage, we also use
AWS for storage, and we had several storage options to choose
from. Haven sits in the critical path of your application being
served, so low latency and high availability were important. (On
the other hand, scalability was not a concern for us.) We chose
Amazon DynamoDB because it has good documentation, high availability, and
single-digit-milliseconds latency.
</li>
</ol>
<br>
<p>
Although we use AWS under the hood, you don't need cloud expertise
to productively use Haven. The Haven Admin needs an AWS
account—that’s it.
</p>
<br />
<h3>4.1 How does Haven work?</h3>
<p>
The architecture of a Haven instance can be split up into two
components: the client side and the corresponding AWS infrastructure
side. On the client side, each user—be it a Haven Admin, a
developer or an application server—uses the Haven application to
interact with the instance’s secrets. All of these users have Haven
installed on their personal machines and are using Haven to interact
with the same AWS infrastructure, albeit with varying levels of
permissions.
</p>
<br />
<img src="assets/images/case-study/architecture.png" class="case-study-image large-image">
<br />
<p>
To showcase how Haven works, let’s walk through a common workflow,
including how you set up Haven as an Admin, how you add a project to
Haven, how you add developers to your Haven projects, and how you
run your applications with Haven.
</p>
<br>
<h4>Setting up Haven</h4>
<p>
The Haven Admin is the person responsible for creating every project
and every user, assigning permissions to the users, and reviewing
access logs.
</p>
<br>
<p>
Haven offers both a UI and CLI, which share the same Haven Core
package under the hood. To get started with the Haven CLI, you
install the haven-secrets-cli package from npm. After installing the
npm package, you run <code>haven setup</code>, which assigns you as
a Haven Admin.
</p>
<br>
<p>
During setup, Haven connects to your AWS account and provisions the
backend resources for creating projects and their environments,
adding users, setting permissions, and adding and updating secrets.
Note that your AWS account is the <em>only</em> place your secrets
will ever be stored with Haven—there is no external Haven server
with your secrets.
</p>
<img
src="assets/images/case-study/admin-setup.png"
class="case-study-image large-image"
/>
<p>
Haven provisions a file called havenAccountFile, which contains your
Haven credentials. All you need to do is place this havenAccountFile
in your home directory.
</p>
<br>
<h5>Integrating your projects</h5>
<p>
Let’s say you have an application called BlueJay that you want to
integrate with Haven. Note that as the Haven Admin, you are the only
person who will be able to create or delete projects. After you run
<code>haven createProject BlueJay</code>, Haven provisions a
DynamoDB table and set of IAM permission groups for your application
BlueJay.
</p>
<img
src="assets/images/case-study/admin-create-bluejay.png"
class="case-study-image large-image"
/>
<p>Next, you add all your secrets for the BlueJay project.</p>
<img
src="assets/images/case-study/admin-put-bluejay.png"
class="case-study-image large-image"
/>
<p>Your project BlueJay is now integrated with Haven.</p>
<p></p>
<br>
<h4>Adding developers to Haven projects</h4>
<img
src="assets/images/case-study/admin-create-user.png"
class="case-study-image large-image"
/>
<p>
When you create a user, Haven provisions temporary user credentials.
They’re saved to your computer, and you’ll then send this file to
the intended user.
</p>
<img
src="assets/images/case-study/admin-temp-credentials.png"
class="case-study-image large-image"
/>
<p>
Each developer has to install the haven-secrets-cli package from npm
on their personal computer. Let’s switch over to the new user’s
point of view, where we see that they’ve received the temporary
credentials.
</p>
<img
src="assets/images/case-study/dev-temp-credentials.png"
class="case-study-image large-image"
/>
<p>
The developer puts this file in their home directory. Note that
Haven users other than the admin don't need an AWS account since
they'll be connecting to the Haven Admin’s AWS account. Initially,
the developer can’t interact with any projects and secrets. The
developer must run <code>haven userSetup</code> on their computer
after placing their havenAccountFile in their home directory. Haven
will then fetch their permanent credentials.
</p>
<img
src="assets/images/case-study/dev-perm-credentials.png"
class="case-study-image large-image"
/>
<p>
They are now able to start interacting with Haven based on the
permissions you give to them. You’re able to grant them read and/or
write permissions for secrets on a per-project, per-environment
basis. (Granting permissions is an admin-only capability.) Depending
on their permissions, they are able to create, update, and/or read
secrets, and also run the application locally using
<code>haven run</code>. Below, we depict the developer being able to
fetch a secret that you, the admin, has stored.
</p>
<img
src="assets/images/case-study/dev-get.png"
class="case-study-image large-image"
/>
<p>
If the dev is not authorized to that particular secret, they’re
denied access:
</p>
<img
src="assets/images/case-study/dev-get-blocked.png"
class="case-study-image large-image"
/>
<br>
<h4>Using Haven in your application</h4>
<p>
Finally, let’s look at how you can use Haven in your applications.
We assume you run your application on a server, or rather, in an
environment that has a filesystem. The Haven Admin creates a “server
user” under this AWS account, which is really just another Haven
user like developers are. As we saw in the previous section, the
Haven Admin receives a havenAccountFile for the new server user. The
Haven Admin SSHs into the server where your application is run and
installs Haven globally or as a dependency in your project, as well
as place the havenAccountFile in the home directory of the
operating-system user that the application will run from. Then, the
Haven Admin will run <code>haven userSetup</code>, just as the
developer did in the previous section.
</p>
<br>
<p>
This server is now able to start interacting with Haven based on the
permissions you give it. You can run your application with
<code>haven run</code>.
</p>
<img src="assets/images/cli.gif" class="case-study-image" />
<br>
<br>
<h2>5 Building Haven</h2>
<br>
<p>
In section 3.2, we noted that there are three questions you should
ask about any secrets manager. The decisions we made and the
challenges we faced in building Haven can be described pretty well
by answering those questions:
</p>
<ul>
<li>How does it keep your secrets <strong>safe</strong>?</li>
<li>How does it let you <strong>share access</strong> safely?</li>
<li>
How do <strong>applications</strong> actually
<strong>get secrets</strong>?
</li>
</ul>
<br />
<h3>5.1 Keeping your secrets safe</h3>
<br>
<h4>Solving the “master key” problem</h4>
<p>
The very notion of encrypting your secrets has an inherent problem:
what do you do with the encryption key? Assume you use symmetric
encryption, so the encryption key both encrypts and decrypts your
secrets. But then that encryption key is itself a secret—and a
particularly sensitive one, since it can unlock all of your secrets.
You might try to encrypt that key with another key, but that would
be yet another key that you have to encrypt.
</p>
<img
src="assets/images/case-study/encryption-problem.png"
class="case-study-image large-image"
/>
<p>
One way to solve this problem is to have a trusted third-party
service store an encryption key that you don’t have physical access
to. Instead, you dictate who has permissions to use it to perform
encryption and decryption operations. For Haven, that trusted
third-party service is the battle-tested AWS Key Management Service
(KMS). We use KMS to store this key, which we’ll call a “master
key”, and limit encryption/decryption access to it via AWS IAM
policies. We don’t need to worry about safely storing this master
key since AWS handles that. Now let’s see why “master key” is an
appropriate name (hint: it decrypts <em>other</em> encryption keys).
</p>
<br>
<h4>Key wrapping</h4>
<p>
Key wrapping is an encryption best practice and refers to the
technique of using two or more layers of keys to protect your data.
It involves generating a unique data encryption key for each secret
and encrypting the secret using that encryption key. Then the data
encryption key is encrypted by the master encryption key. The
encrypted key and encrypted secret are then stored until decrypted
later. To decrypt your data, you perform this process in reverse:
decrypting the data encryption key with the master key and then
using the data encryption key to decrypt your secret data. Key
wrapping is also sometimes called envelope encryption.
</p>
<img
src="assets/images/case-study/envelope-encryption.png"
class="case-study-image large-image"
/>
<p>
Key wrapping has two advantages: first, it’s harder to brute force
the encrypted data since each is encrypted using a different key;
second, you reduce the attack surface area, because the master key
never sees your plaintext data—only plaintext data encryption
keys—so an attacker would need access to both your secrets storage
and the master key (and in addition, there’s one less instance of
your plaintext secrets traveling along the wire). You may be
wondering what you do with these encrypted data encryption keys: you
store them alongside the encrypted secret itself, often in the same
database row.
</p>
<br>
<h4>Implementing encryption best practices</h4>
<p>
It's a common saying in software that you shouldn't "roll your own
crypto"—you should use a vetted cryptographic library. We use the
AWS Encryption SDK, a client-side encryption library, because it
adheres to cryptography best practices like key wrapping. The AWS
Encryption SDK requires a master key, so Haven uses the master key
in AWS KMS that it creates for you in initial setup.
</p>
<br>
<p>
Haven
follows the best practices of encrypting your secrets client-side,
in transit, and at rest. When you add or update a secret, it's first encrypted on the client
using the SDK and then sent encrypted in transit via TLS to be
stored on Amazon DynamoDB, where it is encrypted at rest.
</p>
<br>
<h4>Our encryption scheme as a whole</h4>
<img
src="assets/images/case-study/haven-envelope-encryption.png"
class="case-study-image large-image"
/>
<p>
The diagram above shows Haven’s encryption scheme from start to
finish. First, to encrypt a datum, a unique data encryption key is
generated and is used to encrypt the secret on the client side as
seen in the top right. Then, as shown in the top left, that data
encryption key is encrypted using the singular master key stored in
KMS. Both of these encrypted pieces of information are encrypted in
transit via TLS and sent to DynamoDB to be stored alongside each
other as shown in the bottom of the diagram. Thus we can see that
Haven encrypts your data client side, in transit and at rest up on
DynamoDB.
</p>
<br>
<h4>Storing and fetching secrets</h4>
<p>
As we see below, Haven first makes a request to the AWS encryption
SDK library to encrypt a secret. The SDK checks that the caller has
the IAM permission to encrypt, and if so, generates a data
encryption key, encrypts the secret value with it, and then encrypts
the data encryption key with the master key. Then, Haven takes this
encrypted data, and (if the user has permission) stores it in the
database. Haven also stores the secret’s name, version number and
whether the secret is flagged, in that same row.
</p>
<img
src="assets/images/case-study/put-secret.png"
class="case-study-image large-image"
/>
<p>
When you fetch a secret, Haven first fetches the encrypted secret
from the database, then decrypts it using the Encryption SDK.
</p>
<img
src="assets/images/case-study/get-secret.png"
class="case-study-image large-image"
/>
<br>
<h4>Running the UI web application on localhost</h4>
<img
src="assets/images/case-study/localhost-ui.png"
class="case-study-image large-image"
/>
<p>
We chose to run the Admin/Developer UI Dashboard from localhost in order to avoid the security issues that any application running on the public web faces. [7] We were inspired by EnvKey,
whose FAQ states:
</p>
<blockquote>
Unfortunately, it's still not possible to implement true
zero-knowledge end-to-end encryption on the web. Apart from a
fundamental chicken-and-egg problem when it comes to server trust,
there's no way to protect against all those ever-so-convenient
browser extensions that so many folks have given full-page
permissions. [8]
</blockquote>
<p>
A second reason we run the UI app locally is to make it clear that
Haven does not have a backend “Haven” server, so we could not snoop