-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into binder-server
- Loading branch information
Showing
8 changed files
with
204 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,7 +19,7 @@ organizations: | |
- name: Project Jupyter | ||
url: "https://jupyter.org" | ||
- name: The Turing Way | ||
url: "https://the-turing-way.netlify.app/" | ||
url: "https://the-turing-way.org/" | ||
|
||
# Short bio (displayed in user profile at end of posts) | ||
bio: | ||
|
@@ -36,9 +36,6 @@ social: | |
- icon: envelope | ||
icon_pack: fas | ||
link: 'mailto:[email protected]' | ||
- icon: twitter | ||
icon_pack: fab | ||
link: https://twitter.com/drsarahlgibson | ||
- icon: github | ||
icon_pack: fab | ||
link: https://github.com/sgibson91 | ||
|
@@ -50,10 +47,11 @@ social: | |
# Set this to `[]` or comment out if you are not using People widget. | ||
user_groups: | ||
- Product and Services Team | ||
|
||
--- | ||
|
||
Sarah Gibson is an Open Source Infrastructure Engineer at 2i2c, an open source contributor and advocate. | ||
She holds more than two years of experience as a Research Engineer at a national institute for data science and artificial intelligence, as well as holding a core contributor role in the open source projects [Binder](https://jupyter.org/binder), [JupyterHub](https://jupyter.org/hub), and [_The Turing Way_](https://the-turing-way.netlify.app/). | ||
She holds more than two years of experience as a Research Engineer at a national institute for data science and artificial intelligence, as well as holding a core contributor role in the open source projects [Binder](https://jupyter.org/binder), [JupyterHub](https://jupyter.org/hub), and [_The Turing Way_](https://the-turing-way.org). | ||
Sarah is passionate about working with domain experts to leverage cloud computing in order to accelerate cutting-edge, data-intensive research and disseminating the results in an open, reproducible and reusable manner. | ||
|
||
Sarah holds a Fellowship with the [Software Sustainability Institute](https://software.ac.uk) and advocates for best software practices in research. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
--- | ||
title: "Announcing our formal commitment to open technology" | ||
date: "2025-01-15" | ||
banner: | ||
image: "" | ||
authors: ["Yuvi Panda", "Chris Holdgraf"] | ||
tags: [open-source] | ||
categories: [organization] | ||
featured: false | ||
draft: false | ||
--- | ||
|
||
In this post, we're sharing our [Commitment to Open Technology](../../../open-technology/index.md). It is focused on _software licenses_ for reasons we'll describe below. We hope that it clarifies what kind of licenses we'll use, and assures our communities that we will not change our stance towards open source technology in the future. This ensures 2i2c's long-term commitment to community-owned and open infrastructure. | ||
|
||
Being a platform and service provider gives us a lot of power, and also introduces a potential source of _lock-in_ for our member communities. While 2i2c's organizational mission and culture are strongly aligned with open infrastructure, we believe it's important to encode commitments like these in a formal way to provide both transparency and accountability to our member communities. | ||
|
||
## Our commitment to open technology | ||
|
||
Below we copy the original language of this policy from our [Commitment to Open Technology](../../../open-technology/index.md): | ||
|
||
<!-- TODO: When we switch to MyST, we should embed this rather than copy/paste --> | ||
|
||
_Definitions of MUST, MUST NOT, SHOULD, MAY, etc are defined in [RFC 2119](https://tools.ietf.org/html/rfc2119)_ | ||
|
||
1. All engineering artifacts (code, documentation, etc) produced by 2i2c's engineering team MUST be licensed under an open source license approved by a non-profit organization that is not 2i2c. | ||
2. Open Source Projects originating at 2i2c, or stewarded by 2i2c, MUST NOT require a [Contributor Licensing Agreement](https://en.wikipedia.org/wiki/Contributor_License_Agreement) that includes Copyright Assignment to 2i2c. | ||
3. The list of external organizations that define licenses we accept are | ||
1. [the Open Source Initiative](https://opensource.org/) | ||
2. the [Organization for Ethical Source](https://ethicalsource.dev/). | ||
4. Modifying (1), (2), or (3) MUST be done through a 2/3 majority vote of 2i2c staff. | ||
|
||
## What does this commitment mean? | ||
|
||
In plain language, here's what this commitment means: | ||
|
||
1. We'll only use open source licenses that have been approved by standard non-profits that are broadly recognized by the tech industry. | ||
2. For anything we build, we won't require contributors to give up the rights to their contributions via CLAs, so that it is much harder for 2i2c to change our licenses in the future. | ||
3. Changing this policy will require organization-wide agreement, and in the future we'll give authority over this policy to a group of people representing our member communities. | ||
|
||
## Why are licenses and CLAs important? | ||
|
||
Many organizations claim to be committed to open infrastructure, while retaining the ability to _change this commitment in the future when it is in their interests_. A classic example of this is a "bait and switch" that looks something like this: | ||
|
||
1. A company releases software under an open source license and professes to build an open source community around it. | ||
2. However, they retain the rights to all of the code in their projects through a [Contributor License Agreement](https://en.wikipedia.org/wiki/Contributor_License_Agreement) (CLA) with copyright assignment. This generally means that contributors must _give up the rights to their contribution_ in order to make that contribution. | ||
3. Once their product has gained traction and it is in their interests, the company can _change the license_ to whatever they wish (even one that is not open source) because they retain the rights to all contributions in the codebase. | ||
4. They then leverage this new position as owners of a proprietary project to extract business value or grow their position in a market. | ||
|
||
Think this sounds unlikely? Here are just a few recent examples of companies that have switched their license after many years of releasing their technology under an open source license: | ||
|
||
- [Redis](https://redis.io/blog/redis-adopts-dual-source-available-licensing/) | ||
- [Hashicorp / Terraform](https://www.hashicorp.com/blog/hashicorp-adopts-business-source-license) | ||
- [Elastic Search](https://en.wikipedia.org/wiki/Elasticsearch#Licensing_changes) | ||
|
||
We want to ensure our communities that 2i2c is not headed down this path, in order to give them confidence in treating us as a long-term service partner. | ||
|
||
## What does this change about 2i2c's open source commitment? | ||
|
||
In short: nothing. These are already the principles that 2i2c was committed to from its inception, and already implied via our [Right to Replicate](../../../right-to-replicate/). However, we wanted to make these commitments more formally in order to give ourselves more accountability to sticking with them, and to provide more transparency for our community members and stakeholders. | ||
|
||
## Who is this for? | ||
|
||
We imagine three audiences for this policy: | ||
|
||
1. **2i2c present and future staff** who want to ensure that their organization remains committed to our open principles. This document provides a sense of psychological safety to have bold discussions about structuring our approach to open source. | ||
2. **Member communities and 2i2c stakeholders** who need to have an understanding of the guarantees that we provide in order to trust 2i2c as a service developer and provider. This is similar to the effect our [Right to Replicate](/right-to-replicate) has. | ||
3. **Open source communities** who need to understand our long-term commitment and goals around open technology in order to trust as a peer and collaborator within open source communities. | ||
|
||
## We'd love feedback | ||
|
||
We hope that these ideas both clarify our intent and the reason that we think it's important. We'd love feedback about early refinements to these principles in order to make them more effective, as well as ways that we can provide more community oversight and participation in evolving these policies moving forward. If you have any thoughts to share, please send feedback via e-mail [[email protected]](mailto:[email protected]). | ||
|
||
--- | ||
|
||
**Acknowledgements**: _The creation of this policy and the rationale behind it was led by [Yuvi Panda](../../../authors/yuvi-panda/) with feedback from 2i2c's team. This blog post was co-written with [Chris Holdgraf](../../../authors/chris-holdgraf)._ |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
--- | ||
title: "Designing for an ecosystem: a case study in cross-project open source contribution" | ||
date: "2025-01-21" | ||
authors: ["Chris Holdgraf", "Angus Hollands"] | ||
tags: [open source] | ||
categories: [impact] | ||
featured: false | ||
draft: false | ||
--- | ||
|
||
A key challenge in the open source space is that projects are often independent and autonomous, with relatively few formal ways to collaborate and coordinate efforts. While this usually isn't a big deal, it means that there is a missed opportunity to grow the impact of an ecosystem because it requires coordinated development among multiple stakeholders within it. | ||
|
||
This is one of the reasons we created 2i2c's open community hub platform. By deploying a single platform that utilizes entirely open infrastructure that we contribute back to, we have visibility over a variety of projects along with the need to combine them together for a specific end-user outcome. One-such development scenario recently came up involving [Jupyter Book 2][jb2] and [JupyterHub](https://jupyterhub.org/). | ||
|
||
## Allowing readers to "bring their own Binders" | ||
|
||
We've recently been working to integrate [Jupyter Book 2][jb2] workflows with our community hubs for a more seamless experience (for example, having book pages link back to interactive cloud sessions that allow users to interact with the content). We imagine a network of Jupyter Books that all build upon the same core infrastructures (JupyterHub, Binder, etc) for cloud-based computing. Our hope is to allow a user to _bring their own Binder_ with them so that they can interact with another book's content with their own cloud infrastructure. For example: | ||
|
||
- A student with access to `binder.myuniversity.edu` could read a Jupyter Book created by a professor at `otheruniversity.edu`. | ||
- The Jupyter Book is defined with a [Binder specification](https://repo2docker.readthedocs.io/en/latest/specification.html) that has a recipe for re-building the environment needed to run te book's content. | ||
- From the professor's book, the student can choose to launch an interactive Binder sessions on _their university's Binder_, allowing them to interact with the book's content on their own infrastructure. | ||
|
||
We want a workflow like this to be as seamless and un-complicated as possible. We also want it to follow the same fundamental workflow as the [nbgitpuller-based launch buttons](https://docs.2i2c.org/community/content/). Along the way, we realized that we needed to coordinate development across [Jupyter Book 2][jb2]], [JupyterHub](https://jupyter.readthedocs.io), and [BinderHub](https://binderhub.readthedocs.io). | ||
|
||
{{< figure src="./featured.png" caption="The three projects (Jupyter Book, BinderHub, and JupyterHub) that needed to work together to enable 'bring your own binderhub' workflows." >}} | ||
|
||
## Getting Jupyter Book to discover Jupyter Hub | ||
|
||
As we began developing this workflow, we realized that there was a blocker in the JupyterHub and BinderHub ecosystem that needed to be fixed. We needed a way to **ask a JupyterHub whether it had an unauthenticated end-point for service discovery**. Basically, a way to ask a hub "what kind of hub are you, and how can we launch an interactive session on you?" Doing this is simple-enough - JupyterHub already has a way of reporting its version and application type, which allows us to infer how to launch interactive sessions. But, we hit a snag in an HTML context. | ||
|
||
By default, JupyterHub disallows certain kinds of [Cross-Origin Resource Sharing](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) (CORS) requests, in order to restrict other applications from abusing a JupyterHub's API. If you hit parts of a JupyterHub API from _the command line_, things work fine. But if you do the same thing via JavaScript from a website, the request is disallowed. This was a problem if we want Jupyter Book (a web application) to be able to make requests of JupyterHub's API. | ||
|
||
So, we realized that we needed to make an **upstream contribution in JupyterHub** in order to **enable an interaction between JupyterHub and Jupyter Book**. In this case, it was a relatively simple fix: allowing CORS requests for the specific API endpoint we needed (which is a very lightweight endpoint that is not vulnerable to security risks, and is broadly useful to make accessible)[^1]. That resulted in two PRs: | ||
|
||
- https://github.com/jupyterhub/jupyterhub/pull/4966 allows CORS requests for the API that was needed for service discovery in JupyterHub. | ||
- https://github.com/jupyterhub/binderhub/pull/1906 enables this workflow on a BinderHub so that its services can be discovered. | ||
- https://github.com/jupyter-book/myst-theme/pull/503 adds new launch button functionality to [Jupyter Book 2][jb2] that allows readers to bring their own Binder / JupyterHub links for launching. (this is what necessitated the above two PRs) | ||
|
||
[^1]: This actually required an interesting bit of team discussion that was much easier with a few 2i2c staff on the JupyterHub team. The original request from Angus was interpreted as opening up the _entire hub API_ to external requests (which is a bad idea!) but we were able to quickly discuss this with the JupyterHub team to clarify that this was only about a very specific API endpoint. This is the kind of communication loop that often goes haywire when you have people contributing to a project without historical relationships to the project's maintainers. | ||
|
||
As a result of this upstream contribution loop, JupyterHub can now accept API requests at its "service discovery" endpoint, which means that Jupyter Book (and any other web application) can more easily learn about a hub's capabilities and version. | ||
|
||
We wanted to share this short vignette because it's a good reflection of the kind of value that 2i2c tries to provide, given its role in helping to build and enhance networks of infrastructure, domain communities, and open source communities. In this case, we enabled a _cross-project_ workflow that required knowledge of each project, and a vision for how they could be used together in a way that exceeded the sum of their parts. | ||
|
||
We think there's a lot more potential in these kinds of workflows, and are eager to continue our work to identify and enhance community-centric infrastructure for interactive computing. | ||
|
||
[jb2]: https://next.jupyterbook.org/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
--- | ||
title: "Enforcing per-user storage quotas with `jupyterhub-home-nfs`" | ||
subtitle: "" | ||
summary: "" | ||
authors: ["Sarah Gibson"] | ||
tags: [open-source] | ||
categories: [impact] | ||
date: 2025-01-28T09:57:28+00:00 | ||
lastmod: 2025-01-28T10:10:14+00:00 | ||
featured: false | ||
draft: false | ||
|
||
# Featured image | ||
# To use, add an image named `featured.jpg/png` to your page's folder. | ||
# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight. | ||
image: | ||
caption: "" | ||
focal_point: "" | ||
preview_only: false | ||
|
||
# Projects (optional). | ||
# Associate this post with one or more of your projects. | ||
# Simply enter your project's folder or file name without extension. | ||
# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`. | ||
# Otherwise, set `projects = []`. | ||
projects: ["nasa-veda"] | ||
--- | ||
|
||
When sharing a storage disk between users, as is usually the case in a JupyterHub deployment, it is important to put in guardrails so that one user cannot eat up the whole storage capacity from the rest of the users. | ||
To this end, 2i2c in close collaboration with [Development Seed](https://developmentseed.org) have developed the [`jupyterhub-home-nfs` project](https://github.com/2i2c-org/jupyterhub-home-nfs) which is a Helm chart that permits enforcing per-user quotas on the storage space. | ||
|
||
{{% callout note %}} | ||
Note that this feature is currently available to AWS hosted hubs only and will be rolled out to other cloud providers in the future. | ||
{{% /callout %}} | ||
|
||
Under the hood, the Helm chart runs [NFS Ganesha](https://github.com/nfs-ganesha/nfs-ganesha) as an in-cluster NFS server, backed by [XFS](https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-xfs) as the underlying filesystem. Storage quota is enforced through XFS's native quota management utility `xfs_quota`. | ||
|
||
Since this feature moves our infrastructure away from managed filesystems (such as AWS's Elastic File System) that cannot support per-user storage quotas, we have also developed monitoring and alerting mechanisms that will let us know when the disks are getting full, and automated back-ups for disaster recovery. | ||
|
||
If you would like to try this on your 2i2c-managed hub, [please get in touch](https://docs.2i2c.org/support/). | ||
|
||
This project can also be used with _any_ Kubernetes-based JupyterHub, as per our [Right to Replicate policy](https://2i2c.org/right-to-replicate/), so please try it out on your own deployment and let us know what you think! | ||
|
||
## Credit | ||
|
||
This project was developed and deployed in collaboration with [Tarashish Mishra](https://developmentseed.org/team/tarashish-mishra/) from [Development Seed](https://developmentseed.org) |
Oops, something went wrong.