Skip to content

Commit

Permalink
End of year retrospective 2024
Browse files Browse the repository at this point in the history
  • Loading branch information
choldgraf committed Dec 9, 2024
1 parent 23ce38d commit 907e0b9
Show file tree
Hide file tree
Showing 11 changed files with 222 additions and 2 deletions.
4 changes: 2 additions & 2 deletions config/_default/menus.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
# The weight parameter defines the order that the links will appear in.

[[main]]
name = "Hub platform"
name = "Platform"
url = "platform/"
weight = 10

[[main]]
name = "Community impact"
name = "Impact"
url = "communities/"
weight = 11

Expand Down
Binary file added content/blog/2024/frx/images/leaderboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
62 changes: 62 additions & 0 deletions content/blog/2024/frx/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: "`frx-challenge`: A new tool to host data challenges for Frictionless Research Exchanges"
date: 2024-12-06
---

2i2c is pleased to announce the `frx-challenges` project, a new open source tool to help communities host data challenges on shared infrastructure:

https://github.com/2i2c-org/frx-challenges

This project aims to make it easier for communities to enable their users **submit code and data** that are **evaluated on secure infrastructure with access to private data and resources**.

{{< figure
src="images/leaderboard.png"
width="75%"
caption="An example leaderboard for a data challenge, taken from the [Cellmap Challenge](https://cellmapchallenge.janelia.org/). Users make submissions that are run against secure and private infrastructure and data, and providing feedback about the submission's performance. Learn more about the FRX challeenges project here: https://2i2c.org/frx-challenges/"
>}}
It is designed to be lightweight and flexible, and can be run on a variety of shared infrastructure. For those who wish to run this project on cloud infrastructure, we've also published a Helm Chart to help you deploy `frx-challenges` with Kubernetes:

https://2i2c.org/frx-challenges-helm-chart/

While it can be run on its own, we believe that it naturally complements other tools and services for interactive computing and data, such as **JupyterHub**, **Jupyter Book**, and **Binder**. More on that below.

Below is a brief description of the motivation behind this project.

## What are Frictionless Research Exchanges

The project is heavily inspired by David Donoho's vision of **Frictionless Research Exchanges** (FRX) as described in [_Data Science at the Singularity_](https://arxiv.org/abs/2310.00865).

In this article, Donoho describes three key pillars for Frictionless Research Exchanges:

> The three initiatives are related but separate; and all three have to come together, and in a particularly strong way, to provide the conditions for the new era. Here they are:
>
> - [FR-1: Data] datafication of everything, with a culture of research data sharing. One can now find datasets publicly available online on a bewildering variety of topics, from chest x-rays to cosmic microwave background measurements to uber routes to geospatial crop identifications.
> - [FR-2: Re-execution] research code sharing including the ability to exactly re-execute the same complete workflow by different researchers.
> - [FR-3: Challenges] adopting challenge problems as a new paradigm powering scientific research. The paradigm includes: a shared public dataset, a prescribed and quantified task performance metric, a set of enrolled competitors seeking to outperform each other on the task, and a public leaderboard. Thousands of such challenges with millions of entries have now taken place, across many fields.
We considered the landscape of tools and services, and felt that [FR-1] and [FR-2] were already well-served by a variety of tools and services for community workspace infrastructure (e.g., JupyterHub: https://jupyterhub.readthedocs.io), sharable computational environments (e.g., BinderHub: https://binderhub.readthedocs.io), authoring and reading computational narratives (e.g., Jupyter Book: https://jupyterbook.org and MyST: https://mystmd.org), and data I/O tools and standards (e.g., Zarr: https://zarr.readthedocs.io and Intake: https://intake.readthedocs.io).

However there was a natural missing piece for **[FR-3 Challenges]**, and we could not identify any community-managed infrastructure that facilitated data challenges. This is the goal of `frx-challenges`.

## Why facilitate data challenges?

Data challenges are harder than you think! While it is simple enough to run somebody else's code locally, data challenges require a systematic, secure, and automated approach to accepting and evaluating submissions in a fair and repeatable way. Here are some of the big challenges to tackle:

- **Submissions must retain user and team identity**, which means that we must keep track of users and their submissions over time, since data challenges are designed to encourage iterative improvement and optimization.
- **Evaluations must use potentially complex resources and data** since many data challenges operate by publicly sharing a small dataset, and then running it against a much more complex dataset.
- **Evaluations must be totally secure**, so that submissions can't do nefarious things like mine cryptocurrency or gain access to the challenge's private data.
- **Evaluations must be automated**, so that running the challenge does not require extensive human intervention and can scale to many users.
- **Evaluation must be flexible**, so that the infrastructure can accept a variety of types of submissions (e.g. code, data, model weights, etc), run them with arbitrary environments designed by the organizers, and run them with the right hardware to get the job done.

These are just a few of the major challenges that we've tried to address with `frx-challenges`, and we're excited to see how it goes with our first assisted community challenge: the [Cellmap Challenge](https://cellmapchallenge.janelia.org/).

If you're interested in learning more or participating in this project, follow along at its GitHub repository:

https://github.com/2i2c-org/frx-challenges

This is still the **very early stages** of the project, and we imagine it will evolve significantly. We welcome feedback for how it can more effectively serve a variety of communities.

## Acknowledgements

Many thanks to the Howard Hughes Medical Institute (HHMI) for collaborating with us on the [Cellmap Challenge](https://cellmapchallenge.janelia.org/), which led to the creation of this project.
19 changes: 19 additions & 0 deletions content/blog/2024/funding-czi/czi-logo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: Financial support from the Chan Zuckerberg Initiative for sustaining 2i2c's mission to help communities create and share knowledge with open infrastructure
date: "2024-10-13"
authors: ["Chris Holdgraf"]
tags: [report]
categories: [organization]
featured: false
draft: false
---

We are proud to announce that 2i2c has received financial support from [The Chan Zuckerberg Initiative](https://chanzuckerberg.com/) to sustain our efforts at helping open science communities create and share knowledge with open infrastructure.

{{< figure
src="images/czi-logo.png"
width="75%"
caption="Funding comes from the [Open Science Initiative](https://www.navigation.org/grants/open-science) of The Navigation Fund, which is '...dedicated to transforming scientific research by enhancing collaboration and innovation. We support tools and approaches that move beyond traditional practices, making scientific knowledge more accessible and impactful.'"
>}}
This builds upon [previous core support provided by CZI](../../2021/czi-core-support/), and provides an additional **~$700K over 1 year** to help 2i2c sustain its mission. We are incredibly grateful to CZI for their support, and this funding provides key runway for 2i2c to serve its community network and explore a sustainable and scalable model for impact.
Binary file added content/blog/2024/funding-czi/images/image 2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
30 changes: 30 additions & 0 deletions content/blog/2024/funding-navigation/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Financial support from The Navigation Fund for identifying and building a scalable sustainability model
date: "2024-12-08"
authors: ["Chris Holdgraf"]
tags: [report]
categories: [organization]
featured: false
draft: false
---

We are proud to announce that 2i2c has received financial support from [The Navigation Fund](https://www.navigation.org/) to assist us in our mission to design and build a sustainable and scalable model for helping communities create and share knowledge with open infrastructure.

{{< figure
src="images/tnf-logo.png"
width="75%"
caption="Funding comes from the [Open Science Initiative](https://www.navigation.org/grants/open-science) of The Navigation Fund, which is '...dedicated to transforming scientific research by enhancing collaboration and innovation. We support tools and approaches that move beyond traditional practices, making scientific knowledge more accessible and impactful.'"
>}}
The award totals **~$1.5M over 2 years**. It provides support for several key strategic roles that are traditionally difficult to fund in a young organization: product management, delivery management, and business development. Here are the key goals this funding works toward:

- **Goal #1: Delivery**. Develop the operating structure and team skills to
efficiently scale our product and service delivery.
- **Goal #2: Product**. Develop a product system that continuously improves and
delivers value and impact at scale.
- **Goal #3: Sustainability**. Build a business model that is competitive and gives
us resources to sustain and scale our service.

We believe this is a critical step in helping our organization define and build a pathway to sustainability so that our service remains accessible, scalable, and resilient for years to come.

We're incredibly honored to be supported by the Navigation Fund, and excited to continue our work helping communities create and share knowledge with open infrastructure.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/blog/2024/retrospective/images/maus.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
107 changes: 107 additions & 0 deletions content/blog/2024/retrospective/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: "2024 impact report: new team structure, new funding, and growth in our network"
date: "2024-12-08"
authors: ["Chris Holdgraf"]
tags: [report]
categories: [organization]
featured: false
draft: false
---

2024 has been a busy year for 2i2c, with many highs and lows, a lot of impact, and significant organizational change. As the year comes to an end, I want to reflect on the work we've done in 2024, and where we aim to go in 2025.

## The main idea

In 2024, 2i2c began an organizational transformation that I now call the "$1M to $2M budget jump". This is a point in an organization's lifecycle when your team has grown enough in size and complexity that you must change the ways that you organize. The informal ways that you worked as a small group don't suffice anymore, and you have to put more effort into aligning and coordinating everyone to ensure you have impact as a group.

Every organization hits this point at a different time, but it seems to be around when your annual budget goes from $1M to $2M, thus the name. Getting to the other side of this gap with an intact runway and team is hard, and I suspect that 2i2c's fully distributed nature means that we hit these scaling milestones earlier than many organizations.

At an organizational level for our team, this meant a lot of introspection and planning, a few new roles, a few departing team members, a funding crunch, a successful effort to dig out of it, and a new system of work organizing our team. We'll share more about all of this later, but here are the major implications for our team:

- **We've raised another $2.2M in funding** to support our efforts in scaling and sustaining our network of community hubs. This gives us roughly another 2 years of projected runway (with some assumptions about revenue from contracts and grants).
- **We've hired and incorporated more strategic and systems-level roles** to give our team support as it grows: a Product Lead, Delivery Lead / Chief of Staff , and a People operations manager.
- **We've re-organized our team into separate business development and product teams**, in order to focus on _providing an excellent technical platform and a collection of services that maximizes commnunity impact_, as well as _sustaining this service for our communities_.

This has allowed us to more effectively coordinate our service enhancement and development efforts, and increases our ability to deliver improvements to our communities and to upstream projects.

For more details about our impact, see the summaries below.

## Community impact

2i2c's core mission is to support its network of communities that create and share knowledge with open infrastructure. Here are the highlights of how our community network has grown and had impact.

First, we've grown our network of hubs and users through several new partnerships, we **grew the number of active hubs from ~75 to ~105**, and **grew the number of Monthly Active Users (MAUs) from ~8100 to ~9100**.

{{< figure
src="images/maus.png"
width="75%"
caption="You can see an interactive version of these numbers in our platform usage dashboard: https://2i2c.org/kpis/cloud/"
>}}
Beyond the numbers, we also re-focused our team on reporting on impact stories from our collaborations with community members, and have published these into a (growing) list of impact reports on our blog:

{{< figure
src="images/impact-gallery.png"
width="75%"
caption="Our impact gallery is a new place to share stories of our impact with user research communities as well as open source communities: https://2i2c.org/category/impact/"
>}}

Here are a few community highlights from this year:

- We served around 20 communities from Latin America and Africa for the Catalyst project partner impact: https://2i2c.org/blog/2024/catalyst-partner-highlights/
- Our community partner Openscapes were invited to the white house to discuss the importance of open science: https://openscapes.org/events/2024-09-26-openscapes-whitehouse/
- The NeuroHackademy used our infrastructure to support their annual summer school: http://2i2c.org/blog/2024/neurohackademy-summer-school-reflections/
- We enabled ephemeral and sharable interactive computing environments for workshops in the geospatial community: http://2i2c.org/blog/2024/amerigeo-workshop/
- We ran a pilot for an HHMI-funded Spyglass project for reproducing their pre-print with a live interactive environment: - BinderHub support for publishing infrastructure: https://2i2c.org/blog/2024/hhmi-spyglass-mysql/

## Open source technology enhancements

Our second pillar of impact is to improve the ecosystem of open infrastructure and the open science workflows it enables. We use collaborations with our community partners to drive new cycles of development in open source tools that we support. Here's a brief overview of our impact across the open source ecosystem this year.

In 2024, 2i2c team members [authored **over 500 pull requests**](https://github.com/search?q=author%3Acholdgraf+author%3Aharoldcampbell+author%3Aaprilmj+author%3Acolliand+author%3Ajmunroe+author%3Ajnywong+author%3AGman0909+author%3AconsideRatio+author%3Ageorgianaelena+author%3Asgibson91+author%3Ayuvipanda+author%3Aagoose77+org%3Ajupyter+org%3Ajupyter-server+org%3Ajupyterhub+org%3Ajupyterlab+org%3Abinder-examples+org%3Aexecutablebooks+org%3Acryptnono+org%3Adask+org%3Apydata+org%3Arocker-org+org%3Apangeo-data+org%3Ajupyter-book+is%3Apr+merged%3A%3E%3D2024-01-01&type=pullrequests) that were merged in our key open source communities communities. Find our list of key open source communities here:

https://compass.2i2c.org/open-source/key-communities/

Here are a few highlights where we focused our effort this year - each of these efforts required both development with and for our community network, as well as upstream contributions and support:

- We released the **JupyterHub Fancy Profiles** project, which allows for a more flexible and modern interface to launch environments with JupyterHub:

https://2i2c.org/blog/2024/jupyterhub-fancy-profiles-rollout/
- We used this to allow users to **build and launch custom environments in JupyterHub** in a way that users can also share with others. Here's a community with which we've piloted this functionality:

https://2i2c.org/blog/2024/nasa-ephemeral-hubs/
- We've added a Grafana dashboard for **resource and cost monitoring with JupyterHub** to give communities more visibility over their projected cloud costs:

https://2i2c.org/blog/2024/aws-cost-attribution/
- We began **incorporating Jupyter Book 2.0 workflows into our community hubs** and laid a foundation for enabling our communuty networks to communicate with one another more effectively using the new MyST document engine. Here are two posts about this:

https://2i2c.org/blog/2024/project-pythia-cookoff/

https://2i2c.org/blog/2024/jupyter-book-2/
- We built `frx-challenge`, a tool to help communities host data challenges with secure, automated evaluation of submissions.

https://github.com/2i2c-org/frx-challenge

This was built in collaboration with the **HHMI Cellmsp Challenge** competition:

https://cellmapchallenge.janelia.org/


## Looking to next year

2025 is going to be a critical year for 2i2c to build upon the work it's begin this year and achieve a more sustainable and scalable community model. Here are the main areas that will guide our work in the new year, pulled from our [recent proposal from The Navigation Fund](../funding-navigation/):

- **Goal #1: Delivery**. Develop the operating structure and team skills to
efficiently scale our product and service delivery.
- **Goal #2: Product**. Develop a product system that continuously improves and
delivers value and impact at scale.
- **Goal #3: Sustainability**. Build a business model that is competitive and gives
us resources to sustain and scale our service.

These are the key goals 2i2c must achieve in order to ensure that its service remains impactful, sustainable, scalable, and accessible. We believe that we've laid a strong foundation to get there, and are excited to begin work next year.

Overall, 2024 has been a challenging, but also a rewarding year for our team. We've encountered and successfully worked thorugh a number of scaling challenges for our team, and we've made significant progress at laying a foundation on which we can build for the years to come.

I'm incredibly proud of 2i2c's team for all of their hard work this year, and also honored to be working with a network of communities that care about open infrastructure, and its value for creating and sharing knowledge with the world. Here's to another year of impact!

2 changes: 2 additions & 0 deletions layouts/_default/_markup/render-link.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
{{ $title = replace $title "https://github.com/" "" }}
{{ $title = replace $title "/issues/" "#" }}
{{ $title = replace $title "/pull/" "#" }}
{{ $title = replace $title "https://" "" }}
{{ $title = replace $title "http://" "" }}

{{/* This is a custom link resolver that resolves both internal and external links, thanks to @cmd-ntrf for the helpful fix. */}}

Expand Down

0 comments on commit 907e0b9

Please sign in to comment.