Skip to content

futureverse/future.p2p

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

future.p2p - A Peer-to-Peer Compute Cluster in R

- Share R compute among friends across the world



TL;DR

library(future)

## Resolve futures via a P2P cluster shared among friends
plan(future.p2p::cluster, cluster = "alice/friends")

## Create future
f <- future(Sys.getpid())
  
## Get results
v <- value(f)
print(v)

Installation

install.packages('future.p2p', repos = c('https://futureverse.r-universe.dev', 'https://cloud.r-project.org'))

Getting started

In order to join a future P2P cluster, you must:

  1. have an SSH key pair configured, and

  2. have a pico.sh account.

See the 'Getting Started' vignette for how to set this up, but the gist for creating an SSH key pair if you already don't have one is to:

$ ssh-keygen

With the key pair create a pico.sh account by logging into their server:

$ ssh pico.sh

Choose your pico.sh username, which will also be your P2P cluster username, and click ENTER. Finally, verify SSH access to pipe.pico.sh (sic!);

$ ssh pipe.pico.sh

That's it!

Set up a shared P2P cluster

Let's assume P2P users 'alice', 'bob', 'carol', and 'diana' decides to share a P2P cluster and user 'alice' agrees to host it. Hosting a P2P cluster only means that you control who has access - there's no extra load added. So, to host, 'alice' calls:

{alice}$ Rscript -e future.p2p::host_cluster --users=bob,carol,diana --cluster=alice/friends

A future P2P cluster can be hosted from anywhere in the world, and it does not have to on a machine where you run your own R analysis.

Parallelize via P2P cluster (all users)

Any user with access to the 'alice/friends' cluster can use it. In our example, this means 'bob', 'carol', 'diana', and 'alice' may use the P2P cluster at the same time. Just like with any other future backend, we use plan() to specifying that we want to parallelize via the P2P cluster.

For example,

library(future)
plan(future.p2p::cluster, cluster = "alice/friends")

## Evaluate a R expression via the P2P cluster
f <- future(Sys.getpid())

## Retrieve value
v <- value(f)
print(v)

Share your compute power with your friends (any user)

Without parallel workers, the P2P cluster is useless and will not process any parallel tasks. This is where the peer-to-peer concept comes in, where we contribute our idle compute cycles to the cluster for others to make use of. To contribute your R compute power to the alice/friends cluster, launch a P2P worker as:

{bob}$ Rscript -e future.p2p::worker --cluster=alice/friends

This will contribute one parallel worker to the p2p cluster. You can contribute additional ones by repeating the same command one or more times.

Appendix

Connecting to the same pico.sh account from different machines

If you have multiple computers, you can add your public SSH keys for those as well by logging in again by calling ssh pico.sh. Then go to the pubkeys menu, where you have options to add additional public SSH keys of yours. This way, you can use your pico.sh account from multiple computer systems, which can be handy if you want to set up parallel workers on one system and harness their compute power from another.

Set up a worker to connect to pico.sh via a jumphost

{bob}$ Rscript -e future.p2p::worker --ssh_args="-J somehost" --cluster=alice/friends

Troubleshoot Wormhole

If you are behind a firewall with a proxy, wormhole might fail to establish an outbound connection. For example, if you try:

> system2(future.p2p::find_wormhole(), args = c("send", "--text", "hello"))

it might stall forever. If that happens, press Ctrl-C to interrupt and retry by disabling the proxy settings using:

> Sys.unsetenv("http_proxy")
> system2(future.p2p::find_wormhole(), args = c("send", "--text", "hello"))
On the other computer, please run: wormhole receive (or wormhole-william recv)                                                       
Wormhole code is: 53-visitor-physique

If the latter works for you, launch R by unsetting environment variable http_proxy, e.g.

{bob}$ http_proxy="" Rscript -e future.p2p::worker --cluster=alice/friends