Skip to content

golemfactory/task-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

task-api

Golem - application communication interface

This work is in it's alpha stage and under heavy development so the interface may change frequently.

This repository contains the interface that the Golem compatible application should implement as well as constants used by the protocol. The interface and the constants are defined under the golem_task_api/proto directory in the Protocol Buffers files.

This repository also contains programming language specific packages of the gRPC protocol which may be used for concrete implementation. This is for the ease of development of the application but it's not required to use them in the application. If you don't see a programming language you're interested in, feel free to create an issue or even a pull request and we will add it.

The API

The API is divided into two independent parts - requestor and provider.

Requestor

For requestor the app should implement a long running RPC service which implements the RequestorApp interface from the proto files. The app should assume it will have access to a single directory (let's call it work_dir). Each task will have its own separate working directory under the main work_dir. You can assume that for a given task_id the first call will always be CreateTask and the following directories will exist under work_dir and they will be empty:

  • {task_id}
  • {task_id}/{constants.TASK_INPUTS_DIR}
  • {task_id}/{constants.SUBTASK_INPUTS_DIR}
  • {task_id}/{constants.TASK_OUTPUTS_DIR}
  • {task_id}/{constants.SUBTASK_OUTPUTS_DIR}

RPC methods

  • CreateTask
    • Takes three arguments: task_id, max_subtasks_count, and task_params_json.
    • Should treat {work_dir}/{task_id} as the working directory for the given task.
    • task_params_json is a JSON string containing app-specific task parameters. Format of these parameters is entirely up to the application developer.
    • Will only be called once with given task_id.
    • Can assume that task_id is unique per node.
    • Can assume {task_id}/{constants.TASK_INPUTS_DIR} contains all the resources provided by task creator.
    • Returns env_id and prerequisites_json specifying the environment and prerequisites required for providers to compute the task. See environments section for details.
  • NextSubtask
    • Takes three arguments: task_id, subtask_id and opaque_node_id.
    • opaque_node_id is an identifier of the node which is going to compute the requested subtask. 'Opaque' means that the identifier doesn't allow to obtain any further information about the node (e.g. public key, IP address).
    • Can assume that subtask_id is unique per node.
    • Can assume CreateTask was called earlier with the same task_id.
    • Can return an empty message meaning that the app refuses to assign a subtask to the provider node (for whatever reason).
    • Returns subtask_params_json which is the JSON string containing subtask specific parameters.
    • Also returns resources which is a list of names of files required for computing the subtask. Files with these names are required to be present in {task_id}/{constants.SUBTASK_INPUTS_DIR} directory.
  • HasPendingSubtasks
    • Takes one argument task_id.
    • Returns a boolean indicating whether there are any more pending subtasks waiting for computation at given moment.
    • In case when it returns true, the next NextSubtask call should return successfully (although it can still return an empty message).
  • Verify
    • Takes two arguments: task_id and subtask_id which specify which subtask results should be verified.
    • Will be called with only valid task_id and subtask_id values.
    • Returns result which is one of the defined verification result statuses:
      • SUCCESS - the subtask was computed correctly,
      • FAILURE - the subtask was computed incorrectly,
      • INCONCLUSIVE - cannot determine whether the subtask was computed correctly,
      • AWAITING_DATA - cannot perform verificaton until results of other subtasks are available.
    • Also returns reason which is a string providing more detail about the result.
    • For successfully verified subtasks it can also perform merging of the partial results into the final one.
  • DiscardSubtasks
    • Takes two arguments: task_id and subtask_ids.
    • Should discard results of given subtasks and any dependent subtasks.
    • Returns list of subtask IDs that have been discarded.
    • In a simple case where subtasks are independent from each other it will return the same list as it received.
  • Benchmark
    • Takes no arguments.
    • Returns a score which indicates how efficient the machine is for this type of tasks.
    • Shouldn't take much time (preferably less than a minute for medium range machines).
  • AbortTask
    • Takes one argument: task_id.
    • Will be called when the task is aborted by the user or timed out. Should stop all running subtask verifications for this task and perform any other necessary cleanup.
  • AbortSubtask
    • Takes two arguments: task_id and subtask_id.
    • Will be called when the subtask is aborted by the user or timed out. Should stop verification of the subtask (if it's running) and perform any other necessary cleanup.
  • Shutdown
    • Takes no arguments.
    • Should gracefully terminate the service.

When the last subtask is successfully verified on the requestor's side, the work_dir/task_id/constants.TASK_OUTPUTS_DIR directory should contain all result files and nothing else.

Environments

Both provider and requestor apps run on top of Golem's execution environments. Environment for requestor is specified in the application definition and cannot vary. Provider environment is specified by the return value of CreateTask call. A single application could use different environments for different types of tasks, it could also use different environment for requestor and provider. Environments have their unique IDs and prerequisites formats. Prerequisites are additional requirements for the environment to run the app (e.g. Docker environment prerequisites specify image). Environment IDs and prerequisites formats are listed in envs.proto file.

Currently the following environments are supported:

  • docker_cpu - standard Docker environment
    Prerequisites format:
    {
        "image": "...",
        "tag":   "..."
    }
  • docker_gpu - GPU-enabled Docker environment, Linux only
    Prerequisites format: same as docker_cpu

Provider

Provider app should implement a short-lived RPC service which implements the ProviderApp interface from the proto files. Short-lived means that there will be only one request issued per service instance, i.e. the service should shutdown automatically after handling the first and only request.

RPC commands

  • Compute
    • Gets a single working directory task_work_dir to operate on.
    • Different subtasks of the same task will have the same task_work_dir.
    • Takes task_id, subtask_id, subtask_params_json as arguments.
    • Can assume the {task_work_dir}/{constants.SUBTASK_INPUTS_DIR} directory exists.
    • Can assume that under {task_work_dir}/{constants.SUBTASK_INPUTS_DIR} are the resources specified in the corresponding NextSubtask call.
    • Returns a filepath (relative to the task_work_dir) of the result file which will be sent back to the requestor with unchanged name.
  • Benchmark
    • Takes no arguments.
    • Returns a score which indicates how efficient the machine is for this type of tasks.
    • Shouldn't take much time (preferably less than a minute for medium range machines).
  • Shutdown
    • Takes no arguments.
    • Should gracefully terminate the service.
    • Can be called in case the provider wants to interrupt task computation or benchmark.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages