Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance updates #193

Merged
merged 15 commits into from
Aug 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions .github/workflows/erlang.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,37 +6,37 @@ jobs:

build:

runs-on: ubuntu-latest
runs-on: ubuntu-22.04

strategy:
matrix:
otp: ['23.3', '24.0']
rebar: ['3.16.1']
otp: ['25', '26']
rebar: ['3.22']

steps:
- uses: actions/checkout@v2
- uses: erlef/setup-beam@v1
id: setup-beam
with:
otp-version: ${{matrix.otp}}
rebar3-version: ${{matrix.rebar}}
- name: Restore _build
uses: actions/cache@v2
with:
path: _build
key: _build-cache-for-os-${{runner.os}}-otp-${{steps.setup-beam.outputs.otp-version}}-rebar3-${{steps.setup-beam.outputs.rebar3-version}}-hash-${{hashFiles('rebar.lock')}}
- name: Restore rebar3's cache
uses: actions/cache@v2
with:
path: ~/.cache/rebar3
key: rebar3-cache-for-os-${{runner.os}}-otp-${{steps.setup-beam.outputs.otp-version}}-rebar3-${{steps.setup-beam.outputs.rebar3-version}}-hash-${{hashFiles('rebar.lock')}}
- name: Compile
run: rebar3 compile
- name: Format check
run: rebar3 format --verify
- name: Run tests and verifications
run: rebar3 test
- name: Upload code coverage
uses: codecov/codecov-action@v1
with:
file: "_build/test/covertool/worker_pool.covertool.xml"
- uses: actions/checkout@v3
- uses: erlef/setup-beam@v1
id: setup-beam
with:
otp-version: ${{matrix.otp}}
rebar3-version: ${{matrix.rebar}}
- name: Restore _build
uses: actions/cache@v3
with:
path: _build
key: _build-cache-for-os-${{runner.os}}-otp-${{steps.setup-beam.outputs.otp-version}}-rebar3-${{steps.setup-beam.outputs.rebar3-version}}-hash-${{hashFiles('rebar.lock')}}
- name: Restore rebar3's cache
uses: actions/cache@v3
with:
path: ~/.cache/rebar3
key: rebar3-cache-for-os-${{runner.os}}-otp-${{steps.setup-beam.outputs.otp-version}}-rebar3-${{steps.setup-beam.outputs.rebar3-version}}-hash-${{hashFiles('rebar.lock')}}
- name: Compile
run: rebar3 compile
- name: Format check
run: rebar3 format --verify
- name: Run tests and verifications
run: rebar3 test
- name: Upload code coverage
uses: codecov/codecov-action@v3
with:
file: "_build/test/covertool/worker_pool.covertool.xml"
14 changes: 3 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
rebar3.crashdump
.rebar3
doc/
codecov.json
_build/
all.coverdata
doc
.DS_Store
_*
erl_crash.dump
*.beam
*.log
*~
.idea
*.iml
*.orig
logs
4 changes: 2 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
https://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

Expand Down Expand Up @@ -193,7 +193,7 @@
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0
https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
Expand Down
94 changes: 53 additions & 41 deletions README.md
elbrujohalcon marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,47 +1,45 @@
# Worker Pool [![Build Status](https://travis-ci.org/inaka/worker_pool.svg?branch=main)](https://travis-ci.org/inaka/worker_pool)[![codecov](https://codecov.io/gh/inaka/worker_pool/branch/main/graph/badge.svg)](https://codecov.io/gh/inaka/worker_pool)
# Worker Pool [![Build Status](https://github.com/inaka/worker_pool/actions/workflows/erlang.yml/badge.svg)](https://github.com/inaka/worker_pool/actions/workflows/erlang.yml)[![codecov](https://codecov.io/gh/inaka/worker_pool/branch/main/graph/badge.svg)](https://codecov.io/gh/inaka/worker_pool)

<img src="http://img3.wikia.nocookie.net/__cb20140705120849/clubpenguin/images/thumb/f/ff/MINIONS.jpg/481px-MINIONS.jpg" align="right" style="float:right" height="400" />
<img src="https://img3.wikia.nocookie.net/__cb20140705120849/clubpenguin/images/thumb/f/ff/MINIONS.jpg/481px-MINIONS.jpg" align="right" style="float:right" height="400" />

A pool of gen servers.

### Abstract
## Abstract

The goal of **worker pool** is pretty straightforward: To provide a transparent way to manage a pool of workers and _do the best effort_ in balancing the load among them distributing the tasks requested to the pool.

### Documentation
## Documentation

The documentation can be generated from code using [edoc](http://www.erlang.org/doc/apps/edoc/chapter.html) with ``rebar3 edoc`` or using [erldocs](https://github.com/erldocs/erldocs) with ``make erldocs``. It is also available online [here](https://hexdocs.pm/worker_pool/)
The documentation can be generated from code using [rebar3_ex_doc](https://github.com/starbelly/rebar3_ex_doc) with `rebar3 ex_doc`. It is also available online [here](https://hexdocs.pm/worker_pool/)
elbrujohalcon marked this conversation as resolved.
Show resolved Hide resolved

### Usage
## Usage

All user functions are exposed through the [wpool module](https://hexdocs.pm/worker_pool/wpool.html).

#### Starting the Application
**Worker Pool** is an erlang application that can be started using the functions in the [`application`](http://erldocs.com/current/kernel/application.html) module. For convenience, `wpool:start/0` and `wpool:stop/0` are also provided.
### Starting the Application

**Worker Pool** is an erlang application that can be started using the functions in the [`application`](https://erldocs.com/current/kernel/application.html) module. For convenience, `wpool:start/0` and `wpool:stop/0` are also provided.

### Starting a Pool

#### Starting a Pool
To start a new worker pool, you can either use `wpool:start_pool` (if you want to supervise it yourself) or `wpool:start_sup_pool` (if you want the pool to live under wpool's supervision tree). You can provide several options on any of those calls:

* **overrun_warning**: The number of milliseconds after which a task is considered *overrun* (i.e. delayed) so a warning is emitted using **overrun_handler**. The task is monitored until it is finished, thus more than one warning might be emitted for a single task. The rounds of warnings are not equally timed, an exponential backoff algorithm is used instead: after each warning the overrun time is doubled (i.e. with `overrun_warning = 1000` warnings would be emitted after 1000, 2000, 4000, 8000 ...) The default value for this setting is `infinity` (i.e. no warnings are emitted)
* **max_overrun_warnings**: The maximum number of overrun warnings emitted before killing a delayed task: that is, killing the worker running the task. If this parameter is set to a value other than `infinity` the rounds of warnings becomes equally timed (i.e. with `overrun_warning = 1000` and `max_overrun_warnings = 5` the task would be killed after 5 seconds of execution) The default value for this setting is `infinity` (i.e. delayed tasks are not killed)


**NOTE:** As the worker is being killed it might cause worker's messages to be missing if you are using a worker stategy other than `available_worker` (see worker strategies below)


* **overrun_handler**: The module and function to call when a task is *overrun*. The default value for this setting is `{error_logger, warning_report}`. Repor values are:


* *{alert, AlertType}*: Where `AlertType` is `overrun` on regular warnings, or `max_overrun_limit` when the worker is about to be killed.
* *{pool, Pool}*: The poolname
* *{worker, Pid}*: Pid of the worker
* *{task, Task}*: A description of the task
* *{runtime, Runtime}*: The runtime of the current round

* *{alert, AlertType}*: Where `AlertType` is `overrun` on regular warnings, or `max_overrun_limit` when the worker is about to be killed.
* *{pool, Pool}*: The poolname
* *{worker, Pid}*: Pid of the worker
* *{task, Task}*: A description of the task
* *{runtime, Runtime}*: The runtime of the current round

* **workers**: The number of workers in the pool. The default value for this setting is `100`
* **worker_type**: The type of the worker. The available values are `gen_server`. The default value is `gen_server`. Eventually we'll add `gen_statem` as well.
* **worker**: The [`gen_server`](http://erldocs.com/current/stdlib/gen_server.html) module that each worker will run and the `InitArgs` to use on the corresponding `start_link` call used to initiate it. The default value for this setting is `{wpool_worker, undefined}`. That means that if you don't provide a worker implementation, the pool will be generated with this default one. [`wpool_worker`](https://hexdocs.pm/worker_pool/wpool_worker.html) is a module that implements a very simple RPC-like interface.
* **worker**: The [`gen_server`](https://erldocs.com/current/stdlib/gen_server.html) module that each worker will run and the `InitArgs` to use on the corresponding `start_link` call used to initiate it. The default value for this setting is `{wpool_worker, undefined}`. That means that if you don't provide a worker implementation, the pool will be generated with this default one. [`wpool_worker`](https://hexdocs.pm/worker_pool/wpool_worker.html) is a module that implements a very simple RPC-like interface.
* **worker_opt**: Options that will be passed to each `gen_server` worker. This are the same as described at `gen_server` documentation.
* **worker_shutdown**: The `shutdown` option to be used in the child specs of the workers. Defaults to `5000`.
* **strategy**: Not the worker selection strategy (discussed below) but the supervisor flags to be used in the supervisor over the individual workers (`wpool_process_sup`). Defaults to `{one_for_one, 5, 60}`
Expand All @@ -53,34 +51,43 @@ To start a new worker pool, you can either use `wpool:start_pool` (if you want t
* **callbacks**: Initial list of callback modules implementing `wpool_process_callbacks` to be called on certain worker events.
This options will only work if the `enable_callbacks` is set to **true**. Callbacks can be added and removed later by `wpool_pool:add_callback_module/2` and `wpool_pool:remove_callback_module/2`.

#### Using the Workers
### Using the Workers

Since the workers are `gen_server`s, messages can be `call`ed or `cast`ed to them. To do that you can use `wpool:call` and `wpool:cast` as you would use the equivalent functions on `gen_server`.

##### Choosing a Strategy
#### Choosing a Strategy

Beyond the regular parameters for `gen_server`, wpool also provides an extra optional parameter: **Strategy**.
The strategy used to pick up the worker to perform the task. If not provided, the result of `wpool:default_strategy/0` is used. The available strategies are defined in the `wpool:strategy/0` type and also described below:
The strategy used to pick up the worker to perform the task. If not provided, the result of `wpool:default_strategy/0` is used. The available strategies are defined in the `t:wpool:strategy/0` type and also described below:

##### best_worker

Picks the worker with the smaller queue of messages. Loosely based on [this article](https://lethain.com/load-balancing-across-erlang-process-groups/). This strategy is usually useful when your workers always perform the same task, or tasks with expectedly similar runtimes.

###### best_worker
Picks the worker with the smaller queue of messages. Loosely based on [this article](http://lethain.com/load-balancing-across-erlang-process-groups/). This strategy is usually useful when your workers always perform the same task, or tasks with expectedly similar runtimes.
##### random_worker

###### random_worker
Just picks a random worker. This strategy is the fastest one when to select a worker. It's ideal if your workers will perform many short tasks.

###### next_worker
##### next_worker

Picks the next worker in a round-robin fashion. That ensures evenly distribution of tasks.

###### available_worker
##### available_worker

Instead of just picking one of the workers in the queue and sending the request to it, this strategy queues the request and waits until a worker is available to perform it. That may render the worker selection part of the process much slower (thus generating the need for an additional parameter: **Worker_Timeout** that controls how many milliseconds is the client willing to spend in that, regardless of the global **Timeout** for the call).
This strategy ensures that, if a worker crashes, no messages are lost in its message queue.
It also ensures that, if a task takes too long, that doesn't block other tasks since, as soon as other worker is free it can pick up the next task in the list.

###### next_available_worker
##### next_available_worker

In a way, this strategy behaves like `available_worker` in the sense that it will pick the first worker that it can find which is not running any task at the moment, but the difference is that it will fail if all workers are busy.

###### hash_worker
##### hash_worker

This strategy takes a key and selects a worker using [`erlang:phash2/2`](https://www.erlang.org/doc/man/erlang.html#phash-2). This ensures that tasks classified under the same key will be delivered to the same worker, which is useful to classify events by key and work on them sequentially on the worker, distributing different keys across different workers.

#### Broadcasting a Pool
### Broadcasting a Pool

Wpool provides a way to `broadcast` a message to every worker within the given Pool.

```erlang
Expand All @@ -97,27 +104,32 @@ ok

**NOTE:** This messages don't get queued, they go straight to the worker's message queues, so if you're using available_worker strategy to balance the charge and you have some tasks queued up waiting for the next available worker, the broadcast will reach all the workers **before** the queued up tasks.

#### Watching a Pool
### Watching a Pool

Wpool provides a way to get live statistics about a pool. To do that, you can use `wpool:stats/1`.

#### Stopping a Pool
To stop a pool, just use `wpool:stop/1`.
### Stopping a Pool

To stop a pool, just use `wpool:stop_pool/1`.

### Examples
## Examples

To see how `wpool` is used you can check the [test](test) folder where you'll find many different scenarios exercised in the different suites.

If you want to see **worker_pool** in a _real life_ project, I recommend you to check [sumo_db](https://github.com/inaka/sumo_db), another open-source library from [Inaka](http://inaka.github.io/) that uses wpool intensively.
If you want to see **worker_pool** in a _real life_ project, I recommend you to check [sumo_db](https://github.com/inaka/sumo_db), another open-source library from [Inaka](https://inaka.github.io/) that uses wpool intensively.

## Benchmarks

### Benchmarks
**wpool** comes with a very basic [benchmarker](https://github.com/inaka/worker_pool/blob/main/test/wpool_bench.erl) that let's you compare different strategies against the default `wpool_worker`. If you want to do the same in your project, you can use `wpool_bench` as a template and replace the worker and the tasks by your own ones.

**wpool** comes with a very basic [benchmarker](test/wpool_bench.erl) that let's you compare different strategies against the default `wpool_worker`. If you want to do the same in your project, you can use `wpool_bench` as a template and replace the worker and the tasks by your own ones.
## Contact Us

### Contact Us
If you find any **bugs** or have a **problem** while using this library, please [open an issue](https://github.com/inaka/worker_pool/issues/new) in this repo (or a pull request :)).

### On Hex.pm
## On Hex.pm

Worker Pool is available on [Hex.pm](https://hex.pm/packages/worker_pool).

### Requirements
**Required OTP version 23** or or higher. We only provide guarantees that the system runs on `OTP23+` since that's what we're testing it in, but the `minimum_otp_vsn` is `"21"` because some systems where **worker_pool** is integrated do require it.
## Requirements

**Required OTP version 25** or higher. We only provide guarantees that the system runs on `OTP25+` since that's what we're testing it in, but the `minimum_otp_vsn` is `"21"` because some systems where **worker_pool** is integrated do require it.
6 changes: 0 additions & 6 deletions priv/overview.edoc

This file was deleted.

Loading