Skip to content

Commit

Permalink
Shortened and added documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
DoctressWasTaken committed Feb 19, 2022
1 parent 21d3a7e commit 7198b07
Show file tree
Hide file tree
Showing 5 changed files with 98 additions and 186 deletions.
2 changes: 0 additions & 2 deletions .env
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# Defaults to this server
SERVER=NA1
# Compose settings
COMPOSE_FILE=compose-services.yaml
COMPOSE_PROJECT_NAME=lightshield
Expand Down
217 changes: 36 additions & 181 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,195 +1,50 @@

# Lightshield (LS)
# Lightshield

This tool provides a fully automated system to permanently mirror the official Riot APIs content
with your local system. It uses a number of microservices to split requests done to different endpoints
across separated processes and allows scaling each service according to your current data needs.
This is especially usefull in the early stages of setting up a data repository.
A Self-Contained Pipeline to keep a local mirror of the RiotGames API.
Does:
- Pull regular updates on players ranks
- Update players match histories
- Save match_details and match_timeline for matches

The setup is meant for tasks using alot of match data while not wanting to set up their own data pipeline structure.
Lightshield does currently not pull **all** match details data but only a select subset (for details
check the SQL schemas in the postgres folder). Changes can be made easily by simply expanding/replacing the service.
Does not:
- Give real time updates on players
- Give per match updates on players ranks
- Work well with personal key ratelimits

Lightshield is optimized to not repeat calls unecessarily. This comes at cost of having data added in a less than
real time fashion.

### Standalone Multi-Host Ratelimiter
Lightshield offers its ratelimiter as a standalone [python library](https://pypi.org/project/lightshield/) which only
requires the redis services included in this larger package.

## What Lightshield does **not** do well
#### *Deliver up to date data*
While it can get up to date data on user its not set up for this by default
For more details see [here.](Rate%20Limiting.md)

#### *Gather data on unranked player*
By default Lightshield pulls data through the league endpoint which requires a user to be ranked.
As such getting information over unranked player is not supported by default but can be manually added.

## Structure in Short

Lightshield handles data through a distributable scalable network of triangular microservice structures.
All data collected is stored in a dedicated postgres database. Task handling and scalability is provided through a
buffering redis database.

- Each cluster of services is responsible for a single server and requires an additional proxy for rate limiting and
handling of the API key.

Each step of data processing is stored inside the Postgres database from which a single manager service creates tasks.
Worker services then process the tasks, add data to the db and close the task.


## Requirements
Lightshield runs on docker and can either be built through this repository or by pulling the images
[directly from DockerHub](https://hub.docker.com/u/lightshield).


## Setup

### Env variables
Copy and rename the included secrets_template.env into secrets.env

### I. Network

Initialize the network used to bundle all services together and allow communication:
```shell script
docker network create lightshield
```
The name can be changed but has to be updated in the compose files as well.
If you are planning on running it through docker swarm use the appropriate network type.

### II. Database
Set up a postgres database either locally, in docker (attached to the network) or remotely. The services currently expect
no password verification as such using a remote postgres library should only be done if you can limit access through other means.

DB Connection details can be configured through a secrets.env file (template file included).

Lightshield requires the in the postgres/ folder listed tables to be set up in the specified database under schemas
corresponding to the server they will contain data for. E.g. pulling data for NA1 requires setting up a schema `na1` (lower case)
with all tables inside said schema as well as a user `na1` which will be used to pull data from said schema.

### III. Centralized Services
Services included in the `compose-global.yaml` file are meant to be run centralized meaning they run server-agnostic (usually as a one-off).
Currently this refers to the proxy service as well as the buffer redis db.
Start the services by properly refering to the compose file:
```shell
# Pull (specify a tag if needed, defaults to latest)
TAG=latest docker-compose -f compose-global.yaml pull
# Or build
docker-compose -f compose-global.yaml build
# And then run
docker-compose -f compose-global.yaml up -d
```

### IV. Server Specific Structure

#### For docker-compose
Run docker compose with default parameters. The services require the selected server to be passed into the container via
the environment variable `SERVER`. In addition make sure to use different project names either through `-p [project name]`
or through the env variable `COMPOSE_PROJECT_NAME`. This will stop multiple server setups from overwriting one another.

```shell script
# Build from source
docker-compose build
# Or pull from docker hub (specify a tag if needed, defaults to latest)
TAG=latest docker-compose pull
# Run either with the -p tag
SERVER=EUW1 docker-compose -p lightshield_euw1 up -d
# Or env variable
SERVER=EUW1 COMPOSE_PROJECT_NAME=lightshield_euw1 docker-compose up -d
```
#### For docker-swarm
Follow the same guidelines explained for docker-compose. The images can either be built or pulled from DockerHub.
`SERVER` still needs to be passed into the service.
The individual project name is passed through the stack name.
```shell script
SERVER=EUW1 docker stack deploy -c compose-services.yaml lightshield_euw1
```
The compose file includes the base_image service which is just a unified default image for all other services. As such
this isn't an actual runnable service and would usually just immediatly exit when run in docker-compose. Swarm however tries
to rebuild and restart the service continuously as such you need to manually remove it from the stack to avoid that.

<hr>

## Functionality

Services are structured in general in a triangular form using the persistent postgres DB as source of truth,
a singular non-scalable manager to select new tasks and a scalable amount of microservices to work through those tasks.

2 lists are used to buffer tasks:
- A list of non-selected tasks that are yet to be worked on.
- A list of selected tasks that are being or have been work on
The manager service looks at both lists to determine if it should add a new task to the first non-selected task list.
Each worker service pulls tasks from the first list and adds them to the secondary list. Tasks are never marked as done
as the tasks are by default no longer eligible once inserted in the DB, e.g. if a summonerId has to be pulled then the manager
will only select accounts without summonerId as tasks. Once the summonerId is added it will automatically be ineligible.
All tasks that are older than `TASK_BLOCKING` (secrets.env) minutes in the secondary selected list are periodically removed by the manager
therefore making space for new tasks. As this is the only way tasks are being removed make sure to keep the limit just high enough
to assure that tasks currently being worked on are not removed while not having the queue overflow with already done tasks.
Lightshield runs almost fully functionally out of the box simply starting the docker-compose file.
- No services will start by default they have to be turned on in the included interface manually.
- *Match Details* and *Match Timeline* JSON files will by default be saved in the `match_data/` folder located in the
project folder itself. If the folder doesn't exist, it has to be created.
- Lightshield does by default not run with any exposed ports outwards, all port exposure is localhost only. As such
accessing the web-interface is only possible if the service is started locally or through ssh tunneling. For ssh tunneling
the following ports should be bound: `ssh -L 8301:localhost:8301 -L 8302:localhost:8302`. This will enable communication
to both the front and backend services and the interface will therefore be reachable locally under `localhost:8301`.

### Services

#### League Ranking
Uses the league-exp endpoint to crawl all ranked user in a continuous circle. This service has no manager and only requires a one-off.
via the `UPDATE_INTERVAL=1` variable in the compose file you can limit the delay between cycles in hours. By default after finishing
with challenger the service will wait 1 hour to restart on Iron games.

### Summoner ID
Using the summoner endpoint pulls the remaining IDs for each account that were not included in the league ranking response.
This is a one time tasks for each user but it will take alot of initial requests until all are done.

### Match History
Pull and update the match history of user. Prioritizes user for which no calls have been made so far and, once all user have
had their history pulled user that have had the most soloQ matches played since their last update.
Use the `MIN_MATCHES=20` variable to limit how many new matches a player has to play to even be considered for an update.
Because for each match 10 match-history are updated one should consider keeping this number high as to not make 10 match history
calls per new match. Setting it to 10 or 20 means that for each match played on average only 1 or .5 calls have to be made.

### Match Details
Pull match details for a buffered match_id. This service pulls match details and adds a select number of attributes to the DB
(check the SQL files for more info). If more/less data is needed you have to update the service + db schema accordingly.
Matches are pulled newest to oldest and are not pulled for matches older than defined through `DETAILS_CUTOFF` (secrets.env).

### Match History
Currently not implemented, WIP.


## Lightshield Tools

Tools and Code-Sections of the Lightshield Framework that were better fit to be provided through dependency
rather then included in the main project.

#### What currently doesn't work:
- The keys used to save data in Redis are not linked to the API key, as such multiple keys have to use
multiple Redis servers.

### Ratelimiter (WIP)

Multi-Host async ratelimiting service. The clients each sync via a central redis server.

Set up the proxy in an async context with redis connection details.
```python
from lightshield.proxy import Proxy
import aiohttp

async def run():
p = Proxy()
# Initiate the redis connector in async context
await p.init(host='localhost', port=5432)
```

Make calls directly to one endpoint.
Provide the server the calls are run against and an identifier that helps you recognize which endpoint the requests are
run against
```python
async with aiohttp.ClientSession(headers={'X-Riot-Token': ''}) as session:
zone = await p.get_endpoint(server='europe', zone='league-exp')
for page in range(1, 10):
zone.request('https://euw1.api.riotgames.com/lol/league-exp/v4/entries/RANKED_SOLO_5x5/SILVER/I?page=%s' % page, session)
```

### Settings (WIP)
The settings file contains a number of variables that are used across the project.
Variables can be set through:
`ENV > config.json > default`
```python
from lightshield import settings

headers = {'X-Riot-Token': settings.API_KEY}
```
### Data Storage
#### Postgres
Data is by default stored in the `lightshield` database in the included postgres service. Data is partially saved in
platform-wide schemas (e.g. `EUW1. | NA1.`), centrally, or in region-wide schemas (e.g. `europe. | americas.`).
Details on structure can be found in the corresponding [folder.](postgres)
#### JSON Files
Both Details and Timeline files are saved locally instead of the DB. This is to reduce load on the DB overall as well as
speed up services to not require the inserting of neither large JSON blobs nor multiple normalized table entries per match.
Only identifying parameters are stored in the DB.
The following folder structure is used to save data locally for JSON files:
##### Match Details
`match_data/details/[patch]/[day]/[platform]/[matchid].json`
##### Match Timeline
`match_data/timeline/[platform]/[match_id[:5]]/[matchid].json`
52 changes: 52 additions & 0 deletions Rate Limiting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Rate Limiting

Lightshield offers a multi-host ratelimiting solution by syncing limits through a redis server.

Ratelimits are automatically generated and depend on the user to provide the proper server/method.

### How does it work

A user initiates an endpoint by providing 2 values:
- Server
- Zone

E.g. `server=EUW1, zone='match_details'`

Ratelimits are then checked as follows:
- Server -> App-Rate-Limit
- Zone -> Method-Rate-Limit

Naming schema for each zone is up to the user as the individual endpoints are not predefined.


### Usage

```python
from lightshield.proxy import Proxy
import aiohttp
import asyncio

async def run(proxy):
# Initiate the redis connector in async context
await proxy.init(host='localhost', port=5432)

# Create an endpoint for your requests
zone = await proxy.get_endpoint(server='europe', zone='league-exp')

async with aiohttp.ClientSession(headers={'X-Riot-Token': ''}) as session:
for page in range(1, 10):
# Pass request url + session to the API
zone.request('https://euw1.api.riotgames.com/lol/league-exp/v4/entries/RANKED_SOLO_5x5/SILVER/I?page=%s' % page, session)

def main():
# Set up the proxy instance
proxy = Proxy()
asyncio.run(run(proxy))

if __name__ == "__main__":
main()
```


### Sourcecode
See [here](lightshield/proxy)
12 changes: 10 additions & 2 deletions compose-services.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
version: '3.7'
services:

## Drakebane
### Drakebane
drakebane_frontend:
hostname: drakebane_frontend
build:
dockerfile: Dockerfile
context: drakebane/frontend
image: lightshield/drakebane_frontend:${TAG}
restart: always
ports:
- 127.0.0.1:8301:80
Expand All @@ -16,6 +17,7 @@ services:
build:
dockerfile: Dockerfile
context: drakebane/backend
image: lightshield/drakebane_backend:${TAG}
restart: always
volumes:
- drakebane_settings:/project/configs/
Expand All @@ -30,6 +32,7 @@ services:
build:
dockerfile: Dockerfile
context: redis
image: lightshield/redis:${TAG}
restart: always
volumes:
- redis_data:/data
Expand All @@ -41,6 +44,7 @@ services:
build:
dockerfile: Dockerfile
context: postgres
image: lightshield/postgres:${TAG}
restart: always
ports:
- 127.0.0.1:8303:5432
Expand All @@ -65,6 +69,7 @@ services:
build:
dockerfile: Dockerfile
context: services/summoner_id
image: lightshield/summoner_id:${TAG}
restart: always

match_history:
Expand All @@ -78,6 +83,7 @@ services:
build:
dockerfile: Dockerfile
context: services/match_details
image: lightshield/match_details:${TAG}
volumes:
- type: bind
source: ./match_data
Expand All @@ -89,18 +95,20 @@ services:
build:
dockerfile: Dockerfile
context: services/match_timeline
image: lightshield/match_timeline:${TAG}
volumes:
- type: bind
source: ./match_data
target: /project/data
restart: always

# Glue -- Tiny service to do some linking between data
### Glue -- Tiny service to do some linking between data
glue:
hostname: glue
build:
dockerfile: Dockerfile
context: services/glue
image: lightshield/glue:${TAG}
restart: always

volumes:
Expand Down
1 change: 0 additions & 1 deletion drakebane/backend/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ async def update_settings(self):
await con.set("regions", json.dumps(self.settings["regions"]))
await con.set("apiKey", self.settings["apiKey"])

formatted = {}
for key, value in self.settings["services"].items():
if value:
await con.set("service_%s" % key, "true")
Expand Down

0 comments on commit 7198b07

Please sign in to comment.