|
3 | 3 | ![tscached logo]
|
4 | 4 | (https://github.com/zachm/tscached/raw/master/logo/logo.png)
|
5 | 5 |
|
6 |
| -tscached is a smart caching proxy, built with Redis, for time series data in the [KairosDB](https://kairosdb.github.io/) format. |
| 6 | +tscached is a smart caching proxy for time series data in the [KairosDB](https://kairosdb.github.io/) format. By engineering toward a drastic improvement in user experience, tscached makes dashboards and charts load over **100x faster** than a standard configuration of KairosDB. |
7 | 7 |
|
8 |
| -Inspired by [arussellsaw/postcache](https://github.com/arussellsaw/postcache) - tscached goes one |
9 |
| -step further: *A previously issued query will be reissued across only the elapsed time since its |
10 |
| -last execution.* This provides a substantial improvement in serving high-volume load, especially temporally long queries that return thousands of time series. Using only simple techniques - consistent hashing, read-through caching, and backend load chunking - we provide user-perceived read latency improvements of up to 100x. |
| 8 | +## Motivation |
11 | 9 |
|
12 |
| -There are several different frontends to use with a Kairos-compliant API like this one, but the most full-featured remains (as always) [Grafana](http://grafana.org/) with [this plugin](https://github.com/grafana/kairosdb-datasource) installed. |
| 10 | +KairosDB is a powerful, scalable solution for storing large amounts of time-series data. It's built on top of an off-the-shelf data store (Cassandra), it ingests data *really fast*, and it stores that data in a *lossless* schema. |
13 | 11 |
|
14 |
| -## Design Docs |
15 |
| -They may be out of date with the current release, although likely not by much: [DESIGN.md](https://github.com/zachm/tscached/blob/master/DESIGN.md). |
| 12 | +Unfortunately, getting data back *out* of KairosDB can be challenging: read performance just isn't as good as write performance. The use cases bear this out: our sources write one datapoint at a time, but our consumers request **hours** of data from **thousands** of time series, which will be formatted into charts and dashboards. To design well for both scenarios is very difficult, so it makes sense to separate a read-performance solution into its own system. |
| 13 | + |
| 14 | +## Design |
| 15 | + |
| 16 | +If you're interested in the original design docs, you can find them [here](https://github.com/zachm/tscached/blob/master/DESIGN.md). |
| 17 | + |
| 18 | +tscached makes a few assumptions: |
| 19 | +* Most time series are **write once, read never.** Users care about a only small fraction of total data, but they need to be able to access all of it in a pinch. |
| 20 | +* Grafana doesn't care what it's talking to: By reimplementing the KairosDB API, tscached is truly a drop-in solution. |
| 21 | +* Consistent hashing is cheap: We can create easy Redis keys based on a query's semantic parts, including its grouping and aggregation components. |
| 22 | +* Redis is fast: We can have plenty of *O(n)* logic during processing because we've lowered our accesses to *O(1).* |
| 23 | + |
| 24 | +tscached makes a few advancements, too: |
| 25 | +* A previously issued (and cached) query will be reissued across **only the elapsed time since its |
| 26 | +last execution.** While a one-hour tscached query first requires one hour's worth of KairosDB data, the same query made one minute later requires only one minute's worth of data. Dashboard refresh rate is the lowest common denominator! |
| 27 | +* Caching **metadata** speeds up the user experience when making dashboards with Grafana. No more lag on dropdown menus! |
| 28 | +* Dashboards can be **pre-cached,** eliminating the initial cold scenario, using a *readahead* script included with the service. |
| 29 | +* Long queries are **chunked.** Splitting a six-hour query into six one-hour queries, for instance, can improve performance by up to 10x. The client never knows the difference. |
| 30 | + |
| 31 | +Credit where credit is due: [arussellsaw/postcache](https://github.com/arussellsaw/postcache) was a huge inspiration. Postcache is a great solution if an office has 10 monitors all showing the same dashboard such that all load is exactly the same. However, if an office has hundreds of engineers loading thousands of different dashboards, postcache won't help much, since no two dashboards will create the same exact load nor have the same refresh rates. |
| 32 | + |
| 33 | +There are several different frontends to use with a Kairos-compliant API like this one, but the most full-featured remains (as always) [Grafana](http://grafana.org/) with [this plugin](https://github.com/grafana/kairosdb-datasource) installed. And if you're looking to send system metrics *into* KairosDB, do check out [Fullerite](http://github.com/Yelp/fullerite/): it's cross-compatible with Diamond, super efficient, and supports KairosDB out of the box! |
| 34 | + |
| 35 | +## High-Level Architecture |
| 36 | +tscached is designed to fit well into a common scenario, where a frontend like Grafana sends read requests to a backing KairosDB cluster. From KairosDB's perspective, tscached behaves just like any other client. From Grafana's perspective, tscached behaves just like any other KairosDB server. This diagram shows one way to hook it all together. |
| 37 | + |
| 38 | + |
| 39 | + |
| 40 | + |
| 41 | +## Installation and Use |
| 42 | + |
| 43 | +### Developing |
| 44 | + |
| 45 | +Building is known to work on OS X (El Capitan) and on Ubuntu Trusty. |
| 46 | + |
| 47 | +On OS X, you'll need to have these installed: |
| 48 | +* ```make``` et al. (available from the XCode package) |
| 49 | +* [Homebrew](http://brew.sh/), so you don't break the system Python. |
| 50 | +```bash |
| 51 | +brew install python |
| 52 | +pip install virtualenv |
| 53 | +make run |
| 54 | +``` |
| 55 | + |
| 56 | +On Ubuntu, you pretty much just need python2.7 and the standard development packages. |
| 57 | + |
| 58 | +You can also run a single-threaded server that will auto-refresh on code changes: |
| 59 | +```bash |
| 60 | +make debug |
| 61 | +``` |
| 62 | + |
| 63 | + |
| 64 | +### Within a Container |
| 65 | +If you're into Docker, the included Dockerfile is pretty self-explanatory. |
| 66 | +```bash |
| 67 | +$ docker run -d -p 8008:8008--name=tscached . |
| 68 | +``` |
| 69 | + |
| 70 | +### As a Debian Package |
| 71 | +tscached can be deployed via .DEB files and the Upstart system init framework. You'll need [dh-virtualenv](https://github.com/spotify/dh-virtualenv), among other things, to build. The Debian packaging has been tested on **Ubuntu 14.04 Trusty only**, but do feel free to submit patches for other releases. |
| 72 | +```bash |
| 73 | +$ make package |
| 74 | +``` |
| 75 | + |
| 76 | +### Configuration Files |
| 77 | + |
| 78 | +```tscached.uwsgi.ini``` contains some uWSGI-specific details, such as port assignments and number of threads/processes to run. It will accept the standard uWSGI INI options. |
| 79 | + |
| 80 | +```tscached.yaml``` contains all relevant configuration details. For initial use, you'll **definitely** want to adjust the host/port entries for Redis and KairosDB. Most of the rest is (moderately) self-explanatory. |
| 81 | + |
| 82 | + |
| 83 | +# Contributing |
| 84 | + |
| 85 | +Bug reports, success (or failure) stories, questions, suggestions, feature requests, and (documentation or code) patches are all very welcome. |
| 86 | + |
| 87 | +Feel free to ping @zachtm on Twitter if you'd like help running/configuring/dealing with this software. |
| 88 | + |
| 89 | +This project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. |
| 90 | + |
| 91 | +# Copyright |
| 92 | + |
| 93 | +Copyright 2016 Zach Musgrave. |
| 94 | + |
| 95 | +# License |
| 96 | + |
| 97 | +GNU GPLv3 - See the included LICENSE file for more details. |
0 commit comments