Skip to content

Commit a435a1a

Browse files
committed
new readme
1 parent 2462b12 commit a435a1a

File tree

1 file changed

+89
-7
lines changed

1 file changed

+89
-7
lines changed

README.md

Lines changed: 89 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,95 @@
33
![tscached logo]
44
(https://github.com/zachm/tscached/raw/master/logo/logo.png)
55

6-
tscached is a smart caching proxy, built with Redis, for time series data in the [KairosDB](https://kairosdb.github.io/) format.
6+
tscached is a smart caching proxy for time series data in the [KairosDB](https://kairosdb.github.io/) format. By engineering toward a drastic improvement in user experience, tscached makes dashboards and charts load over **100x faster** than a standard configuration of KairosDB.
77

8-
Inspired by [arussellsaw/postcache](https://github.com/arussellsaw/postcache) - tscached goes one
9-
step further: *A previously issued query will be reissued across only the elapsed time since its
10-
last execution.* This provides a substantial improvement in serving high-volume load, especially temporally long queries that return thousands of time series. Using only simple techniques - consistent hashing, read-through caching, and backend load chunking - we provide user-perceived read latency improvements of up to 100x.
8+
## Motivation
119

12-
There are several different frontends to use with a Kairos-compliant API like this one, but the most full-featured remains (as always) [Grafana](http://grafana.org/) with [this plugin](https://github.com/grafana/kairosdb-datasource) installed.
10+
KairosDB is a powerful, scalable solution for storing large amounts of time-series data. It's built on top of an off-the-shelf data store (Cassandra), it ingests data *really fast*, and it stores that data in a *lossless* schema.
1311

14-
## Design Docs
15-
They may be out of date with the current release, although likely not by much: [DESIGN.md](https://github.com/zachm/tscached/blob/master/DESIGN.md).
12+
Unfortunately, getting data back *out* of KairosDB can be challenging: read performance just isn't as good as write performance. The use cases bear this out: our sources write one datapoint at a time, but our consumers request **hours** of data from **thousands** of time series, which will be formatted into charts and dashboards. To design well for both scenarios is very difficult, so it makes sense to separate a read-performance solution into its own system.
13+
14+
## Design
15+
16+
If you're interested in the original design docs, you can find them [here](https://github.com/zachm/tscached/blob/master/DESIGN.md).
17+
18+
tscached makes a few assumptions:
19+
* Most time series are **write once, read never.** Users care about a only small fraction of total data, but they need to be able to access all of it in a pinch.
20+
* Grafana doesn't care what it's talking to: By reimplementing the KairosDB API, tscached is truly a drop-in solution.
21+
* Consistent hashing is cheap: We can create easy Redis keys based on a query's semantic parts, including its grouping and aggregation components.
22+
* Redis is fast: We can have plenty of *O(n)* logic during processing because we've lowered our accesses to *O(1).*
23+
24+
tscached makes a few advancements, too:
25+
* A previously issued (and cached) query will be reissued across **only the elapsed time since its
26+
last execution.** While a one-hour tscached query first requires one hour's worth of KairosDB data, the same query made one minute later requires only one minute's worth of data. Dashboard refresh rate is the lowest common denominator!
27+
* Caching **metadata** speeds up the user experience when making dashboards with Grafana. No more lag on dropdown menus!
28+
* Dashboards can be **pre-cached,** eliminating the initial cold scenario, using a *readahead* script included with the service.
29+
* Long queries are **chunked.** Splitting a six-hour query into six one-hour queries, for instance, can improve performance by up to 10x. The client never knows the difference.
30+
31+
Credit where credit is due: [arussellsaw/postcache](https://github.com/arussellsaw/postcache) was a huge inspiration. Postcache is a great solution if an office has 10 monitors all showing the same dashboard such that all load is exactly the same. However, if an office has hundreds of engineers loading thousands of different dashboards, postcache won't help much, since no two dashboards will create the same exact load nor have the same refresh rates.
32+
33+
There are several different frontends to use with a Kairos-compliant API like this one, but the most full-featured remains (as always) [Grafana](http://grafana.org/) with [this plugin](https://github.com/grafana/kairosdb-datasource) installed. And if you're looking to send system metrics *into* KairosDB, do check out [Fullerite](http://github.com/Yelp/fullerite/): it's cross-compatible with Diamond, super efficient, and supports KairosDB out of the box!
34+
35+
## High-Level Architecture
36+
tscached is designed to fit well into a common scenario, where a frontend like Grafana sends read requests to a backing KairosDB cluster. From KairosDB's perspective, tscached behaves just like any other client. From Grafana's perspective, tscached behaves just like any other KairosDB server. This diagram shows one way to hook it all together.
37+
38+
![architecture](https://github.com/zachm/tscached/raw/master/example_architecture.png)
39+
40+
41+
## Installation and Use
42+
43+
### Developing
44+
45+
Building is known to work on OS X (El Capitan) and on Ubuntu Trusty.
46+
47+
On OS X, you'll need to have these installed:
48+
* ```make``` et al. (available from the XCode package)
49+
* [Homebrew](http://brew.sh/), so you don't break the system Python.
50+
```bash
51+
brew install python
52+
pip install virtualenv
53+
make run
54+
```
55+
56+
On Ubuntu, you pretty much just need python2.7 and the standard development packages.
57+
58+
You can also run a single-threaded server that will auto-refresh on code changes:
59+
```bash
60+
make debug
61+
```
62+
63+
64+
### Within a Container
65+
If you're into Docker, the included Dockerfile is pretty self-explanatory.
66+
```bash
67+
$ docker run -d -p 8008:8008--name=tscached .
68+
```
69+
70+
### As a Debian Package
71+
tscached can be deployed via .DEB files and the Upstart system init framework. You'll need [dh-virtualenv](https://github.com/spotify/dh-virtualenv), among other things, to build. The Debian packaging has been tested on **Ubuntu 14.04 Trusty only**, but do feel free to submit patches for other releases.
72+
```bash
73+
$ make package
74+
```
75+
76+
### Configuration Files
77+
78+
```tscached.uwsgi.ini``` contains some uWSGI-specific details, such as port assignments and number of threads/processes to run. It will accept the standard uWSGI INI options.
79+
80+
```tscached.yaml``` contains all relevant configuration details. For initial use, you'll **definitely** want to adjust the host/port entries for Redis and KairosDB. Most of the rest is (moderately) self-explanatory.
81+
82+
83+
# Contributing
84+
85+
Bug reports, success (or failure) stories, questions, suggestions, feature requests, and (documentation or code) patches are all very welcome.
86+
87+
Feel free to ping @zachtm on Twitter if you'd like help running/configuring/dealing with this software.
88+
89+
This project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
90+
91+
# Copyright
92+
93+
Copyright 2016 Zach Musgrave.
94+
95+
# License
96+
97+
GNU GPLv3 - See the included LICENSE file for more details.

0 commit comments

Comments
 (0)