This is a (relatively) simple Python memoizing module (ie. a function cache), in which any dict-like can be used as the actual storage object.
Lets walk through a simple example. First we need somewhere to cache values. We will use a simple dictionary for this, but you can use anything with a dict-like interface (see lower for the specification), including persistent storage mechanisms.
# Make a store.
store = {}
Now we create the Memoizer
object itself. Any keyword arguments will be stored as options to the "default" region (more on this later).
# Initialize the cache object.
from memoize import Memoizer
memo = Memoizer(store)
There is a direct API for retrieving a value. We pass it the key we want, and a function that is used to calculate it.
def basic_func():
print 'called'
return 123
# Manually retrieve a value.
memo.get('basic', basic_func)
# stdout > called
# return > 123
# If we ask for the same key it will not run the function again.
memo.get('basic', basic_func)
# return > 123
You can check for a given key, and manually remove something from the cache if you want:
memo.exists('basic')
# return > True
memo.delete('basic')
memo.exists('basic')
# return > False
memo.get('basic', basic_func)
# stdout > called
# return > 123
You can specify the positional and keyword arguments to call the function with:
def adder_func(a, b):
print called
return a + b
# Passing args...
memo.get('adder-1', adder_func, (1, 2))
# stdout > called
# return > 3
memo.get('adder-1', adder_func, (1, 2))
# return > 3
# And kwargs...
memo.get('adder-2', adder_func, (), {'a':3, 'b':4})
# stdout > called
# return > 7
Values need not live forever; we can easily supply maximum ages or expiry times for the data.
memo.get('expires', basic_func, max_age=1)
# stdout > called
# return > 123
memo.get('expires', basic_func, max_age=1)
# return > 123
# Let us wait for this to expire...
import time
time.sleep(1.1)
# The function will get called again!
memo.get('expires', basic_func, max_age=1)
# stdout > called
# return > 123
Alternatively you can pass a expire
which is the explicit unix time for the data to expire at.
You can see how much time is left until something expires, or set an expiration manually:
memo.get('expires', basic_func, max_age=60)
# stdout > called
# return > 123
memo.ttl('expires')
# return > something slightly less than 60
# Wait a bit and see what happens...
time.sleep(1)
memo.ttl('expires')
# return > something slightly less than 59
# Manually set the remaining ttl of the data.
memo.expire('expires', 10)
memo.ttl('expires')
# return > something slightly less than 10
# Set an explicit expiry time.
memo.expire_at('expires', time.time() + 3600)
memo.ttl('expires')
# return > something slightly less than 3600
If you need to regenerate your content based off an external resource that is not reflected by the arguments of the function when called, then you can use etags.
An etag is any object that represents the current state of the resources that will be drawn upon to generate the final value. If the etag changes then it is assumed that the value is out of date. Note that not providing an etag will not trigger a regeneration.
store = {}
memo = Memoizer(store)
memo.get('etagged', basic_func, etag='a')
# stdout > called
# Doesnt call the function again if the etag is the same.
memo.get('etagged', basic_func, etag='a')
# Change the etag.
memo.get('etagged', basic_func, etag='b')
# stdout > called
You can also supply a function that is called with the same arguments that the function will be called with, and its return value is used as the etag.
state = []
def get_etag():
return len(state)
memo.get('etagged', basic_func, etagger=get_etag)
The Memoizer
object can be applied as a decorator to a function, which will automatically cache its return values keyed on the function name, and arguments provided. This is only reliable as long as the repr
of the arguments is deterministic (ie. no dicts which can change order).
The inclusion of arguments into the key is the primary difference between the direct get method, and using the decorator; functions have effectively been memoized!
You can manually specify a key as a positional argument if there will be a name collision by another function with the same name.
Note: Most options we have used previously are perfectly valid to use in the decorator declaration, with an obvious exception of arguments as those are handled by calling the function. Features that we will explain on the decorators (ie. regions, locks, etc) apply equally to the direct get method.
Basic usage:
@memo
def cached_func():
print 'called'
return 456
cached_func()
# stdout > called
# return > 456
cached_func()
# return > 456
Using options:
# This should be obvious as to what it does...
@memo(max_age=60)
def one_minute():
return 'value'
one_minute()
# return > 'value'
Many of the Memoizer methods have been applied to the wrapped function as well!
one_minute.exists()
# return > True
one_minute.ttl()
# return > slightly less than 60
one_minute.expire(10)
one_minute.ttl()
# return > slightly less than 10
one_minute.delete()
one_minute.exists()
# return > False
Since the cache for the function is keyed by the arguments, you must provide all of the positional and keyword arguments that you are checking against to these methods.
@memo
def adder(a, b):
return a + b
adder(1, 2)
# return > 3
# This will not work as expected!
adder.exists()
# return > False
# But this does...
adder.exists((1, 2))
# return > True
Care has been taken to key this keeping in mind how the arguments will be accepted by the function. Ergo you can specify positional arguments by a keyword and it will still use the same key.
# Note that this results in the same argument values.
adder.exists((2, ), {'a': 1})
# return > True
A namespace a simply a string prefix for the final key. The prefixed key is only ever seen by the store itself.
store = {}
memo = Memoizer(store, namespace='master')
memo.get('key', str)
store.keys()
# return > ['master:key']
store.clear()
memo.get('key2', str, namespace='2')
store.keys()
# return > ['2:key2']
store.clear()
@memo(namespace='3')
def func():
pass
func()
store.keys()
# return > ['3:__main__.func()']
Regions are sets of default values. A region named "default" is initially created and populated with keyword arguments passed to the constructor.
A region is simply a dictionary within the Memoizer.regions dictionary (mapped by name). It is referenced by name as an kwarg in all methods.
store = {}
memo = Memoizer(store)
memo.regions['short'] = {'max_age': 60}
memo.regions['long'] = {'max_age': 3600}
memo.get('key1', str)
memo.get('key2', str, region='short')
memo.get('key3', str, region='long')
memo.ttl('key1')
# return > None
memo.ttl('key2')
# return > ~60
memo.ttl('key3')
# return > ~3600
A simple form a region inheritance may exist by a region naming another region as its "parent" (again, by name). Please see the tests for a demonstration.
You can specify the store to use on a global, region, or individual bases.
primary = {}
secondary = {}
tertiary = {}
memo = Memoizer(default)
memo.regions['secondary'] = {'store': secondary}
memo.get('key1', str)
memo.get('key2', str, region='secondary')
memo.get('key3', str, store=tertiary)
primary.keys()
# return > 'key1'
secondary.keys()
# return > 'key2'
tertiary.keys()
# return > 'key3'
It is possible that two threads will attempt to generate content at the same time, but that can be a waste of resources. Ergo, you (or the store
) can provide a locking implementation to attempt to stem this waste.
If the store
has a lock
method, that will be used. You can also provide a lock
function as an option that will override the store's native lock.
You may provide a float timeout
option as well to override the default.
The lock constructor will be called with the key for which a value is about to be calculated. The function MUST return an object with an acquire(timeout)
and a release()
method.
The acquire
method will be called with a single float representing the maximum amount of time to block waiting for a lock, and the boolean value of the return value MUST indicate if the lock was acquired. Any exceptions thrown will not be caught. The release
method will be called only if the lock was acquired.
We have provided a small wrapper and lock implementation for use with Redis.
from redis import Redis
db = Redis()
import memoize.redis
store = memoize.redis.wrap(db)
memo = memoize.Memoizer(store)
# Use!
If you want to integrate PyMemoize in a django project, you can use Django's cache framework as a store for memoized values. This way, you can leverage any cache that has a Django integration, such as Redis, Memcached, or even database cache.
import memoize.djangocache
store = memoize.djangocache.Cache('default')
memo = memoize.Memoizer(store, namespace='memoize')
If you want to use another cache declared in your django settings, you can replace default
with the name of the cache.
There is a relatively minimal interface that a store must offer to be used with this package.
Keys are ALWAYS strings.
The valued stored are ALWAYS tuples. The first item is in integer representing the current protocol version (currently 1). For protocol 1 the fields are:
0: protocol version (always 1, thus far)
1: creation time
2: expiry time
3: etag
4: value
There are constants in memoize.core
that hold the index values so you will
not have to hardcode these (ie. CREATION_INDEX
, and ETAG_INDEX
).
Return the requested data tuple. MUST raise a KeyError, or return None of the key does not exist.
Return the requested data tuple. MUST return None if the key does not exist.
Store the data tuple. This may optionally set an expiry time with the store's native method. If the native method does not support float times, then round up to the next usable time so that the store does not expire a value before we intend it to.
ie. If using a store that has second resolution, set the expiry time to: math.ceil(expiry)
.
Delete the data tuple. MAY throw an KeyError.
Return a lock object as specified by the locking section below.
Return the time-to-live of the given key as a float, or None if it does not exist or will not expire. If the native expiry mechanism does not support float times then take care that the returned value is less than the "real" expiry time.
ie. If using a store that has second resolution, return: native_ttl - 1
.