Skip to content
This repository was archived by the owner on Mar 7, 2025. It is now read-only.

Commit 276df73

Browse files
committed
CHG: Extend README.
1 parent a11877c commit 276df73

File tree

1 file changed

+20
-6
lines changed

1 file changed

+20
-6
lines changed

README.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ Sometimes it can be beneficial to speedup pure function calls by using [memoizat
1212

1313
The cache storage required for implementing such a speedup often is an associative container (i.e. a key/value store).
1414
Programming language standard libraries provide such containers, often implemented as a [hash table](https://en.wikipedia.org/wiki/Hash_table) or a [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree).
15-
These implementations are fine for performance, but do not actually cover all cases because of the lack of retention management
15+
These implementations are fine for performance, but do not actually cover all use cases because of the lack of retention management
1616

1717
Suppose your input data covers the whole space that can be represented by a 64-bit integer.
18-
There probably is some (generally non-uniform) distribution with which the input values arrive, but it's possible that over time _all_ possible values pass by.
19-
Any cache without retention management will then grow to potentially enormous dimensions in memory which is undesirable.
18+
There probably is some (generally non-uniform) probability distribution with which the input values arrive, but it's statistically possible that over time _all_ possible values pass by.
19+
Any cache without retention management will then grow to potentially enormous dimensions in memory which is undesirable, especially in memory-constrained environments.
2020

2121
The cache implemented in this library uses a FIFO-style sequential data storage with fixed size, pre-allocated memory.
2222
When the cache is full, the oldest item is evicted.
@@ -38,7 +38,7 @@ impl Process {
3838
Each single call to this function results in the resource costs of the calculation.
3939
We can add memoization to this function in two different ways:
4040

41-
- Using `MemoCache::get_or_insert_with`,
41+
- Using `MemoCache::get_or_insert_with` (or `get_or_try_insert_with`),
4242
- Using `MemoCache::get` and `MemoCache::insert`.
4343

4444
For each of the following examples: each call to `calculate` will first check if the input value is already in the cache.
@@ -62,7 +62,7 @@ impl Process {
6262
}
6363
```
6464

65-
For fallible insert functions, there's `get_or_try_insert_with`.
65+
For fallible insert functions, there's `get_or_try_insert_with` that returns a `Result`.
6666

6767
### Example B: `get` and `insert`
6868

@@ -95,12 +95,26 @@ However, if the input data set size is greater than the cache size, elements wil
9595
In this scenario, the fixed size of the cache, and/or the retention management aspect of `MemoCache` must weigh against the loss in performance over a `HashTable`.
9696
Always analyze your input data and perform measurements to select the cache size / type you use.
9797

98+
The current implementation of the cache is focused on simplicity, making it outperform a `HashTable` under the right circumstances.
99+
98100
Run the included benchmarks using [criterion](https://crates.io/crates/criterion) by invoking: `cargo bench`
99101

102+
## Implementation details
103+
104+
This cache stores its key/value pairs in a fixed-size array.
105+
A slot in this array represents a key/value and is either empty or occupied.
106+
A cursor pointing to an array slot keeps track of the next slot to be overwritten.
107+
108+
Movement of the cursor is linear and incremental always pointing to the next empty slot, or the oldest slot.
109+
When the cursor is at the end of the array it wraps around to the beginning, so any next insert will overwrite an already existing slot.
110+
111+
The implementation of the cache makes no assumptions whatsoever about the input data probability distribution, keeping the cache clean and simple.
112+
100113
## TODO
101114

102-
- Investigate potential advanced cache improvements (e.g. start [here](https://en.wikipedia.org/wiki/Cache_replacement_policies)).
115+
- Currently, the implementation focuses on simplicity and makes no assumptions about the data arrival probability distribution. However, this could potentially be very beneficial. Investigate cache performance improvements (e.g. start [here](https://en.wikipedia.org/wiki/Cache_replacement_policies)).
103116
- Perhaps add cursor motion policies based on estimated input data probability distributions (e.g. in the current implementation an often-seen input value will still be overwritten by cursor movement).
117+
- More detailed benchmarks w.r.t. insert / lookup performance.
104118

105119
## License
106120

0 commit comments

Comments
 (0)