Skip to content

Commit 0696d28

Browse files
authored
[Docs] Update main README.md and FAQ for missing python header (#26)
* Update README.md * Add faq
1 parent 544a3d4 commit 0696d28

File tree

2 files changed

+157
-70
lines changed

2 files changed

+157
-70
lines changed

README.md

Lines changed: 156 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,74 @@
33
[![Build](https://github.com/cacheMon/libCacheSim-python/actions/workflows/build.yml/badge.svg)](https://github.com/cacheMon/libCacheSim-python/actions/workflows/build.yml)
44
[![Documentation](https://github.com/cacheMon/libCacheSim-python/actions/workflows/docs.yml/badge.svg)](docs.libcachesim.com/python)
55

6-
Python bindings for [libCacheSim](https://github.com/1a1a11a/libCacheSim), a high-performance cache simulator and analysis library.
6+
7+
libCacheSim is fast with the features from [underlying libCacheSim lib](https://github.com/1a1a11a/libCacheSim):
8+
9+
- **High performance** - over 20M requests/sec for a realistic trace replay
10+
- **High memory efficiency** - predictable and small memory footprint
11+
- **Parallelism out-of-the-box** - uses the many CPU cores to speed up trace analysis and cache simulations
12+
13+
libCacheSim is flexible and easy to use with:
14+
15+
- **Seamless integration** with [open-source cache dataset](https://github.com/cacheMon/cache_dataset) consisting of thousands traces hosted on S3
16+
- **High-throughput simulation** with the [underlying libCacheSim lib](https://github.com/1a1a11a/libCacheSim)
17+
- **Detailed cache requests** and other internal data control
18+
- **Customized plugin cache development** without any compilation
19+
20+
## Prerequisites
21+
22+
- OS: Linux / macOS
23+
- Python: 3.9 -- 3.13
724

825
## Installation
926

27+
### Quick Install
28+
1029
Binary installers for the latest released version are available at the [Python Package Index (PyPI)](https://pypi.org/project/libcachesim).
1130

1231
```bash
1332
pip install libcachesim
1433
```
1534

35+
### Recommended Installation with uv
36+
37+
It's recommended to use [uv](https://docs.astral.sh/uv/), a very fast Python environment manager, to create and manage Python environments:
38+
39+
```bash
40+
uv venv --python 3.12 --seed
41+
source .venv/bin/activate
42+
uv pip install libcachesim
43+
```
44+
45+
### Advanced Features Installation
46+
47+
For users who want to run LRB, ThreeLCache, and GLCache eviction algorithms:
48+
49+
!!! important
50+
If `uv` cannot find built wheels for your machine, the building system will skip these algorithms by default.
51+
52+
To enable them, you need to install all third-party dependencies first:
53+
54+
```bash
55+
git clone https://github.com/cacheMon/libCacheSim-python.git
56+
cd libCacheSim-python
57+
bash scripts/install_deps.sh
58+
59+
# If you cannot install software directly (e.g., no sudo access)
60+
bash scripts/install_deps_user.sh
61+
```
62+
63+
Then, you can reinstall libcachesim using the following commands (may need to add `--no-cache-dir` to force it to build from scratch):
64+
65+
```bash
66+
# Enable LRB
67+
CMAKE_ARGS="-DENABLE_LRB=ON" uv pip install libcachesim
68+
# Enable ThreeLCache
69+
CMAKE_ARGS="-DENABLE_3L_CACHE=ON" uv pip install libcachesim
70+
# Enable GLCache
71+
CMAKE_ARGS="-DENABLE_GLCACHE=ON" uv pip install libcachesim
72+
```
73+
1674
### Installation from sources
1775

1876
If there are no wheels suitable for your environment, consider building from source.
@@ -29,6 +87,42 @@ python -m pytest tests/
2987

3088
## Quick Start
3189

90+
### Cache Simulation
91+
92+
With libcachesim installed, you can start cache simulation for some eviction algorithm and cache traces:
93+
94+
```python
95+
import libcachesim as lcs
96+
97+
# Step 1: Get one trace from S3 bucket
98+
URI = "cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
99+
dl = lcs.DataLoader()
100+
dl.load(URI)
101+
102+
# Step 2: Open trace and process efficiently
103+
reader = lcs.TraceReader(
104+
trace = dl.get_cache_path(URI),
105+
trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE,
106+
reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)
107+
)
108+
109+
# Step 3: Initialize cache
110+
cache = lcs.S3FIFO(cache_size=1024*1024)
111+
112+
# Step 4: Process entire trace efficiently (C++ backend)
113+
obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
114+
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")
115+
116+
# Step 4.1: Process with limited number of requests
117+
cache = lcs.S3FIFO(cache_size=1024*1024)
118+
obj_miss_ratio, byte_miss_ratio = cache.process_trace(
119+
reader,
120+
start_req=0,
121+
max_req=1000
122+
)
123+
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")
124+
```
125+
32126
### Basic Usage
33127

34128
```python
@@ -46,7 +140,9 @@ print(cache.get(req)) # False (first access)
46140
print(cache.get(req)) # True (second access)
47141
```
48142

49-
### Trace Processing
143+
### Trace Analysis
144+
145+
Here is an example demonstrating how to use `TraceAnalyzer`:
50146

51147
```python
52148
import libcachesim as lcs
@@ -56,25 +152,40 @@ URI = "cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
56152
dl = lcs.DataLoader()
57153
dl.load(URI)
58154

59-
# Step 2: Open trace and process efficiently
60-
reader = lcs.TraceReader(dl.get_cache_path(URI))
155+
reader = lcs.TraceReader(
156+
trace = dl.get_cache_path(URI),
157+
trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE,
158+
reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)
159+
)
61160

62-
# Step 3: Initialize cache
63-
cache = lcs.S3FIFO(cache_size=1024*1024)
161+
analysis_option = lcs.AnalysisOption(
162+
req_rate=True, # Keep basic request rate analysis
163+
access_pattern=False, # Disable access pattern analysis
164+
size=True, # Keep size analysis
165+
reuse=False, # Disable reuse analysis for small datasets
166+
popularity=False, # Disable popularity analysis for small datasets (< 200 objects)
167+
ttl=False, # Disable TTL analysis
168+
popularity_decay=False, # Disable popularity decay analysis
169+
lifetime=False, # Disable lifetime analysis
170+
create_future_reuse_ccdf=False, # Disable experimental features
171+
prob_at_age=False, # Disable experimental features
172+
size_change=False, # Disable size change analysis
173+
)
174+
175+
analysis_param = lcs.AnalysisParam()
176+
177+
analyzer = lcs.TraceAnalyzer(
178+
reader, "example_analysis", analysis_option=analysis_option, analysis_param=analysis_param
179+
)
64180

65-
# Step 4: Process entire trace efficiently (C++ backend)
66-
obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
67-
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")
181+
analyzer.run()
68182
```
69183

70-
> [!NOTE]
71-
> We DO NOT ignore the object size by defaults, you can add `reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)` to the initialization of `TraceReader` if needed.
72-
73-
## Custom Cache Policies
184+
## Plugin System
74185

75-
Implement custom cache replacement algorithms using pure Python functions - **no C/C++ compilation required**.
186+
libCacheSim allows you to develop your own cache eviction algorithms and test them via the plugin system without any C/C++ compilation required.
76187

77-
### Python Hook Cache Overview
188+
### Plugin Cache Overview
78189

79190
The `PluginCache` allows you to define custom caching behavior through Python callback functions. You need to implement these callback functions:
80191

@@ -87,74 +198,51 @@ The `PluginCache` allows you to define custom caching behavior through Python ca
87198
| `remove_hook` | `(data: Any, obj_id: int) -> None` | Clean up when object removed |
88199
| `free_hook` | `(data: Any) -> None` | [Optional] Final cleanup |
89200

90-
<details>
91-
<summary>An example for LRU</summary>
201+
### Example: Implementing LRU via Plugin System
92202

93203
```python
94204
from collections import OrderedDict
95-
from libcachesim import PluginCache, CommonCacheParams, Request, SyntheticReader, LRU
96-
97-
98-
class StandaloneLRU:
99-
def __init__(self):
100-
self.cache_data = OrderedDict()
101-
102-
def cache_hit(self, obj_id):
103-
if obj_id in self.cache_data:
104-
obj_size = self.cache_data.pop(obj_id)
105-
self.cache_data[obj_id] = obj_size
106-
107-
def cache_miss(self, obj_id, obj_size):
108-
self.cache_data[obj_id] = obj_size
109-
110-
def cache_eviction(self):
111-
evicted_id, _ = self.cache_data.popitem(last=False)
112-
return evicted_id
113-
114-
def cache_remove(self, obj_id):
115-
if obj_id in self.cache_data:
116-
del self.cache_data[obj_id]
117-
205+
from typing import Any
118206

119-
def cache_init_hook(common_cache_params: CommonCacheParams):
120-
return StandaloneLRU()
207+
from libcachesim import PluginCache, LRU, CommonCacheParams, Request
121208

209+
def init_hook(_: CommonCacheParams) -> Any:
210+
return OrderedDict()
122211

123-
def cache_hit_hook(cache, request: Request):
124-
cache.cache_hit(request.obj_id)
212+
def hit_hook(data: Any, req: Request) -> None:
213+
data.move_to_end(req.obj_id, last=True)
125214

215+
def miss_hook(data: Any, req: Request) -> None:
216+
data.__setitem__(req.obj_id, req.obj_size)
126217

127-
def cache_miss_hook(cache, request: Request):
128-
cache.cache_miss(request.obj_id, request.obj_size)
218+
def eviction_hook(data: Any, _: Request) -> int:
219+
return data.popitem(last=False)[0]
129220

221+
def remove_hook(data: Any, obj_id: int) -> None:
222+
data.pop(obj_id, None)
130223

131-
def cache_eviction_hook(cache, request: Request):
132-
return cache.cache_eviction()
133-
134-
135-
def cache_remove_hook(cache, obj_id):
136-
cache.cache_remove(obj_id)
137-
138-
139-
def cache_free_hook(cache):
140-
cache.cache_data.clear()
141-
224+
def free_hook(data: Any) -> None:
225+
data.clear()
142226

143227
plugin_lru_cache = PluginCache(
144-
cache_size=1024,
145-
cache_init_hook=cache_init_hook,
146-
cache_hit_hook=cache_hit_hook,
147-
cache_miss_hook=cache_miss_hook,
148-
cache_eviction_hook=cache_eviction_hook,
149-
cache_remove_hook=cache_remove_hook,
150-
cache_free_hook=cache_free_hook,
151-
cache_name="CustomizedLRU",
228+
cache_size=128,
229+
cache_init_hook=init_hook,
230+
cache_hit_hook=hit_hook,
231+
cache_miss_hook=miss_hook,
232+
cache_eviction_hook=eviction_hook,
233+
cache_remove_hook=remove_hook,
234+
cache_free_hook=free_hook,
235+
cache_name="Plugin_LRU",
152236
)
153-
```
154-
</details>
155237

238+
reader = lcs.SyntheticReader(num_objects=1000, num_of_req=10000, obj_size=1)
239+
req_miss_ratio, byte_miss_ratio = plugin_lru_cache.process_trace(reader)
240+
ref_req_miss_ratio, ref_byte_miss_ratio = LRU(128).process_trace(reader)
241+
print(f"plugin req miss ratio {req_miss_ratio}, ref req miss ratio {ref_req_miss_ratio}")
242+
print(f"plugin byte miss ratio {byte_miss_ratio}, ref byte miss ratio {ref_byte_miss_ratio}")
243+
```
156244

157-
Another simple implementation via hook functions for S3FIFO respectively is given in [examples](examples/plugin_cache/s3fifo.py).
245+
By defining custom hook functions for cache initialization, hit, miss, eviction, removal, and cleanup, users can easily prototype and test their own cache eviction algorithms.
158246

159247
### Getting Help
160248

@@ -208,7 +296,6 @@ If you used libCacheSim in your research, please cite the above papers.
208296

209297
---
210298

211-
212299
## License
213300
See [LICENSE](LICENSE) for details.
214301

docs/src/en/getting_started/quickstart.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ To enable them, you need to install all third-party dependencies first.
3737
bash scripts/install_deps_user.sh
3838
```
3939

40-
Then, you can reinstall libcachesim using the following commands:
40+
Then, you can reinstall libcachesim using the following commands (may need to add `--no-cache-dir` to force it to build from scratch):
4141

4242
```bash
4343
# Enable LRB

0 commit comments

Comments
 (0)