3
3
[ ![ Build] ( https://github.com/cacheMon/libCacheSim-python/actions/workflows/build.yml/badge.svg )] ( https://github.com/cacheMon/libCacheSim-python/actions/workflows/build.yml )
4
4
[ ![ Documentation] ( https://github.com/cacheMon/libCacheSim-python/actions/workflows/docs.yml/badge.svg )] ( docs.libcachesim.com/python )
5
5
6
- Python bindings for [ libCacheSim] ( https://github.com/1a1a11a/libCacheSim ) , a high-performance cache simulator and analysis library.
6
+
7
+ libCacheSim is fast with the features from [ underlying libCacheSim lib] ( https://github.com/1a1a11a/libCacheSim ) :
8
+
9
+ - ** High performance** - over 20M requests/sec for a realistic trace replay
10
+ - ** High memory efficiency** - predictable and small memory footprint
11
+ - ** Parallelism out-of-the-box** - uses the many CPU cores to speed up trace analysis and cache simulations
12
+
13
+ libCacheSim is flexible and easy to use with:
14
+
15
+ - ** Seamless integration** with [ open-source cache dataset] ( https://github.com/cacheMon/cache_dataset ) consisting of thousands traces hosted on S3
16
+ - ** High-throughput simulation** with the [ underlying libCacheSim lib] ( https://github.com/1a1a11a/libCacheSim )
17
+ - ** Detailed cache requests** and other internal data control
18
+ - ** Customized plugin cache development** without any compilation
19
+
20
+ ## Prerequisites
21
+
22
+ - OS: Linux / macOS
23
+ - Python: 3.9 -- 3.13
7
24
8
25
## Installation
9
26
27
+ ### Quick Install
28
+
10
29
Binary installers for the latest released version are available at the [ Python Package Index (PyPI)] ( https://pypi.org/project/libcachesim ) .
11
30
12
31
``` bash
13
32
pip install libcachesim
14
33
```
15
34
35
+ ### Recommended Installation with uv
36
+
37
+ It's recommended to use [ uv] ( https://docs.astral.sh/uv/ ) , a very fast Python environment manager, to create and manage Python environments:
38
+
39
+ ``` bash
40
+ uv venv --python 3.12 --seed
41
+ source .venv/bin/activate
42
+ uv pip install libcachesim
43
+ ```
44
+
45
+ ### Advanced Features Installation
46
+
47
+ For users who want to run LRB, ThreeLCache, and GLCache eviction algorithms:
48
+
49
+ !!! important
50
+ If ` uv ` cannot find built wheels for your machine, the building system will skip these algorithms by default.
51
+
52
+ To enable them, you need to install all third-party dependencies first:
53
+
54
+ ``` bash
55
+ git clone https://github.com/cacheMon/libCacheSim-python.git
56
+ cd libCacheSim-python
57
+ bash scripts/install_deps.sh
58
+
59
+ # If you cannot install software directly (e.g., no sudo access)
60
+ bash scripts/install_deps_user.sh
61
+ ```
62
+
63
+ Then, you can reinstall libcachesim using the following commands (may need to add ` --no-cache-dir ` to force it to build from scratch):
64
+
65
+ ``` bash
66
+ # Enable LRB
67
+ CMAKE_ARGS=" -DENABLE_LRB=ON" uv pip install libcachesim
68
+ # Enable ThreeLCache
69
+ CMAKE_ARGS=" -DENABLE_3L_CACHE=ON" uv pip install libcachesim
70
+ # Enable GLCache
71
+ CMAKE_ARGS=" -DENABLE_GLCACHE=ON" uv pip install libcachesim
72
+ ```
73
+
16
74
### Installation from sources
17
75
18
76
If there are no wheels suitable for your environment, consider building from source.
@@ -29,6 +87,42 @@ python -m pytest tests/
29
87
30
88
## Quick Start
31
89
90
+ ### Cache Simulation
91
+
92
+ With libcachesim installed, you can start cache simulation for some eviction algorithm and cache traces:
93
+
94
+ ``` python
95
+ import libcachesim as lcs
96
+
97
+ # Step 1: Get one trace from S3 bucket
98
+ URI = " cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
99
+ dl = lcs.DataLoader()
100
+ dl.load(URI )
101
+
102
+ # Step 2: Open trace and process efficiently
103
+ reader = lcs.TraceReader(
104
+ trace = dl.get_cache_path(URI ),
105
+ trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE ,
106
+ reader_init_params = lcs.ReaderInitParam(ignore_obj_size = False )
107
+ )
108
+
109
+ # Step 3: Initialize cache
110
+ cache = lcs.S3FIFO(cache_size = 1024 * 1024 )
111
+
112
+ # Step 4: Process entire trace efficiently (C++ backend)
113
+ obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
114
+ print (f " Object miss ratio: { obj_miss_ratio:.4f } , Byte miss ratio: { byte_miss_ratio:.4f } " )
115
+
116
+ # Step 4.1: Process with limited number of requests
117
+ cache = lcs.S3FIFO(cache_size = 1024 * 1024 )
118
+ obj_miss_ratio, byte_miss_ratio = cache.process_trace(
119
+ reader,
120
+ start_req = 0 ,
121
+ max_req = 1000
122
+ )
123
+ print (f " Object miss ratio: { obj_miss_ratio:.4f } , Byte miss ratio: { byte_miss_ratio:.4f } " )
124
+ ```
125
+
32
126
### Basic Usage
33
127
34
128
``` python
@@ -46,7 +140,9 @@ print(cache.get(req)) # False (first access)
46
140
print (cache.get(req)) # True (second access)
47
141
```
48
142
49
- ### Trace Processing
143
+ ### Trace Analysis
144
+
145
+ Here is an example demonstrating how to use ` TraceAnalyzer ` :
50
146
51
147
``` python
52
148
import libcachesim as lcs
@@ -56,25 +152,40 @@ URI = "cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
56
152
dl = lcs.DataLoader()
57
153
dl.load(URI )
58
154
59
- # Step 2: Open trace and process efficiently
60
- reader = lcs.TraceReader(dl.get_cache_path(URI ))
155
+ reader = lcs.TraceReader(
156
+ trace = dl.get_cache_path(URI ),
157
+ trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE ,
158
+ reader_init_params = lcs.ReaderInitParam(ignore_obj_size = False )
159
+ )
61
160
62
- # Step 3: Initialize cache
63
- cache = lcs.S3FIFO(cache_size = 1024 * 1024 )
161
+ analysis_option = lcs.AnalysisOption(
162
+ req_rate = True , # Keep basic request rate analysis
163
+ access_pattern = False , # Disable access pattern analysis
164
+ size = True , # Keep size analysis
165
+ reuse = False , # Disable reuse analysis for small datasets
166
+ popularity = False , # Disable popularity analysis for small datasets (< 200 objects)
167
+ ttl = False , # Disable TTL analysis
168
+ popularity_decay = False , # Disable popularity decay analysis
169
+ lifetime = False , # Disable lifetime analysis
170
+ create_future_reuse_ccdf = False , # Disable experimental features
171
+ prob_at_age = False , # Disable experimental features
172
+ size_change = False , # Disable size change analysis
173
+ )
174
+
175
+ analysis_param = lcs.AnalysisParam()
176
+
177
+ analyzer = lcs.TraceAnalyzer(
178
+ reader, " example_analysis" , analysis_option = analysis_option, analysis_param = analysis_param
179
+ )
64
180
65
- # Step 4: Process entire trace efficiently (C++ backend)
66
- obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
67
- print (f " Object miss ratio: { obj_miss_ratio:.4f } , Byte miss ratio: { byte_miss_ratio:.4f } " )
181
+ analyzer.run()
68
182
```
69
183
70
- > [ !NOTE]
71
- > We DO NOT ignore the object size by defaults, you can add ` reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False) ` to the initialization of ` TraceReader ` if needed.
72
-
73
- ## Custom Cache Policies
184
+ ## Plugin System
74
185
75
- Implement custom cache replacement algorithms using pure Python functions - ** no C/C++ compilation required** .
186
+ libCacheSim allows you to develop your own cache eviction algorithms and test them via the plugin system without any C/C++ compilation required.
76
187
77
- ### Python Hook Cache Overview
188
+ ### Plugin Cache Overview
78
189
79
190
The ` PluginCache ` allows you to define custom caching behavior through Python callback functions. You need to implement these callback functions:
80
191
@@ -87,74 +198,51 @@ The `PluginCache` allows you to define custom caching behavior through Python ca
87
198
| ` remove_hook ` | ` (data: Any, obj_id: int) -> None ` | Clean up when object removed |
88
199
| ` free_hook ` | ` (data: Any) -> None ` | [ Optional] Final cleanup |
89
200
90
- <details >
91
- <summary >An example for LRU</summary >
201
+ ### Example: Implementing LRU via Plugin System
92
202
93
203
``` python
94
204
from collections import OrderedDict
95
- from libcachesim import PluginCache, CommonCacheParams, Request, SyntheticReader, LRU
96
-
97
-
98
- class StandaloneLRU :
99
- def __init__ (self ):
100
- self .cache_data = OrderedDict()
101
-
102
- def cache_hit (self , obj_id ):
103
- if obj_id in self .cache_data:
104
- obj_size = self .cache_data.pop(obj_id)
105
- self .cache_data[obj_id] = obj_size
106
-
107
- def cache_miss (self , obj_id , obj_size ):
108
- self .cache_data[obj_id] = obj_size
109
-
110
- def cache_eviction (self ):
111
- evicted_id, _ = self .cache_data.popitem(last = False )
112
- return evicted_id
113
-
114
- def cache_remove (self , obj_id ):
115
- if obj_id in self .cache_data:
116
- del self .cache_data[obj_id]
117
-
205
+ from typing import Any
118
206
119
- def cache_init_hook (common_cache_params : CommonCacheParams):
120
- return StandaloneLRU()
207
+ from libcachesim import PluginCache, LRU , CommonCacheParams, Request
121
208
209
+ def init_hook (_ : CommonCacheParams) -> Any:
210
+ return OrderedDict()
122
211
123
- def cache_hit_hook ( cache , request : Request):
124
- cache.cache_hit(request .obj_id)
212
+ def hit_hook ( data : Any, req : Request) -> None :
213
+ data.move_to_end(req .obj_id, last = True )
125
214
215
+ def miss_hook (data : Any, req : Request) -> None :
216
+ data.__setitem__ (req.obj_id, req.obj_size)
126
217
127
- def cache_miss_hook ( cache , request : Request):
128
- cache.cache_miss(request.obj_id, request.obj_size)
218
+ def eviction_hook ( data : Any, _ : Request) -> int :
219
+ return data.popitem( last = False )[ 0 ]
129
220
221
+ def remove_hook (data : Any, obj_id : int ) -> None :
222
+ data.pop(obj_id, None )
130
223
131
- def cache_eviction_hook (cache , request : Request):
132
- return cache.cache_eviction()
133
-
134
-
135
- def cache_remove_hook (cache , obj_id ):
136
- cache.cache_remove(obj_id)
137
-
138
-
139
- def cache_free_hook (cache ):
140
- cache.cache_data.clear()
141
-
224
+ def free_hook (data : Any) -> None :
225
+ data.clear()
142
226
143
227
plugin_lru_cache = PluginCache(
144
- cache_size = 1024 ,
145
- cache_init_hook = cache_init_hook ,
146
- cache_hit_hook = cache_hit_hook ,
147
- cache_miss_hook = cache_miss_hook ,
148
- cache_eviction_hook = cache_eviction_hook ,
149
- cache_remove_hook = cache_remove_hook ,
150
- cache_free_hook = cache_free_hook ,
151
- cache_name = " CustomizedLRU " ,
228
+ cache_size = 128 ,
229
+ cache_init_hook = init_hook ,
230
+ cache_hit_hook = hit_hook ,
231
+ cache_miss_hook = miss_hook ,
232
+ cache_eviction_hook = eviction_hook ,
233
+ cache_remove_hook = remove_hook ,
234
+ cache_free_hook = free_hook ,
235
+ cache_name = " Plugin_LRU " ,
152
236
)
153
- ```
154
- </details >
155
237
238
+ reader = lcs.SyntheticReader(num_objects = 1000 , num_of_req = 10000 , obj_size = 1 )
239
+ req_miss_ratio, byte_miss_ratio = plugin_lru_cache.process_trace(reader)
240
+ ref_req_miss_ratio, ref_byte_miss_ratio = LRU(128 ).process_trace(reader)
241
+ print (f " plugin req miss ratio { req_miss_ratio} , ref req miss ratio { ref_req_miss_ratio} " )
242
+ print (f " plugin byte miss ratio { byte_miss_ratio} , ref byte miss ratio { ref_byte_miss_ratio} " )
243
+ ```
156
244
157
- Another simple implementation via hook functions for S3FIFO respectively is given in [ examples ] ( examples/plugin_cache/s3fifo.py ) .
245
+ By defining custom hook functions for cache initialization, hit, miss, eviction, removal, and cleanup, users can easily prototype and test their own cache eviction algorithms .
158
246
159
247
### Getting Help
160
248
@@ -208,7 +296,6 @@ If you used libCacheSim in your research, please cite the above papers.
208
296
209
297
---
210
298
211
-
212
299
## License
213
300
See [ LICENSE] ( LICENSE ) for details.
214
301
0 commit comments