Skip to content

Commit b5dc347

Browse files
authored
Overhaul compressed lists classes using typed lists from biocutils (#5)
Major rewrite of the package, update tests, documentation and tutorial.
1 parent bb2f8aa commit b5dc347

24 files changed

+1045
-299
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Changelog
22

3+
## Version 0.2.0
4+
5+
- Major changes to the package; Switch to typed lists from the biocutils package.
6+
37
## Version 0.1.0 - 0.1.1
48

59
- Initial implementation of various classes - Partitioning and CompressedLists.

README.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,8 @@ pip install compressed-lists
1717

1818
## Usage
1919

20-
2120
```py
22-
from compressed_lists import CompressedIntegerList, CompressedStringList
21+
from compressed_lists import CompressedIntegerList, CompressedStringList, Partitioning
2322

2423
# Create a CompressedIntegerList
2524
int_data = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
@@ -38,9 +37,12 @@ print(squared[0]) # [1, 4, 9]
3837
# Convert to a regular Python list
3938
regular_list = int_list.to_list()
4039

41-
# Create a CompressedStringList
42-
char_data = [["apple", "banana"], ["cherry", "date", "elderberry"], ["fig"]]
43-
char_list = CompressedStringList.from_list(char_data)
40+
# Create a CompressedStringList from lengths
41+
import biocutils as ut
42+
char_data = ut.StringList(["apple", "banana", "cherry", "date", "elderberry", "fig"])
43+
44+
char_list = CompressedStringList(char_data, partitioning=Partitioning.from_lengths([2,3,1]))
45+
print(char_list)
4446
```
4547

4648
### Partitioning
@@ -61,7 +63,7 @@ start, end = part[1] # Returns (3, 5)
6163

6264
> [!NOTE]
6365
>
64-
> Check out the [documentation](https://biocpy.github.io/compressed-lists) for extending CompressedLists to custom data types.
66+
> Check out the [documentation](https://biocpy.github.io/compressed-lists) for available compressed list implementations and extending `CompressedLists` to custom data types.
6567
6668
<!-- biocsetup-notes -->
6769

docs/tutorial.md

Lines changed: 30 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -7,28 +7,31 @@ kernelspec:
77
# Basic Usage
88

99
```{code-cell}
10-
from compressed_lists import CompressedIntegerList, CompressedStringList
10+
from compressed_lists import CompressedIntegerList, CompressedStringList, Partitioning
1111
1212
# Create a CompressedIntegerList
1313
int_data = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
1414
names = ["A", "B", "C"]
1515
int_list = CompressedIntegerList.from_list(int_data, names)
1616
1717
# Access elements
18-
print(int_list[0]) # [1, 2, 3]
19-
print(int_list["B"]) # [4, 5]
20-
print(int_list[1:3]) # Slice of elements
18+
print(int_list[0])
19+
print(int_list["B"])
20+
print(int_list[1:3])
2121
2222
# Apply a function to each element
2323
squared = int_list.lapply(lambda x: [i**2 for i in x])
24-
print(squared[0]) # [1, 4, 9]
24+
print(squared[0])
2525
2626
# Convert to a regular Python list
2727
regular_list = int_list.to_list()
2828
29-
# Create a CompressedStringList
30-
char_data = [["apple", "banana"], ["cherry", "date", "elderberry"], ["fig"]]
31-
char_list = CompressedStringList.from_list(char_data)
29+
# Create a CompressedStringList from lengths
30+
import biocutils as ut
31+
char_data = ut.StringList(["apple", "banana", "cherry", "date", "elderberry", "fig"])
32+
33+
char_list = CompressedStringList(char_data, partitioning=Partitioning.from_lengths([2,3,1]))
34+
print(char_list)
3235
```
3336

3437
## Partitioning
@@ -57,7 +60,7 @@ print(start, end)
5760
Create a new class that inherits from `CompressedList` with appropriate type annotations:
5861

5962
```python
60-
from typing import List, TypeVar, Generic
63+
from typing import List
6164
from compressed_lists import CompressedList, Partitioning
6265
import numpy as np
6366

@@ -72,31 +75,32 @@ The constructor should initialize the superclass with the appropriate data:
7275

7376
```python
7477
def __init__(self,
75-
unlist_data: Any, # Replace with your data type
76-
partitioning: Partitioning,
77-
element_metadata: dict = None,
78-
metadata: dict = None):
78+
unlist_data: Any, # Replace with your data type
79+
partitioning: Partitioning,
80+
element_type: Any = None,
81+
element_metadata: Optional[dict] = None,
82+
metadata: Optional[dict] = None):
7983
super().__init__(unlist_data, partitioning,
80-
element_type="custom_type", # Set your element type
81-
element_metadata=element_metadata,
82-
metadata=metadata)
84+
element_type="custom_type", # Set your element type
85+
element_metadata=element_metadata,
86+
metadata=metadata)
8387
```
8488

85-
## 3. Implement _extract_range Method
89+
## 3. Implement `extract_range` Method
8690

8791
This method defines how to extract a range of elements from your unlisted data:
8892

8993
```python
90-
def _extract_range(self, start: int, end: int) -> List[T]:
94+
def extract_range(self, start: int, end: int) -> List[T]:
9195
"""Extract a range from unlisted data."""
9296
# For example, with numpy arrays:
93-
return self.unlist_data[start:end].tolist()
97+
return self.unlist_data[start:end]
9498

9599
# Or for other data types:
96-
# return self.unlist_data[start:end]
100+
# return self.unlist_data[start:end, :]
97101
```
98102

99-
## 4. Implement from_list Class Method
103+
## 4. Implement `from_list` Class Method
100104

101105
This factory method creates a new instance from a list:
102106

@@ -140,7 +144,7 @@ class CompressedFloatList(CompressedList):
140144
element_metadata=element_metadata,
141145
metadata=metadata)
142146
143-
def _extract_range(self, start: int, end: int) -> List[float]:
147+
def extract_range(self, start: int, end: int) -> List[float]:
144148
return self.unlist_data[start:end].tolist()
145149
146150
@classmethod
@@ -176,14 +180,16 @@ class MyObject:
176180
def __init__(self, value):
177181
self.value = value
178182

179-
class CompressedMyObjectList(CompressedList[List[MyObject]]):
183+
class CompressedMyObjectList(CompressedList):
180184
# Implementation details...
181185

182-
def _extract_range(self, start: int, end: int) -> List[MyObject]:
186+
def extract_range(self, start: int, end: int) -> List[MyObject]:
183187
return self.unlist_data[start:end]
184188

185189
@classmethod
186190
def from_list(cls, lst: List[List[MyObject]], ...):
187191
# Custom flattening and storage logic
188192
# ...
189193
```
194+
195+
Check out the `CompressedBiocFrameList` for a complete example of this usecase.

setup.cfg

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@ python_requires = >=3.9
5050
install_requires =
5151
importlib-metadata; python_version<"3.8"
5252
biocutils
53+
numpy
54+
biocframe
5355

5456

5557
[options.packages.find]

src/compressed_lists/CompressedIntegerList.py

Lines changed: 0 additions & 92 deletions
This file was deleted.

src/compressed_lists/CompressedStringList.py

Lines changed: 0 additions & 86 deletions
This file was deleted.

src/compressed_lists/__init__.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,11 @@
1616
del version, PackageNotFoundError
1717

1818
from .partition import Partitioning
19-
from .CompressedList import CompressedList
20-
from .CompressedIntegerList import CompressedIntegerList
21-
from .CompressedStringList import CompressedStringList
19+
from .base import CompressedList
20+
from .integer_list import CompressedIntegerList
21+
from .string_list import CompressedStringList, CompressedCharacterList
22+
from .bool_list import CompressedBooleanList
23+
from .float_list import CompressedFloatList
24+
from .numpy_list import CompressedNumpyList
25+
from .biocframe_list import CompressedBiocFrameList
26+
from .split_generic import splitAsCompressedList

0 commit comments

Comments
 (0)