Skip to content

Commit 98f9018

Browse files
GH-2: Write a good README describing the basic usage (GH-8)
2 parents 7aa2d28 + 7b9ac03 commit 98f9018

File tree

2 files changed

+66
-1
lines changed

2 files changed

+66
-1
lines changed

README.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# FuzzyMap <img src="https://avatars.githubusercontent.com/u/108220139" align="right" height="64" />
2+
3+
[![PyPI](https://img.shields.io/pypi/v/fuzzymap.svg)](https://pypi.org/project/fuzzymap/)
4+
[![License](https://img.shields.io/pypi/l/fuzzymap.svg)](https://github.com/pysnippet/fuzzymap/blob/master/LICENSE)
5+
6+
## What is FuzzyMap?
7+
8+
`FuzzyMap` is a polymorph Python dictionary. This kind of dictionary returns the value of the exact key if there is such
9+
a key. Otherwise, it will return the value of the most similar key satisfying the given ratio. The same mechanism works
10+
when setting a new or replacing an old key in the dictionary. If the key is not found and does not match any of the keys
11+
by the given ratio, it returns `None`.
12+
13+
## How does it work?
14+
15+
Suppose you have scraped data from multiple sources that do not have a unique identifier, and you want to compare the
16+
values of the items having the same identifiers. Sure there will be found a field that mostly has an equivalent value
17+
at each source. And you can use that field to identify the corresponding items of other sources' data.
18+
19+
## Let's look at the following example
20+
21+
There is a live data parser that collects the coefficients of football matches from different bookmakers at once, then
22+
calculates and logs the existing forks. Many bookmakers change the name of the teams to be incomparable with names on
23+
other sites.
24+
25+
```python
26+
from fuzzymap import FuzzyMap
27+
28+
src1 = {
29+
'Rapid Wien - First Vienna': {'w1': 1.93, 'x': 2.32, 'w2': 7.44},
30+
'Al Bourj - Al Nejmeh': {'w1': 26, 'x': 11.5, 'w2': 1.05},
31+
# hundreds of other teams' data
32+
}
33+
34+
src2 = FuzzyMap({
35+
'Bourj FC - Nejmeh SC Beirut': {'w1': 32, 'x': 12, 'w2': 1.05},
36+
'SK Rapid Wien - First Vienna FC': {'w1': 1.97, 'x': 2.3, 'w2': 8.2},
37+
# hundreds of other teams' data
38+
})
39+
40+
for team, coefs1 in src1.items():
41+
coefs2 = src2[team]
42+
43+
# coefs1 = {"w1": 1.93, "x": 2.32, "w2": 7.44}
44+
# coefs2 = {"w1": 1.97, "x": 2.3, "w2": 8.2}
45+
handle_fork(coefs1, coefs2)
46+
```
47+
48+
With a human brain, it is not difficult to identify that "Rapid Wien - First Vienna" and "SK Rapid Wien - First Vienna
49+
FC" matches are the same. In the above example, the `src2` is defined as `FuzzyMap`, it makes its keys fuzzy-matchable,
50+
and we can get an item corresponding to the key of `src1`. See the below graph demonstrating the associations of
51+
`FuzzyMap` keys.
52+
53+
```mermaid
54+
graph LR
55+
src1team1[Rapid Wien - First Vienna]-->src1coefs1["{'w1': 1.93, 'x': 2.32, 'w2': 7.44}"]
56+
src1team2[Al Bourj - Al Nejmeh]-->src1coefs2["{'w1': 26, 'x': 11.5, 'w2': 1.05}"]
57+
src2team1[SK Rapid Wien - First Vienna FC]-->src2coefs1["{'w1': 1.97, 'x': 2.3, 'w2': 8.2}"]
58+
src2team2[Bourj FC - Nejmeh SC Beirut]-->src2coefs2["{'w1': 32, 'x': 12, 'w2': 1.05}"]
59+
src1team1-->src2coefs1
60+
src1team2-->src2coefs2
61+
```
62+
63+
## License
64+
65+
Copyright (C) 2022 Artyom Vancyan. [GPLv2](LICENSE)

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
version=version,
3030
author="Artyom Vancyan",
3131
author_email="[email protected]",
32-
# description="",
32+
description="Python dictionary with a FUZZY key-matching opportunity",
3333
# long_description=long_description,
3434
# long_description_content_type="text/markdown",
3535
url="https://github.com/pysnippet/fuzzymap",

0 commit comments

Comments
 (0)