Skip to content

Commit c5e8399

Browse files
committed
Fix typos discovered by codespell
1 parent 41e0777 commit c5e8399

39 files changed

+223
-167
lines changed

extra/DEVELOPER_DOCS/Listeners.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ model = chain(
194194
)
195195
```
196196

197-
but the standalone `Tok2VecTransformer` has an additional `split_trf_batch` chained inbetween the model
197+
but the standalone `Tok2VecTransformer` has an additional `split_trf_batch` chained in between the model
198198
and `trfs2arrays`:
199199

200200
```

extra/DEVELOPER_DOCS/Satellite Packages.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This is a list of all the active repos relevant to spaCy besides the main one, w
66

77
These packages are always pulled in when you install spaCy. Most of them are direct dependencies, but some are transitive dependencies through other packages.
88

9-
- [spacy-legacy](https://github.com/explosion/spacy-legacy): When an architecture in spaCy changes enough to get a new version, the old version is frozen and moved to spacy-legacy. This allows us to keep the core library slim while also preserving backwards compatability.
9+
- [spacy-legacy](https://github.com/explosion/spacy-legacy): When an architecture in spaCy changes enough to get a new version, the old version is frozen and moved to spacy-legacy. This allows us to keep the core library slim while also preserving backwards compatibility.
1010
- [thinc](https://github.com/explosion/thinc): Thinc is the machine learning library that powers trainable components in spaCy. It wraps backends like Numpy, PyTorch, and Tensorflow to provide a functional interface for specifying architectures.
1111
- [catalogue](https://github.com/explosion/catalogue): Small library for adding function registries, like those used for model architectures in spaCy.
1212
- [confection](https://github.com/explosion/confection): This library contains the functionality for config parsing that was formerly contained directly in Thinc.
@@ -67,7 +67,7 @@ These repos are used to support the spaCy docs or otherwise present information
6767

6868
These repos are used for organizing data around spaCy, but are not something an end user would need to install as part of using the library.
6969

70-
- [spacy-models](https://github.com/explosion/spacy-models): This repo contains metadata (but not training data) for all the spaCy models. This includes information about where their training data came from, version compatability, and performance information. It also includes tests for the model packages, and the built models are hosted as releases of this repo.
70+
- [spacy-models](https://github.com/explosion/spacy-models): This repo contains metadata (but not training data) for all the spaCy models. This includes information about where their training data came from, version compatibility, and performance information. It also includes tests for the model packages, and the built models are hosted as releases of this repo.
7171
- [wheelwright](https://github.com/explosion/wheelwright): A tool for automating our PyPI builds and releases.
7272
- [ec2buildwheel](https://github.com/explosion/ec2buildwheel): A small project that allows you to build Python packages in the manner of cibuildwheel, but on any EC2 image. Used by wheelwright.
7373

extra/DEVELOPER_DOCS/StringStore-Vocab.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ These are things stored in the vocab:
145145
- `get_noun_chunks`: a syntax iterator
146146
- lex attribute getters: functions like `is_punct`, set in language defaults
147147
- `cfg`: **not** the pipeline config, this is mostly unused
148-
- `_unused_object`: Formerly an unused object, kept around until v4 for compatability
148+
- `_unused_object`: Formerly an unused object, kept around until v4 for compatibility
149149

150150
Some of these, like the Morphology and Vectors, are complex enough that they
151151
need their own explanations. Here we'll just look at Vocab-specific items.

extra/example_data/textcat_example_data/CC_BY-SA-3.0.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ CONDITIONS.
3434
Collection will not be considered an Adaptation for the purpose of
3535
this License. For the avoidance of doubt, where the Work is a musical
3636
work, performance or phonogram, the synchronization of the Work in
37-
timed-relation with a moving image ("synching") will be considered an
37+
timed-relation with a moving image ("syncing") will be considered an
3838
Adaptation for the purpose of this License.
3939
b. "Collection" means a collection of literary or artistic works, such as
4040
encyclopedias and anthologies, or performances, phonograms or
@@ -264,7 +264,7 @@ subject to and limited by the following restrictions:
264264
UNLESS OTHERWISE MUTUALLY AGREED TO BY THE PARTIES IN WRITING, LICENSOR
265265
OFFERS THE WORK AS-IS AND MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY
266266
KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE,
267-
INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTIBILITY,
267+
INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTABILITY,
268268
FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF
269269
LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS,
270270
WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION

spacy/cli/info.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ def info(
8484

8585

8686
def info_spacy() -> Dict[str, Any]:
87-
"""Generate info about the current spaCy intallation.
87+
"""Generate info about the current spaCy installation.
8888
8989
RETURNS (dict): The spaCy info.
9090
"""

spacy/glossary.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ def explain(term):
354354
# https://github.com/ltgoslo/norne
355355
"EVT": "Festivals, cultural events, sports events, weather phenomena, wars, etc.",
356356
"PROD": "Product, i.e. artificially produced entities including speeches, radio shows, programming languages, contracts, laws and ideas",
357-
"DRV": "Words (and phrases?) that are dervied from a name, but not a name in themselves, e.g. 'Oslo-mannen' ('the man from Oslo')",
357+
"DRV": "Words (and phrases?) that are derived from a name, but not a name in themselves, e.g. 'Oslo-mannen' ('the man from Oslo')",
358358
"GPE_LOC": "Geo-political entity, with a locative sense, e.g. 'John lives in Spain'",
359359
"GPE_ORG": "Geo-political entity, with an organisation sense, e.g. 'Spain declined to meet with Belgium'",
360360
}

spacy/lang/ht/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,12 @@ class HaitianCreoleDefaults(BaseDefaults):
2222
stop_words = STOP_WORDS
2323
tag_map = TAG_MAP
2424

25+
2526
class HaitianCreole(Language):
2627
lang = "ht"
2728
Defaults = HaitianCreoleDefaults
2829

30+
2931
@HaitianCreole.factory(
3032
"lemmatizer",
3133
assigns=["token.lemma"],
@@ -49,4 +51,5 @@ def make_lemmatizer(
4951
nlp.vocab, model, name, mode=mode, overwrite=overwrite, scorer=scorer
5052
)
5153

54+
5255
__all__ = ["HaitianCreole"]

spacy/lang/ht/lex_attrs.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@
4949
"P": "Pa",
5050
}
5151

52+
5253
def like_num(text):
5354
text = text.strip().lower()
5455
if text.startswith(("+", "-", "±", "~")):
@@ -69,9 +70,11 @@ def like_num(text):
6970
return True
7071
return False
7172

73+
7274
def norm_custom(text):
7375
return NORM_MAP.get(text, text.lower())
7476

77+
7578
LEX_ATTRS = {
7679
LIKE_NUM: like_num,
7780
NORM: norm_custom,

spacy/lang/ht/punctuation.py

Lines changed: 38 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -16,28 +16,43 @@
1616
_prefixes_elision = "m n l y t k w"
1717
_prefixes_elision += " " + _prefixes_elision.upper()
1818

19-
TOKENIZER_PREFIXES = LIST_PUNCT + LIST_QUOTES + [
20-
r"(?:({pe})[{el}])(?=[{a}])".format(
21-
a=ALPHA, el=ELISION, pe=merge_chars(_prefixes_elision)
22-
)
23-
]
19+
TOKENIZER_PREFIXES = (
20+
LIST_PUNCT
21+
+ LIST_QUOTES
22+
+ [
23+
r"(?:({pe})[{el}])(?=[{a}])".format(
24+
a=ALPHA, el=ELISION, pe=merge_chars(_prefixes_elision)
25+
)
26+
]
27+
)
2428

25-
TOKENIZER_SUFFIXES = LIST_PUNCT + LIST_QUOTES + LIST_ELLIPSES + [
26-
r"(?<=[0-9])%", # numbers like 10%
27-
r"(?<=[0-9])(?:{h})".format(h=HYPHENS), # hyphens after numbers
28-
r"(?<=[{a}])['’]".format(a=ALPHA), # apostrophes after letters
29-
r"(?<=[{a}])['’][mwlnytk](?=\s|$)".format(a=ALPHA), # contractions
30-
r"(?<=[{a}0-9])\)", # right parenthesis after letter/number
31-
r"(?<=[{a}])\.(?=\s|$)".format(a=ALPHA), # period after letter if space or end of string
32-
r"(?<=\))[\.\?!]", # punctuation immediately after right parenthesis
33-
]
29+
TOKENIZER_SUFFIXES = (
30+
LIST_PUNCT
31+
+ LIST_QUOTES
32+
+ LIST_ELLIPSES
33+
+ [
34+
r"(?<=[0-9])%", # numbers like 10%
35+
r"(?<=[0-9])(?:{h})".format(h=HYPHENS), # hyphens after numbers
36+
r"(?<=[{a}])['’]".format(a=ALPHA), # apostrophes after letters
37+
r"(?<=[{a}])['’][mwlnytk](?=\s|$)".format(a=ALPHA), # contractions
38+
r"(?<=[{a}0-9])\)", # right parenthesis after letter/number
39+
r"(?<=[{a}])\.(?=\s|$)".format(
40+
a=ALPHA
41+
), # period after letter if space or end of string
42+
r"(?<=\))[\.\?!]", # punctuation immediately after right parenthesis
43+
]
44+
)
3445

35-
TOKENIZER_INFIXES = LIST_ELLIPSES + LIST_ICONS + [
36-
r"(?<=[0-9])[+\-\*^](?=[0-9-])",
37-
r"(?<=[{al}{q}])\.(?=[{au}{q}])".format(
38-
al=ALPHA_LOWER, au=ALPHA_UPPER, q=CONCAT_QUOTES
39-
),
40-
r"(?<=[{a}]),(?=[{a}])".format(a=ALPHA),
41-
r"(?<=[{a}0-9])(?:{h})(?=[{a}])".format(a=ALPHA, h=HYPHENS),
42-
r"(?<=[{a}][{el}])(?=[{a}])".format(a=ALPHA, el=ELISION),
43-
]
46+
TOKENIZER_INFIXES = (
47+
LIST_ELLIPSES
48+
+ LIST_ICONS
49+
+ [
50+
r"(?<=[0-9])[+\-\*^](?=[0-9-])",
51+
r"(?<=[{al}{q}])\.(?=[{au}{q}])".format(
52+
al=ALPHA_LOWER, au=ALPHA_UPPER, q=CONCAT_QUOTES
53+
),
54+
r"(?<=[{a}]),(?=[{a}])".format(a=ALPHA),
55+
r"(?<=[{a}0-9])(?:{h})(?=[{a}])".format(a=ALPHA, h=HYPHENS),
56+
r"(?<=[{a}][{el}])(?=[{a}])".format(a=ALPHA, el=ELISION),
57+
]
58+
)

spacy/lang/ht/stop_words.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,7 @@
3939
4040
men mèsi oswa osinon
4141
42-
"""
43-
.split()
42+
""".split()
4443
)
4544

4645
# Add common contractions, with and without apostrophe variants

0 commit comments

Comments
 (0)