Skip to content

Commit 2aeed2c

Browse files
authored
Merge pull request #122 from maxlath/split-simplify-claims-and-simplify-snaks
Split simplifyClaim and simplifySnak
2 parents ad69e34 + 740a737 commit 2aeed2c

File tree

13 files changed

+8456
-4145
lines changed

13 files changed

+8456
-4145
lines changed

docs/simplify_claims.md

Lines changed: 104 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,24 @@
11
# Simplify claims
22
*associated Wikibase doc: [DataModel](https://www.mediawiki.org/wiki/Wikibase/DataModel)*
33

4-
`simplify.claims` functions are part of the larger [`simplify.entity` functions family](simplify_entities_data.md)
4+
`simplifyClaims` functions are part of the larger [`simplifyEntity` functions family](simplify_entities_data.md)
55

66
## Summary
77

88
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
99
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
1010

1111
- [Intro](#intro)
12-
- [simplify.claims](#simplifyclaims)
13-
- [simplify.propertyClaims](#simplifypropertyclaims)
14-
- [simplify.claim](#simplifyclaim)
15-
- [simplify.qualifiers](#simplifyqualifiers)
16-
- [simplify.propertyQualifiers](#simplifypropertyqualifiers)
17-
- [simplify.qualifier](#simplifyqualifier)
18-
- [simplify.references](#simplifyreferences)
19-
- [simplify.snaks](#simplifysnaks)
20-
- [simplify.propertySnaks](#simplifypropertysnaks)
21-
- [simplify.snak](#simplifysnak)
12+
- [simplifyClaims](#simplifyclaims)
13+
- [simplifyPropertyClaims](#simplifypropertyclaims)
14+
- [simplifyClaim](#simplifyclaim)
15+
- [simplifyQualifiers](#simplifyqualifiers)
16+
- [simplifyPropertyQualifiers](#simplifypropertyqualifiers)
17+
- [simplifyQualifier](#simplifyqualifier)
18+
- [simplifyReferences](#simplifyreferences)
19+
- [simplifySnaks](#simplifysnaks)
20+
- [simplifyPropertySnaks](#simplifypropertysnaks)
21+
- [simplifySnak](#simplifysnak)
2222
- [Options](#options)
2323
- [Add prefixes to entities and properties ids](#add-prefixes-to-entities-and-properties-ids)
2424
- [Keep rich values](#keep-rich-values)
@@ -105,12 +105,13 @@ we could have
105105
"P279": [ "Q340169", "Q2342494", "Q386724" ]
106106
```
107107

108-
That's what `simplify.claims`, `simplify.propertyClaims`, `simplify.claim` do, each at their own level.
108+
That's what `simplifyClaims`, `simplifyPropertyClaims`, `simplifyClaim` do, each at their own level.
109109

110-
## simplify.claims
111-
you just need to pass your entity' claims object to simplify.claims as such:
110+
## simplifyClaims
111+
you just need to pass your entity' claims object to `simplifyClaims` as such:
112112
```js
113-
const simplifiedClaims = wbk.simplify.claims(entity.claims)
113+
import { simplifyClaims } from 'wikibase-sdk'
114+
const simplifiedClaims = simplifyClaims(entity.claims)
114115
```
115116

116117
in your workflow, that could give something like:
@@ -119,45 +120,74 @@ in your workflow, that could give something like:
119120
const url = wbk.getEntities('Q535')
120121
const { entities } = await fetch(url)
121122
const entity = entities.Q535
122-
const simplifiedClaims = wbk.simplify.claims(entity.claims)
123+
const simplifiedClaims = simplifyClaims(entity.claims)
123124
```
124125

125126
To keep things simple, "weird" values are removed (for instance, statements of datatype `wikibase-item` but set to `somevalues` instead of the expected Q id)
126127

127-
Note that you don't need to instantiate a `wbk` object to access those `simplify` functions, as they can directly imported: `import { simplify } from 'wikibase-sdk'`
128+
Note that those functions are also available on the `wbk.simplify` object: `wbk.simplify.claims`, etc.
128129

129-
## simplify.propertyClaims
130-
Same as simplify.claims but expects an array of claims, typically the array of claims of a specific property:
130+
## simplifyPropertyClaims
131+
Simplify an array of claims, typically the array of claims of a specific property:
131132
```js
132-
const simplifiedP31Claims = wbk.simplify.propertyClaims(entity.claims.P31)
133+
import { simplifyPropertyClaims } from 'wikibase-sdk'
134+
const simplifiedP31Claims = simplifyPropertyClaims(entity.claims.P31, options)
133135
```
134136

135-
## simplify.claim
136-
Same as simplify.claims but expects a unique claim
137+
## simplifyClaim
138+
Simplify a unique claim
137139
```js
138-
const simplifiedP31Claim = wbk.simplify.claim(entity.claims.P31[0])
140+
import { simplifyClaim } from 'wikibase-sdk'
141+
const simplifiedP31Claim = simplifyClaim(entity.claims.P31[0], options)
139142
```
140143

141-
## simplify.qualifiers
142-
Same interface as [simplify.claims](#simplifyclaims) but taking a qualifiers object
143-
144-
## simplify.propertyQualifiers
145-
Same interface as [simplify.propertyClaims](#simplifypropertyclaims) but taking an array of qualifiers
144+
## simplifyQualifiers
145+
Simplify a qualifiers object
146+
```js
147+
import { simplifyQualifiers } from 'wikibase-sdk'
148+
const claim = entity.claims.P31[0]
149+
const simplifiedQualifiers = simplifyQualifiers(claim.qualifiers, options)
150+
```
146151

147-
## simplify.qualifier
148-
Same interface as [simplify.claim](#simplifyclaim) but taking a qualifier object
152+
## simplifyPropertyQualifiers
153+
Simplify an array of qualifiers
154+
```js
155+
import { simplifyPropertyQualifiers } from 'wikibase-sdk'
156+
const claim = entity.claims.P31[0]
157+
const simplifiedP580Qualifiers = simplifyPropertyQualifiers(claim.qualifiers.P580, options)
158+
```
149159

150-
## simplify.references
151-
Same interface as [simplify.claims](#simplifyclaims) but taking an array of reference records
160+
## simplifyQualifier
161+
Simplify a qualifier
162+
```js
163+
import { simplifyQualifier } from 'wikibase-sdk'
164+
const claim = entity.claims.P31[0]
165+
const simplifiedQualifier = simplifyPropertyQualifiers(claim.qualifiers.P580[0], options)
166+
```
152167

153-
## simplify.snaks
154-
Same interface as [simplify.claims](#simplifyclaims), but with a name that hints that it could also accept qualifiers or reference records.
168+
## simplifyReferences
169+
Simplify an array of references
170+
```js
171+
import { simplifyReferences } from 'wikibase-sdk'
172+
const claim = entity.claims.P31[0]
173+
const simplifiedReferences = simplifyReferences(claim.references, options)
174+
```
155175

156-
## simplify.propertySnaks
157-
Same interface as [simplify.propertyClaims](#simplifypropertyclaims), but with a name that hints that it could also accept an array of qualifiers snaks or an array of reference snaks.
176+
## simplifyReference
177+
Simplify a reference
178+
```js
179+
import { simplifyReference } from 'wikibase-sdk'
180+
const claim = entity.claims.P31[0]
181+
const simplifiedReference = simplifyReference(claim.references[0], options)
182+
```
158183

159-
## simplify.snak
160-
Same interface as [simplify.claim](#simplifyclaim), but with a name that hints that it could also accept a qualifier or reference record [snak](https://www.wikidata.org/wiki/Wikidata:Glossary/en#Snak).
184+
## simplifySnak
185+
Simplify a [snak](https://www.wikidata.org/wiki/Wikidata:Glossary/en#Snak), be it a claim `mainsnak`, a qualifier snak, or a reference snak
186+
```js
187+
import { simplifySnak } from 'wikibase-sdk'
188+
const claim = entity.claims.P31[0]
189+
const simplifiedSnak = simplifySnak(claim.mainsnak, options)
190+
```
161191

162192
## Options
163193

@@ -167,9 +197,9 @@ Same interface as [simplify.claim](#simplifyclaim), but with a name that hints t
167197
It may be useful to prefix entities and properties ids in case you work with data from several domains/sources. This can done by setting an entity prefix and/or a property prefix in the options:
168198
```js
169199
const options = { entityPrefix: 'wd', propertyPrefix: 'wdt' }
170-
wbk.simplify.claims(entity.claims, options)
171-
wbk.simplify.propertyClaims(entity.claims.P31, options)
172-
wbk.simplify.claim(entity.claims.P31[0], options)
200+
simplifyClaims(entity.claims, options)
201+
simplifyPropertyClaims(entity.claims.P31, options)
202+
simplifyClaim(entity.claims.P31[0], options)
173203
```
174204
Results would then look something like
175205
```json
@@ -181,7 +211,7 @@ Results would then look something like
181211
### Keep rich values
182212
> `keepRichValues`
183213
184-
By default, `simplify.claims` returns only the simpliest values, so just a string for `monolingualtext` values and just a number for `quantity` values.
214+
By default, `simplifyClaims` returns only the simpliest values, so just a string for `monolingualtext` values and just a number for `quantity` values.
185215
By setting `keepRichValues=true`,
186216
- `monolingualtext` values will be objects on the pattern `{ text, language }`
187217
- `quantity` values will be objects on the pattern `{ amount, unit, upperBound, lowerBound }`
@@ -191,9 +221,9 @@ By setting `keepRichValues=true`,
191221
192222
You can keep the value's types by passing `keepTypes: true` in the options:
193223
```js
194-
wbk.simplify.claims(entity.claims, { keepTypes: true })
195-
wbk.simplify.propertyClaims(entity.claims.P50, { keepTypes: true })
196-
wbk.simplify.claim(entity.claims.P50[0], { keepTypes: true })
224+
simplifyClaims(entity.claims, { keepTypes: true })
225+
simplifyPropertyClaims(entity.claims.P50, { keepTypes: true })
226+
simplifyClaim(entity.claims.P50[0], { keepTypes: true })
197227
```
198228
Results would then look something like
199229
```json
@@ -233,9 +263,9 @@ If one if missing from this list (probably because it's new) you are welcome to
233263
234264
You can keep qualifiers by passing `keepQualifiers: true` in the options:
235265
```js
236-
wbk.simplify.claims(entity.claims, { keepQualifiers: true })
237-
wbk.simplify.propertyClaims(entity.claims.P50, { keepQualifiers: true })
238-
wbk.simplify.claim(entity.claims.P50[0], { keepQualifiers: true })
266+
simplifyClaims(entity.claims, { keepQualifiers: true })
267+
simplifyPropertyClaims(entity.claims.P50, { keepQualifiers: true })
268+
simplifyClaim(entity.claims.P50[0], { keepQualifiers: true })
239269
```
240270
Results would then look something like
241271
```json
@@ -266,9 +296,9 @@ Results would then look something like
266296
267297
You can keep reference by passing `keepReferences: true` in the options:
268298
```js
269-
wbk.simplify.claims(entity.claims, { keepReferences: true })
270-
wbk.simplify.propertyClaims(entity.claims.P50, { keepReferences: true })
271-
wbk.simplify.claim(entity.claims.P50[0], { keepReferences: true })
299+
simplifyClaims(entity.claims, { keepReferences: true })
300+
simplifyPropertyClaims(entity.claims.P50, { keepReferences: true })
301+
simplifyClaim(entity.claims.P50[0], { keepReferences: true })
272302
```
273303
Results would then look something like
274304
```json
@@ -297,9 +327,9 @@ Results would then look something like
297327
You can keep claim ids (a.k.a. `guid`), references and qualifiers hashes by passing `keepIds: true` in the options:
298328

299329
```js
300-
wbk.simplify.claims(entity.claims, { keepIds: true })
301-
wbk.simplify.propertyClaims(entity.claims.P50, { keepIds: true })
302-
wbk.simplify.claim(entity.claims.P50[0], { keepIds: true })
330+
simplifyClaims(entity.claims, { keepIds: true })
331+
simplifyPropertyClaims(entity.claims.P50, { keepIds: true })
332+
simplifyClaim(entity.claims.P50[0], { keepIds: true })
303333
```
304334
Results would then look something like
305335
```json
@@ -318,9 +348,9 @@ Results would then look something like
318348
You can keep references and qualifiers hashes by passing `keepHashes: true` in the options:
319349

320350
```js
321-
wbk.simplify.claims(entity.claims, { keepHashes: true })
322-
wbk.simplify.propertyClaims(entity.claims.P50, { keepHashes: true })
323-
wbk.simplify.claim(entity.claims.P50[0], { keepHashes: true })
351+
simplifyClaims(entity.claims, { keepHashes: true })
352+
simplifyPropertyClaims(entity.claims.P50, { keepHashes: true })
353+
simplifyClaim(entity.claims.P50[0], { keepHashes: true })
324354
```
325355

326356
This option has no effect if neither `keepQualifiers` nor `keepReferences` is `true`.
@@ -351,26 +381,26 @@ Results would then look something like
351381
352382
By default, [non-truthy statements](https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Truthy_statements) are filtered-out (keeping only claims of rank `preferred` if any, otherwise only claims of rank `normal`). This can be disable with this option.
353383
```js
354-
wbk.simplify.claims(entity.claims, { keepNonTruthy: true })
355-
wbk.simplify.propertyClaims(entity.claims.P1082, { keepNonTruthy: true })
384+
simplifyClaims(entity.claims, { keepNonTruthy: true })
385+
simplifyPropertyClaims(entity.claims.P1082, { keepNonTruthy: true })
356386
```
357387

358388
#### Keep ranks
359389
> `keepRanks`
360390
```js
361-
wbk.simplify.claims(entity.claims, { keepRanks: true })
362-
wbk.simplify.propertyClaims(entity.claims.P1082, { keepRanks: true })
363-
wbk.simplify.claim(entity.claims.P1082[0], { keepRanks: true })
391+
simplifyClaims(entity.claims, { keepRanks: true })
392+
simplifyPropertyClaims(entity.claims.P1082, { keepRanks: true })
393+
simplifyClaim(entity.claims.P1082[0], { keepRanks: true })
364394
```
365395
This is mostly useful in combination with `keepNonTruthy`. Example: a city might have several population claims, with only the most recent having a `preferred` rank.
366396

367397
```js
368398
// By default, the simplification only keep the claim of rank 'preferred'
369-
wbk.simplify.propertyClaims(city.claims.P1082, { keepRanks: true })
399+
simplifyPropertyClaims(city.claims.P1082, { keepRanks: true })
370400
// => [ { value: 100000, rank: 'preferred' } ]
371401

372402
// But the other claims can also be returned thank to 'keepNonTruthy'
373-
wbk.simplify.propertyClaims(city.claims.P1082, { keepRanks: true, keepNonTruthy: true })
403+
simplifyPropertyClaims(city.claims.P1082, { keepRanks: true, keepNonTruthy: true })
374404
// => [
375405
// { value: 100000, rank: 'preferred' },
376406
// { value: 90000, rank: 'normal' },
@@ -383,47 +413,47 @@ wbk.simplify.propertyClaims(city.claims.P1082, { keepRanks: true, keepNonTruthy:
383413
#### Customize novalue value
384414
> `novalueValue`
385415
```js
386-
wbk.simplify.claims(claimWithNoValue, { novalueValue: '-' })
416+
simplifyClaims(claimWithNoValue, { novalueValue: '-' })
387417
// => '-'
388418
```
389419

390420
#### Customize somevalue value
391421
> `somevalueValue`
392422
```js
393-
wbk.simplify.claims(claimWithSomeValue, { somevalueValue: '?' })
423+
simplifyClaims(claimWithSomeValue, { somevalueValue: '?' })
394424
// => '?'
395425
```
396426

397427
#### Keep snaktypes
398428
> `keepSnaktypes`
399429
```js
400-
wbk.simplify.claims(claimWithSomeValue, { keepSnaktypes: true })
430+
simplifyClaims(claimWithSomeValue, { keepSnaktypes: true })
401431
// => { value: undefined, snaktype: 'somevalue' }
402-
wbk.simplify.claims(claimWithSomeValue, { keepSnaktypes: true, somevalueValue: '?' })
432+
simplifyClaims(claimWithSomeValue, { keepSnaktypes: true, somevalueValue: '?' })
403433
// => { value: '?', snaktype: 'somevalue' }
404434
```
405435

406436
### Keep all
407437
> `keepAll`
408438
Activates all the `keep` options detailed above:
409439
```js
410-
wbk.simplify.claims(claims, { keepAll: true })
440+
simplifyClaims(claims, { keepAll: true })
411441
// Is equivalent to
412-
wbk.simplify.claims(claims, { keepQualifiers: true, keepReferences: true, keepIds: true, keepHashes: true, keepTypes: true, keepSnaktypes: true, keepRanks: true })
442+
simplifyClaims(claims, { keepQualifiers: true, keepReferences: true, keepIds: true, keepHashes: true, keepTypes: true, keepSnaktypes: true, keepRanks: true })
413443
```
414444
Those options can then be disabled one by one
415445
```js
416-
wbk.simplify.claims(claims, { keepAll: true, keepTypes: false })
446+
simplifyClaims(claims, { keepAll: true, keepTypes: false })
417447
```
418448

419449
### Change time parser
420450

421-
By default, `simplify.claims` functions use [`wikidataTimeToISOString`](general_helpers.md#wikidataTimeToISOString) to parse [Wikidata time values](https://www.mediawiki.org/wiki/Wikibase/DataModel#Dates_and_times).
451+
By default, `simplifyClaims` functions use [`wikidataTimeToISOString`](general_helpers.md#wikidataTimeToISOString) to parse [Wikidata time values](https://www.mediawiki.org/wiki/Wikibase/DataModel#Dates_and_times).
422452

423453
You can nevertheless request to use a different converter by setting the option `timeConverter`:
424454

425455
```js
426-
wbk.simplify.claims(claims, { timeConverter: 'iso' })
456+
simplifyClaims(claims, { timeConverter: 'iso' })
427457
```
428458

429459
Possible modes:
@@ -446,5 +476,5 @@ If none of those format fits your needs, you can pass a custom time converter fu
446476
```
447477
```js
448478
const timeConverterFn = ({ time, precision }) => `foo/${time}/${precision}/bar`
449-
wbk.simplify.claims(claims, { timeConverter })
479+
simplifyClaims(claims, { timeConverter })
450480
```

scripts/compare_datatypes.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/usr/bin/env ts-node
22
import { kebabCase } from 'lodash-es'
33
import { red, green } from 'tiny-chalk'
4-
import { parsers } from '../src/helpers/parse_claim.js'
4+
import { parsers } from '../src/helpers/parse_snak.js'
55
import { readJsonFile } from '../tests/lib/utils.js'
66

77
const supportedTypes = Object.keys(parsers)

src/helpers/parse_claim.ts renamed to src/helpers/parse_snak.ts

Lines changed: 12 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
import { wikibaseTimeToEpochTime, wikibaseTimeToISOString, wikibaseTimeToSimpleDay } from './time.js'
22
import type { TimeInputValue } from './time.js'
3+
import type { DataType } from '../types/claim.js'
4+
import type { SimplifySnakOptions } from '../types/simplify_claims.js'
5+
import type { SnakValue } from '../types/snakvalue.js'
36

47
const simple = datavalue => datavalue.value
58

@@ -105,22 +108,16 @@ for (const [ datatype, parser ] of Object.entries(parsers)) {
105108
normalizedParsers[normalizeDatatype(datatype)] = parser
106109
}
107110

108-
export function parseClaim (datatype, datavalue, options, claimId) {
109-
// Known case of missing datatype: form.claims, sense.claims, mediainfo.statements
111+
export function parseSnak (datatype: DataType | undefined, datavalue: SnakValue, options: SimplifySnakOptions) {
112+
// @ts-expect-error Known case of missing datatype: form.claims, sense.claims, mediainfo.statements
110113
datatype = datatype || datavalue.type
111114

112-
try {
113-
// Known case requiring normalization
114-
// - legacy "muscial notation" datatype
115-
// - mediainfo won't have datatype="globe-coordinate", but datavalue.type="globecoordinate"
116-
const parser = normalizedParsers[normalizeDatatype(datatype)]
117-
return parser(datavalue, options)
118-
} catch (err) {
119-
if (err.message === 'parsers[datatype] is not a function') {
120-
err.message = `${datatype} claim parser isn't implemented
121-
Claim id: ${claimId}
122-
Please report to https://github.com/maxlath/wikibase-sdk/issues`
123-
}
124-
throw err
115+
// Known case requiring normalization
116+
// - legacy "musical notation" datatype
117+
// - mediainfo won't have datatype="globe-coordinate", but datavalue.type="globecoordinate"
118+
const parser = normalizedParsers[normalizeDatatype(datatype)]
119+
if (!parser) {
120+
throw new Error(`${normalizeDatatype(datatype)} claim parser isn't implemented. Please report to https://github.com/maxlath/wikibase-sdk/issues`)
125121
}
122+
return parser(datavalue, options)
126123
}

0 commit comments

Comments
 (0)