-
Notifications
You must be signed in to change notification settings - Fork 5
Remove strange wikidata punctuation on location specifiers #220
Comments
This is a problem upstream https://www.wikidata.org/wiki/Q5885293 |
Yes. The question is how to solve this. I guess we would like to remove stuff in our corpus but that people might want to keep in wikidata, so that there will not be a perfect alignment with wikidata. Maybe add a csv with stuff we exclude from wikidata we add to the updating script from wikidata? Or do you have another solution? |
I mean those misspellings could be just fixed on wikidata? EDIT: AFAIK those additional commas don't introduce any errors to our corpus |
No. I know. My point is that sooner or later we might end up with differences. But maybe not in the next couple of moths. Then fixing this in wikidata is probably easiest. |
They need to be edited on wikidata:
|
Ping @salgo60 . Is this something you could take a pass on? |
|
@MansMeg what problem did you find with Q117288109 ![]()
|
I think that one is actually a problem with us grabbing the data. Here we use the alias that is incorrect. @BobBorges , right? |
All checked not all changed as I didnt see a problem...
Off topic I mentioned your project today as a pattern how other organizations should work with its metadata ![]() |
Should be fixed now. If we find this as an issue again, we could write a unit test. Caused by trailing commas (removed on wikidata) and alias/i-ort in the format |
See for example:
Q5885293,"Kråkered," (now fixed)
We should not include location specifiers with punctuation if the ordinary name exist (like Kråkered in this case).
see:
https://github.com/welfare-state-analytics/riksdagen-corpus/blob/main/corpus/metadata/location_specifier.csv
The text was updated successfully, but these errors were encountered: