-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rank / Classify Sources #34
Comments
Denny Vrandečić about his vision of sources |
It would be really good to rank sources if objective criteria could be applied to the ranking. |
@BobBorges listen to Denny above he tells that en:Wikipedia rank sources. Guess it would be better if the ranking is Done by your project and SBL…. I use Wikidara rank feature and mark wrong facts by e.g. bad precision or not States in the birth record…. —> in the long run we get a rather good quality measurement. I like the way your project test your data against external “sources” like Wikidata but miss that I don’t see SBL in a metadata roundtrip echosystem…. Using Wikidata for handling contradicting sources![]() ![]() |
Thanks @salgo60 and @BobBorges for the insightful discussion, quality of references is something we deeply care about. Let me first just say that ProVe is based on research [1] that takes quality of sources into account, by comparing the degree to which the textual content of external references supports the verbalisation of Wikidata triples. We only take that as a basis to build a tool (the one in this repo) that could be of use to Wikidata editors. The output classifies sources into several types/boxes/colours which goes exactly into the direction Denny is pointing at. That said, I tend to agree with @BobBorges that objective criteria here are a challenging issue. We would be really keen on compiling different 'feelings' and approaches to quality of sources under various perspectives, perhaps by building a dataset that we can use to improve the model behind ProVe. [1] Amaral, G., Rodrigues, O. and Simperl, E., 2022. ProVe: A pipeline for automated provenance verification of knowledge graphs against textual sources. Semantic Web, (Preprint), pp.1-34. |
Thanks @albertmeronyo I recommend delving into the "architecture" behind Wikidata and Denny Vrandečić's vision, particularly on the types of research projects that can be undertaken regarding sources video when I pointed out that we need facts with sources and also metadata if we can trust a source. ——
I believe one key takeaway from @BobBorges’ project is that:
Looking ahead, as data-driven research becomes more prominent and metadata round-tripping improves, it will become increasingly important to explicitly define the trustworthiness and quality of datasets. Example of research project using Wikibase
![]() |
see Wikidata_talk:WikiProject_Reference_Verification
I stated in 2019 that we need to rank sources see T222142 Wikidata has now been used a lot of a research project "Riksdagens Corpus" ( @BobBorges ) and we agree that a sources like Svenskt Biografiskt Lexikon-ID (P3217) / Svenskt biografiskt lexikon (Q379406) / Tvåkammar-riksdagen 1867–1970 (Q110346241) are very good sources, they are just textstrings so to use them in Wikidata its some manual work see issue #78
My suggestion: add a ranking value for sources so more people can agree and understand that e.g. Svenskt Biografiskt Lexikon-ID (P3217) is high quality and have a quality process I think there was some measurement for prizes i.e. that getting the Nobelpriset (Q7191) is ranked higher than getting a prize xxx see my thoughts 2019 that prizes could be a way of evaluating research in different countries... "T216409 Nobelprize as part of evaluating research in different countries"
Maybe we can have dashboards how different research projects support PROV and use quality sources to motivate research to move faster in the right direction....
The text was updated successfully, but these errors were encountered: