We will parse preprints for techniques such as XAS, XPS, XANES to come up with dictionaries that, hopefully, inform
- what the most important terms are
- in which context they appear
Which can then guide standardization efforts.
We will discuss our results with domain experts.
The scientific questions we can, hopefully, answer are:
- can we create good vocabularies/dictionaries in this way
- can we propose this as "minimal reporting standards"
- how well does this work across domains? is this a good way to come up with minimal reporting standards for many different domains?
pyamiimage
extracts text from images. This can be useful for common axial labels in graphs, or for legands (what the substances are
extract words and phrases from text. The most immediate would be figure captions and tables captions