Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language and script in metadata sheets #254

Open
danbalogh opened this issue Jan 5, 2024 · 4 comments
Open

Language and script in metadata sheets #254

danbalogh opened this issue Jan 5, 2024 · 4 comments
Assignees

Comments

@danbalogh
Copy link
Collaborator

@alevivier : I understand that you are or will soon be away on fieldwork. This is not urgent and can be addressed once you are back and have time. The matter came up in another discussion (see #250 for precedents) and we wanted to keep it in sights. Also, please assign anyone else who should be involved in this. I also have a number of other concerns about our metadata sheets, which I believe more and more people are using, yet I know of no instructions for the proper way of filling many of the cells, and several details don't work out as well as they could. I think a guide should be drafted for the metadata sheets (this one seems very much out of date), and the fields and vocabularies need to be finalised before too many people start filling them out. I'm happy to be involved in that discussion and just as happy to be left out of it so long as it takes place.

The question now is: what purpose does it serve to have entries for language and script in the metadata spreadsheets? In their current state, the fields only contain information that is also encoded in the XML files and can be pulled from there (#250), so recording the same information redundantly in the metadata seems to offer no advantage while creating space for human error.

@danbalogh
Copy link
Collaborator Author

Some relevant bits from #250:

Michaël said it is not a problem to generate a smart display of the language information extracted from the XML, such as "Text in Tamil, with parts in Sanskrit. Translations in English and in French"

Arlo said, "I am fairly sure the matter of redundant representatiion of language and script metadata was discussed by Adeline, manu and myself when we were working on the template and guide for the metadata spreadsheet, but I don't remember why we accepted/required the redundancy."

Dan said, "I also think that the redundancy was discussed and also have no clear recollection of the details. But I think that back then the idea was to record something slightly different in the metadata, perhaps by allowing a freetext description of the language of the inscription (e.g. "non-Standard Sanskrit" or "Sanskrit with boundary descriptions in Telugu"). That way, the redundancy is only partial. But what we have in the sheets now can be matched 100% to the data encoded in the XML, so the redundancy is a bad thing."

@alevivier
Copy link
Collaborator

alevivier commented Jan 15, 2024

@danbalogh
Here the last version of the metadata guide: https://docs.google.com/document/d/1RqePCIm7SOBGl0M_V_q95TU87ogJR-NsZsdRrS0YnGE/edit?usp=sharing
It has been written according to the last version of the template

@alevivier
Copy link
Collaborator

About the question of redundancy of languages and scripts in metadata:
I don't remember why we chose to keep the mention of languages and scripts both in the metadata and in the edition. If you feel this information is redundant, of course it can been deleted in one part. But we need to be sure that this information could be searched in the database. I don't know how the database will be searchable (by fields or free search)

@danbalogh
Copy link
Collaborator Author

Thanks for sharing the MDT Guide. Are the people who have already recorded metadata aware of the existence of this guide? I'll read and add comments.

On the redundancy issue: I'm not in a position to decide, but the redundancy is definitely there and so far, nobody has provided an explanation why it may be necessary or even useful. If no explanation occurs to anyone, then script and language should be removed from the metadata table. (The XML files may also have script/language information associated with specific parts of the inscription, so they contain more information than the metadata table. It is only the data in the mdt table that can be deleted.)
Search with language/script filtering will of course have to be implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants