You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
JournalsDB tables were initially populated by loading existing classic files into tables. Subsequent to this, additions to the tables are made by exporting the tables to Google Sheets, making edits, and reimporting the data. The export/import process records the editorial history and enforces some data requirements, but it is time-intensive if more than a few records are being added.
We need to implement a process in run.py that will trigger the bulk import of data from files, but modified so that the data undergo validity checks prior to db insert. This discussion will focus on the bulk import of ISSNs.
Starting from the assumption that the bibcodes for new records already exist in the master table, we need a task that:
will read a formatted text file containing a bibstem, an ISSN_type, and an ISSN
verify that the bibstems are already in journalsdb (i.e. have a masterid)
verify that the ISSN isn't already in journalsdb.idents
check whether there is an existing ISSN of the given ISSN_type for that bibstem
If the given ISSN is truly new, the task should import the masterid, ISSN_type, and ISSN into idents. If it is not new, or if there is some other issue, it should provide actionable information to the user about what is wrong.
The procedure for this will be something like:
read the file
search journalsdb for the bibstem, and whether that bibstem already has any ISSNs
search journalsdb for the ISSN, and return a warning if it has already been assigned to another bibstem, or the same bibstem but with a different ISSN_type
if bibstem exists, ISSN does not, and ISSN_type has not already been assigned to bibstem, insert (masterid, ISSN, ISSN_type)
if bibstem exists, ISSN does not, but ISSN_type has already been assigned (and you're sure the new record is a valid alternate ISSN for this bibstem), insert (masterid, ISSN, ISSN_type_altN), where the altN is generated from existing table data for alternate ISSNs
if bibstem does not exist, reject the row and warn the user
if bibstem exists and ISSN exists for the same bibstem, reject the row and notify the user the ISSN is already present
if bibstem exists, and ISSN exists but for a different bibstem and/or ISSN_type, reject the row and warn the user there's a problematic record that requires investigation
The code for inserting an identifier of given type already exists in journalsmanager.tasks.task_db_load_identifier, so the new coding required is for reading the file, doing the required sqlalchemy queries to validate the incoming data, and generating logging messages of the appropriate level (info/warning) that will direct the curator on what needs to be resolved.
The text was updated successfully, but these errors were encountered:
JournalsDB tables were initially populated by loading existing classic files into tables. Subsequent to this, additions to the tables are made by exporting the tables to Google Sheets, making edits, and reimporting the data. The export/import process records the editorial history and enforces some data requirements, but it is time-intensive if more than a few records are being added.
We need to implement a process in run.py that will trigger the bulk import of data from files, but modified so that the data undergo validity checks prior to db insert. This discussion will focus on the bulk import of ISSNs.
Starting from the assumption that the bibcodes for new records already exist in the master table, we need a task that:
If the given ISSN is truly new, the task should import the masterid, ISSN_type, and ISSN into idents. If it is not new, or if there is some other issue, it should provide actionable information to the user about what is wrong.
The procedure for this will be something like:
The code for inserting an identifier of given type already exists in journalsmanager.tasks.task_db_load_identifier, so the new coding required is for reading the file, doing the required sqlalchemy queries to validate the incoming data, and generating logging messages of the appropriate level (info/warning) that will direct the curator on what needs to be resolved.
The text was updated successfully, but these errors were encountered: