Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy export of data #166

Open
kristian-lange opened this issue Aug 3, 2024 · 8 comments
Open

Easy export of data #166

kristian-lange opened this issue Aug 3, 2024 · 8 comments

Comments

@kristian-lange
Copy link
Collaborator

No description provided.

@kristian-lange
Copy link
Collaborator Author

Here is a repo with a Python script that can be a template for an database exporter.
https://github.com/kristian-lange/omm_db_exporter

I'm not sure though whether the exported data are enough. They only contain the JSON data from the job_results table.

@kristian-lange
Copy link
Collaborator Author

Updated to work with the new database schema

@kristian-lange
Copy link
Collaborator Author

kristian-lange commented Nov 6, 2024

@nclaidiere @smathot Please have a look at project https://github.com/kristian-lange/omm_db_exporter.

This simple Python script can be extended to your needs, e.g. take the database's host, username, pw as arguments or use other IDs than the study ID to query the database.

So far it takes a study ID as an argument and writes a file in CSV format with the job result data. It can deal with inconsistencies in the job variables, e.g. if new variables were added during running a study.

Is this okay? Any wishes?

Edit: The CSV contains additionally to the job result data and all job variables the following fields:

  • participant_name
  • participant_identifier
  • participant_meta
  • job_id
  • job_position
  • study_id
  • study_name

@smathot
Copy link
Collaborator

smathot commented Nov 7, 2024

This looks ok to me. It will be a bit difficult for @nclaidiere to have an opinion before he can actually use it to export data, so let's pick this up when we've migrated the system to the new server. But in general terms, this is indeed what we need.

@nclaidiere
Copy link
Collaborator

I agree, difficult to know but this looks good. Two questions though:

  • the goal was to be able to download data from a date range, a list of study IDs or a list of individuals or a combination of those, is that easily implemented here?
  • we agree there will still be the download data button on the server UI to download the results of a specific experiment directly right?

@kristian-lange
Copy link
Collaborator Author

Hi @nclaidiere and @smathot!

I addressed the missing points and added:

  • support for lists of study IDs
  • support for date ranges
  • command argument parsing
$ ./export.py -h
usage: export.py [-h] -i IDS [-f FROM] [-t TO]

Exports job result data into a file in CSV format

options:
  -h, --help            show this help message and exit
  -i IDS, --ids IDS     list of study IDs, comma-separated
  -f FROM, --from FROM  'from' date in format YYYY-MM-DD
  -t TO, --to TO        'to' date in format YYYY-MM-DD

Example: ./export.py --ids=1,2 --from=2024-11-01 --to=2024-11-03

@smathot
Copy link
Collaborator

smathot commented Nov 12, 2024

Nice touch with the command-line options!

@nclaidiere
Copy link
Collaborator

Fantastic thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants