Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with the Format and Head of the Dataset Files #1

Open
BAEK26 opened this issue Jul 9, 2024 · 0 comments
Open

Issue with the Format and Head of the Dataset Files #1

BAEK26 opened this issue Jul 9, 2024 · 0 comments

Comments

@BAEK26
Copy link

BAEK26 commented Jul 9, 2024

Hello,

I am trying to use the dataset provided in this repository, which is hosted on Google Drive. However, I am encountering issues with the file format and the file head.

The dataset files are in the format *.sql.gz, and I am not sure how to properly use them. Upon extracting the files, I found an unidentified head that looks like this:

2502.dat��������������������������������������������������������������������������������������������0000600�0000765�0000024�00605276750�14577532764�011407� 0������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

Additionally, the content of the files includes SQL INSERT statements. Here is an example snippet:

INSERT INTO unics_cordis.project_members VALUES (3519719, 176013, '888897376', '1949542', 'Washington University at St Louis', 'HES', 'US', NULL, 'St. Louis', NULL, 0, 796899, 'partner', NULL, NULL, NULL, 'MISSING', 38.6272733, -90.1978889, NULL, NULL);
INSERT INTO unics_cordis.project_members VALUES (3643708, 160140, '953456360', '1941838', 'OWL BIOMEDICAL INC', 'PRC', 'US', 'ROBIN HILL ROAD 75', 'GOLETA', '93117', 552909, 242125, 'participant', NULL, 'OWL', NULL, NULL, NULL, NULL, NULL, NULL, NULL);
INSERT INTO unics_cordis.project_members VALUES (3709647, 988792, '999572682', '2941692', 'BROWN UNIVERSITY', 'HES', 'US', '164 ANGELL STREET', 'PROVIDENCE, RI', '02912', 0, 1707, 'partner', NULL, 'UBR', NULL, NULL, 41.8284845, -71.4008336, NULL, NULL);

Could you please provide guidance on:

How to properly extract and utilize the dataset files?
Clarification on the unidentified head and its purpose?
Any additional steps or tools needed to work with the SQL data?
Thank you for your assistance!

Best regards,
Jong-eun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant