ihm.pkl files differ and lot of missing values in an episode #139

sivakumarlakkoju · 2023-02-03T10:03:23Z

After creating the benchmark dataset for in-hospital-mortality risk, the ihm.pkl files differ when the test for checking is run.
Also the csv's for each episode have lot of missing values at each time stamp, for example capillary refill rate, always has no value, is this a norm?

Is there something I'm doing wrong while building the dataset, please let me know, thank you.

PS: Is there any possibility of getting the updated library with 50+ variables as mentioned previously?

hrayrhar · 2023-04-13T06:36:07Z

Hi Siva,

Since the code hasn't been updated for a while it might be that some things don't work as expected with new versions of libraries. Have your tried using the exact versions of libraries specified in the requirements.txt file?

PS: Is there any possibility of getting the updated library with 50+ variables as mentioned previously?

Unfortunately, we wrote code only for 17 variables.

sivakumarlakkoju · 2023-04-13T06:52:35Z

Hey Hrayr, thanks for replying.
Yes I tried with the exact versions, and ran the process multiple times, but the end result is the same always.

I get the following warning when running the validate_events script:
DtypeWarning: Columns (5) have mixed types. Specify dtype option on import or set low_memory=False. events_df = pd.read_csv(os.path.join(args.subjects_root_path, subject, 'events.csv'), index_col=False,

Unfortunately, we wrote code only for 17 variables.

Okay.

Also, I'd like to know the rationale behind choosing the impute values, as mentioned in table 3 of the paper.

hrayrhar · 2023-04-14T00:49:18Z

Hi Siva,

Unfortunately, the tests I wrote before are too rigid and detect even insignificant differences. The current version of the code does not pass those tests, but I have verified manually that all the produced csv files match with those generated by older and tested versions of the code. I am currently trying to write better tests.

Also the csv's for each episode have lot of missing values at each time stamp, for example capillary refill rate, always has no value, is this a norm?

Most episodes have a lot of missing data. But if you suspect that any particular csv file is incorrect, please paste here, I will verify with the local version.

I get the following warning when running the validate_events script

I get that warning too. It has no effect, don't worry about it.

sivakumarlakkoju · 2023-04-14T14:00:34Z

Unfortunately, the tests I wrote before are too rigid and detect even insignificant differences. The current version of the code does not pass those tests, but I have verified manually that all the produced csv files match with those generated by older and tested versions of the code. I am currently trying to write better tests.

I'll wait for the updated tests, thank you.

Most episodes have a lot of missing data. But if you suspect that any particular csv file is incorrect, please paste here, I will verify with the local version.

Will paste one soon, just to be sure.

hrayrhar · 2023-04-16T20:26:17Z

Hi Siva,

I have updated the tests. They are still not ideal, but you should get the same results if you follow the exact installation and benchmark building instructions of README.md. You can find updated information about the tests in mimic3benchmark/tests/README.md.

sivakumarlakkoju · 2023-04-17T06:45:18Z

Hi Hrayr,
Thank you for updating the tests. I've rerun the benchmark creation process and tested with the updates you made and it worked for me.

Can you please comment on this?
"I'd like to know the rationale behind choosing the impute values, as mentioned in table 3 of the paper."

Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ihm.pkl files differ and lot of missing values in an episode #139

ihm.pkl files differ and lot of missing values in an episode #139

sivakumarlakkoju commented Feb 3, 2023

hrayrhar commented Apr 13, 2023 •

edited

Loading

sivakumarlakkoju commented Apr 13, 2023 •

edited

Loading

hrayrhar commented Apr 14, 2023

sivakumarlakkoju commented Apr 14, 2023

hrayrhar commented Apr 16, 2023

sivakumarlakkoju commented Apr 17, 2023 •

edited

Loading

ihm.pkl files differ and lot of missing values in an episode #139

ihm.pkl files differ and lot of missing values in an episode #139

Comments

sivakumarlakkoju commented Feb 3, 2023

hrayrhar commented Apr 13, 2023 • edited Loading

sivakumarlakkoju commented Apr 13, 2023 • edited Loading

hrayrhar commented Apr 14, 2023

sivakumarlakkoju commented Apr 14, 2023

hrayrhar commented Apr 16, 2023

sivakumarlakkoju commented Apr 17, 2023 • edited Loading

hrayrhar commented Apr 13, 2023 •

edited

Loading

sivakumarlakkoju commented Apr 13, 2023 •

edited

Loading

sivakumarlakkoju commented Apr 17, 2023 •

edited

Loading