Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patient's visits list is not ordered - solved it by adding two lines #306

Open
smessica opened this issue Nov 8, 2024 · 0 comments
Open

Comments

@smessica
Copy link

smessica commented Nov 8, 2024

I'm working with the MIMIC 3 dataset using your library, and there is a bug in the order of visits per patient that not only affects the logic when trying to create custom task functions but also creates a bug in your readmission task function (https://github.com/sunlabuiuc/PyHealth/blob/master/pyhealth/tasks/readmission_prediction.py) and maybe in other tasks too.

The issue is under https://github.com/sunlabuiuc/PyHealth/blob/master/pyhealth/datasets/mimic3 (mimic3.py file) in the basic_unit(p_id, p_info) function (line 105):

In the line 114 the groupby ruins the order of the visits by their date:
for v_id, v_info in p_info.groupby("HADM_ID"):

I fixed it by adding 2 lines of code: (marked in bold)

  1. p_info = p_info.sort_values(by="ADMITTIME") (to make sure the visits are ordered according to their date, but it might work without this row as well)
  2. for _, row in p_info.iterrows(): (instead of the groupby)

def basic_unit(p_id, p_info):
p_info = p_info.sort_values(by="ADMITTIME")

        patient = Patient(
            patient_id=p_id,
            birth_datetime=strptime(p_info["DOB"].values[0]),
            death_datetime=strptime(p_info["DOD_HOSP"].values[0]),
            gender=p_info["GENDER"].values[0],
            ethnicity=p_info["ETHNICITY"].values[0],
        )
        # load visits
        for _, row in p_info.iterrows():
            visit = Visit(
                visit_id=row["HADM_ID"],
                patient_id=p_id,
                encounter_time=strptime(row["ADMITTIME"]),
                discharge_time=strptime(row["DISCHTIME"]),
                discharge_status=row["HOSPITAL_EXPIRE_FLAG"],
                insurance=row["INSURANCE"],
                language=row["LANGUAGE"],
                religion=row["RELIGION"],
                marital_status=row["MARITAL_STATUS"],
                ethnicity=row["ETHNICITY"],
            )
            # add visit
            patient.add_visit(visit)
        
        return patient

If you want some visual proof, we can discuss it further via email or something; I did not want to upload here MIMIC 3 information since it is sensitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant