In this project, I'll analyze a medical appointment dataset and find out important factors that affect patient's no-show for their appointment. I'll first assess and clean up the data for easy analysis. Then I'll do exploratory data analysis(EDA) to understand the variables and the relationships between variables and visualize the results. This dataset collects information from more than 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointment.
Questions Under Investigation:
- What's the percentage of missed appointments?
- Is SMS reminder helpful for patients to show-up to the appointment?
- Is welfare program Bolsa Familia helping patients to show up for their appointments?
- Which neighbourhood has the highest number of no-shows?
- Are health condition variables: hypertension, diabetes, alcoholism, handicap related to no-show?
- Is age related to no-show? What is the age distribution for shows and no-shows?