The health of women throughout pregnancy, childbirth, and the postpartum period is referred to as maternal health. Despite significant advancement over the past 20 years, 295 000 women died during or after pregnancy and childbirth in 2017. This number is unacceptably high. In addition to indirect factors including anaemia, and malaria, the most frequent direct causes of maternal injury and death are excessive blood loss, infection, and high blood pressure. Most maternal deaths are preventable with timely management by skilled professionals working together across different disciplines
Imagine you are a data scientist working alongside clinicians, interested in analysing maternal health data, and identifying evidence based actions with an aim to improve health outcomes. Produce a report answering the following questions.
- Build and fit a linear model, assuming that the response variable is Systolic BP, and the exploratory variable(s) of your choice. Explain why you chose these exploratory variables.
- Apply the principal component analysis (PCA) to reduce number of variables.
- Investigate the relationship between age and heartrate by means of age grouping. Provide graphical representation. (Hint: Calculate mean heartrate for each group) Explain your choice of age intervals. Blood pressure is recorded with 2 numbers: the systolic pressure and diastolic blood pressure. Assume, the high systolic blood pressure starts from 140, the high diastolic blood pressure starts from 90. The normal diastolic pressure is in the range (110, 140). The lower blood pressure is in the range (70, 90).
- Describe how would you investigate associations between pairs high/high, normal/normal, and low/low diastolic and systolic blood pressure.
Calculate, and interpret the following: (1) Support (2) Confidence (3) Conviction (4) Lift - Find clusters of patients with similar Systolic BP.
- Calculate the correlation between age and systolic BP