Takes ~9% of runtime in mode 1. Did a quick test with using underlying numpy array. Seems to be much faster, but need to confirm numbers are the same.
result = np.full(len(df), p[f'baseline_odds_of_healthcareseeking_{subgroup}'], dtype=float)
age_years = df.age_years.values
sex = df.sex.values
region = df.region_of_residence.values
li_urban = df.li_urban.values
li_wealth = df.li_wealth.values
# Predict behaviour due to the 'average symptom'
if subgroup == 'children':
result[age_years >= 5] *= p['odds_ratio_children_age_5to14']
elif subgroup == 'adults':
result[(age_years >= 35) & (age_years <60)] *= p['odds_ratio_adults_age_35to59']
result[age_years >= 60] *= p['odds_ratio_adults_age_60plus']
result[li_urban] *= p[f'odds_ratio_{subgroup}_setting_urban']
result[sex == 'F'] *= p[f'odds_ratio_{subgroup}_sex_Female']
result[region == 'Central'] *= p[f'odds_ratio_{subgroup}_region_Central']
result[region == 'Southern'] *= p[f'odds_ratio_{subgroup}_region_Southern']
result[(li_wealth == 4) | (li_wealth == 5)] *= p[f'odds_ratio_{subgroup}_wealth_higher']
# Predict for symptom-specific odd ratios
for symptom, odds in care_seeking_odds_ratios.items():
symptom = df[f'sy_{symptom}'].values
result[symptom > 0] *= odds
result = (1 / (1 + 1 / result))
result = pd.Series(result, index=df.index)
# If a random number generator is supplied provide boolean outcomes, not probabilities
if rng:
outcome = rng.random_sample(len(result)) < result
return outcome
else:
return result
Takes ~9% of runtime in mode 1. Did a quick test with using underlying numpy array. Seems to be much faster, but need to confirm numbers are the same.