Data Architect (and former Data Scientist) and RAP Advocate
I'm an experienced data scientist, having worked in both academia (Physics, Astronomy, Galaxy Modelling, Material Science, Neutron Diffraction, Cell Biology, Genomics) and in the NHS (Secondary Care, Primary Care, Deaths, Mental Health, COVID), using a variety of data science (mostly machine learning and deep learning) techniques.
Currently, I'm mainly focussed on leading the NHSE RAP squad, which is working to upskill the other parts of the analytical community, making resources, training others, and raising awareness of how concepts from DevOps can improve the lives of analysts. Learn more on our website.
I've also recently developed an interest in LLMs, and produced a few tutorials to help people get started:
- RAG (Retreival Augemented Generation): an LLM which looks things up in a database before responding - a cheap and easy way of make it seem like an LLM has local knowledge
- RAG with sources Open In Colab : shows you how get the LLM to give sources for it's claims, and generally how to have more control over the prompts used in the pipeline.
The tools I'm most interested in developing for:
- python
- pyspark
- pandas
- sklearn
- statsmodels
- Tensorflow / Keras
- Pytorch
- ANSI-SQL (specifically associated with spark)
The repositories I'm making split into a few categories:
- making data easier to use and access
- data science code (such as starter code, or explorations of techniques)
- documentation of existing health data