Skip to content

Leverage a large language model (LLM) to classify public comments by topic, sentiment, and intent; from the Office of Head Start in the HHS Administration for Children and Families.

License

Notifications You must be signed in to change notification settings

HHS/acf-ohs-nprm-2024-18279

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LICENSE

DISCLAIMER

acf-ohs-nprm-2024-18279

Analysis of public comments received on proposed rule on Supporting the Head Start Workforce and Consistent Quality Programming

This purpose of open-sourcing this repository is to be transparent about how AI was used to assist in efficiently analyzing public comments and to provide a starting point for others who would like to explore using commercial Large Language Models to aide in the public comment analysis process.

Links to Project Documentation:

How to Run: Outlines how to replicate the project and run the files in this repo.
Technical Documentation: Full technical documentation for this project including technical considerations for future project iterations and the rationale behind some of our choices.
Cloud Architecture: A detailed outline of how we structured our cloud infrastructure.
Lessons Learned: A collection of lessons learned from the Policy Team and the Data Surge Team.

Folder explanation:

inputs/: Should hold pickle file and file used for bill tagging

json_outputs/: Holds one output for each chunk of text that's sent to chatGPT with a prompt.

logs/: Log files will be created when you run data_processing.py and gpt_parallel.py. Logs are timestamped and indicate if there were any issues with particular comments when sending to chatGPT, and the time it takes to run both scripts.

outputs/: Holds an "intermediate" and "final" folder. "Intermediate" folder holds the chunked pickle file created in part of the pipeline. "Final" holds the final csvs exported in long and wide formats as well as the failed_jsons_files.csv and the summaries documents

About

Leverage a large language model (LLM) to classify public comments by topic, sentiment, and intent; from the Office of Head Start in the HHS Administration for Children and Families.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published