-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check load_s3_data still works #22
Comments
athrado
pushed a commit
that referenced
this issue
Jul 19, 2022
athrado
added a commit
that referenced
this issue
Mar 20, 2023
* S3 loading for raw EPC data and suppl data (#22) * Fix requirements * Updates for loading from S3 (#19) * Loading and downlaoding from S3 and fixes (#19, #24, #25) * Load batches (download control) (#24) * Show download options (#24) * Fix column loading (#24) * Minor loading fixes (#24) * Fix batch laoding (#24) * Remove MACOSX dirs (#24) * Minor loading fixes (#24) * Minor loading fixes (#24) * Fix paths (#24) * Minor loading fixes (#24) * Remove debug prints (#24) * Upload notebook (#24) * Flexible bucket name (#19) * Allow loading geojson files (#19) * Fix geopandas loading function (#19) * Fix issues with data loading (#19) * Minor fixes to EPC with MCS installation dates (#25) * Pull in MCS code and adjust processing functions (#28) * Add guide for processing new data (#28) * First attempt at describing feature processing - put on hold (#29) * Update gitignore and remove old files * Update doc strings (#28) * Postcode format for backwards compatibility * Fix data loading (paths) * Fix data loading (paths) * Fix data loading (encoding) * Fixes for data loading (#19) * Update notebook (#19) * Update bucket name * Batch handling * Update notebook * Make capping optional (#33) * Add geopandas (#30) * Simplify IMD data loader (#32) * Make features optional and fix duplication (#34) * Fix indixing issue (#35) * Improve loading by country (#36) * Fix loading Scotland data and features (#38) * Update Scotland feature mapping (#38) * Analysis and loading notebooks * Merging function for processed datasets (#39) * Minor fix for MCS processing * Minor fix for loading MCS data * Merging EPC and MCS data in meaningful way (#39) * Notebook for showcasing processing pipeline (#39) * Adjust postcode formatting and loading for more general use (#40) * Verbose function and doc string adjustment (#39) * Add verbose option (#39) * Get latest MCS installers batch (#39) * Minor fixes to install date cmputation (#39) * Merge all datasets (#39) * Update notebooks with latest version for loading and merging (#39) * ToDos and comments (#39) * Add ToDo (#39) * Showcase postcode functions (#40) * Delete how_to_process_new_data.py * Notebook update * Add Todo and descriptions (#39) * Notebook with completed outputs * Take over quarter 4 fix from #43 * Update outputs * Remove main function (#44) * Remove no longer used imports (#44) * Improve doc string (#44) * Add geojson to supported file type (#44) * Convert string to path (#44) * Remove unnecessary S3 call (#44) * Fix docstring (#44) * Fix usecols for old batches (#44) * Update column renaming (#44) * Fix loading function for certificates (#44) * Rename EPC batch function (#44) * Fix doc string (#44) * Remove print statements (#44) * Rename column (#44) * Refer to bucket via config (#44) * Main call to run from terminal (#44) * Fix docstrings and kwargs (#44) * Adjust output for final merged version (#44) * Improve postcode loading (#44) * Data formatting in function (#44) * Postcode merging (#44) * Update doc string (#44) * Remove potential duplicates in usecols (#44) * Fix input options (#44) * Shorter code (#44) * Change output to outputs/input to inputs (#44) * Updated doc string (#44) * Updated doc string (#44) * Remove old code fragments (#44) * Updates (#44) * Comment and fix reload raw data (#44) * Update comments (#44) * Update docstring (#44) * Update docstring (#44) * Update pipeline with usecol options and docstrings (#44) * Fix line break (#44) * Rename and update install date computation (#44) * Remove unnecessary gitkeeps (#44) * Update configs for MCS (#44) * Remove notebook as python file (#44) * Delete zip file after unzipping and unneccesary code removed (#44) * Update comments and docstrings (#44) * Remove duplicate loading function (#44) * Update MCS feature selection (#44) * Rename script and add necessary MCS fields (#44) * Update gitignore (#44) * Update notebooks (#44) * Remove duplicates (#44) * Merge issues (#44) * Merge issues (#44) * Remove notebook tags (#44) * Add kwargs (#44) * Outputs after final runs (#44) * Rename file and remove notebook tags (#44) * Remove duplicated file (#44) * Adjust kwargs for S3 loading and saving functions (#44) * Remove old function for latest batch (#44) * Adjust output path for merged file (#44) * Fix kwarg dtype (#44) * Update notebooks (#44) * Remove dupl file (#44) --------- Co-authored-by: Julia Suter <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In getters/data_getters.py, during fixing of merge conflicts the function load_s3_data was changed - check it works and hasn't broken anything
The text was updated successfully, but these errors were encountered: