Skip to content

Commit

Permalink
i #284 Edited download_pipermail() and Added refresh_pipermail() and …
Browse files Browse the repository at this point in the history
…process_gz_to_mbox_in_folder()

- download_pipermail: Attempts to download .txt file first. If unavailable fallback to .gz. If using .gz file, unzips and writes output in .mbox
- Added log messages
- download_pipermail: Added timeout parameter to deal with case that server takes too long to respond
- Added refresh_pipermail function
- Updated vignettes/download_mail.Rmd to include refresh_pipermail
- Added process_gz_to_mbox_in_folder function
  • Loading branch information
daomcgill committed Sep 17, 2024
1 parent 69ca163 commit b9a886b
Show file tree
Hide file tree
Showing 10 changed files with 197 additions and 463 deletions.
3 changes: 1 addition & 2 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ export(assign_exact_identity)
export(bipartite_graph_projection)
export(commit_message_id_coverage)
export(community_oslom)
export(convert_pipermail_to_mbox)
export(dependencies_to_sdsmj)
export(download_bugzilla_perceval_rest_issue_comments)
export(download_bugzilla_perceval_traditional_issue_comments)
Expand Down Expand Up @@ -133,13 +132,13 @@ export(parse_r_dependencies)
export(parse_r_function_definition)
export(parse_r_function_dependencies)
export(parse_rfile_ast)
export(process_gz_to_mbox_in_folder)
export(query_src_text)
export(query_src_text_class_names)
export(query_src_text_namespace)
export(read_temporary_file)
export(recolor_network_by_community)
export(refresh_jira_issues)
export(refresh_mod_mbox)
export(refresh_pipermail)
export(smell_missing_links)
export(smell_organizational_silo)
Expand Down
501 changes: 127 additions & 374 deletions R/mail.R

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions conf/helix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,8 @@ mailing_list:
project_key_2:
# archive_url: https://mta.openssl.org/mailman/listinfo/
mailing_list: https://mta.openssl.org/pipermail/openssl-project/
start_year_month: 201903
end_year_month: 202103
start_year_month: 202203
end_year_month: 202303
save_folder_path: "../save_folder_mail_2"

issue_tracker:
Expand Down
17 changes: 0 additions & 17 deletions man/convert_pipermail_to_mbox.Rd

This file was deleted.

15 changes: 10 additions & 5 deletions man/download_pipermail.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 23 additions & 0 deletions man/process_gz_to_mbox_in_folder.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 0 additions & 39 deletions man/refresh_mod_mbox.Rd

This file was deleted.

36 changes: 15 additions & 21 deletions man/refresh_pipermail.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tools.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ refactoring_miner: ~/RefactoringMiner-1.0/bin/RefactoringMiner
# https://github.com/boyter/scc
scc: ~/scc/scc
# universal-ctags
utags: /usr/local/Cellar/universal-ctags/p6.1.20240901.0/bin/ctags
utags: /usr/local/Cellar/universal-ctags/HEAD-40b5861/bin/ctags
# https://archdia.com/
dv8: /Applications/DV84/bin/dv8-console
# OSLOM: http://oslom.org/
Expand Down
20 changes: 18 additions & 2 deletions vignettes/download_mail.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,8 @@ save_folder_path <- conf[["mailing_list"]][["pipermail"]][["project_key_1"]][["s
- end_year_month: The ending date for downloading archives (in YYYYMM format).
- save_folder_path: The local directory where the downloaded archives will be saved.


# Pipermail Downloader

You can download the archives using the download_pipermail() function, which downloads and saves .mbox files to the specified directory. The .mbox files are named with the format kaiaulu_YYYYMM.mbox, where YYYYMM refers to the year and month of the archive.
```{r}
# Download archives
download_pipermail(
Expand All @@ -79,3 +78,20 @@ download_pipermail(
```
After running this function, the .mbox files will be saved in the specified directory with filenames like kaiaulu_202310.mbox, kaiaulu_202311.mbox, etc.

# Pipermail Refresher
In some cases, you may want to refresh the archive to ensure the most recent months are up-to-date or to handle updates to the mailing list. The refresh_pipermail() function helps automate this process.

How refresh_pipermail Works
1. Checks if the folder is empty: If the folder is empty, it downloads archives starting from start_year_month to the current month using download_pipermail().
2. Finds the most recent file: If the folder is not empty, the function checks for the most recent month’s file (based on the filename) and deletes it.
3. Redownloads from the most recent month: The function then redownloads the archive from the most recent month up to the current month.
```{r}
# Refresh archives
refresh_pipermail(
mailing_list = mailing_list,
start_year_month = start_year_month,
save_folder_path = save_folder_path
)
```
This function will ensure that the most recent archives are always up-to-date by redownloading the current month's archive if necessary and adding any new months that have been added to the mailing list.

0 comments on commit b9a886b

Please sign in to comment.