-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Khemarato Bhikkhu
committed
Jun 22, 2023
1 parent
7e4d140
commit cc76b30
Showing
7 changed files
with
160 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
name: Archive.org Saver | ||
on: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: "40 3 15 5,11 *" | ||
jobs: | ||
Archive: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout the Code | ||
uses: actions/checkout@v3 | ||
with: | ||
ref: main | ||
- name: Install Dependencies | ||
run: | | ||
cd ~ | ||
printf "${{ secrets.ARCHIVE_ORG_AUTH }}" > archive.org.auth | ||
pip install tqdm | ||
- name: Run the Site Archiver | ||
run: | | ||
python scripts/archive_site.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
title: "MN 126 Bhūmija Sutta: With Bhūmija" | ||
translator: sujato | ||
slug: "mn126" | ||
external_url: "https://suttacentral.net/mn126/en/sujato" | ||
drive_links: | ||
- "https://drive.google.com/file/d/1-wtKLyWIOCRvOPFYMJ2L4VT_NK-Fnsvv/view?usp=drivesdk" | ||
course: thought | ||
tags: | ||
- imagery | ||
- path | ||
- mn | ||
year: 2018 | ||
pages: 4 | ||
--- | ||
|
||
> heaping sand in a bucket, sprinkling it thoroughly with water, and pressing it out. But by doing this, they couldn’t extract any oil, regardless of whether they made a wish | ||
It's not wishing for *nibbāna* that leads there, but rather putting in the intelligent effort required to walk the path. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
import re | ||
|
||
# download the latest lychee output from GitHub | ||
input_file = "lycheeout.txt" | ||
output_file = "urls.txt" | ||
|
||
# Regular expression pattern to match the desired URLs | ||
pattern = r"✔ \[200\] (https?://\S+)" | ||
|
||
# Open the input and output files | ||
with open(input_file, "r") as f_in, open(output_file, "w") as f_out: | ||
# Read each line from the input file | ||
for line in f_in: | ||
# Find the URLs matching the pattern | ||
match = re.search(pattern, line) | ||
if match: | ||
url = match.group(1) | ||
# Write the URL to the output file | ||
f_out.write(url + "\n") | ||
|
||
print("URLs extracted and saved to 'urls.txt'.") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
import re | ||
|
||
input_file = "urls.txt" | ||
output_file = "filteredurls.txt" | ||
|
||
# Regular expression patterns | ||
exclude_pattern = r"https?://(web\.)?archive\.org" | ||
include_pattern = r"(https?://(?!.*archive\.org)\S*?(\.html?|\.mp3|\.pdf)|https?://\S*?/download\S*)" | ||
|
||
# Set to store unique URLs | ||
unique_urls = set() | ||
|
||
# Open the input file | ||
with open(input_file, "r") as f_in: | ||
# Read each line from the input file | ||
for line in f_in: | ||
# Exclude URLs matching the exclude pattern | ||
if re.search(exclude_pattern, line): | ||
continue | ||
|
||
# Find the URLs matching the include pattern | ||
match = re.search(include_pattern, line) | ||
if match: | ||
url = match.group(0) | ||
# Add the URL to the set | ||
unique_urls.add(url) | ||
|
||
# Open the output file | ||
with open(output_file, "w") as f_out: | ||
# Write the unique URLs to the output file | ||
for url in unique_urls: | ||
f_out.write(url + "\n") | ||
|
||
print("Filtered URLs (with duplicates removed) extracted and saved to 'filteredurls.txt'.") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters