Skip to content

Commit

Permalink
Merge pull request #15 from C2DH/dev
Browse files Browse the repository at this point in the history
url issue fix
  • Loading branch information
memerchik authored May 27, 2024
2 parents 1718ebb + 62bf924 commit 67d2e68
Show file tree
Hide file tree
Showing 5 changed files with 115 additions and 57 deletions.
File renamed without changes.
2 changes: 1 addition & 1 deletion article-check-script/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

def main_func(REPO_URL, PERSONAL_TOKEN, NEW_REPO_NAME):

f = open("../config.json")
f = open("./config.json")

config_file = json.load(f)

Expand Down
71 changes: 35 additions & 36 deletions preflight.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@

BASE_URL = "https://journalofdigitalhistory.org/en/notebook-viewer/"

FIRST_PARAGRAPH = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."


def encode_notebook_url(url):
# URL-encode the string
Expand Down Expand Up @@ -94,42 +96,39 @@ def generate_report(output, notebook, workspace="", action_outputs={}, contents=
print(f"::debug::generatereport output:{output} notebook:{notebook}")
output_filepath = os.path.join(workspace, output)
count = len(contents["cells"])
# Open the file for writing and handle any errors
try:
with open(output_filepath, "w", encoding="utf-8") as output_file:
output_file.write(
f"# Report for {notebook} \u2764 \n\n"
) # Write "#Hello" with a heart utf8 character
# count cells of each type. check for empty cells
cell_types = {"code_empty": 0}
for cell in contents["cells"]:
cell_type = cell["cell_type"]
if cell_type not in cell_types:
cell_types[cell_type] = 0
cell_types[cell_type] += 1
if cell_type == "code":
if len(cell["source"]) == 0:
cell_types["code_empty"] += 1
# write cell counts
config_file = open("config.json")

config_file_text = json.load(config_file)

config_file.close()
first_paragraph = str(config_file_text["first_paragraph"] + "\n\n")
output_file.write(first_paragraph)
output_file.write("## Cell Counts \n")
output_file.write(f"**all cells: {count}** \n")
for cell_type, count in cell_types.items():
output_file.write(f"{cell_type}: {count} \n")
# write every action_output
output_file.write("\n## Action Outputs\n")
for key, value in action_outputs.items():
output_file.write(f"{value}\n")
except IOError:
# bad
print(f"::error::Bad things happened when open or write to {output}!")
sys.exit(1)

def write_result(count):
try:
with open(output_filepath, "w", encoding="utf-8") as output_file:
output_file.write(
f"# Report for {notebook} \u2764 \n\n"
) # Write "#Hello" with a heart utf8 character
# count cells of each type. check for empty cells
cell_types = {"code_empty": 0}
for cell in contents["cells"]:
cell_type = cell["cell_type"]
if cell_type not in cell_types:
cell_types[cell_type] = 0
cell_types[cell_type] += 1
if cell_type == "code":
if len(cell["source"]) == 0:
cell_types["code_empty"] += 1
# write cell counts
output_file.write(FIRST_PARAGRAPH + "\n\n")
output_file.write("## Cell Counts \n")
output_file.write(f"**all cells: {count}** \n")
for cell_type, count in cell_types.items():
output_file.write(f"{cell_type}: {count} \n")
# write every action_output
output_file.write("\n## Action Outputs\n")
for key, value in action_outputs.items():
output_file.write(f"{value}\n")
output_file.close()
except IOError:
# bad
write_result(count)

write_result(count)


def main(
Expand Down
89 changes: 73 additions & 16 deletions preflight_report.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,95 @@
# Report for example/article-altair.ipynb ❤
# Report for example/article_urls_problem.ipynb ❤

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

## Cell Counts
**all cells: 147**
**all cells: 111**
code_empty: 0
markdown: 115
code: 32
markdown: 69
code: 42

## Action Outputs

### Size
**total cells: 147**
**total cells: 111**

### Check URLs


**Impossible to verify (non-404 error code - 4):**
> [!CAUTION]
> **Invalid URLs are present, please review the referenced URLs list.**
**Invalid URLs (404 - 13):**


> [!WARNING]
> **Invalid URL (404):** https://github.com/hepplerj/whatisdigitalhumanities.
> [!WARNING]
> **Invalid URL (404):** https://weltliteratur.net/dh-tools-used-in-research/,
> [!WARNING]
> **Invalid URL (404):** https://doi.org/10.34666/k1de-j489,
> [!WARNING]
> **Invalid URL (404):** https://dh-abstracts.library.virginia.edu/downloads.
> [!WARNING]
> **Invalid URL (404):** https://github.com/Digital-Humanities-Quarterly/dhq-journal/tree/main/data/dhq-xml.
> [!WARNING]
> **Invalid URL (404):** https://github.com/ZoeLeBlanc/dhq_scraper.
> [!WARNING]
> **Invalid URL (404):** https://github.com/jrladd/network_navigator.
> [!WARNING]
> **Invalid URL (404):** https://networknavigator.jrladd.com/,
> [!WARNING]
> **Invalid URL (404):** https://getbootstrap.com/docs/4.0/about/history/>
> [!WARNING]
> **Invalid URL (404):** https://github.com/EvanLi/Github-Ranking/blob/master/Top100/Top-100-stars.md>.
> [!WARNING]
> **Invalid URL (404):** https://raw.githubusercontent.com/melaniewalsh/sample-social-network-datasets/master/sample-datasets/quakers/quaker-edges.csv"
> [!WARNING]
> **Invalid URL (404):** https://networkx.org/documentation/stable/developer/about_us.html,
> [!WARNING]
> **Invalid URL (404):** https://graphology.github.io/,
**Impossible to verify (non-404 error code - 6):**

Invalid URL (Other - 301): https://orcid.org/sites/default/files/images/orcid_16x16.png
Invalid URL (Other - 301): https://worldcat.org/identities/lccn-n50070935/
Invalid URL (Other - 301): https://worldcat.org/identities/lccn-n50055088/
Invalid URL (Other - 301): https://worldcat.org/identities/lccn-no90018047/
Invalid URL (Other - 403): https://www.canva.com/ai-image-generator/
Invalid URL (Other - 301): http://tapor.ca/tools/category/Network%20Analysis.
Invalid URL (Other - 302): https://dh-abstracts.library.cmu.edu.
Invalid URL (Other - 503): https://web.archive.org/web/20080912001706/https://gephi.org/
Invalid URL (Other - 302): https://github.com/jrladd/network_navigator/pull/7>.

> [!TIP]
> Even if some of the urls listed above don't seem to be broken, try to replace them with the valid ones as they might become unavailable soon.

**Valid URLs (200 - 6):**
**Valid URLs (200 - 16):**

2. https://orcid.org/0000-0001-8618-6800
3. https://orcid.org/0000-0002-6951-8014
4. https://orcid.org/0000-0002-0301-2029
5. https://licensebuttons.net/l/by/4.0/88x31.png
6. https://creativecommons.org/licenses/by/4.0/
7. https://github.com/jdh-observer/jdh002-VeaK58WBs82C
2. https://orcid.org/0000-0002-5440-062X
3. https://orcid.org/0000-0003-2012-8805
4. https://licensebuttons.net/l/by-nc-nd/4.0/88x31.png
5. https://creativecommons.org/licenses/by-nc-nd/4.0/
13. https://digitalhumanities.org/dhq/
18. https://github.com/gephi/gephi/commit/c1a1b8316427b1e5c6e0ea5ea8031a016b7ba2c1
22. https://stackoverflow.com/questions/29288592/bootstrap-min-css-size-from-cdn-shows-a-surprising-size-in-devtools>.
23. https://github.com/jrladd/network_navigator/issues/3>
27. https://felix-kling.de/jsnetworkx/
28. https://d3js.org
29. https://www.khronos.org/api/webgl
30. https://d3js.org/what-is-d3#d3-is-for-bespoke-visualization
32. https://www.sigmajs.org/
33. http://vis.stanford.edu/papers/d3
34. https://medialab.sciencespo.fr/en/tools/
35. https://github.com/melaniewalsh/sample-social-network-datasets

10 changes: 6 additions & 4 deletions report.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

## Cell Counts

**all cells: 71**
code_empty: 1
markdown: 35
Expand All @@ -11,21 +12,22 @@ code: 36
## Action Outputs

### Size

**total cells: 71**
## Kernel Checks:

> [!CAUTION]
> Error: Python versions don't match. The notebook is using **python-3.7.10**, when **python-3.11** is required.
> Error: Python versions don't match. The notebook is using **python-3.7.10**, when **python-3.11** is required.

> [!TIP]
> Try changing **runtime.txt** to resolve the error above.
> Try changing **runtime.txt** to resolve the error above.
### Citations Found with problem:


### Check HTML


### Check Output Sizes and Rules
- Table found in output of cell 16
> First words of input cell: df=pd.read_csv("https://raw.githubusercontent.com/mkrzmr/jdh/main/script/file_AA_sm.csv") df.drop('Unnamed: 0', axis=1, inplace=True)
Expand All @@ -46,7 +48,7 @@ Total number of audios: 4
| Mimetype | Presence |
| --- | --- |
| text/html | True |
| text/plain | False |
| text/plain | True |
| image/png | True |
| audio | False |

Expand Down

0 comments on commit 67d2e68

Please sign in to comment.