Skip to content

Conversation

Abeelha
Copy link
Collaborator

@Abeelha Abeelha commented Sep 25, 2025

UPDATE 2025-09-30:
1. Tests Access Frontend (Not API)

  • Tests now fetch actual WordPress frontend HTML pages
  • Use requests.get(post['url']) to load real pages
  • Parse HTML with BeautifulSoup to verify content

2. Removed Unnecessary Code

  • No custom printing (pytest handles output)
  • No file saving (results managed by pytest)
  • No main() functions (use pytest runner)
  • Removed old test files with custom runners

3. Proper Pytest Format

  • All tests start with test_ prefix
  • Use standard pytest assertions
  • Clean, focused test functions
  • Pytest handles all reporting

4. More Generic Tests

  • Tests discover actual posts from WordPress automatically
  • Check patterns (HTML structure, content length) not specific values
  • Work with any posts, not hardcoded to specific ones

Functional Test Results

Test Summary

Comprehensive Tests: 86.7% success rate (13/15 tests passed)
Specific Post Test: 40.0% success rate (2/5 tests passed)

Test Case: "Second Renaissance: what's in a name?"

Issues Found

1. Footnotes Not Converted (FAILED)

Raw markdown footnote syntax appears in HTML:

[^1]: An intere...

Should be converted to proper HTML footnotes with superscript links.

2. Special Markdown Blocks Not Processed (FAILED)

Obsidian-style callout blocks remain unconverted:

[!note]
For an outline and introduction to the Second...

Should be converted to HTML callout/note blocks.

3. Author Attribution Issues (PARTIAL)

  • WordPress shows default author instead of actual authors
  • Original site displays authors with pictures and proper attribution
  • Author metadata from frontmatter not properly mapped

4. Image File Missing (FAILED)

  • Image HTML tag correctly generated with proper alt text
  • Image file uploaded to WordPress Media Library (ID: 80)
  • Image file not accessible at URL: /wp-content/uploads/first-renaissance-to-second-renaissance-bridge.webp
  • Returns HTTP 404 error when accessing image URL
  • Expected: Image should display, shows alt text instead due to missing file

Working Correctly

  • Featured images working (ID: 80)
  • Content structure (headings, paragraphs) converted correctly
  • No wiki-style links or raw markdown artifacts
  • HTML formatting proper
  • Image HTML tags generated correctly with alt text

Test Files

Tests can be run from etl/8-functional-tests/:

  • test_specific_post_migration.py - Specific post comparison
  • functional_test_comprehensive.py - General quality tests

Results saved to etl/8-functional-tests/output/

@Abeelha Abeelha self-assigned this Sep 26, 2025
@@ -0,0 +1,264 @@
#!/usr/bin/env python3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaict the tests access content via the api rather than accessing the frontend. A proper functional test (hardcoded) would visit a page like lifeitself.org/blog/xyz and then check that content is correct in various ways.


def run_comprehensive_tests(self):
"""Run comprehensive test suite on discovered posts."""
print("\n" + "="*70)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this printing is mostly unnecessary. if we were using pytest or some other test suite this should be reported from the individual tests. overall this function to run all the tests as a function should be handled by the test runner.


return success

def save_results(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to save results to a file.

print(f"\nDetailed results saved to: {output_dir / 'comprehensive_test_results.json'}")


def main():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't need a main function. again use a test runner.

print(f"Error fetching WordPress content: {e}")
return None

def test_footnotes(self, original_html, wordpress_content):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems very specific. i would have tried to compare the content in some more generic way. not sure how exactly but would start with something very rough and go from there.

@Abeelha
Copy link
Collaborator Author

Abeelha commented Sep 30, 2025

@rufuspollock
lets add coderabbit into the projects, its much better for PRs and shows potential issues to FIX, see image for reference:

firefox_inoU5YZmZp

- Tests now fetch actual WordPress frontend HTML pages
- Use requests.get(post['url']) to load real pages
- Parse HTML with BeautifulSoup to verify content

2. Removed Unnecessary Code

- No custom printing (pytest handles output)
- No file saving (results managed by pytest)
- No main() functions (use pytest runner)
- Removed old test files with custom runners

3. Proper Pytest Format

- All tests start with test_ prefix
- Use standard pytest assertions
- Clean, focused test functions
- Pytest handles all reporting

4. More Generic Tests

- Tests discover actual posts from WordPress automatically
- Check patterns (HTML structure, content length) not specific values
- Work with any posts, not hardcoded to specific ones
@rufuspollock
Copy link
Member

@Abeelha how do we add coderabbit?

@rufuspollock rufuspollock merged commit efbf9b8 into main Oct 1, 2025
@Abeelha
Copy link
Collaborator Author

Abeelha commented Oct 2, 2025

@rufuspollock will ask Anuar (he is the one who uses it in portalJS ) how-to in tomorrow's meeting, then we can have this helpful AI in the PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants