Skip to content

wiki-connect/ParseWiki

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

WikiConnect ParseWiki

A powerful PHP library for parsing MediaWiki-style content from raw wiki text.


πŸ“š Overview

This library allows you to extract:

  • Templates (single, multiple, nested)
  • Internal wiki links
  • External links
  • Citations (references)
  • Categories (with or without display text) Perfect for handling wiki-formatted text in PHP projects.

πŸ—‚οΈ Project Structure

  • ParserTemplates: Parses multiple templates.
  • ParserTemplate: Parses a single template.
  • ParserInternalLinks: Parses internal wiki links.
  • ParserExternalLinks: Parses external links.
  • ParserCitations: Parses citations and references.
  • ParserCategories: Parses categories from wiki text.
  • DataModel classes:
    • Attribute
    • Citation
    • ExternalLink
    • InternalLink
    • Parameters
    • Template
  • tests/: Contains PHPUnit test files:
    • ParserCategoriesTest
    • ParserCitationsTest
    • ParserExternalLinksTest
    • ParserInternalLinksTest
    • ParserTemplatesTest
    • ParserTemplateTest
    • DataModel tests:
      • AttributeTest
      • ParametersTest
      • TemplateTest

πŸš€ Features

  • βœ… Parse single and multiple templates.
  • βœ… Support nested templates.
  • βœ… Handle named and unnamed template parameters.
  • βœ… Extract internal links with or without display text.
  • βœ… Extract external links with or without labels.
  • βœ… Parse citations including attributes and special characters.
  • βœ… Parse categories, support custom namespaces, handle whitespaces and special characters.
  • βœ… Full PHPUnit test coverage.

🧩 Wikitext Features Support

Feature Read βœ… Modify ✏️ Replace πŸ”„
Templates βœ… Yes βœ… Yes βœ… Yes
Parameters βœ… Yes βœ… Yes βœ… Yes
Citations βœ… Yes βœ… Yes βœ… Yes
Citations>Attributes βœ… Yes βœ… Yes βœ… Yes
Internal Links βœ… Yes
External Links βœ… Yes
Categories βœ… Yes
HTML Tags
Parser Functions
Tables
Sections
Magic Words

🟑 Note: Some features are partially supported or under development. Contributions are welcome!


βš™οΈ Requirements

  • PHP 8.0 or higher
  • PHPUnit 9 or higher

πŸ’» Installation

composer require wiki-connect/parsewiki

Make sure you have proper PSR-4 autoloading for the WikiConnect\ParseWiki namespace.


πŸ§ͺ Running Tests

vendor/bin/phpunit tests

Test Coverage:

  • Templates: Single, multiple, nested, named/unnamed parameters.
  • Internal Links: Simple, with display text, special characters.
  • External Links: With/without labels, multiple links, whitespace handling.
  • Citations: With/without attributes, special characters.
  • Categories: Simple, with display text, custom namespaces, whitespaces, special characters.

✨ Example Usage

Parsing Templates

use WikiConnect\ParseWiki\ParserTemplates;

$text = '{{Infobox person|name=John Doe|birth_date=1990-01-01}}';

$parser = new ParserTemplates($text);
$templates = $parser->getTemplates();

foreach ($templates as $template) {
    echo $template->getName();
    print_r($template->getParameters());
}

Parsing and Editing a single Template

use WikiConnect\ParseWiki\ParserTemplate;

$text = '{{Infobox_Person|name=John Doe|birth_date=1990-01-01}}';

$parser = new ParserTemplate($text);
$template = $parser->getTemplate();

// Edit the template
$template->setName('Infobox person');
$template->parameters->set('birth_place', '[[New York City|New York]]');

$new_template = $template->toString();
echo $new_template; // {{Infobox person|name=John Doe|birth_date=1990-01-01|birth_place=[[New York City|New York]]}

Parsing Internal Links

use WikiConnect\ParseWiki\ParserInternalLinks;

$text = 'See [[Main Page|the main page]] and [[Help]].';

$parser = new ParserInternalLinks($text);
$links = $parser->getTargets();

foreach ($links as $link) {
    echo 'Target: ' . $link->getTarget() . PHP_EOL;
    echo 'Text: ' . ($link->getText() ?? $link->getTarget()) . PHP_EOL;
}

Parsing External Links

use WikiConnect\ParseWiki\ParserExternalLinks;

$text = 'Visit [https://example.com Example Site] and [https://nolabel.com].';

$parser = new ParserExternalLinks($text);
$links = $parser->getLinks();

foreach ($links as $link) {
    echo 'URL: ' . $link->getLink() . PHP_EOL;
    echo 'Label: ' . ($link->getText() ?: 'No label') . PHP_EOL;
}

Parsing Citations

use WikiConnect\ParseWiki\ParserCitations;

$text = 'Some text with a citation.<ref name="source">This is a citation</ref>';

$parser = new ParserCitations($text);
$citations = $parser->getCitations();

foreach ($citations as $citation) {
    echo 'Content: ' . $citation->getContent() . PHP_EOL;
    echo 'Attributes: ' . $citation->getAttributes() . PHP_EOL;
}

Parsing Categories

use WikiConnect\ParseWiki\ParserCategories;

$text = 'Some text [[Category:Science]] and [[Category:Math|Displayed]].';

$parser = new ParserCategories($text);
$categories = $parser->getCategories();

foreach ($categories as $category) {
    echo 'Category: ' . $category . PHP_EOL;
}

πŸ™Œ Author

Developed with ❀️ by Gerges.

About

A library that helps parse wikitext data

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages