Module-ification of MultiMarkdown #25

Oblomov · 2011-01-02T10:39:10Z

Hello,

I took the liberty of refactoring the (Perl) MultiMarkdown files into a module + executable interface. I think I didn't break anything, and there is subsequent work that could be done by exploiting the new Markdown() interface (with options) to override default settings without pushing metadata through, but I believe that in its current state this could already be merged upstream.

This allows MultiMarkdown.pl to be used as a module and still find the correct ASCIIMathML.pm

As a first step, all modules are moved to lib and the MultiMarkdown.pl executable is renamed to lib/MultiMarkdown.pm, while bin/MultiMarkdown.pl is replaced by a simple wrapper that just invokes the MultiMarkdown module.

In the process, simplify the MMDPath detection logic.

Bring the command-line parsing stuff back into the MultiMarkdown.pl executable, and export the Markdown method from the module, making it accept options to be used to override global settings.

This allows us to make the settings overridable via the options passed to Markdown() in a much simpler way than the clumsy if (defined $opts{somesetting}) { $g_somesetting = $opts{somesetting} ; } that would need to be repeated for every setting.

Setting up the tagged extractor used by _HashHTMLBlocks outside of the sub, we can squeeze off some runtime in case of repeated processing.

The search & replace in _DoDefinitionLists takes a lot of time even if no replacement is being done. Optimize by bailing out early if no line looking like a definition is found.

Skip the block tag hashing if there are no block tags.

This fixes an issue where a heading would follow a list with a single item containing a block quote: the heading would be absorbed by the block quote, and fail to get expanded (a similar failure happened with standard Markdown, where the heading would still get expanded but still appear within the blockquote).

This brings the module interface on par with Text::MultiMarkdown from CPAN, except for some extra parameters offered by that module.

Merge the Text::MultiMarkdown work from Doran. This includes adding new switches to disable MMD enhancements, including the documetnation from Text::MultiMarkdown and minor changes to function names to align with the Text::MultiMarkdown source (ease of diff).

Also move ASCIIMathML in Text/, making it easier to find.

This allows the final newline at the end of code blocks to be customized. The default is to have no newline (as per MMD), but it can be set to "\n" to emulate classic Markdown.

Processing those blocks early introduces other bugs against the testsuite, we have to find a better solution.

Go back to Markdown-compatible output. Find a test case where this effectively breaks the HTML parser.

This ensures that e.g. a blockquote following them is properly recognized.

The original Markdown implementation supports "running blockquotes": if any line in a paragraph start with the '>' character, that line and all the subsequent ones are split from the paragraph and become a blockquote. This is inconsistent with the list behavior (list don't start mid-paragraph). Additionally, if a blockquote happens within a non-block list item (e.g. a standalone item or an item in a sequence of items not separated by empty lines), mismatched markup is generated, with interleaved 'blockquote' and 'li' tag pairs because Markdown starts thinking it's in span mode, and then reparses the span-mode output in block mode. Blockquote-in-list detection is solved by letting the list item processor check for existence of >-starting lines in the whole item. Since this is inefficient, we allow the user to disable running blockquotes, in which case blockquotes cannot start mid-paragraph (consistently with the list behavior) and the blockquote-in-list detection is much more efficient.

This is enabled by passing a "leading" regexp as the running_lists option to MMD. In this case, a mid-paragraph line starting with a list item (ordered or unordered) and preceded by a line ending with the leading regexp will switch the paragraph to list mode.

Allow text in footnotes to reference other footnotes. Refactor in-text footnote _mark_ processing from footnote _text_ processing, and process footnote marks when processing each footnote text.

Oblomov added 30 commits December 17, 2010 13:41

Use __FILE__ rather than $0 to find self

3761e0e

This allows MultiMarkdown.pl to be used as a module and still find the correct ASCIIMathML.pm

follow symlinks to find the support module

c391fca

Start lib-ification of MultiMarkdown

83f3ddd

As a first step, all modules are moved to lib and the MultiMarkdown.pl executable is renamed to lib/MultiMarkdown.pm, while bin/MultiMarkdown.pl is replaced by a simple wrapper that just invokes the MultiMarkdown module.

Rework mmd2* scripts to use the new layout

f54bc59

In the process, simplify the MMDPath detection logic.

More modularization progress

8bb14a5

Use File::Spec for path joins

9a4f433

Improve binary vs module split

583bf95

Bring the command-line parsing stuff back into the MultiMarkdown.pl executable, and export the Markdown method from the module, making it accept options to be used to override global settings.

Collect settings into a hash

890e89c

This allows us to make the settings overridable via the options passed to Markdown() in a much simpler way than the clumsy if (defined $opts{somesetting}) { $g_somesetting = $opts{somesetting} ; } that would need to be repeated for every setting.

Prepare _HashHTMLBlocks extractor

0bec43c

Setting up the tagged extractor used by _HashHTMLBlocks outside of the sub, we can squeeze off some runtime in case of repeated processing.

Early bailout from _DoDefinitionLists

d758734

The search & replace in _DoDefinitionLists takes a lot of time even if no replacement is being done. Optimize by bailing out early if no line looking like a definition is found.

Small _HashHTMLBlocks optimization

be84578

Skip the block tag hashing if there are no block tags.

Read links when finding who we are

9573fa2

Move some lines around, getting closer to the typical Perl module

662fe2f

Object-oriented interface

3a5d919

This brings the module interface on par with Text::MultiMarkdown from CPAN, except for some extra parameters offered by that module.

Whitespace-cleaup the main module

8795bc1

Merge Text::MultiMarkdown

a786b28

Merge the Text::MultiMarkdown work from Doran. This includes adding new switches to disable MMD enhancements, including the documetnation from Text::MultiMarkdown and minor changes to function names to align with the Text::MultiMarkdown source (ease of diff).

Text::MultiMarkdown alias to MultiMarkdown

909ef19

Also move ASCIIMathML in Text/, making it easier to find.

Handle 'Keywords' in metadata correctly

08f1b0a

Emulate Markdown when called as Markdown.pl

03edab7

New codeblocks_newline option

1e20257

This allows the final newline at the end of code blocks to be customized. The default is to have no newline (as per MMD), but it can be set to "\n" to emulate classic Markdown.

Revert 7f8cade

e621a95

Processing those blocks early introduces other bugs against the testsuite, we have to find a better solution.

More Markdown.pl emulation

aa19530

Command-line option to emulate Markdown

bcfadcb

Add leading spaces to _all_ lines in a blockquote

eda6547

Don't strip final whitespace in list items

c2f071d

Go back to Markdown-compatible output. Find a test case where this effectively breaks the HTML parser.

Outer lists have proper should end block-style

54b32b4

This ensures that e.g. a blockquote following them is properly recognized.

Support running lists

a09ade1

This is enabled by passing a "leading" regexp as the running_lists option to MMD. In this case, a mid-paragraph line starting with a list item (ordered or unordered) and preceded by a line ending with the leading regexp will switch the paragraph to list mode.

Merge remote-tracking branch 'upstream/master'

99fdcb6

Oblomov added 15 commits March 7, 2011 18:33

More readlink fixes

1739db3

Add HTML5 block-level tags

a49f18a

Handle multiple reference to the same footnote/glossary

8212a1a

Glaring horror error from previous commit

e23970e

Output footnotes ordered by their counter

8939649

Allow footnotes in footnotes

33418f4

Allow text in footnotes to reference other footnotes. Refactor in-text footnote _mark_ processing from footnote _text_ processing, and process footnote marks when processing each footnote text.

Support math between $..$

521faa1

ASCIIMathML: some UTF-8 support

a2bc6d2

Support UTF-8 in ASCIIMathML

b892c60

All scripts: fall back to __FILE__ if not link

497e33e

More UTF-8 for ASCIIMathML

38df87e

Some currency symbols

094052f

Treat floor/ceil as parenthesis

a7e1159

ASCIIMathML: more trig

f51648f

Better HTML5 support + my stuff

38dfb5f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Module-ification of MultiMarkdown #25

Module-ification of MultiMarkdown #25

Oblomov commented Jan 2, 2011

Module-ification of MultiMarkdown #25

Are you sure you want to change the base?

Module-ification of MultiMarkdown #25

Conversation

Oblomov commented Jan 2, 2011