Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module-ification of MultiMarkdown #25

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open

Conversation

Oblomov
Copy link

@Oblomov Oblomov commented Jan 2, 2011

Hello,

I took the liberty of refactoring the (Perl) MultiMarkdown files into a module + executable interface. I think I didn't break anything, and there is subsequent work that could be done by exploiting the new Markdown() interface (with options) to override default settings without pushing metadata through, but I believe that in its current state this could already be merged upstream.

This allows MultiMarkdown.pl to be used as a module and still find the correct
ASCIIMathML.pm
As a first step, all modules are moved to lib and the MultiMarkdown.pl
executable is renamed to lib/MultiMarkdown.pm, while bin/MultiMarkdown.pl
is replaced by a simple wrapper that just invokes the MultiMarkdown module.
In the process, simplify the MMDPath detection logic.
Bring the command-line parsing stuff back into the MultiMarkdown.pl
executable, and export the Markdown method from the module, making it
accept options to be used to override global settings.
This allows us to make the settings overridable via the options passed
to Markdown() in a much simpler way than the clumsy

    if (defined $opts{somesetting}) {
	$g_somesetting = $opts{somesetting} ;
    }

that would need to be repeated for every setting.
Setting up the tagged extractor used by _HashHTMLBlocks outside of the
sub, we can squeeze off some runtime in case of repeated processing.
The search & replace in _DoDefinitionLists takes a lot of time even if
no replacement is being done. Optimize by bailing out early if no line
looking like a definition is found.
Skip the block tag hashing if there are no block tags.
This fixes an issue where a heading would follow a list with a single
item containing a block quote: the heading would be absorbed by the
block quote, and fail to get expanded (a similar failure happened with
standard Markdown, where the heading would still get expanded but still
appear within the blockquote).
This brings the module interface on par with Text::MultiMarkdown from
CPAN, except for some extra parameters offered by that module.
Merge the Text::MultiMarkdown work from Doran. This includes adding new
switches to disable MMD enhancements, including the documetnation from
Text::MultiMarkdown and minor changes to function names to align with
the Text::MultiMarkdown source (ease of diff).
Also move ASCIIMathML in Text/, making it easier to find.
This allows the final newline at the end of code blocks to be
customized. The default is to have no newline (as per MMD), but
it can be set to "\n" to emulate classic Markdown.
Processing those blocks early introduces other bugs against the
testsuite, we have to find a better solution.
Go back to Markdown-compatible output. Find a test case where this
effectively breaks the HTML parser.
This ensures that e.g. a blockquote following them is properly
recognized.
The original Markdown implementation supports "running blockquotes": if
any line in a paragraph start with the '>' character, that line and all
the subsequent ones are split from the paragraph and become a
blockquote. This is inconsistent with the list behavior (list don't
start mid-paragraph).

Additionally, if a blockquote happens within a non-block list item (e.g.
a standalone item or an item in a sequence of items not separated by
empty lines), mismatched markup is generated, with interleaved
'blockquote' and 'li' tag pairs because Markdown starts thinking it's in
span mode, and then reparses the span-mode output in block mode.

Blockquote-in-list detection is solved by letting the list item
processor check for existence of >-starting lines in the whole item.
Since this is inefficient, we allow the user to disable running
blockquotes, in which case blockquotes cannot start mid-paragraph
(consistently with the list behavior) and the blockquote-in-list
detection is much more efficient.
This is enabled by passing a "leading" regexp as the running_lists
option to MMD. In this case, a mid-paragraph line starting with a list
item (ordered or unordered) and preceded by a line ending with the
leading regexp will switch the paragraph to list mode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant