CommonMark testing #20

john-mueller · 2019-12-01T20:15:37Z

~~As mentioned in benchmarking.md, this PR is not ready to be merged at this stage, just here for discussion.~~ The CLI improvements are ready to be merged as PR #19.

I think this can be merged, just as documentation.

Added -h/--help and --version tags Changed functionality of calling ink to make it more testable and permit standard forms of input (pipes, redirections, etc. `$ ink` reads from stdin `$ ink file.md` reads contents of file `$ ink -m string` parses string directly

sindresorhus · 2019-12-02T02:46:50Z

Sources/InkCLI/main.swift

    exit(0)
 }

-var markdown = CommandLine.arguments[1]
+if CommandLine.arguments.contains(where: { $0 == "--version" }) {


Suggested change

if CommandLine.arguments.contains(where: { $0 == "--version" }) {

if CommandLine.arguments.contains("--version") {

This is cleaned up in #19, and I'll merge those changes into this branch once it's finalized.

JohnSundell · 2019-12-05T22:41:11Z

@john-mueller Would you like a review on this, or is it pending some other changes?

steve-h · 2019-12-06T16:32:41Z

@john-mueller You can now reference my test generation experiment.
https://github.com/steve-h/InkTesting

I am doing a hack to parse the CommonMark spec as is done with their python test extractor.
Then I was able to generate "easy to regenerate" detest versions that I can use as TDD code when working on Ink.

I am stuck having failed 3 times to do the normalization step that is also done by a custom HTML python parser that they use. My latest failure is to use SwiftSoup as the base parser. It is working but it is "fixing" all the malformed HTML tests.

I may have to use it as a base and "hand fix" the tests that are difficult to "normalize"

steve-h · 2019-12-06T16:34:50Z

Perhaps you could build a python tool that pre-normalizes the json output version of the CommonMark spec tests and then I could use that in my hacked XCTest generator.

steve-h · 2019-12-06T16:46:50Z

Ultimately test coverage needs to be in Ink. I am now using my tool to single step through Ink to find out why and where it does not grok. Then I add the test case to Ink with the CommonMark line reference and a new name since the CommonMark tests to not have fancy names.

I am manually normalizing whitespace for now for the crazy test fragments. See my latest PR on Ink as an example.

john-mueller · 2019-12-06T18:06:04Z

@JohnSundell I've updated the file, and I think it would be beneficial to merge for people interested in improving test coverage. Let me know if you think anything should be changed, or if there's a better way to present this information.

The document no longer directly addresses whether we would set up some automated form of CommonMark validation through Danger, Bitrise, etc. While I still think it would be valuable, I think a text-based report rather than a binary pass/fail CI step would work better for now.

For one thing, Ink passes slightly different tests using the CommonMark Python script and my native XCTest test methods, even though Ink provides the same output to both and they both use the same spec data. This is because the HTML normalizers used in both make slightly different choices.

In addition, CommonMark test sections overlap (a test in the links section might also require indented code blocks to be implemented to pass). This can lead to tests that previously passed by accident suddenly failing, even though the test itself is not relevant until a future feature is implemented.

If we do implement some automated testing, I think it would be good to simply output some text like

> master: 191 passed, 458 failed, 0 errored, 0 skipped
> pr: 200 passed, 449 failed, 0 errored, 0 skipped
> 11 newly passing tests: 41 42 43 235 236 237 238 266 271 272 536 
> 2 newly failing tests: 45 46

Then the author can check those specific newly failing tests to see if they are relevant or if they are failing because of a completely separate unimplemented feature.

john-mueller · 2019-12-06T18:07:26Z

@steve-h Anything you'd add here? I've briefly looked at your tool, but haven't gotten a chance to actually clone and run it.

steve-h · 2019-12-06T18:13:46Z

I have not succeeded in the normalization problem either. There are nasty differences because one newline or a blank is equal whitespace but the difference will fail XCTest. The CommonMark spec has a lot of newlines in the way they define the spec. I wish they would have pre-minimized the test answers.
There are so many failing test right now that I have decided to start chopping trees instead of staring at the forest. I agree that at least the acknowledgment of the existing tooling is OK and we need chop half the forest before giving up on CommonMark or even submitting Ink to be in the test matrix.

john-mueller · 2019-12-06T18:32:31Z

My XCTests are actually passing more than the Python script (219 of 649 for XCTest, 214 of 649 for Python). The numbers are close enough that I think the XCTest suite is performing basically correctly, with HTML quibbles making up the difference. In either case, there is a useful broad spectrum overview of which sections fail most tests vs. one or two tests.

steve-h · 2019-12-06T18:54:16Z

I agree to try SwiftSoup as the first parse instead of the python HTML parser. The manual cleanup problem is still there. I had SwiftSoup add a missing </a> tag that changed a test; so unfortunately it is interfering too much to be a strict replication of the test.
Neither magically assigns a nice test case name other than example. I tried to compensate by putting the spec preamble in a comment when generating the XCTest, but it still ugly.
Our solutions are very close right now and I just published mine to show that we have a path to a CommonMark test suite if that is useful.
I think is still better to go blindly into battle with it and just start porting in at least the test that are helping preserve the things that are real markdown issues. The waiting for people to complain about their markdown not parsing in a particular instance may be more work than just conforming more.

john-mueller · 2019-12-06T19:14:52Z

Neither magically assigns a nice test case name other than example.

Yeah, in the short term I was envisioning using these tools to identify areas that need to be implemented and tested, and then writing an equivalent, named test for the actual merge.

steve-h · 2019-12-15T00:47:04Z

I had a brainwave today for my generator tool normalization problem.

I used Swiftmarkdown library that wraps cMark(GitHub flavoured) to generate the expected result html instead of the html from the spec. Research of cMark code showed me that it often emits a newline after a closing html element. So I then have an easy normalize of >\n< to >< to get rid of those.

let cMarkhtml = try! markdownToHTML(test.markdown, options: [.normalize, .noBreaks])
let normalizedcMark = cMarkhtml.replacingOccurrences(of: ">\n<", with: "><")
.replacingOccurrences(of: "<hr />", with: "<hr>")
.replacingOccurrences(of: "<br />", with: "<br>")

I just tried this and got a few more passes; now only 455 failures "out-of-the-box" with my pending backslash escape change.

john-mueller · 2020-01-31T22:55:12Z

I think this would be useful to merge, if only so people don't have to reinvent the wheel if they want to see areas where they could work on adding conformance.

john-mueller added 2 commits November 30, 2019 21:03

Improved Ink CLI tool

a715dd4

Added -h/--help and --version tags Changed functionality of calling ink to make it more testable and permit standard forms of input (pipes, redirections, etc. `$ ink` reads from stdin `$ ink file.md` reads contents of file `$ ink -m string` parses string directly

Added temporary document discussing testing

5221a7d

john-mueller mentioned this pull request Dec 1, 2019

Improved Ink CLI tool #19

Merged

JohnSundell added the depends on other PR This PR depends on another, unmerged PR. label Dec 1, 2019

sindresorhus reviewed Dec 2, 2019

View reviewed changes

john-mueller added 2 commits December 2, 2019 13:50

Merge updates in branch 'master' into commonmark-testing

f1dfb21

Removed errantly added file from Tests

c04d911

JohnSundell removed the depends on other PR This PR depends on another, unmerged PR. label Dec 5, 2019

Updated benchmarking.md

6b5fb8a

Added missing links

b7892f5

john-mueller added 2 commits January 31, 2020 08:33

Merge updates in branch 'master' into commonmark-testing

9f7374e

Added link in README.md

9111984

john-mueller requested a review from JohnSundell January 31, 2020 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CommonMark testing #20

CommonMark testing #20

john-mueller commented Dec 1, 2019 •

edited

Loading

sindresorhus Dec 2, 2019

john-mueller Dec 2, 2019

JohnSundell commented Dec 5, 2019

steve-h commented Dec 6, 2019

steve-h commented Dec 6, 2019

steve-h commented Dec 6, 2019

john-mueller commented Dec 6, 2019

john-mueller commented Dec 6, 2019

steve-h commented Dec 6, 2019

john-mueller commented Dec 6, 2019

steve-h commented Dec 6, 2019

john-mueller commented Dec 6, 2019

steve-h commented Dec 15, 2019

john-mueller commented Jan 31, 2020

	if CommandLine.arguments.contains(where: { $0 == "--version" }) {
	if CommandLine.arguments.contains("--version") {

CommonMark testing #20

Are you sure you want to change the base?

CommonMark testing #20

Conversation

john-mueller commented Dec 1, 2019 • edited Loading

sindresorhus Dec 2, 2019

Choose a reason for hiding this comment

john-mueller Dec 2, 2019

Choose a reason for hiding this comment

JohnSundell commented Dec 5, 2019

steve-h commented Dec 6, 2019

steve-h commented Dec 6, 2019

steve-h commented Dec 6, 2019

john-mueller commented Dec 6, 2019

john-mueller commented Dec 6, 2019

steve-h commented Dec 6, 2019

john-mueller commented Dec 6, 2019

steve-h commented Dec 6, 2019

john-mueller commented Dec 6, 2019

steve-h commented Dec 15, 2019

john-mueller commented Jan 31, 2020

john-mueller commented Dec 1, 2019 •

edited

Loading