Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CommonMark testing #20

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

john-mueller
Copy link
Contributor

@john-mueller john-mueller commented Dec 1, 2019

As mentioned in benchmarking.md, this PR is not ready to be merged at this stage, just here for discussion. The CLI improvements are ready to be merged as PR #19.

I think this can be merged, just as documentation.

Added -h/--help and --version tags
Changed functionality of calling ink to make it more testable and permit standard forms of input (pipes, redirections, etc.

`$ ink` reads from stdin
`$ ink file.md` reads contents of file
`$ ink -m string` parses string directly
@john-mueller john-mueller mentioned this pull request Dec 1, 2019
@JohnSundell JohnSundell added the depends on other PR This PR depends on another, unmerged PR. label Dec 1, 2019
exit(0)
}

var markdown = CommandLine.arguments[1]
if CommandLine.arguments.contains(where: { $0 == "--version" }) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if CommandLine.arguments.contains(where: { $0 == "--version" }) {
if CommandLine.arguments.contains("--version") {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cleaned up in #19, and I'll merge those changes into this branch once it's finalized.

@JohnSundell
Copy link
Owner

@john-mueller Would you like a review on this, or is it pending some other changes?

@JohnSundell JohnSundell removed the depends on other PR This PR depends on another, unmerged PR. label Dec 5, 2019
@steve-h
Copy link
Contributor

steve-h commented Dec 6, 2019

@john-mueller You can now reference my test generation experiment.
https://github.com/steve-h/InkTesting

I am doing a hack to parse the CommonMark spec as is done with their python test extractor.
Then I was able to generate "easy to regenerate" detest versions that I can use as TDD code when working on Ink.

I am stuck having failed 3 times to do the normalization step that is also done by a custom HTML python parser that they use. My latest failure is to use SwiftSoup as the base parser. It is working but it is "fixing" all the malformed HTML tests.

I may have to use it as a base and "hand fix" the tests that are difficult to "normalize"

@steve-h
Copy link
Contributor

steve-h commented Dec 6, 2019

Perhaps you could build a python tool that pre-normalizes the json output version of the CommonMark spec tests and then I could use that in my hacked XCTest generator.

@steve-h
Copy link
Contributor

steve-h commented Dec 6, 2019

Ultimately test coverage needs to be in Ink. I am now using my tool to single step through Ink to find out why and where it does not grok. Then I add the test case to Ink with the CommonMark line reference and a new name since the CommonMark tests to not have fancy names.

I am manually normalizing whitespace for now for the crazy test fragments. See my latest PR on Ink as an example.

@john-mueller
Copy link
Contributor Author

@JohnSundell I've updated the file, and I think it would be beneficial to merge for people interested in improving test coverage. Let me know if you think anything should be changed, or if there's a better way to present this information.

The document no longer directly addresses whether we would set up some automated form of CommonMark validation through Danger, Bitrise, etc. While I still think it would be valuable, I think a text-based report rather than a binary pass/fail CI step would work better for now.

For one thing, Ink passes slightly different tests using the CommonMark Python script and my native XCTest test methods, even though Ink provides the same output to both and they both use the same spec data. This is because the HTML normalizers used in both make slightly different choices.

In addition, CommonMark test sections overlap (a test in the links section might also require indented code blocks to be implemented to pass). This can lead to tests that previously passed by accident suddenly failing, even though the test itself is not relevant until a future feature is implemented.

If we do implement some automated testing, I think it would be good to simply output some text like

> master: 191 passed, 458 failed, 0 errored, 0 skipped
> pr: 200 passed, 449 failed, 0 errored, 0 skipped
> 11 newly passing tests: 41 42 43 235 236 237 238 266 271 272 536 
> 2 newly failing tests: 45 46 

Then the author can check those specific newly failing tests to see if they are relevant or if they are failing because of a completely separate unimplemented feature.

@john-mueller
Copy link
Contributor Author

@steve-h Anything you'd add here? I've briefly looked at your tool, but haven't gotten a chance to actually clone and run it.

@steve-h
Copy link
Contributor

steve-h commented Dec 6, 2019

I have not succeeded in the normalization problem either. There are nasty differences because one newline or a blank is equal whitespace but the difference will fail XCTest. The CommonMark spec has a lot of newlines in the way they define the spec. I wish they would have pre-minimized the test answers.
There are so many failing test right now that I have decided to start chopping trees instead of staring at the forest. I agree that at least the acknowledgment of the existing tooling is OK and we need chop half the forest before giving up on CommonMark or even submitting Ink to be in the test matrix.

@john-mueller
Copy link
Contributor Author

My XCTests are actually passing more than the Python script (219 of 649 for XCTest, 214 of 649 for Python). The numbers are close enough that I think the XCTest suite is performing basically correctly, with HTML quibbles making up the difference. In either case, there is a useful broad spectrum overview of which sections fail most tests vs. one or two tests.

@steve-h
Copy link
Contributor

steve-h commented Dec 6, 2019

I agree to try SwiftSoup as the first parse instead of the python HTML parser. The manual cleanup problem is still there. I had SwiftSoup add a missing </a> tag that changed a test; so unfortunately it is interfering too much to be a strict replication of the test.
Neither magically assigns a nice test case name other than example. I tried to compensate by putting the spec preamble in a comment when generating the XCTest, but it still ugly.
Our solutions are very close right now and I just published mine to show that we have a path to a CommonMark test suite if that is useful.
I think is still better to go blindly into battle with it and just start porting in at least the test that are helping preserve the things that are real markdown issues. The waiting for people to complain about their markdown not parsing in a particular instance may be more work than just conforming more.

@john-mueller
Copy link
Contributor Author

Neither magically assigns a nice test case name other than example.

Yeah, in the short term I was envisioning using these tools to identify areas that need to be implemented and tested, and then writing an equivalent, named test for the actual merge.

@steve-h
Copy link
Contributor

steve-h commented Dec 15, 2019

I had a brainwave today for my generator tool normalization problem.

I used Swiftmarkdown library that wraps cMark(GitHub flavoured) to generate the expected result html instead of the html from the spec. Research of cMark code showed me that it often emits a newline after a closing html element. So I then have an easy normalize of >\n< to >< to get rid of those.

let cMarkhtml = try! markdownToHTML(test.markdown, options: [.normalize, .noBreaks])
let normalizedcMark = cMarkhtml.replacingOccurrences(of: ">\n<", with: "><")
.replacingOccurrences(of: "<hr />", with: "<hr>")
.replacingOccurrences(of: "<br />", with: "<br>")

I just tried this and got a few more passes; now only 455 failures "out-of-the-box" with my pending backslash escape change.

@john-mueller
Copy link
Contributor Author

I think this would be useful to merge, if only so people don't have to reinvent the wheel if they want to see areas where they could work on adding conformance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants