Lack of documentation. #110

ruipgil · 2014-07-22T12:28:20Z

There's a lack of documentation, even the example of the README is outdated or isn't explained correctly.

aredridel · 2014-07-22T15:24:05Z

Yes indeed! Needs some TLC.

Any aspects you want to see first?

ruipgil · 2014-07-22T19:06:29Z

Since the project is used widely used with JSDOM, an annotated (JSDOM) example should always be up to date.
Also, you could use tests as examples, to make sure everything works fine. More of an usage test, than an unit test. And with this kind of tests you'd only need to redirect people to the source code of the example.

eGavr · 2014-08-06T16:09:20Z

Can you give a real work example of using your tool in nodejs without jQuery?

aredridel · 2014-08-09T03:17:53Z

The v1.0.1 README now has an example.

eGavr · 2014-08-09T07:32:38Z

Thank you, but it seems, that it is not simple example. Why is ti so difficult? Lots of code for such a simple example...

Can I do something like this:

var parser = require('parse5');
var html = '<p>blah</p>';

console.log(parser.parse(html));

and after console.log receive the full DOM tree?

danyaPostfactum · 2014-08-09T08:53:37Z

HTML5 does not contain any DOM implementation. So, you have to provide it.
If you just need DOM tree:

var HTML5 = require('html5');
var jsdom = require('jsdom');

var DOMImplementation = jsdom.level(3).DOMImplementation;
var parser = new HTML5.DOMParser(new DOMImplementation());

var document = parser.parse('<p>I am a very small HTML document</p>');

console.log(document.getElementsByTagName("p")[0].textContent);

Also, take a look at SAXParser:

var HTML5 = require('html5');

var parser = new HTML5.SAXParser();

parser.contentHandler = {
    startDocument: function() {},
    endDocument: function() {},
    startElement: function(uri, localName, qName, atts) {
        console.log('Start of <' + localName + '> element');
    },
    endElement: function(uri, localName, qName) {
        console.log('End of <' + localName + '> element');
    },
    characters: function(ch, start, length) {
        console.log('Characters: ' + ch);
    }
};

parser.parse('<p>I am a very small HTML document</p>');

Start of <html> element
Start of <head> element
End of <head> element
Start of <body> element
Start of <p> element
Characters: I am a very small HTML document
End of <p> element
End of <body> element
End of <html> element

eGavr · 2014-08-09T09:05:38Z

Great! I think that SAXParser is that what I need!

BUT!

<p>I am a very small HTML document</p>

Where is the html, head element in the input etc?

Can I receive the info exactly about the input?

danyaPostfactum · 2014-08-09T10:02:49Z

Where is the html, head element in the input etc?

Parser creates all these elements according to HTML spec (browsers do the same).
You can use fragment parsing algorithm:

parser.parseFragment('<p>I am a very small HTML document</p>', 'body');

Fragment parsing was broken. I fixed it right now, so you need to pull latest change (still not sure i fixed the bug properly).

Can I receive the info exactly about the input?

No, you receive repaired, well-formed output. This parser may create, forbid, reparent elements etc according to the HTML5 parsing specification.

eGavr · 2014-08-09T11:01:02Z

var HTML5 = require('html5');

var parser = new HTML5.SAXParser();

parser.contentHandler = {
    startDocument: function() {console.log('!!!!')},
    endDocument: function() {console.log('????')},
    startElement: function(uri, localName, qName, atts) {
        console.log("qNAme == ", qName)
        console.log(atts)
        console.log('Start of <' + localName + '> element');
    },
    endElement: function(uri, localName, qName) {
        console.log('End of <' + localName + '> element');
    },
    characters: function(ch, start, length) {
        console.log('Characters: ' + ch);
    }
};

parser.parseFragment('<p>I am a very small HTML document</p>', 'body');

This code doesn't work! I'm sorry ) Probably there is a silly mistake I haven't noticed! Can you help me?

eGavr · 2014-08-09T11:03:10Z

Can you give the information about contentHandler?
Now I know these ones startDocument, endDocument, startElement, endElement, characters!

Are there anything else?

danyaPostfactum · 2014-08-09T11:17:27Z

This code works. You should pull this fix: 4ff67be
This is not available via npm.

Are there anything else?

No. There is a lexicalHandler, that can handle comments, doctype, cdata sections. But this feature is not implemented yet (but it is very easy to do).

eGavr · 2014-08-09T18:34:33Z

Are you going to do this?)
And can you say an approximate date of the release with these changes?

I mean, it would be great if you could combine contentHandler and lexicalHandler into one Handler!

This way, everybody will be able to create the DOM tree of HTML code in manner as they want!

aredridel · 2014-08-10T01:28:32Z

Start an issue for 'em -- this one's about docs! -- and we'll go from there.

aredridel · 2014-08-10T01:31:25Z

And that fix is shipped in v1.0.3

danyaPostfactum · 2014-08-10T08:37:28Z

Lexical handler now can be defined:

parser.lexicalHandler = {
    comment: function(data) {
        console.log('Comment: ' + data);
    },
    startDTD: function(name, publicIdentifier, systemIdentifier) {
        console.log('Doctype: ' + name);
    },
    endDTD: function() {}
};

contentHandler is required, while lexicalHandler is optional.

http://www.saxproject.org/apidoc/org/xml/sax/ContentHandler.html
http://www.saxproject.org/apidoc/org/xml/sax/ext/LexicalHandler.html

everybody will be able to create the DOM tree of HTML code in manner as they want!

Right. With SAXParser they are able.

eGavr · 2014-08-10T21:44:51Z

Is it in v1.0.3?

aredridel · 2014-08-10T22:16:19Z

Yep.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lack of documentation. #110

Lack of documentation. #110

ruipgil commented Jul 22, 2014

aredridel commented Jul 22, 2014

ruipgil commented Jul 22, 2014

eGavr commented Aug 6, 2014

aredridel commented Aug 9, 2014

eGavr commented Aug 9, 2014

danyaPostfactum commented Aug 9, 2014

eGavr commented Aug 9, 2014

danyaPostfactum commented Aug 9, 2014

eGavr commented Aug 9, 2014

eGavr commented Aug 9, 2014

danyaPostfactum commented Aug 9, 2014

eGavr commented Aug 9, 2014

aredridel commented Aug 10, 2014

aredridel commented Aug 10, 2014

danyaPostfactum commented Aug 10, 2014

eGavr commented Aug 10, 2014

aredridel commented Aug 10, 2014

Lack of documentation. #110

Lack of documentation. #110

Comments

ruipgil commented Jul 22, 2014

aredridel commented Jul 22, 2014

ruipgil commented Jul 22, 2014

eGavr commented Aug 6, 2014

aredridel commented Aug 9, 2014

eGavr commented Aug 9, 2014

danyaPostfactum commented Aug 9, 2014

eGavr commented Aug 9, 2014

danyaPostfactum commented Aug 9, 2014

eGavr commented Aug 9, 2014

eGavr commented Aug 9, 2014

danyaPostfactum commented Aug 9, 2014

eGavr commented Aug 9, 2014

aredridel commented Aug 10, 2014

aredridel commented Aug 10, 2014

danyaPostfactum commented Aug 10, 2014

eGavr commented Aug 10, 2014

aredridel commented Aug 10, 2014