html

This package offers a simple HTML parser, motivated by a desire to query the DOM and extract information from it.

The current parser is NOT spec compliant, and is not guaranteed to work on all HTML input. This may change.

usage

package main

import html "../"
import "core:fmt"

main :: proc() {
	doc := html.parse("<html><ul><li>one</li><li>two</li><li>three</li></ul></html>")
	defer html.document_delete(doc)

	iter := html.node_iterator_from_document(doc)

	for node in html.node_iterator_depth_first(&iter) {
		fmt.println(html.node_to_string(node))
	}
}

All strings on the Node are a slice into the original input string. The dynamic arrays for the attributes and children can be deleted with [html.document_delete].

roadmap

record parse errors
spec compliance
- respect content model: eg special hadling for <script>, <pre>, etc
stream in source data with a reader
support unicode input instead of just ascii

potholes

Special tags like <script> are not handled specially. Such a tag is expected to have it's inner HTML be raw text. This version of the parser will parse script content as HTML.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.builds		.builds
example		example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
html.odin		html.odin
iterate.odin		iterate.odin
lex.odin		lex.odin
parse.odin		parse.odin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

html

usage

roadmap

potholes

About

Uh oh!

Releases

Packages

Languages

License

JackMordaunt/odin-html

Folders and files

Latest commit

History

Repository files navigation

html

usage

roadmap

potholes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages