-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement XPath on traits instead of concrete types #120
Comments
I guess the biggest motivation for doing this would be decoupling the XPath parser and evaluator from sxd-document so that it could be used with other backends too, I suspect the main motivation would be to be able to parse huge documents not fitting into memory? I don't think the use case of using XPath against any data is as common as that it would justify such a change in itself (abstraction like this makes code more complex), and I think having anyone wanting something like that create a document manually is the most reasonable decision. I wonder if this is the right approach here though, since I don't know how feasible it is to evaluate XPath without a DOM, I can see a lot of complications as it's possible to have both forward and backward dependency in XPath queries. If there is already an example of a library providing this or a specification of how this would have to be done properly, that would be really valuable. |
The biggest I've heard of would be html5ever, which is indeed a DOM structure.
Having a "streaming XPath" is a truly interesting idea, but I'm not sure how one would go about it. As you mention:
It's definitely not possible for an arbitrary XPath to be applied in such a manner, so we'd have to either limit the input or determine if a given XPath is "streamable".
Agreed. |
If someone really did want to apply these against html5ever, I think the strongest path would be to spin up a branch that just wildly hacks this crate to work against those nodes. That would give very concrete ideas to what kind of abstraction is needed. |
The C++ library Qt supports XQuery (and XPath) on classes that derive from QAbstractXmlNodeModel. http://doc.qt.io/qt-5/qabstractxmlnodemodel.html http://doc.qt.io/qt-5/xquery-introduction.html
A backend that can place cursors in enormous documents would allow this. This might have indexes on nodes. XML databases do this. |
Do you know of any other concrete implementations of that base model? I see |
There is one for HTML documents: https://github.com/jgehring/qhtmlnodemodel Qt comes with an example for file trees: https://code.woboq.org/qt5/qtxmlpatterns/examples/xmlpatterns/filetree/filetree.cpp.html Here's a blog with the rationale for the use of an abstract node model: https://englich.wordpress.com/2007/11/15/query-your-toaster/ |
Cool, thank you! What was your specific usecase that made you originally open this issue? |
KDE has a few uses of it. One maps binary MS Office documents to a QAbstractXmlNodeModel. https://lxr.kde.org/ident?_i=QAbstractXmlNodeModel https://lxr.kde.org/source/playground/libs/binschema/cpp/msoxmlnodemodel.cpp |
I was thinking of doing some XPath code and noticed quite a few XML implementations in Rust. Quite a few developers have started XML parsers and doms with different trade-offs. For each of them, adding XPath is quite a task. For developers that want to use XPath in Rust code, there's not so much choice. My concrete use case at the time was working with gigabyte spreadsheets. I ended up parsing into a special struct and had to forgo the convenience of xsd-xpath. |
sxd-xpath works with sxd-dom. Data needs to be converted to an sxd-dom before an XPath can be run on it.
If sxd-path would work on traits, it could be used on any data structure that implements those traits.
The traits might look something like this:
The text was updated successfully, but these errors were encountered: