Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure SnippetsTrees are written with file = unix filename #7

Open
tarrow opened this issue Apr 18, 2016 · 2 comments
Open

Ensure SnippetsTrees are written with file = unix filename #7

tarrow opened this issue Apr 18, 2016 · 2 comments

Comments

@tarrow
Copy link

tarrow commented Apr 18, 2016

results.xml contains lines like:

<results>
<result pre="ric multi-attribute utility values " name0="exclude" value0="exclude" post="important domains and non-health outcomes, while p" xpath="/*[local-name()='html'][1]/*[local-name()='body'][1]/*[local-name()='div'][1]/*[local-name()='div'][3]/*[local-name()='div'][1]/*[local-name()='p'][1]"/>
<results/>

snippetsTrees are elements which contain results elements. Sometimes multiple, sometimes only one.
projectSnippetsTrees are elements which contain snippetTree elements. One snippets tree element for each paper that is addressed.

However, we directly build snippetsTrees from results. Indeed the current code in SnippetsTree.java relies on them being precisely saved in a file called results/pluginname/option/results.xml. (see line 107). However this doesn't make sense because snippetsTrees when written to file are written with a name of type: plugin.option.snippets.xml which makes it impossible to read a snippetsTree in from a file and have it as a valid object.

I think this shows where we've introduced two different functions of the ami code that should be more strongly decoupled: 1) mining information from papers and 2) formatting it for human reading.

A machine doesn't really need either the snippetsTree or the projectSnippetsTree. We should probably stop making these (including for the situation where they contain post-processed data from the mine like word counts) and leave it to a tool further down the line.

@tarrow
Copy link
Author

tarrow commented Apr 18, 2016

I misunderstood this. Actually we don't get the filename from the name of the file it is actually an XML attribute in the snippetsTree. This is obviously written in a platform dependent format which I'll now track down and try to fix.

@tarrow tarrow changed the title SnippetsTree and Results are conflated Ensure SnippetsTrees are written with file = unix filename Apr 18, 2016
@tarrow
Copy link
Author

tarrow commented Apr 29, 2016

We probably don't need to do this if all of the logic is migrated into wanda. ProjectSnippetsTrees are probably all of the summary that we will write. They still do have the file names in them but this is probably not necessary. They are only there because without them we don't know what plugin wrote the results file (this should really be stored in the results element its self).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant