Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing error due to zope acquisition #165

Open
sauzher opened this issue Feb 17, 2017 · 2 comments
Open

Indexing error due to zope acquisition #165

sauzher opened this issue Feb 17, 2017 · 2 comments

Comments

@sauzher
Copy link
Contributor

sauzher commented Feb 17, 2017

Let a solr schema.xml have a index named foo
Let a number of objects (ex. id=[bar, baz] ) in the plone tree have a sibling (or ancestor) object named foo

Solr raises an exception indexing bar and baz.

Caused by: java.lang.IllegalArgumentException: Document contains at least one immense term in field="foo" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.

I suppose this is because foo is acquired in the bar and baz context and solr gets the whole foo html page to be pushed inside foo index field.

@mauritsvanrees
Copy link
Member

Don't you have the same problem when you do this without solr? I expect that when you add an index foo in the Plone portal_catalog and create the same situation, that you run into similar problems. There may not be an actual error, but the foo html will still end up in the index. I am pretty sure I have seen this happen before.

I think the best solution would be to define your own indexer based on plone.indexer that catches this situation.

Maybe collective.solr could catch situations like this, and truncate the value at 32766 (or a configurable length) before sending it to Solr. But the solution with your own indexer should help here as well.

@sauzher
Copy link
Contributor Author

sauzher commented Mar 27, 2017

Yes, there is the problem even with portal_catalog but it does not raises an error and you still have bar, baz, and all subtree's objects indexed (with anomalous value in the index foo). With c.solr these objects will be totally missing. That's my point.

thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants