Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How to index custom NamedBlobFile? #240

Open
NicolasGoeddel opened this issue Oct 29, 2019 · 4 comments
Open

[Question] How to index custom NamedBlobFile? #240

NicolasGoeddel opened this issue Oct 29, 2019 · 4 comments

Comments

@NicolasGoeddel
Copy link

Hi,

what do I have to do when I created my own content schema, derived from zope.supermodel.model.Schema, containing a plone.namedfile.field.NamedBlobField which I want to index? In this case the file will be a PDF file. Because the field name is simply pdf I added the line

<field name="pdf"        type="text"     indexed="true"  stored="true" />

to schema.xml and restarted Solr. I thought that's it and whenever I change the content of the field the PDF in that field will be indexed. But it seems I am wrong here and do not really understand the relationship between schema.xml and the content fields of Plone, do I?

Here is the simplified schema where I want the file in pdf to be globally searchable.

from plone.supermodel import model
from zope import schema
from plone.namedfile.field import NamedBlobFile

class IIndexedContent(model.Schema):
    title = schema.TextLine(
        title = u'Titel',
        required = True
    )
    pdf = NamedBlobFile(
        title = u'PDF',
        required = False,
    )

Thank you!

I am using Plone 5.2-rc2, Python 3.6, collection.solr 8.0.0a1

@NicolasGoeddel
Copy link
Author

I found out collective.solr.solr.SolrConnection.add() creates the XML request for Solr and it tries to retrieve the pdf field, but it contains something like this:

<field name="pdf" update="set">&lt;plone.namedfile.file.NamedBlobFile object at 0x7f0c24891048 oid 0x1e33 in &lt;Connection at 7f0c27916b38&gt;&gt;</field>

How can I add a handler for these kind of file fields so it does the same thing as the BinaryAdder in indexer.py?
I could hack something together which maybe works but I hope there already is such a functionality to register my own handlers or similar.

@NicolasGoeddel
Copy link
Author

I now understand the thing with the DefaultAdder and its child classes like BinaryAdder. They are chosen based on the portal_type of a content object that should be indexed. I think it would be a nice idea to make something similar for field types.
Or would it be possible to create some type of a decorator that can be used in a dexterity schema definition which automatically uses the right data extractor if there is a binary field or similar?

@WhiteDiamondz
Copy link

WhiteDiamondz commented Feb 4, 2021

Hey @NicolasGoeddel !
I was having some trouble with exactly the same thing and came upon your opened issue.
I was wondering what was the best solution you had found for this scenario.
In indexer.py and following what you were saying I found the declared adapter for archetypes File like

     <adapter
	  factory=".indexer.DXFileBinaryAdder"
	  for="Products.Archetypes.interfaces.IBaseObject"
	  name="File"
	  />

To make sure that File would be indexed with the BinaryAdder so we won't have a field like
<field name="pdf" update="set">&lt;plone.namedfile.file.NamedBlobFile object at 0x7f0c24891048 oid 0x1e33 in &lt;Connection at 7f0c27916b38&gt;&gt;</field>
Which was something I had noticed as well while trying to create a specific text field in my Solr schema.xml

Is the best solution to declare an adapter in the configure.zcml file of collective solr ?
In that case, since we want to link it to our add on what would be the best way to proceed ?
I am currently trying something like the bellow code :

     <adapter
	  factory=".indexer.DXFileBinaryAdder"
	  for="My.AddOn.interfaces.ICustomContentType"
	  name="CustomContentType"
	  />

@WhiteDiamondz
Copy link

After declaring in configure.zcml the piece of code mentioned in my previous post, looks like I was able to get back the desired result !
Thanks for explaining and sharing your discoveries even if you hadn't gotten any answers, this helped a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants