Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fulltextsearch crashes after a few hours #142

Open
Omaha2002 opened this issue Nov 11, 2021 · 2 comments
Open

Fulltextsearch crashes after a few hours #142

Omaha2002 opened this issue Nov 11, 2021 · 2 comments

Comments

@Omaha2002
Copy link

Fulltextsearch working but after a few hours it crashes, I suspect a faulty file. Thing is where to find this file because loging says:

OCA\FullTextSearch\Service\IndexService->updateDocument("*** sensiti ... *")

Can't find the file in /var/log/elasticsearch/elasticsearch.log

Any suggestions?

@Omaha2002
Copy link
Author

Maybe fulltextsearch should skip a faulty file instead of crashing?

@teambvd
Copy link

teambvd commented Feb 26, 2022

@Omaha2002 - This sounds like it may be related to unicode / non utf-8 character issues, at least based on my experiences with other sync applications (specifically emojis both times I had something similar occur, though admittedly this wasn't for nextcloud) - does your backing DB log anything at the time of the event, or just immediately preceding it?

The most annoying thing about such issues is that since the issue is with character interpretation, not only do we hit an ungraceful exit, but we can't log the failure properly... Because we couldn't interpret the character we'd need to log in order to do so. TO BE CLEAR - not saying that's what's happening here, just referring to my past experience elsewhere.

Couple of things you might try -

  • Dump a list of all file and folder names (essentially all the paths) into a text file, files.txt
  • Search for invalid characters within that output with:
    grep --color='auto' -P '[^\x00-\x7F]' files.txt

It's dirty, but if the invalid character is in the filename/path, that should find it. If nodda, then the next thing I'd do is enable logging for all queries on the database, re-run fulltext index, then check out the file which was queried just prior once it crashes.

Last option would be to search character by character in each and every file if neither of the above worked out for some reason I guess - I've got this ghetto vbscript which does that, but it only works on plain text files so I'm not sure how helpful it'd be, especially as it wasn't written with anything like this in mind...
Set objFSO = CreateObject("Scripting.FileSystemObject") Set objFile = objFSO.OpenTextFile("C:\FSO\New Text Document.txt", 1) Do Until objFile.AtEndOfStream strCharacters = objFile.Read(1) Wscript.Echo strCharacters Loop

EDIT: For the record, the 🍇 (grape emoji) was my most recent time dealing with this; we were backing up / archiving emails, but as flat file .eml's. Worked fine for all but one end user couldn't figure out why backups kept failing over and over... Turned out they had one email (out of 480k) that had that emoji in the subject line (name of file in this case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants