You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Occassionally when processing articles wallabag is not handling properly the national characters.
The behavior is deterministic in the sense that for a guiven page it's allways the same, it either processes it properly or it doesn't.
This url corresponds to a page that it's allways unproperly processed: error example
You will find that the article tittle, even if containing national characters it's properly handled. For example it contais the work: más
On the other side, the article contents is not propoerly handled. Very early in the text you can see for example the word automóvil that is wrong. It should look like automóvil instead.
Surprisingly enoug some articles are properly handled. This url from the same site contains naional characters as well but is propoerly handled correct sample
I've done some research.
Looking into the prostgres tables were content is recorded i see in the entry table that the content is already trashed there. Therefore is not a matter on how it's rendered/shown when the articles are presented. Problem arises earlier when parsing the article.
I've tested the problem URL at the site f43.me and the problem is reproduced. Text shows unproper combinations when national characters are present. When I enable debug in this site...well, no errors are reported. Curiously enough the languaje is properly identified as es (which stand for spanish)
Finally I've enabled grabby debut logs, collected them during articler parsing and will attach them to this case
The text was updated successfully, but these errors were encountered:
As far as I can see there's no error reported and it can be seen there that contents have garbage characters...but moving from there to a possbile solutions is really beyond my capabilitees.
Occassionally when processing articles wallabag is not handling properly the national characters.
The behavior is deterministic in the sense that for a guiven page it's allways the same, it either processes it properly or it doesn't.
This url corresponds to a page that it's allways unproperly processed: error example
Surprisingly enoug some articles are properly handled. This url from the same site contains naional characters as well but is propoerly handled correct sample
I've done some research.
The text was updated successfully, but these errors were encountered: