-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML characters should not convert to symbol when editing as HTML #22337
Comments
The entities should not ever be converted to characters. The database should contain entities. The browser will show the entities correctly when showing as HTML. |
This is not a simple fix. All HTML entities are also UTF-8 characters (see https://dev.w3.org/html5/html-author/charref) but most are usable only in a web browser. However the post content (or the editor "output") may be used in other places, like RSS feeds, emails, etc. The "htmlspecialchars" are the only entities required for XML/HTML and are generally understood everywhere. Storing other entities in the DB would probably cause some backwards compatibility issues and affect several other WP components: Formatting, Charset, perhaps Database, and possibly others. |
Hi, I would like to express the issue from an accessibility point of view. What does that have to do with entities? Well, I have trouble visually distinguishing left/right single/double quotes, em/en dash, etc. so I type the entity (I use the numbered entities as I do a lot in XML where HTML entities aren't defined) because when proofreading, it is easier for me to distinguish But in WordPress they get converted so I have trouble when proofreading determining if the wrong combination came out of my fingers. |
This issue is also preventing me from being able to use non-BMP unicode (including emoji) at all. If I try to save a draft with a such a character, I get the error "Updating failed. Could not update post in the database." I believe this is because my MySQL database[1] uses utf8mb3, which can only store BMP characters (which excludes many characters, such as “🛈”, and most emoji). So, I tried to enter the HTML entity instead (like demonstrated by OP), but the editor automatically replaces it with the unicode character, thwarting my attempt to use entities as a workaround (and I haven't found any other workaround). [1] I'm on a hosted solution (EasyWP) where I'm not sure that I can change the database/table/column character sets. Even if I could, I still think this behavior in WP should be addressed. |
Describe the bug
Carrying over an issue from here: https://wordpress.org/support/topic/editor-entities-in-text-mode-copy-paste-in-visual-mode/. This issue focused on the Classic Editor, but it seems to be the case with Gutenberg as well.
When editing as HTML, and typing the HTML character for a symbol (ie.
—
) it gets converted to the symbol. However, when typing&
, that does not get converted. We should not convert any of them while editing as HTML.To reproduce
Steps to reproduce the behavior:
Expected behavior
While editing as HTML, the characters should not convert.
Screenshots
Editor version (please complete the following information):
Possibly related to: #13860
The text was updated successfully, but these errors were encountered: