Check duplicate issues.
Description
[Windows] TRootBrowser cannot display non-ASCII filenames (GDK encoding + font fallback)
1. The Problem
On Windows platforms, TRootBrowser cannot correctly display filenames or directory paths containing characters outside the standard ASCII/Latin-1 range, including but not limited to Chinese, Japanese, and Korean (CJK) characters.
Instead of rendering properly, these filenames appear as garbled text (Mojibake). This is a long-standing limitation that affects both custom source builds and the official pre-built MSVC binaries on Windows.
I am aware that ROOT already features a new web-based GUI based on OpenUI5, but this encoding issue has existed for many years and continues to affect users who rely on the TRootBrowser on Windows.
2. Potential Root Cause Analysis
My local investigation suggests that this rendering failure stems from two interconnected issues within the legacy GDK graphics backend located under graf2d/win32gdk/gdk/src/gdk/win32/:
-
Windows Encoding Context Mismatch (gdkim-win32.c):
Inside gdk_nmbstowchar_ts, the implementation relies on the standard C runtime function mbstowcs to convert multi-byte sequences into wide characters. On Windows, the behavior of mbstowcs is strictly tied to the system's active legacy ANSI code page (e.g., CP936/GBK on Chinese Windows, CP932/Shift-JIS on Japanese Windows).
This introduces an encoding management conflict: the backend attempts to decode the incoming string using the OS legacy code page, whereas the upper layers of ROOT pass down strings uniformly encoded in UTF-8. This mismatch causes mbstowcs to misinterpret the UTF-8 byte streams, resulting in corrupted wide-character output.
-
Missing Font Fallback Mechanism (gdkfont-win32.c):
When rendering these characters, gdk_wchar_text_handle attempts to map the character to an appropriate font block. If an exact glyph match for the specific Unicode block is missing in the primary font, the font pointer is silently fallback-assigned to NULL. Consequently, any non-ASCII characters outside the primary font's immediate coverage are dropped entirely during the drawing cycle rather than substituted.
Additional context on encoding: I noticed that ROOT's build system already includes config/root-manifest.xml.in, which explicitly declares <activeCodePage>UTF-8</activeCodePage>. This means ROOT officially expects the process to operate under a UTF-8 code page on Windows. Since the manifest guarantees that strings entering the GDK backend are already valid UTF-8, a fix based on UTF-8 decoding (rather than relying on the system ANSI code page) is not only safe but consistent with ROOT's own configuration intent.
3. Proposed Fix
To resolve this without introducing massive architectural changes to the native graphics engine, I have tested two minimal patches within graf2d/win32gdk/gdk/src/gdk/win32/:
-
In gdkim-win32.c: Replace the system-dependent mbstowcs call with an explicit, manual UTF-8 decoding loop. Since ROOT's manifest already sets the active code page to UTF-8, this ensures that incoming multi-byte filenames are always consistently decoded as UTF-8 into proper wide characters, bypassing the interference of the local Windows ANSI code page.
-
In gdkfont-win32.c: Modify the fallback logic so that when an exact Unicode block match is not found, the system defaults to the first available font in the active font set instead of returning NULL. This guarantees that characters are at least rendered via a fallback font rather than being silently discarded.
Expected Behavior After Fix
After applying the proposed fixes, CJK filenames are correctly displayed in TRootBrowser and no longer appear as garbled text or missing characters.
4. Notes
- This proposed fix does not aim to implement a perfect, comprehensive "best match font fallback" system; it is a minimal, defensive patch designed to ensure text visibility and prevent silent glyph dropping on Windows.
- These underlying GDK routines seem to have remained unchanged in the upstream Windows port for a long time, leading to consistent reproduction across multiple ROOT versions.
- I am more than happy to submit a Pull Request (PR) with these minimal changes if the maintainers agree that this is an acceptable approach for stabilizing the native Windows GUI experience.
Reproducer
Steps to Reproduce
- On a Windows machine, create a dummy file with non-ASCII characters in its name (e.g.,
你好.txt) in any directory.
- Open Windows Terminal / Command Prompt, and launch ROOT with the web GUI explicitly disabled to force the legacy native interface:
- Open the traditional
TBrowser from the ROOT prompt:
- Navigate to the directory where
你好.txt was created.
- Observed Behavior: The filename
你好.txt is not displayed correctly. It either appears as garbled text (Mojibake) or the non-ASCII characters are silently dropped/skipped entirely in the tree and icon views.
ROOT version
This issue has been observed across multiple versions of ROOT (including v6.38.04 and v6.39.99) on Windows platforms."
Installation method
pre-built binary
Operating system
Windows 10 / 11 (Simplified Chinese Edition, System ANSI Code Page: CP936 / GBK)
Additional context
No response
Check duplicate issues.
Description
[Windows] TRootBrowser cannot display non-ASCII filenames (GDK encoding + font fallback)
1. The Problem
On Windows platforms,
TRootBrowsercannot correctly display filenames or directory paths containing characters outside the standard ASCII/Latin-1 range, including but not limited to Chinese, Japanese, and Korean (CJK) characters.Instead of rendering properly, these filenames appear as garbled text (Mojibake). This is a long-standing limitation that affects both custom source builds and the official pre-built MSVC binaries on Windows.
I am aware that ROOT already features a new web-based GUI based on OpenUI5, but this encoding issue has existed for many years and continues to affect users who rely on the
TRootBrowseron Windows.2. Potential Root Cause Analysis
My local investigation suggests that this rendering failure stems from two interconnected issues within the legacy GDK graphics backend located under
graf2d/win32gdk/gdk/src/gdk/win32/:Windows Encoding Context Mismatch (
gdkim-win32.c):Inside
gdk_nmbstowchar_ts, the implementation relies on the standard C runtime functionmbstowcsto convert multi-byte sequences into wide characters. On Windows, the behavior ofmbstowcsis strictly tied to the system's active legacy ANSI code page (e.g., CP936/GBK on Chinese Windows, CP932/Shift-JIS on Japanese Windows).This introduces an encoding management conflict: the backend attempts to decode the incoming string using the OS legacy code page, whereas the upper layers of ROOT pass down strings uniformly encoded in UTF-8. This mismatch causes
mbstowcsto misinterpret the UTF-8 byte streams, resulting in corrupted wide-character output.Missing Font Fallback Mechanism (
gdkfont-win32.c):When rendering these characters,
gdk_wchar_text_handleattempts to map the character to an appropriate font block. If an exact glyph match for the specific Unicode block is missing in the primary font, the font pointer is silently fallback-assigned toNULL. Consequently, any non-ASCII characters outside the primary font's immediate coverage are dropped entirely during the drawing cycle rather than substituted.Additional context on encoding: I noticed that ROOT's build system already includes
config/root-manifest.xml.in, which explicitly declares<activeCodePage>UTF-8</activeCodePage>. This means ROOT officially expects the process to operate under a UTF-8 code page on Windows. Since the manifest guarantees that strings entering the GDK backend are already valid UTF-8, a fix based on UTF-8 decoding (rather than relying on the system ANSI code page) is not only safe but consistent with ROOT's own configuration intent.3. Proposed Fix
To resolve this without introducing massive architectural changes to the native graphics engine, I have tested two minimal patches within
graf2d/win32gdk/gdk/src/gdk/win32/:In
gdkim-win32.c: Replace the system-dependentmbstowcscall with an explicit, manual UTF-8 decoding loop. Since ROOT's manifest already sets the active code page to UTF-8, this ensures that incoming multi-byte filenames are always consistently decoded as UTF-8 into proper wide characters, bypassing the interference of the local Windows ANSI code page.In
gdkfont-win32.c: Modify the fallback logic so that when an exact Unicode block match is not found, the system defaults to the first available font in the active font set instead of returningNULL. This guarantees that characters are at least rendered via a fallback font rather than being silently discarded.Expected Behavior After Fix
After applying the proposed fixes, CJK filenames are correctly displayed in
TRootBrowserand no longer appear as garbled text or missing characters.4. Notes
Reproducer
Steps to Reproduce
你好.txt) in any directory.TBrowserfrom the ROOT prompt:你好.txtwas created.你好.txtis not displayed correctly. It either appears as garbled text (Mojibake) or the non-ASCII characters are silently dropped/skipped entirely in the tree and icon views.ROOT version
This issue has been observed across multiple versions of ROOT (including v6.38.04 and v6.39.99) on Windows platforms."
Installation method
pre-built binary
Operating system
Windows 10 / 11 (Simplified Chinese Edition, System ANSI Code Page: CP936 / GBK)
Additional context
No response