Skip to content

Commit 5a20c85

Browse files
committed
Update the Unicode support FAQ documentation
1 parent 5bcd74b commit 5a20c85

File tree

1 file changed

+20
-14
lines changed

1 file changed

+20
-14
lines changed

src/site/markdown/faq.md

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,26 @@ DLL" with release builds of Log4cxx and "Multithread DLL Debug" with debug build
4747
Yes. Apache Log4cxx exposes API methods in multiple string flavors supporting differently encoded
4848
textual content, like `char*`, `std::string`, `wchar_t*`, `std::wstring`, `CFStringRef` et al. All
4949
provided texts will be converted to the `LogString` type before further processing, which is one of
50-
several supported Unicode representations selected by the `LOG4CXX_CHAR` cmake option. If methods are
50+
several supported internal representations and is selected by the `LOG4CXX_CHAR` cmake option. If methods are
5151
used that take `LogString` as arguments, the macro `LOG4CXX_STR()` can be used to convert literals
52-
to the current `LogString` type. FileAppenders support an encoding property as well, which should be
53-
explicitly specified to `UTF-8` or `UTF-16` for e.g. XML files. The important point is to get the
54-
chain of input, internal processing and output correct and that might need some additional setup in
55-
the app using Log4cxx:
52+
to the current `LogString` type.
53+
54+
The default external representation is controlled by the `LOG4CXX_CHARSET` cmake option.
55+
FileAppenders support an `Encoding` property allowing character set encoding control per appender.
56+
For example, you can use `UTF-8` or `UTF-16` when writing XML or JSON layouts.
57+
Log4cxx also implements character set encodings for `US-ASCII` (`ISO646-US` or `ANSI_X3.4-1968`)
58+
and `ISO-8859-1` (`ISO-LATIN-1` or `CP1252`).
59+
You are highly encouraged to stick to `UTF-8` for the best support from tools, API and operating systems.
60+
61+
The `locale` character set encoding provides support beyond the above internally implemented options.
62+
It allows you to use any multi-byte encoding provided by the standard library.
63+
See also [some SO post](https://stackoverflow.com/questions/571359/how-do-i-set-the-proper-initial-locale-for-a-c-program-on-windows)
64+
on setting the default locale in C++.
65+
66+
```
67+
std::setlocale( LC_ALL, "" ); /* Set locale for C functions */
68+
std::locale::global(std::locale("")); /* set locale for C++ functions */
69+
```
5670

5771
According to the [libc documentation](https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html),
5872
all programs start in the `C` locale by default, which is the [same as ANSI_X3.4-1968](https://stackoverflow.com/questions/48743106/whats-ansi-x3-4-1968-encoding)
@@ -72,13 +86,5 @@ loggername - ?????????? ???? ??????????????
7286
The important thing to understand is that this is some always applied, backwards compatible default
7387
behaviour and even the case when the current environment sets a locale like `en_US.UTF-8`. One might
7488
need to explicitly tell the app at startup to use the locale of the environment and make things
75-
compatible with Unicode this way. See also [some SO post](https://stackoverflow.com/questions/571359/how-do-i-set-the-proper-initial-locale-for-a-c-program-on-windows)
76-
on setting the default locale in C++.
77-
78-
```
79-
std::setlocale( LC_ALL, "" ); /* Set locale for C functions */
80-
std::locale::global(std::locale("")); /* set locale for C++ functions */
81-
```
89+
compatible with Unicode this way.
8290

83-
See [LOGCXX-483](https://issues.apache.org/jira/browse/LOGCXX-483) or [GHPR #31](https://github.com/apache/logging-log4cxx/pull/31#issuecomment-668870727)
84-
for additional details.

0 commit comments

Comments
 (0)