It provides support for
libutf8 is for you if your application supports 8-bit and multibytes locales like chinese or japanese, and you wish to add UTF-8 locale support but the corresponding support lacks from your system.
libutf8 is for you also if your application supports only 8-bit locales, and you wish to add UTF-8 locale support. Because libutf8 implements an ISO/ANSI C compatible set of types and functions, the support for libutf8 you add will also automatically work (without libutf8) with other multibytes locales, as far as supported by the system.
libutf8 concentrates on 8-bit and UTF-8 encodings and therefore does not suffer from the complexity needed to support other multibytes locales.
To use this library, as a C/C++ package developer:
#ifdef HAVE_LIBUTF8 #include <libutf8.h> #endifafter all system include files are included.
$ ./configure --prefix=/usr/local $ make $ make install
Special configuration options:
--with-traditional-mbstowcs
mbrtowc
, mbrlen
, mbsrtowcs
, mbsnrtowcs
functions in ISO C 89 Amendment 1 is to process complete multibyte
characters. When an incomplete multibyte character is encountered,
processing stops before this character. An mbstate_t
contains shift
state only (i.e., for 8-bit and UTF-8 encodings, no information at all).
The new ISO C 99 semantics is to process all available bytes of an
incomplete multibyte character, and store in an mbstate_t
the parse
state of an incomplete multibyte character, as far as it has been read.
libutf8 by default implements the new semantics.
--with-traditional-mbstowcs
enables the traditional one instead.
--with-nontraditional-wcstombs
wcsrtombs
, wcsnrtombs
functions in
ISO C 89 Amendment 1 is to process complete multibyte characters.
When a multibyte character cannot be stored in the destination buffer
without overflowing it, conversion stops before this character. An
mbstate_t
contains shift state only (i.e., for 8-bit and UTF-8 encodings,
no information at all).
The new ISO C 99 semantics is to write as many bytes as allowed, even
at the risk of writing an incomplete multibyte character. An mbstate_t
keeps track of how far the current multibyte character has been written.
libutf8 by default implements the traditional semantics.
--with-nontraditional-wcstombs
enables the new one instead.
This library can be built and installed in two variants:
libutf8.so
and a header file <libutf8.h>
. (Both were installed
through "make install".)libutf8_plug.so
. This library can be used with
LD_PRELOAD, to override the mbs/wcs functions present in the C library.
$ export LD_PRELOAD=/usr/local/lib/libutf8_plug.so
$ export _RLD_LIST=/usr/local/lib/libutf8_plug.so:DEFAULT
Last modified: 13 March 2016.