Sources of the conversion tables
The source of a table is indicated by the directory its sits in.
- unicode.org-mappings
-
The most official mappings. Were on
ftp://ftp.unicode.org/Public/MAPPINGS/
but have been withdrawn
since then (because the Unicode consortium doesn't want to be liable for the
conversion problems that are due to vendor software).
- csets-1.7
-
Mark Leisher's additional tables. Very high quality.
Bug reports to: Mark Leisher <mleisher@crl.nmsu.edu>
- csets-new, joke
-
Additions to Mark Leisher's tables.
Bug reports to: <bruno@clisp.org>
- yasuoka
-
Koichi Yasuoka's tables.
http://www.kudpc.kyoto-u.ac.jp/~yasuoka/CJK.html
- big5p
-
From CMEX's Big5+ distribution.
Has lots of inconsistencies.
- moztw.org
-
http://moztw.org/docs/big5/
- adobe
-
Table of Postscript character names.
- dkuug.dk
-
http://www.dkuug.dk/cultreg
ftp://dkuug.dk/cultreg/registrations/charmap/
ftp://dkuug.dk/i18n/WG15-collection/
Bug reports to: Keld Jørn Simonsen <keld@dkuug.dk>
- whatwg
-
https://github.com/whatwg/encoding as of 2016-10-02
- glibc-2.2-iconv, glibc-2.3.2-iconv, glibc-2.3.6-iconv, glibc-2.23-iconv
-
Generated from glibc-2.2 / glibc-2.3.2 / glibc-2.3.6 / glibc-2.23 (2016) 'iconv'.
Very close to ftp://dkuug.dk/i18n/WG15-collection.
High quality, mostly consistent with the official mappings.
Bug reports to: Keld Jørn Simonsen <keld@dkuug.dk>, Ulrich Drepper <drepper@redhat.com>
- glibc-2.2-charmaps, glibc-2.3.6-charmaps, glibc-2007-charmaps, glibc-2.23-charmaps
-
glibc's tables.
High quality, mostly consistent with the official mappings.
Bug reports to: Keld Jørn Simonsen <keld@dkuug.dk>, Ulrich Drepper <drepper@redhat.com>
- libiconv-1.[012345678], libiconv-1.9.2, libiconv-1.10, libiconv-1.11, libiconv-1.14
-
Generated from GNU libiconv's 'iconv'.
http://www.haible.de/bruno/packages-libiconv.html
Bug reports to: Bruno Haible <bruno@clisp.org>
- jdk-1.1.8
-
Generated from JDK 1.1.8.
- jdk-1.3.1
-
Generated from JDK 1.3.1_16.
- jdk-1.3.0beta
-
Generated from JDK 1.3.0beta, as modified by IBM.
Single-byte encodings high quality, but CP874 is different. Multi-byte
encodings frequently broken.
- jdk-1.4.2
-
Generated from JDK 1.4.2_10.
- jdk-1.5.0
-
Generated from JDK 1.5.0_06.
- solaris-2.7
-
Generated from Solaris 2.7 'iconv'.
Relatively good, but some tables are rather buggy.
- solaris
-
From http://developers.sun.com/dev/gadc/technicalpublications/articles/gb18030.html.
- osf1-5.1
-
Taken from OSF/1 (Tru64) 5.1.
- hpux
-
Taken from HP-UX 10.
- aix-4.3.2
-
Taken from AIX-4.3.2.
- windows-2000
-
Taken from the IBM ICU charmap repository, in CVS.
Based on a Windows COM service for charset conversion.
- windows-xp
-
Taken from Windows XP SP1.
Based on
MultiByteToWideChar
.
- windows-2016
-
Taken from Windows 10 in October 2016.
Based on
MultiByteToWideChar
.
- microsoft-2005
-
From http://www.microsoft.com/globaldev/reference/ in
December 2005.
- clisp
-
Generated from CLISP on 1999-12-04.
High quality, tries to be consistent with everything.
Bug reports to: Bruno Haible <bruno@clisp.org>
- icu-1.3.1
-
Generated from the data in IBM's ICU 1.3.1 package.
Bug reports to: <icu4c@us.ibm.com>
- icu-20000203
-
Generated from data provided by Helena Shih.
Except where noted otherwise, identical to icu-1.3.1.
Bug reports to: Helena Shih <hshih@us.ibm.com>
- icu-1.7
-
Generated from the data in IBM's ICU 1.7 package.
Bug reports to: <icu4c@us.ibm.com>
- icu-2.2
-
Generated from the data in IBM's ICU 2.2 package.
- icu-2.8
-
Generated from the data in IBM's ICU 2.8 package.
- icu-3.4
-
Generated from the data in IBM's ICU 3.4 package.
- mono-1.1.11
-
Generated from mono-1.1.11.
Bug reports to: www.mono-project.com
- freebsd-iconv-0.4
-
Generated from FreeBSD's iconv-0.4.
Poor quality, some buggy mappings.
Bug reports to: Konstantin Chuguev <joy@urc.ac.ru>
- zos
-
Taken from z/OS.
Comparison of conversion tables
Bruno Haible <bruno@clisp.org>
Last modified: 19 January 2020.