Tools for manipulating conversion tables

The tools are here: tools.tar.bz2.

Tools for analyzing conversion tables

The script addnames adds the character name of each character and prints the resulting commented table to standard output.

The script table-diff compares two tables. When the tables are known to be in the simplified format (without comments), it is equivalent to a simple diff -c1.

Tools for creating conversion tables

The script glibc2table converts a table in glibc format to a table in the described format.

The script posix2table converts a table in POSIX format (as found on dkuug.dk) to a table in the described format.

The program kuten2table converts a table in decimal kuten form to a table in hexadecimal normal form.

The following extract conversion tables from a system:

unicode.org-mappings/EASTASIA-2/UNIHAN.sh, unicode.org-mappings/EASTASIA-3/UNIHAN.sh, unicode.org-mappings/EASTASIA-31/UNIHAN.sh, unicode.org-mappings/EASTASIA-32/UNIHAN.sh
from UNIHAN.TXT
unicode.org-mappings/VENDORS/MICSFT/WindowsBestFit/Makefile, unicode.org-mappings/VENDORS/MICSFT/WindowsBestFit/bestfit2table
from Windows "bestfit" tables
whatwg/Makefile, whatwg/index2table.c
from https://github.com/whatwg/encoding
glibc-2.1-iconv/table.lisp, glibc-2.1-iconv/table-from.c
from glibc's iconv
glibc-2.2-iconv/table-from.c, glibc-2.2-iconv/table-to.c, glibc-2.2-iconv/alltables.sh
from glibc's iconv
glibc-2.2.2-iconv/table-from.c, glibc-2.2.2-iconv/table-to.c, glibc-2.2.2-iconv/alltables.sh
from glibc's iconv
glibc-2.3.2-iconv/table-from.c, glibc-2.3.2-iconv/table-to.c, glibc-2.3.2-iconv/alltables.sh
from glibc's iconv
glibc-2.3.6-iconv/table-from.c, glibc-2.3.6-iconv/table-to.c, glibc-2.3.6-iconv/alltables.sh
from glibc's iconv
glibc-2.23-iconv/table-from.c, glibc-2.23-iconv/table-to.c, glibc-2.23-iconv/alltables.sh
from glibc's iconv
libiconv-0.0/table.lisp, libiconv-0.1/table.lisp, libiconv-0.2/table.lisp
from GNU libiconv
libiconv-x.y/table-from.c, libiconv-x.y/table-to.c
from GNU libiconv
jdk-1.1.7b/table.sh
from the JDK's native2ascii program
jdk-*/tables.sh, jdk-*/table.sh, jdk-*/table_from.java, jdk-*/table_to.java
from the sun.io.* converters
solaris-2.7/table-from.c, solaris-2.7/table-to.c
from Solaris iconv
osf1-5.1/table-from.c osf1-5.1/table-to.c
from OSF/1 (Tru64) iconv
windows-xp/w32-table-from.c
from Windows XP
windows-xp-sp3/from/w32-table-from.c, windows-xp-sp3/to/w32-table-to.c
from Windows XP SP3
windows-2016/from/w32-table-from.c, windows-2016/to/w32-table-to.c
from Windows 10 (2016)
clisp/table.lisp
from CLISP's builtin tables
mono-1.1.11/table_from.cs, mono-1.1.11/table_to.cs, mono-1.1.11/table.sh, mono-1.1.11/tables.sh
from Mono 1.1.11
freebsd-iconv-0.4/table-from.c freebsd-iconv-0.4/table.lisp
from FreeBSD's iconv-0.4

Tools for incorporating a table into glibc

The program table2glibc converts a table to the glibc format. The header (with name and aliases) and the footer (with the wcwidth information) must be added by hand.

Tools for analyzing lists of characters

These tools work on sorted character lists (just 1 column instead of 2 columns).

The script addnames1 is similar to addnames, but works on character lists.

The script set-diff compares two lists of characters.

Notes

All the scripts expect to be started from the current directory. All the scripts refer to a list of Unicode character names; you have to unpack the extended UnicodeData file unicodedata.tar.bz2 and point the environment variable UNICODEDATA to it.
Comparison of conversion tables
Bruno Haible <bruno@clisp.org>

Last modified: 5 October 2016.