First created at: 4/14/2010 Last updated at: 4/29/2010 Libc iconv enhancement ---------------------- OVERVIEW This project extends the current iconv API framework at libc so that it will be compatible with GNU libiconv. It also adds a few additional Solaris specific features such as iconvstr(3C) and additional iconv code conversion behaviors on non-identical and illegal byte sequences as described at below: - iconv_open(3C) will understand and properly handle "" (i.e., empty string), "char", and "wchar_t" as possible values for fromcode and tocode arguments as shown with change bars at [2] and also described in [3]. - iconv_open(3C) will parse fromcode and tocode arguments, understand, and properly handle transliteration and code conversion behavior modification requests when one or more of the following indicators are appended to the names pointed to by the arguments as described with change bars at [2] and also described in [3]: "//ILLEGAL_DISCARD" "//ILLEGAL_REPLACE_HEX" "//ILLEGAL_RESTORE_HEX" "//NON_IDENTICAL_DISCARD" "//NON_IDENTICAL_REPLACE_HEX" "//NON_IDENTICAL_RESTORE_HEX" "//NON_IDENTICAL_TRANSLITERATE" "//IGNORE" (An alias to "//NON_IDENTICAL_DISCARD//ILLEGAL_DISCARD".) "//REPLACE_HEX" (An alias to "//NON_IDENTICAL_REPLACE_HEX//ILLEGAL_REPLACE_HEX".) "//RESTORE_HEX" (An alias to "//NON_IDENTICAL_RESTORE_HEX//ILLEGAL_RESTORE_HEX".) "//TRANSLIT" (An alias to "//NON_IDENTICAL_TRANSLITERATE".) The "//IGNORE" and the "//TRANSLIT" are primarily provided for GNU libiconv compatibility. - The current path-through iconv tocode and fromcode names will go through alias matching mechanism so that the path-through iconvs will also recognize and support alias names. (As an example, iconv_open("8859-1", "ISO-8859-1") will also activate the path-through iconv code conversion unless there is a corresponding iconv code conversion module in the current system.) - iconvctl(3C) will be added as a new function with some additional Solaris extensions as described in [2] and [3]. This is also to be compatible with the same GNU libiconv API. - iconvstr(3C) will be added as a new function as described in [2] and [3]. This is to be compatible with the kernel iconv framework and, in particular, kiconvstr(9F) [10] and also to provide a string-based code conversion scheme in addition to the current buffer-based code conversion scheme of iconv(3C). The main focus of the function at this point is to be compatible with kiconvstr(9F). To support the above, we will add three more Contracted Consolidation Private interfaces defined between libc (owned by ON) and iconv code conversion shared object modules (owned by G11N mostly) as described in [3] and [4]: - _icv_open_attr() - _icv_iconvctl() - _icv_iconvstr() Without any modifications, existing iconv code conversion shared object modules and geniconvtbl binary table driven iconv code conversions will still be supported as they are and be compatible with the new and enhanced iconv framework; they will just not provide the new functionalities specified in this spec. To provide the new functionalities mentioned in this spec, iconv code conversion shared object modules and geniconvtbl binary tables must be updated to supply the necessary changes as specified in [2] and [3]. Once the case is approved by PSARC, the project team will also seek the approvals on the [4] from the responsible engineering managers of the related parties. As a side note, we will do follow-up projects that will update the existing iconv code conversion shared object modules and geniconvtbl-based binary tables and also add necessary iconv code conversions and name aliases to be more compatible with other platforms and systems in terms of code conversion coverage and support of additional codeset/charset names. Lastly, there are iconv_open_into(3C) API and also some extra features in iconv(1) CLI from GNU libiconv that are not included in this project at this time. They are planned to be done as a future project or two in near future as necessary. INTERFACE STABILITY AND RELEASE BINDING There is no notable interface imported. Exported interfaces are: Interface Stability Note --------- --------- ---- iconv_open(3C), Committed [2] iconv(3C), iconvctl(3C), iconvstr(3C), iconv.h(3HEAD), geniconvtbl(4) _icv_open_attr(), Contracted [3], [4] _icv_iconvctl(), Consolidation _icv_iconvstr() Private This project seeks Micro/Patch release binding. REFERENCES [1] Related CRs: 6803313 Solaris iconv does not support transliteration (Rails need transliteration to work on Solaris) 6912982 iconv_open() should allow an empty string / "" / as any argument 6913721 alias supported is wanted for pass-through iconv tocode and fromcode names Bug 11957 - iconv doesn't support wchar_t as an encoding name (http://defect.opensolaris.org/bz/show_bug.cgi?id=11957) [2] Updated and new man pages at the materials directory of the case: iconv_open.3c, iconv_open.3c.diff, iconv.3c, iconv.3c.diff, iconvctl.3c (new man page), iconvstr.3c (new man page), iconv.h.3head, iconv.h.3head.diff, geniconvtbl.4, geniconvtbl.4.diff (See geniconvtbl.4.diff first since geniconvtbl.4 has text with change bars from PSARC/2001/659 too.) [3] Localization guide at the materials directory of the case: iconv-l10n-guide.txt [4] Contract template file at the materials directory of the case: contract-template.txt [5] PSARC/1993/153 iconv/iconv_open/iconv_close [6] PSARC/1999/292 Addition of geniconvtbl(1) [7] PSARC/2001/072 GNU gettext support (For /usr/lib/iconv/alias and alias support mechanism at iconv.) [8] PSARC/2001/659 Non-identical character conversion support in geniconvtbl(1) [9] PSARC/2007/173 kiconv [10] PSARC/2009/561 Pass-through iconv code conversion END_OF_MEMO.