sun microsystems Systems Architecture Committee _________________________________________________________________ Subject: Codeset Independent ldterm(7M) and stty(1) Submitted by: Ienup Sung File: PSARC/1999/140/opinion.ms Date: July 21st, 1999 Committee: Joseph Kowalski, Tim Marsland, Terrence Miller, David Robinson, Glenn Skinner. Steering Committee: Solaris Operating Environment soesc-opinion@scs.eng Operating Systems and Networking onsc@eng 1. Summary The current ldterm(7M) and stty(1) implementations are EUC codeset specific and contain EUC representation dependen- cies. This project is to provide a codeset independent ldterm(7M) module and stty(1) command. Currently, non-EUC locales push a pair of code conversion modules around the ldterm module to circumvent this restriction. This pro- cedure is quite cumbersome and error prone. 2. Decision & Precedence Information The project is approved as specified in reference [1], but as modified by the required technical changes listed in Appendix A below. The project may be delivered in a minor release of Solaris. 3. Interfaces PSARC/1999/140 Copyright 1999 Sun Microsystems - 2 - The project exports the following interfaces. _____________________________________________________ | Interfaces Exported | |_______________|_________________|_________________| |Interface | Classification | Comments | |_______________|_________________|_________________| |stty(1) | Stable | defeuwc | | | | behavior in| | | | non-EUC | | | | locales. | |ldterm(7m) | Stable | Expanded defin-| | | | ition of| | | | EUC_WSET and| | | | EUC_WGET in| | | | non-EUC | | | | locales. | |ldterm(7m) | Consolidation | ioctls only| | CSDATA_SET | Private | used by stty(1)| | CSDATE_GET | | | |ldterm.dat | Sun Private | Data file name| | | | and format as| | | | defined in [1].| |_______________|_________________|_________________| 4. Opinion Originally this project was presented in a form considerably different than the final proposal. That design downloaded large amounts of data through multiple calls to new ldterm ioctls. The committee had two major issues with this imple- mentation. 1. The switch between codesets was not atomic. The behavior was not well defined when a codeset width table was partially downloaded. 2. The large downloaded codeset width tables were per- stream and not shared. This could easily result in massive amounts of physical kernel memory being con- sumed. The revised proposal addressed both of these concerns. It is worth noting that although the committee suggested several possible ways to address its concerns, the project team returned with an implementation which was not dis- cussed. This is mentioned here because project teams have expressed concern about being constrained by solutions sug- gested by the committee. This should not be the case and wasn't with this project. The committee was still concerned by the physical memory consumed by the project which is potentially 280 kilobytes PSARC/1999/140 Copyright 1999 Sun Microsystems - 3 - but is only 16 kilobytes in the initial implementation. The additional memory will be consumed as additional Unicode code-planes are defined. After consideration, the committee decided that because the memory consumption was not extremely excessive and much of it was deferred in time the project was acceptable as proposed. However, this concern is captured as a Technical Change Advised. Finally, the committee was insistent that stty options con- tinue to have the same semantics as before. This discussion resulted in the two Technical Changes Required for this pro- ject. 5. Minority Opinion(s) None. 6. Advisory Information None. 7. Appendices 7.1. Appendix A: Technical Changes Required 1. The project must support the existing ``stty -g'' behavior. Specifically, that is the ability to produce output in a form appropriate as an argu- ment to another stty command. With this project, that should include the value of the non-EUC code set. 2. The output of `stty -a` should report the ``code set width data'' name. This is usually the locale name, but the label ``locale'' should not used in the stty output to allow a future project which may more correctly require the ``locale'' label. Differences could arise where sub-locales exist or character sets are shared among locales. The com- mittee suggested (but did not require) the iden- tifier ``csdata'' be used. 7.2. Appendix B: Technical Changes Advised 1. The project is advised to not lock down physical memory for a large Unicode width table when not required. Possible implementations include using pagable kernel memory or using an optionally load- able module. PSARC/1999/140 Copyright 1999 Sun Microsystems - 4 - 7.3. Appendix C: Reference Material Unless stated otherwise, path names are relative to the case directory PSARC/1999/140. 1. Project Specification (as if for a fast-track). File: commit.materials/spec. 2. Technical Description File: commit.materials/ldterm-csi.txt 3. Diff-marked man page. File: commit.materials/stty.1 4. Diff-marked man page. File: commit.materials/ldterm.7m 5. Project Plan File: incept.materials/ldterm_csi_projplan.ps 6. One Pager File: 990223_ienup.sung 7. Dependency Discussion. File: incept.materials/stty-euc-dependency.txt 8. Dependency Discussion. File: incept.materials/ldterm-euc-dependency.txt 9. Background Codeset Information. File: commit.materials/csicookbookappendix.pdf 10. Solaris Internationalization Guide for Developers Over- view of en_US.UTF-8 Locale Support File: User documentation (current). PSARC/1999/140 Copyright 1999 Sun Microsystems