#ident	"@(#)ldterm-csi.txt	1.9 99/04/28 SMI"
								3/20/1998
								is@eng.sun.com
Codeset independent ldterm(7M) and stty(1)
------------------------------------------

1. Overview

Current ldterm(1M) and stty(1) implementations are EUC codeset specific and
also have EUC representation dependencies. This memo is to provide a design of
codeset independent (CSI) ldterm(7M) module and stty(1) command.

The design in this memo can be summarized as like below:

- Provide three sets of internal methods in the ldterm(7M) to handle various
  codesets:

  (1) EUC codeset methods (default)
  (2) PC environment originated codeset methods
  (3) UTF-8 codeset methods

  The default method set that ldterm(7M) will start and run with will be
  the (1) from above.

- Three new I_STR ioctl message commands specifically for the ldterm(7M) will
  be added:

  CSINFO_SET	This call takes a pointer to a ldterm_cs_header_t data
		structure, and uses it to set the line discipline definition
		and also for a possible switch of the internal methods and
		data for the current locale's codeset.

		When this message is reached, the ldterm(7M) will check
		the validity of the message and if the message contains
		correct info, it will accumulate the header info.

  CSDATA_SET	Depend on the header info previously set by 'CSINFO_SET'
		command, especially, 'csinfo_num' data field of the header,
		the ldterm(7M) will accept one or more of 'CSDATA_SET'
		messages and accumulate them internally.

		When it receives the final 'CSDATA_SET', the ldterm(7M) will
		validate so far received messages and set the received data
		as the data that will be used in the ldterm(7M) and then
		switch into the corresponding methods. If the validation
		fails, the ldterm(7M) will negative acknowledge the message.

		It is a responsibility of stty(1) that there will be always 
		exactly the 'csinfo_num' number of 'CSDATA_SET' ioctl messages
		after the 'CSINFO_SET'.

  CSINFO_GET	This call takes a pointer to a ldterm_cs_header_t structure
		and returns in it the codeset header info currently in use by
		the ldterm(7M) module.

  The three new ioctl commands will be added to <sys/csiioctl.h> header file.
  The EUC_WSET and EUC_WGET will not be removed.

- Any locale that wants to utilize the (internal) non-EUC codeset methods of
  ldterm will provide /usr/lib/locale/<locale>/LC_CTYPE/ldterm.dat file.

  The ldterm.dat file will contain info like codeset type, codeset and/or
  character widths of the current locale.

- Upon user request of 'defeucw' mode setting, The stty(1) command will
  check if the current locale has the /usr/lib/locale/<locale>/LC_CTYPE/
  ldterm.dat file. If it does have the file, the stty(1) command
  will read in the file and pass down the content of the file to
  the ldterm(7M) module by using the CSINFO_SET and CSDATA_SET ioctl message
  commands.

  The current behavior on EUC will not be changed.

  For 'write settings' request, i.e., stty -a, we will not change the current
  implementation. And thus if the stty(1) is executed with -a option, and
  the current locale is not EUC one, it will print out:

	eucw ?, scrw ?

  If the current locale is an EUC one, the stty(1) will print out 
  byte widths and screen column widths for the EUC codesets, for instance,
  in case of any single byte locales we support, stty -a will give following
  result:

	eucw 1:1:0:0, scrw 1:1:0:0


2. Detail design

2.1. ldterm.dat file and header files

The ldterm.dat file will have either one of following file structures shown in
Figure 2.1.1 or Figure 2.1.2:


	File (byte) offset
			+--------------------------+
		0	| ldterm data header info  |
			+--------------------------+
		3	| ldterm eucpc data 1      |
			+--------------------------+
		25	| ldterm eucpc data 2      |
			+--------------------------+
		47	|                          |
			:           ...            :
			|                          |
			+--------------------------+
		201	| ldterm eucpc data 10     |
			+--------------------------+
		223


		Note:	The size of ldterm data header info is 3 bytes and it
			consists of 'version,' 'codeset_type,' and,
			'csinfo_num' data fields of ldterm_cs_header_t.
			The data field 'csinfo_num' of the ldterm data header
			is 10 in above example and the size of each ldterm
			eucpc data is 22 bytes.

	Figure 2.1.1: ldterm.dat file structure example for EUC or PC
				originated codeset


	File (byte) offset
			+--------------------------+
		0	| ldterm data header info  |
			+--------------------------+
		3	| Unicode data for Plane00 |
			+--------------------------+
		16387	| Unicode data for Plane01 |
			+--------------------------+
		32771	|                          |
			:           ...            :
			|                          |
			+--------------------------+
		262147	| Unicode data for Plane16 |
			+--------------------------+
		278531


		Note:	The size of ldterm data header info is 3 bytes and it
			consists of 'version,' 'codeset_type,' and,
			'csinfo_num' data fields of ldterm_cs_header_t.
			The data field 'csinfo_num' of the ldterm data header
			is 16 planes in above example and the size of each
			Unicode plane is 16384 bytes.

   Figure 2.1.2: ldterm.dat file structure example for Unicode/UTF-8 codeset


Definitions and data types that can be used to create and process
the content of the 'ldterm.dat' file are like below and they will be added to
the ldterm header file, <sys/ldterm.h>:

	/* Next version will be the current LDTERM_DATA_VERSION + 1. */
	#define LDTERM_DATA_VERSION		1

	/* Supported codeset types. */
	#define LDTERM_CS_TYPE_MIN		1

	#define LDTERM_CS_TYPE_EUC		1
	#define LDTERM_CS_TYPE_PCCS		2
	#define LDTERM_CS_TYPE_UTF8		3

	#define LDTERM_CS_TYPE_MAX		3

	/* ldterm codeset header information. */
	struct _ldterm_cs_header {
		unsigned char	version;	/* version: 1 ~ 255 */
		unsigned char	codeset_type;
		unsigned char	csinfo_num;	/* the number of */
						/* codesets/planes */
	};
	typedef struct _ldterm_cs_header ldterm_cs_header_t;

	/*
	 * The maximum number of bytes in a character of the codeset that
	 * can be handled by ldterm.
	 */
	#define LDTERM_CS_MAX_BYTE_LENGTH	10

	/*
	 * Following two data structures are to provide codeset-specific 
	 * information for EUC and PC originated codesets (ldterm_eucpc_data_t)
	 * and, Unicode/UTF-8 codeset (ldterm_unicode_data_cell_t).
	 */
	struct _ldterm_eucpc_data {
		unsigned char	byte_length;
		unsigned char	screen_width;
		unsigned char	byte_range_start[LDTERM_CS_MAX_BYTE_LENGTH];
		unsigned char	byte_range_end[LDTERM_CS_MAX_BYTE_LENGTH];
	};
	typedef struct _ldterm_eucpc_data ldterm_eucpc_data_t;

	/* 
	 * To represent a single Unicode plane, it requires to have 16384
	 * 'ldterm_unciode_data_cell_t' elements.
	 */
	struct _ldterm_unicode_data_cell {
		unsigned char u0:2;
		unsigned char u1:2;
		unsigned char u2:2;
		unsigned char u3:2;
	};
	typedef struct _ldterm_unicode_data_cell ldterm_unicode_data_cell_t;


Possible values for each data field of "ldterm_cs_header_t" are like below:

- version:
	LDTERM_DATA_VERSION

- codeset_type:
	LDTERM_CS_TYPE_EUC if the current locale is EUC one.
	LDTERM_CS_TYPE_PCCS if the current locale is PC originated codeset one.
	LDTERM_CS_TYPE_UTF8 if the current locale is UTF-8 one.

- csinfo_num:
	If the codeset_type is LDTERM_CS_TYPE_EUC, it will have the number of
	supplementary codesets supported in the locale. Valid values are 0 to
	3.

	If the codeset_type is LDTERM_CS_TYPE_PCCS, it will have the number of
	distinguishable sub-codesets in the codeset of the locale. Valid
	values are 1 to 10. The number excludes ASCII sub-codeset.

	If the codeset_type is LDTERM_CS_TYPE_UTF8, it will contain
	the number of planes in this locale Unicode locale is supporting.
	Valid values are 1 to 16.

Possible values for each data fields of "ldterm_eucpc_data_t" are like below:

- If the 'codeset_type' is LDTERM_CS_TYPE_EUC, there will be three 
  "ldterm_eucpc_data_t" elements:

	-- The first element's:
		byte_length: The byte length of EUC supplementary codeset one.
		screen_width: The screen column width of EUC supplementary
			codeset one.
	-- The second element's:
		byte_length: The byte length of EUC supplementary codeset two.
		screen_width: The screen column width of EUC supplementary
			codeset two.
	-- The third element's:
		byte_length: The byte length of EUC supplementary codeset three.
		screen_width: The screen column width of EUC supplementary
			codeset three.

- If the codeset_type is LD_TERM_CS_TYPE_PCCS, for each distinguishable
  sub-codesets that will be represented by each "ldterm_eucpc_data_t" elements,
  it will have:

	-- The i'th element's:
		byte_length: The byte length of sub-codeset i.
		screen_width: The screen column width of sub-codeset i.
		byte_range_start: The start range for each byte of
			sub-codeset i including the start byte.
		byte_range_end:	The end range for each byte of sub-codeset i
			including the end byte.

- If the codeset_type is LDTERM_CS_TYPE_UTF8, since Unicode width info are
  quite unique and practically not possible to categorize into supplementary
  or sub-codesets like EUC or PC originated codesets, we will have to provide
  a character-by-character width table like following example source:

	#include <sys/ldterm.h>

        /*
         * Following two table contains width information for Unicode.
         * Values in the table "ucode" points index to the "width_tbl" vector.
         *
         * There are only three different kind of widths: zero, one, or, two.
         * The value -1 means that particular code point is not yet
         * assigned or not a Unicode character, i.e., U+FFFE and U+FFFF.
         */
        static const int width_tbl[4] = { 0, 1, 2, -1 };

        ldterm_unicode_data_cell_t ucode[16][16384] = {
	{  /* Plane 00 a.k.a. BMP */
/*              0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F               */
/*              ----------------------------------------------               */
/* U+0000 */    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,    /* U+000F */
/* U+0010 */    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,    /* U+001F */
/* U+0020 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+002F */
/* U+0030 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+003F */
/* U+0040 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+004F */
/* U+0050 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+005F */
/* U+0060 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+006F */
/* U+0070 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3,    /* U+007F */
/* U+0080 */    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,    /* U+008F */
/* U+0090 */    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,    /* U+009F */
/* U+00A0 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+00AF */

        ...

/* U+FF50 */    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,    /* U+FF5F */
/* U+FF60 */    3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+FF6F */
/* U+FF70 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+FF7F */
/* U+FF80 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+FF8F */
/* U+FF90 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+FF9F */
/* U+FFA0 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,    /* U+FFAF */
/* U+FFB0 */    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3,    /* U+FFBF */
/* U+FFC0 */    3, 3, 1, 1, 1, 1, 1, 1, 3, 3, 1, 1, 1, 1, 1, 1,    /* U+FFCF */
/* U+FFD0 */    3, 3, 1, 1, 1, 1, 1, 1, 3, 3, 1, 1, 1, 3, 3, 3,    /* U+FFDF */
/* U+FFE0 */    2, 2, 2, 2, 2, 2, 2, 3, 1, 1, 1, 1, 1, 1, 1, 3,    /* U+FFEF */
/* U+FFF0 */    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3     /* U+FFFF */
        }
        { 0 },  /* Plane 1 */
        { 0 },  /* Plane 2 */

        ...

        { 0 },  /* Plane 16 */
        };

  Above will be accessed from the ldterm(7M) by using following algorithm:

        plane = get_plane(utf8);
        rowcol = get_rowcolumn(utf8);
        i = rowcol / 4;
        j = rowcol % 4;
        switch (j) {
        case 0:
                width = width_tbl[ucode[plane][i].u0];
                break;
        case 1:
                width = width_tbl[ucode[plane][i].u1];
                break;
        case 2:
                width = width_tbl[ucode[plane][i].u2];
                break;
        case 3:
                width = width_tbl[ucode[plane][i].u3];
                break;
        }


  Our Unicode/UTF-8 locales are conforming to Unicode 2.1 and soon will also
  conform to Unicode 3.0 when it is available.


2.2. ldtermstd_state_t data structure at <sys/ldterm.h>

The EUC specific data fields will not be removed because we still need them.
We will, however, add three more data fields, t_csheaderp, t_csdatap, and,
t_csmethodsp, to support different codeset types as like below lines marked
with vertical bar ('|') at the beginning of each added/changed line:

|	typedef struct _ldterm_cs_methods {
|	    void (*ldterm_dispwidth)(uchar_t, void *, int);
|	    void (*ldterm_memwidth)(uchar_t, void *);
|	    char (*ldterm_non_ascii_trailing_char)(ldtermstd_state_t *);
|	    char (*ldterm_output_msg)(queue_t *, mblk_t *, mblk_t **,
|				ldtermstd_state_t *, size_t, int);
|	} ldterm_cs_methods_t;

	typedef struct ldterm_mod {

		...

	    /*
|	     * The following are for EUC and also other types of codeset
|	     * processing.
	     */
            uchar_t t_codeset;  /* current code set indicator (read side) */
            uchar_t t_eucleft;  /* bytes left to get in current char (read) */
            uchar_t t_eucign;   /* bytes left to ignore (output post proc) */
            uchar_t t_eucpad;   /* padding ... for eucwioc */
            eucioc_t eucwioc;   /* eucioc structure (have to use bcopy) */
            uchar_t *t_eucp;    /* ptr to parallel array of column widths */
            mblk_t  *t_eucp_mp; /* the m_blk that holds parallel array */
            uchar_t t_maxeuc;   /* the max length in memory bytes of an EUC */
            int     t_eucwarn;  /* bad EUC counter */
|
|	    /*
|	     * The t_csheaderp, t_csdatap, and, t_csmethodsp data fields are
|	     * to have support for various codesets.
|	     */
|	    ldterm_cs_header_t	*t_csheaderp;
|	    void		*t_csdatap;
|	    ldterm_cs_methods_t	*t_csmethodsp;
	} ldtermstd_state_t;


2.3. UNKNOWN_WIDTH macro and typetab[] change at <sys/ldterm.h>

We will also have UNKNOWN_WIDTH macro defined at the header file:

	#define EUC_BSWIDTH	254
	#define EUC_NLWIDTH	253
	#define EUC_CRWIDTH	252
|
|	#define UNKNOWN_WIDTH	251
|
	#define EUC_MAXW	4

Detail will be described at section 2.6.1 and 2.6.4.

We will put T_SS2 and T_SS3 at typetab[0x8e] and typetab[0x8f] as like
below:

	static char typetab[256] = {
	/* 000 */  CONTROL,	CONTROL,	CONTROL, 	CONTROL,
	/* 004 */  CONTROL,	CONTROL,	CONTROL,	CONTROL,

	...

	/* 214 */  CONTROL,	CONTROL,	T_SS2,		T_SS3,

	...

	};


2.4 <sys/csiioctl.h>

The /usr/include/sys/csiioctl.h header file will contain following contents:

	#ifndef CSI_IOC
	#define CSI_IOC         (('C' | 128) << 8)
	#endif
	#define CSINFO_SET      (CSI_IOC | 1)
	#define CSDATA_SET      (CSI_IOC | 2)
	#define CSINFO_GET      (CSI_IOC | 3)


2.5. stty(1)

2.5.1. stty.h

The header file will have one more bit flag:

	#define	CSI_CSW		32


2.5.2. stty.c and sttyparse.c

Two additional global variables for the support of new codeset types will be
added at stty.c:

	static ldterm_cs_header_t *cswp; /* User side codeset width header
					   pointer */
	static ldterm_cs_header_t kcswp; /* kernel side codeset width header */

After the setlocale() invocation in the main() routine, the stty(1) command
will try to read /usr/lib/locale/<locale>/LC_CTYPE/ldterm.dat file. If
there is no such file at the directory, the command will assume the locale is
an EUC locale.

If there is ldterm.dat file, the command will mmap() the header portion of
the file to the 'cswp'.

The get_ttymode() function will retrieve the current width header info from
the ldterm(7M) module into the 'kcswp' by using an ioctl() with CSINFO_GET
command. If the ioctl() returns with the return value of zero and the current
codeset is not the EUC codeset, the CSI_CSW bit flag will also be set to
indicate the current terminal mode. If the current codeset is an EUC one,
we will call ioctl() with EUC_WGET to get the EUC codeset width information.
The function will set EUCW bit flag if the ioctl() call with EUC_WGET command
is acknowledged.

In the sttyparse(), if user specified the "defeucw" in the command line,
and the current locale is non-EUC one, the content of the 'cswp' will be
saved into the 'kcswp'. (Also, if the current locale's codeset is a multibyte
one, it will also enable 'cs8' and disable 'istrip', 'cs7' and 'parenb'.)

The set_ttymode() function will check the 'CSI_CSW' bit flag from the terminal
mode and if it is set, the function will send down CSINFO_SET command with
the 'kcswp' to the ldterm(7M). After the acknowledgement from the initial 
CSINFO_SET command, the function will further mmap() remainder of
the ldtem.dat file and then send down necessary amount of CSDATA_SET
commands to the ldterm(7M) for the current codeset.
If there is no CSI_CSW bit flag but EUCW bit flag, it will send downstream
EUC_WSET command.


2.6. ldterm(7M)

2.6.1. Codeset type specific methods

Internal codeset specific methods are like below:

- EUC codeset methods:
  static void __ldterm_dispwidth_euc(uchar_t c, void w*, int mode);
  static void __ldterm_memwidth_euc(uchar_t c, void w*);
  static char __ldterm_non_ascii_trailing_char_euc(ldtermstd_state_t *tp);
  static char __ldterm_output_msg_euc(queue_t *q, mblk_t *imp, mblk_t **omp,
			ldtermstd_state_t *tp, size_t bsize, int echoing);

- PC environment originated codeset methods:
  static void __ldterm_dispwidth_pccs(uchar_t c, void w*, int mode);
  static void __ldterm_memwidth_pccs(uchar_t c, void w*);
  static char __ldterm_non_ascii_trailing_char_pccs(ldtermstd_state_t *tp);
  static char __ldterm_output_msg_pccs(queue_t *q, mblk_t *imp, mblk_t **omp,
			ldtermstd_state_t *tp, size_t bsize, int echoing);

- UTF-8 codeset methods:
  static void __ldterm_dispwidth_utf8(uchar_t c, void w*, int mode);
  static void __ldterm_memwidth_utf8(uchar_t c, void w*);
  static char __ldterm_non_ascii_trailing_char_utf8(ldtermstd_state_t *tp);
  static char __ldterm_output_msg_utf8(queue_t *q, mblk_t *imp, mblk_t **omp,
			ldtermstd_state_t *tp, size_t bsize, int echoing);

  Since in case of UTF-8 codeset, it is impossible to know the display width,
  i.e., screen column width, of a character simply looking at the first
  byte, it will always return UNKNOWN_WIDTH. The macro for the UNKNOWN_WIDTH
  will be defined at the <sys/ldterm.h>.

  Except the __ldterm_output_msg_euc() method, other __ldterm_output_msg_*()
  methods will not use typetab[], notrantab[], 


2.6.2. ldtermopen()

It will allocate memory blocks to t_csheaderp, t_csdatap, and, t_csmethodsp of
the ldterm module's state pointer 'tp'. The t_csheaderp and t_csdatap will be
initialized with C locale (EUC) width info. The t_csmethodsp will be
initialized with EUC codeset methods. The memory allocations and
initializations will be done before qprocson() invocation.


2.6.3. ldtermclose()

It will free the memory blocks assigned to the t_csheaderp, t_csdatap, and,
t_csmethodsp data fields if they are not NULL pointers. The memory
deallocation will be done after qprocsoff() invocation.


2.6.4. ldterm_docanon()

To figure out the type of the character at the end of the canonical buffer,
we will use the current codeset specific method of the ldterm(7M),
'tp->ldterm_non_ascii_tailing_char()'.

We will replace ldterm_euc_erase() and ldterm_tokerase() to more generic and
codeset independent ones:

  static void ldterm_csi_erase(queue_t *, size_t, ldtermstd_state_t *);
  static void ldterm_csi_werase(queue_t *, size_t, ldtermstd_state_t *);

When the ldterm_csi_erase() and the ldterm_csi_werase() encounters
UNKNOWN_WIDTH during their erase operation and the current codeset type is
LDTERM_CS_TYPE_UTF8, it will compute the width of corresponding character by
calling a function:

  static int ldterm_utf8_width(uchar_t *u8char, int length);

Above function will use the algorithm presented at the section 2.1 to
figure out the column width. In this function, if given UTF-8 bytes in
'u8char' does not form a valid character within the 'length', it will
return -1. Otherwise, the function will return the width of the character.

If the state of the ldterm(7M) has TS_MEUC, i.e., if the ldterm(7M) is
processing a codeset that is a multibyte one and/or a multi-column width
one, it will use the current codeset specific methods to figure out
display with (screen column width) and memory width (byte length) of
each character.

Maintenance of t_eucleft, t_eucp, and, t_codeset will be codeset independent.


2.6.5. ldterm_tabcols()

If the function encounters UNKNOWN_WIDTH from the 't_eucp' vector and
the current codeset type is LDTERM_CS_TYPE_UTF8, it will replace the value of
the '*t_eucp' with the return value from the ldterm_utf8_width() function
described at section 2.6.4 so that correct column positions for the tab can
be returned.


2.6.6. ldterm_kill()

The rubout will be done by using the values in 't_eucp' if the current
t_state contains TS_MEUC instead of actually looking into the character
returned from ldterm_unget(). If '*t_eucp' is 1, we will send the character
returned from the ldterm_unget() to ldterm_rubout(). Otherwise, we will
send ' ' (an ASCII space character) to the ldterm_rubout().
If the '*t_eucp' is UNKNOWN_WIDTH and the current codeset type is
LDTERM_CS_TYPE_UTF8, it will replace the '*t_eucp' with the return value from
the ldterm_utf8_width() function described at section 2.6.4 so that
correct rubouts can be done for the UTF-8 character.


2.6.7. ldterm_do_ioctl()

- CSINFO_SET:
  If ioctl command is CSINFO_SET, it will first check the message validity
  by looking at user-supplied data. If the user-supplied data is not
  right, it will negative acknowledge it. If it contains a proper
  user-supplied data, the module will save the data at a temporary
  data structure that will be saved later at the module's state,
  't_csheaderp'.

  After that, the function will wait for CSDATA_SET command(s) from stty(1).
  Once the function receives all necessary codeset width data with 
  the CSDATA_SET command(s), it will check the validity of the received data
  and if data provided is correct, it will initialize following data fields of
  the module state with proper values:

	t_maxeuc:	the max byte length of the codeset.
	t_state:	bitwise or'ng of TS_MEUC if the current codeset's
			screen column width is bigger than 1.

	t_eucp_mp:	if the 't_maxeuc' is bigger than 1 and/or the 't_state'
			has TS_MEUC set, we will allocate a memory block of
			CANBSIZ to the field if it does not have one yet.
			Otherwise, this data field will be freed and/or
			nullified.
	t_eucp:		if the 't_maxeuc' is bigger than 1 and/or the 't_state'
			has TS_MEUC set, the 't_eucp' will have a proper
			pointer to an address of 't_euc_mp'. Otherwise, this
			data field will be nullified.
	t_csheaderp:	newly received codeset header information will be
			placed.
	t_csdatap:	newly received codeset width tables will be
			placed.
	t_csmethodsp:	if the new codset type is different from the previous
			one, we will also switch the methods to match
			the new codeset type.

  Each command we receive, we will acknowledge or negative acknowledge
  depend on the validity of the message received and also pass it downstream.

- CSINFO_GET:
  If ioctl command is CSINFO_GET, it will copy over necessary data from
  the 't_csheaderp' to user-supplied memory block and then it will acknowledge 
  the message.


3. Impact to any other components

From the on998 gate, only crash(1M) command makes use of 
the 'ldtermstd_state_t' data fields, especially t_euc* data fields, to
print out content of the system memory image. We will not change the
crash(1) command.

There is one debug info need to be changed to incorporate the addition of
two data fields at the 'ldtermstd_state_t':

	ldtermstd_state.dbg