| Internet/Intranet Input Method
Architecture For Java, Network Computer and all other platforms. |
|---|---|
| A White Paper | |
Hideki Hiura Sun Microsystems, Inc. | |
| Revision SDK2 1.0 | |
| Last updated: 2/12/'99 | |
|
|
18-Mar-97 Copyright © 1995-99 Sun Microsystems, Inc. All Rights
Reserved. |
![]() |
This document describes the high level architecture of
Internet/Intranet Input Method(IIIM), with its advantages of:
IIIM defines a set of interfaces, conventions, and protocols. These components:
The IIIM Framework enables flexible, scalable and efficient Input Method Services (IMS) that harmonize with existing platform input methods, and extend the IMS beyond the platform boundary. |
![]() |
The IIIM provides a distributed input method (IM) that integrates a
variety of IM engines and interfaces on multiple platforms.
Most mature software platforms, such as UNIX, Windows, Macintosh, and Java, provide a pair of single platform specific IM interfaces:
IMAPI provides the application programmer with an interface to write IM enabled applications, and IMSPI allows IM engine providers to plug-in their IM service module as a part of system services. IMAPI and IMSPI heavily reflect the platform architecture of each platform installation, which tends to make programming models differ between platforms. Consequently, IM providers who wish to support multiple platforms must implement different input methods for each of the different interfaces and architectures. As IM software porting is difficult to provide between incompatible platforms, IM providers generally concentrate on development for one platform only. As a result, one platform is enriched with variety of attractive input methods and the rest have only limited input methods available. Even multiple platforms in a networked environment do not interoperate, because of the platform dependent evolution of IM products. Users on IM pure platforms cannot use input methods on the IM rich platforms even when they are connected in the same room. The platform specific evolution of IMAPI and IMSPI does not necessarily mean that IMAPI, IMSPI, and input method itself, should always be architected to be platform dependent. A possible solution to these limitations would be to standardize IMAPI and IMSPI among platforms to encourage platform venders, application writers, and IM service providers to migrate to a new interface. However, this is unlikely as there is such an array of well developed IM on the market already. A key requirement for distributed IM service is to supply virtually all the input methods available on existing platforms. Ideally the existing IM would be enabled without extensive modifications. Without this it would be unrealistic to expect all the IM service providers to automatically port their IM engines onto new IMSPI.
The IIIM approach does not define new SPIs, but rather federates the different
existing IMAPI, IMSPI, and IM engines. IIIM defines three key concepts;
|
Platform Neutralness |
![]() |
An important design goal of IIIM is Platform neutralness. In the
context of IIIM, Java is yet another platform which defines another new
IMSPI and IMAPI. To leverage existing IMSPIs, IMAPIs and IM engines on
differing platforms, it is essential to absorb the differences as the
components directly interface to each of the IMSPIs, IMAPIs and IM engines,
and their communication infrastructure.
IIIM defines server and client frameworks and conventions, that are adaptable with Java and all platforms which have IMSPI, IMAPI, and IM engines.
|
Distributed IM infrastructure |
![]() |
IIIM, as a distributed IM infrastructure, defines a new protocol which
is highly optimized for IM service. An initial design option for IIIM
was to limit the target platform to Java, using Java RMI (remote method
invocation) and JNI (Java Native Interface). However, introducing heavy
dependency on the Java platform was contrary to our principal of
providing an integrated IM solution for all existing platforms.
By writing another agent to interface the other side of API, such as a protocol conversion agent, the IMEs on Windows can supply the IM services to an application running on UNIX and Macintosh. There are also two issues concerning efficiency; foot print and run time performance. |
Efficiency |
![]() |
Input method can be a frequently and heavily used application for
users who rely on lanugage input to the computer. IM performance implication
directly affects the users' efficient use of the computer.
The first client implementation of IIIM was Java. At the time there was a suggestion to use Java's RMI (Remote Method Invocation) rather than inventing a new protocol from the beginning. Efficiency is the primary reason that IIIM uses its own highly optimized protocol designed for Input Method, rather than easy-to-use generic network framework, such as Java's RMI or OMG's CORBA. The performance of this network based IM has been well measured and its implications have been recognized by several studies on several X Input Method Protocols. It has been one of the most heavily investigated IM protocols by researchers for years. Based on these studies and careful studies of Java RMI performance for distributed IM use, the IIIM design team decided to invent a new highly optimized protocol which can maximize the balance of performance and extensibility. |
Scalability |
![]() |
The IIIM architecture supports a diverse level load dispersion of input
method load. The IIIM cascade mechanism enables Language Engine layer
dispersion. This is important because the Language Engine layer often
becomes a bottle neck because its typically heavily CPU and I/O bound.
Allowing dispersion on Language Engine layer throughout the network
guarantees scalability in most cases.
The IIIM PCE, with a downloadable syntax rule, enables practical client/server load dispersion for languages which typically need multi-level composition/conversion operation, such as Japanese and Korean. The IIIM server can delegate a simple preprocessing operation to each client by downloading a syntax rule to PCE in IIIM Client Framework. This preprocessing task itself is a trivial operation making the load increase on the client ignorable. As a result the number of interactions between the server and the client dramatically decreases, guaranteeing higher scalability. The further load dispersion can be done through the Collaborative Lightweight Engine downloading mechanism. The Language Engine can delegate any part of IM functionalities to the client by downloading its objects as its frontend. Rather than using costly generic network frameworks, such as Java's RMI or OMG's CORBA, the IIIM infrastructure uses a specially designed and optimized protocol. This ensures scalability among existing multilingual network capable input methods when network bandwidth is a bottleneck. |
Extensibility |
![]() |
For the IM service providers, it is essential for an IM framework to
provide features to differentiate their products from others. Some examples are:
The Lightweight Engine downloading mechanism and IIIM protocol extension mechanism enables the engine specific enhancement of IM functionality. Cooperation with special applications depend on the platform IMSPI capability. |
Security |
![]() | The security model for IIIM technology is built around each platforms security mechanism. For example, IIIMJCF on Java2 fully utilizes Java2's security policy mechanism. The objects downloaded from the IIIM Server fully comply with the Java sandbox security management model. |
Agent/Bridge/Cascade Model |
|
One of the goals of IIIM infrastructure is to leverage the existing IM
engines that are available on different platforms. IM engine
technology, although complex, is a mature technology that has produced
many viable intelligent engines on several platforms. It would be
unrealistic to expect IM service providers to port their IM engines
onto a new IMSPI each time one is defined. Clearly,
utilizing the existing IM engines, without requiring extensive
modifications, would provide an ideal solution.
The IIIM Bridge is a pseudo IM client module that uses existing platform specific proprietary IMSPIs to leverage existing platform engines. This is done without requiring extensive modifications to the existing engines:
Another goal of IIIM infrastructure is to allow the client to be as thin as possible. With IIIM server as an agent for widely distributed IM federation, IIIM client can be free from complex resource management.
| |
Multilingual/Multiscript Support. |
![]() |
IIIM uses UTF16 with the source identifier to identify the
language as a primitive of text. Among several Unicode/ISO/IEC 10646
representations, UTF16 takes smallest footprint to support a full
Unicode repertoire.
UCS2 was not considered because it only
covers a subset of Unicode repertoire. Alternatively, while UCS4 covers full
Unicode repertoire and makes character handling easier, has a large
footprint which does not comply with IIIM's maximum scalability
requirement. Legacy Encodings are another option that was considered for each
locale with a legacy extension mechanism, such as ISO2022 or Compound
Text. This approach can be useful for maintaining backward
compatibility with systems that assume codesets are octet-oriented and
upwardly compatible with ASCII.
To avoid the negative heritage of all these models, the entire IIIM Framework was built from scratch. IIIMText for IIIM uses source identifier as a part of protocol primitive and primitive data structure of IIIM Framework. This elevates UTF16 based text from multiscript support to multilingual support. |
IIIM Framework |
![]() | The entire distributed IM framework (IIIM Framework (IIIMF)), consists of IIIM Server Framework (IIIMSF), IIIM Client Framework (IIIMCF), and IIIM Protocol (IIIMP). The framework itself and the protocol are designed to be independent of the operating system that the client or server is running on. IIIMF offers IM services across the platform boundary via platform independent distributed IM protocol (IIIMP). This adaptation to platform specific capability is designed to be externalized and to be pluggable. |
IIIM Server Framework - Platform Neutral IIIM Server |
![]() |
IIIM Server Framework(IIIMSF) consists of:
The following diagram illustrates how each pluggable module interoperates inside IIIMSF.
Due to the lack of the lookup-choice protocol in the X11R6 XIM protocol specification, some extensions are required to support the IIIMP bridge for X11R6 XIM protocol. A way around this limitation is X11R5 XIMP protocol, which defines the lookup choice protocol and a standard extension, that would be less problematic than X11R6 XIM protocol. However, our approach at Solaris was to merge the IIIMP bridge and IIIMP Agent as they already support major Language Engines on UNIX and we can get almost the same coverage with more stability. The IIIMP driver, under the Language Engine Interface, enables the functions as the IIIMP agent (with the multi-lingual, multi-language engine capability it already has).
|
IIIM Client Framework |
![]() |
The IIIM
Client Framework (IIIMCF) consists of:
|
IIIM Java Client Framework |
![]() |
The IIIM
Java Client Framework (IIIMCF) is a Java2 adaptation to IIIMCF,
implementing full IIIM Client Framework as 100% pure Java Input Method.
The Java2 based platform can take advantage of full IIIM framework
capability.
|
IIIM X Client Framework |
![]() | The IIIM X Client Framework (IIIMXCF) is an X Window System adoption to IIIMCF. IIIMXCF is implemented as an extension to Xlib, supporting IIIM Protocol as an alternative IM Protocol to XIM Protocol. |
IIIM Protocol |
![]() | IIIM Protocol (IIIMP) is a protocol designed from scratch to support platform independent distributed IM infrastructure. IIIMP is a key component of IIIM Framework. IIIMP is an extensible, platform independent, Window system independent, and language independent IM protocol. It is specially designed for distributed IM service on a heterogeneous and highly divers network environment. This protocol is also designed to be extremely efficient for use on low bandwidth Internet and low bandwidth areas of Intranet (i.e. PPP connection over modem). |
Resource Manager/Session Manager | |
![]() |
The Resource Manager/Session Manager is a core part of IIIM Server. It
manages:
|
Protocol Driver Manager | |
![]() | The
Protocol Driver Manager provides a framework to plug-in an IM protocol
driver. This provides a transparent mechanism for an Input Method
Engine to access IM protocols. IIIM framework's default configuration
supports (but is not limited to) the following three major IM protocol
drivers:
|
Visual Feedback Rendering Objects | |
![]() |
During the input composition operation, some languages do and others do
not, require visual user feedback for intermediate composed string.
IIIM Framework provides two options:
The X Window System provides other processes to render images on other windows. Traditionally, an IM server renders an intermediate visual feedback directly or indirectly onto the n client window. This is called server rendering. Other platforms, which do not allow this type of remote window rendering, generally use an IM server which asks the client to render intermediate visual feedback indirectly. IIIM Framework supports the following three methods for rendering:
|
![]() |
|
![]() |
IIIM Client Framework enables an IIIM client to access the input
methods resident on the IIIM server host. This is done via IIIM
protocol (IIIMP). IIIM Client Framework is:
There have been several studies done on the implementation of IIIM Client Framework in a variety of programing languages on different platforms:
|
Platform IMSPI Adapter | |
| The Platform IMSPI Adapter layer interfaces with the platform specific IMSPI, distinguishing it from rest of the IIIM Client components. This layer absorbs the differences of IMSPI models among platforms, and provides a consistent view to the rest of the IIIM Client components. |
Dynamic Lightweight Engine Switching/Stacking with IIIM Event/Object Manager | |
IIIMCF defines the IIIM Event/Object Manager to control event flow. All
incoming and outgoing events from IIIMCF components are managed by IIIM
Event/Object Manager. IIIM Event/Object Manager supports dynamic
configuration including Dynamic lightweight engine switching/stacking
through dynamically loadable manager rules. This mechanism allows IM
modules to be configured as a stack of:
|
Primary Composition Engine and Downloadable Syntax Rules |
![]() |
Primary Composition Engine (PCE) is a special built-in state-machine
based Language Engine which handles the simple input composition
operation. By default, the IIIM Event/Object Manager rule and all
incoming events from the Platform IMSPI adapter is dispatched to PCE
first.
PCE supports dynamic configuration of its event binding through a dynamically loadable syntax rule, com.sun.iiim.pce1.s1. The expression power of this rule can cover languages from simple European key composition to Korean hangul and Japanese Romaji-kana conversion. This includes pre-edit feedback and status feedback. |
Downloadable LightWeight Engine |
![]() | Light Weight Engine (LWE) is a Input Method Engine that runs on the IIIM client framework. The architecture of LWE is determined by the platform on which IIIMCF is implemented. LWE for IIIMJCF must be 100% pure Java and comply with all Java Applet restrictions. On the client side, LWE is completely resident at run time, and can be loaded from the IIIM server on demand. LWE provides collaborative IM support with the IIIM server; however, LWE can be developed as a purely stand-alone engine. |
Downloadable GUI Objects | |
|
The IM user interface can be categorized into the
following four regions:
IIIMCF provides default GUI modules for all four regions; however, any of GUI modules can be dynamically downloaded from IIIM Server. The GUI, in particular, is an area where the IM providers can effectively show their originality. GUI object downloading may often be triggered by the Language Engine Module when a Language Engine is selected.
|
Disconnected Mode |
IIIMCF is capable of providing Input Method support in
IIIMCF uses:
|
Key Binding Synchronization | |
| The Primary Key Binding Rule is determined by PCE, LWE and the language engines loaded onto the remote IIIM Server. By default, the IIIM Framework uses the com.sun.iiim.pce1.s1 syntax rule for PCE. The language engines and PCE loaded onto the remote IIIM Server can synchronize on key binding by downloading the Language Engine Specific Key Binding Rule to IIIMCF. The language engines loaded onto remote IIIM Server can also synchronize with LWE by downloading their own collaborative LWEs onto IIIMCF. |
JavaOS1.x Lightweight Engine Compatibility Box | |
| Earlier versions of IIIMJCF in JavaOS1.x provided a simple Lightweight Engine interface which is incompatible with IIIMJCF for Java2. The compatibility box enables LWEs, written for IIIMJCF in JavaOS1.x, to run inside IIIMJCF for Java2. |
![]() |
| ||||||||||
![]() | Protocol Specification |
Jini based discovery and lookup | |
![]() | Jini discovery and lookup service can be considered as an additional mechanism for the lookup and downloading of IIIM objects, especially for IIIMJCF. Once IIIMCJF itself becomes Jini enabled (depending on whether or not the JDK IM framework engine SPI supports Jini based IM discovery and lookup) it is technically possible to load IIIMCJF itself dynamically loaded onto a Java2 platform. |
Voice input | |
![]() | IIIM Protocol works as a lightweight container protocol framework for IM. The Collaborative Voice Input Device Support with other input methods including Networks Voice Recognition Server as well as Local Lightweight Voice Recognition Engine can be seamlessly incorporated technically. |
|
|
18-Mar-97 Copyright © 1995-99 Sun Microsystems, Inc. All Rights
Reserved. |