.de Sc
\\s-1\\$1\\s0\\$2
..
.ds cA 2005/471
.ds aR \s-1PSARC\s0
.LP
.so /shared/sac/Tools/lib/amac
.Co
.ds LF \fI\*(aR/\*(cA\fP
.ds RF \fICopyright 2006 Sun Microsystems\fP
.if n .ds CF
.IP \fBSubject:\fP 15
BrandZ Support for non-native zones
.IP "\fBSubmitted by:\fP" 15
Nils Niewejaar
.IP \fBFile:\fP 15
\*(aR/\*(cA/opinion.ms
.IP \fBDate:\fP 15
July 28th, 2006
.IP "\fBCommittee:\fP" 15
Ed Gould (Opinion by Alan Hargreaves),
James D Carlson,
Glenn Skinner,
William Sommerfeld,
Gary Winiger.

.IP "\fBProduct Approval Committee:\fP" 15
solaris-pac-opinion@sun.com
.pn 2
.NH
Summary
.LP
BrandZ is an extension  of  the  zones  infrastructure  that allows the
creation of zones that emulate non-native operating system
environments, such as Linux.  Future projects may extend this project
to build other non-native operating environments.

.NH
Decision & Precedence Information
.LP
The project is approved as specified in reference [1].
.LP
The project may be delivered in a patch release of Solaris.
.LP
The project supercedes \*(aR/2003/445: Janus: Linux binary compatibility
for Solaris x86.
.LP
The project depends on \*(aR/2006/440: BrandZ-aware Installer and may
not be delivered before it.

.NH
Interfaces
.LP
The project exports the following interfaces.
.if n .ne 8
.if t .ne 3
.TS H
box;
c s s
l | l | l.
Interfaces Exported
_
Interface	Classification	Comments
_
.TH
T{
Linux Interfaces:

System calls (structure, semantics and calling conventions)

/dev (names and major/minor #'s)
.br
/proc
.br
signal numbers
.br
error numbers
T}	External	T{
This category includes all of the different Linux interfaces that the
lx brand emulates.
T}
_
T{
.na
AT_SUN_BRAND_BASE
AT_SUN_BRAND_LDDATA
AT_SUN_BRAND_LDENTRY
AT_SUN_BRAND_BRANDNAME
AT_SUN_BRAND_PHDR
AT_SUN_BRAND_PHENT
AT_SUN_BRAND_PHNUM
AT_SUN_BRAND_ENTRY
T}	T{
Project Private
T}	T{
Additional AUX vector flags used to convey brand information to the
Solaris linker
T}
_
config.xml	T{
Project Private
T}	Brand definition
_
platform.xml	T{
Project Private
T}	T{
Virtual platform definition
T}
_
struct modlbrand	T{
Consolidation Private
T}	T{
kernel/brand module linkage interface
T}
_
struct brand	T{
Project Private
T}	T{
Kernel/brand operational interface
T}
_
struct brand_ops	T{
Project Private
T}	T{
Kernel/brand operational interface
T}
_
struct brand_mach_ops	T{
Project Private
T}	T{
Arch-specific kernel/brand operational interface
T}
_
struct brand_attr	T{
Project Private
T}	T{
Userspace/kernel interface
T}
_
struct lx_brand_registration	T{
Project Private
T}	T{
Userspace/kernel interface
T}
_
rd_helper_ops_t	T{
Consolidation Private
T}	T{
librtld_db.so helper plugin interface
T}
_
T{
.na
brand_open()
brand_close()
brand_is_native()
brand_get_boot()
brand_get_halt()
brand_get_initname()
brand_get_install()
brand_get_modename()
brand_get_postclone()
brand_get_verify()
brand_platform_iter_gmounts()
brand_platform_iter_lmounts()
brand_platform_iter_devdir()
brand_platform_iter_link()
T}	T{
Project Private
T}	T{
libbrand.so.1 is a new library for parsing the BrandZ .xml files
T}
_
T{
zonecfg_get_brand()
zone_get_brand()
T}	T{
Contracted Project Private
T}	T{
Added to libzonecfg.so.1
.br
Contract in reference [4]
T}
_
zonecfg(1M)	Evolving	Added -B <brand> option
_
zoneadm(1M)	T{
Project Private
T}	T{
Added -f (force) option to mount and boot commands
.br
Added "brand" column to verbose "list" output
T}
_
zonecfg(1M)	Evolving	Added -B <brand> option
_
T{
.na
lockd(1M)
statd(1M)
T}	T{
Consolidation Private
T}	T{
Added -P option to indicate portportmapper usage
T}
_
libnsl(3LIB)	T{
Consolidation Private
T}	T{
Add __use_portmapper() to resurrect old portmapper support
T}
_
streamio(7I)	Evolving	T{
Add support for TIOCSCTTY, TIOCNOTTY, TIOCSETLD and TOICGETLD
T}
_
uucopy(2)	Evolving	T{
.na
Added to libc.so.1
See design doc: 3.5.2
T}
_
set_setcontext_enforcement(3C)	T{
Consolidation Private
T}	T{
.na
Added to libc.so.1
See design doc 3.6.2
T}
_
setsigacthandler(3C)	T{
Consolidation Private
T}	T{
.na
Added to libc.so.1
See design doc 3.6.1
T}
_
lx-install(1M)	Evolving	T{
Invoked by zoneadm(1M), but options are user-visible
T}
_
lx-syscall(7D)	Evolving	Linux syscall provider
_
lx_ptm(7D)	T{
Project Private
T}	Linux pty master driver
_
ldlinux(7M)	T{
Project Private
T}	T{
STREAMS module that provides Linux termio(7I) semantics
T}
_
lx_afs(7D)	T{
Project Private
T}	Linux automounter support
_
lx_audio(7D)	T{
Project Private
T}	T{
Layered driver to convert Linux semantics to Solaris
T}
_
.TE
.LP
The project imports the following interfaces.
.if n .ne 8
.if t .ne 3
.TS H
box;
c s s
l | l | l.
Interfaces Imported
_
Interface	Classification	Comments
_
.TH
Linux syscall Interface	External
_
T{
.na
rpm2cpio(1M) CLI
rpm CLI
T}	External	Used to install RedHat software
_
T{
Linux statd(1M) and lockd(1M) uid/gid #'s
T}	External	T{
Used to support NFS locking within lx branded zones
T}
_
T{
.na
glibc ABI
  gethostbyname_r
  gethostbyaddr_r
  getservbyname_r
  getservbyport_r
  openlog
  syslog
  closelog
  __progname
T}	External	T{
Used to provide naming services to Solaris statd(1M) and lockd(1M)
daemons. See section 3.8 of the design doc.
T}
_
RHEL 3.x contents	External	T{
/etc files, rc.d scripts, etc. which we modify at install time.
T}
_
Linux ELF format	External	T{
Object file format for Linux binaries
T}
.TE
.NH
Opinion
.LP
.NH 2
The \fIlx\fP Brand
.LP
The word \fILinux\fP is a trademark. To avoid issues, the name \fIlx\fP
is used to reference linux branded zones.

\*(aR had concerns about the management of this namespace for future
brands and releases of linux. As a result, references to specific
releases of the linux kernel were removed from the documentation and
the lx brand will not be associated with specific releases of a linux
kernel.

.NH 2
Executable Stacks
.LP
For compatibility BrandZ has to allow Linux applications to run with
executable stacks, so those applications are vulnerable to any security
holes that are opened by those stacks. However, since it is running
inside a zone, any damage would be confined to that BrandZ instance. A
compromised zone will not be able to bring down the system, and will
neither have access to, nor be able to damage applications or data in
other zones.

Considered more generally, a BrandZ-hosted linux environment will be
subject to any security holes in the Linux user-space.  However, it will
not be vulnerable to any security holes that depend on kernel support or
kernel bugs.  This would arguably make a BrandZ-hosted RHEL 3 environment
more secure than a native RHEL 3 environment.

.NH 2
Truss, Apptrace and Dbx
.LP
Truss has been updated to recognize the new Solaris system calls; it
has not been, and will not be, updated to understand and display the
Linux system calls issued by the application.  An lx-syscall DTrace
provider makes that information available.

dbx does not currently work, but this appears to be a bug rather than a
fundamental limitation of the design.  This is still under investigation
and is being tracked as:

        6445248 dbx cannot grok Linux processes

.NH 2
Live Upgrade and Packaging Tools
.LP
Live upgrade doesn't run with zones. The packaging tools will go
into the install gate

	63242179 packaging tools need to be brand aware

has been filed and links to this case.

\*(aR/2006/440 has been submitted and approved for working with live
upgrade.

.NH 2
Audio
.LP
There is no notion of a device-specific attribute, which is needed to
support systems with multiple audio devices, in the zone's infrastructure
now.  Adding such a capability would have required an extensive overhaul of
how devices are configured and managed.

Rather than redesign the core of the zones configuration tools simply
to solve one Linux corner case, the project team  chose to use the
generic attributes mechanism to support audio devices.

.NH 2
Solaris Trusted Extensions
.LP
After discussions between the project teams for this case and for the
Trusted Extensions, it was determined that lx branded zones will not be
supported on trusted systems where labels are active.

.NH 2
lxrun
.LP
\*(aR/2006/441 has been submitted and approved to EOL \fIlxrun\fP.

.NH 2
Process Auditing
.LP
Processes running in an lx-branded zone do not have their Linux system
calls audited.  Otherwise, they are subject to all the standard auditing.
For example, Linux process creation/exit events are captured as for any
other process.  The Solaris system calls that the brand library uses to
emulate the Linux system calls are subject to auditing.

The only restriction is that the Solaris audit processing tools cannot run
inside the Linux zone, so the audit records must be consumed by tools
running in the global zone.

.NH 2
Signals to init
.LP
During inception, \*(aR expressed a concern about how the lx init would
deal with system generated signals that it was not expecting. The project
team has addressed these concerns as follows.

With standard Solaris zones, the kernel and init are in agreement on how to
handle the death of init: the kernel restarts the process, and the
resurrected init process uses a state file to pick up where its predecessor
left off.

The Linux init is not prepared to handle this kind of restart.  When it is
restarted, it works its way through the entire boot process again.  This
means that all the rc.d scripts are rerun, and we end up with multiple
instances of services like crond, syslogd, and so on.

Since it cannot simply ignore SIGSEGV, and since the Linux init is not
prepared to handle a warm restart, the only action that will deliver a
sensible result is to reboot the zone.  Regardless of whether this is the
expected behavior on a native Linux system, it's the behavior that will be
implemented inside a Linux zone.

.NH 2
Delegated Administration of Solaris-specific capabilities
.LP
Linux-branded zones will always be second-class citizens in many ways.  As
our real goal is to increase Solaris adoption, using BrandZ as one part of
a migration strategy, we view this as a feature rather than a bug.

To address these specific issues: ZFS delegation will not work within a
Linux zone.  Given sufficient customer interest, we could possibly support
the ZFS utilities, but it would take a significant amount of engineering
work, and would violate our "one binary type per zone" model. It should be
noted that this in no way affects being able to install and run a Linux
zone on a ZFS filesystem.

Supporting network delegation is significantly more feasible.  By emulating
the ioctl()s needed to perform network configuration tasks, we should be
able to support network delegation using Linux configuration tools.  This
would not be a trivial engineering effort, but it would certainly fit
within the overall BrandZ model.

.NH 2
Impact on Zones Upgrade
.LP
The zones test suite, which is run as a regular part of the PIT suite, will
be extended to include testing of lx-branded zones.


.NH
Minority Opinion(s)
.LP
None.
.NH
Advisory Information
.LP
None.
.NH
Appendices
.NH 2
Appendix A: Technical Changes Required
.LP
None.
.NH 2
Appendix B: Technical Changes Advised
.LP
None.
.NH 2
Appendix C: Reference Material
.LP
Unless stated otherwise, path names are relative to the case
directory \*(aR/\*(cA.
.RS
.IP 1.
\fBSpecification\fP
.br
File: final.materials/design.pdf
.br
File: committment.materials/onepager
.br
File: committment.materials/what_works
.IP 2.
\fB20 Questions\fP
.br
File: final.materials/20_questions
.IP 3.
\fBMan Pages\fP
.br
File: committment.materials/brand.dtd.1
.br
File: committment.materials/brands.5
.br
File: committment.materials/design.pdf
.br
File: committment.materials/lx.5
.br
File: committment.materials/zone_platform.dtd.1
.br
File: committment.materials/zoneadm.1m
.br
File: committment.materials/zonecfg.1m
.br
File: committment.materials/zones.5
.IP 4.
\fBContract between Solaris Core Technologies and Solaris Install\fP
.br
File: contract-01

.RE
