#ident "@(#)issues	1.2 08/10/15 SAC"

Inception Review October 15, 2008:
Roamer-1. When the material talks about current interface limitation,
4.1.2, why it's a problem to allow a driver to get more that *2* MSI-X?
Those integrated device drivers should be prepared that it can not get
any MSI-X interrupt vector, and it might try the legacy INTX instead.
So it should not be a problem even all MSI-X vectors have been given to
those attached drivers. Late-attached drivers will just use legacy INTX
interrupts. The justification for current *hard-coded* limitation
doesn't make sense.

ANS:
	You're right.  Drivers could revert to INTX (FIXED) interrupts
	if the system has run out of MSI-X vectors.

	The current limit of 2 vectors was derived based on the number
	of slots in current systems with PCIe (and thus MSI-X) support,
	and based upon restrictions of the interrupt vector space and
	priority assignments therein imposed by the current design for
	low-level interrupt management on x64 based systems.  That was
	just the value that could safely be selected at that time due
	to those constraints.

Roamer-2. How the IRM framework decide to decrease the number of
interrupt vectors that have been given to a driver? 4.2.1 talk about
how driver participate the IRM interfaces, but it's obscure how the
framework can wisely move interrupt resources around drivers.

ANS:
	The "Design & Implementation" specification provides more details,
	including pseudo code of the algorithms used.  The implementation
	is purely mathematical, and employs the size of individual requests
	as a weighting factor when computing how many interrupts to give to
	each device.  The goal is to take a set of requests and the total
	number of vectors in a pool, and compute the largest possible number
	of vectors that can be given to each device.  Larger requests may
	be less fulfilled than smaller requests, and smaller requests may
	be totally fulfilled.  The results of the calculations are always
	consistent whether the final I/O configuration (and set of requests)
	was the result of a series of hotplug operations, or whether it was
	the initial boottime configuration of the system.

Roamer-3. How the IRM framework make *wise* decision about which driver
can take more interrupt vectors than others? For example, when you have
a 10GbE NIC and a 1GbE NIC in the box, both drivers ask for 16 vectors
when you don't have enough vectors left. To give the same amount of
interrupt vectors to two driver instances are unreasonable. As part of
Crossbow project, hardware resources are allocated depending on the
real link speed and bandwidth need. But as the low level I/O framework,
IRM don't have knowledge about those information. How do you prove that
your "management" is reasonable?

ANS:
	One aspect of the design is that the algorithms implemented to do
	those computations are modular.  Additional algorithms can be added,
	and then selected in the future to rebalance different pools based
	on different policies.  All of the devinfo nodes associated with all
	of the requests in each pool are visible to these algorithms.  So
	there is an opportunity to expand our repertoire of algorithms in
	the future to give different preferences to different types of
	devices, or make more elaborate policy decisions.  The project only
	delivers two generic algorithms to begin with, but there is room
	to evolve the underlying implementation without changing the visible
	interfaces to drivers.

Roamer-4. What's the perimeter of IRM? In a virtualized environment,
interrupts might have been bound to CPUs in an exclusive zone or a
guest domain, when IRM asks such interrupt vectors back from the
driver, who will take care of the interrupt re-targeting? It's out of
driver's control, and I can not find any relevant information from this
document.

Garrett-5. The interfaces are marked Committed.  I have some concerns with
this, as I read the project details.  I'd feel a lot better if we had some
more complete description of how this is used some typical device drivers,
with some real experience with them, before raising the commitment.  If the
project team has some sample implementations that can use this, then I might
change my position.  But in the absence of that, I'd feel better with an
Uncommitted binding while we get some experience with the APIs at hand.

ANS:
	This is a good suggestion, and in fact the project team decides that a
	better commitment level for these interfaces is CONSOLIDATION PRIVATE.

	Our experience with the interfaces is limited.  The Atlas/Neptune team
	is actively converting the nxge driver to this project's interfaces.
	And we have had technical consultations with HBA driver developers
	(working on QLogic and Emulex drivers), and the Crossbow Project team.
	Input from these other teams has been taken into consideration in our
	project.  But certainly as more practical experience is gained, we
	may evolve our interfaces.  The CONSOLIDATION PRIVATE commitment level
	will allow us to manage the changes to the interfaces by only having
	to effect consumers in the ON Consolidation as the interfaces evolve.

	A more detailed example of how to use these interfaces will be written
	up for future inclusion in the WDD, and to improve the example that is
	already in the project's manpages.  These additional examples are in
	the case materials.  The example used is derived from the real world
	example of how the Atlas/Neptune driver will utilize the interfaces,
	but slightly generalized to represent an idealized, non-specific kind
	of hypothetical driver.

Garrett-6. How is the default number of interrupts that will be allocated
to a non-IRM driver determined?

ANS:
	As previously answered about how the current default value of 2 was
	derived, the default number is based on what is appropriate for the
	platform.  In existing platforms with existing PCIe nexus drivers,
	the current value is what is appropriate.

	In this project, the default value becomes a property of the interrupt
	pool when it is created by a nexus driver.  That value will always be
	what is appropriate based on the platform when a nexus driver creates
	its interrupt pools.  The value will increase for future platforms on
	which it makes sense to do so.

Durrant-7. I can't find any mention in the specification of an interface for a
device driver to discover or control interrupt CPU binding. In general, for a
driver with high data throughput, multiple interrupts on the same CPU are
pointless at best and in many cases harmful to performance; in fact, multiple
interrupts using the same CPU core, cache or even chip/package can be harmful
since they may cause needless CPU contention.  So, if a device driver is being
given an interface to request more interrupts then that interface really should
allow some control of which CPUs those interrupts are bound to whether this is
specific or by specification of a policy (e.g. one-per-cpu, one-per-cache,
one-per-chip, etc.). Also, for interrupts allocated using the existing
ddi_intr_alloc() command there really should be a means to discover the CPU
binding unless this call can be superceded by a call that also gives control
over CPU binding for that initial allocation. 

ANS:
	The scope of this project is just to make the number of available
	interrupts given to each driver instance a dynamically managed value.
	A driver may take the current number of available interrupts and other
	factors into considering during its attach(9F) routine before it makes
	a decision on how many interrupt resources it should actually allocate
	and how it will setup its handlers.  What this project really boils
	down to is notifying the driver through a callback mechanism when
	the number of available interrupts has changed, at which time it can
	revisit those original decisions.  We still depend on the drivers to
	request whatever number of interrupt resources is appropriate as they
	already do today without this project.

	Binding interrupt vectors to specific processors is a low level
	function beyond the scope of this project, best handled by platform
	specific nexus drivers or managed long term by something like intrd.