#pragma ident "@(#)ddi-interrupts.txt 1.48 04/10/11 SMI" Title: Advanced DDI Interrupt Functions. Date: 10/09/04 Authors: david.kahn@sun.com wesley.shao@sun.com govinda.tatti@sun.com Abstract: This document describes a new set of ddi advanced interrupt functions that support existing and new buses with new interrupt types. The advanced functions provide more control for device drivers using existing or new interrupt types. Legacy support for existing interrupt functions and data structures is maintained for complete driver compatibility. Table of Contents 1. Introduction 2. Scope 3. Theory of Operation. 4. DDI Legacy and Retained Interrupt Functions 4.1. Interrupt functions available in both old and new framework 4.2. Retained for compatibility with the existing interrupt framework 5. Data Definitions. 5.1. Interrupt handler type. 5.2. Interrupt pri, given as 'pri' in this document. 5.3. Soft Interrupt pri. (soft_pri) 5.4. Interrupt identification (type and inum) 5.5. Interrupt types. 5.6. Interrupt flags. 5.7. Interrupt Resource Management Callback types and definitions. 5.8. Interrupt handle. 5.9. New DDI generic error codes. 6. DDI Advanced Interrupt Functions for use by Device Drivers. 6.1. ddi_intr_get_supported_types(9f) 6.2. ddi_intr_get_nintrs(9f), ddi_intr_get_navail(9f) 6.3. ddi_intr_register_management_cb(9f), ddi_intr_cancel_management_cb(9f), ddi_intr_enable_management_cb(9f), ddi_intr_disale_management_cb(9f) 6.4. ddi_intr_alloc(9f), ddi_intr_free(9f) 6.5. ddi_intr_get_cap(9f), ddi_intr_set_cap(9f) 6.6. ddi_intr_get_hilevel_pri(9f) 6.7. ddi_intr_get_pri(9f), ddi_intr_set_pri(9f) 6.8. ddi_intr_add_handler(9f), ddi_intr_dup_handler(9f), ddi_intr_remove_handler(9f) 6.9. ddi_intr_enable(9f), ddi_intr_disable(9f), ddi_intr_block_enable(9f) ddi_intr_block_disable(9f) 6.10. ddi_intr_set_mask(9f), ddi_intr_clr_mask(9f) 6.11. ddi_intr_get_pending(9f) 6.12. ddi_intr_add_softint(9f), ddi_intr_remove_softint(9f), ddi_intr_get_softint_pri(9f), ddi_intr_set_softint_pri(9f), ddi_intr_trigger_softint(9f) 7. References A. Bus Specific Binding Information A.1. Application of ddi Interrupt Functions to SBus. A.2. Application of ddi Interrupt Functions to ISA-like Buses. A.3. Application of ddi Interrupt Functions to PCI* Buses. B. Example Pseudo-Code. B.1. Simple MSI Example. 1. Introduction New buses with new programmable and variable interrupt functionality requires new support for drivers. At the same time, the existing ddi interrupt interfaces are antiquated and need updating, while maintaining compatibility for existing ddi-compliant drivers. Legacy devices on legacy buses signal interrupts by using an external interrupt pin or pins. These signals may be routed out of band from the device hierarchy to a separate interrupt concentrator and may be converted to a different form when sent to a CPU for servicing. This document uses the term "fixed interrupts" for discrete interrupts signaled by an interrupt pin. On board devices using fixed interrupts may have "optimized" wiring for their interrupt signals. Plug-in PCI devices under a plug-in pci-pci bridge even have their interrupt signals "twisted" to aid in automatic distribution of PCI interrupts before those signals are routed in some "optimized" fashion to an interrupt controller on the motherboard. When these fixed interrupts are routed to an interrupt concentrator, and their message converted to a cpu message, the signal that the cpu receives may be shared among several interrupting devices. In many cases, especially in traditional "Wintel PC" based systems, the number of unique signals an interrupt concentrator could issue to the CPU was limited. These limitations often caused configuration problems and/or required lots of sharing of interrupts between devices which can also cause problems when devices are not serviced frequently enough. Conventional PCI specifications [1] included optional support for Message Signaled Interrupts (MSI), which are in in-band message implemented as a posted write to an address specified by software with a data value specified by software. The address and data values used are host-bridge specific. MSIs, unlike fixed interrupts, are in-band messages targeting an address range in the host bridge. Since the messages are in-band, the receipt of the message can be used to "push" any data associated with the interrupt. MSI's are by definition, unshared. Each MSI message assigned to a device is guaranteed to be a unique message in the system. PCI functions can request between 1 and 32 MSI messages, in powers of two. The system software may allocate fewer MSI messages to a function than the function requested. The host bridge will have some limitation in the number of unique MSI messages that can be allocated for devices. The introduction of PCI-Express [3] extended PCI and MSI by requiring the use of MSI for PCI functions. PCI-Express is a serial point-to-point bus with no external wires. For legacy purposes PCI-Express includes INTx (INTA-INTD) emulation messages for compatibility with existing software, however, within any one PCI-Express domain, the four INTx emulation messages are shared by any device using INTx emulation with that hierarchy. Thus, depending on INTx emulation is generally a bad idea due to the nature of its implementation. A PCI-SIG MSI-X ECN [2] extended MSI by adding the ability for a function to allocate more (up to 2048) messages, making the address and data value used for each message independent of any other MSI-X message, and allowing software the ability to choose to use the same MSI address/data value in multiple MSI-X "slots", as an architected method for dealing with the case when the system allocates fewer MSI/X messages to the device than the device requested. Since the nature of MSI/X is an allocation protocol for limited system resources, an allocation algorithm must be provided for a driver to request and manage its own resources. For existing systems where the host bridge does not implement MSI/X, a set of get/set capability functions must be provided so that the driver can query the capabilities of the system and allow it to decide which interrupt type it should use. Fixed interrupts in the current DDI implementation use hints about the device type and certain properties for overrides when assigning a software interrupt priority to an interrupt. The proposed framework provides more control, when available, for interrupt priority settings. Finally, this proposal is a merger of the ideas from two previous proposals. One of the two previous proposals was a simple set of wrappers that just addressed the addition of MSI support on all pci* buses. The other proposal was a more generic proposal specifying a single "intr-op" function that would replace the existing framework, but did not provide easy to use 'wrapper' functions for the leaf drivers. This proposal merges the two ideas and provides a generic set of ddi(9f) wrapper functions for use by leaf drivers. 2. Scope - Maintain Compatibility for existing DDI compliant drivers. - Add support for new interrupt types MSI/X. - Add get/set capability, resource management and priority management interfaces to the new framework, making new bus features available to drivers that need them. - Make the new framework generic enough to support other new (and unknown) interrupt types, where possible. - No support for multiple interrupt types from a single device/function is required, since no known bus type supports more than one interrupt type at a time. 3. Theory of Operation. The new interrupt functions deal with two types of interrupts. Fixed or discrete interrupts are issued by devices that have interrupt pins and issue an electrical signal on that pin to indicate an interrupt condition. The pins may be wired in platform specific ways to an interrupt controller which passes the event to the CPU. This document refers to these interrupts as "fixed interrupts". The configuration of fixed interrupts may be limited on certain platforms due to interrupt concentrator limitations, but these configuration issues are handled by the platform itself, either in BIOS or in Plug-N-Play configuration software. Advanced Configuration and Power Interface (ACPI) specification provides a new model for configuration and power management for x86 platforms. Refer [5] for more details. For the purposes of supporting legacy software, PCI-Express INTx emulation messages are treated as a form of fixed interrupts. Fixed interrupts do not require allocation of resources as variable interrupts do. Therefore, once configured by the platform firmware or by configuration software (such as boot/config or plug-n-play configuration) there is no separate allocation step required to use fixed interrupts however, the interrupt allocation function must be called in order to associate the interrupt with an interrupt handle. New ddi functions are available that allow a driver to determine the set of interrupts supported by both the device and by the root complex and any device in between the target device and the root complex. This allows advanced drivers to determine the best interrupt option. Variable interrupts are issued by devices as in-band messages, For MSI and MSI-X (referred to collectively as MSI/X) these messages are posted writes targeting a resource in the root complex (host bridge). The address and data value used in the posted write message is assigned by software and may be limited in the number of unique messages available in the root complex hardware. Thus, variable interrupts must be allocated and assigned before they can be used. Thus variable interrupts must be allocated and managed, and go through the following states as they are setup by the driver and the framework. variable ints ---> Uninitialized (attach/detach state) | ^ | | | | v | fixed ints ---> Allocated (resources allocated) (attach/detach state) | ^ | | | | v | Assigned (handler assigned) | ^ | | | | v | Enabled (interrupt enabled) | ^ | | | | v | Masked (Per-vector masking) (Device/host-bridge) Since resources for variable interrupts are generally limited by the root complex implementation, a set of resource management callback interfaces are provided to allow a driver to be notified when more interrupt resources are available and also allow the framework to notify device drivers when the framework wants to attempt to reclaim some interrupt resources. (eg: For hot plug, etc.) Drivers that use MSI-X and want to use lots of interrupts, perhaps an interrupt per active data stream, can benefit by implementing the resource management callbacks. The framework may be willing to hold back less resources in hot-plug capable systems from a driver that implements both management interfaces since the implementation provides a method to tell the driver that it should release some of its interrupts as soon as it can so they can made available to other devices. (The new devices may have been recently onlined or otherwise made available to the "system".) Resource management callbacks have the following state: (none) | | | (register) | v idle/enabled <------------------ | ^ ^ | | | | (trigger) | | | | | v | | callback ---------------- | | DDI_INTR_M_ENABLE | | | | | |DDI_INTR_M_DISABLE | | | | | v | disabled --------------------->| (enable) The registration call, places the callback in the *idle/enabled* state, and the framework will invoke the callback when the internal trigger or condition exists. The callback handler returns either DDI_INTR_M_ENABLE or DDI_INTR_M_DISABLE. In the former case, the callback is immediately re-armed. In the latter case, the callback remains disarmed until the driver manually rearms it by calling the *enable* function. The driver may also manually disarm the callback by calling the *disable* function. The effect of the *disable* function (not shown on the state diagram) is to change the state to *disabled*. There is also a *cancel* function which un-registers the callback. The *cancel* function is not shown on the state diagram. 4. DDI Legacy and Retained Interrupt Functions 4.1. Interrupt related functions available in both old and new framework condvar(9f) - cv* functions mutex(9f) - mutex* functions rwlock(9f) - rw* functions 4.2. Retained for compatibility with the existing interrupt framework ddi_get_iblock_cookie(9f) ddi_add_intr(9f) ddi_remove_intr(9f) ddi_dev_nintrs(9f) ddi_get_soft_iblock_cookie(9f) ddi_add_softintr(9f) ddi_remove_softintr(9f) ddi_trigger_softintr(9f) ddi_idevice_cookie(9s) ddi_iblock_cookie(9s) ddi_intr_hilevel(9f) interrupt handlers with a single argument. 5. Data Definitions. 5.1. Interrupt handler type. The following are defined in ddi_intr.h and included by sunddi.h /* * Typedef for driver's interrupt handler */ typedef int (ddi_intr_handler_t)(void *arg1, void *arg2); 5.2. Interrupt pri, given as 'pri' in this document. The following are defined in ddi_intr.h and included by sunddi.h #define DDI_INTR_PRI_MIN 1 #define DDI_INTR_PRI_MAX 12 pri is a small integer range from DDI_INTR_PRI_MIN to DDI_INTR_PRI_MAX for most drivers and represents virtual priority. pri can be used directly in lock initialization calls: mutex_init, rw_init, etc. 5.3. Soft Interrupt pri. (soft_pri) The following are defined in ddi_intr.h and included by sunddi.h /* Used in calls to allocate soft interrupt priority. */ #define DDI_INTR_SOFTPRI_DEFAULT 1 /* soft pri must be a number within min/max values */ #define DDI_INTR_SOFTPRI_MIN 1 #define DDI_INTR_SOFTPRI_MAX 9 5.4. Interrupt identification (type and inum) Specific interrupts are always specified by the combination of interrupt-type and inum. For legacy buses, inum is the same as it is now and refers to the nth interrupt, typically as defined by the devices "interrupts" property. For pci fixed interrupts, inum refers to the interrupt number. For MSI/X inum is the relative interrupt number, from 0 - 31 for MSI and from 0 - 2047 for MSI-X. 0 means the first and 31/2047 means the last relative interrupt. In some cases, a range of interrupts can be specified by the combination of interrupt-type, inum and count. In this case, the range refers to interrupt number inum through interrupt number inum+count-1 for the given interrupt-type. 5.5. Interrupt types. The following are defined in ddi_intr.h and included by sunddi.h /* HW interrupt types */ #define DDI_INTR_TYPE_FIXED 0x01 #define DDI_INTR_TYPE_MSI 0x02 #define DDI_INTR_TYPE_MSIX 0x04 5.6. Interrupt flags. The following are defined in ddi_intr.h and included by sunddi.h /* * Interrupt flags specify certain capabilities for a given * interrupt (by type and inum). * RO/RW refer to use by ddi_intr_set_cap(9f) */ #define DDI_INTR_FLAG_LEVEL 0x0001 /* (RW) level trigger */ #define DDI_INTR_FLAG_EDGE 0x0002 /* (RW) edge triggered */ #define DDI_INTR_FLAG_MASKABLE 0x0010 /* (RO) maskable */ #define DDI_INTR_FLAG_PENDING 0x0020 /* (RO) int pending supported */ #define DDI_INTR_FLAG_BLOCK 0x0100 /* (RO) requires block enable and disable */ 5.7. Interrupt Resource Management Callback types and definitions. The following are defined in ddi_intr.h and included by sunddi.h /* * Typedef for interrupt callback functions. */ typedef int (ddi_intr_cb_t)(void *cb_arg1, void *cb_arg2); typedef void *ddi_intr_cb_id_t; /* * Definitions for resource management callback returnval */ #define DDI_INTR_M_ENABLE 0 /* Re-enable the callback */ #define DDI_INTR_M_DISABLE 1 /* Disable the callback */ /* * Typedef for resource management op */ typedef enum { DDI_INTR_M_OP_AVAILABLE = 0, /* Int resources are avail. */ DDI_INTR_M_OP_NEEDED /* Int resources are needed */ } ddi_intr_management_op_t; 5.8. Interrupt handles. The following are defined in ddi_intr.h and included by sunddi.h /* * Typedef for interrupt handles */ typedef void *ddi_intr_handle_t; typedef void *ddi_softint_handle_t; 5.9. New DDI generic error codes. The following new generic error codes are added to sunddi.h DDI_EAGAIN - Resources (currently) unavailable DDI_EINVAL - Invalid arguments 5.10 Behavior flag The following are defined in ddi_intr.h and included by sunddi.h /* * Definitions for behavior flag used with ddi_intr_alloc(). */ #define DDI_INTR_ALLOC_STRICT 1 /* Strict allocation */ 6. DDI Advanced Interrupt Functions for use by Device Drivers. This section describes the new interrupt interfaces. Unless otherwise specified, all functions return DDI_SUCCESS or DDI_FAILURE. Some functions like ddi_intr_get_supported_types()/ ddi_intr_get_ninitrs()/ddi_intr_alloc()/ddi_intr_get_nvail() returns DDI_INTR_NOTFOUND if the hardware device is found not to support any interrupts. Each function description includes a "Context" tag. The "context" tag describes the context (user, kernel or interrupt) that the function may be called in. 6.1. ddi_intr_get_supported_types(9f) #include int ddi_intr_get_supported_types(dev_info_t *dip, int *typesp); Return, as a bit mask, the hardware interrupt types supported by both the device and by the host in the integer pointed to be the 'typesp' argument. An interrupt type is usable by this device if it is returned by this function. Note that soft interrupts are always usable and the sw interrupt types are not returned by this function. Context: User or kernel non-interrupt context 6.2. ddi_intr_get_nintrs(9f), ddi_intr_get_navail(9f) #include int ddi_intr_get_nintrs(dev_info_t *dip, int type, int *nintrsp); int ddi_intr_get_navail(dev_info_t *dip, int type, int *navailp); ddi_intr_get_nintrs: Return as an integer in the integer pointed to by the argument *nintrsp*, the number of interrupts the device supports for the given interrupt type. ddi_intr_get_navail: Return as an integer in the integer pointed to by the argument *navailp*, the number of interrupts that are available to this particular hardware device for the given interrupt type. Note that the number of interrupts currently available is a snapshot in time, and may not be the same by the time the value is used. Context: User or kernel non-interrupt context 6.3. ddi_intr_register_management_cb(9f), ddi_intr_unregister_management_cb(9f), ddi_intr_enable_management_cb(9f), ddi_intr_disale_management_cb(9f) #include int ddi_intr_register_management_cb(dev_info_t *dip, ddi_intr_management_op_t op, int type, ddi_intr_cb_t cb, void *cb_arg1, void *cb_arg2, ddi_intr_cb_id_t *cbidp); int ddi_intr_cancel_management_cb(ddi_intr_cb_id_t cbid); int ddi_intr_enable_management_cb(ddi_intr_cb_id_t cbid); int ddi_intr_disable_management_cb(ddi_intr_cb_id_t cbid); Interrupt Resource Management Functions. These functions define a set of resource management callbacks that can be used with certain interrupt types to allow the system framework and device driver to cooperatively manage the interrupt resource pool. In this version of the spec, these calls are useful for managing MSI-X resources. ddi_intr_register_management_cb can be used to register callback handlers for the operation given by the *op* argument, for the interrupt type given by the *type* argument. The argument *cb* is the callback handler, *cb_arg1* and *cb_arg2* are the first and second arguments that the callback handler will be invoked with. Upon successful registration, a callback ID which can be used in the remaining resource management function calls is returned in the ddi_intr_cb_id_t pointed to by the *cbidp* argument. The callback function has the following definition: typedef int (ddi_intr_cb_t)(void *cb_arg1, void *cb_arg2); The callback function is called in kernel non-interrupt context, and may sleep if it wants to. All callbacks and any other framework functions waiting for the same condition may be invoked simultaneously (i.e.: broadcast condition) when the given trigger occurs. Thus, the callback handler must not assume that the given trigger is true within its code. It can only assume that the condition was true when the callback handler was scheduled to be invoked. The callback function may return one of two values: DDI_INTR_M_ENABLE - rearm the callback. DDI_INTR_M_DISABLE - Do not rearm the callback DDI_INTR_M_ENABLE tells the framework to rearm the callback, thus if the condition given by the *op* argument is true or later becomes true, then the callback handler for this *op* will be called again. DDI_INTR_M_DISABLE tells the framework not to rearm the callback. The callback will only be rearmed later if there is a subsequent call to ddi_intr_enable_management_cb. Callback "ops": DDI_INTR_M_OP_AVAIL - Call when intr resources are avail. Upon successful registration of this callback op, the given callback handler will be scheduled to be called if there are resources available for new interrupt allocations. Note that while the callback may be invoked, the only guarantee of availability of new resources is at the time the trigger occurred. The callback handler for this op should return DDI_INTR_M_ENABLE if it still wants notification of new interrupt resources. The callback handler for this op should return DDI_INTR_M_DISABLE if the callback handler does not need current notification of new resources at this time. Later, when the driver instance needs new interrupt resources, it can call ddi_intr_enable_management_cb to "rearm" the callback handler. DDI_INTR_M_OP_NEEDED - Call when the "system" needs intrs. Upon successful registration of this callback op, the given callback handler will be called when the system framework needs new interrupt resources. By registering this callback op, the driver agrees to co-manage interrupt resources with the system framework and when requested to "free" interrupt resources, agrees to do so, if possible, at the time of the callback, or shortly after the callback is invoked. A drivers willingness to cooperatively manage interrupt resources allows the system to give it more interrupt resources than it might otherwise give to a single instance of a device driver, since it knows it will be able to reclaim some resources if it needs them for a new device. A successful callback that has released interrupt resources within the callback should return DDI_INTR_M_ENABLE. If the callback schedules a release of interrupt resources later, the callback handler should return DDI_INTR_M_DISABLE until the scheduled release occurs and then call ddi_intr_enable_management_cb to re-enable the callback after the resources are released. If it is not possible to release any interrupt resources at this time, the callback should return DDI_INTR_M_DISABLE and call ddi_intr_enable_management_cb later, when it is possible to release interrupts. ddi_intr_unregister_callback may be used to de-register (remove) the callback handler given by the argument *cbid*. The *cbid* is provided by the system framework when the callback is registered. ddi_intr_enable_management_cb may be used to "rearm" a callback handler that was previously "disarmed" by either a return value of DDI_INTR_M_DISABLE from the callback or a previous call to ddi_intr_disable_management_cb. The argument *cbid* specifies which callback trigger should be enabled. ddi_intr_disable_management_cb may be used to "disarm" a callback handler. The argument *cbid* specifies the callback handler to be disabled. Once disabled, the callback will not be invoked until it is enabled via a call to ddi_intr_enable_management_cb. Context: User or kernel non-interrupt context Callback context: User or kernel non-interrupt context (may block) 6.4. ddi_intr_alloc(9f), ddi_intr_free(9f) #include int ddi_intr_alloc(dev_info_t *dip, ddi_intr_handle_t *h_array, int type, int inum, int count, int *actualp, int behavior); int ddi_intr_free(ddi_intr_handle_t h); allocate/free interrupts of a given type. ddi_intr_alloc allocates interrupts of the interrupt type given by *type* beginning at the interrupt number described by *type* and *inum*. If ddi_intr_alloc allocates any interrupts, it returns the actual number of interrupts allocated in the integer pointed to by the *actualp* argument and returns that number of interrupt handles in the interrupt handle array pointed to by the *h_array* argument. h_array must be pre-allocated by the caller as a *count* sized array of ddi_intr_handle_t's. See section 5.4 for the true meaning of 'inum' argument. The value of the *behavior* flag controls the allocation algorithm. If the *behavior* is set to DDI_INTR_ALLOC_STRICT, then the call fails if less than "count" interrupts are currently available. If *behavior* is not set to DDI_INTR_ALLOC_STRICT, the call returns "count" or less number of interrupts. The integer value returned in the int pointed to by the *actualp* argument depends on the *behavior* flag. In all cases, if any interrupts are allocated, then the number of interrupt vectors allocated shall be returned in the int pointed to by the *actualp* argument and ddi_intr_alloc returns DDI_SUCCESS. If *behavior* is zero, and no interrupt vectors were allocated, ddi_intr_alloc returns zero in the int pointed to by the *actualp* argument. If *behavior* is non-zero and no interrupt vectors were allocated, ddi_intr_alloc returns the number of vectors currently available for the given interrupt type (as with ddi_intr_navail(9f)) in the int pointed to by the *actualp* argument. The handle for each allocated interrupt vector, if any, is returned in the array of handles given by the *h_array* argument. ddi_intr_alloc return values: DDI_SUCCESS - normal non-error return, no errors were encountered, but check the number of interrupts allocated in *actualp. Interrupts were allocated. DDI_FAILURE - Unspecified error occurred. No interrupts were allocated. DDI_EAGAIN - Not enough interrupt resources available to satisfy the request. No interrupt were allocated. DDI_EINVAL - The request cannot be satisfied now or ever. No interrupt resources were allocated. (typically, a logic error in the arguments, eg: count is too large with non-zero behavior). ddi_intr_free releases the system resources and interrupt vectors associated with the ddi_intr_handle_t h, including any resources associated with the handle h itself. Once freed, the handle h may not be used in any further calls. Context: User or kernel non-interrupt context 6.5. ddi_intr_get_cap(9f), ddi_intr_set_cap(9f) #include int ddi_intr_get_cap(ddi_intr_handle_t h, int *flagsp); int ddi_intr_set_cap(ddi_intr_handle_t h, int flags); Get/set the interrupt capabilities by interrupt type. ddi_intr_get_cap returns interrupt capability flags for the interrupt specified by the *h* argument. The flags are returned in the integer pointed to be the *flagsp* argument. DDI_INTR_FLAG_LEVEL and DDI_INTR_FLAG_EDGE flags specify that for discrete interrupts, the host supports level, edge or both types of triggers. DDI_INTR_FLAG_MASKABLE indicates that the interrupt can be masked either by the device or by the host bridge. DDI_INTR_FLAG_PENDING indicates that the interrupt supports an 'interrupt pending' bit. DDI_INTR_FLAG_BLOCK indicates that all of the given type interrupts must be block-enabled and are not individually maskable. ddi_intr_set_cap allows the driver to specify DDI_INTR_FLAG_LEVEL or DDI_INTR_FLAG_EDGE for a a discrete interrupt given by the *h* argument, where that capability has both DDI_INTR_FLAG_LEVEL and DDI_INTR_FLAG_EDGE flags returned in the ddi_intr_get_cap call. The flags are specified in the *flags* argument. ddi_intr_set_cap may be called after interrupts are allocated and prior to adding the interrupt handler. For all other times it returns failure. Context: User or kernel non-interrupt context 6.6. ddi_intr_get_hilevel_pri(9f) #include int ddi_intr_get_hilevel_pri(void); Returns the minimum pri level for a hi-level interrupt. The return value can be used to compare to other pri values, such as those returned from ddi_intr_get_pri(9f), to determine if a given interrupt priority is a hi-level interrupt. Context: User or kernel non-interrupt context 6.7. ddi_intr_get_pri(9f), ddi_intr_set_pri(9f) #include int ddi_intr_get_pri(ddi_intr_handle_t h, int *prip); int ddi_intr_set_pri(ddi_intr_handle_t h, int pri); get/set the interrupt priority level for a given interrupt, given by the argument h. ddi_intr_get_pri returns a small integer in the int pointed to by the argument *prip*. The small integer is typically in the range DDI_INTR_PRI_MIN .. DDI_INTR_PRI_MAX and represents the current software priority setting for the interrupt given by the argument *h*. ddi_intr_set_pri may only be called prior to adding the interrupt handler (or when an interrupt handler is unassigned). The priority returned from ddi_intr_get_pri (if not changed by a call to ddi_intr_set_pri) may be used in calls to mutex_init, rw_init as the iblock_cookie argument. *pri* is a virtual relative priority described as a small integer. Context: User or kernel non-interrupt context 6.8. ddi_intr_add_handler(9f), ddi_intr_dup_handler(9f), ddi_intr_remove_handler(9f) #include int ddi_intr_add_handler(ddi_intr_handle_t h, ddi_intr_handler_t inthandler, void *arg1, void *arg2); int ddi_intr_dup_handler(ddi_intr_handle_t from, int to_inum, ddi_intr_handle_t *to); int ddi_intr_remove_handler(ddi_intr_handle_t h); Add/remove/duplicate interrupt handlers. ddi_intr_add_handler adds an interrupt handler given by the *handler* argument for the previously allocated interrupt given by the h argument. The arguments *arg1* and *arg2* are passed as the first and second arguments, respectively, to the interrupt handler. The interrupt handler is defined as: typedef int (ddi_intr_handler_t func)(void *arg1, void *arg2); func is the interrupt handler, arg1 and arg2 are the static data arguments specified when the interrupt handler is added. ddi_intr_dup_handler is for use with MSI-X, where an unallocated interrupt vector is permitted to use the same MSI address/data pair, and therefore uses the same interrupt handler and handler arguments as a previously set interrupt vector. ddi_intr_dup_handler copies the entry from the interrupt given by *from* to the entry for the MSI-X interrupt given by the argument *to_inum* and, if successful, returns the new interrupt handle for the new interrupt in *to. ddi_intr_remove_handler removes the given handler association. ddi_intr_add_handler must be called after ddi_intr_alloc, but before ddi_intr_enable is called ddi_intr_dup_handler must be called after the interrupt handler has been added for the interrupt given by *from* and the interrupt handler given by *to_inum* must NOT have been previously allocated, or have a handler associated with it. ddi_intr_remove_handler may be used to unassociate handlers when the interrupt is disabled and to remove 'duped' interrupt handlers when they have been disabled (See ddi_intr_disable and ddi_intr_block_disable). If a handler has been duplicated via ddi_intr_dup_handler, all added and duplicated instances of the handler must be removed via ddi_intr_remove_handler in order for the handler to be completely removed. In all cases, the interrupt source is not automatically enabled, if the interrupt type and implementation allow that. The interrupt must be enabled (see ddi_intr_enable and ddi_intr_block_enable) before it can be used. Context: User or kernel non-interrupt context 6.9. ddi_intr_enable(9f), ddi_intr_disable(9f), ddi_intr_block_enable(9f) ddi_intr_block_disable(9f) #include int ddi_intr_enable(ddi_intr_handle_t h); int ddi_intr_disable(ddi_intr_handle_t h); int ddi_intr_block_enable(ddi_intr_handle_t *h_array, int count); int ddi_intr_block_disable(ddi_intr_handle_t *h_array, int count); Enable/Disable assigned interrupts. ddi_intr_enable enables the interrupt given by the *h* argument. ddi_intr_block_enable enables a range of interrupts given by the *count* and *h_array* arguments where *count* must be at least 1 and h_array is pointer to a count-sized array of interrupt handles. ddi_intr_disable disables the interrupt given by *h*. ddi_intr_block_disable disables a range of interrupts given by the *count* and *h_array* arguments where *count* must be at least 1 and h_array is pointer to a count-sized array of interrupt handles. Also, ddi_intr_block_disable must be called if ddi_intr_block_enable was used to enable the interrupts. ddi_intr_get_cap() returns the RO flag DDI_INTR_FLAG_BLOCK if the device or host bridge supports interrupt block enable/disable feature for the given interrupt type. These functions can be used only if the device or host bridge supports the block enable/disable feature. For example, ddi_intr_block_enable/ddi_intr_block_disable are useful for enabling/disabling MSI interrupts, when the optional per-vector masking capability is not supported. Once enabled by either of the enable calls, the interrupt may be taken and passed to the drivers interrupt service routine. Enabling an interrupt implies clearing any system or device mask bits associated with the interrupt. These functions may only be called after interrupt handlers have been added via ddi_intr_add_handler or duplicated via ddi_intr_dup_handler. Context: User or kernel non-interrupt context. 6.10. ddi_intr_set_mask(9f), ddi_intr_clr_mask(9f) #include int ddi_intr_set_mask(ddi_intr_handle_t h); int ddi_intr_clr_mask(ddi_intr_handle_t h); set/clear interrupt mask. ddi_intr_set_mask sets the interrupt mask associated with the interrupt given by *h* if the device or host bridge or system supports the masking operation. In flight interrupts may still be taken and delivered to the driver, but if the operation is supported, if the call returns successfully, no new interrupts issued by the device will result in the device drivers interrupt handler being called until the mask is cleared. ddi_intr_clr_mark clears the interrupt mask associated with the interrupt given by *h* if the hardware supports the masking operation. The mask may not be cleared directly if the OS implementation has also temporarily masked the interrupt. A call to ddi_intr_clr_mask must be preceded by a call to ddi_intr_set_mask. It is not necessary to call ddi_intr_clr_mask when adding and enabling the interrupt, the system framework will clear any device or system mask bits when the interrupt is initially enabled. Context: Any 6.11. ddi_intr_get_pending(9f) #include int ddi_intr_get_pending(ddi_intr_handle_t h, int *pendingp); get interrupt pending bit. In many cases, the device or host bridge supports the ability to read an interrupt pending bit. These functions provide access to that capability. ddi_intr_get_cap returns the RO flag DDI_INTR_FLAG_PENDING if the device supports interrupt pending bits for the given interrupt type. ddi_intr_get_pending returns non-zero in the integer pointed to by the *pendingp* argument if an interrupt is pending for the interrupt defined by *h*, which must be an allocated interrupt. If the DDI_INTR_FLAG_PENDING capability is not supported, ddi_intr_get_pending returns DDI_FAILURE and zero in 'pendingp'. Context: Any 6.12. ddi_intr_add_softint(9f), ddi_intr_remove_softint(9f), ddi_intr_get_softint_pri(9f), ddi_intr_set_softint_pri(9f), ddi_intr_trigger_softint(9f) #include int ddi_intr_add_softint(dev_info_t *dip, ddi_softint_handle_t *h, int soft_pri, ddi_intr_handler_t handler, void *arg1); int ddi_intr_trigger_softint(ddi_softint_handle_t h, void *arg2); int ddi_intr_remove_softint(ddi_softint_handle_t h); ddi_intr_get_softint_pri(ddi_softint_handle_t h, int *soft_prip); ddi_intr_set_softint_pri(ddi_softint_handle_t h, int soft_pri); softint management functions. ddi_intr_add_softint adds the soft interrupt handler given by the *handler* argument, with the handler argument *arg1* using the soft interrupt priority given by the *soft_pri* argument and returns the interrupt handle in *h. *soft_pri* is a relative pri value within the values DDI_INTR_SOFTPRI_MIN and DDI_INTR_SOFTPRI_MAX. Most drivers should use the default soft_pri value DDI_INTR_SOFTPRI_DEFAULT. The soft interrupt handler, when triggered, has the following form: typedef int (ddi_intr_handler_t h)(void *arg1, void *arg2); ddi_intr_trigger_softint triggers the soft interrupt specified by the handle *h* argument, and specifies that *arg2* shall be passed to the interrupt handler as its *arg2* argument. ddi_intr_remove_softint removes the handler for the soft interrupt identified by the *h* handle argument. Once removed, the soft interrupt can no longer be triggered, however, any trigger calls already in progress may be delivered to the handler. ddi_intr_get_softint_pri returns the soft interrupt priority (a small integer value) in the int pointed to by the argument *soft_prip* for the interrupt associated with the handle given by the *h* argument. ddi_intr_set_softint_pri may be used to change the relative soft priority associated with the soft interrupt defined by the *h* argument to the priority given by the *soft_pri* argument, which must be a valid software priority value as defined above. Context: User or kernel non-interrupt context Context: ddi_intr_trigger_softintr: Any 7. References [1] PCI Local Bus Specification, Revision 2.2, Dec 18, 1998 Published by the PCI Local Bus Special Interest Group. Officially available at http://www.pcisig.com/ Locally available at http://noho.sfbay/~dmk/pci/pci22.pdf Note: Section 6.8 Describes the MSI capability and operation. [2] PCI SIG Engineering Change Notice - MSI-X, June 10, 2003 MSI-X addition and MSI Per Vector Masking addition to PCI 2.3 and PCI 3.0. Published by the PCI Special Interest Group. Officially available at http://www.pcisig.com/ Locally available at http://noho.sfbay/~dmk/pci/msi-x_ecn.pdf Note: Describes changes to pci 2.3 to add MSI-X and add PVM to MSI. [3] PCI Express Base Specification, Revision 1.0a, April 15, 2003 Published by the PCI Special Interest Group Officially available at http://www.pcisig.com/ Locally available at http://noho.sfbay/~dmk/pci_express_base10a.pdf [4] PCI Express Engineering Change Notice - MSI-X, Oct. 31, 2003 Unpublished PCI-Express ECN, permits functions to use MSI or MSI-X. Officially available in PCI-Express SW-WG member area at pcisig.com Locally available at: http://noho.sfbay/~dmk/pci-express/PciEx_draft_ECN_MSI-X_031031.pdf Note: ECN is member approved, probably waiting publication by SIG. [5] ACPI: Framework and Config - PSARC/1998/300 http://sac.sfbay.sun.com/arc/PSARC/1998/300 A. Bus Specific Binding Information A.1. Application of ddi Interrupt Functions to SBus. SBus supports discrete level interrupt sources. An SBus device may choose between 7 SBus interrupt levels. Each level is a discrete wire, that may be shared with other SBus devices using the same SBus interrupt level. In some cases interrupts may be shared between devices. Thus, interrupt handlers must poll the device to ensure that the interrupt from its device is active. As with all discrete interrupt sources, physical SBus interrupt levels may be routed independently of the device hierarchy. For example, they may be routed to an interrupt concentrator in another branch of the device tree. Interrupt routing information is platform specific and may be represented using a combination of "interrupt-parent" and "interrupt-map" properties in the device tree, or may be represented in a host bridge specific manner. inum is as currently specified and specifies the relative interrupt number in the "interrupts" property. Each entry in the "interrupts" property contains the SBus interrupt level used by that device. Note that in some systems, "interrupts" may contain a system dependent number that does not directly represent interrupt priority. Some host bridges may support per vector masking of SBus sources, but a per-vector mask operation on a shared vector will block all sources using that interrupt vector, so the implementation might not honor calls to block an interrupt source, unless that interrupt source is unique and not shared. A.2. Application of ddi Interrupt Functions to ISA-like Buses. ISA and ISA-like devices support a single discrete interrupt. As with all discrete interrupt sources, physical INTx interrupt lines may be routed independently of the device hierarchy. For example, they may be routed to an interrupt concentrator in another branch of the device tree. Interrupt routing information is platform specific and may be represented using a combination of "interrupt-parent" and "interrupt-map" properties in the device tree, or may be represented in a host bridge specific manner. In systems with x86 BIOS, the BIOS may provide the interrupt assignment and routing tables to the client program. The interrupt source may be level or edge or programmable. Most devices, given a choice, should use level sources, with the exception of periodic device interrupts. Note that there is no guarantee that a periodic edge interrupt will always be services. In systems with x86 BIOS, the BIOS may provide the interrupt assignment and routing tables to the client program. For x86 systems equipped with I/O APICs, the interrupt assignment and routing tables could be retrieved through ACPI or MPSpec table. Some platforms may support per-vector masking for non-shared ISA interrupts. A.3. Application of ddi Interrupt Functions to PCI* Buses. PCI* buses includes Conventional PCI, PCI-X, PCI-Express and other flavors of the pci specification. All PCI buses support discrete INTx lines although PCI-Express does not have discrete interrupt wires, and supports INTx via INTx emulation messages. INTx can be level or edge triggered. Most devices must use level triggers particularly if the INTx line is shared, except for some periodic interrupt sources. There is no guarantee that an edge triggered interrupt will always be serviced. Plug-in cards must use level triggers with INTx. As with all discrete interrupt sources, physical INTx interrupt lines may be routed independently of the device hierarchy. For example, they may be routed to an interrupt concentrator in another branch of the device tree. Interrupt routing information is platform specific and may be represented using a combination of "interrupt-parent" and "interrupt-map" properties in the device tree, or may be represented in a host bridge specific manner. In systems with x86 BIOS, the BIOS may provide the interrupt assignment and routing tables to the client program. On all PCI* buses, use of INTx [emulation] versus MSI versus MSI-X is mutually exclusive. The driver/device cannot use two types of interrupts simultaneously. INTx emulation is typically treated as a form of discrete interrupts. Since the INTx emulation messages do not use discrete wires, they always target the root complex (host bridge), however, the host bridge may present a representation of an interrupt concentrator to the client program as a root complex device for compatibility with existing systems. Typically, INTx is a shared resource, but on some systems, certain devices may have unique interrupts for a given slot, but if the device contains a pci-pci bridge, the interrupts under that bridge will be combined into a single set of INTx signals. On pci-express, INTx emulation is always a shared resource, and the 4 virtual INTx "signals" are always shared by all devices using INTx emulation compared to the same interrupt source. INTx emulation consists of a pair of INTx_ENABLE and INTx_DISABLE messages, so devices can emulate level or edge semantics. With edge semantics, there is no guarantee that an edge triggered interrupt will always be serviced. All flavors of PCI also support Message Signaled Interrupts (MSI) or MSI-X, an extended form of MSI. MSI/X is always effectively edge triggered since the interrupt is signaled with a posted write command by the device targeting a pre-allocated area of "memory" on the host bridge. However, some host bridges have the ability to "latch" the acceptance of an MSI/X message and can effectively treat it as a level signaled interrupt. Devices are permitted to send more than one MSI/X message prior to an outstanding interrupt being services, however, the PCI specifications state that there is no guarantee that additional MSI/X messages will be serviced until the first of a set of MSI/X messages targeting the same address/data values have been serviced. Therefore, there is only a guarantee of servicing one MSI/X message per set of MSI/X messages. Other than certain devices that send periodic interrupts, devices should in general, only send one MSI/X message per interrupt source until that interrupt has been serviced. With MSI/X, vectors must be allocated by the implementation and assigned to the device. Some flavors of MSI support individual per-vector mask and status bits (PVM). MSI-X always supports PVM. Additionally, the host bridge may support a form of PVM even if PVM bits are not present in the MSI capability block. Note that PVM bits are required in MSI-X, so a given source (inum) can always be blocked with MSI-X. Note that the number of unique MSI/X vectors is limited by the total number of unique vectors supported by any given host bridge implementation, so MSI/X vectors should be treated as a sparse resource by drivers. The implementation should always keep some MSI/X vectors available for hot-plug events, if that bus segment supports slots with the hot plug capability. Even with that reservation, it will still be possible to run out of MSI/X vectors on some systems thus, in some cases, critical drivers may choose to fall back to INTx emulation mode, if they can survive with the added latency and reduced number of interrupts. Other devices may choose to fail (with an appropriate error message) or wait until sufficient MSI/X resources are available before allowing use of the device. Default interrupt priority is assigned based on the class code of the device. For fixed PCI interrupts (INTx), the "interrupt-priorities" property may be used to override the default class code based interrupt priority assignments. Interrupt priority for MSI/X may be overridden by the device driver, but, in general, drivers should usually simply accept the default interrupt levels assigned by the implementation. Native PCI devices should avoid using INTx or INTx emulation when MSI/X is available in the device and supported by the host bridge implementation. PCI drivers may need to fall back to using INTx or INTx emulation if there are no MSI/X vectors available. B. Example Pseudo-Code. B.1. Simple MSI Example. /* Somewhere in the drivers attach function ... */ int xx_msi_setup(dev_info_t *dip) { int type, count, actual, flags, inum, ret; ddi_intr_handle_t *htable, *h; size_t n; (void) ddi_intr_get_supported_types(dip, &type); if ((type & DDI_INTR_TYPE_MSI) == 0)) { /* The host or the device does not support MSI */ /* fall back to INTx - not shown */; } type = DDI_INTR_TYPE_MSI; /* for readability only */ count = ddi_intr_get_nintrs(dip, type); if (count == 0) { /* The device does not support MSI */ /* fall back to INTx - not shown */; } flags = inum = 0; /* for readability only */ /* Allocate an array of interrupt handles */ n = count * sizeof (ddi_intr_handle_t); htable = kmem_alloc(n, KM_SLEEP); /* xxx - Store htable and count in instance data (not shown) */ /* Allocate 'count' interrupts */ type = DDI_INTR_TYPE_MSI; ret = ddi_intr_alloc(dip, htable, type, inum, count, &actual, 0); if ((ret != DDI_SUCCESS) || (actual == 0)) { /* error or wait case - not shown */; } if (actual < count) { /* We got fewer vectors than we wanted */ /* device dependent handling - not shown */; } /* Add the interrupt handlers */ for (inum = 0, h = htable; inum < actual; ++inum, ++hp) { /* Priority/hi-level management goes here */ /* add handler for inum */ if (ddi_intr_add_handler(h, handler, arg1, arg2)) { /* error case -- clean up and fail the attach? */ } } /* Enable the interrupts */ (void) ddi_intr_block_enable(htable, count); /* DONE */ return (0); }