Template Version: @(#)onepager.txt 1.35 07/11/07 SMI 1. Introduction 1.1. Project/Component Working Name: CPU idle notification interface 1.2. Name of Document Author/Supplier: Author: Gerry Liu 1.3. Date of This Document: Feb 21, 2009 4. Technical Description: 4.1. Problem A CPU idle notification mechanism is needed to signal other components which are interested in the CPU idle state change events when CPU enters/exits idle state. This mechanism could be used by following components: A) Memory power saving driver B) Lazy TLB flush on x86 system C) CPU power management framework 4.2. Proposal We propose to add following data structures/interfaces to OpenSolaris kernel. 4.2.1 CPU idle notification data structure typedef void * cpu_idle_callback_handle_t; typedef void * cpu_idle_callback_context_t; typedef void * cpu_idle_prop_handle_t; typedef union cpu_idle_prop_value { intptr_t cipv_intptr; uint32_t cipv_uint32; uint64_t cipv_uint64; hrtime_t cipv_hrtime; } cpu_idle_prop_value_t; 4.2.2 Prototype of entering idle state notification callback typedef void (*cpu_idle_enter_cbfn)(void *arg, cpu_idle_callback_context_t *ctxp); Entering idle state notification callback must obey all constraints which applies to idle thread because it will be called in idle thread context. And it may be called with interrupt enabled/disabled. arg is the parameter passed in when registering callback. ctxp is an opaque parameter which will be used to retrieve property. 4.2.3 Prototype of exiting idle state notification callback typedef void (*cpu_idle_exit_cbfn)(void *arg, cpu_idle_callback_context_t *ctxp, int flags); Exiting idle state notification callback will be called in idle thread context or interrupt context. There are flags to distinguish the calling contexts. arg is the parameter passed in when registering callback. ctxp is an opaque parameter which will be used to retrieve property. flags for exiting idle state notification callback: CPU_IDLE_CB_FLAG_INTR: called in interrupt context CPU_IDLE_CB_FLAG_IDLE: called in idle thread context 4.2.4 CPU idle notification callback data structures typedef struct cpu_idle_callback { int version; cpu_idle_enter_cbfn idle_enter; cpu_idle_exit_cbfn idle_exit; } cpu_idle_callback_t; At least one of idle_enter and idle_exit is non-NULL. Field version will be used to match cpu_idle_callback_t structure and CPU idle notification framework. 4.2.5. Register CPU idle notification callback int cpu_idle_register_callback(uint_t priority, cpu_idle_callback_t *callbackp, void *arg, cpu_idle_callback_handle_t *hdlp); This interface registers a callback to be called when CPU idle state changes. All registered callbacks will be called in priority order from high to low when CPU enters idle state and will be called in reverse order when CPU exits idle state. If CPU is predicted to sleep for a short time or be under heavy load, framework may skip calling registered callbacks when entering/exiting idle state to avoid overhead and reduce performance penalty. This interface shouldn't be called from callback handlers. priority is used to determine calling order of registered callbacks. arg will be passed back to registered callback and how to use it is determined by callback. hdlp is used to stored created handle on success. It returns zero on success and error number on failure. 4.2.6. Deregister CPU notification callback int cpu_idle_unregister_callback(cpu_idle_callback_handle_t hdlp); This interface deregisters a registered callback. It shouldn't be called from callback handler. It returns zero on success and error number on failure. 4.2.7. Signal entering idle state event void cpu_idle_enter(int state); This interface notifies CPU idle notification subsystem that a specific CPU is entering into idle state. state is idle state CPU is going to enter. 4.2.8. Signal exiting idle state event void cpu_idle_exit(int flags); This interface notifies idle notification subsystem that a specific CPU is exiting from idle state. flags for exiting idle state: CPU_IDLE_CB_FLAG_INTR: called in interrupt context CPU_IDLE_CB_FLAG_IDLE: called in idle thread context 4.2.9. Get idle callback context for calling CPU cpu_idle_callback_context_t cpu_idle_get_context(void); 4.2.10. Create property handle int cpu_idle_prop_create_handle(const char *propname, cpu_idle_prop_handle_t *prophdl); This function creates a handle for specific property if it's supported. It returns zero on success or error code on failure. 4.2.11. Set property value void cpu_idle_property_set(cpu_idle_prop_handle_t prophdl, cpu_idle_callback_context_t ctx, cpu_idle_prop_value_t val); This function is used by property provider to set value for property. 4.2.12. Get property value int cpu_idle_property_get(cpu_idle_prop_handle_t prophdl, cpu_idle_callback_context_t ctx, cpu_idle_prop_value_t *valp); The function tries to get value of a property, it may fail in case that the property doesn't support fast access. It return zero on success and error number on failure. uint32_t cpu_idle_prop_get_uint32(cpu_idle_prop_handle_t prophdl, cpu_idle_callback_context_t ctx); uint64_t cpu_idle_prop_get_uint64(cpu_idle_prop_handle_t prophdl, cpu_idle_callback_context_t ctx); intptr_t cpu_idle_prop_get_intptr(cpu_idle_prop_handle_t prophdl, cpu_idle_callback_context_t ctx); hrtime_t cpu_idle_prop_get_hrtime(cpu_idle_prop_handle_t prophdl, cpu_idle_callback_context_t ctx); This suite of functions retrieve value of property which supports fast access. 4.2.13. Supported properties All properties without special notes are per logical CPU statistics. The framework supports following properties by default. ______________________________________________________________________ | Property Name | Data Type | Fast | Decription | |_________________|___________|_Access_|______________________________| | idle-state | int | Yes | CPU Idle state to enter | | idle-enter-ts | hrtime_t | Yes | Timestamp for entering idle | | idle-exit-ts | hrtime_t | Yes | Timestamp for exiting idle | | last-idle-time | hrtime_t | Yes | Last idle period | | last-busy-time | hrtime_t | Yes | Last busy period | | interrupt-count | uint64_t | Yes | Interrupt count in last busy | |_________________|___________|________| period_______________________| Other property providers may add new properties to framework in future. For example, CPU power management driver may support "idle-latency", "max-idle-time" etc. 6. Resources and Schedule: 6.4. Steering Committee requested information 6.4.1. Consolidation C-team Name: ON 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open