1. Introduction 1.1. Project/Component Working Name: LatencyTOP for OpenSolaris 1.2. Name of Document Author/Supplier: krishnendu.sadhukhan@sun.com 1.3. Date of This Document: 05/20/09 1.3.1. Date this project was conceived: 05/20/09 1.4. Name of Major Document Customer(s)/Consumer(s): 1.4.1. The PAC or CPT you expect to review your project: Solaris PAC 1.4.2. The ARC(s) you expect to review your project: PSARC 1.4.3. The Director/VP who is "Sponsoring" this project: greg.lavendar@sun.com 1.4.4. The name of your business unit: Software 1.5. Email Aliases: 1.5.1. Responsible Manager: darrin.johnson@sun.com 1.5.2. Responsible Engineer: krishnendu.sadhukhan@sun.com 1.5.3. Marketing Manager: mike.mulkey@sun.com 1.5.4. Interest List: latencytop-dev@opensolaris.org 2. Project Summary 2.1. Project Description: LatencyTOP is an observability tool that can be used to identify latencies in application and system software. Latency occurs when a process/thread can not run and goes to sleep due to unavailability of some resource. Developers can use this tool to identify latency holes in their applications; system programmers can equally use this tool to identify latencies in system processes. This tool was initially developed for Linux. Intel and Sun are jointly collaborating on developing this tool for OpenSolaris. The first version of the tool is already available for use on OpenSolaris website. 2.2. Risks and Assumptions: LatencyTOP uses probes from the proc, sched and lockstat DTrace provider. So, the OpenSolaris version in which LatencyTOP is run must have these probes available. 3. Business Summary 3.1. Problem Area: Latency affects performance of applications and systems. This tool can be used to locate latencies, which when removed improves performance of applications and systems. 3.2. Market/Requester: 3.3. Business Justification: LatencyTOP is being jointly developed by Sun and Intel through the OpenSolaris community. 3.4. Competitive Analysis: LatencyTOP was originally developed for Linux. Making this tool available in OpenSolaris will grow the user base of OpenSolaris and will provide OpenSolaris with competitive advantage over Linux. 3.5. Opportunity Window/Exposure: The first version of the tool is already available on OpenSolaris for download and use. 3.6. How will you know when you are done?: Verson 1.0 of the tool is already complete. 4. Technical Description: 4.1. Details: The original LatencyTOP project is hosted on http://www.latencytop.org The OpenSolaris port of this project is hosted on http://opensolaris.org/os/project/latencytop 4.2. Bug/RFE Number(s): 6825817 Integrate latencyTOP into OpenSolaris 6847419 man page for LatencyTOP 4.3. In Scope: LatencyTOP traces two types of latencies : i) an LWP goes to sleep state because it is waiting for some resource to be available for it to run again ii) an LWP spinning in order to acquire a synchronization object 4.4. Out of Scope: LatencyTOP is only an observability tool; it does not allow the user to accomplish anything other than getting latency statistics. LatencyTOP does not detect busy loop inside user application. Neither does it detect delay that is not caused by waiting, e.g. process not running because a higher priority process takes a lot of CPU time. 4.5. Interfaces: Minor binding only. INTERFACES COMMITTMENT LEVEL ========== ================= /usr/bin/i86/latencytop committed /usr/bin/amd64/latencytop committed /bin/i86/latencytop committed /bin/amd64/latencytop committed /usr/bin/latencytop (hard link) committed /bin/latencytop (hard link) committed LatencyTOP has a UI based on libcurses. 4.6. Doc Impact: A new man(1M) page will be required (see Appendix). The OpenSolaris system administration needs to include this tool. 4.7. Admin/Config Impact: None 4.8. HA Impact: None 4.9. I18N/L10N Impact: None 4.10. Packaging & Delivery: A new package called SUNWlatencytop will be introduced. 4.11. Security Impact: The user must have DTrace privilege to run LatencyTOP. 4.12. Dependencies: LatencyTOP uses the Solaris DTrace APIs, specifically the following DTrace providers: sched, proc and lockstat. 5. Reference Documents: http://monaco.sfbay/detail.jsf?cr=6825817 http://opensolaris.org/os/project/latencytop 6. Resources and Schedule: 6.1. Projected Availability: LatencyTOP is currently available through OpenSolaris. 6.2. Cost of Effort: This project is being jointly developed by Sun and Intel through OpenSolaris. One Sun engineer is required to sponsor the effort and facilitate the integration into OpenSolaris. 6.3. Cost of Capital Resources: None 6.4. Product Approval Committee requested information: 6.4.1. Consolidation or Component Name: sfw 6.4.3. Type of CPT Review and Approval expected: FastTrack 6.4.4. Project Boundary Conditions: None 6.4.5. Is this a necessary project for OEM agreements: No 6.4.6. Notes: N/A 6.4.7. Target RTI Date/Release: onnv_120 6.4.8. Target Code Design Review Date: 06/20/2009 6.4.9. Update approval addition: No 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open 6.6.1. Rationale: N/A 7. Prototype Availability: 7.1. Prototype Availability: Version 1.0 of LatencyTOP is already available on OpenSolaris. 7.2. Prototype Cost: N/A 8. Appendix System Administration Commands latencytop(1M) NAME latencytop - report statistics related to latencies in the system and in applications SYNOPSIS latencytop [-o log file] [-k log-level] [-t interval] [-f] [-s] [-l log interval] [-h] DESCRIPTION LatencyTOP is an observability tool that reports statistics about latencies in the system and in applications. The tool reports statistics about where and what kind of latencies are happening in the system and in the applications that are running on the system. The statistics then can be used to improve performance throughput of applications and system by removing the latencies. The tool analyzes system activity periodically and displays the data in the output window. Two types of latencies are tracked - an LWP going in and out of sleep and an LWP spinning order to acquire a synchronization object. The tool uses the Solaris DTrace framework to collect the statistics corresponding to these two scenarios of inactivity of the system and application LWPs. The output window is divided into two sections - upper part displays the system-wide statistics while the lower part displays statistics about individual processes. The user can navigate the list of processes (using the < and the > keys) and select the one they are interested in, and the tool will display statistics about that selected process in the lower part of the window; if the t or T key is pressed, the tool displays the LWP-specific view of that selected process. Thus, the t or T key can be used to toggle between the process-view and the thread-view. During execution, a user can force a refresh of the analysis by pressing the r or R key. The interval time is restored to the default or to a specified value (if -t was used). To quit the application, the user must press the q or Q key. OPTIONS The following options are supported: -o [log file] Specifies the log file where output will be written. The default log file is /var/log/latencytop.log. -k [log level] Specifies the level of logging in the log file. Valid values are: 0 = none (default), 1 = unknown, and 2 = all; -t [interval] Specifies the interval, in seconds, at which the tool collects statistics from the system. The possible values are between 1 and 60; the default is 5 seconds. -f Filter large interruptible latencies (e.g. sleep). -s Monitors the sched (PID=0) process for any latency. -l [log-interval] Writes data to the log file every log-interval seconds; log-interval must be > 60. -h Displays the command's usage. EXAMPLES Example 1 Running the tool The following command launches the tool with default values for options. % latencytop Example 2 Setting the Interval The following command sets the interval to two seconds. % latencytop -t 2 Example 3 Setting the log file The following command sets the log file to /tmp/latencytop.log. % latencytop -o /tmp/latencytop.log Example 4 Setting the log level The following command sets the log level to "all". % latencytop -l 2 EXIT STATUS 0 Successful operation. 1 An error occurred. ATTRIBUTES See attributes(5) for descriptions of the following attri- butes: ____________________________________________________________ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | |_____________________________|_____________________________| | Architecture | x86, SPARC | |_____________________________|_____________________________| | Availability | SUNWlatencytop | |_____________________________|_____________________________| | Interface Stability | Volatile | |_____________________________|_____________________________| SEE ALSO kstat(1M), dtrace(1M) Among non-SunOS man pages, xscreensaver(1), from the OpenWindows man pages. USAGE You must have DTrace privileges to run LatencyTOP.