Please contact if you have questions. 

Dates & Times

July 15 2024
9AM - 4PM, Monday


In-person in B453, Black Diamond Room (1st floor before glass doors); Hybrid options available but in-person is preferred


We will be having a hybrid Full-Day TAU workshop.


Parallel Performance Evaluation Using TAU
To meet the needs of computational scientists to evaluate the performance of their parallel, scientific applications, we present the TAU performance system and its interfaces with other tools such as  PAPI, Perfetto, OTF2, and Vampir. This two-day workshop will cover performance evaluation of applications on tri-lab OCF platforms, and consulting sessions. This workshop will focus on performance data collection, analysis, and performance optimization. After describing and demonstrating how performance data (both profile and trace data) can be collected in a straightforward manner using TAU’s (Tuning and Analysis Utilities) automated instrumentation, the bulk of the workshop will cover how to analyze the performance data collected and drill down to find performance bottlenecks and determine their causes.
The workshop will include some sample codes that illustrate the different instrumentation and measurement choices available to the users. Topics will cover generating performance profiles and traces with memory utilization and headroom, I/O, and hardware performance counters data using PAPI. The workshop will demonstrate scalable tracing using OTF2 and visualization using the Vampir trace analysis tool. Performance data analysis using ParaProf and PerfExplorer will be demonstrated using the performance data management framework (TAUdb) that includes TAU’s performance database.
The workshop will also feature cross experiment analysis including comparing the effects of multi-core architectures on code performance. The demonstrations will include using TAU on programming paradigms such as ROCm, Intel oneAPI (DPC++/SYCL), OpenCL, OpenACC, CUDA,.
The workshop will also cover using TAU in the Extreme-Scale Scientific Software Stack (E4S) [] using containers and AWS. E4S is a curated Spack based collection of HPC and Ai/ML tools and includes PETSc, Trilinos, TAU, HPCToolkit, HDF5, as well as TensorFlow, PyTorch and other Generative AI toolkits available on commercial  cloud platforms.
We will attempt to collect and analyze performance data for additional user codes during the hands-on portion of the workshop. Users and developers are welcome to contact the instructor ahead of time to begin collecting data to discuss at the workshop.
The second day of the workshop is dedicated to one-on-one consultation sessions with Dr. Shende for further, more in-depth, instruction and help in addressing performance bottlenecks in your codes. Please see below for information on scheduling an appointment.


Monday, July 15: 
9:00 PDT - 12:00 PDT:
 *   Introduction to TAU, E4S, AWS for hands-on
 *   Instrumentation using tau_exec with MPI, OpenMP OMPT, CUDA, ROCm, Level Zero, and OpenACC
 *    I/O and memory evaluation
 *   Hands-on with Paraprof, tau_exec, and
12:00 PDT - 1:30 PDT:
 *   Lunch Break
1:30 PDT - 4:00 PDT:
 *   Demonstration of analysis tools: Paraprof, TAUdb, PerfExplorer, Vampir, and Jumpshot. 
 *   Using TAU on AWS using Adaptive Computing’s ODDC. 
 *   Hands-on
Tuesday, July 16:
 *   One-on-one consultation sessions on 7/16 by appointment (schedule by sending email to<>”
TAU is available on LC systems - please see:

What to Bring

Please bring your LC username, LC token and a laptop if you wish to follow along with hands on exercises.


Registration has been opened, below.

Please only register as in-person if you will attend in the B453 Black Diamond Room.

Webex link will be sent after registration.


No cost.


Please call LC Hotline or send e-mail to:


Participant Information
If not checked, it is assumed you will join by webex. Class size is limited by room size.