In conjunction with our Python user community, Livermore Computing (LC) maintains Python and a set of site-specific packages (modules) on all production TOSS systems. The information herein, which includes the supported versions of Python and site-packages, the description of each site-package, and Python development techniques, will be useful in using Python under LC environments.
Note: SSA has been deprecated and will not run in the Intel version 16 and above compilers, and results cannot be viewed in the Inspector version 2016 and above. It can still be run in older compiler and Inspector versions.
Modern x86 processors include vector units that can operate on multiple data objects with a single instruction, otherwise known as Single Instruction, Multiple Data (or SIMD) units. These are implemented in the 128-bit Streaming SIMD Extensions (SSE) and starting with Intel's Sandy Bridge architecture, the 256-bit Advanced Vector eXtensions (AVX).
Vampir is a full featured tool suite for analyzing the performance and message passing characteristics of parallel applications. Vampir is based on run-time tracing of program events collected as OTF format files by other tools/libraries, such as VampirTrace, TAU, Score-P, Open|SpeedShop, etc.
TotalView is a sophisticated and powerful tool used for debugging and analyzing both serial and parallel programs. TotalView provides source level debugging for serial, parallel, multi-process, multi-threaded, accelerator/GPU and hybrid applications written in C/C++ and Fortran. Most HPC platforms and systems are supported. Both a graphical user interface and command line interface are provided.
TAU (Tuning and Analysis Utilities) is a comprehensive profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, and Python. It is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements. All C++ language features are supported including templates and namespaces.
PapiEx is a PAPI-based program for measuring hardware performance events of an application using the command-line. It supports both PAPI preset events and native events. It supports multiple threads of execution as well, including pthreads and OpenMP threads. For MPI programs, PapiEx can gather statistics across tasks. PapiEx also measures the total time spent in I/O and MPI calls.
The PAPI Performance Application Programming Interface provides machine and operating system independent (portable) access to hardware performance counters found on most modern processors. Any of over 100 preset events can be counted through either a simple high level programming interface or a more complete low level interface from either C or Fortran.
mpiP is a lightweight profiling library for MPI applications. Because it only collects statistical information about MPI functions, mpiP generates considerably less overhead and much less data than tracing tools. All the information captured by mpiP is task-local. It only uses communication during report generation, typically at the end of the experiment, to merge results from all of the tasks into one output file.