Content tagged with #Compilers

This page lists available online tutorials related to parallel programming and using LC's HPC systems. NOTE: archive tutorials are no longer updated and may contain broken links and other QA issues.

Our Development Environment Software consists of compilers and preprocessors, debugging software, memory-related software, profiling tools, tracing tools, and performance analysis tools.

Numerous compilers are available to provide a rich programming environment for scientific and technical computing. A detailed list of single CPU, distributed memory, and shared memory compilers and their machine availability is provided. Note this page is deprecated in favor of

Intel's VTune Amplifier is a performance profiling tool for C, C++, and Fortran code that can identify where in the code time is being spent in both serial and threaded applications. For threaded applications, it can also determine the amount of concurrency and identify bottlenecks created by synchronization primitives.

Allinea DDT is a powerful, easy-to-use graphical debugger capable of debugging:

TotalView is a sophisticated and powerful tool used for debugging and analyzing both serial and parallel programs. TotalView provides source level debugging for serial, parallel, multi-process, multi-threaded, accelerator/GPU and hybrid applications written in C/C++ and Fortran. Most HPC platforms and systems are supported. Both a graphical user interface and command line interface are provided.

TAU (Tuning and Analysis Utilities) is a comprehensive profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, and Python. It is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements. All C++ language features are supported including templates and namespaces.

UPDATE: This page is largely deprecated.
In conjunction with our Python user community, Livermore Computing (LC) maintains Python and a set of site-specific packages (modules) on all production CHAOS systems. The information herein, which includes the supported versions of Python and site-packages, the description of each site-package, and Python development techniques, will be useful in using Python under LC environments.

The PAPI Performance Application Programming Interface provides machine and operating system independent (portable) access to hardware performance counters found on most modern processors. Any of over 100 preset events can be counted through either a simple high level programming interface or a more complete low level interface from either C or Fortran.

Modern x86 processors include vector units that can operate on multiple data objects with a single instruction, otherwise known as Single Instruction, Multiple Data (or SIMD) units. These are implemented in the 128-bit Streaming SIMD Extensions (SSE) and starting with Intel's Sandy Bridge architecture, the 256-bit Advanced Vector eXtensions (AVX).