Livermore Computing (LC) maintains Python and a set of site-specific packages (modules) on all production TOSS and CORAL systems. The information herein will be useful in using Python under LC environments. Where possible, the links to external sites are also provided. General information about the Python programming language is available at the Python website. An LC Confluence Space allows users to share tips and tricks and to post requests for site-package additions and an Institutional Confluence Space provides additional information including a supported Python Wheelhouse.
Future Python 2 Support
The Python Software Foundation officially ended Python 2 support on January 1, 2020, so the Python community will not improve Python 2 even if a bug or security problem is found. However, Livermore Computing will continue to support existing Python 2 installations on current systems and will likely maintain a Python 2 installation beyond the January 1, 2020 on new systems (and through major OS updates, e.g., TOSS 4) for the foreseeable future.
In the long term, users are strongly encouraged to port their Python scripts to Python 3, as Python 2 will eventually be retired from LC systems. We do not currently have a retirement date as such a move will be dependent on any major bugs, security issues, and existing Python packages (e.g., numpy, scipy) support. Users will be given ample notice prior to LC dropping support for Python 2.
Below are a few references that may help port Python 2 applications to Python 3:
On TOSS 3 x86 clusters, LC maintains the default Python command for version 2 as /usr/tce/bin/python and for version 3 as /usr/tce/bin/python3. Unless users modify their standard search paths, /usr/tce/bin/python gets invoked when the python command is typed at a UNIX prompt because /usr/tce/bin takes precedence over any other paths in the standard search path that LC configures. Note that this is not the same as the OS-included /usr/bin/python. The LC-supported Python installations were built with the Spack package manager and ultimately reside in /collab/usr/gapps, however, symlinks are created in /usr/tce/packages/python for convenience. Only a few of the Python bin binaries are made available directly in /usr/tce/bin/ (python, ipython, cython, f2py, idle, pyMPI, and virtualenv), so other commands (pytest, pip, and others) will default to their /usr/bin OS-installed versions. To get all of the LC-supported Python installation's executables in your $PATH, users are advised to load the python module by running module load python. Instructions on loading the python module can be found below and more general information about modules at LC can be found on the LC confluence wiki here. LC maintains similar Python installations on CORAL EA and CORAL systems in /usr/tcetmp (rather than /usr/tce on TOSS 3 x86).
On TOSS 4 x86 clusters and CORAL 2 systems, there is no default python command in the standard $PATH. This is consistent with the wider Linux community. Users should use the version-specific python2 and python3 commands. Similar to TOSS 3 systems, this should default to /usr/tce/bin/python2 and /usr/tce/bin/python3 rather than the /usr/bin counterparts. The only additional Python binaries that are available are idle3, ipython, and virtualenv. These correspond to the default python, which is currently version 3.9.12. Similar to TOSS 3 clusters, users can switch python versions using modules, which will make the full set of python binaries available in their $PATH. Note that when one loads a python module, there will be a default python command which will correspond with the version of the loaded module. Also, due to the way LC installs Python using Spack and its view feature, loading the python module will expose the entire view's bin directory including some more common commands like R, meson, and ninja.
In addition to the default version, LC maintains an older version of Python—typically the previous default—to ensure a smooth transition between upgrades. This old version is available in /usr/tce/bin/python-<old_version>. For example, when version 2.7.14 replaced version 2.7.13 as the default version, the latter became accessible via the python-2.7.13 command. LC supports the old version until the transition is believed to be complete. Newer versions may be similarly available. Users can load the python module to modify their default.
For each Python version, LC supports a set of modules, also known as site-packages, generally beneficial to our user community. Our package selection process begins when a package is requested by a user. LC first studies if it can generally benefit the Python users on the LC machines before committing to install and maintain it. If it turns out to be too specific to the individual requester, we recommend that this user use virtualenv to manage their own Python environment. Users may also make a side-installation and add the installation path to their Python search path via the PYTHONPATH environment variable. Site package requests can be made and tracked on the site-package request wiki. The complete list of the site-packages that LC maintains is available through the following links (note that for TOSS 4 and newer systems, we will not maintain this list, but instead recommend that users run python3 -m pip list --format=columns to view the full list):
- Python 2.7.13 for TOSS 3
- Python 2.7.14 for TOSS 3
- Python 2.7.16 for TOSS 3
- Python 3.6.0 for TOSS 3
- Python 3.6.4 for TOSS 3
- Python 3.7.2 for TOSS 3
- Python 3.8.2 for TOSS 3
- Python 2.7.13 for CORAL EA and CORAL
- Python 2.7.14 for CORAL EA and CORAL
- Python 2.7.16 for CORAL EA and CORAL
- Python 3.6.4 for CORAL EA and CORAL
- Python 3.7.2 for CORAL EA and CORAL
- Python 3.8.2 for CORAL EA and CORAL
Note: Our operating system installation includes a build of Python in /usr/bin/python[2,3], which is typically a slightly older version of Python. You may use this version if it works for you; however, be aware that updates, patches, and site-package additions to this Python build are usually scheduled along with the OS upgrades. We therefore encourage you to use /usr/tce/bin/python because we can more flexibly modify that installation to your needs.
We have installed virtualenv to enable users to manage their own Python build while leveraging the LC-supported Python installation. Virtualenv copies portions of a base Python installation into a user-specified directory while still pointing to the base Python installation. The virtualenv copy thus has access to the base Python installation's site-packages (and will track updates) while also allowing the user to install additional or updated site packages.
To create a virtualenv environment using the default python, run (replace <yourprefix> with a directory where you have write access, such as your home directory or workspace directory):
/usr/tce/packages/python/default/bin/virtualenv --system-site-packages <yourprefix>
Note that you need to supply the --system-site-packages argument to pick up the default site-packages. This command creates a virtualenv environment in the specified <yourprefix> directory. LC suggests that <yourprefix> should use the $SYS_TYPE environment variable (e.g., virtualenv --system-site-packages /usr/workspace/$USER/local/$SYS_TYPE/venv), as installed packages may not be portable across different LC systems. You can then access this Python via <yourprefix>/bin/python or by running source <yourprefix>/bin/activate to set your $PATH such that you can just run python. There are many options and customizations possible with virtualenv. More information about virtualenv is available on the virtualenv Web page. Once a user has created a virtualenv, they may run <yourprefix>/bin/pip install <somepackage> to install additional packages. Here is an example session using virtualenv (with some output removed for brevity):
[lee218@rzwhippet17:lee218]$ /usr/tce/packages/python/python-3.10.8/bin/virtualenv --system-site-packages /usr/workspace/$USER/local/$SYS_TYPE/venv-3.10.8 [lee218@rzwhippet17:lee218]$ /usr/workspace/$USER/local/$SYS_TYPE/venv-3.10.8/bin/pip install torch Collecting torch [notice] A new release of pip available: 22.3.1 -> 23.0 [notice] To update, run: /usr/WS2/lee218/local/toss_4_x86_64_ib/venv-3.10.8/bin/python -m pip install --upgrade pip [lee218@rzwhippet17:lee218]$ /usr/workspace/$USER/local/$SYS_TYPE/venv-3.10.8/bin/python -c 'import torch ; print(torch.__file__)' /usr/workspace/lee218/local/toss_4_x86_64_ib/venv-3.10.8/lib/python3.10/site-packages/torch/__init__.py
First, it is worth noting that users should not specify the --user flag when installing site-packages. This installs packages in $HOME/.local regardless of the architecture. This can cause incompatibility when moving between hardware architectures or OS versions.
If you are using virtualenv, you can install site packages directly into your virtualenv Python environment. One method is to use pip, which is included in the /bin directory of your virtualenv environment. There are several ways to use pip, as mentioned in the pip documentation. One method is to download the source tarball for the desired package and then run:
% <yourprefix>/bin/pip install <packagename-version>.tar.gz
You can also manually install packages using distutils by running the following command from the package source directory with your virtualenv python:
% <yourprefix>/bin/python setup.py install
With virtualenv and the above examples, the site package can be imported directly, without having to set PYTHONPATH, when you run <yourprefix>/bin/python.
If you do not use virtualenv, you can still install your own site packages, but they cannot be built-in to the actual Python installation. Instead, you will have to run distutils with the --prefix option, for example:
% /usr/tce/packages/python/default/bin/python setup.py install --prefix=<siteprefix>
Note that the path used to install the package should be /usr/tce/packages/python/default/bin/python not /usr/tce/bin/python. The /usr/tce/bin/python path is a script and due to the symlink structure and how distutils works, the build may not work properly if you build the package with the /usr/tce/bin/python path. We also suggest including a $SYS_TYPE directory in the specified siteprefix (i.e., --prefix=/usr/workspace/$USER/local/$SYS_TYPE), to separate installations for various OS versions and architectures. In order to load the site package, you will need to add the installation directory, which is typically <siteprefix>/lib/python<version>/lib/site-packages, to your PYTHONPATH environment variable. It is worth noting again that because home file systems are mounted across multiple platforms, we generally advise against installing with the --user option, which installs packages in $HOME/.local. This may cause conflicts when trying to run Python codes across multiple OS versions or across various architectures.
Additional information about installing Python modules is available on the Installing Python Modules page. Furthermore, an LC-hosted wheelhouse exists with already approved and built packages. Information on how to use the LC-hosted wheelhouse can be found on the LLNL Python Wheelhouse wiki.
There are certain scenarios where using the LC python and virtualenv do not suffice. Another option is to use the Conda package manager. One common case to use anaconda is to use the Spyder IDE. LLNL has procured a site-wide Anaconda license allowing users to install their own anaconda distribution and install their own site-packages through the default anaconda repository. LC has downloaded the installers which are available in /collab/usr/gapps/python/$SYS_TYPE/conda/. On TOSS 4 systems, a base anaconda installation can be found in /collab/usr/gapps/python/toss_4_x86_64_ib/anaconda3/bin/conda. Note that this is a rolling "default" symlink that may change to a newer version once available. To guarantee consistency users can use the full path to an absolute version such as /collab/usr/gapps/python/toss_4_x86_64_ib/anaconda3-2023.03/bin/conda.
TensorFlow and PyTorch on LC Systems
For MPI parallelism in Python, users are advised to use the mpi4py package, which is included in LC Python installations.
The executable for Python 3 is python3 as opposed to python. Loading a Python 3.X module on TOSS 3 and CORAL 1 systems will only affect the version of the python3 command and not the python command. With TOSS 4 this behavior has changed and loading a Python 3.X module will modify one's default python command. Users are still advised to explicitly run python2 or python3 to avoid confusion.
Jupyter Notebook is an open-source web application that allows users to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter Notebook runs as a web server and thus LC users not allowed to run their own installations of Jupyter (including Conda or similar distributions) on LC systems. LC does, however, deploy a modified JupyterHub service that adheres to LC security policies. Users may connect to this service at https://lc.llnl.gov/jupyter for the CZ and to https://rzlc.llnl.gov/jupyter for the RZ. The LC JupyterHub deployment allows users to select a login node that they have access to and remotely spawn a notebook on that login node. The default kernel has been set to Python 3, however, users may create a virtualenv installation to create their own custom kernels. More information on how to use the LC JupyterHub service is available at https://hpc.llnl.gov/services/jupyterhub-and-jupyter-notebooks and https://lc.llnl.gov/confluence/display/LC/JupyterHub+and+Jupyter+Notebook.
Below are a few basic, useful techniques for running Python. Users are also encouraged to share pointers on the Python tips and tricks wiki.
You can use Lmod Modules to control which version of Python to use. For more information, see the TOSS 3 Lmod wiki.
To see which versions of Python are available:
[lee218@rzgenie5:~]$ module avail python -------------------------- /usr/tce/modulefiles/Core --------------------------- python/2.7.11 python/2.7.16 (D) python/3.6.4 python/2.7.13 python/3.5.1 python/3.7.2 python/2.7.14 python/3.6.0 python/3.8.2 Where: D: Default Module
To use the default version of Python:
% python Python 2.7.14 (default, Jan 17 2018, 10:04:29) [GCC 4.9.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
To use a specific version of Python:
% module load python/2.7.11 % python Python 2.7.11 (default, Jun 10 2016, 11:20:30) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
Invoking Python in a Script
Place the following line at the beginning of your Python script:
#! /bin/env python3
Change the permissions of your script to add execute privileges:
% chmod +x myscript.py
Run your script like an executable:
Note: While you can also use #! /usr/tce/bin/python3, we do not recommend doing so because it will cause problems if you are using modules to manage which Python version you are running. Also, be aware that /bin/env python will pick up any alias to Python that you may have set. However, if your script depends on a specific Python version, you should use the full path.
Running Python at Scale
Due to it's usage of shared libraries and how Python searches for site-packages, running python at large node/task counts can have detrimental effects on shared file systems, impacting not only the performance of your script, but also for the entire compute center. To avoid overloading the file systems, users are advised to launch their large scale Python jobs using Spindle. This is generally as simple as prepending spindle to one's command line, i.e., spindle srun python myscript.py myarg1 myarg2.
Python, Multiprocessing, srun, and mpibind
On node-scheduled LC clusters, mpibind is enabled by default. This is known to have detrimental performance impact to Python scripts that use multiprocessing as the primary mode of parallelism, as all threads launched by a given python process will be bound to a single socket. To work around this issue, users of python and multiprocessing are advised to explicitly disable mpibind, i.e., srun -n 1 --mpibind=off python my_mp_script.py. This will allow threads to migrate across sockets, thus take advantage of all CPUs/cores on a node.
Python Integrated Development Environments (IDEs)
For users who prefer the Spyder IDE, users may install it themselves using anaconda (see section above on Conda). The base conda installation on LC TOSS 4 systems does contain a spyder executable which can be accessed via /collab/usr/gapps/python/toss_4_x86_64_ib/anaconda3/bin/spyder.
All TOSS 3 systems include a build of eclipse with the pydev plugin. To load eclipse you can run the following commands: module load opt ; module load eclipse ; eclipse or you can directly invoke /opt/rh/rh-eclipse46/root/bin/eclipse. For instructions on how to setup a pydev project, please refer to the pydev manual at https://www.pydev.org/manual_101_root.html.
Since IDEs run on LC systems may not perform well with X forwarding, users are advised to consider logging in to LC systems using VNC. Please refer to https://hpc.llnl.gov/software/visualization-software/vnc-realvnc for more information.
A lot of documentation for Python is available on the Web and in published hardcopy. The Python Web site contains links that are extremely useful. Documents and information specific to LLNL or not easily found at the Python Web site are:
- Python Site Packages (LLNL)
- Debugging Python: Python script debugging using pdb
- Limited Python debugging capabilities are available in TotalView
- pyDoc: Documentation generator and online help system
- pyMPI: An introduction to parallel Python using MPI
If you know of other documentation that should be listed here, please contact the LC Hotline.
If you are getting SSL certificate errors, such as
requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)
you can set your REQUESTS_CA_BUNDLE environment variable to /etc/pki/tls/cert.pem. LC has patched its requests installations to automatically set this variable if it is not already defined.
Some users have reported SSL errors such as:
SSLError: HTTPSConnectionPool(host='example.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1131)')))
It was discovered that this happens when conda installs openssl 3.X. This may be fixed by forcing conda to downgrade openssl to 1.1.1:
conda install --force-reinstall openssl=1.1.1
On TOSS 3 x86 systems, the LC Python installations and their site packages were built with gcc-4.9.3, however, many site-packages that users add require a newer libstdc++, resulting in errors such as:
ImportError: /usr/tce/packages/gcc/gcc-4.9.3/lib64/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /path/to/some_extension.so)
The best way to work around this issue is to force your Python script to load the libstdc++.so from a newer gcc build. It is important that this be loaded before loading any other modules that may bring in the version 4.9.3 libstdc++.so. For example, below is the code required to load the 8.1.0 libstdc++.so:
import ctypes ctypes.CDLL("/usr/tce/packages/gcc/gcc-8.1.0/lib64/libstdc++.so") import some_extension
mpi4py one-sided communication
If locks don't appear to be working as expected using mpi4py's Win class, setting MPICH_ASYNC_PROGRESS=1 may resolve the issue.
h5py on CORAL systems
The default h5py on CORAL systems was built with MPI support. If run outside of jsrun/lrun (i.e., directly invoking python and importing h5py), users may receive an error/warning message or get a segmentation fault. Users who need to use h5py when directly running python outside of jsrun will need to use a side-installed, non-MPI h5py by adding the appropriate directory to their PYTHONPATH environment variable. These side installs can be found in /usr/tce/packages/python/python-2.7.16/lib/python2.7/h5py-serial-2.7.16, /usr/tce/packages/python/python-3.7.2/lib/python3.7/h5py-serial-3.7.2, and /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial. For example:
% lrun -n 1 python- -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-2.7.16/lib/python2.7/site-packages/h5py/__init__.pyc % python -c 'import h5py; print(h5py.__file__)' Segmentation fault % setenv PYTHONPATH /usr/tce/packages/python/python-2.7.16/lib/python2.7/h5py-serial-2.7.16 % python -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-2.7.16/lib/python2.7/h5py-serial-2.7.16/h5py/__init__.pyc % lrun -n 1 python3-3.7.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.7.2/lib/python3.7/site-packages/h5py/__init__.py % python3-3.7.2 -c 'import h5py; print(h5py.__file__)' Segmentation fault % setenv PYTHONPATH /usr/tce/packages/python/python-3.7.2/lib/python3.7/h5py-serial-3.7.2 % python3-3.7.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.7.2/lib/python3.7/h5py-serial-3.7.2/h5py/__init__.py % lrun -n 1 python3-3.8.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial/h5py/__init__.py % python3-3.8.2 -c 'import h5py; print(h5py.__file__)' Segmentation fault % setenv PYTHONPATH /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial % python3-3.8.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial/h5py/__init__.py