Livermore Computing (LC) maintains Python and a set of site-specific packages (modules) on all production TOSS and CORAL systems. The information herein will be useful in using Python under LC environments. Where possible, the links to external sites are also provided. General information about the Python programming language is available at the Python website. An LC Confluence Space allows users to share tips and tricks and to post requests for site-package additions and an Institutional Confluence Space provides additional information including a supported Python Wheelhouse.
Future Python 2 Support
The Python Software Foundation officially ended Python 2 support on January 1, 2020, so the Python community will not improve Python 2 even if a bug or security problem is found. However, Livermore Computing will continue to support existing Python 2 installations on current systems and will likely maintain a Python 2 installation beyond the January 1, 2020 on new systems (and through major OS updates, e.g., TOSS 4) for the foreseeable future.
In the long term, users are strongly encouraged to port their Python scripts to Python 3, as Python 2 will eventually be retired from LC systems. We do not currently have a retirement date as such a move will be dependent on any major bugs, security issues, and existing Python packages (e.g., numpy, scipy) support. Users will be given ample notice prior to LC dropping support for Python 2.
Below are a few references that may help port Python 2 applications to Python 3:
docs.python.org/3/howto/pyporting.html
docs.python.org/2/library/2to3.html
Installation
he LC-supported Python installations were built with the Spack package manager and ultimately reside in /collab/usr/gapps, however, symlinks are created in /usr/tce/packages/python for convenience. To get all of the LC-supported Python installation's executables in your $PATH, users are advised to load the python module by running module load python. Instructions on loading the python module can be found below and more general information about modules at LC can be found on the LC confluence wiki here.
On TOSS 4 x86 clusters and CORAL 2 systems, there is no default python command in the standard $PATH. This is consistent with the wider Linux community. Users should use the version-specific python2 and python3 commands, which should default to /usr/tce/bin/python2 and /usr/tce/bin/python3 rather than the /usr/bin counterparts. The only additional Python binaries that are available in /usr/tce/bin are idle3, ipython, and virtualenv. These correspond to the default python, which is currently version 3.9.12. Other commands (pytest, pip, and others) will default to their /usr/bin OS-installed versions. Users can switch python versions using modules, which will make the full set of python binaries available in their $PATH. Note that when one loads a python module, there will be a default python command which will correspond with the version of the loaded module. Also, due to the way LC installs Python using Spack and its view feature, loading the python module will expose the entire view's bin directory including some more common commands like R, meson, and ninja.
For each Python version, LC supports a set of modules, also known as site-packages, generally beneficial to our user community. Our package selection process begins when a package is requested by a user. LC first studies if it can generally benefit the Python users on the LC machines before committing to install and maintain it. If it turns out to be too specific to the individual requester, we recommend that this user use virtualenv to manage their own Python environment. Users may also make a side-installation and add the installation path to their Python search path via the PYTHONPATH environment variable. Site package requests can be made and tracked on the site-package request wiki. The complete list of the site-packages that LC maintains is available by running python3 -m pip list --format=columns.
Note: Our operating system installation includes a build of Python in /usr/bin/python[2,3], which is typically a slightly older version of Python. You may use this version if it works for you; however, be aware that updates, patches, and site-package additions to this Python build are usually scheduled along with the OS upgrades. We therefore encourage you to use /usr/tce/bin/python because we can more flexibly modify that installation to your needs.
Using virtualenv to Manage Your Own Python Environment
We have installed virtualenv to enable users to manage their own Python build while leveraging the LC-supported Python installation. Virtualenv effectively creates a Python installation into a user-specified directory while still pointing to the base Python installation. The virtualenv copy thus has access to the base Python installation's site-packages (and will track updates) while also allowing the user to install additional or updated site packages.
To create a virtualenv environment using the default python, run (replace <yourprefix> with a directory where you have write access, such as your home directory or workspace directory):
/usr/tce/packages/python/default/bin/virtualenv --system-site-packages <yourprefix>
Note that you need to supply the --system-site-packages argument to pick up the default site-packages. This command creates a virtualenv environment in the specified <yourprefix> directory. LC suggests that <yourprefix> should use the $SYS_TYPE environment variable (e.g., virtualenv --system-site-packages /usr/workspace/$USER/local/$SYS_TYPE/venv), as installed packages may not be portable across different LC systems. You can then access this Python via <yourprefix>/bin/python or by running source <yourprefix>/bin/activate to set your $PATH such that you can just run python. There are many options and customizations possible with virtualenv. More information about virtualenv is available on the virtualenv Web page. Once a user has created a virtualenv, they may run <yourprefix>/bin/pip install <somepackage> to install additional packages. Please note that running pip install --user is highly discouraged on LC system since it will install packages in a user's $HOME directory which is shared across different architectures and OS versions, and thus may cause unexpected, and hard debug issues. Here is an example session using virtualenv (with some output removed for brevity):
[lee218@rzwhippet17:lee218]$ /usr/tce/packages/python/python-3.10.8/bin/virtualenv --system-site-packages /usr/workspace/$USER/local/$SYS_TYPE/venv-3.10.8 [lee218@rzwhippet17:lee218]$ /usr/workspace/$USER/local/$SYS_TYPE/venv-3.10.8/bin/pip install torch Collecting torch [notice] A new release of pip available: 22.3.1 -> 23.0 [notice] To update, run: /usr/WS2/lee218/local/toss_4_x86_64_ib/venv-3.10.8/bin/python -m pip install --upgrade pip [lee218@rzwhippet17:lee218]$ /usr/workspace/$USER/local/$SYS_TYPE/venv-3.10.8/bin/python -c 'import torch ; print(torch.__file__)' /usr/workspace/lee218/local/toss_4_x86_64_ib/venv-3.10.8/lib/python3.10/site-packages/torch/__init__.py
Installing Your Own Site Packages
First, it is worth noting that users should not specify the --user flag when installing site-packages. This installs packages in $HOME/.local regardless of the architecture. This can cause incompatibility when moving between hardware architectures or OS versions.
If you are using virtualenv, you can install site packages directly into your virtualenv Python environment. One method is to use pip, which is included in the /bin directory of your virtualenv environment. There are several ways to use pip, as mentioned in the pip documentation. One method is to download the source tarball for the desired package and then run:
% <yourprefix>/bin/pip install <packagename-version>.tar.gz
You can also manually install packages using distutils by running the following command from the package source directory with your virtualenv python:
% <yourprefix>/bin/python setup.py install
With virtualenv and the above examples, the site package can be imported directly, without having to set PYTHONPATH, when you run <yourprefix>/bin/python.
If you do not use virtualenv, you can still install your own site packages, but they cannot be built-in to the actual Python installation. Instead, you will have to run distutils with the --prefix option, for example:
% /usr/tce/packages/python/default/bin/python setup.py install --prefix=<siteprefix>
We suggest including a $SYS_TYPE directory in the specified siteprefix (i.e., --prefix=/usr/workspace/$USER/local/$SYS_TYPE), to separate installations for various OS versions and architectures. In order to load the site package, you will need to add the installation directory, which is typically <siteprefix>/lib/python<version>/lib/site-packages, to your PYTHONPATH environment variable. It is worth noting again that because home file systems are mounted across multiple platforms, we generally advise against installing with the --user option, which installs packages in $HOME/.local. This may cause conflicts when trying to run Python codes across multiple OS versions or across various architectures.
Additional information about installing Python modules is available on the Installing Python Modules page.
Conda
There are certain scenarios where using the LC python and virtualenv do not suffice. Another option is to use the Conda package manager. One common case to use anaconda is to use the Spyder IDE. LLNL has procured a site-wide Anaconda license allowing users to install their own anaconda distribution and install their own site-packages through the default anaconda repository. LC has downloaded the installers which are available in /collab/usr/gapps/python/$SYS_TYPE/conda/. On TOSS 4 systems, a base anaconda installation can be found in /collab/usr/gapps/python/toss_4_x86_64_ib/anaconda3/bin/conda. Note that this is a rolling "default" symlink that may change to a newer version once available. To guarantee consistency users can use the full path to an absolute version such as /collab/usr/gapps/python/toss_4_x86_64_ib/anaconda3-2023.03/bin/conda.
TensorFlow and PyTorch on LC Systems
LC does not directly support TensorFlow or PyTorch. However, guidance on how to build these packages are available on the LC wikis TensorFlow on LC and PyTorch on LC.
Parallel Python
For MPI parallelism in Python, users are advised to use the mpi4py package, which is included in LC Python installations.
Python 3
The executable for Python 3 is python3 as opposed to python. Loading a Python 3.X module on TOSS 3 and CORAL 1 systems will only affect the version of the python3 command and not the python command. With TOSS 4 this behavior has changed and loading a Python 3.X module will modify one's default python command. Users are still advised to explicitly run python2 or python3 to avoid confusion.
Jupyter Notebook and LC's JupyterHub
Jupyter Notebook is an open-source web application that allows users to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter Notebook runs as a web server and thus LC users not allowed to run their own installations of Jupyter (including Conda or similar distributions) on LC systems. LC does, however, deploy a modified JupyterHub service that adheres to LC security policies. Users may connect to this service at https://lc.llnl.gov/jupyter for the CZ and to https://rzlc.llnl.gov/jupyter for the RZ. The LC JupyterHub deployment allows users to select a login node that they have access to and remotely spawn a notebook on that login node. The default kernel has been set to Python 3, however, users may create a virtualenv installation to create their own custom kernels. More information on how to use the LC JupyterHub service is available at https://hpc.llnl.gov/services/jupyterhub-and-jupyter-notebooks and https://lc.llnl.gov/confluence/display/LC/JupyterHub+and+Jupyter+Notebook.
Useful Techniques
Below are a few basic, useful techniques for running Python. Users are also encouraged to share pointers on the Python tips and tricks wiki.
Using Python with Lmod Modules
You can use Lmod Modules to control which version of Python to use. For more information, see the TOSS 3 Lmod wiki.
To see which versions of Python are available:
[lee218@poodle18:~]$ module avail python -------------------------- /usr/tce/modulefiles/Core --------------------------- python/2.7.18 python/3.9.12 (D) python/3.10.8 Where: D: Default Module
To use the default version of python3:
[lee218@poodle18:~]$ python3 Python 3.9.12 (main, Apr 15 2022, 09:20:22) [GCC 10.3.1 20210422 (Red Hat 10.3.1-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
To use a specific version of python3:
[lee218@poodle18:~]$ module load python/3.10.8 [lee218@poodle18:~]$ python3 Python 3.10.8 (main, Jan 5 2023, 10:38:19) [GCC 10.3.1 20210422 (Red Hat 10.3.1-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
Invoking Python in a Script
Place the following line at the beginning of your Python script:
#! /bin/env python3
Change the permissions of your script to add execute privileges:
% chmod +x myscript.py
Run your script like an executable:
% ./myscript.py
Note: While you can also use #! /usr/tce/bin/python3, we do not recommend doing so because it will cause problems if you are using modules to manage which Python version you are running. Also, be aware that /bin/env python will pick up any alias to Python that you may have set. However, if your script depends on a specific Python version, you should use the full path.
Running Python at Scale
Due to it's usage of shared libraries and how Python searches for site-packages, running python at large node/task counts can have detrimental effects on shared file systems, impacting not only the performance of your script, but also for the entire compute center. To avoid overloading the file systems, users are advised to launch their large scale Python jobs using Spindle. This is generally as simple as prepending spindle to one's command line, i.e., spindle srun python myscript.py myarg1 myarg2.
Python, Multiprocessing, srun, and mpibind
On node-scheduled LC clusters, mpibind is enabled by default. This is known to have detrimental performance impact to Python scripts that use multiprocessing as the primary mode of parallelism, as all threads launched by a given python process will be bound to a single socket. To work around this issue, users of python and multiprocessing are advised to explicitly disable mpibind, i.e., srun -n 1 --mpibind=off python my_mp_script.py. This will allow threads to migrate across sockets, thus take advantage of all CPUs/cores on a node.
Python Integrated Development Environments (IDEs)
For users who are looking for a Python IDE, we first recommend using VScode. Documentation on how to setup VScode on LC systems is available at Visual Studio (VS) Code | LLNL Developer Homepage.
For users who prefer the Spyder IDE, users may install it themselves using anaconda (see section above on Conda). The base conda installation on LC TOSS 4 systems does contain a spyder executable which can be accessed via /collab/usr/gapps/python/toss_4_x86_64_ib/anaconda3/bin/spyder.
Since IDEs run on LC systems may not perform well with X forwarding, users are advised to consider logging in to LC systems using VNC. Please refer to https://hpc.llnl.gov/software/visualization-software/vnc-realvnc for more information.
Documentation
A lot of documentation for Python is available on the Web and in published hardcopy. The Python Web site contains links that are extremely useful. Documents and information specific to LLNL or not easily found at the Python Web site are:
- Python Site Packages (LLNL)
- Debugging Python: Python script debugging using pdb
- Limited Python debugging capabilities are available in TotalView
- pyDoc: Documentation generator and online help system
If you know of other documentation that should be listed here, please contact the LC Hotline.
Troubleshooting
SSL Certificates
If you are getting SSL certificate errors, such as
requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)
you can set your REQUESTS_CA_BUNDLE environment variable to /etc/pki/tls/cert.pem. LC has patched its requests installations to automatically set this variable if it is not already defined.
Some users have reported SSL errors such as:
SSLError: HTTPSConnectionPool(host='example.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1131)')))
It was discovered that this happens when conda installs openssl 3.X. This may be fixed by forcing conda to downgrade openssl to 1.1.1:
conda install --force-reinstall openssl=1.1.1
CXXABI
On TOSS 3 x86 systems, the LC Python installations and their site packages were built with gcc-4.9.3, however, many site-packages that users add require a newer libstdc++, resulting in errors such as:
ImportError: /usr/tce/packages/gcc/gcc-4.9.3/lib64/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /path/to/some_extension.so)
The best way to work around this issue is to force your Python script to load the libstdc++.so from a newer gcc build. It is important that this be loaded before loading any other modules that may bring in the version 4.9.3 libstdc++.so. For example, below is the code required to load the 8.1.0 libstdc++.so:
import ctypes ctypes.CDLL("/usr/tce/packages/gcc/gcc-8.1.0/lib64/libstdc++.so") import some_extension
mpi4py one-sided communication
If locks don't appear to be working as expected using mpi4py's Win class, setting MPICH_ASYNC_PROGRESS=1 may resolve the issue.
h5py on CORAL systems
The default h5py on CORAL systems was built with MPI support. If run outside of jsrun/lrun (i.e., directly invoking python and importing h5py), users may receive an error/warning message or get a segmentation fault. Users who need to use h5py when directly running python outside of jsrun will need to use a side-installed, non-MPI h5py by adding the appropriate directory to their PYTHONPATH environment variable. These side installs can be found in /usr/tce/packages/python/python-2.7.16/lib/python2.7/h5py-serial-2.7.16, /usr/tce/packages/python/python-3.7.2/lib/python3.7/h5py-serial-3.7.2, and /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial. For example:
% lrun -n 1 python- -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-2.7.16/lib/python2.7/site-packages/h5py/__init__.pyc % python -c 'import h5py; print(h5py.__file__)' Segmentation fault % setenv PYTHONPATH /usr/tce/packages/python/python-2.7.16/lib/python2.7/h5py-serial-2.7.16 % python -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-2.7.16/lib/python2.7/h5py-serial-2.7.16/h5py/__init__.pyc % lrun -n 1 python3-3.7.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.7.2/lib/python3.7/site-packages/h5py/__init__.py % python3-3.7.2 -c 'import h5py; print(h5py.__file__)' Segmentation fault % setenv PYTHONPATH /usr/tce/packages/python/python-3.7.2/lib/python3.7/h5py-serial-3.7.2 % python3-3.7.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.7.2/lib/python3.7/h5py-serial-3.7.2/h5py/__init__.py % lrun -n 1 python3-3.8.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial/h5py/__init__.py % python3-3.8.2 -c 'import h5py; print(h5py.__file__)' Segmentation fault % setenv PYTHONPATH /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial % python3-3.8.2 -c 'import h5py; print(h5py.__file__)' /usr/tce/packages/python/python-3.8.2/lib/python3.8/h5py-serial/h5py/__init__.py