VNC: NICE DCV

Overview

NICE DCV is a 3D/GLX-capable Virtual Network Computing (VNC) server that provides a securely authenticated and encrypted way for users to create and view a virtual desktop with 3D/GLX applications that persists even if no client is actually viewing it. As an example, let's say you are at an airport getting ready to fly to Texas for a conference, and you wish to create an animation with VisIt, an OpenGL application, that is going to take a couple of hours to create. You log into the czvnc cluster, launch a batch job, run a NICE DCV session on the batch node, and connect to that virtual desktop session. You launch VisIt inside the DCV session, click the button to create your animation, and then disconnect from the session and catch your flight. You can then reconnect to your DCV session from the comfort of your hotel in Texas and view VisIt's window, which has been happily updating the whole time. Science is saved again!

As in the example above, NICE DCV allows users to start any GUI-based program on the cluster which opens a main window, and then disconnect from the cluster without closing the main window and exiting the program. The user can then reconnect later to the virtual desktop to view any changes in their program. NICE DCV at LLNL is secure, highly performant, and supports OpenGL programs such as VisIt and ParaView. You may connect to your DCV virtual desktop with a DCV "thick" client on a Windows or Linux platform or via a web browser from Windows, MacOS, or Linux. DCV shares functionality with RealVNC; however,  DCV only runs on the compute nodes (not the login node) and exits when your batch job is over. For these reasons, if you do not need the 3D/GLX compatibility, it is recommended that you use RealVNC instead, which runs independently from Slurm and provides longer-term virtual sessions, albeit with 2D graphics.

Environment

DCV Clusters

NICE DCV is only installed on the compute nodes of specific LC Linux systems with GPUs.  As of Jan 2020 these systems include:

  • CZ:  pascal, surface
  • RZ:  rzhasgpu
  • SCF:  max - GPU nodes only (pgpu partition)

Location

NICE DCV is accessed by running /usr/bin/dcvsession on a GPU compute node in one of the DCV clusters mentioned above. Note that this software is not installed on login nodes or on other clusters.

Settings

No special settings are necessary for NICE DCV.

Accounts and Banks

You must have an account on the DCV clusters(s) where you intend to do your work. These systems are mentioned above. For details on requesting an LC account see https://hpc.llnl.gov/accounts/new-account-setup.  When you request an account, you should also specify the appropriate bank. Otherwise, you will need to request a bank later through the LC Hotline. Contact the Hotline if this is unclear or you don't have a bank. Having an account does not always mean you have a bank!

Usage

There are two different usage cases, with instructions for each below. 

  1. Interactive: Your local workstation connection to your DCV session will be persistent. That is, you won't be logging off, closing your terminal window, or losing your VPN connection (if you have one) during your DCV session.
  2. Batch: Your may be disconnecting from your local workstation DCV session and/or you plan on connecting to your DCV session later, possibly from somewhere else.

In both cases, the instructions include steps that must be performed on the LC DCV cluster, and steps that must be performed on your local workstation.  Also, instructions will differ for Mac/Linux and Windows machines.

QUICKSTART: For convenience, if you've already done all of the one-time software installation/setup before, and just want to see the commands to get a DCV session going, see the Quickstart section near the end of this document.

Interactive Usage Instructions - to be performed on the DCV cluster

  1. Log into a DCV cluster where you have an account/bank. This will place you on a login node.
  2. Reserve your compute node(s) using salloc. Some examples with notes are shown below - see the salloc man page for additional options:

    salloc                   allocate 1 node, default time limit and default partition.
    salloc -N 8 -t 4:00:00   allocate 8 nodes for 4 hours, default partition
    salloc -t 120            allocate 1 node for 120 minutes, default partition
    salloc -p pgpu           allocate 1 node, default time limit, in the pgpu partition (required for max)
    salloc -N 2 -p pvis      allocate 2 nodes, default time limit, in the pvis partition (such as on pascal)
  1. When your salloc command completes, you will be placed on a compute node. Then start the DCV session by running:

    dcvsession -o <YOUR_OS_TYPE>

    where <YOUR_OS_TYPE> is either osx (Mac), lin (Linux) or win (Windows).  Use dcvsession -h  to see other options.
  1. Carefully review the output from dcvsession, as it will provide instructions on how to connect to your DCV session from your workstation. Instructions may differ between RZ and CZ hosts, and will differ between Mac/Linux and Windows. Examples below (click for larger version):
     
  1. You are now ready to follow the instructions for your local workstation - see the DCV Instructions For Your Local Workstation section below.

 

Batch Usage Instructions - to be performed on the DCV cluster

  1. Log into a DCV cluster where you have an account/bank. This will place you on a login node.
  2. Create a small job script that allocates your batch node(s), job time limit and partition, and then launches your DCV session. If you do not specify these, the system defaults will be used, which may not be adequate for your purposes.  For details on creating job scripts see Building a Job Script.  A sample job script that allocates 4 nodes for 8 hours in the pvis partition is shown below. Note that it sleeps for a long period of time - presumably longer than you will need the DCV session batch nodes to do your work.

    $ cat dcv.sh
    #!/bin/sh
    #SBATCH -N 4
    #SBATCH -t 8:00:00
    #SBATCH -p pvis
    /usr/bin/dcvsession -o win
    sleep 10d
  1. Submit your job using sbatch: sbatch dcv.sh     You can add additional sbatch options if so desired - see the sbatch man page for details.
  2. After your job starts to run (the squeue and other commands can be used to monitor this) look at the contents of your job's output file (named slurm-job#.out by default) and carefully review the instructions on connecting to your DCV session from your local machine. Sample output is available here.
  3. You are now ready to follow the instructions for your local workstation - see the DCV Instructions For Your Local Workstation section below.

 

DCV Instructions For Your Local Workstation

These instructions should be performed on your local workstation after you have completed the instructions discussed above for Interactive Usage or Batch Usage. They will differ between Mac/Linux and Windows.

Windows Instructions

One time only: Install SSH software on your Windows workstation, since Windows does not include SSH by default. Two options are described here:

  1. Install PuTTY with plink.exe from the PuTTY website. The plink.exe executable provides a command line interface that can be used from a Windows DOS command (cmd) terminal window on your workstation.  For more information about plink.exe, see this document.
  2. Install Cygwin and use SSH in that environment. Note that if you select this option, you will need to use the -o lin option when you start dcvsession on the LC cluster, as discussed above for Interactive Usage or Batch Usage.  Then skip the remainder of these Windows instructions and follow the Mac/Linux Instructions below.

Web Browser Method

1. Open a DOS command terminal window on your Windows workstation.  Simply type cmd in the Windows search box and a black terminal window should appear.

2. Follow the output instructions from the dcvsession command when you started it on an LC cluster compute node, as discussed under Interactive Usage or Batch Usage above.  For example, if you are using plink.exe type the following command in your DOS cmd terminal window (use the actual command shown in your dcvsession output though):

plink.exe  -L 8443:pascal23:8443 joeuser@pascal sleep 180


3. Note: the plink.exe command will need to be in your Windows path. The easiest way to insure this is to simply cd to the directory where you installed it. 

4. You will be prompted to authenticate - use your LC password (PIN + token).  If successful, continue to the next step. Do NOT type anything else into your DOS cmd terminal window, as this may terminate your connection.

5. In a web browser on your local Windows workstation, go to the URL shown in the output from your dcvsession command. For example: https://localhost:8443. This should connect you to a DCV authentication window. Login with your LC username and RSA token credentials.

6. You should then see your "virtual desktop" on the compute node where your dcvsession command was run. A sample screenshot (with Applications selected) is provided below (click for larger version). 

.

7. You can now use your virtual desktop to do work on your compute node allocation. For example, you could open a terminal/konsole and then launch a parallel job.

8. To exit your DCV session, log out of the virtual desktop. Or, you may simply exit your interactive Slurm session if you are not running in batch mode.

 

Windows DCV "Thick Client" Method

As an alternative to the web browser method above, you can use the "DCV Endstation" client.  It can be downloaded and installed from the vendor's website . Make sure you download the Client and not the Server software. 

  1. After starting dcvsession on your compute node(s) as described in the Interactive Usage or Batch Usage sections above, launch your DCV Endstation client and follow the instructions from the output of your dcvsession command for the thick client. Sample output is available here.
  2. You can also set up a shortcut to connect to a cluster once the proxy server is running. To do so, create a file named <cluster>.dcv on your desktop or home directory, where <cluster> is the name of the cluster you are connecting to. This only needs to be configured once per cluster on your workstation.
    For a CZ system, add the following lines:
    ​[version]
    format=1.0
    [connect]
    user=<YOUR LC USER NAME>
    proxyhost=127.0.0.1:1080
    proxyport=1080
    proxytype=Socks5


    For an RZ system, add the following lines (note the different port on the ProxyServer line):
    ​[version]
    format=1.0
    [connect]
    user=<YOUR LC USER NAME>
    proxyhost=127.0.0.1:1080
    proxyport=1081
    proxytype=Socks5


    Start an SSH SOCKS proxy server on your workstation to tunnel traffic to and from the allocated compute node.
    For a CZ cluster type:
    ssh -fN -D 1080 <user>@pascal

    For an RZ cluster, first set up a tunnel through the RZ gateway, and then start the proxy server directed at the forwarded port on your workstation:
    ssh -fN -L 11081:rzhasgpu:22 rzgw
    ssh -fN -D 1081 -p 11081 localhost


    Launch your DCV EndStation client and connect to the hostname of the allocated batch node.
     

Mac / Linux Instructions

Linux and MacOS users must use a web browser to access their virtual sessions.

  1. First, you will need to set up an SSH tunnel to the compute node(s) you reserved. On your local workstation in a terminal window, use the actual command that appears in the output of dcvsession as described in the Interactive Usage or Batch Usage sections above. For example:

    ssh -f  -L 8443:pascal16:8443 joeuser@pascal sleep 180

    Note: the number on the end is the time in seconds when the SSH tunnel will close if unused, and is configurable.
  1. In a web browser on your local workstation, go to the URL shown in the output from your dcvsession command. For example: https://localhost:8443. This should connect you to a DCV authentication window.
  2. Login with your LC username and RSA token credentials.  You should then see your "virtual desktop" on the compute node where your dcvsession command was run. Sample screenshot available here.
  3. You can now use your virtual desktop to do work on your compute node allocation. For example, you could open a terminal/konsole and then launch a parallel job.
  4. To exit your DCV session, log out of the virtual desktop

 

Quickstart

This section provides a quick summary of commands/actions for getting a DCV session going, provided you've already performed all of the one-time software installation and setup requirements.

Task / Step Where  Command / Notes
1. Acquire a compute node partition on a DCV cluster DCV cluster salloc  (interactive)
sbatch myjobscript (batch)
2. Start dcvsession on a compute node DCV cluster

dcvsession -o win   (Windows)
dcvsession -o mac   (Mac)
dcvsession -o lin   (Linux)

3. Open a local terminal window Workstation

Windows: use DOS cmd command

Mac/Linux: open terminal window as usual

4. Connect to compute node DCV session from your workstation Workstation

Windows:  plink.exe  -L 8443:pascal23:8443 joeuser@pascal sleep 180

Mac/Linux:  ssh -f  -L 8443:pascal23:8443 joeuser@pascal sleep 180

Note: use actual command from dcvsession output in step 2

5. Start virtual desktop on workstaton web browser Workstation

https://localhost:8443

Note: use actual URL provided from dcvsession output in step 2

 

Help

Troubleshooting Tips

  1. For Windows DCV thick client: If you get a message trying to set up the SSH proxy connection similar to bind: Address already in use:
    • Determine if you have already established the proxy (you generally only do it once until your machine is rebooted). Try connecting to the port with the nc or netcat command and if it says connected, then proceed past the dcvsession prompt to set up the proxy:
      nc -v localhost 1080
      ~$ nc -v localhost 1080
      Ncat: Version 7.60 ( https://nmap.org/ncat )
      Ncat: Connected to ::1:1080.
  • Try using a different port. For instance, use port 1082 instead of 1080. You'll then need to change your .dcv file to use the same port number

Help is also available from the lc-hotline@llnl.gov, (925) 422-4531.

 

 

UCRL-MI-128467-REV-1