VNC: NICE DCV

Overview

NICE DCV is a 3D/GLX-capable Virtual Network Computing (VNC) server that provides a securely authenticated and encrypted way for users to create and view a virtual desktop with 3D/GLX applications that persists even if no client is actually viewing it. As an example, let's say you are at an airport getting ready to fly to Texas for a conference, and you wish to create an animation with VisIt, an OpenGL application, that is going to take a couple of hours to create. You log into the czvnc cluster, launch a batch job, run a NICE DCV session on the batch node, and connect to that virtual desktop session. You launch VisIt inside the DCV session, click the button to create your animation, and then disconnect from the session and catch your flight. You can then reconnect to your DCV session from the comfort of your hotel in Texas and view VisIt's window, which has been happily updating the whole time. Science is saved again!

As in the example above, NICE DCV allows users to start any GUI-based program on the cluster which opens a main window, and then disconnect from the cluster without closing the main window and exiting the program. The user can then reconnect later to the virtual desktop to view any changes in their program. NICE DCV at LLNL is secure, highly performant, and supports OpenGL programs such as VisIt and ParaView. You may connect to your DCV virtual desktop with a DCV "thick" client on a Windows or Linux platform or via a web browser from Windows, MacOS, or Linux. DCV shares functionality with RealVNC; however,  DCV only runs on the batch nodes (not the login node) and exits when your batch job is over. For these reasons, if you do not need the 3D/GLX compatibility, it is recommended that you use RealVNC instead, which runs independently from Slurm and provides longer-term virtual sessions, albeit with 2D graphics.

Environment

Machines

NICE DCV is only installed on the batch nodes of Surface and Pascal on the CZ, Rzhasgpu on the RZ, and GPU nodes on Max on the SCF.

Location

NICE DCV is accessed by running /usr/bin/dcvsession.

Settings

No special settings are necessary for NICE DCV.

Usage

  1. Get the proper account(s) and bank(s). You must have an account on either the pascal or rzhasgpu clusters to use DCV. You can request an account through the ID Management system here. Once you have an account, you must also request a bank through the LC hotline. Call extension 2-4531 to set this up.<li>
  2. Log into pascal.llnl.gov or rzhasgpu.llnl.gov and, from the login node, reserve a batch node. Contact the hotline if this is unclear or you don't have a bank. Having an account does not mean you have a bank!
    1. UPDATE: As of Feb. 2019, the --load-nvcache option to salloc or sbatch should not be used.
    2. If you will be reserving the node from a persistent terminal, such as on your workstation in your office, you may use the salloc command.
      1. Reserve a node with a command such as salloc to reserve one node in the default Slurm partition for the default length of time. You may optionally supply options for a different length of time to reserve the node (-t <HH:MM:SS>) or other Slurm options.
        1. Note: On Max, not all compute nodes have GPUs; therefore you must specify the -p pgpu option to  salloc  or sbatch  to obtain a compute node containing a GPU.
      2. On the batch node, run /usr/bin/dcvsession -o <YOUR_OS_TYPE>
        • <YOUR_OS_TYPE> is either "osx," "lin," or "win."
        • Use the -h flag with dcvsession to see available options.
        • Note: Read the output from dcvsession carefully, as it will provide instructions to connect to your batch node. Instructions may differ between RZ and CZ hosts.
        • Note for Windows users: Windows does not include SSH by default. You can either install Cygwin and use SSH in that environment, or you can install PuTTY with plink.exe from the PuTTY website. For more information about plink, see this document. dcvsession will assume you are using plink, but if you are using SSH and Cygwin, simply use "lin" for your os type instead of "win."
    3. If you will be submitting your job from a terminal that will be closed, such as on your laptop or a VPN session that will end, you should use the sbatch command instead of salloc. With this method, you will need to create a small job script, submit it, and then look at the job output for instructions on connecting to your DCV session.
      1. Below is a sample job script that allocates a node and starts a DCV session, and sleeps for a long period of time - presumably longer than you will want the node reserved.
        1. [<user>@pascal83:~]$ cat dcv.sh

          #!/bin/sh

          /usr/bin/dcvsession -o win

          sleep 10d

      2. You should then submit the job using sbatch: sbatch dcv.sh where you can add additional Slurm options if so desired.

      3. Finally, look at the contents of slurm-<job#>.out to view instructions on connecting to your DCV session.

  3. To access your virtual desktop you may use a web browser, or, if using Linux or Windows, you may use a "thick" client. The "DCV Endstation" client may be downloaded and installed from the vendor's website . Make sure you download the Client and not the Server software. 
  4. Follow the instructions printed in the output of the dcvsession command to connect to your session.
    1. If using a web browser, make sure you use
      https://localhost:<port number>
       

      in the URL, where the port number is the first (left-most) number in your SSH tunnel.

    2. If using the thick client, you can also set up a shortcut to connect to a cluster once the proxy server is running. To do so, create a file named <cluster>.dcv on your desktop or home directory, where <cluster> is the name of the cluster you are connecting to (pascal or rzhasgpu). This only ever needs to be configured once per cluster on your workstation.
      1. For the CZ system, add the following lines:
        ​[version]
        format=1.0
        
        [connect]
        user=<YOUR LC USER NAME>
        proxyhost=127.0.0.1:1080
        proxyport=1080
        proxytype=Socks5
      2. For the RZ system, add the following 3 lines (note the different port on the ProxyServer line):
        ​[version]
        format=1.0
        
        [connect]
        user=<YOUR LC USER NAME>
        proxyhost=127.0.0.1:1080
        proxyport=1081
        proxytype=Socks5
      3. Start an SSH SOCKS proxy server on your workstation to tunnel traffic to and from the allocated batch node. On the RZ, you'll need to set up an intermediary tunnel first.
        1. When using Pascal, type:
          1. ssh -fN -D 1080 <user>@pascal
        2. When using Pascal, type:
          1. ssh -fN -D 1082 <user>@pascal
        3. When using Rzhasgpu, first set up a tunnel through the RZ gateway, and then start the proxy server directed at the forwarded port on your workstation:
          1. ssh -fN -L 11081:rzhasgpu:22 rzgw
          2. ssh -fN -D 1081 -p 11081 localhost
      4. Launch your DCV EndStation client and connect to the hostname of the allocated batch node.
  5. To exit your DCV session, you log out of the virtual desktop or you may simply exit your interactive Slurm session from step 2. 

Help

Troubleshooting Tips

  • If you get a message trying to set up the SSH proxy connection similar to bind: Address already in use:
    • Determine if you have already established the proxy (you generally only do it once until your machine is rebooted). Try connecting to the port with the nc or netcat command and if it says connected, then proceed past the dcvsession prompt to set up the proxy:
      • nc -v localhost 1080
        ~$ nc -v localhost 1080
        Ncat: Version 7.60 ( https://nmap.org/ncat )
        Ncat: Connected to ::1:1080.
    • Try using a different port. For instance, use port 1082 instead of 1080. You'll then need to change your .dcv file to use the same port number

Help is also available from the lc-hotline@llnl.gov, (925) 422-4531.

UCRL-MI-128467-REV-1