The purpose of the Green Data Oasis (GDO) is to facilitate collaboration between LLNL and external collaborators by providing a means for easily sharing data. As such, it was designed as a data portal and is not intended for doing data analysis or other CPU-intensive activities.
Account Requests and Disk Allocations
GDO accounts and disk allocations are approved by the Multiprogrammatic and Institutional Computing (M&IC) Program Leaders and are renewed on a yearly basis. Account requesters must meet these requirements:
- Project is associated with the M&IC Program or has a Grand Challenge or LDRD allocation.
- Project has a U.S.-citizen LLNL employee to manage data on the GDO.
- Data to be shared has been released for unlimited distribution.
All users of the system must sign an acknowledgment agreeing to the GDO terms and conditions. When a project's allocation has expired, user accounts and data associated with that project will be removed. Log-in accounts can only be granted to U.S. citizens. Foreign national project members will be able to access the GDO in the same way as external collaborators (see below).
All LLNL data placed onto the GDO by the user must first be authorized for general distribution through the Laboratory's Information Management (IM) procedures. See the data review and release FAQ for more details. No Unclassified Controlled Information (UCI) data is allowed anywhere on the system. Data on the GDO from external sources must be part of an ongoing collaboration that is associated with an LLNL science project. External data must be checked for validity by an LLNL data custodian (U.S. citizen project member) before it is made available for others to view. Data on the GDO from external sources need not go through the Lab's IM process, but external data must bear no legal responsibility for the Lab to protect it. It will be the project's responsibility to assure that the data has no associated legal implications.
Local (LLNL) project members will have log-in access, using two-factor authentication as provided by RSA tokens (i.e., one-time passwords.) External collaborators will not have log-in accounts. They will be able to retrieve data using FTP, HTTP, or other approved protocol. Projects can install software, but any server software (e.g., software that enables access from off-machine) first requires the approval of the GDO project leader. In some cases, collaborators will be able to upload data to the GDO. The system will provide a means for specifying access control that limits the hosts and times during which an upload can occur. A data check by a responsible project data custodian will be required before the data can be moved into a publicly readable area. The data custodian must be a U.S. citizen. See the rules in the Nature of Data Allowed on the GDO section for the type of external data that is allowed on the GDO.
Networking and Connectivity
A project's GDO disk space can be NFS exported (read-only) to another host on the green network. This will allow data-intensive visualization or analysis to be done on a project's local green network servers. The GDO will not be NFS exported to any host on the yellow network, nor will any other special trust relationship exist between the GDO and the yellow network. All traffic between the GDO and the yellow network will go through the standard Lab network channels. All network traffic to and from the GDO will be closely monitored. For those projects without green network computing resources, the LC-managed system called Ebert is a viable option. Contact the LC Hotline or visit ICC's High Performance Computing Resources for details about Snowbert.
Backups and Data Loss
Data on the GDO RAID system will not be backed up. Users will be responsible for backing up this data to other systems such as the High Performance Storage System (HPSS). Other partitions on the GDO will be backed up; this includes user home directories and system directories. The GDO uses disruptive (bleeding edge) technology. While every reasonable precaution against data loss will be taken, hardware crashes and disk failures will occur, possibly resulting in loss of data.