NOTEThe term "S3" is often overloaded and can quickly lead to confusion if parties in any given dialogue make assumptions without verifying. S3 is both a service by Amazon Web Services (AWS) and a protocol.
More accurately described as "AWS S3", the service provides object storage cloud capacity off-premises for a fee and can be thought of as "the cloud" proper. It happens that AWS S3 uses a specific protocol to communicate with its object storage service known as the "S3 protocol," also commonly referred to as simply "S3."
We recommend checking with your communication participants regularly to ensure these distinctions are understood.
As the S3 protocol is the de facto standard for communication with both AWS S3 and various on-premises object storage solutions, it is the protocol we aim to serve. To learn more about object based storage please visit: Wikipedia
The following is not intended to be a full tutorial. These are basic commands to get started. Please refer to the s3api documentation for more info on AWS CLI functional uses.
On-premises Object Storage
Today, LC provides object storage compatible with the S3 protocol backed by NetApp's StorageGRID product. LC users may think of this particular HPC storage offering as basic individual user use cases (read: not group data share repositories) of 4TB or less allotments in an HPC-adjacent instance.
In the near future, LC intends to greatly expand its offerings of both large-capacity high performance storage and long-term streaming media archival storage object storage (using the S3 protocol) alongside our current StorageGRID solution.
Provisioning
First you'll need to provision some on-premises object storage capacity if you haven't done so already. This is a one-time step. LC users can self-provision by navigating to LaunchIT, a web "store front" for HPC-adjacent databases, message brokers, and object storage within LC. In the CZ and the SCF, the URL is launchit.llnl.gov. For the RZ, the URL is rzlaunchit.llnl.gov. Once logged in, look for the "Object Storage Allocations" in the menu at the top.
Basic Setupπ
After allocating S3 object storage via LaunchIT, you will need to configure a profile using the AWS command line interface (CLI). The CLI enabled below works against both LC on-premises object store(s) and AWS S3 when accessible depending on the network enclave.
In LC version 2 of the AWS CLI installed at /usr/global/awscli/aws. You can either run the aws command with the full path or add this directory to your $PATH variable, for example:
export PATH=$PATH:/usr/global/awscli
You can use the following syntax to use the AWS CLI to begin creating a profile called lc-objstore:
aws configure --profile lc-objstore
After running this command, you'll be prompted to provide the access-key and secret-key provided to you with your object storage account. Below is an example of the prompts you'll see. The values "my-access-key" and "my-secret-key" should be replaced with your information, and you can simply hit Enter to pass the prompts for "Default region name" and "Default output format":
AWS Access Key ID [None]: my-access-key AWS Secret Access Key [None]: my-secret-key Default region name [None]: [enter] Default output format [None]: [enter]
After completing the above, update your profile to use a locally customized certificate bundle by running:
aws configure --profile lc-objstore set ca_bundle /etc/pki/tls/cert.pem
Note replace "lc-objstore" with the name of the profile you've created.
Below are examples both at the shell and using the boto3 python library.
S3 at the shellπ
The S3 provisioned for you should come with an existing bucket. In the sections below, you'll see how to list existing buckets, modify buckets (uploading and deleting objects from buckets), query buckets (listing and downloading objects from a bucket), and create new buckets.
List a bucketπ
aws s3api --profile lc-objstore --endpoint-url https://osgha.llnl.gov list-buckets
Note: replace "lc-objstore" with the name of the profile you've created.
After this, you'll see output similar to:
{ "Owner": { "DisplayName": "<username>", "ID": "<Account ID>" }, "Buckets": [ { "CreationDate": "2022-01-22T00:29:03.070Z", "Name": "cz-my-bucket" } ] }
Note the name of your bucket from the output. Here the bucket name is "cz-my-bucket".
Upload an objectπ
In the following command, you upload an object to be named my-hosts (from the --key flag) to an existing bucket cz-my-bucket under the profile lc-objstore. The content of my-hosts is /etc/hosts specified by the --body flag and info on the object is specified by --metadata:
aws s3api --profile lc-objstore --endpoint-url https://osgha.llnl.gov put-object --key my-hosts --body /etc/hosts --metadata '{"hosts":"local-cluster"}' --bucket cz-my-bucket
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.
This returns something like:
'{"hosts":"local-cluster"}' { "ETag": "\"8b14f67903beb8db9d332a552aa60c3f\"" }
List objects in a bucketπ
aws s3api --profile lc-objstore --endpoint-url https://osgha.llnl.gov list-objects --bucket cz-my-bucket
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.
Output here will look like
{ "Contents": [ { "LastModified": "2022-01-22T00:34:15.065Z", "ETag": "\"8b14f67903beb8db9d332a552aa60c3f\"", "StorageClass": "STANDARD", "Key": "my-hosts", "Owner": { "DisplayName": "<username>", "ID": "<Account ID>" }, "Size": 33340 } ] }
Download an objectπ
You can download an object via
aws s3api --profile lc-objstore --endpoint-url https://osgha.llnl.gov get-object --key my-hosts downloaded-file --bucket cz-my-bucket
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.
which returns a message like
{ "AcceptRanges": "bytes", "ContentType": "binary/octet-stream", "LastModified": "Sat, 22 Jan 2022 00:34:15 GMT", "ContentLength": 33340, "ETag": "\"8b14f67903beb8db9d332a552aa60c3f\"", "Metadata": { "hosts": "local-cluster" } }
To check the downloaded file,
$ tail -1 downloaded-file 172.19.3.171 oslic21-san0 oslic21.llnl.gov oslic21
Delete an objectπ
aws s3api --profile lc-objstore --endpoint-url https://osgha.llnl.gov delete-object --key my-hosts --bucket cz-my-bucket
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.
Create a bucketπ
To create additional buckets,
aws s3api --profile lc-objstore --endpoint-url https://osgha.llnl.gov create-bucket --bucket cz-my-bucket
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.
After running this, you should see something like:
{ "Location": "/cz-my-bucket" }
Note that if you choose a bucket name that has already been used by another user, including cz-my-bucket as above, you'll see
An error occurred (BucketAlreadyExists) when calling the CreateBucket operation: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again.
So, you'll have to get more creative.
S3 with python boto3π
https://boto3.amazonaws.com/v1/documentation/api/latest/index.html
Create a virtual environment and install boto3π
python3 -m venv boto-venv source boto-venv/bin/activate pip install -U pip pip install boto3
Start python, import boto3 and set up the connection details
$ python >>> import boto3 >>> import boto3.session >>> session = boto3.session.Session(profile_name='lc-objstore') >>> endpoint = 'https://osgha.llnl.gov' >>> s3 = session.resource(service_name='s3', endpoint_url=endpoint) >>> client = s3.meta.client
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.
List buckets
>>> for bucket in s3.buckets.all(): ... print(bucket.name) ... mybucket
Here mybucket is the name of the bucket returned.
List files in a bucket
>>> for o in s3.Bucket('mybucket').objects.all(): ... print("Key: " + o.key) ... print("Size: " + str(o.size)) ... ... Key: my-hosts Size: 40246
Note: Replace values with your specific data. For example, replace "lc-objstore" with the name of the profile you've created and replace "cz-my-bucket" with the name of your bucket.