Rclone

From SD4H wiki
Revision as of 17:46, 24 April 2025 by Dbrownlee (talk | contribs)
Jump to navigation Jump to search

Rclone is a powerful client that can interact with multiple storage backends, it offers a good support for our Ceph version of the S3 api and has good speed transfer out of the box. It can also be used to mount an Object Store as traditional block file storage.

Configuration

First download rclone or use the script installation. Then get your S3 id key and secret from Open Stack.

Create the following file:

 ~/.config/rclone/rclone.conf
[my-project]
type = s3
provider = Other
env_auth = false
access_key_id = <S3 ID from previous step>
secret_access_key = <S3 secret from previous step>
endpoint = https://objets.juno.calculquebec.ca
acl = private

You can then list current bucket, create a bucket and then copy a file into it,

$rclone lsd my-project:
          -1 2024-01-19 14:12:34        -1 backups
          -1 2024-03-07 14:23:26        -1 my-bucket
$rclone mkdir   c3g-prod:test
$rclone lsd my-project:
          -1 2024-01-19 14:12:34        -1 backups
          -1 2024-03-07 14:23:26        -1 my-bucket
          -1 2025-04-15 18:08:32        -1 test
$rclone copy my-file.txt my-project:test
$rclone ls  my-project:test/
    12408 my-file.txt

Mounting an Object Store

To allow mounting by non-root users, in /etc/fuse.conf, uncomment:

user_allow_other

Mount the Object Store in daemon mode with:

rclone mount <rclone config block>:<bucket> /path/to/mount/dir --daemon --daemon-wait 0 --allow-other --read-only
# For example:
#rclone mount c3g-data-repos:ihec_data /mnt/ihec_data_objstr --daemon --daemon-wait 0 --allow-other --read-only

A service may be used to auto-mount the Object Store on boot with a service file (in /etc/systemd/system/).

# Mount the ihec_data_objstr, even after a restart
[Unit]
Description=My Object Store automount
After=network.target

[Service]
ExecStart=/usr/bin/rclone mount <rclone config block>:<bucket> /path/to/mount/point/dir --no-modtime --fast-list --transfers 50 --checkers 50 --allow-other --read-only
# For example:
# ExecStart=/usr/bin/rclone mount c3g-data-repos:ihec_data /mnt/ihec_data_objstr --no-modtime --fast-list --transfers 50 --checkers 50 --allow-other --read-only
ExecStop=/usr/bin/fusermount -u /mnt/ihec_data_objstr
Restart=always
SyslogIdentifier=ihec_data_objstr
User=ihec
Group=ihec
Environment=RCLONE_CONFIG=/home/ihec/.config/rclone/rclone.conf
TimeoutStopSec=30

[Install]
WantedBy=multi-user.target

Mounting a public Object Store without using credentials

Public Object Stores may be accessed or mounted as read-only without the use of Open Stack credentials. This relies on a bucket syntax prepended with the Open Stack project ID.

Your ~/.config/rclone/rclone.conf need not contain an access_key_id and secret_access_key but only:

[my-public-project]
type = s3
provider = Other
env_auth = false
endpoint = https://objets.juno.calculquebec.ca

Then combine the OS project ID and the bucket name like so:

rclone lsd my-public-project:<OS project ID>:<bucket name>
# For example:
# rclone lsd my-public-project:d5f8b8e8e3e2442f81573b2f0951013b:ihec_data
# or
# rclone mount my-public-project:d5f8b8e8e3e2442f81573b2f0951013b:ihec_data /mnt/ihec_data_objstr --daemon --daemon-wait 0 --allow-other --read-only

No problems, only solutions

1. I cannot upload file larger than 48GB.

In some situation rclone is not able to guess the size of the file to upload and use the default value of`--s3-chunk-size 5M` to spit and upload file to the bucket. But since the server has a 10,000 chunk limit, the upload crashes. You can solve that by setting a larger  value:
$rclone copy --s3-chunk-size 50M my-large-file.cram  my-project:test

Note that you need the ram of your computer to be larger that chunks.