Backing up Object Store: Difference between revisions

From SD4H wiki
Jump to navigation Jump to search
m (Slight re-wording of first para. Minor spelling corrections.)
m (Confirm policy applied section added.)
Line 18: Line 18:
Please follow this procedure to request backups of your buckets.  
Please follow this procedure to request backups of your buckets.  


==List bucket==
==Email the list of buckets==


Send a list of buckets to back up to [mailto:juno@calculquebec.ca sd4h support] with the name and ID of the project where the buckets live.
Send a list of buckets to backup to [mailto:juno@calculquebec.ca sd4h support] with the name and ID of the project where the buckets live.


==Give us permission==
==Give us permission==


You need to configure the iam policy statement of '''all the buckets''' you want to back up so your TSM robot user in charge of the backup can access them. Here is the policy that needs to be added.
An iam policy statement must be applied to '''all the buckets''' you want to backup so the TSM robot user in charge of the backup can access them. This can be done with the [https://awscli.amazonaws.com/v2/documentation/api/latest/index.html aws cli].


For example, using the [https://docs.aws.amazon.com/cli/latest/ aws cli], apply the policy on <code>my-bucket</code> using the <code>my-profile</code> identity.  
First, ensure that <code>my-bucket</code> currently has no policy. Check bucket <code>my-bucket</code> using profile <code>my-profile</code> (as defined in ~/.aws/config and ~/.aws/credentials files):


First, we make sure that <code>my-bucket</code> currently has no policy.
<pre>$aws s3api get-bucket-policy --profile c3g-data-repos --bucket my-bucket
 
<pre>$aws s3api --profile my-project  get-bucket-policy --bucket my-bucket


An error occurred (NoSuchBucketPolicy) when calling the GetBucketPolicy operation: The bucket policy does not exist
An error occurred (NoSuchBucketPolicy) when calling the GetBucketPolicy operation: The bucket policy does not exist
</pre>
</pre>


If that command returns something, you need to add the new statement the existing policy. But we are not covering that here.
If that command returns something, the new policy statements must be added to the existing policy (which is not covered here).


Adding policy.json to my-bucket
The following policy.json needs to be applied.
<div class="filename">'''File :''' policy.json </div>
<div class="filename">'''File :''' policy.json </div>
<syntaxhighlight lang=json file=my-policy.json>
<syntaxhighlight lang="json" file="my-policy.json">
{
{
"Statement": [
"Statement": [
Line 51: Line 49:
     "Resource": [
     "Resource": [
       "arn:aws:s3:::my-bucket/*",
       "arn:aws:s3:::my-bucket/*",
       "arn:aws:s3:::my-bycket"
       "arn:aws:s3:::my-bucket"
     ]
     ]
   }
   }
Line 58: Line 56:
</syntaxhighlight>
</syntaxhighlight>


Then loading the policy to the bucket:
Load the policy onto the bucket <code>my-bucket</code> using the profile <code>my-profile</code>


<pre>
<pre>
$aws s3api --profile my-profile  put-bucket-policy --policy file://my-policy.json --bucket my-bucket
$aws s3api put-bucket-policy --policy file://my-policy.json --profile my-profile --bucket my-bucket
</pre>
</pre>
== Confirm policy applied ==
As we did before, request the bucket's IAM policy, ensuring that the contents of policy.json are listed.
$aws s3api get-bucket-policy --profile c3g-data-repos --bucket my-bucket


=Restore Procedure=
=Restore Procedure=

Revision as of 20:28, 3 June 2025


Object Store data, while stored redundantly via Ceph, is not backed up. Object Store buckets are backed up to the TSM tape system upon request only by following the procedures listed on this page.

The following contents and policies apply to backups by default:

What is in the backup?

* Only the bucket data is backed up.  We are not currently backing up the IAM policies of the buckets or objects.
* Only the current version of the data is seen by the backup system.  Object chunks or versioned objects are not seen by the backups system.

What is the backup policy?

* Backups are run on a daily basis.
* The current object and one modified version of object are kept (this is different than full bucket versioning).    
* The modified version is kept for 6 month - after that period only the current object is kept. 
* Deleted objects are kept for 6 months.

Backup Procedure

Please follow this procedure to request backups of your buckets.

Email the list of buckets

Send a list of buckets to backup to sd4h support with the name and ID of the project where the buckets live.

Give us permission

An iam policy statement must be applied to all the buckets you want to backup so the TSM robot user in charge of the backup can access them. This can be done with the aws cli.

First, ensure that my-bucket currently has no policy. Check bucket my-bucket using profile my-profile (as defined in ~/.aws/config and ~/.aws/credentials files):

$aws s3api get-bucket-policy --profile c3g-data-repos --bucket my-bucket

An error occurred (NoSuchBucketPolicy) when calling the GetBucketPolicy operation: The bucket policy does not exist

If that command returns something, the new policy statements must be added to the existing policy (which is not covered here).

The following policy.json needs to be applied.

File : policy.json
{
"Statement": [
  {
    "Effect": "Allow",
    "Principal": {"AWS": ["arn:aws:iam:::user/tsm"]},
    "Action": [
      "s3:ListBucket",
      "s3:GetObject"
    ],
    "Resource": [
      "arn:aws:s3:::my-bucket/*",
      "arn:aws:s3:::my-bucket"
    ]
  }
]
}

Load the policy onto the bucket my-bucket using the profile my-profile

$aws s3api put-bucket-policy --policy file://my-policy.json --profile my-profile --bucket my-bucket

Confirm policy applied

As we did before, request the bucket's IAM policy, ensuring that the contents of policy.json are listed.

$aws s3api get-bucket-policy --profile c3g-data-repos --bucket my-bucket

Restore Procedure

List bucket

Send us list of buckets or object to restore to sd4h support.

Give us permission

You will be asked to create a bucket for each bucket you want to restore to restore with the -restore suffix.

File : policy.json
{
"Statement": [
  {
    "Effect": "Allow",
    "Principal": {"AWS": ["arn:aws:iam:::user/tsm"]},
    "Action": [
      "s3:ListBucket",
      "s3:GetObject",
      "s3:PutObject",
      "s3:PutObjectAcl",
      "s3:AbortMultipartUpload"
    ],
    "Resource": [
      "arn:aws:s3:::my-bucket-restore/*",
      "arn:aws:s3:::my-bycket-restore"
    ]
  }
]
}

Once it is done we will restore you data to that folder.