How to Move Data from AWS S3 to OCI OS
Goal: Transfer 10+ TB of data from an Amazon Web Services Simple Storage Service bucket to an Oracle Cloud Infrastructure Object Storage bucket
General approach: use rclone to move directly from one endpoint to the other
Setup on OCI
- Create the bucket
- Generate a secret key (user (top right head looking icon) > user settings > customer secret keys > generate secret key). Save the secret key (will be called
secret_access_keylater), click on the access key name after created and copy the access key name (will be called
- Get the namespace: open the cli (top cli icon in web console interface), run
oci os ns get, copy the result shown with key “data”
Setup on AWS
- Create an instance with FAST data transfer and lots of compute power (I used c5n.18xlarge).
- Give that instance an IAM role that has read access to the bucket with the target data.
- I used the aws Linux type.
- SSH into that instance:
ssh -i <key location> ec2-user@<ip addr>(note: if you are setting up permissions properly, you will need to be on your corporate network or vpn (if offsite) to access your ec2 instances)
- install tmux:
yum install tmux
- install rclone:
yum install rclone
- start tmux (you want things to keep running even if you disconnect)
setup rclone config at
[aws] type = s3 provider = AWS env_auth = true region = <aws-region> endpoint = acl = private server_side_encryption = aws:kms sse_kms_key_id = arn:aws:kms:us-east-1:* storage_class = INTELLIGENT_TIERING [oci] type = s3 provider = Other env_auth = false access_key_id = <from above> secret_access_key = <from above> region = <oci region> endpoint = https://<oci namespace from above>.compat.objectstorage.<region>.oraclecloud.com
AWS region for me is
us-east-1OCI region for me is:
us-ashburn-1you can see what region you are in on the OCI web console top bar and in the AWS web console url. I don’t think that, for aws, acl, server side encryption, kms id, storage class are actually necessary, but whatever.
- Test whether it all works:
rclone lsd oci:<bucket name>and
rclone lsd aws:<bucket name>
- You can test putting things into the oci bucket:
touch test.txt && rclone copy test.txt oci:<bucket name>/test/text.txtand go take a look at the bucket in the web console
On the aws ec2 vm that you have set up, run
rclone sync aws:<bucket name> oci:<bucket name> --progress --transfers 1000 and now you wait… You could change the number of transfers based on how well things are going. You are in tmux, so you can split:
" and then run top or htop in that, look at your cpu usage.
We NEED to clean up a security hole we just made (although not a big one, we already trusted AWS to have our data): go back into the OCI online console and delete the secret key you created.
Also, make sure to shut down the EC2 instance
Post Mortem - do better next time:
Running on the ec2 shape specified, I am using 100% of the available compute and not saturating the networking (hitting like 250-300 MB/s each way), using about half of the RAM. That could be because I don’t have enough cpu power to handle the processing or could be because AWS S3 or OCI OS can’t move faster. This can and should be tuned better for more intense transfers as should the number of transfers to do in parallel. Good enough for me though, for 9TB this took 52 minutes…