Access Ozone object store with Amazon Boto3 client

This recipe shows how Ozone object store can be accessed from Boto3 client. Following apis were verified:

  • Create bucket
  • List bucket
  • Head bucket
  • Delete bucket
  • Upload file
  • Download file
  • Delete objects(keys)
  • Head object
  • Multipart upload

Requirements

You will need a higher version of Python3 for your Boto3 client as Boto3 installation requirement indicates at here: https://boto3.amazonaws.com/v1/documentation/api/latest/index.html

Obtain resource to Ozone

You may reference Amazon Boto3 documentation regarding the creation of ‘s3’ resources at here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html

  1. s3 = boto3.resource('s3',
  2. endpoint_url='http://localhost:9878',
  3. aws_access_key_id='testuser/scm@EXAMPLE.COM',
  4. aws_secret_access_key='c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999'
  5. )
  6. 'endpoint_url' is pointing to Ozone s3 endpoint.

Obtain client to Ozone via session

You may reference Amazon Boto3 documentation regarding session at here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

  1. Create a session
  2. session = boto3.session.Session()
  3. Obtain s3 client to Ozone via session:
  4. s3_client = session.client(
  5. service_name='s3',
  6. aws_access_key_id='testuser/scm@EXAMPLE.COM',
  7. aws_secret_access_key='c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999',
  8. endpoint_url='http://localhost:9878',
  9. )
  10. 'endpoint_url' is pointing to Ozone s3 endpoint.
  11. In our code sample below, we're demonstrating the usage of both s3 and s3_client.

There are multiple ways to configure Boto3 client credentials if you’re connecting to a secured cluster. In these cases, the above lines of passing ‘aws_access_key_id’ and ‘aws_secret_access_key’ when creating Ozone s3 client shall be skipped.

Please refer to Boto3 documentation for details at here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

Create a bucket

  1. response = s3_client.create_bucket(Bucket='bucket1')
  2. print(response)

This will create a bucket ‘bucket1’ in Ozone volume ‘s3v’.

List buckets

  1. response = s3_client.list_buckets()
  2. print('Existing buckets:')
  3. for bucket in response['Buckets']:
  4. print(f' {bucket["Name"]}')

This will list all buckets in Ozone volume ‘s3v’.

Head a bucket

  1. response = s3_client.head_bucket(Bucket='bucket1')
  2. print(response)

This will head bucket ‘bucket1’ in Ozone volume ‘s3v’.

Delete a bucket

  1. response = s3_client.delete_bucket(Bucket='bucket1')
  2. print(response)

This will delete the bucket ‘bucket1’ from Ozone volume ‘s3v’.

Upload a file

  1. response = s3.Bucket('bucket1').upload_file('./README.md','README.md')
  2. print(response)

This will upload ‘README.md’ to Ozone creates a key ‘README.md’ in volume ‘s3v’.

Download a file

  1. response = s3.Bucket('bucket1').download_file('README.md', 'download.md')
  2. print(response)

This will download ‘README.md’ from Ozone volume ‘s3v’ to local and create a file with name ‘download.md’.

Head an object

  1. response = s3_client.head_object(Bucket='bucket1', Key='README.md')
  2. print(response)

This will head object ‘README.md’ from Ozone volume ‘s3v’ in the bucket ‘bucket1’.

Delete Objects

  1. response = s3_client.delete_objects(
  2. Bucket='bucket1',
  3. Delete={
  4. 'Objects': [
  5. {
  6. 'Key': 'README4.md',
  7. },
  8. {
  9. 'Key': 'README3.md',
  10. },
  11. ],
  12. 'Quiet': False,
  13. },
  14. )

This will delete objects ‘README3.md’ and ‘README4.md’ from Ozone volume ‘s3v’ in bucket ‘bucket1’.

Multipart upload

  1. response = s3_client.create_multipart_upload(Bucket='bucket1', Key='key1')
  2. print(response)
  3. uid=response['UploadId']
  4. print(uid)
  5. response = s3_client.upload_part_copy(
  6. Bucket='bucket1',
  7. CopySource='/bucket1/maven.gz',
  8. Key='key1',
  9. PartNumber=1,
  10. UploadId=str(uid)
  11. )
  12. print(response)
  13. etag1=response.get('CopyPartResult').get('ETag')
  14. print(etag1)
  15. response = s3_client.upload_part_copy(
  16. Bucket='bucket1',
  17. CopySource='/bucket1/maven1.gz',
  18. Key='key1',
  19. PartNumber=2,
  20. UploadId=str(uid)
  21. )
  22. print(response)
  23. etag2=response.get('CopyPartResult').get('ETag')
  24. print(etag2)
  25. response = s3_client.complete_multipart_upload(
  26. Bucket='bucket1',
  27. Key='key1',
  28. MultipartUpload={
  29. 'Parts': [
  30. {
  31. 'ETag': str(etag1),
  32. 'PartNumber': 1,
  33. },
  34. {
  35. 'ETag': str(etag2),
  36. 'PartNumber': 2,
  37. },
  38. ],
  39. },
  40. UploadId=str(uid),
  41. )
  42. print(response)

This will use ‘maven.gz’ and ‘maven1.gz’ as copy source from Ozone volume ‘s3v’ to create a new object ‘key1’ in Ozone volume ‘s3v’. Please note ‘ETag’s is required and important for the call.

Next >>