Under File Storage Extension API

Slack Docker Pulls GitHub edit source

This page is intended for developers of under storage extensions. Please look at managing extensions for a guide to using existing extensions.

Introduction

Under storage extensions provide a framework to enable additional storage systems to work with Alluxio and makes it convenient to develop modules not already supported by Alluxio. Extensions are built as JARs and included at a specific extensions location to be picked up by core Alluxio. This page describes the mechanics of how extensions in Alluxio work, and provides detailed instructions for developing an under storage extension.

If the modules included in core Alluxio do not use the interface supported by your desired storage system, you may choose to implement an under storage extension.

Implementing an Under Storage Extension

Building a new under storage connector involves:

  • Implementing the required under storage interface
  • Declaring the service implementation
  • Bundling up the implementation and transitive dependencies in an uber JAR

A reference implementation can be found in the alluxio-extensions repository. In the rest of this section, we describe the steps involved in writing a new under storage extension. The sample project, called DummyUnderFileSystem, uses maven as the build and dependency management tool, and forwards all operations to a local filesystem.

Implement the Under Storage Interface

The HDFS Submodule and S3A Submodule are good examples of how to enable a storage system to serve as Alluxio’s underlying storage.

Step 1: Implement the interface UnderFileSystem

The UnderFileSystem interface is defined in the module org.alluxio:alluxio-core-common. Choose to extend either BaseUnderFileSystem or ObjectUnderFileSystem to implement the UnderFileSystem interface. ObjectUnderFileSystem is suitable for connecting to object storage and abstracts away mapping file system operations to an object store.

  1. public class DummyUnderFileSystem extends BaseUnderFileSystem {
  2. // Implement filesystem operations
  3. ...
  4. }

or,

  1. public class DummyUnderFileSystem extends ObjectUnderFileSystem {
  2. // Implement object store operations
  3. ...
  4. }

Step 2: Implement the interface UnderFileSystemFactory

The under storage factory determines defines which paths the UnderFileSystem implementation supports and how to create the UnderFileSystem implementation.

  1. public class DummyUnderFileSystemFactory implements UnderFileSystemFactory {
  2. ...
  3. @Override
  4. public UnderFileSystem create(String path, UnderFileSystemConfiguration conf) {
  5. // Create the under storage instance
  6. }
  7. @Override
  8. public boolean supportsPath(String path) {
  9. // Choose which schemes to support, e.g., dummy://
  10. }
  11. }

Declare the Service

Create a file at src/main/resources/META-INF/services/alluxio.underfs.UnderFileSystemFactory advertising the implemented UnderFileSystemFactory to the ServiceLoader.

  1. alluxio.underfs.dummy.DummyUnderFileSystemFactory

Build

Include all transitive dependencies of the extension project in the built JAR using either maven-shade-plugin or maven-assembly.

In addition, to avoid collisions, specify scope for the dependency alluxio-core-common as provided. The maven definition would look like:

  1. <dependencies>
  2. <!-- Core Alluxio dependencies -->
  3. <dependency>
  4. <groupId>org.alluxio</groupId>
  5. <artifactId>alluxio-core-common</artifactId>
  6. <scope>provided</scope>
  7. </dependency>
  8. ...
  9. </dependencies>

Test

Extend AbstractUnderFileSystemContractTest to test that the defined UnderFileSystem adheres to the contract between Alluxio and an under storage module. Look at the reference implementation to include parameters such as the working directory for the test.

  1. public final class DummyUnderFileSystemContractTest extends AbstractUnderFileSystemContractTest {
  2. ...
  3. }

How it Works

Service Discovery

Extension JARs are loaded dynamically at runtime by Alluxio servers, which enables Alluxio to talk to new under storage systems without requiring a restart. Alluxio servers use Java ServiceLoader to discover implementations of the under storage API. Providers include implementations of the alluxio.underfs.UnderFileSystemFactory interface. The implementation is advertised by including a text file in META_INF/services with a single line pointing to the class implementing the said interface.

Dependency Management

Implementors are required to include transitive dependencies in their extension JARs. Alluxio performs isolated classloading for each extension JARs to avoid dependency conflicts between Alluxio servers and extensions.

Contributing your Under Storage extension to Alluxio

Congratulations! You have developed a new under storage extension to Alluxio. Let the community know by submitting a pull request to the Alluxio repository to edit the list of extensions section on the documentation page.