List of Configuration Properties

Slack Docker Pulls GitHub edit source

All Alluxio configuration settings fall into one of the six categories: Common (shared by Master and Worker), Master specific, Worker specific, User specific, Cluster specific (used for running Alluxio with cluster managers like Mesos and YARN), and Security specific (shared by Master, Worker, and User).

Common Configuration

The common configuration contains constants shared by different components.

Property NameDefaultDescription
alluxio.conf.dir${alluxio.home}/confThe directory containing files used to configure Alluxio. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.debugfalseSet to true to enable debug mode which has additional logging and info in the Web UI.
alluxio.extensions.dir${alluxio.home}/extensionsThe directory containing Alluxio extensions.
alluxio.fuse.cached.paths.max500Maximum number of Alluxio paths to cache for FUSE conversion.
alluxio.fuse.debug.enabledfalseRun FUSE in debug mode, and have the fuse process log every FS request.
alluxio.fuse.fs.namealluxio-fuseThe FUSE file system name.
alluxio.fuse.maxwrite.bytes128KBMaximum granularity of write operations, capped by the kernel to 128KB max (as of Linux 3.16.0).
alluxio.fuse.user.group.translation.enabledfalseWhether to translate Alluxio users and groups into Unix users and groups when exposing Alluxio files through the FUSE API. When this property is set to false, the user and group for all FUSE files will match the user who started the alluxio-fuse process.
alluxio.home/opt/alluxioAlluxio installation directory.
alluxio.jvm.monitor.info.threshold1secExtra sleep time longer than this threshold, log INFO.
alluxio.jvm.monitor.sleep.interval1secThe time for the JVM monitor thread to sleep.
alluxio.jvm.monitor.warn.threshold10secExtra sleep time longer than this threshold, log WARN.
alluxio.locality.compare.node.ipfalseWhether try to resolve the node IP address for locality checking
alluxio.locality.nodeValue to use for determining node locality
alluxio.locality.ordernode,rackOrdering of locality tiers
alluxio.locality.rackValue to use for determining rack locality
alluxio.locality.scriptalluxio-locality.shA script to determine tiered identity for locality checking
alluxio.logger.typeConsoleThe type of logger.
alluxio.logs.dir${alluxio.work.dir}/logsThe path to store log files. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.logserver.hostnameThe hostname of Alluxio logserver. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.logserver.logs.dir${alluxio.work.dir}/logsDefault location for remote log files. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.logserver.port45600Default port number to receive logs from alluxio servers. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.logserver.threads.max2048The maximum number of threads used by logserver to service logging requests.
alluxio.logserver.threads.min512The minimum number of threads used by logserver to service logging requests.
alluxio.metrics.conf.file${alluxio.conf.dir}/metrics.propertiesThe file path of the metrics system configuration file. By default it is metrics.properties in the conf directory.
alluxio.network.host.resolution.timeout5secDuring startup of the Master and Worker processes Alluxio needs to ensure that they are listening on externally resolvable and reachable host names. To do this, Alluxio will automatically attempt to select an appropriate host name if one was not explicitly specified. This represents the maximum amount of time spent waiting to determine if a candidate host name is resolvable over the network.
alluxio.network.netty.heartbeat.timeout30secThe amount of time the server will wait before closing a netty connection if there has not been any incoming traffic. The client will periodically heartbeat when there is no activity on a connection. This value should be the same on the clients and server.
alluxio.network.thrift.frame.size.bytes.max16MB(Experimental) The largest allowable frame size used for Thrift RPC communication.
alluxio.proxy.s3.deletetypeALLUXIO_AND_UFSDelete type when deleting buckets and objects through S3 API. Valid options are ALLUXIO_AND_UFS (delete both in Alluxio and UFS), ALLUXIO_ONLY (delete only the buckets or objects in Alluxio namespace).
alluxio.proxy.s3.multipart.temporary.dir.suffix_s3_multipart_tmpSuffix for the directory which holds parts during a multipart upload.
alluxio.proxy.s3.writetypeCACHE_THROUGHWrite type when creating buckets and objects through S3 API. Valid options are MUST_CACHE (write will only go to Alluxio and must be stored in Alluxio), CACHE_THROUGH (try to cache, write to UnderFS synchronously), THROUGH (no cache, write to UnderFS synchronously).
alluxio.proxy.stream.cache.timeout1hourThe timeout for the input and output streams cache eviction in the proxy.
alluxio.proxy.web.bind.host0.0.0.0The hostname that the Alluxio proxy’s web server runs on. See multi-homed networks.
alluxio.proxy.web.hostnameThe hostname Alluxio proxy’s web UI binds to.
alluxio.proxy.web.port39999The port Alluxio proxy’s web UI runs on.
alluxio.site.conf.dir${alluxio.conf.dir}/,${user.home}/.alluxio/,/etc/alluxio/Comma-separated search path for alluxio-site.properties. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.test.modefalseFlag used only during tests to allow special behavior.
alluxio.tmp.dirs/tmpThe path(s) to store Alluxio temporary files, use commas as delimiters. If multiple paths are specified, one will be selected at random per temporary file. Currently, only files to be uploaded to object stores are stored in these paths.
alluxio.underfs.address${alluxio.work.dir}/underFSStorageAlluxio directory in the under file system.
alluxio.underfs.allow.set.owner.failurefalseWhether to allow setting owner in UFS to fail. When set to true, it is possible file or directory owners diverge between Alluxio and UFS.
alluxio.underfs.gcs.directory.suffix/Directories are represented in GCS as zero-byte objects named with the specified suffix.
alluxio.underfs.gcs.owner.id.to.username.mappingOptionally, specify a preset gcs owner id to Alluxio username static mapping in the format “id1=user1;id2=user2”. The Google Cloud Storage IDs can be found at the console address https://console.cloud.google.com/storage/settings . Please use the “Owners” one.
alluxio.underfs.hdfs.configuration${alluxio.conf.dir}/core-site.xml:${alluxio.conf.dir}/hdfs-site.xmlLocation of the HDFS configuration file.
alluxio.underfs.hdfs.implorg.apache.hadoop.hdfs.DistributedFileSystemThe implementation class of the HDFS as the under storage system.
alluxio.underfs.hdfs.prefixeshdfs://,glusterfs:///,maprfs:///Optionally, specify which prefixes should run through the HDFS implementation of UnderFileSystem. The delimiter is any whitespace and/or ‘,’.
alluxio.underfs.hdfs.remotefalseBoolean indicating whether or not the under storage worker nodes are remote with respect to Alluxio worker nodes. If set to true, Alluxio will not attempt to discover locality information from the under storage because locality is impossible. This will improve performance. The default value is false.
alluxio.underfs.listing.length1000The maximum number of directory entries to list in a single query to under file system. If the total number of entries is greater than the specified length, multiple queries will be issued.
alluxio.underfs.object.store.mount.shared.publiclyfalseWhether or not to share object storage under storage system mounted point with all Alluxio users. Note that this configuration has no effect on HDFS nor local UFS.
alluxio.underfs.object.store.read.retry.base.sleep50msBlock reads from an object store automatically retry for transient errors with an exponential backoff. This property determines the base time in the exponential backoff.
alluxio.underfs.object.store.read.retry.max.num20Block reads from an object store automatically retry for transient errors with an exponential backoff. This property determines the maximum number of retries.
alluxio.underfs.object.store.read.retry.max.sleep30secBlock reads from an object store automatically retry for transient errors with an exponential backoff. This property determines the maximum wait time in the backoff.
alluxio.underfs.object.store.service.threads20The number of threads in executor pool for parallel object store UFS operations.
alluxio.underfs.oss.connection.max1024The maximum number of OSS connections.
alluxio.underfs.oss.connection.timeout50secThe timeout when connecting to OSS.
alluxio.underfs.oss.connection.ttl-1The TTL of OSS connections in ms.
alluxio.underfs.oss.socket.timeout50secThe timeout of OSS socket.
alluxio.underfs.s3.admin.threads.max20The maximum number of threads to use for metadata operations when communicating with S3. These operations may be fairly concurrent and frequent but should not take much time to process.
alluxio.underfs.s3.disable.dns.bucketsfalseOptionally, specify to make all S3 requests path style.
alluxio.underfs.s3.endpointOptionally, to reduce data latency or visit resources which are separated in different AWS regions, specify a regional endpoint to make aws requests. An endpoint is a URL that is the entry point for a web service. For example, s3.cn-north-1.amazonaws.com.cn is an entry point for the Amazon S3 service in beijing region.
alluxio.underfs.s3.owner.id.to.username.mappingOptionally, specify a preset s3 canonical id to Alluxio username static mapping, in the format “id1=user1;id2=user2”. The AWS S3 canonical ID can be found at the console address https://console.aws.amazon.com/iam/home?#security_credential . Please expand the “Account Identifiers” tab and refer to “Canonical User ID”.
alluxio.underfs.s3.proxy.hostOptionally, specify a proxy host for communicating with S3.
alluxio.underfs.s3.proxy.portOptionally, specify a proxy port for communicating with S3.
alluxio.underfs.s3.threads.max40The maximum number of threads to use for communicating with S3 and the maximum number of concurrent connections to S3. Includes both threads for data upload and metadata operations. This number should be at least as large as the max admin threads plus max upload threads.
alluxio.underfs.s3.upload.threads.max20The maximum number of threads to use for uploading data to S3 for multipart uploads. These operations can be fairly expensive, so multiple threads are encouraged. However, this also splits the bandwidth between threads, meaning the overall latency for completing an upload will be higher for more threads.
alluxio.underfs.s3a.consistency.timeout1minThe duration to wait for metadata consistency from the under storage. This is only used by internal Alluxio operations which should be successful, but may appear unsuccessful due to eventual consistency.
alluxio.underfs.s3a.default.mode0700Mode (in octal notation) for S3 objects if mode cannot be discovered.
alluxio.underfs.s3a.directory.suffix/Directories are represented in S3 as zero-byte objects named with the specified suffix.
alluxio.underfs.s3a.inherit_acltrueOptionally disable this to disable inheriting bucket ACLs on objects.
alluxio.underfs.s3a.list.objects.v1falseWhether to use version 1 of GET Bucket (List Objects) API.
alluxio.underfs.s3a.request.timeout1minThe timeout for a single request to S3. Infinity if set to 0. Setting this property to a non-zero value can improve performance by avoiding the long tail of requests to S3. For very slow connections to S3, consider increasing this value or setting it to 0.
alluxio.underfs.s3a.secure.http.enabledfalseWhether or not to use HTTPS protocol when communicating with S3.
alluxio.underfs.s3a.server.side.encryption.enabledfalseWhether or not to encrypt data stored in S3.
alluxio.underfs.s3a.signer.algorithmThe signature algorithm which should be used to sign requests to the s3 service. This is optional, and if not set, the client will automatically determine it. For interacting with an S3 endpoint which only supports v2 signatures, set this to “S3SignerType”.
alluxio.underfs.s3a.socket.timeout50secLength of the socket timeout when communicating with S3.
alluxio.web.resources${alluxio.home}/core/server/common/src/main/webappPath to the web application resources.
alluxio.web.threads1How many threads to use for the web server.
alluxio.work.dir${alluxio.home}The directory to use for Alluxio’s working directory. By default, the journal, logs, and under file system data (if using local filesystem) are written here.
alluxio.zookeeper.addressAddress of ZooKeeper.
alluxio.zookeeper.connection.timeout15sConnection timeout to use when connecting to Zookeeper
alluxio.zookeeper.election.path/electionElection directory in ZooKeeper.
alluxio.zookeeper.enabledfalseIf true, setup master fault tolerant mode using ZooKeeper.
alluxio.zookeeper.leader.inquiry.retry10The number of retries to inquire leader from ZooKeeper.
alluxio.zookeeper.leader.path/leaderLeader directory in ZooKeeper.
alluxio.zookeeper.session.timeout60sSession timeout to use when connecting to Zookeeper
aws.accessKeyIdThe access key of S3 bucket.
aws.secretKeyThe secret key of S3 bucket.
fs.cos.access.keyThe access key of COS bucket.
fs.cos.app.idThe app id of COS bucket.
fs.cos.connection.max1024The maximum number of COS connections.
fs.cos.connection.timeout50secThe timeout of connecting to COS.
fs.cos.regionThe region name of COS bucket.
fs.cos.secret.keyThe secret key of COS bucket.
fs.cos.socket.timeout50secThe timeout of COS socket.
fs.gcs.accessKeyIdThe access key of GCS bucket.
fs.gcs.secretAccessKeyThe secret key of GCS bucket.
fs.oss.accessKeyIdThe access key of OSS bucket.
fs.oss.accessKeySecretThe secret key of OSS bucket.
fs.oss.endpointThe endpoint key of OSS bucket.
fs.swift.apikey(deprecated) The API key used for user:tenant authentication.
fs.swift.auth.methodChoice of authentication method: [tempauth (default), swiftauth, keystone, keystonev3].
fs.swift.auth.urlAuthentication URL for REST server, e.g., http://server:8090/auth/v1.0.
fs.swift.passwordThe password used for user:tenant authentication.
fs.swift.regionService region when using Keystone authentication.
fs.swift.simulationWhether to simulate a single node Swift backend for testing purposes: true or false (default).
fs.swift.tenantSwift user for authentication.
fs.swift.use.public.urlWhether the REST server is in a public domain: true (default) or false.
fs.swift.userSwift tenant for authentication.

Master Configuration

The master configuration specifies information regarding the master node, such as the address and the port number.

Property NameDefaultDescription
alluxio.master.audit.logging.enabledfalseSet to true to enable file system master audit. Note: This property must be specified as a JVM property; it is not accepted in alluxio-site.properties.
alluxio.master.audit.logging.queue.capacity10000Capacity of the queue used by audit logging.
alluxio.master.backup.directory/alluxiobackupsDefault directory for writing master metadata backups. This path is an absolute path of the root UFS. For example, if the root ufs directory is hdfs://host:port/alluxio/data, the default backup directory will be hdfs://host:port/alluxio_backups.
alluxio.master.bind.host0.0.0.0The hostname that Alluxio master binds to. See multi-homed networks.
alluxio.master.connection.timeout0Timeout of connections between master and client. A value of 0 means never timeout
alluxio.master.daily.backup.enabledfalseWhether or not to enable daily primary master metadata backup.
alluxio.master.daily.backup.files.retained3The maximum number of backup files to keep in the backup directory.
alluxio.master.daily.backup.time05:00Default UTC time for writing daily master metadata backups. The accepted time format is hour:minute which is based on a 24-hour clock (E.g., 05:30, 06:00, and 22:04). Backing up metadata requires a pause in master metadata changes, so please set this value to an off-peak time to avoid interfering with other users of the system.
alluxio.master.file.async.persist.handleralluxio.master.file.async.DefaultAsyncPersistHandlerThe handler for processing the async persistence requests.
alluxio.master.format.file_prefix_formatThe file prefix of the file generated in the journal directory when the journal is formatted. The master will search for a file with this prefix when determining if the journal is formatted.
alluxio.master.heartbeat.timeout10minTimeout between leader master and standby master indicating a lost master.
alluxio.master.hostnameThe hostname of Alluxio master.
alluxio.master.journal.checkpoint.period.entries2000000The number of journal entries to write before creating a new journal checkpoint.
alluxio.master.journal.flush.batch.time5msTime to wait for batching journal writes.
alluxio.master.journal.flush.timeout5minThe amount of time to keep retrying journal writes before giving up and shutting down the master.
alluxio.master.journal.folder${alluxio.work.dir}/journalThe path to store master journal logs.
alluxio.master.journal.formatter.classalluxio.master.journalv0.ProtoBufJournalFormatterThe class to serialize the journal in a specified format.
alluxio.master.journal.gc.period2minFrequency with which to scan for and delete stale journal checkpoints.
alluxio.master.journal.gc.threshold5minMinimum age for garbage collecting checkpoints.
alluxio.master.journal.init.from.backupA uri for a backup to initialize the journal from. When the master becomes primary, if it sees that its journal is freshly formatted, it will restore its state from the backup. When running multiple masters, this property must be configured on all masters since it isn’t known during startup which master will become the first primary.
alluxio.master.journal.log.size.bytes.max10MBIf a log file is bigger than this value, it will rotate to next file.
alluxio.master.journal.retry.interval1secThe amount of time to sleep between retrying journal flushes
alluxio.master.journal.tailer.shutdown.quiet.wait.time5secBefore the standby master shuts down its tailer thread, there should be no update to the leader master’s journal in this specified time period.
alluxio.master.journal.tailer.sleep.time1secTime for the standby master to sleep for when it cannot find anything new in leader master’s journal.
alluxio.master.journal.temporary.file.gc.threshold30minMinimum age for garbage collecting temporary checkpoint files.
alluxio.master.journal.typeUFSThe type of journal to use. Valid options are UFS (store journal in UFS) and NOOP (do not use a journal).
alluxio.master.journal.ufs.optionThe configuration to use for the journal operations.
alluxio.master.jvm.monitor.enabledfalseWhether to enable start JVM monitor thread on master.
alluxio.master.keytab.fileKerberos keytab file for Alluxio master.
alluxio.master.lineage.checkpoint.classalluxio.master.lineage.checkpoint.CheckpointLatestPlannerThe class name of the checkpoint strategy for lineage output files. The default strategy is to checkpoint the latest completed lineage, i.e. the lineage whose output files are completed.
alluxio.master.lineage.checkpoint.interval5minThe interval between Alluxio’s checkpoint scheduling.
alluxio.master.lineage.recompute.interval5minThe interval between Alluxio’s recompute execution. The executor scans the all the lost files tracked by lineage, and re-executes the corresponding jobs.
alluxio.master.lineage.recompute.log.path${alluxio.logs.dir}/recompute.logThe path to the log that the recompute executor redirects the job’s stdout into.
alluxio.master.log.config.report.heartbeat.interval1hThe interval for periodically logging the configuration check report.
alluxio.master.master.heartbeat.interval2minThe interval between Alluxio masters’ heartbeats.
alluxio.master.metastore.inode.inherit.owner.and.grouptrueWhether to inherit the owner/group from the parent when creating a new inode path if empty
alluxio.master.mount.table.root.alluxio/Alluxio root mount point.
alluxio.master.mount.table.root.optionConfiguration for the UFS of Alluxio root mount point.
alluxio.master.mount.table.root.readonlyfalseWhether Alluxio root mount point is readonly.
alluxio.master.mount.table.root.sharedtrueWhether Alluxio root mount point is shared.
alluxio.master.mount.table.root.ufs${alluxio.underfs.address}The UFS mounted to Alluxio root mount point.
alluxio.master.periodic.block.integrity.check.interval1hrThe period for the block integrity check, disabled if <= 0.
alluxio.master.periodic.block.integrity.check.repairtrueWhether the system should delete orphaned blocks found during the periodic integrity check. This is an experimental feature.
alluxio.master.port19998The port that Alluxio master node runs on.
alluxio.master.principalKerberos principal for Alluxio master.
alluxio.master.startup.block.integrity.check.enabledtrueWhether the system should be checked on startup for orphaned blocks (blocks having no corresponding files but still taking system resource due to various system failures). Orphaned blocks will be deleted during master startup if this property is true. This property is available since 1.7.1
alluxio.master.startup.consistency.check.enabledtrueWhether the system should be checked for consistency with the underlying storage on startup. During the time the check is running, Alluxio will be in read only mode. Enabled by default.
alluxio.master.thrift.shutdown.timeout60secMaximum time to wait for thrift servers to stop on shutdown
alluxio.master.tieredstore.global.level0.aliasMEMThe name of the highest storage tier in the entire system.
alluxio.master.tieredstore.global.level1.aliasSSDThe name of the second highest storage tier in the entire system.
alluxio.master.tieredstore.global.level2.aliasHDDThe name of the third highest storage tier in the entire system.
alluxio.master.tieredstore.global.levels3The total number of storage tiers in the system.
alluxio.master.ttl.checker.interval1hourTime interval to periodically delete the files with expired ttl value.
alluxio.master.ufs.block.location.cache.capacity1000000The capacity of the UFS block locations cache. This cache caches UFS block locations for files that are persisted but not in Alluxio space, so that listing status of these files do not need to repeatedly ask UFS for their block locations. If this is set to 0, the cache will be disabled.
alluxio.master.ufs.path.cache.capacity100000The capacity of the UFS path cache. This cache is used to approximate the Once metadata load behavior (see alluxio.user.file.metadata.load.type). Larger caches will consume more memory, but will better approximate the Once behavior.
alluxio.master.ufs.path.cache.threads64The maximum size of the thread pool for asynchronously processing paths for the UFS path cache. Greater number of threads will decrease the amount of staleness in the async cache, but may impact performance. If this is set to 0, the cache will be disabled, and alluxio.user.file.metadata.load.type=Once will behave like Always.
alluxio.master.web.bind.host0.0.0.0The hostname Alluxio master web UI binds to. See multi-homed networks.
alluxio.master.web.hostnameThe hostname of Alluxio Master web UI.
alluxio.master.web.port19999The port Alluxio web UI runs on.
alluxio.master.whitelist/A comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Alluxio will try to cache the cacheable file when it is read for the first time.
alluxio.master.worker.connect.wait.time5secAlluxio master will wait a period of time after start up for all workers to register, before it starts accepting client requests. This property determines the wait time.
alluxio.master.worker.heartbeat.interval10secThe interval between Alluxio master and worker heartbeats.
alluxio.master.worker.threads.maxA third of the max file descriptors limit, if b/w 2048 and 32768The maximum number of incoming RPC requests to master that can be handled. This value is used to configure maximum number of threads in Thrift thread pool with master.
alluxio.master.worker.threads.min512The minimum number of threads used to handle incoming RPC requests to master. This value is used to configure minimum number of threads in Thrift thread pool with master.
alluxio.master.worker.timeout5minTimeout between master and worker indicating a lost worker.

Worker Configuration

The worker configuration specifies information regarding the worker nodes, such as the address and the port number.

Property NameDefaultDescription
alluxio.worker.allocator.classalluxio.worker.block.allocator.MaxFreeAllocatorThe strategy that a worker uses to allocate space among storage directories in certain storage layer. Valid options include: alluxio.worker.block.allocator.MaxFreeAllocator, alluxio.worker.block.allocator.GreedyAllocator, alluxio.worker.block.allocator.RoundRobinAllocator.
alluxio.worker.bind.host0.0.0.0The hostname Alluxio’s worker node binds to. See multi-homed networks.
alluxio.worker.block.heartbeat.interval1secThe interval between block workers’ heartbeats.
alluxio.worker.block.heartbeat.timeout${alluxio.worker.master.connect.retry.timeout}The timeout value of block workers’ heartbeats. If the worker can’t connect to master before this interval expires, the worker will exit.
alluxio.worker.block.master.client.pool.size11The block master client pool size on the Alluxio workers.
alluxio.worker.block.threads.max2048The maximum number of incoming RPC requests to block worker that can be handled. This value is used to configure maximum number of threads in Thrift thread pool with block worker. This value should be greater than the sum of alluxio.user.block.worker.client.threads across concurrent Alluxio clients. Otherwise, the worker connection pool can be drained, preventing new connections from being established.
alluxio.worker.block.threads.min256The minimum number of threads used to handle incoming RPC requests to block worker. This value is used to configure minimum number of threads in Thrift thread pool with block worker.
alluxio.worker.data.bind.host0.0.0.0The hostname that the Alluxio worker’s data server runs on. See multi-homed networks.
alluxio.worker.data.folder/alluxioworker/A relative path within each storage directory used as the data folder for Alluxio worker to put data for tiered store.
alluxio.worker.data.folder.permissionsrwxrwxrwxThe permission set for the worker data folder. If short circuit is used this folder should be accessible by all users (rwxrwxrwx).
alluxio.worker.data.folder.tmp.tmp_blocksA relative path in alluxio.worker.data.folder used to store the temporary data for uncommitted files.
alluxio.worker.data.hostnameThe hostname of Alluxio worker data service.
alluxio.worker.data.port29999The port Alluxio’s worker’s data server runs on.
alluxio.worker.data.server.classalluxio.worker.netty.NettyDataServerSelects the networking stack to run the worker with. Valid options are: alluxio.worker.netty.NettyDataServer.
alluxio.worker.data.server.domain.socket.addressThe path to the domain socket. Short-circuit reads make use of a UNIX domain socket when this is set (non-empty). This is a special path in the file system that allows the client and the AlluxioWorker to communicate. You will need to set a path to this socket. The AlluxioWorker needs to be able to create the path. If alluxio.worker.data.server.domain.socket.as.uuid is set, the path should be the home directory for the domain socket. The full path for the domain socket with be {path}/{uuid}.
alluxio.worker.data.server.domain.socket.as.uuidfalseIf true, the property alluxio.worker.data.server.domain.socket.addressis the path to the home directory for the domain socket and a unique identifier is used as the domain socket name. In addition, clients ignore alluxio.user.hostname while detecting a local worker for short circuit ops. If false, the property is the absolute path to the UNIX domain socket.
alluxio.worker.data.tmp.subdir.max1024The maximum number of sub-directories allowed to be created in alluxio.worker.data.tmp.folder.
alluxio.worker.evictor.classalluxio.worker.block.evictor.LRUEvictorThe strategy that a worker uses to evict block files when a storage layer runs out of space. Valid options include alluxio.worker.block.evictor.LRFUEvictor, alluxio.worker.block.evictor.GreedyEvictor, alluxio.worker.block.evictor.LRUEvictor.
alluxio.worker.evictor.lrfu.attenuation.factor2.0A attenuation factor in [2, INF) to control the behavior of LRFU.
alluxio.worker.evictor.lrfu.step.factor0.25A factor in [0, 1] to control the behavior of LRFU: smaller value makes LRFU more similar to LFU; and larger value makes LRFU closer to LRU.
alluxio.worker.file.buffer.size1MBThe buffer size for worker to write data into the tiered storage.
alluxio.worker.file.persist.pool.size64The size of the thread pool per worker, in which the thread persists an ASYNC_THROUGH file to under storage.
alluxio.worker.file.persist.rate.limit2GBThe rate limit of asynchronous persistence per second.
alluxio.worker.file.persist.rate.limit.enabledfalseWhether to enable rate limiting when performing asynchronous persistence.
alluxio.worker.filesystem.heartbeat.interval1secThe heartbeat interval between the worker and file system master.
alluxio.worker.free.space.timeout10secThe duration for which a worker will wait for eviction to make space available for a client write request.
alluxio.worker.hostnameThe hostname of Alluxio worker.
alluxio.worker.jvm.monitor.enabledfalseWhether to enable start JVM monitor thread on worker.
alluxio.worker.keytab.fileKerberos keytab file for Alluxio worker.
alluxio.worker.master.connect.retry.timeout1hourRetry period before workers give up on connecting to master
alluxio.worker.memory.size2/3 of total system memory, or 1GB if system memory size cannot be determinedMemory capacity of each worker node.
alluxio.worker.network.netty.async.cache.manager.threads.max8The maximum number of threads used to cache blocks asynchronously in the netty data server.
alluxio.worker.network.netty.backlogNetty socket option for SO_BACKLOG: the number of connections queued.
alluxio.worker.network.netty.block.reader.threads.max2048The maximum number of threads used to read blocks in the netty data server.
alluxio.worker.network.netty.block.writer.threads.max1024The maximum number of threads used to write blocks in the netty data server.
alluxio.worker.network.netty.boss.threads1How many threads to use for accepting new requests.
alluxio.worker.network.netty.buffer.receiveNetty socket option for SO_RCVBUF: the proposed buffer size that will be used for receives.
alluxio.worker.network.netty.buffer.sendNetty socket option for SO_SNDBUF: the proposed buffer size that will be used for sends.
alluxio.worker.network.netty.channelEPOLLNetty channel type: NIO or EPOLL. If EPOLL is not available, this will automatically fall back to NIO.
alluxio.worker.network.netty.file.transferMAPPEDWhen returning files to the user, select how the data is transferred; valid options are MAPPED (uses java MappedByteBuffer) and TRANSFER (uses Java FileChannel.transferTo).
alluxio.worker.network.netty.file.writer.threads.max1024The maximum number of threads used to write files to UFS in the netty data server.
alluxio.worker.network.netty.reader.buffer.size.packets16The maximum number of parallel data packets when a client reads from a worker.
alluxio.worker.network.netty.rpc.threads.max2048The maximum number of threads used to handle worker side RPCs in the netty data server.
alluxio.worker.network.netty.shutdown.quiet.period2secThe quiet period. When the netty server is shutting down, it will ensure that no RPCs occur during the quiet period. If an RPC occurs, then the quiet period will restart before shutting down the netty server.
alluxio.worker.network.netty.shutdown.timeout15secMaximum amount of time to wait until the netty server is shutdown (regardless of the quiet period).
alluxio.worker.network.netty.watermark.high32KBDetermines how many bytes can be in the write queue before switching to non-writable.
alluxio.worker.network.netty.watermark.low8KBOnce the high watermark limit is reached, the queue must be flushed down to the low watermark before switching back to writable.
alluxio.worker.network.netty.worker.threads0How many threads to use for processing requests. Zero defaults to #cpuCores * 2.
alluxio.worker.network.netty.writer.buffer.size.packets16The maximum number of parallel data packets when a client writes to a worker.
alluxio.worker.port29998The port Alluxio’s worker node runs on.
alluxio.worker.principalKerberos principal for Alluxio worker.
alluxio.worker.session.timeout1minTimeout between worker and client connection indicating a lost session connection.
alluxio.worker.tieredstore.block.lock.readers1000The max number of concurrent readers for a block lock.
alluxio.worker.tieredstore.block.locks1000Total number of block locks for an Alluxio block worker. Larger value leads to finer locking granularity, but uses more space.
alluxio.worker.tieredstore.level0.aliasMEMThe alias of the top storage tier on this worker. It must match one of the global storage tiers from the master configuration. We disable placing an alias lower in the global hierarchy before an alias with a higher postion on the worker hierarchy. So by default, SSD cannot come before MEM on any worker.
alluxio.worker.tieredstore.level0.dirs.path/mnt/ramdisk on Linux, /Volumes/ramdisk on OSXThe path of storage directory for the top storage tier. Note for MacOS the value should be /Volumes/.
alluxio.worker.tieredstore.level0.dirs.quota${alluxio.worker.memory.size}The capacity of the top storage tier.
alluxio.worker.tieredstore.level0.reserved.ratioFraction of space reserved in the top storage tier. This has been deprecated, please use high and low watermark instead.
alluxio.worker.tieredstore.level0.watermark.high.ratio0.95The high watermark of the space in the top storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level0.watermark.low.ratio0.7The low watermark of the space in the top storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level1.aliasThe alias of the second storage tier on this worker.
alluxio.worker.tieredstore.level1.dirs.pathThe path of storage directory for the second storage tier.
alluxio.worker.tieredstore.level1.dirs.quotaThe capacity of the second storage tier.
alluxio.worker.tieredstore.level1.reserved.ratioFraction of space reserved in the second storage tier. This has been deprecated, please use high and low watermark instead.
alluxio.worker.tieredstore.level1.watermark.high.ratio0.95The high watermark of the space in the second storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level1.watermark.low.ratio0.7The low watermark of the space in the second storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level2.aliasThe alias of the third storage tier on this worker.
alluxio.worker.tieredstore.level2.dirs.pathThe path of storage directory for the third storage tier.
alluxio.worker.tieredstore.level2.dirs.quotaThe capacity of the third storage tier.
alluxio.worker.tieredstore.level2.reserved.ratioFraction of space reserved in the third storage tier. This has been deprecated, please use high and low watermark instead.
alluxio.worker.tieredstore.level2.watermark.high.ratio0.95The high watermark of the space in the third storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level2.watermark.low.ratio0.7The low watermark of the space in the third storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.levels1The number of storage tiers on the worker.
alluxio.worker.tieredstore.reserver.enabledtrueWhether to enable tiered store reserver service or not.
alluxio.worker.tieredstore.reserver.interval1secThe time period of space reserver service, which keeps certain portion of available space on each layer.
alluxio.worker.tieredstore.retry3The number of retries that the worker uses to process blocks.
alluxio.worker.ufs.block.open.timeout5minTimeout to open a block from UFS.
alluxio.worker.ufs.instream.cache.enabledtrueEnable caching for seekable under storage input stream, so that subsequent seek operations on the same file will reuse the cached input stream. This will improve position read performance as the open operations of some under file system would be expensive. The cached input stream would be stale, when the UFS file is modified without notifying alluxio.
alluxio.worker.ufs.instream.cache.expiration.time5minCached UFS instream expiration time.
alluxio.worker.ufs.instream.cache.max.size5000The max entries in the UFS instream cache.
alluxio.worker.web.bind.host0.0.0.0The hostname Alluxio worker’s web server binds to. See multi-homed networks.
alluxio.worker.web.hostnameThe hostname Alluxio worker’s web UI binds to.
alluxio.worker.web.port30000The port Alluxio worker’s web UI runs on.

User Configuration

The user configuration specifies values regarding file system access.

Property NameDefaultDescription
alluxio.user.app.idThe custom id to use for labeling this client’s info, such as metrics. If unset, a random long will be used. This value is displayed in the client logs on initialization. Note that using the same app id will cause client info to be aggregated, so different applications must set their own ids or leave this value unset to use a randomly generated id.
alluxio.user.block.master.client.pool.gc.interval120secThe interval at which block master client GC checks occur.
alluxio.user.block.master.client.pool.gc.threshold120secA block master client is closed if it has been idle for more than this threshold.
alluxio.user.block.master.client.pool.size.max10The maximum number of block master clients cached in the block master client pool.
alluxio.user.block.master.client.pool.size.min0The minimum number of block master clients cached in the block master client pool. For long running processes, this should be set to zero.
alluxio.user.block.remote.read.buffer.size.bytes8MBThe size of the file buffer to read data from remote Alluxio worker.
alluxio.user.block.remote.reader.classalluxio.client.netty.NettyRemoteBlockReaderSelects networking stack to run the client with. Currently only alluxio.client.netty.NettyRemoteBlockReader (read remote data using netty) is valid.
alluxio.user.block.remote.writer.classalluxio.client.netty.NettyRemoteBlockWriterSelects networking stack to run the client with for block writes.
alluxio.user.block.size.bytes.default512MBDefault block size for Alluxio files.
alluxio.user.block.worker.client.pool.gc.threshold300secA block worker client is closed if it has been idle for more than this threshold.
alluxio.user.block.worker.client.pool.size.max128The maximum number of block worker clients cached in the block worker client pool.
alluxio.user.block.worker.client.read.retry5The maximum number of workers to retry before the client gives up on reading a block
alluxio.user.block.worker.client.threads10The number of threads used by a block worker client pool for heartbeating to a worker. Increase this value if worker failures affect client connections to healthy workers.
alluxio.user.conf.cluster.default.enabledtrueWhen this property is true, an Alluxio client will load the default values of configuration properties set by Alluxio master.
alluxio.user.date.format.patternMM-dd-yyyy HH:mm:ss:SSSDisplay formatted date in cli command and web UI by given date format pattern.
alluxio.user.failed.space.request.limits3The number of times to request space from the file system before aborting.
alluxio.user.file.buffer.bytes8MBThe size of the file buffer to use for file system reads/writes.
alluxio.user.file.cache.partially.read.blocktrueThis property is deprecated as of 1.7 and has no effect. Use the read type to control caching behavior.
alluxio.user.file.copyfromlocal.write.location.policy.classalluxio.client.file.policy.RoundRobinPolicyThe default location policy for choosing workers for writing a file’s blocks using copyFromLocal command.
alluxio.user.file.delete.uncheckedfalseWhether to check if the UFS contents are in sync with Alluxio before attempting to delete persisted directories recursively.
alluxio.user.file.master.client.pool.gc.interval120secThe interval at which file system master client GC checks occur.
alluxio.user.file.master.client.pool.gc.threshold120secA fs master client is closed if it has been idle for more than this threshold.
alluxio.user.file.master.client.pool.size.max10The maximum number of fs master clients cached in the fs master client pool.
alluxio.user.file.master.client.pool.size.min0The minimum number of fs master clients cached in the fs master client pool. For long running processes, this should be set to zero.
alluxio.user.file.metadata.load.typeOnceThe behavior of loading metadata from UFS. When information about a path is requested and the path does not exist in Alluxio, metadata can be loaded from the UFS. Valid options are Always, Never, and Once. Always will always access UFS to see if the path exists in the UFS. Never will never consult the UFS. Once will access the UFS the “first” time (according to a cache), but not after that. This parameter is ignored if a metadata sync is performed, via the parameter “alluxio.user.file.metadata.sync.interval”
alluxio.user.file.metadata.sync.interval-1The interval for syncing UFS metadata before invoking an operation on a path. -1 means no sync will occur. 0 means Alluxio will always sync the metadata of the path before an operation. If you specify a time interval, Alluxio will (best effort) not re-sync a path within that time interval. Syncing the metadata for a path must interact with the UFS, so it is an expensive operation. If a sync is performed for an operation, the configuration of “alluxio.user.file.metadata.load.type” will be ignored.
alluxio.user.file.passive.cache.enabledtrueWhether to cache files to local Alluxio workers when the files are read from remote workers (not UFS).
alluxio.user.file.readtype.defaultCACHE_PROMOTEDefault read type when creating Alluxio files. Valid options are CACHE_PROMOTE (move data to highest tier if already in Alluxio storage, write data into highest tier of local Alluxio if data needs to be read from under storage), CACHE (write data into highest tier of local Alluxio if data needs to be read from under storage), NO_CACHE (no data interaction with Alluxio, if the read is from Alluxio data migration or eviction will not occur).
alluxio.user.file.seek.buffer.size.bytes1MBThe file seek buffer size. This is only used when alluxio.user.file.cache.partially.read.block is enabled.
alluxio.user.file.waitcompleted.poll1secThe time interval to poll a file for its completion status when using waitCompleted.
alluxio.user.file.write.avoid.eviction.policy.reserved.size.bytes0MBThe portion of space reserved in worker when user use the LocalFirstAvoidEvictionPolicy class as file write location policy.
alluxio.user.file.write.location.policy.classalluxio.client.file.policy.LocalFirstPolicyThe default location policy for choosing workers for writing a file’s blocks.
alluxio.user.file.write.tier.default0The default tier for choosing a where to write a block. Valid option is any integer. Non-negative values identify tiers starting from top going down (0 identifies the first tier, 1 identifies the second tier, and so on). If the provided value is greater than the number of tiers, it identifies the last tier. Negative values identify tiers starting from the bottom going up (-1 identifies the last tier, -2 identifies the second to last tier, and so on). If the absolute value of the provided value is greater than the number of tiers, it identifies the first tier.
alluxio.user.file.writetype.defaultMUST_CACHEDefault write type when creating Alluxio files. Valid options are MUST_CACHE (write will only go to Alluxio and must be stored in Alluxio), CACHE_THROUGH (try to cache, write to UnderFS synchronously), THROUGH (no cache, write to UnderFS synchronously).
alluxio.user.heartbeat.interval1secThe interval between Alluxio workers’ heartbeats.
alluxio.user.hostnameThe hostname to use for the client. Note: this property is deprecated. set alluxio.locality.node instead
alluxio.user.lineage.enabledfalseFlag to enable lineage feature.
alluxio.user.lineage.master.client.threads10The number of threads used by a lineage master client to talk to the lineage master.
alluxio.user.local.reader.packet.size.bytes8MBWhen a client reads from a local worker, the maximum data packet size.
alluxio.user.local.writer.packet.size.bytes64KBWhen a client writes to a local worker, the maximum data packet size.
alluxio.user.metrics.collection.enabledfalseEnable collecting the client-side metrics and hearbeat them to master
alluxio.user.metrics.heartbeat.interval3secThe time period of client master hearbeat to send the client-side metrics.
alluxio.user.network.netty.channelEPOLLType of netty channels. If EPOLL is not available, this will automatically fall back to NIO.
alluxio.user.network.netty.channel.pool.disabledfalseDisable netty channel pool. This should be turned on if the client version is >= 1.3.0 but server version is <= 1.2.x.
alluxio.user.network.netty.channel.pool.gc.threshold300secA netty channel is closed if it has been idle for more than this threshold.
alluxio.user.network.netty.channel.pool.size.max1024The maximum number of netty channels cached in the netty channel pool.
alluxio.user.network.netty.channel.pool.size.min0The minimum number of netty channels cached in the netty channel pool. For long running processes, this should be set to zero.
alluxio.user.network.netty.reader.buffer.size.packets16When a client reads from a remote worker, the maximum number of packets to buffer by the client.
alluxio.user.network.netty.reader.packet.size.bytes64KBWhen a client reads from a remote worker, the maximum packet size.
alluxio.user.network.netty.timeout30secThe maximum time for a netty client (for block reads and block writes) to wait for a response from the data server.
alluxio.user.network.netty.worker.threads0How many threads to use for remote block worker client to read from remote block workers.
alluxio.user.network.netty.writer.buffer.size.packets16When a client writes to a remote worker, the maximum number of packets to buffer by the client.
alluxio.user.network.netty.writer.close.timeout30minThe timeout to close a netty writer client.
alluxio.user.network.netty.writer.packet.size.bytes64KBWhen a client writes to a remote worker, the maximum packet size.
alluxio.user.network.socket.timeout10minThe time out of a socket created by a user to connect to the master.
alluxio.user.rpc.retry.base.sleep50msAlluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the base time in the exponential backoff.
alluxio.user.rpc.retry.max.duration2minAlluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum duration to retry for before giving up. Note that, this value is set to 5s for fs and fsadmin CLIs.
alluxio.user.rpc.retry.max.num.retry100Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum number of retries. This property has been deprecated by time-based retry using: alluxio.user.rpc.retry.max.duration
alluxio.user.rpc.retry.max.sleep3secAlluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum wait time in the backoff.
alluxio.user.short.circuit.enabledtrueThe short circuit read/write which allows the clients to read/write data without going through Alluxio workers if the data is local is enabled if set to true.
alluxio.user.ufs.block.read.concurrency.max2147483647The maximum concurrent readers for one UFS block on one Block Worker.
alluxio.user.ufs.block.read.location.policyalluxio.client.file.policy.LocalFirstPolicyWhen an Alluxio client reads a file from the UFS, it delegates the read to an Alluxio worker. The client uses this policy to choose which worker to read through. Builtin choices: [alluxio.client.block.policy.DeterministicHashPolicy, alluxio.client.file.policy.LocalFirstAvoidEvictionPolicy, alluxio.client.file.policy.LocalFirstPolicy, alluxio.client.file.policy.MostAvailableFirstPolicy, alluxio.client.file.policy.RoundRobinPolicy, alluxio.client.file.policy.SpecificHostPolicy].
alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards1When alluxio.user.ufs.block.read.location.policy is set to alluxio.client.block.policy.DeterministicHashPolicy, this specifies the number of hash shards.
alluxio.user.ufs.delegation.read.buffer.size.bytes8MBSize of the read buffer when reading from the UFS through the Alluxio worker. Each read request will fetch at least this many bytes, unless the read reaches the end of the file.
alluxio.user.ufs.delegation.write.buffer.size.bytes2MBSize of the write buffer when writing to the UFS through the Alluxio worker. Each write request will write at least this many bytes, unless the write is at the end of the file.
alluxio.user.ufs.file.reader.classalluxio.client.netty.NettyUnderFileSystemFileReaderSelects networking stack to run the client with for reading from under file system through a worker’s data server. Currently only alluxio.client.netty.NettyUnderFileSystemFileReader (remote read using netty) is valid.
alluxio.user.ufs.file.writer.classalluxio.client.netty.NettyUnderFileSystemFileWriterSelects networking stack to run the client with for writing to under file system through a worker’s data server. Currently only alluxio.client.netty.NettyUnderFileSystemFileWriter (remote write using netty) is valid.

Resource Manager Configuration

When running Alluxio with resource managers like Mesos and YARN, Alluxio has additional configuration options.

Property NameDefaultDescription
alluxio.integration.master.resource.cpu1The number of CPUs to run an Alluxio master for YARN framework.
alluxio.integration.master.resource.mem1024MBThe amount of memory to run an Alluxio master for YARN framework.
alluxio.integration.mesos.alluxio.jar.urlhttp://downloads.alluxio.org/downloads/files/${alluxio.version}/alluxio-${alluxio.version}-bin.tar.gzUrl to download an Alluxio distribution from during Mesos deployment.
alluxio.integration.mesos.jdk.pathjdk1.8.0_151If installing java from a remote URL during mesos deployment, this must be set to the directory name of the untarred jdk.
alluxio.integration.mesos.jdk.urlLOCALA url from which to install the jdk during Mesos deployment. Default to LOCAL which tells Mesos to use the local JDK on the system. When using this property, alluxio.integration.mesos.jdk.path must also be set correctly.
alluxio.integration.mesos.master.nameAlluxioMasterThe name of the master process to use within Mesos.
alluxio.integration.mesos.master.node.count1The number of Alluxio master process to run within Mesos.
alluxio.integration.mesos.principalalluxioThe Mesos principal for the Alluxio Mesos Framework.
alluxio.integration.mesos.role*Mesos role for the Alluxio Mesos Framework.
alluxio.integration.mesos.secretSecret token for authenticating with Mesos.
alluxio.integration.mesos.userThe Mesos user for the Alluxio Mesos Framework. Defaults to the current user.
alluxio.integration.mesos.worker.nameAlluxioWorkerThe name of the worker process to use within Mesos.
alluxio.integration.worker.resource.cpu1The number of CPUs to run an Alluxio worker for YARN framework.
alluxio.integration.worker.resource.mem1024MBThe amount of memory to run an Alluxio worker for YARN framework.
alluxio.integration.yarn.workers.per.host.max1The number of workers to run on an Alluxio host for YARN framework.

Security Configuration

The security configuration specifies information regarding the security features, such as authentication and file permission. Settings for authentication take effect for master, worker, and user. Settings for file permission only take effect for master. See Security for more information about security features.

Property NameDefaultDescription
alluxio.security.authentication.custom.provider.classThe class to provide customized authentication implementation, when alluxio.security.authentication.type is set to CUSTOM. It must implement the interface ‘alluxio.security.authentication.AuthenticationProvider’.
alluxio.security.authentication.typeSIMPLEThe authentication mode. Currently three modes are supported: NOSASL, SIMPLE, CUSTOM. The default value SIMPLE indicates that a simple authentication is enabled. Server trusts whoever the client claims to be.
alluxio.security.authorization.permission.enabledtrueWhether to enable access control based on file permission.
alluxio.security.authorization.permission.supergroupsupergroupThe super group of Alluxio file system. All users in this group have super permission.
alluxio.security.authorization.permission.umask022The umask of creating file and directory. The initial creation permission is 777, and the difference between directory and file is 111. So for default umask value 022, the created directory has permission 755 and file has permission 644.
alluxio.security.group.mapping.cache.timeout1minTime for cached group mapping to expire.
alluxio.security.group.mapping.classalluxio.security.group.provider.ShellBasedUnixGroupsMappingThe class to provide user-to-groups mapping service. Master could get the various group memberships of a given user. It must implement the interface ‘alluxio.security.group.GroupMappingService’. The default implementation execute the ‘groups’ shell command to fetch the group memberships of a given user.
alluxio.security.login.impersonation.usernameHDFS_USERWhen alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the IMPERSONATED user requesting Alluxio service. If it is not set explicitly, or set to NONE, impersonation will not be used. A special value of ‘HDFS_USER‘ can be specified to impersonate the hadoop client user.
alluxio.security.login.usernameWhen alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the user requesting Alluxio service. If it is not set explicitly, the OS login user will be used.