Ozone can use topology related information (for example rack placement) to optimize read and write pipelines. To get full rack-aware cluster, Ozone requires three different configuration.
- The topology information should be configured by Ozone.
- Topology related information should be used when Ozone chooses 3 different datanodes for a specific pipeline/container. (WRITE)
- When Ozone reads a Key it should prefer to read from the closest node.
Ozone uses RAFT replication for Open containers (write), and an async replication for closed, immutable containers (cold data). As RAFT requires low-latency network, topology awareness placement is available only for closed containers. See the page about Containers about more information related to Open vs Closed containers.
Topology hierarchy can be configured with using
net.topology.node.switch.mapping.impl configuration key. This configuration should define an implementation of the
org.apache.hadoop.net.CachedDNSToSwitchMapping. As this is a Hadoop class, the configuration is exactly the same as the Hadoop Configuration
Static list can be configured with the help of
The second configuration option should point to a text file. The file format is a two column text file, with columns separated by whitespace. The first column is a DNS or IP address and the second column specifies the rack where the address maps. If no entry corresponding to a host in the cluster is found, then
/default-rack is assumed.
Rack information can be identified with the help of an external script:
If implementing an external script, it will be specified with the
net.topology.script.file.name parameter in the configuration files. Unlike the java class, the external topology script is not included with the Ozone distribution and is provided by the administrator. Ozone will send multiple IP addresses to ARGV when forking the topology script. The number of IP addresses sent to the topology script is controlled with
net.topology.script.number.args and defaults to 100. If
net.topology.script.number.args was changed to 1, a topology script would get forked for each IP submitted.
Placement of the closed containers can be configured with
ozone.scm.container.placement.impl configuration key. The available container placement policies can be found in the
By default the
SCMContainerPlacementRandom is used for topology-awareness the
SCMContainerPlacementRackAware can be used:
This placement policy complies with the algorithm used in HDFS. With default 3 replica, two replicas will be on the same rack, the third one will on a different rack.
This implementation applies to network topology like “/rack/node”. Don’t recommend to use this if the network topology has more layers.
Finally the read path also should be configured to read the data from the closest pipeline.
- Hadoop documentation about
- Design doc