Testing tools

Testing is one of the most important part during the development of a distributed system. We have the following type of test.

This page includes our existing test tool which are part of the Ozone source base.

Note: we have more tests (like TCP-DS, TCP-H tests via Spark or Hive) which are not included here because they use external tools only.

Unit test

As every almost every java project we have the good old unit tests inside each of our projects.

Integration test (JUnit)

Traditional unit tests are supposed to test only one unit, but we also have higher level unit tests. They use MiniOzoneCluster which is a helper method to start real daemons (scm,om,datanodes) during the unit test.

From maven/java point of view they are just simple unit tests (JUnit library is used) but to separate them (and solve some dependency problems) we moved all of these tests to hadoop-ozone/integration-test

Smoketest

We use docker-compose based pseudo-cluster to run different configuration of Ozone. To be sure that the different configuration can be started we implemented acceptance tests with the help of https://robotframework.org/.

The smoketests are available from the distribution (./smoketest) but the robot files defines only the tests: usually they start CLI and check the output.

To run the tests in different environment (docker-compose, kubernetes) you need a definition to start the containers and execute the right tests in the right containers.

These definition of the tests are included in the compose directory (check ./compose/*/test.sh or ./compose/test-all.sh).

For example a simple way to test the distribution package:

  1. cd compose/ozone
  2. ./test.sh

Blockade

Blockade is a tool to test network failures and partitions (it’s inspired by the legendary Jepsen tests).

Blockade tests are implemented with the help of tests and can be started from the ./blockade directory of the distribution.

  1. cd blocakde
  2. pip install pytest==2.8.7,blockade
  3. python -m pytest -s .

See the README in the blockade directory for more details.

MiniChaosOzoneCluster

This is a way to get chaos in your machine. It can be started from the source code and a MiniOzoneCluster (which starts real daemons) will be started and killed randomly.

Freon

Freon is a command line application which is included in the Ozone distribution. It’s a load generator which is used in our stress tests.

Random keys:

In randomkeys mode, the data written into ozone cluster is randomly generated. Each key will be of size 10 KB.

The number of volumes/buckets/keys can be configured. The replication type and factor (eg. replicate with ratis to 3 nodes) also can be configured.

For more information use:

bin/ozone freon –help

For example:

  1. ozone freon randomkeys --num-of-volumes=10 --num-of-buckets 10 --num-of-keys 10 --replication-type=RATIS --factor=THREE
  1. ***************************************************
  2. Status: Success
  3. Git Base Revision: 48aae081e5afacbb3240657556b26c29e61830c3
  4. Number of Volumes created: 10
  5. Number of Buckets created: 100
  6. Number of Keys added: 1000
  7. Ratis replication factor: THREE
  8. Ratis replication type: RATIS
  9. Average Time spent in volume creation: 00:00:00,035
  10. Average Time spent in bucket creation: 00:00:00,319
  11. Average Time spent in key creation: 00:00:03,659
  12. Average Time spent in key write: 00:00:10,894
  13. Total bytes written: 10240000
  14. Total Execution time: 00:00:16,898
  15. ***********************

Genesis

Genesis is a microbenchmarking tool. It’s also included in the distribution (ozone genesis) but it doesn’t require real cluster. It measures different part of the code in an isolated way (eg. the code which saves the data to the local RocksDB based key value stores)

Example run:

  1. ozone genesis -benchmark=BenchMarkRocksDbStore
  2. # JMH version: 1.19
  3. # VM version: JDK 11.0.1, VM 11.0.1+13-LTS
  4. # VM invoker: /usr/lib/jvm/java-11-openjdk-11.0.1.13-3.el7_6.x86_64/bin/java
  5. # VM options: -Dproc_genesis -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender
  6. # Warmup: 2 iterations, 1 s each
  7. # Measurement: 20 iterations, 1 s each
  8. # Timeout: 10 min per iteration
  9. # Threads: 4 threads, will synchronize iterations
  10. # Benchmark mode: Throughput, ops/time
  11. # Benchmark: org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test
  12. # Parameters: (backgroundThreads = 4, blockSize = 8, maxBackgroundFlushes = 4, maxBytesForLevelBase = 512, maxOpenFiles = 5000, maxWriteBufferNumber = 16, writeBufferSize = 64)
  13. # Run progress: 0.00% complete, ETA 00:00:22
  14. # Fork: 1 of 1
  15. # Warmup Iteration 1: 213775.360 ops/s
  16. # Warmup Iteration 2: 32041.633 ops/s
  17. Iteration 1: 196342.348 ops/s
  18. ?stack: <delayed till summary>
  19. Iteration 2: 41926.816 ops/s
  20. ?stack: <delayed till summary>
  21. Iteration 3: 210433.231 ops/s
  22. ?stack: <delayed till summary>
  23. Iteration 4: 46941.951 ops/s
  24. ?stack: <delayed till summary>
  25. Iteration 5: 212825.884 ops/s
  26. ?stack: <delayed till summary>
  27. Iteration 6: 145914.351 ops/s
  28. ?stack: <delayed till summary>
  29. Iteration 7: 141838.469 ops/s
  30. ?stack: <delayed till summary>
  31. Iteration 8: 205334.438 ops/s
  32. ?stack: <delayed till summary>
  33. Iteration 9: 163709.519 ops/s
  34. ?stack: <delayed till summary>
  35. Iteration 10: 162494.608 ops/s
  36. ?stack: <delayed till summary>
  37. Iteration 11: 199155.793 ops/s
  38. ?stack: <delayed till summary>
  39. Iteration 12: 209679.298 ops/s
  40. ?stack: <delayed till summary>
  41. Iteration 13: 193787.574 ops/s
  42. ?stack: <delayed till summary>
  43. Iteration 14: 127004.147 ops/s
  44. ?stack: <delayed till summary>
  45. Iteration 15: 145511.080 ops/s
  46. ?stack: <delayed till summary>
  47. Iteration 16: 223433.864 ops/s
  48. ?stack: <delayed till summary>
  49. Iteration 17: 169752.665 ops/s
  50. ?stack: <delayed till summary>
  51. Iteration 18: 165217.191 ops/s
  52. ?stack: <delayed till summary>
  53. Iteration 19: 191038.476 ops/s
  54. ?stack: <delayed till summary>
  55. Iteration 20: 196335.579 ops/s
  56. ?stack: <delayed till summary>
  57. Result "org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test":
  58. 167433.864 ?(99.9%) 43530.883 ops/s [Average]
  59. (min, avg, max) = (41926.816, 167433.864, 223433.864), stdev = 50130.230
  60. CI (99.9%): [123902.981, 210964.748] (assumes normal distribution)
  61. Secondary result "org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test:?stack":
  62. Stack profiler:
  63. ....[Thread state distributions]....................................................................
  64. 78.9% RUNNABLE
  65. 20.0% TIMED_WAITING
  66. 1.1% WAITING
  67. ....[Thread state: RUNNABLE]........................................................................
  68. 59.8% 75.8% org.rocksdb.RocksDB.put
  69. 16.5% 20.9% org.rocksdb.RocksDB.get
  70. 0.7% 0.9% java.io.UnixFileSystem.delete0
  71. 0.7% 0.9% org.rocksdb.RocksDB.disposeInternal
  72. 0.3% 0.4% java.lang.Long.formatUnsignedLong0
  73. 0.1% 0.2% org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test
  74. 0.1% 0.1% java.lang.Long.toUnsignedString0
  75. 0.1% 0.1% org.apache.hadoop.ozone.genesis.generated.BenchMarkRocksDbStore_test_jmhTest.test_thrpt_jmhStub
  76. 0.0% 0.1% java.lang.Object.clone
  77. 0.0% 0.0% java.lang.Thread.currentThread
  78. 0.4% 0.5% <other>
  79. ....[Thread state: TIMED_WAITING]...................................................................
  80. 20.0% 100.0% java.lang.Object.wait
  81. ....[Thread state: WAITING].........................................................................
  82. 1.1% 100.0% jdk.internal.misc.Unsafe.park
  83. # Run complete. Total time: 00:00:38
  84. Benchmark (backgroundThreads) (blockSize) (maxBackgroundFlushes) (maxBytesForLevelBase) (maxOpenFiles) (maxWriteBufferNumber) (writeBufferSize) Mode Cnt Score Error Units
  85. BenchMarkRocksDbStore.test 4 8 4 512 5000 16 64 thrpt 20 167433.864 ? 43530.883 ops/s
  86. BenchMarkRocksDbStore.test:?stack 4 8 4 512 5000 16 64 thrpt NaN ---