GPU acceleration

GPU acceleration is an experimental feature. For updates on the progress of GPU acceleration, or if you want to leave feedback that could help improve the feature, join the discussion in the OpenSearch forum.

When running a natural language processing (NLP) model in your OpenSearch cluster with a machine learning (ML) node, you can achieve better performance on the ML node using graphics processing unit (GPU) acceleration. GPUs can work in tandem with the CPU of your cluster to speed up the model upload and training.

Supported GPUs

Currently, ML nodes following GPU instances:

If you need GPU power, you can provision GPU instances through Amazon Elastic Compute Cloud (Amazon EC2). For more information on how to provision a GPU instance, see Recommended GPU Instances.

Supported images

You can use GPU acceleration with both Docker images with CUDA 11.6 and Amazon Machine Images (AMIs).

PyTorch

GPU-accelerated ML nodes require PyTorch 1.12.1 work with ML models.

Setting up a GPU-accelerated ML node

Depending on the GPU, you can provision a GPU-accelerated ML node manually or by using automated initialization scripts.

Preparing an NVIDIA ML node

NVIDIA uses CUDA to increase node performance. In order to take advantage of CUDA, you need to make sure that your drivers include the nvidia-uvm kernel inside the /dev directory. To check for the kernel, enter ls -al /dev | grep nvidia-uvm.

If the nvidia-uvm kernel does not exist, run nvidia-uvm-init.sh:

  1. #!/bin/bash
  2. ## Script to initialize nvidia device nodes.
  3. ## https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-verifications
  4. /sbin/modprobe nvidia
  5. if [ "$?" -eq 0 ]; then
  6. # Count the number of NVIDIA controllers found.
  7. NVDEVS=`lspci | grep -i NVIDIA`
  8. N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
  9. NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
  10. N=`expr $N3D + $NVGA - 1`
  11. for i in `seq 0 $N`; do
  12. mknod -m 666 /dev/nvidia$i c 195 $i
  13. done
  14. mknod -m 666 /dev/nvidiactl c 195 255
  15. else
  16. exit 1
  17. fi
  18. /sbin/modprobe nvidia-uvm
  19. if [ "$?" -eq 0 ]; then
  20. # Find out the major device number used by the nvidia-uvm driver
  21. D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
  22. mknod -m 666 /dev/nvidia-uvm c $D 0
  23. mknod -m 666 /dev/nvidia-uvm-tools c $D 0
  24. else
  25. exit 1
  26. fi

After verifying that nvidia-uvm exists under /dev, you can start OpenSearch inside your cluster.

Preparing AWS Inferentia ML node

Depending on the Linux operating system running on AWS Inferentia, you can use the following commands and scripts to provision an ML node and run OpenSearch inside your cluster.

To start, download and install OpenSearch on your cluster.

Then export OpenSearch and set up your environment variables. This example exports OpenSearch into the directory opensearch-2.5.0, so OPENSEARCH_HOME = opensearch-2.5.0:

  1. echo "export OPENSEARCH_HOME=~/opensearch-2.5.0" | tee -a ~/.bash_profile
  2. echo "export PYTORCH_VERSION=1.12.1" | tee -a ~/.bash_profile
  3. source ~/.bash_profile

Next, create a shell script file called prepare_torch_neuron.sh. You can copy and customize one of the following examples based on your Linux operating system:

After you’ve run the scripts, exit your current terminal and open a new terminal to start OpenSearch.

GPU acceleration has only been tested on Ubuntu 20.04 and Amazon Linux 2. However, you can use other Linux operating systems.

Ubuntu 20.04

  1. . /etc/os-release
  2. sudo tee /etc/apt/sources.list.d/neuron.list > /dev/null <<EOF
  3. deb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} main
  4. EOF
  5. wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | sudo apt-key add -
  6. # Update OS packages
  7. sudo apt-get update -y
  8. ################################################################################################################
  9. # To install or update to Neuron versions 1.19.1 and newer from previous releases:
  10. # - DO NOT skip 'aws-neuron-dkms' install or upgrade step, you MUST install or upgrade to latest Neuron driver
  11. ################################################################################################################
  12. # Install OS headers
  13. sudo apt-get install linux-headers-$(uname -r) -y
  14. # Install Neuron Driver
  15. sudo apt-get install aws-neuronx-dkms -y
  16. ####################################################################################
  17. # Warning: If Linux kernel is updated as a result of OS package update
  18. # Neuron driver (aws-neuron-dkms) should be re-installed after reboot
  19. ####################################################################################
  20. # Install Neuron Tools
  21. sudo apt-get install aws-neuronx-tools -y
  22. ######################################################
  23. # Only for Ubuntu 20 - Install Python3.7
  24. sudo add-apt-repository ppa:deadsnakes/ppa
  25. sudo apt-get install python3.7
  26. ######################################################
  27. # Install Python venv and activate Python virtual environment to install
  28. # Neuron pip packages.
  29. cd ~
  30. sudo apt-get install -y python3.7-venv g++
  31. python3.7 -m venv pytorch_venv
  32. source pytorch_venv/bin/activate
  33. pip install -U pip
  34. # Set pip repository to point to the Neuron repository
  35. pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com
  36. #Install Neuron PyTorch
  37. pip install torch-neuron torchvision
  38. # If you need to trace the neuron model, install torch neuron with this command
  39. # pip install torch-neuron neuron-cc[tensorflow] "protobuf==3.20.1" torchvision
  40. # If you need to trace neuron model, install the transformers for tracing the Huggingface model.
  41. # pip install transformers
  42. # Copy torch neuron lib to OpenSearch
  43. PYTORCH_NEURON_LIB_PATH=~/pytorch_venv/lib/python3.7/site-packages/torch_neuron/lib/
  44. mkdir -p $OPENSEARCH_HOME/lib/torch_neuron; cp -r $PYTORCH_NEURON_LIB_PATH/ $OPENSEARCH_HOME/lib/torch_neuron
  45. export PYTORCH_EXTRA_LIBRARY_PATH=$OPENSEARCH_HOME/lib/torch_neuron/lib/libtorchneuron.so
  46. echo "export PYTORCH_EXTRA_LIBRARY_PATH=$OPENSEARCH_HOME/lib/torch_neuron/lib/libtorchneuron.so" | tee -a ~/.bash_profile
  47. # Increase JVm stack size to >=2MB
  48. echo "-Xss2m" | tee -a $OPENSEARCH_HOME/config/jvm.options
  49. # Increase max file descriptors to 65535
  50. echo "$(whoami) - nofile 65535" | sudo tee -a /etc/security/limits.conf
  51. # max virtual memory areas vm.max_map_count to 262144
  52. sudo sysctl -w vm.max_map_count=262144

Amazon Linux 2

  1. # Configure Linux for Neuron repository updates
  2. sudo tee /etc/yum.repos.d/neuron.repo > /dev/null <<EOF
  3. [neuron]
  4. name=Neuron YUM Repository
  5. baseurl=https://yum.repos.neuron.amazonaws.com
  6. enabled=1
  7. metadata_expire=0
  8. EOF
  9. sudo rpm --import https://yum.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB
  10. # Update OS packages
  11. sudo yum update -y
  12. ################################################################################################################
  13. # To install or update to Neuron versions 1.19.1 and newer from previous releases:
  14. # - DO NOT skip 'aws-neuron-dkms' install or upgrade step, you MUST install or upgrade to latest Neuron driver
  15. ################################################################################################################
  16. # Install OS headers
  17. sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) -y
  18. # Install Neuron Driver
  19. ####################################################################################
  20. # Warning: If Linux kernel is updated as a result of OS package update
  21. # Neuron driver (aws-neuron-dkms) should be re-installed after reboot
  22. ####################################################################################
  23. sudo yum install aws-neuronx-dkms -y
  24. # Install Neuron Tools
  25. sudo yum install aws-neuronx-tools -y
  26. # Install Python venv and activate Python virtual environment to install
  27. # Neuron pip packages.
  28. cd ~
  29. sudo yum install -y python3.7-venv gcc-c++
  30. python3.7 -m venv pytorch_venv
  31. source pytorch_venv/bin/activate
  32. pip install -U pip
  33. # Set Pip repository to point to the Neuron repository
  34. pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com
  35. # Install Neuron PyTorch
  36. pip install torch-neuron torchvision
  37. # If you need to trace the neuron model, install torch neuron with this command
  38. # pip install torch-neuron neuron-cc[tensorflow] "protobuf<4" torchvision
  39. # If you need to run the trace neuron model, install transformers for tracing Huggingface model.
  40. # pip install transformers
  41. # Copy torch neuron lib to OpenSearch
  42. PYTORCH_NEURON_LIB_PATH=~/pytorch_venv/lib/python3.7/site-packages/torch_neuron/lib/
  43. mkdir -p $OPENSEARCH_HOME/lib/torch_neuron; cp -r $PYTORCH_NEURON_LIB_PATH/ $OPENSEARCH_HOME/lib/torch_neuron
  44. export PYTORCH_EXTRA_LIBRARY_PATH=$OPENSEARCH_HOME/lib/torch_neuron/lib/libtorchneuron.so
  45. echo "export PYTORCH_EXTRA_LIBRARY_PATH=$OPENSEARCH_HOME/lib/torch_neuron/lib/libtorchneuron.so" | tee -a ~/.bash_profile
  46. # Increase JVm stack size to >=2MB
  47. echo "-Xss2m" | tee -a $OPENSEARCH_HOME/config/jvm.options
  48. # Increase max file descriptors to 65535
  49. echo "$(whoami) - nofile 65535" | sudo tee -a /etc/security/limits.conf
  50. # max virtual memory areas vm.max_map_count to 262144
  51. sudo sysctl -w vm.max_map_count=262144

When the script completes running, open a new terminal for the settings to take effect. Then, start OpenSearch.

OpenSearch should now be running inside your GPU-accelerated cluster. However, if any errors occur during provisioning, you can install the GPU accelerator drivers manually.

Prepare ML node manually

If the previous two scripts do not provision your GPU-accelerated node properly, you can install the drivers for AWS Inferentia manually:

  1. Deploy an AWS accelerator instance based on your chosen Linux operating system. For instructions, see Deploy on AWS accelerator instance.

  2. Copy the Neuron library into OpenSearch. The following command uses a directory named opensearch-2.5.0:

    1. OPENSEARCH_HOME=~/opensearch-2.5.0
  3. Set the PYTORCH_EXTRA_LIBRARY_PATH path. In this example, we create a pytorch virtual environment in the OPENSEARCH_HOME folder:

    ``` PYTORCH_NEURON_LIB_PATH=~/pytorch_venv/lib/python3.7/site-packages/torch_neuron/lib/

  1. mkdir -p $OPENSEARCH_HOME/lib/torch_neuron; cp -r $PYTORCH_NEURON_LIB_PATH/ $OPENSEARCH_HOME/lib/torch_neuron
  2. export PYTORCH_EXTRA_LIBRARY_PATH=$OPENSEARCH_HOME/lib/torch_neuron/lib/libtorchneuron.so
  3. ```
  1. (Optional) To monitor the GPU usage of your accelerator instance, install Neuron tools, which allows models to be used inside your instance:

    1. # Install Neuron Tools
    2. sudo apt-get install aws-neuronx-tools -y
    1. # Add Neuron tools your PATH
    2. export PATH=/opt/aws/neuron/bin:$PATH
    1. # Test Neuron tools
    2. neuron-top
  2. To make sure you have enough memory to upload a model, increase the JVM stack size to >+2MB:

    1. echo "-Xss2m" | sudo tee -a $OPENSEARCH_HOME/config/jvm.options
  3. Start OpenSearch.

Troubleshooting

Due to the amount of data required to work with ML models, you might encounter the following max file descriptors or vm.max_map_count errors when trying to run OpenSearch in a your cluster:

  1. [1]: max file descriptors [8192] for opensearch process is too low, increase to at least [65535]
  2. [2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

To troubleshoot the max file descriptors error, run the following command:

  1. echo "$(whoami) - nofile 65535" | sudo tee -a /etc/security/limits.conf

To fix the vm.max_map_count error, run this command to increase the count to 262114:

  1. sudo sysctl -w vm.max_map_count=262144

Next steps

If you want to try a GPU-accelerated cluster using AWS Inferentia with a pretrained HuggingFace model, see Compiling and Deploying HuggingFace Pretrained BERT.