在CPP下使用TVM来部署mxnet模型(以Insightface为例)

自从AI被炒作以来,各个深度学习框架层出不穷。我们通常来讲,作为AI从业者,我们通常经历着标注-训练-部署的过程。其中部署是较为痛苦的工作,尤其是在跨平台如(移动端需要native对接的时候。)当然用于inference框架同样也是层出不穷。但是大多数框架框架往往性能都一般,或者缺少相关op,或者就是转换模型较为困难。TVM的出现很大程度上为模型部署带来了福音。

但是网上将用于TVM部署的教程还比较少,尤其是通过cpp和移动端部署。本文以Insightface Model Zoo中的MobileFaceNet为例,介绍一下如何编译Mxnet模型、在python下inference、在cpp下inference、对比人脸余弦距离、以及在android下的部署。

安装

TVM编译环境的安装需要LLVM编译器,可以简要遵循官方的教程。 official installation tutorial.

LLVM 7.0 可能会导致编译错误,推荐使用LLVM 6.0.1

编译模型

TVM使用了一系列的优化措施来优化计算图,当模型编译完之后会生成若干个编译好的文件。在编译前要指定预编译的平台、架构、指令集等参数。

  1. import numpy as np
  2. import nnvm.compiler
  3. import nnvm.testing
  4. import tvm
  5. from tvm.contrib import graph_runtime
  6. import mxnet as mx
  7. from mxnet import ndarray as nd
  8. prefix,epoch = "emore1",0
  9. sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
  10. image_size = (112, 112)
  11. opt_level = 3
  12. shape_dict = {'data': (1, 3, *image_size)}
  13. target = tvm.target.create("llvm -mcpu=haswell")
  14. # "target" means your target platform you want to compile.
  15. #target = tvm.target.create("llvm -mcpu=broadwell")
  16. nnvm_sym, nnvm_params = nnvm.frontend.from_mxnet(sym, arg_params, aux_params)
  17. with nnvm.compiler.build_config(opt_level=opt_level):
  18. graph, lib, params = nnvm.compiler.build(nnvm_sym, target, shape_dict, params=nnvm_params)
  19. lib.export_library("./deploy_lib.so")
  20. print('lib export succeefully')
  21. with open("./deploy_graph.json", "w") as fo:
  22. fo.write(graph.json())
  23. with open("./deploy_param.params", "wb") as fo:
  24. fo.write(nnvm.compiler.save_param_dict(params))

运行该代码后会生成三个文件分别为deploy_lib.so 、deploy_graph.json 、deploy_param.params 。其中deploy_lib.so 为编译好的动态库,deploy_graph.json为部署使用的计算图、deploy_param.params为模型参数。

使用TVM Python Runtime 进行简单的测试

TVM的Runtime(运行时)并不需要任何依赖,直接clone tvm后 make runtime.即可。

  1. import numpy as np
  2. import nnvm.compiler
  3. import nnvm.testing
  4. import tvm
  5. from tvm.contrib import graph_runtime
  6. import mxnet as mx
  7. from mxnet import ndarray as nd
  8. ctx = tvm.cpu()
  9. # load the module back.
  10. loaded_json = open("./deploy_graph.json").read()
  11. loaded_lib = tvm.module.load("./deploy_lib.so")
  12. loaded_params = bytearray(open("./deploy_param.params", "rb").read())
  13. input_data = tvm.nd.array(np.random.uniform(size=data_shape).astype("float32"))
  14. module = graph_runtime.create(loaded_json, loaded_lib, ctx)
  15. module.load_params(loaded_params)
  16. # Tiny benchmark test.
  17. import time
  18. for i in range(100):
  19. t0 = time.time()
  20. module.run(data=input_data)
  21. print(time.time() - t0)

使用C++来推理MobileFaceNet人脸识别模型

在C++下 TVM Runtime(运行时)仅仅需要编译时输出的so文件,包含 “tvm_runtime_pack.cc” 。runtime的体积也比较小,只有几百K。

下列的CPP代码包含了通过输入一张对齐后的人脸识别照片,输出归一化的之后的人脸向量。

  1. #include <stdio.h>
  2. #include <opencv2/opencv.hpp>
  3. #include <tvm/runtime/module.h>
  4. #include <tvm/runtime/registry.h>
  5. #include <tvm/runtime/packed_func.h>
  6. class FR_MFN_Deploy{
  7. private:
  8. void * handle;
  9. public:
  10. FR_MFN_Deploy(std::string modelFolder)
  11. {
  12. tvm::runtime::Module mod_syslib = tvm::runtime::Module::LoadFromFile(modelFolder + "/deploy_lib.so");
  13. //load graph
  14. std::ifstream json_in(modelFolder + "/deploy_graph.json");
  15. std::string json_data((std::istreambuf_iterator<char>(json_in)), std::istreambuf_iterator<char>());
  16. json_in.close();
  17. int device_type = kDLCPU;
  18. int device_id = 0;
  19. // get global function module for graph runtime
  20. tvm::runtime::Module mod = (*tvm::runtime::Registry::Get("tvm.graph_runtime.create"))(json_data, mod_syslib, device_type, device_id);
  21. this->handle = new tvm::runtime::Module(mod);
  22. //load param
  23. std::ifstream params_in(modelFolder + "/deploy_param.params", std::ios::binary);
  24. std::string params_data((std::istreambuf_iterator<char>(params_in)), std::istreambuf_iterator<char>());
  25. params_in.close();
  26. TVMByteArray params_arr;
  27. params_arr.data = params_data.c_str();
  28. params_arr.size = params_data.length();
  29. tvm::runtime::PackedFunc load_params = mod.GetFunction("load_params");
  30. load_params(params_arr);
  31. }
  32. cv::Mat forward(cv::Mat inputImageAligned)
  33. {
  34. //mobilefacnet preprocess has been written in graph.
  35. cv::Mat tensor = cv::dnn::blobFromImage(inputImageAligned,1.0,cv::Size(112,112),cv::Scalar(0,0,0),true);
  36. //convert uint8 to float32 and convert to RGB via opencv dnn function
  37. DLTensor* input;
  38. constexpr int dtype_code = kDLFloat;
  39. constexpr int dtype_bits = 32;
  40. constexpr int dtype_lanes = 1;
  41. constexpr int device_type = kDLCPU;
  42. constexpr int device_id = 0;
  43. constexpr int in_ndim = 4;
  44. const int64_t in_shape[in_ndim] = {1, 3, 112, 112};
  45. TVMArrayAlloc(in_shape, in_ndim, dtype_code, dtype_bits, dtype_lanes, device_type, device_id, &input);//
  46. TVMArrayCopyFromBytes(input,tensor.data,112*3*112*4);
  47. tvm::runtime::Module* mod = (tvm::runtime::Module*)handle;
  48. tvm::runtime::PackedFunc set_input = mod->GetFunction("set_input");
  49. set_input("data", input);
  50. tvm::runtime::PackedFunc run = mod->GetFunction("run");
  51. run();
  52. tvm::runtime::PackedFunc get_output = mod->GetFunction("get_output");
  53. tvm::runtime::NDArray res = get_output(0);
  54. cv::Mat vector(128,1,CV_32F);
  55. memcpy(vector.data,res->data,128*4);
  56. cv::Mat _l2;
  57. // normlize
  58. cv::multiply(vector,vector,_l2);
  59. float l2 = cv::sqrt(cv::sum(_l2).val[0]);
  60. vector = vector / l2;
  61. TVMArrayFree(input);
  62. return vector;
  63. }
  64. };

我们可以通过输入两张对齐后的人脸照片来提取人脸向量。

  1. cv::Mat A = cv::imread("/Users/yujinke/Desktop/align_id/aligned/20171231115821836_face.jpg");
  2. cv::Mat B = cv::imread("/Users/yujinke/Desktop/align_id/aligned/20171231115821836_idcard.jpg");
  3. FR_MFN_Deploy deploy("./models");
  4. cv::Mat v2 = deploy.forward(B);
  5. cv::Mat v1 = deploy.forward(A);

测量余弦相似度

  1. inline float CosineDistance(const cv::Mat &v1,const cv::Mat &v2){
  2. return static_cast<float>(v1.dot(v2));
  3. }
  4. std::cout<<CosineDistance(v1,v2)<<std::endl;

简单的配置一个cmake文件

  1. cmake_minimum_required(VERSION 3.6)
  2. project(tvm_mobilefacenet)
  3. set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -ldl -lpthread")
  4. SET(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})
  5. SET(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})
  6. SET(HOME_TVM /Users/jackyu/downloads/tvm-0.5)
  7. find_package(OPENCV REQUIRED)
  8. INCLUDE_DIRECTORIES(${OpenCV_INCLUDE_DIRS})
  9. INCLUDE_DIRECTORIES(${HOME_TVM}/include)
  10. INCLUDE_DIRECTORIES(${HOME_TVM}/3rdparty/dmlc-core/include)
  11. INCLUDE_DIRECTORIES(${HOME_TVM}/3rdparty/dlpack/include)
  12. add_executable(tvm_mobilefacenet tvm_runtime_pack.cc main.cpp)
  13. target_link_libraries(tvm_mobilefacenet ${OpenCV_LIBS})

Todo:如何在在Android下部署整套人脸识别流程