使用PyNative模式调试

概述

MindSpore支持两种运行模式,在调试或者运行方面做了不同的优化:

  • PyNative模式:也称动态图模式,将神经网络中的各个算子逐一下发执行,方便用户编写和调试神经网络模型。

  • Graph模式:也称静态图模式或者图模式,将神经网络模型编译成一整张图,然后下发执行。该模式利用图优化等技术提高运行性能,同时有助于规模部署和跨平台运行。

默认情况下,MindSpore处于PyNative模式,可以通过context.set_context(mode=context.GRAPH_MODE)切换为Graph模式;同样地,MindSpore处于Graph模式时,可以通过 context.set_context(mode=context.PYNATIVE_MODE)切换为PyNative模式。

PyNative模式下,支持执行单算子、普通函数和网络,以及单独求梯度的操作。下面将详细介绍使用方法和注意事项。

执行单算子

执行单个算子,并打印相关结果,如下例所示。

  1. Copyimport numpy as np
  2. import mindspore.nn as nn
  3. from mindspore import context, Tensor
  4.  
  5. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  6.  
  7. conv = nn.Conv2d(3, 4, 3, bias_init='zeros')
  8. input_data = Tensor(np.ones([1, 3, 5, 5]).astype(np.float32))
  9. output = conv(input_data)
  10. print(output.asnumpy())

输出:

  1. Copy[[[[-0.02190447 -0.05208071 -0.05208071 -0.05208071 -0.06265172]
  2. [-0.01529094 -0.05286242 -0.05286242 -0.05286242 -0.04228776]
  3. [-0.01529094 -0.05286242 -0.05286242 -0.05286242 -0.04228776]
  4. [-0.01529094 -0.05286242 -0.05286242 -0.05286242 -0.04228776]
  5. [-0.01430791 -0.04892948 -0.04892948 -0.04892948 -0.01096004]]
  6.  
  7. [[ 0.00802889 -0.00229866 -0.00229866 -0.00229866 -0.00471579]
  8. [ 0.01172971 0.02172665 0.02172665 0.02172665 0.03261888]
  9. [ 0.01172971 0.02172665 0.02172665 0.02172665 0.03261888]
  10. [ 0.01172971 0.02172665 0.02172665 0.02172665 0.03261888]
  11. [ 0.01784375 0.01185635 0.01185635 0.01185635 0.01839031]]
  12.  
  13. [[ 0.04841832 0.03321705 0.03321705 0.03321705 0.0342317 ]
  14. [ 0.0651359 0.04310361 0.04310361 0.04310361 0.03355784]
  15. [ 0.0651359 0.04310361 0.04310361 0.04310361 0.03355784]
  16. [ 0.0651359 0.04310361 0.04310361 0.04310361 0.03355784]
  17. [ 0.04680437 0.03465693 0.03465693 0.03465693 0.00171057]]
  18.  
  19. [[-0.01783456 -0.00459451 -0.00459451 -0.00459451 0.02316688]
  20. [ 0.01295831 0.00879035 0.00879035 0.00879035 0.01178642]
  21. [ 0.01295831 0.00879035 0.00879035 0.00879035 0.01178642]
  22. [ 0.01295831 0.00879035 0.00879035 0.00879035 0.01178642]
  23. [ 0.05016355 0.03958241 0.03958241 0.03958241 0.03443141]]]]

执行普通函数

将若干算子组合成一个函数,然后直接通过函数调用的方式执行这些算子,并打印相关结果,如下例所示。

示例代码

  1. Copyimport numpy as np
  2. from mindspore import context, Tensor
  3. from mindspore.ops import functional as F
  4.  
  5. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  6.  
  7. def tensor_add_func(x, y):
  8. z = F.tensor_add(x, y)
  9. z = F.tensor_add(z, x)
  10. return z
  11.  
  12. x = Tensor(np.ones([3, 3], dtype=np.float32))
  13. y = Tensor(np.ones([3, 3], dtype=np.float32))
  14. output = tensor_add_func(x, y)
  15. print(output.asnumpy())

输出

  1. Copy[[3. 3. 3.]
  2. [3. 3. 3.]
  3. [3. 3. 3.]]

提升PyNative性能

为了提高PyNative模式下的前向计算任务执行速度,MindSpore提供了Staging功能,该功能可以在PyNative模式下将Python函数或者Python类的方法编译成计算图,通过图优化等技术提高运行速度,如下例所示。

  1. Copyimport numpy as np
  2. import numpy as np
  3. import mindspore.nn as nn
  4. from mindspore import context, Tensor
  5. import mindspore.ops.operations as P
  6. from mindspore.common.api import ms_function
  7.  
  8. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  9.  
  10. class TensorAddNet(nn.Cell):
  11. def __init__(self):
  12. super(TensorAddNet, self).__init__()
  13. self.add = P.TensorAdd()
  14.  
  15. @ms_function
  16. def construct(self, x, y):
  17. res = self.add(x, y)
  18. return res
  19.  
  20. x = Tensor(np.ones([4, 4]).astype(np.float32))
  21. y = Tensor(np.ones([4, 4]).astype(np.float32))
  22. net = TensorAddNet()
  23.  
  24. z = net(x, y) # Staging mode
  25. tensor_add = P.TensorAdd()
  26. res = tensor_add(x, z) # PyNative mode
  27. print(res.asnumpy())

输出

  1. Copy[[3. 3. 3. 3.]
  2. [3. 3. 3. 3.]
  3. [3. 3. 3. 3.]
  4. [3. 3. 3. 3.]]

上述示例代码中,在TensorAddNet类的construct之前加装了ms_function装饰器,该装饰器会将construct方法编译成计算图,在给定输入之后,以图的形式下发执行,而上一示例代码中的F.tensor_add会直以普通的PyNative的方式执行。

需要说明的是,加装了ms_function装饰器的函数中,如果包含不需要进行参数训练的算子(如poolingtensor_add等算子),则这些算子可以在被装饰的函数中直接调用,如下例所示。

示例代码

  1. Copyimport numpy as np
  2. import mindspore.nn as nn
  3. from mindspore import context, Tensor
  4. import mindspore.ops.operations as P
  5. from mindspore.common.api import ms_function
  6.  
  7. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  8.  
  9. tensor_add = P.TensorAdd()
  10.  
  11. @ms_function
  12. def tensor_add_fn(x, y):
  13. res = tensor_add(x, y)
  14. return res
  15.  
  16. x = Tensor(np.ones([4, 4]).astype(np.float32))
  17. y = Tensor(np.ones([4, 4]).astype(np.float32))
  18. z = tensor_add_fn(x, y)
  19. print(z.asnumpy())

输出

  1. Copy[[2. 2. 2. 2.]
  2. [2. 2. 2. 2.]
  3. [2. 2. 2. 2.]
  4. [2. 2. 2. 2.]]

如果被装饰的函数中包含了需要进行参数训练的算子(如ConvolutionBatchNorm等算子),则这些算子必须在被装饰等函数之外完成实例化操作,如下例所示。

示例代码

  1. Copyimport numpy as np
  2. import mindspore.nn as nn
  3. from mindspore import context, Tensor
  4. from mindspore.common.api import ms_function
  5.  
  6. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  7.  
  8. conv_obj = nn.Conv2d(in_channels=3, out_channels=4, kernel_size=3, stride=2, padding=0)
  9. @ms_function
  10. def conv_fn(x):
  11. res = conv_obj(x)
  12. return res
  13.  
  14. input_data = np.random.randn(2, 3, 6, 6).astype(np.float32)
  15. z = conv_fn(Tensor(input_data))
  16. print(z.asnumpy())

输出

  1. Copy[[[[ 0.10377571 -0.0182163 -0.05221086]
  2. [ 0.1428334 -0.01216263 0.03171652]
  3. [-0.00673915 -0.01216291 0.02872104]]
  4.  
  5. [[ 0.02906547 -0.02333629 -0.0358406 ]
  6. [ 0.03805163 -0.00589525 0.04790922]
  7. [-0.01307234 -0.00916951 0.02396654]]
  8.  
  9. [[ 0.01477884 -0.06549098 -0.01571796]
  10. [ 0.00526886 -0.09617482 0.04676902]
  11. [-0.02132788 -0.04203424 0.04523344]]
  12.  
  13. [[ 0.04590619 -0.00251453 -0.00782715]
  14. [ 0.06099087 -0.03445276 0.00022781]
  15. [ 0.0563223 -0.04832596 -0.00948266]]]
  16.  
  17. [[[ 0.08444098 -0.05898955 -0.039262 ]
  18. [ 0.08322686 -0.0074796 0.0411371 ]
  19. [-0.02319113 0.02128408 -0.01493311]]
  20.  
  21. [[ 0.02473745 -0.02558945 -0.0337843 ]
  22. [-0.03617039 -0.05027632 -0.04603915]
  23. [ 0.03672804 0.00507637 -0.08433761]]
  24.  
  25. [[ 0.09628943 0.01895323 -0.02196114]
  26. [ 0.04779419 -0.0871575 0.0055248 ]
  27. [-0.04382382 -0.00511185 -0.01168541]]
  28.  
  29. [[ 0.0534859 0.02526264 0.04755395]
  30. [-0.03438103 -0.05877855 0.06530266]
  31. [ 0.0377498 -0.06117418 0.00546303]]]]

调试网络训练模型

PyNative模式下,还可以支持单独求梯度的操作。如下例所示,可通过grad_all求该函数或者网络所有的输入梯度。

示例代码

  1. Copyfrom mindspore.ops import composite as C
  2. import mindspore.context as context
  3.  
  4. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  5.  
  6. def mul(x, y):
  7. return x * y
  8.  
  9. def mainf(x, y):
  10. return C.grad_all(mul)(x, y)
  11.  
  12. print(mainf(1,2))

输出

  1. Copy(2, 1)

在进行网络训练时,求得梯度然后调用优化器对参数进行优化(暂不支持在反向计算梯度的过程中设置断点),然后再利用前向计算loss,从而实现在PyNative模式下进行网络训练。

完整LeNet示例代码

  1. Copyimport numpy as np
  2. import mindspore.nn as nn
  3. import mindspore.ops.operations as P
  4. from mindspore.nn import Dense
  5. from mindspore import context, Tensor, ParameterTuple
  6. from mindspore.common.initializer import TruncatedNormal
  7. from mindspore.ops import composite as C
  8. from mindspore.common import dtype as mstype
  9. from mindspore.nn.wrap.cell_wrapper import WithLossCell
  10. from mindspore.nn.loss import SoftmaxCrossEntropyWithLogits
  11. from mindspore.nn.optim import Momentum
  12.  
  13. context.set_context(mode=context.PYNATIVE_MODE, device_target="GPU")
  14.  
  15. def conv(in_channels, out_channels, kernel_size, stride=1, padding=0):
  16. """weight initial for conv layer"""
  17. weight = weight_variable()
  18. return nn.Conv2d(in_channels, out_channels,
  19. kernel_size=kernel_size, stride=stride, padding=padding,
  20. weight_init=weight, has_bias=False, pad_mode="valid")
  21.  
  22. def fc_with_initialize(input_channels, out_channels):
  23. """weight initial for fc layer"""
  24. weight = weight_variable()
  25. bias = weight_variable()
  26. return nn.Dense(input_channels, out_channels, weight, bias)
  27.  
  28. def weight_variable():
  29. """weight initial"""
  30. return TruncatedNormal(0.02)
  31.  
  32. class LeNet5(nn.Cell):
  33. """
  34. Lenet network
  35. Args:
  36. num_class (int): Num classes. Default: 10.
  37.  
  38. Returns:
  39. Tensor, output tensor
  40.  
  41. Examples:
  42. >>> LeNet(num_class=10)
  43. """
  44. def __init__(self, num_class=10):
  45. super(LeNet5, self).__init__()
  46. self.num_class = num_class
  47. self.batch_size = 32
  48. self.conv1 = conv(1, 6, 5)
  49. self.conv2 = conv(6, 16, 5)
  50. self.fc1 = fc_with_initialize(16 * 5 * 5, 120)
  51. self.fc2 = fc_with_initialize(120, 84)
  52. self.fc3 = fc_with_initialize(84, self.num_class)
  53. self.relu = nn.ReLU()
  54. self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
  55. self.reshape = P.Reshape()
  56.  
  57. def construct(self, x):
  58. x = self.conv1(x)
  59. x = self.relu(x)
  60. x = self.max_pool2d(x)
  61. x = self.conv2(x)
  62. x = self.relu(x)
  63. x = self.max_pool2d(x)
  64. x = self.reshape(x, (self.batch_size, -1))
  65. x = self.fc1(x)
  66. x = self.relu(x)
  67. x = self.fc2(x)
  68. x = self.relu(x)
  69. x = self.fc3(x)
  70. return x
  71.  
  72.  
  73. class GradWrap(nn.Cell):
  74. """ GradWrap definition """
  75. def __init__(self, network):
  76. super(GradWrap, self).__init__(auto_prefix=False)
  77. self.network = network
  78. self.weights = ParameterTuple(filter(lambda x: x.requires_grad, network.get_parameters()))
  79.  
  80. def construct(self, x, label):
  81. weights = self.weights
  82. return C.grad_by_list(self.network, weights)(x, label)
  83.  
  84. net = LeNet5()
  85. optimizer = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), 0.1, 0.9)
  86. criterion = nn.SoftmaxCrossEntropyWithLogits(is_grad=False, sparse=True)
  87. net_with_criterion = WithLossCell(net, criterion)
  88. train_network = GradWrap(net_with_criterion)
  89. train_network.set_train()
  90.  
  91. input_data = Tensor(np.ones([net.batch_size, 1, 32, 32]).astype(np.float32) * 0.01)
  92. label = Tensor(np.ones([net.batch_size]).astype(np.int32))
  93. output = net(Tensor(input_data))
  94. loss_output = criterion(output, label)
  95. grads = train_network(input_data, label)
  96. success = optimizer(grads)
  97. loss = loss_output.asnumpy()
  98. print(loss)

输出

  1. Copy2.3050091

上述执行方式中,可以在construct函数任意需要的地方设置断点,获取网络执行的中间结果,通过pdb的方式对网络进行调试。