Android NN API
Table of Contents
1. Android NN API
1.1. NN API
https://developer.android.com/ndk/guides/neuralnetworks/
- NN API 面向 tensorflow lite 等机器学习框架
- 它提供了基本的 tensor 操作, 可以构造计算图并对计算图求值
- 它只支持 inference, 不支持 training
- 与底层的 android NN HAL 配合, 支持各种硬件加速
- 只提供了 c++ API
1.2. example
使用 NN API 计算一个简单的 \(w*x\):
// 2018-08-06 11:04 #include <NeuralNetworks.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/mman.h> #include <stdio.h> int main(int argc, char *argv[]) { ANeuralNetworksModel* model = NULL; ANeuralNetworksModel_create(&model); ANeuralNetworksOperandType tensorW; tensorW.type = ANEURALNETWORKS_TENSOR_FLOAT32; tensorW.dimensionCount = 0; tensorW.dimensions = NULL; ANeuralNetworksOperandType tensorX; tensorX.type = ANEURALNETWORKS_TENSOR_FLOAT32; tensorX.dimensionCount = 0; tensorX.dimensions = NULL; ANeuralNetworksOperandType tensorOut; tensorOut.type = ANEURALNETWORKS_TENSOR_FLOAT32; tensorOut.dimensionCount = 0; tensorOut.dimensions = NULL; ANeuralNetworksOperandType activationType; activationType.type = ANEURALNETWORKS_INT32; activationType.scale = 0.f; activationType.zeroPoint = 0; activationType.dimensionCount = 0; activationType.dimensions = NULL; int ret = 0; ret += ANeuralNetworksModel_addOperand(model, &tensorW); ret += ANeuralNetworksModel_addOperand(model, &tensorX); ret += ANeuralNetworksModel_addOperand(model, &activationType); ret += ANeuralNetworksModel_addOperand(model, &tensorOut); // w=111. float w_value = 111.0; ret += ANeuralNetworksModel_setOperandValue(model, 0, &w_value, sizeof(w_value)); int32_t noneValue = ANEURALNETWORKS_FUSED_NONE; ret += ANeuralNetworksModel_setOperandValue(model, 2, &noneValue, sizeof(noneValue)); uint32_t addInputIndexes[3]= {0, 1, 2}; uint32_t addOutputIndexes[1] = {3}; ret += ANeuralNetworksModel_addOperation(model, ANEURALNETWORKS_ADD, 3, addInputIndexes, 1, addOutputIndexes); uint32_t modelInputIndexes[1] = {1}; uint32_t modelOutputIndexes[1] = {3}; ret += ANeuralNetworksModel_identifyInputsAndOutputs(model, 1, modelInputIndexes, 1, modelOutputIndexes); ret += ANeuralNetworksModel_finish(model); // compilation ANeuralNetworksCompilation* compilation; ret += ANeuralNetworksCompilation_create(model, &compilation); ANeuralNetworksCompilation_finish(compilation); // run ANeuralNetworksExecution* run = NULL; ret += ANeuralNetworksExecution_create(compilation, &run); float myInput = 10; ANeuralNetworksExecution_setInput(run, 0, NULL, &myInput, sizeof(myInput)); float myOutput = 0; ret += ANeuralNetworksExecution_setOutput(run, 0, NULL, &myOutput, sizeof(myOutput)); ANeuralNetworksEvent* run_end = NULL; ret += ANeuralNetworksExecution_startCompute(run, &run_end); ret += ANeuralNetworksEvent_wait(run_end); printf("%f\n", myOutput); ANeuralNetworksEvent_free(run_end); ANeuralNetworksExecution_free(run); }
使用如下的 Android.bp 进行编译:
cc_binary { srcs: ["nn api_test.cpp"], cflags: ["-Wno-error-unused-parameter"], name: "nn api_test", include_dirs: ["frameworks/ml/nn/runtime/include"], shared_libs: ["libneuralnetworks"], }
1.3. NN HAL
HAL 定义在 `hardware/interfaces/neuralnetworks` HAL 有一个 sample 实现在 `frameworks/ml/nn/driver/sample`
IDevice
IDevice 主要用来实现设备的查询, 例如
getCapabilities
getCapabilities 会返回设备支持哪类运算以及相应的性能, 例如:
Return<void> SampleDriverMinimal::getCapabilities(getCapabilities_cb cb) { Capabilities capabilities = {.float32Performance = {.execTime = 0.4f, .powerUsage = 0.5f}, .quantized8Performance = {.execTime = 1.0f, .powerUsage = 1.0f}}; cb(ErrorStatus::NONE, capabilities); return Void(); }
表示该设备支持 float32 和 int8 量化, 以及相应的性能和功耗. 上层 NN runtime 会根据这些信息决定是否使用该设备
getSupportedOperations
NN API 一共定义了 29 种 operation, 定义在 `NeuralNetworks.h`, 具体包括:
ANEURALNETWORKS_ADD = 0 ANEURALNETWORKS_AVERAGE_POOL_2D = 1 ANEURALNETWORKS_CONCATENATION = 2 ANEURALNETWORKS_CONV_2D = 3 ANEURALNETWORKS_DEPTHWISE_CONV_2D = 4 ANEURALNETWORKS_DEPTH_TO_SPACE = 5 ANEURALNETWORKS_DEQUANTIZE = 6 ANEURALNETWORKS_EMBEDDING_LOOKUP = 7 ANEURALNETWORKS_FLOOR = 8 ANEURALNETWORKS_FULLY_CONNECTED = 9 ANEURALNETWORKS_HASHTABLE_LOOKUP = 10 ANEURALNETWORKS_L2_NORMALIZATION = 11 ANEURALNETWORKS_L2_POOL_2D = 12 ANEURALNETWORKS_LOCAL_RESPONSE_NORMALIZATION = 13 ANEURALNETWORKS_LOGISTIC = 14 ANEURALNETWORKS_LSH_PROJECTION = 15 ANEURALNETWORKS_LSTM = 16 ANEURALNETWORKS_MAX_POOL_2D = 17 ANEURALNETWORKS_MUL = 18 ANEURALNETWORKS_RELU = 19 ANEURALNETWORKS_RELU1 = 20 ANEURALNETWORKS_RELU6 = 21 ANEURALNETWORKS_RESHAPE = 22 ANEURALNETWORKS_RESIZE_BILINEAR = 23 ANEURALNETWORKS_RNN = 24 ANEURALNETWORKS_SOFTMAX = 25 ANEURALNETWORKS_SPACE_TO_DEPTH = 26 ANEURALNETWORKS_SVDF = 27 ANEURALNETWORKS_TANH = 28
IPreparedModel
IPreparedModel 主要定义了 `execute` 方向, 以便 runtime 能够把 operation 委派给 device 去执行