在PC上使用tflite 模型进行推理,发现float模型跑起来很快,但是跑quant模型非常的慢,尤其是一些复杂模型,这是因为 intel x86_64没有对量化推理计算进行优化,所以导致很慢。
解决方法就是使用是arm cpu的设备跑,又专门的优化,会很快
INT TFLITE very much slower than FLOAT TFLITE
This is likely because quantized int requires an arm neon to be faster than float. On a PC (which is what I assume you are running on, float is likely better). This is because quantized int relies on special instructions that have not been emphasized on intel x86_64.
参考:https://github.com/tensorflow/tensorflow/issues/21698