对于ARMv8-A/R和ARMv7-A/R中浮点运算的介绍可以在这里看到ARM Floating Point
其中,个人比较在意的是SIMD(对于ARM来说大致就是NEON吧)对浮点运算的支持。
ARMv8在AArch64模式下对于IEEE 754的支持是比较好的:
Floating-point support in AArch64 state SIMD is IEEE 754-2008 compliant with:
Configurable rounding modes
Configurable Default NaN behavior
Configurable Flush-to-zero behavior
Floating-point computation using AArch32 Advanced SIMD instructions remains unchanged from Armv7.
而ARMv7就略差一些:
The Armv7-A/R Advanced SIMD extension (NEON) offers single-precision floating-point support and performs IEEE 754 floating-point arithmetic with the following restrictions:
Denormalized numbers are flushed to zero
Only default NaNs are supported
The Round to Nearest rounding mode is used
Untrapped floating-point exception handling is used for all floating-point exceptions
所以在ARMv7平台上对浮点运算采用NEON进行加速的时候,要非常注意精度是否足够! 以SLEEF Vectorized Math Library为例,在SLEEF的AArch32 reference中便随处可见这样的描述:
This function may less accurate than the scalar function since AArch32 NEON is not IEEE 754-compliant.
此处应复习一下关于IEEE float的知识,图源CSAPP:
就以Denormalized numbers are flushed to zero这一项来做个测试吧,C和NEON的代码分别如下:
#define ITER_NUM 130
void test_float()
{
float x = 0.5;
float y = 1.5;
//第一次循环
int i;
for (i = 0; i < ITER_NUM; i++)
y *= x;
//第二次循环只是为了显示好看
for (i = 0; i < 10; i++)
y *= 10000;
printf("y %d\r\n", (int)y);
}
void test_float_neon()
{
float y[2];
float32x2_t fvecx = vdup_n_f32(0.5);
float32x2_t fvecy = vdup_n_f32(1.5);
int i;
for (i = 0; i < ITER_NUM; i++)
fvecy = vmul_f32(fvecx, fvecy);
vst1_f32(y, fvecy);
for (i = 0; i < 10; i++)
y[0] *= 10000;
printf("y %d\r\n", (int)y[0]);
}
IEEE 574规范下32-bit float的最小的规范化数是
#define FLT_MIN 1.175494351e-38F /* min positive value */
当ITER_NUM
定义为130时,经过第一次循环后y的值大概是1.102e-39,显然已经超出了规范化数能够表示的范围,那么在ARMv7平台上test_float_neon()
最终的打印输出应该就是0,而在ARMv8平台AArch64模式下的输出应该与C语言版本test_float()
一致且非0。
用QEMU中的mcimx6ul-evk模拟ARMv7平台,virt -cpu cortex-a57模拟ARMv8平台,结果确实如此。
BTW,假如用VS想达到的Denormalized numbers are flushed to zero效果,可以这样写:
#include <xmmintrin.h>
#include <pmmintrin.h>
#include <stdio.h>
int main()
{
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);
test_float();
return 0;
}