我已经在这个问题上工作了一段时间,我希望有人可以指出我的错误.我想我再也看不到穿过树林的森林了.
我有一个用于测试的LeMaker HiKey开发板.它的AArch64,所以它有NEON和其他cpu功能,如AES,SHA和CRC32:
$cat /proc/cpuinfo
Processor : AArch64 Processor rev 3 (aarch64)
...
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
...
当我尝试编译程序时:
$cat test.cxx
#if (defined(__ARM_NEON__) || defined(__ARM_NEON))
# define NEON_INTRINSICS_AVAILABLE 1
#else
# define NEON_INTRINSICS_AVAILABLE 0
#endif
#if BOOL_NEON_INTRINSICS_AVAILABLE
# include <arm_neon.h>
# if defined(__ARM_FEATURE_CRC32) || (__ARM_ACLE >= 200)
# include <arm_acle.h>
# endif
#endif
#include <stdint.h>
int main(int argc, char* argv[])
{
uint32_t crc = 0;
crc = __crc32b(crc, (uint8_t)0);
return 0
}
它导致以下结果:
$g++ test.cxx -o test.exe
test.cxx: In function ‘int main(int, char**)’:
test.cxx:20:33: error: ‘__crc32b’ was not declared in this scope
crc = __crc32b(crc, (uint8_t)0);
^
test.cxx:22:1: error: expected ‘;’ before ‘}’ token
}
^
$clang++ test.cxx -o test.exe
test.cxx:20:9: error: use of undeclared identifier '__crc32b'
crc = __crc32b(crc, (uint8_t)0);
^
test.cxx:21:11: error: expected ';' after return statement
return 0
^
;
2 errors generated.
文件系统的grep显示arm_acle.h实际上是标题:
$grep -IR '__crc32' /usr/lib
/usr/lib/gcc/.../include/arm_acle.h:__crc32b (uint32_t __a, uint8_t __b)
...
根据ARM® C Language Extensions,第9.7节CRC32内在函数,假定在定义__ARM_FEATURE_CRC32时存在缺失的符号.检查arm_acle.h证实了这一点.
为了完整性,我尝试使用-march = native编译,但编译器拒绝了它.
为什么编译器没有定义__ARM_FEATURE_CRC32?
如何使用主板上提供的本机功能来编译程序?
$gcc --version
gcc (Debian/Linaro 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$clang --version
Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Target: aarch64-unknown-linux-gnu
Thread model: posix
$g++ -dM -E - </dev/null | egrep -i '(arm|neon|acle)'
#define __ARM_NEON 1
$clang++ -dM -E - </dev/null | egrep -i '(arm|neon|acle)'
#define __ARM_64BIT_STATE 1
#define __ARM_ACLE 200
#define __ARM_ALIGN_MAX_STACK_PWR 4
#define __ARM_ARCH 8
#define __ARM_ARCH_ISA_A64 1
#define __ARM_ARCH_PROFILE 'A'
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FP 0xe
#define __ARM_FP16_FORMAT_IEEE 1
#define __ARM_FP_FENV_ROUNDING 1
#define __ARM_NEON 1
#define __ARM_NEON_FP 0xe
#define __ARM_PCS_AAPCS64 1
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_SIZEOF_WCHAR_T 4
解决方法:
至于为什么默认情况下不启用此功能;这是编译器所针对的基线ABI中不存在的可选功能,即编译器生成的二进制文件应该能够在缺少CRC功能的设备上运行.
至少对于gcc,您可以使用-march修饰符crc启用此功能,如下所示:
$gcc -dM -E - -march=armv8-a+crc < /dev/null | egrep -i '(arm|neon|acle|crc)'
#define __ARM_FEATURE_CRC32 1
#define __ARM_NEON 1
有关如何设置此内容的更多文档,请参阅https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/AArch64-Options.html(或旧版gcc版本的相同页面).
我想人们可以期待-march = native做同样的事情,但是这个选项目前似乎只是针对x86架构实现的.