unittest failed by math_avx512_ldouble_float_double_schar_ #24

yaozhongxiao · 2021-02-01T01:59:41Z

version / revision	Operating System	Compiler & Version	Compiler Flags	CPU
std_simd::master	Linux	gcc (GCC) 11.0.0 20210119 (experimental)	make test	x86_64

Testcase

29919     3592 - math_avx512_ldouble_float_double_schar_uchar_0 (Failed)
29920     3593 - math_avx512_ldouble_float_double_schar_uchar_1 (Failed)
29921     3594 - math_avx512_ldouble_float_double_schar_uchar_2 (Failed)
29922     3595 - math_avx512_ldouble_float_double_schar_uchar_3 (Failed)
29923     3596 - math_avx512_ldouble_float_double_schar_uchar_4 (Failed)
29924     3597 - math_avx512_ldouble_float_double_schar_uchar_5 (Failed)
29925     3598 - math_avx512_ldouble_float_double_schar_uchar_6 (Failed)
29926     3599 - math_avx512_ldouble_float_double_schar_uchar_7 (Failed)
29927     3600 - math_avx512_ldouble_float_double_schar_uchar_8 (Failed)

Actual Results

29904 99% tests passed, 9 tests failed out of 4821
29905
29906 Label Time Summary:
29907 AVX = 2250.66 secproc (804 tests)
29908 AVX2 = 2282.78 secproc (803 tests)
29909 AVX512 = 1805.20 secproc (803 tests)
29910 KNL = 0.13 secproc (3 tests)
29911 SSE = 0.22 secproc (5 tests)
29912 SSE2 = 2840.78 secproc (801 tests)
29913 SSE4_2 = 2787.19 secproc (801 tests)
29914 SSSE3 = 2892.13 secproc (801 tests)
29915
29916 Total Test time (real) = 14874.98 sec
29917
29918 The following tests FAILED:
29919 3592 - math_avx512_ldouble_float_double_schar_uchar_0 (Failed)
29920 3593 - math_avx512_ldouble_float_double_schar_uchar_1 (Failed)
29921 3594 - math_avx512_ldouble_float_double_schar_uchar_2 (Failed)
29922 3595 - math_avx512_ldouble_float_double_schar_uchar_3 (Failed)
29923 3596 - math_avx512_ldouble_float_double_schar_uchar_4 (Failed)
29924 3597 - math_avx512_ldouble_float_double_schar_uchar_5 (Failed)
29925 3598 - math_avx512_ldouble_float_double_schar_uchar_6 (Failed)
29926 3599 - math_avx512_ldouble_float_double_schar_uchar_7 (Failed)
29927 3600 - math_avx512_ldouble_float_double_schar_uchar_8 (Failed)
29928 FAILED: CMakeFiles/test_random
29929 cd /home/zhongxiao.yzx/workspace/cpp_libs/std-simd/build-disk1-zhongxiao.yzx-compiler-gcc-release-bin-g++ && /usr/local/bin/ctest --schedule-random
29930 ninja: build stopped: subcommand failed.
29931 make: *** [Makefile:37: test] Error 1

Expected Results

100% Passed

yaozhongxiao · 2021-02-01T08:27:21Z

It seems that the position of nan will lead to different results.

TEST_TYPES(V, fpclassify, real_test_types)
  {
    using T = typename V::value_type;
    using intv = std::experimental::fixed_size_simd<int, V::size()>;
    constexpr T inf = std::__infinity_v<T>;
    constexpr T denorm_min = std::__infinity_v<T>;
    constexpr T nan = std::__quiet_NaN_v<T>;
    constexpr T max = std::__finite_max_v<T>;
    constexpr T norm_min = std::__norm_min_v<T>;
    test_values<V>(
      {0., 1., -1.,
#if __GCC_IEC_559 >= 2
       -0., inf, -inf, denorm_min, -denorm_min, nan,
       norm_min * 0.9, -norm_min * 0.9,
#endif
       max, -max, norm_min, -norm_min
      },
      [](const V input) {        <------target point
	COMPARE(NOFPEXCEPT(isfinite(input)),
		!V([&](auto i) { return std::isfinite(input[i]) ? 0 : 1; }))
         ...

while debugging and break at the "target point", we will see the data as follow:

$1 = { >, std::experimental::parallelism_v2::_SimdImplX86 >, false>> = {}, , 16>::_SimdBase1> = {}, 
  _M_data = {> = {_M_data = {nan(0x400000), 1, -1, -0, inf, -inf, inf, -inf, nan(0x400000),
        1.05794489e-38, -1.05794489e-38, 3.40282347e+38, -3.40282347e+38, 1.17549435e-38, 
        -1.17549435e-38, 0}}, }}

FAIL: ┍ at /home/zhongxiao.yzx/workspace/cpp_libs/std-simd/tests/math.cpp:153 (0x4158db)):
FAIL: │ v1 (m[0111 0000 1111 1111]) == v2 (m[0111 0000 0111 1111]) -> m[1111 1111 0111 1111]
  [nan (nan), 1 (0x1p+0), -1 (-0x1p+0), -0 (-0x0p+0), inf (inf), -inf (-inf), inf (inf), -inf (-inf), nan (nan),
   1.05794e-38 (0x1.ccccccp-127), -1.05794e-38 (-0x1.ccccccp-127), 3.40282e+38 (0x1.fffffep+127),
   -3.40282e+38 (-0x1.fffffep+127), 1.17549e-38 (0x1p-126), -1.17549e-38 (-0x1p-126), 0 (0x0p+0)]

yaozhongxiao · 2021-02-02T01:49:31Z

It seems that __k1 ^ _mm512_mask_fpclass_ps_mask(xk1, __xi, 0x99); can not eval nan for float to some extent.

diff --git a/experimental/bits/simd_x86.h b/experimental/bits/simd_x86.h
index d6dda28..72de4a2 100644
--- a/experimental/bits/simd_x86.h
+++ b/experimental/bits/simd_x86.h
@@ -2983,8 +2983,11 @@ template <typename _Abi>
          {
            const auto __xi = __to_intrin(__x);
            constexpr auto __k1 = _Abi::template _S_implicit_mask_intrin<_Tp>();
-           if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
-             return __k1 ^ _mm512_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+           if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4) {
+              // FIXME(): _mm512_mask_fpclass_ps_mask can not eval the nan for float
+              // to some extent, see issue: https://github.com/VcDevel/std-simd/issues/24
+              // return __k1 ^ _mm512_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+           }
            else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
              return __k1 ^ _mm512_mask_fpclass_pd_mask(__k1, __xi, 0x99);
            else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
@@ -2996,7 +2999,7 @@ template <typename _Abi>
            else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
              return __k1 ^ _mm_mask_fpclass_pd_mask(__k1, __xi, 0x99);
          }
-       else if constexpr (__is_avx512_abi<_Abi>())
+       if constexpr (__is_avx512_abi<_Abi>())
          {
            // if all exponent bits are set, __x is either inf or NaN
            using _I = __int_for_sizeof_t<_Tp>;

mattkretz · 2021-02-19T10:33:37Z

This failure is strange. Why is the input value different than defined in test_values, i.e. the initial 0. is a nan and the denorm_min values are inf?
If you think _mm512_mask_fpclass_ps_mask really doesn't work for this case you should reduce it to a minimal test case and determine whether the compiler miscompiled it or whether the CPU is broken. I have not seen this fail on any of my AVX512 machines.

yaozhongxiao mentioned this issue Feb 22, 2021

Fix compilation error and warning #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unittest failed by math_avx512_ldouble_float_double_schar_ #24

unittest failed by math_avx512_ldouble_float_double_schar_ #24

yaozhongxiao commented Feb 1, 2021

yaozhongxiao commented Feb 1, 2021 •

edited by mattkretz

Loading

yaozhongxiao commented Feb 2, 2021

mattkretz commented Feb 19, 2021

unittest failed by math_avx512_ldouble_float_double_schar_ #24

unittest failed by math_avx512_ldouble_float_double_schar_ #24

Comments

yaozhongxiao commented Feb 1, 2021

Testcase

Actual Results

Expected Results

yaozhongxiao commented Feb 1, 2021 • edited by mattkretz Loading

yaozhongxiao commented Feb 2, 2021

mattkretz commented Feb 19, 2021

yaozhongxiao commented Feb 1, 2021 •

edited by mattkretz

Loading