Intel Architecture Manual Updates: bfloat16 for Cooper Lake

April 8, 2019

374

Intel lately launched a brand new model of its doc for software program builders revealing some further particulars about its upcoming Xeon Scalable ‘Cooper Lake-SP’ processors. As it seems, the brand new CPUs will help AVX512_BF16 directions and subsequently the bfloat16 format. Meanwhile, the principle intrigue right here is the truth that at this level AVX512_BF16 appears to be solely supported by the Cooper Lake-SP microarchitecture, however not its direct successor, the Ice Lake-SP microarchitecture.

The bfloat16 is a truncated 16-bit model of the 32-bit IEEE 754 single-precision floating-point format that preserves eight exponent bits, however reduces precision of the significand from 24-bits to eight bits to avoid wasting up reminiscence, bandwidth, and processing sources, whereas nonetheless retaining the identical vary. The bfloat16 format was designed primarily for machine studying and near-sensor computing functions, the place precision is required close to to zero however not a lot on the most vary. The quantity illustration is supported by Intel’s upcoming FPGAs in addition to Nervana neural community processors, and Google’s TPUs. Given the truth that Intel helps the bfloat16 format throughout two of its product traces, it is sensible to help it elsewhere as effectively, which is what the corporate goes to do by including its AVX512_BF16 directions help to its upcoming Xeon Scalable ‘Cooper Lake-SP’ platform.

AVX-512 Support Propogation by Various Intel CPUs Newer uArch helps older uArch
	Xeon	General	Xeon Phi
Skylake-SP	AVX512BW AVX512DQ AVX512VL	AVX512F AVX512CD	AVX512ER AVX512PF	Knights Landing
Cannon Lake	AVX512VBMI AVX512IFMA		AVX512_4FMAPS AVX512_4VNNIW	Knights Mill
Cascade Lake-SP	AVX512_VNNI
Cooper Lake	AVX512_BF16
Ice Lake	AVX512_VNNI AVX512_VBMI2 AVX512_BITALG AVX512+VAES AVX512+GFNI AVX512+VPCLMULQDQ (not BF16)	AVX512_VPOPCNTDQ
Source: Intel Architecture Instruction Set Extensions and Future Features Programming Reference (pages 16)

The record of Intel’s AVX512_BF16 Vector Neural Network Instructions contains VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS. All of them might be executed in 128-bit, 256-bit, or 512-bit mode, so software program builders can decide up one in all a complete of 9 variations based mostly on their necessities.

Intel AVX512_BF16 Instructions Intel C/C++ Compiler Intrinsic Equivalent
Instruction		Description
VCVTNE2PS2BF16		Convert Two Packed Single Data to One Packed BF16 Data Intel C/C++ Compiler Intrinsic Equivalent: VCVTNE2PS2BF16 __m128bh _mm_cvtne2ps_pbh (__m128, __m128); VCVTNE2PS2BF16 __m128bh _mm_mask_cvtne2ps_pbh (__m128bh, __mmask8, __m128, __m128); VCVTNE2PS2BF16 __m128bh _mm_maskz_cvtne2ps_pbh (__mmask8, __m128, __m128); VCVTNE2PS2BF16 __m256bh _mm256_cvtne2ps_pbh (__m256, __m256); VCVTNE2PS2BF16 __m256bh _mm256_mask_cvtne2ps_pbh (__m256bh, __mmask16, __m256, __m256); VCVTNE2PS2BF16 __m256bh _mm256_maskz_cvtne2ps_ pbh (__mmask16, __m256, __m256); VCVTNE2PS2BF16 __m512bh _mm512_cvtne2ps_pbh (__m512, __m512); VCVTNE2PS2BF16 __m512bh _mm512_mask_cvtne2ps_pbh (__m512bh, __mmask32, __m512, __m512); VCVTNE2PS2BF16 __m512bh _mm512_maskz_cvtne2ps_pbh (__mmask32, __m512, __m512);
VCVTNEPS2BF16		Convert Packed Single Data to Packed BF16 Data Intel C/C++ Compiler Intrinsic Equivalent: VCVTNEPS2BF16 __m128bh _mm_cvtneps_pbh (__m128); VCVTNEPS2BF16 __m128bh _mm_mask_cvtneps_pbh (__m128bh, __mmask8, __m128); VCVTNEPS2BF16 __m128bh _mm_maskz_cvtneps_pbh (__mmask8, __m128); VCVTNEPS2BF16 __m128bh _mm256_cvtneps_pbh (__m256); VCVTNEPS2BF16 __m128bh _mm256_mask_cvtneps_pbh (__m128bh, __mmask8, __m256); VCVTNEPS2BF16 __m128bh _mm256_maskz_cvtneps_pbh (__mmask8, __m256); VCVTNEPS2BF16 __m256bh _mm512_cvtneps_pbh (__m512); VCVTNEPS2BF16 __m256bh _mm512_mask_cvtneps_pbh (__m256bh, __mmask16, __m512); VCVTNEPS2BF16…

Source hyperlink

Post Views: 479

Intel Architecture Manual Updates: bfloat16 for Cooper Lake

LEAVE A REPLY Cancel reply

EVEN MORE NEWS

Amazon’s Vulcan Robot with Sense of Touch: ‘Fundamental Leap…

Agatha Christie + AI = New BBC Writing Course and ‘Profoundl…

Samsung TV Plus To Exclusively Live Stream SMTOWN LIVE 2025

POPULAR CATEGORY

RELATED ARTICLESMORE FROM AUTHOR

Former Intel administrators warn Trump’s TSMC plan would…

Intel, Microchip instances expose CHIPS Act challenges

Intel spins off IFS as subsidiary to boost monetary…