Choose your operating system:
Windows
macOS
Linux
Module |
|
Header |
/Engine/Source/Runtime/Core/Public/Math/Float16.h |
Include |
#include "Math/Float16.h" |
class FFloat16
16 bit float components and conversion
IEEE float 16 Represented by 10-bit mantissa M, 5-bit exponent E, and 1-bit sign S
Specials:
E=0, M=0 == 0.0 E=0, M!=0 == Denormalized value (M / 2^10) * 2-14 0<E<31, M=any == (1 + M / 2^10) * 2(E-15) E=31, M=0 == Infinity E=31, M!=0 == NAN
conversion from 32 bit float is with RTNE (round to nearest even)
Legacy code truncated in the conversion. SetTruncate can be used for backwards compatibility.
Name | Description | ||
---|---|---|---|
|
Encoded |
Name | Description | |
---|---|---|
|
FFloat16() |
Default constructor |
|
FFloat16 ( |
Copy constructor. |
|
FFloat16 ( |
Conversion constructor. Convert from Fp32 to Fp16. |
Name | Description | ||
---|---|---|---|
|
float |
GetFloat() |
Convert from Fp16 to Fp32. |
|
IsNegative() |
Is the float negative without converting |
|
|
Set ( |
Convert from Fp32 to Fp16. (RTNE) |
|
|
SetTruncate ( |
Convert from Fp32 to Fp16. (backward-compatible truncate conversion) |