Skip to content

Commit

Permalink
update library.json, minor edits, license (#6)
Browse files Browse the repository at this point in the history
  • Loading branch information
RobTillaart committed Dec 18, 2021
1 parent a2fdd9b commit 445058d
Show file tree
Hide file tree
Showing 15 changed files with 55 additions and 35 deletions.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2015-2021 Rob Tillaart
Copyright (c) 2015-2022 Rob Tillaart

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
54 changes: 40 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ Arduino library to implement float16 data type.

## Description

This **experimental** library defines the float16 (2 byte) data type, including conversion
This **experimental** library defines the float16 (2 byte) data type, including conversion
function to and from float32 type. It is definitely **work in progress**.

The library implements the **Printable** interface so one can directly print the
The library implements the **Printable** interface so one can directly print the
float16 values in any stream e.g. Serial.

The primary usage of the float16 data type is to efficiently store and transport
a floating point number. As it uses only 2 bytes where float and double have typical
The primary usage of the float16 data type is to efficiently store and transport
a floating point number. As it uses only 2 bytes where float and double have typical
4 and 8 bytes, gains can be made at the price of range and precision.


Expand All @@ -31,13 +31,39 @@ a floating point number. As it uses only 2 bytes where float and double have typ
| attribute | value | notes |
|:----------|:-------------|:--------|
| size | 2 bytes | layout s eeeee mmmmmmmmmm
| sign | 1 bit |
| exponent | 5 bit |
| mantissa | 11 bit | ~ 3 digits
| minimum | 5.96046 E−8 | smallest positive number.
| | 1.0009765625 | 1 + 2^−10 = smallest nr larger than 1.
| maximum | 65504 |
| | |
| sign | 1 bit |
| exponent | 5 bit |
| mantissa | 11 bit | ~ 3 digits
| minimum | 5.96046 E−8 | smallest positive number.
| | 1.0009765625 | 1 + 2^−10 = smallest nr larger than 1.
| maximum | 65504 |
| | |


#### example values

```cpp
/*
SIGN EXP MANTISSA
0 01111 0000000000 = 1
0 01111 0000000001 = 1 + 2−10 = 1.0009765625 (next smallest float after 1)
1 10000 0000000000 = −2
0 11110 1111111111 = 65504 (max half precision)
0 00001 0000000000 = 2−14 ≈ 6.10352 × 10−5 (minimum positive normal)
0 00000 1111111111 = 2−14 - 2−24 ≈ 6.09756 × 10−5 (maximum subnormal)
0 00000 0000000001 = 2−24 ≈ 5.96046 × 10−8 (minimum positive subnormal)
0 00000 0000000000 = 0
1 00000 0000000000 = −0
0 11111 0000000000 = infinity
1 11111 0000000000 = −infinity
0 01101 0101010101 = 0.333251953125 ≈ 1/3
*/
```


## Interface
Expand Down Expand Up @@ -66,7 +92,7 @@ See array example for efficient storage using set/getBinary() functions.

#### Compare

Standard compare functions. Since 0.1.5 these are quite optimized,
Standard compare functions. Since 0.1.5 these are quite optimized,
so it is fast to compare e.g. 2 measurements.

- **bool operator == (const float16& f)**
Expand All @@ -80,7 +106,7 @@ so it is fast to compare e.g. 2 measurements.
#### Math (basic)

Math is done by converting to double, do the math and convert back.
These operators are added for convenience only.
These operators are added for convenience only.
Not planned to optimize these.

- **float16 operator + (const float16& f)**
Expand All @@ -106,7 +132,7 @@ negation operator.
## Future


#### 0.1.6
#### 0.1.x

- update documentation.
- unit tests of the above.
Expand Down
3 changes: 1 addition & 2 deletions examples/float16_test0/float16_test0.ino
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
//
// FILE: float16_test0.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2015-03-11
// URL: https://github.com/RobTillaart/float16
//


/*
SIGN EXP MANTISSA
0 01111 0000000000 = 1
Expand All @@ -29,6 +27,7 @@
0 01101 0101010101 = 0.333251953125 ≈ 1/3
*/


#include "float16.h"


Expand Down
1 change: 0 additions & 1 deletion examples/float16_test1/float16_test1.ino
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//
// FILE: float16_test1.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2015-03-11
// URL: https://github.com/RobTillaart/float16
Expand Down
1 change: 0 additions & 1 deletion examples/float16_test_all/float16_test_all.ino
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//
// FILE: float16_test_all.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2021-11-27
// URL: https://github.com/RobTillaart/float16
Expand Down
1 change: 0 additions & 1 deletion examples/float16_test_array/float16_test_array.ino
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//
// FILE: float16_test_array.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2015-03-11
// URL: https://github.com/RobTillaart/float16
Expand Down
1 change: 0 additions & 1 deletion examples/float16_test_negative/float16_test_negative.ino
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//
// FILE: float16_test_negative.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2021-11-26
// URL: https://github.com/RobTillaart/float16
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//
// FILE: float16_test_performance.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2021-11-26
// URL: https://github.com/RobTillaart/float16
Expand All @@ -15,6 +14,7 @@ uint32_t start, stop;
volatile float f;
volatile bool b;


void setup()
{
while (!Serial);
Expand Down
1 change: 0 additions & 1 deletion examples/float16_test_powers2/float16_test_powers2.ino
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//
// FILE: float16_test_powers2.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2015-03-11
// URL: https://github.com/RobTillaart/float16
Expand Down
3 changes: 1 addition & 2 deletions examples/float16_test_special/float16_test_special.ino
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
//
// FILE: float16_test_special.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: test float16
// DATE: 2021-11-26
// URL: https://github.com/RobTillaart/float16
//

// test special values ...
// https://github.com/RobTillaart/float16/issues/2


#include "float16.h"


uint16_t value[32] =
{
0xFC00, 0xF400, 0xEC00, 0xE400, 0xDC00, 0xD400, 0xCC00, 0xC400,
Expand Down
10 changes: 6 additions & 4 deletions float16.cpp
Original file line number Diff line number Diff line change
@@ -1,24 +1,26 @@
//
// FILE: float16.cpp
// AUTHOR: Rob Tillaart
// VERSION: 0.1.4
// VERSION: 0.1.5
// PURPOSE: library for Float16s for Arduino
// URL: http://en.wikipedia.org/wiki/Half-precision_floating-point_format
//
// HISTORY:
// 0.1.00 2015-03-10 initial version
// 0.1.01 2015-03-12 make base conversion separate functions
// 0.1.02 2015-03-14 getting rounding right
// 0.1.03
// 0.1.03
// 0.1.4 2021-11-26 setup repo to get it working again.
// still experimental.
//
// 0.1.5 2021-12-02 add basic math, optimize compare operators
// 0.1.6 2021-12-18 update library.json, license, minor edits


#include "float16.h"

// #define DEBUG


// CONSTRUCTOR
float16::float16(double f)
{
Expand Down Expand Up @@ -258,7 +260,7 @@ uint16_t float16::f32tof16(float f) const
// normal numbers
exp = exp - 127 + 15;
// overflow does not fit => INF
if (exp > 30)
if (exp > 30)
{
return sgn ? 0xFC00 : 0x7C00; // -INF : INF
}
Expand Down
4 changes: 2 additions & 2 deletions float16.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
//
// FILE: float16.h
// AUTHOR: Rob Tillaart
// VERSION: 0.1.5
// VERSION: 0.1.6
// PURPOSE: Arduino library to implement float16 data type.
// half-precision floating point format,
// used for efficient storage and transport.
Expand All @@ -12,7 +12,7 @@

#include "Arduino.h"

#define FLOAT16_LIB_VERSION (F("0.1.5"))
#define FLOAT16_LIB_VERSION (F("0.1.6"))


class float16: public Printable
Expand Down
2 changes: 1 addition & 1 deletion library.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"type": "git",
"url": "https://github.com/RobTillaart/float16.git"
},
"version": "0.1.5",
"version": "0.1.6",
"license": "MIT",
"frameworks": "arduino",
"platforms": "*",
Expand Down
2 changes: 1 addition & 1 deletion library.properties
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name=float16
version=0.1.5
version=0.1.6
author=Rob Tillaart <[email protected]>
maintainer=Rob Tillaart <[email protected]>
sentence=Arduino library to implement float16 data type.
Expand Down
3 changes: 1 addition & 2 deletions test/unit_test_001.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@

unittest_setup()
{
fprintf(stderr, "FLOAT16_LIB_VERSION: %s\n", (char*) FLOAT16_LIB_VERSION);
}


Expand All @@ -48,8 +49,6 @@ unittest_teardown()

unittest(test_constructor)
{
fprintf(stderr, "FLOAT16_LIB_VERSION: %s\n", (char*) FLOAT16_LIB_VERSION);

float16 zero;
assertEqualFloat(0.000, zero.toDouble(), 1e-3);
float16 one(1);
Expand Down

0 comments on commit 445058d

Please sign in to comment.