apache · JFinis · Nov 10, 2023 · wgtmac · Jan 15, 2024 · etseidl
diff --git a/LogicalTypes.md b/LogicalTypes.md
@@ -253,7 +253,7 @@ Used in contexts where precision is traded off for smaller footprint and potenti
 
 The primitive type is a 2-byte fixed length binary.
 
-The sort order for `FLOAT16` is signed (with special handling of NANs and signed zeros); it uses the same [logic](https://github.com/apache/parquet-format#sort-order) as `FLOAT` and `DOUBLE`.
+The type-defined sort order for `FLOAT16` is signed (with special handling of NaNs and signed zeros), as for `FLOAT` and `DOUBLE`. It is recommended that writers use IEEE754TotalOrder when writing columns of this type for a well-defined handling of NaNs and signed zeros. See the `ColumnOrder` union in the [Thrift definition](src/main/thrift/parquet.thrift) for details.
 
 ## Temporal Types
 

diff --git a/README.md b/README.md
@@ -146,40 +146,13 @@ documented in [LogicalTypes.md][logical-types].
 [logical-types]: LogicalTypes.md
 
 ### Sort Order
-
 Parquet stores min/max statistics at several levels (such as Column Chunk,
-Column Index and Data Page). Comparison for values of a type obey the
-following rules:
-
-1.  Each logical type has a specified comparison order. If a column is
-    annotated with an unknown logical type, statistics may not be used
-    for pruning data. The sort order for logical types is documented in
-    the [LogicalTypes.md][logical-types] page.
-2.  For primitive types, the following rules apply:
-
-    * BOOLEAN - false, true
-    * INT32, INT64 - Signed comparison.
-    * FLOAT, DOUBLE - Signed comparison with special handling of NaNs and
-      signed zeros.   The details are documented in the
-      [Thrift definition](src/main/thrift/parquet.thrift) in the
-      `ColumnOrder` union. They are summarized here but the Thrift definition
-      is considered authoritative:
-      * NaNs should not be written to min or max statistics fields.
-      * If the computed max value is zero (whether negative or positive),
-        `+0.0` should be written into the max statistics field.
-      * If the computed min value is zero (whether negative or positive),
-        `-0.0` should be written into the min statistics field.
-
-      For backwards compatibility when reading files:
-      * If the min is a NaN, it should be ignored.
-      * If the max is a NaN, it should be ignored.
-      * If the min is +0, the row group may contain -0 values as well.
-      * If the max is -0, the row group may contain +0 values as well.
-      * When looking for NaN values, min and max should be ignored.
-
-    * BYTE_ARRAY and FIXED_LEN_BYTE_ARRAY - Lexicographic unsigned byte-wise
-      comparison.
-
+Column Index, and Data Page). These statistics are according to a sort order,
+which is defined for each column in the file footer. Parquet supports common
+sort orders for logical and primitve types and also special orders for types
+where the common sort order is not unambiguously defined (e.g., NaN ordering
+for floating point types). The details are documented in the
+[Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union.
 
 ## Nested Encoding
 To encode nested columns, Parquet uses the Dremel encoding with definition and

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
@@ -288,7 +288,7 @@ struct MapType {}     // see LogicalTypes.md
 struct ListType {}    // see LogicalTypes.md
 struct EnumType {}    // allowed for BINARY, must be encoded with UTF-8
 struct DateType {}    // allowed for INT32
-struct Float16Type {} // allowed for FIXED[2], must encoded raw FLOAT16 bytes
+struct Float16Type {} // allowed for FIXED[2], must encoded raw FLOAT16 bytes (see LogicalTypes.md)
 
 /**
  * Logical type to annotate a column that is always null.
@@ -788,7 +788,7 @@ struct ColumnMetaData {
   /** total byte size of all uncompressed pages in this column chunk (including the headers) **/
   6: required i64 total_uncompressed_size
 
-  /** total byte size of all compressed, and potentially encrypted, pages 
+  /** total byte size of all compressed, and potentially encrypted, pages
    *  in this column chunk (including the headers) **/
   7: required i64 total_compressed_size
 
@@ -903,17 +903,20 @@ struct RowGroup {
    * in this row group **/
   5: optional i64 file_offset
 
-  /** Total byte size of all compressed (and potentially encrypted) column data 
+  /** Total byte size of all compressed (and potentially encrypted) column data
    *  in this row group **/
   6: optional i64 total_compressed_size
-  
+
   /** Row group ordinal in the file **/
   7: optional i16 ordinal
 }
 
 /** Empty struct to signal the order defined by the physical or logical type */
 struct TypeDefinedOrder {}
 
+/** Empty struct to signal IEEE 754 total order for floating point types */
+struct IEEE754TotalOrder {}
+
 /**
  * Union to specify the order used for the min_value and max_value fields for a
  * column. This union takes the role of an enhanced enum that allows rich
@@ -922,6 +925,7 @@ struct TypeDefinedOrder {}
  * Possible values are:
  * * TypeDefinedOrder - the column uses the order defined by its logical or
  *                      physical type (if there is no logical type).
+ * * IEEE754TotalOrder - the floating point column uses IEEE 754 total order.
  *
  * If the reader does not support the value of this union, min and max stats
  * for this column should be ignored.
@@ -941,6 +945,7 @@ union ColumnOrder {
    *   UINT64 - unsigned comparison
    *   DECIMAL - signed comparison of the represented value
    *   DATE - signed comparison
+   *   FLOAT16 - signed comparison of the represented value (*)
    *   TIME_MILLIS - signed comparison
    *   TIME_MICROS - signed comparison
    *   TIMESTAMP_MILLIS - signed comparison
@@ -962,15 +967,19 @@ union ColumnOrder {
    *   BYTE_ARRAY - unsigned byte-wise comparison
    *   FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison
    *
-   * (*) Because the sorting order is not specified properly for floating
-   *     point values (relations vs. total ordering) the following
+   * (*) Because the precise sorting order is ambiguous for floating
+   *     point types due to underspecified handling of NaN and -0/+0,
+   *     it is recommended that writers use IEEE_754_TOTAL_ORDER
+   *     for these types.
+   *
+   *     If TYPE_ORDER is used for floating point types, then the following
    *     compatibility rules should be applied when reading statistics:
    *     - If the min is a NaN, it should be ignored.
    *     - If the max is a NaN, it should be ignored.
    *     - If the min is +0, the row group may contain -0 values as well.
    *     - If the max is -0, the row group may contain +0 values as well.
    *     - When looking for NaN values, min and max should be ignored.
-   * 
+   *
    *     When writing statistics the following rules should be followed:
    *     - NaNs should not be written to min or max statistics fields.
    *     - If the computed max value is zero (whether negative or positive),
@@ -979,6 +988,58 @@ union ColumnOrder {
    *       `-0.0` should be written into the min statistics field.
    */
   1: TypeDefinedOrder TYPE_ORDER;
+
+  /*
+   * The floating point type is ordered according to the totalOrder predicate,
+   * as defined in section 5.10 of IEEE-754 (2008 revision). Only columns of
+   * physical type FLOAT or DOUBLE, or logical type FLOAT16 may use this ordering.
+
+   * Intuitively, this orders floats mathematically, but defines -0 to be less
+   * than +0, -NaN to be less than anything else, and +NaN to be greater than
+   * anything else. It also defines an order between different bit representations
+   * of the same value.
+   *
+   * The formal definition is as follows:
+   *   a) If x<y, totalOrder(x, y) is true.
+   *   b) If x>y, totalOrder(x, y) is false.
+   *   c) If x=y:
+   *     1) totalOrder(−0, +0) is true.
+   *     2) totalOrder(+0, −0) is false.
+   *     3) If x and y represent the same floating-point datum:
+   *        i) If x and y have negative sign, totalOrder(x, y) is true if and
+   *           only if the exponent of x ≥ the exponent of y
+   *       ii) otherwise totalOrder(x, y) is true if and only if the exponent
+   *           of x ≤ the exponent of y.
+   *   d) If x and y are unordered numerically because x or y is NaN:
+   *     1) totalOrder(−NaN, y) is true where −NaN represents a NaN with
+   *        negative sign bit and y is a floating-point number.
+   *     2) totalOrder(x, +NaN) is true where +NaN represents a NaN with
+   *        positive sign bit and x is a floating-point number.
+   *     3) If x and y are both NaNs, then totalOrder reflects a total ordering
+   *        based on:
+   *         i) negative sign orders below positive sign
+   *        ii) signaling orders below quiet for +NaN, reverse for −NaN
+   *       iii) lesser payload, when regarded as an integer, orders below
+   *            greater payload for +NaN, reverse for −NaN.
+   *
+   * Note that this ordering can be implemented efficiently in software
+   * by flipping all non-sign bits in case of a set sign bit to achieve a
+   * two's-complement-like representation and then performing a signed
+   * integer comparison on the resulting bits.
+   * E.g., this is a possible implementation for DOUBLE in Rust:
+   *
+   *   pub fn totalOrder(x: f64, y: f64) -> bool {
+   *     // view bits as signed integers
+   *     let mut x_int = x.to_bits() as i64;
+   *     let mut y_int = y.to_bits() as i64;
+   *     // flip all non-sign bits if sign bit is set
+   *     x_int ^= (((x_int >> 63) as u64) >> 1) as i64;
+   *     y_int ^= (((y_int >> 63) as u64) >> 1) as i64;
+   *     // perform signed integer comparison
+   *     return x_int <= y_int;
+   *   }
+   */
+  2: IEEE754TotalOrder IEEE_754_TOTAL_ORDER;
 }
 
 struct PageLocation {
@@ -1148,30 +1209,30 @@ struct FileMetaData {
    */
   7: optional list<ColumnOrder> column_orders;
 
-  /** 
+  /**
    * Encryption algorithm. This field is set only in encrypted files
    * with plaintext footer. Files with encrypted footer store algorithm id
    * in FileCryptoMetaData structure.
    */
   8: optional EncryptionAlgorithm encryption_algorithm
 
-  /** 
-   * Retrieval metadata of key used for signing the footer. 
-   * Used only in encrypted files with plaintext footer. 
-   */ 
+  /**
+   * Retrieval metadata of key used for signing the footer.
+   * Used only in encrypted files with plaintext footer.
+   */
   9: optional binary footer_signing_key_metadata
 }
 
 /** Crypto metadata for files with encrypted footer **/
 struct FileCryptoMetaData {
-  /** 
+  /**
    * Encryption algorithm. This field is only used for files
    * with encrypted footer. Files with plaintext footer store algorithm id
    * inside footer (FileMetaData structure).
    */
   1: required EncryptionAlgorithm encryption_algorithm
-    
-  /** Retrieval metadata of key used for encryption of footer, 
+
+  /** Retrieval metadata of key used for encryption of footer,
    *  and (possibly) columns **/
   2: optional binary key_metadata
 }