From 30f4e65eae97d0c90dc28432874ed25512125ec0 Mon Sep 17 00:00:00 2001 From: Matthias Kretz Date: Fri, 1 Nov 2024 10:55:22 +0100 Subject: [PATCH] P3488 ChangeLog: * P3488_fp_excess_precision/discussion.tex: * P3488_fp_excess_precision/main.tex: * P3488_fp_excess_precision/resolutions.tex: --- P3488_fp_excess_precision/discussion.tex | 47 +++++++++++------- P3488_fp_excess_precision/main.tex | 60 ++++++++++++++++++++--- P3488_fp_excess_precision/resolutions.tex | 15 +++--- 3 files changed, 89 insertions(+), 33 deletions(-) diff --git a/P3488_fp_excess_precision/discussion.tex b/P3488_fp_excess_precision/discussion.tex index f8ca8a0..bbf4517 100644 --- a/P3488_fp_excess_precision/discussion.tex +++ b/P3488_fp_excess_precision/discussion.tex @@ -1,5 +1,6 @@ \section{Discussion} +A general observation: A simplification where the implementation were free to use excess precision at runtime as it deems best would lead to suprising results: Consider two floating-point values \code{a} and \code{b} where @@ -7,12 +8,25 @@ \section{Discussion} With arbitrary excess precision the optimizer would then be allowed to replace \code{a + b - b} with \code{a}. +A general consequence of excess precision is that \fp evaluation leads to +double rounding and thus potentially worse errors. +Where the second rounding occurs is not fully reproducible and can potentially +change via unrelated code changes in the translation unit\footnote{e.g. because of register allocation}. + +Without excess precision \code{std::float16_t} and \code{std::bfloat16_t} can +either use a soft-float implementation or dedicated hardware is required. +Using \float (binary32) instructions is impossible with the current possible +values for \code{FLT_EVAL_METHOD}. +An implementation that wants to evaluate \std\code{float16_t} / +\std\code{bfloat16_t} in higher intermediate precision needs to set +\code{FLT_EVAL_METHOD} to 1 or 2 (or 32?). + \subsection{strictest: Disallow all excess precision}\label{d:1} I believe [expr.pre] p6 is fairly clear that it was never the design intent to exclude all excess precision. -Implications: +Implications of disallowing all excess precision: \begin{itemize} \item \Fp contraction into FMAs is non-conforming. @@ -29,36 +43,31 @@ \subsection{strictest: Disallow all excess precision}\label{d:1} \subsection{compatible: Do exactly the same as C}\label{d:2} -This may have been the original intent, but [lex.fcon] p3 suggests otherwise. +It might have been the original intent to do the same as C, but [lex.fcon] p3 +suggests otherwise. +Implications of adopting this as resolution: \begin{itemize} \item \code{float x = 3.14f;} can require 8, 12, 16, or even more bytes to be stored in the resulting binary. + (This is the status quo of GCC since version 13.) \item \code{float x = 3.14f; assert(x == 3.14f);} is allowed to fail depending on implementation, target, and compiler flags. - - \item \code{std::float16_t} and \code{std::bfloat16_t} can either use a - soft-float implementation or requires dedicated hardware. - Double rounding, by using binary32 instructions is impossible with - \code{FLT_EVAL_METHOD == 0}. - An implementation that wants to evaluate - \code{std::float16_t}/\code{std::bfloat16_t} in higher intermediate - precision needs to set \code{FLT_EVAL_METHOD} to 1 or 2 (or 32?). + (This is the status quo of GCC since version 13.) \end{itemize} -\subsection{like C but only for run-time evaluation}\label{d:3} +\subsection{like C but only for runtime evaluation}\label{d:3} \begin{itemize} \item The intent here appears to be that we want to prescribe reproducible \fp behavior. - In other words, all floating-point code that is \emph{not} evaluated at - run-time is reproducible. - - \item However, we acknowledge the existence of hardware where this comes at - unreasonable performance cost. Because of these cases --- and only for - these --- the non-zero \code{FLT_EVAL_METHOD} modes exist. - \item Consequence: evaluation at higher precision leads to double rounding - and thus potentially worse errors. + \item However, since that has potentially dramatic consequences on runtime + performance, this restriction is only a recomendation for runtime + evaluation. + We thus acknowledge the existence of hardware where reproducible \fp + behavior comes at unreasonable performance cost. + Because of these cases --- and only for these --- the non-zero + \code{FLT_EVAL_METHOD} modes exist. \end{itemize} diff --git a/P3488_fp_excess_precision/main.tex b/P3488_fp_excess_precision/main.tex index b3d87cb..062460a 100644 --- a/P3488_fp_excess_precision/main.tex +++ b/P3488_fp_excess_precision/main.tex @@ -1,6 +1,6 @@ \newcommand\wgTitle{Floating-Point Excess Precision} \newcommand\wgName{Matthias Kretz } -\newcommand\wgDocumentNumber{D3488R0} +\newcommand\wgDocumentNumber{P3488R0} \newcommand\wgGroup{SG6, EWG} \newcommand\wgTarget{\CC{}26} %\newcommand\wgAcknowledgements{ } @@ -327,20 +327,68 @@ \section{Floating-point contraction} Adding such an “attribute” to \CC{} itself is material for another paper, but should not be done in a resolution to a core issue. +\subsection{Guaranteed opt-out of \fp contraction} + +It appears that accoding to the footnote of [expr.pre] p6 the expression +\lstinline@a * b + c@ can be transformed into an FMA, whereas +\lstinline@auto(a * b) + c@ cannot. +Likewise \lstinline@auto ab = a * b; ab * c@ would not lead to \fp contraction. + +It is unclear whether a simple \fp wrapper class would inhibit \fp contraction: +\medskip +\begin{lstlisting} +class Float +{ + float x; + +public: + Float(float xx) : x(xx) {} + + friend Float operator+(Float a, Float b) { return a.x. + b.x; } + friend Float operator*(Float a, Float b) { return a.x. * b.x; } +}; + +Float test(Float a, Float b, Float c) +{ return a * b + c; } // is contraction allowed or not? +\end{lstlisting} + +The copy constructor of \code{Float} implicitly assigns to the data member \code{x}. +But there is no assignment or cast expression. +The return statements in the binary operators of \code{Float} call the +\code{Float(float)} constructor which copies the \code{float} into \code{xx} +and subsequently into \code{x}. +Both copies are neither using a cast not assignment expression. +Consequently this wrapper class would still allow \fp contraction, correct? + +With a minor change to the \code{Float(float)} constructor to +\medskip +\begin{lstlisting} + Float(float xx) : x(float(xx)) {} +\end{lstlisting} +\fp contractions would be inhibited. + +I believe we need to clarify whether this matches the intent and at least +add a note in the wording to explain this subtlety. + + + \section{Wording} -Very much TBD. -But here's at least a sketch: +TBD. +But here's at least a sketch if we agree on adopting \ref{o:3}: \begin{enumerate} - \item Clarify [expr.pre] that it only provides this freedom for run-time + \item Clarify [expr.pre] that it only provides this freedom for runtime evaluation. - \item Clarify [expr.pre] that \fp contraction is a conforming transformation (but not required) + \item Clarify [expr.pre] that \fp contraction is a conforming transformation + (but not required) + + \item Add the above \code{Float} class example to [expr.pre]? \item Stop inheriting \code{FLT_EVAL_METHOD} verbatim from C. We need to write our own wording that clarifies \code{FLT_EVAL_METHOD} only - applies to run-time evaluation and not to constants. + applies to runtime evaluation and not to constants. Also we need to consider adopting and adjusting the wording from Annex H, which is important for \code{std::float16_t} and \code{std::bfloat16_t}. \end{enumerate} diff --git a/P3488_fp_excess_precision/resolutions.tex b/P3488_fp_excess_precision/resolutions.tex index 4c73762..700056e 100644 --- a/P3488_fp_excess_precision/resolutions.tex +++ b/P3488_fp_excess_precision/resolutions.tex @@ -29,22 +29,21 @@ \subsection{compatible: Do exactly the same as C}\label{o:2} \end{itemize} \discussionref{2} -\subsection{like C but only for run-time evaluation}\label{o:3} +\subsection{like C but only for runtime evaluation}\label{o:3} \begin{itemize} - \item \code{FLT_EVAL_METHOD} only applies to run-time evaluation of \fp expressions. + \item \code{FLT_EVAL_METHOD} only applies to runtime evaluation of \fp expressions. \item The value of a \fp literal is always rounded to the precision of its type. - \item Evaluation of floating-point expressions at compile time is not allowed + \item Evaluation of floating-point expressions at compile-time is not allowed to use excess precision. \item \code{FLT_EVAL_METHOD != 0} at runtime is permitted. - - \item Floating-point evaluation at runtime can use higher precision and is - only required to round to the precision of the floating-point type on cast - and assignment. The intermediate precision is exposed to the program via - \code{FLT_EVAL_METHOD} and thus cannot exceed \code{long double}. + \Fp evaluation at runtime can use higher precision and is only required to + round to the precision of the \fp type on cast and assignment. + The intermediate precision is exposed to the program via + \code{FLT_EVAL_METHOD}. \end{itemize} \discussionref{3}