Skip to content

[libc++] Refactor basic_filebuf::overflow() #144793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ldionne
Copy link
Member

@ldionne ldionne commented Jun 18, 2025

Refactor the function to streamline the logic so it matches the specification in [filebuf.virtuals] more closely. In particular, avoid modifying the put area pointers when we loop around after a partial codecvt conversion.

Note that we're technically not up-to-spec in this implementation, since the Standard says that we shouldn't try more than once after a partial codecvt conversion. However, this refactoring attempts not to change any functionality.

Refactor the function to streamline the logic so it matches the specification
in [filebuf.virtuals] more closely. In particular, avoid modifying the put
area pointers when we loop around after a partial codecvt conversion.

Note that we're technically not up-to-spec in this implementation, since
the Standard says that we shouldn't try more than once after a partial
codecvt conversion. However, this refactoring attempts not to change any
functionality.
@ldionne ldionne requested a review from a team as a code owner June 18, 2025 21:08
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jun 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2025

@llvm/pr-subscribers-libcxx

Author: Louis Dionne (ldionne)

Changes

Refactor the function to streamline the logic so it matches the specification in [filebuf.virtuals] more closely. In particular, avoid modifying the put area pointers when we loop around after a partial codecvt conversion.

Note that we're technically not up-to-spec in this implementation, since the Standard says that we shouldn't try more than once after a partial codecvt conversion. However, this refactoring attempts not to change any functionality.


Full diff: https://github.com/llvm/llvm-project/pull/144793.diff

1 Files Affected:

  • (modified) libcxx/include/fstream (+36-21)
diff --git a/libcxx/include/fstream b/libcxx/include/fstream
index 00aa00ff7e9cd..f9cbac2ab0578 100644
--- a/libcxx/include/fstream
+++ b/libcxx/include/fstream
@@ -835,35 +835,50 @@ typename basic_filebuf<_CharT, _Traits>::int_type basic_filebuf<_CharT, _Traits>
   }
   if (this->pptr() != this->pbase()) {
     if (__always_noconv_) {
-      size_t __nmemb = static_cast<size_t>(this->pptr() - this->pbase());
-      if (std::fwrite(this->pbase(), sizeof(char_type), __nmemb, __file_) != __nmemb)
+      size_t __n = static_cast<size_t>(this->pptr() - this->pbase());
+      if (std::fwrite(this->pbase(), sizeof(char_type), __n, __file_) != __n)
         return traits_type::eof();
     } else {
-      char* __extbe = __extbuf_;
-      codecvt_base::result __r;
+      if (!__cv_)
+        std::__throw_bad_cast();
+
+      // See [filebuf.virtuals]
+      char_type* __b = this->pbase();
+      char_type* __p = this->pptr();
+      const char_type* __end;
+      char* __extbuf_end = __extbuf_;
       do {
-        if (!__cv_)
-          std::__throw_bad_cast();
-
-        const char_type* __e;
-        __r = __cv_->out(__st_, this->pbase(), this->pptr(), __e, __extbuf_, __extbuf_ + __ebs_, __extbe);
-        if (__e == this->pbase())
+        codecvt_base::result __r = __cv_->out(__st_, __b, __p, __end, __extbuf_, __extbuf_ + __ebs_, __extbuf_end);
+        if (__end == __b)
           return traits_type::eof();
+
+        // No conversion needed: output characters directly to the file, done.
         if (__r == codecvt_base::noconv) {
-          size_t __nmemb = static_cast<size_t>(this->pptr() - this->pbase());
-          if (std::fwrite(this->pbase(), 1, __nmemb, __file_) != __nmemb)
+          size_t __n = static_cast<size_t>(__p - __b);
+          if (std::fwrite(__b, 1, __n, __file_) != __n)
             return traits_type::eof();
-        } else if (__r == codecvt_base::ok || __r == codecvt_base::partial) {
-          size_t __nmemb = static_cast<size_t>(__extbe - __extbuf_);
-          if (std::fwrite(__extbuf_, 1, __nmemb, __file_) != __nmemb)
+          break;
+
+          // Conversion successful: output the converted characters to the file, done.
+        } else if (__r == codecvt_base::ok) {
+          size_t __n = static_cast<size_t>(__extbuf_end - __extbuf_);
+          if (std::fwrite(__extbuf_, 1, __n, __file_) != __n)
+            return traits_type::eof();
+          break;
+
+          // Conversion partially successful: output converted characters to the file and repeat with the
+          // remaining characters.
+        } else if (__r == codecvt_base::partial) {
+          size_t __n = static_cast<size_t>(__extbuf_end - __extbuf_);
+          if (std::fwrite(__extbuf_, 1, __n, __file_) != __n)
             return traits_type::eof();
-          if (__r == codecvt_base::partial) {
-            this->setp(const_cast<char_type*>(__e), this->pptr());
-            this->__pbump(this->epptr() - this->pbase());
-          }
-        } else
+          __b = const_cast<char_type*>(__end);
+          continue;
+
+        } else {
           return traits_type::eof();
-      } while (__r == codecvt_base::partial);
+        }
+      } while (true);
     }
     this->setp(__pb_save, __epb_save);
   }

@@ -835,35 +835,50 @@ typename basic_filebuf<_CharT, _Traits>::int_type basic_filebuf<_CharT, _Traits>
}
if (this->pptr() != this->pbase()) {
if (__always_noconv_) {
size_t __nmemb = static_cast<size_t>(this->pptr() - this->pbase());
if (std::fwrite(this->pbase(), sizeof(char_type), __nmemb, __file_) != __nmemb)
size_t __n = static_cast<size_t>(this->pptr() - this->pbase());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name this __put_buffer_size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This isn't really the size of the put area though, as that would be epptr() - pbase(). This is really the number of characters inside the put area that we want to write to the underlying file stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants