diff --git a/blog/2023-12-30-fast-string/index.html b/blog/2023-12-30-fast-string/index.html index 4f03048..973b65d 100644 --- a/blog/2023-12-30-fast-string/index.html +++ b/blog/2023-12-30-fast-string/index.html @@ -26,7 +26,7 @@ ---------------------------------^^ /* Actual capacity data. */ ||| /* Is it large? */ || - /* Is it medium? */ |
For a small string, both of those flag bits are 0. This is important and it's the final piece to a puzzle: where do we store the size, for small strings? Well, we store it in the remaining 6 bits of capacity data. But we don't just store the size, oh no. We store the remaining capacity (max_size - size). This lovely treat allows us to use that final byte as the null terminator when the string is full, since the two flag bits will be 0 and the remaining capacity will be 0, thus the byte will be 0.
This means folly's string allows for 23 bytes of small string data in a 24 byte string. That's 23:34 = 0.958
, compared to the previous 15:32 = 0.469
. Our string is 24 bytes, compared to previous 32 bytes, too! A very impressive design.
There's a trick which all three of the string classes use called empty member optimization and I'll explain it because it's another example of how crazy C++ is. In C++, an empty struct can't have the size of 0. It generally has the size of 1. This is important for addressing, as I'll show here.
struct empty
+ /* Is it medium? */ |
For a small string, both of those flag bits are 0. This is important and it's the final piece to a puzzle: where do we store the size, for small strings? Well, we store it in the remaining 6 bits of capacity data. But we don't just store the size, oh no. We store the remaining capacity (max_size - size). This lovely treat allows us to use that final byte as the null terminator when the string is full, since the two flag bits will be 0 and the remaining capacity will be 0, thus the byte will be 0.
This means folly's string allows for 23 bytes of small string data in a 24 byte string. That's 23:24 = 0.958
, compared to the previous 15:32 = 0.469
. Our string is 24 bytes, compared to previous 32 bytes, too! A very impressive design.
There's a trick which all three of the string classes use called empty member optimization and I'll explain it because it's another example of how crazy C++ is. In C++, an empty struct can't have the size of 0. It generally has the size of 1. This is important for addressing, as I'll show here.
struct empty
{ };
struct foo
diff --git a/blog/feed.xml b/blog/feed.xml
index c5cbcf7..40814a2 100644
--- a/blog/feed.xml
+++ b/blog/feed.xml
@@ -1 +1 @@
-2023-12-30T21:14:01.778401081Z jank blog https://jank-lang.org/blog/ jank's new persistent string is fast 2023-12-30T00:00:00Z 2023-12-30T00:00:00Z https://jank-lang.org/blog/2023-12-30-fast-string Jeaye Wilkerson <p>One thing I've been meaning to do is build a custom string class for jank. I had some time, during the holidays, between wrapping up this quarter's work and starting on next quarter's, so I decided to see if I could beat both <code>std::string</code> and <code>folly::fbstring</code>, in terms of performance. After all, if we're gonna make a string class, it'll need to be fast. :)</p> jank development update - Load all the modules! 2023-12-17T00:00:00Z 2023-12-17T00:00:00Z https://jank-lang.org/blog/2023-12-17-module-loading Jeaye Wilkerson <p>I've been quiet for the past couple of months, finishing up this work on jank's module loading, class path handling, aliasing, and var referring. Along the way, I ran into some very interesting bugs and we're in for a treat of technical detail in this holiday edition of jank development updates! A warm shout out to my <a href="https://github.com/sponsors/jeaye">Github sponsors</a> and <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for sponsoring this work.</p> jank development update - Module loading 2023-10-14T00:00:00Z 2023-10-14T00:00:00Z https://jank-lang.org/blog/2023-10-14-module-loading Jeaye Wilkerson <p>For the past month and a half, I've been building out jank's support for <code>clojure.core/require</code>, including everything from class path handling to compiling jank files to intermediate code written to the filesystem. This is a half-way report for the quarter. As a warm note, my work on jank this quarter is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>.</p> jank development update - Object model results 2023-08-26T00:00:00Z 2023-08-26T00:00:00Z https://jank-lang.org/blog/2023-08-26-object-model Jeaye Wilkerson <p>As summer draws to a close, in the Pacific Northwest, so too does my term of sponsored work focused on a faster object model for jank. Thanks so much to <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for funding jank's development. The past quarter has been quite successful and I'm excited to share the results.</p> jank development update - A faster object model 2023-07-08T00:00:00Z 2023-07-08T00:00:00Z https://jank-lang.org/blog/2023-07-08-object-model Jeaye Wilkerson <p>This quarter, my work on jank is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>. The terms of the work are to research a new object model for jank, with the goal of making jank code faster across the board. This is a half-way report and I'm excited to share my results!</p> jank development update - Optimizing a ray tracer 2023-04-07T00:00:00Z 2023-04-07T00:00:00Z https://jank-lang.org/blog/2023-04-07-ray-tracing Jeaye Wilkerson <p>After the <a href="/blog/2023-01-13-optimizing-sequences">last post</a>, which focused on optimizing jank's sequences, I wanted to get jank running a ray tracer I had previously written in Clojure. In this post, I document what was required to start ray tracing in jank and, more importantly, how I chased down the run time in a fierce battle with Clojure's performance.</p> jank development update - Optimizing sequences 2023-01-13T00:00:00Z 2023-01-13T00:00:00Z https://jank-lang.org/blog/2023-01-13-optimizing-sequences Jeaye Wilkerson <p>In this episode of jank's development updates, we follow an exciting few weekends as I was digging deep into Clojure's sequence implementation, building jank's equivalent, and then benchmarking and profiling in a dizzying race to the bottom.</p> jank development update - Lots of new changes 2022-12-08T00:00:00Z 2022-12-08T00:00:00Z https://jank-lang.org/blog/2022-12-08-progress-update Jeaye Wilkerson <p>I was previously giving updates only in the <a href="https://clojurians.slack.com/archives/C03SRH97FDK">#jank</a> Slack channel, but some of these are getting large enough to warrant more prose. Thus, happily, I can announce that jank has a new blog and I have a <i>lot</i> of new progress to report! Let's get into the details.</p>
\ No newline at end of file
+2023-12-30T22:08:40.115640319Z jank blog https://jank-lang.org/blog/ jank's new persistent string is fast 2023-12-30T00:00:00Z 2023-12-30T00:00:00Z https://jank-lang.org/blog/2023-12-30-fast-string Jeaye Wilkerson <p>One thing I've been meaning to do is build a custom string class for jank. I had some time, during the holidays, between wrapping up this quarter's work and starting on next quarter's, so I decided to see if I could beat both <code>std::string</code> and <code>folly::fbstring</code>, in terms of performance. After all, if we're gonna make a string class, it'll need to be fast. :)</p> jank development update - Load all the modules! 2023-12-17T00:00:00Z 2023-12-17T00:00:00Z https://jank-lang.org/blog/2023-12-17-module-loading Jeaye Wilkerson <p>I've been quiet for the past couple of months, finishing up this work on jank's module loading, class path handling, aliasing, and var referring. Along the way, I ran into some very interesting bugs and we're in for a treat of technical detail in this holiday edition of jank development updates! A warm shout out to my <a href="https://github.com/sponsors/jeaye">Github sponsors</a> and <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for sponsoring this work.</p> jank development update - Module loading 2023-10-14T00:00:00Z 2023-10-14T00:00:00Z https://jank-lang.org/blog/2023-10-14-module-loading Jeaye Wilkerson <p>For the past month and a half, I've been building out jank's support for <code>clojure.core/require</code>, including everything from class path handling to compiling jank files to intermediate code written to the filesystem. This is a half-way report for the quarter. As a warm note, my work on jank this quarter is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>.</p> jank development update - Object model results 2023-08-26T00:00:00Z 2023-08-26T00:00:00Z https://jank-lang.org/blog/2023-08-26-object-model Jeaye Wilkerson <p>As summer draws to a close, in the Pacific Northwest, so too does my term of sponsored work focused on a faster object model for jank. Thanks so much to <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for funding jank's development. The past quarter has been quite successful and I'm excited to share the results.</p> jank development update - A faster object model 2023-07-08T00:00:00Z 2023-07-08T00:00:00Z https://jank-lang.org/blog/2023-07-08-object-model Jeaye Wilkerson <p>This quarter, my work on jank is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>. The terms of the work are to research a new object model for jank, with the goal of making jank code faster across the board. This is a half-way report and I'm excited to share my results!</p> jank development update - Optimizing a ray tracer 2023-04-07T00:00:00Z 2023-04-07T00:00:00Z https://jank-lang.org/blog/2023-04-07-ray-tracing Jeaye Wilkerson <p>After the <a href="/blog/2023-01-13-optimizing-sequences">last post</a>, which focused on optimizing jank's sequences, I wanted to get jank running a ray tracer I had previously written in Clojure. In this post, I document what was required to start ray tracing in jank and, more importantly, how I chased down the run time in a fierce battle with Clojure's performance.</p> jank development update - Optimizing sequences 2023-01-13T00:00:00Z 2023-01-13T00:00:00Z https://jank-lang.org/blog/2023-01-13-optimizing-sequences Jeaye Wilkerson <p>In this episode of jank's development updates, we follow an exciting few weekends as I was digging deep into Clojure's sequence implementation, building jank's equivalent, and then benchmarking and profiling in a dizzying race to the bottom.</p> jank development update - Lots of new changes 2022-12-08T00:00:00Z 2022-12-08T00:00:00Z https://jank-lang.org/blog/2022-12-08-progress-update Jeaye Wilkerson <p>I was previously giving updates only in the <a href="https://clojurians.slack.com/archives/C03SRH97FDK">#jank</a> Slack channel, but some of these are getting large enough to warrant more prose. Thus, happily, I can announce that jank has a new blog and I have a <i>lot</i> of new progress to report! Let's get into the details.</p>
\ No newline at end of file