diff --git a/blog/2023-12-30-fast-string/index.html b/blog/2023-12-30-fast-string/index.html index 4f03048..973b65d 100644 --- a/blog/2023-12-30-fast-string/index.html +++ b/blog/2023-12-30-fast-string/index.html @@ -26,7 +26,7 @@ ---------------------------------^^ /* Actual capacity data. */ ||| /* Is it large? */ || - /* Is it medium? */ |

For a small string, both of those flag bits are 0. This is important and it's the final piece to a puzzle: where do we store the size, for small strings? Well, we store it in the remaining 6 bits of capacity data. But we don't just store the size, oh no. We store the remaining capacity (max_size - size). This lovely treat allows us to use that final byte as the null terminator when the string is full, since the two flag bits will be 0 and the remaining capacity will be 0, thus the byte will be 0.

This means folly's string allows for 23 bytes of small string data in a 24 byte string. That's 23:34 = 0.958, compared to the previous 15:32 = 0.469. Our string is 24 bytes, compared to previous 32 bytes, too! A very impressive design.

Empty member optimization

There's a trick which all three of the string classes use called empty member optimization and I'll explain it because it's another example of how crazy C++ is. In C++, an empty struct can't have the size of 0. It generally has the size of 1. This is important for addressing, as I'll show here.

struct empty
+              /* Is it medium? */ |

For a small string, both of those flag bits are 0. This is important and it's the final piece to a puzzle: where do we store the size, for small strings? Well, we store it in the remaining 6 bits of capacity data. But we don't just store the size, oh no. We store the remaining capacity (max_size - size). This lovely treat allows us to use that final byte as the null terminator when the string is full, since the two flag bits will be 0 and the remaining capacity will be 0, thus the byte will be 0.

This means folly's string allows for 23 bytes of small string data in a 24 byte string. That's 23:24 = 0.958, compared to the previous 15:32 = 0.469. Our string is 24 bytes, compared to previous 32 bytes, too! A very impressive design.

Empty member optimization

There's a trick which all three of the string classes use called empty member optimization and I'll explain it because it's another example of how crazy C++ is. In C++, an empty struct can't have the size of 0. It generally has the size of 1. This is important for addressing, as I'll show here.

struct empty
 { };
 
 struct foo
diff --git a/blog/feed.xml b/blog/feed.xml
index c5cbcf7..40814a2 100644
--- a/blog/feed.xml
+++ b/blog/feed.xml
@@ -1 +1 @@
-2023-12-30T21:14:01.778401081Zjank bloghttps://jank-lang.org/blog/jank's new persistent string is fast2023-12-30T00:00:00Z2023-12-30T00:00:00Zhttps://jank-lang.org/blog/2023-12-30-fast-stringJeaye Wilkerson<p>One thing I&apos;ve been meaning to do is build a custom string class for jank. I had some time, during the holidays, between wrapping up this quarter&apos;s work and starting on next quarter&apos;s, so I decided to see if I could beat both <code>std::string</code> and <code>folly::fbstring</code>, in terms of performance. After all, if we&apos;re gonna make a string class, it&apos;ll need to be fast. :)</p>jank development update - Load all the modules!2023-12-17T00:00:00Z2023-12-17T00:00:00Zhttps://jank-lang.org/blog/2023-12-17-module-loadingJeaye Wilkerson<p>I&apos;ve been quiet for the past couple of months, finishing up this work on jank&apos;s module loading, class path handling, aliasing, and var referring. Along the way, I ran into some very interesting bugs and we&apos;re in for a treat of technical detail in this holiday edition of jank development updates! A warm shout out to my <a href="https://github.com/sponsors/jeaye">Github sponsors</a> and <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for sponsoring this work.</p>jank development update - Module loading2023-10-14T00:00:00Z2023-10-14T00:00:00Zhttps://jank-lang.org/blog/2023-10-14-module-loadingJeaye Wilkerson<p>For the past month and a half, I&apos;ve been building out jank&apos;s support for <code>clojure.core/require</code>, including everything from class path handling to compiling jank files to intermediate code written to the filesystem. This is a half-way report for the quarter. As a warm note, my work on jank this quarter is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>.</p>jank development update - Object model results2023-08-26T00:00:00Z2023-08-26T00:00:00Zhttps://jank-lang.org/blog/2023-08-26-object-modelJeaye Wilkerson<p>As summer draws to a close, in the Pacific Northwest, so too does my term of sponsored work focused on a faster object model for jank. Thanks so much to <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for funding jank&apos;s development. The past quarter has been quite successful and I&apos;m excited to share the results.</p>jank development update - A faster object model2023-07-08T00:00:00Z2023-07-08T00:00:00Zhttps://jank-lang.org/blog/2023-07-08-object-modelJeaye Wilkerson<p>This quarter, my work on jank is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>. The terms of the work are to research a new object model for jank, with the goal of making jank code faster across the board. This is a half-way report and I&apos;m excited to share my results!</p>jank development update - Optimizing a ray tracer2023-04-07T00:00:00Z2023-04-07T00:00:00Zhttps://jank-lang.org/blog/2023-04-07-ray-tracingJeaye Wilkerson<p>After the <a href="/blog/2023-01-13-optimizing-sequences">last post</a>, which focused on optimizing jank&apos;s sequences, I wanted to get jank running a ray tracer I had previously written in Clojure. In this post, I document what was required to start ray tracing in jank and, more importantly, how I chased down the run time in a fierce battle with Clojure&apos;s performance.</p>jank development update - Optimizing sequences2023-01-13T00:00:00Z2023-01-13T00:00:00Zhttps://jank-lang.org/blog/2023-01-13-optimizing-sequencesJeaye Wilkerson<p>In this episode of jank&apos;s development updates, we follow an exciting few weekends as I was digging deep into Clojure&apos;s sequence implementation, building jank&apos;s equivalent, and then benchmarking and profiling in a dizzying race to the bottom.</p>jank development update - Lots of new changes2022-12-08T00:00:00Z2022-12-08T00:00:00Zhttps://jank-lang.org/blog/2022-12-08-progress-updateJeaye Wilkerson<p>I was previously giving updates only in the <a href="https://clojurians.slack.com/archives/C03SRH97FDK">#jank</a> Slack channel, but some of these are getting large enough to warrant more prose. Thus, happily, I can announce that jank has a new blog and I have a <i>lot</i> of new progress to report! Let&apos;s get into the details.</p>
\ No newline at end of file
+2023-12-30T22:08:40.115640319Zjank bloghttps://jank-lang.org/blog/jank's new persistent string is fast2023-12-30T00:00:00Z2023-12-30T00:00:00Zhttps://jank-lang.org/blog/2023-12-30-fast-stringJeaye Wilkerson<p>One thing I&apos;ve been meaning to do is build a custom string class for jank. I had some time, during the holidays, between wrapping up this quarter&apos;s work and starting on next quarter&apos;s, so I decided to see if I could beat both <code>std::string</code> and <code>folly::fbstring</code>, in terms of performance. After all, if we&apos;re gonna make a string class, it&apos;ll need to be fast. :)</p>jank development update - Load all the modules!2023-12-17T00:00:00Z2023-12-17T00:00:00Zhttps://jank-lang.org/blog/2023-12-17-module-loadingJeaye Wilkerson<p>I&apos;ve been quiet for the past couple of months, finishing up this work on jank&apos;s module loading, class path handling, aliasing, and var referring. Along the way, I ran into some very interesting bugs and we&apos;re in for a treat of technical detail in this holiday edition of jank development updates! A warm shout out to my <a href="https://github.com/sponsors/jeaye">Github sponsors</a> and <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for sponsoring this work.</p>jank development update - Module loading2023-10-14T00:00:00Z2023-10-14T00:00:00Zhttps://jank-lang.org/blog/2023-10-14-module-loadingJeaye Wilkerson<p>For the past month and a half, I&apos;ve been building out jank&apos;s support for <code>clojure.core/require</code>, including everything from class path handling to compiling jank files to intermediate code written to the filesystem. This is a half-way report for the quarter. As a warm note, my work on jank this quarter is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>.</p>jank development update - Object model results2023-08-26T00:00:00Z2023-08-26T00:00:00Zhttps://jank-lang.org/blog/2023-08-26-object-modelJeaye Wilkerson<p>As summer draws to a close, in the Pacific Northwest, so too does my term of sponsored work focused on a faster object model for jank. Thanks so much to <a href="https://www.clojuriststogether.org/">Clojurists Together</a> for funding jank&apos;s development. The past quarter has been quite successful and I&apos;m excited to share the results.</p>jank development update - A faster object model2023-07-08T00:00:00Z2023-07-08T00:00:00Zhttps://jank-lang.org/blog/2023-07-08-object-modelJeaye Wilkerson<p>This quarter, my work on jank is being sponsored by <a href="https://www.clojuriststogether.org/">Clojurists Together</a>. The terms of the work are to research a new object model for jank, with the goal of making jank code faster across the board. This is a half-way report and I&apos;m excited to share my results!</p>jank development update - Optimizing a ray tracer2023-04-07T00:00:00Z2023-04-07T00:00:00Zhttps://jank-lang.org/blog/2023-04-07-ray-tracingJeaye Wilkerson<p>After the <a href="/blog/2023-01-13-optimizing-sequences">last post</a>, which focused on optimizing jank&apos;s sequences, I wanted to get jank running a ray tracer I had previously written in Clojure. In this post, I document what was required to start ray tracing in jank and, more importantly, how I chased down the run time in a fierce battle with Clojure&apos;s performance.</p>jank development update - Optimizing sequences2023-01-13T00:00:00Z2023-01-13T00:00:00Zhttps://jank-lang.org/blog/2023-01-13-optimizing-sequencesJeaye Wilkerson<p>In this episode of jank&apos;s development updates, we follow an exciting few weekends as I was digging deep into Clojure&apos;s sequence implementation, building jank&apos;s equivalent, and then benchmarking and profiling in a dizzying race to the bottom.</p>jank development update - Lots of new changes2022-12-08T00:00:00Z2022-12-08T00:00:00Zhttps://jank-lang.org/blog/2022-12-08-progress-updateJeaye Wilkerson<p>I was previously giving updates only in the <a href="https://clojurians.slack.com/archives/C03SRH97FDK">#jank</a> Slack channel, but some of these are getting large enough to warrant more prose. Thus, happily, I can announce that jank has a new blog and I have a <i>lot</i> of new progress to report! Let&apos;s get into the details.</p>
\ No newline at end of file