try fix ci

Yihao Sun · Yihao Sun · commit 6ea9f72eea1e · 2025-04-12T22:43:20.000-04:00
remove Gemfile

remove Gemfile

Create jekyll-docker.yml

fix CI
diff --git a/.github/workflows/jekyll-docker.yml b/.github/workflows/jekyll-docker.yml
@@ -0,0 +1,20 @@
+name: Jekyll site CI
+
+on:
+  push:
+    branches: [ "main" ]
+  pull_request:
+    branches: [ "main" ]
+
+jobs:
+  build:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v4
+    - name: Build the site in the jekyll/builder container
+      run: |
+        docker run \
+        -v ${{ github.workspace }}:/srv/jekyll -v ${{ github.workspace }}/_site:/srv/jekyll/_site \
+        jekyll/builder:latest /bin/bash -c "chmod -R 777 /srv/jekyll && jekyll build --future"
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,5 @@
 _site/
 _site/assets
-.DS_Store
+.DS_Store
+Gemfile.lock
+# Gemfile
diff --git a/Gemfile b/Gemfile
@@ -2,4 +2,6 @@ source "https://rubygems.org"
 
 gem "jekyll"
 
-gem 'jekyll-scholar', group: :jekyll_plugins
+group :jekyll_plugins do
+    gem "jekyll-scholar"
+end
diff --git a/Gemfile.lock b/Gemfile.lock
diff --git a/_posts/2025-04-12-wcoj.md b/_posts/2025-04-12-wcoj.md
@@ -328,7 +328,7 @@ where each of $R_i(x_i), R_j(x_j), R_k(x_k),...$  is a subatom over a the schema
 
 To interpret a free join plan, simply treat each bracketed group [] as a loop level. Within each level, iterate over the first subatom of the group, and use the scanned value to ground (or filter) and look up the remaining subatoms. For example, the triangle query with WCOJ optimizations can be represented in pseudocode as follows:
 
-![Free Join](/assets/blog/wcoj/free.png)
+![Free Join](/assets/blog/wcoj/free.png){:width="100%"}
 
 One bonus contribution of the original Free Join paper is that it also presents an algorithm for implementing the join plan using a novel data structure called the Lazy Generalized Hash Trie (LGHT). Similar to how the sorted trie enables the pipelining of worst-case optimal joins in LFTJ, LGHT makes it possible to fully pipeline hash-based WCOJ.
 
diff --git a/docs/2025/04/12/wcoj.html b/docs/2025/04/12/wcoj.html
@@ -135,7 +135,7 @@ <h3 id="data-skew">Data Skew</h3>
 
 <p>While the pipelining model is parallelizable, it can failed scale when <em>data skew</em>  occurs. For instance, in a large social media graph, some influential nodes may have hundreds of times more followers than other nodes. When such a graph serves as the outer relation in a multi-input relation CQ, the thread processing the influential node will have to handle significantly more work compared to threads processing less-connected nodes. This imbalance can lead to thread idle.</p>
 
-<p><img src="/assets/blog/wcoj/idle.png" alt="Thread idle" /></p>
+<p><img src="/assets/blog/wcoj/idle.png" alt="Thread idle" width="70%" /></p>
 
 <p>This imbalance in k-way joins makes it difficult to scale pipelined join operations on massively parallel hardware and hinders adaptation to SIMD-based architectures, such as GPUs and AVX-supported CPUs.</p>
 
@@ -204,7 +204,7 @@ <h2 id="4-query-size-estimation-bound-of-cqs">4. Query Size Estimation: Bound of
 
 <p>A natural way to reason about how each relation and column contributes to the CQ is by constructing a <em>query graph</em>: nodes represent logical variables (i.e., column names), and edges represent the relations. In this model, worst-case join planning can be seen as an <em>edge cover</em> problem, where the goal is to identify a minimal set of vertices that touches every edge. For example, for two foobar query graphs we show earlier in this section, the first one might use the vertex set ${A, D, C}$ to represent the worst case for the first query, while the vertex set ${A, B}$ will be sufficient for the second query, indicating that the worst-case output size is dominated by fewer variables.</p>
 
-<p><img src="/assets/blog/wcoj/graph.png" alt="Query Graph" /></p>
+<p><img src="/assets/blog/wcoj/graph.png" alt="Query Graph" width="70%" /></p>
 
 <ul>
   <li>
@@ -278,7 +278,7 @@ <h2 id="5-worst-case-optimal-join">5. Worst Case Optimal Join</h2>
 
 <p>Inspired by AGM bound,  Hung Q. Ngo, Christopher Re and Atri Rudra purpose a generic framework (missing reference) for designing worst case optimal join. Their algorithm can be described using the following pseudocode:</p>
 
-<p><img src="/assets/blog/wcoj/generic_join.png" alt="Generic Join" /></p>
+<p><img src="/assets/blog/wcoj/generic_join.png" alt="Generic Join" width="70%" /></p>
 
 <p>At each recursion level, the algorithm selects a join variable—typically chosen based on heuristics such as frequency of occurrence across relations. It then <strong>projects</strong> the selected variable from all participating relations and computes the intersection of these value sets to determine all possible assignments to that logical variable. For each intersected value, the algorithm grounds the query accordingly and recursively applies the same process to the partially grounded query. This recursion continues until all variables in the query are bound, yielding a complete join result. This project-intersect-join pattern, this match the suggestion of original AGM paper.</p>
 
@@ -296,7 +296,7 @@ <h3 id="delayed-materialization">Delayed Materialization</h3>
 
 <p>One notable thing of the generic WCOJ algorithm is that it requires allocating temporary buffers at each level of the recursion (or nested for-loop) to store intermediate results, the partially grounded tuples. This introduces more memory overhead, when compared to traditional left-deep binary joins, where intermediate results are often pipelined or materialized in global buffer. If we use the same data structures for both storage and join processing (as is common in left-deep plans), this memory pressure can severely impact performance.  A common solution is using prefix trie as relation data structure.</p>
 
-<p><img src="/assets/blog/wcoj/trie.png" alt="Trie" /></p>
+<p><img src="/assets/blog/wcoj/trie.png" alt="Trie" width="80%" /></p>
 
 <p>For example in above relation A stored in trie, operation <code class="language-plaintext highlighter-rouge">A[1]</code> can now be implemented as finding the pointer of sub-tree rooted at value one and instead of temporary buffer we only need store a single pointer.</p>
 
@@ -316,10 +316,10 @@ <h3 id="leapfrog-triejoin">Leapfrog triejoin</h3>
 
 <p>An implementation of this idea is <em>Leapfrog Triejoin (LFTJ)</em> <a class="citation" href="#veldhuizen2014leapfrog">(Veldhuizen)</a>, introduced by Todd L. Veldhuizen and used in the commercial system LogicBlox. LFTJ is specifically designed for scenarios where all column values are integers and each relation is indexed using a sorted trie. In such tries, each level corresponds to a join variable, and the children (subtries) of every node are kept in sorted order. Below pseudo code describe the LFTJ using iterator-model style:</p>
 
-<p><img src="/assets/blog/wcoj/lftj_alg.png" alt="LFTJ pseudo code" />
+<p><img src="/assets/blog/wcoj/lftj_alg.png" alt="LFTJ pseudo code" width="100%" />
 Below is a concrete example illustrating the algorithm’s operation. Initially, the algorithm initializes an iterator over each input relation’s join column. In this example, the iterators for relations A, B, and C are positioned at 0, 0, and 2, respectively. The algorithm then determines the candidate join value by computing the maximum of these initial values as possible lower-bound of next joined value, which is 2—this value is currently held by relation C. Next, the algorithm uses the linear probing function (leapfrog-seek) to search for the candidate value 2 in another relation—in this case, relation A is arbitrarily chosen. During the search in relation A, it is found that 2 is not present; instead, the iterator advances to the smallest value greater than 2, which is 3. With 3 as the new possible lower-bound of next joined value, the algorithm then repeats the search in relation B.This process of advancing the iterators continues until a candidate join value (in the example, 8) is present in all relations. When such a value is found, it confirms that the value lies in the intersection of all join columns, allowing the algorithm to proceed with the inner loop of the generic join operation.</p>
 
-<p><img src="/assets/blog/wcoj/lftj.png" alt="LFTJ example" /></p>
+<p><img src="/assets/blog/wcoj/lftj.png" alt="LFTJ example" width="100%" /></p>
 
 <p>Although LFTJ is a compelling algorithm for pipelining worst-case optimal joins, its reliance on sorted tries for relation storage can be limiting. While sorted tries support efficient sequential iteration, they impose an ordering constraint that can result in non-constant factor access during lookups. For systems that require truly constant-factor indexed value access, a hash-trie based algorithm is more appealing.</p>
 
@@ -347,7 +347,7 @@ <h3 id="free-join">Free Join</h3>
 
 <p>To interpret a free join plan, simply treat each bracketed group [] as a loop level. Within each level, iterate over the first subatom of the group, and use the scanned value to ground (or filter) and look up the remaining subatoms. For example, the triangle query with WCOJ optimizations can be represented in pseudocode as follows:</p>
 
-<p><img src="/assets/blog/wcoj/free.png" alt="Free Join" /></p>
+<p><img src="/assets/blog/wcoj/free.png" alt="Free Join" width="100%" /></p>
 
 <p>One bonus contribution of the original Free Join paper is that it also presents an algorithm for implementing the join plan using a novel data structure called the Lazy Generalized Hash Trie (LGHT). Similar to how the sorted trie enables the pipelining of worst-case optimal joins in LFTJ, LGHT makes it possible to fully pipeline hash-based WCOJ.</p>
 
@@ -356,7 +356,8 @@ <h2 id="whats-next-">What’s Next ?</h2>
 <p>In this article, we have explored a wide range of processing algorithms for conjunctive queries, but most of them only run on single CPU system—especially worst-case optimal join techniques. However, scaling these methods to parallel processing environments remains a complex and open research question. Recent work on adapting worst-case optimal joins to parallel hardware (see <a class="citation" href="#wu2025honeycomb">(Wu and Suciu)</a> <a class="citation" href="#lai2022accelerating">(Lai et al.)</a>) shows promising directions, though none of these approaches have matured for use in real-world databases. I look forward to discussing these developments and sharing my thoughts on parallelizing conjunctive query processing in future blog articles.</p>
 
 <h2 id="reference">Reference</h2>
-<ol class="bibliography"><li><span id="graefe1993volcano">Graefe, Goetz, and William J. McKenna. “The Volcano Optimizer Generator: Extensibility and Efficient Search.” <i>Proceedings of IEEE 9th International Conference on Data Engineering</i>, IEEE, 1993, pp. 209–18.</span></li>
+<ol class="bibliography"><li><span id="cqwiki">Wikipedia. <i>Conjunctive Query</i>. 2024, https://en.wikipedia.org/wiki/Conjunctive_query.</span></li>
+<li><span id="graefe1993volcano">Graefe, Goetz, and William J. McKenna. “The Volcano Optimizer Generator: Extensibility and Efficient Search.” <i>Proceedings of IEEE 9th International Conference on Data Engineering</i>, IEEE, 1993, pp. 209–18.</span></li>
 <li><span id="palvo2024multiway">Palvo, Andy. <i>Lecture Note of Andy Palvo on Multi-Way Join Algorithm</i>. 2024, https://15721.courses.cs.cmu.edu/spring2024/notes/10-multiwayjoins.pdf.</span></li>
 <li><span id="lai2022accelerating">Lai, Zhuohang, et al. “Accelerating Multi-Way Joins on the GPU.” <i>The VLDB Journal</i>, 2022, pp. 1–25.</span></li>
 <li><span id="atserias2013size">Atserias, Albert, et al. “Size Bounds and Query Plans for Relational Joins.” <i>SIAM Journal on Computing</i>, vol. 42, no. 4, 2013, pp. 1737–67.</span></li>