Node from stream backrefs optimisation #532

matt-o-how · 2025-01-13T16:09:52Z

Use a Vec<NodePtr> stack instead of NodePtr / SExps in node_from_stream_backrefs and add a new traverse_path_with_vec() function to handle backrefs

coveralls-official · 2025-01-13T16:19:03Z

Pull Request Test Coverage Report for Build 12826960470

Details

93 of 98 (94.9%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.01%) to 93.874%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/serde/de_br.rs	91	96	94.79%

Totals
Change from base Build 12813454877:	0.01%
Covered Lines:	6084
Relevant Lines:	6481

💛 - Coveralls

arvidn

it looks correct, as far as I can tell. I think we need tests for all interesting cases, to make sure it works. I'm also interested in seeing a benchmark. Does this make a difference? I would expect it to at least use less memory, which typically means faster on small machines (like Raspberry PI)

arvidn · 2025-01-13T16:47:04Z

src/traverse_path.rs

@@ -72,6 +72,83 @@ pub fn traverse_path(allocator: &Allocator, node_index: &[u8], args: NodePtr) ->
    Ok(Reduction(cost, arg_list))
 }

+pub fn traverse_path_with_vec(


it would be good to have unit tests for this function

arvidn · 2025-01-13T16:53:26Z

src/serde/de_br.rs

@@ -22,7 +22,7 @@ pub fn node_from_stream_backrefs(
    f: &mut Cursor<&[u8]>,
    mut backref_callback: impl FnMut(NodePtr),
 ) -> io::Result<NodePtr> {
-    let mut values = allocator.nil();
+    let mut values = Vec::<NodePtr>::new();


one idea I had was that you could make this Vec<(NodePtr, Option<NodePtr>)>, where the optional NodePtr is a cache of nodes you've created for this stack "link", in case there are multiple references to the same one.

arvidn · 2025-01-13T16:54:56Z

src/traverse_path.rs

+    // find first non-zero byte
+    let first_bit_byte_index = first_non_zero(node_index);
+
+    let mut cost: Cost = TRAVERSE_BASE_COST


this version doesn't need to track cost, I don't think. In fact, I think this is sufficiently different (and specialized) that it makes sense to move it into the de_br.rs file.

arvidn · 2025-01-13T16:58:07Z

src/traverse_path.rs

+    let mut bitmask = 0x01;
+
+    // if we move from parsing the Vec stack to parsing the SExp stack use the following variables
+    let mut parsing_sexp = false;


it might be simpler to have a separate loop in the beginning that just reads 1-bits (we're still on the Vec-stack), until it hits a 0-bit (we select a stack item), and then moves into the next loop that only considers nodes in the allocator.

arvidn · 2025-01-13T16:58:57Z

src/traverse_path.rs

+) -> Response {
+    // the vec is a stack so a ChiaLisp list of (3 . (2 . (1 . NIL))) would be [1, 2, 3]
+    // however entries in this vec may be ChiaLisp SExps so it may look more like [1, (2 . NIL), 3]
+    let mut arg_list: Vec<NodePtr> = args.to_owned();


it would be more efficient to just keep an index into args, rather than cloning it

src/traverse_path.rs

arvidn

I think these things are still needed:

preserve the existing function, partly to control when we switch over to the new one, and also to be able to test that both behave the same
ensure the new function produce the same result as the old one, e.g. with a fuzzer.
ensure the new function behave the same with regards to limits to the number of pairs created by Allocator. It can be tested in a fuzzer by building with the counters build feature
benchmark to demonstrate that this is an improvement (this should probably be done early, as we might want to scrap this idea if it doesn't carry its weight)
survey the mainnet and testnet blockchains to see if back references into the parse-stack eveer exists in the wild
unit tests for all edge cases

matt-o-how requested a review from arvidn January 13, 2025 16:18

arvidn reviewed Jan 13, 2025

View reviewed changes

arvidn reviewed Jan 15, 2025

View reviewed changes

matt-o-how added 26 commits January 17, 2025 10:06

initial commit

9da72e6

basic structure

67994b2

comment for clarity

736258a

Improve clarity in error messages and comments

173c1ca

passing tests!

5124107

clippy cleanups

f3e61b6

clarify comment

dd6c782

remove Cost and move into de_br

71904ef

use an index to simulate stack rather than cloning vec

8cee7f8

add catch for underflow

dfb9272

catch another underflow

427eeaf

add fuzz against old

766fdc4

fmt

f6a7a9d

add pair count fuzz

ab2e56d

fmt fuzz

b073026

try adding features to fuzzer cargo.toml

bf16556

forward the counters feature

4ed7198

fix name in fuzz cargo.toml

88cb1d3

add all features to cargo fuzz github action

2d0e86f

return error when traversing empty values stack

955843d

special case 0xfe 0x01

bef9f4a

special case 0xfe 0x00

a6c3475

treat empty stack as sexp

71dcba5

prevent underflow in empty case

532a64a

parse as empty list when we empty the list

4e58534

add benchmarking for node_from_bytes_backrefs_old

cb47c16

matt-o-how force-pushed the node_from_stream_backrefs_optimisation branch from 166b35f to cb47c16 Compare January 17, 2025 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node from stream backrefs optimisation #532

Node from stream backrefs optimisation #532

matt-o-how commented Jan 13, 2025

coveralls-official bot commented Jan 13, 2025 •

edited

Loading

arvidn left a comment

arvidn Jan 13, 2025

arvidn Jan 13, 2025

arvidn Jan 13, 2025

matt-o-how Jan 14, 2025

arvidn Jan 13, 2025

arvidn Jan 13, 2025

matt-o-how Jan 14, 2025

arvidn left a comment •

edited by matt-o-how

Loading

Node from stream backrefs optimisation #532

Are you sure you want to change the base?

Node from stream backrefs optimisation #532

Conversation

matt-o-how commented Jan 13, 2025

coveralls-official bot commented Jan 13, 2025 • edited Loading

Pull Request Test Coverage Report for Build 12826960470

Details

💛 - Coveralls

arvidn left a comment

Choose a reason for hiding this comment

arvidn Jan 13, 2025

Choose a reason for hiding this comment

arvidn Jan 13, 2025

Choose a reason for hiding this comment

arvidn Jan 13, 2025

Choose a reason for hiding this comment

matt-o-how Jan 14, 2025

Choose a reason for hiding this comment

arvidn Jan 13, 2025

Choose a reason for hiding this comment

arvidn Jan 13, 2025

Choose a reason for hiding this comment

matt-o-how Jan 14, 2025

Choose a reason for hiding this comment

arvidn left a comment • edited by matt-o-how Loading

Choose a reason for hiding this comment

coveralls-official bot commented Jan 13, 2025 •

edited

Loading

arvidn left a comment •

edited by matt-o-how

Loading