Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconvenient apis for append_repeated_* #16

Closed
Buckram123 opened this issue Jan 2, 2024 · 13 comments
Closed

Inconvenient apis for append_repeated_* #16

Buckram123 opened this issue Jan 2, 2024 · 13 comments

Comments

@Buckram123
Copy link

Buckram123 commented Jan 2, 2024

Hey, I really like using anybuf, but there are a couple of inconvenient methods

Append repeated strings

Currently append_repeated_string accepts &[&str] but IMO it could be improved to take &[impl AsRef<str>] instead. The reason why it's inconvenient, because in the cases where you have Vec<String> right now it's unavoidable in safe rust to not allocate a new vector Vec<&str> and take slice of that:

// Real case: cosmos.bank.v1beta1.DenomUnit
self.aliases
    .iter()
    .map(String::as_str)
    .collect::<Vec<&str>>()
    .as_slice(),

Append repeated bytes

Since append_bytes already accepts impl AsRef<[u8]>, I would assume repeated should take &[impl AsRef<[u8]>] as well

Append repeated message

Currently append_repeated_message accepts &[&Anybuf], but similar to the methods above it's tricky to get this type:

// coins: Vec<Anybuf> here is created from `Vec<cosmos.base.v1beta1.Coin>` 
coins_anybuf.iter().collect::<Vec<&Anybuf>>().as_slice()

For this one I would suggest to take simply &[Anybuf].

@Buckram123 Buckram123 changed the title Inconvenient apis for append_* Inconvenient apis for append_repeated_* Jan 2, 2024
@webmaster128
Copy link
Contributor

webmaster128 commented Jan 3, 2024

Thanks for bringing this up. I addressed the first two in #17 but am not really happy with the diff. It requires a lot of additional type annotations as inference does not work for:

  • empty slices &[]
  • byte literals like b"", b"a" which are fixed length arrays by default

This issue might be mostly in test code since literals are more common there. Any thoughts on that PR?

@webmaster128
Copy link
Contributor

in the cases where you have Vec<String> right now it's unavoidable in safe rust to not allocate a new vector Vec<&str> and take slice of that:

This is a very good point. I doubt this is possible, no matter safe or unsafe Rust. A &[&str] is a list of references aka pointers and you need to have those strored in memory one after another. There may be more efficient ways to store this new list than allocating a new vector for the references, e.g. with a reusable scratch area. But this is low level fiddling you don't want in app logic.

@Buckram123
Copy link
Author

I doubt this is possible, no matter safe or unsafe Rust. A &[&str] is a list of references aka pointers and you need to have those strored in memory one after another.

Well, technically it is possible, if you leak all string slices back to the vector of strings, but this is simply a bad solution

There may be more efficient ways to store this new list than allocating a new vector for the references, e.g. with a reusable scratch area. But this is low level fiddling you don't want in app logic.

What do you mean by reusable scratch area?

@webmaster128
Copy link
Contributor

webmaster128 commented Jan 4, 2024

if you leak all string slices back to the vector of strings, but this is simply a bad solution

But a vector of Strings is a list of { ptr, capacity, len } structs (probablly 3x size_of<usize> depending on alignment). A list of references should be elements of size size_of<usize>

What do you mean by reusable scratch area?

You can have a let mut my_strings: Vec<&str> = Vec::with_capacity(20); that you create once per contract execution and then fill with new &str lists every time you need them. As long as you stay below your initial capacity, no new memory is allocated.

@Buckram123
Copy link
Author

Buckram123 commented Jan 4, 2024

But a vector of Strings is a list of { ptr, capacity, len } structs (probablly 3x size_of<usize> depending on alignment). A list of references should be elements of size size_of<usize>

Right, they are different sizes, that's why I say it's bad(along with the fact that leaks are bad), just for example:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f47c05ded74ec4bd6474e54209ff934a

You can have a let mut my_strings: Vec<&str> = Vec::with_capacity(20); that you create once per contract execution and then fill with new &str lists every time you need them. As long as you stay below your initial capacity, no new memory is allocated.

Yeah, that could be helpful in some cases I will remember this workaround, thanks :)

@webmaster128
Copy link
Contributor

#18 has a solution for appending repeated message. It keeps &Anybuf elements possible. This way a single message can be added multiple times without cloning.

@webmaster128
Copy link
Contributor

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f47c05ded74ec4bd6474e54209ff934a

Ah nice, now I see what you mean: Reuse the raw memory of the vector because you don't need it anymore and know the slice of references is shorter. Interesting idea for sure but nothing you want to show an auditor :D

@Buckram123
Copy link
Author

#18 has a solution for appending repeated message.

Nice! Can't wait to use it 🚀

This way a single message can be added multiple times without cloning.

Right, didn't think about having same message in repeating message. Thanks for covering it as well :)

@webmaster128
Copy link
Contributor

Could you try latest main in your project? If everything is fine, I can make a breaking 0.4.0 release with the two changes discussed.

@Buckram123
Copy link
Author

Buckram123 commented Jan 23, 2024

Could you try latest main in your project? If everything is fine, I can make a breaking 0.4.0 release with the two changes discussed.

Yes, thanks for the change! It was compatible with previous version in my case, and was able to remove every .iter().collect::<Vec<_>>() for append_repeated_message and .iter().map(String::as_str).collect::<Vec<_>>() for append_repeated_string without any issues!
Few examples:

             .append_string(2, &self.receiver)
             .append_string(3, &self.token_a)
             .append_string(4, &self.token_b)
-            .append_repeated_string(
-                5,
-                self.amounts_a
-                    .iter()
-                    .map(String::as_str)
-                    .collect::<Vec<_>>()
-                    .as_ref(),
-            )
-            .append_repeated_string(
-                6,
-                self.amounts_b
-                    .iter()
-                    .map(String::as_str)
-                    .collect::<Vec<_>>()
-                    .as_ref(),
-            )
+            .append_repeated_string(5, &self.amounts_a)
+            .append_repeated_string(6, &self.amounts_b)
             .append_repeated_int64(7, &self.tick_indexes_a_to_b)
             .append_repeated_uint64(8, &self.fees)
-            .append_repeated_message(9, options_messages.iter().collect::<Vec<_>>().as_ref())
+            .append_repeated_message(9, &options_messages)
             .into_vec()
-                    .collect::<Vec<_>>()
-                    .as_ref(),
-            )
+            .append_repeated_string(5, &self.amounts_a)
+            .append_repeated_string(6, &self.amounts_b)
             .append_repeated_int64(7, &self.tick_indexes_a_to_b)
             .append_repeated_uint64(8, &self.fees)
-            .append_repeated_message(9, options_messages.iter().collect::<Vec<_>>().as_ref())
+            .append_repeated_message(9, &options_messages)

@Buckram123
Copy link
Author

Another thing to discuss, if it will be better to take IntoIterator, instead of a slice, consider this example:

        let coins_anybuf: Vec<Anybuf> = self
            .register_fee
            .iter()
            .map(Coin::to_anybuf)
            .collect::<Vec<_>>();
        Anybuf::new()
            .append_uint64(1, self.msg_submit_tx_max_messages)
            .append_repeated_message(2, &coins_anybuf)

With this change we won't need to collect it:

        let coins_anybuf = self
           .register_fee
           .iter()
           .map(Coin::to_anybuf);
       Anybuf::new()
           .append_uint64(1, self.msg_submit_tx_max_messages)
           .append_repeated_message(2, coins_anybuf)

Not sure about drawbacks, except of bigger binaries when used carelessly

@webmaster128
Copy link
Contributor

Okay, I see. Not bad. But let's break this down into separate steps. I'll release a version with the changes we discussed already. For everything else, let's make a new ticket.

@webmaster128
Copy link
Contributor

Released as part of 0.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants