Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iolist_size crashes on non-byte aligned bit strings #320

Open
varnerac opened this issue Jun 23, 2022 · 11 comments
Open

iolist_size crashes on non-byte aligned bit strings #320

varnerac opened this issue Jun 23, 2022 · 11 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@varnerac
Copy link
Contributor

import gleeunit
import gleam/bit_builder

pub fn main() {
  gleeunit.main()
}

pub fn bit_string_size_test() {
  <<1:int-size(1)>>
  |> bit_builder.from_bit_string()
  |> bit_builder.byte_size()
}

crashes with

λ ~/bitstringsize/ master* gleam test
  Compiling gleam_stdlib
  Compiling gleeunit
  Compiling bitstringsize
   Compiled in 1.41s
    Running bitstringsize_test.main
F
Failures:

  1) bitstringsize_test:bit_string_size_test/0: module 'bitstringsize_test'
     Failure: badarg
     Stacktrace:
       erlang.iolist_size
       gleam@bit_builder.byte_size
     Output: 

Finished in 0.006 seconds
1 tests, 1 failures
@lpil
Copy link
Member

lpil commented Jun 23, 2022

Thank you

@lpil lpil transferred this issue from gleam-lang/gleam Jun 23, 2022
@lpil lpil added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Jun 23, 2022
@varnerac
Copy link
Contributor Author

varnerac commented Jun 23, 2022

How do we fix this? Do bit strings need to be byte-aligned? Erlang iolists are formed from byte-aligned binaries. Do we deprecate byte_size in bit_string and bit_builder and replace it with bit_size?

@lpil
Copy link
Member

lpil commented Jun 27, 2022

Great point. Do they always need to be byte aigned? Is there a function to get the bit size of an iolist?

@varnerac
Copy link
Contributor Author

Iolists need to be byte-aligned according to their Erlang type specs. You can combine bit strings into lists. However, they will fail for byte_size, even if they add up to a byte-aligned value:

1> iolist_size([<<1:1>>, <<0:15>>]).            
** exception error: bad argument
     in function  iolist_size/1
        called as iolist_size([<<1:1>>,<<0,0:7>>])
        *** argument 1: not an iodata term

There is a bit_size function for bit strings, but you'd have to fold/accumulate it across bit strings in Gleam land.

The real problem will be networks and file I/O. I think both need byte boundaries. I don't think you can send/write a fraction of a byte. I feel like you'd be forced into size checking a bit string list before doing anything useful with it.

@lpil
Copy link
Member

lpil commented Jun 27, 2022

I think being byte aligned makes sense, but it has some uncomfortable API implications. If a bit string is added that is not byte aligned should it fail? That implies it should return a Result, which would make the API much more uncomfortable to use. Another option would be to automatically add padding.

I also wonder if the name BitBuilder is misleading if it's byte aligned.

@varnerac
Copy link
Contributor Author

varnerac commented Jun 27, 2022

The problem is that we cannot tell if the bit string is byte-aligned at compile-time. Here's the option I've been thinking about:

  • Deprecate bit_builder.byte_size. Add a bit_builder.bit_size implementation. The bit_size implementation attempts to use the Erlang iolist_size() BIF first, then falls back to fold/accumulate with bit_size/1 if that fails
  • Eat the cost of the NIF size call when passing a bit builder into a file or network device. Also, we can try the operation on the file descriptor first, catch it and return an error. Those calls (file. network) will typically return a Result anyway, so it's not any less ergonomic.

@sdancer
Copy link

sdancer commented Oct 28, 2022

<<1:int-size(1)>>

forbidding expressions where the static size isn't multiple of 8 to compile would be a good start

@sdancer
Copy link

sdancer commented Oct 28, 2022

can someone showcase a use case of partial bitstrings?

19> A.
<<1:1>>
20> <<A, 1:8>>.         
** exception error: construction of binary failed
     in function  eval_bits:eval_exp_field/6 (eval_bits.erl, line 143)
        *** segment 1 of type 'integer': expected an integer but got: <<1:1>>
     in call from eval_bits:create_binary/2 (eval_bits.erl, line 77)
     in call from eval_bits:expr_grp/5 (eval_bits.erl, line 68)

they don't seem to work at the moment

@varnerac
Copy link
Contributor Author

gleam-lang/gleam#1591 is related

1> A=<<1:1>>.
* 1:4: syntax error before: '<'
1> A= <<1:1>>.
<<1:1>>
2> <<A, 1:8>>.         
** exception error: bad argument
     in function  eval_bits:eval_exp_field1/6 (eval_bits.erl, line 123)
     in call from eval_bits:create_binary/2 (eval_bits.erl, line 81)
     in call from eval_bits:expr_grp/4 (eval_bits.erl, line 72)
3> <<A/bitstring, 1:8>>. 
<<128,1:1>>

@varnerac
Copy link
Contributor Author

I think the answer is to:

  • Rename BitString to ByteString in Gleam
  • Do a runtime check on ByteStrings to make sure they byte-align
  • Consider implementing BitStrings (vs. ByteStrings) as an Erlang-specific library

@lpil
Copy link
Member

lpil commented Oct 29, 2022

forbidding expressions where the static size isn't multiple of 8 to compile would be a good start

We will not do this, bit strings of any size are perfectly valid in BEAM languages.

The plan here is to make BitBuilders always byte aligned, as they are intended to be in Erlang. The name is misleading, we may want to change it to BinaryBuilder or ByteBuilder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants