Skip to content

xinz/crockford_base32

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CrockfordBase32

Module Version Hex Docs

An Elixir Implementation of Douglas Crockford's Base32 encoding.

Please see https://www.crockford.com/base32.html.

This library can encode an integer or a bitstring in Crockford's Base32, and also provide the way to decode the corresponding encoding.

Installation

def deps do
  [
    {:crockford_base32, "~> 0.8"}
  ]
end

Usage

Encode

Encode an integer:

iex> CrockfordBase32.encode(1234)
"16J"

Encode an integer with checksum: true:

iex> CrockfordBase32.encode(1234, checksum: true)
"16JD"

Encode an inetger, and insert hyphens (-) per the step size(via split_size) in encoded result:

iex> CrockfordBase32.encode(1234, split_size: 2)
"16-J"
iex> CrockfordBase32.encode(1234, split_size: 1)
"1-6-J"
iex> CrockfordBase32.encode(1234, split_size: 1, checksum: true)
"1-6-J-D"

Encode a bitstring, and optional split_size and checksum options are both working:

iex> CrockfordBase32.encode(<<12345678::size(48)>>)
"00001F319R"
iex> CrockfordBase32.encode("abc")
"C5H66"
iex> CrockfordBase32.encode("abc", checksum: true)
"C5H66C"
iex> CrockfordBase32.encode("abc", checksum: true, split_size: 3)
"C5H-66C"
iex> CrockfordBase32.encode(<<5::size(3)>>)
"M"

Decode

There will internally remove all hyphen(s) before decoding.

Decode the encoded to an integer:

iex> CrockfordBase32.decode_to_integer("16J")
{:ok, 1234}
iex> CrockfordBase32.decode_to_integer("16-J")
{:ok, 1234}
iex> CrockfordBase32.decode_to_integer("16-j")
{:ok, 1234}

With a check symbol, and decode the encoded to an integer:

iex> CrockfordBase32.decode_to_integer("16JD", checksum: true)
{:ok, 1234}
iex> CrockfordBase32.decode_to_integer("16J1", checksum: true)
:error_checksum

Decode the encoded to a bitstring:

iex> CrockfordBase32.decode_to_bitstring("00001F319R")
{:ok, <<0, 0, 0, 188, 97, 78>>}
iex> CrockfordBase32.decode_to_bitstring("C5H66")
{:ok, "abc"}
iex> CrockfordBase32.decode_to_bitstring("C5H-66")
{:ok, "abc"}
iex> CrockfordBase32.decode_to_bitstring("c5H-66")
{:ok, "abc"}
iex> CrockfordBase32.decode_to_bitstring("c5h-66")
{:ok, "abc"}
iex> CrockfordBase32.decode_to_bitstring("c5h66")
{:ok, "abc"}
iex> CrockfordBase32.decode_to_bitstring("M")
{:ok, <<5::size(3)>>}

With a check symbol, and decode the encoded to a bitstring:

iex> CrockfordBase32.decode_to_bitstring("C5H66C", checksum: true)
{:ok, "abc"}
iex> CrockfordBase32.decode_to_bitstring("C5H66D", checksum: true)
:error_checksum

Some invalid cases:

iex> CrockfordBase32.decode_to_bitstring(<<1, 2, 3>>)
:error
iex> CrockfordBase32.decode_to_bitstring(<<>>)
:error
iex> CrockfordBase32.decode_to_integer(<<1, 2, 3>>)
:error
iex> CrockfordBase32.decode_to_integer(<<>>)
:error

Fixed Size Encoding

In some cases, you may want to encode the fixed size bytes, we can do this be with a better performance leverages the benefit of the pattern match of Elixir/Erlang. I use this feature to implement a ULID in Elixir.

Refer ULID specification, a ULID concatenates a UNIX timestamp in milliseconds(a 48 bit integer) and a randomness in 80-bit, since an integer in bits are padded with some <<0::1>> leading when needed, and a ULID in 128-bit after encoded its length is 26 (can be divisible by 5), apply the fixed size encoding with type: :integer can efficiently encode/decode a ULID, for example:

defmoule ULID do

  defmoule Base32.Bits128 do
    use CrockfordBase32,
      bits_size: 128,
      type: :integer # Optional, defaults to `:bitstring`
  end

end

Then we can use ULID.Base32.Bits128 to encode/decode a 128-bit bitstring which concatenates an integer (as UNIX timestamp in millisecond) in 48-bit and a randomly generated in 80-bit.

Padding 0-bit when decoding

Crockford's Base32 avoids the use of padding characters by zero-extending the data to ensure the bit-length is a multiple of 5, there is no need to retain additional padding bits(<<0::size(1)>>) in the decoded result, so there may some decoded bits that are not as complete as expected, for example:

A string("01HY3B3HQ5FMEVJN8ME7C4HZDM") is a 26 length randomly generated string as a suffix of TypeID, TypeID's specification defines its suffix base32 encoding be with two zeroed bits are pre-pended to the 128-bits of the UUID, resulting in 130-bits of data.

Notice: Please ignore case in the following parameter "s", CrockfordBase32 is not case sensitive, but TypeID only uses lowercase.

iex> s = "01HY3B3HQ5FMEVJN8ME7C4HZDM"
iex> {:ok, input} = CrockfordBase32.decode_to_bitstring(s)
{:ok, <<0, 99, 225, 172, 113, 185, 95, 71, 110, 85, 69, 28, 118, 18, 63, 109>>}
iex> bit_size(input)
128
iex> CrockfordBase32.encode(<<input::bitstring, 0::size(2)>>)
"01HY3B3HQ5FMEVJN8ME7C4HZDM"
iex> <<0::size(2), uuid::bitstring>> = <<input::bitstring, 0::size(2)>>
<<0, 99, 225, 172, 113, 185, 95, 71, 110, 85, 69, 28, 118, 18, 63, 109,
  0::size(2)>>
iex> uuid
<<1, 143, 134, 177, 198, 229, 125, 29, 185, 85, 20, 113, 216, 72, 253, 180>>

We must explicitly append two zero bits(<<0::size(2)>>) into the <<0, 99, 225, 172, 113, 185, 95, 71, 110, 85, 69, 28, 118, 18, 63, 109>> and only later we can get the correct uuid bitstring.

Similar to TypeID, pre-defining a fixed-size bit string encoding, we can do this:

defmoule Typeid do

  defmoule Base32.Bits130 do
    use CrockfordBase32,
      bits_size: 130
  end

end

Use "Typeid.Base32.Bits130" and then do not need to manually to pad the zero bit(s), it will use its fixed size to handle the padding.

iex> Typeid.Base32.Bits130.decode("01HY3B3HQ5FMEVJN8ME7C4HZDM")
{:ok,
 <<0, 99, 225, 172, 113, 185, 95, 71, 110, 85, 69, 28, 118, 18, 63, 109,
   0::size(2)>>}
iex> {:ok, input} = Typeid.Base32.Bits130.decode("01HY3B3HQ5FMEVJN8ME7C4HZDM")
{:ok,
 <<0, 99, 225, 172, 113, 185, 95, 71, 110, 85, 69, 28, 118, 18, 63, 109,
   0::size(2)>>}
iex> bit_size(input)
130
iex> <<0::size(2), uuid::bitstring>> = input
<<0, 99, 225, 172, 113, 185, 95, 71, 110, 85, 69, 28, 118, 18, 63, 109,
  0::size(2)>>
iex> uuid
<<1, 143, 134, 177, 198, 229, 125, 29, 185, 85, 20, 113, 216, 72, 253, 180>>
iex> Typeid.Base32.Bits130.encode(input)
"01HY3B3HQ5FMEVJN8ME7C4HZDM"

Custom alphabet

There is a way to custom alphabet in the encoding, for example:

  defmodule Typeid.Base32 do
    use CrockfordBase32,
      bits_size: 130,
      alphabet: '0123456789abcdefghjkmnpqrstvwxyz'
  end

Use "Typeid.Base32" to satisfy TypeID's specification uses 0123456789abcdefghjkmnpqrstvwxyz as its alphabet.

Credits

These libraries or tools are very helpful in understanding and reference, thanks!

About

An Elixir implementation of Douglas Crockford's Base32 encoding with an integer or a bitstring

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages