Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Mina daemon's databases #15767

Open
georgeee opened this issue Jun 20, 2024 · 8 comments
Open

Document Mina daemon's databases #15767

georgeee opened this issue Jun 20, 2024 · 8 comments
Assignees

Comments

@georgeee
Copy link
Member

georgeee commented Jun 20, 2024

Produce a document describing what are databases that Daemon uses.

Result of this task is expected to be a table of databases and descriptions.

Table's rows, I imagine, would be:

  • Location of database
  • Description
  • Ocaml's type for key
  • Ocaml's type for value

Relates to #13971

@georgeee
Copy link
Member Author

This document is needed to analyze usage of RocksDB in block production/processing.

Major focus should be on frontier and ledger-related DBs.

@georgeee
Copy link
Member Author

georgeee commented Jun 20, 2024

AFAIU all DBs we have are key-value storages.

And we probably have more than one key-value space within some of these DBs, let's have one "key-value space" per row in a resuting table.

To start with the task I suggest one launching a Mina node that connects to mainnet. And then checking what you have in .mina-config.
Ideally we'd be able to inspect every DB with manual tool and check that there are no keys unaccounted, but it might be too tedious, so maybe just invetigating usages in codebase would be alright.

What I see on a mainnet's node:

  • genesis/<...> <- three genesis ledger folders
  • frontier <- transition frontier, where the recent blocks are kept
  • wallets <- for locally stored secret keys, empty if no key configured
  • mina_net2/<..> <- some databases used by networking implementation
    • mina_net2/block_db <- lmdb database for blocks, to be used in future by the new catchup algorithm
  • root <- some data on the "root" transition, including uncommitted snarked ledgers
  • trust <- database to store information about peers, now it's effectively containing peers we know and whether a peer is banned until some datetime

@Geometer1729
Copy link
Member

Geometer1729 commented Jun 26, 2024

root

The root/snarked_ledger location is defined here:

let snarked_ledger root = Filename.concat root "snarked_ledger"

This path is given to Ledger.Db where Ledger is Mina_ledger.Ledger.
So the snarked_ledger Db type gets defined here:
module Db :
Merkle_ledger.Intf.Ledger.DATABASE
with module Location = Location_at_depth
with module Addr = Location_at_depth.Addr
with type root_hash := Ledger_hash.t
and type hash := Ledger_hash.t
and type key := Public_key.Compressed.t
and type token_id := Token_id.t
and type token_id_set := Token_id.Set.t
and type account := Account.t
and type account_id_set := Account_id.Set.t
and type account_id := Account_id.t =
Database.Make (Inputs)

which calls this functor:
module Make (Inputs : Intf.Inputs.DATABASE) = struct

On this module:

module Inputs = struct
module Key = Public_key.Compressed
module Token_id = Token_id
module Account_id = Account_id
module Balance = struct
include Currency.Balance
let to_int = to_nanomina_int
end
module Account = Account.Stable.Latest
module Hash = Hash.Stable.Latest
module Kvdb = Kvdb
module Location = Location_at_depth
module Location_binable = Location_binable
module Storage_locations = Storage_locations
end

The Kvbd.t type it takes seems to be a generic database type with Bigstring.t for both key and value.
But from calls to Kvdb like this:

let get_raw { kvdb; depth; _ } location =
Kvdb.get kvdb ~key:(Location.serialize ~ledger_depth:depth location)

It looks like the keys represent serialized Location.ts which I think is this enum:
type t = Generic of Bigstring.t | Account of Addr.t | Hash of Addr.t
[@@deriving hash, sexp, compare]

and the value appears to be the Toeken_id.Set.t type from the inputs.
I'm not sure where Token_id comes from, there are a few opens at the top of the file.

The root/root location is defined here and seems to store a hash of the genesis state.

@Geometer1729
Copy link
Member

trust

src/lib/trust_system/peer_trust.ml

Has the key as Peer_id.t which I believe refers to this:

module Peer_id = struct
type t = int [@@deriving sexp, yojson]
let ip t = Unix.Inet_addr.of_string (sprintf "127.0.0.%d" t)
end

and the value Record.t which I beleive reffers to this

module V1 = struct
type t =
{ trust : float
; trust_last_updated : Core.Time.Stable.V1.t
; banned_until_opt : Core.Time.Stable.V1.t option
}
let to_latest = Fn.id

@Geometer1729 Geometer1729 self-assigned this Jun 26, 2024
@Geometer1729
Copy link
Member

Geometer1729 commented Jun 27, 2024

frontier

It looks like this module:

module Rocks = Rocksdb.Serializable.GADT.Make (Schema)

calls this functor:

module Make (Key : Intf.Key.S) : Intf.Database.S with type 'a g := 'a Key.t =

With this argument:

I think that results in the key type being this private Enum:

type _ t =
| Db_version : int t
| Transition : State_hash.Stable.V1.t -> Mina_block.Stable.V2.t t
| Arcs : State_hash.Stable.V1.t -> State_hash.Stable.V1.t list t
| Root : Root_data.Minimal.Stable.V2.t t
| Best_tip : State_hash.Stable.V1.t t
| Protocol_states_for_root_scan_state
: Mina_state.Protocol_state.Value.Stable.V2.t list t

Here Root_data.Minimal.Stable.V2 refers to this type:

type t = { hash : State_hash.Stable.V1.t; common : Common.Stable.V2.t }

and I think this defines a type family like thing for what the value types is depending on the key:

let binable_data_type (type a) : a t -> a Bin_prot.Type_class.t = function
| Db_version ->
[%bin_type_class: int]
| Transition _ ->
[%bin_type_class: Mina_block.Stable.Latest.t]
| Arcs _ ->
[%bin_type_class: State_hash.Stable.Latest.t list]
| Root ->
[%bin_type_class: Root_data.Minimal.Stable.Latest.t]
| Best_tip ->
[%bin_type_class: State_hash.Stable.Latest.t]
| Protocol_states_for_root_scan_state ->
[%bin_type_class: Mina_state.Protocol_state.Value.Stable.Latest.t list]

@Geometer1729
Copy link
Member

mina_net2/blocksdb

Based on the language server hovers in this code:

let read_body { statuses; logger; blocks; env } body_ref =
let impl txn =
try
if Lmdb.Map.get ~txn statuses body_ref = full_status then (
match read_body_impl blocks txn body_ref with
| Ok r ->

It seems like the key here is Blake2.t and the values are Mina_block.Body.t
I can't find where the path for this database is defined, so I'm just going by it being LMDB and the module name being appropriate.

@Geometer1729
Copy link
Member

genesis

This comment seems to confirm that the ledger here is the genesis ledger in rocksdb

Mina_ledger.Ledger.commit ledger ;
let dirname = Option.value_exn (Mina_ledger.Ledger.get_directory ledger) in
let root_hash =
Ledger_hash.to_base58_check @@ Mina_ledger.Ledger.merkle_root ledger
in
let%bind () = Unix.mkdir ~p:() genesis_dir in
let tar_path = genesis_dir ^/ hash_filename root_hash ~ledger_name_prefix in
[%log trace]
"Creating $ledger tar file for $root_hash at $path from database at $dir"
~metadata:
[ ("ledger", `String ledger_name_prefix)
; ("root_hash", `String root_hash)
; ("path", `String tar_path)
; ("dir", `String dirname)
] ;
(* This sleep for 5s is a hack for rocksdb. It seems like rocksdb would need some
time to stablize *)

The language server links that commit to this

let commit t =
assert (not t.is_committing) ;
t.is_committing <- true ;
assert_is_attached t ;
let parent = get_parent t in
let old_root_hash = merkle_root t in
let account_data = Map.to_alist t.maps.accounts in
t.maps <-
{ accounts = Location_binable.Map.empty
; hashes = Addr.Map.empty
; token_owners = Token_id.Map.empty
; locations = Account_id.Map.empty
} ;
Base.set_batch parent account_data ;
Debug_assert.debug_assert (fun () ->
[%test_result: Hash.t]
~message:
"Parent merkle root after committing should be the same as the \
old one in the mask"
~expect:old_root_hash (Base.merkle_root parent) ;
[%test_result: Hash.t]
~message:"Merkle root of the mask should delegate to the parent now"
~expect:(merkle_root t) (Base.merkle_root parent) ) ;
t.is_committing <- false

The database part of which seems to be Base.set_batch parent account_data ;
Which again following the language server seems to come from here
module Base :
Base_merkle_tree_intf.S
with module Addr = Location.Addr
and module Location = Location
and type account := Account.t
and type root_hash := Hash.t
and type hash := Hash.t
and type key := Key.t
and type token_id := Token_id.t
and type token_id_set := Token_id.Set.t
and type account_id := Account_id.t
and type account_id_set := Account_id.Set.t

I'm not entirely sure what's going on, but this seems to be pointing to the same code as the root/snarked_ledger.
So I checked if the outputs looked comparable in the hopes they are the same types which would make sense I think given both databases represent snapshots of the ledger.
The outputs are extremely similar
last few lines of genesis_ledger:

$tid!0x3BD4E3A4BEC457DC36730A0295CB9542CF01AEDC6EC0F3A03240314649D90CF6 ==> rb5V̛,IA~:
$tid!0x3BD55C98C4DF679BF75D58287C49464CF382C94E0F493339E2C3AD996EEAF377 ==>
Q       1Hz>\~)4jW<8m
$tid!0x3BD611554C36EADB0C2FF4FD7831B035C1150534CE07515C901125CE1D7F82E8 ==> :o<p~e_Lg=($k<
$tid!0x3BD8498E090010ED1FCE99004B9D9E3B0A8E7C91F71B9283B333277236A54EFA ==> s-@_N6    R)pMI~
id!0x3BDA4F97891922FDEB5018F7F1C1C8756191778C3B07DDC685B4D9321263A305 ==> O3{W2~
"X"i;dV]        K6
$tid!0x3BDA9A6DFF6496A75B097EB7BB7949CB27A2374D505E4D1E3B44D032B4FEACA0 ==> 4)x|Au]y_6$fLa-
$tid!0x3BDAF9FF99C601137912EA4191D6EF2E3DCAD5A4F508B3584A8972CAC1A8B04D ==> fh0y)yUxP0

last few lines of snarked_ledger

$tid!0x3BD4E3A4BEC457DC36730A0295CB9542CF01AEDC6EC0F3A03240314649D90CF6 ==> rb5V̛,IA~:
$tid!0x3BD55C98C4DF679BF75D58287C49464CF382C94E0F493339E2C3AD996EEAF377 ==>
Q       1Hz>\~)4jW<8m
$tid!0x3BD611554C36EADB0C2FF4FD7831B035C1150534CE07515C901125CE1D7F82E8 ==> :o<p~e_Lg=($k<
$tid!0x3BD8498E090010ED1FCE99004B9D9E3B0A8E7C91F71B9283B333277236A54EFA ==> s-@_N6    R)pMI~
id!0x3BDA4F97891922FDEB5018F7F1C1C8756191778C3B07DDC685B4D9321263A305 ==> O3{W2~
"X"i;dV]        K6
$tid!0x3BDA9A6DFF6496A75B097EB7BB7949CB27A2374D505E4D1E3B44D032B4FEACA0 ==> 4)x|Au]y_6$fLa-
$tid!0x3BDAF9FF99C601137912EA4191D6EF2E3DCAD5A4F508B3584A8972CAC1A8B04D ==> fh0y)yUxP0

They are not actually completely identical per diff, but having several lines in common makes it feel like a safe bet the types are the same. So the key should be Location.t and the value should be Toeken_id.Set.t like in root/snarked_ledger

@Geometer1729
Copy link
Member

wallets/store/<..>

AFAICT there are no databases in here, just keys in individual files. Based on this it looks like the private keys are stored separately from public keys and the filenames of the private keys depend on the public keys.

let get_path { path; cache } public_key =
(* TODO: Do we need to version this? *)
let filename =
Public_key.Compressed.Table.find cache public_key
|> Option.bind ~f:(function
| Locked file | Unlocked (file, _) ->
Option.return file
| Hd_account _ ->
Option.return
(Public_key.Compressed.to_base58_check public_key ^ ".index") )
|> Option.value ~default:(get_privkey_filename public_key)
in
path ^/ filename

Here it seems to be parsing the key but just reading plain lines from a file

let%map lines = Reader.file_lines (path ^/ file) in

I was able to run a node with a wallet, but the wallets/store/ directory remains empty at least for an hour or two. I assume I would need to actually produce a block for this code to run. So I haven't been able to confirm that this is the actual behavior of a node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants