Skip to content

foundationdb-beam/ex_fdbmonitor

Repository files navigation

ExFdbmonitor

An Elixir application that manages FoundationDB clusters using the BEAM's distributed capabilities.

ExFdbmonitor starts and supervises fdbmonitor (the FoundationDB management process), bootstraps new clusters, and handles scaling operations — all coordinated across nodes via Erlang distribution.

How it works

  1. First node — detects that no FDB peers exist, creates the cluster file, writes a foundationdb.conf, and runs configure new single <storage_engine>.
  2. Subsequent nodes — discover existing peers via :erlang.nodes(), copy the cluster file, and join the cluster.
  3. Redundancy — once enough nodes are registered, scale_up configures coordinators and the declared redundancy mode ("double", "triple").
  4. Restarts — on restart the bootstrap config is ignored (data files already exist). The node re-includes itself if necessary and re-evaluates redundancy automatically.

All mutating FDB operations are serialized through ExFdbmonitor.MgmtServer, a DGenServer backed by FDB itself. This prevents concurrent fdbcli commands from interleaving across nodes.

Requirements

  • Elixir ~> 1.18
  • FoundationDB client and server packages (releases)

Usage

See examples/example_app/README.md for a tutorial on using ExFdbmonitor in your application.

Configuration

FDB executable paths

If your FoundationDB installation is not in the default location, then you must set the following environment variables. The paths shown here are the defaults.

config :ex_fdbmonitor,
       fdbmonitor: "/usr/local/libexec/fdbmonitor",
       fdbcli: "/usr/local/bin/fdbcli",
       fdbserver: "/usr/local/libexec/fdbserver",
       fdbdr: "/usr/local/bin/fdbdr",
       backup_agent: "/usr/local/foundationdb/backup_agent/backup_agent",
       dr_agent: "/usr/local/bin/dr_agent"

Minimal (single-node dev)

# config/dev.exs
import Config

config :ex_fdbmonitor,
  etc_dir: ".my_app/dev/fdb/etc",
  run_dir: ".my_app/dev/fdb/run"

config :ex_fdbmonitor,
  bootstrap: [
    conf: [
      data_dir: ".my_app/dev/fdb/data",
      log_dir: ".my_app/dev/fdb/log",
      fdbservers: [[port: 5000]]
    ]
  ]

Multi-node production

# config/runtime.exs
import Config

addr = fn interface ->
  {:ok, addrs} = :inet.getifaddrs()
  :proplists.get_value(to_charlist(interface), addrs)[:addr]
  |> :inet.ntoa()
  |> to_string()
end

config :ex_fdbmonitor,
  etc_dir: "/var/lib/my_app/fdb/etc",
  run_dir: "/var/lib/my_app/fdb/run"

config :ex_fdbmonitor,
  bootstrap: [
  
    # nodes must communicate with coordinators over the
    # network interface
    cluster: [coordinator_addr: addr.("eth0")],
    
    conf: [
      data_dir: "/var/lib/my_app/fdb/data",
      log_dir: "/var/lib/my_app/fdb/log",
      storage_engine: "ssd-2",
      
      # We're defining 2 fdbservers per node
      fdbservers: [[port: 4500], [port: 4501]],
      
      # When safe to do so, ex_fdbmonitor will upgrade
      # to 'double' redunancy automatically
      redundancy_mode: "double"
    ]
  ]

Configuration reference

Key Required Description
:etc_dir yes Directory for fdb.cluster and foundationdb.conf
:run_dir yes Directory for fdbmonitor pid file
:bootstrap no Bootstrap config (ignored after first successful start)

Bootstrap keys:

Key Description
cluster: [coordinator_addr:] IP address for the initial coordinator (default "127.0.0.1")
conf: [data_dir:] FDB data directory
conf: [log_dir:] FDB log directory
conf: [storage_engine:] Storage engine (default "ssd-2")
conf: [fdbservers:] List of [port: N] keyword lists, one per fdbserver process
conf: [redundancy_mode:] "single", "double", or "triple" (default: nil / single)
fdbcli: Extra fdbcli args to run at bootstrap (optional, repeatable)

Bootstrap flow

On application start, ExFdbmonitor runs two phases:

Phase 1 (before any processes start):

  • If the conf file and data dir are empty (first boot), write config files. If FDB peers exist on :erlang.nodes(), copy their cluster file. Otherwise, create a new cluster file and generate configure new single <engine>.
  • If files already exist (restart), skip — use existing cluster file.

Phase 2 (after fdbmonitor / fdbserver are running):

  • Start ExFdbmonitor.MgmtServer (connects to FDB for distributed coordination).
  • Register this node's machine_id.
  • Call scale_up(redundancy_mode, [node()]) — includes the node back into FDB and configures redundancy when enough nodes are present.

Public API

ExFdbmonitor.leave/0

Gracefully remove the current node from the cluster. Downgrades redundancy if needed, reassigns coordinators, excludes the node (blocks until data is moved), and stops the local fdbmonitor. To rejoin, restart the :ex_fdbmonitor application.

Redundancy modes

Mode Min nodes Min coordinators
"single" 1 1
"double" 3 3
"triple" 5 5

scale_up stores the declared mode as a ceiling. scale_down auto-determines the highest mode the surviving nodes can support, capped at that ceiling. This prevents a scale-down/scale-up cycle from accidentally exceeding the operator's intent.

Scaling example

When a node is gracefully shutting down,

# On the departing node:
ExFdbmonitor.leave()

When a node is returning from previously having been gracefully shutdown,

# Later, restart the :ex_fdbmonitor application to rejoin:
Application.stop(:ex_fdbmonitor)
Application.ensure_all_started(:ex_fdbmonitor)

Testing

ExFdbmonitor provides sandbox modules for integration testing:

# Single-node sandbox
sandbox = ExFdbmonitor.Sandbox.Single.checkout("my-test", starting_port: 5000)
# ... run tests ...
ExFdbmonitor.Sandbox.Single.checkin(sandbox, drop?: true)

# 3-node double-redundancy sandbox
sandbox = ExFdbmonitor.Sandbox.Double.checkout("my-test", starting_port: 5500)
# ... run tests ...
ExFdbmonitor.Sandbox.Double.checkin(sandbox, drop?: true)

Sandboxes start isolated local_cluster nodes with their own FDB processes. Pass drop?: true to delete all data on checkin.

About

An Elixir application that manages the starting and stopping of fdbmonitor.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages