-
Notifications
You must be signed in to change notification settings - Fork 2
Wallet::new() hangs indefinitely on stale persisted state #53
Description
Summary
Wallet::new() hangs indefinitely when orange.sqlite contains persisted state
from a previous initialization that partially succeeded then failed (e.g.,
Electrum connected but Spark timed out, or the process was killed mid-init).
There is no alternative initialization path — Wallet::new() is the only entry
point, and it always performs a full startup sequence including Spark reconnection,
LDK ChannelManager state recovery, and chain sync.
Reproduction
- Create a wallet with a valid Electrum source but a Spark backend that is slow
or intermittently unreachable - Kill the process mid-initialization (after Electrum connects but before Spark
completes) - Attempt to call
Wallet::new()again with the same storage path
Expected: Wallet initializes successfully (possibly with degraded Spark
functionality)
Actual: Wallet::new() hangs indefinitely, never returning or erroring
Workaround
Deleting orange.sqlite before retrying resolves the hang — the wallet
reinitializes cleanly from the seed. This loses any cached state (network graph,
scorer, Spark sync offset) but is acceptable for development.
Root Cause (suspected)
On startup, Wallet::new() reads persisted state from orange.sqlite:
ldk_data[""]["manager"]— LDK ChannelManager (channels, HTLCs, peer info)ldk_data["spark"]["cache"]— Spark account info, sync offset, last sync timeldk_data[""]["network_graph"],[""]["scorer"]— routing metadata
When this state is stale or inconsistent (from a partial init), the startup
sequence appears to block on:
- Peer reconnection attempts for channels that were never fully opened
- Spark sync from a cached offset that is inconsistent with the actual state
- Possibly an unrecoverable LDK state machine condition
Observed state in orange.sqlite after failed init
ldk_data table namespaces:
(root) -> manager, network_graph, node_metrics, scorer, output_sweeper
bdk_wallet -> descriptor, change_descriptor, local_chain, network
spark/cache -> account_info, last_sync_time, lightning_address, sync_offset
Proposal
Option 1: Init timeout with clean retry (minimal change)
Add a configurable timeout to Wallet::new(). On timeout, clear connection-
related persisted state and retry:
impl Wallet {
pub async fn new(config: WalletConfig) -> Result<Self, InitFailure> {
// existing implementation
}
pub async fn new_with_timeout(
config: WalletConfig,
timeout: Duration,
) -> Result<Self, InitFailure> {
match tokio::time::timeout(timeout, Self::new(config.clone())).await {
Ok(result) => result,
Err(_) => {
Self::clear_connection_state(&config)?;
Self::new(config).await
}
}
}
}Option 2: Separate init paths (better long-term)
impl Wallet {
/// Full init — current behavior.
pub async fn new(config: WalletConfig) -> Result<Self, InitFailure>;
/// Clean init — clears persisted connection state, keeps key material
/// and tx history. Forces fresh handshakes with all backends.
pub async fn new_clean(config: WalletConfig) -> Result<Self, InitFailure>;
/// Read-only open — loads persisted state without connecting to any
/// backend. Suitable for balance display, tx history, address generation.
pub async fn open_readonly(config: WalletConfig) -> Result<Self, InitFailure>;
}open_readonly would also benefit mobile apps where fast app launch is
critical — the full init sequence (Spark + LDK + chain sync) adds significant
startup latency.
Environment
- orange-sdk rev: 2762df2
- Network: regtest
- Chain source: Electrum (local electrs v0.10.9)
- Extra config:
ExtraConfig::Spark(SparkWalletConfig::default()) - Platform: macOS (aarch64), Rust 1.88
Our current workaround
implementing a timeout wrapper on our side.