Liquidium LogoLiquidium
TechnicalSecurity

Atomicity & Write-Ahead Logging

Two-phase execution model and reliable async operations

The Challenge

IC canisters face unique constraints:

  • Execution limits: Few billion instructions per message
  • Async calls: Inter-canister calls can fail
  • Upgrades: Subnets and Canisters can be upgraded at any time
  • Crashes: Unexpected failures can occur

Financial operations must remain consistent despite these challenges.

Two-Phase Execution Model

Every critical operation follows this pattern:

Phase 1: Synchronous (Atomic)

  • All state changes are atomic (all-or-nothing)
  • User receives immediate feedback
  • No pending async operations if validation fails
  • State persists before async work begins

Duration: Typically < 200ms

Phase 2: Asynchronous (WAL-backed)

  • Operations persist across canister upgrades
  • Automatic retries with exponential backoff
  • Idempotent execution (safe to retry)
  • Error tracking for manual intervention

Duration: Variable (seconds to minutes).

Write-Ahead Log (WAL)

The WAL is a persistent queue of pending async operations stored in stable memory.

WAL Entry Structure

Each entry tracks:

  • Kind: Operation type ("outflow", "liquidation", etc.)
  • Status: Current execution state
  • Retry info: Attempts, max retries, backoff, next attempt time
  • Audit trail: First seen, last update, last error
  • Payload: Operation-specific data

WAL Status States

StatusMeaning
EnqueuedQueued for first execution
InFlightCurrently executing
SucceededCompleted successfully
FailedRetryableFailed, will retry
FailedPermanentFailed, needs intervention

Retry Policy

Operations use exponential backoff:

  • Attempt 1: 2 seconds
  • Attempt 2: 4 seconds
  • Attempt 3: 8 seconds
  • Attempt 4: 16 seconds
  • Attempt 5: 32 seconds

Total time before permanent failure: ~62 seconds

Timer-Based Execution

A background timer processes the WAL every 30 seconds, acquiring pending operations in batches of up to 256 and executing those whose next attempt time has passed.

Idempotency Guarantees

Multiple layers prevent double-execution:

LayerMechanism
Operation IDUnique ID per operation, duplicates rejected
Status CheckAlready-succeeded operations skipped
Per-Entry LockingExclusive lock prevents concurrent execution
Handler DeduplicationPool tracks processed withdrawal IDs

Failure Scenarios

Scenario 1: Transient Network Failure

User withdraws 1 BTC → Lending Canister burns shares (committed) → Pool call times out

Recovery: WAL marks FailedRetryable → Waits 2 seconds → Retries → Eventually succeeds

Scenario 2: Subnet or Canister Upgrade

100 withdrawals pending in WAL → Operator upgrades Subnet or Canister → Heap cleared, timers stopped

Recovery: post_upgrade() reinitializes timers → WAL entries preserved (stable storage) → Operations resume

Scenario 3: Health Factor Violation

User tries to withdraw → Lending Canister burns shares → Health factor check fails

Recovery: Lending Canister re-mints shares (rollback) → No WAL entry created → User receives immediate error

Scenario 4: Permanent Failure

Withdrawal with invalid address → Pool rejects permanently → WAL marks FailedPermanent

Recovery: Admin investigates → Fixes root cause → Manually resets operation → Retry succeeds

Atomic Operations Table

OperationAtomic State ChangesAsync Operations
DepositMint supply sharesPool-initiated
WithdrawBurn supply shares, health checkPool withdrawal, ckAsset burn
BorrowMint debt shares, health checkPool withdrawal, ckAsset burn
RepayBurn debt sharesPool-initiated
LiquidateBurn debt, burn collateral, mint treasuryCollateral transfer, change refund

Persistence Properties

The WAL uses stable storage that survives:

  • Canister upgrades
  • Canister crashes
  • Node restarts

The WAL ensures that once a user's state is updated, the corresponding async operation will eventually complete, even across canister upgrades and network failures.