Performance Gabo Laptop 2026-05-26 20:22

Performance report from fresh snapshots at http://localhost:58001/debug/network and /debug/sqlite.

Executive summary

The app is not currently blocked by SQLite lock/busy errors. The main performance problem is still network/discovery sync latency, especially waiting on already-connected peer syncs.

The daemon is doing lots of peer sync attempts; many fail or get preempted, and the successful path often waits seconds.

SQLite health

| Signal | Current state | |---|---:| | Write transactions in flight | None | | Recent SQLITE_BUSY / begin_busy | 0 | | WAL file size | 0 B | | DB size | 4.0 GiB | | Worst write caller | blob.(*Index).PutMany-range1 | | PutMany p99 total | 458 ms | | PutMany p99 hold | 120 ms | | WAL checkpoint p99 | 17.6 ms |

SQLite looks mostly healthy right now:

No active writer lock.
No recent busy timeout events.
WAL is truncated to 0B, so there is no large WAL backlog.
Writes are present, but not dominating wall time.
Some reads are moderately slow:
- ListAccounts p99: 328 ms
- ListEntityMentions p99: 289 ms
- syncing.(*Server).loadStore p99: 156 ms

So: SQLite is contributing some latency, but it is not the current main bottleneck.

Network / sync performance

| Signal | Current state | |---|---:| | Daemon uptime in snapshot | 3m50s | | connected_sync p50 | 4.02 s | | connected_sync p90 | 18.51 s | | connected_sync p99 | 21.09 s | | Discovery connected p50 | 17.34 s | | Discovery connected p99 | 21.29 s | | Dial p99 | 10.85 s | | Reconcile RPC p99 | 1.96 s | | Bitswap fetch p99 | 313 ms |

The bad user-visible delay is:

connected_sync: reconcile + download from connected peers, and it waits for all of them.

That phase has p50 around 4 seconds and p90/p99 around 18–21 seconds.

Sync outcomes

| Outcome | Count | |---|---:| | ok | 1210 | | dial_failed | 3272 | | preempted | 852 | | protocol_mismatch | 243 | | rpc_error | 102 | | putmany_failed | 0 |

This is important:

putmany_failed = 0, so SQLite writes are not failing.
dial_failed = 3272, very high.
preempted = 852, also high.
Many peer sync attempts are wasted or cancelled.

Bandwidth / data movement

| Signal | Current state | |---|---:| | Total traffic since startup | 214.6 MiB | | HTTP server loopback out | 188.7 MiB | | libp2p remote total | 24.7 MiB | | SQLite grew | 3.0 MiB | | Bitswap unique received | 4.5 MiB | | Duplicate bitswap data | 33.4 KiB / 0.7% |

Most traffic is local frontend/backend gRPC-Web loopback, not external network.

Bitswap itself looks healthy:

Fetches complete.
Completeness ratio is 1.00.
Duplicate waste is low.
putmany_failed = 0.

Likely bottleneck

Current bottleneck is:

Frontend asks daemon for document/content
        ↓
Daemon runs discovery
        ↓
Daemon syncs with many peers
        ↓
Many peer syncs dial-fail, timeout, preempt, or take seconds
        ↓
connected_sync waits too long
        ↓
User sees slow app

Not:

SQLite locked

At least in this snapshot, SQLite is not locked.

What looks wrong

Discovery waits too long on peer syncs
- connected_sync p90 = 18.51s
- connected_sync p99 = 21.09s
Too many failed peer attempts
- dial_failed = 3272
- more failures than successful syncs.
Preemption is high
- preempted = 852
- suggests work starts, then gets cancelled before useful completion.
SQLite is okay now
- no begin_busy
- no in-flight writer
- WAL is 0B
- writes are measurable but not the main pain.

Recommendation to discuss with team

Focus on reducing discovery/network fanout before optimizing SQLite further:

dedupe concurrent discovery requests for the same resource;
reduce number of peers queried per discovery;
stop earlier once enough peers answered/found content;
avoid waiting for all connected peer syncs;
prioritize historically good peers;
cache negative/positive discovery results briefly;
reduce frontend query block requests to daemon.

SQLite should still be watched, especially PutMany and slow reads, but the current app performance issue is primarily network/discovery contention, not SQLite lock contention.

Recap

The app is currently slow mainly because connected_sync takes seconds to tens of seconds, with many dial_failed and preempted peer syncs. SQLite is not currently locked: no writer in flight, no recent begin_busy, no putmany_failed, and the WAL file is 0B.

DB: 4.0 GiB; WAL: 0B.
Worst SQLite write p99: PutMany around 458 ms total, 120 ms hold.
Worst user-visible network phase: connected_sync p99 around 21 s.
Best next fix area: reduce/dedupe/short-circuit discovery peer sync work.

Do you like what you are reading? Subscribe to receive updates.

Unsubscribe anytime