TinySync Understand Integrate Decide FAQ Playground ↗
Questions from integrating teams

FAQ

Every question asked by teams evaluating TinySync — answered honestly. Answered is settled in design or code. Deferred was consciously punted. Open is genuinely undecided.

Tree & mutation semantics

How do you prevent a move from creating a cycle? If A moves X into Y while B moves Y into X, both pass their version checks — is there an ancestry check at commit, inside the transaction?

Yes — inside the transaction. Every MoveRename is validated in the same Postgres transaction that commits it: the server rejects the move if the destination parent is the item itself or any of its descendants, via a recursive-CTE ancestry walk (is_descendant). The rejection surfaces as a normal mutation conflict, not a 500.

The scenario in the question — two cross-moves racing in separate transactions — is sharper: each ancestry check reads pre-race state. A serialization hardening (taking the per-vault lock before validation rather than after) is tracked for exactly this case.

Is DeleteSubtree atomic from a reader's point of view, or can a local process observe it half-deleted?

Two different readers, two answers. Against the server, it is atomic: one transaction tombstones the folder and every descendant and appends a single DeleteSubtree event — no API reader can see a half-deleted subtree, and the log carries one event, not one per descendant.

On a device's local filesystem, application is necessarily progressive — files are removed one at a time, so a local process can observe the subtree mid-removal. Dirty local files are preserved as conflict copies before any destructive apply.

AnsweredMoreGroups & Access
Do you support hierarchy in vaults?

Inside a vault: full folder hierarchy, with moves and renames as first-class single operations. Vaults themselves do not nest — a vault is the unit of access control (group ↔ vault edges). The shared-task use case (a non-member who needs one task's files but not the team drive) is modelled as a per-task vault plus a task-participant group, so access stays additive.

Ordering, the log & recovery

How is seq assigned: Postgres sequence, max+1, or advisory lock? Monotonic with no gaps under concurrent commits?

None of the three. The commit transaction increments a per-vault counter — UPDATE vaults SET latest_seq = latest_seq + 1 … RETURNING — and the row lock is held until commit. That serializes assignment per vault, and because the increment lives in the same transaction as the change-log insert, an abort rolls both back together.

So: monotonic and gapless per vault, by construction. A Postgres sequence would leak gaps on abort; max+1 races; advisory locks add a second locking domain for nothing.

A seq gap is "an error, not a skip." Error how — halt, retry, or panic-resync? Is there a recovery path?

Halt first, then retry, then escalate. Replay applies events strictly sequentially; a gap raises LogGap and stops applying at the cursor — nothing after the gap is touched. The next sync re-fetches from last_seq_processed, which clears transient gaps (a page boundary, a racing read). If local state is ever genuinely unreconcilable, the engine marks itself for panic-resync: snapshot, re-hash, preserve divergent bytes as conflict copies, rebuild.

PanicResync is "safe to re-run" after a mid-resync crash — does it resume hashing where it left off, or start over?

It starts over. Safety comes from ordering, not checkpointing: fetch snapshot → enumerate and hash local content → preserve local-only or mismatched bytes as conflict copies → rebuild the local index from the snapshot → only then discard the pending-op queue. Every step is idempotent, and the one destructive step (discarding pending ops) happens last — so a crash at any point re-runs from the top without losing data.

Idempotency keys live 7–30 days. Is the client retry window guaranteed shorter than that retention, to prevent double-apply?

Honest status: the 7–30 day figure is design intent — key pruning is not implemented yet, so today keys never expire and the question can't bite. When pruning lands, the stated invariant will be retention > maximum client retry/offline window.

Two mitigating layers exist regardless: base_item_version preconditions reject a stale replay of any existing-item mutation, and sibling-name uniqueness rejects a replayed create. A very late replay is far more likely to get a clean conflict than a double-apply.

Realtime, fanout & scale

One WebSocket per vault per device, or one socket per device multiplexed across all vaults?

Both exist today. The original surface is one socket per vault (/v1/vaults/{id}/events). A device-level stream (/v1/devices/stream) multiplexes every vault the device's groups reach over one socket, authenticated by the device credential. The device stream is the direction of travel; the per-vault socket remains for single-vault clients.

How does the wake hub resolve a vault's subscribers (architecture reference §19)? Where is this information stored?

Nowhere persistent — that's the point. Each server process keeps an in-memory map of vault_id → broadcast channel; a WebSocket subscribes on connect and the entry is reaped when the last receiver drops. Across processes, the committing server publishes one Postgres NOTIFY; every process's listener re-broadcasts to its local subscribers. An offline device subscribes to nothing and costs nothing; correctness never depends on a wake arriving, because catch-up is always log replay from the client-owned cursor.

Open
How many concurrent connections have you tested to, and what's the fanout latency at that number?

We haven't load-tested at scale yet — no number we'd stand behind exists, and we'd rather say that than invent one. The fanout design keeps per-wake work small (a fixed-size hint, no payload, no per-device queue), so the expected first ceilings are connection count per process and the NOTIFY listener, not per-message cost. A load-testing pass is the planned source of real numbers.

Open
Where does Postgres LISTEN/NOTIFY start to hurt, and what's the plan past it?

Known ceilings: a single notification channel, an 8 KB payload cap (the wake payload is ~100 bytes, so headroom is real), notifications funneled through one listener connection per server process, and the queue sharing the database's resources. Because wakes are lossy-by-design hints with poll fallback, saturation degrades latency, not correctness.

The successor — sharded channels or an external pub-sub (Redis/NATS) behind the same WakeHub trait — is deliberately undecided until load tests say where the cliff actually is.

When a device reconnects with thousands of queued ops, is submission throttled, or can one device starve others?

The client submits sequentially — one op at a time, in creation order, stopping on first error — so a single device cannot flood the server with parallel writes. But there is no server-side fairness or rate limiting yet; a device replaying a huge queue contends like any other traffic. Per-device rate limiting is one of the operational reasons device identity exists, and is tracked as hardening work.

Open
With a per-vault sequence number, how many vaults can a device subscribe to before performance degrades? (PowerSync declares 1000.)

No measured number yet. The per-vault costs on a device are a cursor, a local engine instance, and (on the old surface) a socket; the session stream removes the socket multiplier. We expect the practical limit to come from local engine instances rather than the protocol, and the answer for task-scale fanout (hundreds of vaults) is part of the groups rollout validation.

Blobs & transfer

Two devices upload identical content at once. Both see the blob missing, both PUT. Since content_hash is the primary key, how is the collision handled: upsert, two-phase, or a 500?

Upsert. The metadata row is written with ON CONFLICT (content_hash) DO UPDATE, and the store write is idempotent because the hash is the key — both writers store identical bytes at the same address. Both PUTs return 200; nobody retries; the race is the happy path.

Are transfers chunked, or is each blob a single monolithic PUT/GET? What's the practical ceiling?

Single PUT/GET, with a hard server-enforced 50 MB cap — the ceiling is explicit rather than emergent. Chunked transfer, resumable upload, and content-defined-chunking dedup are consciously post-beta: they change transfer economics, not the protocol shape, which is why whole-file content addressing ships first. Raising the cap is a constant; chunking is the scheduled follow-up when large files enter scope.

Is partially transferred content checksummed before it's committed, so a truncated transfer can't be mistaken for a complete one?

On upload, fully: the client hashes the staged bytes before sending, the server independently re-hashes the body and rejects any mismatch with the addressed hash, and a mutation only commits if its blob row already exists — a truncated upload can never become visible metadata.

On download, the server validates stored size and the client writes temp-then-atomic-rename, so a torn write never surfaces. Client-side re-hashing of downloaded bytes against the content hash is a flagged hardening item — tracked, not yet implemented.

Identity, access & tokens

Why is there no notion of Identity/Subject in Drive? Shouldn't there be a first-class UserID/BotID, plus groups to bundle identities?

The absence of users is a deliberate boundary; the rest of the proposal is the ratified design. TinySync's subject is the device — a (user × machine) session — because idempotency, echo suppression, surgical revocation, and abuse attribution all need a device-grained identity, while user semantics differ per ecosystem and live in the Platform Service. Groups are exactly the bundling layer proposed: authorization is granted to groups, devices are members, and TinySync treats group meaning as opaque. See the group model — adopted 2026-06-10 and implemented; the schema is migration 0004.

Without a "user", how is vault access granted before a user has logged in on any device? Wouldn't tasket have to grant every vault to every device at each login?

No — that's precisely what groups decouple. Vault grants are group ↔ vault edges, created whenever the product decides (before any device exists). When a user activates Drive on a machine, the Platform Service registers one device and adds it to the user's groups — a single membership operation, after which every current and future vault edge of those groups applies automatically. Adding a vault never touches devices; adding a device never touches vaults.

A team vault plus 500 task vaults would mean 501 short-lived tokens per device per user. How is that avoided?

By not issuing per-vault tokens. Under the adopted model a device holds exactly one long-lived credential — one token per (user × physical device) — and vault reach is resolved at request time from group membership. The 501-token scenario becomes 1 token plus membership edges: n devices + m vaults + sparse edges, instead of n×m credentials.

Shouldn't the client device report its cursor and sync forward, instead of the server storing a per-device cursor?

It already works exactly that way — the server never had a per-device cursor. The change log is keyed (vault_id, seq) and nothing else; clients call GET /log?after=N with a cursor they own, and the server answers and forgets. Any document suggesting server-side cursors was describing a discarded sketch; the decision record states the ground truth.

Could we issue a single token per (Identity, DeviceID) per session, with subscriptions resolved from replicated membership data?

Yes — that is the adopted direction, almost verbatim: token subject = device, one per (user × device) session; authorization = group membership resolved at request time; the Platform Service owns the user → devices mapping and keeps memberships current. The schema migration landed: one credential per (user × device), vault reach resolved from group membership at request time.

What is the status of the solutioning and design around "groups"?

Ratified. The model — opaque device groups as the unit of authorization, vault access as group ↔ vault edges, union semantics across memberships — is specified in Groups & Access, and the identity decision underpinning it was adopted 2026-06-10. Implemented: migration 0004 plus the group-management APIs (PUT /v1/groups/{id}, edge PUT/DELETEs, GET /v1/groups/{id}). Build your integration design against the group model now — it is the committed shape, not a proposal under debate.

Access is all-or-nothing per vault today. Is per-folder or per-item access an additive layer or a core redesign?

Additive, by design. Groups stay the vault-level gate; containers — tagged views inside a vault carrying namespace and policy — are the designed hook for finer-grained rules, though container-level ACLs are not yet specified. Per-item sharing already exists in one form: public links target an item_id, time-limited and revocable. What it is not is a rewrite of the authorization chain.

Platform adoption

What are the API shapes — create vault, grant access, upload changes, the rest?

The live surface:

  • Vault lifecyclePOST /v1/vaults
  • Access grantPUT /v1/groups/{gid}/vaults/{vid} + PUT /v1/groups/{gid}/devices/{did} (admin)
  • ContentPUT/GET /v1/vaults/{id}/blobs/{hash}
  • Metadata changesPOST /v1/vaults/{id}/mutations (typed: CreateFile, CreateFolder, ModifyFile, MoveRename, Delete)
  • Catch-up & wakeGET /v1/vaults/{id}/snapshot · GET /v1/vaults/{id}/log?after=N · GET /v1/vaults/{id}/history · GET /v1/vaults/{id}/events (WebSocket)
  • Device managementGET /v1/vaults/{id}/devices
  • SharingPOST/GET /v1/vaults/{id}/public-file-links · …/revoke · GET /p/{token}
  • Session layerPOST /v1/devices (registration) · GET /v1/devices · POST /v1/devices/{id}/revoke (admin, or the device revoking itself) · GET /v1/devices/me/vaults · GET /v1/devices/stream (multiplexed wake stream)
Answered
Where do I read about the sync architecture, the fanout approach, scalability, the vault-to-device mapping, and metadata vs blob syncing?

The Understand journey is written for exactly this: ch.2 (engine), ch.3 (the full protocol as one story — fanout included), ch.4 (data model, metadata/blob split). The vault → device mapping is the group_devices/group_vaults edge tables. Wake subscriptions are ephemeral (see the wake-hub entry above). Scalability honesty: see the Open entries in "Realtime, fanout & scale".

Deferred
What are the delta-sync (chunking) semantics?

There are none yet, on purpose — transfers are whole-file with a 50 MB cap, and chunking/delta is post-beta scope. See "Blobs & transfer" above for the full answer.

How do you plan to support multi-account?

The identity model already carries it: a device is a (user × machine) session, so two users — or two accounts — on one machine are simply two device identities. What's deferred is the client-side work: storing multiple credentials, routing each vault's engine to the right identity, and the setup UX. It's cut from v1 scope, with the model hook in place.

How do you sync and show vault metadata in a web app (like the tasket drive page)?

Through the Platform, not by teaching browsers the sync protocol. A web session authenticates against the Platform Service, which proxies TinySync's tree and history APIs (snapshot, log, item metadata) read-through — names, sizes, structure render without downloading content. Whether file bytes are then fetched directly from TinySync's blob endpoint or proxied through the Platform (for per-file permission enforcement) is a flagged open sub-question.

How do you provide multi-tenancy, and how do you model for containers?

Tenancy is composed from three orthogonal pieces: vaults are storage namespaces, groups gate who reaches them, and containers carve product namespaces and policy inside a shared vault. TinySync never interprets what a tenant means — group and container semantics are the Platform Service's to define, which is what lets one TinySync serve multiple products and orgs without embedding any of their business logic.

How do we sync vaults on orphan tasks, or tasks where a non-member participant needs access without team-vault access?

Both are group-mapping patterns, not new machinery. Per-task vault + per-task participant group: a participant's device joins the task group and reaches that vault — whether or not they're in the team group, and whether or not the task has a team at all. An orphan task is just a vault whose group contains its participants. Access is the union of group memberships, so the non-member sees the task vault and nothing else.

Answered
Can we get sequence diagrams for user stories of Drive integrated in Neo?

The integration lifecycle diagrams — device registration, group assignment, active sync, revocation, re-registration — are in Integration Lifecycle, and the end-to-end sync flows are in A File's Journey. Product-specific story diagrams (e.g. "attach a file to a task") are a docs request we're happy to take — tell us which stories matter first.

Positioning

AnsweredMoreDecide § Scribe vs Drive (the same embeddability argument, internally)
Why should Neo use TinySync over alternatives like Seafile?

Because they solve different problems. Seafile is a complete, self-hosted file product: its own user accounts, sharing model, web UI, and clients. Adopting it inside Neo means embedding a second identity system and a second product surface, then bridging both forever.

TinySync is a sync substrate built to be embedded: it has no user concept to collide with Neo's — identity stays in the Platform Service — access maps to Neo's own teams and tasks via opaque groups, and the end-user surface is the OS itself (placeholders in Finder/Explorer, on-demand hydration), not a separate app. Cloud-authoritative ordering with deterministic conflict copies, a Rust engine, and an API small enough to audit complete the case.

Honest trade-off in the other direction: Seafile ships block-level delta sync today; TinySync defers chunking past beta. If raw large-file transfer efficiency is the deciding criterion right now, that's Seafile's column.