Questions from integrating teams

FAQ

Every question asked by teams evaluating TinySync — answered honestly. Answered is settled in design or code. Deferred was consciously punted. Open is genuinely undecided.

Tree & mutation semantics

AnsweredMoreA File's Journey § What the server validates

How do you prevent a move from creating a cycle? If A moves X into Y while B moves Y into X, both pass their version checks — is there an ancestry check at commit, inside the transaction?

Yes — inside the transaction. Every MoveRename is validated in the same Postgres transaction that commits it: the server rejects the move if the destination parent is the item itself or any of its descendants, via a recursive-CTE ancestry walk (is_descendant). The rejection surfaces as a normal mutation conflict, not a 500.

The scenario in the question — two cross-moves racing in separate transactions — is sharper: each ancestry check reads pre-race state. A serialization hardening (taking the per-vault lock before validation rather than after) is tracked for exactly this case.

AnsweredMoreTruth & Conflict § Folder subtree semantics

Is DeleteSubtree atomic from a reader's point of view, or can a local process observe it half-deleted?

Two different readers, two answers. Against the server, it is atomic: one transaction tombstones the folder and every descendant and appends a single DeleteSubtree event — no API reader can see a half-deleted subtree, and the log carries one event, not one per descendant.

On a device's local filesystem, application is necessarily progressive — files are removed one at a time, so a local process can observe the subtree mid-removal. Dirty local files are preserved as conflict copies before any destructive apply.

AnsweredMoreGroups & Access

Do you support hierarchy in vaults?

Inside a vault: full folder hierarchy, with moves and renames as first-class single operations. Vaults themselves do not nest — a vault is the unit of access control (group ↔ vault edges). The shared-task use case (a non-member who needs one task's files but not the team drive) is modelled as a per-task vault plus a task-participant group, so access stays additive.

Ordering, the log & recovery

AnsweredMoreTruth & Conflict § The server's ledger

How is seq assigned: Postgres sequence, max+1, or advisory lock? Monotonic with no gaps under concurrent commits?

None of the three. The commit transaction increments a per-vault counter — UPDATE vaults SET latest_seq = latest_seq + 1 … RETURNING — and the row lock is held until commit. That serializes assignment per vault, and because the increment lives in the same transaction as the change-log insert, an abort rolls both back together.

So: monotonic and gapless per vault, by construction. A Postgres sequence would leak gaps on abort; max+1 races; advisory locks add a second locking domain for nothing.

AnsweredMoreInside the Engine § Lifecycle

A seq gap is "an error, not a skip." Error how — halt, retry, or panic-resync? Is there a recovery path?

Halt first, then retry, then escalate. Replay applies events strictly sequentially; a gap raises LogGap and stops applying at the cursor — nothing after the gap is touched. The next sync re-fetches from last_seq_processed, which clears transient gaps (a page boundary, a racing read). If local state is ever genuinely unreconcilable, the engine marks itself for panic-resync: snapshot, re-hash, preserve divergent bytes as conflict copies, rebuild.

AnsweredMoreTruth & Conflict § Invariants

PanicResync is "safe to re-run" after a mid-resync crash — does it resume hashing where it left off, or start over?

It starts over. Safety comes from ordering, not checkpointing: fetch snapshot → enumerate and hash local content → preserve local-only or mismatched bytes as conflict copies → rebuild the local index from the snapshot → only then discard the pending-op queue. Every step is idempotent, and the one destructive step (discarding pending ops) happens last — so a crash at any point re-runs from the top without losing data.

OpenMoreTruth & Conflict § Retention

Idempotency keys live 7–30 days. Is the client retry window guaranteed shorter than that retention, to prevent double-apply?

Honest status: the 7–30 day figure is design intent — key pruning is not implemented yet, so today keys never expire and the question can't bite. When pruning lands, the stated invariant will be retention > maximum client retry/offline window.

Two mitigating layers exist regardless: base_item_version preconditions reject a stale replay of any existing-item mutation, and sibling-name uniqueness rejects a replayed create. A very late replay is far more likely to get a clean conflict than a double-apply.

Realtime, fanout & scale

AnsweredMoreDecide § Device Identity & Cardinality

One WebSocket per vault per device, or one socket per device multiplexed across all vaults?

Both exist today. The original surface is one socket per vault (/v1/vaults/{id}/events). A device-level stream (/v1/devices/stream) multiplexes every vault the device's groups reach over one socket, authenticated by the device credential. The device stream is the direction of travel; the per-vault socket remains for single-vault clients.

AnsweredMoreArchitecture reference §19

How does the wake hub resolve a vault's subscribers (architecture reference §19)? Where is this information stored?

Nowhere persistent — that's the point. Each server process keeps an in-memory map of vault_id → broadcast channel; a WebSocket subscribes on connect and the entry is reaped when the last receiver drops. Across processes, the committing server publishes one Postgres NOTIFY; every process's listener re-broadcasts to its local subscribers. An offline device subscribes to nothing and costs nothing; correctness never depends on a wake arriving, because catch-up is always log replay from the client-owned cursor.

Open

How many concurrent connections have you tested to, and what's the fanout latency at that number?

We haven't load-tested at scale yet — no number we'd stand behind exists, and we'd rather say that than invent one. The fanout design keeps per-wake work small (a fixed-size hint, no payload, no per-device queue), so the expected first ceilings are connection count per process and the NOTIFY listener, not per-message cost. A load-testing pass is the planned source of real numbers.

Open

Where does Postgres LISTEN/NOTIFY start to hurt, and what's the plan past it?

Known ceilings: a single notification channel, an 8 KB payload cap (the wake payload is ~100 bytes, so headroom is real), notifications funneled through one listener connection per server process, and the queue sharing the database's resources. Because wakes are lossy-by-design hints with poll fallback, saturation degrades latency, not correctness.

The successor — sharded channels or an external pub-sub (Redis/NATS) behind the same WakeHub trait — is deliberately undecided until load tests say where the cliff actually is.

OpenMoreDecide § Attribution and abuse control

When a device reconnects with thousands of queued ops, is submission throttled, or can one device starve others?

The client submits sequentially — one op at a time, in creation order, stopping on first error — so a single device cannot flood the server with parallel writes. But there is no server-side fairness or rate limiting yet; a device replaying a huge queue contends like any other traffic. Per-device rate limiting is one of the operational reasons device identity exists, and is tracked as hardening work.

Open

With a per-vault sequence number, how many vaults can a device subscribe to before performance degrades? (PowerSync declares 1000.)

No measured number yet. The per-vault costs on a device are a cursor, a local engine instance, and (on the old surface) a socket; the session stream removes the socket multiplier. We expect the practical limit to come from local engine instances rather than the protocol, and the answer for task-scale fanout (hundreds of vaults) is part of the groups rollout validation.

Blobs & transfer

AnsweredMoreA File's Journey § Blob first, always

Two devices upload identical content at once. Both see the blob missing, both PUT. Since content_hash is the primary key, how is the collision handled: upsert, two-phase, or a 500?

Upsert. The metadata row is written with ON CONFLICT (content_hash) DO UPDATE, and the store write is idempotent because the hash is the key — both writers store identical bytes at the same address. Both PUTs return 200; nobody retries; the race is the happy path.

DeferredMoreThe Shape § Scope honesty

Are transfers chunked, or is each blob a single monolithic PUT/GET? What's the practical ceiling?

Single PUT/GET, with a hard server-enforced 50 MB cap — the ceiling is explicit rather than emergent. Chunked transfer, resumable upload, and content-defined-chunking dedup are consciously post-beta: they change transfer economics, not the protocol shape, which is why whole-file content addressing ships first. Raising the cap is a constant; chunking is the scheduled follow-up when large files enter scope.

AnsweredMoreTruth & Conflict § Invariants

Is partially transferred content checksummed before it's committed, so a truncated transfer can't be mistaken for a complete one?

On upload, fully: the client hashes the staged bytes before sending, the server independently re-hashes the body and rejects any mismatch with the addressed hash, and a mutation only commits if its blob row already exists — a truncated upload can never become visible metadata.

On download, the server validates stored size and the client writes temp-then-atomic-rename, so a torn write never surfaces. Client-side re-hashing of downloaded bytes against the content hash is a flagged hardening item — tracked, not yet implemented.

Identity, access & tokens

AnsweredMoreDecide § Device Identity · The Six Elements

Why is there no notion of Identity/Subject in Drive? Shouldn't there be a first-class UserID/BotID, plus groups to bundle identities?

The absence of users is a deliberate boundary; the rest of the proposal is the ratified design. TinySync's subject is the device — a (user × machine) session — because idempotency, echo suppression, surgical revocation, and abuse attribution all need a device-grained identity, while user semantics differ per ecosystem and live in the Platform Service. Groups are exactly the bundling layer proposed: authorization is granted to groups, devices are members, and TinySync treats group meaning as opaque. See the group model — adopted 2026-06-10 and implemented; the schema is migration 0004.

AnsweredMoreGroups & Access § Key invariant

Without a "user", how is vault access granted before a user has logged in on any device? Wouldn't tasket have to grant every vault to every device at each login?

No — that's precisely what groups decouple. Vault grants are group ↔ vault edges, created whenever the product decides (before any device exists). When a user activates Drive on a machine, the Platform Service registers one device and adds it to the user's groups — a single membership operation, after which every current and future vault edge of those groups applies automatically. Adding a vault never touches devices; adding a device never touches vaults.

AnsweredMoreDecide § The decision

A team vault plus 500 task vaults would mean 501 short-lived tokens per device per user. How is that avoided?

By not issuing per-vault tokens. Under the adopted model a device holds exactly one long-lived credential — one token per (user × physical device) — and vault reach is resolved at request time from group membership. The 501-token scenario becomes 1 token plus membership edges: n devices + m vaults + sparse edges, instead of n×m credentials.

AnsweredMoreDecide § Dumb server, smart cursor

Shouldn't the client device report its cursor and sync forward, instead of the server storing a per-device cursor?

It already works exactly that way — the server never had a per-device cursor. The change log is keyed (vault_id, seq) and nothing else; clients call GET /log?after=N with a cursor they own, and the server answers and forgets. Any document suggesting server-side cursors was describing a discarded sketch; the decision record states the ground truth.

AnsweredMoreDecide § Consequences · The Platform Service

Could we issue a single token per (Identity, DeviceID) per session, with subscriptions resolved from replicated membership data?

Yes — that is the adopted direction, almost verbatim: token subject = device, one per (user × device) session; authorization = group membership resolved at request time; the Platform Service owns the user → devices mapping and keeps memberships current. The schema migration landed: one credential per (user × device), vault reach resolved from group membership at request time.

AnsweredMoreDecide § Consequences

What is the status of the solutioning and design around "groups"?

Ratified. The model — opaque device groups as the unit of authorization, vault access as group ↔ vault edges, union semantics across memberships — is specified in Groups & Access, and the identity decision underpinning it was adopted 2026-06-10. Implemented: migration 0004 plus the group-management APIs (PUT /v1/groups/{id}, edge PUT/DELETEs, GET /v1/groups/{id}). Build your integration design against the group model now — it is the committed shape, not a proposal under debate.

DeferredMoreGroups & Access § Containers

Access is all-or-nothing per vault today. Is per-folder or per-item access an additive layer or a core redesign?

Additive, by design. Groups stay the vault-level gate; containers — tagged views inside a vault carrying namespace and policy — are the designed hook for finer-grained rules, though container-level ACLs are not yet specified. Per-item sharing already exists in one form: public links target an item_id, time-limited and revocable. What it is not is a rewrite of the authorization chain.

Platform adoption

AnsweredMoreArchitecture reference §15

What are the API shapes — create vault, grant access, upload changes, the rest?

The live surface:

Vault lifecycle — POST /v1/vaults
Access grant — PUT /v1/groups/{gid}/vaults/{vid} + PUT /v1/groups/{gid}/devices/{did} (admin)
Content — PUT/GET /v1/vaults/{id}/blobs/{hash}
Metadata changes — POST /v1/vaults/{id}/mutations (typed: CreateFile, CreateFolder, ModifyFile, MoveRename, Delete)
Catch-up & wake — GET /v1/vaults/{id}/snapshot · GET /v1/vaults/{id}/log?after=N · GET /v1/vaults/{id}/history · GET /v1/vaults/{id}/events (WebSocket)
Device management — GET /v1/vaults/{id}/devices
Sharing — POST/GET /v1/vaults/{id}/public-file-links · …/revoke · GET /p/{token}
Session layer — POST /v1/devices (registration) · GET /v1/devices · POST /v1/devices/{id}/revoke (admin, or the device revoking itself) · GET /v1/devices/me/vaults · GET /v1/devices/stream (multiplexed wake stream)

Answered

Where do I read about the sync architecture, the fanout approach, scalability, the vault-to-device mapping, and metadata vs blob syncing?

The Understand journey is written for exactly this: ch.2 (engine), ch.3 (the full protocol as one story — fanout included), ch.4 (data model, metadata/blob split). The vault → device mapping is the group_devices/group_vaults edge tables. Wake subscriptions are ephemeral (see the wake-hub entry above). Scalability honesty: see the Open entries in "Realtime, fanout & scale".

Deferred

What are the delta-sync (chunking) semantics?

There are none yet, on purpose — transfers are whole-file with a 50 MB cap, and chunking/delta is post-beta scope. See "Blobs & transfer" above for the full answer.

DeferredMoreThe Shape § Explicitly cut for v1

How do you plan to support multi-account?

The identity model already carries it: a device is a (user × machine) session, so two users — or two accounts — on one machine are simply two device identities. What's deferred is the client-side work: storing multiple credentials, routing each vault's engine to the right identity, and the setup UX. It's cut from v1 scope, with the model hook in place.

AnsweredMorePlatform model § Web view layer

How do you sync and show vault metadata in a web app (like the tasket drive page)?

Through the Platform, not by teaching browsers the sync protocol. A web session authenticates against the Platform Service, which proxies TinySync's tree and history APIs (snapshot, log, item metadata) read-through — names, sizes, structure render without downloading content. Whether file bytes are then fetched directly from TinySync's blob endpoint or proxied through the Platform (for per-file permission enforcement) is a flagged open sub-question.

AnsweredMoreGroups & Access § Containers

How do you provide multi-tenancy, and how do you model for containers?

Tenancy is composed from three orthogonal pieces: vaults are storage namespaces, groups gate who reaches them, and containers carve product namespaces and policy inside a shared vault. TinySync never interprets what a tenant means — group and container semantics are the Platform Service's to define, which is what lets one TinySync serve multiple products and orgs without embedding any of their business logic.

AnsweredMoreGroups & Access § Practical patterns

How do we sync vaults on orphan tasks, or tasks where a non-member participant needs access without team-vault access?

Both are group-mapping patterns, not new machinery. Per-task vault + per-task participant group: a participant's device joins the task group and reaches that vault — whether or not they're in the team group, and whether or not the task has a team at all. An orphan task is just a vault whose group contains its participants. Access is the union of group memberships, so the non-member sees the task vault and nothing else.

Answered

Can we get sequence diagrams for user stories of Drive integrated in Neo?

The integration lifecycle diagrams — device registration, group assignment, active sync, revocation, re-registration — are in Integration Lifecycle, and the end-to-end sync flows are in A File's Journey. Product-specific story diagrams (e.g. "attach a file to a task") are a docs request we're happy to take — tell us which stories matter first.

Positioning

AnsweredMoreDecide § Scribe vs Drive (the same embeddability argument, internally)

Why should Neo use TinySync over alternatives like Seafile?

Because they solve different problems. Seafile is a complete, self-hosted file product: its own user accounts, sharing model, web UI, and clients. Adopting it inside Neo means embedding a second identity system and a second product surface, then bridging both forever.

TinySync is a sync substrate built to be embedded: it has no user concept to collide with Neo's — identity stays in the Platform Service — access maps to Neo's own teams and tasks via opaque groups, and the end-user surface is the OS itself (placeholders in Finder/Explorer, on-demand hydration), not a separate app. Cloud-authoritative ordering with deterministic conflict copies, a Rust engine, and an API small enough to audit complete the case.

Honest trade-off in the other direction: Seafile ships block-level delta sync today; TinySync defers chunking past beta. If raw large-file transfer efficiency is the deciding criterion right now, that's Seafile's column.