CloudFS

Idea: cloud-native filesystem, in the spirit of local-first software.

  • Works well with cloud services, such as object stores.
  • Ability to snapshot, go back in time, similar to git.
  • Uses content-addressed storage and some clever data structures.
  • Ability to self-host storage, but use cloud proxy.
  • Ability to lazy-load, to only fetch content you are interested in
  • Ability to work offline.

Architecture

Architecture

CloudFS consists of three components: it is a basic client/server architecture, where the server (which can be self-hosted, or hosted in the cloud) manages the state of the filesystem. It stores the metadata locally. The data itself can be stored in any key-value store and is immutable (due to it using content-addressed storage). Data storage can be partitioned or replicated easily.

Optionally, a relay can be used which is a cloud-hosted entity. This facilitates communication between the client and the server. It caches any blobs requested through it, so that access is possible even when the server is on a low-uplink connection. It also ensures the filesystem is readable when the server is down, as long as the chunks are cached.

Primitives

Blob store

Uses merkle-trees of data chunks.

Key-Value store

Uses G-trees storing hashes of blobs (merkle tree root hashes).

Implementation