Ryyn (massive multi-device virtual FS)

That’s great, good news! :partying_face:

Hm, this error is unexpected. I recognise it but I’ve not seen it for a very long time, before this app was anything really.

There is almost nothing wrong a user can do, very few variables. This is most likely app layer or lower.

Well, I’m on it :face_with_monocle:

5 Likes

Is this still effectively Windows-only for now?

1 Like

Yes, unfortunately.
It is more or less highest priority to target unix when the most rudimentary is solid.

1 Like

Apologies :grinning_face_with_smiling_eyes:

1 Like

I noticed in your log that it tries to create the system manifest, that’s what it would do when none exists. It would seem that the second instance is not using the same workspace key. But, when I ran this - manually - I had the exact same result as you did! So, I entered --wsp-key manually. Did you as well?
After that I ran with the default workspace (i.e. derived from the evm-key) and then it worked fine, joined, detected by first instance at the other device etc.
But the automated scenarios always use the custom wsp key, so it’s not that.

I think I remember seeing this cost insufficient peers error when trying to create an existing scratchpad… very odd error msg in that case. Well, I should try that and confirm it.
But it would make sense in so far that you and I have probably used the same wsp key in our second instances, so for some reason they did not consider the workspace as existing, i.e. the scratchpad_check_existence query returned false.. I haven’t seen that yet.

Confirmed now that I did use the exact same wsp key.
I tried starting the second instance of first workspace again, and since there is no re-entering of credentials (starting only requires a pwd and instance id) it is the exact same key etc that is used as the first time. And now it worked. It correctly detected the workspace, joined it, and first instance at the other device saw it joining.

Have you tried just doing β€œ.\ryyn_daemon.exe [id]” again?

To me, this looks like the exists check returned false, and it tried to create a scratchpad that already existed.

2 Likes

I tried this quite a few times with the same result each time.

I didn’t try setting the --wsp-key manually at first.

I have tried a few times now, and after doing the init with --wsp-key specified, when I try to run the daemon, I get β€˜Error: Invalid bytes representation’. I’m using the same secret key as before, but specifying it as --wsp-key to see if it makes any difference.

1 Like

Okay, I see.

Yes, it’s a pain actually to enter/paste the secret keys. The input step, that’s where things go wrong mostly. All those args for the init, it’s easy to mix them up. I really dislike having to look so closely and double check. I’m sure you already have checked, but that could give such an error msg (and malformed key and a few more things).

It’s working quite smoothly now between two machines here. I’m keeping a terminal at each, then just opening a new tab and joining the same workspace at a new custom base dir, 0, 1, 2 etc.

Will have to continue tomorrow though.

6 Likes

Beautiful!!

7 Likes

I’m doing a new release right now for a bug fix. Just so you know, if anyone is about to try it atm, then hold on slightly :slight_smile:

Here it is now: v0.1.0-alpha.5

6 Likes

You get this working, and this will probably be the first thing I ever spend crypto on to accomplish a necessary goal.

4 Likes

For technical users wanting to experiment.


I do this when I’m trying things out manually now.

The example has a total of 6 β€œdevices” (3 instances on two machines).

Used in the example:

  • 2 machines
  • An EVM key
  • A workspace key (32 byte hex)
  • A program folder, in the example F:\programs\ryyn

– Workspace

In the example I call my workspace wsp-101, so that gives F:\programs\ryyn\wsp-101 as the dir where the devices on this machine will install.
Next workspace when I experiment could be e.g. wsp-102.

– IDs

The machines are m0 and m1, three instances each a0, a1 and a2.
Thus, the instance ids are m0a0, m0a1 … m1a1, m1a2.

– Base dir and install path

The base dir passed to cli is then F:\programs\ryyn\wsp-101\m0a2 etc.
Install path will thus be F:\Programs\ryyn\wsp-101\m0a2\ryyn under which that instance’s files will reside; app.lock, device_id, evm-key.enc, the data dir with dbs etc.

– Drives

Within every instance base dir I place a drive dir, e.g. F:\programs\ryyn\wsp-101\m0a2\drive. That’s where I will create and mount vaults for this specific instance. These are normally placed anywhere; your vaults are created from where your actual files are, and you mount vaults to where you want them replicated on your machine. But this placement structure keeps it cohesive and tidy for trying things out.

– Initiate and start

Each instance is initiated with the line you can see to the left. This registers the id in an index with port for comms with daemon, and stores the key(s) to file, encrypted with the password. Then in the terminal for the daemon the RYYN_PWD env var is set:
set-pwd-var

This allows for calling cli without having to include the pwd every time (it is used to decrypt the keys from file).

Then we can start the daemon in the terminal to the right; passing the corresponding id (m0a2 here). I do the same on both machines. You can see that the cmd info lists 6 devices.


(Masked in the screenshot: evm-key and wsp-key. Not only the evm-key, but also the wsp-key bust be kept secret - the wsp-key gives access to all the files in the workspace.)

– Create vault

Create a folder on one of the β€œdrives”, say my-vault, then call .\ryyn.exe m0a2 vault create F:\programs\ryyn\wsp-101\m0a2\drive\my-vault.


(The name/id shown there is a double hex encode of the hex prefixes, by mistake. No extra format of ids.)

And subsequently, other instances (here m1a0, i.e. on machine 1, and m0a0 on machine 0) discover the vault:


– Mount a vault

The other devices only registered the vault, they haven’t mounted it yet.
So, we mount it on say m1a1. We create say vault-at-m0a2 and call
.\ryyn.exe <id> vault mount <vault-prefix> <path\for\your\mount> [any-name]

i.e.:
.\ryyn.exe m1a1 vault mount C:\Users\Public\ryyn\wsp-101\m1a1\drive\vault-at-m0a2 0717d632 v-m0a2

The m1a1 instance receives the cmd and we can see the setup of the mount specific components:

And m0a2 sees it a couple of seconds after the system manifest was updated:
m0a2-sees-m1a1-mount

And the instances can see this updated info:

(the application is not filling in all details yet, like the number of mounts of a vault)

– Sync files

Now that we have 2 mounts to a vault, we can try syncing files.
Drag and drop a file to one of the drive folders. See the activity in the daemon terminals.

m1a1, that mounted the vault created by m0a2, adds a file to the mount:
(by green rectangle you can see that there was some substantial delay in processing - some delays can be seen, but this was unusually long)

On m0a2 we pull the first event promptly:

Then, after the unexpected delay (the green rectangle) at m1m1, m0a2 pulls the event with the file version manifest (addresss of all chunks etc), decrypts and assembles the chunks to a local file:


Well, as for basic functionality, that’s pretty much it.

5 Likes

Keen to look at this once its ready for folk with grown-up OSes :slight_smile:

2 Likes

Yes, indeed @Southside. I am a bit childish though, so it makes sense :blush:

@DavidMc0, if you don’t mind, I’d happily take a look at the exact cmds you pass in (without the evm key, or pwd if it’s not a dummy, which you of course know but I have to say it). You can send it in PM if you want.

3 Likes

A new release is available: v0.1.0-alpha.6

This contains multiple bug-fixes that has made the basic functionality quite stable now.

Since previous release focus has been on e2e tests. First, ci was extended with spinning up a local ant network for running the e2e tests on. Then I was building a structure for implementing scenarios, both common use cases and extreme/faulty behavior. These scenarios are specified in .toml files and run on the different environments (mock network, local network and - for special use - live network). The bugs fixed in this release were all found using these scenarios.

I have been putting a large effort into simulations, stress testing, scenarios as a way to develop and verify the invariants of the system. I find it irreplaceable for thoroughly covering large amounts of cases in short time.
In this release only a rudimentary scaffold was setup, the next step is worked on now. It will give a lot more realistic loads and will be an invaluable tool and assistence in making the functionality rock solid. Very satisfying work!


A note about using new versions:
Make sure to delete the folder of the previous version AND a file index.toml found at C:\Users\<user>\AppData\Local\ryyn. This last bit has not been documented previously, and would definitely have caused issues, since the format in it was updated across earlier releases.
So, just delete it before using this new version.

14 Likes

What’s up with the title? you might say. I was about to come up with a more descriptive and informative title. I had a vague memory of having prefixed titles here on the forum with GitHub: [app name] for posts about apps, that also had code. This was several years ago when I wrote the first apps for Autonomi. I think it was meant to show that there is code for this thing. I squinted at it now and thought, yeah I think that’s the impression I still get.. So, there it is.

Now that IF is over things will change slightly on the Ryyn side.

It was early this winter that I started thinking about writing some kind of application running on Autonomi, directing focus here again after quite some time. It became clear to me pretty soon what it would be; it’s very similar to what I built to run on the network several years ago (virtual filesystem, database etc). And the reason is still the same; that’s what I want and have needed for years. It is such a PITA and annoyance with backups, or just my data that I want also in my phone and on device a, b, c… and just trying to keep one’s digital things in one place, accessible without worry, without lots of maintenance and fiddling. And I want it to be mine of course, and private when I wish so. I know many share the experience! Even though I’m technical enough to work with these things, it’s just immensely frustrating to me, I do not want to spend time on it (hours here and there). I’d rather spend years building a better solution :smiley:

I happened to see this IF thing about a week or so before the applications closed, and I was doubtful until the very last if I should join in. I had looked forward to the quiet work in anonymity, and only announcing something when there was something really solid to use. So, that didn’t happen. There were pros and cons with participating, some input was gotten from the community - that is and will always be very valuable - but I did rush some development as well, causing a bit of technical debt. Actually though, I find that there are some good advancements done also when you’ve got pressure on you - you can find out some class of things that you wouldn’t have otherwise. But it’s not good for code quality in the long run (but the reality for like.. 99% of all development?). So, really, I don’t know for sure either way in this case. But I was thinking along the way that Ryyn wasn’t really something for IF. There is no marketing going to be done. There is no selling, no making money on it. It’s just (going to be) there for anyone to use.

During IF I did largely stick to my anticipated β€œwork in peace and quiet”, that to the outside may have looked like nothing much happening at all. And on the usability front that was pretty much the case! The work was (and is) focused on the foundation, and finding and discovering the best designs of components, design of the system itself, problem framing and the actual conceptual model. That model changed, mostly from distributed backup/virtual file system over some intermediate undefined states into what it is now (which doesn’t have a good name, but I’ll include an overview below). The conceptual model is quite firmly grounded now (but not entirely unfolded). However I see different concepts branching off of the same solutions. But not for a while.. and maybe not by me.

This application, with the ambitions I have, is a long-haul project. Most work has been, and continues to be very complex, to make the system simple - simple to understand, simple to use and simple to maintain (and of course, that’s a target no-one reaches fully - a mirage?). It will take a lot of time to get to where I want it to be. There is so much worked on that no user will ever think about, just to allow that very thing: for them to never think about it.
There is a similar application that is a somewhat suitable comparison, for grasping scale; Syncthing (github).
I’m impressed by the user adoption they have and the long time they’ve kept up the work. They are an excellent scale reference, with 11 years of work (at least?) and hundreds of contributors (a handful larger ones, and a major champ). Most of their work was done early on though, but I’m sure the longevity has depended on the ongoing work.

We’re building somewhat different things, but I think the scale is not widely off. I have high ambitions for this application. I’m confident that I can build something simpler, with better features, using less code. (I also have the advantage that I may learn from their mistakes.) Plenty could say the same - given enough time hard things become doable. But it is also the case that I am not constricting this to any particular timeline. I will do this simply. As long as Autonomi keeps developing and becoming more and more usable, then I will get my device sync built so I can rest in peace with my data. (I’m of course joking a bit.) It’s a huge motivator that this is something many both want and need, and it is a great satisfaction to be able to do something for others - you know, something that actually matters.

Anyway, I think all that would mean that there will be no more IF things for me, since I won’t start building another app, and as I understood the same application can’t participate twice. So I may not put that much effort for a while in involving others in the progress. It takes a lot more time than it would seem (to me at least) to produce material like tutorials, user docs etc. But that does not mean that things are not moving. My language will be commits. It takes a while to learn it, but I will be telling you all a lot about progress that way :slight_smile: (and occasional notes here when something lands)

But! I will include an overview of the conceptual model now - already implemented in code - below this post. This overview is not perfect, consistent, complete or finished, because this as well takes ages to write up (for me at least).
I will always answer questions, or help out if there are issues using the application. So just shoot, whenever.

16 Likes

On a high level this is really something like a Massive Multi-Device Virtual FS. Or multi-device could be exchanged for distributed (but that’s not quite the same still).

Below is an overview of what’s under that.

Ryyn conceptual model (lower level)

Short description:

A workspace holds vaults. Each vault can be mounted anywhere; mounts append to their own stream and merge all streams, so state converges deterministicallyβ€”with no central log.

  • A workspace is owned by one key.
  • A vault is a shareable, network-resident, dataset created from a local folder.
  • A mount exposes a vault at any path on any device; all mounts of a vault stay in sync.
  • Optional per-path sharing grants read/write access to parts of a vault.
  • The vault is the network source of truth, a mount-partitioned event DAG, and mounts are authoritative I/O projections into it.
  • A workspace can contain any number of vaults. Each vault can be mounted on any* number of devices.

*well, not any, ANY, but a yet unspecified fairly large amount :slight_smile:

In summary: Mountable Vaults are a workspace-scoped, key-owned model where each vault is a network-resident, mount-partitioned event DAG that can be mounted in many places across devices; all mounts converge deterministically by merging the vault’s streams.

Glossary

  • Workspace (forest) β€” A forest of network-native vault trees, owned by one key, each vault mountable anywhere across devices, all mounts kept in sync.
  • Vault (tree) β€” Network-native dataset structured as a mount-partitioned event DAG forming an index over the dataset history.
  • Mount (binding/peer) β€” Writable attachment of a vault tree 1:1 to a local path; maintains a partition of the vault event DAG. Edits flow both ways (mount ↔ vault ↔ mount); all mounts of a vault reflect the same dataset and stay synchronized via the network. Multiple mounts of a vault are peers.
  • Vault DAG β€” The union of all mount streams for a vault; a logical source of truth reflecting dataset changes.
  • Mount (event) stream β€” A partition of the logical vault DAG stored in the network. Written to only by its mount.
  • Reconciling β€” Deterministic merge on a mount that consumes the vault DAG and produces the local filesystem view.
  • Subpath grant β€” Capability granting R/W to (sub)paths of a vault.
  • Key hierarchy β€” Workspace key β†’ derivations for vaults/mounts/data/etc.
  • Sync (replicate & reconcile) β€” Each mount appends to its own stream, fetches all other mounts’ streams, and merges the union locally via deterministic reconciliation, then applies ops reflected by the events, fetching and applying data when needed, keeping the filesystem synced with the other mounts.
  • Worktree (alias) β€” Developer-friendly synonym for mount.
  • Repo/dataset (aliases) β€” Developer-friendly synonym for vault.
  • Profile (per-device) β€” User-composed set of mount points per device.
  • P2P side-channel β€” Ephemeral signals (presence, hints, metrics). Not part of the event DAG.
  • Hash-identified chunk β€” The file version manifests use the chunk’s hash as its identifier (integrity), not as the storage locator.
  • Key-addressed chunk containers β€” Network location is under a public key address derived from the workspace key hierarchy.
  • Deterministic per-chunk key β€” Encryption key K_chunk = KDF(workspace_scope_key, chunk_hash); symmetric and unique per chunk.
  • Capability grant (per-chunk) β€” Sharing = passing K_chunk (directly or wrapped) + its location in network (or the chunk); grants just that chunk.
  • Capability grant (per-file-version) β€” Sharing = passing file link, consisting of the chunk locations, assembly index, and their K_chunk.
  • File link β€” Used for sharing a file outside workspace. A self-contained descriptor used to fetch, decrypt, and assemble exactly one file version without access to anything else in the vault. (Sharing entire vaults uses a different mechanism.)
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚              WORKSPACE KEY             β”‚
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                   β”‚ derives
                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
                          β”‚                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
                    β”‚  VAULT A  β”‚      β”‚  VAULT B β”‚
                    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
                          β”‚                 β”‚
     (logical) VAULT DAG = union of mount streams per vault
                          β”‚                 β”‚
──────────────────────────┼─────────────────┼───────────────────  (network, as a 2D line)
		   (chunks)       β”‚        (mount streams of VAULT B)           
                          β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                   β”‚            β”‚               β”‚
   STREAM A1            STREAM A2     STREAM A3       STREAM A4
 (written by Mnt A1)   (by Mnt A2)   (by Mnt A3)     (by Mnt A4)

Devices (each mount writes its own stream; all mounts read all streams in the vault):
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ MOUNT A1 @ /work/proj        β”‚       β”‚ MOUNT A3 @ D:\proj           β”‚
β”‚ writes β†’ STREAM A1           β”‚       β”‚ writes β†’ STREAM A3           β”‚
β”‚ reads  ← A1,A2,A3,A4 β†’ merge β”‚       β”‚ reads  ← A1,A2,A3,A4 β†’ merge β”‚
β”‚ reconciles β‡’ local FS view   β”‚       β”‚ reconciles β‡’ local FS view   β”‚
β””β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”˜
     └───── (P2P: presence/hints/metrics only; not in DAG) β”€β”€β”€β”€β”€β”˜
15 Likes

This is so good to read Edward. There’s another project I wish would write something like this, but they continue to remain even more on the background than you!

The only bit I didn’t like is, that it will take a while! Although that is obviously necessary to get all the things I liked even more. :man_shrugging:

I think this will be one of the most useful and widely used applications we will see on Autonomi, as well as an outstanding well executed piece of development. Good luck, and please get a move on! :rofl:

11 Likes

I’ve not read this but thought it might be useful when considering UX:

3 Likes

They write β€œmust converge to the exact same state without losing data” and then advise using CRDTs with Last-Write-Wins strategy which is simply discarding data…

3 Likes

Thank you Mark, that’s very nice to hear :smiling_face:

yes for sure :rofl:

Yes indeed, if they discard it though. They keep a log with all pending actions and their values. But going back on anything would require a layer of later reconciliation, like manual 3-way or so. Don’t know if they had it, but I guess they would have mentioned it in that case.

Ryyn started with deterministic LWW tie-breaker (ts, author_id, event_id) on event DAG-conflict (i.e. causality is already handled) for simplicity, but there is no data discarded, it’s just a convenience now because it’s simple to implement.

5 Likes