[Pre-RFC] Labelled Data

joshuef · December 2, 2019, 12:36pm

Not if the data is siloed away in some app’s container that you don’t know to check.

That’s the crux of this proposal. It’s not removing the ability to have folder structures. But more enabling such data localisation and modification by any apps targeting that data label.

Currently (well… previously as we don’t have this implemented.), you’d do something like:

// this is all pseudo API
let myPhoto = <data>
let app = new Safe(<my app id>);

// saved in apps own container
app.save('/profile_pic', myPhoto);

// and to retrieve
let myPhoto = <data>
let app = new Safe(<my app id>);

// retrieved from apps own container
let photo = app.get('/profile_pic')

Only this app knows about this data. No other app can access this photo.

With this proposal for apps to manage their own data

// more pseudocode
let myPhoto = <data>
let app = new Safe(<my app id>);
// automatically labelled with `appId`, and saved in that index
// ALSO has 'photo' label applied automatically
app.save(myPhoto)

// and to retrieve
let myPhoto = <data>
let app = new Safe(<my app id>);

//retrieved from apps own index
let photo = app.get('/profile_pic')

// BUT ALSO

let someOtherApp = new Safe(<another app id>);

// if another app has 'photo' permissions
let photo = someOtherApp.getFromIndex('photos', '/profile_pic');

It is the semantic in web, in that each index could be an RDF struct explaining what’s within, and referencing the data via URL. Just accessible across apps.

Indeed, it could / would be great to have these labels applied from a data’s RDF automatically. In which case, each index is simply a quick reference of all data of a particular type. Though as we don’t have RDF baked in yet, label’s is perhaps a shortcut.

oetyng · December 2, 2019, 12:49pm

Aha, OK thanks.
So, to build a folder view on top of this, we explicitly state that a label corresponds to a folder somehow. But how?
Would it be enough to post-/prefix the label with “folder” (+ delimiter)?

Then, to determine the hierarchy, how is that done?

For example, if I want to have the tree-views “root/photos/food” and “root/photos/animals” would “frog.jpg” have label “folder_root/photos/animals” and “ceviche.jpg” have label “folder_root/photos/food”? Or something else?

In that case, that one label could serve as both the multi-index as well as the tree-view definition.

joshuef · December 2, 2019, 12:55pm

We’d build a folder completely separately to this. safe files put <folder> would create a FilesContainer.

Within that container, whatever data you have can/will be individually indexed/labelled. And then the FilesContainer (which is our folder), would / could be labelled as well.

In terms of drawing out hierarchy from these indices, you woulnd’t necessarily be able to have one true hierarchy. The data is flat within each index (as i imagine atm). You could create some kind of hierarchy via ‘Smart Folders’ as @JimCollinson suggests above eg.

That seems correct yeh.

(It may be worth noting the ‘label-combo’ string such as photos/animals is just a workaround current limitations with container APIs. Ideally you’d just have an index for each label and the client libs/network handles permission crossover for you)

bochaco · December 2, 2019, 12:57pm

I was imagining it as this:

You upload data with any app, all data is labelled automatically and linked from a Root index container
The Root index container data is represented using RDF, you have indexes with URLs to the data
You can refer to indexed data using their labels with a label-URL (also with an API) like safe:///<label>/<file and/or path> , a label could be linked to a FilesContainer so you can pass the path of a file in such URL after you provided the label.
FilesContainers can be also created where the link to the file is a label-URL rather than an ImmutableData URL (this can create a circular link, we can solve it as OSs do I guess)

joshuef · December 2, 2019, 12:58pm

Yeh, @bochaco, all that would be grand

oetyng · December 2, 2019, 1:02pm

Okok, this is the part I missed.
With hierarchy I meant solely for the tree-view (as it’s basically one and the same) so not for the labeling. But I thought that tree-view over containers would be scrapped and replaced with tree-view emulated on top of labels instead.

Hmm… I’m not done mulching that, but I think I’d prefer the emulation over doing both actually, for simplicity … (if it in fact would make it simpler…).

bochaco · December 2, 2019, 1:08pm

But emulation you mean every time you query something? I see the FilesContainers to be such emulation but persisted on the network (I guess with better performance/efficiency…? …)

joshuef · December 2, 2019, 1:14pm

At least in OP i’m suggesting to scrap the root level ‘containers’ as we had envisaged them previous.

But you could still create your own FilesContainer data struct (that we use for NRS resolution of websites eg).

Smart Folder emulation atop labels could be another (perhaps app level) feature?

happybeing · December 2, 2019, 1:22pm

So the motivations here are:

to overcome a problem caused by apps having their own container which leads to data being known only to the owning app, and
to provide a general indexing mechanism that will enhance access to semantically labelled data, or non RDF data that is explicitly labelled by app or user, or labelled automatically according to content type for example

Is that fair / any others?

The first only applies to data created by apps which choose to use their own container, so I’m wondering what the use cases are for that and if it’s still needed, or could be handled in other ways (permissions for example). I can’t remember the discussions on this and I’m not sure I understood it anyway, so can someone give a summary of why we have app containers and some use cases?

I’m wondering if app containers are still needed, and also whether labelling might conflict with the reason an app would use them.

I’m liking the idea of a built in flexible index, and the way this is described lends itself to a good UI/API. The implementation also seems much easier to understand than I’m used to with this kind of feature.

bochaco · December 2, 2019, 1:30pm

To me the app container becomes a label (with < app type > ??), which will make more sense when you start sharing data across apps, e.g. I have my chat app customisation created as an RDF by chat-app-A, but I could import that to chat-app-B by simply sharing the data created with label with chat-app-B

joshuef · December 2, 2019, 1:31pm

They wouldn’t be needed.

If you only label it with your app, it’s effectively the same as your own container. Any other app would still have to specifically ask for permission for the data that an app’s put.

Ah yeh. Missed that bit sorry @happybeing. Yeh I think that’s fair.

No silos
indexed data
flexibility of data access (ie smart folders, etc) maybe being another aim

happybeing · December 2, 2019, 1:36pm

Thanks (both of you). I think it’s best to drop the idea of app owned data then, though obviously an app can achieve this functionality as you’ve described.

Much better to encourage a chat client/app to use ontologies to store messages, user identity etc and not encourage labelling as ‘RiotChatMessage’ and so on.

It might be worthwhile looking at how these issues are being handled in Solid too, or we might end up with unnecessary incompatibilities, and we might get some useful ideas and feedback. I think it’s an area of active work.

Mindphreaker · December 2, 2019, 1:59pm

I generally like the possibility of being able to label / tag data. However here are some initial thoughts / concerns:

Just for organisational purposes, is it necessary to implement this on the network layer? I mainly think of SAFE as a data / storage layer and maybe it would make sense to settle this on the app layer (each app will find their way / interface of organising the data for the user)? If there are technical reasons which would make data fetching faster (e.g. faster queries etc.) then I would encourage the proposal.
Most, if not all, applications I can think of which are using labels / tags, use them for organising their data. It’s a well known form of data organisation. However I think it’s not a good idea to extend this functionality and use it for access / permission management because users are simply just not used to it. Maybe it’s just me but when I think about the term “label” or “tag” I would not assume in any way that just because I give some data a label that it’s automatically shared to other people for instance.
2.1. If I’m not mistaken, we just had a huge rework of the data model regarding Public / Shared / Private data. Adding too many permission concepts would be confusing.

So the bottom line of my message is: Might be a good idea for making data queries more usable and / or technically efficient, but not a good concept for permission management.

P.s.: I also share happybeing’s concerns that it’s too much effort for most users to use labels in practice.

joshuef · December 2, 2019, 2:05pm

Thanks for the thoughts @Mindphreaker!

I imagine labels more often to be used by app devs than users (though there is scope for both). The burden does not have to be put on end users for there to be very tangible benefit.

It’s necessary to implement some form of indexing of user’s data. Otherwise there’s no way for anyone to know what data they own / what data an application has put on the network. And without that, can anyone actually be in control of their data?

IMO This just takes that a step further, providing some well worn functionality (app controlled data) and extending automated indexing with some useful presets (easy access to file types).

The gravy (or danger), as you note, comes in w/ permissions (and flexibility). For which I’d let @JimCollinson opine perhaps on how he thinks that fits in the overall pic.

Mindphreaker · December 2, 2019, 2:19pm

Thanks for the quick response. I agree that the burden can be transferred to the app or its developers and I’m sure that there are good use-cases for this, however this feels contradictive to what you said afterwards:

It feels like you are saying this from an end-user perspective. As such, If the tagging is “outsourced” (automated, or similar) how does this reflect the end-users ability of being in organisational control of their data? This only applies if the tagging is done by the end-user, right? Otherwise he would not know about the organisation beneath and in return could not benefit from it?

Anyhow, regarding my second concern (permissions) I think it’s not important from which perspective it is seen. Also from an app developer perspective I think it’s not a good UX approach to mix labels with permissions logic because it’s just very uncommon and therefore error prone.

joshuef · December 2, 2019, 2:48pm

No, as any/all applications are indexed automatically (at least by <appId>), even data created by an app but never labelled explicitly by a user will have an index which the user could then inspect (via an app most likely).

This is where applications will have to provide some help to make these raw indexes usable. (see @JimCollinson’s mention of ‘Smart Folders’ eg).

That all depends how it’s presented IMO. I don’t see anything insurmountable, nor indeed that different to managing folder permissions as has been the current idea: Allow <app> to access/edit photos? Is the same permission if that’s a ‘container’ called Photos or a label… (you know equally as much about what is inside that photos ‘folder’).

Traktion · December 2, 2019, 6:28pm

I missed this post before, but it sounds like a really positive change. More flexible and practical with few down sides. I like it!

david-beinn · December 2, 2019, 8:11pm

Sounds a good idea!

Having slightly skimmed the thread due to lack of time I think the comment that has stuck out the most to me is @happybeing’s suggestion to start at the UX and work backwards.

Aside from all the other benefits of SAFE I actually think there is a gap in the market for a storage system with a really great UX that is the right blend of familiar and innovative, and naturally leads the user into organising their data in a way that is both powerful and intuitive, as well as allowing easy sharing with both apps and other individuals, or even self-publishing to customers. This could definitely be a tool that is part of that.

My suggestion would be perhaps to have a fairly large set of default labels to lead people into actually bothering to label things, and to make it an integral part of the UX to be able to look things up by label. Default labels would also make it easier for apps to locate data, as I think has been mentioned.

In short, great idea I think, only concern is to make sure it doesn’t become unmanageable for the user.

happybeing · December 2, 2019, 9:55pm

Along these lines I added labels under the description of a GitHub repo for the first time today and found it a very pleasant UX, much more than that for adding tags to a post on this forum for example. So worth a look @jimcollinson

mav · December 2, 2019, 11:47pm

Seems like a powerful way of organising data. It reminds me a bit of union mount file system feature.

Sounds like middleware.

I’m not advocating this level of detail for initial development, but as an idea it would be neat to be able to add labelling ‘plugins’ that take the raw file input and can sequentially add or remove labels as needed.

eg a photo app would come with label middleware to simply always apply the ‘photo’ label. A more complex middleware might do image processing and add more labels such as ‘tiger’ or ‘tree’ etc. A third party might sell labelling middleware for the camera app that looks up a database and adds labels for the botanical names to any pictures of trees or flowers… etc. The point being, the method of labelling should be flexible and I think a handy way to implement this flexibility would be via chains of labelling logic depending how the user wants their labels.

How is unicode handled? It’s a minefield but a global network should presumably work with global languages and character sets?

Topic		Replies	Views
[RFC] Labelled Data, Indexing and Token Authorisation RFCs	61	3535	January 29, 2020
RFC - Unified Structured Data RFCs	12	3306	September 22, 2015
Pre-RFC: Linkable Data Structure Development	25	2243	January 6, 2016
Code: Indexed and searchable data in SAFENetwork Development	13	2487	February 10, 2019
Demo APP UI Revisements RFC RFCs	0	993	July 18, 2016

[Pre-RFC] Labelled Data

Related topics