The essence of SAFE apps

janitor · March 7, 2015, 10:50am

Yes, theoretically you can, but above I mentioned all the reasons why that won’t make sense until apps (or RDBMS s/w) is modified to handle such novel architecture. A summary of my arguments:

do you want to pay for 1, no, cross that - you need another for transaction log, so at least 2 PUT’s for each transaction? I don’t know how much that could cost, but it won’t be cheap. (OT: since data is mirrored 4 times, should that be 4 and 8 PUT’s, respectively?)
do you want to wait (say) 1 second for each transaction to complete until the GUI is responding again?
do you want to risk a DB corruption in case you get disconnected?

Once v1.0 is out, I may move my small site to MaidSafe, but I’ve decided to change the s/w I use and “go static” because I am not too eager to deal with these issues.

happybeing · March 7, 2015, 12:41pm

This is what you said. I’m saying I don’t think this is the case and asking if you are sure about this. You repeat your assertion with more detailed assumptions as far as I can see.

Why not just admit you don’t know how a MySQL database will work on SAFE Drive at least at the low level?

We both expect performance may be poor, but that may be entirely ok for some applications. I am pretty sure based on discussions on the forum, that it will not be one PUT per transaction, which is why I’m asking you if you are sure about that and if so to back it up. If you aren’t please stop repeating it.

EDIT:
Let me elaborate:

As I understand it, every write to a file on SAFE Drive does not result in a PUT. Instead there is some caching that leads to write operations being batched up. I don’t recall how the batching works, what if any precautions there are against data loss, so it remains to be seen what kind of applications this would be suitable for.

dallyshalla · March 7, 2015, 4:36pm

SAFE apps, can use their client interface, and target data of individual users; in the case of a facebook; people could fill out a form of their profile, put it to the network; and only those people who have location of the profile; or even the encryption key to the profile of the person, can they then add them to their ‘friend list’

So a person can ‘put’ their profile to the network; and ‘facebook’ client could be used to get that profile, and it’s location could be stored in a contact list for easy access to the profile;

This way each individual is responsible for storing their own profile; so then; if you store pictures, perhaps you stored an album, you would then target the already put pictures in your profile, so that there is no need to double put;

Imagine like, you have photos on facebook, photos on your camera, and photos in your dropbox; in Our case, we have Our photos on SAFE Network… We have Our photos on SAFE Network…

This is a huge reduction is personal redundancy, imagine that facebook copies your photos 4 times, camera is stored one time, maybe on your computer also, and drop box also has redundancy so files are really stored 10 times. Glad SAFE Network solves this type of issue;

I’m sure that performance will not be as abysmal as you think; in fact at all; We are pulling chunks of data, small pieces, so even a huge file if broken down to small let’s say some KBs each, and put; then those chunks are quick to get… so even a 1TB file could pulled almost instantly;

happybeing · March 7, 2015, 4:48pm

@dallyshalla just to clarify, i don’t expect SAFE REST API performance to be poor, or apps written for this API. The above refers only to a standard MySQL application (e.g WordPress CMS) with files accessed via SAFE Drive.

Is this a scenario you are saying you expect to have good performance? If so…

dallyshalla · March 7, 2015, 4:59pm

Mark, let’s get a SAFE CMS; I must be back in that land again where I’m all ready to be doing stuff in the SAFE Network versus on the dated (previously) standard software;

Even, last night @frabrunelle and I were thinking over the FIX messaging system, and how ugly that was; Also will need a rewrite from xml to something prettier like json… look out wall st;

The SAFE apps gives us a chance to do things right; @janitor definitely, looking into the previously generated systems so we can work out the nuances that we don’t enjoy, even though they still work; Here’s our chance to show the improvement, and deliver it.

Static content on SAFE Network, I see is so simple; since you put once, and there you go access from anywhere to anyone you want; (no recurring subscriptions, etc;)

janitor · March 7, 2015, 5:18pm

Because I don’t need to know! I already know that a running DB spread over say 25 1MB chunks will create changes in most of those chunks as I INSERT data which modifies tables and appends to the transaction log and so a single INSERT cannot be less than 2 PUT transactions (if it was, i.e. it would modify just one chunk, that would be awful, because if that chunk by some chances gets corrupt, you would have a corrupt DB and a corrupt transaction log).

In the worst case, a single transaction could mean 25 PUT’s (i.e. a DELETE that touches all 25 chunks).

For the app to work, DB data must be present on your device (i.e. downloaded). You cannot create INSERT’s and DELETE’s without that.

You need to download the DB and that’s a scenario that I mentioned on this page yesterday (5-6 comments ago).
I said I thought that would work fine (assuming clean application exit), but you’d have to flush changes after application exit and download the DB every time you start it on a different device.

5-6 months ago we had a similar discussion I said I’m not bothered by this slight inconvenience. I happen to use WP with mySQL but I’ll simply use something else with MaidSafe. Every technology - especially in v1.0 - has some drawbacks. Those who appreciate the advantages enough will find something that works well with MaidSafe, others will wait until v2 and some sort of atomic DB writes are supported. To me that’s not a big issue or a show stopper.

happybeing · March 7, 2015, 5:23pm

@dallyshalla I agree completely. And… this will take time, so there will be a demand for interim solutions if they turn out to be feasible in terms of performance and PUT costs.

It’s not hard to implement something like a CMS to static page generator for WordPress for example. The CMS database could then be located either on SAFE, locally, or left where it is on an old style hosting service. It’s not a great solution, but it could help people get masses of useful content onto SAFE ahead of a SAFE CMS. That’s good for all I think.

dallyshalla · March 7, 2015, 5:28pm

@janitor glad you see this; correctly if you had to pull the whole DB each time would be cumbersome; yet if you can make some meta data for the segment of the SQL database you want; so that instead of pulling a whole MySQL DB to query, use the file Container, scan meta data of the files inside; and then download just the content of that file which could be the one line in the MySQL DB you wanted…

So each put’s don’t have to be as much as 1MB of cost;

happybeing · March 7, 2015, 5:38pm

The following is not true IMO, and you have not addressed this point - you’ve just repeated the same assertion.

I already know that a running DB spread over say 25 1MB chunks will create changes in most of those chunks as I INSERT data which modifies tables and appends to the transaction log and so a single INSERT cannot be less than 2 PUT transactions

How do you know this? The database structure and pattern of transactions mean that this will be the case in some situations, but unless you have some actual knowledge of a specific application, you cannot assume this is always the case. Yet you keep repeating it as fact. It’s not, because you’ve not backed it up with application specific evidence or multiple application analysis.

The following is even more suspect because it is only true if every single block in the database is modified, which would be rare in many, if not most applications, because designers will already have tried to minimise this for performance reasons.

You need to download the DB and that’s a scenario that I mentioned on this page yesterday

So again you’ve just ignored what I said and restated the same points. This is a familiar pattern with your posts. Rather than address a point of criticism you’ve simply repeated yourself.

nicklambert · March 7, 2015, 5:52pm

It’s a good point @DavidMtl, I often think of not just apps in this way, but also the nodes, each performing various tasks within different and ever changing close groups.

janitor · March 8, 2015, 9:11am

You are saying MaidSafe will pull blocks, not entire files. Is that how it works?

But anyway, let’ see how much that could save us:

How much data from a DB index file do I need to use the index? All.
How much data from a table do I need for a SELECT * type of query? All.
How much data from a table do I need for other (e.g. DELETE) operations? In case of MySQL, I suppose all.
For all other data, how much do I need to wait for every single non-cached IO operation to complete? Say I click on a menu item that generates 15 DB operations. How long do you estimate that could take? Would such an app be considered usable?
Most of files that I have to GET (such as indexes) also have to be PUT (because they change). So making any use of an app would likely require several PUTs.

Are you sure all PUTs are priced on a per capacity basis? Or does MaidSafe just allocate less space while charging you the same?
What is the minimum unit of update for files that are 1MB or larger? Is it a chunk, or a file? If it’s a file, then the entire file has to be PUT when it’s updated and the flexible chunk size would have no bearing on cost here.

Why do you have to focus on what I clearly said is the worst case scenario? And you called me a pessimist!
My baseline scenario (in this topic) has always been that each update to MySQL will generate at least 2 PUTs, one for the data and one for the transaction log.

Designers (I assume you’re talking about MaidSafe developers) either have this plan for v1.0 or they don’t. I already said that perhaps by v2.0 there may be a better way to handle this. So it looks like you’re either repeating my earlier claims, or you are speculating that this optimization will happen in MaidSafe v1.0.

I thought I spotted the same pattern in your replies the other day, I just didn’t want to call you out on that. I’m referring to my questions from this comment here, when I asked you - because you’re in favor of “appropriate regulation” - whether you think that MaidSafe should be regulated to prevent it from turning into a monopoly, and if yes, how should that be done.

You saw a generic discussion about how write buffering will work with regular, flat files.
You are assuming that each MySQL write is the same. It’s not. Writes are inter-dependent.

A write to one place in DB often means that a write in other place (such as the transaction log or some table or index) must happen. MaidSafe 1.0 would have to know which writes can be buffered, which cannot, and which should be grouped with others in some particular way.

That’s why I said I don’t need to know how MaidSafe v1.0 will work on the low level, because:

I haven’t seen such claims on the forum that relate to RDBMS files
From experience I know those things are hard to do even on a local filesystem (if one attempts to write one from scratch)
For v1.0 I would consider plans to implement something like that a bad idea because when one deals with such a complex app, a best practice is to build a stable core and not some fancy database caching features

So regardless of that discussion about write buffering of plain files that you mentioned above, I am telling you that I don’t believe write buffering and coalescing applies to RDMBS until I see a statement to that effect.

If you write-buffer databases like that, it will work great, except when you get disconnected or your app crashes or your MaidSafe crashes. Then you will end up with a corrupt database.

I didn’t count a whole bunch of additional updates that happen on WordPress MySQL (e.g. everything generated by plugins) because I don’t consider WordPress a single client app, so it wouldn’t be fair and those aren’t strictly required, although we know that faced with a choice of running WP without a single plugin would mean most people wouldn’t run it at all.
Something like SQLite might be more appropriate for single-client apps in MaidSafe v1.0, but even in those cases, my baseline assumption (1 PUT per update) would probably hold for databases that are not fully downloaded to the client every time the app is executed (and uploaded back after clean shutdown, assuming the user accesses it from different clients and/or wants to use MaidSafe as their primary data store and not as a backup device).

I outlined all of these problems and mentioned some potential workarounds (such as keeping 3 copies of a DB on MaidSafe and deleting the last one every time the latest is error-free) earlier, so this comment is also mere repeating of what I had said before and I’m not sure how effective it will be. I guess it won’t.
But if there’s no definitive proof one way or another, we’ll have to conclude this after v1.0 is available.

happybeing · March 8, 2015, 11:26am

@janitor I’m glad you’ve finally admitted that despite making definitive statements, we cannot be definitive about this.

That was my whole contention: I agreed with you I’m pessimistic about the practicality of MySQL on SAFE, but that some of the reasons you gave were in my opinion assumptions, because they differ from my understanding of how SAFE works at the file system and REST API level.

My information comes from having followed forum and reddit discussions with David very closely on this, as well as studying the initial REST API proposal.

You clearly have not done this, as the following shows:

You are saying MaidSafe will pull blocks, not entire files. Is that how it works?

This is exactly what I have been saying about a MySQL database sitting on SAFE Drive. And this also relates to why I think you are wrong to say that the will be two PUTS per transaction too.

Firstly, SAFE Drive only downloads the blocks affected, as needed. Secondly it does not do a PUT every time a block is modified on Drive. I am not sure if the batching mechanism is finalised yet, but have read about it in some detail (in the REST API design I think). I made both these points earlier.

It is misleading and I believe irresponsible to make definitive statements about whether an existing MySQL app will work adequately, when you have not done enough research to establish how the SAFE filesystem works, and to not make clear that you are making assumptions that you have not validated. You give an unnecessarily negative view of SAFE Network - and this is the reason for my intervention.

I appear to have more understanding of SAFE Drive than you, but I don’t pretend to be able to say what will or won’t work.

My whole issue with you in this discussion is not about whether or not MySQL will work, but about you saying it won’t, when in reality we don’t know yet. Neither of us expect it to work, but we should remain open until we know.

Traktion · March 8, 2015, 2:37pm

Interesting discussion, but I don’t think that centralised databases will be appropriate for safe apps.

Sure, you can treat safe net like a if virtual drive and overlay traditional DBMS systems over it. However, this is the old centralised way, not the new distributed way. The safe net itself is essentially a big database and developers should take advantage of this.

You could have a predetermined sequence of references which users retrieve and then save data files to. The users could then share the file to one, some or all other users. This would need no centralised application or storage engine.

We must remember that the safe net itself is like a huge storage device. Anyone can write to it and share stuff with others. Why bother centralizing the saving process when it can be avoided?

As every user can read public data, they can query and filter it locally. Likewise, private data will only be accessible by those who have access, but can then query/filter it. This puts security in the hands of the user, rather than a third party.

I suspect each client application would index the data for their needs to keep the system performant too. Any data they don’t need or access could be ignored.

I am sure distributed databases which work in this sort of way will be developed to ease application development. I am also sure they will come up with some excellent solutions to such a heavily distributed databases.

We must remember that this sort of technique can scale massively. There is no limitation on any centralised hardware, disk space, cpu or memory. With the right techniques, safe net could allow hugely scalable solutions to distributed data issues.

happybeing · March 8, 2015, 3:19pm

Agreed NoSQL is the way in SAFE Network The discussion of MySQL in no way infers otherwise

chrisfostertv · March 9, 2015, 1:17am

I wonder if we’ll get a glimpse of the possibilities in testnet3

TylerAbeoJordan · March 10, 2015, 2:44am

Combine a Wordpress port to NodeJS with some reworking to use NoSQL instead of MySQL (assuming that’s not the plan anyway), and we’ll have a massive user-base ready to jump ship.

Oh…and then there is this – maybe that’s not functional though as it might be the opposite of what we’d want. Hmm, my understanding here is at a limit.

What about just using caching to bypass most DB queries?

edit: after further reading on NodeJS and NoSQL, I’m wondering how well either of these are going to scale. I’m NO expert on these things anymore, just researching other people’s opinions. Perhaps a new thread/topic on what the best languages and DB’s might be to use to program for MAIDSAFE would be in order?

Traktion · March 10, 2015, 9:13am

I suspect Wordpress/blog style sites could very easily be ported to safe net, but with a completely different technology stack. No database management system needed.

Essentially, the author would bust need a nice tool to write static content, then leave the browser to piece it together, filter it, etc. The user experience could all be client side.

It would get more tricky to post comments, but a sort to defined index number sequence that people could port data to the next free slot could work. This way, the user would store their comments as their own data (and could remove them etc) and the blog would just retrieve them at load time via the client.

janitor · March 10, 2015, 12:33pm

Why would someone want to run a single user WordPress on NOSQL?

+1
That’s what I’m talking about!

TylerAbeoJordan · March 10, 2015, 1:14pm

Someone, some organisation, some company. Why do individuals use wordpress now? Maybe you can be more specific.

Wordpress has plugin’s that create static content - cached pages.

Traktion · March 10, 2015, 1:35pm

I suspect this is largely for historical reasons; it wasn’t that long ago that browsers were only capable of displaying largely static content.

Additionally, online storage for hosting and local storage have traditionally been very different things. This makes modifying content fiddly. Safe net presents a virtual drive and a public URL out of the box.

Ofc, you could use dropbox in a similar way, but maybe the use case wasn’t worth the tooling effort to support it. Maybe browsers are only reaching the point where it is a reasonable option too.

One thing is certain, static content on safe net will scale supremely well. We should take advantage of that.

Topic		Replies	Views
App idea: Safe Layer. Let's talk about the data Apps	23	3392	March 27, 2018
Apps and SAFE - Public Repositories Apps	8	1523	May 17, 2015
Data Transfer to MaidSafe Development	3	1227	July 23, 2015
SAFE App Launcher - Introduction Development	71	8952	February 22, 2016
How to build an App on SAFE Network? Development	33	6283	March 25, 2018

The essence of SAFE apps

Related topics