EDIT: some of the information in this post is misguided due to ADs not being implemented as data objects but link objects. So an AD should be called a ALD and this changes some of the logic/assumptions my post was based on.
Not smart in my view. Remember we want to support devices that do not have a large disk capacity and/or privacy concerns so that no trace is left on the device
One notable use case is - Database operations. Some very large databases are doing thousands of mutations per minute or even per second during their peak hours. If you make all data immutable then the space required for these types of databases will balloon out and swamp the storage capacity. Its one reason these multi TerraByte databases do not keep a log of every mutation that is made on them and only do snapshots of the data. And even these snapshots are kept for a set period of time.
In my opinion you must provide the ability for fast changing data (bases) to not keep every change done on them. At least have it as an option.
Also Databases with append only data means there is not a simple field change function. You either have to have a procedure to track through all changes (may need to read many MDs == very slow access now due to lag time) in order to reconstruct the record and this is time consuming when done every record read. All that energy wasted. OR the database makes a complete copy of the record and appends it so that the procedure to reconstruct the record is easy.
To build just one object for display it may require 10s to 100s of relational database records to be read from multiple files & index records too and if the database is very active then the work to reconstruct each of those hundreds of records could require more than one MD per record. And its not a parallel situation since the index field for other files is held in the records being read.
Then in my opinion there is a use case for temporary files too. For instance editors that store a temp file and discard it once the editing session is finished. So the temp file is useless once discarded since the saved file and previous file is the actual files. Remember SAFE will be run on devices that cannot have large temp files on its disk and/or for privacy concerns
Also these application temporary files are often heavily mutated and some on group of characters change and others on larger changes. This is for recovery purposes and if someone wants privacy (whistle blowers, ordinary people) on the shared device then temp files have to be on the network If you keep all these mutations then this represents a lot of wasted space for no benefit. The changes are saved when the file is saved and session is over. Think of all those 100s of millions of word documents that office staff work on each day and you want to save all the character/lines/paragraph of changes (for no benefit). Thats many terra bytes or more a day of useless data (never accessed again) (no information gained/lost by keeping or not keeping it)
And deleting this very temporary information does not take away meaningful information since each version of those documents are still kept in immutable data as the files. Thus reinforcing the fact that its keeping data with no benefit to the people using those applications or the future world.
The world of data storage is a lot more than web sites.
So in my view keeping web site changes is good. BUT not EVERY character or word or tag that is changed during an editing session. Just keep the saved files for goodness sake.
tl;dr
- Remember one of the early promises was that you could log in on any device and when you logged out there is no trace left. Having the requirement that temp files are stored on the device means there is a trace. SSD devices/memsticks means that wiping files using overwrite methods don’t work properly and files can be recovered often. EDIT: even if you encrypt the temp files, the fact they even existed (meta data) can cause problems. Remember the ex NSA chief who said we kill people on meta data.
- Databases will require a method to reconstruct records by tracing through all the appended changes and building the record.
- This dramatically increases the time to access data since a lot of those records will now be multi MD in size because of changes done to the record.
- Index files now become almost useless (speed wise & size wise) due to having to reconstruct the index record. Just read up on how they optimise index records and you might get an idea of the problems of append only data will cause.
- the multi terrabyte databases with 1000s of mutations per minute or second will result in dataspace blowout for no measurable benefit
- The world of data is so much more than web sites and I agree that each version of the website should be kept, but not all the temporary data/files involved in the edit of each web page.
- Privacy & Security
- once you have append only data then you force temporary files back onto the device, if indeed the device can support it. This has serious implications for those in the world who want privacy and security of their data. Not every activist or whistle blower can have a device that can support large temp files and that also cannot be taken from them. Often they use shared devices or phones/tablets that can be taken from them. If the temp files are not on SAFE but the device then the device can betray them.
- But if you put all temp files on SAFE then data space will balloon. For instance if I have 10 MB of documents and I edit each on average 3 times a year then I end up adding 60-300MB of appended temporary files. And 3MB to 30MB of immutable data (the various versions). Now I have 60-300MB of pure wasted space stored on SAFE. Many times than each version requires.