@chriso , per @southside 's latest upload challenges and ERROR throw just posted on DISCORDā¦
Here is a bit a of logic below that perhaps might help:
The root cause of the ERROR throw (panic code not there)
might be the antnode operatorās use of OS /SWAP config of VRAM
which perhaps is related to node operator use of /swap in further bid
to oversubscribe Disk Space
so as to get more nodes operating in RAM
to win more rewards:
perhaps Maidsafe should consider not throwing an Error in this Upload instance, as it appears to be potentially a
/swap from VRAM(disk) induced by OS
āwait for the code I need to loadā momentā¦,
Work around?,
Possibly consider adding to the cli api and echo UI handle with ā Waiting to Uploadā msg from the closegroup antnode Gossip consensus process responding to an upload client request for quote,
That is,
the consensus closegroup process must check , are their enough antnodes present āreadyā to provide a quote,
Therefore causing the closegorup, to signal to all upload client (cli or Dave UI App) attempts at that moment a gossip message to the current uploaders requresting quotes,
to signal a āwaitā message to the current set of upload file/I need a quote process requests triggering a āquote consensus forumā to be formed among close group nodes ,
The Gossip msg proposed must be received
through the client cli api which causes the local client to generate
an echo via the cli prompt via the cli instance api
WITH a wait state added to accept a cli (or DAVE UI api) choice
to trigger
local uploader cli or UI uploader logic to:
IF (above msg received)
run a function to echo message and wait for response
to branch to the uploader choices
'Wait 1 minute, or Retry from Upload Start Now, or Stop and upload later, please enter Wait/Retry/Stop ā (enter W, R or S then confirm w/ Yes or No Y/N ) , with same msging passed up the stack to a UI client api?
Just āthinking out loudā hereā¦,
as its likely
the lack of activity asking for quotes from antnodes in the close group
perhaps has had one or more of antnodes in the close group
being slow to respond,
given
more than one of the antnodes in the close group may have their Linux OS configured
to push that object to VRAM, if /SWAP is enabled,
meaning/resulting in:
fewer consensus āquote forumā closegroup antnodes responding within the given time (less than 5 antnodes) to provide the five quotes necessary.
That said,
it might be a programmatic consideration for Maidsafe as well,
to keep that quote object in memory by prioritization it
(making it immune to OS VRAM /SWAP actions)
so all antnodes remain responsive to providing quotes as their first priority
( VERSUS every other antnode process/function, including: shuffling chunks around or ; writing and reading copies of chunks to local storage (slightly lower priority); etc.)
As a general observation of node system operator configuration behaviour,
one can see how some node operators might be making deliberate use of the OS VRAM SWAP to further cram more antnodes into system Memory to win more rewards with more nodes operating, by further oversucribing their disks in this /SWAP assisted manner.
which could be contributing to causing the above ERROR behaviour to upload clientsā¦
AND/OR there is another exception use case which comes to mind which may be a contributing factor:
ANOTHER Exception USE case causing same ERROR?:
consider the case, where:
members are leaving the close group and,
there is a temporary shortage of antnodes providing quotes,
Caused by rapid new antnode ājoinsā or āleavesā to/from the Autonomi Network
AND, At the same time, (concurrently)
perhaps coinciding exactly at that moment,
There is a surge of upload client requests for quotes to service their uploads within the affected/changing closegroup membership responsbile for providing the quote,
that is causing a drop on responsive āquote forumā closegroup member count,
when those closegroup antnodes leaving their old close group to be re-ordered in a new closegroup,
before new assigned close group members join the old close group TO RESPOND QUICKLY ENOUGH TO THE UPLOAD REQUEST
In either exception use case, the bit of logic expressed above,
will also handle both,
to better serve the cli api or UI client api interface handling the upload action,
to provide better overall UI/UX service to the uploader.
The other alternative in the latter exception use case, is to make 'quote forum formation and response the main priority,
before shuffling close group antnode members to another existing close group or the creation of a new closegroup.
Anyway, food for thought, trying to trouble shoot this latest upload āERRORā.
(imo the current āERRORā condition expressed by the system during upload to the uploader is indicating the current prioritization of different system behaviour functions facing the uploader client needs to be reordered, to 100% always first serve uploads as the FIRST priority to the uploader, to make this upload āERRORā problem go away, but also deal effectively with ā OS /SWAP node operator configuration behaviour as well
I hope the above helps.