Not sure. It could be defined as a fixed timeout, eg 60s, or it could be like bitcoin mempool and grow/shrink up to a fixed buffer size for EVs, say 100MB of memory and the oldest are purged if it gets full. Not sure the most appropriate design, but it seems difficult to enforce at the network level so probably ends up being up to the node anyhow depending how they want to manage the risk of failures.
I don’t think there’s any need for network time to be introduced, this feature can still work just on local time.
It’ll be interested to see how multi-section networks go.
From the last test network speeds, 1k in 3s compared to 900k in 11s is worth looking at, roughly the size of a metadata message vs a single full chunk.
For example, if there’s 5 hops to negotiate each way, in current routing that’s 5 metadata request hops and 5 chunk data response hops (5*3s+5*11s = 70s), vs EV it’s 5 metadata request hops and 5 metadata response hops and 2 metadata EV hops and 2 data chunk EV hops (5*3+5*3+2*3+2*11 = 58s). Only very rough approximation but EV looks good there.
As you say it will depend on the time for each handshake and validation etc vs the time for data transfer. Doesn’t help that both these will also change into the future depending on improvements to computation / networking / crypto primitives etc.