This is not intended to be original in any form, as usual, mostly to capture my own thought processes in an exhibio- (uh... the spell checker does not take any of the variants of this word but you get what I am talking about) kind of way. With that, let's proceed.
So, first wild assumption: there are no useful (human undetectable) cryptographic hash collisions. So we can assume that one can run over every piece of content in the world and enumerate them by sha1(what_you_get_from_network).
So, with that assumption in mind, let's just throw away the entire host-based URI thing and use the hash to address the content. Seriously - I either click on my URLs or search for them - so, it really does not matter. Let's call this new scheme "hash", with the new hash://sha1:4e1243bd22c66e76c2ba9eddc1f91394e57f9f83/
This is shorter than a lot of existing URLs. Moreover, with our first assumption, you can actually tell that as soon as you get anything other than "test" in response - it's the wrong content.
So, if you were an SP transiting this content - you could trivially cache the whole thing, and give a near-instantaneous replies for the content you already saw.
Of course, this whole new hash:// protocol is a misnomer - even with the current catastrophic worldwide finances I probably have more chances to retire than to see it ever implemented. Why ?
Because http://4e1243bd22c66e76c2ba9eddc1f91394e57f9f83.sha1.subdomain.tld/ could serve just as well...
Provided that it were agreed upon, such a "subdomain.tld" would mean that the originator says:
1) "I authorise anyone to make a cached copy of the content, based on the subdomain name to derive the algorithm, and with the hash value as specified in the sub-subdomain, and to serve that as an immediate replacement for the content that I am offering at that URL."
2) "I authorize the MITM on the subdomain.tld domain by your name servers, and redirecting the lookups to whatever the content servers are at your install - given that the client can verify the content anyway".
Of course you would not use this kind of URL right away or when the content is changing - so you'd either use an iframe-based container, or a redirect.
But, this seems to allow a few gradual steps towards the content-driven static data:
1) dedicated domain, same as now, RTT ~ 50..200ms
2) caching at the ISP level, RTT ~30ms.
3) caching at the LAN level, RTT ~1ms.
4) caching at the user agent level, RTT ~0.1ms.
So overall it seems like a doable and reasonable approach with fallback ?
--a
p.s. This was for static content. You can do similar tricks for the code, but that's a topic for another day...
So, first wild assumption: there are no useful (human undetectable) cryptographic hash collisions. So we can assume that one can run over every piece of content in the world and enumerate them by sha1(what_you_get_from_network).
So, with that assumption in mind, let's just throw away the entire host-based URI thing and use the hash to address the content. Seriously - I either click on my URLs or search for them - so, it really does not matter. Let's call this new scheme "hash", with the new hash://sha1:4e1243bd22c66e76c2ba9eddc1f91394e57f9f83/
This is shorter than a lot of existing URLs. Moreover, with our first assumption, you can actually tell that as soon as you get anything other than "test" in response - it's the wrong content.
So, if you were an SP transiting this content - you could trivially cache the whole thing, and give a near-instantaneous replies for the content you already saw.
Of course, this whole new hash:// protocol is a misnomer - even with the current catastrophic worldwide finances I probably have more chances to retire than to see it ever implemented. Why ?
Because http://4e1243bd22c66e76c2ba9eddc1f91394e57f9f83.sha1.subdomain.tld/ could serve just as well...
Provided that it were agreed upon, such a "subdomain.tld" would mean that the originator says:
1) "I authorise anyone to make a cached copy of the content, based on the subdomain name to derive the algorithm, and with the hash value as specified in the sub-subdomain, and to serve that as an immediate replacement for the content that I am offering at that URL."
2) "I authorize the MITM on the subdomain.tld domain by your name servers, and redirecting the lookups to whatever the content servers are at your install - given that the client can verify the content anyway".
Of course you would not use this kind of URL right away or when the content is changing - so you'd either use an iframe-based container, or a redirect.
But, this seems to allow a few gradual steps towards the content-driven static data:
1) dedicated domain, same as now, RTT ~ 50..200ms
2) caching at the ISP level, RTT ~30ms.
3) caching at the LAN level, RTT ~1ms.
4) caching at the user agent level, RTT ~0.1ms.
So overall it seems like a doable and reasonable approach with fallback ?
--a
p.s. This was for static content. You can do similar tricks for the code, but that's a topic for another day...