Gnutella Protocol Development

Our blue logo

Gnutella Protocol Development

Home :: Developer :: Press :: Research :: Servents

3.3 The download mesh

3.3.1 Purpose of the download mesh

The purpose of the download mesh is to help people finding more 
sources for the files they are looking for, without needing to 
requery the network. These supplementary sources are called alternate
locations, or alt-locs in this document. With the wide deployment of
downloading one same file from multiple sources (also called 
sometimes swarm downloading, although this expression has a more
specific meaning), servents can benefit from knowing about several
sources for the files they want to download.

There is no original proposal for the download mesh. The roots of
the download mesh specification can be found in the HUGE proposal.
The wide adoption of the HUGE protocol brought the creation of the
download mesh, using URNs as the way to uniquely identify a given
file on the GNet. Note that not all parts of the HUGE specification
are related to the download mesh, for example querying by URN is
unrelated to the way the download mesh works.

Basically, the solution choosen to construct the download mesh is to
try to make each servent aware of those other servents that share the 
same files on the GNet in a decentralized way. There are many ways to
do that, the solution presented here has been choosen as a good
compromise between efficiency and lower bandwidth usage.

The download mesh is more efficient to help finding sources for 
popular files with a lot of sources on the GNet. However, the goal is
to make the mesh work better for every files including rare files.
Recent changes in the download mesh should help getting closer from
this goal.

3.3.2 Headers

The HUGE proposal uses two headers to indicate respectively the URN
associated with a file and known alternate locations for this file.
Those headers are X-Gnutella-Content-URN to give the URN of the file,
and X-Gnutella-Alternate-Location to indicate alternate locations for
that file. But new headers have been introduced recently (2003) to
construct a new download mesh, as the older one suffered from poorly
implemented servents which lead the mesh to be mostly inefficient.
One of the reasons for introducing the new smaller headers was to use
less bandwidth.

Former legacy format example :

   X-Gnutella-Alternate-Location: http://1.2.3.4:6546/uri-res/N2R?
urn:sha1:OJUNVQ75FQMZ5RXR3LJUDIQSGSVC5RFE 2002-12-27T12:35:51Z\r\n, 
http://1.2.3.5:6461/uri-res/N2R?
urn:sha1:OJUNVQ75FQMZ5RXR3LJUDIQSGSVC5RFE 2002-12-27T11:38:51Z
   X-Gnutella-Content-URN: urn:sha1:OJUNVQ75FQMZ5RXR3LJUDIQSGSVC5RFE

New concise format example :

   X-Alt: 1.2.3.4:6347,1.2.3.5
   X-Gnutella-Content-URN: urn:sha1:OJUNVQ75FQMZ5RXR3LJUDIQSGSVC5RFE

The X-Alt header is the replacement of the legacy 
X-Gnutella-Alternate-Location header. The port number MAY be omitted
if it is 6346. Legacy format is allowed in X-Alt headers, but newer
clients SHOULD only send the new concise format.

If a servent implements PFSP, it SHOULD submit and accept partial
ranges available using the PFSP X-Available-Ranges header.

Servents implementing push proxy MAY also use another X-Alt format,
as follows :

   X-Alt: <GUID>;1.2.3.4:6346;1.2.3.5:6347
   
Again, the port MAY be omitted if it is 6346. <GUID> is the Base32
encoded version of the proxied hosts' 16-byte Gnutella GUID. Both
concise formats MAY also be mixed together by some vendors. Thus the
following header is valid :

   X-Alt: 1.2.3.4:6347,<GUID>;1.2.3.5;1.2.3.6:6347,1.2.3.7:6348
 
There was no agreement on the ways to maintain compatibility with the
legacy servents using the older headers. Thus the legacy headers may
be considered deprecated. They SHOULD be still understood by newer
servents to benefit from the alt-locs given by older servents, but
the new concise alt-locs headers MUST be used for every servent 
willing to participate to the new download mesh. 

Thus servents SHOULD answer to hosts sending old headers with legacy
headers as this implies that the remote host is using the older 
mesh. There is no harm submitting alternate locations coming from
the older mesh, as they will be checked and dropped if they are
not valid.

In addition to the original proposal, a new X-NAlts header was also
added to indicate bad (expired, false or malicious) alternate
locations. The format of the header is the same as the X-Alt header.
Example : 

   X-NAlts: 1.2.3.4:6346, 1.2.3.5:6341
  
The port MAY be omitted if it is 6346. 

An alt-loc SHOULD be considered expired if a 404 HTTP response was 
received or if the socket couldn't connect to the remote host, 
probably meaning that the servent has disconnected, but SHOULD NOT
be considered expired when the server is busy (503 response), or
when a Requested Range Not Satisfiable (416 response) is received.
   
A servent that sends malformed HTTP headers SHOULD also be removed
from the mesh. Chances are that it's download mesh implementation is
also bad, and thus it should be considered as a bad alt-loc. If this
servent send alt-locs, they SHOULD be discarded as well.

Servents SHOULD NOT add to the mesh uploaders which queued their
download requests, so that the upldoaders will not be overloaded with
more downloader requests. But they are neither put in the bad 
alt-locs as the uploader exists and has the file. Some servents MAY 
also do the same for busy servents (503 response).

3.3.3 Description

When downloading a file from uploaders, the downloader SHOULD 
inform the uploaders about others locations it knows for this file, 
and from which it has successfully downloaded. The downloader MUST 
NOT inform the uploader about alternate locations from which it has 
not actually downloaded yet.

If, for example, the downloader has 10 locations and tries eight of
them, out of which the first five worked and the last three did not 
work, all of the first five uploaders must be informed that the last
three uploaders are bad, and these good uploaders must also be
informed that the other 4 uploaders are good. This downloader says
nothing about the last two downloaders -- because it has not tried
them it has no way of deciding if these locations are good or bad. If
there are many alt-locs available, the servent should not submit too
much to spare bandwidth. A maximum of 10 alternate-locations
for a given file is suggested.

To submit alt-locs (good or bad) to an uploader, the downloader has
two solutions. If it implements download by chunk and the download is
still in progress, it SHOULD submit alt-locs when downloading the
next chunks of data. If not, then it SHOULD implement HEAD requests,
and send one after the file has been downloaded, including the
submitted alt-locs.

Similarly, the uploader stores the alternate locations given
to it by each downloader, and sends them back to all the other 
downloaders of the same file. The difference in this case is that the
uploader will not send a request to the alternate location to check
their validity. This would cause too much unneeded traffic as the
uploader has no other reason to connect to the alternate locations
indicated by the downloaders. Instead, with the scheme described here
the uploader relies on the downloaders to verify the goodness of the
alt-locs, as part of their function. In contrast with downloaders,
bad alternate locations MUST NOT be submitted by the uploaders.

The fact that the uploader has no way to check the validity of the
alternate locations was the main flaw in the initial download mesh
mecanism, and that is one of the reasons which lead to change the
initial specification, notably to add a way to remove bad alternate
locations from the download mesh. However, a good download mesh
implementation can avoid this issue.

Alternate locations can also be obtained in QueryHits replying to
Queries sumbited by the user, when the hash value is included in the
QueryHit. In this case the host's address MAY be added to the 
mesh once it has been checked that this is a valid alternate-location.
Also, if the host sending the QueryHit implements GGEP, it SHOULD
send an ALT GGEP extension (see 3.3.4). These alt-locs, as always,
MUST be checked by the downloader before being submitted.

The good practices to keep a high quality download mesh are as 
follows :

1. Test alt-locs before forwarding them. Downloading clients MUST
   test every alternate locations before submitting them to it's 
   uploaders, using X-Alt header for good alt-locs, and X-Nalts to
   submit bad alt-locs. Each known alt-loc (good or bad) SHOULD be 
   submitted to each uploader after the test. 

2. Inform uploaders about bad locations. As uploaders have no way to
   know when an entry expires, a downloader MUST inform the uploader
   about every bad alt-loc it knows.

3. Clean expired entries. The uploaders MUST notably remove
   alt-locs that are submitted using the X-NAlts header. Uploaders
   SHOULD have some tolerance though, and not remove the host from
   their list of alternate locations unless two (maybe three) 
   downloaders failed to download from the host. This will help also
   against malicious servent trying to destroy the mesh.

4. Minimize transfers. Alt-locs should be exchanged between servents
   as often as necessary, but no more often. Hence, a servent SHOULD
   NOT send the same alt-locs more than once to another servent.
   Similarly, it should not submit the same bad alt-loc more than
   once.

The points 1, 2 and 3 are the absolute prerequisite for participating
to the new download mesh (via the new concise headers).

There are various options to implement the point 3. Some vendors
make their entries expire after a given amount of time (for instance,
two hours). Some other vendors cycle their alt-locs so that each of
them is submitted regularly to the downloaders, which can in turn
notify the uploader when a bad location was found within it's 
submitted alt-locs. The second solution is better as it ensure that
every alt-loc will be tested regularly by the downloaders. Both
solution can be mixed, notably to take in account the possibility 
that some wrongly implemented downloaders will not give a feedback
on the expired entries.

For servents implementing PFSP there are some additional
requirements, see chapter 3.3.5 below.

Under these rules, the alternate locations are propagated through the
download mesh from uploader (source of the file) to uploader, using
the downloaders to check the alt-locs and then submit them to others
uploaders of the same file.

Bad alt-loc are removed from the mesh with the use of X-NAlts headers,
allowing the downloaders to notify each uploader which submitted a bad
location. Remember that X-NAlts headers are not propagated as X-Alt
headers, though.

The downloaders are doing most of the maintenance work on the 
download mesh, while the uploaders are blindly trusting the
downloaders. The advantage of this scheme is that it benefits from
the fact that downloaders will naturally search for and find new
alternate locations while downloading a file from multiple sources,
and thus can maintain the download mesh with a very low bandwidth
cost.

3.3.4 GGEP extension

Servents implementing GGEP SHOULD send an ALT GGEP extension in 
queryHits to submit alternate locations in QueryHits.

If a server has alt-locs for a while whose hash matches the hash in a
query it receives the server SHOULD send Alternate Locations in the
Query Hit using the  GGEP extension. See GGEP ALT extension in
appendix C.

3.3.5 Additional requirements

The basic requirements for a servent is to implement the HUGE 
specification. However, some additional features may benefit to
the download mesh. 

A server implementing PFSP MUST add itself as an alternate location.
It SHOULD do so when requesting for the second chunk of data (or 
alternatively, although the first way is preferred, it MAY add itself
to the mesh by sending and HEAD request at the end of the download).
It is assumed that non PSFSP aware servents will just not be able to
use those partial sources, but they will propagate them anyway to
other servents (because 503 and 416 responses do not stop the
alternate locations to be propagated), which may use them if they
implement PFSP. 

A servent which does not implement PFSP but does implement HEAD HTTP
requests MAY send an HEAD request to the uploaders once finished
downloading the file, to add himself into the download mesh.

Persistent connections are not required. However it can help the
download mesh logic to avoid sending duplicate alternate locations to
the same servent.


3.3.6 Sources

- HUGE Proposal v0.94 : 
  http://rfc-gnutella.sf.net/src/draft-gdf-huge-0_94.txt
- PFSP v1.0 : 
 http://rfc-gnutella.sf.net/src/Partial_File_Sharing_Protocol_1.0.txt
- HTTP/1.1 : http://www.w3.org/Protocols/rfc2616/rfc2616.html

3.3.7 - Credits

These specifications were written by Mathias Bollaert and Sumeet 
Thadani (LimeWire LLC), from ideas discussed and agreed on the GDF.
Andrew Mickish from Freepeers (BearShare) proposed the best
practices.


 

 

 

Home :: Developer :: Press :: Research :: Servents

SourceForge.net Logo