Gnutella Protocol Development

Our blue logo

Gnutella Protocol Development

Home :: Developer :: Press :: Research :: Servents

4.6 Link compression extension
Source - Latest draft

Compression of the message traffic over a Gnutella connection is 
OPTIONAL. The following assumes the "deflate" scheme is used, but any
compression algorithm MAY be used.

Compression is meant to be done on a per-connection basis.  The 
"deflate" scheme is handled by the www.zlib.org library (the deflate/
inflate routines). This means there will be a compression dictionary 
and history per connection on both ends, meaning a good compression 
ability (compared to compressing each message individually).

However, this stream compression of the traffic means that we need to
individually compress the same packet on each connection, using a 
dedicated compressing state maintained per connection.

Because compression algorithms don't necessarily produce output when 
fed input (e.g. if you feed them with "aaaaa", they'll wait for the 
next char, and will do so until it's a "b" for instance), it is 
necessary to periodically direct them to flush output.  However, this
comes at a cost, because once flushed, data is inflatable by the 
decompressor, and this is done at the expense of sending the 
necessary dictionary information.

To negotiate compression, we're making full use of the 3-way 
handshaking. The idea is that decompression is fast, so it's OK to be
sent compressed data, but the compressing side must decide based on 
the resources it has available whether it will compress or not.

Basically, the side supporting decompression will say:

   Accept-Encoding: deflate

Note that although we only specify "deflate" here, the servant MAY
advertise the set of various compression algorithms it knows,
subsequent items being separated by a ",".

And to accept compression, the other side acknowledges by sending:

   Content-Encoding: deflate

The servant just picks the compression scheme it supports amongst
the ones advertised by the remote end in the Accept-Encoding line.
The Content-Encoding MUST contain only one value.

This also means that compression settings is asymmetric: a node can
send compressed data but receive uncompressed data.

Here's an example where both nodes support compression, comments
starting with "--", and ending  removed for clarity:

   GNUTELLA CONNECT/0.6
   Accept-Encoding: deflate       -- OK for reception of compressed data

       GNUTELLA/0.6 200 OK
       Accept-Encoding: deflate   -- I can also receive compressed data
       Content-Encoding: deflate  -- And I will send compressed data

   GNUTELLA/0.6 200 OK
   Content-Encoding: deflate      -- OK, will also compress data

   

Here's an example where compression will only be made on the 
transmission side of the first node (A is the node initiating the 
handshake, B is the node replying):

   GNUTELLA CONNECT/0.6
   Accept-Encoding: deflate      -- OK for reception of compressed data

       GNUTELLA/0.6 200 OK
       Accept-Encoding: deflate  -- I can also receive compressed data
                                 -- I refuse to compress data, sorry

   GNUTELLA/0.6 200 OK
   Content-Encoding: deflate     -- OK, I will compress data sent
                                 -- But I will receive uncompressed data

   <flow from A->B is compressed, flow from B->A is not>

Even though GGEP payloads (see Appendix 1) can be compressed, and 
this information is visible in the GGEP header, it is not advisable 
to decompress those payloads before sending them to the compressing 
layer.  The deflate algorithm does not expand already-compressed data
by a large factor and emits them as clearly marked non-compressible 
data (the overhead is limited to roughly 0.1%). If connection 
compression is widely used on the Gnutella network, individual GGEP 
extensions SHOULD NOT be compressed.

 

 

 

Home :: Developer :: Press :: Research :: Servents

SourceForge.net Logo