Document Revision Version 0.51
Protocol Version 0.5
February 4, 2002
Jason Thomas (jason@jasonthomas.com)
Document Revision | Protocol Version | Date | Changes |
---|---|---|---|
0.51 | 0.5 | 02/4/2002 |
|
0.5 | 0.5 | 01/27/2002 |
|
0.4 | N/A | 01/15/2002 |
|
0.31 | N/A | 01/08/2002 |
|
0.3 | N/A | 01/03/2002 |
|
0.2 | N/A | 12/20/2001 |
|
0.11 | N/A | 12/17/2001 |
|
0.1 | N/A | 12/13/2001 |
|
The Gnutella 0.4 Protocol is a bare bones protocol for sharing files. Over time, servent implementers have targeted different sections of the protocol for extensions. An example of this is the Bear Share Trailer that was added to the Query Hit.
Today, proposals exist to pack yet more data into the existing protocol. Unfortunately, many of these were designed for a single purpose and will close off future extensions.
This document describes a standardized format for the creation of arbitrary new extensions. This new standard allows for :
Servents that have support the forwarding of all packets that contain GGEP extensions (whether or not they can process them), must include a new header in the Gnutella 0.6 connection handshake indicating this support. This will allow other servents to know what types of packets this servent can accept. The format of this header is
GGEP : majorversion'.'minorversion
The length field uses an encoding technique that ensures that 0x0 is never the value of any byte. Steps were also taken to ensure that the encoding is compact. In this technique, a length field is the concatenation of length chunks. The format of each length chunk (which contains 6 bits of length info) is described in bit level below:
76543210 MLxxxxxx
M = 1 if there is another length chunk in the sequence, else 0
L = 1 if this is the last length chunk in the sequence, else 0
xxxxxx = 6 bits of data.
01aaaaaa ==> aaaaaa (2^6 values = 0-63)
10bbbbbb 01aaaaaa ==> bbbbbbaaaaaa (2^12 values = 0-4095)
10ccccccc 10bbbbbb 01aaaaaa ==> ccccccbbbbbbaaaaaa (2^18 values = 0-262143)
As you see, when the bits are concatenated, the number is in big endian format.
int length = 0; byte b; do { b = *extensionbuf++; length = (length << 6) | (b&0x3f); } while (0x40 != b & 0x40);
GGEP Extension Prefix |
GGEP Extension Header 0 |
GGEP Extension Data 0 |
. |
---|
. |
. |
GGEP Extension Header N |
GGEP Extension Data N |
Extension blocks may contain an arbitrary number of GGEP blocks packed one against another. Although this behavior is allowed, developers are encouraged to merge multiple GGEP blocks into a single GGEP block. If a newer extension format is created (either a new version of GGEP or another format altogether), they will appear AFTER the last GGEP block of an earlier version.
GGEP Block 0 |
. |
---|
. |
. |
GGEP Block N |
Byte Positions | Name | Comments |
---|---|---|
0 | Magic | This is a magic number is used to help distinguish GGEP extensions from legacy data which may exist. It must be set to the value 0xC3. |
GGEP Extension Header
Field Order | Bytes Required | Name | Comments |
---|---|---|---|
0 | 1 | Flags | These are options which describe the encoding of the extension header and data. |
1 | 1-15 | ID |
The raw binary data in this field is the extension ID. See Appendix A on suggestions for creating extension IDs. No byte in the extension header my be 0x0. |
2 | 1-3 | Data Length |
This is the length of the raw extension data. This field is persisted according to the length encoding rules listed above. |
Bit Positions | Name | Comments |
---|---|---|
7 | Last Extension | When set, this is the last extension in the GGEP block. |
6 | Encoding | The value contained in this field dictates the type of encoding which should be applied to the extension data (after possible compression). |
5 | Compression | The value contained in this field dictates the type of compression that should be applied to the extension data. |
4 | Reserved | This field is currently reserved for future use. It must be set to 0. |
3-0 | ID Len | Value 1-15 can be stored here. Since this will not be zero, it ensures this byte will not be 0x0. |
Values | Types |
---|---|
0 | There is no encoding on the data. |
1 | The data is encoded using the COBS scheme. |
Values | Types |
---|---|
0 | The extension data has not been compressed. |
1 | The extension data should be decompressed using the deflate algorithm. |
The Clip2 document states that ping messages have no payloads. Given this definition, existing servent vendors drop connections that issue pings containing payloads. As such, developers are suggested to allow widespread distribution of GGEP enabled servents before releasing extensions for the ping message. Similarly, they should only forward ping messages containing GGEP extensions to other servents who have indicated their support of GGEP via the handshake header.
Servents are instructed to forward all ping messages containing GGEP blocks they do not understand regardless of ping/pong reduction schemes.
The payload of the ping message is now defined to be:
Extension Block |
Being that there are not currently extensions to the pong message and the last field has a fixed length, it is easy to extend this message to include GGEP. That said, since the current pong message currently has a fixed length, existing servents may drop connections if they receive pongs containing extensions. To this end, developers are suggested to only include GGEP blocks in response to ping messages containing GGEP blocks, as that will guarantee that the pathway is GGEP enabled.
The payload of the pong message is now defined to be:
Port |
IP Address |
Num Files Shared |
Num KBytes Shared |
Extension Block |
The Query message is redefined in a way as not to break Gnutella 0.4 compatible servents.
MinSpeed |
Criteria |
0x0 |
Extension Block |
Until now, servent vendors have been left to define the format of this opaque field. Ones that are able to write to this field, only read from it if they recognize their vendor code. To this end, we must first indicate to all clients that the private data section contains a GGEP block, so they know to crack open the field. To do this, we use an open data bit 5 of the Flags and Flag2 fields (the format of this field is defined in the Clip2 document). Remember that the ability to crack the GGEP block does not mean that one is able to understand the extensions contained within.
For compatibility with a couple of existing servents that already use this field, it is necessary to search for the GGEP block by looking for the first occurance of the GGEP magic byte.
Private Data Format for servents that already use the private area and are trying to retain compatibility with older versions of their code This will be phased out over time:
Servent Specific Private Data Backwards Data that are guaranteed not to contain the GGEP magic byte | 0-XXX bytes (usually 1) |
GGEP Block | YYY bytes |
More Servent Specific Private Data for backwards compatibility section that can completely be ignored. | 0-ZZZ bytes |
Private Data Format for new servents:
Extension Block |
Note that the code necessary to find the GGEP block in both formats is identical.
Servent vendors must be careful to ensure that 0x0 does not appear in any extension data placed embedded into the Query Hit Result. One does this by using any of the available encoding options.
File Index |
File Size |
File Name |
0x0 |
Extension Block |
Being that there are not currently extensions to the push message and the last field has a fixed length, it is easy to extend this message to include GGEP. That said, since the current push message currently has a fixed length, it is possible that old servents will validate against that length, and throw out push messages that include GGEP extensions. To this end, servents should only send push messages containing extension blocks to other servents that have indicated GGEP support via the connection handshake.
The payload of the push message is now defined to be:
Servent ID |
File Index |
IP Address |
Port |
Extension Block |
The Extension ID field in the GGEP header is a binary field consisting of between 1 and 15 bytes. It cannot contain the byte 0x0, and one must be able to compare IDs with a simple binary comparison. Asisde from those rules, GGEP does not mandate any particular format, but does encourage the creation of short IDs that are free from conflicts. One should also note that Extension IDs are meant to be consumed by machines. To this end, the following techniques are recommended. If one has a strong need to create an alternative format, be sure to avoid conflicts with the following schemes.
Any Extension ID of less than 4 bytes must be stored in the appropriate GDF database. Any Extension ID of less than 3 bytes must be also be approved by the GDF. The format of the extension data must also be registered.
This simple technique allows for the creation of ExtensionIDs based upon uses the following format VendorID|'.'|BinaryData VendorID for a Gnutella servent is a 4 byte value that has been registered in the GDF Peer Codes database. In the QueryHit Descriptor, this case is case insensitive. With ExtensionIDs, the case matters, as one must be able to perform a binary comparison on the ID. This means an ExtensionID of "SWAP.1" and "swap.1" are different, but both "belong" the vendor who ones the code "SWAP."
It is unfortunately the case that GGEP extensions will co-exist with legacy extensions for quite some time. In some cases, both may exist in the same space allocated for extensions. The below algorithm for interpreting extension fields will help sort out such this co-existence.