Gnutella Protocol Development
Home :: Developer :: Press :: Research :: Servents
2.4 Standard Message Architecture
Source - Latest draft
Once a servent has connected successfully to the network, it
communicates with other servents by sending and receiving Gnutella
protocol messages. Each message is preceded by a Message Header with
the byte structure given below.
Note 1: One IP packet may contain several Gnutella messages, and
one Gnutella message may be split up on multiple IP-packets. This
means one can never assume a Gnutella message ends when the chunk of
data read from the socket ends.
Note 2: All fields in the following structures are in little-endian
byte order unless otherwise specified.
Note 3: All IP addresses in the following structures are in IPv4
format. For example, the IPv4 byte array
0xD0 0x11 0x32 0x04
byte 0 byte 1 byte 2 byte 3
represents the dotted address 208.17.50.4.
2.4.1 Message Header
The message header is 23 bytes divided into the following fields.
Bytes: Description:
0-15 Message ID/GUID (Globally Unique ID)
16 Payload Type
17 TTL (Time To Live)
18 Hops
19-22 Payload Length
Message ID A 16-byte string (GUID) uniquely identifying the
message on the network.
Servents SHOULD store all 1's (0xff) in byte 8 of the
GUID. (Bytes are numbered 0-15, inclusive.) This
serves to tag the GUID as being from a modern
servent.
Servents SHOULD initially store all 0's in byte 15 of
the GUID. This is reserved for future use.
The other bytes SHOULD have random values.
Payload Indicates the type of message
Type 0x00 = Ping
0x01 = Pong
0x02 = Bye
0x40 = Push
0x80 = Query
0x81 = Query Hit
Other Gnutella messages can be used, but if so the
servent MUST first make sure that the remote host
supports this new message type. This can be done
using handshaking headers.
TTL Time To Live. The number of times the message
will be forwarded by Gnutella servents before it is
removed from the network. Each servent will decrement
the TTL before passing it on to another servent. When
the TTL reaches 0, the message will no longer be
forwarded (and MUST not).
Hops The number of times the message has been forwarded.
As a message is passed from servent to servent, the
TTL and Hops fields of the header must satisfy the
following condition:
TTL(0) = TTL(i) + Hops(i)
Where TTL(i) and Hops(i) are the value of the TTL and
Hops fields of the message, and TTL(0) is maximum
number of hops a message will travel (usually 7).
Payload The length of the message immediately following
Length this header. The next message header is located
exactly this number of bytes from the end of this
header i.e. there are no gaps or pad bytes in the
Gnutella data stream. Messages SHOULD NOT be larger
than 4 kB.
The Payload Length field is the only reliable way for a servent to
find the beginning of the next message in the input stream.
Therefore, servents SHOULD rigorously validate the Payload Length
field for each message received. If a servent becomes out of synch
with its input stream, it SHOULD close the connection associated with
the stream since the upstream servent is either generating, or
forwarding, invalid messages.
Abuse of the TTL field in broadcasted messages (Query) will lead to
an unnecessary amount of network traffic and poor network
performance. Therefore, servents SHOULD carefully check the TTL
fields of received query messages and lower them as necessary.
Assuming the servent's maximum admissible Query message life is 7
hops, then if TTL + Hops > 7, TTL SHOULD be decreased so that TTL +
Hops = 7. Broadcasted messages with very high TTL values (>15)
SHOULD be dropped.
Immediately following the message header, is a payload consisting
of one of the following messages.
2.4.2 Ping (0x00)
Ping messages MAY contain a GGEP extension block (see Section 2.3),
but no other payload.
2.4.3 Pong (0x01)
Pong messages contains information about a Gnutella host. The
message has the following fields
Bytes: Description:
0-1 Port number. The port number on which the responding
host can accept incoming connections.
2-5 IP Address. The IP address of the responding host.
Note: This field is in big-endian format.
6-9 Number of shared files. The number of files that the
servent with the given IP address and port is sharing
on the network.
10-13 Number of kilobytes shared. The number of kilobytes
of data that the servent with the given IP address and
port is sharing on the network.
14- OPTIONAL GGEP extension block. (see Section 2.3)
Pong messages are only sent in response to an incoming Ping
message. It is valid for more than one Pong message to be sent in
response to a single Ping message. This enables host caches to send
cached servent address information in response to a Ping request.
The Message ID of a Pong message MUST be the Message ID of the Ping
message it is sent in reply to.
The fields specifying the number of shared files and the number of
kilobytes shared was intended to allow one to measure the amount of
data available on the network. With a very large Gnutella network,
and minimized Ping and Pong message traffic, this can no longer be
done. Still, these fields SHOULD be filled out correctly.
2.4.4 Query (0x80)
Since Query messages are broadcasted to many nodes, the total size
of the message SHOULD not be larger than 256 bytes. Servents MAY drop
Query messages larger that 256 bytes, and SHOULD drop Query messages
larger than 4 kB.
A Query message has the following fields:
Bytes: Description:
0-1 Minimum Speed. The minimum speed (in kb/second) of servents
that should respond to this message. A servent receiving a
Query message with a Minimum Speed field of n kb/s SHOULD
only respond with a Query Hit if it is able to communicate at
a speed >= n kb/s.
2- Search Criteria. This field is terminated by a NUL (0x00).
See section 2.2.7.3 for rules and information on how to
interpret the Search Criteria
Rest OPTIONAL extensions block. The rest of the query message is
used for extensions to the original query format. The allowed
extension types are GGEP, HUGE and XML (see Section 2.3 and
Appendixes 1 and 2).
If two or more of these extension types exist together,
they are separated by a 0x1C (file separator) byte. Since
GGEP blocks can contain 0x1C bytes, the GGEP block, if
present, MUST be located after any HUGE and XML blocks.
The type of each block can be determined by looking for the
prefixes "urn:" for a HUGE block, "<" or "{" for XML and 0xC3
for GGEP.
The extension block SHOULD NOT be followed by a null (0x00)
byte, but some servents wrongly do that.
2.4.4.1 Flags field semantics
The first two bytes of the Query message payload were previously
used to signal the minumum speed required of the sharing host. The
value was in little-endian format. This use has now been deprecated.
The new semantic is in big-endian format. The higher bit in
big-endian format (bit 15) is used as a flag to detect queries with
the new semantic. This bit MUST be set. If the bit 15 is not set,
then this is a query with the legacy minspeed semantic, and the
field MAY be ignored, but servents MUST NOT ignore the entire query.
If the bit 15 is set, then this is a query with the new semantic.
Note however that bit 15 in the new semantics was the bit 7 in the
legacy one (encoding for 128 kbps).
In the new semantic, each bit (except for bit 15) is used as a flag,
mostly to indicate compatibility with new gnutella extensions. The
affectation of each bit is as follow :
* Bit 14 : Firewalled indicator. The host who sent the query is
unable to accept incoming connections. This flag can
be used by the remote servent to avoid returning
Query Hits if it is itself firewalled, as the
requesting servent will not be able to download any
files.
* Bit 13 : XML Metadata. Set this bit to 1 if you want the
sharing servent to send XML Metadata in the Query Hit.
This flag has been assigned to spare bandwidth,
returning metadata in queryHits only if the requester
asks for it. If this bit is not set, the sharing host
MUST NOT send XML metadata in return Query Hit messages.
* Bit 12 : Leaf Guided Dynamic Query. When the bit is set to 1,
this means that the query is sent by a leaf which
wants to control the dynamic query mechanism. This
is part of the Leaf guidance of dynamic queries
proposal. This information is only used by the
ultrapeers shileding this leave if they implement leaf
guidance of dynamic queries. If this bit is set in a
Query from a Leaf it indicates that the Leaf will
respond to Vendor Messages from its Ultrapeer about
the status of the search results for the Query.
* Bit 11 : GGEP "H" allowed. If this bit is set to 1, then the
sender is able to parse the GGEP "H" extension
which is a replacement for the leagacy HUGE GEM
extension. This is meant to start replacing the GEM
mecanism with GGEP extensions, as GEM extensions are
now deprecated.
* Bit 10 : Out of Band Query. This flag is used to recognize a
Query which was sent using the Out Of Band query
extension.
* Bit 9 : Reserved for a future use.
* Bits 0-8 : Indicates the maximum number of query hits expected,
0 if no maximum. This does not mean that no more query
hits may be returned, but that the query should be
propagated in a way that will cause the specified
number of hits.
2.4.5 Query Hit
Query Hit messages has the following fields:
Bytes: Description:
0 Number of Hits. The number of query hits in the result set
(see below).
1-2 Port. The port number on which the responding host can accept
incoming HTTP file requests. This is usually the same port as
is used for Gnutella network traffic, but any port MAY be
used.
3-6 IP Address. The IP address of the responding host.
Note: This field is in big-endian format.
7-10 Speed The speed (in kb/second) of the responding host.
11- Result Set. A set of responses to the corresponding Query.
This set contains Number_of_Hits elements, each with the
following structure:
Bytes: Description:
0-3 File Index. A number, assigned by the responding
host, which is used to uniquely identify the file
matching the corresponding query.
4-7 File Size. The size (in bytes) of the file whose
index is File_Index.
8- File Name. The name of the file whose index is
File_Index. Terminated by a null (i.e. 0x00)
x Extensions block. Allowed extension types are HUGE,
GGEP and plain text metadata. This field is
terminated by a null (0x00), even if there are no
extensions (resulting in a double null). Also, the
extensions block itself MUST NOT contain any null
bytes.
If two or more of these extension types exist
together, they are separated by a 0x1C (file
separator) byte. Since GGEP blocks can contain 0x1C
bytes, the GGEP block, if present, MUST be located
after any HUGE and plan text blocks.
The type of each block can be determined by looking
for the prefixes "urn:" for a HUGE block, 0xC3 for
GGEP and anything else is probably plain text
metadata.
Plain text metadata is intended to be displayed
directly to the user. It was first invented by
Gnotella (a now discontinued Gnutella servent) to tag
MP3 files. Examples:
"192 kbps 44 kHz 3:23"
"120 kbps(VBR) 44kHz 3:55" (variable bitrate)
Other plan text formats MAY be used.
x RECOMMENDED extra block. This block is not required, but
strongly recommended. It is sometimes called EQHD, or
(incorrectly) just QHD. It has the following format:
Bytes:
0-3 Vendor Code. Four case-insensitive characters
representing a vendor code. For example "LIME" for
LimeWire. See registered codes and register yours at
http://groups.yahoo.com/group/the_gdf/database?
method=reportRows&tbl=6
(Requires GDF membership)
4 Open Data Size. Contains the length (in bytes) of the
Open Data field. Set to 2 in most current
implementations, and 4 in those that support XML
metadata outside GGEP (see Section 2.3 and Appendix 2).
The Open Data area MAY be larger to allow future
extensions.
x Open Data. Contains two 1-byte flags fields with the
following layout and in the specified order:
bit: Description:
7,6 Reserved for future use
5 flagGGEP
4 flagUploadSpeed
3 flagHaveUploaded
2 flagBusy
1 Reserved for future use
0 flagPush
The first flag byte can be viewed as an "enabler" for
the flags in the second byte, the "setter". Only
those bits that were enabled must be considered by
the servent as being valid. This logic is reversed
for flagPush, which is set in the first byte and
enabled in the second. The enabling byte allows
you to know which flags are supported by a given
servent.
Bits 5,4,3,2 in the first byte MUST be set if and
only if the corresponding flag in the second byte is
meaningful.
Bit 0 in the second byte MUST be set if and only
if the corresponding flag in the second byte is
meaningful. Yes, the order is reversed for this flag.
flagGGEP is set is set if and only if the private
data block (see below) contains a GGEP block.
flagUploadSpeed is set if and only if the Speed field
of the QueryHit message contains the highest
average transfer rate (in kbps) of the last 10
uploads. Otherwise Speed field contains the hosts
total upload speed as set by the user, and therefore
less reliable.
flagHaveUploaded is set if and only if the servent
has successfully uploaded at least one file.
flagBusy is set if and only if the all of the
servent's upload slots are currently full.
flagPush is set if and only if the servent is
firewalled or cannot accept incoming TCP connections
for any other reason.
The reserved flags MUST not be set, unless they are
used for a future extension.
If XML metadata (Appendix 2) is included in the
current Query Hit message, the following 2 bytes of
Open Data area will contain the size of the XML
block. The XML block itself is placed in the private
area (see below).
x Private Data. Undocumented vendor-specific data. This field
continues till the servent Identifier, which uses the last 16
bytes of the message.
If the flagGGEP in the open data block is set, this block
contains a GGEP (see Section 2.3) extension block. The GGEP
block starts with a 0xC3 byte. Any data before or after the
GGEP block is vendor-specific data, and MUST be ignored, if
not recognized.
Servents are NOT RECOMMENDED to use the private data area for
vendor specific data. Servents SHOULD use GGEP extensions
instead.
If the Open Data area indicates an XML block is will also be
placed in the private area (see Appendix 2). Assuming that
the two bytes in the Open Data area specifies an XML block of
m bytes, that block can be found by extracting the last m
bytes of the private area. Both GGEP and XML can exist in the
same Private Data area, but XML SHOULD be implemented inside
GGEP.
[TODO: How about the nul after the XMP block? What is it good for?]
Last 16 Servent Identifier. A 16-byte string uniquely identifying the
responding servent on the network. This SHOULD be constant
for all Query Hit messages emitted by a servent and is
typically some function of the servent's network address. The
servent Identifier is mainly used for routing the Push
Message (see below).
2.4.6 Push (0x40)
A Push message has the following fields:
Bytes: Description:
0-15 Servent Identifier. The 16-byte string uniquely identifying
the servent on the network who is being requested to push the
file with index File_Index. The servent initiating the push
request MUST set this field to the Servent_Identifier
returned in the corresponding QueryHit message. This is
used to route the Push message to the sender of the Query
Hit message.
16-19 File Index. The index uniquely identifying the file to be
pushed from the target servent. The servent initiating the
push request MUST set this field to the value of one of the
File_Index fields from the Result Set in the corresponding
QueryHit message.
20-23 IP Address. The IP address of the host to which the file with
File_Index should be pushed. This field is in big-endian
format.
24-25 Port. The port number the receiver of this message should
push to.
26- OPTIONAL GGEP extension block. (see Section 4.1)