Gnutella Protocol Development

Our blue logo

Gnutella Protocol Development

Home :: Developer :: Press :: Research :: Servents

2.4 Standard Message Architecture
Source - Latest draft

Once a servent has connected successfully to the network, it 
communicates with other servents by sending and receiving Gnutella 
protocol messages. Each message is preceded by a Message Header with 
the byte structure given below.

Note 1: One IP packet may contain several Gnutella messages, and 
one Gnutella message may be split up on multiple IP-packets. This 
means one can never assume a Gnutella message ends when the chunk of 
data read from the socket ends.

Note 2: All fields in the following structures are in little-endian 
byte order unless otherwise specified.

Note 3: All IP addresses in the following structures are in IPv4 
format. For example, the IPv4 byte array

    0xD0     0x11     0x32     0x04
    byte 0   byte 1   byte 2   byte 3

represents the dotted address 208.17.50.4.


2.4.1 Message Header

The message header is 23 bytes divided into the following fields.

    Bytes:  Description:
    0-15    Message ID/GUID (Globally Unique ID)
    16      Payload Type
    17      TTL (Time To Live)
    18      Hops
    19-22   Payload Length

Message ID      A 16-byte string (GUID) uniquely identifying the
                message on the network. 
                        
                Servents SHOULD store all 1's (0xff) in byte 8 of the
                GUID.  (Bytes are numbered 0-15, inclusive.) This 
                serves to tag the GUID as being from a modern 
                servent.
        
                Servents SHOULD initially store all 0's in byte 15 of
                the GUID. This is reserved for future use.

                The other bytes SHOULD have random values.

Payload         Indicates the type of message
Type            0x00 = Ping
                0x01 = Pong
                0x02 = Bye
                0x40 = Push
                0x80 = Query
                0x81 = Query Hit

                Other Gnutella messages can be used, but if so the
                servent MUST first make sure that the remote host 
                supports this new message type. This can be done 
                using handshaking headers.

TTL             Time To Live. The number of times the message 
                will be forwarded by Gnutella servents before it is 
                removed from the network. Each servent will decrement
                the TTL before passing it on to another servent. When
                the TTL reaches 0, the message will no longer be 
                forwarded (and MUST not).

Hops            The number of times the message has been forwarded.
                As a message is passed from servent to servent, the
                TTL and Hops fields of the header must satisfy the 
                following condition:
                TTL(0) = TTL(i) + Hops(i)
                Where TTL(i) and Hops(i) are the value of the TTL and
                Hops fields of the message, and TTL(0) is maximum 
                number of hops a message will travel (usually 7).

Payload         The length of the message immediately following 
Length          this header. The next message header is located 
                exactly this number of bytes from the end of this 
                header i.e. there are no gaps or pad bytes in the 
                Gnutella data stream. Messages SHOULD NOT be larger
                than 4 kB.

The Payload Length field is the only reliable way for a servent to 
find the beginning of the next message in the input stream. 
Therefore, servents SHOULD rigorously validate the Payload Length 
field for each message received.  If a servent becomes out of synch 
with its input stream, it SHOULD close the connection associated with
the stream since the upstream servent is either generating, or 
forwarding, invalid messages.

Abuse of the TTL field in broadcasted messages (Query) will lead to 
an unnecessary amount of network traffic and poor network 
performance.  Therefore, servents SHOULD carefully check the TTL 
fields of received query messages and lower them as necessary.  
Assuming the servent's maximum admissible Query message life is 7 
hops, then if TTL + Hops > 7, TTL SHOULD be decreased so that TTL + 
Hops = 7.  Broadcasted messages with very high TTL values (>15) 
SHOULD be dropped.

Immediately following the message header, is a payload consisting 
of one of the following messages.


2.4.2 Ping (0x00)

Ping messages MAY contain a GGEP extension block (see Section 2.3),
but no other payload.


2.4.3 Pong (0x01)

Pong messages contains information about a Gnutella host. The 
message has the following fields

    Bytes:  Description:
    0-1     Port number. The port number on which the responding
            host can accept incoming connections.
    2-5     IP Address. The IP address of the responding host.
            Note: This field is in big-endian format.
    6-9     Number of shared files. The number of files that the
            servent with the given IP address and port is sharing
            on the network.
    10-13   Number of kilobytes shared. The number of kilobytes
            of data that the servent with the given IP address and
            port is sharing on the network.
    14-     OPTIONAL GGEP extension block. (see Section 2.3)

Pong messages are only sent in response to an incoming Ping 
message. It is valid for more than one Pong message to be sent in 
response to a single Ping message. This enables host caches to send 
cached servent address information in response to a Ping request.

The Message ID of a Pong message MUST be the Message ID of the Ping 
message it is sent in reply to.

The fields specifying the number of shared files and the number of 
kilobytes shared was intended to allow one to measure the amount of 
data available on the network.  With a very large Gnutella network, 
and minimized Ping and Pong message traffic, this can no longer be 
done.  Still, these fields SHOULD be filled out correctly. 


2.4.4 Query (0x80)

Since Query messages are broadcasted to many nodes, the total size 
of the message SHOULD not be larger than 256 bytes. Servents MAY drop
Query messages larger that 256 bytes, and SHOULD drop Query messages 
larger than 4 kB.

A Query message has the following fields:

Bytes:  Description:
0-1     Minimum Speed. The minimum speed (in kb/second) of servents
        that should respond to this message. A servent receiving a 
        Query message with a Minimum Speed field of n kb/s SHOULD 
        only respond with a Query Hit if it is able to communicate at
        a speed >= n kb/s.
        
2-      Search Criteria. This field is terminated by a NUL (0x00).

        See section 2.2.7.3 for rules and information on how to 
        interpret the Search Criteria
        
Rest    OPTIONAL extensions block. The rest of the query message is
        used for extensions to the original query format. The allowed
        extension types are GGEP, HUGE and XML (see Section 2.3 and 
        Appendixes 1 and 2).
        
        If two or more of these extension types exist together, 
        they are separated by a 0x1C (file separator) byte. Since 
        GGEP blocks can contain 0x1C bytes, the GGEP block, if 
        present, MUST be located after any HUGE and XML blocks.
        
        The type of each block can be determined by looking for the 
        prefixes "urn:" for a HUGE block, "<" or "{" for XML and 0xC3 
        for GGEP.

        The extension block SHOULD NOT be followed by a null (0x00) 
        byte, but some servents wrongly do that.

2.4.4.1 Flags field semantics

The first two bytes of the Query message payload were previously 
used to signal the minumum speed required of the sharing host. The
value was in little-endian format. This use has now been deprecated.

The new semantic is in big-endian format. The higher bit in 
big-endian format (bit 15) is used as a flag to detect queries with
the new semantic. This bit MUST be set. If the bit 15 is not set,
then this is a query with the legacy minspeed semantic, and the
field MAY be ignored, but servents MUST NOT ignore the entire query.
If the bit 15 is set, then this is a query with the new semantic.
Note however that bit 15 in the new semantics was the bit 7 in the 
legacy one (encoding for 128 kbps).

In the new semantic, each bit (except for bit 15) is used as a flag, 
mostly to indicate compatibility with new gnutella extensions. The 
affectation of each bit is as follow : 

   * Bit 14   : Firewalled indicator. The host who sent the query is 
                unable to accept incoming connections. This flag can 
                be used by the remote servent to avoid returning 
                Query Hits if it is itself firewalled, as the 
                requesting servent will not be able to download any 
                files.
   * Bit 13   : XML Metadata. Set this bit to 1 if you want the 
                sharing servent to send XML Metadata in the Query Hit. 
                This flag has been assigned to spare bandwidth, 
                returning metadata in queryHits only if the requester 
                asks for it. If this bit is not set, the sharing host
                MUST NOT send XML metadata in return Query Hit messages.
   * Bit 12   : Leaf Guided Dynamic Query. When the bit is set to 1,
                this means that the query is sent by a leaf which
                wants to control the dynamic query mechanism. This
                is part of the Leaf guidance of dynamic queries 
                proposal. This information is only used by the 
                ultrapeers shileding this leave if they implement leaf
                guidance of dynamic queries. If this bit is set in a 
                Query from a Leaf it indicates that the Leaf will 
                respond to Vendor Messages from its Ultrapeer about
                the status of the search results for the Query.
   * Bit 11   : GGEP "H" allowed. If this bit is set to 1, then the 
                sender is able to parse the GGEP "H" extension
                which is a replacement for the leagacy HUGE GEM 
                extension. This is meant to start replacing the GEM 
                mecanism with GGEP extensions, as GEM extensions are 
                now deprecated. 
   * Bit 10   : Out of Band Query. This flag is used to recognize a
                Query which was sent using the Out Of Band query
                extension.
   * Bit 9    : Reserved for a future use.
   * Bits 0-8 : Indicates the maximum number of query hits expected, 
                0 if no maximum. This does not mean that no more query 
                hits may be returned, but that the query should be 
                propagated in a way that will cause the specified 
                number of hits.

2.4.5 Query Hit

Query Hit messages has the following fields:

Bytes:  Description:
0       Number of Hits. The number of query hits in the result set 
        (see below).
        
1-2     Port. The port number on which the responding host can accept
        incoming HTTP file requests. This is usually the same port as
        is used for Gnutella network traffic, but any port MAY be 
        used.
        
3-6     IP Address. The IP address of the responding host. 
        Note: This field is in big-endian format.

7-10    Speed The speed (in kb/second) of the responding host.
        
11-     Result Set. A set of responses to the corresponding Query. 
        This set contains Number_of_Hits elements, each with the 
        following structure:
        
        Bytes:  Description:
        0-3     File Index. A number, assigned by the responding 
                host, which is used to uniquely identify the file 
                matching the corresponding query.

        4-7     File Size. The size (in bytes) of the file whose 
                index is File_Index.

        8-      File Name. The name of the file whose index is 
                File_Index. Terminated by a null (i.e. 0x00) 

        x       Extensions block. Allowed extension types are HUGE, 
                GGEP and plain text metadata. This field is 
                terminated by a null (0x00), even if there are no 
                extensions (resulting in a double null). Also, the
                extensions block itself MUST NOT contain any null 
                bytes.
        
                If two or more of these extension types exist 
                together, they are separated by a 0x1C (file 
                separator) byte. Since GGEP blocks can contain 0x1C 
                bytes, the GGEP block, if present, MUST be located 
                after any HUGE and plan text blocks.
                
                The type of each block can be determined by looking 
                for the prefixes "urn:" for a HUGE block, 0xC3 for 
                GGEP and anything else is probably plain text 
                metadata.
        
                Plain text metadata is intended to be displayed 
                directly to the user. It was first invented by 
                Gnotella (a now discontinued Gnutella servent) to tag
                MP3 files. Examples:
                "192 kbps 44 kHz 3:23"
                "120 kbps(VBR) 44kHz 3:55" (variable bitrate)
                Other plan text formats MAY be used.
        
x       RECOMMENDED extra block. This block is not required, but 
        strongly recommended. It is sometimes called EQHD, or 
        (incorrectly) just QHD. It has the following format:
        
        Bytes:
        0-3     Vendor Code. Four case-insensitive characters 
                representing a vendor code. For example "LIME" for 
                LimeWire. See registered codes and register yours at
                http://groups.yahoo.com/group/the_gdf/database?
                        method=reportRows&tbl=6 
                        (Requires GDF membership)

        4       Open Data Size. Contains the length (in bytes) of the
                Open Data field. Set to 2 in most current 
                implementations, and 4 in those that support XML 
                metadata outside GGEP (see Section 2.3 and Appendix 2). 
                The Open Data area MAY be larger to allow future 
                extensions.
        
        x       Open Data. Contains two 1-byte flags fields with the 
                following layout and in the specified order:
                
                bit:    Description:
                7,6     Reserved for future use
                5       flagGGEP
                4       flagUploadSpeed
                3       flagHaveUploaded
                2       flagBusy
                1       Reserved for future use
                0       flagPush

                The first flag byte can be viewed as an "enabler" for
                the flags in the second byte, the "setter".  Only
                those bits that were enabled must be considered by
                the servent as being valid.  This logic is reversed
                for flagPush, which is set in the first byte and
                enabled in the second.  The enabling byte allows
                you to know which flags are supported by a given
                servent.

                Bits 5,4,3,2 in the first byte MUST be set if and 
                only if the corresponding flag in the second byte is 
                meaningful.

                Bit 0 in the second byte MUST be set if and only 
                if the corresponding flag in the second byte is 
                meaningful. Yes, the order is reversed for this flag.

                flagGGEP is set is set if and only if the private
                data block (see below) contains a GGEP block.

                flagUploadSpeed is set if and only if the Speed field
                of the QueryHit message contains the highest 
                average transfer rate (in kbps) of the last 10 
                uploads. Otherwise Speed field contains the hosts 
                total upload speed as set by the user, and therefore 
                less reliable.

                flagHaveUploaded is set if and only if the servent 
                has successfully uploaded at least one file.

                flagBusy is set if and only if the all of the 
                servent's upload slots are currently full.

                flagPush is set if and only if the servent is 
                firewalled or cannot accept incoming TCP connections 
                for any other reason.

                The reserved flags MUST not be set, unless they are 
                used for a future extension.

                If XML metadata (Appendix 2) is included in the 
                current Query Hit message, the following 2 bytes of
                Open Data area will contain the size of the XML 
                block. The XML block itself is placed in the private 
                area (see below).

x       Private Data. Undocumented vendor-specific data. This field 
        continues till the servent Identifier, which uses the last 16
        bytes of the message.

        If the flagGGEP in the open data block is set, this block
        contains a GGEP (see Section 2.3) extension block. The GGEP 
        block starts with a 0xC3 byte. Any data before or after the 
        GGEP block is vendor-specific data, and MUST be ignored, if 
        not recognized.

        Servents are NOT RECOMMENDED to use the private data area for
        vendor specific data. Servents SHOULD use GGEP extensions 
        instead.

        If the Open Data area indicates an XML block is will also be 
        placed in the private area (see Appendix 2). Assuming that 
        the two bytes in the Open Data area specifies an XML block of
        m bytes, that block can be found by extracting the last m 
        bytes of the private area. Both GGEP and XML can exist in the
        same Private Data area, but XML SHOULD be implemented inside 
        GGEP.
        [TODO: How about the nul after the XMP block? What is it good for?]

Last 16 Servent Identifier. A 16-byte string uniquely identifying the
        responding servent on the network. This SHOULD be constant 
        for all Query Hit messages emitted by a servent and is 
        typically some function of the servent's network address. The
        servent Identifier is mainly used for routing the Push 
        Message (see below).


2.4.6 Push (0x40)

A Push message has the following fields:
Bytes:  Description:

0-15    Servent Identifier. The 16-byte string uniquely identifying 
        the servent on the network who is being requested to push the
        file with index File_Index. The servent initiating the push 
        request MUST set this field to the Servent_Identifier 
        returned in the corresponding QueryHit message. This is 
        used to route the Push message to the sender of the Query 
        Hit message.

16-19   File Index. The index uniquely identifying the file to be 
        pushed from the target servent. The servent initiating the 
        push request MUST set this field to the value of one of the 
        File_Index fields from the Result Set in the corresponding 
        QueryHit message.

20-23   IP Address. The IP address of the host to which the file with
        File_Index should be pushed. This field is in big-endian 
        format.

24-25   Port. The port number the receiver of this message should 
        push to.

26-     OPTIONAL GGEP extension block. (see Section 4.1)

 

 

 

Home :: Developer :: Press :: Research :: Servents

SourceForge.net Logo