A Modified version of Mike Green's modified version of LimeWire's Ping/Pong
scheme.
Overview
This is a modified version of Mike Green's pong caching scheme
which was in-turn derived from the LimeWire system.
I hope this document
is of use to the Gnutella Devloper Community. Please forgive me if this document
is a bit skimpy, but I am soooo very, very tired.
All comments,
corrections or general abuse to the GDF or me directly.
Introduction
The following notation is used in this document:
nVariable
|
A variable numeric value |
TStruct
|
A pseudo-structure |
CONSTANT
|
A constant value |
gnutella-message
|
A Gnutella message-type |
Predefined constants:
MAX_PING_FREQ |
The minimum time between ping broadcasts |
MIN_CACHE_SIZE |
The minimum number of addresses a cache should try to maintain |
CACHE_EXPIRE_TIME |
The maximum amount of time an address can live in the cache for |
T |
The current time value |
The following pseudo-structure is used in this document
- An array of structures (referred to as THostRecord) containing:
- A host IP/Port combination.
- The number and size of shared files.
- A time-stamp value (nTimeStamp)
- A second array (reffered to as TPingSource) of structures holding:
- A host socket descriptor, or equivilent.
- A 'pongs remaining' field (nPongsRemaining)
- A variable to hold the time of the last ping (nLastPingTime)
Handling of incomming PING messages.
First we choose a number,
nPongsRequired, which is the number of pong messages we
wish to route back to the requesting host. The normal value for
nPongsRequired is 10, and should be no larger than
MIN_CACHE_SIZE.
If the servent is capable of accepting an
incomming host connection, in must then response with a single pong
containing its own IP address and port, and decrement
nPongsRequired by one.
Next it must consult the pong cache.
It does this by firstly removing all entries which are older than
CACHE_EXPIRE_TIME [1].
Then it must randomly select no more than nPongsRequired entries
from the cache and return these as pong messages with a hop-count of
zero. The number of cached pongs returned (nCacheHits) may
be less than nPongsRequired if the cache did not contain at least
nPongsRequired entries.
If nCacheHits was
less than nPongsRequired the servent must then insert an entry
containing the socket descriptor into the TPingSource array with a
nPongsRemaining value of (nPongsRequired -
nCacheHits).
If the level of the pong cache has fallen
below MIN_CACHE_SIZE, the servent may then broadcast the ping, if
the TTL of the ping is larger than one and nLastPingTime is
larger than MAX_PING_FREQ. If the ping is broadcasted,
nLastPingTime should be reset to T.
Handling of incomming PONG messages.
One huge problem with pong
caching is that we MUST NOT cache pong messages which were transmitted to
us via a another cache. If we let this happen, the Gnutella network would
quickly become a cyber-junkyard of ancient pongs which travel from one
pong cache to another. To combat this, we have to introduce requirements which a
pong message must satisfy in order to be cached.
- A pong with a hop-count of zero MUST NOT be cached, unless it
satisfies rule 2.
- A pong whose IP matches the observed IP of the responding
host MUST be cached. [2]
If the pong satisfies the above rules it may be inserted into
the THostRecord array with the nTimeStamp field set to
T.
Regardless of wether or not a pong is inserted into the
cache, a servent must send the pong message to each host in the
TPingSource array. Each structure's nPongsRemaining field
must also decremented by one, with the servent removing the entire structure
from the array if this value reaches zero.
Handling of incomming QUERY-HIT messages.
There is another valuable
chance to once again reduce ping/pong traffic by using the host IP
and port located in query-hit messages [3].
In
order to be cached, a servent must first check if the query-hit contains
an extended-query-hit-descriptor (EQHD). If it does, the query-hit
MUST NOT be cache if the push flag is set [4].
Otherwise,
the query-hit may be inserted into the cache, using the host IP and port,
and optionally using the number and size of the files in the query-hit as the
sharing values in THostRecord.
If the query-hit message was
eligble for caching, a servent must also send a pong message to each host
in the TPingSource array, using the information in the query-hit
as above. Each structure's nPongsRemaining field must also
decremented by one, with the servent removing the entire structure from the
array if this value reaches zero.
1. An implementation may also choose limit the size of
the pong-cache to some arbritary value. If this is the case, this can be folded
into one step.
2. Normally a pong with a hop-count of zero is
assumed to be from a cache. This would lead to the servents neighbouring nodes
to never be cached, since their hop-count is always zero.
3. This assumes the port contained in the
query-hit message is the same port used for incomming host connections. I
have never seen any paper state that this has to be the case, but I assume most
servents use the same port for all incomming connections.
4. I know the push flag doesn't necessarily mean a host
is firewalled, but if it isn't then we will receive a pong from it sooner or
later anyway.