[ba-ohs-talk] Why Gnutella Can't Scale. No, Really.
Got this link from slashdot.
http://www.darkridge.com/~jpr5/doc/gnutella.html (01)
It's a mathematical analysis of gnutella. Some of the discussion may, or
may not apply to P2P in general; I'm not sure so I thought I'd post it here
for folks to discuss if interested. (02)
"In the spring of 2000, when Gnutella was a hot topic on everyone's mind, a
concerned few of us in the open-source community just sat back and shook
our heads. Something just wasn't right. Any competent network engineer that
observed a running gnutella application would tell you, through simple
empirical observation alone, that the application was an incredible burden
on modern networks and would probably never scale. I myself was just
stupefied at the gross abuse of my limited bandwidth, and that was just DSL
-- god help the dialup folks! We wondered to ourselves, Is no one paying
attention, was no one bothered?
That summer we all saw a rush of press on Gnutella, and the rumour mill
started churning. Most stories covering Gnutella were grossly and
inappropriately evangelical, praising the not-yet-analyzed Gnutella as a
technology capable of delivering on wildly fantastic promises of fully
distributed, undeterrable, unstoppable, larger-than-life file sharing on
the grandest scale. Many folks were convinced that Gnutella was the next
generation Napster. Gene Kan, the first to spearhead the Gnutella
evangelical movement, claimed in one early interview: "Gnutella is going to
kick Napster in the pants." Later Kan admitted "Gnutella isn't perfect",
but still went on to say that "there's no huge glaring thing missing".
Well, something just wasn't right, and though we couldn't see it, it did
seem pretty glaring.
We all understood the excitement. Herein was a technology that could
potentially prove the true magnitude of Metcalfe's Law. That realization
evoked nothing short of the phrase "holy shit!". But what I couldn't
understand was why no one was questioning the legitimacy of these claims.
For several months the only analyses anyone heard of practical
implementations were generalizations and speculative comments, without much
scientific or mathematical basis.
So I quickly got fed up, and resolved to write a research paper. Sometime
in late March, I had begun analyzing the network structure of the Gnutella
system, trying to find a way to gauge the capacity of a GnutellaNet in
generalized terms, and to predict its realistic limits. What later resulted
was a set of mathematical equations that could describe reachability,
capacity, and bandwidth throughput. I then fed those equations into
Mathematica to produce 3-D plots depicting, much to my own satisfaction,
visual realizations of exactly what didn't make sense.
In the spring of 2000, when Gnutella was a hot topic on everyone's mind, a
concerned few of us in the open-source community just sat back and shook
our heads. Something just wasn't right. Any competent network engineer that
observed a running gnutella application would tell you, through simple
empirical observation alone, that the application was an incredible burden
on modern networks and would probably never scale. I myself was just
stupefied at the gross abuse of my limited bandwidth, and that was just DSL
-- god help the dialup folks! We wondered to ourselves, Is no one paying
attention, was no one bothered?
That summer we all saw a rush of press on Gnutella, and the rumour mill
started churning. Most stories covering Gnutella were grossly and
inappropriately evangelical, praising the not-yet-analyzed Gnutella as a
technology capable of delivering on wildly fantastic promises of fully
distributed, undeterrable, unstoppable, larger-than-life file sharing on
the grandest scale. Many folks were convinced that Gnutella was the next
generation Napster. Gene Kan, the first to spearhead the Gnutella
evangelical movement, claimed in one early interview: "Gnutella is going to
kick Napster in the pants." Later Kan admitted "Gnutella isn't perfect",
but still went on to say that "there's no huge glaring thing missing".
Well, something just wasn't right, and though we couldn't see it, it did
seem pretty glaring.
We all understood the excitement. Herein was a technology that could
potentially prove the true magnitude of Metcalfe's Law. That realization
evoked nothing short of the phrase "holy shit!". But what I couldn't
understand was why no one was questioning the legitimacy of these claims.
For several months the only analyses anyone heard of practical
implementations were generalizations and speculative comments, without much
scientific or mathematical basis.
So I quickly got fed up, and resolved to write a research paper. Sometime
in late March, I had begun analyzing the network structure of the Gnutella
system, trying to find a way to gauge the capacity of a GnutellaNet in
generalized terms, and to predict its realistic limits. What later resulted
was a set of mathematical equations that could describe reachability,
capacity, and bandwidth throughput. I then fed those equations into
Mathematica to produce 3-D plots depicting, much to my own satisfaction,
visual realizations of exactly what didn't make sense.
At about the same time, a fellow colleague in the security industry wrote a
short paper detailing the various and flagrant insecurities inherent in
this particular implementation of a distributed system. Seth McGann's
security advisory titled Self-Replication Using Gnutella centered on the
characteristics an Internet Worm inside a GnutellaNet could thrive from,
and also touched on a few other flaws that would be useful to an attacker.
His advisory posted in May of 2000, and unfortunately went mostly unnoticed
(or misunderstood, because of its technical nature).
" (03)