From: harshitha.menon SpamElide on behalf of Harshitha Menon
Sent: Thursday, April 28, 2011 12:20 PM
To: Gupta, Indranil
Subject: 525 review 04/28
Exploring Complex Networks
This paper explores various forms of network prevalent around us and studies the affect of their structure on their functionality. For eg: the topology of the social network affects the spread of information or diseases. The networks have various complexities which makes it difficult to comprehend. The paper reviews what is known about dynamical complexity in regular networks of nonlinear systems, the impact of network structure on collective dynamics. As a next step, it tackles networks that combine dynamical and structural complexities. Researchers have identified two properties of the network namely, short paths and high clustering. Hence dynamic systems networked this way would display enhanced signal propagation speed, synchronizability and computational power.
Pros:
-The paper explores various networks and it is very useful in understanding the nature of the network and how the structure affects the functionality. It also points out properties of the network which makes it more affective.
-This paper does not restrict itself to P2P networks rather it explores various fields and analyses the network methodically
-Paper provides models for the network being analysed.
Cons:
-There can be other properties that can be considered in a network like how does the network evolve when partitions occur, how does it handle failures of nodes etc. These can be explored in the context of distributed systems as they are important aspects of distributed systems.
-The other aspect that could have been analysed is the scalability of the network.
Mapping the Gnutella network
Gnutella builds, at the application level, a virtual network with its own routing mechanisms. The topology of this virtual network and the routing mechanisms which has a significant influence on application properties such as performance, reliability, and scalability. The authors of the paper built a crawler to extract the topology of Gnutella’s application level network which they analyse and extract the network traffic information. The following are the two findings. Gnutella does not follow power law and Gnutella virtual topology does not match with the underlying topology.
A common characteristics of P2P networks is that they build a virtual network over the underlying topology which decides various factors like communication cost, performance, reliability etc. Since in a P2P network, churn rate is high, these characteristics are dynamic, evolving over time and basically they are self-organized network of independent entities. Since the Gnutella’s virtual network doesn’t map on to the underlying topology, it is not making efficient use of network.
Pros:
-Their approach yielded a lot of information about the P2P system and its behavior.
-Their measurement study could be deployed on other P2P network
-They had suggested improvements upon the current Gnutella network
From: Jason Croft
Sent: Thursday, April 28, 2011 12:16 PM
To: Gupta, Indranil
Subject: 525 review 04/28
Scaling Properties of the Internet Graph
This paper uses analysis and simulation to argue that congestion in the Internet
scales poorly as its size grows. The authors focus on graphs with power law degree
distribution and analyze both shortest path routing and policy routing (BGP) schemes.
Three different traffic vectors are considered: any-2-any (a unit traffic between all
nodes), leaf-2-leaf (consisting of stub nodes without customers, and carriers), and
clout (where highly connected nodes connected to other highly connected nodes
send larger amounts of traffic). Simulations are run on a graph based on real AS
topologies, as well as a synthetically generated graph. The real AS graph, also
called an accurately labeled real graph (ALR), contains at most 13000 nodes; the
synthetically generated graph, or heuristically labeled synthetic graph (HLS) varies
with up to 50000 nodes.
Both the analytical results and simulation results show maximum congestion scales
poorly in power law graphs. For shortest path routing with an any-2-any model,
maximum congestion scales as n^2 for large values of n, or the worst possible rate
of growth of congestion. The distribution of congestion also becomes uneven as the
number of nodes increases. Leaf-2-leaf is about 0.8x any-2any, and the clout model
scales worse than n^5. BGP scales better than shortest path, as n^5, since policies
may move traffic away from hot-spots.
Based on these results, the authors claim the uniform scaling in the link capacities
according to Moore's law as the Internet grows will likely not be enough to handle
increasing congestion. One solution they propose is the use of parallel links between
nodes. While this could be a simple method to handle increasing congestion, it could
be costly for highly connected ISPs that may have many potential hot-spots.
Exploring Complex Networks
In this paper, the difficulties in characterizing and understanding networks in different
fields are discussed. The author argues networks are inherently difficult to
understand because of their structural complexity, evolution, connection diversity,
dynamical complexity, node diversity, and meta-complications where complications
can influence each other. Networks of dynamical systems with regular topologies
(e.g., grids, lattices, fully-connected) and their problems with stability and
synchronization are examined. Biological examples, such as flashing fireflies,
inspired a large amount of research in this area.
The author then discusses more complex network architectures, but leaving aside
the challenges of dynamics. Random graphs, though studied in pure math, has
applications to ecosystems and the spread of diseases and computer viruses.
Small-world networks fall in between the order of regular graphs and the randomness
of random graphs. Graphs associated with search problems tend to have a small-
world topology. Scale-free networks, with skewed distributions, are seen in the
Internet backbone and the Web. Generalized random graphs, such as bipartite
graphs, models boards of directors and their interlocking into one large governance.
Aside from these, there is no further discussion of the six points/complexities he
outlines earlier in the paper (e.g., node diversity or meta-complications).
From: Long Kai
Sent: Thursday, April 28, 2011 12:04 PM
To: Gupta, Indranil
Subject: CS 525 Reviews 04/28
Exploring complex networks
Summary:
This paper gives a review about the common properties and features of
all kinds of network systems that scientists are exploring. Though the
question is critical to applications and development of such systems,
the problem is hard to understand because of Structural complexity,
Network evolution, Connection diversity, Dynamical complexity, Node
diversity and Meta-complication. This paper firstly reviews what is
known about dynamical complexity in regular networks of nonlinear
systems. The author offered a few rules of thumb about the impact of
network structure on collective dynamics, especially for arrays of
coupled limit-cycle oscillators. This article also uses graph theory
to explore the structure of complex networks, an approach that has led
to some encouraging progress, especially when combined with the tools
of statistical mechanics and computer simulations.
The article is inspiring review about the common features that
computer network share with other network systems. However, there are
a lot of differences between Internet and other networks. The
fundamental reason is that the internet is manmade, and it is not
evolved in a totally natural way. The links are not connected directly
and the load is not distributed as evenly as others.
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer
Systems and Implications for System Design
Summary:
This paper observes two questions about Gnutella, its connectivity
structure and how the Gnutella virtual network topology maps to the
physical Internet infrastructure. Later on, this paper provides
suggestions for improvements of Gnutella system based on the features
they found. The authors have built a “crawler” to extract the topology
of Gnutella’s application level network to study the topology and
traffic of the system. They found that Gnutella node connectivity
follows a multi-modal distribution, combining a power law and a
quasi-constant distribution. Power-law structure is such systems that
organize themselves so that most nodes have few links while a tiny
number of nodes, called hubs, have a large number of links. Networks
following this organizational pattern display an unexpected degree of
robustness. As for the mismatch between Gnutella virtual topology and
network infrastructure, the authors make the point that this mismatch
cause unnecessary burden and stress on the network infrastructure. The
author further talked about the potential improvement for Gnutella.
The problem with this paper though is that matching the topology of
Gnutella of infrastructure topology may not improve efficiency a lot,
because the query distribution of Gnutella is very different from that
of normal websites. The author assumes that distribution of Gnutella
queries is similar to the distribution of HTTP requests in the
Internet, which is not true.
--
Larry
From: Michael Ford
Sent: Thursday, April 28, 2011 11:56 AM
To: Gupta, Indranil
Subject: 525 review 04/28
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design
The authors examine the architecture of the network built by the Gnutella system. Since peer-to-peer are essentially a virtual overlay and they use their own routing schemes, their performance is highly dependent on the ability to overlay in collaboration with the underlying topology. The authors build a distributed crawler using a client-server architecture to map Gnutella networks. This greatly reduces the amount of time required to crawl the network, leading to more consistent snapshots and higher scalability.
Analysis shows that Gnutella peers have a high rate of churn, continued to grow in number, and generate a significant amount of traffic. Roughly one third of the traffic is user-generated, while the remainder is overhead; after an update in June 2001, the overhead was reduced to eight percent. The authors report that this is a significant portion of the US traffic, but they only give statistics before the June 2001 update, so it is not known if the trend will hold in the future. Node connectivity remains constant as the network grows, allowing for large scalability, but adding increased latency.
Scaling Properties of the Internet Graph
The Internet is continuously growing, and although link speeds also improve, there may be congestion bottlenecks. The authors consider BGP and shortest-path routing in their analysis and show that BGP routing does not improve congestion on the worst links as opposed to shortest path routing, though there are certainly other motivations for using BGP. They argue that the maximum congestion in the Internet scales poorly with the growing size of the Internet graph.
To this end, the authors combine theoretical analysis with simulation. The simulations are seeded with Internet AS topology networks and synthesized networks. The size of these networks is questionable. The largest AS topology network contains only 13,000 nodes, and the largest synthesized network only 50,000. The previous paper, published a year earlier, noticed networks as large as 50,000 that only included Gnutella clients. It is unclear if their simulations can capture the hierarchical nature of the entire Internet with so few nodes. Regardless, they notice that the power law nature scales poorly on the most congested link. They present a solution of doubling the links between some nodes. The authors do not show that the maximum congestion is a good measure of overall performance, and their solution is specifically designed to counter this problem (by doubling the capacity of that edge).
From: Nipun Sehrawat
Sent: Thursday, April 28, 2011 11:55 AM
To: Gupta, Indranil
Subject: 525 review 04/28
Scaling Properties of the Internet Graph
This paper does an interesting study of scalability of congestion in the internet as it grows in size. The worst congestion in the Internet was found to be growing at O(n^(1 + const)), which is a matter of concern, considering the high value of n (number of nodes in internet). Also, this behavior shows up in both shortest-path-first as well as policy-driven (BGP) routing. Such a congestion is expected in the ‘core’ links of internet, which can lead to performance degradation of large parts of the graph. Supplementing such hot-spot links with some redundant links will help in improving the congestion problem at these links. Alternatively, routing policies in place can be changed to acknowledge these hot-spots.
The authors assume internet graph to be following power-law, where the number of nodes having a degree d is inversely proportional to d, which seems reasonable, but could have been backed by empirical findings. So, the problem is reduced to finding out congestion in graph following power-law and routing policies being shortest-path and policy-driven routing. Three type of traffic patterns are assumed - any2any (unit traffic between every pair of nodes), leaf2leaf (unit traffic between every pair of nodes that don’t have a customer (called stubs)) and clout (assumes stubs with high-degree are likely to send more traffic).
Based on the simulations, it is observed that congestion at an edge grows with the degree of nodes on either ends of that edge. This seems reasonable, as such edges are expected to appear in the core of internet, such as between two boundary routers of ASes. Adding parallel links to supplement such links resulted in maximum relative congestion to grow linearly with the number of nodes, which seems as a big improvement.
Pros:
- Provides insights about growth of congestion in the internet graph, as number of end nodes grow.
- Proposes a solution to contain congestion to a linear growth, by adding redundant links at key points in the internet graph.
Cons:
- Authors could have tried to back their power-law assumption about internet graph, by empirical findings.
--
Exploring Complex Networks
This article first motivates about the importance of networks and how the study of networks is relevant to diverse fields ranging from computer science to neurobiology. The author also argues that structure of networks plays an important role as it affects the functionality of the concerned entity. For example, topology of a social network affects the spread of an epidemic and how the topology of a power grid determines its robustness. Some of the difficulties involved in studying/understanding networks are:
- Structural complexity: It becomes difficult to represent a complex structured network
- Network evolution: It gets difficult to capture the dynamism of networks like the world wide web, where pages are added and removed continuously.
- Connection diversity: Links between nodes might be diverse in type, direction, weights etc, for eg in a computer network.
- Node diversity: A network may feature heterogeneous nodes, such as nodes being routers, bridges, PCs etc in a computer network.
The article describes Random graphs, which are a collection of n nodes and m randomly generated links between these nodes. An important characteristic of such graphs is the connectivity of nodes as m grows. For m > n/2, the graph has a connected component having O(n) nodes and the closest rival to this chunk having o(logn) nodes. This kind of networks are seen in spread of computer viruses, where keeping m low results in containing the spread.
Scale-free networks capture the fact that some nodes in a network are more connected than others and nodes follow the power-law. Such networks are resistant to random failures as the connectivity is mainly dominated by a a fewer hubs. However, the network is now open to attacks at these limited number of hub nodes. Such graphs represent the internet topology to a high degree.
Pros:
- The article presents the study of networks in totally different fields, computer science shall have a lot to borrow/collaborate from the study of networks in these fields.
From: wzhou10 SpamElide
Sent: Thursday, April 28, 2011 11:49 AM
To: Gupta, Indranil
Subject: CS525 Review 04/28
CS525 Review 04/28 Wenxuan Zhou
Review on “Mapping the Gnutella Network: Properties of Large-Scale Peer-to-
Peer Systems and Implications for System Design”
Core idea
The popularity of P2P systems, such like Gnutella network, makes
quantitative evaluations on them important. This paper analyzes the results
of two types of crawls, a sequential one and a distributed client/server one,
on the Gnutella network and extracts some properties of it. Gnutella is
designed with the goals: ability to operate in a dynamic environment,
performance and Scalability, reliability, and anonymity. Following aspects of
Gnutella are analyzed: growth trends and dynamic behavior, estimate of
Gnutella generated traffic, connectivity and reliability in Gnutella, and
mapping between Internet infrastructure and Gnutella. It draws the
conclusions about Gnutella: 1) it generates a significant fraction of the total
Internet traffic; 2) its node connectivity follows a multi-modal distribution; 3)
its topology mismatches that of the Internet.
Pros
This paper provides some interesting statistics and results about the
properties of the Gnutella network. Especially, it reveals that Gnutella is
completely independent from the Internet structure.
Cons
1. The crawls may change the dynamics of the Gnutella Network. It’d be
nice to know how much effect the crawls made on the original Gnutella
Network. Extremely case is it changes the properties of the network, which
are the paper’s major interests, especially when crawls behave very
differently from normal Gnutella clients, thus the results are somehow
meaningless.
2. There’s some confusion also. It’s found that in Nov. 2000, only 1/3
traffic was useful user traffic, while the rest was for maintenance. Then, after
June 2001, the problem was solved, with 92% traffic for user traffic. However,
as said earlier no crawls run only after June 2001, so this data isn’t
supported by their own experiment results. Also, the fact the authors failed
to mention any details about the solution method other than “newer Gnutella
implementations” makes it even less convincing
3. The mismatch between the topology of Gnutella and the Internet,
according to the authors, can cause serious problems. However, the paper
made no effort to mitigate this fact or make some suggestions.
Review on “Scaling Properties of the Internet Graph”
Core idea
This authors discovered the worst congestion in Internet-like graphs scales
poorly with the graph size, when shortest path routing is used, based on
both analysis and simulations. Moreover, policy-based routing is found,
surprisingly, performs no worse, in terms of worst congestion. To improve
the Internet graph, the authors suggest to introduce parallel edges, which is
proved to be an effective way.
Pros
The conclusions are supported both by theoretical proof and simulation
results. Some interesting discoveries are presented. An effective way to
improve internet graph scaling properties is proposed.
Cons
1. In 3.1 the traffic on each edge is calculated without consideration of the
capacities of each link. Moreover, there may be some load balancing
mechanism could be taken into account too.
2. The fact motivating the clout model, “well placed” sources are likely to
send larger amounts of traffic than other sources seems a little plausible. A
better placement of a source can only means its ability to send larger traffic,
but not its likelihood to do so.
3. The paper suggested adding parallel links according to simple functions
of degrees to reduce congestion, and said “in this case it might not be
necessary for some links in the graph to grow in capacity at a faster rate than
the others”. But I fail to see the fundamental difference between these two
approaches.
4. The entire paper is about worst congestion, but how much percentage
this worst case takes up is not mentioned or considered. Policy-based
routing and shortest path are compared based on worst congestion, which
may not fair.
Thanks.
~Wenxuan
From: trowerm SpamElide on behalf of Matt Trower
Sent: Thursday, April 28, 2011 10:53 AM
To: indy SpamElide
Subject: 525 review 04/28
Mapping the Gnutella Network
The authors present in this paper a measurement study of the Gnutella P2P network application. Gnutella manages a virtual network topology through which queries and data are disseminated. This topology can be significantly different than the underlying physical topology which leads to inefficiencies in hops. Later P2P networks such as Pastry have addressed this issue. Gnutella has to maintain peer connectivity in this graph and does so using PING and PONG messages. The study showed that these messages produced the majority of the traffic present in Gnutella networks until a fix was released later. Other topics such as fault tolerance were also covered in the paper.
This paper presents an interesting view into the Gnutella network which is one of the predominantly used networks. Many of the scalability issues discussed have now forced ISP's to change pricing models as predicted. Furthermore, new P2P systems such as Pastry have emerged to correct issues in network topology present in Gnutella.
Scaling Properties of the Internet Graph
This paper presents a theoretical study of the congestion in the Internet graph as the number of edge nodes grow. It has long been understood that the Internet graph grows in the number of nodes and capacity in the core at rates similar to those expressed by Moore's Law. This paper shows that while the outer edges of the network indeed grow at this rate, internal nodes in the graph are required to grow much faster to accommodate the new traffic. The second interesting finding is that this holds true whether SPF or policy based routing is used which is perhaps not what is initially expected. The authors suggest that the congestion in the network could be largely relieved by the use of a few parallel links between nodes which are indicated as being hotspots. Several strategies are presented for how these links should be selected which each provide similar benefits.
This paper provides an important contribution to our view of the Internet's growth. Under current architectures, it cannot be assumed that the end-to-end speeds of the Internet will continue to grow as it has previously. This paper seems to promote the idea of trying to keep as much traffic as possible close to the edges and avoid the hotspots in the core of the network.
From: kevin larson
Sent: Thursday, April 28, 2011 10:35 AM
To: Gupta, Indranil
Subject: 525 review 04/28
Akella et al investigate the scaling properties of the Internet Graph. They characterize the Internet and look at the worst congestion, showing that it scales poorly at O(n^>1) with respect to increasing quantities of nodes. They model the scaling and congestion in respect to 3 models, any-2-any, leaf-2-leaf, and clout. Any-2-any assumes a unit traffic between every pair of nodes. Leaf-2-leaf assumes traffic between all the customer-provided leaf nodes in the graph. Clout attempts to best model the actual Internet graph, leveraging knowledge that high degree sources are likely to send larger quantities of traffic than other sources. They show the derivation of their formulas and simulate these three models with Inet-3.0. They then proceed to graph the maximum edge congestion in respect to the number of nodes for each of the three models. They show that while any-2-any and leaf-2-leaf show similar scaling, clout shows significantly higher congestion. They present policy based routing and show the improvements it provides for scaling with the cloud model and suggest the addition of parallel links on these high congestion links (and suspect that this is done in practice).
The authors strongly motivate their clout model and explain the shortcomings of the other two models in respect to a true model of the Internet. They also did a good job of motivating the parallel link implementation in respect to the actual implementation. Evaluation was strong and motivated, and graphs were well explained.
The authors presented both constant and dynamic values for alpha in the implementation of Inet-3.0, the exponent of the power law graph. They used the constant values for their experimental support and the dynamic values for their simulation results. This was confusing (to me), as it was not clear why the different usages of alpha were necessary. Additionally, they presented a graph of edge congestion vs the average degree, which had a hump at degree values of approximately 500. Insight into the cause of this would have been interesting, but was not provided.
Ripeanu et al presented a mapping of the Gnutella network, and explored the growth and traffic. They presented other attempts to characterize Gnutella, and explained the shortcomings of these works. They summarized Gnuttelas goals in terms of dynamic configuration, performance/scalability, reliability, and anonymity and explained the ping, pong, query, response, get, and push messages. They explained how their crawler captures the state of Gnutella, and present growth trends and message behavior in respect to the the date. They also mapped the characteristics of the connections, exploring the shortest paths, and the relation between the number of nodes and the number of links (in respect to both the network as well as a single network topology). The relation between the number of nodes and links was interesting, as the number of nodes in a topology increased significantly between Nov 2000 and Mar 2001. They explored the infrastructure, and showed many of the inefficiencies in how nodes were connected. Finally, they suggested improvements to Gnutella in respect to their findings.
The authors did a good job of motivating the improvements from previous characterizations that resulted from their techniques. They provided a lot of interesting data about how the Gnutella implementation and network changed and scaled over the period of their observations. They provided deep insight into the trends presented in their graphs, and explained these insights clearly and in detail.. Additionally, they found areas that seemed promising for improvements.
The authors provided evaluation only a short interval of Guntella. Although, its hard to argue this years later, it would be very interesting to see how these behaviors have changed and improved since their work.
From: Andrew Harris
Sent: Thursday, April 28, 2011 9:44 AM
To: Gupta, Indranil
Subject: 525 review 04/28
Review of “Mapping the Gnutella Network” and “Exploring Complex Networks”
In the Gnutella mapping project, the main goal was to determine a rough structure for that particular filesharing network. The researchers had in mind a supposed model - a power-law network - but had no empirical evidence to support this supposition. To explore this, they built a network crawler that, over a six month period, recorded various aspects of the network that would hint at its shape. These metrics included network joins and parts (churn), bandwidth utilization by the Gnutella protocol, and overall traffic volume. Their particular findings are curious; a preponderance of nodes enter and leave the network within about 4 hours, while a quarter remain alive for multiple days at a time. Traffic estimates for just the Gnutella protocol approach 330 TB/month, for a 50,000 node network, indicating issues of scalability with large networks. Locality of nodes is an aspect introduced as a method of improving Gnutella. Also introduced is the notion of a better query broadcasting mechanism, with CAN and Tapestry cited as methods of improving upon the Gnutella model.
The Strogatz article provides some insight as to the observed structure of the Gnutella network, as its own self-organization is rather similar to other self-organizing networks in nature and elsewhere. The article makes the point that knowing how certain networks tend to be arranged can provide new perspectives as to their analysis. Complex networks, such as randomly arranged networks and small-world networks, emerge in nonlinear systems, and their study is important. Now, with social networking at an apex of popularity, it seems more critical than ever to better understand networks, and to better understand how to apply the right sort of network model to a given unknown structure.
With social networking in mind, the transfer of information between persons follows similarly the patterns of information transfer observed between filesharing clients. Person A finds some interesting piece of information and makes it known that they have such to share; Persons B, C, etc., contact A for the information, an exchange is (potentially) made, and the data propagates across the network. Person N, connected to A and B through multiple intermediaries may send a query across their network, asking if anyone has heard of a particular topic - this would be a query flood in Gnutella terms. Persons A, etc., respond in kind, and a transfer is again made. Knowledge, like data, is easily replicated and shared between persons, so the same sort of analyses made on networks like Gnutella can be used on networks of live people as well (with caveats, of course).
What of newly emergent network structures, such as BitTorrent swarms and computing clouds? The former, BitTorrent, is much more anarchistic, in that the only common point is the tracker (and even that tends to vary, and is no longer completely necessary). Thus, you end up with a network that is quite random, though highly efficient in passing along information. The latter, computation clouds, tends to be easier to enforce a structure upon, as the machines themselves tend to be human-managed rather than automatically arranged. This isn’t the case, though, as some allow for arbitrary network configurations, so to their clients the network may appear as complex as is desired for the application at hand.
All of these in mind, though, underscore the overall importance of understanding the overlap of disciplines within the sciences. Networks have been modeled within mathematics, biology, sociology, media studies, computer science, and other disciplines; surely no one discipline can fully claim it as their own. Thus, network analysis is a truly multidisciplinary skill, deserving of consideration from multiple angles when found in literature and in research.
--------------------
Andrew Harris
PhD Student, Social Spaces Group
Dept. of Computer Science
University of Illinois at Urbana-Champaign
harris78 SpamElide
From: Curtis Wang
Sent: Thursday, April 28, 2011 4:00 AM
To: Gupta, Indranil
Subject: 525 review 04/28
Curtis Wang (wang505)
4/28/11
Structure of Networks
Exploring Complex Networks
The review gives a very brief overview of nonlinear dynamics of systems and goes into deeper detail on three types of systems—small-world, scale-free, and generalized random connectivity. The small-world model lies somewhere between a completely ordered model and a completely random model. It takes a regular lattice and randomly replaces edges with a certain probability. The main observation is that the resulting graph has short paths between any two nodes. The scale-free network’s degree distribution follows a power law. The generalized random connectivity model permits more variation in the degree of nodes. The author briefly talks about speculation about how networks that come as a result of these three different types may have advantages—such as being more synchronizable or more fault-tolerant.
Pros
- General introduction into an interesting field
- Motivates many relevant examples from several different fields
- Very illustrative diagrams
Cons
- As a general overview, it seemed too short to be so technical. It could have been either longer with more details, or shorter without going into the mathematical details
Scaling Properties of the Internet Graph
The authors discuss the problem of scaling the Internet graph: new end-nodes are being added every day, which puts increased load on the links in the network. In addition, the load may grow faster for certain bottleneck links in the network. Through theoretical analysis, the authors show that the worst congestion of the Internet scales quite poorly with the network size and that policy-based routing is no worse than shortest-path routing. In addition, they offer a potential solution, showing that by adding parallel links in the existing network, the maximum relative congestion in the graph scales linearly instead.
Pros
- Addresses an important (and seemingly easy-to-overlook) problem that could have lots of impact on the future of the Internet
- Theoretical analysis with supporting simulation results
- Offers a potential solution—adding parallel links
Cons
- Could offer more possible solutions, maybe regarding routing policies or other ways to alter the graph structure
Adding parallel links seems like it would be a very slow and costly process. Are there other methods of changing the graph structure that could be used? Is there analysis about which links should be reinforced first (with a parallel link)?
From: Tengfei Mu
Sent: Thursday, April 28, 2011 1:07 AM
To: Gupta, Indranil
Subject: 525 review 04/29
1. Exploring Complex Network
This paper presents an overview of modeling of complex networks. The author start from illustrating how the network could be structured in real world by introducing examples in biology or sociology. Then the author pointed out different issues that make the study of networks complex, such as evolution, dynamic, structure, diversity of node, diversity of connection and meta-complication. Also the author analyzed the connection between different part within the network using mathematic way. In the second part, the author describes the networks that combine dynamic and structural complexity.
Pro:
1. The authors have introduced and discussed different examples of complex networks, as well as the individual dynamics of the components and the coupling behavior. It is a good review paper overall.
2. I agree with the author’s conclusion that networking thinking will become essential to all branches of science in the long term.
Con:
1. The authors only studied 3 aspects of a complex network.
2. Mapping the Gnutella network: properties of large-scale peer-to-peer systems and implications for system design
This paper focuses on characterizing the connectivity issue and how it is related to the improvement of the system resilience within the Gnutella’s network. This paper first find that Gnutella network is not an exactly power-law network. It means that Gnutella network has the advantage of a power-law structure, most of which suffers from being vulnerable to Dos attacks. But Gnutella does not falls in this kind of structure. The second finding is that the logical topology does not match the underlying network structure within Gnutella network. It means Gnutella wastes too much time sending messages to nodes.
Pro:
1. Different aspect of Gnutella network has been discussed, which affecting the scalability and the performance of the Internet, which could potentially inspire more future work to overcome those limitations.
Con:
1. No possible solution for the problem discussed in the paper.
From: david.m.lundgren SpamElide on behalf of David Lundgren
Sent: Thursday, April 28, 2011 12:55 AM
To: Gupta, Indranil
Subject: 525 review 04/28
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer
Systems and Implications for System Design
Ripeanu et al. explore Gnutella's virtual topology as well as the
correspondence between Gnutella's virtual structure and the underlying
physical Internet infrastructure. The authors characterize Gnutella as a
``semi-scale-free'' network, where the degree distribution seems to
follow more or less a power law, but with a flattened head. The authors
posit that the flattened head is the result of users being early
technology adopters and having a different distribution of high-speed
Internet access than what is typical of most network users. It is then
demonstrated that the virtual topology is independent of the underlying
physical network infrastructure. This independence along with a
power-law-esque distribution make Gnutella robust to network outages and
other more malicious failures. The authors recommend implementing a
mechanism similar to proxy caching for HTTP requests on the assumption
that Web requests and P2P requests follow similar distributions. I
believe that this assumption is slightly incorrect as shown by Krishna
et al. in 2003. More innovative protocols for query dissemination are
suggested to reduce the overhead of the network.
Pros, Cons, Comments, and Questions:
- The authors measurement and analysis techniques seem rigorous and will
most likely generalize to other P2P networks.
- I thought it was quite interesting that the number of connections per
node remained constant as network sized increased and that the average
shortest hop is less than seven nodes, while most nodes are connected
by two steps.
- I think future work could (or may already have, given the paper's age)
examine other salient network characteristics such as edge
weight/frequency of contact, network diameter, clustering coefficient,
etc.
- I did not understand the reason for only sampling traffic along a
single edge. I think more samples would have improved my confidence in
this measurement.
- I am also not convinced of the early adopter for the flattened head of
Gnutella's degree distribution. I also am not sure the authors
presented enough evidence to claim that all graph statistics are
preserved in their snapshots.
-------------------------------------------------------------------------
Exploring Complex Networks
In his review article published in Nature, Strogatz proceeds to explain
the mechanics and certain unique characteristics of a variety of graph
families. He begins with regular networks and discusses our
understanding of coupled dynamical systems of oscillators. More complex
network structures are then discussed, such as random graphs, scale-free
networks (graphs with power law degree distributions), and generalized
random graphs. Small-world networks are examined as a middle ground
between regular networks and more random architectures. Lattice networks
with links replaced with random ones probabilistically are shown to
accurately model many real world graphs such as biological networks.
Graphs generated in this manner often have a single, massive connected
component and few ``dominant'' central hubs. Scale-free networks or
power law networks form a family of graphs extremely resilient to random
node failures, but brittle to targeted attacks at key hub nodes. Real
examples include Web page link networks. Finally generating functions
are introduced for generalized random graphs of arbitrary degree
distributions. Examples of bipartite corporate, academic, and celebrity
graphs are generated using this manner. The clustering coefficient of
these models is compared to that of empirical networks. The model
corporate model is shown to be an ideal fit while the academic and
celebrity networks are slightly off.
Pros, Cons, Comments, and Questions:
- Overall I thought this was an excellent review article at the nascent
stage of network analysis popularity in computer science. It would be
interesting to see an updated version of this paper discussing more
recent work given the explosion of Twitter, Facebook, and other social
media. ``Computational social science'' is an emerging and fascinating
field.
- The mathematical framework for constructing general random graphs
seems extremely potent and applicable to research in distributed
systems.
- I would like to see an overview of theoretical models of network
structure focusing purely on distributed systems and other networks
related to computer science. I also think that there are many more
advanced metrics for graph analysis that would be well suited for an
additional survey article.
From: Agarwal, Rachit
Sent: Wednesday, April 27, 2011 9:25 PM
To: Gupta, Indranil
Subject: 525 review 04/28
Rachit Agarwal
----
Scaling Properties of the Internet graph
The paper argues that the worst-case congestion in the Internet scales poorly with the network size. Using analytical and empirical arguments, the paper also shows that in the realistic case of using policy-induced routing, the worst-case congestion has similar behavior as that with shortest-path routing. My thoughts/comments on the paper:
1. Such results typically depend on traffic patterns in the Internet. While it is a good start to understand the congestion using simple traffic patterns, such patters could be very unrealistic in practice. It would have been more interesting to see analytical results for a more general class of traffic patters, which may be a difficult task to achieve.
2. The notion of policy routing used in the paper is very loose. Typically, ASes do not *just* use valley-free routing and have a more sophisticated local preferences over the set of valley-free paths (even if we choose to forget about other policy settings). It would have been interesting to see some discussion on how these results carry over to the more general case.
3. The analytical case seems a bit tricky. In particular, is the analysis modeling an AS-level topology? If no, how is the core of the network defined? If my understanding is right, each router in the core of the network does not have to have a very large degree. If yes, then the analysis should have considered the fact that the AS-level topology is already a multigraph with multiple edges between any two ASes (pairs of ASes may have different point of connections).
----
Exploring Complex Networks
An interesting read that surveys results on how structural properties of graphs/networks play a role in functionality of these networks. The focus is, in general, on how several key ideas apply to the general understanding of a wide variety of networks. The author argues that among various aspects of these networks, some are easy to model and leaves exploring the other aspects as major open problems in the field. Focusing on the more interesting case of data networks, some comments:
1. Modeling realistic data networks simply based on their structure may lead to very counter-intuitive (and possibly incorrect?) conclusions. For instance, the author cites a work that argues the following - "any node [in scale-free networks] that fails probably has small degree". While this may be true for failures due to device aging, in practice the failures in data networks are often due to the "load" on the device which is usually higher on nodes with high degree. How would one model such solely based on structure of the underlying graph?
2. Among the various complications discussed by the author, I found connection diversity, dynamical complexity and meta-complication as the most interesting ones. Too bad that it turns out that we do (?) not have good tools to model these complications. Does that mean that we should fall back to simulations and empirical approaches? Can we make realistic assumptions while not completely ignoring these complications in our analysis?
3. I am curious if structural properties can be used to describe the effect of correlated failures and load-induced failures?
----
--
Siebel Center for Computer Science,
University of Illinois at Urbana-Champaign,
Web: http://agarwa16.wikidot.com
----
Sing as if No one is Listening,
Dance as if No one is Watching,
Dream as if You'll Live Forever,
Live as if You'll Die Today !!
From: Ankit Singla
Sent: Wednesday, April 27, 2011 4:12 PM
To: Gupta, Indranil
Subject: 525 Review 04/28
1. Scaling Properties of the Internet Graph
-----------------------------------------------------------------
Summary: The paper attempts to study the impact of increasing number
of nodes in the Internet AS graph on the congestion in the network. It
looks at shortest path routing as well as "valley-free" policy
compliant routing on the AS-graph as well as synthetic graphs. The
authors analytically prove that the maximum congestion scales as ?(n1
+ 1/?). Through simulations, they show that policy-compliant routing
doesn't worsen this significantly. They conclude by noting that adding
parallel edges between high-degree nodes seems to solve the congestion
scaling problem.
Comments: The preferential attachment model may not really be
appropriate for network evolution in this setting - ISPs are likely to
look at the traffic they see to decide the best connectivity
augmentations to make. Likewise, the unit-flow between each node-pair
is far from ideal in this setting - traffic can be expected to be
fairly skewed instead of uniform. The "clout" model seems insufficient
because the destination's degree should also matter; if at all degrees
are indicators. The result on congestion being relatively independent
of the policy restrictions is fairly surprising.
The notions of cause and effect are somewhat mixed-up here: ISP
contracts probably factor congestion into them anyway, augmenting
connectivity as required. Possibly the Internet graph already has
enough edge multiplicity to avoid poor congestion scaling. ISPs
probably look at existing traffic conditions rather than simplistic
heuristics like node degree to decide where to add connectivity.
2. Exploring Complex Networks
--------------------------------------------------
Summary: The paper discusses complex networks - random graphs,
small-world phenomena and scale-free graphs included. It describes
very intuitively the properties of these networks and to some extent,
the rationale for how these properties come about. In some ways, it is
a survey of results as well as applications of graph models and
techniques.
Comments: It's strange how some parts of the article are watered down
to avoid detail, and how in some parts entire formulae and equations
are thrown in. The note towards the end of the article about using the
theoretical graph models to separate properties that are naturally
occurring across different settings just because of the somewhat
universal characteristics of connectivity, from features that are
special to a particular setting is a nice observation.
Ankit
From: Qingxi Li
Sent: Wednesday, April 27, 2011 2:15 PM
To: Gupta, Indranil
Subject: 525 review 04/28
Exploring complex networks
This paper introduces two mathematical models to analyze the network structure which are regular networks of dynamical system and the random graph. Besides this, the author also mentioned the reasons why the structure of the networks is difficult to be analyzed, like the structural complexity, network evolution, connection diversity, nodes diversity and so on. However for the network evolution the author argued that the links are created and lost every minute, in fact, the physical links in between the networks or in each network both are not changed frequently. And the connection between two nodes created or lost will not affect the underlying links. Besides this, the diversity of the nodes will also not affect the structure of the networks as the bandwidth of the link have already determined by the type of the links, no matter the node uses this link is supercomputer or laptop, the bandwidth assigned to this node will not be different. Additionally, the real networks are something between the random graph and the dynamic system. Compared with the random graph, there still some principle can be used to module the structure of the networks, but it’s not so regular that we can put it into a module. So compared with creating a theoretical module to analyze the complex structure of the networks, directly evaluate and measure the structure of the networks will be more useful as the structure of the networks will not be changed too frequently to be measured.
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design
This paper analyzes the structure and traffic of the Gnutella. It uses the distributed crawler to get a snapshot the structure in Gnutella, but there are two problems. Firstly, even the author didn’t mention in the paper, but it seems like the author used IP to separate the different nodes. However, as some nodes may use DHCP to get IP address, so this may underestimate the average alive time of the nodes. As the other paper we read in this semester, this underestimation may be around 25% to 20% times of the original ones. The other problem also mentioned by the authors, the crawling time of this method is around couple of hours, compared with the living time of the nodes, 40% is less than 4 hours and only 25% more than 24 hours, which is too long. The result is interesting. After the Gnutella changing the protocol, the membership messages decreased from 55% to 8%. And even the authors think there is too few nodes to form the power-law structure, however, from the figure 5&6, we can find that compared with the number of the nodes which has large degree, the most nodes still have a small degree. The reason why the number of the nodes with degree less than 10 becomes small just because the increasing of the total number of nodes or may also because of the changing of the protocol. The other result found by this paper is Gnutella not efficiently uses the underlying network structure. This may just because Gnutella is built on the overlay network.