From:	yanenli SpamElide on behalf of Yanen Li [yanenli2 SpamElide]
Sent:	Tuesday, April 05, 2011 12:32 PM
To:	Gupta, Indranil
Subject:	cs 525 review 04/05/2011

Yanen Li


paper 1:
Astrolabe: A robust and scalable technology for distributed system
monitoring, management, and data mining
summary:
This paper presented astrolabe, a distributed, self-organized
information management system, which collects large scale system
state, provides on-the-fly attribute aggregation and performs rapid
updates. The Astrolabe system is essentially a relational database
built using a peer-to-peer protocol running between the applications
or computers on which Astrolabe is installed. Astrolabe monitors the
dynamically changing state of a collection of distributed resources,
reporting summaries of this information to its users. This information
can be retrieved by queries written in SQL syntax.
methods in Astrolabe:
1. Mobile code: Astrolabe uses SQL-like query language to enable
different applications to monitor different kind of information or one
application to query different data at different times.
2. Hierarchy: by using zone hierarchy, Astrolabe achieves scalability.
It first summarizes information in zones and then exchanges
information between zones.
3. Random peer to peer: Astrolabe runs a process in each host and let
them communicate in a epidemically, peer-to-peer fashion. This avoids
vulnerable and heavily loaded central monitor servers.
4. Certification: Astrolabe has its own public key infrastructure over
the zone tree hierarchy. It prevents and controls propagation of bad
data and costly operation. It also digitally signs the information
exchanged between nodes and provided to users.

Pros:
1. Astrolabe is largely scalable. It is reported that it can scale to
thousands of nodes.
2. SQL-like query language provides an easy to use interface to
monitor and manage information in the system
3. peer-to-peer Gossip communication protocol makes Astrolabe low
latency compared to other DNS-like management systems.

Questions:
1. Timestamps Astrolabe are assumed to be global – is this reasonable
assumption?

Paper 2:
Moara: Flexible and Scalable Group-Based Querying System
summary
This paper presents Moara, a distributed querying system addressing
the group-based one-time query problem over a  large-scale distributed
infrastructure. It presents a SQL-like query  interface, and models
queries as a tuple of (query-attribute,  aggregation-function,
predicate), where the predicate is expressed as  a series
conjunct/disjunct terms of the form (attribute op value). Moara knows
how to deliver the query and aggregate the data through the member
nodes. Through this design, bandwidth could be saved and resource with
nodes could be saved also. This paper is very focused on the query
optimization in Moara. Two important optimization techniques include
using per-simple-predicate-based prune variables to help pruning
subtrees and  using a path-compression-like technique to jump over
unnecessary nodes.

pros:
1. The SQL-like query interface is simple and useful, which can
satisfy most of
the needs of administrative queries.
2. Focusing on the most important issues of query optimization.
3. Dynamic maintenance of groups.

cons:
1. Moara is using pooling strategy to monitor the system states.
However, notification strategy might be more suitable in the case for
administrative purposes where continuous monitoring is more common
than one time query.
2. The authors try to find the lowest cost simple group for composite
predicates, but this solution is not theoretically optimal.


-- 
Yanen Li

Ph.D Candidate
Department of  Computer Science
University of Illinois at Urbana-Champaign
Tel: (217)-766-0686
Email: yanenli2 SpamElide
From:	Igor Svecs [isvecs SpamElide]
Sent:	Tuesday, April 05, 2011 12:27 PM
To:	Gupta, Indranil
Subject:	525 review 04/05

Igors Svecs
CS 525
2011-04-05

Reliable On-Demand Management Operations for Large-scale Distributed
Applications

SUMMARY

The paper introduces MON, a gossip-based protocol for building
on-demand overlays for managing distributed applications. Each node
maintains a list of log(N) members. When a query (such as "select
<resource_list> where <condition>") is issued, an overlay tree or a
DAG is constructed by forwarding Session messages between the nodes.
Experiments show that construction takes ~1.5 seconds on average on a
330-node PlanetLab slice. For their implementation, on-demand overlay
construction is more feasible than persistent overlays if a query is
sent at least every 40 seconds. Medium-term overlays are considered to
increase query throughput; DAGs are used to add some redundancy in the
overlay to survive failures.

PROS

- Does not require to have global membership information.

- On-demand overlay construction makes RON simpler compared to
persistent overlays.

- With medium-term overlays, MON is useful for infrequent manual
management tasks.

CONS

- Since MON is designed for a modest scale (several 100's to 1000's
nodes) and assumes infrequent queries (on-demand tree construction),
there appears to be no reason why each instance of the distributed
application could not maintain a connection to a well-known management
server (botnet-style). This should have been discussed in the paper.

- Partial membership lists have to be maintained by each instance, and
they must include at least log(N) entries. There is also an assumtion
that these lists should have uniform random distribution, which may
not work for each application, and may require additional overhead.

- 1.5-2 s additional latency (tree construction) for each query may
limit query rates. 40 second feasibility threshold seems too high for
automated application monitoring, and is more suitable for manual
administrative tasks.


Moara: Flexible and Scalable Group-Based Querying System

SUMMARY

Moira is a system that builds and maintains group trees for subsequent
querying. Each Moira node mainstains a list of (attribute, value)
pairs. This information can be dynamic; e.g., (Current-CPU-Usage, 55).
A Moira query is formatted as (query-attribute, aggregation function,
group-predicate), e.g., (Current-CPU-Usage, Average, Service = Apache
and Application = AppX). Group predicate is a boolean expression of
the type (group-attribute op value). Moira uses Pastry DHT as an
overlay, and maintains a DHT tree for routing. When a query is
generated, group-attribute is hashed to a value, and the query is
forwarded to a node that is the root of the corresponding
group-attribute tree. This root then propagates the query to its
children, does in-network aggregation, and returns the result to the
client. To avoid flooding the query to all nodes, subtrees that do not
satisfy the predicate are pruned - child nodes proactively notify
their parent with their (attribute, value) pairs, and the parent can
use this information to determine whether to forward the query down
the subtree.

PROS

- Design of this system is driven by existing need.

- The system considers both scalability and efficiency. Once the
pruning mechanism is fine-tuned, it can be very efficient.


CONS

- Aggregation function is required to be partially aggregatable.

- Group-attributes need to be more-or-less static.

- Only eventual completeness is guaranteed.

- Relatively complex system that deals with lots of dynamic structures.

- A lot of manual fine-tuning and optimization is required for the
pruning to work effectively.
From:	w SpamElide on behalf of Will Dietz [wdietz2 SpamElide]
Sent:	Tuesday, April 05, 2011 12:27 PM
To:	Gupta, Indranil
Subject:	cs525 Review 04/05

"Reliable On-Demand Management Operations for Large-Scale Distributed
Applications"

This paper presents "MON" (Management Overlay Network) system, that
introduces a number of key ideas.  MONS introduces the idea of
'on-demand' overlay networks,which the authors use to faciliate
commands over a distributed system.
  On-demand overlay networks are probabilistic graph constructions
(trees or DAGS, depending on what makes more sense for a particular
query) that are generated for each query (idea being that there are
relatively few queries, and not needing to preserve the overlay makes
it much cheaper) and used to issue and gather the results of a
particular control command.  The commands they support over this
system are avg/top/histo/grep/run and a more general 'select' to find
resources/nodes that match certain criteria.
  One thought is that it doesn't seem clear why the limited command
set--seems you could support quite a few more operations relatively
easily.  My best guess is that the main contribution is the membership
and overlay construction ideas (and investigation of how
effective/reliable/etc) they are, and that the authors leave command
language construction to other works.
  They added MONs to PlanetLab, claiming it's "available".  Does this
mean it's actively in use on the PlanetLab cluster?
From:	harshitha.menon SpamElide on behalf of Harshitha Menon [gplkrsh2 SpamElide]
Sent:	Tuesday, April 05, 2011 12:17 PM
To:	Gupta, Indranil
Subject:	525 review 04/05

MON:

This paper proposes a mechanism for instant monitoring and management of tasks for large-scale distributed applications running on hundreds of hosts. The approach uses MON (Management Overlay Networks) , on-demand overlays, to support instant queries to be executed on hosts in the system. On-demand overlays are built on-the-fly by using gossip-style membership protocol. Cluster management tools allow querying of resources associated with the infrastructure but there is a scarcity of tools that application developers can use for managing their applications on a cluster. This system built on top of weakly consistent membership information can be used to execute monitoring and software push commands quickly, scalably and reliably. The system is built on two kinds of overlay structures: trees and DAGs. 

Pros:
-Uses gossip-based membership management protocol which aims at eventual consistency. This means that even in case of failures, this system would perform reliably. 
-This system is scalable and latency is low for query execution.
-This system is light-weight in terms of memory, computation and bandwidth
-Provides tuning parameters for application-specific reliability

Cons and Suggestions:
-Construction of overlay tree and DAG structure would incur costs of construction and rebuilding when nodes join and leave
-Experiments showing the performance during high churn rate is not shown.
-max_drop and max_missing specified at overlay creation time doesnt seem to be the right metric as it might have different consequences based on the overlay structure. For eg: in tree failure of some node can lead to a branch being inaccessible but instead for DAG, thats not the case and might trigger false failure alarms.

Moara: Flexible and Scalable Group-Based Querying system

Moara is a querying system which builds aggregation trees for different groups and adaptively maintains the trees to optimize total message cost. Moara supports a query language allowing groups to be specified implicitly. A query in Moara has three components, query attribute, aggregation function and group predicate. Moara uses peer to peer in network aggregation approach and uses a DHT trees to compute results. These trees are used for spreading queries and aggregating answers back towards source node. Moara uses query optimization techniques. It uses adaptive pruning of branches of the DHT trees and only querying nodes that can reply. This causes reduction in bandwidth and reduces response time.

Pros:
-The query language is simple and handles composite query expressions
-Dynamically maintains group trees based on query rates and churn rates which reduces the bandwidth consumption and response time
-The simulations indicate that the churn doesnt affect the system

Cons and Suggestions:
-Moara uses DHT trees for group predicate and this would incur maintenance cost. So how about using DAG


From:	Jason Croft [croft1 SpamElide]
Sent:	Tuesday, April 05, 2011 12:17 PM
To:	Gupta, Indranil
Subject:	525 review 04/05

Reliable On-Demand Management Operations for Large-scale Distributed 
Applications

Management Overlay Networks (MON) arise out of a need for low cost management 
of end-user applications in distributed infrastructures.  The design allows queries to 
be executed for large-scale distributed applications using weakly-consistent 
membership information and building one overlay per query.  On-demand overlays are 
used in place of persistent overlays to reduce bandwidth overhead incurred by 
membership maintenance.  Instead, a gossip-style protocol is used to distributed this 
information so nodes can build partial views, or partial membership lists. 

MON supports two different overlay structures: trees and DAGs, for instant status 
queries (trees) and software push (DAGs).  Trees can be constructed using a random 
tree construction algorithm or a locality-aware two-stage construction.  The two-stage 
method improves on the random tree, which is oblivious to the underlying network.  
DAGs are constructed using levels, where each node receives a level one more than 
that of its first parent. 

However, this design is limited by an upper bound on the rate of injected queries.  
Above a rate of one query every 40 seconds, MON is less effective than a persistent 
overlay in bandwidth usage.  While this rate may be acceptable for 
operators/administrators, it would not work well for automated or continuous 
monitoring.  Therefore, the authors also propose the use of medium-term overlays 
that last for hundreds of seconds.  This can then be optimized for session reliability 
(that at any given time, no more than a specified number of nodes are disconnected) 
or task reliability (that at most a specified number of nodes will be missing from the 
aggregate result).

For task reliability, experiments show perfect query completion 72% of the time, 
which may not be optimal for continuous monitoring of some metrics, but if this is 
constraint is relaxed to allow 3% missing results then queries can be completed 96% 
of the time.  The latency of queries is only several seconds, with 97% completing in 
under 2s.  However, this only considers the random tree and two stage overlays.  It 
would have been interesting to see a similar evaluation with the medium-term overlay 
shell DAG to determine if there is a large change in latency.


Moara: Flexible and Scalable Group-Based Querying System

Moara provides a flexible, efficient, and scalable method to query groups of machines 
by preferring high availability and scalability over consistency.  Moara uses 
aggregation trees to inject queries to a minimal number of nodes and reduce 
message cost and query response time.  The query processor can support composite 
queries that target multiple groups.  A query contains an attribute to be aggregated, 
an aggregation function to be used on the data, and a group predicate that specifies 
the machines on which the aggregation is performed.  An agent running on each 
machine populates (attribute, value) pairs, such as (CPU-MHz, 3000). 

To spread queries and aggregate answers back to the source, a DHT tree is used.  
Moara attempts to prune this tree before spreading the query to nodes that satisfy the 
group predicate.  Additional optimizations are also made, such as short-circuiting the 
trees to reduce the number of internal nodes not satisfying the predicate and rewriting 
predicates targeting multiple groups.  Moara guarantees eventual completeness, that 
is, when the nodes satisfying the predicate and the DHT do not change after some 
period after the query, the group eventually returns answers from all the nodes.  To 
bypass internal nodes that forward responses up or down a tree but to not satisfy the 
predicate, a separate query plane is introduced.  This optimizes the tree by 
maintaining two sets at each node: a list of nodes that it forwards to its parent 
(updateSet) and a list of children where it forwards received queries (called the qset).

However, this design only allows for one-shot queries rather than automatic and 
continuous monitoring like the medium-term overlay in MON.  Continuous monitoring 
seems more applicable to some of the use cases mentioned in section 2, especially 
data centers that may have thousand of nodes that are continuously monitored for 
failures.  Also, latency is not as low as MON, median response time for queries is 1-
2 seconds, and 90% complete in less than 5 seconds.

From:	anjalis2006 SpamElide on behalf of Anjali Sridhar [sridhar3 SpamElide]
Sent:	Tuesday, April 05, 2011 12:16 PM
To:	Gupta, Indranil
Subject:	525 review 4/5


MON: On-demand overlays for distributed system management, J. Liang et al, SIGOPS OSR 2007.


MON attempts to provide an on-demand overlay network for query distribution and software pushes over distributed applications running over hundreds of hosts. It aims to minimize memory, computation and bandwidth requirements for these functions. MON provides a system where system properties like RAM, CPU utilization etc can be queried by the user. Monitoring the running of an application on large scale distributed applications has been described as being the motivation. MON users attempt to query the status of the system as if the status of each node is a row in the database.  MON is divided into three components – the Management Commands, the Gossip system and the On-demand overlay construction. The management commands provide metrics like avg, top k , grep and run. The gossip system enables each node to maintain a partial list of the other nodes in the network. Each gossip message is associated with an age counter in order to remove failed nodes and update the partial view. There are two types of overlay networks that are created – trees and DAG’s. The trees are constructed either randomly or by a two stage locality aware method.  The trees are generally used for queries and the DAG’s for software pushes.  The paper also shows the point at which a persistent overlay network might be more efficient than an on demand overlay network. Reliable on demand overlay networks are further discussed along the lines of either session reliability or task reliability. Session reliability talks about the number of nodes that are disconnected from the on demand overlay network at that time. Task reliability is about the number of nodes missing from the aggregate results. The advantage here is that the user can specify the threshold for acceptable session and task reliability. 

MON is able to execute queries within seconds by constructing the best overlay network for each query. It is able to give the user some control regarding the tradeoffs between query reliability and completion of the query by deciding the threshold for missing nodes. 

The status change of hosts in the middle of a query is not addressed clearly. For example, the closer the node is to the root; greater is the chance for disconnecting a larger part of the network. The paper talks about query results with respect to number of nodes that are currently connected in the overlay. The proposed suggestions at the end of the paper are some of the improvements that came up during the paper reading. One of the aspects about the paper is the lack of comparison graphs with existing tree aggregation methods. 

 
Moara: Flexible and Scalable Group-Based Querying System, S. Ko, Middleware 2008


This paper introduces a system to query large distributed infrastructures by building aggregation trees for different groups and maintaining them. It focuses on achieving flexibility in querying the groups, efficiency in the message overhead and scalability of the number of machines. It provides high availability and scalability of the system and eventual consistency on aggregation results. Information is stored at each node in the form of tuples (attribute, value). Each query consists of three parts – query attribute, aggregation function and the group that is being queried. The group field specifies the nodes that will be taking part in the query based on a condition. Moara uses a structured overlay routing for constructing DHT trees. In order to reduce bandwidth cost, Moara employs certain techniques that remove nodes that do not satisfy the group predicate. Each node maintains its state of being pruned or not based on if it satisfies the group predicate. Using certain binary variables, Moara proposes a state machine which allows the queries to obtain values only from nodes that satisfy the group predicate.  An optimization to the tree construction is having a separate query path for some child nodes which have to traverse many internal nodes that do not contribute to the query.  Moara also talks about how to deal with composite predicates by building multiple trees.  The query is routed to the tree that incurs the least bandwidth cost. 

Moara does propose optimized tree construction and routing for each query. The experimental results comparing it with centralized aggregators show a definite reduction in latency.  By employing existing routing algorithms, Moara is able to concentrate more on the optimization of the tree that is constructed based on each query requirement. 

The paper does not talk about if the user is able to see the number of nodes involved in the query. Does the query fail if some of the nodes that were initially in the tree do not respond? Since Chord or similar DHT routing algorithms are used, each node needs to know about every node in the system. The routing carried out may need to be location based when constructing the tree. Some of these aspects might add to the latency of query requests. 

 
From:	kevin larson [kevinlarson1 SpamElide]
Sent:	Tuesday, April 05, 2011 12:07 PM
To:	Gupta, Indranil
Subject:	525 review 04/05

    Liang et. al present their Management Overlay Networks (MON)
approach, which uses gossip-style on-demand overlays to issue commands
such as queries and software pushes.  On demand overlays are build per
command, and in systems with infrequent queries, MON is very
efficient.  Where as persistent overlays require maintenance and
membership handling, on demand overlays require no persistent state.
Even including the setup times on clusters with thousands of nodes, on
demand overlays complete their queries within a few seconds.  MON is
built around gossip based memberships, with each node maintaining a
fixed size “partial view” of close by nodes.  At a periodic interval,
a node picks a random target and exchanges random membership entries.
Age associations help remove failed nodes.  To execute commands, two
types of structures are used, DAGs and trees.  Trees are useful for
static queries, and DAGs are useful for software pushes.
    The idea of constructing on-demand overlays is very interesting.
However, the ability to ignore most of the management overheads
doesn’t come without a cost.  It would be interesting to see how
frequent queries need to become in order for non-on-demand overlays to
have comparable overhead with MON.
    Ko et al present Moara, a querying system which builds aggregation
trees for different groups and adaptively maintains these trees in
order to optimize the total message cost.  The authors demonstrated
the usage patterns for Planetlab and shows the inefficiencies of the
current monitoring system.  Moira stores data in tuples, and uses
three part queries with query-attributes, aggregation functions, and
group-predicates.  The attribute is the item of interest, and the
aggregation function is the operation, such as enumeration, max, min,
etc.  The group predicate is the set of machines for the aggregation
to be applied over, defaulting to the entire set or machines.  Moira
uses a DHT to maintain the nodes in the system, and depending on the
churn, uses varying levels of group management to prune the branches.
Moara utilizes interesting optimizations for composite queries,
selecting a small “cover”, which is a set of groups which contain all
nodes that satisfy the composite predicate.  Evaluation of bandwidth
shows that Moara performs very well in comparison to global DHT trees
and an aggressively maintained tree as well, basically performing as
well or better in all cases.  Latency shows that although Moara is
slower in achieving the first responses, the relevant latency is to
the slowest responses, where Moara also outperforms a centralized
approach.


.
From:	Anupam Das [anupam009 SpamElide]
Sent:	Tuesday, April 05, 2011 11:53 AM
To:	Gupta, Indranil
Subject:	525 review 04/05

i. MON: ON-demand overlays for distributed system management

This paper talks about developing a management level overlay which
allows application level queries to be executed very quickly in a
large scale distributed system.  MON (Management Overlay Network) uses
a gossip based management protocol as it is lightweight and it imposes
less overhead on network bandwidth. So, every node only maintains a
partial view of its available neighbors. MON uses this membership list
to create the actual overlay network. MON proposes two ways to create
this overlay network namely- random trees or DAGs. The paper then
analyzes the upper bound of query rate at which MON becomes more
feasible than persistent overlay. By default MON builds one on-demand
overlay per query, but the authors also describe a way to reuse the
on-demand overlays to perform multiple queries. However, MON wants to
achieve this without spending any explicit maintenance bandwidth. For
this purpose MON introduces two types of reliability (session and task
reliability) .

Pros:
1.	MON provides a faster way to perform instant querying/monitoring in
large-scale distributed systems.
2.	MON imposes very low overhead on network bandwidth.
3.	Overlays can be reused provided they meet some application level constraints.

Cons:
1.	Comparison between on-demand and persistent overlay is missing. It
would have been nice to see exactly how much faster MON really
performs over persistent overlay.
2.	MON supports only a fixed list of commands. Could it support more
complex set of commands/queries? If so, would that impose any
restriction on the durability of the overlay?

-----------------------------------------------------------------------------------------------------------------------------------------------

ii. Moara: Flexible and Scalable Group-Based Querying System


In large distributed systems users and administrators often desire to
monitor the overall performance of a particular application or gather
statistics about a particular performance metric. To do so one could
query the whole system but that would lead to high bandwidth cost and
high latency. The authors of this paper therefore, present a group
based querying system called   Moara where information is aggregated
in a group fashion (here groups contain only those nodes that are
relevant to the specified queries). Moara maintains aggregation trees
(DHTs) for different groups. It supports a query language of the form-
(query-attribute, aggregation function, group-predicate). Here query
attribute is the feature (e.g. cpu utilization) that we are trying to
gather, aggregate function refers to the type of aggregation we are
aiming for (e.g. max) and this must be partially aggregatable, and
finally group-predicate defines the groups of machines (e.g. machines
running MYSQL) on which this aggregation is performed.  Moara also
supports composite queries that target multiple groups using unions
and intersections. Moara uses a few optimizations techniques to reduce
bandwidth usage and response time. These optimization techniques
include adaptively pruning branches of the DHT tree, bypassing
intermediate nodes that do not satisfy a given query and querying only
those nodes that are able to answer quickly without affecting the
result of the query. The authors finally performed experiments on
FreePastry simulator, Emulab and Planetlab to show the effectiveness
of their proposed system.


Pros:
1.	Reduces bandwidth overhead and latency by directing queries to the
desired target group.
2.	Allows a wide range of queries. Queries can be combined to create
more complex composite queries.
3.	It is scalable as it sends queries to only the target group (i.e.,
independent of the total number of nodes in the system and grows
linearly only with group size).
4.	Dynamic maintenance allows Moara to perform well even in the
presence of high group churn.

Cons:
1.	Maintaining a DHT for each query could be computationally expensive
as a single machine could belong to multiple query-groups. Could DAGs
or random trees be used (as used in MON paper)? If so it could reduce
the maintenance cost of tree.
2.	Moara sacrifices consistency for availability, so some of the
machines might be left out of the group. This could impact the
aggregated output.
3.	The paper does not discuss much about the state maintenance overhead.

-----------------Anupam