Software engineering methods can increase the dependability of software systems, and yet some faults escape even the most rigorous and methodical development process. Therefore, to guarantee high levels of reliability in the presence of faults, software systems must be designed to reduce the impact of the failures caused by such faults, for example by deploying techniques to detect and compensate for erroneous runtime conditions. In this chapter, we focus on software techniques to handle software faults, and we survey several such techniques developed in the area of fault tolerance and more recently in the area of autonomic computing. Since practically all techniques exploit some form of redundancy, we consider the impact of redundancy on the software architecture, and we propose a taxonomy centered on the nature and use of redundancy in software systems. The primary utility of this taxonomy is to classify and compare techniques to handle software faults.
We address the problem that existing publish/subscribe messaging systems, including such commonly used ones as Apache’s ActiveMQ and IBM’s WebSphere MQ, exhibit degraded end-to-end throughput performance in a wide-area network setting. We contend that the cause of this problem is the lack of an appropriate routing protocol. Building on the idea of a content-based network, we introduce a protocol called B-DRP that can demonstrably improve the situation. A content-based network is a content-based publish/subscribe system architected as a datagram network: a message is forwarded hop-by-hop and delivered to any and all hosts that have expressed interest in the message content. This fits well with the character of a wide-area messaging system. B-DRP is based on two main techniques: a message delivery mechanism that utilizes and exploits unicast forwarding state, which can be easily maintained using standard protocols, and a probabilistic data structure to efficiently represent and evaluate receiver interests. We present the design of B-DRP and the results of an experimental evaluation that demonstrates its support for improved throughput in a wide-area setting.
Selecting a random peer with uniform probability across a peer-to-peer (P2P) network is a fundamental function for unstructured search, data replication, and monitoring algorithms. Such uniform sampling is supported by several techniques. However, current techniques suffer from sample bias and limited applicability. In this paper, we present a sampling algorithm that achieves a desired uniformity while making essentially no assumptions about the underlying P2P network. This algorithm, called doubly stochastic converge (DSC), iteratively adjusts the probabilities of crossing each link in the network during a random walk, such that the resulting transition matrix is doubly stochastic. DSC is fully decentralized and is designed to work on both directed and undirected topologies, making it suitable for virtually any P2P network. Our simulations show that DSC converges quickly on a wide variety of topologies, and that the random walks needed for sampling are short for most topologies. In simulation studies with FreePastry, we show that DSC is resilient to high levels of churn, while incurring a minimal sample bias.
Modern societies are pervaded by computerized, heterogeneous devices designed for specific purposes, but also more and more often capable of interacting with other devices for entirely different purposes. For example, a cell phone could be used to purchase a train ticket on-line that could later be printed by a vending machine at the train station. This type of open environment is what we call a society of digital systems. In this paper, we outline the characteristics of societies of digital systems, and argue that they call for a new approach to cope with unforeseen interactions, possible incompatibilities, failures, and emergent behaviors. We argue that designers can not assume a closed or homogeneous world, and must instead naturally accommodate dynamic adaptations. Furthermore, self-adaptability, that is, the ability to adapt autonomically to a changing environment, also poses problems, as different adaptation strategies may interfere negatively, leading to unstable behaviors. As an initial concrete contribution to solve this problem, we propose a method to support the graceful integration of devices and software systems in an open environment. The method uses management information, and is specifically centered on the idea of expressing self-adaptation operations as change sets over the management information base.
Uniformly sampling nodes from deployed peer-to-peer (P2P) networks has proven to be a difficult problem, as current techniques suffer from sample bias and limited applicability. A sampling service which randomly samples nodes from a uniform distribution across all members of a network offers a platform on which it is easy to construct unstructured search, data replication, and monitoring algorithms. We present an algorithm which allows for uniform random sampling, by the use of biased random walks, over the peers of any P2P network. Our algorithm, called doubly stochastic converge (DSC), iteratively adjusts the probabilities of crossing each link in the network during a random walk, such that the resulting transition matrix is doubly stochastic. DSC is fully decentralized and is designed to work on physically directed networks, allowing it to work on any P2P topology. Our simulations show that DSC converges quickly on a wide variety of topologies, and that the random walks needed for sampling are short for most topologies. In simulation studies with FreePastry, we show that DSC is resilient to extremely high levels of churn, while incurring a minimal sample bias.
We develop the notion of automatic workaround in the context of Web applications. A workaround is a sequence of operations, applied to a failing component, that is equivalent to the failing sequence in terms of its intended effect, but that does not result in a failure. We argue that workarounds exist in modular systems because components often offer redundant interfaces and implementations, which in turn admit several equivalent sequences of operations. In this paper, we focus on Web applications because these are good and relevant examples of component-based (or service-oriented) applications. Web applications also have attractive technical properties that make them particularly amenable to the deployment of automatic workarounds. We propose an architecture where a self-healing proxy applies automatic workarounds to a Web application server. We also propose a method to generate equivalent sequences and to represent and select them at run-time as automatic workarounds. We validate the proposed architecture in four case studies in which we deploy automatic workarounds to handle four known failures in to the popular Flickr and Google Maps Web applications.
Test adequacy criteria provide the engineer with guidance on how to populate test suites. While adequacy criteria have long been a focus of research, existing testing methods do not address many of the fundamental characteristics of distributed systems, such as distribution topology, communication failure, and timing. Furthermore, they do not provide the engineer with a means to evaluate the relative effectiveness of different criteria, nor the relative effectiveness of adequate test suites satisfying a given criterion. This paper makes three contributions to the development and use of test adequacy criteria for distributed systems: (1) a testing method based on discrete-event simulations; (2) a fault-based analysis technique for evaluating test suites and adequacy criteria; and (3) a series of case studies that validate the method and technique. The testing method uses a discrete-event simulation as an operational specification of a system, in which the behavioral effects of distribution are explicitly represented. Adequacy criteria and test cases are then defined in terms of this simulation-based specification. The fault-based analysis involves mutation of the simulation-based specification to provide a foil against which test suites and the criteria that formed them can be evaluated. Three distributed systems were used to validate the method and technique, including DNS, the Domain Name System.
Experimentation is an essential tool employed by the developers of software systems, especially distributed systems. In prior work we developed a model-driven framework for automating various experimentation tasks, such as workload generation, and demonstrated that it gives the engineer a cost-effective means to conduct large-scale experiments on distributed testbeds. We have enhanced the methods underlying the framework in four significant ways: (1) increasing the expressiveness of workloads by allowing for conditional and reactive behaviors; (2) supporting the repeatability of experiments through the creation of environment workloads that can control the operational context; (3) enabling the composability of application and environment workloads to obtain a broader class of experiments; and (4) extending the scope of experiment management to include control over multiple runs. We use the enhancements to conduct a series of interesting new experiments. Specifically, the enhancements allow us to manipulate a fixed-wired testbed so that it simulates a mobile-wireless environment, and to selectively and maliciously inject faults into a system.
We propose to use automatic workarounds to achieve self-healing in software systems. We observe that software systems of significant complexity, especially those made of components, are often redundant, in the sense that the same functionality and the same state-transition can be obtained through multiple sequences of operations. This redundancy is the basis to construct effective workarounds for component failures. In particular, we assume that failures can be detected and intercepted together with a trace of the operations that lead to the failure. Given the failing sequence, the system autonomically executes one or more alternative sequences that are known to have an equivalent behavior. We argue that such workarounds can be derived with reasonable effort from many forms of specifications, that they can be effectively prioritized either statically or dynamically, and that they can be deployed at run time in a completely automated way, and therefore that they amount to a valid self-healing mechanism. We develop this notion of self-healing by detailing a method to represent, derive, and deploy workarounds. We validate our method in two case studies.
Network processors provide an economical programmable platform to handle the high throughput and frame rates of modern and next-generation communication systems. However, these platforms have exchanged general-purpose capabilities for performance.
This paper presents an alternative; a software network processor (Soft-NP) framework using commodity generalpurpose platforms capable of high-rate and throughput sequential frame processing compatible with high-level languages and general-purpose operating systems. A cacheoptimized concurrent lock free queue provides the necessary low-overhead core-to-core communication for sustained sequential frame processing beyond the realized 1.41 million frames per second (Gigabit Ethernet) while permitting perframe processing time expansion with pipeline parallelism.
In the mid-nineties, mobile code was on the rise and, in particular, there was a growing interest in autonomously moving code components, called mobile agents. In 1997, we published a paper that introduced the concept of mobile code paradigms, which are design patterns that involve code mobility. The paradigms highlighted the locations of code, resources, and execution as first-class abstractions. This characterization proved useful to frame mobile code designs and technologies, and also as a basis for a quantitative analysis of applications built with them. Ten years later, things have changed considerably. In this paper we present our view of how mobile code evolved and discuss which paradigms succeeded or failed in supporting effectively distributed applications.
Until now, structured and unstructured networks have been considered in absentia of each other. We believe that next-generation P2P services will require both structured and unstructured algorithms, and that it therefore makes sense to consider a unified substrate that provides good service for both. In this paper we argue for the creation of a semi-structured overlay substrate, called Spinneret, which can serve as the base layer for a variety of structured and unstructured search algorithms. In order to validate that this structure forms a good foundation for various services, we present two algorithms simulated on top of the Spinneret substrate: an unstructured k-walker random walk search as well as a logarithmic DHT search. Further, we argue that such a substrate strikes a balance between the resilience and reliability of unstructured networks and the efficiency of structured networks.
The high performance, low cost, and flexibility of commodity hardware systems make them appealing for network processing applications. However, the standard software architecture of such systems imposes significant limitations. At high rates (e.g., gigabit Ethernet) and small frame sizes (64 byte), each frame must be processed in less than 672 ns. System calls, synchronization, and memory latencies can dominate this processing time. Despite significant effort to remove this overhead, we are aware of no general-purpose mechanism that can handle this load on commodity hardware.
This paper describes the frame-shared-memory architecture (FShm), a general-purpose software architecture for processing network frames on commodity multiprocessor hardware. FShm supports kernel- and user-space processing at gigabit Ethernet rates by increasing throughput without reducing the per-frame processing time, pipelining work across multiple processors. FShm can generate, capture, and forward frames at the theoretical maximum rate on a gigabit Ethernet network for all frame sizes greater than 96 bytes, and at 95% of maximum for the 64 byte minimum frame size (the limit of the tested hardware).
A content-based publish/subscribe system is a message-oriented communication facility based on the idea of interest-driven routing. A message, published by the sender without a set destination, is delivered to any and all the receivers that expressed an interest in its content. We refer to this communication style and to the distributed infrastructure that realizes it as content-based communication and content-based networking, respectively. In this paper we review what we consider the foundations of content-based networking, including some of the major advances of the past years. We then present a vision for further research in this area as well as for the practical realization of a content-based network. In particular, we discuss the implications of content-based communication for the network, the middleware, and applications.
An ideal sensor network would minimize communication by routing information only to those nodes requiring the information. We are exploring the use of a content-based network for this purpose, where messages containing sensor readings and associated metadata are relayed from source nodes to destination nodes based solely on the fact that the destination nodes have expressed interest in specific message content. This paper contributes a concrete protocol, called DV/DRP, that implements content-based networking for wireless sensor networks or other similarly constrained network configurations. DV/DRP augments a basic distance vector protocol to construct both primary and alternate routes. DV/DRP also features a new content-based forwarding technique called dynamic receiver partitioning. DV/DRP makes minimal assumptions on the underlying MAC layer. Notably, it uses only a primitive local broadcast, does not require reliability of the link layer nor the use of acknowledgments, and explicitly handles asymmetric links. We present simulations showing that the protocol scales to large networks while minimizing the resource consumption of individual nodes. We also show that the protocol is robust with respect to transient and permanent node failures, as well as asymmetries in wireless links. Finally, to demonstrate that DV/DRP is suitable for memory-constrained sensor networks, we discuss a preliminary implementation.
Content-based networking is a message-oriented communication service in which a message is delivered to all destinations that have declared a selection predicate matching the content of that message. Analogous to that of a traditional address-based network, a routing scheme in a content-based network defines the router-local matching, forwarding, and header functions that collectively realize the delivery function. Several such routing schemes have been proposed in the literature, but they have been evaluated only qualitatively or in simulation. In this paper we abstract from those previous results in an effort to place them in a general theoretical framework. This framework allows us to rigorously define notions of correctness, minimality, and complexity. In particular, we prove the correctness and characterize the complexity of two existing content-based routing schemes, propose a new latency-minimal scheme, and provide the results of a Monte Carlo simulation that serves as the basis for estimating the space requirements of any given scheme.
Developers of distributed systems routinely construct discrete-event simulations to help them understand and evaluate the behavior of inter-component protocols. Typically written using an imperative programming language, these simulations capture basic algorithmic functionality at the same time as they focus attention on properties critical to distribution, including topology, timing, bandwidth, and overall scalability. We ask the following question: Can simulations also be used to help in the testing of distributed-system implementations? Because simulations amount to specifications of intended behavior, the code of a simulation can be viewed as an operational, albeit non-traditional, formal model. We claim that this kind of model, when used within a specification-based testing regime, provides developers with the foundations of a powerful new method for selecting effective test suites. The primary tool used in our method is a fault-based analysis of the simulation code in which a set of mutants are generated using standard code-mutation techniques. The analysis can be used to rate the effectiveness of a test suite, as well as the criterion used to form it. We substantiate our claim through experiments performed on the simulations and implementations of two different distributed systems.
Engineering distributed systems is a challenging activity. This is partly due to the intrinsic complexity of distributed systems, and partly due to the practical obstacles that developers face when evaluating and tuning their design and implementation decisions. This paper addresses the latter aspect, providing techniques for software engineers to automate the experimentation activity. Our approach is founded on a suite of models that characterize the distributed system under experimentation, the testbeds upon which the experiments are to be carried out, and the client behaviors that drive the experiments. The models are used by generative techniques to automate construction of the workloads, as well as construction of the scripts for deploying and executing the experiments on distributed testbeds. The framework is not targeted at a specific system or application model, but rather is a generic, programmable tool. We have validated our approach by performing experiments on a variety of distributed systems. For two of these systems, the experiments were deployed and executed on the PlanetLab wide-area testbed. Our experience shows that this framework can be readily applied to different kinds of distributed system architectures, and that using it for meaningful experimentation, especially in large-scale network environments, is advantageous.
Distributed systems represent an important class of software applications whose testing has received relatively little attention from the research community. As a result, there does not currently exist a general-purpose, disciplined, and effective testing method for distributed systems. While a number of research efforts target this area, there are no broad empirical studies that can be used to ground existing techniques or inform new ones. In this paper we present the results of an empirical study of failures experienced by the users of seven open-source distributed systems. The goal of our study is to understand what patterns exist in failure scenarios to guide the definition of an improved testing method for distributed systems. At a high level, our results indicate that: there is empirical evidence to support the consideration of a new generation of testing methods that address the failures due to distribution; the configurations that cause user-reported failures are reasonably straightforward to construct; and generic failure observations (i.e., those for which reusable techniques can be developed) are strongly correlated to the distributed nature of system failures. The second two results in particular imply that there is a reasonable bound on the effort required to organize the testing activity. Overall, the study gives us some early confidence that it is both necessary and feasible to consider a testing method that targets distributed systems. The results of this study are offered as a basis for future work on the definition of such a method.
Engineering distributed systems is a challenging activity. This is partly due to the intrinsic complexity of distributed systems, and partly due to the practical obstacles that developers face when evaluating and tuning their design and implementation decisions. This paper addresses the latter aspect, providing techniques for software engineers to automate two key elements of the experimentation activity: (1) workload generation and (2) experiment deployment and execution. Our approach is founded on a suite of models that characterize the client behaviors that drive the experiments, the distributed system under experimentation, and the testbeds upon which the experiments are to be carried out. The models are used by simulation-based and generative techniques to automate the construction of the workloads, as well as construction of the scripts for deploying and executing the experiments on distributed testbeds. The framework is not targeted at a specific system or application model, but rather is a generic, programmable tool. We have validated our approach on a variety of distributed systems. Our experience shows that this framework can be readily applied to different kinds of distributed system architectures, and that using it for meaningful experimentation is advantageous.
An ideal sensor network would minimize communication by routing information only to those nodes requiring the information. We are exploring the use of a content-based network for this purpose, where messages containing sensor readings and associated metadata are relayed from source nodes to destination nodes based solely on the fact that the destination nodes have expressed interest in specific message content. Routing uses a distance vector protocol augmented to construct both primary and alternate routes. Forwarding uses content-based matching together with a special process called dynamic receiver partitioning. The protocol, called DV/DRP, is specifically designed for wireless sensor networks or other similarly constrained network configurations. We present simulations showing that the protocol scales to large networks while minimizing the resource consumption of individual nodes. We also show that the protocol is robust with respect to both transient and permanent node and communication failures.
This paper proposes a routing scheme for content-based networking. A content-based network is a communication network that features a new advanced communication model where messages are not given explicit destination addresses, and where the destinations of a message are determined by matching the content of the message against selection predicates declared by nodes. Routing in a content-based network amounts to propagating predicates and the necessary topological information in order to maintain loop-free and possibly minimal forwarding paths for messages. The routing scheme we propose uses a combination of a traditional broadcast protocol and a content-based routing protocol. We present the combined scheme and its requirements over the broadcast protocol. We then detail the content-based routing protocol, highlighting a set of optimization heuristics. We also present the results of our evaluation, showing that this routing scheme is effective and scalable.
This paper presents the design and evaluation of a support service for mobile, wireless clients of a distributed publish/subscribe system. A distributed publish/subscribe system is a networked communication infrastructure where messages are published by senders and then delivered to the receivers whose subscriptions match the messages. Communication therefore does not involve the use of explicit addresses, but rather emerges from the dynamic arrangement of publishers and subscribers. Such a communication mechanism is an ideal platform for a variety of internet applications, including multi-party messaging, personal information management, information sharing, on-line news distribution, service discovery, and electronic auctions. Our goal is to support such applications on mobile, wireless host devices in such a way that the applications can, if they chose, be oblivious to the mobility and intermittent connectivity of their hosts as they move from one publish/subscribe access point to another. In this paper we describe a generic, value-added service that can be used in conjunction with publish/subscribe systems to achieve these goals. We detail the implementation of the service and present the results of our evaluation of the service in both wireline and emulated wireless environments.
This paper presents an algorithm for content-based forwarding, an essential function in content-based networking. Unlike in traditional address-based unicast or multicast networks, where messages are given explicit destination addresses, the movement of messages through a content-based network is driven by predicates applied to the content of the messages. Forwarding in such a network amounts to evaluating the predicates stored in a router's forwarding table in order to decide to which neighbor routers the message should be sent. We are interested in finding a forwarding algorithm that can make this decision as quickly as possible in situations where there are numerous, complex predicates and high volumes of messages. We present such an algorithm and give the results of studies evaluating its performance.
This paper proposes a routing scheme for content-based networking. A content-based network is a communication network that features a new advanced communication model where messages are not given explicit destination addresses, and where the destinations of a message are determined by matching the content of the message against selection predicates declared by nodes. Routing in a content-based network amounts to propagating predicates and the necessary topological information in order to maintain loop-free and possibly minimal forwarding paths for messages. The routing scheme we propose uses a combination of a traditional broadcast protocol and a content-based routing protocol. We present the combined scheme and its requirements over the broadcast protocol. We then detail the content-based routing protocol, highlighting a set of optimization heuristics. We also present the results of our evaluation, showing that this routing scheme is effective and scalable.
We describe Lira, a lightweight infrastructure for managing dynamic recon guration that applies and extends the concepts of network management to component-based, distributed software systems. Lira is designed to perform both component-level recon gurations and scalable application-level recon gurations, the former through agents associated with individual components and the latter through a hierarchy of managers. Agents are programmed on a component-by-component basis to respond to recon guration requests appropriate for that component. Managers embody the logic for monitoring the state of one or more components, and for determining when and how to execute recon guration activities. A simple protocol based on SNMP is used for communication among managers and agents.
Engineering a highly distributed system requires the ability to evaluate and optimize the protocols that control the movement and processing of information throughout the system. Because the design of such protocols is often characterized by conflicting goals and trade-offs, the designer must calibrate the parameters of the protocols, seeking the best balance of performance in the most common usage scenarios. Unfortunately, fully testing these calibrations requires experiments conducted on large, expensive testbeds that are very difficult to deploy and maintain.
In this paper, we propose a new approach for the optimization of a highly distributed system's performance. The approach is based on leveraging data collected from fielded components to fine-tune the behavior of the system and its protocols. Captured data is "replayed" in simulations performed directly in the field during off-peak hours. The results of these simulations are then used to control the system directly in the field, and/or to report aggregate performance and behavior information to the system designer.
This paper presents the design and evaluation of a support service for mobile, wireless clients of a distributed publish/subscribe system. A distributed publish/subscribe system is a networked communication infrastructure where messages are published by senders and then delivered to the receivers whose subscriptions match the messages. Communication therefore does not involve the use of explicit addresses, but rather emerges from the dynamic arrangement of publishers and subscribers. Such a communication mechanism is an ideal platform for a variety of Internet applications, including multi-party messaging, personal information management, information sharing, on-line news distribution, service discovery, and electronic auctions. Our goal is to support such applications on mobile, wireless host devices in such a way that the applications can, if they chose, be oblivious to the mobility and intermittent connectivity of their hosts as they move from one publish/subscribe access point to another. In this paper we describe a generic, value-added service that can be used in conjunction with publish/subscribe systems to achieve these goals. We detail the implementation of the service and present the results of our evaluation of the service in both wireline and emulated wireless environments.
Building a distributed publish/subscribe infrastructure amounts to defining a service model (or interface) and providing an implementation for it. A typical distributed implementation is architected as a network of dispatcher components, each one implementing appropriate protocols and algorithms, that collectively realize the chosen service model. The service model should provide a value-added service for a wide variety of applications, while the implementation should gracefully scale up to handle an intense traffic of publications and subscriptions. We believe that the design of such service models and implementations must be guided by a systematic evaluation method, which in turns must be based on a carefully chosen benchmark suite. In this paper, we lay out a set of requirements for a benchmark suite for distributed publish/subscribe services, and we outline its primary components. The ideas proposed in this paper are based on our own experience in building and studying publish/subscribe infrastructures, and on existing evaluation frameworks.
As wireless technology becomes more available, developers of distributed applications are becoming more interested in how that technology affects the performance of their systems. We have developed a distributed publish/subscribe communication service initially hosted on the standard IP-wired network infrastructure, but would now like to rehost that service onto a GPRS wireless network. This paper reports on our experience in attempting to evaluate the performance of the service using an available emulation environment. Our conclusion from our experience to date is that current tools do not model the wireless network at an appropriate level of abstraction. In particular, they do not allow us to study the integration of individual publish/subscribe service-layer elements with GPRS network-layer elements, nor do they allow us to study multiple GPRS clients interacting over the network. Instead we were limited to results related to the interaction between an individual GPRS client and the GPRS network modeled as a monolith.
Reconfiguration is the process of applying planned changes to the communication, interconnection, componentization, or functionality of a deployed system. It is a powerful tool for achieving a variety of desirable properties of large-scale, distributed systems, including evolvability, adaptability, survivability, and continuous availability. Current approaches to reconfiguration are inadequate: some allow one to describe a system s range of configurations for a relatively broad class of system architectures, but do not provide a mechanism for actually carrying out a reconfiguration; others provide a mechanism for carrying out certain kinds of limited reconfigurations, but assume a specialized system architecture in order to do so. This paper describes our attempt at devising a reconfiguration mechanism for use with the popular and widely available Enterprise JavaBean (EJB) component container model. We describe extensions to the basic services provided by EJB to support the mechanism, a prototype implementation, and a case study of its application to a representative component-based distributed system.
The Willow architecture provides a comprehensive architectural approach to the provision of survivability in critical information networks. It is based on the notion that survivability of a network requires reconfiguration at both the system and the application levels. The Willow notion of reconfiguration is very general, and the architecture provides reconfiguration mechanisms for both automatic and manual network control. In this paper we summarize the Willow concepts and provide an overview of the Willow architecture. Finally we describe a demonstration application system that has been built on top of a prototype Willow implementation.
The Willow architecture is a comprehensive approach to survivability in critical distributed applications. Survivability is achieved in a deployed system using a unique combination of (a) fault avoidance by disabling vulnerable network elements intentionally when a threat is detected or predicted, (b) fault elimination by replacing system software elements when faults are discovered, and (c) fault tolerance by reconfiguring the system if non-maskable damage occurs. The key to the architecture is a powerful reconfiguration mechanism that is combined with a general control structure in which network state is sensed, analyzed, and required changes effected. The architecture can be used to deploy software functionality enhancements as well as survivability. Novel aspects include: node configuration control mechanisms; a workflow system for resolving conflicting configurations; communications based on wide-area event notification; tolerance for wide-area, hierarchic and sequential faults; secure, and secure, scaleable, and delegatable trust models.
Reconfiguration is the process of applying planned changes to the communication, interconnection, componentization, or functionality of a deployed system. It is a powerful tool for achieving a variety of desirable properties of large-scale, distributed systems, including evolvability, adaptability, survivability, and continuous availability. Current approaches to reconfiguration are inadequate: some allow one to describe a system's range of configurations for a relatively broad class of system architectures, but do not provide a mechanism for actually carrying out a reconfiguration; others provide a mechanism for carrying out certain kinds of limited reconfigurations, but assume a specialized system architecture in order to do so. This paper describes our attempt at devising a reconfiguration mechanism for use with the popular and widely available Enterprise JavaBean (EJB) component container model. We describe extensions to the basic services provided by EJB to support the mechanism, a prototype implementation, and a case study of its application to a representative component-based distributed system.
This paper presents a new algorithm for content-based forwarding, an essential function in content-based networking. Unlike in traditional address-based unicast or multicast networks, where messages are given explicit destination addresses, the movement of messages through a content-based network is driven by predicates applied to the content of the messages. Forwarding in such a network amounts to evaluating the predicates stored in a router's forwarding table in order to decide to which neighbor router the message should be sent. We are interested in finding a forwarding algorithm that can make this decision as quickly as possible in situations where there are large numbers of predicates and high volumes of messages. We present such an algorithm and give the results of studies evaluating its performance.
Even though the number and variety of available configuration management systems has grown rapidly in the past few years, the need for new configuration management systems still remains. Driving this need are the emergence of situations requiring highly specialized solutions, the demand for management of artifacts other than traditional source code, and the exploration of wholely new research questions in configuration management. Complicating the picture is the trend toward organizational structures that involve personnel working at physically separate sites. We have developed a testbed to support the rapid development of configuration management systems. The testbed separates configuration management repositories (i.e., the stores for versions of artifacts) from configuration management policies (i.e., the procedures according to which the versions are manipulated) by providing a generic model of a distributed repository and an associated programmatic interface. Specific configuration management policies are programmed as unique extensions to the generic interface, while the underlying distributed repository is reused across different policies. In this paper, we describe the repository model and its interface, and present our experience in using a prototype of the testbed, called NUCM, to implement a variety of configuration management systems.
Publish-subscribe is a communication paradigm that supports dynamic, many-to-many communications in a distributed environment. Content-based pub-sub systems are often implemented on a peer-to-peer infrastructure that enables information dissemination from information producers (publishers) to consumers (subscribers) through a subscription mechanism. In a wide-area pub-sub network, the pub-sub service must handle information dissemination across distinct authoritative domains, heterogeneous platforms and a large, dynamic population of publishers and subscribers. Such an environment raises serious security concerns. In this paper, we investigate the security issues and requirements that arise in an internet-scale content-based pub-sub system. We distinguish among those requirements that can be achieved with current technology and those that require innovative solutions.
We argue that the needs of many classes of modern applications, especially those targeted at mobile or wireless computing, demand the services of content-based publish/subscribe middleware, and that this middleware in turn demands a new kind of communication infrastructure for its proper implementation. We refer to this new communication infrastructure as content-based networking. The service model of this network must directly support the interface of an advanced content-based publish/subscribe middleware service. At the same time, the implementation must be architected as a true distributed network, providing appropriate guarantees of reliability, security, and performance. We do not propose content-based networking as a replacement for IP, nor we advocate an implementation of a publish/subscribe middleware at the network level (i.e., within routers). Instead, we argue that content-based networking must be designed according to established networking principles and techniques. To this end, in this paper, we formulate the foundational concepts of content-based networking, and relate them to the corresponding concepts in traditional networking. We also briefly review our experience with content-based publish/subscribe middleware and suggest some open research problems in the area of content-based networking.
The components of a loosely-coupled system are typically designed to operate by generating and responding to asynchronous events. An event notification service is an application-independent infrastructure that supports the construction of event-based systems, whereby generators of events publish event notifications to the infrastructure and consumers of events subscribe with the infrastructure to receive relevant notifications. The two primary services that should be provided to components by the infrastructure are notification selection (i.e., determining which notifications match which subscriptions) and notification delivery (i.e, routing matching notifications from publishers to subscribers). Numerous event notification services have been developed for local-area networks, generally based on a centralized server to select and deliver event notifications. Therefore, they suffer from an inherent inability to scale to wide-area networks, such as the Internet, where the number and physical distribution of the service's clients can quickly overwhelm a centralized solution. The critical challenge in the setting of a wide-area network is to maximize the expressiveness in the selection mechanism without sacrificing scalability in the delivery mechanism.
This paper presents Siena, an event notification service that we have designed and implemented to exhibit both expressiveness and scalability. We describe the service's interface to applications, the algorithms used by networks of servers to select and deliver event notifications, and the strategies used to optimize performance. We also present results of simulation studies that examine the scalability and performance of the service.
This paper describes the design of Siena, an Internet-scale event notification middleware service for distributed event-based applications deployed over wide-area networks. Siena is responsible for selecting the notifications that are of interest to clients (as expressed in client subscriptions) and then delivering those notifications to the clients via access points. The key design challenge for Siena is maximizing expressiveness in the selection mechanism without sacrificing scalability of the delivery mechanism. This paper focuses on those aspects of the design of Siena that fundamentally impact scalability and expressiveness. In particular, we describe Siena's data model for notifications, the covering relations that formally define the semantics of the data model, the distributed architectures we have studied for Siena's implementation, and the processing strategies we developed to exploit the covering relations for optimizing the routing of notifications.
The designers of communication networks are being challenged by the emergence of a new class of addressing and routing scheme referred to as content-based addressing and routing. This new approach differs from traditional unicast and multicast schemes in that it performs routing based on the data being transported in a message rather than on any specialized addressing and routing information attached to, or otherwise associated with, the message. An example of an application for content-based addressing and routing is an event notification service, which is a general-purpose facility for asynchronously and implicitly conveying information from generators of events to any and all parties expressing interest in those events. In order to implement content-based addressing and routing, we can adopt well-known and successful network architectures and protocols, provided that we understand how to map the core concepts and functionalities of content-based addressing and routing onto this established infrastructure. Toward that end, we have formulated a general, yet powerful model of addressing and routing that allows us to formalize the crucial aspects of content-based addressing and routing in a surprisingly simple manner. Furthermore, it allows us to treat traditional unicast and multicast addressing and routing uniformly as instances of this more general model. This paper presents our model and demonstrates its utility by showing its application to the design of an existing event notification service.
The components of a loosely-coupled system are typically designed to operate by generating and responding to asynchronous events. An event notification service is an application-independent infrastructure that supports the construction of event-based systems. The two primary services that should be provided to components by the infrastructure are notification selection and notification delivery. Numerous event notification services have been developed for local-area networks, generally based on a centralized server to select and deliver event notifications. Therefore, they suffer from an inherent inability to scale to wide-area networks, such as the Internet, where the number and physical distribution of the service's clients can quickly overwhelm a centralized solution. The critical challenge in the setting of a wide-area network is to maximize the expressiveness in the selection mechanism without sacrificing scalability in the delivery mechanism.
This paper presents SIENA, an event notification service that we have designed to exhibit both expressiveness and scalability. We describe the service's interface to applications, the algorithms used by networks of servers to select and deliver event notifications, and the strategies used to optimize performance. We present results of simulation experiments that examine the efficiency of the service. Finally, we describe a prototype implementation of SIENA.
Software applications are no longer stand-alone systems. They are
increasingly the result of the integration of heterogeneous
collections of components, possibly distributed over a computer
network. Di erent components can be provided by di erent producers and
they can be part of di erent systems at the same time. Moreover,
components change and evolve very rapidly, making it di cult to manage
the whole system in a consistent way.
In this scenario, a crucial step of the software life cycle is
deployment|that is, the activities related to the release,
installation, activation, deactivation, update, and removal of
components, as well as whole systems.
This paper presents a characterization of the deployment process
together with a framework for evaluating technologies that are
intended to address the software deployment problem. The framework
highlights four primary factors that characterize the maturity of the
technologies: process coverage; process changeability; interprocess
coordination; and site, product, and deployment policy abstraction. A
variety of existing technologies are surveyed and assessed against the
proposed framework. Finally, we discuss promising research directions
in software deployment.
DVS is a simple versioning system that adopts a check-in/check-out
policy with exclusive locks much like the one implemented by RCS. In
addition to the basic functionalities of RCS, DVS provides extensions
in two main directions: distribution of artifacts and
versioning of collections of artifacts.
DVS has been implemented on top of NUCM 2, a generic distributed
platform aimed at providing a policy-neutral programmable interface
for realizing configuration management systems. NUCM provides support
for storing artifacts and collections of artifacts and their
attributes in a set of distributed servers. DVS implements the locking
policy and all the related higher level services including check-in
and check-out of single artifacts as well as collections, managing
locks, change logs, and recursive check-in and check-out.
This paper describes the main design principles underlying DVS and
NUCM together with the basics issues regarding their
implementation. DVS has been used and is currently being used for
collaborative authoring involving several authors distributed over
five different sites on the Internet. We also discuss this first
experience and the feedback and validation for both DVS and NUCM.
This work is about an infrastructure for supporting event-based applications on a wide-area network.
A wide range of software systems are designed to operate in a reactive manner. In such systems, the high-level control flow is not explicitly programmed, instead it is driven by the occurrence of events. These systems realize their functionality by performing some actions in response to events, possibly using the information associated with the stimulating events. Examples of reactive systems are integrated development environments, work-flow and process analysis systems, graphical user interfaces, network management systems, software deployment systems and security monitors.
There are two major motivations for designing applications as reactive systems. First, some applications are reactive by their nature. Typically, the ones that involve the direct interaction of human agents are characterized by an asynchronous input of relatively small pieces of data. In these cases, and in the general case of "on-line" input, the concept of event is a good modeling and design abstraction. Similarly, the same abstraction is useful for those components that, although not necessarily functioning in an asynchronous way, are integrated by means of some communication mechanisms that introduce asynchronicity in their interactions. The other benefit of adopting an event-based style is that it requires only a loose coupling for the integration of heterogeneous components. Components do not need to export interfaces to be accessed by other components. Components can request some services without addressing a specific server component and, to a certain extent, components can interoperate even if they have been designed and developed separately without any mutual knowledge.
A common logical component of every event-based application is what we call an event observation and notification service, or more concisely an event service. The event service observes the occurrence of events or the occurrence of combinations of events and consequently notifies all the applications or components that have declared their interests in reacting to such events. Because the semantics of event-based applications is substantially independent of the mechanisms that are used to capture and dispatch events of interest, it is convenient to separate the event service from all the applications that make use of it. In accordance to many standardization efforts that have been proposed recently (e.g., the CORBA Event Service), and according to strategic plans and research in network technology, we envision a unified event service implemented as a common "middle-ware" to support the event-based interaction among software components.
The idea of integrating software components by means of a common event service seems to be very promising especially for those distributed applications that are deployed on a wide-area network such as the Internet. For one thing, the vast number of available information sources offers a great deal of opportunities for the development of new applications. New classes of wide-scale event-driven applications can be devised including stock market analysis tools, efficient news and mailing systems, data mining tools, and indexing tools. Also, many existing applications that are already designed to exploit event-based infrastructures can be proficiently integrated at a much higher scale thanks to the "global" connectivity provided by the network. For example, work-flow systems can be federated for companies that have multiple distributed branches or even across corporate boundaries, or else software deployment systems can connect software producers and consumers through the Internet. In general, the asynchronicity, the heterogeneity, and the high degree of loose coupling that characterize wide-area networks suggest that a wide-scale event service would be a good integration infrastructure for existing systems and for new applications.
Numerous technologies that realize an event service have been developed and effectively used for quite a long time, examples are Field, SUN ToolTalk, and Yeast. However, most of them are targeted towards single computers or at most local-area networks, and it is very clear that they can not be simply "adapted" to scale to the Internet. In fact, extending the support of an event service to a wide-area network creates new challenges and trade-offs. Not only does the number of objects and events grow tremendously, but also many of the assumptions made for local-area networks, such as, low latency, abundant bandwidth, homogeneous platforms, continuous reliable connectivity, and centralized control, are no longer valid.
Some technologies address issues related to wide-area services. Among them we can find new technologies such as TIB/Rendezvous that specifically provide an event service, but also more mature technologies such as the USENET news infrastructure, IP multicasting, the Domain Name Service (DNS), that, although not explicitly targeted at this problem domain, represent potential or partial solutions to the problem of scalability. The main shortcoming of all of these technologies is that they are specific to some application domain and not flexible enough to be usable as a generic and open infrastructure for integrating event-based applications.
In summary, we see two main challenges in the area of event-based infrastructures, that Siena proposes to address:
The contributions of this work are a formal definition of an event service that combines expressiveness with scalability together with the design and implementation of the architectures and algorithms that realize this event service as a distributed infrastructure. One obvious issue that we must face in this research is the validation and verification of the solutions that we propose. To this end, we used a simulation environment by which we performed systematic simulations of our architectures and algorithms in several network scenarios. Here we present the framework that we built and the modeling abstraction that we adopted to perform this analysis. We also discuss some initial results that clearly differentiate the Siena event service from traditional ones. Further simulations will help clarify the trade-offs and differentiators between the alternative solutions that we propose. In addition to the simulation environment, we implemented a real prototype of Siena, consisting of a server that realizes one of our distributed architectures plus a client-side interface with a mapping to the Java language. This prototype has been used in distributed settings of limited scale to support an event-based software deployment system.
Distributed configuration management is intended to support the activities of projects that span multiple sites. NUCM is a testbed that we are developing to help explore the issues of distributed configuration management. NUCM separates configuration management repositories (i.e., the stores for versions of artifacts) from configuration management policies (i.e., the procedures according to which the versions are manipulated) by providing a generic model of a distributed repository and an associated programmatic interface. Specific configuration management policies are programmed as unique extensions to the generic interface, but the underlying distributed repository is reused across different policies. In this paper, we describe the repository model and its interface, discuss their implementation in NUCM, and present how NUCM has been used to implement several, rather different, configuration management policies.
Event-based distributed systems are programmed to operate in response
to events. An event notification service is an application-independent
infrastructure that supports the construction of event-based systems.
While numerous technologies have been developed for supporting
event-based interactions over local-area networks, these technologies
do not scale well to wide-area networks such as the Internet.
Wide-area networks pose new challenges that have to be attacked with
solutions that specifically address issues of scalability. This paper
presents Siena, a scalable event notification service that is based on
a distributed architecture of event servers. We first present a
formally defined interface that is based on an extension to the
publish/subscribe protocol. We then describe and compare several
different server topologies and routing algorithms. We conclude by
briefly discussing related work, our experience with an initial
implementation of Siena, and a framework for evaluating the
scalability of event notification services such as Siena.
Software applications are no longer stand-alone systems. They are
increasingly the result of integrating heterogeneous collections of
components, both executable and data, possibly dispersed over a
computer network. Different components can be provided by different
producers and they can be part of different systems at the same time.
Moreover, components can change rapidly and independently, making it
difficult to manage the whole system in a consistent way. Under these
circumstances, a crucial step of the software life cycle is
deployment---that is, the activities related to the release,
installation, activation, deactivation, update, and removal of
components, as well as whole systems.
This paper presents a framework for characterizing technologies that
are intended to support software deployment. The framework highlights
four primary factors concerning the technologies: process coverage;
process changeability; interprocess coordination; and site, product,
and deployment policy abstraction. A variety of existing technologies
are surveyed and assessed against the framework. Finally, we discuss
promising research directions in software deployment.
Traditionally, configuration management has only addressed the needs of the software development process. Once a software system leaves the development environment and enters the field, however, there still is a significant role for configuration management. Activities such as release, installation, activation, update, adaptation, deactivation, and de-release constitute the deployment lifecycle; these activities need careful coordination and planning in their own right. This article discusses the dimensions of software deployment, argues why current solutions are not sufficient, and presents two research systems that specifically address software deployment.
A common architectural style for distributed, loosely-coupled, heterogeneous software systems is a structure based on event generation, observation and notification. A notable characteristic of an architecture based on events is that interaction among architectural components occurs asynchronously, thereby simplifying the composition of autonomous, independently-executing components that may be written in different programming languages and executing on varied hardware platforms.
There is increasing interest in deploying these kinds of distributed systems across wide-area networks such as the Internet. For instance, workflow systems for multi-national corporations, multi-site/multi-organization software development, and real-time investment analysis across world financial markets are just a few of the many applications that lend themselves to deployment on an Internet scale. However, deployment of such systems at the scale of the Internet imposes new challenges that are not met by existing technology.
We have been studying the problem of designing an Internet-scale event observation and notification facility that can serve as a platform for building wide-area distributed systems according to an event-based architectural style. In this paper we briefly outline our achievements to date.
Large scale distributed systems are becoming of paramount importance, due to the evolution of technology and to the interest of market. Their development, however, is not yet supported by a sound technological and methodological background, as the results developed for small size distributed systems often do not scale up. Recently, mobile code languages (MCLs) have been proposed as a technological answer to the problem. In this work, we abstract away from the details of these languages by deriving design paradigms exploiting code mobility that are independent of any particular technology. We present such design paradigms, together with a discussion of their features, their application domain, and some hints about the selection of the correct paradigm for a given distributed application.
One of the requirements for an object-oriented database to support advanced applications is a communication mechanism. The Inter-Client Communication Mechanism (ICCM) is a set of data structures and functions developed for the O2 database, which provides this kind of service. Communication is achieved through shared persistent objects, implementing the basic idea of mailbox. One to one connections are established between different processes accessing the database. Methods and data structure defined in the ICCM support connection set-up, disconnection, and all the basic data transfer facilities. In this paper, we describe the concepts of the ICCM and an overview of its implementation.