A Case Study on Formal Verification of Self-Adaptive Behaviors in a Decentralized System

Overview

Ensuring the required qualities of software systems that have to operate in dynamic environments poses severe engineering challenges. Self-adaptation is a promising approach to manage the com- plexity of modern software systems. A self-adaptive system is able to adapt itself autonomously to internal dynamics and dynamics in the environment to achieve particular quality goals. Our particular interest is in decentralized self-adaptive systems, in which central control of adaptation is not an option. One important challenge in self-adaptive systems, in particular those with decentralized control of adaptation, is to provide guarantees about the intended runtime qualities. In this paper, we present a case study in which we use model checking to verify behavioral properties of a decen- tralized self-adaptive system. Concretely, we contribute with a formalized architecture model of a decentralized traffic monitoring system and prove a number of self-adaptation properties for flexibility and robustness. To model the main processes in the system we use timed automata, and for the specification of the required properties we use timed computation tree logic. We use the Uppaal tool to specify the system and verify the flexibility and robustness properties.

This page provides all the material related to the case study.

Case Study

The traffic monitoring system consists of a set of intelligent cameras, which are distributed evenly along the road. An example of a highway is shown in Fig. 1. Each camera has a limited viewing range and cameras are placed to get an optimal coverage of the highway with a minimum overlap. The task of the cameras is to detect and monitor traffic jams on the highway to assist for example traffic light con- trollers or driver assistance systems. This task has to be performed in a decentralized way, avoiding the bottleneck of a centralized control center. To that end, cameras collaborate in organizations: if a traffic jam spans the viewing range of multiple cameras, they form an organization that provides information to clients that have an interest in traffic jams. Fig. 1 shows two scenarios that require adaption. The first scenario concerns the dynamic adaptation of an organization from T0 to T1, where camera 2 joins the organization of cameras 3 and 4 after it monitors a traffic jam. The second scenario concerns robustness to a silent node failure, i.e., a failure in which a failing camera becomes unresponsive without sending any incorrect data. This scenario is shown from T2 to T3, where camera 2 fails. Since there are dependencies between the software running on different cameras (we explain this in detail below), such failures may bring the system in an inconsistent state and disrupt its services. Therefore, the system should be able to restore its services after a failure, although in degraded mode since the traffic state is no longer monitored in the viewing range of the failed camera.

Figure 1. Self-healing scenario

Dynamic Agent Organizations for Flexibility

Figure 2a shows the primary components of the software deployed on each camera, i.e. the local camera system. The local traffic monitoring system provides the functionality to detect traffic jams and inform clients. The local traffic monitoring system is conceived as an agent-based system consisting of two com- ponents. The agent is responsible for monitoring the traffic and collaborating with other agents to report a possible traffic jam to clients. The organization middleware offers life cycle management services to set up and maintain organizations. We employ dynamic organizations of agents to support flexibility in the system, that is, agent organizations dynamically adapt themselves with changing traffic conditions. To access the hardware and communication facilities on the camera, the local traffic monitoring system can rely on the services provided by the distributed communication and host infrastructure.

In normal traffic conditions, each agent belongs to a single member organization. However, when a traffic jam is detected that spans the viewing range of multiple neighboring cameras, organizations on these cameras will merge into one organization. To simplify the management of organizations and interactions with clients, the organizations have a master/slave structure. The master is responsible for managing the dynamics of that organization (merging and splitting) by synchronizing with its slaves and neighboring organizations. Therefore, the master uses the context information provided by its slaves about their local monitored traffic conditions. At T0, the example in Fig. 1 shows four single member organizations, org1 with agent1, org2 with agent2, and similar for org5, and org6. Furthermore, there is one merged organization, org34 with agent3 as master and agent4 as slave. At T1, the traffic jam spans the viewing range of cameras 2, 3 and 4. As a result, organizations org2 and org34 have merged to form org24 with agent2 as master. When the traffic jam resolves, the organization is split dynamically.

(a) Primary software components deployed on each camera	(b) Design model of camera and environment (communication channels are omitted)
Figure 2. Local Camera System

Self-Healing Subsystem for Robustness

To recover from camera failures, a self-healing subsystem is added to the local traffic monitoring system, as shown in Fig. 2a. The self-healing subsystem maintains a model of the current dependencies of the components of the local traffic monitoring system with other active cameras. Dependencies include master/slave relationships and neighbor relationships. It also contains repair actions for different types of failure scenarios. Examples of repair actions are: halt the communication with the failed camera, elect a new master, and exchange the context with another camera. To detect failures, the self-healing subsystem coordinates with self-healing subsystems on other cameras with a dependency in the dependency model using a ping-echo mechanism.

Camera Dependencies

Each working camera is in one of three distinct roles: master of a single member organization, master of an organization with slaves, or slave in an organization. As these roles come with certain responsibilities, each camera is dependent on a particular set of remote cameras in order to function properly:

A master of a single agent organization is dependent on its neighboring nodes.
A master with slaves is dependent on its slaves and its neighboring nodes.
A slave of an organization is dependent on its master and its neighboring nodes.

The dependencies enable cameras to dynamically adapt organizations when the traffic conditions change in the environment and recover when one of the cameras fails.

Data

Case Study Material

Uppaal Model (zip)
Self-Adaptation Framework (zip)
The Uppaal tool can be downloaded here.