Metrics Service

Metrics Service Overview

The Cougaar Metrics Service records and integrates metrics data captured by local sensors, such as CPU load and message traffic sensors, and advertises these real-time metrics to local components and remote user interfaces. For example servlet view snapshots, see the Operation Using Servlets page. The remainder of this manual covers the design, configuration, and developer use of the Metrics Service.

The Metrics Service constructs and maintains a data model which integrates raw sensor values from sources throughout any given Node’s containment hierarchy, reconciling conflicting sensor view as needed. The resulting metrics are accessible to any client Component in that Node. The data-model perspective is always local to the given Node, though the information may also include data about, and in some cases from, remote Nodes and Agents.

The Metrics Service data model is updated in real-time in a demand-based way. As a result, a given data model becomes a more comprehensive view of the society as a whole not only as available data increases but also as demand increases. The services themselves do not do any measurements. Their function is only to collect and integrate raw values into a coherent and reliable data model, and to provide integrated best-guess metrics to clients that need them.

Note that the Metric Service role is limited to capturing the current state of the society and its underlying resources. The Metrics Service gives the best guess at what the society is doing now from the local Nodes perspective. The Metrics Service does not support other kinds of statistics gathering, such as event/traps, time history, resource control, or policy management. Other Cougaar services handle these kinds of access mechanisms, even though the underlying statistics may be similar.

A Node’s Metrics Service data model can be accessed by several ways, each tailored for a different type of consumer. When reading, Metrics are named based on looking of a formula in the data model. Since the lookup can follow relative relationship between resource contexts in the model, the lookup syntax is called a Path.

  • Components can read metrics using direct calls to the MetricsService interface. The interface supports both query or subscribe to specific metrics.
  • Operators and debuggers can access tables of metrics using Servlets
  • Management/Monitoring Agents can subscribe to metrics which are not collected locally, but which are collected on remote nodes. The Gossip Service will automatically transport these metrics to the local node.
  • The Adaptivity Engine has a standard interface and naming convention for accessing metrics in rule-books.
  • ACME and CSMART will be able to access metrics via a Servlet which returns queries in XML or Cougaar Properites format. (Not available in 10.4.0).
  • The QuO framework for generating QoS adaptive Aspects and Comonents has access to metrics, by using QuO SysConds.

The Metrics Services has several ways for sensors to write metric values into the Node’s local data model. When writing, Metrics are named using on a hieractical naming convention called Keys

  • Components can write metrics using direct calls to the MetricsUpdateService
  • Network/System Management Systems can import external information about the network and host configurations. Special DataFeeds can be created to import information in different formats. For example, default configuation information can be imported from a web page or file using a PropertiesDataFeed, see Configuring The Metrics Service.

Sources of the Metrics Service can be explained as differing flows, with data from sensor code, netowrk management configuration files, and even host-level probes.

 Engineering Assessment

Null Metrics Service

For engineering assessment, it is possible to configure the metric service to add or remove functionality. The assessment idea is to remove as much functionality as possible to obtain a base line. The functionality is then added back piece by piece and the system is re-evalutate to determine the overhead.

The first level of testing is to remove the metrics service sensors and clients. To remove the clients of the service, the metrics rules should not be loaded into the society XML. Past measurements show that the metric service clients and less than 10% load on a Node.

Another level of base-lining is to remove the metric-service implementation itself. The Metric Service is really a wrapper around the QuO Resource Status Service. If the RSS class is not found, the wrapper will act as a blackhole. The MetricsService will accept subscriptions, queries, but the wrapper will always return a NO-VALUE Metric. Also, MetricsUpdaterService can input with key-value pairs, but the wrapper will just drop them.
The empty implementation of MetricsService returns a 0-credibility metric with the provenance set to “undefined” (this constant UndefinedMetric) for all getValue calls and does nothing for all subscribeToValue and unsubscribeToValue calls. The empty implementation of MetricsUpdateServicedoes nothing for the updateValue call.
Of course none of the Metrics clients will work, and should not be configured in the society XML.

In 11.0 you can disable the QuO -RSS implementation, getting empty implementations of the MetricsService and the MetricsUpdateService, if qos.jar isn’t accessible. Simply remove the quoSumo.jar from the Class path $CIP/sys/qouSumo.jar.
In subsequent releases you get empty implementations of the MetricsService and the MetricsUpdateServiceby loading the component org.cougaar.core.qos.metrics.MetricsServiceProvider. Full Component-loading isn’t functional in 11.0, so provided is a workaround to do this with a -D flag:

-Dorg.cougaar.society.xsl.param.metrics=trivial.