White Pages overview

Here is an overview of the Cougaar Whitepages.

The choice of naming service is controlled by -Dorg.cougaar.society.xsl.param.template=VALUE in SimpleAgent.xsl
where (pseudocode):

if template is embedded, single_node, or single_debug
   wpserver=single_node
else
   wpserver=full

This “wpserver=” is handled by NodeAgent.xsl
where (pseudocode):

if wpserver is single_node
   load the LoopbackWhitePages component
else
   load distributed-wp impl components

If you want to add a new WP implementation, it’s easiest to add it to the above xsl files.

The minimal naming service is the local-only “loopback” implementation: LoopbackWhitePages.java

All the work is in the “submit” method. Here’s pseudocode:

 private Response submit(Request req) {
    // handle request
    Object ret;
    if (req is Bind) {
       add req.ae's  to table
       ret = ttl if added, else error code
    } else if (req is Unbind) {
       rm req.ae's  from table
       ret = true if existed, else false
    } else if (req is List) {
       find entries w/ req.suffix in table
       ret = set of string names
    } else if (req is Get or GetAll) {
       ret = lookup req's  in table
    } else if (req is Flush) {
       // N/A for loopback; used to flush cache
       ret = null
    }

    Response res = req.createResponse();

    // set the result now.  If this required I/O we'd set the
    // result in the async thread; we never block!
    res.setResult(ret);

    return res;
}

AddressEntry “ae” objects are AddressEntry.java
The “ae.cert” field was only used by the UL security team, so you could deprecate that field.

Requests can have a CACHE_ONLY flag, which is esp useful for Gets; see: Request.java
A Bind with CACHE_ONLY=True is called a “hint”, which can be used to bootstrap the WP; it adds a local-only (cache) entry.

To create a new naming service implementation (e.g. JNDI or JLDAP), you’ll need to handle the above “submit” cases: Bind, Unbind, List, Get, GetAll, and Flush.
You might need worker threads/Schedulables to ensure that the WP API is non-blocking.

Here are some notes from the old Cougaar Arch Document on lessons-learned, especially regarding JNDI:

4.1.2 White Pages

The white pages is a distributed table which maps agent names to network
addresses. The primary function of the white pages is to support the Cougaar
message transport and other network-aware components.

For example, a white pages lookup of ‘AgentX’ may return a set of network
entries such as the agent’s RMI message address (rmi://test.com:1234/xyz) and
the agent’s servlet port (http://test.com:8800). This is similar to DNS
name-to-address resolution.

The predecessor of the white pages is the Cougaar Naming Service, which has
existed in different forms over the lifetime of the Cougaar project. The
current white pages implementation is the third full redesign. Many important
lessons were learned over the years, so it’s worthwhile to trace the history
of the naming service:

The initial naming service implementation was developed in 1996. The naming
server ran on a single JVM and communicated over RMI. The implementation was
deeply tangled with the message transport, which was its only client. There
was no caching, change notification, security, or persistence. If the JVM
hosting the naming service was killed, the rest of the agents could no longer
perform their necessary (name -> address) lookups. Despite these limitations,
the initial naming service worked fine in societies with less than 30 or so
agents.

The second naming service implementation, developed in 2000, was primarily
focused on separating the naming service from the message transport and
supporting new naming service clients. The front end was changed to use JNDI,
and the back end supported either RMI or LDAP. The RMI implementation added
change notification callbacks to help reduce client-side polling. JNDI allowed
clients to bind arbitrary objects and perform full JNDI attribute-based
queries. Several client-side ad-hoc caching schemes were implemented but never
generalized to benefit all naming service clients. The naming service was still
a single point of failure, even with LDAP replication, and there was no JDNI
federation or persistence support in the RMI back end. This naming service
barely supported societies of 200 agents.

Several scalability lessons were learned from the second naming service
implementation. All of these lessons are self-evident if you envision an agent
society the size of the entire Internet, with millions of individual agents.
The lessons we learned include:

  • It’s very tempting for developers to treat the naming service as a global
    database, which results in sever scalability issues.
  • Simple requests like ‘list all agents’ are a bad idea when there are
    hundreds of entries.
  • A single-point name server is not robust or scalable. Replication alone
    will not fix this problem, due to the quantity of data and cache
    synchronization overhead.
  • Caching and leasing should be built into the design, since simple
    client-side polling can easily overwhelm the naming server.
  • RMI and JDNI are blocking APIs that suffer from socket resource limitations
    and poor I/O handling. An asynchronous design that supported better
    message delivery control would have been beneficial.
  • The real issue is ‘Where does the data live’? A highly distributed agent
    system must use a highly distributed naming service.
  • The latest naming service implementation included several high-level goals:

  • Must be scalable, to support thousands of agents, running on hundreds of
    hosts on a wide area network.
  • Must be robust, with no single point of failure, multiple servers to
    survive overloads, and persistence to support restarts.
  • Must be efficient, utilizing an integrated caching and garbage collection
    scheme.
  • Must be cleanly integrated into Cougaar, leveraging the Cougaar message
    transport for message protocols, quality of service, and security.
  • Reflection on how the naming service was used prompted us to split the naming
    service into two separate services. A phone book analogy was adopted, where
    the two services are: A ‘white pages,’ which maps names to network addresses;
    and a ‘yellow pages,’ which supports more complex attribute-based searches.

    The white pages service has been modeled after DNS. Agent names now support
    Internet host name semantics with the ‘.’ separator character. A hierarchical
    name space will support better cache control and help distribute the data to
    multiple naming server agents.

    In Cougaar 10.0 the initial WhitePagesService API was defined. An entry in the
    white pages contains an agent’s network address (URI) and an optional
    certificate to validate the entry. The white pages support an asynchronous
    callback API with methods to:

      resolve(name) -> set of entries for that name
      list(suffix) -> set of names
      bind(entry)
      rebind(entry)
      unbind(entry)
    

    The Cougaar 10.0 implementation was built on top of the prior JNDI-based naming
    service.

    Cougaar 10.2 will feature a white pages implementation that replaces the JDNI
    naming service. The white pages will use the message transport to send and
    receive messages, which will allow it to use alternate message transport
    protocols and quality of service guarantees. The white pages will be
    bootstrapped with pre-resolved message transport addresses for the ‘root’ name
    servers. The Cougaar 10.2 implementation will support caching and a resolver
    hierarchy.

    Cougaar 10.4 and later will support peer-based white pages replication and
    state reconciliation for restarted name servers. The white pages will
    effectively become an agent-based application that runs within Cougaar, whose
    job is to support the message transport and other Cougaar network-aware
    components.