[cadynce] How Many Configurations Is Enough?
Gerard M Boudreau
Gerard_M_Boudreau at raytheon.com
Wed May 23 16:17:09 CDT 2007
Joe,
comments embedded.
Gerry
"Cross, Joseph"
<Joseph.Cross at dar
pa.mil> To
"Gerard M Boudreau"
05/23/2007 03:42 <Gerard_M_Boudreau at raytheon.com>,
PM "Raj Rajkumar" <raj at ece.cmu.edu>
cc
<cadynce at list.isis.vanderbilt.edu>,
<cadynce-bounces at list.isis.vanderbi
lt.edu>, "Raj Rajkumar"
<raj at ece.cmu.edu>
Subject
RE: [cadynce] How Many
Configurations Is Enough?
Gerry -
This is great information. Thanks for posting it.
> There was a recent review of the Resource Manager Ensemble Software
> Design Description for release 4 that covered how resources are
> allocated upon service deployment.
This looks like true dynamic resource management. I.e., it will allocate
SCIs to nodes in configurations that have never been tested. I was under
the impression that the cert folks would not tolerate such a thing. Has
this changed?
This is what's in the design now, with gov't reps at the reviews. This
doesn't mean that SW Cert will go along with it. I will make more
inquiries.
> Step 1: Get the list of potential hardware on which to deploy
> the software...
> Step 2: Filter out the nodes that do not have enough resources
> (RAM, CPU utilization and bandwidth) to support the software.
Here "enough" means "enough resources remaining after already allocated
higher priority processes and dataflows have taken their slices," right?
Right.
> Step 3: Determine where to deploy the software. This is based on:
> Co-location dependency - must be deployed on same
> node as other software
> Replication scheme - Master/slave - deploy master
> replicas on same nodes as dependent software masters
No compendo. Please elucidate. E.g., what's a dependent software master?
In the case where 2 critical pieces of software have a master and slave
deployment scheme and are dependent on each other (e.g. software that
calculates info for a particular missile and software that interfaces to
the missile launcher), the plan is to put both masters on same node and
both slaves on a different node.
> Survivability with respect to fire zones...
What's a fire zone? How does it relate to the data centers, EMEs, etc.
that we've heard about?
Ships are segmented into parts that can be segregated from one another
physically, mechanically, electrically, etc. to prevent damage from
proliferating through a whole ship. Physically, there are hatches and
doors that can be closed to prevent flooding and fires from spreading
throughout a ship. These segments are firezones. The data centers are in
separate firezones. EMEs are Electro-Mechanical Enclosures. They house
equipment.
> ... if a replica already
> exists, attempt to deploy a new replica in a
> different firezone.
> Survivability with respect to nodes - if a replica exists
> on one node, attempt to deploy a new replica on a different node.
So if there's no other choice, you will deploy a replica on the same
node as its master?
unless there is a constraint otherwise, sounds like this could be done.
> The resource manager will attempt to evenly distribute
> replicas across different fire zones.
> Bin Packing - if the above does not result in a list of zero
> candidate nodes, then use bin packing to select the
> node. The
> resource manager will attempt to deploy the software
> in the node
> with the most RAM first, then CPU utilization then network
> bandwidth that meets the software needs. The
> attempt is to deploy
> 1 copy of all software needed for an operational
> string before
> deploying additional replicas.
>
> Step 4: If there is not enough room on any node to deploy
> the software,
> then the resource manager will examine the existing software
> on candidate
> nodes for an operational string(s) that can be preempted.
> This is based on
> criticality, resource utilization, effect of previous preemptions and
> replicas.
Very interesting. Do you distinguish between strings that would have to
be killed until more hardware is available and those that could be
immediately restarted?
All strings will have some criticality factor. The first to go would be
those strings that are not critical to the protection and survival of the
ship - for example training. The plan is to preempt only enough of the
lowest and smallest of the executing string(s) to get the more critical
strings started.
Thanks again for posting this, and thanks in advance for your time and
effort in educating us.
- Joe
More information about the Cadynce
mailing list