<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Gerry:<br>
<br>
I would also like to thank you for the detailed description of PoR
needs.<br>
<br>
Dear Joe and Gautam:<br>
<br>
Another way to view replication (non-co-location) and co-location needs
is to consider these as constraints to the bin-packing problem. Some
stylized formulation may be necessary to understand optimal packing
under these constraints, and how efficient heuristics will do. They
should do well, I think, but further study is needed.<br>
<br>
The criticality property of application strings can tend to change the
bin-packing problem into a knapsack-like value maximization problem.
Some results can be produced here as well using polynomial heuristics
which nevertheless get close to the optimal (and thereby maximize the
number of most critical application strings that are run, and/or
maximize the number of replicas that are deployed).<br>
<br>
Finally, while some theoretical studies can and perhaps should be
undertaken, heuristics that generate feasible allocations (using
sufficient conditions) can be crafted for quick use by project + PoR.<br>
<br>
---<br>
Raj<br>
<br>
Cross, Joseph wrote:
<blockquote
cite="mid7FC402934595684AB4B49D416EF3D578021A2286@sde2k3-mb2.darpa.mil"
type="cite">
<pre wrap="">Gerry -
This is great information. Thanks for posting it.
</pre>
<blockquote type="cite">
<pre wrap="">There was a recent review of the Resource Manager Ensemble Software
Design Description for release 4 that covered how resources are
allocated upon service deployment.
</pre>
</blockquote>
<pre wrap=""><!---->
This looks like true dynamic resource management. I.e., it will allocate
SCIs to nodes in configurations that have never been tested. I was under
the impression that the cert folks would not tolerate such a thing. Has
this changed?
</pre>
<blockquote type="cite">
<pre wrap="">Step 1: Get the list of potential hardware on which to deploy
the software...
</pre>
</blockquote>
<pre wrap=""><!---->
</pre>
<blockquote type="cite">
<pre wrap=""> Step 2: Filter out the nodes that do not have enough resources
(RAM, CPU utilization and bandwidth) to support the software.
</pre>
</blockquote>
<pre wrap=""><!---->
Here "enough" means "enough resources remaining after already allocated
higher priority processes and dataflows have taken their slices," right?
</pre>
<blockquote type="cite">
<pre wrap="">Step 3: Determine where to deploy the software. This is based on:
Co-location dependency - must be deployed on same
node as other software
</pre>
</blockquote>
<pre wrap=""><!---->
</pre>
<blockquote type="cite">
<pre wrap=""> Replication scheme - Master/slave - deploy master
replicas on same nodes as dependent software masters
</pre>
</blockquote>
<pre wrap=""><!---->
No compendo. Please elucidate. E.g., what's a dependent software master?
</pre>
<blockquote type="cite">
<pre wrap=""> Survivability with respect to fire zones...
</pre>
</blockquote>
<pre wrap=""><!---->
What's a fire zone? How does it relate to the data centers, EMEs, etc.
that we've heard about?
</pre>
<blockquote type="cite">
<pre wrap="">... if a replica already
exists, attempt to deploy a new replica in a
different firezone.
Survivability with respect to nodes - if a replica exists
on one node, attempt to deploy a new replica on a different node.
</pre>
</blockquote>
<pre wrap=""><!---->
So if there's no other choice, you will deploy a replica on the same
node as its master?
</pre>
<blockquote type="cite">
<pre wrap="">The resource manager will attempt to evenly distribute
replicas across different fire zones.
Bin Packing - if the above does not result in a list of zero
candidate nodes, then use bin packing to select the
node. The
resource manager will attempt to deploy the software
in the node
with the most RAM first, then CPU utilization then network
bandwidth that meets the software needs. The
attempt is to deploy
1 copy of all software needed for an operational
string before
deploying additional replicas.
Step 4: If there is not enough room on any node to deploy
the software,
then the resource manager will examine the existing software
on candidate
nodes for an operational string(s) that can be preempted.
This is based on
criticality, resource utilization, effect of previous preemptions and
replicas.
</pre>
</blockquote>
<pre wrap=""><!---->
Very interesting. Do you distinguish between strings that would have to
be killed until more hardware is available and those that could be
immediately restarted?
Thanks again for posting this, and thanks in advance for your time and
effort in educating us.
- Joe
</pre>
</blockquote>
<br>
</body>
</html>