[tao-users] Notification Service sudden memory leak

Wed May 17 09:56:22 CDT 2017

Hi Markus,

These are most likely proxy and/or admin objects. Whenever you use, for
example, new_for_consumers() to get a new consumer admin, a new instance is
created in the server. It is up to the client code to call destroy() when
done with them. Since a single admin can manage many  proxies (as long as
they have the same admin-level properties) you can use the returned ID to
get that admin object later, or put its reference in the name service.

Same deal for the proxies, except those are one-to-one with the
corresponding entity in the clients. As you disconnect from them, you will
want to call destroy() on the proxy object too.

Externally, there is a Monitor/Control API on the Notify service that
enables you to clean up admin and proxy objects externally, they are not
removed when you destroy the channel.

Regarding timestamps in the log, add "-ORBVerboseLogging 1" to the command
line. Also there should be some examples of configuring log rotation
around. I'm sorry I don't have time now to look that up.

If you need more support with this. I suggest you contact
sales at objectcomputing.com to talk about opening a support contract with OCI.

Best regards,
Phi

On Wed, May 17, 2017 at 8:11 AM, Markus Gaugusch <markus at gaugusch.at> wrote:

>     TAO VERSION: 2.1.6
>     ACE VERSION: 6.1.6
>
>     HOST MACHINE and OPERATING SYSTEM:
> RHEL 6.3, using standard RPM
>
>     DOES THE PROBLEM AFFECT:
>         COMPILATION?
> no
>         LINKING?
> no
>         EXECUTION?
> yes
>
>     SYNOPSIS:
> At some point in time, the TAO notification service starts to leak massive
> amounts of memory. Later on, it even crashes (probably due to "out of
> memory").
>
>     DESCRIPTION:
> We use the tao-cosnotification service in an environment with mixed C++
> (TAO) and Java applications. Most event channels have only one supplier and
> one or more consumers, some channels have several suppliers (either from
> C++ or Java apps).
>
> After some time, we noticed that the memory usage (shown with top, %MEM)
> starts to increase linearly, even though the load (number of events) is
> unchanged. Our tests took between 12 and 36 hours to cause the problem to
> occur. Several channels are involved and we have no idea yet what the
> trigger is.
>
> I started tao-cosnotification with -ORBDebugLevel 4 to gain some insights
> and it seems that there is an object with increasing refcount:
>
> object:8a650e40 decr refcount = 4
> object:8a650e40 incr refcount = 5
> object:8a650e40 incr refcount = 6
> object:8a650e40 incr refcount = 6
> object:8a650e40 decr refcount = 5
> object:8a650e40 decr refcount = 5
> object:8a650e40 decr refcount = 4 # for many hours it is stable
> [...]
> object:8a650e40 decr refcount = 8125 # after some time a bit increased
> [...]
> object:8a650e40 decr refcount = 8538 # but now growing much faster
> [...]
> object:8b166010 incr refcount = 212842 # and very high
> object:8b166010 incr refcount = 212843
> object:8b166010 decr refcount = 212842
> object:8b166010 incr refcount = 212843
> object:8b166010 incr refcount = 212844
> object:8b166010 decr refcount = 212843
>
> I had the suspicion, that consumers that were not properly disconnecting
> from their channels could be the cause, but while there is "some" leak, it
> stabilizes and doesn't leak further. So this doesn't seem to be the cause.
>
>
>     QUESTIONS:
> 1.) Is such a problem known and fixed in newer versions?
>
> 2.) The object refcount shows 2 increments and only 1 decrement in this
> "leaking" state. What could be the cause and how do I find out what this
> object
> "8b166010" is all about?
>
> 3.) I wrote a test program to destroy all channels using ec->destroy(),
> but the
> memory was not freed. What could be the cause? Is this expected?
>
> 4.) Is there a way to get timestamps into the debug log??!! :) Is
> ORBDebugLevel
> 4 enough? The logfile grew to ~30GB during this test, which is not easy to
> handle ... I didn't dare to enable more logging :)
>
> 5.) What is the policy for disconnected consumers? Is there a way to
> "improperly" disconnect and cause objects to be stored until the end?
>
>     SAMPLE FIX/WORKAROUND:
> Restart of tao-cosnotification is necessary :(
>
>
> kind regards,
> Markus
> _______________________________________________
> tao-users mailing list
> tao-users at list.isis.vanderbilt.edu
> http://list.isis.vanderbilt.edu/cgi-bin/mailman/listinfo/tao-users

-- 

Phil Mesnier
Principal Engineer & Partner
OCI | WE ARE SOFTWARE ENGINEERS.
tel (314) 579-0066 x225 <(314)%20579-0066>
ObjectComputing.com
<http://objectcomputing.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.isis.vanderbilt.edu/pipermail/tao-users/attachments/20170517/22457578/attachment.html>