[tao-bugs] Service after some time working idle (1 night) can't process requests

Phil Mesnier mesnierp at ociweb.com
Mon Jan 16 06:35:47 CST 2017


Hah, I knew it! 

Thank you for the update, Good luck with your project. 

Best regards,
Phil

> On Jan 16, 2017, at 4:48 AM, Daniel Suchodolski <Daniel.Suchodolski at pitradwar.com> wrote:
> 
> Hi Phil and Johny,
> 
> 
> Thank you for your response and advice.
> 
> 
> The problem is resolved.
> 
> A good idea was using command lsof. My application was regularly creating file descriptors.
> 
> Now application is working stable.
> 
> Thank you again.
> 
> 
> Best regards,
> Daniel
> 
> Od: Phil Mesnier [mesnierp at ociweb.com <mailto:mesnierp at ociweb.com>]
> Wysłano: 13 stycznia 2017 12:58
> Do: Daniel Suchodolski
> DW: tao-bugs at list.isis.vanderbilt.edu
> Temat: Re: [tao-bugs] Service after some time working idle (1 night) can't process requests
> 
> HI Daniel,
> 
> In addition to what Johnny said, do you have log output from the server during the period from 11:38:25 to 15:49:37? I ask because that last client connection is assigned handle #1542 which is quite a large number for a supposedly idle server. Do you have other I/O happening on a regular basis, maybe writing to a file, or connecting to a database or something?
> 
> I'm guessing you have a resource leak somewhere, not closing a file or socket. 
> 
> By default, TAO's reactor uses select() which typically has a limit of 1024 handles. Since the handle number is used as an index to a bit array, a high number such as #1542 will be out of bounds even if you only have a handful of sockets you are interested in.  So the server is probably just in its normal run state, but unable to select on the client's connection.
> 
> You can try using lsof -p <server pid> which if I'm guessing right will show you 1500 or so open files or sockets.
> 
> In fact, I noticed that the first client connection you noted uses handle #28, while previously the server had opened  connection to the naming service using handle #9. So 19 handles were consumed in 2.25 minutes. Later you have a 4 hour 11 minute gap with 1514 handles consumed, so in the first case roughly 8 handles per minute were leaked, and roughly 6 per minute in the latter case.
> 
> Now if you need all these open resources, perhaps you can switch to the dev_poll reactor rather than the default reactor. See docs/Options.html for information on setting reactor type via the advanced resource factory.
> 
> Best regards,
> Phil
> 
>> On Jan 13, 2017, at 4:32 AM, Johnny Willemsen <jwillemsen at remedy.nl <mailto:jwillemsen at remedy.nl>> wrote:
>> 
>> Hi,
>> 
>> Thanks for using the PRF form. Can you attach a debugger and see where the server is looping exactly?
>> 
>> Best regards,
>> 
>> Johnny Willemsen
>> Remedy IT
>> Postbus 81 | 6930 AB Westervoort | The Netherlands
>> http://www.remedy.nl <http://www.remedy.nl/>
>> On 01/12/2017 02:32 PM, Daniel Suchodolski wrote:
>>> Hi TAO,
>>> 
>>> 
>>>     TAO VERSION: 2.4.1
>>>     ACE VERSION: 6.4.1
>>> 
>>>     HOST MACHINE and OPERATING SYSTEM:
>>>     Debian 8 (Jessie)
>>> 
>>>     COMPILER: g++ (Debian 4.9.2-10) 4.9.2 
>>> 
>>> 
>>>     DOES THE PROBLEM AFFECT:
>>>         EXECUTION? YES
>>> 
>>> 
>>>     SYNOPSIS:
>>> CORBA service after some time working idle (1 night) can't process requests. (migration result)
>>> 
>>>     DESCRIPTION:
>>> At the beginning I want to highlight that we use version of ACE-TAO compiled with option "threads 0". The problem started to happen with many services after migration to newest version of ACE TAO and operating system. Up to now the system worked stable on version 1.2.1/5.2.1 (linux lenny).
>>> 
>>> How it works:
>>> Server application registers CORBA service in NamingService.
>>> If a client connects shortly after start of the server
>>> then the server works fine: the server process request properly.
>>> 
>>> After some time (for example 1 night) a client connects to the Server, but
>>> when the client tries to use service then the server hangs up and use 100% of a processor.
>>> Client is blocked by the server until the server is killed. Debugging, we found out that the problem is somewhere inside CORBA invocation. Very unclear are debug information seen during loading adatp3-services.svc, but we are not able fully interpret this issue.
>>> 
>>> The services is run with the following ORBParameters:
>>>         -ORBDottedDecimalAddresses 1
>>>             -ORBDebug -ORBDebugLevel 10 -ORBVerboseLogging 2 -ORBInitRef NameService=corbaloc::server:30033/NameService
>>>             -ORBSvcConf adatp3-services.svc
>>> 
>>> and adatp3-services.svc:
>>> static Advanced_Resource_Factory "-ORBReactorMaskSignals 0 -ORBInputCDRAllocator null -ORBReactorType select_st -ORBConnec
>>> tionCacheLock null"
>>> static Server_Strategy_Factory "-ORBAllowReactivationOfSystemids 0"
>>> static Client_Strategy_Factory "-ORBTransportMuxStrategy EXCLUSIVE -ORBClientConnectionHandler RW"
>>> 
>>> 
>>>     REPEAT BY:
>>> Every Time
>>> 
>>>     TAO LOG:
>>> 
>>> The Log of the server is showed below. The log is divided into parts:
>>> 
>>> [Start Server]
>>> [Client connect to Server after short time]
>>> [Client is connecting after some time]
>>> 
>> 
>> _______________________________________________
>> tao-bugs mailing list
>> tao-bugs at list.isis.vanderbilt.edu <mailto:tao-bugs at list.isis.vanderbilt.edu>
>> http://list.isis.vanderbilt.edu/cgi-bin/mailman/listinfo/tao-bugs <http://list.isis.vanderbilt.edu/cgi-bin/mailman/listinfo/tao-bugs>
> --
> Phil Mesnier
> Principal Engineer & Partner
> 
> OCI | WE ARE SOFTWARE ENGINEERS.
> tel  +1.314.579.0066 x225
> ociweb.com <http://ociweb.com/>
> 
> 
> 
> 
> 
> PIT-RADWAR S.A.
> z siedzibą w Warszawie
> ul.Poligonowa 30
> 
> Wpisana do Rejestru Przedsiębiorców Krajowego Rejestru Sądowego
> przez Sąd Rejonowy dla m.st <http://m.st/>. Warszawy w Warszawie, XIII Wydział Gospodarczy.
> NR KRS 0000297470
> NIP: 525 000 9298
> Kapitał zakładowy 421 542 770,00 PLN w całości wpłacony. 
> 
> www.pitradwar.com <http://www.pitradwar.com/>
> 
> Niniejsza wiadomość jest przeznaczona wyłącznie dla wskazanego w niej adresata i stanowi własność PIT-RADWAR S.A. Jeżeli nie jesteście Państwo adresatem tej wiadomości, bądź otrzymaliście ją przez pomyłkę, informujemy, że jej rozpowszechnianie lub kopiowanie oraz jakiekolwiek czynienie z niej użytku jest zakazane i może naruszać prawo. Prosimy o powiadomienie o powyższym nadawcy i trwałe usunięcie wiadomości wraz z załącznikami.
> P Zanim wydrukujesz, pomyśl o środowisku
> 
> This message is intended only for use of the named addressee and is the property of PIT-RADWAR S.A. If you are not the receiver of this e-mail or you have received it accidentally we inform that publishing or copying this message or any other using it is forbidden and may break the law. We kindly ask to inform the sender of the message about what is mentioned above and to remove the message permanently from your computer together with attached files.
> P Please consider the environment before printing this email

--
Phil Mesnier
Principal Engineer & Partner

OCI | WE ARE SOFTWARE ENGINEERS.
tel  +1.314.579.0066 x225
ociweb.com <http://ociweb.com/>





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.isis.vanderbilt.edu/pipermail/tao-bugs/attachments/20170116/8e3d4f71/attachment-0001.html>


More information about the tao-bugs mailing list