[Ace-users] Re: TAO-ORB: Server crashes intermittently on restarts

mak khan.m.arshad at gmail.com
Tue Jul 17 05:47:00 CDT 2007


On Jul 16, 6:15 pm, mak <khan.m.ars... at gmail.com> wrote:
>     TAO VERSION: 1.5.9
>     ACE VERSION: 5.5.9
>
>     HOST MACHINE and OPERATING SYSTEM:
>
>     Linux 2.6.20-1.2320.fc5smp #1 SMP i686 i686 i386 GNU/Linux
>
>     TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
>
>     COMPILER NAME AND VERSION (AND PATCHLEVEL):
>
>     g++ (GCC) 4.1.1 20070105 (Red Hat 4.1.1-51)
>
>     THE $ACE_ROOT/ace/config.h FILE :
>
>     #include "ace/config-linux.h"
>
>     THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE:
>
>     include $(ACE_ROOT)/include/makeinclude/platform_linux.GNU
>
>     CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/
> default.features
>     :
>
>     AREA/CLASS/EXAMPLE AFFECTED:
>
>     ORB
>
>     DOES THE PROBLEM AFFECT:
>         COMPILATION?
>
>         "No"
>
>         LINKING?
>
>         "No"
>
>         EXECUTION?
>
>         "Yes"
>
>         OTHER (please specify)?
>
>     SYNOPSIS:
>
>     Server crashes intermittently on restarts.
>
>     DESCRIPTION:
>
>     My application provides Server restart functionality when a
> certain
>     IDL method is called by a remote client. The restart process is
> sometimes
>     successful and sometimes ends in segmentation fault.
>
>     The SEGV happens in ORB::run method. The core file indicates
> following
>     place for the crash:
>
> Program terminated with signal 11, Segmentation fault.
> #0  0x09e08990 in ?? ()
> (gdb) where
> #0  0x09e08990 in ?? ()
> #1  0x001c5c02 in TAO_ORB_Core::run (this=0x9e08e48, tv=0x0,
> perform_work=0)
>     at /home/akhan/ecp/software/external/acetao/TAO/tao/
> LF_Event_Loop_Thread_Helper.inl:24
> #2  0x001bfb7c in CORBA::ORB::run (this=0x9df4540, tv=0x0) at ORB.cpp:
> 202
> #3  0x001bfbe5 in CORBA::ORB::run (this=0x9df4540) at ORB.cpp:188
> #4  0x080501dd in RestartTestTPWorker::svc (this=0x9dfed40) at
> RestartTestRunner.cpp:250
> #5  0x008d3e56 in ACE_Task_Base::svc_run (args=0x9dfed40) at Task.cpp:
> 271
> #6  0x008d4818 in ACE_Thread_Adapter::invoke_i (this=0x9e153f8) at
> Thread_Adapter.cpp:146
> #7  0x008d49e6 in ACE_Thread_Adapter::invoke (this=0x9e153f8) at
> Thread_Adapter.cpp:95
> #8  0x008693d1 in ace_thread_adapter (args=0x9e153f8) at
> Base_Thread_Adapter.cpp:116
> #9  0x00db3433 in start_thread () from /lib/libpthread.so.0
> #10 0x00c0ba1e in clone () from /lib/libc.so.6
> (gdb)
>
>     It could be something that I'm doing wrong. I have my suspicion on
> the
>     way I reset reactor event loop on restart. Any help is very much
>     appreciated.
>
>     Thanks
>     Arshad
>
>     REPEAT BY:
>
> Test case follows. Apologies for a rather long test case.:(. Please
> let
> me know if there are any questions about the test case.
>
> The files are:
>
> 1. RestartTest.idl: Interface definition.
> 2. RestartTestIntf_i.h|cpp : CORBA servant implementing the IDL
> interface.
> 3. RestartTestRunner.h|cpp : Class containing ORB initialize and
> shutdown methods.
>                              Also contains class for ORB Thread Pool
> Worker.
> 4. server.cpp : Server main. Also contains Signal handler class to
> handle
>                 SIGINT.
> 5. client.cpp : Client main.
>
> How to build:
> 1. Generate make files.
> $ACE_ROOT/bin/mwc.pl -type gnuace restart.mwc
>
> 2. Do the make
> make
>
> How to run:
> 1. Run the server (make sure port 1234 is available)
> ./server -ORBEndPoint iiop://localhost:1234
>
> 2. Run the client
> ./client -ORBInitRef Test=corbaloc::localhost:1234/Test
>
> Expected Result:
> "The server should restart successfully on multiple invocation of the
> client."
>
> Observed Result:
> "The server restarts but gets a segmentation fault after a few client
>  invocations."
>
> ----- File: RestartTest.idl -----
> module RestartTest {
>
>   /**
>    * Restart Test interface.
>    */
>   interface RestartTestIntf {
>
>    /**
>     * A two way method to exercise the server.
>     */
>     string test_echo_string(in string message);
>
>    /**
>     * Restarts the server.
>     */
>     oneway void test_restart();
>
>    /**
>     * Shutsdown the server.
>     */
>     oneway void test_shutdown();
>   };
>
> };
>
> -------------------------------------
>
> ----- File: RestartTestIntf_i.h -----
>
> #ifndef RESTART_TEST_INTF_I_H
> #define RESTART_TEST_INTF_I_H
>
> #include /**/ "ace/pre.h"
>
> #include "RestartTestS.h"
>
> class RestartTestIntf_i :
>       public virtual POA_RestartTest::RestartTestIntf
> {
> public:
>     /**
>      * Constructor.
>      */
>     RestartTestIntf_i() {}
>
>     /**
>      * Destructor
>      */
>     ~RestartTestIntf_i() {}
>
>     /**
>      * A two way method to exercise the server.
>      * @param value Passed in string.
>      * @return Returns back the passed in string.
>      */
>     virtual char * test_echo_string(const char * value);
>
>     /**
>      * Restarts the server.
>      */
>     virtual void test_restart();
>
>     /**
>      * Shutsdown the server.
>      */
>     virtual void test_shutdown();
>
> };
>
> #include /**/ "ace/post.h"
> #endif /* RESTART_TEST_INTF_I_H */
>
> -------------------------------------
>
> ----- File: RestartTestIntf_i.cpp -----
>
> #include "RestartTestIntf_i.h"
> #include "RestartTestRunner.h"
>
> #include "ace/Signal.h"
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // test_echo_string
> //X//////////////////////X////////////////////
> X//////////////////////////
> char *
> RestartTestIntf_i::test_echo_string(const char * value)
> {
>     ACE_DEBUG ((LM_DEBUG,
>               "(%P|%t) RestartTestIntf_i::test_echo_string.\n"));
>
>    char * retVal = CORBA::string_dup(value);
>
>    return retVal;
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // test_restart
> //X//////////////////////X////////////////////
> X//////////////////////////
> void
> RestartTestIntf_i::test_restart()
> {
>     ACE_DEBUG ((LM_DEBUG,
>               "(%P|%t) RestartTestIntf_i::test_restart.\n"));
>
>     // Set the restart flag to true so that main method can
>     // re-init the RestartTestRunner.
>
>     RestartTestRunner::mRestart = true;
>
>     // Send a SIGINT signal to the signal handler for
>     // shutdown (and then a restart)
>
>     ACE_OS::kill(ACE_OS::getpid(), SIGINT);
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // test_shutdown
> //X//////////////////////X////////////////////
> X//////////////////////////
> void
> RestartTestIntf_i::test_shutdown()
> {
>     ACE_DEBUG ((LM_DEBUG,
>               "(%P|%t) RestartTestIntf_i::test_shutdown.\n"));
>
>     // Set the restart flag to true so that main method can
>     // does not restarts the server.
>
>     RestartTestRunner::mRestart = false;
>
>     // Send a SIGINT signal to the signal handler for
>     // shutdown.
>
>     ACE_OS::kill(ACE_OS::getpid(), SIGINT);
>
> }
>
> -------------------------------------
>
> ----- File: RestartTestRunner.h -----
>
> #ifndef RESTART_TEST_RUNNER_H
> #define RESTART_TEST_RUNNER_H
>
> #include /**/ "ace/pre.h"
> #include "ace/Task.h"
> #include "tao/ORB.h"
> #include "tao/PortableServer/PortableServer.h"
>
> // Flag to conditionally compile TAO Singleton Manager's Initialize
> // and finalization.
>
> #define DO_SING_MGR_INIT_FINI   1
>
> /**
>  * Thread Pool worker class.
>  *
>  * Objects of this class represent a thread pool worker. Each worker
>  * executes the orb->run method and services incoming requests.
>  */
>
> class RestartTestTPWorker : public ACE_Task_Base {
>
> private:
>
>     /// An initialized ORB instance.
>     CORBA::ORB_var pOrb;
>
> public:
>
>     /**
>      * Constructor
>      */
>
>     RestartTestTPWorker (CORBA::ORB_ptr orb);
>
>     /**
>      * Destructor
>      */
>
>     virtual ~RestartTestTPWorker();
>
>     /**
>      * Thread svc method.
>      */
>
>     virtual int svc(void);
>
> }; /* RestartTestTPWorker */
>
> /**
>  * Class that initializes the ORB and POAs and instantiate
>  * thread pool workers.
>  */
> class RestartTestRunner {
>
> private:
>
>     /// Command line arguments.
>     int mArgc;
>     char **pArgv;
>
>     /// ORB instance.
>     CORBA::ORB_var pOrb;
>
>     /// POA instance for servicing rqeuests.
>     PortableServer::POA_var pTestPOA;
>
>     /// Thread Pool worker instance.
>     RestartTestTPWorker * pWorkers;
>
>     /// Flag indicating if the ORB is initialized successfully.
>     bool mInited;
>
> public:
>
>     /// Flag used for indicating whether a restart is needed.
>     static bool mRestart;
>
>     /**
>      * Constructor
>      */
>
>     RestartTestRunner (int argc, char *argv[]);
>
>     /**
>      * Destructor
>      */
>
>     virtual ~RestartTestRunner();
>
>     /**
>      * Initializes ORB and POAs.
>      */
>
>     virtual int initialize(void);
>
>     /**
>      * Shuts down the ORB.
>      */
>
>     virtual int shutdown();
>
> };
>
> #include /**/ "ace/post.h"
> #endif /* RestartTestRunner */
>
> -------------------------------------
>
> ----- File: RestartTestRunner.cpp -----
>
> #include "RestartTestRunner.h"
> #include "RestartTestIntf_i.h"
> #include "ace/ARGV.h"
> #include "tao/Object.h"
> #include "tao/IORTable/IORTable.h"
> #include "tao/TAO_Singleton_Manager.h"
>
> // Public fields and methods for RestartTestRunner.
>
> bool RestartTestRunner::mRestart = false;
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // Constructor
> //X//////////////////////X////////////////////
> X//////////////////////////
> RestartTestRunner::RestartTestRunner(int argc, char *argv[])
>         : mArgc(argc),
>           pArgv(argv),
>           pOrb(CORBA::ORB::_nil()),
>           pWorkers(0),
>           mInited(false)
>
> {
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // Destructor
> //X//////////////////////X////////////////////
> X//////////////////////////
> RestartTestRunner::~RestartTestRunner()
> {
>     if (mInited)
>     {
>         mInited = false;
>         this->shutdown();
>     }
>
>     if (pWorkers)
>     {
>         delete pWorkers;
>         pWorkers = 0;
>     }
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // initialize
> //X//////////////////////X////////////////////
> X//////////////////////////
>
> int
> RestartTestRunner::initialize(void)
> {
>     try
>     {
> #ifdef DO_SING_MGR_INIT_FINI
>         int register_with_object_manager = 0;
>
>         if (TAO_Singleton_Manager::instance ()->init (
>                                register_with_object_manager) == -1)
>         {
>             ACE_DEBUG ((LM_ERROR,
>                       "(%P|%t) RestartTestRunner - initialize: ",
>                       "Failed to initialize TAO Singleton Manager.
> \n"));
>             return -1;
>         }
> #endif
>
>         // Copy command line arguments and pass them to ORB_init
>         ACE_ARGV newArgs(pArgv);
>
>         pOrb = CORBA::ORB_init(mArgc,newArgs.argv());
>
>         // Get RootPOA
>         CORBA::Object_var obj = pOrb-
>
> >resolve_initial_references("RootPOA");
>
>         PortableServer::POA_var rootPOA =
> PortableServer::POA::_narrow(obj.in());
>
>         // Create Persistent Object Lifespan policy
>
>         PortableServer::LifespanPolicy_var persistentLifespan =
>                                           rootPOA-
>
> >create_lifespan_policy(
>
> PortableServer::PERSISTENT);
>
>         // Create User assigned Id policy
>
>         PortableServer::IdAssignmentPolicy_var idAssign =
>                                           rootPOA-
>
> >create_id_assignment_policy(
>
> PortableServer::USER_ID);
>
>         CORBA::PolicyList persistentPOAPolicy (2);
>         persistentPOAPolicy.length (2);
>
>         persistentPOAPolicy[0] =
>
> PortableServer::LifespanPolicy::_duplicate(persistentLifespan);
>
>         persistentPOAPolicy[1] =
>
> PortableServer::IdAssignmentPolicy::_duplicate(idAssign);
>
>         // Create POA with Persistent lifespan and User assigned Id
> policies
>         // and its own POAManager
>
>         pTestPOA = rootPOA->create_POA("TestPOA",
>
> PortableServer::POAManager::_nil(),
>
> persistentPOAPolicy);
>
>         // Create the servant.
>         RestartTestIntf_i * pServant = new RestartTestIntf_i();
>
>         PortableServer::ObjectId_var oid =
>                            PortableServer::string_to_ObjectId("Test");
>
>         /// Activate the object
>         pTestPOA->activate_object_with_id(oid, pServant);
>
>         CORBA::Object_var testObj = pTestPOA-
>
> >servant_to_reference(pServant);
>
>         CORBA::String_var iorStr = pOrb-
>
> >object_to_string(testObj.in());
>
>         // Bind this object reference in the IORTable with an alias
>
>         CORBA::Object_var iorTabObj =
>                           pOrb-
>
> >resolve_initial_references("IORTable");
>
>         IORTable::Table_var iorTab =
> IORTable::Table::_narrow(iorTabObj.in());
>
>         iorTab->bind ("Test", iorStr.in ());
>
>         // Activate POAs
>
>         PortableServer::POAManager_var pmgr1 = rootPOA->the_POAManager();
>
>         pmgr1->activate();
>
>         PortableServer::POAManager_var pmgr2 = pTestPOA->the_POAManager();
>
>         pmgr2->activate();
>
>         // Destroy the policy object
>
>         persistentLifespan->destroy();
>         idAssign->destroy();
>
>         // Now create ThreadPool workers and activate them
>
>         pWorkers = new RestartTestTPWorker(pOrb.in());
>
>         if (pWorkers->activate(THR_NEW_LWP | THR_JOINABLE,2) == -1)
>         {
>             pOrb->destroy();
>             ACE_ERROR_RETURN ((LM_ERROR,
>                               "(%P|%t) RestartTestRunner - initialize:
> "
>                               "Failed to activate Pool Workers.\n"),
>                               -1);
>         }
>
>         ACE_DEBUG ((LM_DEBUG,
>                    "(%P|%t) RestartTestRunner::initialize: Success.
> \n"));
>
>         mInited = true;
>     }
>     catch (const CORBA::Exception & e)
>     {
>         ACE_DEBUG ((LM_ERROR,
>                     "(%P|%t) RestartTestRunner::initialize: "
>                     "CORBA Exception caught while initializing: "));
>
>         e._tao_print_exception("\n");
>
>         return -1;
>     }
>
>     return 0;
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // shutdown
> //X//////////////////////X////////////////////
> X//////////////////////////
>
> int
> RestartTestRunner::shutdown()
> {
>     try
>     {
>         // Do an ORB shutdown.
>         pOrb->shutdown(1);
>
>         // Wait for all pool workers to finish.
>         pWorkers->wait();
>
>         mInited = false;
>
> #ifdef DO_SING_MGR_INIT_FINI
>         if (TAO_Singleton_Manager::instance ()->fini () == -1)
>         {
>             ACE_ERROR_RETURN ((LM_ERROR,
>                               "(%P|%t) RestartTestRunner::shutdown: ",
>                               "Failed to finalize TAO Singleton
> Manager.\n"),
>                               -1);
>         }
> #endif
>
>         ACE_DEBUG ((LM_INFO,
>                    "(%P|%t) RestartTestRunner::shutdown: "
>                    "ORB shutdown successful.\n"));
>
>     }
>     catch (const CORBA::Exception& e)
>     {
>         ACE_DEBUG ((LM_ERROR,
>                    "(%P|%t) RestartTestRunner::shutdown: "
>                    "CORBA Exception caught while shutting down: "));
>
>         e._tao_print_exception("\n");
>
>         return -1;
>     }
>
>     return 0;
>
> }
>
> // Public fields and methods for RestartTestTPWorker.
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // Constructor
> //X//////////////////////X////////////////////
> X//////////////////////////
> RestartTestTPWorker::RestartTestTPWorker(CORBA::ORB_ptr orb)
>         : pOrb(CORBA::ORB::_duplicate(orb))
> {
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // Destructor
> //X//////////////////////X////////////////////
> X//////////////////////////
> RestartTestTPWorker::~RestartTestTPWorker()
> {
>
> }
>
> //X//////////////////////X////////////////////
> X//////////////////////////
> //
> // svc
> //X//////////////////////X////////////////////
> X//////////////////////////
> int
> RestartTestTPWorker::svc()
> {
>     try
>     {
>         pOrb->run();
>     }
>     catch (const CORBA::Exception& e)
>     {
>         ACE_DEBUG ((LM_ERROR,
>                     "(%P|%t) RestartTestTPWorker - svc: ",
>                     "CORBA Exception caught while in orb->run: "));
>
>         e._tao_print_exception("\n");
>
>         return -1;
>     }
>     catch (...)
>     {
>         ACE_ERROR_RETURN((LM_ERROR,
>                           "(%P|%t) RestartTestTPWorker - svc: ",
>                           "Exception caught while in orb->run"),
>                           -1);
>     }
>
>     return 0;
>
> }
>
> -------------------------------------
>
> ----- File: server.cpp -----
>
> #include "RestartTestRunner.h"
> #include "ace/Reactor.h"
>
> #include "ace/Signal.h"
>
> /**
>  * Class for doing signal handling.
>  */
> class SigHandler : public ACE_Event_Handler {
>
>     private:
>
>         /// Pointer to the test runner
>         RestartTestRunner * pRunner;
>
>     public:
>         SigHandler(RestartTestRunner * runner) :
>                    pRunner(runner) {}
>
>         ~SigHandler() {}
>
>         int handle_signal(int signum, siginfo_t * =0, ucontext_t * =0)
>         {
>             switch (signum)
>             {
>                 case SIGINT:
>                         try
>                         {
>                             pRunner->shutdown();
>                             ACE_Reactor::instance()->end_reactor_event_loop();
>
>                         }
>                         catch(...)
>                         {
>                             ACE_ERROR_RETURN ((LM_ERROR,
>                                               "(%P|%t)
> SigHandler::handle_signal"),
>                                               -1);
>                         };
>                         break;
>
>                 default:
> break;
>             }
>             return 0;
>         }
>
> };
>
> int main(int argc, char* argv[])
> {
>
>     bool restart = false;
>     do
>     {
>         if (restart)
>         {
>            ACE_DEBUG ((LM_INFO,
>                       "(%P|%t) server::main: ",
>                       "Restarting Server.\n"));
>             RestartTestRunner::mRestart = false;
>         }
>
>         RestartTestRunner * pRunner = new RestartTestRunner(argc,
> argv);
>
>         if (pRunner->initialize() == -1)
>         {
>             ACE_DEBUG ((LM_ERROR,
>                         "(%P|%t) server::main: "
>                         "Error while initializing Test Runner.\n"));
>             return -1;
>         }
>
>         if (restart)
>         {
>             ACE_DEBUG ((LM_INFO,
>                         "(%P|%t) server::main: Server Re-started.
> \n"));
>         }
>         else
>         {
>             ACE_DEBUG ((LM_INFO,
>                         "(%P|%t) server::main: Server Started.
> \n"));
>         }
>
>         ACE_Sig_Set signalSet;
>         signalSet.sig_add( SIGINT );
>
>         SigHandler handler(pRunner);
>
>         ACE_Reactor::instance()->register_handler( signalSet,
>                                                    &handler );
>
>         ACE_Reactor::instance()->run_reactor_event_loop();
>
>         ACE_Reactor::instance()-
>
> >remove_handler(signalSet);
>
>         restart = RestartTestRunner::mRestart;
>
>         if (restart)
>         {
>             ACE_DEBUG ((LM_DEBUG,
>                       "(%P|%t) server::main: "
>                       "Resetting event loop before restart.\n"));
>
>             ACE_Reactor::instance()->reset_reactor_event_loop();
>
>         }
>
>         delete pRunner;
>
>     } while (restart);
>
>     ACE_DEBUG ((LM_INFO,
>                "(%P|%t) server::main: Server stopped.\n"));
>     return 0;
>
> }
>
> -------------------------------------
>
> ----- File: client.cpp -----
>
> #include "tao/ORB.h"
> #include "tao/Object.h"
> #include "RestartTestC.h"
>
> int main(int argc, char * argv[])
> {
>
>   try
>   {
>      CORBA::ORB_var mOrb = CORBA::ORB_init(argc, argv);
>
>      // Resolve to the target object.
>      CORBA::Object_var mObj =  mOrb-
>
> >resolve_initial_references("Test");
>
>      RestartTest::RestartTestIntf_var testObj =
>                   RestartTest::RestartTestIntf::_narrow(mObj.in());
>
>      CORBA::String_var msg = CORBA::string_dup("Hello World!");
>
>      ACE_DEBUG ( (LM_DEBUG,
>                  "(%P|%t) client::main: "
>                  "Calling method test_echo_string....\n"));
>
>      CORBA::String_var retVal = testObj->test_echo_string(msg);
>
>      ACE_DEBUG ((LM_DEBUG,
>                 "(%P|%t) client::main: "
>                 "Server replied with: %s.\n",
>                 retVal.in()));
>
>      ACE_DEBUG ((LM_DEBUG,
>                 "(%P|%t) client::main: "
>                 "Restarting server....\n"));
>
>      testObj->test_restart();
>    }
>    catch (const CORBA::Exception & e)
>    {
>      ACE_DEBUG ((LM_DEBUG,
>                 "(%P|%t) client::main: "
>                 "Caught CORBA Exception while calling server: "));
>
>      e._tao_print_exception("\n");
>
>      return -1;
>
>    }
>
>    return 0;
>
> }
>
> -------------------------------------
>
> ----- File: restart.mpb -----
>
> project: taoidldefaults {
>
>   includes += $(ACE_ROOT) $(ACE_ROOT)/TAO
>
>   libpaths += $(ACE_ROOT)/lib
>
>   libs += ACE TAO
>
>   IDL_Files {
>     RestartTest.idl
>   }
>
> }
>
> -------------------------------------
>
> ----- File: restart.mpc -----
>
> project(restart_test_client): restart {
>
>   exename = client
>
>   Source_Files {
>     client.cpp
>     RestartTestC.cpp
>   }
>
> }
>
> project(restart_test_server): restart {
>
>   exename = server
>
>   libs += TAO_PortableServer TAO_IORTable
>
>   Header_Files {
>     RestartTestIntf_i.h
>     RestartTestRunner.h
>   }
>
>   Source_Files {
>     RestartTestIntf_i.cpp
>     server.cpp
>     RestartTestS.cpp
>     RestartTestC.cpp
>     RestartTestRunner.cpp
>   }
>
> }
>
> -------------------------------------
>
> ----- File: restart.mwc -----
>
> workspace {
>     restart.mpc
>
> }
>
>     SAMPLE FIX/WORKAROUND:

Same issue also happens on the latest SVN snapshot of ACE-TAO.

Arshad



More information about the Ace-users mailing list