[Ace-users] TAO 1.6.1: application with 2 ORBs sometimes gets a SIGSEGV

a.koehler.tux at gmx.de a.koehler.tux at gmx.de
Thu Feb 7 03:48:35 CST 2008


Hi,

first the report form ;-) ...

TAO VERSION: 1.6.1
ACE VERSION: 5.6.1

HOST/TARGET MACHINE and OPERATING SYSTEM:
    PC with Intel based 2x2.0GHz DualCore
    SUSE Linux Enterprise Server 10 (i586)

COMPILER NAME AND VERSION (AND PATCHLEVEL):
    gcc (GCC) 3.4.6 20051222 (prerelease) for GNAT Pro 5.04a

THE $ACE_ROOT/ace/config.h FILE:
    config-linux.h

THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE:
    platform_linux.GNU

CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features
    this file does not exist

AREA/CLASS/EXAMPLE AFFECTED:
    ACE_Message_Block, ACE_Data_Block, ACE_InputCDR, TAO_InputCDR, ...

DOES THE PROBLEM AFFECT:
    COMPILATION?
        No.
    LINKING?
        No.
    EXECUTION?
        Yes.
    OTHER (please specify)?
        No.

SYNOPSIS:
    application with 2 ORBs sometimes gets a SIGSEGV

DESCRIPTION:
    Please see below.

REPEAT BY:
    It's not reproducible (happens only sometimes). Obviously there is
    a race between the threads when accessing a resource.

SAMPLE FIX/WORKAROUND:
    Unknown.

Now the detailed DESCRIPTION ...

we're using the Latest BFO Beta TAO 1.6.1 and there is a serious
stability
problem under Linux SLES 10 with gcc 3.4.6.

The trace (as an example):

(gdb) bt
#0  0x4064de0b in ACE_InputCDR::read_4 (this=0xbfa2bf38, x=0xbfa2ba98)
at
CDR_Stream.cpp:1434
#1  0x4064e004 in ACE_InputCDR::read_ulong (this=0xbfa2bf38,
x=@0xbfa2ba98) at /home/akoehan/tmp/ACE_wrappers/ace/CDR_Stream.inl:
651
#2  0x4064e119 in operator>> (is=@0xbfa2bf38, x=@0xbfa2ba98) at
/home/akoehan/tmp/ACE_wrappers/ace/CDR_Stream.inl:1237
#3  0x40497de4 in operator>> (is=@0xbfa2bf38, x=@0xbfa2ba98) at
/home/akoehan/tmp/ACE_wrappers/TAO/tao/CDR.inl:338
#4  0x404ceee0 in TAO::demarshal_sequence<TAO_InputCDR,
IOP::ServiceContext> (strm=@0xbfa2bf38, target=@0xbfa2bba4)
    at
/home/akoehan/tmp/ACE_wrappers/TAO/tao/Unbounded_Sequence_CDR_T.h:301
#5  0x404ce557 in operator>> (strm=@0xbfa2bf38,
_tao_sequence=@0xbfa2bba0)
at IOP_IORC.cpp:611
#6  0x404b0342 in
TAO_GIOP_Message_Generator_Parser_12::parse_request_header
(this=0x83b1238, request=@0xbfa2bb78) at
GIOP_Message_Generator_Parser_12.cpp:291
#7  0x404adb14 in TAO_GIOP_Message_Base::process_request
(this=0x83b1228,
transport=0x8371f38, cdr=@0xbfa2bf38, output=@0xbfa2bedc,
parser=0x83b1238)
    at GIOP_Message_Base.cpp:851
#8  0x404acf8f in TAO_GIOP_Message_Base::process_request_message
(this=0x83b1228, transport=0x8371f38, qd=0xbfa2c4d4) at
GIOP_Message_Base.cpp:668
#9  0x4052baaf in TAO_Transport::process_parsed_messages
(this=0x8371f38,
qd=0xbfa2c4d4, rh=@0xbfa2c58c) at Transport.cpp:2261#10 0x4052c518 in
TAO_Transport::handle_input_parse_data (this=0x8371f38,
rh=@0xbfa2c58c,
max_wait_time=0x0) at Transport.cpp:2185
#11 0x4052d106 in TAO_Transport::handle_input (this=0x8371f38,
rh=@0xbfa2c58c, max_wait_time=0x0) at Transport.cpp:1512
#12 0x4049ab7c in TAO_Connection_Handler::handle_input_internal
(this=0x84b286c, h=239, eh=0x84b2800) at Connection_Handler.cpp:275
#13 0x4049ade9 in TAO_Connection_Handler::handle_input_eh
(this=0x84b286c,
h=239, eh=0x84b2800) at Connection_Handler.cpp:234
#14 0x404bc5fb in TAO_IIOP_Connection_Handler::handle_input
(this=0x84b2800, h=239) at IIOP_Connection_Handler.cpp:294
#15 0x406d1136 in ACE_TP_Reactor::dispatch_socket_event
(this=0x8149cf0,
dispatch_info=@0xbfa2c640) at TP_Reactor.cpp:591
#16 0x406d14a7 in ACE_TP_Reactor::handle_socket_events
(this=0x8149cf0,
event_count=@0xbfa2c698, guard=@0xbfa2c6e4) at TP_Reactor.cpp:460
#17 0x406d1588 in ACE_TP_Reactor::dispatch_i (this=0x8149cf0,
max_wait_time=0x0, guard=@0xbfa2c6e4) at TP_Reactor.cpp:250
#18 0x406d1654 in ACE_TP_Reactor::handle_events (this=0x8149cf0,
max_wait_time=0x0) at TP_Reactor.cpp:174
#19 0x406ab313 in ACE_Reactor::handle_events (this=0x813ff48,
max_wait_time=0x0) at Reactor.cpp:420
#20 0x404ea8bb in TAO_ORB_Core::run (this=0x813b838, tv=0x0,
perform_work=0) at ORB_Core.cpp:2142
#21 0x404e5c21 in CORBA::ORB::run (this=0x814b608, tv=0x0) at ORB.cpp:
202
#22 0x404e5c85 in CORBA::ORB::run (this=0x814b608) at ORB.cpp:188
...
(gdb)

The only thing that is a little bit special is that we're using two
ORBs.
One is initialized with the arguments

<executable> -ORBSvcConf <svcconf>

... and the second one is initialized with the arguments

<executable> -ORBEndPoint iiop://host:port
             -ORBEndPoint uiop:///tmp/svc_nr
             -ORBSvcConf <svcconf>

... with svcconf:

-- snip --
static Client_Strategy_Factory "-ORBProfileLock null
-ORBClientConnectionHandler mt"

static Server_Strategy_Factory "-ORBConcurrency reactive -ORBPOALock
thread"

static UIOP_Factory ""

static Advanced_Resource_Factory "-ORBReactorType tp -
ORBInputCDRAllocator
null -ORBConnectionCacheLock null -ORBProtocolFactory IIOP_Factory
-ORBProtocolFactory SHMIOP_Factory -ORBProtocolFactory UIOP_Factory"
-- snap --

The motivation for using two ORBs is the structure of our software.
There
are two independent libraries linked together.

Is it possible that our problem is caused by -ORBConnectionCacheLock
null?
... but on the other side we never had such problems with the same
code in
the same configuration under RedHat 8.0 (gcc 3.2) with TAO 1.4 (but
unfortunately this TAO version does not build any more with newer
versions
of gcc).

Thanks for all answers and best regards,
Andreas


More information about the Ace-users mailing list