[Ace-users] A bug about ACE_Logging_Strategy

speng2005 at gmail.com speng2005 at gmail.com
Sun Dec 23 21:24:37 CST 2007


We have found a threads-safety problem in
ACE_Logging_Strategy::handle_timeout() method.
In our application,there are mutiple threads writing log to a shared
ofstream object.Also there is one reatctor thread calling the
ACE_Logging_Strategy::handle_timeout() method.But sometime our program
crashed with this stack backtrace:

(gdb) bt
#0  0xb7c1bb4a in memcpy () from /lib/tls/i686/cmov/libc.so.6
#1  0xb7d9c006 in std::basic_streambuf<char, std::char_traits<char>
>::xsputn () from /usr/lib/libstdc++.so.6
#2  0xb7d66d59 in std::basic_filebuf<char, std::char_traits<char>
>::xsputn () from /usr/lib/libstdc++.so.6
#3  0xb7d911df in std::operator<< <std::char_traits<char> > () from /
usr/lib/libstdc++.so.6
#4  0xb7ed8f15 in ACE_Log_Record::print (this=0x2befe228,
host_name=0x0, verbose_flag=4, s=@0x80521c8) at Log_Record.cpp:302
#5  0xb7ed70a6 in ACE_Log_Msg::log (this=0x8588a58,
log_record=@0x2befe228, suppress_stderr=0) at Log_Msg.cpp:2197
#6  0xb7ed749b in ACE_Log_Msg::log (this=0x8588a58,
format_str=0xb7f9ef3d "", log_priority=LM_TRACE, argp=0xb7f9ef3c "\n")
    at Log_Msg.cpp:2047
#7  0xb7ed8055 in ACE_Log_Msg::log (this=0x8588a58,
log_priority=LM_TRACE, format_str=0xb7f9ef2a "%*s(%t) leaving %s\n")
    at Log_Msg.cpp:961
#8  0xb7f9d9b3 in ~JAWS_Trace (this=0x2befe338) at Trace.cpp:139
#9  0xb70ff7b3 in StorageWriteState::service (this=0x85aaa28,
ec=0x291b4d70, data=0x291b4d70) at StorageWriteState.cpp:63
#10 0xb7f94739 in JAWS_Protocol_Handler::service (this=0x291b4d70) at
Protocol_Handler.cpp:38
#11 0xb7f86e0e in JAWS_Concurrency_Impl::svc (this=0x8057a34) at
Concurrency.cpp:38
#12 0xb7f1c794 in ACE_Task_Base::svc_run (args=0x8057a34) at Task.cpp:
258
#13 0xb7f1d108 in ACE_Thread_Adapter::invoke_i (this=0x8057ca8) at
Thread_Adapter.cpp:151
#14 0xb7f1d2d6 in ACE_Thread_Adapter::invoke (this=0x8057ca8) at
Thread_Adapter.cpp:95
#15 0xb7eb0c51 in ace_thread_adapter (args=0x8057ca8) at
Base_Thread_Adapter.cpp:137
#16 0xb7e04240 in start_thread () from /lib/tls/i686/cmov/
libpthread.so.0
#17 0xb7c7a4ae in clone () from /lib/tls/i686/cmov/libc.so.6

We found the reason is the calling of ostream<>::tellp () will change
the state of the internal basic_filebuf<> object of this ostream<>
object. We catched the call stack of that moment:

(gdb) bt
#0  0xb7cec09e in std::basic_filebuf<char, std::char_traits<char>
>::_M_seek () from /usr/lib/libstdc++.so.6
#1  0xb7cede2a in std::basic_filebuf<char, std::char_traits<char>
>::seekoff () from /usr/lib/libstdc++.so.6
#2  0xb7d14e5d in std::ostream::tellp () from /usr/lib/libstdc++.so.6
#3  0xb7e5f0a4 in ACE_Logging_Strategy::handle_timeout
(this=0xb7f2dac0) at Logging_Strategy.cpp:404
#4  0xb7e2dcb1 in
ACE_Event_Handler_Handle_Timeout_Upcall<ACE_Recursive_Thread_Mutex>::timeout
(this=0x80515d0,
    timer_queue=@0x8051528, event_handler=0xb7f2dac0, act=0x0,
recurring_timer=1, cur_time=@0x2ce852e8)
    at /data/jinwei/svn_root/ireport/common/ace/ACE_wrappers/ace/
Timer_Queue_T.cpp:408
#5  0xb7e4b20d in ACE_Dev_Poll_Reactor::dispatch_timer_handler
(this=0x8051470, guard=@0x2ce85394)
    at /data/jinwei/svn_root/ireport/common/ace/ACE_wrappers/ace/
Timer_Queue_T.inl:170
#6  0xb7e4d92c in ACE_Dev_Poll_Reactor::dispatch (this=0x8051470,
guard=@0x2ce85394) at Dev_Poll_Reactor.cpp:1230
#7  0xb7e4da56 in ACE_Dev_Poll_Reactor::handle_events_i
(this=0x8051470, max_wait_time=0x0, guard=@0x2ce85394)
    at Dev_Poll_Reactor.cpp:1211
#8  0xb7e4db21 in ACE_Dev_Poll_Reactor::handle_events (this=0x8051470,
max_wait_time=0x0) at Dev_Poll_Reactor.cpp:1166
#9  0xb7e8bf02 in ACE_Reactor::run_reactor_event_loop (this=0x80518f0,
eh=0) at Reactor.cpp:233
#10 0xb7f16be7 in
JAWS_Event_Dispatcher::JAWS_Event_Dispatcher_Reactor_Event_Loop () at
Event_Dispatcher.cpp:32
#11 0xb7ea2108 in ACE_Thread_Adapter::invoke_i (this=0x804c720) at
Thread_Adapter.cpp:151
#12 0xb7ea22d6 in ACE_Thread_Adapter::invoke (this=0x804c720) at
Thread_Adapter.cpp:95
#13 0xb7e35c51 in ace_thread_adapter (args=0x804c720) at
Base_Thread_Adapter.cpp:137
#14 0xb7d89240 in start_thread () from /lib/tls/i686/cmov/
libpthread.so.0
#15 0xb7bff4ae in clone () from /lib/tls/i686/cmov/libc.so.6

Sometimes basic_filebuf<>::_M_seek ()  method will set NULL to an
internal buffer pointer of the basic_filebuf<> object, and this
pointer would be pass to memcpy () as an argument when we output some
thing through this basic_filebuf<> object. We have viewed the source
of the ACE_Logging_Strategy::handle_timeout():

int
ACE_Logging_Strategy::handle_timeout (const ACE_Time_Value &,
                                      const void *)
{
#if defined (ACE_LACKS_IOSTREAM_TOTALLY)
  if ((size_t) ACE_OS::ftell (this->log_msg_->msg_ostream ()) > this-
>max_size_)
#else
  if ((size_t) this->log_msg_->msg_ostream ()->tellp () > this-
>max_size_)
#endif /* ACE_LACKS_IOSTREAM_TOTALLY */
    {
      // Lock out any other logging.
      if (this->log_msg_->acquire ())
         .........

The calling of "this->log_msg_->msg_ostream ()->tellp ()" is not thead-
safed! So we modified the code and tested it. Everything is ok now.

We shared this experience to peploes. Good luck to everyone.

ps:
     I will detail this problem in my blog: http://blog.tom.com/blogger2007
     Some one also have a problem about ostream<>::tellp() at url:
http://www.thescripts.com/forum/thread139342.html


More information about the Ace-users mailing list