[Ace-users] [ace-users] ACE not detecting connection closing with keepalive enabled

Yan Burman yan_b at tech-mer.com
Mon Dec 3 01:03:42 CST 2007


ACE VERSION: 5.6.1, 5.5.3, 5.5.4

HOST MACHINE and OPERATING SYSTEM:
Fedora core 5 linux

TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
Pentium M 1.8Ghz gcc 4.1.1

Official RPMs from http://dist.bonsai.com/ken/ace_tao_rpm/

DOES THE PROBLEM AFFECT:
EXECUTION

AREA/CLASS/EXAMPLE AFFECTED:
ACE_Svc_Handler

SYNOPSIS:
If peer is disconnected and data is sent to peer before the socket "senses" the error, handle_close() is not called for very long time

DESCRIPTION:
I am running the same code on windows and linux using ACE.
I set keepalive values on both platforms so that I can detect connection tear-off fast. The class I use it in, is derived class of ACE_Svc_Handler<ACE_SOCK_STREAM, ACE_MT_SYNCH> Here's the code for setting the keepalive:
#define PROBES_FREQ_SECONDS 1
#define KEEP_ALIVE_PROBES_SECONDS 4

	option = 1;
	peer().set_option(SOL_TCP, TCP_KEEPALIVE, &option, sizeof(option)); #ifdef __linux__
	// set keepalive idle time (wait IDLE seconds)
	option = KEEP_ALIVE_PROBES_SECONDS;
	peer().set_option(SOL_TCP, TCP_KEEPIDLE, &option, sizeof(option));

	// set interval between probes (send CNT probes)
	option = 3;
	peer().set_option(SOL_TCP, TCP_KEEPCNT, &option, sizeof(option));

	// set interval between probes (send probes with INTVL seconds apart)
	option = PROBES_FREQ_SECONDS;
	peer().set_option(SOL_TCP, TCP_KEEPINTVL, &option, sizeof(option)); #endif // __linux__

#ifdef _WIN32
	// defined in MSTcpIP.h
	struct tcp_keepalive settings, returned;
	DWORD dwBytes;

	memset(&settings, 0, sizeof(settings));
	memset(&returned, 0, sizeof(returned));

	settings.onoff = 1; // On
	// how often to send keepalive probes
	settings.keepalivetime = KEEP_ALIVE_PROBES_SECONDS * 1000;
	// Resend if No-Reply
	settings.keepaliveinterval = PROBES_FREQ_SECONDS * 1000;

	int res = ACE_OS::ioctl(peer().get_handle(), SIO_KEEPALIVE_VALS,
		reinterpret_cast<void*>(&settings), sizeof(settings),
		reinterpret_cast<void*>(&returned), sizeof(returned), &dwBytes,
		NULL, NULL);
#endif

This works fine both on windows and linux - when I disconnect peer's network cable, handle_close() is called within 4-8 seconds.
If I send something through that socket before handle_close() is called, but after I disconnected the peer's network cable, handle_close() is sometimes called after 20 minutes or so. This behavior is observed only on linux, on windows my code works exactly as expected - connection failure is detected always. The only platform dependent code I have is stated above. Everything else is the same on windows and linux. The version it worked for on windows is 5.5.4. I tried 5.5.4, 5.5.3 and 5.6.1 on linux and got the same result.


Hope you can help by either clarifying something I'm missing or if this is a bug in ACE.

Contact me if you need more information on this
Regards
Yan Burman





More information about the Ace-users mailing list