sockets - Java Solaris NIO OP_CONNECT problem -


i have java client connects c++ server using tcp sockets using java nio. works under linux, aix , hp/ux under solaris op_connect event never fires.

further details:

  • selector.select() returning 0, , 'selected key set' empty.
  • the issue occurs when connecting local machine (via loopback or ethernet interface), works when connecting remote machine.
  • i have confirmed issue under 2 different solaris 10 machines; physical sparc , virtual x64 (vmware) using both jdk versions 1.6.0_21 , _26.

here test code demonstrates issue:

import java.io.ioexception; import java.net.inetsocketaddress; import java.nio.bytebuffer; import java.nio.channels.selectionkey; import java.nio.channels.selector; import java.nio.channels.socketchannel; import java.util.hashset; import java.util.iterator; import java.util.set;  public class niotest3 {     public static void main(string[] args)     {         int i, tcount = 1, open = 0;         string[] addr = args[0].split(":");         int port = integer.parseint(addr[1]);         if (args.length == 2)             tcount = integer.parseint(args[1]);         inetsocketaddress inetaddr = new inetsocketaddress(addr[0], port);         try         {             selector selector = selector.open();             socketchannel channel;             (i = 0; < tcount; i++)             {                 channel = socketchannel.open();                 channel.configureblocking(false);                 channel.register(selector, selectionkey.op_connect);                 channel.connect(inetaddr);             }             open = tcount;             while (open > 0)             {                 int selected = selector.select();                 system.out.println("selected=" + selected);                 iterator<selectionkey> = selector.selectedkeys().iterator();                 while (it.hasnext())                 {                     selectionkey key = it.next();                     it.remove();                     channel = (socketchannel)key.channel();                     if (key.isconnectable())                     {                         system.out.println("isconnectable");                         if (channel.finishconnect())                         {                             system.out.println(formataddr(channel) + " connected");                             key.interestops(selectionkey.op_write);                         }                     }                     else if (key.iswritable())                     {                         system.out.println(formataddr(channel) + " iswritable");                         string message = formataddr(channel) + " quick brown fox jumps on lazy dog";                         bytebuffer buffer = bytebuffer.wrap(message.getbytes());                         channel.write(buffer);                         key.interestops(selectionkey.op_read);                     }                     else if (key.isreadable())                     {                         system.out.println(formataddr(channel) + " isreadable");                         bytebuffer buffer = bytebuffer.allocate(1024);                         channel.read(buffer);                         buffer.flip();                         byte[] bytes = new byte[buffer.remaining()];                         buffer.get(bytes);                         string message = new string(bytes);                         system.out.println(formataddr(channel) + " read: '" + message + "'");                         channel.close();                         open--;                     }                 }             }          }         catch (ioexception e)         {             e.printstacktrace();         }     }      static string formataddr(socketchannel channel)     {         return integer.tostring(channel.socket().getlocalport());     } } 

you can run using command line:

java -cp . niotest3 <ipaddr>:<port> <num-connections> 

where port should 7 if running against real echo service; i.e.:

java -cp . niotest3 127.0.0.1:7 5 

if cannot real echo service running source 1 here. compile echo server under solaris with:

$ cc -o echoserver echoserver.c -lsocket -lnsl 

and run this:

$ ./echoserver 8007 > out 2>&1 & 

this has been reported sun bug.

your bug report has been closed 'not bug', explanation. ignoring result of connect(), if true means op_connect never fire, because channel connected. need whole op_connect/finishconnect() megillah if returns false. shouldn't register op_connect unless connect() returns false, let alone register before you've called connect().

further remarks:

under hood, op_connect , op_write same thing, explains part of it.

as have single thread this, workaround connect in blocking mode, switch non-blocking i/o.

are doing select() after registering channel selector?

the correct way of handling non-blocking connect follows:

channel.configureblocking(false); if (!channel.connect(...)) {     channel.register(sel, selectionkey.op_connect, ...); // ... attachment, or absent } // else channel connected, maybe register op_read ... // select() loop runs ... // process ready keys ... if (key.isconnectable()) {   if (channel.finishconnect())   {      key.interestops(0); // or selectionkey.op_read or op_write, whatever appropriate   } } 

a few non-exhaustive comments after reviewing extended code:

  1. closing channel cancels key. don't need both.

  2. the non-static removeinterest() method incorrectly implemented.

  3. type_deregister_object closes channel. not sure if intended. have thought should cancel key, , there should separate operation closing channel.

  4. you have gone way overboard on small methods , exception handling. addinterest() , removeinterest() examples. catch exceptions, log them, proceed though exception hadn't happened, when set or clear bit: 1 line of code. , on top of many of them have both static , non-static versions. same goes little methods call key.cancel(), channel.close(), etc. there no point this, clocking lines of code. adds obscurity , makes code harder understand. operation required inline , have single catcher @ bottom of select loop.

  5. if finishconnect() returns false isn't connection failure, hasn't completed yet. if throws exception, connection failure.

  6. you registering op_connect , op_read @ same time. doesn't make sense , may cause problems. there nothing read until op_connect has fired. register op_connect @ first.

  7. you allocating bytebuffer per read. wasteful. use same 1 life of connection.

  8. you ignoring result of read(). can zero. can -1, indicating eos, on must close channel. assuming entire application message in single read. can't assume that. that's reason why should use single bytebuffer life of connection.

  9. you ignoring result of write(). less buffer.remaining() when called it. can zero.

  10. you simplify lot making netselectable key attachment. away several things, including example channel map, , assert, because channel of key must equal channel of attachment of key.

  11. i move finishconnect() code netselector, , have connectevent() success/failure notification. don't want spread kind of stuff around. same readevent(), i.e. read in netselector, buffer supplied netselectable, , notify netselectable of read result: count or -1 or exception. ditto on write: if channel writable, write netselectable, write in netselector, , notify result. have notification callbacks return indicate next, e.g. close channel.

but 5 times complex needs be, , fact have bug proves it. simplify head.


Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -