JDK在mac和linux上对connection reset的行为不一致?

遇到一个JDK(版本都是1.8.0_51-b16)在mac和linux上行为不一致的问题,这个问题是针对redis服务器端关闭连接时的状况处理;即在redis-server端设置连接的空闲时间,当超过这个空闲时间后server主动把该连接关闭掉。在我的mac上这个行为是符合预期的,对这个超时的连接再发起请求时会抛出 SocketException "Connection Reset",但在linux上却不会。

简单模拟一下,在Redis服务端配置timeout为3秒,客户端建立连接之后 sleep 4秒(或更久),让连接超时

import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.Socket;

public class ConnRestTest {

    public static void main(String[] args) throws Exception {
        Socket socket = new Socket();
        socket.connect(new InetSocketAddress("localhost", 6379));

        InputStream is = socket.getInputStream();
        OutputStream os = socket.getOutputStream();

        Thread.sleep(4000);

        os.write("*1\r\n$4\r\nPING\r\n".getBytes());
        os.flush();

        for (int i = 0; i < 7; i++) {
            int r = is.read();
            System.out.print( r == -1 ? r : (char) r);
        }

        socket.close();
    }
}

上面的代码在mac上运行后会看到如下异常(如果注释掉 Thread.sleep(4000) 会得到 "+PONG\r\n")

Exception in thread "main" java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(SocketInputStream.java:209)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.net.SocketInputStream.read(SocketInputStream.java:223)
    at com.wacai.common.redis.io.Test.main(Test.java:23)    

而在linux上运行后则是打印7个连续的"-1",即并不会抛出"Connection reset"

分别进行tcpdump,在mac上:

 ➜  sudo tcpdump -i lo0 port 6379

23:13:43.398003 IP localhost.63864 > localhost.6379: Flags [S], seq 1658673677, win 65535, options [mss 16344,nop,wscale 5,nop,nop,TS val 1063188198 ecr 0,sackOK,eol], length 0
23:13:43.398064 IP localhost.6379 > localhost.63864: Flags [S.], seq 355134851, ack 1658673678, win 65535, options [mss 16344,nop,wscale 5,nop,nop,TS val 1063188198 ecr 1063188198,sackOK,eol], length 0
23:13:43.398075 IP localhost.63864 > localhost.6379: Flags [.], ack 1, win 12759, options [nop,nop,TS val 1063188198 ecr 1063188198], length 0
23:13:43.398085 IP localhost.6379 > localhost.63864: Flags [.], ack 1, win 12759, options [nop,nop,TS val 1063188198 ecr 1063188198], length 0

23:13:47.063640 IP localhost.6379 > localhost.63864: Flags [F.], seq 1, ack 1, win 12759, options [nop,nop,TS val 1063191852 ecr 1063188198], length 0
23:13:47.063671 IP localhost.63864 > localhost.6379: Flags [.], ack 2, win 12759, options [nop,nop,TS val 1063191852 ecr 1063191852], length 0

23:13:48.403144 IP localhost.63864 > localhost.6379: Flags [P.], seq 1:15, ack 2, win 12759, options [nop,nop,TS val 1063193184 ecr 1063191852], length 14
23:13:48.403255 IP localhost.6379 > localhost.63864: Flags [R], seq 355134853, win 0, length 0  

在linux上:

$ sudo tcpflow -p -c -i lo port 6379

00:26:13.303233 IP localhost.34609 > localhost.6379: Flags [S], seq 1094106697, win 43690, options [mss 65495,sackOK,TS val 8462190 ecr 0,nop,wscale 7], length 0
00:26:13.303272 IP localhost.6379 > localhost.34609: Flags [S.], seq 2755621045, ack 1094106698, win 43690, options [mss 65495,sackOK,TS val 8462190 ecr 8462190,nop,wscale 7], length 0
00:26:13.303298 IP localhost.34609 > localhost.6379: Flags [.], ack 1, win 342, options [nop,nop,TS val 8462190 ecr 8462190], length 0

00:26:17.037992 IP localhost.6379 > localhost.34609: Flags [F.], seq 1, ack 1, win 342, options [nop,nop,TS val 8465925 ecr 8462190], length 0
00:26:17.038450 IP localhost.34609 > localhost.6379: Flags [.], ack 2, win 342, options [nop,nop,TS val 8465926 ecr 8465925], length 0

00:26:18.305591 IP localhost.34609 > localhost.6379: Flags [P.], seq 1:15, ack 2, win 342, options [nop,nop,TS val 8467193 ecr 8465925], length 14
00:26:18.305630 IP localhost.6379 > localhost.34609: Flags [R], seq 2755621047, win 0, length 0     

在mac上比linux上多了一次在连接建立后从server发给client端的ack,在3秒之后,连接因为超过空闲时间server端向client端发起了fin,client回复ack,之后client端程序对socket进行写操作,在tcpdump里看到标记为P的标记,然后server端发送了reset

在linux上看到连接状态的变化也符合预期:

$ netstat -antp | grep 6379
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:6379          0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:6379          127.0.0.1:34607         ESTABLISHED -
tcp6       0      0 127.0.0.1:34607         127.0.0.1:6379          ESTABLISHED 3683/java

tcp        0      0 127.0.0.1:6379          0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:6379          127.0.0.1:34607         FIN_WAIT2   -
tcp6       1      0 127.0.0.1:34607         127.0.0.1:6379          CLOSE_WAIT  3683/java

按说server端发送reset标记后,mac上抛出异常的行为更符合预期,可能是JDK网络层面的实现细节不同,或者tcpdump出的数据已经能解释原因只是我看不出来。

3 thoughts on “JDK在mac和linux上对connection reset的行为不一致?

  1. import java.io.IOException;
    import java.io.InputStream;
    import java.io.OutputStream;
    import java.net.InetSocketAddress;
    import java.net.Socket;

    public class Test {

    public static void main(String[] args) {
    String host = "localhost";
    int port = 6379;
    int timeout = 2000;

    String serInfo = "server=" + host + ":" + port;
    String threadName = Thread.currentThread().getName();
    try {
    Socket socket = new Socket();
    socket.setReuseAddress(true);
    socket.setKeepAlive(true);
    socket.setTcpNoDelay(true);
    socket.setSoLinger(true, 0);
    socket.connect(new InetSocketAddress(host, port), timeout);
    socket.setSoTimeout(timeout);

    InputStream is = socket.getInputStream();
    OutputStream os = socket.getOutputStream();

    os.write("*1\r\n$4\r\nPING\r\n".getBytes());
    os.flush();

    for (int i = 0; i < 7; i++) { int r = is.read(); System.out.print((char) r); } socket.close(); } catch (IOException ex) { System.err.println("redis-client socket.connect error, " + serInfo + ", " + threadName + ": " + ex.getMessage()); } } }

  2. 我同意您的看法,应该是网络层面的实现不一样。。。

    返回-1的话,也表示stream已经到头了
    If no byte is available because the end of the stream has been reached, the value -1 is returned

Leave a Reply

Your email address will not be published. Required fields are marked *