本节主要内容
1. Tcp编程
2. redis使用
3. 课后作业
1. Tcp编程
(1)简介
Golang是谷歌设计开发的语言,在Golang的设计之初就把高并发的性能作为Golang的主要特性之一,也是面向大规模后端服务程序。在服务器端网络通信是必不可少的也是至关重要的一部分。Golang内置的包例如net、net/http中的底层就是对TCP socket方法的封装。
TCP简介:
1 Golang是谷歌设计开发的语言,在Golang的设计之初就把高并发的性能作为Golang的主要特性之一,也是面向大规模后端服务程序。在服务器端网络通信是必不可少的也是至关重要的一部分。Golang内置的包例如net、net/http中的底层就是对TCP socket方法的封装。 2 网络编程方面,我们最常用的就是tcp socket编程了,在posix标准出来后,socket在各大主流OS平台上都得到了很好的支持。关于tcp programming,最好的资料莫过于W. Richard Stevens 的网络编程圣经《UNIX网络 编程 卷1:套接字联网API》 了,书中关于tcp socket接口的各种使用、行为模式、异常处理讲解的十分细致。
Go是自带runtime的跨平台编程语言,Go中暴露给语言使用者的tcp socket api是建立OS原生tcp socket接口之上的。由于Go runtime调度的需要,golang tcp socket接口在行为特点与异常处理方面与OS原生接口有着一些差别。
(2)模型
从tcp socket诞生后,网络编程架构模型也几经演化,大致是:“每进程一个连接” –> “每线程一个连接” –> “Non-Block + I/O多路复用(linux epoll/windows iocp/freebsd darwin kqueue/solaris Event Port)”。伴随着模型的演化,服务程序愈加强大,可以支持更多的连接,获得更好的处理性能
目前主流web server一般均采用的都是”Non-Block + I/O多路复用”(有的也结合了多线程、多进程)。不过I/O多路复用也给使用者带来了不小的复杂度,以至于后续出现了许多高性能的I/O多路复用框架, 比如libevent、libev、libuv等,以帮助开发者简化开发复杂性,降低心智负担。不过Go的设计者似乎认为I/O多路复用的这种通过回调机制割裂控制流 的方式依旧复杂,且有悖于“一般逻辑”设计,为此Go语言将该“复杂性”隐藏在Runtime中了:Go开发者无需关注socket是否是 non-block的,也无需亲自注册文件描述符的回调,只需在每个连接对应的goroutine中以“block I/O”的方式对待socket处理即可,这可以说大大降低了开发人员的心智负担。一个典型的Go server端程序大致如下:
1 //go-tcpsock/server.go 2 func HandleConn(conn net.Conn) { 3 defer conn.Close() 4 5 for { 6 // read from the connection 7 // ... ... 8 // write to the connection 9 //... ... 10 } 11 } 12 13 func main() { 14 listen, err := net.Listen("tcp", ":8888") 15 if err != nil { 16 fmt.Println("listen error: ", err) 17 return 18 } 19 20 for { 21 conn, err := listen.Accept() 22 if err != nil { 23 fmt.Println("accept error: ", err) 24 break 25 } 26 27 // start a new goroutine to handle the new connection 28 go HandleConn(conn) 29 } 30 }
(重点)用户层眼中看到的goroutine中的“block socket”,实际上是通过Go runtime中的netpoller通过Non-block socket + I/O多路复用机制“模拟”出来的,真实的underlying socket实际上是non-block的,只是runtime拦截了底层socket系统调用的错误码,并通过netpoller和goroutine 调度让goroutine“阻塞”在用户层得到的Socket fd上。比如:当用户层针对某个socket fd发起read操作时,如果该socket fd中尚无数据,那么runtime会将该socket fd加入到netpoller中监听,同时对应的goroutine被挂起,直到runtime收到socket fd 数据ready的通知,runtime才会重新唤醒等待在该socket fd上准备read的那个Goroutine。而这个过程从Goroutine的视角来看,就像是read操作一直block在那个socket fd上似的。
关于netpoller可以看下这为博主博客:http://www.opscoder.info/golang_netpoller.html
(3)TCP连接的建立
众所周知,TCP Socket的连接的建立需要经历客户端和服务端的三次握手的过程。连接建立过程中,服务端是一个标准的Listen + Accept的结构(可参考上面的代码),而在客户端Go语言使用net.Dial()或net.DialTimeout()进行连接建立。
服务端的处理流程: a. 监听端口 b. 接收客户端的链接 c. 创建goroutine,处理该链接
客户端的处理流程: a. 建立与服务端的链接 b. 进行数据收发 c. 关闭链接
阻塞Dial:
1 conn, err := net.Dial("tcp", "www.baidu.com:80") 2 if err != nil { 3 //handle error 4 } 5 //read or write on conn
超时机制的Dial:
1 conn, err := net.DialTimeout("tcp", "www.baidu.com:80", 2*time.Second) 2 if err != nil { 3 //handle error 4 } 5 //read or write on conn
对于客户端而言,连接的建立会遇到如下几种情形:
- 网络不可达或对方服务未启动
如果传给Dial的Addr是可以立即判断出网络不可达,或者Addr中端口对应的服务没有启动,端口未被监听,Dial会几乎立即返回错误,比如:
1 package main 2 3 import ( 4 "net" 5 "log" 6 ) 7 8 func main() { 9 log.Println("begin dial...") 10 conn, err := net.Dial("tcp", ":8888") 11 if err != nil { 12 log.Println("dial error:", err) 13 return 14 } 15 defer conn.Close() 16 log.Println("dial ok") 17 }
如果本机8888端口未有服务程序监听,那么执行上面程序,Dial会很快返回错误:
注:在Centos6.5上测试,下同。
- 对方服务的listen backlog满
还有一种场景就是对方服务器很忙,瞬间有大量client端连接尝试向server建立,server端的listen backlog队列满,server accept不及时((即便不accept,那么在backlog数量范畴里面,connect都会是成功的,因为new conn已经加入到server side的listen queue中了,accept只是从queue中取出一个conn而已),这将导致client端Dial阻塞。我们还是通过例子感受Dial的行为特点:
服务端代码:
1 package main 2 3 import ( 4 "net" 5 "log" 6 "time" 7 ) 8 9 func main() { 10 l, err := net.Listen("tcp", ":8888") 11 if err != nil { 12 log.Println("error listen:", err) 13 return 14 } 15 defer l.Close() 16 log.Println("listen ok") 17 18 var i int 19 for { 20 time.Sleep(time.Second * 10) 21 if _, err := l.Accept(); err != nil { 22 log.Println("accept error:", err) 23 break 24 } 25 i++ 26 log.Printf("%d: accept a new connection\n", i) 27 } 28 }
客户端代码:
1 package main 2 3 import ( 4 "net" 5 "log" 6 "time" 7 ) 8 9 func establishConn(i int) net.Conn { 10 conn, err := net.Dial("tcp", ":8888") 11 if err != nil { 12 log.Printf("%d: dial error: %s", i, err) 13 return nil 14 } 15 log.Println(i, ":connect to server ok") 16 return conn 17 } 18 19 func main() { 20 var sl []net.Conn 21 22 for i := 1; i < 1000; i++ { 23 conn := establishConn(i) 24 if conn != nil { 25 sl = append(sl, conn) 26 } 27 } 28 29 time.Sleep(time.Second * 10000) 30 }
经过测试在Client初始时成功地一次性建立了131个连接,然后后续每阻塞近1s才能成功建立一条连接。也就是说在server端 backlog满时(未及时accept),客户端将阻塞在Dial上,直到server端进行一次accept。
如果server一直不accept,client端会一直阻塞么?我们去掉accept后的结果是:在Darwin下,client端会阻塞大 约1分多钟才会返回timeout。而如果server运行在ubuntu 14.04上,client似乎一直阻塞,我等了10多分钟依旧没有返回。 阻塞与否看来与server端的网络实现和设置有关。
注:在Centos6.5上测试,发现注释掉server端的accept,client一次建立131个连接后,后面还会每隔1s建立一个链接。
- 网络延迟较大,Dial阻塞并超时
如果网络延迟较大,TCP握手过程将更加艰难坎坷(各种丢包),时间消耗的自然也会更长。Dial这时会阻塞,如果长时间依旧无法建立连接,则Dial也会返回“ getsockopt: operation timed out”错误。
在连接建立阶段,多数情况下,Dial是可以满足需求的,即便阻塞一小会儿。但对于某些程序而言,需要有严格的连接时间限定,如果一定时间内没能成功建立连接,程序可能会需要执行一段“异常”处理逻辑,为此我们就需要DialTimeout了。下面的例子将Dial的最长阻塞时间限制在2s内,超出这个时长,Dial将返回timeout error:
1 package main 2 3 import ( 4 "net" 5 "log" 6 "time" 7 ) 8 9 func main() { 10 log.Println("begin dial...") 11 conn, err := net.DialTimeout("tcp", "192.168.30.134:8888", 2*time.Second) 12 if err != nil { 13 log.Println("dial error:", err) 14 return 15 } 16 defer conn.Close() 17 log.Println("dial ok") 18 }
执行结果如下,需要模拟一个网络延迟大的环境:
1 $go run client_timeout.go 2 2015/11/17 09:28:34 begin dial... 3 2015/11/17 09:28:36 dial error: dial tcp 104.236.176.96:80: i/o timeout
(4)Socket读写
连接建立起来后,我们就要在conn上进行读写,以完成业务逻辑。前面说过Go runtime隐藏了I/O多路复用的复杂性。语言使用者只需采用goroutine+Block I/O的模式即可满足大部分场景需求。Dial成功后,方法返回一个Conn接口类型变量值。
客户端Dial建立连接:
func Dial(network, address string) (Conn, error)
1 type Conn interface { 2 // Read reads data from the connection. 3 // Read can be made to time out and return an Error with Timeout() == true 4 // after a fixed time limit; see SetDeadline and SetReadDeadline. 5 Read(b []byte) (n int, err error) 6 7 // Write writes data to the connection. 8 // Write can be made to time out and return an Error with Timeout() == true 9 // after a fixed time limit; see SetDeadline and SetWriteDeadline. 10 Write(b []byte) (n int, err error) 11 12 // Close closes the connection. 13 // Any blocked Read or Write operations will be unblocked and return errors. 14 Close() error 15 16 // LocalAddr returns the local network address. 17 LocalAddr() Addr 18 19 // RemoteAddr returns the remote network address. 20 RemoteAddr() Addr 21 22 // SetDeadline sets the read and write deadlines associated 23 // with the connection. It is equivalent to calling both 24 // SetReadDeadline and SetWriteDeadline. 25 // 26 // A deadline is an absolute time after which I/O operations 27 // fail with a timeout (see type Error) instead of 28 // blocking. The deadline applies to all future and pending 29 // I/O, not just the immediately following call to Read or 30 // Write. After a deadline has been exceeded, the connection 31 // can be refreshed by setting a deadline in the future. 32 // 33 // An idle timeout can be implemented by repeatedly extending 34 // the deadline after successful Read or Write calls. 35 // 36 // A zero value for t means I/O operations will not time out. 37 SetDeadline(t time.Time) error 38 39 // SetReadDeadline sets the deadline for future Read calls 40 // and any currently-blocked Read call. 41 // A zero value for t means Read will not time out. 42 SetReadDeadline(t time.Time) error 43 44 // SetWriteDeadline sets the deadline for future Write calls 45 // and any currently-blocked Write call. 46 // Even if write times out, it may return n > 0, indicating that 47 // some of the data was successfully written. 48 // A zero value for t means Write will not time out. 49 SetWriteDeadline(t time.Time) error 50 }
服务器端Listen监听客户端连接:
func Listen(network, address string) (Listener, error)
1 type Listener interface { 2 // Accept waits for and returns the next connection to the listener. 3 Accept() (Conn, error) 4 5 // Close closes the listener. 6 // Any blocked Accept operations will be unblocked and return errors. 7 Close() error 8 9 // Addr returns the listener's network address. 10 Addr() Addr 11 }
从Conn接口中有Read,Write,Close等方法。
1)conn.Read的特点
- Socket中无数据
连接建立后,如果对方未发送数据到socket,接收方(Server)会阻塞在Read操作上,这和前面提到的“模型”原理是一致的。执行该Read操作的goroutine也会被挂起。runtime会监视该socket,直到其有数据才会重新调度该socket对应的Goroutine完成read。例子对应的代码文件:go-tcpsock/read_write下的client1.go和server1.go。
1 package main 2 3 import ( 4 "log" 5 "net" 6 "time" 7 ) 8 9 func main() { 10 log.Println("begin dial...") 11 conn, err := net.Dial("tcp", ":8888") 12 if err != nil { 13 log.Println("dial error:", err) 14 return 15 } 16 defer conn.Close() 17 log.Println("dial ok") 18 time.Sleep(time.Second * 10000) 19 }
1 //server.go 2 3 package main 4 5 import ( 6 "log" 7 "net" 8 ) 9 10 func handleConn(c net.Conn) { 11 defer c.Close() 12 for { 13 // read from the connection 14 var buf = make([]byte, 10) 15 log.Println("start to read from conn") 16 n, err := c.Read(buf) 17 if err != nil { 18 log.Println("conn read error:", err) 19 return 20 } 21 log.Printf("read %d bytes, content is %s\n", n, string(buf[:n])) 22 } 23 } 24 25 func main() { 26 l, err := net.Listen("tcp", ":8888") 27 if err != nil { 28 log.Println("listen error:", err) 29 return 30 } 31 32 for { 33 c, err := l.Accept() 34 if err != nil { 35 log.Println("accept error:", err) 36 break 37 } 38 // start a new goroutine to handle 39 // the new connection. 40 log.Println("accept a new connection") 41 go handleConn(c) 42 } 43 }
- Socket中有部分数据
如果socket中有部分数据,且长度小于一次Read操作所期望读出的数据长度,那么Read将会成功读出这部分数据并返回,而不是等待所有期望数据全部读取后再返回。
客户端:
1 //client2.go 2 package main 3 4 import ( 5 "fmt" 6 "log" 7 "net" 8 "os" 9 "time" 10 ) 11 12 func main() { 13 if len(os.Args) <= 1 { 14 fmt.Println("usage: go run client2.go YOUR_CONTENT") 15 return 16 } 17 log.Println("begin dial...") 18 conn, err := net.Dial("tcp", ":8888") 19 if err != nil { 20 log.Println("dial error:", err) 21 return 22 } 23 defer conn.Close() 24 log.Println("dial ok") 25 26 time.Sleep(time.Second * 2) 27 data := os.Args[1] 28 conn.Write([]byte(data)) 29 30 time.Sleep(time.Second * 10000) 31 }
服务端:
1 //server2.go 2 package main 3 4 import ( 5 "log" 6 "net" 7 ) 8 9 func handleConn(c net.Conn) { 10 defer c.Close() 11 for { 12 // read from the connection 13 var buf = make([]byte, 10) 14 log.Println("start to read from conn") 15 n, err := c.Read(buf) 16 if err != nil { 17 log.Println("conn read error:", err) 18 return 19 } 20 log.Printf("read %d bytes, content is %s\n", n, string(buf[:n])) 21 } 22 } 23 24 func main() { 25 l, err := net.Listen("tcp", ":8888") 26 if err != nil { 27 log.Println("listen error:", err) 28 return 29 } 30 31 for { 32 c, err := l.Accept() 33 if err != nil { 34 log.Println("accept error:", err) 35 break 36 } 37 // start a new goroutine to handle 38 // the new connection. 39 log.Println("accept a new connection") 40 go handleConn(c) 41 } 42 }
通过client2.go发送”hi”到Server端:
F:\Go\project\src\go_dev\go-tcpsock\read_write>go run client2.go hi 2019/03/04 22:43:41 begin dial... 2019/03/04 22:43:41 dial ok F:\Go\project\src\go_dev\go-tcpsock\read_write>go run server2.go 2019/03/04 22:43:41 accept a new connection 2019/03/04 22:43:41 start to read from conn 2019/03/04 22:43:43 read 2 bytes, content is hi 2019/03/04 22:43:43 start to read from conn
- Socket中有足够数据
如果socket中有数据,且长度大于等于一次Read操作所期望读出的数据长度,那么Read将会成功读出这部分数据并返回。这个情景是最符合我们对Read的期待的了:Read将用Socket中的数据将我们传入的slice填满后返回:n = 10, err = nil。
执行结果:
F:\Go\project\src\go_dev\go-tcpsock\read_write>go run client2.go abcdefghij123 2019/03/04 22:50:01 begin dial... 2019/03/04 22:50:01 dial ok F:\Go\project\src\go_dev\go-tcpsock\read_write>go run server2.go 2019/03/04 22:50:01 accept a new connection 2019/03/04 22:50:01 start to read from conn 2019/03/04 22:50:03 read 10 bytes, content is abcdefghij 2019/03/04 22:50:03 start to read from conn 2019/03/04 22:50:03 read 3 bytes, content is 123 2019/03/04 22:50:03 start to read from conn
结果分析: client端发送的内容长度为13个字节,Server端Read buffer的长度为10,因此Server Read第一次返回时只会读取10个字节;Socket中还剩余3个字节数据,Server再次Read时会把剩余数据读出(如:情形2)。
- Socket关闭
如果client端主动关闭了socket,那么Server的Read将会读到什么呢?
这里分为“有数据关闭”和“无数据关闭”:
有数据关闭是指在client关闭时,socket中还有server端未读取的数据。当client端close socket退出后,server依旧没有开始Read,10s后第一次Read成功读出了所有的数据,当第二次Read时,由于client端 socket关闭,Read返回EOF error。
客户端:
1 //client3.go 2 package main 3 4 import ( 5 "fmt" 6 "log" 7 "net" 8 "os" 9 "time" 10 ) 11 12 func main() { 13 if len(os.Args) <= 1 { 14 fmt.Println("usage: go run client3.go YOUR_CONTENT") 15 return 16 } 17 log.Println("begin dial...") 18 conn, err := net.Dial("tcp", ":8888") 19 if err != nil { 20 log.Println("dial error:", err) 21 return 22 } 23 defer conn.Close() 24 log.Println("dial ok") 25 26 time.Sleep(time.Second * 2) 27 data := os.Args[1] 28 conn.Write([]byte(data)) 29 }
服务端:
1 //server3.go 2 3 package main 4 5 import ( 6 "log" 7 "net" 8 "time" 9 ) 10 11 func handleConn(c net.Conn) { 12 defer c.Close() 13 for { 14 // read from the connection 15 time.Sleep(10 * time.Second) 16 var buf = make([]byte, 10) 17 log.Println("start to read from conn") 18 n, err := c.Read(buf) 19 if err != nil { 20 log.Println("conn read error:", err) 21 return 22 } 23 log.Printf("read %d bytes, content is %s\n", n, string(buf[:n])) 24 } 25 } 26 27 func main() { 28 l, err := net.Listen("tcp", ":8888") 29 if err != nil { 30 log.Println("listen error:", err) 31 return 32 } 33 34 for { 35 c, err := l.Accept() 36 if err != nil { 37 log.Println("accept error:", err) 38 break 39 } 40 // start a new goroutine to handle 41 // the new connection. 42 log.Println("accept a new connection") 43 go handleConn(c) 44 } 45 }
执行结果:
F:\Go\project\src\go_dev\go-tcpsock\read_write>go run client3.go hello 2019/03/04 22:55:49 begin dial... 2019/03/04 22:55:49 dial ok F:\Go\project\src\go_dev\go-tcpsock\read_write>go run server3.go 2019/03/04 22:55:49 accept a new connection 2019/03/04 22:55:59 start to read from conn 2019/03/04 22:55:59 read 5 bytes, content is hello 2019/03/04 22:56:09 start to read from conn 2019/03/04 22:56:09 conn read error: EOF
结果分析:从输出结果来看,当client端close socket退出后,server3依旧没有开始Read,10s后第一次Read成功读出了5个字节的数据,当第二次Read时,由于client端 socket关闭,Read返回EOF error。
通过上面这个例子,我们也可以猜测出“无数据关闭”情形下的结果,那就是Read直接返回EOF error。
- 读取操作超时
有些场合对Read的阻塞时间有严格限制,在这种情况下,Read的行为到底是什么样的呢?在返回超时错误时,是否也同时Read了一部分数据了呢? 这个实验比较难于模拟,下面的测试结果也未必能反映出所有可能结果。
客户端:
1 //client4.go 2 package main 3 4 import ( 5 "log" 6 "net" 7 "time" 8 ) 9 10 func main() { 11 log.Println("begin dial...") 12 conn, err := net.Dial("tcp", ":8888") 13 if err != nil { 14 log.Println("dial error:", err) 15 return 16 } 17 defer conn.Close() 18 log.Println("dial ok") 19 20 data := make([]byte, 65536) 21 conn.Write(data) 22 23 time.Sleep(time.Second * 10000) 24 }
服务端:
1 //server4.go 2 3 package main 4 5 import ( 6 "log" 7 "net" 8 "time" 9 ) 10 11 func handleConn(c net.Conn) { 12 defer c.Close() 13 for { 14 // read from the connection 15 time.Sleep(10 * time.Second) 16 var buf = make([]byte, 65536) 17 log.Println("start to read from conn") 18 //c.SetReadDeadline(time.Now().Add(time.Microsecond * 10))//conn read 0 bytes, error: read tcp 127.0.0.1:8888->127.0.0.1:60763: i/o timeout 19 c.SetReadDeadline(time.Now().Add(time.Microsecond * 10)) 20 n, err := c.Read(buf) 21 if err != nil { 22 log.Printf("conn read %d bytes, error: %s", n, err) 23 if nerr, ok := err.(net.Error); ok && nerr.Timeout() { 24 continue 25 } 26 return 27 } 28 29 log.Printf("read %d bytes, content is %s\n", n, string(buf[:n])) 30 } 31 } 32 33 func main() { 34 l, err := net.Listen("tcp", ":8888") 35 if err != nil { 36 log.Println("listen error:", err) 37 return 38 } 39 40 for { 41 c, err := l.Accept() 42 if err != nil { 43 log.Println("accept error:", err) 44 break 45 } 46 // start a new goroutine to handle 47 // the new connection. 48 log.Println("accept a new connection") 49 go handleConn(c) 50 } 51 }
在Server端我们通过Conn的SetReadDeadline方法设置了10微秒的读超时时间。
虽然每次都是10微秒超时,但结果不同,第一次Read超时,读出数据长度为0;第二次读取所有数据成功,没有超时。反复执行了多次,没能出现“读出部分数据且返回超时错误”的情况。
2)conn.Write的特点
- 成功写
前面例子着重于Read,client端在Write时并未判断Write的返回值。所谓“成功写”指的就是Write调用返回的n与预期要写入的数据长度相等,且error = nil。这是我们在调用Write时遇到的最常见的情形,这里不再举例了。
- 写阻塞
TCP连接通信两端的OS都会为该连接保留数据缓冲,一端调用Write后,实际上数据是写入到OS的协议栈的数据缓冲的。TCP是全双工通信,因此每个方向都有独立的数据缓冲。当发送方将对方的接收缓冲区以及自身的发送缓冲区写满后,Write就会阻塞。
客户端:
1 //client5.go 2 package main 3 4 import ( 5 "log" 6 "net" 7 "time" 8 ) 9 10 func main() { 11 log.Println("begin dial...") 12 conn, err := net.Dial("tcp", ":8888") 13 if err != nil { 14 log.Println("dial error:", err) 15 return 16 } 17 defer conn.Close() 18 log.Println("dial ok") 19 20 data := make([]byte, 65536) 21 var total int 22 for { 23 n, err := conn.Write(data) 24 if err != nil { 25 total += n 26 log.Printf("write %d bytes, error:%s\n", n, err) 27 break 28 } 29 total += n 30 log.Printf("write %d bytes this time, %d bytes in total\n", n, total) 31 } 32 33 log.Printf("write %d bytes in total\n", total) 34 time.Sleep(time.Second * 10000) 35 }
服务端:
1 //server5.go 2 3 package main 4 5 import ( 6 "log" 7 "net" 8 "time" 9 ) 10 11 func handleConn(c net.Conn) { 12 defer c.Close() 13 time.Sleep(time.Second * 10) 14 for { 15 // read from the connection 16 time.Sleep(5 * time.Second) 17 var buf = make([]byte, 60000) 18 log.Println("start to read from conn") 19 n, err := c.Read(buf) 20 if err != nil { 21 log.Printf("conn read %d bytes, error: %s", n, err) 22 if nerr, ok := err.(net.Error); ok && nerr.Timeout() { 23 continue 24 } 25 break 26 } 27 28 log.Printf("read %d bytes, content is %s\n", n, string(buf[:n])) 29 } 30 } 31 32 func main() { 33 l, err := net.Listen("tcp", ":8888") 34 if err != nil { 35 log.Println("listen error:", err) 36 return 37 } 38 39 for { 40 c, err := l.Accept() 41 if err != nil { 42 log.Println("accept error:", err) 43 break 44 } 45 // start a new goroutine to handle 46 // the new connection. 47 log.Println("accept a new connection") 48 go handleConn(c) 49 } 50 }
执行结果:
[root@centos tcp]# go run client5.go 2019/03/04 23:30:18 begin dial... 2019/03/04 23:30:18 dial ok 2019/03/04 23:30:18 write 65536 bytes this time, 65536 bytes in total 2019/03/04 23:30:18 write 65536 bytes this time, 131072 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 196608 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 262144 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 327680 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 393216 bytes in total 2019/03/04 23:30:39 write 65536 bytes this time, 458752 bytes in total 2019/03/04 23:30:39 write 65536 bytes this time, 524288 bytes in total [root@centos tcp]# go run server5.go 2019/03/04 23:30:18 accept a new connection 2019/03/04 23:30:33 start to read from conn 2019/03/04 23:30:33 read 60000 bytes, content is 2019/03/04 23:30:38 start to read from conn 2019/03/04 23:30:38 read 60000 bytes, content is 2019/03/04 23:30:43 start to read from conn 2019/03/04 23:30:43 read 60000 bytes, content is
Server5在前10s中并不Read数据,因此当client5一直尝试写入时,写到一定量后就会发生阻塞。
在Centos6.5上测试,这个size大约在 393216 bytes。后续当server5每隔5s进行Read时,OS socket缓冲区腾出了空间,client5就又可以写入。
- 写入部分数据
Write操作存在写入部分数据的情况,比如上面例子中,当client端输出日志停留在“2019/03/04 23:30:39 write 65536 bytes this time, 524288 bytes in total”时,我们杀掉server5,这时我们会看到client5输出以下日志:
[root@centos tcp]# go run client5.go 2019/03/04 23:30:18 begin dial... 2019/03/04 23:30:18 dial ok 2019/03/04 23:30:18 write 65536 bytes this time, 65536 bytes in total 2019/03/04 23:30:18 write 65536 bytes this time, 131072 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 196608 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 262144 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 327680 bytes in total 2019/03/04 23:30:19 write 65536 bytes this time, 393216 bytes in total 2019/03/04 23:30:39 write 65536 bytes this time, 458752 bytes in total 2019/03/04 23:30:39 write 65536 bytes this time, 524288 bytes in total 2019/03/04 23:30:45 write 49152 bytes, error:write tcp 127.0.0.1:37294->127.0.0.1:8888: write: connection reset by peer 2019/03/04 23:30:45 write 573440 bytes in total
显然Write并非在 524288 bytes 这个地方阻塞的,而是后续又写入49152 bytes 后发生了阻塞,server端socket关闭后,我们看到Wrote返回er != nil且n = 49152,程序需要对这部分写入的49152 字节做特定处理。
- 写入超时
如果非要给Write增加一个期限,那我们可以调用SetWriteDeadline方法。我们copy一份client5.go,形成client6.go,在client6.go的Write之前增加一行timeout设置代码:
conn.SetWriteDeadline(time.Now().Add(time.Microsecond * 10))
1 //client6.go 2 package main 3 4 import ( 5 "log" 6 "net" 7 "time" 8 ) 9 10 func main() { 11 log.Println("begin dial...") 12 conn, err := net.Dial("tcp", ":8888") 13 if err != nil { 14 log.Println("dial error:", err) 15 return 16 } 17 defer conn.Close() 18 log.Println("dial ok") 19 20 data := make([]byte, 65536) 21 var total int 22 for { 23 conn.SetWriteDeadline(time.Now().Add(time.Microsecond * 10)) 24 n, err := conn.Write(data) 25 if err != nil { 26 total += n 27 log.Printf("write %d bytes, error:%s\n", n, err) 28 break 29 } 30 total += n 31 log.Printf("write %d bytes this time, %d bytes in total\n", n, total) 32 } 33 34 log.Printf("write %d bytes in total\n", total) 35 time.Sleep(time.Second * 10000) 36 }
启动server6.go,启动client6.go,我们可以看到写入超时的情况下,Write的返回结果:
[root@centos tcp]# go run client6.go 2019/03/04 23:46:33 begin dial... 2019/03/04 23:46:33 dial ok 2019/03/04 23:46:33 write 65536 bytes this time, 65536 bytes in total 2019/03/04 23:46:33 write 65536 bytes this time, 131072 bytes in total 2019/03/04 23:46:33 write 49152 bytes, error:write tcp 127.0.0.1:37295->127.0.0.1:8888: i/o timeout 2019/03/04 23:46:33 write 180224 bytes in total
可以看到在写入超时时,依旧存在部分数据写入的情况。
综上例子,虽然Go给我们提供了阻塞I/O的便利,但在调用Read和Write时依旧要综合需要方法返回的n和err的结果,以做出正确处理。net.conn实现了io.Reader和io.Writer接口,因此可以试用一些wrapper包进行socket读写,比如bufio包下面的Writer和Reader、io/ioutil下的函数等。
(5)Goroutine safe
基于goroutine的网络架构模型,存在在不同goroutine间共享conn的情况,那么conn的读写是否是goroutine safe的呢?在深入这个问题之前,我们先从应用意义上来看read操作和write操作的goroutine-safe必要性。
对于read操作而言,由于TCP是面向字节流,conn.Read无法正确区分数据的业务边界,因此多个goroutine对同一个conn进行read的意义不大,goroutine读到不完整的业务包反倒是增加了业务处理的难度。对与Write操作而言,倒是有多个goroutine并发写的情况。不过conn读写是否goroutine-safe的测试不是很好做,我们先深入一下runtime代码,先从理论上给这个问题定个性:
net.conn只是*netFD的wrapper结构,最终Write和Read都会落在其中的fd上:
type conn struct { fd *netFD }
netFD在不同平台上有着不同的实现,我们以net/fd_unix.go中的netFD为例:
// Network file descriptor. type netFD struct { // locking/lifetime of sysfd + serialize access to Read and Write methods fdmu fdMutex // immutable until Close sysfd int family int sotype int isConnected bool net string laddr Addr raddr Addr // wait server pd pollDesc }
我们看到netFD中包含了一个runtime实现的fdMutex类型字段,从注释上来看,该fdMutex用来串行化对该netFD对应的sysfd的Write和Read操作。从这个注释上来看,所有对conn的Read和Write操作都是有fdMutex互斥的,从netFD的Read和Write方法的实现也证实了这一点:
1 func (fd *netFD) Read(p []byte) (n int, err error) { 2 if err := fd.readLock(); err != nil { 3 return 0, err 4 } 5 defer fd.readUnlock() 6 if err := fd.pd.PrepareRead(); err != nil { 7 return 0, err 8 } 9 for { 10 n, err = syscall.Read(fd.sysfd, p) 11 if err != nil { 12 n = 0 13 if err == syscall.EAGAIN { 14 if err = fd.pd.WaitRead(); err == nil { 15 continue 16 } 17 } 18 } 19 err = fd.eofError(n, err) 20 break 21 } 22 if _, ok := err.(syscall.Errno); ok { 23 err = os.NewSyscallError("read", err) 24 } 25 return 26 } 27 28 func (fd *netFD) Write(p []byte) (nn int, err error) { 29 if err := fd.writeLock(); err != nil { 30 return 0, err 31 } 32 defer fd.writeUnlock() 33 if err := fd.pd.PrepareWrite(); err != nil { 34 return 0, err 35 } 36 for { 37 var n int 38 n, err = syscall.Write(fd.sysfd, p[nn:]) 39 if n > 0 { 40 nn += n 41 } 42 if nn == len(p) { 43 break 44 } 45 if err == syscall.EAGAIN { 46 if err = fd.pd.WaitWrite(); err == nil { 47 continue 48 } 49 } 50 if err != nil { 51 break 52 } 53 if n == 0 { 54 err = io.ErrUnexpectedEOF 55 break 56 } 57 } 58 if _, ok := err.(syscall.Errno); ok { 59 err = os.NewSyscallError("write", err) 60 } 61 return nn, err 62 }
每次Write操作都是受lock保护,直到此次数据全部write完。因此在应用层面,要想保证多个goroutine在一个conn上write操作的Safe,需要一次write完整写入一个“业务包”;一旦将业务包的写入拆分为多次write,那就无法保证某个Goroutine的某“业务包”数据在conn发送的连续性。
同时也可以看出即便是Read操作,也是lock保护的。多个Goroutine对同一conn的并发读不会出现读出内容重叠的情况,但内容断点是依 runtime调度来随机确定的。存在一个业务包数据,1/3内容被goroutine-1读走,另外2/3被另外一个goroutine-2读 走的情况。比如一个完整包:world,当goroutine的read slice size < 5时,存在可能:一个goroutine读到 “worl”,另外一个goroutine读出”d”。
(6)Socket属性
原生Socket API提供了丰富的sockopt设置接口,但Golang有自己的网络架构模型,golang提供的socket options接口也是基于上述模型的必要的属性设置。包括
SetKeepAlive
SetKeepAlivePeriod
SetLinger
SetNoDelay (默认no delay)
SetWriteBuffer
SetReadBuffer
不过上面的Method是TCPConn的,而不是Conn的,要使用上面的Method的,需要type assertion:
tcpConn, ok := c.(*TCPConn) if !ok { //error handle } tcpConn.SetNoDelay(true)
对于listener socket, golang默认采用了 SO_REUSEADDR,这样当你重启 listener程序时,不会因为address in use的错误而启动失败。而listen backlog的默认值是通过获取系统的设置值得到的。不同系统不同:mac 128, linux 512等。
(7)关闭连接
和前面的方法相比,关闭连接算是最简单的操作了。由于socket是全双工的,client和server端在己方已关闭的socket和对方关闭的socket上操作的结果有不同。看下面例子:
客户端:
1 package main 2 3 import ( 4 "log" 5 "net" 6 "time" 7 ) 8 9 func main() { 10 log.Println("begin dial...") 11 conn, err := net.Dial("tcp", ":8888") 12 if err != nil { 13 log.Println("dial error:", err) 14 return 15 } 16 conn.Close() 17 log.Println("close ok") 18 19 var buf = make([]byte, 32) 20 n, err := conn.Read(buf) 21 if err != nil { 22 log.Println("read error:", err) 23 } else { 24 log.Printf("read % bytes, content is %s\n", n, string(buf[:n])) 25 } 26 27 n, err = conn.Write(buf) 28 if err != nil { 29 log.Println("write error:", err) 30 } else { 31 log.Printf("write % bytes, content is %s\n", n, string(buf[:n])) 32 } 33 34 time.Sleep(time.Second * 1000) 35 }
服务端:
1 //server.go 2 3 package main 4 5 import ( 6 "log" 7 "net" 8 ) 9 10 func handleConn(c net.Conn) { 11 defer c.Close() 12 13 // read from the connection 14 var buf = make([]byte, 10) 15 log.Println("start to read from conn") 16 n, err := c.Read(buf) 17 if err != nil { 18 log.Println("conn read error:", err) 19 } else { 20 log.Printf("read %d bytes, content is %s\n", n, string(buf[:n])) 21 } 22 23 n, err = c.Write(buf) 24 if err != nil { 25 log.Println("conn write error:", err) 26 } else { 27 log.Printf("write %d bytes, content is %s\n", n, string(buf[:n])) 28 } 29 } 30 31 func main() { 32 l, err := net.Listen("tcp", ":8888") 33 if err != nil { 34 log.Println("listen error:", err) 35 return 36 } 37 38 for { 39 c, err := l.Accept() 40 if err != nil { 41 log.Println("accept error:", err) 42 break 43 } 44 // start a new goroutine to handle 45 // the new connection. 46 log.Println("accept a new connection") 47 go handleConn(c) 48 } 49 }
上述例子的执行结果如下:
[root@centos conn_close]# go run client1.go 2019/03/05 00:00:59 begin dial... 2019/03/05 00:00:59 close ok 2019/03/05 00:00:59 read error: read tcp 127.0.0.1:37296->127.0.0.1:8888: use of closed network connection 2019/03/05 00:00:59 write error: write tcp 127.0.0.1:37296->127.0.0.1:8888: use of closed network connection [root@centos conn_close]# go run server1.go 2019/03/05 00:00:59 accept a new connection 2019/03/05 00:00:59 start to read from conn 2019/03/05 00:00:59 conn read error: EOF 2019/03/05 00:00:59 write 10 bytes, content is
从client的结果来看,在己方已经关闭的socket上再进行read和write操作,会得到”use of closed network connection” error;
从server的执行结果来看,在对方关闭的socket上执行read操作会得到EOF error,但write操作会成功,因为数据会成功写入己方的内核socket缓冲区中,即便最终发不到对方socket缓冲区了,因为己方socket并未关闭。因此当发现对方socket关闭后,己方应该正确合理处理自己的socket,再继续write已经无任何意义了。
(8)小结
本文比较基础,但却很重要,毕竟golang是面向大规模服务后端的,对通信环节的细节的深入理解会大有裨益。另外Go的goroutine+阻塞通信的网络通信模型降低了开发者心智负担,简化了通信的复杂性,这点尤为重要。
注:上面例子出现(root@centos)表示是在Centos6.5上运行,其他是在Windows上运行,go version go1.8 windows/amd64。
特别注意:
- 上面内容除一小部分(运行结果及其他博客链接部分)全部来自 https://tonybai.com/2015/11/17/tcp-programming-in-golang/ 该博主,解释权归该博主。
- 本节用到的例子在该博主github地址:https://github.com/bigwhite/experiments/tree/master/go-tcpsock
(9)发送http请求
2. redis使用
3. 课后作业
参考文献:
- https://tonybai.com/2015/11/17/tcp-programming-in-golang/