在 epoll 中检测连接状态。

huangapple go评论72阅读模式
英文:

detecting connection state in epoll linux

问题

有许多关于如何检测套接字是否连接或不连接的线程,使用各种方法,如使用getpeername / getsockopt w/ SO_ERROR。https://man7.org/linux/man-pages/man2/getpeername.2.html 对我来说是一种很好的方法来检测套接字是否连接或不连接。问题是,它并没有说明连接是否正在进行...所以如果我调用connect,它正在进行中,然后我调用getpeername,它会说它是一个错误(-1),即使连接仍然正在进行吗?

如果是这样的话,我可以实现一个类似计时器的系统,如果连接在x秒后仍然在进行中,就会最终关闭套接字。

英文:

There are many threads regarding how to detect if a socket is connected or not using various methods like getpeername / getsockopt w/ SO_ERROR. https://man7.org/linux/man-pages/man2/getpeername.2.html would be a good way for me to detect if a socket is connected or not. The problem is, it does not say anything about if the connection is in progress... So if i call connect, it is in progress, then i call getpeername, will it say it is an error (-1) even though the connection is still in progress?

If it does, I can implement a counter-like system that will eventually kill the socket if it is still in progress after x seconds.

答案1

得分: 0

man connect
> 如果初始套接字是连接模式,... 如果连接不能立即建立且套接字的文件描述符未设置为O_NONBLOCK,connect()将阻塞,直到连接建立,最多阻塞一个未指定的超时时间。如果超时时间在连接建立之前到期,connect()将失败,连接尝试将被中止。
>
> 如果connect()被信号中断,而同时被阻塞等待建立连接,connect()将失败并设置errno为[EINTR],但连接请求不会被中止,连接将异步建立。
>
> 如果连接不能立即建立且套接字的文件描述符设置为O_NONBLOCK,connect()将失败并设置errno为[EINPROGRESS],但连接请求不会被中止,连接将异步建立。
>
> 当连接已经异步建立时,select()和poll()将指示套接字的文件描述符准备好写入。

如果套接字处于阻塞模式,connect将在连接正在进行时阻塞。在connect返回后,您将知道是否已建立连接(或未建立连接)。

信号可能会中断(阻塞/等待)过程,连接例程将切换到异步模式。

如果套接字处于非阻塞模式(O_NONBLOCK),并且连接不能立即建立,connect将以错误EINPROGRESS失败,类似上面的方式切换到异步模式,这意味着您必须使用selectpoll来确定套接字是否准备好写入(表示已建立连接)。

英文:

man connect
> If the initiating socket is connection-mode, .... If the connection cannot be established immediately and O_NONBLOCK is not set for the file descriptor for the socket, connect() shall block for up to an unspecified timeout interval until the connection is established. If the timeout interval expires before the connection is established, connect() shall fail and the connection attempt shall be aborted.
>
> If connect() is interrupted by a signal that is caught while blocked waiting to establish a connection, connect() shall fail and set errno to [EINTR], but the connection request shall not be aborted, and the connection shall be established asynchronously.
>
> If the connection cannot be established immediately and O_NONBLOCK is set for the file descriptor for the socket, connect() shall fail and set errno to [EINPROGRESS], but the connection request shall not be aborted, and the connection shall be established asynchronously.
>
> When the connection has been established asynchronously, select() and poll() shall indicate that the file descriptor for the socket is ready for writing.

If the socket is in blocking mode, connect will block while the connection is in progress. After connect returns, you'll know if a connection has been established (or not).

A signal could interrupt the (blocking/waiting) process, the connection routine will then switch to asynchronous mode.

If the socket is in non blocking mode (O_NONBLOCK) and the connection cannot be established immediately, connect will fail with the error EINPROGRESS and like above switching to asynchronous mode, that means, you'll have to use select or poll to figure out if the socket is ready for writing (indicates established connection).

答案2

得分: 0

Short Answer

我认为,如果getpeername()返回ENOTCONN,那只是表示TCP连接请求尚未成功。为了使其不返回ENOTCONN,我认为客户端端点需要接收到来自服务器的syn+ack并发送自己的ack,服务器端点需要接收到客户端的ack。之后一切都不确定。连接随后可能会中断,但getpeername()无法知道发生了什么。

Long Answer

很大程度上取决于想要多么挑剔和短视地了解连接是否已建立。

严格来说...

严格来说,最大程度的挑剔下,是无法知道的。在分组交换网络中,网络中没有任何一点可以确定(在任何单一时间点)对等方之间存在可能的连接。这是一个"试一试看"的事情。

这与电路交换网络(例如普通电话呼叫)形成对比,在电路交换网络中,存在专用于对等方(电话)之间的专用电路;只要电流在流动,即使电话另一端的人保持沉默,您也知道电路是完整的。

请注意,如果两台计算机仅通过单个以太网电缆连接(没有路由器,没有交换机,只是两个网卡之间的电缆),那实际上是一个固定电路(甚至不是电路交换网络)。

稍微宽松一点...

关注在分组交换网络中了解连接的可能性。正如其他人已经说过的,答案是,实际上,人们必须不断地发送和接收数据包,以了解网络是否仍然可以连接这两个对等方。

这种数据包的交换发生在TCP套接字connect()中 - 连接的对等方发送一个特殊数据包,表示“请允许我连接到您”,服务的对等方回复“是的”,然后客户端说“谢谢!”(syn->,<-syn+ack,ack->)。但之后,只有在应用程序发送和接收数据或选择关闭连接(fin)时,数据包才在对等方之间流动。

getpeername()这样的调用,我认为在某种程度上有点误导,这取决于您的要求。如果您信任网络基础设施、远程计算机及其应用程序不会出现故障或崩溃,那么这是可以的。

connect()可能成功,然后在网络的某个地方发生了故障(例如,对等方的网络连接被拔掉,或对等方崩溃了),而您在网络的这一端并不知道发生了什么。

您可以了解的第一件事是,如果发送一些数据并且没有收到响应。最初的响应是TCP的确认(允许您的网络堆栈清除某些缓冲区),然后可能是来自对等应用程序的实际消息。如果您继续将数据发送到虚空中,网络将很高兴地将数据包路由到尽可能远的地方,但是由于来自对等方的确认不断缺失,您的TCP堆栈的缓冲区将填满。最终,您的网络套接字在调用write()时会被阻塞,因为本地缓冲区已满。

各种选项...

  • 如果您编写两个应用程序(服务器和客户端),则可以编写应用程序以定期“乒乓”连接;只需发送一个没有实际含义的消息,只是表示“告诉我您听到了这个”。成功的乒乓意味着至少在过去的几秒钟内连接是正常的。

  • 使用ZeroMQ等库。这个库解决了使用网络连接的许多问题,并包括(在现代版本中)套接字心跳(即ping pong)。它很好,因为ZeroMQ处理了建立、恢复和监视连接的繁杂业务,并且可以在连接状态更改时通知应用程序。同样,您需要编写客户端和服务器应用程序,因为ZeroMQ在TCP之上具有自己的协议,与普通的套接字不兼容。如果您对这种方法感兴趣,可以在API文档中查找的关键词是套接字监视器ZMQ_HEARTBEAT_IVL

  • 如果真的只有一端需要知道连接仍然可用,那可以通过另一端只是发送“ping”来实现。这可能适用于您在两端都不编写软件的情况。例如,服务器应用程序可以配置(而不是重新编写)以流式传输数据,无论客户端是否需要,客户端会忽略大部分数据。然而,客户端知道如果正在接收数据,那么它也知道存在连接。服务器不知道(它只是盲目地发送数据,直到它的write()最终被阻塞),但可能不需要知道。

乒乓也很有用,因为它可以提供一些关于网络性能的指示。如果一端希望在发送ping后的5秒内收到pong,但没有收到,那表明情况并非如预期(即使数据包最终到达)。

这允许区分工作正常的网络和传递数据包但速度太慢以至于无法使用的网络之间的区别。

英文:

Short Answer

I think that, if getpeername() returns ENOTCONN, that simply means that the tcp connection request has not yet succeeded. For it to not return ENOTCONN, I think the client end needs to have received the syn+ack from the server and sent its own ack, and the server end needs to have received the client's ack.

Thereafter all bets are off. The connection might subsequently be interrupted, but getpeername() has no way of knowing this has happened.

Long Answer

A lot of it depends on how fussy and short-term one wants to be about knowing if the connection is up.

Strictly Speaking...

Strictly speaking with maximum fussiness, one cannot know. In a packet switched network there is nothing in the network that knows (at any single point in time) for sure that there is a possible connection between peers. It's a "try it and see" thing.

This contrasts to a circuit switched network (e.g. a plain old telephone call), where there is a live circuit for exclusive use between peers (telephones); provided current is flowing, you know the circuit is complete even if the person at the other end of the phone call is silent.

Note that if the two computers were connected by a single Ethernet cable (no router, no switches, just a cable between NICs), that is effectively a fixed circuit (not even a circuit-switched network).

Relaxing a Little...

Focusing on what one can know about a connection in a packet switched network. As others have already said, the answer is that, really, one has to send and receive packets constantly to know if the network can still connect the two peers.

Such an exchange of packets occurs with a tcp socket connect() - the connecting peer sends a special packet to say "please can I connect to you", and the serving peer replies "yes", the client then says "thank you!" (syn->, <-syn+ack, ack->). But thereafter the packets flow between peers only if the applications send and receive data, or elects to close the connection (fin).

Calling something like getpeername() I think is somewhat misleading, depending on your requirements. It's fine, if you trust the network infrastructure and remote computer and its application to not break, and not crash.

It's possible for the connect() to succeed, then something breaks somewhere in the network (e.g. the peer's network connection is unplugged, or the peer crashes), and there is no knowledge at your end of the network that that has happened.

The first thing you can know about it is if you send some traffic and fail to get a response. The response is, initially, the tcp acks (which allows your network stack to clear out some of its buffers), and then possibly an actual message back from the peer application. If you keep sending data out into the void, the network will quite happily route packets as far as it can, but your tcp stack's buffers will fill up due to the lack of acks coming back from the peer. Eventually, your network socket blocks on a call to write(), because the local buffers are full.

Various Options...

  • If you're writing both applications (server and client), you can write the application to "ping pong" the connection periodically; just send a message that means nothing other than "tell me you heard this". Successful ping-ponging means that, at least within the last few seconds, the connection was OK.

  • Use a library like ZeroMQ. This library solves many issues with using network connections, and also includes (in modern version) socket heartbeats (i.e. a ping pong). It's neat, because ZeroMQ looks after the messy business of making, restoring and monitoring connections with a heartbeat, and can notify the application whenever the connection state changes. Again, you need to be writing both client and server applications, because ZeroMQ has it's own protocol on top of tcp that is not compatible with just a plain old socket. If you're interested in this approach, the words to look for in the API documentation is socket monitor and ZMQ_HEARTBEAT_IVL;

  • If, really, only one end needs to know the connection is still available, that can be accomplished by having the other end just sending out "pings". That might fit a situation where you're not writing the software at both ends. For example, a server application might be configured (rather than re-written) to stream out data regardless of whether the client wants it or not, and the client ignores most of it. However, the client knows that if it is receiving data it then also knows there is a connection. The server does not know (it's just blindly sending out data, up until its writes() eventually block), but may not need to know.

Ping ponging is also good in that it gives some indication of the performance of the network. If one end is expecting a pong within 5 seconds of sending a ping but doesn't get it, that indicates that all is not as expected (even if packets are eventually turning up).

This allows discrimination between networks that are usefully working, and networks that are delivering packets but too slowly to be useful. The latter is still technically "connected" and is probably represented as connected by other tests (e.g. calling getpeername()), but it may as well not be.

Limited Local Knowledge...

There is limited things one can do locally to a peer. A peer can know whether its connection to the network exists (e.g. the NIC reports a live connection), but that's about it.

My Opinion

Personally speaking, I default to ZeroMQ these days if at all possible. Even if it means a software re-write, that's not so bad as it seems. This is because one is generally replacing code such as connect() with zmq_connect(), and recv() with zmq_revc(), etc. There's often a lot of code removal too. ZeroMQ is message orientated, a tcp socket is stream orientated. Quite a lot of applications have to adapt tcp into a message orientation, and ZeroMQ replaces all the code that does that.

ZeroMQ is also well supported across numerous languages, either in bindings and / or re-implementations.

huangapple
  • 本文由 发表于 2023年1月9日 09:10:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75052404.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定