如何在断开连接后干净地重新连接 boost::socket?

2021-12-11 00:00:00 sockets c++ boost boost-asio

我的客户端应用程序使用 boost::asio::ip::tcp::socket 连接到远程服务器.如果应用失去与该服务器的连接(例如,由于服务器崩溃或关闭),我希望它定期尝试重新连接,直到成功.

My client application uses a boost::asio::ip::tcp::socket to connect to a remote server. If the app loses connection to this server (e.g. due to the server crashing or being shutdown) I would like it to attempt a re-connect at regular intervals until it succeeds.

我需要在客户端做什么才能干净地处理断开连接、整理然后反复尝试重新连接?

目前我的代码中有趣的部分看起来像这样.

Currently the interesting bits of my code look something like this.

connect是这样的:

bool MyClient::myconnect()
{
    bool isConnected = false;

    // Attempt connection
    socket.connect(server_endpoint, errorcode);

    if (errorcode)
    {
        cerr << "Connection failed: " << errorcode.message() << endl;
        mydisconnect();
    }
    else
    {
        isConnected = true;

        // Connected so setup async read for an incoming message.
        startReadMessage();

        // And start the io_service_thread
        io_service_thread = new boost::thread(
            boost::bind(&MyClient::runIOService, this, boost::ref(io_service)));
    }
    return (isConnected)
}

runIOServer() 方法只是:

void MyClient::runIOService(boost::asio::io_service& io_service)
{
    size_t executedCount = io_service.run();
    cout << "io_service: " << executedCount << " handlers executed." << endl;
    io_service.reset();
}

如果任何异步读取处理程序返回错误,则它们只需调用此 disconnect 方法:

And if any of the async read handlers return an error then they just call this disconnect method:

void MyClient::mydisconnect(void)
{
    boost::system::error_code errorcode;

    if (socket.is_open())
    {
        // Boost documentation recommends calling shutdown first
        // for "graceful" closing of socket.
        socket.shutdown(boost::asio::ip::tcp::socket::shutdown_both, errorcode);
        if (errorcode)
        {
            cerr << "socket.shutdown error: " << errorcode.message() << endl;
        }

        socket.close(errorcode);
        if (errorcode)
        {
            cerr << "socket.close error: " << errorcode.message() << endl;
        }    

        // Notify the observer we have disconnected
        myObserver->disconnected();            
    }

..它尝试正常断开连接,然后通知观察者,观察者将开始每隔五秒调用 connect() 直到重新连接.

..which attempts to gracefully disconnect and then notifies an observer, which will start calling connect() at five second intervals until it gets reconnected.

还有什么我需要做的吗?

目前这似乎有效.如果我杀死它所连接的服务器,我会在读取处理程序中收到预期的 "End of file" 错误,并且 mydisconnect() 被调用而没有任何问题.

Currently this does seem to work. If I kill the server that it is connected to I get the expected "End of file" error at my read handlers and mydisconnect() is called without any issues.

但是当它尝试重新连接并失败时,我看到它报告socket.shutdown error: Invalid argument".这仅仅是因为我试图关闭一个没有挂起读/写的套接字吗?或者还有什么?

But when it then attempts to re-connect and fails I see it report "socket.shutdown error: Invalid argument". Is this just because I am attempting to shutdown a socket that has no read/writes pending on it? Or is it something more?

推荐答案

每次重新连接时都需要创建一个新的 boost::asio::ip::tcp::socket.最简单的方法可能是使用 boost::shared_ptr 在堆上分配套接字(如果你的套接字完全封装在一个类中).例如:

You need to create a new boost::asio::ip::tcp::socket each time you reconnect. The easiest way to do this is probably to just allocate the socket on the heap using a boost::shared_ptr (you could probably also get away with scoped_ptr if your socket is entirely encapsulated within a class). E.g.:

bool MyClient::myconnect()
{
    bool isConnected = false;

    // Attempt connection
    // socket is of type boost::shared_ptr<boost::asio::ip::tcp::socket>
    socket.reset(new boost::asio::ip::tcp::socket(...));
    socket->connect(server_endpoint, errorcode);
    // ...
}

然后,当 mydisconnect 被调用时,你可以释放套接字:

Then, when mydisconnect is called, you could deallocate the socket:

void MyClient::mydisconnect(void)
{
    // ...
    // deallocate socket.  will close any open descriptors
    socket.reset();
}

您看到的错误可能是操作系统在您调用 close 后清理文件描述符的结果.当您调用 close 然后尝试在同一个套接字上 connect 时,您可能正在尝试连接无效的文件描述符.此时,根据您的逻辑,您应该会看到一条以Connection failed: ..."开头的错误消息,但您随后调用了 mydisconnect,这可能是在尝试调用 shutdown> 在无效的文件描述符上.恶性循环!

The error you're seeing is probably a result of the OS cleaning up the file descriptor after you've called close. When you call close and then try to connect on the same socket, you're probably trying to connect an invalid file descriptor. At this point you should see an error message starting with "Connection failed: ..." based on your logic, but you then call mydisconnect which is probably then attempting to call shutdown on an invalid file descriptor. Vicious cycle!

相关文章