PHP-MySQLi 连接随机失败,出现“无法分配请求的地址"

2022-01-24 00:00:00 debian php mysql

大约 2 周以来,我一直在处理 LAMP 堆栈中最奇怪的问题之一.长话短说,与 MySQL 服务器的随机连接失败并显示错误消息:

Since about 2 weeks I'm dealing with one of the weirdest problems in LAMP stack. Long story short randomly connection to MySQL server is failing with error message:

Warning:  mysqli::real_connect(): (HY000/2002): Cannot assign requested address in ..

MySQL 位于不同的盒子"上,托管在 Rackspace Cloud今天我们将它的版本降级为

The MySQL is on different "box", hosted at Rackspace Cloud Today we downgraded it's version to

Ver 14.14 Distrib 5.1.42, for debian-linux-gnu (x86_64).

数据库服务器正忙于处理每秒平均查询次数:5327.957,根据它的状态变量.

The DB server is pretty busy dealing with Queries per second avg: 5327.957 according to it's status variable.

MySQL 处于 log-warnings=9 中,但没有记录拒绝连接的警告.假设 1% 的概率,站点和 Gearman 工作人员脚本都因该错误而失败.没有服务器负载似乎不是我们监控的一个因素.(CPU 负载、IO 负载或 MySQL 负载)最大 DB 连接数 (max_connections) 设置为 200,但我们从未处理过超过 100 个与数据库的同时连接

MySQL is in log-warnings=9 but no warring for connection refused are logged. Both site and gearman workers scripts fail with that error at let's say 1% probability. No server load DO NOT seems to be a factor as we monitor. (CPU load, IO load or MySQL load) The maximum DB connections (max_connections) are setted to 200 but we have never dealed with more than 100 simultaneous connections to the database

无论有没有防火墙软件,都会发生这种情况.

It happens with and without the firewall software.

我怀疑是 TCP 网络问题而不是 PHP/MySQL 配置问题.

I suspect TCP Networking problem rather than PHP/MySQL configurationn problem.

谁能告诉我如何找到它?

Can anyone give me clue how to find it?

更新:

连接代码为:

$this->_mysqli = mysqli_init(); 
$this->_mysqli->options(MYSQLI_OPT_CONNECT_TIMEOUT, 120); 
$this->_mysqli->real_connect($dbHost,$dbUserName, $dbPassword, $dbName); 

if (!is_null($this->_mysqli->connect_error)) {
    $ping = $this->_mysqli->ping(); 

    if(!$ping){
        $error = 'HOST: {'.$dbHost.'};MESSAGE: '. $this->_mysqli->connect_error ."
"; 
        DataStoreException::raiseHostUnreachable($error);
    }
} 

推荐答案

MySQL:使用巨量连接

频繁连接有什么危险?
它运作良好,除了一些极端情况.如果您每秒从同一个盒子获得数百个连接,您可能会耗尽本地端口号.修复它的方法可能是 - 在 linux 上减少/proc/sys/net/ipv4/tcp_fin_timeout"(这违反了 TCP/IP 标准,但您可能不在本地网络中),增加/proc/sys/net/客户端上的ipv4/ip_local_port_range".其他操作系统也有类似的设置.您还可以为同一数据库主机使用更多的 Web 框或多个 IP 来解决此问题.我真的在生产中看到了这一点.

What are dangers of frequent connects ?
It works well, with exception of some extreme cases. If you get hundreds of connects per second from the same box you may get into running out of local port numbers. The way to fix it could be - decrease "/proc/sys/net/ipv4/tcp_fin_timeout" on linux (this breaks TCP/IP standard but you might not care in your local network), increase "/proc/sys/net/ipv4/ip_local_port_range" on the client. Other OS have similar settings. You also may use more web boxes or multiple IP for your same database host to work around this problem. I've realy seen this in production.

关于这个问题的一些背景:
TCP/IP 连接由 localip:localport remoteip:remote 端口标识.在这种情况下,我们固定了 MySQL IP 和端口以及客户端 IP,因此我们只能更改具有有限范围的本地端口.注意即使关闭连接后,TCP/IP 堆栈也必须保留端口一段时间,这就是 tcp_fin_timeout 的来源.

Some background about this problem:
TCP/IP connection is identified by localip:localport remoteip:remote port. We have MySQL IP and Port as well as client IP fixed in this case so we can only vary local port which has finite range. Note even after you close connection TCP/IP stack has to keep the port reserved for some time, this is where tcp_fin_timeout comes from.

相关文章