什么会导致 UDP 数据包在发送到 localhost 时被丢弃?
我正在发送非常大(64000 字节)的数据报.我意识到 MTU 远小于 64000 字节(根据我的阅读,典型值约为 1500 字节),但我怀疑会发生两件事之一 - 要么没有数据报可以通过(所有大于 1500 字节会被静默丢弃或引发错误/异常),或者 64000 字节的数据报会被分块成大约 43 1500 字节的消息并透明传输.
从长远来看(2000+ 64000 字节的数据报),大约 1% 的数据报被丢弃(即使是 LAN,这似乎也异常高).我可能期望通过网络实现这一点,其中数据报可能会乱序到达、被丢弃、被过滤等等.但是,我没想到在 localhost 上运行时会出现这种情况.
是什么导致无法在本地发送/接收数据?我意识到 UDP 是不可靠的,但我没想到它在 localhost 上如此不可靠.我想知道这是否只是时间问题,因为发送和接收组件都在同一台机器上.
为了完整起见,我已经包含了发送/接收数据报的代码.
发送:
DatagramSocket socket = new DatagramSocket(senderPort);int valueToSend = 0;while (valueToSend < valuesToSend || valuesToSend == -1) {字节[] intBytes = intToBytes(valueToSend);字节 [] 缓冲区 = 新字节 [缓冲区大小 - 4];//这确保将数据放入我们要发送的大小的数组中byte[] bytesToSend = concatAll(intBytes, buffer);System.out.println("发送" + valueToSend + " as " + bytesToSend.length + " bytes");DatagramPacket 数据包 = 新 DatagramPacket(bytesToSend,缓冲区大小,receiverAddress,receiverPort);套接字.发送(数据包);Thread.sleep(延迟);值发送++;}
接收:
DatagramSocket socket = new DatagramSocket(receiverPort);而(真){数据报包数据包 = 新数据报包(新字节[缓冲区大小],缓冲区大小);System.out.println("等待数据报...");socket.receive(数据包);int receivedValue = bytesToInt(packet.getData(), 0);System.out.println("收到:" + receivedValue+ ".预期:" + 预期值);if (receivedValue == expectedValue) {收到的数据报++;总数据报++;}别的 {丢弃数据报++;总数据报++;}预期值 = 收到值 + 1;System.out.println("预期数据报:" + totalDatagrams);System.out.println("接收到的数据报:" + receivedDatagrams);System.out.println("丢弃的数据报:" + droppedDatagrams);System.out.println("收到:"+ ((双) receivedDatagrams/totalDatagrams));System.out.println("丢弃:"+((双)dropDatagrams/totalDatagrams));System.out.println();}
解决方案 概览
<块引用>是什么导致无法在本地发送/接收数据?
主要是缓冲空间.想象一下,发送恒定的 10MB/秒,而只能消耗 5MB/秒.操作系统和网络堆栈跟不上,因此数据包被丢弃.(这与 TCP 不同,TCP 提供流量控制和重传来处理这种情况.)
即使在没有溢出缓冲区的情况下消费数据,也可能存在无法消费数据的小时间片,因此系统会丢弃数据包.(例如在垃圾收集期间,或者当操作系统任务暂时切换到更高优先级的进程时,等等.)
这适用于网络堆栈中的所有设备.当队列已满时,非本地网络、以太网交换机、路由器、集线器和其他硬件也会丢弃数据包.通过 100MB/s 以太网交换机发送 10MB/s 流,而其他人试图通过同一物理线路填充 100MB/s 将导致丢包.
同时增加 socket缓冲区大小和操作系统的套接字缓冲区大小.
Linux
默认套接字缓冲区大小通常为 128k 或更小,这使得 非常几乎没有空间暂停数据处理.
sysctl
使用sysctl增加发送(写内存[wmem])和接收(读内存 [rmem]) 缓冲区:
- net.core.wmem_max
- net.core.wmem_default
- net.core.rmem_max
- net.core.rmem_default
例如,将值增加到 8 兆字节:
sysctl -w net.core.rmem_max=8388608
要使设置保持不变,请同时更新 /etc/sysctl.conf
,例如:
net.core.rmem_max=8388608
一个在-关于调整网络堆栈的深度文章深入探讨了更多细节,涉及在Linux中如何从内核的网络驱动程序通过环形缓冲区一直到C的recv
接收和处理数据包的多个级别代码>调用.本文介绍了在诊断网络问题时要监控的其他设置和文件.(见下文.)
在进行以下任何调整之前,请务必了解它们如何影响网络堆栈.确实有可能使您的网络无法使用.选择适合您的系统、网络配置和预期流量负载的数字:
- net.core.rmem_max=8388608
- net.core.rmem_default=8388608
- net.core.wmem_max=8388608
- net.core.wmem_default=8388608
- net.ipv4.udp_mem='262144 327680 434274'
- net.ipv4.udp_rmem_min=16384
- net.ipv4.udp_wmem_min=16384
- net.core.netdev_budget=600
- net.ipv4.ip_early_demux=0
- net.core.netdev_max_backlog=3000
ethtool
此外,ethtool
对于查询或更改网络设置很有用.例如,如果 ${DEVICE}
是 eth0
(使用 ip address
或 ipconfig
来确定您的网络设备名称),那么可以使用以下方法增加 RX 和 TX 缓冲区:
- ethtool -G ${DEVICE} rx 4096
- ethtool -G ${DEVICE} 发送 4096
iptables
默认情况下,iptables
将记录有关数据包的信息,这会消耗 CPU 时间,尽管很少.例如,您可以使用以下命令禁用端口 6004 上的 UDP 数据包日志记录:
iptables -t raw -I PREROUTING 1 -p udp --dport 6004 -j NOTRACKiptables -I 输入 1 -p udp --dport 6004 -j 接受
您的特定端口和协议会有所不同.
监控
多个文件包含有关网络数据包在发送和接收的各个阶段发生的情况的信息.在下面的列表中,${IRQ}
是中断请求号,${DEVICE}
是网络设备:
/proc/cpuinfo
- 显示可用的 CPU 数量(有助于 IRQ 平衡)/proc/irq/${IRQ}/smp-affinity
- 显示 IRQ 关联性/proc/net/dev
- 包含一般数据包统计信息/sys/class/net/${DEVICE}/queues/QUEUE/rps_cpus
- 与接收数据包引导 (RPS) 相关/proc/softirqs
- 用于 ntuple 过滤/proc/net/softnet_stat
- 用于数据包统计,例如丢包、时间压缩、CPU 冲突等./proc/sys/net/core/flow_limit_cpu_bitmap
- 显示数据包流(可以帮助诊断大小流之间的丢包)/proc/net/snmp
/proc/net/udp
总结
缓冲区空间最有可能是丢包的罪魁祸首.网络堆栈中散布着许多缓冲区,每个缓冲区对发送和接收数据包都有自己的影响.网络驱动程序、操作系统、内核设置和其他因素可能会影响数据包丢失.没有灵丹妙药.
进一步阅读
- https://github.com/leandromoreira/linux-network-performance-parameters
- http://man7.org/linux/man-pages/man7/udp.7.html
- http://www.ethernetresearch.com/geekzone/linux-networking-commands-to-debug-ipudptcp-packet-loss/
I'm sending very large (64000 bytes) datagrams. I realize that the MTU is much smaller than 64000 bytes (a typical value is around 1500 bytes, from my reading), but I would suspect that one of two things would happen - either no datagrams would make it through (everything greater than 1500 bytes would get silently dropped or cause an error/exception to be thrown) or the 64000 byte datagrams would get chunked into about 43 1500 byte messages and transmitted transparently.
Over a long run (2000+ 64000 byte datagrams), about 1% (which seems abnormally high for even a LAN) of the datagrams get dropped. I might expect this over a network, where datagrams can arrive out of order, get dropped, filtered, and so on. However, I did not expect this when running on localhost.
What is causing the inability to send/receive data locally? I realize UDP is unreliable, but I didn't expect it to be so unreliable on localhost. I'm wondering if it's just a timing issue since both the sending and receiving components are on the same machine.
For completeness, I've included the code to send/receive datagrams.
Sending:
DatagramSocket socket = new DatagramSocket(senderPort);
int valueToSend = 0;
while (valueToSend < valuesToSend || valuesToSend == -1) {
byte[] intBytes = intToBytes(valueToSend);
byte[] buffer = new byte[bufferSize - 4];
//this makes sure that the data is put into an array of the size we want to send
byte[] bytesToSend = concatAll(intBytes, buffer);
System.out.println("Sending " + valueToSend + " as " + bytesToSend.length + " bytes");
DatagramPacket packet = new DatagramPacket(bytesToSend,
bufferSize, receiverAddress, receiverPort);
socket.send(packet);
Thread.sleep(delay);
valueToSend++;
}
Receiving:
DatagramSocket socket = new DatagramSocket(receiverPort);
while (true) {
DatagramPacket packet = new DatagramPacket(
new byte[bufferSize], bufferSize);
System.out.println("Waiting for datagram...");
socket.receive(packet);
int receivedValue = bytesToInt(packet.getData(), 0);
System.out.println("Received: " + receivedValue
+ ". Expected: " + expectedValue);
if (receivedValue == expectedValue) {
receivedDatagrams++;
totalDatagrams++;
}
else {
droppedDatagrams++;
totalDatagrams++;
}
expectedValue = receivedValue + 1;
System.out.println("Expected Datagrams: " + totalDatagrams);
System.out.println("Received Datagrams: " + receivedDatagrams);
System.out.println("Dropped Datagrams: " + droppedDatagrams);
System.out.println("Received: "
+ ((double) receivedDatagrams / totalDatagrams));
System.out.println("Dropped: "
+ ((double) droppedDatagrams / totalDatagrams));
System.out.println();
}
解决方案
Overview
What is causing the inability to send/receive data locally?
Mostly buffer space. Imagine sending a constant 10MB/second while only able to consume 5MB/second. The operating system and network stack can't keep up, so packets are dropped. (This differs from TCP, which provides flow control and re-transmission to handle such a situation.)
Even when data is consumed without overflowing buffers, there might be small time slices where data cannot be consumed, so the system will drop packets. (Such as during garbage collection, or when the OS task switches to a higher-priority process momentarily, and so forth.)
This applies to all devices in the network stack. A non-local network, an Ethernet switch, router, hub, and other hardware will also drop packets when queues are full. Sending a 10MB/s stream through a 100MB/s Ethernet switch while someone else tries to cram 100MB/s through the same physical line will cause dropped packets.
Increase both the socket buffers size and operating system's socket buffer size.
Linux
The default socket buffer size is typically 128k or less, which leaves very little room for pausing the data processing.
sysctl
Use sysctl to increase the transmit (write memory [wmem]) and receive (read memory [rmem]) buffers:
- net.core.wmem_max
- net.core.wmem_default
- net.core.rmem_max
- net.core.rmem_default
For example, to bump the value to 8 megabytes:
sysctl -w net.core.rmem_max=8388608
To make the setting persist, update /etc/sysctl.conf
as well, such as:
net.core.rmem_max=8388608
An in-depth article on tuning the network stack dives into far more details, touching on multiple levels of how packets are received and processed in Linux from the kernel's network driver through ring buffers all the way to C's recv
call. The article describes additional settings and files to monitor when diagnosing network issues. (See below.)
Before making any of the following tweaks, be sure to understand how they affect the network stack. There is a real possibility of rendering your network unusable. Choose numbers appropriate for your system, network configuration, and expected traffic load:
- net.core.rmem_max=8388608
- net.core.rmem_default=8388608
- net.core.wmem_max=8388608
- net.core.wmem_default=8388608
- net.ipv4.udp_mem='262144 327680 434274'
- net.ipv4.udp_rmem_min=16384
- net.ipv4.udp_wmem_min=16384
- net.core.netdev_budget=600
- net.ipv4.ip_early_demux=0
- net.core.netdev_max_backlog=3000
ethtool
Additionally, ethtool
is useful to query or change network settings. For example, if ${DEVICE}
is eth0
(use ip address
or ipconfig
to determine your network device name), then it may be possible to increase the RX and TX buffers using:
- ethtool -G ${DEVICE} rx 4096
- ethtool -G ${DEVICE} tx 4096
iptables
By default, iptables
will log information about packets, which consumes CPU time, albeit minimal. For example, you can disable logging of UDP packets on port 6004 using:
iptables -t raw -I PREROUTING 1 -p udp --dport 6004 -j NOTRACK
iptables -I INPUT 1 -p udp --dport 6004 -j ACCEPT
Your particular port and protocol will vary.
Monitoring
Several files contain information about what is happening to network packets at various stages of sending and receiving. In the following list ${IRQ}
is the interrupt request number and ${DEVICE}
is the network device:
/proc/cpuinfo
- shows number of CPUs available (helpful for IRQ-balancing)/proc/irq/${IRQ}/smp-affinity
- shows IRQ affinity/proc/net/dev
- contains general packet statistics/sys/class/net/${DEVICE}/queues/QUEUE/rps_cpus
- relates to Receive Packet Steering (RPS)/proc/softirqs
- used for ntuple filtering/proc/net/softnet_stat
- for packet statistics, such as drops, time squeezes, CPU collisions, etc./proc/sys/net/core/flow_limit_cpu_bitmap
- shows packet flow (can help diagnose drops between large and small flows)/proc/net/snmp
/proc/net/udp
Summary
Buffer space is the most likely culprit for dropped packets. There are numerous buffers strewn throughout the network stack, each having its own impact on sending and receiving packets. Network drivers, operating systems, kernel settings, and other factors can affect packet drops. There is no silver bullet.
Further Reading
- https://github.com/leandromoreira/linux-network-performance-parameters
- http://man7.org/linux/man-pages/man7/udp.7.html
- http://www.ethernetresearch.com/geekzone/linux-networking-commands-to-debug-ipudptcp-packet-loss/
相关文章