什么会导致 UDP 数据包在发送到 localhost 时被丢弃?

2022-01-19 00:00:00 network-programming udp java

我正在发送非常大(64000 字节)的数据报.我意识到 MTU 远小于 64000 字节(根据我的阅读，典型值约为 1500 字节)，但我怀疑会发生两件事之一 - 要么没有数据报可以通过(所有大于 1500 字节会被静默丢弃或引发错误/异常)，或者 64000 字节的数据报会被分块成大约 43 1500 字节的消息并透明传输.

从长远来看(2000+ 64000 字节的数据报)，大约 1% 的数据报被丢弃(即使是 LAN，这似乎也异常高).我可能期望通过网络实现这一点，其中数据报可能会乱序到达、被丢弃、被过滤等等.但是，我没想到在 localhost 上运行时会出现这种情况.

是什么导致无法在本地发送/接收数据?我意识到 UDP 是不可靠的，但我没想到它在 localhost 上如此不可靠.我想知道这是否只是时间问题，因为发送和接收组件都在同一台机器上.

为了完整起见，我已经包含了发送/接收数据报的代码.

发送:

DatagramSocket socket = new DatagramSocket(senderPort);int valueToSend = 0;while (valueToSend < valuesToSend || valuesToSend == -1) {字节[] intBytes = intToBytes(valueToSend);字节 [] 缓冲区 = 新字节 [缓冲区大小 - 4]；//这确保将数据放入我们要发送的大小的数组中byte[] bytesToSend = concatAll(intBytes, buffer);System.out.println("发送" + valueToSend + " as " + bytesToSend.length + " bytes");DatagramPacket 数据包 = 新 DatagramPacket(bytesToSend,缓冲区大小，receiverAddress，receiverPort)；套接字.发送(数据包)；Thread.sleep(延迟);值发送++；}

接收:

DatagramSocket socket = new DatagramSocket(receiverPort);而(真){数据报包数据包 = 新数据报包(新字节[缓冲区大小]，缓冲区大小)；System.out.println("等待数据报...");socket.receive(数据包);int receivedValue = bytesToInt(packet.getData(), 0);System.out.println("收到:" + receivedValue+ ".预期:" + 预期值);if (receivedValue == expectedValue) {收到的数据报++；总数据报++；}别的 {丢弃数据报++；总数据报++；}预期值 = 收到值 + 1；System.out.println("预期数据报:" + totalDatagrams);System.out.println("接收到的数据报:" + receivedDatagrams);System.out.println("丢弃的数据报:" + droppedDatagrams);System.out.println("收到:"+ ((双) receivedDatagrams/totalDatagrams));System.out.println("丢弃:"+((双)dropDatagrams/totalDatagrams))；System.out.println();}

解决方案

概览

<块引用>

是什么导致无法在本地发送/接收数据?

主要是缓冲空间.想象一下，发送恒定的 10MB/秒，而只能消耗 5MB/秒.操作系统和网络堆栈跟不上，因此数据包被丢弃.(这与 TCP 不同，TCP 提供流量控制和重传来处理这种情况.)

即使在没有溢出缓冲区的情况下消费数据，也可能存在无法消费数据的小时间片，因此系统会丢弃数据包.(例如在垃圾收集期间，或者当操作系统任务暂时切换到更高优先级的进程时，等等.)

这适用于网络堆栈中的所有设备.当队列已满时，非本地网络、以太网交换机、路由器、集线器和其他硬件也会丢弃数据包.通过 100MB/s 以太网交换机发送 10MB/s 流，而其他人试图通过同一物理线路填充 100MB/s 将导致丢包.

同时增加 socket缓冲区大小和操作系统的套接字缓冲区大小.

Linux

默认套接字缓冲区大小通常为 128k 或更小，这使得非常几乎没有空间暂停数据处理.

sysctl

使用sysctl增加发送(写内存[wmem])和接收(读内存 [rmem]) 缓冲区:

net.core.wmem_max
net.core.wmem_default
net.core.rmem_max
net.core.rmem_default

例如，将值增加到 8 兆字节:

sysctl -w net.core.rmem_max=8388608

要使设置保持不变，请同时更新 /etc/sysctl.conf，例如:

net.core.rmem_max=8388608

一个在-关于调整网络堆栈的深度文章深入探讨了更多细节，涉及在Linux中如何从内核的网络驱动程序通过环形缓冲区一直到C的recv接收和处理数据包的多个级别代码>调用.本文介绍了在诊断网络问题时要监控的其他设置和文件.(见下文.)

在进行以下任何调整之前，请务必了解它们如何影响网络堆栈.确实有可能使您的网络无法使用.选择适合您的系统、网络配置和预期流量负载的数字:

net.core.rmem_max=8388608
net.core.rmem_default=8388608
net.core.wmem_max=8388608
net.core.wmem_default=8388608
net.ipv4.udp_mem='262144 327680 434274'
net.ipv4.udp_rmem_min=16384
net.ipv4.udp_wmem_min=16384
net.core.netdev_budget=600
net.ipv4.ip_early_demux=0
net.core.netdev_max_backlog=3000

ethtool

此外，ethtool 对于查询或更改网络设置很有用.例如，如果 ${DEVICE} 是 eth0(使用 ip address 或 ipconfig 来确定您的网络设备名称)，那么可以使用以下方法增加 RX 和 TX 缓冲区:

ethtool -G ${DEVICE} rx 4096
ethtool -G ${DEVICE} 发送 4096

iptables

默认情况下，iptables 将记录有关数据包的信息，这会消耗 CPU 时间，尽管很少.例如，您可以使用以下命令禁用端口 6004 上的 UDP 数据包日志记录:

iptables -t raw -I PREROUTING 1 -p udp --dport 6004 -j NOTRACKiptables -I 输入 1 -p udp --dport 6004 -j 接受

您的特定端口和协议会有所不同.

监控

多个文件包含有关网络数据包在发送和接收的各个阶段发生的情况的信息.在下面的列表中，${IRQ} 是中断请求号，${DEVICE} 是网络设备:

/proc/cpuinfo - 显示可用的 CPU 数量(有助于 IRQ 平衡)
/proc/irq/${IRQ}/smp-affinity - 显示 IRQ 关联性
/proc/net/dev - 包含一般数据包统计信息
/sys/class/net/${DEVICE}/queues/QUEUE/rps_cpus - 与接收数据包引导 (RPS) 相关
/proc/softirqs - 用于 ntuple 过滤
/proc/net/softnet_stat - 用于数据包统计，例如丢包、时间压缩、CPU 冲突等.
/proc/sys/net/core/flow_limit_cpu_bitmap - 显示数据包流(可以帮助诊断大小流之间的丢包)
/proc/net/snmp
/proc/net/udp

总结

缓冲区空间最有可能是丢包的罪魁祸首.网络堆栈中散布着许多缓冲区，每个缓冲区对发送和接收数据包都有自己的影响.网络驱动程序、操作系统、内核设置和其他因素可能会影响数据包丢失.没有灵丹妙药.

进一步阅读

https://github.com/leandromoreira/linux-network-performance-parameters
http://man7.org/linux/man-pages/man7/udp.7.html
http://www.ethernetresearch.com/geekzone/linux-networking-commands-to-debug-ipudptcp-packet-loss/

I'm sending very large (64000 bytes) datagrams. I realize that the MTU is much smaller than 64000 bytes (a typical value is around 1500 bytes, from my reading), but I would suspect that one of two things would happen - either no datagrams would make it through (everything greater than 1500 bytes would get silently dropped or cause an error/exception to be thrown) or the 64000 byte datagrams would get chunked into about 43 1500 byte messages and transmitted transparently.

Over a long run (2000+ 64000 byte datagrams), about 1% (which seems abnormally high for even a LAN) of the datagrams get dropped. I might expect this over a network, where datagrams can arrive out of order, get dropped, filtered, and so on. However, I did not expect this when running on localhost.

What is causing the inability to send/receive data locally? I realize UDP is unreliable, but I didn't expect it to be so unreliable on localhost. I'm wondering if it's just a timing issue since both the sending and receiving components are on the same machine.

For completeness, I've included the code to send/receive datagrams.

Sending:

DatagramSocket socket = new DatagramSocket(senderPort);

int valueToSend = 0;

while (valueToSend < valuesToSend || valuesToSend == -1) {
    byte[] intBytes = intToBytes(valueToSend);

    byte[] buffer = new byte[bufferSize - 4];

     //this makes sure that the data is put into an array of the size we want to send
    byte[] bytesToSend = concatAll(intBytes, buffer);

    System.out.println("Sending " + valueToSend + " as " + bytesToSend.length + " bytes");

    DatagramPacket packet = new DatagramPacket(bytesToSend,
                        bufferSize, receiverAddress, receiverPort);

    socket.send(packet);

    Thread.sleep(delay);

    valueToSend++;
}

Receiving:

DatagramSocket socket = new DatagramSocket(receiverPort);

while (true) {
    DatagramPacket packet = new DatagramPacket(
            new byte[bufferSize], bufferSize);

    System.out.println("Waiting for datagram...");
    socket.receive(packet);

    int receivedValue = bytesToInt(packet.getData(), 0);

    System.out.println("Received: " + receivedValue
            + ". Expected: " + expectedValue);

    if (receivedValue == expectedValue) {
        receivedDatagrams++;
        totalDatagrams++;
    }
    else {
        droppedDatagrams++;
        totalDatagrams++;
    }

    expectedValue = receivedValue + 1;
    System.out.println("Expected Datagrams: " + totalDatagrams);
    System.out.println("Received Datagrams: " + receivedDatagrams);
    System.out.println("Dropped Datagrams: " + droppedDatagrams);
    System.out.println("Received: "
            + ((double) receivedDatagrams / totalDatagrams));
    System.out.println("Dropped: "
            + ((double) droppedDatagrams / totalDatagrams));
    System.out.println();
}

解决方案

Overview

What is causing the inability to send/receive data locally?

Mostly buffer space. Imagine sending a constant 10MB/second while only able to consume 5MB/second. The operating system and network stack can't keep up, so packets are dropped. (This differs from TCP, which provides flow control and re-transmission to handle such a situation.)

Even when data is consumed without overflowing buffers, there might be small time slices where data cannot be consumed, so the system will drop packets. (Such as during garbage collection, or when the OS task switches to a higher-priority process momentarily, and so forth.)

This applies to all devices in the network stack. A non-local network, an Ethernet switch, router, hub, and other hardware will also drop packets when queues are full. Sending a 10MB/s stream through a 100MB/s Ethernet switch while someone else tries to cram 100MB/s through the same physical line will cause dropped packets.

Increase both the socket buffers size and operating system's socket buffer size.

Linux

The default socket buffer size is typically 128k or less, which leaves very little room for pausing the data processing.

sysctl

Use sysctl to increase the transmit (write memory [wmem]) and receive (read memory [rmem]) buffers:

net.core.wmem_max
net.core.wmem_default
net.core.rmem_max
net.core.rmem_default

For example, to bump the value to 8 megabytes:

sysctl -w net.core.rmem_max=8388608

To make the setting persist, update /etc/sysctl.conf as well, such as:

net.core.rmem_max=8388608

An in-depth article on tuning the network stack dives into far more details, touching on multiple levels of how packets are received and processed in Linux from the kernel's network driver through ring buffers all the way to C's recv call. The article describes additional settings and files to monitor when diagnosing network issues. (See below.)

Before making any of the following tweaks, be sure to understand how they affect the network stack. There is a real possibility of rendering your network unusable. Choose numbers appropriate for your system, network configuration, and expected traffic load:

net.core.rmem_max=8388608
net.core.rmem_default=8388608
net.core.wmem_max=8388608
net.core.wmem_default=8388608
net.ipv4.udp_mem='262144 327680 434274'
net.ipv4.udp_rmem_min=16384
net.ipv4.udp_wmem_min=16384
net.core.netdev_budget=600
net.ipv4.ip_early_demux=0
net.core.netdev_max_backlog=3000

ethtool

Additionally, ethtool is useful to query or change network settings. For example, if ${DEVICE} is eth0 (use ip address or ipconfig to determine your network device name), then it may be possible to increase the RX and TX buffers using:

ethtool -G ${DEVICE} rx 4096
ethtool -G ${DEVICE} tx 4096

iptables

By default, iptables will log information about packets, which consumes CPU time, albeit minimal. For example, you can disable logging of UDP packets on port 6004 using:

iptables -t raw -I PREROUTING 1 -p udp --dport 6004 -j NOTRACK
iptables -I INPUT 1 -p udp --dport 6004 -j ACCEPT

Your particular port and protocol will vary.

Monitoring

Several files contain information about what is happening to network packets at various stages of sending and receiving. In the following list ${IRQ} is the interrupt request number and ${DEVICE} is the network device:

/proc/cpuinfo - shows number of CPUs available (helpful for IRQ-balancing)
/proc/irq/${IRQ}/smp-affinity - shows IRQ affinity
/proc/net/dev - contains general packet statistics
/sys/class/net/${DEVICE}/queues/QUEUE/rps_cpus - relates to Receive Packet Steering (RPS)
/proc/softirqs - used for ntuple filtering
/proc/net/softnet_stat - for packet statistics, such as drops, time squeezes, CPU collisions, etc.
/proc/sys/net/core/flow_limit_cpu_bitmap - shows packet flow (can help diagnose drops between large and small flows)
/proc/net/snmp
/proc/net/udp

Summary

Buffer space is the most likely culprit for dropped packets. There are numerous buffers strewn throughout the network stack, each having its own impact on sending and receiving packets. Network drivers, operating systems, kernel settings, and other factors can affect packet drops. There is no silver bullet.

什么会导致 UDP 数据包在发送到 localhost 时被丢弃?

概览

Linux

sysctl

ethtool

iptables

监控

总结

进一步阅读

Overview

Linux

sysctl

ethtool

iptables

Monitoring

Summary

Further Reading