数据包丢失的纠错码 (UDP)
我不知道要查找什么,因为我得到的纠错码"都是与您不知道错误位置的情况相关的内容.因此,这些代码比我需要的要复杂得多且效率低下.
I have no real idea what to look for, since all I get with "Error correcting codes" is stuff related to cases where you don't know the location of the error. Thus those codes are much more complicated an inefficient than I need them to be.
在下文中,请注意位等于数据包(因为只能丢失整个数据包,因此位类比非常适合).
In the following, note that bits are equal to packets (because only a whole packet can be missing, thus the bit analogy fits very well).
是否有 ECC 考虑到您已经知道缺少哪些 k 位,并且只为您提供一种在这些 k 位置重建数据流的方法?另外,ECC添加的位应该是独立的(最好是).这样,如果数据的ECC部分发生丢包,它仍然可以重建一些原始数据(并不总是会有k个错误,大部分不会有.所以ECC对自己的容错很重要添加了 ECC 位).
Are there ECCs that take into account that you already know WHICH k-bits are missing and only provide you with a way to reconstruct the datastream in those k places? Additionally, the bits added by the ECC should be independent (preferably). Such that if packet loss occurs inside the ECC portion of the data, it still can reconstruct some of the original data (not always will there be k errors, mostly there will be none. So its important that the ECC is fault tolerant to its own ECC bits added).
这是一个很大的不同 IMO.对于一个丢失的位很简单,我可以只使用一个 XOR 位.但我不够聪明,无法将其推广到 n 位.
That is a big difference IMO. For one missing bit its simple, I can just use one XOR bit. But I am not clever enough to generalize it to n-bits.
再说一次,我有一个 n 位流,我知道最多有 k 位丢失(我真的知道哪些是确切的,它们是丢失,腐败是不可能的).现在我需要一个编解码器,它可以在向数据流中添加尽可能少的开销的情况下重建它们.我梦想拥有 (n+k) 位来纠正 n 位流中的 k 个随机位错误 :).最重要的是,理想情况下,如果添加到 n 位数据流中的任何 k 个 ECC 位被损坏,例如k 位被损坏,那么它应该仍然能够重建 n 位流中的 (kc) 位错误.
So again, I have a stream of n-bits, and I know that up to k bits are missing (I really know which ones exactly and that they are missing, corruption is not possible). Now I need a codec that can reconstruct them with as little overhead added to the datastream as possible. I am dreaming of having (n+k) bits to correct k random bit errors in an n bit stream :). On top of that, ideally, if any of those k ECC bits added to the n bit data stream gets corrupted, like say c bits of the k bits get corrupted, then it should still be able to reconstruct (k-c) bit errors in the n bit stream.
请注意,虽然 xD 我事先并不知道错误位置.
Please note ofc that I do not know the error positions in advance though xD.
例子:
我能想到的一种算法是这样的.要防止错误的 n 位数据流.
One algorithm I can think of is this. Datastream of n bits to be protected against errors.
设 p 是 n 的最小相对素数.然后用 i = (p * j) mod n 迭代数据流,通过递增 j,对通过选择每个偶数 j 的位获得的子流进行异或.该子流有 n/2 个元素.迭代后,我们获得了 n/2 个元素的奇偶校验.我们可以以相同的方式获得另一半的另一个奇偶校验(取奇数 j).
Let p be the smallest relative primes to n. Then iterate through the datastream with i = (p * j) mod n, by incrementing j, XORing the substream obtained by selecting bits of every even j. This substream has n/2 elements. After iterating, we have obtained a parity for n/2 the elements. We can obtain another parity for the other half in the same way (taking odd j).
对于 2 位丢失,这可减少 50% 的错误.
This yields 50% error reduction for 2 bit losses.
好的一面是我们现在可以任意地变得更好.只需取下一个较高的相对质数,然后再做同样的事情.现在我们有 25% 的错误机会.基本上,每次我们添加两个额外的奇偶校验位时,我们可以将错误机会减少一半.
The bright side is we can now get arbitrarily better. Just take the next higher relative prime and do the same again. Now we are at 25% error chance. Basically we can cut the error chance in a half each time we add two additional parity bits.
推荐答案
您需要一个擦除代码(不是错误检测代码).错误检测由链路和传输层负责.由于您正在尝试减少 UDP 数据包丢失,因此您已经知道丢失了哪些部分――丢失的数据包丢失了.
You need an erasure code (not an error detection code). Error detection is taken care of by the link and transport layer. Since you are trying to mitigate UDP packet loss, you already know which parts are missing -- the dropped packet is missing.
在字节(或位)级别上不存在擦除或错误之类的东西,至少没有任何合理的可能性(至少有两个底层协议层,有时是三个,每个层都有一个校验和,这可以确保其中).您要么收到完整的、完整的数据报,或者不.从来没有介于两者之间.
There is no such thing as erasures or errors on a byte (or bit) level, at least not with any reasonable likelihood (there are at least two underlying layers of protocols, sometimes three, each with a checksum, which makes sure of that). You either receive a complete, intact datagram, or you don't. Never anything in between.
Cauchy Reed Solomon 码是您可以考虑的一类算法,它们将一定长度的 k 个数据块转换为 k+m 个块,并允许恢复原始数据最多给出 m 次擦除.这种算法的一个特例是parity,编码和解码都是一个简单的异或操作,m=1.这是 Raid-5 中使用的算法,在上面的评论中提到过.
Cauchy Reed Solomon codes is one class of algorithms you may consider, these transform k blocks of data of some length into k+m blocks, and allow restoration of the original data given at most m erasures. A special case for this kind of algorithm is parity, for which both encoding and decoding is a simple xor operation, and m=1. This is the very algorithm used in Raid-5, which was mentioned in a comment above.
一句话,你想要longhair.
作为替代方案,如果您有很多数据要传输到多个方,并且想要花哨,则可以考虑喷泉码.这些更复杂(因此更慢)并且位效率更低,但它们允许您创建任意数量的数据包,其中 any k 将重建 k 长度的原始消息.如果您能够向许多都需要大量数据但不一定同时开始下载的客户端进行多播,这可以节省大量带宽.
As an alternative, if you have a lot of data to transmit to several parties, and you want to be fancy, you can consider fountain codes. These are much more complicated (and thus slower) and less bit-efficient, but they allow you to create an arbitrary number of packets, of which any k will reconstruct the k-lenght original message. This allows you to save a lot of bandwidth if you are able to multicast to a lot of clients who all want one large set of data, but don't necessarily start downloading at the same time.
相关文章