如何将一组重叠范围划分为非重叠范围?
问题描述
假设您有一组范围:
- 0 - 100: 'a'
- 0 - 75:'b'
- 95 - 150: 'c'
- 120 - 130:'d'
显然,这些范围在某些点重叠.您将如何剖析这些范围以生成不重叠范围的列表,同时保留与其原始范围相关的信息(在本例中为范围后面的字母)?
Obviously, these ranges overlap at certain points. How would you dissect these ranges to produce a list of non-overlapping ranges, while retaining information associated with their original range (in this case, the letter after the range)?
例如上面运行算法后的结果是:
For example, the results of the above after running the algorithm would be:
- 0 - 75:'a'、'b'
- 76 - 94: 'a'
- 95 - 100:'a'、'c'
- 101 - 119:'c'
- 120 - 130:'c'、'd'
- 131 - 150:'c'
解决方案
我在编写混合(部分重叠)音频样本的程序时遇到了同样的问题.
I had the same question when writing a program to mix (partly overlapping) audio samples.
我所做的是将开始事件"和停止事件"(针对每个项目)添加到列表中,按时间点对列表进行排序,然后按顺序处理.你可以做同样的事情,除了使用整数点而不是时间,而不是混合声音,你将添加符号到与范围相对应的集合中.是生成空范围还是忽略它们都是可选的.
What I did was add an "start event" and "stop event" (for each item) to a list, sort the list by time point, and then process it in order. You could do the same, except using an integer point instead of a time, and instead of mixing sounds you'd be adding symbols to the set corresponding to a range. Whether you'd generate empty ranges or just omit them would be optional.
编辑
也许一些代码...
# input = list of (start, stop, symbol) tuples
points = [] # list of (offset, plus/minus, symbol) tuples
for start,stop,symbol in input:
points.append((start,'+',symbol))
points.append((stop,'-',symbol))
points.sort()
ranges = [] # output list of (start, stop, symbol_set) tuples
current_set = set()
last_start = None
for offset,pm,symbol in points:
if pm == '+':
if last_start is not None:
#TODO avoid outputting empty or trivial ranges
ranges.append((last_start,offset-1,current_set))
current_set.add(symbol)
last_start = offset
elif pm == '-':
# Getting a minus without a last_start is unpossible here, so not handled
ranges.append((last_start,offset-1,current_set))
current_set.remove(symbol)
last_start = offset
# Finish off
if last_start is not None:
ranges.append((last_start,offset-1,current_set))
显然,完全未经测试.
相关文章