我正在寻找一种展平(拆分)可能重叠的数字范围列表的好方法。问题与以下问题非常相似:最快的分割重叠日期范围的方法,还有许多其他方法。
但是,范围不仅是整数,而且我正在寻找一种可以在Javascript或Python等中轻松实现的体面算法。
示例数据:
解决方案示例:
抱歉,如果这是重复的,但是我还没有找到解决方法。
我正在寻找一种展平(拆分)可能重叠的数字范围列表的好方法。问题与以下问题非常相似:最快的分割重叠日期范围的方法,还有许多其他方法。
但是,范围不仅是整数,而且我正在寻找一种可以在Javascript或Python等中轻松实现的体面算法。
示例数据:
解决方案示例:
抱歉,如果这是重复的,但是我还没有找到解决方法。
Answers:
从左到右走动,使用堆栈跟踪您所用的颜色。可以使用数据集中的10个数字作为断点来代替离散地图。
从一个空堆栈开始,并设置start
为0,循环直到结束为止:
start
,然后将其和所有排名较低的颜色推入堆栈。在展平的列表中,标记该颜色的开始。start
找到排名更高的颜色的下一个起点,并找到当前颜色的终点
start
为该颜色的末尾,将其弹出堆栈,然后检查排名第二高的颜色
start
在下一个颜色的范围内,请将此颜色添加到展平的列表中,从开始start
。给定您的示例数据,这是一个简单的过程:
# Initial data.
flattened = []
stack = []
start = 0
# Stack is empty. Look for the next starting point at 0 or later: "b", 0 - Push it and all lower levels onto stack
flattened = [ (b, 0, ?) ]
stack = [ r, b ]
start = 0
# End of "b" is 5.4, next higher-colored start is "g" at 2 - Delimit and continue
flattened = [ (b, 0, 2), (g, 2, ?) ]
stack = [ r, b, g ]
start = 2
# End of "g" is 12, next higher-colored start is "y" at 3.5 - Delimit and continue
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, ?) ]
stack = [ r, b, g, y ]
start = 3.5
# End of "y" is 6.7, next higher-colored start is "o" at 6.7 - Delimit and continue
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, ?) ]
stack = [ r, b, g, y, o ]
start = 6.7
# End of "o" is 10, and there is nothing starting at 12 or later in a higher color. Next off stack, "y", has already ended. Next off stack, "g", has not ended. Delimit and continue.
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, ?) ]
stack = [ r, b, g ]
start = 10
# End of "g" is 12, there is nothing starting at 12 or later in a higher color. Next off stack, "b", is out of range (already ended). Next off stack, "r", is out of range (not started). Mark end of current color:
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, 12) ]
stack = []
start = 12
# Stack is empty. Look for the next starting point at 12 or later: "r", 12.5 - Push onto stack
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, 12), (r, 12.5, ?) ]
stack = [ r ]
start = 12
# End of "r" is 13.8, and there is nothing starting at 12 or higher in a higher color. Mark end and pop off stack.
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, 12), (r, 12.5, 13.8) ]
stack = []
start = 13.8
# Stack is empty and nothing is past 13.8 - We're done.
这个解决方案似乎是最简单的。(或者至少是最容易掌握的)
所需要的只是一个减去两个范围的函数。换句话说,将产生以下结果:
A ------ A ------ A ----
B ------- and B ------ and B ---------
= ---- = ---- = --- --
这很简单。然后,您可以简单地遍历每个范围(从最低范围开始),然后依次从每个范围中减去上方的所有范围。那里有。
这是Python中范围减法器的实现:
def subtractRanges((As, Ae), (Bs, Be)):
'''SUBTRACTS A FROM B'''
# e.g, A = ------
# B = -----------
# result = -- ---
# Returns list of new range(s)
if As > Be or Bs > Ae: # All of B visible
return [[Bs, Be]]
result = []
if As > Bs: # Beginning of B visible
result.append([Bs, As])
if Ae < Be: # End of B visible
result.append([Ae, Be])
return result
使用此功能,其余操作可以像这样完成:(“ span”表示范围,因为“ range”是Python关键字)
spans = [["red", [12.5, 13.8]],
["blue", [0.0, 5.4]],
["green", [2.0, 12.0]],
["yellow", [3.5, 6.7]],
["orange", [6.7, 10.0]]]
i = 0 # Start at lowest span
while i < len(spans):
for superior in spans[i+1:]: # Iterate through all spans above
result = subtractRanges(superior[1], spans[i][1])
if not result: # If span is completely covered
del spans[i] # Remove it from list
i -= 1 # Compensate for list shifting
break # Skip to next span
else: # If there is at least one resulting span
spans[i][1] = result[0]
if len(result) > 1: # If there are two resulting spans
# Insert another span with the same name
spans.insert(i+1, [spans[i][0], result[1]])
i += 1
print spans
这给出[['red', [12.5, 13.8]], ['blue', [0.0, 2.0]], ['green', [2.0, 3.5]], ['green', [10.0, 12.0]], ['yellow', [3.5, 6.7]], ['orange', [6.7, 10.0]]]
,这是正确的。
如果数据的范围确实与样本数据相似,则可以创建一个像这样的地图:
map = [0 .. 150]
for each color:
for loc range start * 10 to range finish * 10:
map[loc] = color
然后只需浏览这张地图即可生成范围
curcolor = none
for loc in map:
if map[loc] != curcolor:
if curcolor:
rangeend = loc / 10
make new range
rangecolor = map[loc]
rangestart = loc / 10
要正常工作,值必须与样本数据中的值在相对较小的范围内。
编辑:要使用真正的浮点数,请使用地图生成高级地图,然后参考原始数据创建边界。
map = [0 .. 15]
for each color:
for loc round(range start) to round(range finish):
map[loc] = color
curcolor = none
for loc in map
if map[loc] != curcolor:
make new range
if loc = round(range[map[loc]].start)
rangestart = range[map[loc]].start
else
rangestart = previous rangeend
rangecolor = map[loc]
if curcolor:
if map[loc] == none:
last rangeend = range[map[loc]].end
else
last rangeend = rangestart
curcolor = rangecolor
这是Scala中一个相对简单的解决方案。移植到另一种语言应该并不难。
case class Range(name: String, left: Double, right: Double) {
def overlapsLeft(other: Range) =
other.left < left && left < other.right
def overlapsRight(other: Range) =
other.left < right && right < other.right
def overlapsCompletely(other: Range) =
left <= other.left && right >= other.right
def splitLeft(other: Range) =
Range(other.name, other.left, left)
def splitRight(other: Range) =
Range(other.name, right, other.right)
}
def apply(ranges: Set[Range], newRange: Range) = {
val left = ranges.filter(newRange.overlapsLeft)
val right = ranges.filter(newRange.overlapsRight)
val overlaps = ranges.filter(newRange.overlapsCompletely)
val leftSplit = left.map(newRange.splitLeft)
val rightSplit = right.map(newRange.splitRight)
ranges -- left -- right -- overlaps ++ leftSplit ++ rightSplit + newRange
}
val ranges = Vector(
Range("red", 12.5, 13.8),
Range("blue", 0.0, 5.4),
Range("green", 2.0, 12.0),
Range("yellow", 3.5, 6.7),
Range("orange", 6.7, 10.0))
val flattened = ranges.foldLeft(Set.empty[Range])(apply)
val sorted = flattened.toSeq.sortBy(_.left)
sorted foreach println
apply
接受Set
已经应用的所有范围中的a ,找到重叠部分,然后返回减去重叠部分再加上新范围和新拆分范围的新集合。 在每个输入范围内foldLeft
反复调用apply
。