由于此答案中给出的现有非递归DFS实现似乎已失效,因此让我提供一个实际可行的方法。
我已经用Python编写了此代码,因为我发现它非常易读且不受实现细节的干扰(并且因为它具有yield
用于实现generators的方便关键字),但是移植到其他语言应该相当容易。
# a generator function to find all simple paths between two nodes in a
# graph, represented as a dictionary that maps nodes to their neighbors
def find_simple_paths(graph, start, end):
visited = set()
visited.add(start)
nodestack = list()
indexstack = list()
current = start
i = 0
while True:
# get a list of the neighbors of the current node
neighbors = graph[current]
# find the next unvisited neighbor of this node, if any
while i < len(neighbors) and neighbors[i] in visited: i += 1
if i >= len(neighbors):
# we've reached the last neighbor of this node, backtrack
visited.remove(current)
if len(nodestack) < 1: break # can't backtrack, stop!
current = nodestack.pop()
i = indexstack.pop()
elif neighbors[i] == end:
# yay, we found the target node! let the caller process the path
yield nodestack + [current, end]
i += 1
else:
# push current node and index onto stacks, switch to neighbor
nodestack.append(current)
indexstack.append(i+1)
visited.add(neighbors[i])
current = neighbors[i]
i = 0
此代码维护两个并行堆栈:一个包含当前路径中的较早节点,一个包含该节点堆栈中每个节点的当前邻居索引(以便我们在弹出节点时可以继续遍历节点的邻居)堆栈)。我本来可以很好地使用(节点,索引)对的单个堆栈,但是我认为双堆栈方法更具可读性,并且对于其他语言的用户来说可能更易于实现。
这段代码还使用了一个单独的visited
集合,该集合始终包含当前节点和堆栈上的所有节点,以便让我高效地检查节点是否已成为当前路径的一部分。如果您的语言碰巧有一个“有序集合”数据结构,该结构既提供了有效的类似于堆栈的推入/弹出操作,又提供了有效的成员资格查询,则可以将其用于节点堆栈并摆脱单独的visited
集合。
或者,如果对节点使用自定义可变类/结构,则只需在每个节点中存储一个布尔值标志,以指示是否已将其作为当前搜索路径的一部分进行了访问。当然,如果您出于某种原因希望这样做,则该方法将不允许您在同一图形上并行运行两个搜索。
这是一些测试代码,展示了上面给出的功能如何工作:
# test graph:
# ,---B---.
# A | D
# `---C---'
graph = {
"A": ("B", "C"),
"B": ("A", "C", "D"),
"C": ("A", "B", "D"),
"D": ("B", "C"),
}
# find paths from A to D
for path in find_simple_paths(graph, "A", "D"): print " -> ".join(path)
在给定的示例图上运行此代码将产生以下输出:
A-> B-> C-> D
A-> B-> D
A-> C-> B-> D
A-> C-> D
请注意,尽管此示例图是无向的(即,其所有边沿都是双向的),但该算法也适用于任意有向图。例如,移除C -> B
边缘(通过B
从的邻居列表中移除C
)会产生相同的输出,除了第三条路径(A -> C -> B -> D
)不再可用。
附言 构造图很容易,对于这种图,简单的搜索算法(例如该线程(以及该线程中给出的其他算法))执行效果非常差。
例如,考虑在无向图上查找从A到B的所有路径的任务,其中起始节点A有两个邻居:目标节点B(除A之外没有其他邻居)和节点C(属于集团)的n +1个节点,如下所示:
graph = {
"A": ("B", "C"),
"B": ("A"),
"C": ("A", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"D": ("C", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"E": ("C", "D", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"F": ("C", "D", "E", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"G": ("C", "D", "E", "F", "H", "I", "J", "K", "L", "M", "N", "O"),
"H": ("C", "D", "E", "F", "G", "I", "J", "K", "L", "M", "N", "O"),
"I": ("C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "O"),
"J": ("C", "D", "E", "F", "G", "H", "I", "K", "L", "M", "N", "O"),
"K": ("C", "D", "E", "F", "G", "H", "I", "J", "L", "M", "N", "O"),
"L": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "M", "N", "O"),
"M": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "N", "O"),
"N": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "O"),
"O": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"),
}
显而易见,A和B之间的唯一路径是直接路径,但是从节点A开始的幼稚DFS将浪费O(n!)时间,无益地探索集团内部的路径,即使(对于人类而言)显而易见的是这些路径都不可能导致B。
也可以构建具有类似属性的DAG,例如,通过将起始节点A连接到目标节点B以及两个其他节点C 1和C 2,这两个节点都连接到节点D 1和D 2,这两个节点都连接到E 1和E 2,依此类推。对于这样排列的n层节点,天真的搜索从A到B的所有路径最终将浪费O(2 n)时间,在放弃之前检查所有可能的死角。
当然,从集团中的一个节点(不是C)或从DAG的最后一层向目标节点B添加一条边,将会创建从A到B的大量可能路径,并且纯粹的本地搜索算法无法真正提前告知它是否会找到这样的边缘。因此,从某种意义上说,这种天真的搜索对输出的敏感性很差,这是由于它们缺乏对图形全局结构的了解。
尽管可以使用多种预处理方法(例如迭代地消除叶节点,搜索单节点顶点分隔符等)来避免这些“指数时间死胡同”,但我不知道任何一般方法可以在所有情况下消除它们的预处理技巧。通用的解决方案是在搜索的每个步骤中检查目标节点是否仍可访问(使用子搜索),如果无法访问则尽早回溯,但是but,这会大大降低搜索速度(最坏的情况是,与图的大小成比例),对于许多不包含此类病理死角的图。