是否存在最著名的算法具有运行时间


18

我以前从未见过带有分母中有对数的算法,并且我想知道这种形式是否有任何实际有用的算法?

我了解许多事情可能导致对数因子在运行时成倍增长,例如排序或基于树的算法,但是什么导致您除以对数因子呢?


24
合并排序,其中f(n)=nlog2n
Jeffε

12
@Jɛff E snarky Mcsnarkster
Suresh Venkat

5
基数排序-确实是。这是怎么回事的是,由于随机访问,你可以这样保存日志因素....O(nlogn/logn)
沙利叶喀拉-佩莱德

我不知道DTIME层次定理是否可以用作此类算法存在的一个论据,因为人们可以在RAM模型中做一些类似的节省空间成本的技巧。
chazisop

Answers:


41

通常的回答是“什么导致您除以日志?” 是两件事的结合:

  1. 一种计算模型,其中允许对单词大小的整数进行恒定时间的算术运算,但是您要保守单词的长度,因此假设O(logn)每个字为位(因为少于此位数而且您甚至无法解决所有的内存问题,并且还因为使用表查找的算法在单词较长的情况下会花费太多时间来建立表),并且
  2. 一种算法,该算法通过将位打包成字然后对字进行运算来压缩数据。

我认为有很多示例,但是经典示例是 用于最长公共子序列等的“ 四俄罗斯人”算法。它实际上最终为,因为它使用位打包的思想,但是节省了第二个使用另一个想法的log因子:用单个表查找替换O log 2 n 位操作的块。O(n2/log2n)O(log2n)


35

魔方是一个非常自然的例子(对我来说,是意外的)。一个n×n×n立方体需要Θ(n2/logn)步骤来求解。(请注意,这是theta表示法,因此上下限很严格)。

本文 [1]所示。

值得一提的是,解决魔方的特定实例的复杂性是开放的,但被认为是NP难的(例如在这里讨论) NP难的 [2]。的Θ(n2/logn)算法保证了溶液中,它保证所有解决方案都渐近最优,但它可能不是最佳解决特定实例。您对有用的定义此处可能适用,也可能不会适用,因为此算法通常无法解决Rubik的多维数据集(Kociemba算法通常用于小型多维数据集,因为它在实践中提供了快速,最佳的解决方案)。

[1] Erik D. Demaine,Martin L. Demaine,Sarah Eisenstat,Anna Lubiw和Andrew Winslow。解决魔方的算法。第19届年度欧洲算法研讨会(ESA 2011),2011年9月5-9日,第689-700页

[2] Erik D. Demaine,Sarah Eisenstat和Mikhail Rudoy。最优地解决魔方的问题是NP完全的。第35届计算机科学理论方面的国际学术会议论文集(STACS 2018),2018年2月28日至3月3日,第24:1-24:13页。


16

出现在分母中而没有位打包技巧的示例是Agarwal,Ben Avraham,Kaplan和Sharir的最新论文,该论文计算了时间O中两个多边形链之间的离散Fréchet距离n 2 log log n / log n 。虽然我对算法的细节不熟悉,但是一个通用的技巧是将输入分成相对较小的部分,然后巧妙地组合答案(当然这听起来像是分而治之,但您不会得到log n在分母上有一些巧妙的技巧)lognO(n2loglogn/logn)


5
这是David的答案中描述的“四个俄罗斯人”技术的一个更复杂的实例。
Jeffε

13

并非完全是您的要求,而是分母中出现对数因子的“野外”情况是斯蒂芬·库克,皮埃尔·麦肯齐,达斯汀·韦尔,马克·布拉维曼和斯蒂芬·库克撰写的论文《用于树的卵石和分支程序》。拉胡尔·桑塔南(Rahul Santhanam)。

树评价问题(TEP)是:给定与值注释进制树在{ 1 ... ķ }在叶子和功能{ 1 ... ķ } d{ 1 ... ķ }上的内部节点,评估树。在此,每个内部节点都会根据其子代的值获得其带注释的函数的值。这是一个简单的问题,重点是要证明它不能在对数空间中解决(当树的高度是输入的一部分时)。为此,我们对解决TEP的分支程序的规模感兴趣。d{1,,k}{1,,k}d{1,,k}

In Section 5, tight bounds are presented for trees of height 3, both for TEP and for the related problem BEP, in which the output is collapsed to {0,1} in some arbitrary way. For TEP the bound is Θ(k2d1), while for BEP the bound is Θ(k2d1/logk), i.e. you get a saving of logk.


12

Even though it's not about runtime, I thought it worth mentioning the classical result of Hopcroft, Paul, and Valiant: TIME[t]SPACE[t/logt] [1], since it's still in the spirit of "what could cause you to save a log factor."

That gives lots of examples of problems whose best known upper bound on space complexity has a log in the denominator. (Depending on your viewpoint, I would think that either makes this example very interesting - what an amazing theorem! - or very uninteresting - it's probably not "actually useful".)

[1] Hopcroft, Paul, and Valiant. On time versus space. J. ACM 24(2):332-337, 1977.



8

The best known algorithm for computing the edit (a.k.a. Levenshtein) distance between two strings of length n takes O((n/logn)2) time:

William J. Masek, Mike Paterson: A Faster Algorithm Computing String Edit Distances. J. Comput. Syst. Sci. 20(1): 18-31 (1980).


4
Again, this is a variation of the Four Russians algorithm, I think.
David Eppstein

7

Θ(n/logn) appears as the correct bound for a problem considered by Greg and Paul Valiant (no connection to bit tricks):

Gregory Valiant, and Paul Valiant, The power of linear estimators, 2011. In the 52nd Annual IEEE Symposium on the Foundations of Computer Science, FOCS 2011.


7

Here's another example of a tight bound having a log factor. (This is Theorem 6.17 from Boolean Function Complexity: Advances and Frontiers by Stasys Jukna.)

The formula size (over the full binary basis or the De Morgan basis) of the element distinctness problem is Θ(n2/logn), where n is the number of bits in the input.

The reason the log factor appears in the denominator is that representing m integers between 1 and poly(m) requires n:=O(mlogm) bits in total, since each integer requires O(logm) bits. So an upper bound that looks natural in terms of m, like Θ(m2logm), becomes Θ(n2/logn) when expressed in terms of n, where n is the number of bits in the input.


2

Finding the prime factors of n by trial division when the list of primes is already given. There are θ(nlog(n)) primes less than n so if these primes are given to you, then trial division of n by each of them takes θ(nlog(n)) time (assuming division is a constant-time operation)


3
In fact, it's enough to look at the roughly 2n/logn primes below n. But there are far better algorithms out there.
Yuval Filmus

-2

somewhat similar to JG's answer & "thinking outside the box", this seems like a related/relevant/apropos/fundamental negative result. based on diagonalization with a universal TM, there exists a O(f(n)) DTIME language that cannot run in O(f(n)logf(n)) DTIME, due to the time hierarchy theorem. so this applies to a linear DTIME algorithm that exists, f(n)=n, that runs impossibly in O(nlogn) DTIME.


2
on a TM, DTIME(n/logn) is trivial as it doesn't allow the machine to read the whole input. also, the DTIME notation makes the big-oh notation unnecessary.
Sasho Nikolov

?? there is still theory for sublinear time algorithms...
vzn

3
sublinear algorithms make sense in oracle & random access models. DTIME is standardly defined w.r.t. multitape TM, and that's the definition used in the hierarchy theorem for DTIME.
Sasho Nikolov

1
No, @SashoNikolov, DTIME(n/logn) is not trivial. Compare "Are the first n/lgn bits of the input all zeros?" with "Do the first n/lgn bits of the input encode a satisfiable boolean formula?"
Jeffε

5
@JɛffE: You cannot test “Are the first n/lgn bits of the input all zeros?” in O(n/logn) time on a TM, since you do not know what n is without first reading the whole input, which takes time n. It is a standard fact that if f(n)<n, then DTIME(f(n)) contains only languages the membership in which can be determined from the first k bits of input for a constant k (and therefore are computable in constant time).
Emil Jeřábek supports Monica
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.