多项式共域的基数k表示-是否与上下文无关?


14

在Jeffrey Shallit的“自动机理论第二门课程”的第4章,以下问题被列为开放问题:

p n p(n)是有理系数的多项式,使得对于所有。证明或证明中所有整数的基数k表示的语言是上下文无关的,且仅当为\ leqslant 1时p Ñ Ñp(n)N Ñ ÑnN { p Ñ | Ñ 0 } {p(n)n0}p p11

现在的状态如何(截至2018年10月)?被证明吗?那一些特殊情况呢?


1
如果(一元表示),那么即使是最简单的也不是上下文无关的(众所周知的非CF语言)k = 1 p n = n 2 L = { 1 n 2 }k=1p(n)=n2L={1n2}
Marzio De Biasi

@MarzioDeBiasi所谓的一元表示形式不是base-。实际以为基数的唯一整数将是。1 11 100
EmilJeřábek支持Monica

1
@EmilJeřábek:我认为在许多情况下base-1是“一元表示”的别名
Marzio De Biasi

Answers:


10

当然在这里。ķ 2k2

霍瓦斯(Horváth)曾经有一个手稿声称可以解决这个问题,但是在很多地方还不清楚,据我所知从未出版过。

据我所知,问题仍然存在。当然,暗示的一个方向很容易。


k = 2已经解决了k=2吗?(我有一个想法可以证明k = 2,k=2并且如果它起作用,则可以将相同的技术应用于其他基础)
Marzio De Biasi

收到您关于我的回答的任何反馈,我将非常高兴。
domotorp '18

很抱歉,我无法理解您要求的解决方案。
Jeffrey Shallit,

我已经发布了另一个更详细的答案。完整的陈述非常复杂,因此我在其之前添加了一些简单的引理,这些引理具有主要思想,希望这会使整个过程更加可信。
domotorp

3

这是证明对的草图ķ = 2k=2大号= { [ Ñ 2 ] 2 | Ñ 1 }L={[n2]2n1} ; 其中[ Ñ 2 ] 2[n2]2是的二进制表示Ñ 2n2。为了更好地说明,我们将二进制字符串的最低有效位放在左侧,例如[ 4 2 ] 2 = 00001[42]2=00001

核心思想是假设LL是上下文无关的,然后尝试通过简单的常规语言RR来“简化”它。新的语言大号[RLR仍然是免费的情况下,它仍然应该包含正方形的二进制表示; 然后我们对CF语言应用抽水引理,以得到不是正方形表示形式的二进制字符串。

LL与仅包含有限数量的11位数字的常规单词相交是没有希望的。事实证明,多达四个11位数字[R = { 0 * 1 } { 0 * 10 * 1 } { 0 * 10 * 10 * 1 } { 0 * 10 * 10 * 10 * 1 } (R={01},{0101},{010101},{01010101}),我们得到CF语言;还有五个11 数字,我们得到一个显然困难的数论问题。

有希望的方法是将LLR = 1相交0 +1 +0 +1R=10+1+0+1 ; 这等效于将 LL限制为平方:

n 2 = 2 0 + 2 a2 b1 + 2 b + c1 < a b c

ñ2= 20+ 2一种2b1 + 2b + c),1<a,b,c

(非正式的奇数方格,其二进制表示形式包含所有00 s ,中间的序列为11 s)。

    n        n^2  n                  n^2
   39       1521  111..1             1...11111.1
  143      20449  1111...1           1....1111111..1
  543     294849  11111....1         1.....111111111...1
 2111    4456321  111111.....1       1......11111111111....1
 8319   69205761  1111111......1     1.......1111111111111.....1
33023 1090518529  11111111.......1   1........111111111111111......1
                  LSB          MSB   LSB                         MSB

通过一些努力,我们可以证明以下几点:

定理:2 0 + 2 2 b - 1 + 2 b + c ^;0 < c 3 < a < b20+ 2一种2b1)+2b+c);0<c,3<a<b是一个正方形,当且仅当

b = 2 a - 3 c = a - 3

b=2a3,c=a3

(证明很长,我将其发布在我的博客上)

在这一点上,我们可以很容易地证明大号[RLR不是上下文无关使用泵引理(我们可以在大部分两个“段”泵100..0011 ... 1100.001100..0011...1100.001字符串)。因此,LL不是上下文无关的。

可能将相同的技术应用于任何基数kk


3
很久以来就知道以2为底的n 2结果。挑战是要为将整数映射到整数的每个多项式以及每个基数进行类似的构造。(n2)
Jeffrey Shallit

1
我们是如此相似,尽管我采取了完全不同的方法,但我也花了最后几天思考这个问题。
domotorp

3

我想我有证据。证明来自这个引理。

引理。对于上下文无关的语言L,L如果无限多的nnn个6n6个相等长度的单词,其前n 2个n2字母相同,而后n个n字母不同(成对),则存在一个BB,表示无限多对ü v 大号u,vL区别仅在他们的最后长度相等的B字母。

因此,如果uuvv表示二进制数,则它们的差经常会无穷大为2 B2B,这对于多项式是不可能的。在另一方面,具有一定的数论它可以示出满足该条件时为值多项式的每个整数pp:取任何X 1... X Ñ 6x1,,xn6为其˚F X ˚F X Ĵf(xi)f(xj),和然后在每个数字上加上一些足够大的数字NN以获得所需的单词f xi+N)f(xi+N).

Proof of the lemma. Take a large enough nn such that there are n6n6 words of equal length, w1,,wn6w1,,wn6, that satisfy the conditions. For each wiwi fix a way in which it can be generated from the context-free grammar. (Warning! I'm not an expert of this field, so I might not use the proper terms.)

Say that the application of a rule ABCABC splits two letters bb and cc of the final word, if the bb and cc are both derived from AA, but bb is derived from BB, while cc is derived from CC. Each rule splits at most O(1)O(1) letters of wiwi from each other.

In any wiwi, there will be Ω(n)Ω(n) consecutive letters among the first n2n2 letters that are split from each other by some consecutive rules such that no two letters among the last nn letters are split from each other while applying these rules. If we write these rules collectively for letter wiwi as AiB1iB2iBniAiB1iB2iBni, then no letter from the last nn letters is derived from BjiBji for j<nj<n, and B1iB2iBn1iB1iB2iBn1i are all converted into some part of the first n2n2 letters. We can apply the pumping lemma to the rule AiB1iB2iBniAiB1iB2iBni if nn is large enough.

There are only (n22)(n22) choices for the interval of Ω(n)Ω(n) letters, O(n)O(n) options about what the pumping lemma gives (as it has O(1)O(1) length), so by the pigeonhole principle there will be two words for which these are all the same. But then after pumping we can obtain an arbitrarily long common initial part for these two words, while we know that they'll differ only in their last nn bits.


1

Note. This is a much more detailed version of my other answer, as that didn't seem to be comprehensible enough. I've tried to convert it to resemble more standard pumping lemmas, but the full proof got way to complex. I recommend to read the statement of the first two lemmas to understand the main idea, then the statement of the Corollary, and finally the end, where I prove why the Corollary implies the answer to the question.

The proof is based on a generalization of the pumping lemma. The lemma that we need is quite elaborate, so instead of stating it right away, I start with some easier generalizations, eventually building up to more complicated ones. As I've later learned, this is very similar to the so-called interchange lemma.

Twin Pumping Lemma. For every context-free language LL there is a pp such that from any pp words s1,,spLs1,,spL we can select two, ss and ss, that can be written as s=uvwxys=uvwxy and s=uvwxys=uvwxy such that 1|vx|p1|vx|p, 1|vx|p1|vx|p and every word ˉuˉv1ˉvnˉwˉxnˉx1ˉyLu¯v¯1v¯nw¯x¯nx¯1y¯L, where ˉww¯ can be either ww or ww, and similarly, ˉviv¯i can be either vv or vv and ˉxix¯i can be either xx or xx, but only such that ˉvi=vv¯i=v if and only if ˉxi=xx¯i=x (thus ˉvi=vv¯i=v if and only if ˉxi=xx¯i=x), and ˉu=uu¯=u if and only if ˉy=yy¯=y and ˉu=uu¯=u if and only if ˉy=yy¯=y. Moreover, if instead of pp, we are given p(n+44)p(n+44) words of length nn, we can additionally suppose for the selected two words that |u|=|u||u|=|u|, |v|=|v||v|=|v|, |w|=|w||w|=|w|, |x|=|x||x|=|x| and |y|=|y||y|=|y|.

This statement can be proved essentially the same way as the pumping lemma, we just need to pick some ss and ss for which the same rule is pumped. This can be done if pp is large enough since there are only a constant number of rules. In fact, we don't even need that the same rule is pumped, but only that the non-terminal symbol is the same in the pumped rule. For the moreover part, notice that for a word of length nn there are only (n+44)(n+44) options it can be broken into five subwords, thus the statement follows from the pigeonhole principle.

Next we give another way of generalizing the pumping lemma (and later we'll combine the two).

Nested Pumping Lemma. For every context-free language LL there is a pp such that for any kk any word sLsL can be written as s=uv1vkwxkx1ys=uv1vkwxkx1y such that i 1|vixi|pi 1|vixi|p and for every sequence (ij)mj=1(ij)mj=1 the word uvi1vimwximxi1yLuvi1vimwximxi1yL.

Note that the indices ijij can be arbitrary from 11 to kk, the same index can occur multiple times. The proof of the Nested Pumping Lemma is essentially the same as the original pumping lemma's, we just need to use that we obtain the same non-terminal symbol from itself kk times - this is true if we do p(k1)+1p(k1)+1 steps (instead of the pp from the original pumping lemma). We can also strengthen Ogden's lemma in a similar way.

Nested Ogden's Lemma. For every context-free language LL there is a pp such that for any k marking any at least pk positions in any word sL, it can be written as s=uv1vkwxkx1y such that i 1 '# of marks in vixi'pk and for every sequence (ij)mj=1 the word uvi1vimwximxi1yL.

Unfortunately, in our application pk would be too large, so we need to weaken the conclusion to allow non-nested vi-xi pairs. Luckily, using Dilworth, the structure stays simple.

Dilworth Ogden's Lemma. For every context-free language L there is a p such that for any k, marking any at least pk positions in any word sL, it can be written either as

case (i): s=uv1vkwxkx1y, or as

case (ii): s=uv1w1x1vwxy,

such that i 1 '# of marks in vixi'pk and for every sequence (ij)mj=1,

in case (i) the word uvi1vimwximxi1yL, and

in case (ii) the word uvi11w1xi11viwxiyL.

Proof: Take the derivation tree generating s. Call a non-terminal recurring if it appears again under itself in the derivation tree. By expanding the rule set, we can suppose that all non-terminal symbols are recurring in the derivation tree. (This is to be understood that we might have eliminated their recurrence; this doesn't matter, the point is that they can be pumped.) There are at least pk leaves that correspond to a marked position. We look at the nodes where two marked letters split. There are at least Ω(pk) such nodes. By the pigeonhole principle, at least Ω(pk) correspond to the same non-terminal. Using Dilworth, Ω(pk) of them are in a chain or Ω(p) are in an antichain, giving cases (i) and (ii), respectively, if p is large enough.

Now we are ready to state a big combination lemma.

Super Lemma. For every context-free language L there is a p such that for any k, marking the same at least pk positions in Nmax(pn2k+2,pn3+1) words s1,,sNL, each of length n, there are two words, s and s, that can be written as s=uv1vkwxkx1y and s=uv1vkwxkx1y OR as s=uv1w1x1vwxy and s=uv1w1x1vwxy such that the respective lengths of the subwords are all the same, i.e., |u|=|u|, |vi|=|vi|, etc., and i vixi contains a mark, and for every sequence (ij)mj=1 the word ˉuˉvi1ˉvimˉwˉximˉxi1ˉyL OR ˉuˉvi11ˉw1ˉxi11ˉviˉwˉxiˉyL, respectively, where ˉz stands for z or z, i.e., we can freely mix the intermediate subwords from s and s, but only such that ˉu=u if and only if ˉy=y etc.

Proof sketch of Super Lemma: Apply the Dilworth Ogden's Lemma for each si. There are (n+2k+22k+2) and (n+3+13+1) possible options, respectively, where the boundaries between the subwords of si can be. There are a constant number of non-terminals in the language, thus by the pigeonhole principle, if p is large enough, the same non-terminal is pumped in the k/ rules for at least two words, s and s, that also have the same subword boundaries.

Unfortunately, the number N that comes from this lemma is too large for our application. We can, however, decrease it by demanding fewer coincidences among the subwords. Now we state the lemma that we'll use.

Special Lemma. For every context-free language L there is a p such that for any k marking the same at least pk positions in N=pkn2 words s1,,sNL, each of length n, there are two words, s and s, that can be written either as s=uv1vkwxkx1y and s=uv1vkwxkx1y such that either

case (i): i<k such that |xi|=|xi|=0, |uv1vi1|=|uv1vi1|, and |vi|=|vi| (i.e., the two latter conditions mean that the position of vi is the same as the position of vi), OR

case (ii): i<k |xi|1, |xi|1, |uv1vk1|=|uv1vk1| and |vkwxk|=|vkwxk| (i.e., these two conditions mean that the position of vkwxk is the same as the position of vkwxk),

and (for both cases) i vixi contains a mark, and for every sequence (ij)mj=1 the word ˉuˉvi1ˉvimˉwˉximˉxi1ˉyL, where ˉz stands for z or z, i.e., we can freely mix the intermediate subwords from s and s, but only such that ˉu=u if and only if ˉy=y etc., OR

case (iii): s and s can be written as s=uv1w1x1v2w2x2y and s=uv1w1x1v2w2x2y such that |u|=|u| and |v1w1x1|=|v1w1x1|, and i vixi contains a mark, and uvh1w1xh1v2w2x2yL and uvh1w1xh1v2w2x2yL.

The proof only differs from the Super Lemma's that there are k(n2) possible options for a word in case (i), which leaves (n2) options for case (ii), while in case (iii) there are (n2) options.

Corollary. If for every p there are t and n with np(t+1)+t such that there are N=n3 words of length n in a context-free language L whose first p(t+1) letters are the same for each word, and their last t letters are different for each pair or words (i.e., the words look like si=sbegismidisendi such that |sbegi|=p(t+1), |smidi|=np(t+1)t, |sendi|=t, and ij sbegi=sbegj and sendisendj), then there is a B such that there are infinitely many pairs of words ahbhL of equal length that differ only in their last B letters.

Proof: Take k=t+1 and apply the Special Lemma for our N words using N=n3p(t+1)n2, marking the first p(t+1) letters (that are the same in every word) to obtain s=uv1vt+1wxt+1x1y and s=uv1vt+1wxt+1x1y OR s=uv1w1x1v2w2x2y and s=uv1w1x1v2w2x2y.

If we are in case (i) of the Special Lemma, i.e., there is an i such that |xi|=|xi|=0, |uv1vi1|=|uv1vi1|, and |vi|=|vi|, then uv1vi1=uv1vi1 and vi=vi also hold, as vi+1wxi+1 needs to contain a marked letter, thus the subwords preceding vi+1 consist of only marked letters, and these are the same in s and s. We can take the words ah=uv1vhivtwxtx1y and bh=uv1vhivi+1vtwxtx1y to obtain the desired pairs; since these words end the same way as s and s, ahbh and they differ only in their last bounded many letters.

If we are in case (ii) of the Special Lemma, i.e., i<k |xi|1, |xi|1, |uv1vt|=|uv1vt| and |vt+1wxt+1|=|vt+1wxt+1|, then uv1vt=uv1vt also holds, similarly as in the previous case. Now we can take ah=uv1vtvht+1wxht+1xtx1y and bh=uv1vtvht+1wxht+1xtx1y; since |xtx1y|=|xtx1y|t, these words certainly end differently and can differ only in their last bounded many letters. (Note that this is the only place where we really need that we can pump one word with a subword of the other one.)

If we are in case (iii) of the Special Lemma, i.e., s=uv1w1x1v2w2x2y and s=uv1w1x1v2w2x2y such that |u|=|u| and |v1w1x1|=|v1w1x1|, then u=u and v1w1x1=v1w1x1 also hold, similarly as in the previous cases. Now we can take ah=uvh1w1xh1v2w2x2yL and bh=uvh1w1xh1v2w2x2yL; since v2 contains a marked letter, |v2w2x2y|t, thus these words certainly end differently and can differ only in their last bounded many letters.

This finishes the proof of the Corollary. Now let's see how to prove the original question from the Corollary.

Final proof. First we show that the condition of the Corollary is satisfied for every integer valued polynomial f. Set t=p1 and n=Cp for some large enough C=C(f). The plan is to take some numbers x1,,x2N (where N=2n3) for which f(xi)f(xj), and then add some sufficiently large number z to each of them to obtain the desired words si=f(xi+z). If the degree of f is d, then at most d numbers can take the same value, thus we can select x1,,x2N from the first 2dN numbers, which means that they have log(dn) digits. In this case f(xi)=O((dN)d), thus each f(xi) will have at most dlogN+O(1)=O(logn) digits. If we pick z to be an n/d digit number, then f(z) will have n digits, and for each f(xi+z) only the last O(logn) digits can differ. The f(xi+z) will all have n or n+1 digits, thus at least half, i.e., N of them have the same length; these will be the si.

From the conclusion of the Corollary we obtain infinitely many pairs of numbers ah and bh, such that |ahbh|2B, which is clearly impossible for polynomials.

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.