最快的半素数分解

编写程序以在最短的时间内分解半素数。

为了进行测试，请使用以下命令：38！+1（523022617466601111111760007224100074291200000001）

等于：14029308060317546154181×37280713718589679646221

fastest-code primes

— 苏汉·乔杜里（Soham Chowdhury）
source

尽管我喜欢“最快”的功能，但由于它使C之类的语言比典型的代码golfing语言更具优势，我不知道您将如何测试结果？

— 李斯特先生，2012年

如果您12259243打算将其用于测试程序的运行速度，那么结果将是如此之小，以至于您在统计上不会有任何显着差异。

— 彼得·泰勒

我增加了一个更大的数字，单挑。

— Soham Chowdhury

@Mr Lister，我将在自己的PC上对其进行测试。

— Soham Chowdhury

inb4有人使用预处理器滥用来编写400艾字节的查找表。

— Wug 2012年

Python（含PyPy JIT v1.9）〜1.9秒

使用多项式二次筛。我认为这是一个代码挑战，所以我选择不使用任何外部库（log我想除了标准功能之外）。计时时，应使用PyPy JIT，因为它的计时速度是cPython的 4-5倍。

更新（2013-07-29）：
自最初发布以来，我进行了一些较小但重要的更改，这些更改将整体速度提高了约2.5倍。

更新（2014-08-27）：
由于此帖子仍在受到关注my_math.py，对于可能正在使用该帖子的任何人，我已经更新了更正两个错误：

isqrt错误，有时会产生非常接近完美平方值的错误输出。这已得到纠正，并且通过使用更好的种子可以提高性能。
is_prime已经升级了。我以前删除完美的正方形2点的尝试充其量只是三心二意。我添加了3-sprp检查-Mathmatica使用的一种技术-以确保测试值是无平方的。

更新（2014-11-24）：
如果在计算结束时未找到非平凡的等价性，则程序现在将筛选其他多项式。先前在代码中将其标记为TODO。

mpqs.py

from my_math import *
from math import log
from time import clock
from argparse import ArgumentParser

# Multiple Polynomial Quadratic Sieve
def mpqs(n, verbose=False):
  if verbose:
    time1 = clock()

  root_n = isqrt(n)
  root_2n = isqrt(n+n)

  # formula chosen by experimentation
  # seems to be close to optimal for n < 10^50
  bound = int(5 * log(n, 10)**2)

  prime = []
  mod_root = []
  log_p = []
  num_prime = 0

  # find a number of small primes for which n is a quadratic residue
  p = 2
  while p < bound or num_prime < 3:

    # legendre (n|p) is only defined for odd p
    if p > 2:
      leg = legendre(n, p)
    else:
      leg = n & 1

    if leg == 1:
      prime += [p]
      mod_root += [int(mod_sqrt(n, p))]
      log_p += [log(p, 10)]
      num_prime += 1
    elif leg == 0:
      if verbose:
        print 'trial division found factors:'
        print p, 'x', n/p
      return p

    p = next_prime(p)

  # size of the sieve
  x_max = len(prime)*60

  # maximum value on the sieved range
  m_val = (x_max * root_2n) >> 1

  # fudging the threshold down a bit makes it easier to find powers of primes as factors
  # as well as partial-partial relationships, but it also makes the smoothness check slower.
  # there's a happy medium somewhere, depending on how efficient the smoothness check is
  thresh = log(m_val, 10) * 0.735

  # skip small primes. they contribute very little to the log sum
  # and add a lot of unnecessary entries to the table
  # instead, fudge the threshold down a bit, assuming ~1/4 of them pass
  min_prime = int(thresh*3)
  fudge = sum(log_p[i] for i,p in enumerate(prime) if p < min_prime)/4
  thresh -= fudge

  if verbose:
    print 'smoothness bound:', bound
    print 'sieve size:', x_max
    print 'log threshold:', thresh
    print 'skipping primes less than:', min_prime

  smooth = []
  used_prime = set()
  partial = {}
  num_smooth = 0
  num_used_prime = 0
  num_partial = 0
  num_poly = 0
  root_A = isqrt(root_2n / x_max)

  if verbose:
    print 'sieving for smooths...'
  while True:
    # find an integer value A such that:
    # A is =~ sqrt(2*n) / x_max
    # A is a perfect square
    # sqrt(A) is prime, and n is a quadratic residue mod sqrt(A)
    while True:
      root_A = next_prime(root_A)
      leg = legendre(n, root_A)
      if leg == 1:
        break
      elif leg == 0:
        if verbose:
          print 'dumb luck found factors:'
          print root_A, 'x', n/root_A
        return root_A

    A = root_A * root_A

    # solve for an adequate B
    # B*B is a quadratic residue mod n, such that B*B-A*C = n
    # this is unsolvable if n is not a quadratic residue mod sqrt(A)
    b = mod_sqrt(n, root_A)
    B = (b + (n - b*b) * mod_inv(b + b, root_A))%A

    # B*B-A*C = n <=> C = (B*B-n)/A
    C = (B*B - n) / A

    num_poly += 1

    # sieve for prime factors
    sums = [0.0]*(2*x_max)
    i = 0
    for p in prime:
      if p < min_prime:
        i += 1
        continue
      logp = log_p[i]

      inv_A = mod_inv(A, p)
      # modular root of the quadratic
      a = int(((mod_root[i] - B) * inv_A)%p)
      b = int(((p - mod_root[i] - B) * inv_A)%p)

      k = 0
      while k < x_max:
        if k+a < x_max:
          sums[k+a] += logp
        if k+b < x_max:
          sums[k+b] += logp
        if k:
          sums[k-a+x_max] += logp
          sums[k-b+x_max] += logp

        k += p
      i += 1

    # check for smooths
    i = 0
    for v in sums:
      if v > thresh:
        x = x_max-i if i > x_max else i
        vec = set()
        sqr = []
        # because B*B-n = A*C
        # (A*x+B)^2 - n = A*A*x*x+2*A*B*x + B*B - n
        #               = A*(A*x*x+2*B*x+C)
        # gives the congruency
        # (A*x+B)^2 = A*(A*x*x+2*B*x+C) (mod n)
        # because A is chosen to be square, it doesn't need to be sieved
        val = sieve_val = A*x*x + 2*B*x + C

        if sieve_val < 0:
          vec = set([-1])
          sieve_val = -sieve_val

        for p in prime:
          while sieve_val%p == 0:
            if p in vec:
              # keep track of perfect square factors
              # to avoid taking the sqrt of a gigantic number at the end
              sqr += [p]
            vec ^= set([p])
            sieve_val = int(sieve_val / p)

        if sieve_val == 1:
          # smooth
          smooth += [(vec, (sqr, (A*x+B), root_A))]
          used_prime |= vec
        elif sieve_val in partial:
          # combine two partials to make a (xor) smooth
          # that is, every prime factor with an odd power is in our factor base
          pair_vec, pair_vals = partial[sieve_val]
          sqr += list(vec & pair_vec) + [sieve_val]
          vec ^= pair_vec
          smooth += [(vec, (sqr + pair_vals[0], (A*x+B)*pair_vals[1], root_A*pair_vals[2]))]
          used_prime |= vec
          num_partial += 1
        else:
          # save partial for later pairing
          partial[sieve_val] = (vec, (sqr, A*x+B, root_A))
      i += 1

    num_smooth = len(smooth)
    num_used_prime = len(used_prime)

    if verbose:
      print 100 * num_smooth / num_prime, 'percent complete\r',

    if num_smooth > num_used_prime:
      if verbose:
        print '%d polynomials sieved (%d values)'%(num_poly, num_poly*x_max*2)
        print 'found %d smooths (%d from partials) in %f seconds'%(num_smooth, num_partial, clock()-time1)
        print 'solving for non-trivial congruencies...'

      used_prime_list = sorted(list(used_prime))

      # set up bit fields for gaussian elimination
      masks = []
      mask = 1
      bit_fields = [0]*num_used_prime
      for vec, vals in smooth:
        masks += [mask]
        i = 0
        for p in used_prime_list:
          if p in vec: bit_fields[i] |= mask
          i += 1
        mask <<= 1

      # row echelon form
      col_offset = 0
      null_cols = []
      for col in xrange(num_smooth):
        pivot = col-col_offset == num_used_prime or bit_fields[col-col_offset] & masks[col] == 0
        for row in xrange(col+1-col_offset, num_used_prime):
          if bit_fields[row] & masks[col]:
            if pivot:
              bit_fields[col-col_offset], bit_fields[row] = bit_fields[row], bit_fields[col-col_offset]
              pivot = False
            else:
              bit_fields[row] ^= bit_fields[col-col_offset]
        if pivot:
          null_cols += [col]
          col_offset += 1

      # reduced row echelon form
      for row in xrange(num_used_prime):
        # lowest set bit
        mask = bit_fields[row] & -bit_fields[row]
        for up_row in xrange(row):
          if bit_fields[up_row] & mask:
            bit_fields[up_row] ^= bit_fields[row]

      # check for non-trivial congruencies
      for col in null_cols:
        all_vec, (lh, rh, rA) = smooth[col]
        lhs = lh   # sieved values (left hand side)
        rhs = [rh] # sieved values - n (right hand side)
        rAs = [rA] # root_As (cofactor of lhs)
        i = 0
        for field in bit_fields:
          if field & masks[col]:
            vec, (lh, rh, rA) = smooth[i]
            lhs += list(all_vec & vec) + lh
            all_vec ^= vec
            rhs += [rh]
            rAs += [rA]
          i += 1

        factor = gcd(list_prod(rAs)*list_prod(lhs) - list_prod(rhs), n)
        if factor != 1 and factor != n:
          break
      else:
        if verbose:
          print 'none found.'
        continue
      break

  if verbose:
    print 'factors found:'
    print factor, 'x', n/factor
    print 'time elapsed: %f seconds'%(clock()-time1)
  return factor

if __name__ == "__main__":
  parser =ArgumentParser(description='Uses a MPQS to factor a composite number')
  parser.add_argument('composite', metavar='number_to_factor', type=long,
      help='the composite number to factor')
  parser.add_argument('--verbose', dest='verbose', action='store_true',
      help="enable verbose output")
  args = parser.parse_args()

  if args.verbose:
    mpqs(args.composite, args.verbose)
  else:
    time1 = clock()
    print mpqs(args.composite)
    print 'time elapsed: %f seconds'%(clock()-time1)

my_math.py

# divide and conquer list product
def list_prod(a):
  size = len(a)
  if size == 1:
    return a[0]
  return list_prod(a[:size>>1]) * list_prod(a[size>>1:])

# greatest common divisor of a and b
def gcd(a, b):
  while b:
    a, b = b, a%b
  return a

# modular inverse of a mod m
def mod_inv(a, m):
  a = int(a%m)
  x, u = 0, 1
  while a:
    x, u = u, x - (m/a)*u
    m, a = a, m%a
  return x

# legendre symbol (a|m)
# note: returns m-1 if a is a non-residue, instead of -1
def legendre(a, m):
  return pow(a, (m-1) >> 1, m)

# modular sqrt(n) mod p
# p must be prime
def mod_sqrt(n, p):
  a = n%p
  if p%4 == 3:
    return pow(a, (p+1) >> 2, p)
  elif p%8 == 5:
    v = pow(a << 1, (p-5) >> 3, p)
    i = ((a*v*v << 1) % p) - 1
    return (a*v*i)%p
  elif p%8 == 1:
    # Shank's method
    q = p-1
    e = 0
    while q&1 == 0:
      e += 1
      q >>= 1

    n = 2
    while legendre(n, p) != p-1:
      n += 1

    w = pow(a, q, p)
    x = pow(a, (q+1) >> 1, p)
    y = pow(n, q, p)
    r = e
    while True:
      if w == 1:
        return x

      v = w
      k = 0
      while v != 1 and k+1 < r:
        v = (v*v)%p
        k += 1

      if k == 0:
        return x

      d = pow(y, 1 << (r-k-1), p)
      x = (x*d)%p
      y = (d*d)%p
      w = (w*y)%p
      r = k
  else: # p == 2
    return a

#integer sqrt of n
def isqrt(n):
  c = n*4/3
  d = c.bit_length()

  a = d>>1
  if d&1:
    x = 1 << a
    y = (x + (n >> a)) >> 1
  else:
    x = (3 << a) >> 2
    y = (x + (c >> a)) >> 1

  if x != y:
    x = y
    y = (x + n/x) >> 1
    while y < x:
      x = y
      y = (x + n/x) >> 1
  return x

# strong probable prime
def is_sprp(n, b=2):
  if n < 2: return False
  d = n-1
  s = 0
  while d&1 == 0:
    s += 1
    d >>= 1

  x = pow(b, d, n)
  if x == 1 or x == n-1:
    return True

  for r in xrange(1, s):
    x = (x * x)%n
    if x == 1:
      return False
    elif x == n-1:
      return True

  return False

# lucas probable prime
# assumes D = 1 (mod 4), (D|n) = -1
def is_lucas_prp(n, D):
  P = 1
  Q = (1-D) >> 2

  # n+1 = 2**r*s where s is odd
  s = n+1
  r = 0
  while s&1 == 0:
    r += 1
    s >>= 1

  # calculate the bit reversal of (odd) s
  # e.g. 19 (10011) <=> 25 (11001)
  t = 0
  while s:
    if s&1:
      t += 1
      s -= 1
    else:
      t <<= 1
      s >>= 1

  # use the same bit reversal process to calculate the sth Lucas number
  # keep track of q = Q**n as we go
  U = 0
  V = 2
  q = 1
  # mod_inv(2, n)
  inv_2 = (n+1) >> 1
  while t:
    if t&1:
      # U, V of n+1
      U, V = ((U + V) * inv_2)%n, ((D*U + V) * inv_2)%n
      q = (q * Q)%n
      t -= 1
    else:
      # U, V of n*2
      U, V = (U * V)%n, (V * V - 2 * q)%n
      q = (q * q)%n
      t >>= 1

  # double s until we have the 2**r*sth Lucas number
  while r:
    U, V = (U * V)%n, (V * V - 2 * q)%n
    q = (q * q)%n
    r -= 1

  # primality check
  # if n is prime, n divides the n+1st Lucas number, given the assumptions
  return U == 0

# primes less than 212
small_primes = set([
    2,  3,  5,  7, 11, 13, 17, 19, 23, 29,
   31, 37, 41, 43, 47, 53, 59, 61, 67, 71,
   73, 79, 83, 89, 97,101,103,107,109,113,
  127,131,137,139,149,151,157,163,167,173,
  179,181,191,193,197,199,211])

# pre-calced sieve of eratosthenes for n = 2, 3, 5, 7
indices = [
    1, 11, 13, 17, 19, 23, 29, 31, 37, 41,
   43, 47, 53, 59, 61, 67, 71, 73, 79, 83,
   89, 97,101,103,107,109,113,121,127,131,
  137,139,143,149,151,157,163,167,169,173,
  179,181,187,191,193,197,199,209]

# distances between sieve values
offsets = [
  10, 2, 4, 2, 4, 6, 2, 6, 4, 2, 4, 6,
   6, 2, 6, 4, 2, 6, 4, 6, 8, 4, 2, 4,
   2, 4, 8, 6, 4, 6, 2, 4, 6, 2, 6, 6,
   4, 2, 4, 6, 2, 6, 4, 2, 4, 2,10, 2]

max_int = 2147483647

# an 'almost certain' primality check
def is_prime(n):
  if n < 212:
    return n in small_primes

  for p in small_primes:
    if n%p == 0:
      return False

  # if n is a 32-bit integer, perform full trial division
  if n <= max_int:
    i = 211
    while i*i < n:
      for o in offsets:
        i += o
        if n%i == 0:
          return False
    return True

  # Baillie-PSW
  # this is technically a probabalistic test, but there are no known pseudoprimes
  if not is_sprp(n, 2): return False

  # idea shamelessly stolen from Mathmatica
  # if n is a 2-sprp and a 3-sprp, n is necessarily square-free
  if not is_sprp(n, 3): return False

  a = 5
  s = 2
  # if n is a perfect square, this will never terminate
  while legendre(a, n) != n-1:
    s = -s
    a = s-a
  return is_lucas_prp(n, a)

# next prime strictly larger than n
def next_prime(n):
  if n < 2:
    return 2
  # first odd larger than n
  n = (n + 1) | 1
  if n < 212:
    while True:
      if n in small_primes:
        return n
      n += 2

  # find our position in the sieve rotation via binary search
  x = int(n%210)
  s = 0
  e = 47
  m = 24
  while m != e:
    if indices[m] < x:
      s = m
      m = (s + e + 1) >> 1
    else:
      e = m
      m = (s + e) >> 1

  i = int(n + (indices[m] - x))
  # adjust offsets
  offs = offsets[m:] + offsets[:m]
  while True:
    for o in offs:
      if is_prime(i):
        return i
      i += o

样本I / O：

$ pypy mpqs.py --verbose 94968915845307373740134800567566911
smoothness bound: 6117
sieve size: 24360
log threshold: 14.3081031579
skipping primes less than: 47
sieving for smooths...
144 polynomials sieved (7015680 values)
found 405 smooths (168 from partials) in 0.513794 seconds
solving for non-trivial congruencies...
factors found:
216366620575959221 x 438925910071081891
time elapsed: 0.685765 seconds

$ pypy mpqs.py --verbose 523022617466601111760007224100074291200000001
smoothness bound: 9998
sieve size: 37440
log threshold: 15.2376302725
skipping primes less than: 59
sieving for smooths...
428 polynomials sieved (32048640 values)
found 617 smooths (272 from partials) in 1.912131 seconds
solving for non-trivial congruencies...
factors found:
14029308060317546154181 x 37280713718589679646221
time elapsed: 2.064387 seconds

注意：不使用该--verbose选项将提供更好的计时：

$ pypy mpqs.py 94968915845307373740134800567566911
216366620575959221
time elapsed: 0.630235 seconds

$ pypy mpqs.py 523022617466601111760007224100074291200000001
14029308060317546154181
time elapsed: 1.886068 seconds

基本概念

通常，二次筛基于以下观察结果：任何奇数合成n都可以表示为：

$n=(x+d)(x-d)=x^2-d^2\qquad\Rightarrow\qquad d^2=x^2-n$

这并不是很难确认的。由于Ñ为奇数时，任何两个辅因子之间的距离Ñ必须是偶数2D，其中X是它们之间的中点。此外，n的任何倍数都具有相同的关系

$abn=(ax+ad)(bx-bd)=abx^2-abd^2\qquad\Rightarrow\qquad abd^2=abx^2-abn$

请注意，如果可以找到任何这样的x和d，则将立即导致n（不一定是质数）因子n，因为x + d和x-d都按定义将n除。由于允许潜在的微不足道的一致性，这种关系可以进一步减弱为以下形式：

$d^2\equiv x^2(mod\,n)$

因此，通常来说，如果我们找到两个等于mod n的完美平方，那么很可能我们可以直接产生n a la gcd（x±d，n）的因数。看起来很简单，对吧？

除非不是。如果我们预期在所有可能进行穷举搜索X，我们需要从[搜索整个范围√ Ñ，√（2N） ]，它是稍微比全试除法较小，而且还需要昂贵的is_square操作中的每个迭代确认d的值。除非预先知道ň具有因素非常接近√ ñ，审判部门很可能会更快。

也许我们可以进一步削弱这种关系。假设我们选择一个x，使得

$y\equiv x^2 (mod\,n)$

y的完全素数分解是众所周知的。如果我们有足够的这样的关系，那么我们应该能够构造一个足够的d，如果我们选择一个y使得它们的乘积是一个完美的平方；也就是说，所有质数因子都被使用了偶数次。实际上，如果这样的y比它们包含的唯一质数的总数更多，则可以保证存在解决方案；它成为一个线性方程组。现在的问题是，我们如何选择这样的x？这就是筛分的作用。

筛子

考虑多项式：

$y(x)=x^2-n$

那么对于任何素数p和整数k，以下条件成立：

$y(x+kp)=(x+kp)^2-n\\y(x+kp)=x^2+2xkp+(kp)^2-n\\y(x+kp)=y(x)+2xkp+(kp)^2\equiv y(x) (mod\,p)$

这意味着在求解多项式mod p的根后-也就是说，您找到了一个x，使得y（x）≡0（mod p），ergo y可被p整除-然后您发现了一个无限数的x。这样，您可以筛选x的范围，确定y的较小素数，希望找到一些所有素数均较小的素数。这样的数字称为k-smooth，其中k是使用的最大素数。

但是，这种方法存在一些问题。并不是所有值X是足够的，其实这里只有极少数人的，是围绕√ ñ。较小的值将在很大程度上变为负数（由于-n项），而较大的值将变得太大，因此它们的素因数分解不可能仅由小素数组成。会有很多这样的x，但是除非要分解的复合数很小，否则很难找到足够的平滑度来进行分解。因此，对于更大的n，有必要筛选给定形式的多个多项式。

多项式

因此，我们需要更多的多项式进行筛选吗？这个怎么样：

$y(x)=(Ax+B)^2-n$

可以的请注意，A和B实际上可以是任何整数值，并且数学仍然成立。我们需要做的就是选择一些随机值，求解多项式的根，然后筛分接近零的值。在这一点上，我们可以说它足够好：如果您在随机方向上扔了足够多的石头，那么早晚一定会打碎窗户的。

除此之外，这也有问题。如果多项式的斜率在x轴截距处较大（如果不是相对平坦的话）将是斜率，则每个多项式只有几个合适的值可以筛分。它将起作用，但是最终您会在获得所需的结果之前筛查许多多项式。我们可以做得更好吗？

我们可以做得更好。蒙哥马利的观测结果如下：如果选择A和B使得存在一些C满足

$B^2-n=AC$

然后可以将整个多项式重写为

$y(x)=(Ax+B)^2-n=(Ax)^2+2ABx+B^2-n=A(Ax^2+2Bx+C)$

此外，如果选择A为理想正方形，则在筛选时可以忽略前导A项，从而得到更小的值和更平坦的曲线。对于这样的解决方案存在，Ñ必须是一个二次剩余 MOD √ 甲，其可立即通过计算被称为勒让德符号：
（N |√A）= 1。请注意，为了求解B，需要知道√A的完整素数分解（以取模平方根√n（mod√A）），这就是为什么通常选择√A为素数。

然后可以看出，如果 $A\approx\frac{\sqrt{2n}}{M}$ ，那么对于所有的值X ∈[ -M中，M ]：

$|y(x)|\le\frac{M\sqrt{2n}}{2}$

现在，最后，我们有了实施筛网所需的所有组件。还是我们？

素数的力量作为因子

如上所述，我们的筛子有一个主要缺陷。它可以识别其中值X将导致ý整除p，但它不能识别此是否ÿ是由整除功率的p。为了确定该值，我们将需要对要筛分的值进行试验除法，直到不再被p整除。我们似乎已经陷入了僵局：筛的整点是让我们没得做。是时候检查剧本了。

$ln(abcd...)=ln(a)+ln(b)+ln(c)+ln(d)+ln(...)$

看起来很有用。如果y的所有小质数因子的ln的总和接近ln（y）的期望值，则几乎可以认为y没有其他因子。此外，如果将期望值略微下调，我们还可以识别出具有多个素数幂作为因子的平滑值。这样，我们可以将筛子用作“预筛分”过程，并且仅考虑那些可能是平滑的值。

这也具有其他一些优点。请注意，小的素数对ln和的贡献很小，但是它们需要最多的筛分时间。筛选值3比组合11、13、17、19和23需要更多的时间。相反，我们可以跳过前几个素数，并假设一定比例的素数已通过，并相应地将阈值下调。

另一个结果是，将允许许多值“滑过”，这些值大部分是平滑的，但包含一个大的辅助因子。我们可以丢弃这些值，但是假设我们发现了另一个完全平滑的值，并且具有完全相同的辅助因子。然后，我们可以使用这两个值来构造可用的y；由于他们的产品将包含这个大的平方的辅因子，因此不再需要考虑。

放在一起

我们需要做的最后一件事是使用y的这些值构造一个足够的x和d。假设我们仅考虑y的非平方因子，即奇数幂的素数。然后，每个y可以用以下方式表示：

$y_0=p_0^0\cdot p_1^1\cdot p_2^1\cdots p_n^0\\y_1=p_0^1\cdot p_1^0\cdot p_2^1\cdots p_n^1\\y_2=p_0^0\cdot p_1^0\cdot p_2^0\cdots p_n^1\\y_3=p_0^1\cdot p_1^1\cdot p_2^0\cdots p_n^0\\\vdots$

可以用矩阵形式表示：

$M = \begin{bmatrix}0&1&1&\cdots&0\\1&0&1&\cdots&1\\0&0&0&\cdots&1\\1&1&0&\cdots&0\\\vdots\end{bmatrix}$

然后问题就变成了找到向量v，使得vM = ⦳ （mod 2），其中⦳是空向量。也就是说，求解M的左零空间。这可以通过多种方式完成，最简单的方法是对M ^T执行高斯消除，用行xor代替行加法运算。这将导致许多空空间基向量，将它们的任何组合都将产生有效的解。

x的构造相当简单。它只是每个使用的y的Ax + B的乘积。d的构造稍微复杂一些。如果要取所有y的乘积，则最终将得到一个10千（如果不是10千）个数字的值，需要找到其平方根。这种计算是不切实际的昂贵的。相反，我们可以在筛选过程中跟踪的素数甚至权力，然后用与和XOR对非方形因素的向量运算来重建平方根。

我似乎已达到30000个字符的限制。嗯，我想那已经足够了。

— 原始
source

好吧，我从未上过高中的代数（实际上是在大学一年级的第一学期辍学的），但是从程序员的角度来看，这使您很容易理解。如果不付诸实践，我不会假装完全理解它，但我为您鼓掌。您应该认真考虑将此帖子扩展到非现场并发布！

— jdstankosky

我同意。很好的答案，有很好的解释。+1

— Soham Chowdhury 2012年

@primo您对这里的多个问题的回答非常彻底和有趣。非常感激！

— 保罗·沃尔斯

最后，我要感谢威尔·尼斯（Will Ness）在这个问题上给予+100的悬赏。从字面上看，这就是他的整个声誉。

— primo 2012年

@StepHen可以。不幸的是，它使用的是2012年的原始版本，但没有提高速度，并且在高斯消除中存在错误（当最后一列为数据透视列时出错）。一段时间前，我尝试与作者联系，但未收到任何回复。

— primo

好吧，您的38！+1破坏了我的php脚本，不确定为什么。实际上，任何超过16位数字的半素数都会破坏我的脚本。

但是，使用8980935344490257（86028157 * 104395301），我的脚本在家用计算机（2.61GHz AMD Phenom 9950）上管理的时间为25.963秒。比我的工作计算机快得多，在2.93GHz Core 2 Duo上工作将近31秒。

PHP-757个字符（含）。新行

<?php
function getTime() {
    $t = explode( ' ', microtime() );
    $t = $t[1] + $t[0];
    return $t;
}
function isDecimal($val){ return is_numeric($val) && floor($val) != $val;}
$start = getTime();
$semi_prime = 8980935344490257;
$slice      = round(strlen($semi_prime)/2);
$max        = (pow(10, ($slice))-1);
$i          = 3;
echo "\nFactoring the semi-prime:\n$semi_prime\n\n";

while ($i < $max) {
    $sec_factor = ($semi_prime/$i);
    if (isDecimal($sec_factor) != 1) {
        $mod_f = bcmod($i, 1);
        $mod_s = bcmod($sec_factor, 1);
        if ($mod_f == 0 && $mod_s == 0) {
            echo "First factor = $i\n";
            echo "Second factor = $sec_factor\n";
            $end=getTime();
            $xtime=round($end-$start,4).' seconds';
            echo "\n$xtime\n";
            exit();
        }
    }
    $i += 2;
}
?>

我很想在c或其他编译语言中看到相同的算法。

— 杰斯坦科斯基
source

PHP的数字只有53位精度，大约是16个十进制数字

— 复制

在C ++中使用64位整数实现相同的算法只需要大约1.8秒即可在计算机上运行。但是，这种方法存在一些问题：1.无法处理足够大的数字。2.即使可以并假设所有数字（无论长度）使用相同的时间进行试验划分，每增加一个数量级，都会导致相等的时间增加。由于您的第一个因数要比给定的第一个因数小14个数量级，因此该算法将花费900万年以上的时间来计算给定的半素数。

— CasaDeRobison

诚然，我并不是数学能力最强的人，但是据我所知，对于很大数量的人来说，分解半素数的标准方法根本行不通（使用椭圆等）。考虑到这一点，如何改进算法本身？

— jdstankosky 2012年

Eratosthenes的筛子以数字列表开头，然后除去2的所有倍数，然后除去3，然后是5，然后是7，等等。筛子完成后剩下的只是质数。对于某些因素，可以“预先计算”该筛子。因为lcm(2, 3, 5, 7) == 210，被这些因素消除的数字的模式将每210个数字重复一次，仅剩下48个。通过这种方式，您可以从审判部门中消除所有数字的77％，而不是仅通过赔率来消除50％。

— 2012年

@primo出于好奇，您花了多少时间？我花了很长时间才想到这些东西。在我写这篇文章的时候，我只是在思考素数如何总是奇数。我并没有尝试超越这一点，也没有消除非主要赔率。回想起来似乎很简单。

— jdstankosky 2012年