10

鉴于：

自然数Ş。
总和为1 的N个有理权重W的列表。

返回由N个非负整数组成的列表L，例如：

(1) sum(L) = S
(2) sum((S⋅W_i - L_i)^2) is minimal

换句话说，S⋅W_i用整数尽可能接近s。

例子：

1 [0.4 0.3 0.3] = [1 0 0]
3 [0 1 0] = [0 3 0]
4 [0.3 0.4 0.3] = [1 2 1]
5 [0.3 0.4 0.3] = [2 2 1] or [1 2 2] but not [1 3 1]
21 [0.3 0.2 0.5] = [6 4 11]
5 [0.1 0.2 0.3 0.4] = [1 1 1 2] or [0 1 2 2]
4 [0.11 0.3 0.59] = [1 1 2]
10 [0.47 0.47 0.06] = [5 5 0]
10 [0.43 0.43 0.14] = [4 4 2]
11 [0.43 0.43 0.14] = [5 5 1]

规则：

您可以使用任何输入格式，或者仅提供一个接受输入作为参数的函数。

背景：

当针对类型以不同比例W _i显示不同类型的项目S时，会出现此问题。

这个问题的另一个例子是比例政治代表制，见分摊悖论。最后两个测试用例称为阿拉巴马悖论。

作为一名统计学家，我认识到这个问题等同于在进行分层样本时识别样本大小时遇到的问题。在这种情况下，我们要使样本中每个阶层的比例等于总体中每个阶层的比例。— @tomi

code-golf number arithmetic

— b
source

你能用语言说出任务是什么吗？我无法将表达式解压缩为直观的东西。

— xnor

两者均应≤固定。任务是提供一个整数作为基于权重的整数之和。其余的应该分配有利于最大的权重，尽管我不确定此要求是否正确编码？这很有趣，因为round(A + B) != round(A) + round(B)，一个简短的解决方案需要洞悉这里发生的事情。

— glebm 2015年

1

也许更改规则以最小化距离L[i] - S*W[i]平方的总和，而不是规则2和规则3 S*W[i]。

— 2015年

1

[0 1 2 2] 对于5 [0.1 0.2 0.3 0.4]

— Jakube，2015年

1

也许您应该为1 [0.4 0.3 0.3]添加一个示例

— aditsu退出是因为SE为EVIL，2015年

6

APL，21

{{⍵+1=⍋⍋⍵-⍺}⍣⍺/⍺0×⊂⍵}

这是aditsu的37字节CJam答案的翻译。

在线测试。

说明

 {      ⍵-⍺}            ⍝ Right argument - left argument.
 {  1=⍋⍋⍵-⍺}            ⍝ Make one of the smallest number 1, others 0.
 {⍵+1=⍋⍋⍵-⍺}            ⍝ Add the result and the right argument together.
 {⍵+1=⍋⍋⍵-⍺}⍣⍺          ⍝ Repeat that S times. The result of each iteration is the new right argument.
                  ⊂⍵    ⍝ Return enclosed W, which is taken as one unit in APL.
               ⍺0×⊂⍵    ⍝ Return S*W and 0*W.
{{⍵+1=⍋⍋⍵-⍺}⍣⍺/⍺0×⊂⍵}   ⍝ Make S*W the left argument, 0*W the right argument in the first iteration.

— 吉米23013
source

7

Python 2中，95 83 132 125 143

我的第一个（和第二个）（和第三个）算法有问题，因此在（另一个！）重写并进行了更多测试之后，（我真的希望）这里是一个正确且快速的解决方案：

def a(b,h):
 g=h;c=[];d=[]
 for w in b:f=int(w*h);d+=[f];c+=[h*w-f];g-=f
 if g:
  for e in sorted(c)[-g:]:i=c.index(e);c[i]=2;d[i]+=1
 return d

缩小器之前的源现在看起来像：

# minified 143 bytes
def golfalloc(weights, num):
    # Tiny seq alloc for golfing
    gap = num;
    errors = [];
    counts = []
    for w in weights :
        count = int(w*num);
        counts += [count];
        errors += [num*w - count];
        gap -= count
    if gap:
        for e in sorted(errors)[-gap:] :
            i = errors.index(e);
            errors[i] = 2;
            counts[i] += 1
    return counts

测试返回：

Pass                    Shape    N               Result Error                        AbsErrSum
ok            [0.4, 0.3, 0.3]    1            [1, 0, 0] -0.60,+0.30,+0.30                 1.20
ok                  [0, 1, 0]    3            [0, 3, 0] +0.00,+0.00,+0.00                 0.00
ok            [0.3, 0.4, 0.3]    4            [1, 2, 1] +0.20,-0.40,+0.20                 0.80
ok            [0.3, 0.4, 0.3]    5            [2, 2, 1] -0.50,+0.00,+0.50                 1.00
ok            [0.3, 0.2, 0.5]   21           [6, 4, 11] +0.30,+0.20,-0.50                 1.00
ok       [0.1, 0.2, 0.3, 0.4]    5         [1, 1, 1, 2] -0.50,+0.00,+0.50,+0.00           1.00
ok          [0.11, 0.3, 0.59]    4            [1, 1, 2] -0.56,+0.20,+0.36                 1.12
ok         [0.47, 0.47, 0.06]   10            [5, 5, 0] -0.30,-0.30,+0.60                 1.20
ok         [0.43, 0.43, 0.14]   10            [4, 4, 2] +0.30,+0.30,-0.60                 1.20
ok         [0.43, 0.43, 0.14]   11            [5, 5, 1] -0.27,-0.27,+0.54                 1.08

此算法与此处的其他答案类似。对于num，它是O（1），因此对于整数10和1000000，它具有相同的运行时间。理论上，权重数是O（nlogn）（因为排序）。如果可以承受所有其他棘手的输入情况，它将替换下面我编程工具箱中的算法。

请不要将这种算法用于任何非高尔夫运动。我在速度上做出了妥协，以最小化源大小。以下代码使用相同的逻辑，但速度更快且更有用：

def seqalloc(anyweights, num):
    # Distribute integer num depending on weights.
    # weights may be non-negative integers, longs, or floats.
    totalbias = float(sum(anyweights))
    weights = [bias/totalbias for bias in anyweights]
    counts = [int(w*num) for w in weights]
    gap = num - sum(counts)
    if gap:
        errors = [num*w - q for w,q in zip(weights, counts)]
        ordered = sorted(range(len(errors)), key=errors.__getitem__)
        for i in ordered[-gap:]:
            counts[i] += 1
    return counts

num的值不会显着影响速度。我已经使用1到10 ^ 19的值对其进行了测试。执行时间随权重数量线性变化。在我的计算机上，权重为10 ^ 5的时间为0.15秒，权重为10 ^ 7的时间为15秒。注意，权重不限于总和为一的分数。这里使用的排序技术也快于传统sorted((v,i) for i,v in enumerate...)样式的两倍。

原始算法

这是我工具箱中的一个功能，为高尔夫做了一些修改。它最初来自SO答案。这是错误的。

def seqalloc(seq, num):
    outseq = []
    totalw = float(sum(seq))
    for weight in seq:
        share = int(round(num * weight / totalw)) if weight else 0
        outseq.append(share)
        totalw -= weight
        num -= share
    return outseq

尽管给出了sum（outseq）== num，但它给出了近似值，但并不总是正确的。快速但不推荐。

感谢@alephalpha和@ user23013发现错误。

编辑：将totalw（d）设置为1，因为OP指定权重之和将始终为1。现在为83个字节。

EDIT2：修复了针对[0.4，0.3，0.3]，1的错误。

EDIT3：被遗弃的有缺陷的算法。增加了更好的一个。

EDIT4：这太荒谬了。替换为正确的算法（我真的希望如此）。

EDIT5：为其他可能喜欢使用此算法的人添加了不符合规则的代码。

— 逻辑骑士
source

4

a([0.4, 0.3, 0.3], 1)返回[0, 1, 0]，而正确答案是[1, 0, 0]。

— alephalpha

1

还是错。a([0.11,0.3,0.59],4)返回[0, 1, 3]。应该是[1, 1, 2]。

— jimmy23013 2015年

1

f([0.47,0.47,0.06],10)返回[5, 4, 1]。应该是[5, 5, 0]。

— jimmy23013 2015年

2

我认为现在是正确的。

— jimmy23013

2

@CarpetPython我使用此算法经历了类似的过程，这就是我想到这个问题的方式。如果他们拿走了您的许可证，他们也应该拿我的:)

— glebm 2015年

4

Mathematica，67 50 46 45个字符

f=(b=⌊1##⌋;b[[#~Ordering~-Tr@#&[b-##]]]++;b)&

取消高尔夫：

f[s_, w_] := Module[{a = s*w, b, c, d},
  b = Floor[a];
  c = b - a;
  d = Ordering[c, -Total[c]];
  b[[d]] += 1;
  b]

例：

f[5,{0.1,0.2,0.3,0.4}]

{1，1，1，2}

— Alephalpha
source

我的天哪，很短，考虑到它是Mathematica！

— DavidC

3

果酱-37

q~:W,0a*\:S{[_SWf*]z::-_:e<#_2$=)t}*p

在线尝试

说明：

q~             read and evaluate the input
               (pushing the number and the array on the stack)
:W,            save the array in variable W and calculate its length (N)
0a*            make an array of N zeros (the initial "L")
\:S            swap it with the number and save the number in S
{…}*           execute the block S times
    [_SWf*]    make a matrix with 2 rows: "L" and S*W
    z          transpose the matrix, obtaining rows of [L_i S*W_i]
    ::-_       convert to array of L_i-S*W_i and duplicate
    :e<        get the smallest element
    #          find its index in the unsorted array,
               i.e. the "i" with the largest S*W_i-L_i
    _2$=)t     increment L_i
p              print the result nicely

笔记：

复杂度约为O（S * N），因此对于大S来说，它的速度真的很慢
CJam非常缺少2个数组的算术运算符，我打算稍后实现

不同的想法-46

q~:Sf*_:m[_:+S\-@[1f%_,,]z{0=W*}$<{1=_2$=)t}/p

在线尝试

这是更直接，更有效的方法，但可惜的是，时间要长得多。这里的想法是从L_i = floor（S * W_i）开始，确定S与它们的和之间的差（例如D），找到具有S * W_i的小数部分的D索引（通过排序并取顶D）并为这些索引增加L_i。复杂度O（N * log（N））。

— aidtsu退出，因为SE是邪恶的
source

现在有O（N）:e<。

— jimmy23013 2015年

@ user23013哦，是的，对于第一个程序，谢谢

— aditsu退出了，因为SE有害

那很快！恭喜🌟

— glebm

对于那些想知道的人，用线性时间选择算法替换排序将产生O（n）而不是排序引起的实际O（nlogn）：在O（N）中找到第D个最大元素P，然后递增≥PD倍的元素（自D <= N以来为O（N））。

— glebm 2015年

@glebm非常酷，但是我认为如果多个元素具有相同的值（P），则会出现问题。也许您可以通过2次求解来解决：首先递增并计算> P元素，然后知道需要多少个元素= P。或者，如果您可以从选择算法中获得该信息，那就更好了。

— aidtsu退出是因为SE为EVIL，2015年

3

的JavaScript（ES6）126 130 104 115 156 162 194

在@CarpetPython的答案中所有注释和测试用例之后，回到我的第一个算法。las，智能解决方案不起作用。实现缩短了一点，它仍然尝试所有可能的解决方案，计算平方距离并保持最小值。

编辑对于权重w的每个输出元素，“所有”的可能值仅为2：trunc（w * s）和trunc（w * s）+1，因此仅尝试了（2 ** elemensts）种可能的解决方案。

Q=(s,w)=>
  (n=>{
    for(i=0;
        r=q=s,(y=i++)<1<<w.length;
        q|r>n||(n=r,o=t))
      t=w.map(w=>(f=w*s,q-=d=0|f+(y&1),y/=2,f-=d,r+=f*f,d));
  })()||o

在Firefox / FireBug控制台中测试

;[[ 1,  [0.4, 0.3, 0.3]      ]
, [ 3,  [0, 1, 0]            ]
, [ 4,  [0.3, 0.4, 0.3]      ]
, [ 5,  [0.3, 0.4, 0.3]      ]
, [ 21, [0.3, 0.2, 0.5]      ]
, [ 5,  [0.1, 0.2, 0.3, 0.4] ]
, [ 4,  [0.11, 0.3, 0.59]    ]
, [ 10, [0.47, 0.47, 0.06]   ]
, [ 10, [0.43, 0.43, 0.14]   ]
, [ 11, [0.43, 0.43, 0.14]   ]]
.forEach(v=>console.log(v[0],v[1],Q(v[0],v[1])))

输出量

1 [0.4, 0.3, 0.3] [1, 0, 0]
3 [0, 1, 0] [0, 3, 0]
4 [0.3, 0.4, 0.3] [1, 2, 1]
5 [0.3, 0.4, 0.3] [1, 2, 2]
21 [0.3, 0.2, 0.5] [6, 4, 11]
5 [0.1, 0.2, 0.3, 0.4] [0, 1, 2, 2]
4 [0.11, 0.3, 0.59] [1, 1, 2]
10 [0.47, 0.47, 0.06] [5, 5, 0]
10 [0.43, 0.43, 0.14] [4, 4, 2]
11 [0.43, 0.43, 0.14] [5, 5, 1]

那是一个更聪明的解决方案。单次传递weigth数组。
对于每遍，我找到w中的当前最大值。我用加权整数值（四舍五入）就地更改此值，因此，如果s == 21且w = 0.4，我们得到0.5 * 21-> 10.5->11。我存储了该值，因此不能在下一个循环中被发现为最大值。然后，我相应地减少总和（s = s-11），也减少变量f中的权重的总和。
当没有最大大于0的最大值（已管理所有值= 0）时，循环结束。
最后，我再次将值更改为正值。警告此代码会修改权重数组，因此必须使用原始数组的副本来调用它

~~F=(s,w)=> (f=>{ for(;j=w.indexOf(z=Math.max(...w)),z>0;f-=z) s+=w[j]=-Math.ceil(z*s/f); })(1)||w.map(x=>0-x)~~

我的第一次尝试

并不是一个聪明的解决方案。对于每种可能的结果，它都会评估差异，并保持最小值。

F=(s,w,t=w.map(_=>0),n=NaN)=>
  (p=>{
    for(;p<w.length;)
      ++t[p]>s?t[p++]=0
      :t.map(b=>r+=b,r=p=0)&&r-s||
        t.map((b,i)=>r+=(z=s*w[i]-b)*z)&&r>n||(n=r,o=[...t])
  })(0)||o

Ungolfed并解释

F=(s, w) =>
{
  var t=w.map(_ => 0), // 0 filled array, same size as w
      n=NaN, // initial minumum NaN, as "NaN > value"  is false for any value
      p, r
  // For loop enumerating from [1,0,0,...0] to [s,s,s...s]
  for(p=0; p<w.length;)
  {
    ++t[p]; // increment current cell
    if (t[p] > s)
    {
      // overflow, restart at 0 and point to next cell
      t[p] = 0;
      ++p;
    }
    else
    {
      // increment ok, current cell is the firts one
      p = 0;
      r = 0;
      t.map(b => r += b) // evaluate the cells sum (must be s)
      if (r==s)
      {
        // if sum of cells is s
        // evaluate the total squared distance (always offset by s, that does not matter)
        t.map((b,i) => r += (z=s*w[i]-b)*z) 
        if (!(r > n))
        {
          // if less than current mininum, keep this result
          n=r
          o=[...t] // copy of t goes in o
        }
      }
    }
  }
  return o
}

— edc65
source

2

CJam，48个字节

解决问题的直接方法。

q~:Sf*:L,S),a*{m*{(+}%}*{1bS=},{L]z::-Yf#:+}$0=p

输入像

[0.3 0.4 0.3] 4

说明：

q~:S                                 "Read and parse the input, store sum in S";
    f*:L                             "Do S.W, store the dot product in L";
         S),                         "Get array of 0 to S";
        ,   a*                       "Create an array with N copies of the above array";
              {m*{(+}%}*             "Get all possible N length combinations of 0 to S ints";
                        {1bS=},      "Filter to get only those which sum up to S";
{L]z::-Yf#:+}$                       "Sort them based on (S.W_i - L_i)^2 value";
 L                                   "Put the dot product after the sum combination";
  ]z                                 "Wrap in an array and transpose";
    ::-                              "For each row, get difference, i.e. S.W_i - L_i";
       Yf#                           "Square every element";
          :+                         "Take sum";
              0=p                    "After sorting on sum((S.W_i - L_i)^2), take the";
                                     "first element, i.e. smallest sum and print it";

在这里在线尝试

— 优化器
source

2

Pyth：40个字节

Mhosm^-*Ghded2C,HNfqsTGmms+*G@Hb}bklHyUH

这定义了一个g带有2个参数的函数。您可以像这样称呼它Mhosm^-*Ghded2C,HNfqsTGmms+*G@Hb}bklHyUHg5 [0.1 0.2 0.3 0.4。

在线尝试：Pyth编译器/执行器

说明：

mms+*G@Hb}bklHyUH     (G is S, H is the list of weights)
m             yUH    map each subset k of [0, 1, ..., len(H)-1] to:
 m          lH          map each element b of [0, 1, ..., len(H)-1] to: 
    *G@Hb                  G*H[b]
   +     }bk               + b in k
  s                       floor(_)

这将L在L[i] = floor(S*W[i])或处创建所有可能的解决方案L[i] = floor(S*W[i]+1)。例如，输入4 [0.3 0.4 0.3创建[[1, 1, 1], [2, 1, 1], [1, 2, 1], [1, 1, 2], [2, 2, 1], [2, 1, 2], [1, 2, 2], [2, 2, 2]]。

fqsTG...  
f    ... only use the solutions, where
 qsTG       sum(solution) == G

只有[[2, 1, 1], [1, 2, 1], [1, 1, 2]]保持。

Mhosm^-*Ghded2C,HN
  o                  order the solutions by
   s                   the sum of 
    m         C,HN       map each element d of zip(H, solution) to
     ^-*Ghded2           (G*d[0] - d[1])^2
 h                   use the first element (minimum)
M                    define a function g(G,H): return _

— 雅库比
source

2

Mathematica 108

s_~f~w_:=Sort[{Tr[(s*w-#)^2],#}&/@ 
Flatten[Permutations/@IntegerPartitions[s,{Length@w},0~Range~s],1]][[1,2]]

f[3, {0, 1, 0}]
f[4, {0.3, 0.4, 0.3}]
f[5, {0.3, 0.4, 0.3}]
f[21, {0.3, 0.2, 0.5}]
f[5, {0.1, 0.2, 0.3, 0.4}]

{0，3，0}
{1,2,1}
{1,2,2}
{6，4，11}
{0，1，2，2}

说明

不打高尔夫球

f[s_,w_]:=
Module[{partitions},
partitions=Flatten[Permutations/@IntegerPartitions[s,{Length[w]},Range[0,s]],1];
Sort[{Tr[(s *w-#)^2],#}&/@partitions][[1,2]]]

IntegerPartitions[s,{Length@w},0~Range~s]返回的所有整数分区s，使用从集合中取出的元素，并{0, 1, 2, ...s}具有以下约束：输出应包含与权重集中相同数量的元素w。

Permutations 给出每个整数分区的所有有序排列。

{Tr[(s *w-#)^2],#}{error, permutation} 对于每个排列，返回有序对的列表。

Sort[...] 排序清单 {{error1, permutation1},{error2, permutation2}...according to the size of the error.

[[1,2]]]或Part[<list>,{1,2}]返回的排序列表中第一个元素的第二项{{error, permutation}...}。换句话说，它返回具有最小误差的排列。

— 戴维
source

2

R，85 80 76

使用野兔配额方法。

在看到W等于1 的规格后删除了一对

function(a,b){s=floor(d<-b*a);s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s}

测试运行

> (function(a,b){s=floor(d<-b/(sum(b)/a));s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s})(3,c(0,1,0))
[1] 0 3 0
> (function(a,b){s=floor(d<-b/(sum(b)/a));s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s})(1,c(0.4,0.3,0.3))
[1] 1 0 0
> (function(a,b){s=floor(d<-b/(sum(b)/a));s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s})(4,c(0.3, 0.4, 0.3))
[1] 1 2 1
> (function(a,b){s=floor(d<-b/(sum(b)/a));s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s})(5,c(0.3, 0.4, 0.3))
[1] 1 2 2
> (function(a,b){s=floor(d<-b/(sum(b)/a));s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s})(21,c(0.3, 0.2, 0.5))
[1]  6  4 11
> (function(a,b){s=floor(d<-b/(sum(b)/a));s[o]=s[o<-rev(order(d%%1))[0:(a-sum(s))]]+1;s})(5,c(0.1,0.2,0.3,0.4))
[1] 1 1 1 2
>

— 米奇
source

2

Python，第139个 128 117字节

def f(S,W):
 L=(S+1,0,[]),
 for n in W:L=[(x-i,y+(S*n-i)**2,z+[i])for x,y,z in L for i in range(x)]
 return min(L)[2]

先前的itertools解决方案，139字节

from itertools import*
f=lambda S,W:min((sum(x)!=S,sum((S*a-b)**2for a,b in zip(W,x)),list(x))for x in product(*tee(range(S+1),len(W))))[2]

— Sp3000
source

我想知道itertools解决方案是否可能。做得好+1。我认为这具有O（n ^ 4）时间复杂度，对吗？

— 逻辑骑士

Itertools解决方案O(S^len(W))实际上是：P。新解决方案虽然快很多，但仍然很慢

— Sp3000

2

八度，87 76

打高尔夫球：

function r=w(s,w)r=0*w;for(i=1:s)[m,x]=max(s*w-r);r(x)+=1;endfor endfunction

取消高尔夫：

function r=w(s,w)
  r=0*w;   # will be the output
  for(i=1:s)
    [m,x]=max(s*w-r);
    r(x)+=1;
  endfor
endfunction

（爆破了“ endfor”和“ endfunction”！我永远不会赢，但我确实喜欢用“真实”语言打高尔夫球。）

— dcsohl
source

不错的算法。您可以替换zeros(size(w))为0*w。

— alephalpha

真好！我为什么没想到呢？

— dcsohl

1

T-SQL， 167 265

因为我也喜欢尝试在查询中解决这些挑战。

将其转换为内联函数以更好地符合规范，并为表数据创建类型。它花费了一点，但是这永远不会成为竞争者。每个语句都需要单独运行。

CREATE TYPE T AS TABLE(A INT IDENTITY, W NUMERIC(9,8))
CREATE FUNCTION W(@ int,@T T READONLY)RETURNS TABLE RETURN SELECT CASE WHEN i<=@-SUM(g)OVER(ORDER BY(SELECT\))THEN g+1 ELSE g END R,A FROM(SELECT A,ROW_NUMBER()OVER(ORDER BY (W*@)%1 DESC)i,FLOOR(W*@)g FROM @T)a

正在使用

DECLARE @ INT = 21
DECLARE @T T
INSERT INTO @T(W)VALUES(0.3),(0.2),(0.5)
SELECT R FROM dbo.W(@,@T) ORDER BY A

R
---------------------------------------
6
4
11

— 米奇
source