给定浮点数列表，对其进行标准化。

细节

如果所有值的平均值为0，并且标准偏差为列表 $x_1,x_2,\ldots,x_n$ 是标准化的。一种计算方法是首先计算平均值和标准偏差为 $\mu$ $\sigma$ $μ = \frac{1}{n} \sum_{i = 1}^{n} x_{i} σ = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (x_{i} - μ)^{2}},$ $\mu = \frac1n\sum_{i=1}^n x_i \qquad \sigma = \sqrt{\frac{1}{n}\sum_{i=1}^n (x_i -\mu)^2} ,$ 然后通过替换每一个计算标准化 $x_i$ 与 $\frac{x_i-\mu}{\sigma}$ 。
您可以假定输入至少包含两个不同的条目（这意味着 $\sigma \neq 0$ ）。
请注意，某些实现使用样本标准偏差，该样本标准偏差不等于我们在此处使用的总体标准偏差 $\sigma$ 。
有一个CW答案为所有平凡的解决方案。

例子

[1,2,3] -> [-1.224744871391589,0.0,1.224744871391589]
[1,2] -> [-1,1]
[-3,1,4,1,5] -> [-1.6428571428571428,-0.21428571428571433,0.8571428571428572,-0.21428571428571433,1.2142857142857144]

（这些示例是使用此脚本生成的。）

— 瑕疵
source

7

R，51 45 38 37字节

感谢Giuseppe和J.Doe！

function(x)scale(x)/(1-1/sum(x|1))^.5

在线尝试！

— 罗伯特·S。
source

由2个字节1分钟打我

— Sumner18

5

所有琐碎词条的 CW

Python 3 + scipy，31个字节

from scipy.stats import*
zscore

在线尝试！

八度/ MATLAB，15字节

@(x)zscore(x,1)

在线尝试！

— 瑕疵
source

5

APL（Dyalog Classic），21 20 19字节

(-÷.5*⍨⊢÷⌹×≢)+/-⊢×≢

在线尝试！

⊢÷⌹ 是平方和

⊢÷⌹×≢ 是平方和除以长度

— ngn
source

哇。我不应该再感到惊讶了，但是我每次都感到惊讶

— Quintec

4

MATL，10字节

tYm-t&1Zs/

在线尝试！

说明

t       % Implicit input
        % Duplicate
Ym      % Mean
-       % Subtract, element-wise
t       % Duplicate
&1Zs    % Standard deviation using normalization by n
/       % Divide, element-wise
        % Implicit display

— 路易斯·门多
source

4

APL + WIN，41,32 30字节

Erik节省了9个字节，ngn节省了2个字节

x←v-(+/v)÷⍴v←⎕⋄x÷(+/x×x÷⍴v)*.5

提示输入数字并计算平均标准差和输入向量的标准化元素

— 格雷厄姆
source

您不能分配x←v-(+/v)÷⍴v←⎕然后做x÷((+/x*2)÷⍴v)*.5吗？

— 暴民埃里克（Erik the Outgolfer）

我确实可以。谢谢。

— 格雷厄姆

apl + win会做单例扩展（1 2 3+,4←→ 1 2 3+4）吗？如果是的话，您可以改写(+/x*2)÷⍴v为+/x×x÷⍴v

— ngn

@ngn可以工作另外2个字节。谢谢。

— 格雷厄姆

3

R + pryr，53 52字节

-1字节使用sum(x|1)而不是length(x)@Robert S.解决方案中看到的

pryr::f((x-(y<-mean(x)))/(sum((x-y)^2)/sum(x|1))^.5)

作为统计员专用的语言，我很惊讶它没有内置功能。至少我找不到。即使该函数mosaic::zscore也无法产生预期的结果。这可能是由于使用总体标准偏差而不是样本标准偏差。

在线尝试！

— 萨姆纳18
source

2

您可以更改<-成=保存1个字节。

— 罗伯特S.18年

@ J.Doe不，我使用了我在Robert S.的解决方案中评论的方法。scale整齐！

— 朱塞佩

2

@ J.Doe，因为您只使用n一次即可直接使用38字节

— Giuseppe

2

@RobertS。在PPCG上，我们倾向于鼓励提供灵活的输入和输出，包括输出超出要求的输出，但挑战之外，因为精确的输出布局是挑战的重点。

— ngm

6

当然，R内置项不会使用“人口差异”。只有困惑的工程师才会使用这种东西（因此Python和Matlab回答;））

— ngm

3

Tcl，126字节

proc S L {lmap c $L {expr ($c-[set m ([join $L +])/[set n [llength $L]].])/sqrt(([join [lmap c $L {expr ($c-$m)**2}] +])/$n)}}

在线尝试！

— 塞尔吉奥
source

2

果冻，10字节

_ÆmµL½÷ÆḊ×

在线尝试！

它并不短，但是Jelly的行列式函数ÆḊ也可以计算向量范数。

_Æm             x - mean(x)
   µ            then:
    L½          Square root of the Length
      ÷ÆḊ       divided by the norm
         ×      Multiply by that value

— 小鱼
source

嘿，不错的选择！不幸的是，我找不到缩短它的方法。

— 暴民埃里克（Erik the Outgolfer）

2

Mathematica，25个字节

Mean[(a=#-Mean@#)a]^-.5a&

纯功能。将数字列表作为输入，并返回机器精度数字列表作为输出。请注意，内置Standardize函数默认使用样本方差。

— 军团哺乳动物978
source

2

J，22字节

-1个字节感谢牛嘎嘎！

(-%[:%:1#.-*-%#@[)+/%#

在线尝试！

J，31 23字节

(-%[:%:#@[%~1#.-*-)+/%#

在线尝试！

                   +/%# - mean (sum (+/) divided (%) by the number of samples (#)) 
(                 )     - the list is a left argument here (we have a hook)
                 -      - the difference between each sample and the mean
                *       - multiplied by 
               -        - the difference between each sample and the mean
            1#.         - sum by base-1 conversion
          %~            - divided by
       #@[              - the length of the samples list
     %:                 - square root
   [:                   - convert to a fork (function composition) 
 -                      - subtract the mean from each sample
  %                     - and divide it by sigma

— 加伦·伊万诺夫（Galen Ivanov）
source

1

重新排列后得到[:(%[:%:1#.*:%#)]-+/%# 22tio.run/##y/qfVmyrp2CgYKVg8D/…，我认为其中的一个上限可以删除，但是到目前为止还没有运气，编辑：更直接的字节数(-%[:%:1#.-*-%#@[)+/%#也位于22

— Kritixi Lithos

@牛嘎嘎谢谢！

— Galen Ivanov '18

2

APL（Dyalog Unicode），33 29字节

{d÷.5*⍨l÷⍨+/×⍨d←⍵-(+/⍵)÷l←≢⍵}

-4个字节，感谢@ngn

在线尝试！

— 金泰克
source

您可以分配⍵-m给变量并m←像这样删除：{d÷.5*⍨l÷⍨+/×⍨d←⍵-(+/⍵)÷l←≢⍵}

— ngn

@ngn啊，很好，谢谢，我以某种方式没有看到重复

— Quintec

2

Haskell，80 75 68字节

t x=k(/sqrt(f$sum$k(^2)))where k g=g.(-f(sum x)+)<$>x;f=(/sum(1<$x))

感谢@flawr提供的建议，以sum(1<$x)代替sum[1|_<-x]和插入均值，@xnor提供内联的标准差和其他减少量。

展开：

-- Standardize a list of values of any floating-point type.
standardize :: Floating a => [a] -> [a]
standardize input = eachLessMean (/ sqrt (overLength (sum (eachLessMean (^2)))))
  where

    -- Map a function over each element of the input, less the mean.
    eachLessMean f = map (f . subtract (overLength (sum input))) input

    -- Divide a value by the length of the input.
    overLength n = n / sum (map (const 1) input)

— 乔恩·珀迪
source

1

您可以替换[1|_<-x]为(1<$x)以节省一些字节。这是避免fromIntegral出现到目前为止尚未见过的的绝妙技巧！

— 瑕疵

顺便说一句：我喜欢使用tryitonline，您可以在此处运行代码，然后将预格式化的aswer复制到此处！

— 瑕疵

而且您不必定义 m。

— 瑕疵

您可以(-x+)为(+(-x))避免而写。看起来f也可以是无点的：f=(/sum(1<$x))，并且s可以用其定义替换。

— xnor

@xnor Ooh，(-x+)非常方便，我确定以后会用到它

— Jon Purdy

2

MathGolf，7个字节

▓-_²▓√/

在线尝试！

说明

从字面上看，这是凯文·克鲁伊森（Kevin Cruijssen）的05AB1E答案的逐字节更新，但我从MathGolf中保存了一些字节，这些字节具有1个字节的字节，可以解决此挑战。我认为答案也相当不错！

▓         get average of list
 -        pop a, b : push(a-b)
  _       duplicate TOS
   ²      pop a : push(a*a)
    ▓     get average of list
     √    pop a : push(sqrt(a)), split string to list
      /   pop a, b : push(a/b), split strings

— 马克斯
source

1

JavaScript（ES7）， 80 79字节

a=>a.map(x=>(x-g(a))/g(a.map(x=>(x-m)**2))**.5,g=a=>m=eval(a.join`+`)/a.length)

在线尝试！

已评论

a =>                      // given the input array a[]
  a.map(x =>              // for each value x in a[]:
    (x - g(a)) /          //   compute (x - mean(a)) divided by
    g(                    //   the standard deviation:
      a.map(x =>          //     for each value x in a[]:
        (x - m) ** 2      //       compute (x - mean(a))²
      )                   //     compute the mean of this array
    ) ** .5,              //   and take the square root
    g = a =>              //   g = helper function taking an array a[],
      m = eval(a.join`+`) //     computing the mean
          / a.length      //     and storing the result in m
  )                       // end of outer map()

— Arnauld
source

1

Python 3 + numpy，46个字节

lambda a:(a-mean(a))/std(a)
from numpy import*

在线尝试！

— ovs
source

1

Haskell，59个字节

(%)i=sum.map(^i)
f l=[(0%l*y-1%l)/sqrt(2%l*0%l-1%l^2)|y<-l]

在线尝试！

不使用库。

helper函数%计算i列表的三次方之和，这使我们可以获得三个有用的值。

0%l是l（称为n）的长度
1%l是l（称为s）的总和
2%l是l（称为m）的平方和

我们可以将元素的z得分表示y为

(n*y-s)/sqrt(n*v-s^2)

（此表达式(y-s/n)/sqrt(v/n-(s/n)^2)通过将top和bottom乘以来简化n。）

我们可以插入表情0%l，1%l，2%l没有括号，因为%我们定义具有比算术运算符优先级越高。

(%)i=sum.map(^i)与的长度相同i%l=sum.map(^i)l。使其更加无意义无济于事。g i=...当我们调用它时，定义它就像丢失字节。尽管%适用于任何列表，但我们仅使用问题输入列表来调用它，但是l每次使用参数调用它都不会丢失字节，因为两参数调用i%l不再是单参数调用g i。

— 异或
source

我们确实有

L A T E X

$\LaTeX$ 在这里：)

— 瑕疵的

我真的很喜欢这个%主意！它看起来像统计矩的离散版本。

— 瑕疵

1

K（oK），33 23字节

-10个字节，感谢ngn！

{t%%(+/t*t:x-/x%#x)%#x}

在线尝试！

第一次尝试用K编码（我不敢称其为“高尔夫”）。我敢肯定它可以做得更好（这里的变量名太多...）

— 加伦·伊万诺夫（Galen Ivanov）
source

1

真好！可以更换初始(x-m)与t（二氧化钛）

— NGN

1

内部{ }是不必要的-它的隐式参数名称是x，并且已将其传递为xas参数（tio）

— ngn

1

通过更换另一个-1字节x-+/x用x-/x。左边的参数to -/作为减少量（tio）的初始值

— ngn

@ngn谢谢！现在，我看到前两个高尔夫很明显。最后一个超出了我当前的水平:)

— Galen Ivanov '18

1

MATLAB，26个字节

微不足道的，std(,1)用于使用人口标准偏差

f=@(x)(x-mean(x))/std(x,1)

— aaaaa说恢复莫妮卡
source

1

TI-Basic（83系列），14 11字节

Ans-mean(Ans
Ans/√(mean(Ans²

在中接受输入Ans。例如，如果您在中键入prgmSTANDARD，{1,2,3}:prgmSTANDARD则将返回{-1.224744871,0.0,1.224744871}。

以前，我尝试使用该1-Var Stats命令，该命令将总体标准差存储在中σx，但是手动计算它的麻烦较少。

— Misha Lavrov
source

1

05AB1E, 9 bytes

ÅA-DnÅAt/

Port of @Arnauld's JavaScript answer, so make sure to upvote him!

Try it online or verify all test cases.

Explanation:

ÅA          # Calculate the mean of the (implicit) input
            #  i.e. [-3,1,4,1,5] → 1.6
  -         # Subtract it from each value in the (implicit) input
            #  i.e. [-3,1,4,1,5] and 1.6 → [-4.6,-0.6,2.4,-0.6,3.4]
   D        # Duplicate that list
    n       # Take the square of each
            #  i.e. [-4.6,-0.6,2.4,-0.6,3.4] → [21.16,0.36,5.76,0.36,11.56]
     ÅA     # Pop and calculate the mean of that list
            #  i.e. [21.16,0.36,5.76,0.36,11.56] → 7.84
       t    # Take the square-root of that
            #  i.e. 7.84 → 2.8
        /   # And divide each value in the duplicated list with it (and output implicitly)
            #  i.e. [-4.6,-0.6,2.4,-0.6,3.4] and 2.8 → [-1.6428571428571428,
            #   -0.21428571428571433,0.8571428571428572,-0.21428571428571433,1.2142857142857144]

— Kevin Cruijssen
source

0

Jelly, 10 bytes

_Æm÷²Æm½Ɗ$

Try it online!

— Erik the Outgolfer
source

0

Pyth, 21 19 bytes

mc-dJ.OQ@.Om^-Jk2Q2

Try it online here.

mc-dJ.OQ@.Om^-Jk2Q2Q   Implicit: Q=eval(input())
                       Trailing Q inferred
    J.OQ               Take the average of Q, store the result in J
           m     Q     Map the elements of Q, as k, using:
             -Jk         Difference between J and k
            ^   2        Square it
         .O            Find the average of the result of the map
        @         2    Square root it
                       - this is the standard deviation of Q
m                  Q   Map elements of Q, as d, using:
  -dJ                    d - J
 c                       Float division by the standard deviation
                       Implicit print result of map

Edit: after seeing Kevin's answer, changed to use the average builtin for the inner results. Previous answer: mc-dJ.OQ@csm^-Jk2QlQ2

— Sok
source

0

SNOBOL4 (CSNOBOL4), 229 bytes

	DEFINE('Z(A)')
Z	X =X + 1
	M =M + A<X>	:S(Z)
	N =X - 1.
	M =M / N
D	X =GT(X) X - 1	:F(S)
	A<X> =A<X> - M	:(D)
S	X =LT(X,N) X + 1	:F(Y)
	S =S + A<X> ^ 2 / N	:(S)
Y	S =S ^ 0.5
N	A<X> =A<X> / S
	X =GT(X) X - 1	:S(N)
	Z =A	:(RETURN)

Try it online!

Link is to a functional version of the code which constructs an array from STDIN given its length and then its elements, then runs the function Z on that, and finally prints out the values.

Defines a function Z which returns an array.

The 1. on line 4 is necessary to do the floating point arithmetic properly.

— Giuseppe
source

0

Julia 0.7, 37 bytes

a->(a-mean(a))/std(a,corrected=false)

Try it online!

— Kirill L.
source

0

Charcoal, 25 19 bytes

≧⁻∕ΣθＬθθＩ∕θ₂∕ΣＸθ²Ｌθ

Try it online! Link is to verbose version of code. Explanation:

       θ    Input array
≧           Update each element
 ⁻          Subtract
   Σ        Sum of
    θ       Input array
  ∕         Divided by
     Ｌ      Length of
      θ     Input array

Calculate $\mu$ and vectorised subtract it from each $x_i$ .

  θ         Updated array
 ∕          Vectorised divided by
   ₂        Square root of
     Σ      Sum of
       θ    Updated array
      Ｘ     Vectorised to power
        ²   Literal 2
    ∕       Divided by
         Ｌ  Length of
          θ Array
Ｉ           Cast to string
            Implicitly print each element on its own line.

Calculate $\sigma$ , vectorised divide each $x_i$ by it, and output the result.

Edit: Saved 6 bytes thanks to @ASCII-only for a) using SquareRoot() instead of Power(0.5) b) fixing vectorised Divide() (it was doing IntDivide() instead) c) making Power() vectorise.

— Neil
source

crossed out 25 = no bytes? :P (Also, you haven't updated the TIO link yet)

— ASCII-only

@ASCII-only Oops, thanks!

— Neil

标准化样品（计算z得分）

细节

例子

R，51 45 38 37字节

所有琐碎词条的 CW

Python 3 + scipy，31个字节

八度/ MATLAB，15字节

APL（Dyalog Classic），21 20 19字节

MATL，10字节

说明

APL + WIN，41,32 30字节

R + pryr，53 52字节

Tcl，126字节

果冻，10字节

Mathematica，25个字节

J，22字节

J，31 23字节

APL（Dyalog Unicode），33 29字节

Haskell，80 75 68字节

MathGolf，7个字节

说明

JavaScript（ES7）， 80 79字节

已评论

Python 3 + numpy，46个字节

Haskell，59个字节

K（oK），33 23字节

MATLAB，26个字节

TI-Basic（83系列），14 11字节

05AB1E, 9 bytes

Jelly, 10 bytes

Pyth, 21 19 bytes

SNOBOL4 (CSNOBOL4), 229 bytes

Julia 0.7, 37 bytes

Charcoal, 25 19 bytes