NumPy - 统计函数
-
简述
NumPy 有很多有用的统计函数,用于从数组中的给定元素中找到最小值、最大值、百分位标准差和方差等。功能解释如下 - -
numpy.amin() 和 numpy.amax()
这些函数返回给定数组中沿指定轴的元素的最小值和最大值。例子
import numpy as np a = np.array([[3,7,5],[8,4,3],[2,4,9]]) print 'Our array is:' print a print '\n' print 'Applying amin() function:' print np.amin(a,1) print '\n' print 'Applying amin() function again:' print np.amin(a,0) print '\n' print 'Applying amax() function:' print np.amax(a) print '\n' print 'Applying amax() function again:' print np.amax(a, axis = 0)
它将产生以下输出 -Our array is: [[3 7 5] [8 4 3] [2 4 9]] Applying amin() function: [3 3 2] Applying amin() function again: [2 4 3] Applying amax() function: 9 Applying amax() function again: [8 7 9]
-
numpy.ptp()
这numpy.ptp()函数返回沿轴的值的范围(最大值-最小值)。import numpy as np a = np.array([[3,7,5],[8,4,3],[2,4,9]]) print 'Our array is:' print a print '\n' print 'Applying ptp() function:' print np.ptp(a) print '\n' print 'Applying ptp() function along axis 1:' print np.ptp(a, axis = 1) print '\n' print 'Applying ptp() function along axis 0:' print np.ptp(a, axis = 0)
它将产生以下输出 -Our array is: [[3 7 5] [8 4 3] [2 4 9]] Applying ptp() function: 7 Applying ptp() function along axis 1: [4 5 7] Applying ptp() function along axis 0: [6 3 6]
-
numpy.percentile()
百分位数(或百分位数)是统计中使用的一种度量,表示一组观测值中给定百分比的观测值低于该值。功能numpy.percentile()接受以下论点。numpy.percentile(a, q, axis)
参数说明序号 论据和描述 1 a输入数组2 q要计算的百分位数必须在 0-100 之间3 axis计算百分位数的轴例子
import numpy as np a = np.array([[30,40,70],[80,20,10],[50,90,60]]) print 'Our array is:' print a print '\n' print 'Applying percentile() function:' print np.percentile(a,50) print '\n' print 'Applying percentile() function along axis 1:' print np.percentile(a,50, axis = 1) print '\n' print 'Applying percentile() function along axis 0:' print np.percentile(a,50, axis = 0)
它将产生以下输出 -Our array is: [[30 40 70] [80 20 10] [50 90 60]] Applying percentile() function: 50.0 Applying percentile() function along axis 1: [ 40. 20. 60.] Applying percentile() function along axis 0: [ 50. 40. 60.]
-
numpy.median()
Median定义为将数据样本的上半部分与下半部分分开的值。这numpy.median()函数的使用如以下程序所示。例子
import numpy as np a = np.array([[30,65,70],[80,95,10],[50,90,60]]) print 'Our array is:' print a print '\n' print 'Applying median() function:' print np.median(a) print '\n' print 'Applying median() function along axis 0:' print np.median(a, axis = 0) print '\n' print 'Applying median() function along axis 1:' print np.median(a, axis = 1)
它将产生以下输出 -Our array is: [[30 65 70] [80 95 10] [50 90 60]] Applying median() function: 65.0 Applying median() function along axis 0: [ 50. 90. 60.] Applying median() function along axis 1: [ 65. 80. 60.]
-
numpy.mean()
算术平均值是沿轴的元素的总和除以元素的数量。这numpy.mean()函数返回数组中元素的算术平均值。如果提到轴,则沿它计算。例子
import numpy as np a = np.array([[1,2,3],[3,4,5],[4,5,6]]) print 'Our array is:' print a print '\n' print 'Applying mean() function:' print np.mean(a) print '\n' print 'Applying mean() function along axis 0:' print np.mean(a, axis = 0) print '\n' print 'Applying mean() function along axis 1:' print np.mean(a, axis = 1)
它将产生以下输出 -Our array is: [[1 2 3] [3 4 5] [4 5 6]] Applying mean() function: 3.66666666667 Applying mean() function along axis 0: [ 2.66666667 3.66666667 4.66666667] Applying mean() function along axis 1: [ 2. 4. 5.]
-
numpy.average()
加权平均值是每个分量乘以反映其重要性的因子得到的平均值。这numpy.average()函数根据在另一个数组中给出的各自权重计算数组中元素的加权平均值。该函数可以有一个轴参数。如果未指定轴,则将数组展平。考虑数组 [1,2,3,4] 和相应的权重 [4,3,2,1],通过将相应元素的乘积相加并将总和除以权重总和来计算加权平均值。加权平均 = (1*4+2*3+3*2+4*1)/(4+3+2+1)例子
import numpy as np a = np.array([1,2,3,4]) print 'Our array is:' print a print '\n' print 'Applying average() function:' print np.average(a) print '\n' # this is same as mean when weight is not specified wts = np.array([4,3,2,1]) print 'Applying average() function again:' print np.average(a,weights = wts) print '\n' # Returns the sum of weights, if the returned parameter is set to True. print 'Sum of weights' print np.average([1,2,3, 4],weights = [4,3,2,1], returned = True)
它将产生以下输出 -Our array is: [1 2 3 4] Applying average() function: 2.5 Applying average() function again: 2.0 Sum of weights (2.0, 10.0)
在多维数组中,可以指定计算的轴。例子
import numpy as np a = np.arange(6).reshape(3,2) print 'Our array is:' print a print '\n' print 'Modified array:' wt = np.array([3,5]) print np.average(a, axis = 1, weights = wt) print '\n' print 'Modified array:' print np.average(a, axis = 1, weights = wt, returned = True)
它将产生以下输出 -Our array is: [[0 1] [2 3] [4 5]] Modified array: [ 0.625 2.625 4.625] Modified array: (array([ 0.625, 2.625, 4.625]), array([ 8., 8., 8.]))
-
标准差
标准偏差是与平均值的平方偏差的平均值的平方根。标准偏差的公式如下 -std = sqrt(mean(abs(x - x.mean())**2))
如果数组是 [1, 2, 3, 4],那么它的平均值是 2.5。因此,平方偏差为 [2.25, 0.25, 0.25, 2.25],其均值的平方根除以 4,即 sqrt (5/4) 为 1.1180339887498949。例子
import numpy as np print np.std([1,2,3,4])
它将产生以下输出 -1.1180339887498949
-
方差
方差是平方偏差的平均值,即mean(abs(x - x.mean())**2). 换句话说,标准差是方差的平方根。例子
import numpy as np print np.var([1,2,3,4])
它将产生以下输出 -1.25