创建数组
import numpy as np
import pandas as pd
import math
value = float('nan')
# 使用 math.isnan()
if math.isnan(value):
print("Value is NaN")
# 使用 numpy.isnan()
if np.isnan(value):
print("Value is NaN")
np.array([1, 2, 3, 4, 5])
np.linspace(10, 100, 10)
Value is NaN
Value is NaN
array([ 10., 20., 30., 40., 50., 60., 70., 80., 90., 100.])
sex = pd.Series(['Male','Male','Female'])
np.array(sex)
array(['Male', 'Male', 'Female'], dtype=object)
linspace
start:起始数字
end:结束
Num:要生成的样本数,默认为50。
np.linspace(10,100,10)
Arange: step:数值步长
np.arange(5,10,2)
array([5, 7, 9])
Uniform: 在上下限之间的均匀分布中生成随机样本
np.random.uniform(5,10,size = 4)
array([5.35806766, 5.25970119, 5.53573947, 7.04110989])
np.random.uniform(size = 5)
array([0.53490428, 0.07574269, 0.2994071 , 0.10866207, 0.78867775])
np.random.uniform(size = (2,3))
array([[0.23929405, 0.26832237, 0.70498685],
[0.71195525, 0.50311203, 0.99720624]])
Random.randint:在一个范围内生成n个随机整数样本
np.random.randint(5,10,10)
array([9, 8, 6, 6, 9, 5, 9, 5, 6, 5])
Random.random:生成n个随机浮点数样本
np.random.random(3)
array([0.70025574, 0.50566006, 0.80192119])
Logspace:在对数尺度上生成间隔均匀的数字
Start:序列的起始值。
End:序列的最后一个值。
endpoint:如果为True,最后一个样本将包含在序列中。
base:底数。默认是10
np.logspace(0,10,5,base=2)
array([1.00000000e+00, 5.65685425e+00, 3.20000000e+01, 1.81019336e+02,
1.02400000e+03])
zeroes: np.zeroes会创建一个全部为0的数组,
shape:阵列的形状。
Dtype:生成数组所需的数据类型。’ int ‘或默认’ float ’
np.zeros((2,3),dtype="int")
array([[0, 0, 0],
[0, 0, 0]])
np.zeros(5)
array([0., 0., 0., 0., 0.])
ones: np.ones函数创建一个全部为1的数组
np.ones((2,3))
array([[1., 1., 1.],
[1., 1., 1.]])
full:创建一个单独值的n维数组
fill_value:填充值
np.full((2,4),fill_value=4)
array([[4, 4, 4, 4],
[4, 4, 4, 4]])
Identity:创建具有指定维度的单位矩阵
np.identity(4)
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
数组操作
max,min:返回数组中的最小值
axis:用于操作的轴
out:用于存储输出的数组
arr=np.array([1,1,2,3,3,4,5,6,6,2])
np.min(arr)
arr.min()
1
np.max(arr)
6
unique:返回一个所有唯一元素排序的数组
return_index:如果为True,返回数组的索引。
return_inverse:如果为True,返回唯一数组的下标。
return_counts:如果为True,返回数组中每个唯一元素出现的次数。
axis:要操作的轴。默认情况下,数组被认为是扁平的
np.unique(arr,return_counts=True)
(array([1, 2, 3, 4, 5, 6]), array([2, 2, 2, 1, 1, 2]))
mean: 返回数组的平均数
np.mean(arr)
3.3
medain: 返回中位数
arr = np.array([[1,2,3],[5,8,4]])
np.median(arr)
3.5
digitize: 返回输入数组中每个值所属的容器的索引。bin:容器的数组,right:表示该间隔是否包括右边或左边的bin
a = np.array([-0.9, 0.5, 0.9, 1, 1.2, 1.4, 3.6, 4.7, 5.3])
bin = np.array([0,1,2,3])
np.digitize(a,bin)
array([0, 1, 1, 2, 2, 2, 4, 4, 4])
reshape:它返回一个数组,其中包含具有新形状的相同数据
A = np.random.randint(15,size=(4,3))
A
array([[ 0, 3, 3],
[12, 12, 7],
[ 7, 8, 11],
[10, 14, 9]])
A.reshape(3,4)
array([[ 0, 3, 3, 12],
[12, 7, 7, 8],
[11, 10, 14, 9]])
A.reshape(-1)
array([ 0, 3, 3, 12, 12, 7, 7, 8, 11, 10, 14, 9])
expand_dims:它用于扩展数组的维度
arr = np.array([ 8, 14, 1, 8, 11, 4, 9, 4, 1, 13, 13, 11])
np.expand_dims(arr,axis=0)
array([[ 8, 14, 1, 8, 11, 4, 9, 4, 1, 13, 13, 11]])
np.expand_dims(arr,axis=1)
array([[ 8],
[14],
[ 1],
[ 8],
[11],
[ 4],
[ 9],
[ 4],
[ 1],
[13],
[13],
[11]])
arr = np.array([1, 2, 3]) # 输入数组
result = np.expand_dims(arr, axis=1)
result
array([[1],
[2],
[3]])
squeeze:通过移除一个单一维度来降低数组的维度
arr = np.array([[ 8],[14],[ 1],[ 8],[11],[ 4],[ 9],[ 4],[ 1],[13],[13],[11]])
np.squeeze(arr)
array([ 8, 14, 1, 8, 11, 4, 9, 4, 1, 13, 13, 11])
count_nonzero: 计算所有非零元素并返回它们的计数
a = np.array([0,0,1,1,1,0])
np.count_nonzero(a)
3
argwhere: 查找并返回非零元素的所有下标
a = np.array([0,0,1,1,1,0])
np.argwhere(a)
array([[2],
[3],
[4]])
argmax & argmin:argmax返回数组中Max元素的索引。它可以用于多类图像分类问题中获得高概率预测标签的指标,argmin将返回数组中min元素的索引
arr = np.array([[0.12,0.64,0.19,0.05]])
np.argmax(arr)
1
np.argmin(arr)
3
sort: 对数组排序,kind:要使用的排序算法。{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}
arr = np.array([2,3,1,7,4,5])
np.sort(arr)
array([1, 2, 3, 4, 5, 7])
abs:返回数组中元素的绝对值
A=np.array([[1,-3,4],[-2,-4,3]])
np.abs(A)
array([[1, 3, 4],
[2, 4, 3]])
round:将浮点值四舍五入到指定数目的小数点,decimals:要保留的小数点的个数
a=np.random.random(size=(3,4))
a
array([[0.16884867, 0.57913567, 0.16815851, 0.20774758],
[0.54647561, 0.9234027 , 0.83512324, 0.06706385],
[0.70642496, 0.02558962, 0.08868171, 0.52844368]])
clip: 它可以将数组的裁剪值保持在一个范围内
arr = np.array([0,1,-3,-4,5,6,7,2,3])
arr.clip(0,5)
array([0, 1, 0, 0, 5, 5, 5, 2, 3])
替换数组中的值
where:返回满足条件的数组元素,condition:匹配的条件。如果true则返回x,否则y。
a = np.arange(12).reshape(4,3)
a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
np.where(a>5)
(array([2, 2, 2, 3, 3, 3]), array([0, 1, 2, 0, 1, 2]))
a[np.where(a>5)]
array([ 6, 7, 8, 9, 10, 11])
可以用来替换pandas df中的元素
put:用给定的值替换数组中指定的元素,
a:数组
Ind:需要替换的索引。
V:替换值
arr = np.array([1,2,3,4,5,6])
arr
array([1, 2, 3, 4, 5, 6])
np.put(arr,[1,2],[10,9])
arr
array([ 1, 10, 9, 4, 5, 6])
copyto:将一个数组的内容复制到另一个数组中
arr1 = np.array([1,2,3])
arr2 = np.array([4,5,6])
arr1
array([1, 2, 3])
arr2
array([4, 5, 6])
np.copyto(dst=arr1,src=arr2)
arr2
array([4, 5, 6])
arr1
array([4, 5, 6])
集合操作
查找公共元素: intersect1d函数以排序的方式返回两个数组中所有唯一的值
Assume_unique:如果为真值,则假设输入数组都是唯一的。
Return_indices:如果为真,则返回公共元素的索引
ar1 = np.array([1,2,3,4,5,6])
ar2 = np.array([3,4,5,8,9,1])
np.intersect1d(ar1,ar2)
array([1, 3, 4, 5])
查找不同元素: np.setdiff1d函数返回arr1中在arr2中不存在的所有唯一元素
a = np.array([1, 7, 3, 2, 4, 1])
b = np.array([9, 2, 5, 6, 7, 8])
np.setdiff1d(a,b)
array([1, 3, 4])
从两个数组中提取唯一元素:Setxor1d 将按顺序返回两个数组中所有唯一的值
a = np.array([1, 2, 3, 4, 6])
b = np.array([1, 4, 9, 4, 36])
np.setxor1d(a,b)
array([ 2, 3, 6, 9, 36])
合并:Union1d函数将两个数组合并为一个
a = np.array([1, 2, 3, 4, 5])
b = np.array([1, 3, 5, 4, 36])
np.union1d(a,b)
array([ 1, 2, 3, 4, 5, 36])
数组分割:
Hsplit函数将数据水平分割为n个相等的部分
A = np.array([[3,4,5,2],[6,7,2,6]])
np.hsplit(A,2)
[array([[3, 4],
[6, 7]]),
array([[5, 2],
[2, 6]])]
np.hsplit(A,4)
[array([[3],
[6]]),
array([[4],
[7]]),
array([[5],
[2]]),
array([[2],
[6]])]
垂直分割:Vsplit将数据垂直分割为n个相等的部分
A = np.array([[3,4,5,2],[6,7,2,6]])
np.vsplit(A,2)
[array([[3, 4, 5, 2]]), array([[6, 7, 2, 6]])]
数组叠加
水平叠加:hstack 将在另一个数组的末尾追加一个数组
a = np.array([1,2,3,4,5])
b = np.array([1,4,9,16,25])
np.hstack((a,b))
array([ 1, 2, 3, 4, 5, 1, 4, 9, 16, 25])
垂直叠加:vstack将一个数组堆叠在另一个数组上
np.vstack((a,b))
array([[ 1, 2, 3, 4, 5],
[ 1, 4, 9, 16, 25]])
allclose:如果两个数组的形状相同,则Allclose函数根据公差值查找两个数组是否相等或近似相等
a = np.array([0.25,0.4,0.6,0.32])
b = np.array([0.26,0.3,0.7,0.32])
tolerance = 0.1 ## Total Difference
np.allclose(a,b,tolerance)
False
tolerance = 0.5
np.allclose(a,b,tolerance)
True
equal:它比较两个数组的每个元素,如果元素匹配就返回True
np.equal(arr1,arr2)
array([ True, True, True])
重复的数组元素
repeat:它用于重复数组中的元素n次,
A:重复的元素
Repeats:重复的次数
np.repeat('2017',3)
array(['2017', '2017', '2017'], dtype='<U4')
fruits=pd.DataFrame([
['Mango',40],
['Apple',90],
['Banana',130]
],columns=['Product','ContainerSales'])
fruits
Product | ContainerSales | |
---|---|---|
0 | Mango | 40 |
1 | Apple | 90 |
2 | Banana | 130 |
[0] 表示索引操作,用于访问数组或矩阵的维度信息
fruits['year'] = np.repeat(2020,fruits.shape[0])
fruits
Product | ContainerSales | year | |
---|---|---|---|
0 | Mango | 40 | 2020 |
1 | Apple | 90 | 2020 |
2 | Banana | 130 | 2020 |
tile: 通过重复A,rep次来构造一个数组
np.tile("Ram",5)
array(['Ram', 'Ram', 'Ram', 'Ram', 'Ram'], dtype='<U3')
np.tile(3,(2,3))
array([[3, 3, 3],
[3, 3, 3]])
爱因斯坦求和
a = np.arange(1,10).reshape(3,3)
b = np.arange(21,30).reshape(3,3)
np.einsum('ii->i',a)
array([1, 5, 9])
np.einsum('ji',a)
array([[1, 4, 7],
[2, 5, 8],
[3, 6, 9]])
np.einsum('ij,jk',a,b)
array([[150, 156, 162],
[366, 381, 396],
[582, 606, 630]])
np.einsum('ii',a)
15
统计分析
A = np.array([[3, 4, 5, 2],
[6, 7, 2, 6]])
np.histogram(A)
(array([2, 0, 1, 0, 1, 0, 1, 0, 2, 1]),
array([2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. ]))
百分位数
沿指定轴计算数据的Q-T-T百分位数
a:输入
q:要计算的百分位。
overwrite_input:如果为true,则允许输入数组修改中间计算以节省内存
a = np.array([[2, 4, 6], [4, 8, 12]])
np.percentile(a, 50)
5.0
np.percentile(a, 10)
3.0
np.percentile(a,5)
2.5
标准偏差和方差
std和var是NumPy的两个函数,用于计算沿轴的标准偏差和方差
a = np.array([[2, 4, 6], [4, 8, 12]])
np.std(a,axis=1)
array([1.63299316, 3.26598632])
np.std(a,axis=0)
array([1., 2., 3.])
np.var(a,axis=1)
array([ 2.66666667, 10.66666667])
np.var(a,axis=0)
array([1., 4., 9.])
数组打印
np.set_printoptions(precision=2)
a = np.array([12.23456,32.34535])
a
array([12.23, 32.35])
设置打印数组最大值
np.set_printoptions(threshold=np.inf)
增加一行中元素的数量
np.set_printoptions(linewidth=100)
保存和加载数据
arr = np.linspace(10,100,500).reshape(25,20)
np.savetxt('array.txt',arr)
加载
np.loadtxt('array.txt')
array([[ 10. , 10.18, 10.36, 10.54, 10.72, 10.9 , 11.08, 11.26, 11.44, 11.62, 11.8 ,
11.98, 12.16, 12.34, 12.53, 12.71, 12.89, 13.07, 13.25, 13.43],
[ 13.61, 13.79, 13.97, 14.15, 14.33, 14.51, 14.69, 14.87, 15.05, 15.23, 15.41,
15.59, 15.77, 15.95, 16.13, 16.31, 16.49, 16.67, 16.85, 17.03],
[ 17.21, 17.39, 17.58, 17.76, 17.94, 18.12, 18.3 , 18.48, 18.66, 18.84, 19.02,
19.2 , 19.38, 19.56, 19.74, 19.92, 20.1 , 20.28, 20.46, 20.64],
[ 20.82, 21. , 21.18, 21.36, 21.54, 21.72, 21.9 , 22.08, 22.26, 22.44, 22.63,
22.81, 22.99, 23.17, 23.35, 23.53, 23.71, 23.89, 24.07, 24.25],
[ 24.43, 24.61, 24.79, 24.97, 25.15, 25.33, 25.51, 25.69, 25.87, 26.05, 26.23,
26.41, 26.59, 26.77, 26.95, 27.13, 27.31, 27.49, 27.68, 27.86],
[ 28.04, 28.22, 28.4 , 28.58, 28.76, 28.94, 29.12, 29.3 , 29.48, 29.66, 29.84,
30.02, 30.2 , 30.38, 30.56, 30.74, 30.92, 31.1 , 31.28, 31.46],
[ 31.64, 31.82, 32. , 32.18, 32.36, 32.55, 32.73, 32.91, 33.09, 33.27, 33.45,
33.63, 33.81, 33.99, 34.17, 34.35, 34.53, 34.71, 34.89, 35.07],
[ 35.25, 35.43, 35.61, 35.79, 35.97, 36.15, 36.33, 36.51, 36.69, 36.87, 37.05,
37.23, 37.41, 37.6 , 37.78, 37.96, 38.14, 38.32, 38.5 , 38.68],
[ 38.86, 39.04, 39.22, 39.4 , 39.58, 39.76, 39.94, 40.12, 40.3 , 40.48, 40.66,
40.84, 41.02, 41.2 , 41.38, 41.56, 41.74, 41.92, 42.1 , 42.28],
[ 42.46, 42.65, 42.83, 43.01, 43.19, 43.37, 43.55, 43.73, 43.91, 44.09, 44.27,
44.45, 44.63, 44.81, 44.99, 45.17, 45.35, 45.53, 45.71, 45.89],
[ 46.07, 46.25, 46.43, 46.61, 46.79, 46.97, 47.15, 47.33, 47.52, 47.7 , 47.88,
48.06, 48.24, 48.42, 48.6 , 48.78, 48.96, 49.14, 49.32, 49.5 ],
[ 49.68, 49.86, 50.04, 50.22, 50.4 , 50.58, 50.76, 50.94, 51.12, 51.3 , 51.48,
51.66, 51.84, 52.02, 52.2 , 52.38, 52.57, 52.75, 52.93, 53.11],
[ 53.29, 53.47, 53.65, 53.83, 54.01, 54.19, 54.37, 54.55, 54.73, 54.91, 55.09,
55.27, 55.45, 55.63, 55.81, 55.99, 56.17, 56.35, 56.53, 56.71],
[ 56.89, 57.07, 57.25, 57.43, 57.62, 57.8 , 57.98, 58.16, 58.34, 58.52, 58.7 ,
58.88, 59.06, 59.24, 59.42, 59.6 , 59.78, 59.96, 60.14, 60.32],
[ 60.5 , 60.68, 60.86, 61.04, 61.22, 61.4 , 61.58, 61.76, 61.94, 62.12, 62.3 ,
62.48, 62.67, 62.85, 63.03, 63.21, 63.39, 63.57, 63.75, 63.93],
[ 64.11, 64.29, 64.47, 64.65, 64.83, 65.01, 65.19, 65.37, 65.55, 65.73, 65.91,
66.09, 66.27, 66.45, 66.63, 66.81, 66.99, 67.17, 67.35, 67.54],
[ 67.72, 67.9 , 68.08, 68.26, 68.44, 68.62, 68.8 , 68.98, 69.16, 69.34, 69.52,
69.7 , 69.88, 70.06, 70.24, 70.42, 70.6 , 70.78, 70.96, 71.14],
[ 71.32, 71.5 , 71.68, 71.86, 72.04, 72.22, 72.4 , 72.59, 72.77, 72.95, 73.13,
73.31, 73.49, 73.67, 73.85, 74.03, 74.21, 74.39, 74.57, 74.75],
[ 74.93, 75.11, 75.29, 75.47, 75.65, 75.83, 76.01, 76.19, 76.37, 76.55, 76.73,
76.91, 77.09, 77.27, 77.45, 77.64, 77.82, 78. , 78.18, 78.36],
[ 78.54, 78.72, 78.9 , 79.08, 79.26, 79.44, 79.62, 79.8 , 79.98, 80.16, 80.34,
80.52, 80.7 , 80.88, 81.06, 81.24, 81.42, 81.6 , 81.78, 81.96],
[ 82.14, 82.32, 82.51, 82.69, 82.87, 83.05, 83.23, 83.41, 83.59, 83.77, 83.95,
84.13, 84.31, 84.49, 84.67, 84.85, 85.03, 85.21, 85.39, 85.57],
[ 85.75, 85.93, 86.11, 86.29, 86.47, 86.65, 86.83, 87.01, 87.19, 87.37, 87.56,
87.74, 87.92, 88.1 , 88.28, 88.46, 88.64, 88.82, 89. , 89.18],
[ 89.36, 89.54, 89.72, 89.9 , 90.08, 90.26, 90.44, 90.62, 90.8 , 90.98, 91.16,
91.34, 91.52, 91.7 , 91.88, 92.06, 92.24, 92.42, 92.61, 92.79],
[ 92.97, 93.15, 93.33, 93.51, 93.69, 93.87, 94.05, 94.23, 94.41, 94.59, 94.77,
94.95, 95.13, 95.31, 95.49, 95.67, 95.85, 96.03, 96.21, 96.39],
[ 96.57, 96.75, 96.93, 97.11, 97.29, 97.47, 97.66, 97.84, 98.02, 98.2 , 98.38,
98.56, 98.74, 98.92, 99.1 , 99.28, 99.46, 99.64, 99.82, 100. ]])