pandas两个主要数据结构之一——Series
- 类似于一维数组,由一组数据和与其相关的一组索引组成
obj = Series([4, 7, -5, 3], index=[‘d‘, ‘b‘, ‘a‘, ‘c‘])
print(obj)
‘‘‘
d 4
b 7
a -5
c 3
dtype: int64
‘‘‘
tmp = [‘a‘, ‘b‘]
print(obj[tmp])
"""
a -5
b 7
dtype: int64
"""
print(obj[obj > 0])
print(obj*2)
"""
d 4
b 7
c 3
dtype: int64
d 8
b 14
a -10
c 6
dtype: int64
"""
data = {‘a‘: 1, ‘b‘: 2, ‘c‘: 3, ‘d‘: -1}
obj2 = Series(data) # 通过字典直接生成Series
print(obj2)
"""
a 1
b 2
c 3
d -1
dtype: int64
"""
t = ‘a‘ in obj2 # 判断‘a’是否为obj2索引
print(t)
"""
True
"""
- 生成Series时,无对应值自动填充为
NaN
,且Series对数据会根据索引自动对齐
# 如果 obj2 = Series(data, index = ...) 中,index对应无对应值,则其填充为NaN
index = [‘a‘, ‘e‘, ‘b‘, ‘c‘, ‘d‘] # 多了一个‘e’,并且位置不同(在生成时会自动对齐)
obj3 = Series(data, index=index)
print(obj3)
"""
a 1.0
e NaN
b 2.0
c 3.0
d -1.0
dtype: float64
"""
# 数据是否缺失可用isnull检测
print(obj3.isnull())
"""
a False
e True
b False
c False
d False
dtype: bool
"""
- Series可进行运算,不过与
NaN
运算结果始终为NaN
obj4 = Series({‘a‘: 1, ‘b‘: 2, ‘d‘: -1, ‘e‘: 5})
print(obj4)
print(obj3+obj4)
"""
a 1
b 2
d -1
e 5
dtype: int64
# obj4 中无‘c’索引,其默认为NaN
# 运算完后会自动排序
a 2.0
b 4.0
c NaN
d -2.0
e NaN
dtype: float64
"""
obj4.name = ‘obj4‘
obj4.index.name = ‘index‘
print(obj4)
"""
index
a 1
b 2
d -1
e 5
Name: obj4, dtype: int64
"""
obj4.index = [1, 2, 3, 4] # 索引个数要相同,且更改后索引名会清空
print(obj4)
"""
1 1
2 2
3 -1
4 5
Name: obj4, dtype: int64
"""
pandas的数据结构介绍(一)—— Series