Python数据结构¶

概要: 简介Python中的列表、元祖、字典、集合和字符串

创建时间: 2022.10.02 11:18:43

更新时间: 2022.10.02 11:26:27

列表(List)¶

列表是一种用于保存一系列有序项目的集合。当创建了列表之后，可以进行添加、删除或者排序列表中的项目，这就是说，列表是一种可变的(Mutable)数据类型。

关于类(Class)和对象(Object)的简单说明
一个类可以带有方法(Method)，即仅对于这个类而言所启用的某个函数，例如mylist.append('an item')
一个类也可以具有字段(Field)，它用于定义仅仅服务这个类的变量，例如mylist.field

详情我们可以看以下示例

Python
# This is my shopping list
shoplist = ['apple', 'mango', 'carrot', 'banana']  # 列表的初始化
print('I have', len(shoplist), 'items to purchase.')

print('These items are:', end=' ')

for item in shoplist:
    print(item, end=' ')

print('\nI also have to buy rice.')
shoplist.append('rice')  # 添加列表元素
print('My shopping list is now', shoplist)

print('I will sort my list now')
shoplist.sort()  # 排序列表元素
print('Sorted shopping list is', shoplist)

print('The first item I will buy is', shoplist[0])
olditem = shoplist[0]
del shoplist[0]  # 删除列表元素
print('I bought the', olditem)
print('My shopping list is now', shoplist)

运行后结果如下

Text Only

I have 4 items to purchase.
These items are: apple mango carrot banana 
I also have to buy rice.
My shopping list is now ['apple', 'mango', 'carrot', 'banana', 'rice']
I will sort my list now
Sorted shopping list is ['apple', 'banana', 'carrot', 'mango', 'rice']
The first item I will buy is apple
I bought the apple
My shopping list is now ['banana', 'carrot', 'mango', 'rice']

提示

我们可以通过使用for...in循环来遍历列表中的每一个项目
在调用print函数时，我们使用了end参数，它的作用在于通过一个空格来结束输出工作，而不是换行
在上述示例中，我们通过列表的append方法来向列表中添加一个对象，通过sort方法对列表进行排序，通过del语句将列表中的对应项目移除，如果要移除列表第一个项目，使用del shoplist[0]
Python从0开始计数

元祖(Tuple)¶

元祖用于将多个对象保存在一起。与列表不同的是，元祖是不可变的，这就是说，不能编辑或者更改元祖。
元祖的结构类似于固定的树状结构，如下图所示

下面通过一个例子来了解元祖

Python
zoo = ('python', 'elephant', 'penguin')
print('Number of animals in the zoo is', len(zoo))

new_zoo = 'monkey', 'camel', zoo 
print('Number of cages in the new zoo is', len(new_zoo))
print('All animals in new zoo are', new_zoo)
print('Animals brought from old zoo are', new_zoo[2])
print('Last animal brought from old zoo is', new_zoo[2][2])
print('Number of animals in the new zoo is',
    len(new_zoo)-1+len(new_zoo[2]))

运行后结果如下

Text Only

Number of animals in the zoo is 3
Number of cages in the new zoo is 3
All animals in new zoo are ('monkey', 'camel', ('python', 'elephant', 'penguin'))
Animals brought from old zoo are ('python', 'elephant', 'penguin')
Last animal brought from old zoo is penguin
Number of animals in the new zoo is 5

提示

变量zoo指的是一个包含项目的元祖，而len函数用于获取元祖的长度
在建立了new_zoo元祖时，我们将原元祖zoo的项目“转移”到新元祖内，但原元祖zoo的结构不变
通过索引(Indexing)运算符可以访问元祖内的各个项目，比如通过指定new_zoo[2]来指定new_zoo中的第三个项目，我们也可以通过指定new_zoo[2][2]来指定new_zoo元组中的第三个项目中的第三个项
空元祖由一对圆括号构成，如myempty=()，而包含一个项目的元祖必须标明第一个项目后的逗号，如singleton=(2,)

字典(Dictionary)¶

字典类似于电话簿，包含键值(Keys)(电话簿里的联系人姓名)和值(Values)(电话簿里的联系人号码)，通过查找键值，我们就可以得到对应的值。

提示

字典的键值只能使用不可变的对象(如字符串)，但是字典中的值可以使用可变或者不可变的对象
在字典中，键值与值之间通过冒号分割，如d={key:value1, key:value2}
字典不会将字典中成对的键值-值以任何顺序排序

举例说明如下

Python
# ab = AdressBook
ab = {
    'Swaroop': 'swaroop@swaroopch.com',
    'Larry': 'larry@wall.org',
    'Matsumoto': 'matz@ruby-lang.org',
    'Spammer': 'spammer@hotmail.com'
}

print("Swaroop's address is", ab['Swaroop'])

# 删除一对 键值-值
del ab['Spammer']

print('\nThere are {} contacts in the address-book\n'.format(len(ab)))

for name, address in ab.items():
    print('Contact {} at {}'.format(name, address))

# 新增一对 键值-值
ab['Guido'] = 'guido@python.org'
if 'Guido' in ab:
    print("\nGuido's address is", ab['Guido'])

运行后结果如下

Text Only

Swaroop's address is swaroop@swaroopch.com

There are 3 contacts in the address-book

Contact Swaroop at swaroop@swaroopch.com
Contact Larry at larry@wall.org
Contact Matsumoto at matz@ruby-lang.org

Guido's address is guido@python.org

提示

在使用del来删除某一对键值-值时，我们只需要知道指定的字典以及要删除的键值名称的索引运算符，无需知道键值所对应的值
上述示例中，我们通过使用字典中item的方法来访问字典中的每一对键值-值的信息，并使用for...in循环来遍历打印它们
若要新增一对键值-值，可以简单地通过ab['Guido'] = 'guido@python.org'进行添加
可以使用in运算符来检查某一对键值-值是否存在于当前字典中

序列与切片运算¶

序列(Sequence)的主要功能是资格测试(Membership Test)(in与not in表达式)和索引操作(Indexing Operations)，这些操作允许我们直接获取序列中的特定项目。前文所提的列表、元祖和字符串，同样有一种切片(Slicing)运算符，允许我们直接访问数据结构的内容。

下面是示例

Python
shoplist = ['apple', 'mango', 'carrot', 'banana']
name = 'swaroop'

# 利用下标进行索引操作
print('Item 0 is', shoplist[0])
print('Item 1 is', shoplist[1])
print('Item 2 is', shoplist[2])
print('Item 3 is', shoplist[3])
print('Item -1 is', shoplist[-1])
print('Item -2 is', shoplist[-2])
print('Character 0 is', name[0])

# 在列表中切片
print('Item 1 to 3 is', shoplist[1:3])
print('Item 2 to end is', shoplist[2:])
print('Item 1 to -1 is', shoplist[1:-1])
print('Item start to end is', shoplist[:])

# 在字符串中切片
print('characters 1 to 3 is', name[1:3])
print('characters 2 to end is', name[2:])
print('characters 1 to -1 is', name[1:-1])
print('characters start to end is', name[:])

运行结果如下

Text Only

Item 0 is apple
Item 1 is mango
Item 2 is carrot
Item 3 is banana
Item -1 is banana
Item -2 is carrot
Character 0 is s
Item 1 to 3 is ['mango', 'carrot']
Item 2 to end is ['carrot', 'banana']
Item 1 to -1 is ['mango', 'carrot']
Item start to end is ['apple', 'mango', 'carrot', 'banana']
characters 1 to 3 is wa
characters 2 to end is aroop
characters 1 to -1 is waroo
characters start to end is swaroop

提示

使用索引来获取序列内各个项目的操作，也成为下标操作(subscription operation)，如shoplist[0]表示序列shoplist的第一个项目
索引操作可以使用负数，此时位置计数将从队列的末尾开始，即shoplist[-1]表示序列的最后一项，而shoplist[-2]表示序列的倒数第二项
通过指定序列名称来进行序列操作时，如shoplist[1:3]，shoplist[2:]中，数字是可选的，但冒号不是
在切片操作中，第一个数字（冒号前面的那位）指的是切片开始的位置，第二个数字（冒号后面的那位）指的是切片结束的位置。如果第一位数字没有指定，Python将会从序列的起始处开始操作。如果第二个数字留空，Python将会在序列的末尾结束操作。即序列切片将包括起始位置，但不包括结束位置
切片操作中同样可以使用负数位置，但是使用负数位置时从序列末端开始计算，如shoplist[:-1]的返回值不包括最后一项

此外，序列接受第三个参数，视为步长(Step)，默认步长为1，示例如下

Python
>>> shoplist = ['apple', 'mango', 'carrot', 'banana']
>>> shoplist[::1]
['apple', 'mango', 'carrot', 'banana']
>>> shoplist[::2]
['apple', 'carrot']
>>> shoplist[::3]
['apple', 'banana']
>>> shoplist[::-1]
['banana', 'carrot', 'mango', 'apple']

集合(Set)¶

集合(Set)是简单对象的无序集合(Collection)，通过使用集合可以测试某些对象的资格或者情况，检查它们是否是其它集合的子集，找到两个集合的交集等。

简单的示例如下

Python
>>> bri = set(['brazil', 'russia', 'india'])  # 集合初始化
>>> 'india' in bri  # 判断对象是否在集合内
True
>>> 'usa' in bri
False
>>> bric = bri.copy()  # 复制集合
>>> bric.add('china')  # 增加对象
>>> bric.issuperset(bri)  # 判断集合bri是否包含于集合bric中
True
>>> bri.remove('russia')  # 集合移除对象
>>> bri & bric  # 两个集合的交集部分元素
{'india', 'brazil'}

引用¶

在Python中，当创建了一个对象并分配给某个变量时，变量只会查阅(Refer)某个对象(类似于C/C内的指针指向对象)，这叫做将名称绑定(Binding)给那个对象。

类似于C/C中的浅拷贝和深拷贝的差别，通过指向对象的复制与通过切片操作的复制，其结果是完全不同的。示例如下

Python
print('Simple Assignment')
shoplist = ['apple', 'mango', 'carrot', 'banana']
# mylist仅仅是shoplist的另一个名字，它们指向的是同一个内容
mylist = shoplist

del shoplist[0]
print('shoplist is', shoplist)
print('mylist is', mylist)
# 删除了shoplist指向的第一项内容后，发现mylist的指向内容也是更改了的

print('Copy by making a full slice')
# 通过制作一份完整的切片来进行“深拷贝”
mylist = shoplist[:]
del mylist[0]

print('shoplist is', shoplist)
print('mylist is', mylist)
# 此时的shoplist与mylist内容是不一样的

运行结果如下

Text Only

Simple Assignment
shoplist is ['mango', 'carrot', 'banana']
mylist is ['mango', 'carrot', 'banana']
Copy by making a full slice
shoplist is ['mango', 'carrot', 'banana']
mylist is ['carrot', 'banana']

字符串其他内容¶

程序中使用的所有字符串都是str类下的对象，下面的示例程序演示了一些有用的方法

Python
name = 'Swaroop'

if name.startswith('Swa'):  # 判断开头的字符串
    print('Yes, the string starts with "Swa"')

if 'a' in name:  # 判断是否包含某些字符
    print('Yes, it contains the string "a"')

if name.find('war') != -1:  # 判断是否包含指定的字符串，如果找不到返回-1
    print('Yes, it contains the string "war"')

delimiter = '_*_'  # 使用 _*_ 来连接序列中的项目
mylist = ['Brazil', 'Russia', 'India', 'China']
print(delimiter.join(mylist))

运行结果如下

Text Only
1 2 3 4	`Yes, the string starts with "Swa" Yes, it contains the string "a" Yes, it contains the string "war" Brazil__Russia__India_*_China`