From: http://interactivepython.org/courselib/static/pythonds/Introduction/GettingStartedwithData.html
Built-in Collection Data Types
Python has a number of very powerful built-in collection classes. Lists, strings, and tuples are ordered collections that are very similar in general structure but have specific differences that must be understood for them to be used properly. Sets and dictionaries are unordered collections.
- List
- A list is an ordered collection of zero or more references to Python data objects.
- Lists are heterogeneous, meaning that the data objects need not all be from the same class and the collection can be assigned to a variable as below.
>>> [1,3,True,6.5]
[1, 3, True, 6.5]
>>> myList = [1,3,True,6.5]
>>> myList
[1, 3, True, 6.5]
- Since lists are considered to be sequentially ordered, they support a number of operations that can be applied to any Python sequence.
Operation Name | Operator | Explanation |
---|---|---|
indexing | [ ] | Access an element of a sequence |
concatenation | + | Combine sequences together |
repetition | * | Concatenate a repeated number of times |
membership | in | Ask whether an item is in a sequence |
length | len | Ask the number of items in the sequence |
slicing | [ : ] | Extract a part of a sequence |
- Note that the indices for lists (sequences) start counting with 0.
myList = [1,2,3,4]
A = [myList]*3
print(A)
myList[2]=45
print(A)
Lists support a number of methods that will be used to build data structures.
Method Name | Use | Explanation |
---|---|---|
append |
alist.append(item) |
Adds a new item to the end of a list |
insert |
alist.insert(i,item) |
Inserts an item at the ith position in a list |
pop |
alist.pop() |
Removes and returns the last item in a list |
pop |
alist.pop(i) |
Removes and returns the ith item in a list |
sort |
alist.sort() |
Modifies a list to be sorted |
reverse |
alist.reverse() |
Modifies a list to be in reverse order |
del |
del alist[i] |
Deletes the item in the ith position |
index |
alist.index(item) |
Returns the index of the first occurrence of item
|
count |
alist.count(item) |
Returns the number of occurrences of item
|
remove |
alist.remove(item) |
Removes the first occurrence of item
|
myList = [1024, 3, True, 6.5]
myList.append(False)
print(myList)
myList.insert(2,4.5)
print(myList)
print(myList.pop())
print(myList)
print(myList.pop(1))
print(myList)
myList.pop(2)
print(myList)
myList.sort()
print(myList)
myList.reverse()
print(myList)
print(myList.count(6.5))
print(myList.index(4.5))
myList.remove(6.5)
print(myList)
del myList[0]
print(myList)
One common Python function that is often discussed in conjunction with lists is the range
function. range
produces a range object that represents a sequence of values. By using the list
function, it is possible to see the value of the range object as a list. This is illustrated below.
>>> range(10)
range(0, 10)
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(5,10)
range(5, 10)
>>> list(range(5,10))
[5, 6, 7, 8, 9]
>>> list(range(5,10,2))
[5, 7, 9]
>>> list(range(10,1,-1))
[10, 9, 8, 7, 6, 5, 4, 3, 2]
>>>
- String
- Strings are sequential collections of zero or more letters, numbers and other symbols.
>>> "David"
'David'
>>> myName = "David"
>>> myName[3]
'i'
>>> myName*2
'DavidDavid'
>>> len(myName)
5
>>>
Since strings are sequences, all of the sequence operations described above work as you would expect. In addition, strings have a number of methods.
>>> myName
'David'
>>> myName.upper()
'DAVID'
>>> myName.center(10)
' David '
>>> myName.find('v')
2
>>> myName.split('v')
['Da', 'id']
Of these, split
will be very useful for processing data. split
will take a string and return a list of strings using the split character as a division point. In the example, v
is the division point. If no division is specified, the split method looks for whitespace characters such as tab, newline and space.
Method Name | Use | Explanation |
---|---|---|
center |
astring.center(w) |
Returns a string centered in a field of size w
|
count |
astring.count(item) |
Returns the number of occurrences of item in the string |
ljust |
astring.ljust(w) |
Returns a string left-justified in a field of size w
|
lower |
astring.lower() |
Returns a string in all lowercase |
rjust |
astring.rjust(w) |
Returns a string right-justified in a field of size w
|
find |
astring.find(item) |
Returns the index of the first occurrence of item
|
split |
astring.split(schar) |
Splits a string into substrings at schar
|
A major difference between lists and strings is that lists can be modified while strings cannot. This is referred to as mutability. Lists are mutable; strings are immutable. For example, you can change an item in a list by using indexing and assignment. With a string that change is not allowed.
>>> myList
[1, 3, True, 6.5]
>>> myList[0]=2**10
>>> myList
[1024, 3, True, 6.5]
>>>
>>> myName
'David'
>>> myName[0]='X' Traceback (most recent call last):
File "<pyshell#84>", line 1, in -toplevel-
myName[0]='X'
TypeError: object doesn't support item assignment
>>>
- Tuples
Tuples are very similar to lists in that they are heterogeneous sequences of data. The difference is that a tuple is immutable, like a string.
>>> myTuple = (2,True,4.96)
>>> myTuple
(2, True, 4.96)
>>> len(myTuple)
3
>>> myTuple[0]
2
>>> myTuple * 3
(2, True, 4.96, 2, True, 4.96, 2, True, 4.96)
>>> myTuple[0:2]
(2, True)
>>> However, if you try to change an item in a tuple, you will get an error. Note that the error message provides location and reason for the problem.
>>> myTuple[1]=False Traceback (most recent call last):
File "<pyshell#137>", line 1, in -toplevel-
myTuple[1]=False
TypeError: object doesn't support item assignment
>>>
- Sets
A set is an unordered collection of zero or more immutable Python data objects. Sets do not allow duplicates and are written as comma-delimited values enclosed in curly braces. The empty set is represented by set()
. Sets are heterogeneous, and the collection can be assigned to a variable as below.
>>> {3,6,"cat",4.5,False}
{False, 4.5, 3, 6, 'cat'}
>>> mySet = {3,6,"cat",4.5,False}
>>> mySet
{False, 4.5, 3, 6, 'cat'}
>>>
Even though sets are not considered to be sequential, they do support a few of the familiar operations presented earlier.
Operation Name | Operator | Explanation |
---|---|---|
membership | in | Set membership |
length | len | Returns the cardinality of the set |
| |
aset | otherset |
Returns a new set with all elements from both sets |
& |
aset & otherset |
Returns a new set with only those elements common to both sets |
- |
aset - otherset |
Returns a new set with all items from the first set not in second |
<= |
aset <= otherset |
Asks whether all elements of the first set are in the second |
>>> mySet
{False, 4.5, 3, 6, 'cat'}
>>> len(mySet)
5
>>> False in mySet
True
>>> "dog" in mySet
False
>>>
Sets support a number of methods that should be familiar to those who have worked with them in a mathematics setting.
Method Name | Use | Explanation |
---|---|---|
union |
aset.union(otherset) |
Returns a new set with all elements from both sets |
intersection |
aset.intersection(otherset) |
Returns a new set with only those elements common to both sets |
difference |
aset.difference(otherset) |
Returns a new set with all items from first set not in second |
issubset |
aset.issubset(otherset) |
Asks whether all elements of one set are in the other |
add |
aset.add(item) |
Adds item to the set |
remove |
aset.remove(item) |
Removes item from the set |
pop |
aset.pop() |
Removes an arbitrary element from the set |
clear |
aset.clear() |
Removes all elements from the set |
>>> mySet
{False, 4.5, 3, 6, 'cat'}
>>> yourSet = {99,3,100}
>>> mySet.union(yourSet)
{False, 4.5, 3, 100, 6, 'cat', 99}
>>> mySet | yourSet
{False, 4.5, 3, 100, 6, 'cat', 99}
>>> mySet.intersection(yourSet)
{3}
>>> mySet & yourSet
{3}
>>> mySet.difference(yourSet)
{False, 4.5, 6, 'cat'}
>>> mySet - yourSet
{False, 4.5, 6, 'cat'}
>>> {3,100}.issubset(yourSet)
True
>>> {3,100}<=yourSet
True
>>> mySet.add("house")
>>> mySet
{False, 4.5, 3, 6, 'house', 'cat'}
>>> mySet.remove(4.5)
>>> mySet
{False, 3, 6, 'house', 'cat'}
>>> mySet.pop()
False
>>> mySet
{3, 6, 'house', 'cat'}
>>> mySet.clear()
>>> mySet
set()
>>>
Dictionaries
Dictionaries are collections of associated pairs of items where each pair consists of a key and a value.This key-value pair is typically written as key:value.
>>> capitals = {'Iowa':'DesMoines','Wisconsin':'Madison'}
>>> capitals
{'Wisconsin': 'Madison', 'Iowa': 'DesMoines'}
>>>
capitals = {'Iowa':'DesMoines','Wisconsin':'Madison'}
print(capitals['Iowa'])
capitals['Utah']='SaltLakeCity'
print(capitals)
capitals['California']='Sacramento'
print(len(capitals))
for k in capitals:
print(capitals[k]," is the capital of ", k)
It is important to note that the dictionary is maintained in no particular order with respect to the keys.The placement of a key is dependent on the idea of “hashing“.
Dictionaries have both methods and operators.Thekeys
,values
, anditems
methods all return objects that contain the values of interest.You can use thelist
function to convert them to lists. You will also see that there are two variations on theget
method. If the key is not present in the dictionary,get
will returnNone
. However, a second, optional parameter can specify a return value instead.
Operator | Use | Explanation |
---|---|---|
[] |
myDict[k] |
Returns the value associated with k , otherwise its an error |
in |
key in adict |
Returns True if key is in the dictionary, False otherwise |
del |
del adict[key]
|
Removes the entry from the dictionary |
>>> phoneext={'david':1410,'brad':1137}
>>> phoneext
{'brad': 1137, 'david': 1410}
>>> phoneext.keys()
dict_keys(['brad', 'david'])
>>> list(phoneext.keys())
['brad', 'david']
>>> phoneext.values()
dict_values([1137, 1410])
>>> list(phoneext.values())
[1137, 1410]
>>> phoneext.items()
dict_items([('brad', 1137), ('david', 1410)])
>>> list(phoneext.items())
[('brad', 1137), ('david', 1410)]
>>> phoneext.get("kent")
>>> phoneext.get("kent","NO ENTRY")
'NO ENTRY'
>>>
Method Name | Use | Explanation |
---|---|---|
keys |
adict.keys() |
Returns the keys of the dictionary in a dict_keys object |
values |
adict.values() |
Returns the values of the dictionary in a dict_values object |
items |
adict.items() |
Returns the key-value pairs in a dict_items object |
get |
adict.get(k) |
Returns the value associated with k , None otherwise |
get |
adict.get(k,alt) |
Returns the value associated with k , alt otherwise |