5. Structured Types, Mutability, and Higer-Order Function

The programs we have looked at thus far have dealt with three types of objects: int, float, and str. The numeric types int and float are scalar types. That is to say, objects of these types have no accessible internal structure. In contrast, str can be thought of as a structured, or non-scalar, type. One can use indexing to extract individual characters from a string and slicing to extract substrings. In this chapter, we introduce four additional structured types. One, tuple, is a rather simple generalization of str. The other three—list, range, and dict—are more interesting. We also return to the topic of functions with some examples that illustrate the utility of being able to treat functions in the same way as other types of objects.

5.1. Tuple

Like strings, tuples are immutable ordered sequences of elements. The difference is that the elements of a tuple need not be characters. The individual elements can be of any type, and need not be of the same type as each other.

Literals of type tuple are written by enclosing a comma-separated list of elements within parentheses. For example, we can write：

Comma-separate:

t1 = ()
t2 = (1, 'two', 3)
print(t1)
print(t2)
Unsurprisingly, the print statements produce the output
()
(1, 'two', 3)

单数项的写法：Looking at this example, you might naturally be led to believe that the tuple containing the single value1 would be written (1). But, to quote Richard Nixon, “that would be wrong.”Since parentheses are used to group expressions, (1)is merely a verbose way to write the integer1. To denote the singleton tuple containing this value, we write==(1,)==. Almost everybody who uses Python has at one time or another accidentally omitted that annoying comma.

Repetition can be used on tuples. For example, the expression 3* ('a', 2) evaluates to ('a', 2, 'a', 2, 'a', 2).Like strings, tuples can be **concatenated, indexed, and sliced. **Consider:

**Concatenated, indexed, and sliced. **

1 = (1, 'two', 3)
t2 = (t1, 3.25)
print(t2)
print(t1 + t2)
print((t1 + t2))
print((t1 + t2)[3])
print((t1 + t2)[2:5])
## will print:
((1, 'two', 3), 3.25)
(1, 'two', 3, (1, 'two', 3), 3.25)
(1, 'two', 3, (1, 'two', 3), 3.25)
(1, 'two', 3)
(3, (1, 'two', 3), 3.25)

5.1.1 Sequences and Multiple Assignment

If you know the length of a sequence (e.g., a tuple or a string), it can be convenient to use Python’s multiple assignment statement to extract the individual elements. For example, after executing the statementx, y = (3, 4), x will be bound to 3 and y to 4. Similarly, the statement a, b, c = 'xyz' will bind a to 'x', b to 'y', and c to 'z'.

This mechanism is particularly convenient when used in conjunction with functions that return fixed-size sequences. Consider, for example the function definition

minDivisor, maxDivisor = findExtrmeDivisors(100, 200)

def findExtrmeDivisors(n1, n2):
    """Assumes that n1 and n2 are positive inters
        Returns a tuple containing the smallest common divisor > 1
        and the largest common divisor of n1 and n2.
        If no common divisor,
        Returns(None,None)"""
    minVal, maxVal = None, None
    for i in range(2, min(n1,n2)+1):
        if n1%i == 0 and n2%i == 0:
            if minVal == None:
                minVal = i
            maxVal = i
    return(minVal, maxVal)
minDivisor, maxDivisor = findExtrmeDivisors(100, 200)
#The multiple assignment statement:
  will bind minDivisor to 2 and maxDivisor to 100.

5.2. Range

Like strings and tuples, ranges are immutable. The range function returns an object of type range. As stated in Section 3.2, the range function takes three integer arguments: start, stop, and step, and returns the progression of integers start, start + step, start + 2*step, etc. If step is positive, the last element is the largest integer start + i*step less than stop. If step is negative, the last element is the smallest integer start + i*step greater than stop. If only two arguments are supplied, a step of 1 is used. If only one argument is supplied, that argument is the stop, start defaults to 0, and step defaults to 1.

All of the operations on tuples are also available for ranges, except for concatenation and repetition. For example, range(10)[2:6][2] evaluates to 4. When the==operator is used to compare objects of type range, it returns True if the two ranges represent the same sequence of integers. For example, range(0, 7, 2) == range(0, 8, 2) evaluates to True. However,range(0, 7, 2) == range(6, -1, -2) evaluates to False because though the two ranges contain the same integers, they occur in a different order.

Unlike objects of type tuple, the amount of space occupied by an object of type range is not proportional to its length. Because a range is fully defined by its start, stop, and step values; it can be stored in a small amount of space.

The most common use of range is in for loops, but objects of type range can be used anywhere a sequence of integers can be used.

5.3. Lists and Mutalility

Like a tuple, a list is an ordered sequence of values, where each value is identified by an index. The syntax for expressing literals of type list is similar to that used for tuples; the difference is that we use square brackets rather than parentheses. The empty list is written as [], and singleton lists are written without that (oh so easy to forget) comma before the closing bracket. So, for example, the code,

singleton listsare writtenwithout that (oh so easy to forget) comm
L = ['I did it all', 4, 'love']
for i in range(len(L)):
  print(L[i])
#produces the output,
I did it all
4
love

Occasionally, the fact that square brackets are used for literals of type list, indexing into lists, and slicing lists can lead to some visual confusion. For example, the expression[1,2,3,4][1:3][1], which evaluates to 3, uses the square brackets in three different ways. This is rarely a problem in practice, because most of the time lists are built incrementally rather than written as literals.

Lists differ from tuples in one hugely important way: lists are mutable. In contrast, tuples and strings are immutable. There are many operators that can be used to create objects of these immutable types, and variables can be bound to objects of these types. But objects of immutable types cannot be modified. On the other hand, objects of type list can be modified after they are created.

The distinction between mutating an object and assigning an object to a variable may, at first, appear subtle. However, if you keep repeating the mantra, “In Python a variable is merely a name, i.e., a label that can be attached to an object,” it will bring you clarity.

When the statements

Techs = ['MIT', 'Caltech']
Ivys = ['Harvard', 'Yale', 'Brown']

are executed, the interpreter creates two new lists and binds the appropriate variables to them, as pictured in Figure 5.1.

Figure 5.1 Two Lists

Figure 5.1 Two Lists

The assignment statements

Univs = [Techs, Ivys]
Univs1 = [['MIT', 'Caltech'], ['Harvard', 'Yale', 'Brown']]

also create new lists and bind variables to them. The elements of these lists are themselves lists. The three print statements

print('Univs =', Univs)
print('Univs1 =', Univs1)
print(Univs == Univs1)

produce the output

Univs = [['MIT', 'Caltech'], ['Harvard', 'Yale', 'Brown']]
Univs1 = [['MIT', 'Caltech'], ['Harvard', 'Yale', 'Brown']]
True

It appears as if Univs and Univs1 are bound to the same value. But appearances can be deceiving. As Figure 5.2 illustrates, Univs and Univs1 are bound to quite different values.

Figure 5.2 Two lists that apprea to have the same value

Figure 5.2 Two lists that apprea to have the same value

That Univs and Univs1 are bound to different objects can be verified using the built-in Python function id, which returns a unique integer identifier for an object. This function allows us to test for object equality. When we run the code

print(Univs == Univs1) #test value equality
print(id(Univs) == id(Univs1)) #test object equality
print('Id of Univs =', id(Univs))
print('Id of Univs1 =', id(Univs1))

it prints

True
False
Id of Univs = 4447805768
Id of Univs1 = 4456134408

(Don’t expect to see the same unique identifiers if you run this code. The semantics of Python says nothing about what identifier is associated with each object; it merely requires that no two objects have the same identifier.) Notice that in Figure 5.2 the elements of Univs are not copies of the lists to which Techs and Ivys are bound, but are rather the lists themselves. The elements of Univs1are lists that contain the same elements as the lists inUnivs, but they are not the same lists. We can see this by running the code

print('Ids of Univs[0] and Univs[1]', id(Univs[0]), id(Univs[1]))
print('Ids of Univs1[0] and Univs1[1]', id(Univs1[0]), id(Univs1[1]))
print('Ids of Univs[0] and Univs[1]', id(Univs[0]), id(Univs[1]))
print('Ids of Univs1[0] and Univs1[1]', id(Univs1[0]), id(Univs1[1]))
#which prints
Ids of Univs[0] and Univs[1] 4447807688 4456134664
Ids of Univs1[0] and Univs1[1] 4447805768 4447806728

Why does this matter? It matters because lists are mutable. Consider the code

Techs.append('RPI')

The append method has a side effect. Rather than create a new list, it mutates the existing list Techs by adding a new element, the string 'RPI', to the end of it. Figure 5.3 depicts the state of the computation after append is executed.

Figure 5.3 Demonstration of mutability

The object to which Univs is bound still contains the same two lists, but the contents of one of those lists has been changed. Consequently, the print statements

print('Univs =', Univs)
print('Univs1 =', Univs1)
now produce the output
Univs = [['MIT', 'Caltech', 'RPI'], ['Harvard', 'Yale', 'Brown']]
Univs1 = [['MIT', 'Caltech'], ['Harvard', 'Yale', 'Brown']]

What we have here is something called aliasing. There are two distinct paths to the same list object.

One path is through the variable Techs
and the other through the first element of the list object to which Univs is bound.

One can mutate the object via either path, and the effect of the mutation will be visible through both paths. This can be convenient, but it can also be treacherous. Unintentional aliasing leads to programming errors that are often enormously hard to track down.

As with tuples, afor statement can be used to iterate over the elements of a list. For example,

for e in Univs:
    print('Univs contains', e)
    print(' which contains')
    for u in e:
        print(' ', u)

will print

Univs contains ['MIT', 'Caltech', 'RPI']
which contains
MIT
Caltech
RPI
Univs contains ['Harvard', 'Yale', 'Brown'] which contains
Harvard
Yale
Brown

When we append one list to another, e.g., Techs.append(Ivys), the original structure is maintained. I.e., the result is a list that contains a list. Suppose we do not want to maintain this structure, but want to add the elements of one list into another list. We can do that by using list concatenation or the extend method, e.g.,

L1 = [1,2,3]
L2 = [4,5,6]
L3 = L1 + L2
print('L3 =', L3)
L1.extend(L2)
print('L1 =', L1)
L1.append(L2)
print('L1 =', L1)
#will print
L3 = [1, 2, 3, 4, 5, 6]
L1 = [1, 2, 3, 4, 5, 6]
L1 = [1, 2, 3, 4, 5, 6, [4, 5, 6]]

Notice that

the operator + does not have a side effect. It creates a new list and returns it.
In contrast, extend and append each mutated L1.

extend扩展，ex交换两者是平等的关系；

append，a加了一个尾巴，后者是前者的附庸。

Figure 5.4 contains short descriptions of some of the methods associated with lists. Note that all of these except count and index mutate the list.

Figure 5.4 Methods associated with lists

Figure 5.4 Methods associated with lists

5.3.1 Cloning

It is usually prudent to avoid mutating a list over which one is iterating. Consider, for example, the code

def removeDups(L1, L2):
    """Assumes that L1 and L2 are lists.
        Removes any element from L1 that also occurs in L2"""
    for e1 in L1:
        if e1 in L2:
            L1.remove(e1)
L1 = [1,2,3,4]
L2 = [1,2,5,6]
removeDups(L1, L2)
print('L1 =', L1)
#You might be surprised to discover that this prints
L1 = [2, 3, 4]

During a for loop, the implementation of Python keeps track of where it is in the list using an internal counter that is incremented at the end of each iteration. When the value of the counter reaches the current length of the list, the loop terminates. This works as one might expect if the list is not mutated within the loop, but can have surprising consequences if the list is mutated. In this case, the hidden counter starts out at 0, discovers that L1[0] is in L2, and removes it—reducing the length ofL1 to3. The counter is then incremented to 1, and the code proceeds to check if the value of L1[1] is inL2. Notice that this is not the original value of L1[1](i.e., 2), but rather the current value of L1[1] (i.e., 3). As you can see, it is possible to figure out what happens when the list is modified within the loop. However, it is not easy. And what happens is likely to be unintentional, as in this example.

One way to avoid this kind of problem is to use slicing to clone (i.e., make a copy of) the list and write for e1 in L1[:]. Notice that writing:

newL1 = L1
for e1 in newL1:

would not solve the problem. It would not create a copy of L1, but would merely introduce a new name for the existing list.

Slicing is not the only way to clone lists in Python. The expressionlist(L) returns a copy of the list L. If the list to be copied contains mutable objects that you want to copy as well, import the standard library module copy and use the function copy.deepcopy.

5.3.2 List Comprehension

List comprehension provides a concise way to apply an operation to the values in a sequence. It creates a new list in which each element is the result of applying a given operation to a value from a sequence (e.g., the elements in another list). For example,

L = [x**2 for x in range(1,7)]
print(L)
will
print the list
[1, 4, 9, 16, 25, 36]

The for clause in a list comprehension can be followed by one or more if and for statements that are applied to the values produced by the for clause. These additional clauses modify the sequence of values generated by the first for clause and produce a new sequence of values, to which the operation associated with the comprehension is applied. For example, the code:

mixed = [1, 2, 'a', 3, 4.0]
print([x**2 for x in mixed if type(x) == int])

squares the integers in mixed, and then prints [1, 4, 9].

Some Python programmers use list comprehensions in marvelous and subtle ways. That is not always a great idea. Remember that somebody else may need to read your code, and “subtle” is not usually a desirable property.

5.4. Function as Objects

In Python, functions are first-class objects. That means that they can be treated like objects of any other type, e.g., int or list. They have types, e.g., the expression type(abs) has the value <type 'built- in_function_or_method'>; they can appear in expressions, e.g., as the right-hand side of an assignment statement or as an argument to a function; they can be elements of lists; etc.

Using functions as arguments allows a style of coding called higher- order programming. It can be particularly convenient in conjunction with lists, as shown in Figure 5.5.

def applyToEach(L, f):
    """Assumes L is a list,f a funtion
        Mutates L by replacing each element, e, of L by f(e)"""
    for i in range(len(L)):
        L[i] = f(L[i]) # 这里没有Return
L = [1, -2, 3.33]
print('L =', L)
print('Apply abs to each element of L ')
applyToEach(L, abs)
print('L =', L)
print('Apply int to each element of L ')
applyToEach(L, int)
print('L =', L)
print('Apply factorial to each element of L ')
applyToEach(L, factR)
print('L =', L)
print('Apply fibonacci to each element of L ')
applyToEach(L, fib)
print('L =', L)

The function applyToEach is called higher-order because it has an argument that is itself a function. The first time it is called, it mutates L by applying the unary built-in function abs to each element. The second time it is called, it applies a type conversion to each element. The third time it is called, it replaces each element by the result of applying the function factR (defined in Figure 4.5) to each element. And the fourth time it is called, it replaces each element by the result of applying the function fib (defined in Figure 4.7) to each element. It prints:

L = [1, -2, 3.33]
Apply abs to each element of L
L = [1, 2, 3.33]
Apply int to each element of L
L = [1, 2, 3]
Apply factorial to each element of L
L = [1, 2, 6]
Apply fibonacci to each element of L
L = [1, 2, 13]

Python has a built-in higher-order function, map, that is similar to, but more general than, the applyToEach function defined in Figure 5.5. it is designed to be used in conjunction with a for loop. In its simplest form the first argument to map is a unary function (i.e., a function that has only one parameter) and the second argument is any ordered collection of values suitable as arguments to the first argument. When used in a for loop, map behaves like the range function in that it returns one value for each iteration of the loop. These values are generated by applying the first argument to each element of the second argument. For example, the code

for i in map(fib, [2, 6, 4]):
print(i)
prints
IndentationError: expected an indented block
>>> for i in map(fib, [2, 6, 4]):
        print(i)
...
2
13
5
>>>

More generally, the first argument to map can be a function of n arguments, in which case it must be followed by n subsequent ordered collections (each of the same length). For example, the code:

L1 = [1, 28, 36]
L2 = [2, 57, 9]
for i in map(min, L1, L2):
    print(i)
#prints
1
28
9

Python supports the creation of anonymous functions (i.e., functions that are not bound to a name), using the reserved word lambda. The general form of a lambda expression is

lambda <sequence of variable names>: <expression>

For example, the lambda expression lambda x, y: x*yreturns a function that returns the product of its two arguments. Lambda expressions are frequently used as arguments to higher-order functions. For example, the code

L = []
for i in map(lambda x, y: x**y, [1 ,2 ,3, 4], [3, 2, 1, 0]): L.append(i)
print(L)
#prints
[1, 4, 3, 1].

5.5 Strings, Tuples, Ranges, and Lists

We have looked at four different sequence types: str, tuple, range, and list. They are similar in that objects of of these types can be operated upon as described in Figure 5.6. Some of their other similarities and differences are summarized in Figure 5.7.

Figure 5.6 Common operations on sequence types

Figure 5.7 Comparison of sequence types

Python programmers tend to use lists far more often than tuples. Since lists are mutable, they can be constructed incrementally during a computation. For example, the following code incrementally builds a list containing all of the even numbers in another list.

evenElems = []
for e in L:
    if e%2 == 0:
        evenElems.append(e)

evenElems = []
for e in L:
    if e%2 == 0:
        evenElems.append(e)

One advantage of tuples is that because they are immutable, aliasing is never a worry. Another advantage of their being immutable is that tuples, unlike lists, can be used as keys in dictionaries, as we will see in the next section.

Since strings can contain only characters, they are considerably less versatile than tuples or lists. On the other hand, when you are working with a string of characters there are many built-in methods that make life easy. Figure 5.8 contains short descriptions of a few of them. Keep in mind that since strings are immutable these all return values and have no side effect.

One of the more useful built-in methods is split, which takes two strings as arguments. The second argument specifies a separator that is used to split the first argument into a sequence of substrings. For example,

print('My favorite professor--John G.--rocks'.split(' '))
print('My favorite professor--John G.-- rocks'.split('-'))
print('My favorite professor--John G.--rocks'.split('--'))

prints

['My', 'favorite', 'professor--John', 'G.--rocks']
['My favorite professor', '', 'John G.', '', ' rocks']
['My favorite professor', 'John G.', 'rocks']

The second argument is optional. If that argument is omitted the first string is split using arbitrary strings of whitespace characters (space, tab, newline, return, and formfeed).

The second argument is optional. If that argument is omitted the first string is split using arbitrary strings of whitespace characters (space, tab, newline, return, and formfeed).

Figure 5.8 Some methods on Strings

5.6. Dictionary

Objects of type dict (short for dictionary) are like lists except that we index them using keys. Think of a dictionary as a set of key/value pairs. Literals of type dict are enclosed in curly braces, and each element is written as a key followed by a colon followed by a value. For example, the code,

monthNumbers = {'Jan':1, 'Feb':2, 'Mar':3, 'Apr':4, 'May':5, 1:'Jan', 2:'Feb', 3:'Mar', 4:'Apr', 5:'May'}
print('The third month is ' + monthNumbers[3])
dist = monthNumbers['Apr'] - monthNumbers['Jan']
print('Apr and Jan are', dist, 'months apart')
#will print
The third month is Mar
Apr and Jan are 3 months apart

The entries in a dict are unordered and cannot be accessed with an index. That’s why monthNumbers[1] unambiguously refers to the entry with the key 1 rather than the second entry.

Like lists, dictionaries are mutable. We can add an entry by writing

monthNumbers['June'] = 6

or change an entry by writing

monthNumbers['May'] = 'V'

Dictionaries are one of the great things about Python. They greatly reduce the difficulty of writing a variety of programs. For example, in Figure 5.9 we use dictionaries to write a (pretty horrible) program to translate between languages. Since one of the lines of code was too long to fit on the page, we used a backslash, , to indicate that the next line of text is a continuation of the previous line.

EtoF = {'bread': 'du pain', 'wine': 'du vin',\
       'eats': 'mange', 'drinks': 'bois',\
       'likes': 'aime', 1: 'un',\
       '6.00':'6.00'}
def translateWord(word, dictionary):
   if word in dictionary:
       return dictionary[word]
   else:
       return word

def translate(sentence):
   translation = ''
   word = ''
   for c in sentence:
       if c != ' ':
           word = word + c
       else:
           translation = translation + ' '\
                         + translateWord(word, EtoF)
           word = ''
   return translation[1:] + ' ' + translateWord(word, EtoF)

print (translate('John eats bread'))
print (translate('Steve drinks wine'))
print (translate('John likes 6.00'))

will print

John mange du pain
Steve bois du vin
John aime 6.00

The code in the figure prints,

Je bois "good" rouge vin, et mange pain.
I drink of wine red.

Remember that dictionaries are mutable. So one must be careful about side effects. For example,

FtoE['bois'] = 'wood'
Print(translate('Je bois du vin rouge.', dicts, 'French to English'))
#will print
I wood of wine red

Most programming languages do not contain a built-in type that provides a mapping from keys to values. Instead, programmers use other types to provide similar functionality. It is, for example, relatively easy to implement a dictionary by using a list in which each element is a key/value pair. One can then write a simple function that does the associative retrieval, e.g.,

def keySearch(L, k):

def keySearch(L, k):
   for elem in L:
       if elem[0] == k: return elem[1]
   return None

The problem with such an implementation is that it is computationally inefficient. In the worst case, a program might have to examine each element in the list to perform a single retrieval. In contrast, the built-in implementation is quite fast. It uses a technique called hashing, described in Chapter 10, to do the lookup in time that is nearly independent of the size of the dictionary.

A for statement can be used to iterate over the entries in a dictionary. However, the value assigned to the iteration variable is a key, not a key/value pair. The order in which the keys are seen in the iteration is not defined. For example, the code

monthNumbers = {'Jan':1, 'Feb':2, 'Mar':3, 'Apr':4, 'May':5, 1:'Jan', 2:'Feb', 3:'Mar', 4:'Apr', 5:'May'}
keys = []
for e in monthNumbers:
    keys.append(str(e))
print(keys)
keys.sort()
print(keys)
#might print
['Jan', 'Mar', '2', '3', '4', '5', '1', 'Feb', 'May', 'Apr']
['1', '2', '3', '4', '5', 'Apr', 'Feb', 'Jan', 'Mar', 'May']

The method keys returns an object of type dict_keys. This is anexample of a view object. The order in which the keys appear in the view is not defined. A view object is dynamic in that if the object with which it is associated changes, the change is visible through the view object. For example,

birthStones = {'Jan':'Garnet', 'Feb':'Amethyst', 'Mar':'Acquamarine', 'Apr':'Diamond', 'May':'Emerald'}
months1 = birthStones.keys()
months2 = birthStones.keys
print(months)
birthStones['June'] = 'Pearl'
print(months)
#might print
dict_keys(['Jan', 'Feb', 'May', 'Apr', 'Mar'])
dict_keys(['Jan', 'Mar', 'June', 'Feb', 'May', 'Apr'])

Objects of type dict_keys can be iterated over using for, and membership can be tested using in. An object of type dict_keys can easily be converted into a list, e.g., list(months).

Not all types of of objects can be used as keys: A key must be an object of a hashable type. A type is hashable if it has

A hash method that maps an object of the type to an int, and for every object the value returned by hash does not change during the lifetime of the object, and
An eq method that is used to compare objects for equality.

All of Python’s built-in immutable types are hashable, and none of Python’s built-in mutable types are hashable. It is often convenient to use tuples as keys. Imagine, for example, using a tuple of the form (flightNumber, day) to represent airline flights. It would then be easy to use such tuples as keys in a dictionary implementing a mapping from flights to arrival times.

As with lists, there are many useful methods associated with dictionaries, including some for removing elements. We do not enumerate all of them here, but will use them as convenient in examples later in the book. Figure 5.10 contains some of the more useful operations on dictionaries.