What Namedtuples Are Good For
Python comes with a specialized “namedtuple” container type that doesn’t seem to get the attention it deserves. It’s one of those amazing features in Python that’s hidden in plain sight.
命名元组
Namedtuples can be a great alternative to defining a class manually, and they have some other interesting features that I want to introduce you to in this chapter.
命名元组在手动定义一个类的时候有很大的机动性,而且还有一些其他有趣的特性。
Now, what’s a namedtuple and what makes it so special? A good way to think about namedtuples is to view them as an extension of the built-in tuple data type.
什么是命名元组?是什么让它如此的与众不同呢?一个思考命名元组的好方式就是将命名元祖看成是内置的元组数据类型的延伸。
Python’s tuples are a simple data structure for grouping arbitrary objects. Tuples are also immutable—they cannot be modified once they’ve been created. Here’s a brief example:
>>> tup = ('hello', object(), 42)
>>> tup
('hello', <object object at 0x105e76b70>, 42)
>>> tup[2]
42
>>> tup[2] = 23
TypeError:
"'tuple' object does not support item assignment"
元组不可修改,可以作为任意对象的集合。
One downside of plain tuples is that the data you store in them can only be pulled out by accessing it through integer indexes. You can’t give names to individual properties stored in a tuple. This can impact code readability.
一个缺点就是我们只能通过整数下标去提取我们存在元组中的数据。我们不能给存在元组中的单个属性命名。(这一点应该是对应字典的比较来说的)
Also, a tuple is always an ad-hoc structure. It’s hard to ensure that two tuples have the same number of fields and the same properties stored on them. This makes it easy to introduce “slip-of-the-mind” bugs by mixing up the field order.
另外,元组总是一种特殊的结构。很难确保两个元组具有相同数量的字段和存储在它们上面的相同属性。这使得通过混合字段顺序很容易引入“思维障碍”错误(emmmmmm......)
Namedtuples to the Rescue
Namedtuples aim to solve these two problems.
命名元组来解决这个问题。
First of all, namedtuples are immutable containers, just like regular tuples. Once you store data in top-level attribute on a namedtuple, you can’t modify it by updating the attribute. All attributes on a namedtuple object follow the “write once, read many” principle.
首先,命名元组也是不可变的。
Besides that, namedtuples are, well…named tuples. Each object stored in them can be accessed through a unique (human-readable) identifier. This frees you from having to remember integer indexes, or resorting to workarounds like defining integer constants as mnemonics for your indexes.
但是我们可以用更加可读的变量名来替换数字索引。
Here’s what a namedtuple looks like:
>>> from collections import namedtuple
>>> Car = namedtuple('Car' , 'color mileage')
上面是一个示例。
Namedtuples were added to the standard library in Python 2.6. To use them, you need to import the collections module. In the above example, I defined a simple Car data type with two fields: color and mileage.
我定义了一个简单的car数据类型,带有两个字段:color和mileage。
You might be wondering why I’m passing the string 'Car' as the first argument to the namedtuple factory function in this example.
你可能会疑惑为什么我会将字符串Car放在命名元组的工厂函数的第一项。
This parameter is referred to as the “typename” in the Python docs. It’s the name of the new class that’s being created by calling the namedtuple function.
这个参数是关于python文档中的类型名称的。这就是通过调用命名元组函数创造出来的新的类的名字。
Since namedtuple has no way of knowing what the name of the variable is we’re assigning the resulting class to, we need to explicitly tell it which class name we want to use. The class name is used in the docstring and the __repr__
implementation that namedtuple automatically generates for us.
因为命名元组不知道我们命名的类的变量名称,我们需要清晰地告诉它我们想使用的类名称。这个类名称被用在文档字符串和命名函数为我们自动生成的repr魔法实现中。
And there’s another syntactic oddity in this example—why are we passing the fields as a string that encodes their names as 'color mileage'?
在这个例子中还有一个其他的语法奇怪的地方。为什么我们要传入成字符串字段并且将其名称编码为color mileage呢?
The answer is that namedtuple’s factory function calls split() on the field names string to parse it into a list of field names. So this is really just a shorthand for the following two steps:
>>> 'color mileage'.split()
['color', 'mileage']
>>> Car = namedtuple('Car', ['color', 'mileage'])
答案就是命名元组的工厂函数调用split()方法作用在字段名字符串上将其分解成字符名称列表。
Of course, you can also pass in a list with string field names directly if you prefer how that looks. The advantage of using a proper list is that it’s easier to reformat this code if you need to split it across multiple lines:
>>> Car = namedtuple('Car', [
... 'color',
... 'mileage',
... ])
当然,如果你更喜欢传入列表的方式那么你可以直接传入列表。
Whatever you decide, you can now create new “car” objects with the Car factory function. It behaves as if you had defined a Car class manually and given it a constructor accepting a “color” and a “mileage” value:
>>> my_car = Car('red', 3812.4)
>>> my_car.color
'red'
>>> my_car.mileage
3812.4
不管你怎么选择,你现在可以利用Car工厂函数创建一个新的car对象。这就像你手动定义了一个Car类而且给他一个接受color和mileage值的构造器。
Besides accessing the values stored in a namedtuple by their identifiers, you can still access them by their index. That way, namedtuples can be used as a drop-in replacement for regular tuples:
>>> my_car[0]
'red'
>>> tuple(my_car)
('red', 3812.4)
除了通过它们的标识符访问存储在namedtuple中的值之外,您还可以通过它们的索引访问它们。这样,namedDuples就可以作为常规元组的替代品。
Tuple unpacking and the *-operator for function argument unpacking
also work as expected:
>>> color, mileage = my_car
>>> print(color, mileage)
red 3812.4
>>> print(*my_car)
red 3812.4
元组解包和单星号运算可以解包。
You’ll even get a nice string representation for your namedtuple object for free, which saves some typing and verbosity:
>>> my_car
Car(color='red' , mileage=3812.4)
Like tuples, namedtuples are immutable. When you try to overwrite
one of their fields, you’ll get an AttributeError exception:
>>> my_car.color = 'blue'
AttributeError: "can't set attribute"
就像元组一样,命名元组是不可变的。
Namedtuple objects are implemented as regular Python classes internally. When it comes to memory usage, they are also “better” than regular classes and just as memory efficient as regular tuples.
NamedDuple对象在内部实现为常规的Python类。当涉及到内存使用时,它们也比常规类“更好”,而且与常规元组的内存效率一样高。
A good way to view them is to think that namedtuples are a memoryefficient shortcut to defining an immutable class in Python manually.
查看它们的一个好方法是认为NamedDuples是在Python中手动定义不可变类的一个内存高效的快捷方式。
Subclassing Namedtuples
Since they are built on top of regular Python classes, you can even add methods to a namedtuple object. For example, you can extend a namedtuple’s class like any other class and add methods and new properties to it that way. Here’s an example:
Car = namedtuple('Car', 'color mileage')
class MyCarWithMethods(Car):
def hexcolor(self):
if self.color == 'red':
return '#ff0000'
else:
return '#000000'
因为他们是在python基础类上构建的,你甚至可以对命名元组使用加方法。
We can now create MyCarWithMethods objects and call their hexcolor() method, just as expected:
>>> c = MyCarWithMethods('red', 1234)
>>> c.hexcolor()
'#ff0000'
However, this might be a little clunky. It might be worth doing if you want a class with immutable properties, but it’s also easy to shoot yourself in the foot here.
不过,这可能有点笨拙。如果你想要一个具有不变属性的类,这可能是值得做的,但是在这里也很容易让你自己陷入困境。
For example, adding a new immutable field is tricky because of how namedtuples are structured internally. The easiest way to create hierarchies of namedtuples is to use the base tuple’s _fields property:
>>> Car = namedtuple('Car', 'color mileage')
>>> ElectricCar = namedtuple(
... 'ElectricCar', Car._fields + ('charge',))
这个方法确实比较有意思。
This gives the desired result:
>>> ElectricCar('red', 1234, 45.0)
ElectricCar(color='red', mileage=1234, charge=45.0)
Built-in Helper Methods
Besides the fields property, each namedtuple instance also provides a few more helper methods you might find useful. Their names all start with a single underscore character () which usually signals that a method or property is “private” and not part of the stable public interface of a class or module.
除了_fields属性,每个命名元组实例也提供一些你可能认为有用的帮助方法。他们的名字都是以单下划线字符开头的,这个表明一个方法或者属性是私有的,而且不是一个类或者模块的稳定公共接口的一部分。
With namedtuples, the underscore naming convention has a different meaning though. These helper methods and properties are part of namedtuple’s public interface. The helpers were named that way to avoid naming collisions with user-defined tuple fields. So go ahead and use them if you need them!
这些帮助方法和属性都是命名元组的公共接口的一部分。
I want to show you a few scenarios where the namedtuple helper methods might come in handy. Let’s start with the _asdict() helper method. It returns the contents of a namedtuple as a dictionary:
>>> my_car._asdict()
OrderedDict([('color', 'red'), ('mileage', 3812.4)])
_asdict()帮助方法可以将一个命名元组以字典的方式返回。
This is great for avoiding typos in the field names when generating JSON-output, for example:
>>> json.dumps(my_car._asdict())
'{"color": "red", "mileage": 3812.4}'
Another useful helper is the _replace() function. It creates a (shallow) copy of a tuple and allows you to selectively replace some of its fields:
>>> my_car._replace(color='blue')
Car(color='blue', mileage=3812.4)
另外一个有用的帮助方法就是_replace(),他创造了一个元组的浅复制,并且允许你有选择性的替代一些字符。
Lastly, the _make() classmethod can be used to create new instances of a namedtuple from a sequence or iterable:
>>> Car._make(['red', 999])
Car(color='red', mileage=999)
最后,_make()类方法可以用于从一个序列或者迭代器创造一个新的命名元祖的实例。
When to Use Namedtuples
Namedtuples can be an easy way to clean up your code and to make it more readable by enforcing a better structure for your data.
命名元祖是一种简单的方法,他可以帮助清晰你的代码,而且可以通过设置数据一个更加结构化使得代码更加可读。
For example, I find that going from ad-hoc data types like dictionaries
with a fixed format to namedtuples helps me express my intentions more clearly. Often when I attempt this refactoring I magically come up with a better solution for the problem I’m facing.
例如,我发现从字典之类的特殊数据类型使用固定的命名格式可以帮助我更清楚地表达我的意图。通常,当我尝试重构时,我会神奇地为我所面临的问题想出一个更好的解决方案。
Using namedtuples over unstructured tuples and dicts can also make my coworkers’ lives easier because they make the data being passed around “self-documenting” (to a degree).
On the other hand, I try not to use namedtuples for their own sake if they don’t help me write “cleaner” and more maintainable code. Like many other techniques shown in this book, sometimes there can be too much of a good thing.
However, if you use them with care, namedtuples can undoubtedly make your Python code better and more expressive.
Key Takeaways
- collection.namedtuple is a memory-efficient shortcut to manually define an immutable class in Python.
- Namedtuples can help clean up your code by enforcing an easier-to-understand structure on your data.
- Namedtuples provide a few useful helper methods that all start with a single underscore, but are part of the public interface. It’s okay to use them.
命名元祖是一种内存高效的方式。使得代码清晰,更好理解。
命名元祖提供了一些有用的助手方法,这些方法都以一个下划线开头,但都是公共接口的一部分。可以使用它们。