The built-in dict type’s setdefault method results in shorter code when handling missing keys in some specific circumstances (see Item 16: “Prefer get Over in and KeyError to Handle Missing Dictionary Keys” for examples). For many of those situations, the better tool for the job is the defaultdict type from the collections built-in module (see Item 17: “Prefer defaultdict Over setdefault to Handle Missing Items in Internal State” for why). However, there are times when neither setdefault nor defaultdict is the right fit.
内置的dict类型的setdefault方法在某些特定的情况下处理丢失的键时会使代码更短(参见Item 16:“Prefer get Over in and KeyError to Handle Missing Dictionary Keys”的例子)。对于许多这样的情况,更好的工具是来自collections内置模块的defaultdict类型(参见Item 17:“Prefer defaultdict Over setdefault to Handle Missing Items in Internal State”)。但是,有时setdefault和defaultdict都不合适。
For example, say that I’m writing a program to manage social network profile pictures on the filesystem. I need a dictionary to map profile picture pathnames to open file handles so I can read and write those images as needed. Here, I do this by using a normal dict instance and checking for the presence of keys using the get method and an assignment expression (introduced in Python 3.8; see Item 10: “Prevent Repetition with Assignment Expressions”) :
例如,假设我正在编写一个程序来管理文件系统上的社交网络头像。我需要一个字典来映射图片路径名和已打开的文件句柄,这样我就可以根据需要读取和写入这些图像。在这里,我使用一个普通的dict实例,并使用get方法和赋值表达式(在Python 3.8中引入;参见项目10:“Prevent Repetition with Assignment Expressions”):
pictures = {}
path = 'profile_1234.png '
if (handle := pictures.get (path)) is None:
try:
handle = open (path, 'a+b ')
except OSError:
print (f 'Failed to open path {path} ')
raise
else:
pictures [path] = handle
handle.seek (0)
image_data = handle.read ()
When the file handle already exists in the dictionary, this code makes only a single dictionary access. In the case that the file handle doesn’t exist, the dictionary is accessed once by get, and then it is assigned in the else clause of the try/except block. (This approach also works with finally; see Item 65: “Take Advantage of Each Block in try/except/else/finally.”) The call to the read method stands clearly separate from the code that calls open and handles exceptions.
当文件句柄已经存在于该字典中时,这段代码只进行一次字典访问。在文件句柄不存在的情况下,该字典被get访问一次,然后在try/except块的else子句中被赋值一次。(这个方法也适用于finally; 参见Item 65:“Take Advantage of Each Block in try/except/else/finally.”)。调用read方法,与调用open和处理异常的代码是明显分开。
Although it’s possible to use the in expression or KeyError approaches to implement this same logic, those options require more dictionary accesses and levels of nesting. Given that these other options work, you might also assume that the setdefault method would work, too:
尽管可以使用in表达式或KeyError方法来实现相同的逻辑,但这些选项需要更多的字典访问和嵌套级别。既然这些选项都能工作,你也可以假设setdefault方法也能工作:
try:
handle = pictures.setdefault (path, open (path, 'a+b '))
except OSError:
print (f 'Failed to open path {path} ')
raise
else:
handle.seek (0)
image_data = handle.read ()
This code has many problems. The open built-in function to create the file handle is always called, even when the path is already present in the dictionary. This results in an additional file handle that may conflict with existing open handles in the same program. Exceptions may be raised by the open call and need to be handled, but it may not be possible to differentiate them from exceptions that may be raised by the setdefault call on the same line (which is possible for other dictionary-like implementations; see Item 43: “Inherit from collections.abc for Custom Container Types”).
这段代码有很多问题。创建文件句柄的open内置函数总是被调用,即使该路径已经存在于字典中。这会产生一个额外的文件句柄,该句柄可能与同一程序中现有的打开句柄冲突。open调用可能会引发异常并需要处理,但我们无法将它们与setdefault调用在同一行上引发的异常区分开来(这可能适用于其他类似字典的实现;参见Item 43:“Inherit from collections.abc for Custom Container Types”)。
If you’re trying to manage internal state, another assumption you might make is that a defaultdict could be used for keeping track of these profile pictures. Here, I attempt to implement the same logic as before but now using a helper function and the defaultdict class:
如果您试图管理内部状态,您可能会做出的另一个假设是,使用defaultdict来跟踪这些头像图片。在这里,我尝试实现与以前相同的逻辑,但现在使用一个助手函数和defaultdict类:
from collections import defaultdict
def open_picture (profile_path):
try:
return open(profile_path, 'a+b ')
except OSError:
print (f 'Failed to open path {profile_path} ')
raise
pictures = defaultdict (open_picture)
handle = pictures[path]
handle.seek (0)
image_data = handle.read ()
>>>
Traceback ...
TypeError: open_picture () missing 1 required positional
argument: 'profile_path '
The problem is that defaultdict expects that the function passed to its constructor doesn’t require any arguments. This means that the helper function that defaultdict calls doesn’t know which specific key is being accessed, which eliminates my ability to call open. In this situation, both setdefault and defaultdict fall short of what I need.
问题是defaultdict期望传递给它的构造器的函数是不带任何参数的。这意味着defaultdict调用的helper函数不知道被访问的是哪个特定键,这就消除了调用open的能力。在这种情况下,setdefault和defaultdict都不能满足我的需要。
Fortunately, this situation is common enough that Python has another built-in solution. You can subclass the dict type and implement the __missing__ special method to add custom logic for handling missing keys. Here, I do this by defining a new class that takes advantage of the same open_picture helper method defined above:
幸运的是,这种情况非常常见,因此Python有另一个内置的解决方案。你可以继承dict类型并实现__missing__魔术方法来添加处理丢失键的自定义逻辑。在这里,我通过定义一个新类来实现这一点,它利用了上面定义的open_picture helper方法:
class Pictures (dict):
def __missing__ (self, key):
value = open_picture(key)
self [key] = value
return value
pictures = Pictures ()
handle = pictures[path]
handle.seek (0)
image_data = handle.read()
When the pictures [path] dictionary access finds that the path key isn’t present in the dictionary, the __missing__ method is called. This method must create the new default value for the key, insert it into the dictionary, and return it to the caller. Subsequent accesses of the same path will not call __missing__ since the corresponding item is already present (similar to the behavior of __getattr__; see Item 47: “Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes”).
当pictures[path]字典访问发现path键不在字典中时,__missing__方法被调用。此方法必须为键创建新的默认值,将其插入字典并返回给调用者。后续对相同的path访问不会再调用__missing__,因为相应的项已经存在了(类似于__getattr__;参见第47项:“Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes”)。
Things to Remember
要记住的事
✦ The setdefault method of dict is a bad fit when creating the default value has high computational cost or may raise exceptions.
✦ The function passed to defaultdict must not require any arguments, which makes it impossible to have the default value depend on the key being accessed.
✦ You can define your own dict subclass with a __missing__ method in order to construct default values that must know which key was being accessed.
✦ 当创建默认值有很高的计算成本或可能引发异常时,dict的setdefault方法是不适合的。
✦ 传递给defaultdict的函数不能有任何参数,这使得它不可能有依赖于被访问的键的默认值。
✦ 你可以用__missing__方法定义你自己的dict子类,以便依赖被访问的键构造默认值。