String Conversion (Every Class Needs a repr)
When you define a custom class in Python and then try to print one of its instances to the console (or inspect it in an interpreter session), you get a relatively unsatisfying result. The default “to string” conversion behavior is basic and lacks detail:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
>>> my_car = Car('red', 37281)
>>> print(my_car)
<__console__.Car object at 0x109b73da0>
>>> my_car
<__console__.Car object at 0x109b73da0>
By default all you get is a string containing the class name and the id of the object instance (which is the object’s memory address in CPython.) That’s better than nothing, but it’s also not very useful.
默认情况下,我们可以得到的是一个字符串,包含着类名称和对象示例的id。虽然聊胜于无。
You might find yourself trying to work around this by printing attributes of the class directly, or even by adding a custom to_string() method to your classes:
>>> print(my_car.color, my_car.mileage)
red 37281
你会发现你自己总是忙着通过直接打印类的属性或者自定义添加一个字符串转换的方法去解决这个问题。
The general idea here is the right one—but it ignores the conventions and built-in mechanisms Python uses to handle how objects are represented as strings.
这种一般观点是正确的,但是它忽略了python用来解决对象如何像字符串一样表达的惯例和内置机制。
Instead of building your own to-string conversion machinery, you’ll be better off adding the str and repr “dunder” methods to your class. They are the Pythonic way to control how objects are converted to strings in different situations.
我们最好添加str和repr的魔术方法搭配我们的类。它们是更加python风格的办法去控制在不同的场景里对象如何转换成字符串。
Let’s take a look at how these methods work in practice. To get started, we’re going to add a str method to the Car class we defined earlier:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __str__(self):
return f'a {self.color} car'
上面我们添加了一个str的魔术方法。
When you try printing or inspecting a Car instance now, you’ll get a different, slightly improved result:
>>> my_car = Car('red', 37281)
>>> print(my_car)
'a red car'
>>> my_car
<__console__.Car object at 0x109ca24e0>
当你尝试着打印或者查看car这个实例的时候,会得到一个不同的,有一些提升的结果。
Inspecting the car object in the console still gives us the previous result containing the object’s id. But printing the object resulted in the string returned by the str method we added.
在控制台中查看car这个实例给了我们包含对象id的之前的结果,但是用字符串的方式打印对象结果返回的是str魔法方法的结果。
str is one of Python’s “dunder” (double-underscore) methods and gets called when you try to convert an object into a string through the various means that are available:
>>> print(my_car)
a red car
>>> str(my_car)
'a red car'
>>> '{}'.format(my_car)
'a red car'
str是一个前后双下划线的魔法方法,而且在你通过有效的各种各样的方法将一个对象转换成一个字符串的时候被调用。
With a proper str implementation, you won’t have to worry about printing object attributes directly or writing a separate to_string() function. It’s the Pythonic way to control string conversion.
一个恰当的str魔法方法的操作,你不用去担心直接打印一个对象属性或者写一个分离的to_string的函数。这就是控制字符串转换的很python的方法。
By the way, some people refer to Python’s “dunder” methods as “magic methods.” But these methods are not supposed to be magical in any way. The fact that these methods start and end in double underscores is simply a naming convention to flag them as core Python features. It also helps avoid naming collisions with your own methods and attributes. The object constructor init follows the same convention, and there’s nothing magical or arcane about it.
顺便提一句,魔法方法不是魔法方法,前后双下划线只是去标志它们是核心python特性的一个命名的习惯。它也避免了和你自己的方法和属性发生命名冲突。init魔法方法也是同样的惯例,没有什么魔法或者神秘可言。
Don’t be afraid to use Python’s dunder methods—they’re meant to
help you.
不要害怕使用它们,它们是来帮助你的。
str vs repr
Now, our string conversion story doesn’t end there. Did you see how inspecting my_car in an interpreter session still gave that odd <Car object at 0x109ca24e0> result?
现在,我们的字符串转换的故事还没有结束。你看到之前我们在交互模式中查看my_car类还是给了原来的结果吗?
This happened because there are actually two dunder methods that control how objects are converted to strings in Python 3. The first one is str, and you just learned about it. The second one is repr, and the way it works is similar to str, but it is used in different situations. (Python 2.x also has a unicode method that I’ll touch on a little later.)
实际上有两种方法控制着在python3中对象如何转换到字符串。第一个方法是str魔法方法。第二个就是repr魔法方法,而且它工作的方式接近于str魔法方法,但是它用在不同的场景里。
Here’s a simple experiment you can use to get a feel for when str or repr is used. Let’s redefine our car class so it contains both tostring dunder methods with outputs that are easy to distinguish:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __repr__(self):
return '__repr__ for Car'
def __str__(self):
return '__str__ for Car'
做了一个小实验。现在Car类中有两个将对象转换成字符串的方法了。
Now, when you play through the previous examples you can see which method controls the string conversion result in each case:
>>> my_car = Car('red', 37281)
>>> print(my_car)
__str__ for Car
>>> '{}'.format(my_car)
'__str__ for Car'
>>> my_car
__repr__ for Car
试一下。
This experiment confirms that inspecting an object in a Python interpreter session simply prints the result of the object’s repr.
这个实验证明,在python交互模式中查看一个对象只是简单的输出对象的repr魔法方法的结果。
Interestingly, containers like lists and dicts always use the result of repr to represent the objects they contain. Even if you call str on the
container itself:
str([my_car])
'[__repr__ for Car]'
有趣的是,像列表和字典这样的容器总是使用repr魔法方法的结果去代表他们所包含的对象。就算你在容器本身去调用str方法。
To manually choose between both string conversion methods, for example, to express your code’s intent more clearly, it’s best to use the built-in str() and repr() functions. Using them is preferable over calling the object’s str or repr directly, as it looks nicer and gives the same result:
>>> str(my_car)
'__str__ for Car'
>>> repr(my_car)
'__repr__ for Car'
在选择字符串转换方法去清晰地表达我们代码的目标的时候,我们最好使用内置的str和repr函数。用他们比直接调用对象的str和repr的魔法方法还是更好的,因为它们看上去更美观,而且结果是一样的。
Even with this investigation complete, you might be wondering what the “real-world” difference is between str and repr. They both seem to serve the same purpose, so it might be unclear when to use each.
With questions like that, it’s usually a good idea to look into what the Python standard library does. Time to devise another experiment. We’ll create a datetime.date object and find out how it uses repr and str to control string conversion:
>>> import datetime
>>> today = datetime.date.today()
让我们看看python的标准库是怎么干的。我们创建了datetime.date对象而且看它怎么使用repr和str方法去控制字符串的转换。
The result of the date object’s str function should primarily be readable. It’s meant to return a concise textual representation for human consumption—something you’d feel comfortable displaying to a user. Therefore, we get something that looks like an ISO date format when we call str() on the date object:
>>> str(today)
'2017-02-02'
str方法调用的结果是更适合人类查看的精确的文本内容。
With repr, the idea is that its result should be, above all, unambiguous. The resulting string is intended more as a debugging aid for developers. And for that it needs to be as explicit as possible about what this object is. That’s why you’ll get a more elaborate result calling repr() on the object. It even includes the full module and class name:
>>> repr(today)
'datetime.date(2017, 2, 2)'
使用repr方法,结果是清晰的。这个返回的字符串更像是给开发者的调试帮助。这个结果需要尽量清晰地展示这个对象是什么。这就是为什么为什么你调用repr函数是会得到一个更加丰富的结果,它甚至包含全模块和类名称。datetime模块,date类。
We could copy and paste the string returned by repr and execute it as valid Python to recreate the original date object. This is a neat approach and a good goal to keep in mind while writing your own reprs.
我们可以复制和粘贴repr返回的字符串并且执行去创造一个原始的date对象。这是一个简洁的方法,也是一个好的目标,在编写自己的报告时要牢记在心。
On the other hand, I find that it is quite difficult to put into practice. Usually it won’t be worth the trouble and it’ll just create extra work for you. My rule of thumb is to make my repr strings unambiguous and helpful for developers, but I don’t expect them to be able to restore an object’s complete state.
另外一个方面,我发现在实际使用中确实不同。我的原则是让我的repr字符串清晰和有用,但是我们不期待它们可以存储一个对象的完整状态。
Why Every Class Needs a repr
If you don’t add a str method, Python falls back on the result of repr when looking for str. Therefore, I recommend that you always add at least a repr method to your classes. This will guarantee a useful string conversion result in almost all cases, with a minimum of implementation work.
如果没有str方法,python找不到的时候就会返回repr方法。因此我建议你们至少添加有一个repr方法到自己的类。这将保证在最小的工作量下,在几乎所有的情况下有一个有用的字符串转换的结果。
Here’s how to add basic string conversion support to your classes quickly and efficiently. For our Car class we might start with the following repr:
def __repr__(self):
return f'Car({self.color!r}, {self.mileage!r})'
添加一个字符串转化。
Please note that I’m using the !r conversion flag to make sure the output string uses repr(self.color) and repr(self.mileage) instead of str(self.color) and str(self.mileage).
请注意我用!r标志来确保输出的字符串用repr(self.color) 和repr(self.mileage)而不是str(self.color) 和str(self.mileage)。
This works nicely, but one downside is that we’ve repeated the class name inside the format string. A trick you can use here to avoid this repetition is to use the object’s class.name attribute, which will always reflect the class’ name as a string.
这个方法不错,但是一个缺陷是我们在格式化字符串的时候重复了类的名字。在这里你可以用的一个技巧去避免这个重复就是去用对象的class.name属性,该属性总是反应类的名字的字符串。
The benefit is you won’t have to modify the repr implementation when the class name changes. This makes it easy to adhere to the Don’t Repeat Yourself (DRY) principle:
def __repr__(self):
return (f'{self.__class__.__name__}('
f'{self.color!r}, {self.mileage!r})')
这样做的好处就是当类名称改变的时候我们不需要修改repr魔法方法。这让我们更加容易去执行“不要重复你自己”的原则。
The downside of this implementation is that the format string is quite long and unwieldy. But with careful formatting, you can keep the code nice and PEP 8 compliant.
这样做的坏处就是格式化字符串太长了也太笨拙了。但是认证的格式化的话,你可以保持代码的美观和遵守PEP8的规则。
With the above repr implementation, we get a useful result when we inspect the object or call repr() on it directly:
>>> repr(my_car)
'Car(red, 37281)'
Printing the object or calling str() on it returns the same string because
the default str implementation simply calls repr:
>>> print(my_car)
'Car(red, 37281)'
>>> str(my_car)
'Car(red, 37281)'
在没有str魔法时就用repr魔法来代替了。
I believe this approach provides the most value with a modest amount of implementation work. It’s also a fairly cookie-cutter approach that can be applied without much deliberation. For this reason, I always try to add a basic repr implementation to my classes.
我相信这样的办法有很不错的性价比。这是一个千篇一律的方法,可以不用经过太多考虑就使用。因为这个原因,我总是添加一个基础的repr方法到我的类中。
Here’s a complete example for Python 3, including an optional
__str__
implementation:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __repr__(self):
return (f'{self.__class__.__name__}('
f'{self.color!r}, {self.mileage!r})')
def __str__(self):
return f'a {self.color} car'
这是在python3中的一个例子。
Python 2.x Differences: unicode
In Python 3 there’s one data type to represent text across the board: str. It holds unicode characters and can represent most of the world’s writing systems.
在python 3中,有一种数据类型可以跨板表示文本:它包含Unicode字符,可以代表世界上大多数的书写系统。
Python 2.x uses a different data model for strings. There are two types to represent text: str, which is limited to the ASCII character set, and unicode, which is equivalent to Python 3’s str.
python2用一个不同的数据模型给字符串。这有两个不同类型去代表文本,str用的是ASCII符号集,和unicode,这个相当于python3中的str。
也就是说python2的str用的是ASCII字符集,py3用的Unicode字符集。
Due to this difference, there’s yet another dunder method in the mix for controlling string conversion in Python 2: unicode. In Python 2, str returns bytes, whereas unicode returns characters.
因为这个不同,在py2中就有另外一个魔法方法去控制字符串转化。在py2中,str魔法返回的是比特值,而unicode魔法返回字符。
For most intents and purposes, unicode is the newer and preferred method to control string conversion. There’s also a built-in unicode() function to go along with it. It calls the respective dunder method, similar to how str() and repr() work.
为了大多数的目的和目标,unicode魔法是新的和更受青睐的控制字符串转换的方法。这里有一个内置的unicode函数干这个活。它分别调用,像str函数和repr函数一个工作。
So far so good. Now, it gets a little more quirky when you look at the rules for when str and unicode are called in Python 2:
The print statement and str() call str. The unicode() built-in calls unicode if it exists, and otherwise falls back to str and decodes the result with the system text encoding.
Compared to Python 3, these special cases complicate the text conversion rules somewhat. But there is a way to simplify things again for practical purposes. Unicode is the preferred and future-proof way of handling text in your Python programs.
Unicode是一个更好的更有前景的处理文本的方法。
So generally, what I would recommend you do in Python 2.x is to put all of your string formatting code inside the unicode method and then create a stub str implementation that returns the unicode representation encoded as UTF-8:
def __str__(self):
return unicode(self).encode('utf-8')
The str stub will be the same for most classes you write, so you can just copy and paste it around as needed (or put it into a base class where it makes sense). All of your string conversion code that is meant for non-developer use then lives in __unicode__
.
所有用于非开发人员使用的字符串转换代码都以unicode格式存在.
Here’s a complete example for Python 2.x:
class Car(object):
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __repr__(self):
return '{}({!r}, {!r})'.format(
self.__class__.__name__,
self.color, self.mileage)
def __unicode__(self):
return u'a {self.color} car'.format(self=self)
def __str__(self):
return unicode(self).encode('utf-8')
Key Takeaways
- You can control to-string conversion in your own classes using the
__str__
and__repr__
“dunder” methods. - The result of
__str__
should be readable. The result of__repr__
should be unambiguous. - Always add a
__repr__
to your classes. The default implementation for__str__
just calls__repr__
. - Use
__unicode__
instead of__str__
in Python 2.