String Conversion (Every Class Needs a repr)
When you define a custom class in Python and then try to print one of its instances to the console (or inspect it in an interpreter session), you get a relatively unsatisfying result. The default “to string” conversion behavior is basic and lacks detail:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
>>> my_car = Car('red', 37281)
>>> print(my_car)
<__console__.Car object at 0x109b73da0>
>>> my_car
<__console__.Car object at 0x109b73da0>
By default all you get is a string containing the class name and the id of the object instance (which is the object’s memory address in CPython.) That’s better than nothing, but it’s also not very useful.
You might find yourself trying to work around this by printing attributes of the class directly, or even by adding a custom to_string() method to your classes:
>>> print(my_car.color, my_car.mileage)
red 37281
The general idea here is the right one—but it ignores the conventions and built-in mechanisms Python uses to handle how objects are represented as strings.
Instead of building your own to-string conversion machinery, you’ll be better off adding the str and repr “dunder” methods to your class. They are the Pythonic way to control how objects are converted to strings in different situations.
Let’s take a look at how these methods work in practice. To get started, we’re going to add a str method to the Car class we defined earlier:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __str__(self):
return f'a {self.color} car'
When you try printing or inspecting a Car instance now, you’ll get a different, slightly improved result:
>>> my_car = Car('red', 37281)
>>> print(my_car)
'a red car'
>>> my_car
<__console__.Car object at 0x109ca24e0>
Inspecting the car object in the console still gives us the previous result containing the object’s id. But printing the object resulted in the string returned by the str method we added.
str is one of Python’s “dunder” (double-underscore) methods and gets called when you try to convert an object into a string through the various means that are available:
>>> print(my_car)
a red car
>>> str(my_car)
'a red car'
>>> '{}'.format(my_car)
'a red car'
With a proper str implementation, you won’t have to worry about printing object attributes directly or writing a separate to_string() function. It’s the Pythonic way to control string conversion.
By the way, some people refer to Python’s “dunder” methods as “magic methods.” But these methods are not supposed to be magical in any way. The fact that these methods start and end in double underscores is simply a naming convention to flag them as core Python features. It also helps avoid naming collisions with your own methods and attributes. The object constructor init follows the same convention, and there’s nothing magical or arcane about it.
Don’t be afraid to use Python’s dunder methods—they’re meant to
help you.
str vs repr
Now, our string conversion story doesn’t end there. Did you see how inspecting my_car in an interpreter session still gave that odd <Car object at 0x109ca24e0> result?
This happened because there are actually two dunder methods that control how objects are converted to strings in Python 3. The first one is str, and you just learned about it. The second one is repr, and the way it works is similar to str, but it is used in different situations. (Python 2.x also has a unicode method that I’ll touch on a little later.)
Here’s a simple experiment you can use to get a feel for when str or repr is used. Let’s redefine our car class so it contains both tostring dunder methods with outputs that are easy to distinguish:
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __repr__(self):
return '__repr__ for Car'
def __str__(self):
return '__str__ for Car'
Now, when you play through the previous examples you can see which method controls the string conversion result in each case:
>>> my_car = Car('red', 37281)
>>> print(my_car)
__str__ for Car
>>> '{}'.format(my_car)
'__str__ for Car'
>>> my_car
__repr__ for Car
This experiment confirms that inspecting an object in a Python interpreter session simply prints the result of the object’s repr.
Interestingly, containers like lists and dicts always use the result of repr to represent the objects they contain. Even if you call str on the
container itself:
'[__repr__ for Car]'
To manually choose between both string conversion methods, for example, to express your code’s intent more clearly, it’s best to use the built-in str() and repr() functions. Using them is preferable over calling the object’s str or repr directly, as it looks nicer and gives the same result:
>>> str(my_car)
'__str__ for Car'
>>> repr(my_car)
'__repr__ for Car'
Even with this investigation complete, you might be wondering what the “real-world” difference is between str and repr. They both seem to serve the same purpose, so it might be unclear when to use each.
With questions like that, it’s usually a good idea to look into what the Python standard library does. Time to devise another experiment. We’ll create a object and find out how it uses repr and str to control string conversion:
>>> import datetime
>>> today =
The result of the date object’s str function should primarily be readable. It’s meant to return a concise textual representation for human consumption—something you’d feel comfortable displaying to a user. Therefore, we get something that looks like an ISO date format when we call str() on the date object:
>>> str(today)
With repr, the idea is that its result should be, above all, unambiguous. The resulting string is intended more as a debugging aid for developers. And for that it needs to be as explicit as possible about what this object is. That’s why you’ll get a more elaborate result calling repr() on the object. It even includes the full module and class name:
>>> repr(today)
', 2, 2)'
We could copy and paste the string returned by repr and execute it as valid Python to recreate the original date object. This is a neat approach and a good goal to keep in mind while writing your own reprs.
On the other hand, I find that it is quite difficult to put into practice. Usually it won’t be worth the trouble and it’ll just create extra work for you. My rule of thumb is to make my repr strings unambiguous and helpful for developers, but I don’t expect them to be able to restore an object’s complete state.
Why Every Class Needs a repr
If you don’t add a str method, Python falls back on the result of repr when looking for str. Therefore, I recommend that you always add at least a repr method to your classes. This will guarantee a useful string conversion result in almost all cases, with a minimum of implementation work.
Here’s how to add basic string conversion support to your classes quickly and efficiently. For our Car class we might start with the following repr:
def __repr__(self):
return f'Car({self.color!r}, {self.mileage!r})'
Please note that I’m using the !r conversion flag to make sure the output string uses repr(self.color) and repr(self.mileage) instead of str(self.color) and str(self.mileage).
请注意我用!r标志来确保输出的字符串用repr(self.color) 和repr(self.mileage)而不是str(self.color) 和str(self.mileage)。
This works nicely, but one downside is that we’ve repeated the class name inside the format string. A trick you can use here to avoid this repetition is to use the object’s attribute, which will always reflect the class’ name as a string.
The benefit is you won’t have to modify the repr implementation when the class name changes. This makes it easy to adhere to the Don’t Repeat Yourself (DRY) principle:
def __repr__(self):
return (f'{self.__class__.__name__}('
f'{self.color!r}, {self.mileage!r})')
The downside of this implementation is that the format string is quite long and unwieldy. But with careful formatting, you can keep the code nice and PEP 8 compliant.
With the above repr implementation, we get a useful result when we inspect the object or call repr() on it directly:
>>> repr(my_car)
'Car(red, 37281)'
Printing the object or calling str() on it returns the same string because
the default str implementation simply calls repr:
>>> print(my_car)
'Car(red, 37281)'
>>> str(my_car)
'Car(red, 37281)'
I believe this approach provides the most value with a modest amount of implementation work. It’s also a fairly cookie-cutter approach that can be applied without much deliberation. For this reason, I always try to add a basic repr implementation to my classes.
Here’s a complete example for Python 3, including an optional
class Car:
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __repr__(self):
return (f'{self.__class__.__name__}('
f'{self.color!r}, {self.mileage!r})')
def __str__(self):
return f'a {self.color} car'
Python 2.x Differences: unicode
In Python 3 there’s one data type to represent text across the board: str. It holds unicode characters and can represent most of the world’s writing systems.
在python 3中,有一种数据类型可以跨板表示文本:它包含Unicode字符,可以代表世界上大多数的书写系统。
Python 2.x uses a different data model for strings. There are two types to represent text: str, which is limited to the ASCII character set, and unicode, which is equivalent to Python 3’s str.
Due to this difference, there’s yet another dunder method in the mix for controlling string conversion in Python 2: unicode. In Python 2, str returns bytes, whereas unicode returns characters.
For most intents and purposes, unicode is the newer and preferred method to control string conversion. There’s also a built-in unicode() function to go along with it. It calls the respective dunder method, similar to how str() and repr() work.
So far so good. Now, it gets a little more quirky when you look at the rules for when str and unicode are called in Python 2:
The print statement and str() call str. The unicode() built-in calls unicode if it exists, and otherwise falls back to str and decodes the result with the system text encoding.
Compared to Python 3, these special cases complicate the text conversion rules somewhat. But there is a way to simplify things again for practical purposes. Unicode is the preferred and future-proof way of handling text in your Python programs.
So generally, what I would recommend you do in Python 2.x is to put all of your string formatting code inside the unicode method and then create a stub str implementation that returns the unicode representation encoded as UTF-8:
def __str__(self):
return unicode(self).encode('utf-8')
The str stub will be the same for most classes you write, so you can just copy and paste it around as needed (or put it into a base class where it makes sense). All of your string conversion code that is meant for non-developer use then lives in __unicode__
Here’s a complete example for Python 2.x:
class Car(object):
def __init__(self, color, mileage):
self.color = color
self.mileage = mileage
def __repr__(self):
return '{}({!r}, {!r})'.format(
self.color, self.mileage)
def __unicode__(self):
return u'a {self.color} car'.format(self=self)
def __str__(self):
return unicode(self).encode('utf-8')
Key Takeaways
- You can control to-string conversion in your own classes using the
“dunder” methods. - The result of
should be readable. The result of__repr__
should be unambiguous. - Always add a
to your classes. The default implementation for__str__
just calls__repr__
. - Use
instead of__str__
in Python 2.