Item 4: Prefer Interpolated F-Strings Over C-style Format Strings and str.format

Strings are present throughout Python codebases. They’re used for rendering messages in user interfaces and command-line utilities. They’re used for writing data to files and sockets. They’re used for specifying what’s gone wrong in Exception details (see Item 27: “Use Comprehensions Instead of map and filter”). They’re used in debugging (see Item 80: “Consider Interactive Debugging with pdb” and Item 75: “Use repr Strings for Debugging Output”).

字符串存在于整个Python代码库中。它们用于在用户界面和命令行工具中呈现消息,它们用于将数据写入文件和套接字接口,它们用于指定异常详细信息中的错误(参见第27项:“使用推导式代替map和filter”),它们用于调试(参见第80项:“考虑使用pdb进行交互调试”和第75项:“使用repr字符串进行调试输出”)。

Formatting is the process of combining predefined text with data values into a single human-readable message that’s stored as a string. Python has four different ways of formatting strings that are built into the language and standard library. All but one of them, which is covered last in this item, have serious shortcomings that you should understand and avoid.

格式化是将预定义的文本与数据值组合的过程,它以人类可读的字符串形式进行存储。Python有四种不同的格式化字符串的方法,这些方法都内置在语言和标准库中。除了本章节最后介绍的方法之外,其他方法都有严重的缺陷,你应该了解并避免使用它们。

The most common way to format a string in Python is by using the % formatting operator. The predefined text template is provided on the left side of the operator in a format string. The values to insert into the template are provided as a single value or tuple of multiple values on the right side of the format operator. For example, here I use the % operator to convert difficult-to-read binary and hexadecimal values to integer strings:

在Python中格式化字符串最常见的方法是使用 % 格式化操作符。预定义的文本模板在操作符的左侧,它以格式字符串的形式提供。要插入到模板中的值在%右侧,以单个值或多个值的元组的形式提供。例如,这里我使用%操作符将晦涩的二进制和十六进制值转换为整型字符串:

b = 0xc5f 
print('Binary is %d, hex is %d' % (a, b)) 

>>> 
Binary is 187, hex is 3167

The format string uses format specifiers (like %d) as placeholders that will be replaced by values from the right side of the formatting expression. The syntax for format specifiers comes from C’s printf function, which has been inherited by Python (as well as by other programming languages). Python supports all the usual options you’d expect from printf, such as %s, %x, and %f format specifiers, as well as control over decimal places, padding, fill, and alignment. Many programmers who are new to Python start with C-style format strings because they’re familiar and simple to use.

格式字符串使用格式说明符(如%d)作为占位符,这些占位符将被格式表达式右侧的值替换。格式说明符的语法来自C语言的printf函数,该函数已被Python(以及其他编程语言)继承。Python支持printf中所有常见的选项,如%s、%x和%f格式说明符,以及对小数位数、填补、填充和对齐的控制。许多新Python程序员从C风格的格式字符串开始,因为他们熟悉且易于使用。

There are four problems with C-style format strings in Python.
在Python中使用C风格的格式字符串有四个问题。

The first problem is that if you change the type or order of data values in the tuple on the right side of a formatting expression, you can get errors due to type conversion incompatibility. For example, this simple formatting expression works:

第一个问题是,如果更改格式化表达式右侧元组中数据值的类型或顺序,可能会由于类型转换不兼容而出现错误。例如,下面这个简单的格式化表达式可以正常运行:

key = 'my_var' 
value = 1.234
formatted = '%-10s = %.2f' % (key, value) 
print(formatted) 

>>> 
my_var = 1.23

But if you swap key and value, you get an exception at runtime:
但是如果你交换键和值,你会在运行时得到一个异常:

reordered_tuple = '%-10s = %.2f ' % (value, key)  
>>>  
Traceback ...  
TypeError: must be real number, not str  

Similarly, leaving the right side parameters in the original order but changing the format string results in the same error:

类似地,让右边的参数保持原来的顺序,但更改格式字符串也会导致相同的错误:

reordered_string = '%.2f = %-10s ' % (key, value)  
>>>  
Traceback ...  
TypeError: must be real number, not str  

To avoid this gotcha, you need to constantly check that the two sides of the % operator are in sync; this process is error prone because it must be done manually for every change.

为了避免这个问题,你需要不断地检查%操作符的两边是否同步; 这个过程很容易出错,因为每次更改都必须手动执行。

The second problem with C-style formatting expressions is that they become difficult to read when you need to make small modifications to values before formatting them into a string—and this is an extremely common need. Here, I list the contents of my kitchen pantry without making inline changes:

C风格格式化表达式的第二个问题是: 当您需要在格式化字符串之前对值进行小的修改时,这将导致代码变得难以阅读,而这种小的修改又是非常常见的需求。在这里,我列出了我的厨房食品储藏室的内容,而没有做内联更改:

pantry = [  
    ( 'avocados ', 1.25),  
    ( 'bananas ', 2.5),  
    ( 'cherries ', 15),  
]  
for i, (item, count) in enumerate (pantry):  
    print ( '#%d: %-10s = %.2f ' % (i, item, count))  
 
>>>  
#0: avocados = 1.25 
#1: bananas = 2.50 
#2: cherries = 15.00 

Now, I make a few modifications to the values that I’m formatting to make the printed message more useful. This causes the tuple in the formatting expression to become so long that it needs to be split across multiple lines, which hurts readability:

现在,为了使打印信息更实用,我对正在格式化的值做了一些修改。这导致了格式化表达式中的元组变得太长,以至于需要将其拆分为多行,而这就损害了可读性:

for i, (item, count) in enumerate (pantry):  
    print ( '#%d: %-10s = %d ' % (  
    i + 1,  
    item.title (),  
    round (count)))  

>>>  
#1: Avocados = 1  
#2: Bananas  = 2  
#3: Cherries = 15  

The third problem with formatting expressions is that if you want to use the same value in a format string multiple times, you have to repeat it in the right side tuple:

格式化表达式的第三个问题是,如果你想在格式化字符串中多次使用相同的值,你必须在右边的元组中重复它:

template = '%s loves food. See %s cook. '  
name = 'Max '  
formatted = template % (name, name)  
print (formatted)  

>>>  
Max loves food. See Max cook.  

This is especially annoying and error prone if you have to repeat small modifications to the values being formatted. For example, here I remembered to call the title () method multiple times, but I could have easily added the method call to one reference to name and not the other, which would cause mismatched output:

如果您必须对被格式化的值进行重复的小修改,这尤其令人讨厌而且容易出错。例如,这里我记得多次调用title()方法,但是我可能很容易地将方法调用添加到其中一个name的引用,而却忘记了另外一个,这会导致不匹配的输出:

name = 'brad '  
formatted = template % (name.title (), name.title ()) 
print (formatted)  

>>>  
Brad loves food. See Brad cook.  

To help solve some of these problems, the % operator in Python has the ability to also do formatting with a dictionary instead of a tuple. The keys from the dictionary are matched with format specifiers with the corresponding name, such as %(key)s. Here, I use this functionality to change the order of values on the right side of the formatting expression with no effect on the output, thus solving problem #1 from above:

为了帮助解决其中一些问题,Python中的%操作符还能够使用字典而不是元组进行格式化。字典中的键与具有相应名称的格式说明符匹配,例如%(key)s。在这里,我使用这个功能来改变格式化表达式右侧值的顺序,而不会影响输出,从而解决了上面的问题#1:

key = 'my_var '  
value = 1.234  
old_way = '%-10s = %.2f ' % (key, value)  
new_way = '% (key)-10s = % (value).2f ' % {  'key ': key, 'value ': value} # Original  
reordered = '% (key)-10s = % (value).2f ' % {  'value ': value, 'key ': key} # Swapped  
assert old_way == new_way == reordered  

Using dictionaries in formatting expressions also solves problem #3 from above by allowing multiple format specifiers to reference the same value, thus making it unnecessary to supply that value more than once:

在格式化表达式中使用字典还解决了上面的问题#3,它允许多个格式说明符引用相同的值,从而无需多次提供该值:

name = 'Max '  
  
template = '%s loves food. See %s cook. '  
before = template % (name, name) # Tuple  
  
template = '% (name)s loves food. See % (name)s cook. '  
after = template % { 'name ': name} # Dictionary  
  
assert before == after  

However, dictionary format strings introduce and exacerbate other issues. For problem #2 above, regarding small modifications to values before formatting them, formatting expressions become longer and more visually noisy because of the presence of the dictionary key and colon operator on the right side. Here, I render the same string with and without dictionaries to show this problem:

然而,字典格式字符串引入,加剧了其他问题。对于上面的问题#2,如果在格式化之前对值进行的小修改,格式化表达式将会变得更长,视觉效果会更加嘈杂,因为在右边出现了字典键和冒号操作符。这里,我使用有字典和没有字典两种方式,渲染相同的字符串来说明这个问题:

for i, (item, count) in enumerate (pantry):  
    before = '#%d: %-10s = %d ' % (  
    i + 1,  
    item.title (),  
    round (count))  
    after = '#% (loop)d: % (item)-10s = % (count)d ' % {  
    'loop ': i + 1,  
    'item': item.title (),  
    'count ': round (count),  
    }  
    assert before == after  

Using dictionaries in formatting expressions also increases verbosity, which is problem #4 with C-style formatting expressions in Python. Each key must be specified at least twice—once in the format specifier, once in the dictionary as a key, and potentially once more for the variable name that contains the dictionary value:

在格式化表达式中使用字典还会增加冗长,这就是Python中使用C风格格式化表达式带来的问题#4。每个键必须至少指定两次——一次在格式说明符中,一次在字典中作为键,可能还要为包含字典值的变量名指定一次:

soup = 'lentil '  
formatted = 'Today\ 's soup is % (soup)s. ' % { 'soup ': soup}  
print (formatted)  

>>>  
Today 's soup is lentil.  

Besides the duplicative characters, this redundancy causes formatting expressions that use dictionaries to be long. These expressions often must span multiple lines, with the format strings being concatenated across multiple lines and the dictionary assignments having one line per value to use in formatting:

除了重复字符之外,这种冗余还导致使用字典的格式表达式很长。这些表达式通常必须跨越多行,格式字符串跨多行连接,并且字典赋值在格式化中时每个值使用一行:

menu = {  
    'soup ': 'lentil ',  
    'oyster ': 'kumamoto ',  
    'special ': 'schnitzel ',  
}  
template = ('Today\ 's soup is % (soup)s, '  
    'buy one get two % (oyster)s oysters, '  
    'and our special entrée is % (special)s. ')  
formatted = template % menu  
print (formatted)  

>>>  
Today 's soup is lentil, buy one get two kumamoto oysters, and our special entrée is schnitzel.  

To understand what this formatting expression is going to produce, your eyes have to keep going back and forth between the lines of the format string and the lines of the dictionary. This disconnect makes it hard to spot bugs, and readability gets even worse if you need to make small modifications to any of the values before formatting.

为了理解这个格式化表达式将产生什么结果,您的眼睛必须在格式字符串的行和字典的行之间来回切换。这种脱节使得很难发现bug,而且如果你需要在格式化之前对任何值进行小的修改,可读性就会变得更加糟糕。

There must be a better way.
一定有一种更好的方法。

The format Built-in and str.format
内置的format方法和str.format

Python 3 added support for advanced string formatting that is more expressive than the old C-style format strings that use the % operator. For individual Python values, this new functionality can be accessed through the format built-in function. For example, here I use some of the new options ( , for thousands separators and ^ for centering) to format values:

Python 3增加了对高级字符串格式化的支持,这种格式比使用%操作符的旧C风格格式字符串更具表现力。对于单个Python值,这个新功能可以通过format内置函数访问。例如,这里我使用了一些新的选项(, 千分隔符和 ^居中操作符)来格式化值:

a = 1234.5678  
formatted = format (a, ',.2f ')  
print (formatted)  

b = 'my string '  
formatted = format (b, '^20s ')  
print ( '* ', formatted, '* ')  

>>>  
1,234.57  
* my string *

You can use this functionality to format multiple values together by calling the new format method of the str type. Instead of using C-style format specifiers like %d, you can specify placeholders with {}. By default the placeholders in the format string are replaced by the corresponding positional arguments passed to the format method in the order in which they appear:

您可以使用此功能通过调用str类型的新format方法来同时格式化多个值。可以使用{}指定占位符,而不是使用c风格的格式说明符%d。默认情况下,format字符串中的占位符会被传递给format方法的相应位置参数替换,位置参数的出现顺序如下:

key = 'my_var '  
value = 1.234  
formatted = '{} = {} '.format (key, value)  
print (formatted)  

>>>  
my_var =  1.234  

Within each placeholder you can optionally provide a colon character followed by format specifiers to customize how values will be converted into strings (see help ( 'FORMATTING ') for the full range of options) :
在每个占位符中,你可以选择提供一个冒号字符,后面跟上格式说明符来定制如何将值转换为字符串(请参阅help ('FORMATTING ')了解所有选项):

formatted = '{:<10} = {:.2f} '.format (key, value)  
print (formatted)  
 
>>>  
my_var = 1.23  

The way to think about how this works is that the format specifiers will be passed to the format built-in function along with the value (format (value, '.2f ') in the example above). The result of that function call is what replaces the placeholder in the overall formatted string. The formatting behavior can be customized per class using the format special method.

理解这是如何运行的: 格式说明符将与值(上面示例中的format(value,'.2f'))一起传递给format内置函数。该函数调用的结果将替换整个格式化字符串中的占位符。格式化行为可以使用format特殊方法为每个类定制。

With C-style format strings, you need to escape the % character (by doubling it) so it’s not interpreted as a placeholder accidentally. With the str.format method you need to similarly escape braces:

对于C风格的格式字符串,您需要转义%字符(通过将其加倍),以便它不会意外地被解释为一个占位符。对于str.format方法,你需要类似地转义花括号:

print ( '%.2f%% ' % 12.5)  
print ( '{} replaces {{}} '.format (1.23))  

>>>  
12.50%  
1.23 replaces {}  

Within the braces you may also specify the positional index of an argument passed to the format method to use for replacing the placeholder. This allows the format string to be updated to reorder the output without requiring you to also change the right side of the formatting expression, thus addressing problem #1 from above: *

在花括号中,您还可以指定传递给format方法的参数的位置索引,以用于替换占位符。这允许更新格式字符串来重新排序输出,而不需要您更改格式表达式的右侧的顺序,从而解决上面的问题#1:

formatted = '{1} = {0} '.format (key, value)  
print (formatted)  
  
>>>  1.234  = my_var  

The same positional index may also be referenced multiple times in the format string without the need to pass the value to the format method more than once, which solves problem #3 from above:

同样的位置索引也可以在format字符串中被引用多次,而不需要将值多次传递给format方法,这解决了上面的问题#3:

formatted = '{0} loves food. See {0} cook. '.format (name)  
print (formatted)  

>>>  
Max loves food. See Max cook.  

Unfortunately, the new format method does nothing to address problem #2 from above, leaving your code difficult to read when you need to make small modifications to values before formatting them. There’s little difference in readability between the old and new options, which are similarly noisy:

不幸的是,新的format方法没有解决上面的第2个问题,当您需要在格式化值之前对其进行小的修改时,代码依然很难阅读。新旧选项在可读性上几乎没有差别,视觉上同样很嘈杂:

for i, (item, count) in enumerate (pantry):  
    old_style = '#%d: %-10s = %d ' % (  
    i + 1,  
    item.title (),  
    round (count))  
    
    new_style = '#{}: {:<10s} = {} '.format (  
    i + 1,  
    item.title (),  
    round (count))  
    
    assert old_style == new_style  

There are even more advanced options for the specifiers used with the str.format method, such as using combinations of dictionary keys and list indexes in placeholders, and coercing values to Unicode and repr strings:

对于str.format方法使用的说明符,甚至还有更高级的选项,比如在占位符中使用字典键和列表索引的组合,以及将值强制为Unicode和repr字符串:

formatted = 'First letter is {menu [oyster] [0] !r} '.format (menu=menu)  
print (formatted)  

>>>  
First letter is 'k '  

But these features don’t help reduce the redundancy of repeated keys from problem #4 above. For example, here I compare the verbosity of using dictionaries in C-style formatting expressions to the new style of passing keyword arguments to the format method:

但是这些特性并不能帮助减少上面问题4中重复键的冗余。例如,在这里,我比较了两种风格的冗长性,一种是在表达式中使用字典的C风格格式化,另一种是向format方法传递关键字参数的新风格:

old_template = (  
    'Today\ 's soup is % (soup)s, '  
    'buy one get two % (oyster)s oysters, '  
    'and our special entrée is % (special)s. ')  
old_formatted = template % {  
    'soup ': 'lentil ',  
    'oyster ': 'kumamoto ',  
    'special ': 'schnitzel ',  
}  

new_template = (  
    'Today\ 's soup is {soup}, '  
    'buy one get two {oyster} oysters, '  
    'and our special entrée is {special}. ')  
new_formatted = new_template.format (  
    soup= 'lentil ',  
    oyster= 'kumamoto ',  
    special= 'schnitzel ',  
  )  
assert old_formatted == new_formatted  

This style is slightly less noisy because it eliminates some quotes in the dictionary and a few characters in the format specifiers, but it’s hardly compelling.

这种风格稍微不那么嘈杂,因为它消除了字典中的一些引号和格式说明符中的一些字符,但它很难引人注目。

Further, the advanced features of using dictionary keys and indexes within placeholders only provides a tiny subset of Python’s expression functionality. This lack of expressiveness is so limiting that it undermines the value of the format method from str overall.

此外,在占位符中使用字典键和索引的高级特性只提供了Python表达式功能的一小部分。这种表达性的缺乏是如此的有限,以至于它从整体上破坏了str的format方法的价值。

Given these shortcomings and the problems from C-style formatting expressions that remain (problems #2 and #4 from above), I suggest that you avoid the str.format method in general.

鉴于这些缺点和C风格格式表达式仍然存在的问题(上面的问题#2和#4),我建议您一般避免使用str.format方法。

It’s important to know about the new mini language used in format specifiers (everything after the colon) and how to use the format built-in function. But the rest of the str.format method should be treated as a historical artifact to help you understand how Python’s new f-strings work and why they’re so great.

了解格式说明符(冒号之后的所有内容)中使用的新迷你语言,以及如何使用format内置函数非常重要。但是str.format方法的其余部分应该被视为历史文物,以帮助您理解Python的新的f-strings是如何工作的,以及它们为什么如此伟大。

Interpolated Format Strings
插值格式字符串

Python 3.6 added interpolated format strings—f-strings for short—to solve these issues once and for all. This new language syntax requires you to prefix format strings with an f character, which is similar to how byte strings are prefixed with a b character and raw (unescaped) strings are prefixed with an r character.

Python3.6添加了插值格式字符串——简称f-string——来一劳永逸地解决这些问题。这种新的语法要求您在格式字符串前加上f字符,这类似于字节字符串前缀为b字符,原始(未转义)字符串前缀为r字符。

F-strings take the expressiveness of format strings to the extreme, solving problem #4 from above by completely eliminating the redundancy of providing keys and values to be formatted. They achieve this pithiness by allowing you to reference all names in the current Python scope as part of a formatting expression:

f-string将格式字符串的表达能力发挥到了极致,它通过完全消除格式化时的键和值的冗余,解决了上面的问题#4。它通过允许你引用当前Python作用域中的所有名称作为格式化表达式的一部分来实现这种简洁:

key = 'my_var '  
value = 1.234  
formatted = f '{key} = {value} '  
print (formatted)  

>>>  
my_var =  1.234  

All of the same options from the new format built-in mini language are available after the colon in the placeholders within an f-string, as is the ability to coerce values to Unicode and repr strings similar to the str.format method:

新format内置迷你语言中的所有相同选项都可以在f-string占位符中的冒号之后使用,就像将值强制转换为Unicode和repr字符串的能力一样,类似于str.format方法:

formatted = f '{key !r:<10} = {value:.2f} '  
print (formatted)  
 
>>>    
'my_var ' = 1.23  

Formatting with f-strings is shorter than using C-style format strings with the % operator and the str.format method in all cases. Here, I show all these options together in order of shortest to longest, and line up the left side of the assignment so you can easily compare them:

在任何情况下,使用f-string进行格式化都比使用C风格的%操作符和str.format方法的更短。在这里,我将所有这些选项按最短到最长的顺序排列在一起,并将赋值的左侧对齐,这样你就可以轻松地比较它们:

f_string = f '{key:<10} = {value:.2f} '  
c_tuple  = '%-10s = %.2f ' % (key, value)  
str_args = '{:<10} = {:.2f} '.format (key, value)  
str_kw = '{key:<10} = {value:.2f} '.format (key=key, value=value)  
c_dict = '% (key)-10s = % (value).2f ' % { 'key ': key, 'value ': value}  
assert c_tuple == c_dict == f_string  
assert str_args == str_kw == f_string 

F-strings also enable you to put a full Python expression within the placeholder braces, solving problem #2 from above by allowing small modifications to the values being formatted with concise syntax. What took multiple lines with C-style formatting and the str.format method now easily fits on a single line:

f-string还允许您将一个完整的Python表达式放入占位符大括号中,通过允许使用简洁语法对格式化的值进行小的修改,解决了上面的问题#2。使用C风格格式化的多行代码和str.format方法现在可以轻松地适用于单行:

for i, (item, count) in enumerate (pantry):  
    old_style = '#%d: %-10s = %d ' % (  
    i + 1,  
    item.title (),  
    round (count))  
    new_style = '#{}: {:<10s} = {} '.format (  
    i + 1,  
    item.title (),  
    round (count))  
    f_string = f '#{i+1}: {item.title ():<10s} = {round (count)} '  
    assert old_style == new_style == f_string  

Or, if it’s clearer, you can split an f-string over multiple lines by relying on adjacent-string concatenation (similar to C). Even though this is longer than the single-line version, it’s still much clearer than any of the other multiline approaches:

或者,如果你想让它更清晰,你可以通过依赖邻接字符串拼接(类似于C)的方式将一个f-string拆分为多行。尽管这比单行版本要长,但它仍然比任何其他多行方法更清晰:

for i, (item, count) in enumerate (pantry):  
    print (f '#{i+1}: '  
    f '{item.title ():<10s} = '  
    f '{round (count)} ')  

>>>  
#1: Avocados = 1  
#2: Bananas  = 2  
#3: Cherries = 15  

Python expressions may also appear within the format specifier options. For example, here I parameterize the number of digits to print by using a variable instead of hard-coding it in the format string:

Python表达式也可以出现在格式说明符选项中。例如,这里我通过使用变量而不是在格式字符串中硬编码来参数化要打印的数字:

places = 3  
number = 1.23456  
print (f'My number is {number:.{places}f} ')  

>>>  
My number is 1.235  

The combination of expressiveness, terseness, and clarity provided by f-strings makes them the best built-in option for Python programmers. Any time you find yourself needing to format values into strings, choose f-strings over the alternatives.

f-string所提供的表达性、简洁性和清晰度的组合使其成为Python程序员的最佳内置选项。任何时候,当您发现需要将值格式化为字符串时,请选择f-string代替其他任何选项。

Things to Remember
要记住的事

✦ C-style format strings that use the % operator suffer from a variety of gotchas and verbosity problems.
✦ The str.format method introduces some useful concepts in its formatting specifiers mini language, but it otherwise repeats the mistakes of C-style format strings and should be avoided.
✦ F-strings are a new syntax for formatting values into strings that solves the biggest problems with C-style format strings.
✦ F-strings are succinct yet powerful because they allow for arbitrary Python expressions to be directly embedded within format specifiers.

✦ 使用%操作符的C风格字符串有各种陷阱和冗长的问题。
✦ str.format方法在其格式说明符迷你语言中引入了一些有用的概念,但它重复了C风格格式字符串的错误,应该避免。
✦ F-strings是一种将值格式化为字符串的新语法,解决了C风格格式字符串的最大问题。
✦ F-strings简洁而强大,因为它们允许任意的Python表达式直接嵌入格式说明符。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容