字符串方法0x08 -- 条件判断

转载须注明出处:简书@Orca_J35 | GitHub@orca-j35

字符串不仅支持所有通用序列操作,还实现了很多附件方法。
我会以『字符串方法』为标题,分几篇笔记逐一介绍这些方法。
我会在这仓库中持续更新笔记:https://github.com/orca-j35/python_notes

endswith

🔨 str.endswith(suffix[, start[, end]])

Return True if the string ends with the specified suffix, otherwise return False. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.

# 测试字符串是否以suffix结尾
text = 'stop comparing at that position'
assert text.endswith('tion') is True
assert text.endswith(('tom', 'tion')) is True
# 测试 suffix 是否等于 str_obj[start:end]
assert text.endswith('top', 1, 4) is True
assert text.endswith('top', 1, 3) is False

startswith

🔨 str.startswith(prefix[, start[, end]])

Return True if string starts with the prefix, otherwise return False. prefix can also be a tuple of prefixes to look for. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

# 测试字符串是否以suffix开头
# 测试 suffix 是否等于 str_obj[start:end]

isascii

🔨 str.isascii()

Return true if the string is empty or all characters in the string are ASCII, false otherwise. ASCII characters have code points in the range U+0000-U+007F.

New in version 3.7.

# 测试字符是否只包含ASCII字符
from string import printable
assert r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~""".isascii() is True
assert printable.isascii() is True
assert '¡'.isascii() is False
# 空字符串也会返回 True
assert ''.isascii() is True

isalnum

🔨 str.isalnum()

Return true if all characters in the string are alphanumeric and there is at least one character, false otherwise. A character c is alphanumeric if one of the following returns True: c.isalpha(), c.isdecimal(), c.isdigit(), or c.isnumeric().

# 测试字符串是否只包含数字和字母
assert 'abc123'.isalnum() is True
assert '逆戟鲸'.isalnum() is True
assert 'abc_123'.isalnum() is False
assert 'abc 123'.isalnum() is False
assert '!'.isalnum() is False

isalpha

🔨 str.isalpha()

Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard. —— 关于 “Letter” 和 “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”,详见本文附录 Letter 小节。

# 测试字符串是否只包含字母: Lu|Ll|Lt|Lm|Lo
assert 'abc'.isalpha() is True
assert '逆戟鲸'.isalpha() is True
assert 'abc def'.isalnum() is False
assert '123'.isalpha() is False
assert '!'.isalpha() is False

isdecimal

🔨 str.isdecimal()

Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters are those that can be used to form numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Formally a decimal character is a character in the Unicode General Category “Nd”.

# 测试字符串是否只包含十进制字符:0,1,2,3,4,5,6,7,8,9
assert '0123456789'.isdecimal() is True
assert '0123456789abcdef'.isdecimal() is False
assert '1+j1'.isdecimal() is False
assert '6.1'.isdecimal() is False
# 包括各种语言中表示0,1,2,3,4,5,6,7,8,9的字符
# U+0660~U+0669表示ARABIC-INDIC语系中的0~9
assert ''.join([chr(i) for i in range(0x660, 0x66A)]).isdecimal() is True

就笔者目前的知识而言,我认为 Nd 属性和 Numeric_Type=Decimal 是充要条件。也就是说,当 Numeric_Type=Decimal 时,isdecimal() 必定返回 True。关于 Nd 和 Numeric_Type,详见本文附录部分。

因为 Decimal ⊂ Digit ⊂ Numeric,所以当 isdecimal() 为真,isdigit()isnumeric() 必定为真:

# 十进制字符
assert '0123456789'.isdecimal() is True
assert '0123456789'.isdigit() is True
assert '0123456789'.isnumeric() is True
# 上标'⁸'
assert '⁸'.isdecimal() is False
assert '⁸'.isdigit() is True
assert '⁸'.isnumeric() is True
# 分数
assert '⅕'.isdecimal() is False
assert '⅕'.isdigit() is False
assert '⅕'.isnumeric() is True

isdigit

🔨 str.isdigit()

Return true if all characters in the string are digits and there is at least one character, false otherwise. Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. This covers digits which cannot be used to form numbers in base 10, like the Kharosthi(Kharoshthi) numbers. Formally, a digit is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal —— 详见本文附录Numeric_Type。

# 测试字符串是否只包含:十进制字符和需要特殊处理的数字(例如兼容性上标数字)
assert '0123456789'.isdigit() is True
assert '⁸'.isdigit() is True
# 包括不是基于10进制构建数值的数字,如U+10A40表示Kharoshthi语系中的数字1
# 即 '\U00010A40' -> '𐩀'
assert '\U00010A40'.isdecimal() is False
assert '\U00010A40'.isdigit() is True
assert '\U00010A40'.isnumeric() is True

Kharosthi(Kharoshthi) 语系的计数方式不是十进制,只有数字 1、2、3、4、10、20、100、1000,详细介绍可查看Kharosthi - 维基百科

因为 Decimal ⊂ Digit ⊂ Numeric,所以当 isdigit() 为真,则 isnumeric() 必定为真:

# 十进制字符
assert '0123456789'.isdecimal() is True
assert '0123456789'.isdigit() is True
assert '0123456789'.isnumeric() is True
# 上标'⁸'
assert '⁸'.isdecimal() is False
assert '⁸'.isdigit() is True
assert '⁸'.isnumeric() is True
# 分数
assert '⅕'.isdecimal() is False
assert '⅕'.isdigit() is False
assert '⅕'.isnumeric() is True

isnumeric

🔨 str.isnumeric()

Return true if all characters in the string are numeric characters, and there is at least one character, false otherwise. Numeric characters include digit characters, and all characters that have the Unicode numeric value property, e.g. U+2155, VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property value Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric. —— 详见本文附录Numeric_Type。

assert '⅕'.isnumeric() is True
assert 'Ⅵ'.isnumeric() is True
assert '贰'.isnumeric() is True

因为 Decimal ⊂ Digit ⊂ Numeric,所以当 isnumeric() 为真,isdecimalisdigit 不一定为真:

# 十进制字符
assert '0123456789'.isdecimal() is True
assert '0123456789'.isdigit() is True
assert '0123456789'.isnumeric() is True
# 上标'⁸'
assert '⁸'.isdecimal() is False
assert '⁸'.isdigit() is True
assert '⁸'.isnumeric() is True
# 分数
assert '⅕'.isdecimal() is False
assert '⅕'.isdigit() is False
assert '⅕'.isnumeric() is True
# 罗马数字
assert 'Ⅵ'.isdecimal() is False
assert 'Ⅵ'.isdigit() is False
assert 'Ⅵ'.isnumeric() is True
# 中文
assert '贰'.isdecimal() is False
assert '贰'.isdigit() is False
assert '贰'.isnumeric() is True

isidentifier

🔨 str.isidentifier()

Return true if the string is a valid identifier according to the language definition, section Identifiers and keywords.

Use keyword.iskeyword() to test for reserved identifiers such as def and class.

# 测试字符串是否是合法标识符
assert 'if'.isidentifier() is True
assert '_orca_j35'.isidentifier() is True
assert '123_abc'.isidentifier() is False
# keyword.iskeyword()用于测试是否是保留标识符
import keyword
assert keyword.iskeyword('def') is True

isprintable

🔨 str.isprintable()

Return true if all characters in the string are printable or the string is empty, false otherwise. Nonprintable characters are those characters defined in the Unicode character database as “Other” or “Separator”, excepting the ASCII space (0x20) which is considered printable. (Note that printable characters in this context are those which should not be escaped when repr() is invoked on a string. It has no bearing on the handling of strings written to sys.stdout or sys.stderr.) —— 关于 “Other” 或 “Separator”,详见本文附录 Separator&Other 小节。

注意,这里所说的可打印字符是指 repr() 函数不会转义的字符,与如何处理字符串的写入(sys.stdoutsys.stderr )无关。

# 测试字符串是否只包含可打印字符
assert 'orca_j35'.isprintable() is True
# Unicode字符集中Other或Separator被定义为不可打印字符,但ASCII空格(0x20)除外
assert '\t'.isprintable() is False
assert ' '.isprintable() is True
# 空字符串也会返回 True
assert ''.isprintable() is True

isspace

🔨 str.isspace()

Return true if there are only whitespace characters in the string and there is at least one character, false otherwise. Whitespace characters are those characters defined in the Unicode character database as “Other” or “Separator” and those with bidirectional property being one of “WS”, “B”, or “S”. —— 关于 “Other” 或 “Separator”,详见本文附录 Separator&Other 小节;关于 bidirectional property,可阅读 Bidi_ClassBidirectional Class Values

# 测试字符串是否只包含空白字符
# Unicode字符集中Other或Separator被定义为空白字符,以及具备双向属性(WS,B,S)的字符
assert ' \t\n\r\v\f'.isspace() is True
assert 'orca_j35'.isspace() is False
# 空字符串会返回False
assert ''.isspace() is False

istitle

🔨 str.istitle()

Return true if the string is a titlecased string and there is at least one character, for example uppercase characters may only follow uncased characters and lowercase characters only cased ones. Return false otherwise.

# 测试字符串是否是首字母大写的字符串
# 大写字符只能位于非大小字符之后,小写字符只能位于小写字符之后
assert 'A'.istitle() is True
assert 'Orca 8@Orca 🐳逆戟鲸Orca'.istitle() is True
assert 'Orca ORca'.istitle() is False
assert 'Orca orca'.istitle() is False
assert 'Orca O#rca'.istitle() is False
assert '35orca'.istitle() is False
# 首字母可以是 Lt 中的字符,详见本文附录 Letter 小节。
assert 'ᾯabc'.istitle() is True
# 汉字属于非大小写字符
assert '逆戟鲸 Orca'.istitle() is True
assert '逆戟鲸orca'.istitle() is False

非大小写字符是指不属于 Letter 的字符,详见本文附录 Letter 小节。

islower

🔨 str.islower()

Return true if all cased characters in the string are lowercase and there is at least one cased character, false otherwise.

Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase). —— 详见本文附录 Letter 小节。

# 测试字符串是否只包含小写字符Ll和非大小写字符
assert 'a'.islower() is True
assert 'ƺ'.islower() is True # Latin Small Letter Ezh with Tail
assert 'orca j35 逆戟鲸 !@\n\t'.islower() is True
# 至少需要一个小写字符
assert '逆戟鲸'.islower() is False

非大小写字符是指不属于 Letter 的字符,详见本文附录 Letter 小节。

isupper

🔨 str.isupper()

Return true if all cased characters [4] in the string are uppercase and there is at least one cased character, false otherwise.

Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase). —— 详见本文附录 Letter 小节。

# 测试字符串是否只包含Lu大写字符和非大小写字符
assert 'A'.isupper() is True
assert 'Æ'.isupper() is True  # Latin Capital Letter Ae
assert 'ORCA J35 逆戟鲸 !@\n\t'.isupper() is True
assert 'Orca'.isupper() is False
# 至少需要一个大写字符
assert '逆戟鲸'.islower() is False
assert '_35'.isupper() is False

非大小写字符是指不属于 Letter 的字符,详见本文附录 Letter 小节。

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,496评论 6 501
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,407评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,632评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,180评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,198评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,165评论 1 299
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,052评论 3 418
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,910评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,324评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,542评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,711评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,424评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,017评论 3 326
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,668评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,823评论 1 269
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,722评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,611评论 2 353

推荐阅读更多精彩内容