1、上下文管理器

1.1、with语句

文件操作时使用with语句可以自动调用关闭文件操作，即使出现异常也会自动关闭文件操作
with 管理的对象就是上下文管理器

# 1、以写的方式打开文件
with open('1.txt', 'w') as f:
    # 2、读取文件内容
    f.write('hello world')

2.1、用代码实现上下文管理器

使用with语句会自动执行如下的魔法方法：
__enter__()初始化执行、 __exit__()生命周期结束执行

class MyFile(object):
    def __init__(self, file_name, file_model):
        self.file_name = file_name
        self.file_model = file_model
        self.fp = None # 文件句柄
    def __enter__(self):
        print('上文，类初始化执行')
        self.fp = open(self.file_name, self.file_model)
        return self # 返回类对象本身
    def __exit__(self, exc_type, exc_val, exc_tb):
        print('下文，生命周期结束执行')
        self.fp.close()
        return self
    # 外界无法使用此函数
    def readfile_filter(self, end_words):
        print('读取文件操作')
if __name__ == '__main__':
    with MyFile('./jaychou_lyrics.txt', 'r') as f:
        print(type(f))
        f.readfile_filter('女人')

3、生成器

根据程序设计者制定的规则循环生成数据，当条件不成立时则生成数据结束

3.1、生成器推导式

生成器推导式使用小括号，使用方式如下

# 1生成器推导式
my_generator = (i*2 for i in range(5))
print(my_generator)
print('='*50)
# 2 通过for循环迭代生成器
for i in my_generator:
     print(i)
# 3 通过while循环，不推荐使用
while True:
     try:
         result = next(my_generator)
         print(result)
     except StopIteration as e: # 数据取完时，通过异常结束循环
         break
# 4 通过单步迭代next(my_generator)取数据，通过游标的方式
next(my_generator)
next(my_generator)
... # 遍历超出范围，返回异常

3.2、yield生成器

代码执行到 yield 会暂停，然后把结果返回出去，下次启动生成器会在暂停的位置继续往下执行
生成器如果把数据生成完成，再次获取生成器中的下一个数据会抛出一个StopIteration 异常，表示停止迭代异常
while 循环内部没有处理异常操作，需要手动添加处理异常操作
for 循环内部自动处理了停止迭代异常，使用起来更加方便，推荐大家使用。

yeild执行流程图

import math
def dataset_loader(bath_size):
    # 1 读歌词
    with open('./jaychou_lyrics.txt', 'r') as file :
        lines = file.readlines()
    # 2 统计共有多少条歌词
    lyrics_number = len(lines)
    # 3 计算共有多少个批次 math.ceil向上取整 math.floor向下取整
    batch_number = math.ceil(lyrics_number/bath_size)
    # 4 遍历每一个 batch
    for idx in range(batch_number):
        yield lines[idx*bath_size : idx*bath_size+bath_size]
if __name__ == '__main__':
    dataloader = dataset_loader(8)
    for data in dataloader:
        print(data)
    print('创建生成器(数据加载器) 为AI专业课做准备 End')

4、属性

定义property属性有两种方式: 装饰器方式; 类属性方式
1、装饰器方式:
@property 修饰获取值的方法
@方法名.setter 修饰设置值的方法
2、类属性方式:
类属性 = property(获取值方法, 设置值方法)

class Person(object):
    def __init__(self):
        self.__age = 0
    # 获取属性
    @property
    def age(self):
        return self.__age
    # 修改属性
    @age.setter
    def age(self, new_age):
        self.__age = new_age
#======================================
class Person(object):
    def __init__(self):
        self.__age = 0
    def get_age(self):
        """当获取age属性的时候会执行该方法"""
        return self.__age
    def set_age(self, new_age):
        """当设置age属性的时候会执行该方法"""
        self.__age = new_age
    # 类属性方式的property属性
    age = property(get_age, set_age)
if __name__ == '__main__':
    p1 = Person()
    print(p1.age)
    p1.age = 100
    print(p1.age)

5、正则表达式

正则表达式(regular expression)描述了一种字符串匹配的==模式==，可以用来检查一个串是否含有==某种==子串、将匹配的子串做替换或者从某个串中取出符合某个条件的子串等。

5.1、re

1、re.match使用
match从左往右依次匹配字符串
使用group方法来提取数据

# 第一步：导入re模块
import re
# 第二步：使用match方法进行匹配操作
result = re.match(pattern正则表达式, string要匹配的字符串, flags=0)
# 第三步：如果数据匹配成功，使用group方法来提取数据
result.group()

2、re.search使用
扫描字符串

def dm02_search扫描字符串():
    ''' # 扫描字符返回第一个成功的匹配 def search(pattern, string, flags=0) '''
    import re
    result = re.search("\d.*", "city:1beijing2.shanghai")  # "\d.*": 数字开头,任意多个字符字符结尾
    # result = re.search(".\d.", "cityp.1.beijing2.shanghai")
    if result:
        print(result.group())
    else:
        print('没有匹配到')
    pass

3、re.compile使用
去除字符串

def demo_03:
    sentence = "车主说:你的刹车片应该更换了啊,嘿嘿"
    # 正则表达式: 去除多余字符
    p = r"呢|吧|哈|啊|啦|嘿|嘿嘿"
    r = re.compile(pattern=p)
    mystr = r.sub('', sentence)
    print('mystr-->', mystr)
    # 正则表达: 删除除了汉字数字字母和，！？。.- 以外的字符
    # \u4e00-\u9fa5 是用来判断是不是中文的一个条件
    p = "[^，！？。\.\-\u4e00-\u9fa5_a-zA-Z0-9]"
    r = re.compile(pattern=p)
    mystr = r.sub('', sentence)
    print('mystr-->', mystr)

5.2、匹配字符规则

匹配单个字符功能演示

1 . 匹配任意1个字符（除了\n）
2 [ ] 匹配[ ]中列举的字符，可选
3 \d 匹配数字,即0-9 => [0123456789] => [0-9]
4 \D 匹配非数字,即不是数字 # 一般大写D表示非
5 \s 匹配空白,即空格,tab键
6 \S 匹配非空白
7 \w 匹配非特殊字符，即a-z, A-Z, 0-9, _, 汉字
8 \W 匹配特殊字符,即非字母, 非数字, 非_, 非汉字
9 * 匹配前一个字符出现0次或者无限次，即可有可无
10 + 匹配前一个字符出现1次或者无限次，即至少有1次
11 ? 匹配前一个字符出现1次或者0次，即要么有1次，要么没有
12 {m} 匹配前一个字符出现m次
13 {m,n} 匹配前一个字符出现从m到n次
14 ^ 匹配字符串开头
15 $ 匹配字符串结尾
16 [^指定字符] 匹配除了指定字符以外的所有字符
17 | 匹配左右任意一个表达式
18 (ab) 将括号中字符作为一个分组
19 \ 转义字符
20 (?P<name>) 分组起别名
21(?P=name) 引用别名为name分组匹配到的字符串

#[a-z]  [A-Z] [0-9]   [a-zA-Z0-9]
result = re.match("itcast[123abc]", "itcast376")
#  匹配数据: 匹配1个数字开头的子串
result = re.match("^\ditcast", "2itcast")   # 1 匹配1个数字开头 + itcast
#1-2 以数字为开头的字符串
result = re.match('^\d.*', '22itcast')   # "^\d":以数字开头, ".*":以字符结尾
#2-1 $匹配字符串结尾
result = re.match(".*\d$", "itcast66")   # ".*" : 0个多个字符开头, "\d$":数字结尾
#  3 匹配以数字为开头以数字为结尾
result = re.match("^\d.*\d$", "11itcast22")
# 16 [^指定字符]  匹配除了指定字符以外的所有字符
result = re.match("^\d.*[^4]$", "11itcast@")
# 17-18-19 使用分组匹配
result = re.match("[a-zA-Z0-9_]{4,20}@163|126|qq\.com", "hello@163.com")  # 只能把"hello@163"匹配出来
# 18 分组
result = re.match("(qq):([1-9]\d{4,11})", "qq:10567")
if result:
    info = result.group(0) # qq:10567 
    num = result.group(2) # qq
    type = result.group(1) # 10567
# 匹配出<html>hh</html>
result = re.match("<([a-zA-Z1-6]{4})>.*</([a-zA-Z1-6]{4})>", "<html>hh</html>")
result = re.match("<([a-zA-Z1-6]{4})>.*</\\1>", "<html>hh</html>") # \1 引用分组
result = re.match(r"<([a-zA-Z1-6]{4})>.*</\1>", "<html>hh</html>")  # 前面加1个r,也不用转义了
# 5 需求：匹配出<html><h1>www.itcast.cn</h1></html>
result = re.match("<([a-zA-Z1-6]{4})><([a-zA-Z1-6]{2})>.*</\\2></\\1>", "<html><h1>www.itcast.cn</h1></html>")
result = re.match("<(?P<html>[a-zA-Z1-6]{4})><(?P<h1>[a-zA-Z1-6]{2})>.*</(?P=h1)></(?P=html)>", "<html><h1>www.itcast.cn</h1></html>")

14、Python高级语法