文件

什么是文件

文件是操作系统提供给用户/应用程序操作硬盘的一种虚拟的概念/接口

为什么要用文件

用户/应用程序可以通过文件将数据永久保存的硬盘中即操作文件就是操作硬盘
就是操作硬盘用户/应用程序直接操作的是文件，对文件进行的所有的操作，都是在向操作系统发送系统调用，然后再由操作将其转换成具体的硬盘操作

如何操控文件

三个步骤：打开，操作，关闭

基本操作流程

打开文件，由应用程序向操作系统发起系统调用open(...)，操作系统打开该文件，对应一块硬盘空间，并返回一个文件对象赋值给一个变量f

f=open('a.txt','r',encoding='utf-8') #默认打开模式就为r

调用文件对象下的读/写方法，会被操作系统转换为读/写硬盘的操作

data=f.read()

向操作系统发起关闭文件的请求，回收系统资源

f.close()

文件的操作模式

控制文件读写操作的模式

r(默认的)：只读
w：只写
a：只追加写

r 模式的使用

r只读模式: 在文件不存在时则报错,文件存在文件内指针直接跳到文件开头

with open('a.txt',mode='r',encoding='utf-8') as f:
     res=f.read() # 会将文件的内容由硬盘全部读入内存，赋值给res

实现用户认证功能

inp_name=input('请输入你的名字: ').strip()
 inp_pwd=input('请输入你的密码: ').strip()
 with open(r'db.txt',mode='r',encoding='utf-8') as f:
     for line in f:
         # 把用户输入的名字与密码与读出内容做比对
         u,p=line.strip('\n').split(':')
         if inp_name == u and inp_pwd == p:
             print('登录成功')
             break
     else:
         print('账号名或者密码错误')

w 模式的使用

w只写模式: 在文件不存在时会创建空文档,文件存在会清空文件,文件指针跑到文件开头

with open('b.txt',mode='w',encoding='utf-8') as f:
    f.write('你好\n')
    f.write('我好\n') 
    f.write('大家好\n')
    f.write('111\n222\n333\n')

强调：
1 在文件不关闭的情况下,连续的写入，后写的内容一定跟在前写内容的后面
2 如果重新以w模式打开文件，则会清空文件内容

a 模式的使用

a只追加写模式: 在文件不存在时会创建空文档,文件存在会将文件指针直接移动到文件末尾

with open('c.txt',mode='a',encoding='utf-8') as f:
     f.write('44444\n')
     f.write('55555\n')

强调 w 模式与 a 模式的异同：
1.相同点：在打开的文件不关闭的情况下，连续的写入，新写的内容总会跟在前写的内容之后
2.不同点：以 a 模式重新打开文件，不会清空原文件内容，会将文件指针直接移动到文件末尾，新写的内容永远写在最后
实现注册功能:

name=input('username>>>: ').strip()
 pwd=input('password>>>: ').strip()
 with open('db1.txt',mode='a',encoding='utf-8') as f:
     info='%s:%s\n' %(name,pwd)
     f.write(info)

+ 模式的使用(了解)

r+ w+ a+ :可读可写
在平时工作中，我们只单纯使用r/w/a，要么只读，要么只写，一般不用可读可写的模式

X模式

只写模式：不可读；不存在则创建，存在则报错

with open('a.txt',mode='x',encoding='utf-8') as f:
      f.read('accx')

控制文件读写内容的模式

t：
1、读写都是以字符串（unicode）为单位
2、只能针对文本文件
3、必须指定字符编码，即必须指定encoding参数
b：binary模式
1、读写都是以bytes为单位
2、可以针对所有文件
3、一定不能指定字符编码，即一定不能指定encoding数
总结：
1、在操作纯文本文件方面t模式帮我们省去了编码与解码的环节，b模式则需要手动编码与解码，所以此时t模式更为方便
2、针对非文本文件（如图片、视频、音频等）只能使用b模式

文件拷贝工具

src_file=input('源文件路径>>: ').strip()
dst_file=input('源文件路径>>: ').strip()
with open(r'{}'.format(src_file),mode='rb') as f1,\
    open(r'{}'.format(dst_file),mode='wb') as f2:
    for line in f1:
        f2.write(line)

循环读取文件

方式一：自己控制每次读取的数据的数据量

with open(r'test.jpg',mode='rb') as f:
    while True:
        res=f.read(1024) # 1024
        if len(res) == 0:
            break
        print(len(res))

方式二：以行为单位读，当一行内容过长时会导致一次性读入内容的数据量过大

with open(r'g.txt',mode='rt',encoding='utf-8') as f:
    for line in f:
        print(len(line),line)
with open(r'g.txt',mode='rb') as f:
    for line in f:
        print(line)
with open(r'test.jpg',mode='rb') as f:
    for line in f:
        print(line)

文件的其他操作方法

读相关操作

1、readline：一次读一行

with open(r'g.txt',mode='rt',encoding='utf-8') as f:
     res1=f.readline()
     res2=f.readline()
     print(res2)

    while True:
        line=f.readline()
        if len(line) == 0:
            break
        print(line)

2、readlines：

with open(r'g.txt',mode='rt',encoding='utf-8') as f:
    res=f.readlines()
    print(res)

强调：f.read()与f.readlines()都是将内容一次性读入内存，如果内容过大会导致内存溢出，若还想将内容全读入内存，

写相关操作

1、f.writelines()：

with open('h.txt',mode='wt',encoding='utf-8') as f:
      l=['11111\n','2222','3333']
       f.writelines(l)

补充1：如果是纯英文字符，可以直接加前缀b得到bytes类型
补充2：'上'.encode('utf-8') 等同于bytes('上',encoding='utf-8')
2、flush：

with open('h.txt', mode='wt',encoding='utf-8') as f:
      f.flush()

3、了解

with open('h.txt', mode='wt',encoding='utf-8') as f:
    print(f.readable())
    print(f.writable())
    print(f.encoding)
    print(f.name)
    print(f.closed)

控制文件指针移动

1.指针移动的单位都是以bytes/字节为单位
2.只有一种情况特殊：t模式下的read(n),n代表的是字符个数

with open('aaa.txt',mode='rt',encoding='utf-8') as f:
    res=f.read(4)
    print(res)

f.seek(n,模式):n指的是移动的字节个数

模式0：参照物是文件开头位置

with open('aaa.txt',mode='rt',encoding='utf-8') as f:
f.seek(9,0)

模式1：参照物是当前指针所在位置

with open('aaa.txt',mode='rt',encoding='utf-8') as f:
f.seek(9,1)

模式2：参照物是文件末尾位置，应该倒着移动

with open('aaa.txt',mode='rt',encoding='utf-8') as f:
f.seek(-9,2)

强调：只有0模式可以在t下使用，1、2必须在b模式下用

f.tell() # 获取文件指针当前位置

例：

with open('aaa.txt',mode='rb') as f:
    f.seek(9,0)
    f.seek(3,0) # 3
    # print(f.tell())
    f.seek(4,0)
    res=f.read()
    print(res.decode('utf-8'))

with open('aaa.txt',mode='rb') as f:
    f.seek(9,1)
    f.seek(3,1) # 12
    print(f.tell())

with open('aaa.txt',mode='rb') as f:
    f.seek(-9,2)
    # print(f.tell())
    f.seek(-3,2)
 文件的系应该   # print(f.tell())
    print(f.read().decode('utf-8'))

文件的修改

文件a.txt内容如下

张一蛋     山东    179    49    12344234523
李二蛋     河北    163    57    13913453521
王全蛋     山西    153    62    18651433422

修改操作

with open('a.txt',mode='r+t',encoding='utf-8') as f:
    f.seek(9)
    f.write('<妇女主任>')

强调：
1、硬盘空间是无法修改的,硬盘中数据的更新都是用新内容覆盖旧内容
2、内存中的数据是可以修改的

方式一：文本编辑采用的就是这种方式
实现思路：将文件内容发一次性全部读入内存,然后在内存中修改完毕后再覆盖写回原文件
优点: 在文件修改过程中同一份数据只有一份
缺点: 会过多地占用内存

with open('c.txt',mode='rt',encoding='utf-8') as f:
    res=f.read()
    data=res.replace('alex','dsb')
    print(data)

with open('c.txt',mode='wt',encoding='utf-8') as f1:
    f1.write(data)

方式二：

实现思路：以读的方式打开原文件,以写的方式打开一个临时文件,一行行读取原文件内容,修改完后写入临时文件...,删掉原文件,将临时文件重命名原文件名
优点: 不会占用过多的内存
缺点: 在文件修改过程中同一份数据存了两份

import os
with open('c.txt', mode='rt', encoding='utf-8') as f, \
        open('.c.txt.swap', mode='wt', encoding='utf-8') as f1:
    for line in f:
        f1.write(line.replace('alex', 'dsb'))
os.remove('c.txt')
os.rename('.c.txt.swap', 'c.txt')
f = open('a.txt')
res = f.read()
print(res)

作业

f=open('db.txt','r')
tag=True
while tag:
    name=input("name:")
    pwd=input('pwd:')
    for line in f:
        a,b=line.strip("\n").split(':')
        if name== a and pwd ==b:
            print('登录成功')
            tag=False
            break
    else:
        print('账号密码错误')

f.close()

day11 作业

1、通用文件copy工具实现

src_file = input('请输入源目录：').strip()
ds_file = input('请输入要拷贝的文件目录：').strip()
with open(r'%s'%src_file, 'rb') as src_f, open(r'%s'%ds_file, 'wb')as ds_f:
    for line in src_file:
        data = src_f.readline()
        ds_f.write(data)

2、基于seek控制指针移动，测试r+、w+、a+模式下的读写内容

r+ 写模式
with open(r'b.txt','r+',encoding='utf-8')as f:
    f.seek(3,0)
    f.write('123')
w+读模式
wit
print(data)
a+ 读模式
with open(r'b.txt','a+',encoding='utf-8')as f:
    f.seek(2,0)
    data=f.read()
print(data)

3、tail -f access.log程序实现

import time
for x in range(111111111):
    with open('access.log','at',encoding='utf-8') as f:
        t=time.strftime("%Y-%m-%d %H:%M:%S")
        count='egon老师正在进行第%s次讲课'%x
        msg = f'{t} {count}\n'
        f.write(msg)
        time.sleep(2)

监控日志

with open('access.log','rb',)as f2:
    f2.seek(0,2)
    while True:
        line=f2.readline()
        if line==0:
            time.sleep(0.5)
        else:
            print(line.decode('utf-8'),end='')

4、
4.1：编写用户登录接口
4.2：编写程序实现用户注册后（注册到文件中），可以登录（登录信息来自于文件）

tag=True
dic={}
while tag:
    with open('db.txt', 'rt', encoding='utf-8')as f:
        for line in f:
            db_name, db_pwd = line.strip('\n').split(':')
            dic[db_name] = db_pwd
    name=input('请输入账号：')
    pwd=input('请输入密码：')
    if name not in dic.keys():
        print('账号不存在')
        cmd = input('是否注册账号？是请输入y，否则请输入n。请输入指令：')
        cmd = cmd.lower()
        if cmd == 'y':
            new_name = input('请输入账号：')
            new_pwd = input('请输入密码：')
            with open('db.txt', 'a')as f1:
                info = '{}:{}\n'.format(new_name, new_pwd)
                f1.write(info)
                print('注册成功，请重新登录')

    elif name in dic and pwd==dic[name] :
        print('登录成功！')
        tag = False
        break
    else:
        print('密码错误')

5、下属两个案例的升级需求完成
示范1：注册功能
name = input("your name: ").strip()
做合法性校验：
1、如果输入的用户名包含特殊字符^$&...让用户重新输入
2、如果输入的用户名已经存在也重新输入
pwd = input("your password: ").strip()
做合法性校验：
1、密码长度
2、如果密码包含特殊字符则重新输入
f.txt = open('user.txt',mode='at',encoding='utf-8')
f.txt.write('%s:%s\n' %(name,pwd))
f.txt.close()

tag=True
dic={}
while tag:
    with open('db.txt', 'rt', encoding='utf-8')as f:
        for line in f:
            db_name, db_pwd = line.strip('\n').split(':')
            dic[db_name] = db_pwd
    name=input('请输入账号：')
    pwd=input('请输入密码：')
    if name not in dic.keys():
        print('账号不存在')
        cmd = input('是否注册账号？是请输入y，否则请输入n。请输入指令：')
        cmd = cmd.lower()
        if cmd == 'y':
            while True:
                new_name = input('请输入账号：').strip()
                if new_name in dic:
                    print('账号已存在，请重新输入!')
                    continue
                elif not new_name.isalpha() :
                    print('只能使用字母')
                    continue
                while True:
                    new_pwd = input('请输入密码：').strip()
                    if len(new_pwd)<7:
                        print('密码不能少于8位')
                        continue
                    elif  not new_pwd.isalnum():
                        # for line in new_pwd:
                        #     if line in []:
                        print('密码应为数字、字母')
                        continue
                    break
                with open('db.txt', 'a')as f1:
                    info = '{}:{}\n'.format(new_name, new_pwd)
                    f1.write(info)
                    print('注册成功，请重新登录')
                    break
    elif name in dic and pwd==dic[name] :
        print('登录成功！')
        tag = False
        break
    else:
        print('密码错误')

示范2：登录功能
inp_name = input("your name: ").strip()
inp_pwd = input("your pwd: ").strip()
f.txt = open('user.txt',mode='rt',encoding='utf-8')
for line in f.txt:
user,pwd=line.strip('\n').split(':')
if inp_name == user and inp_pwd == pwd:
print('login successful')
break
else:
print('user or password error')
f.txt.close()
升级需求1：同一个账号输错三次则退出
升级需求2：同一个账号输错三次则，该账号则锁定10秒，即便程序被终止，仍然计时

tag = True
info_dic = {}
while tag:
    with open('info.txt', 'rt', encoding='utf-8')as f:
        for line in f:
            print(line.strip('\n').split(':'))
            info_name, info_pwd, info_count, info_lock = line.strip('\n').split(':')
            info_dic[info_name] = [info_pwd, info_count, info_lock]

    while True:
        name = input('请输入账号：')
        pwd = input("请输入密码：")
        if name in info_dic and pwd == info_dic[name][0]:
            print('登录成功')
            tag = False
            break
        elif name not in info_dic:
            print('账号不存在')
        else:
            print('账号或者密码错误！')
            info_count = int(info_count)
            info_count += 1
            print(info_count)
            if info_count == 3:
                print('账号被锁定，请10s后尝试')
                import time

                info_lock = int(info_lock)
                info_lock += time.time() + 10
                if info_lock >= time.time():
                    time.sleep(1)
                info_dic[name][1] = 3
                info_dic[name][2]=info_lock
        print(info_dic)
        with open('info.txt', 'w', encoding='utf-8')as f1:
            for l in info_dic:
                a,b,c= info_dic[l]
                msg = '{}:{}:{}:{}\n'.format(l, a,b,c)
                # print(msg)
                f1.write(msg)

文件处理