Chapter_3_excel_files

3.1 内省Excel工作表

.XLS

import sys
from xlrd import open_workbook

input_file = sys.argv[1]
workbook = open_workbook(input_file)
print('Number of worksheets:', workbook.nsheets)
for worksheet in workbook.sheets():
    print("Worksheet name:", worksheet.name, "\tRow:", worksheet.nrows, \
          "\tColumns:", worksheet.ncols)
  • xldr模块不再支持.xlsx格式,xlwt模块不支持.xlsx格式。
  • workbook.nsheetsworksheet.nameworksheet.nrowsworksheet.ncols属性分别储存了文件内工作表个数、工作表名、工作表行数、工作表列数。

.XLSX

尝试使用openpyxl模块读取.xlsx格式,但其并不支持.xls格式。

import sys
from openpyxl import load_workbook

input_file = sys.argv[1]
wb = load_workbook(input_file, read_only=True, data_only=True)
print(f"Number of worksheets: {len(wb.sheetnames)}")
for worksheet in wb.worksheets:
    print("Worksheet name:", worksheet.title , "\tRow:", worksheet.max_row, \
          "\tColumns:", worksheet.max_column)
wb.close()

There are several flags that can be used in load_workbook.
data_only controls whether cells with formulae have either the formula (default) or the value stored the last time Excel read the sheet.
keep_vba controls whether any Visual Basic elements are preserved or not (default). If they are preserved they are still not editable.

  • load_workbook一定要添加read_only=True的参数,否则会报错>>>UserWarning: Unknown extension is not supported and will be removed。与普通工作簿不同,只读工作簿将使用延迟加载,结尾一定要有wb.close()

  • 相应的,wb.sheetnames属性储存了文件内所有工作表名,wb.active属性,会获取当前活动工作表,worksheet.titleworksheet.max_rowworksheet.max_column属性则分别储存了工作表名、工作表行数、工作表列数。

  • 只写工作簿wb = Workbook(write_only=True)与普通工作簿不同,新创建的只写工作簿不包含任何工作表;必须使用该create_sheet()方法专门创建工作表。

  • 常规openpyxl.worksheet.worksheet.Worksheet已被更快的替代品openpyxl.worksheet._write_only.WriteOnlyWorksheet. 当您想转储大量数据时,请确保已安装lxml。

  • 在只写工作簿中,写入行只能用 append(),写单元格只能用from openpyxl.cell import WriteOnlyCell 。在任意位置读写不能使用cell()iter_rows()。例如:ws.append([‘Hello world!’, 3.14, None]),使用列来写入。

  • 只写文档能够导出无限量的数据(甚至超过 Excel 实际可以处理的数据量),同时将内存使用量保持在 10Mb 以下。

  • 只写工作簿只能保存一次。之后每次尝试储存工作簿或 append() 到现有工作表都会引发openpyxl.utils.exceptions.WorkbookAlreadySaved 异常。

  • 在实际单元格数据之前出现在文件中的所有内容都必须在添加单元格之前创建,因为它必须在此之前写入文件。例如,应在添加单元格之前设置freeze_panes(冻结窗格)。

3.2 处理单个工作表

读写Excel文件

基础读写

.XLS

import sys
from xlrd import open_workbook
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')

workbook = open_workbook(input_file)
worksheet = workbook.sheet_by_name('january_2013')
for row_index in range(worksheet.nrows):
    for column_index in range(worksheet.ncols):
        output_worksheet.write(row_index, column_index, worksheet.cell_value(row_index, column_index))
output_workbook.save(output_file)
  • output_workbook = Workbook()实例化一个xlwt的Workbook对象,可以将结果写入用于输出的excel文件。
  • output_worksheet = output_workbook.add_sheet('jan_2013_output')为输出工作簿添加一个工作表,新建文件则为当前活动表。
  • worksheet.cell_value()接受两个参数,分别为行、列值的索引(从0开始)。
  • output_worksheet.write()接收三个参数,分别为行、列值的索引(从0开始)和值。

.XLSX

import sys
from openpyxl import load_workbook
from openpyxl import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
worksheet1 = output_workbook.active
worksheet1.title = "jan_2013_output"
input_workbook = load_workbook(input_file, read_only=True)
input_worksheet = input_workbook['january_2013']
for i in range(1, input_worksheet.max_row+1):
    for j in range(1, input_worksheet.max_column+1):
        worksheet1.cell(row = i, column = j, value = input_worksheet.cell(row = i, column = j).value)
output_workbook.save(output_file)
input_workbook.close()
  • 使用worksheet1 = output_workbook.active获取当前活动表属性并利用worksheet1.title()修改名称。
  • 如果直接使用worksheet1 = output_workbook.create_sheet("jan_2013_output")则会在“Sheet 1”后新建名为"jan_2013_output"的工作表。
  • Cell 对象的 value, row, column 和 coordinate 属性,存放的是该单元格的所保存的值,行列索引(从1开始)和位置信息(如 'A1')。也可以通过 sheet 的 cell 方法直接指定行(row)与列(column),来获取单元格 Cell 对象。

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, sheet_name='january_2013')

writer = pd.ExcelWriter(output_file)
data_frame.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

pandas.read_excel(io, sheet_name=0, header=0, names=None, index_col=None, usecols=None, squeeze=False, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, parse_dates=False, date_parser=None, thousands=None, comment=None, skipfooter=0, convert_float=None, mangle_dupe_cols=True, storage_options=None)

  • pd.read_excel()方法读取excel文件中的某工作表。
  • 通过writer变量储存pd.ExcelWriter(output_file),用于将 DataFrame 对象写入 Excel 工作表的类,使用data_frame.to_excel()将对象写入 Excel 工作表。
  • 由于pandas1.3.0以上的版本使用openpyxl模块读取.xlsx文件,因此同样会报错UserWarning: Unknown extension is not supported and will be removed,并且不能添加read_only=True参数来防止报错,但对输出结果没有影响。

格式化日期数据

.XLS

import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')
with open_workbook(input_file) as workbook:
    worksheet = workbook.sheet_by_name('january_2013')
    for row_index in range(worksheet.nrows):
        row_list_output = []
        for col_index in range(worksheet.ncols):
            if worksheet.cell_type(row_index, col_index) == 3:
                date_cell = xldate_as_tuple(worksheet.cell_value(row_index, col_index), workbook.datemode)
                date_cell = date(*date_cell[0: 3]).strftime('%m/%d/%y')
                row_list_output.append(date_cell)
                output_worksheet.write(row_index, col_index, date_cell)
            else:
                non_date_cell = worksheet.cell_value(row_index, col_index)
                row_list_output.append(non_date_cell)
                output_worksheet.write(row_index, col_index, non_date_cell)
output_workbook.save(output_file)
  • xldate_as_tuple函数可以将Excel中代表日期时间的数值转化为元组,只要成为元组便可以提取出具体时间元素,并格式化为不同格式。
  • worksheet.cell_type(i, j)储存了单元格类型,具体类型数字为下表:
Type symbol Type number Python value
XL_CELL_EMPTY 0 empty string ''
XL_CELL_TEXT 1 a Unicode string
XL_CELL_NUMBER 2 float
XL_CELL_DATE 3 float
XL_CELL_BOOLEAN 4 int; 1 means TRUE, 0 means FALSE
XL_CELL_ERROR 5 int representing internal Excel codes; for a text representation, refer to the supplied dictionary error_text_from_code
XL_CELL_BLANK 6 empty string ''. Note: this type will appear only when open_workbook(..., formatting_info=True) is used.
  • xldate_as_tuple(worksheet.cell_value(row_index, col_index), workbook.datemode)workbook.datemode参数是必须的,用以确定日期基于1900还是1904年并转化为元组(year, month, day, hour, minute, nearest_second)。
  • 元组或者列表前面加*,就可以将列表或元组中的元素拆分成独立元素,不用进行遍历。

.XLSX

import sys
from openpyxl import load_workbook
from openpyxl import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.active
output_worksheet.title = 'jan_2013_output'
input_workbook = load_workbook(input_file, read_only=True)
input_worksheet = input_workbook['january_2013']
for i in range(1, input_worksheet.max_row+1):
    for j in range(1, input_worksheet.max_column+1):
        if input_worksheet.cell(row=i, column=j).is_date:
            date_cell = input_worksheet.cell(row=i, column=j).value.strftime('%m/%d/%Y')
            output_worksheet.cell(i, j, date_cell)
        else:
            output_worksheet.cell(row = i, column = j, value = input_worksheet.cell(row = i, column = j).value)
output_workbook.save(output_file)
input_workbook.close()

人麻了=。=,查了好久到底怎么才能让openpyxl自动识别单元格的时间格式,查来查去全是怎么处理时间的问题,这些老外不知道strptime()和strftime()吗怎么比我还菜,最后查看模块文档中cell的属性发现有一个is_date的布尔值属性,绕了半天其实人都给你做好了,气死。

  • 使用cell().is_date的布尔值来区分日期与其他类型的单元格,使用strftime()格式化日期,最后添加到输出表格的对应位置。
  • cell.get_type()函数可以获取单元格类型,数字返回'n',字符串则返回's',时间则返回'd'

查到可以使用NamedStyle模块定义一种格式并赋予某些单元格这个格式,当然excel会识别到你用的格式化后的格式(DD/MM/YYYY)但不会识别出这是时间。

date_style = NamedStyle(name='datetime', number_format='DD/MM/YYYY')
ws.cell(i, j).style = date_style

筛选特定行

行中的值满足某个条件

.XLS

import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')

sale_amount_value_index = 3
with open_workbook(input_file) as workbook:
    worksheet = workbook.sheet_by_name('january_2013')
    data = []
    header = worksheet.row_values(0)
    data.append(header)
    for row_index in range(1, worksheet.nrows):
        sale = worksheet.cell_value(row_index, sale_amount_value_index)
        sale_amount = float(str(sale).strip('$').replace(',', ''))
        if sale_amount > 1_400.0:
            row_list = []
            for column_index in range(worksheet.ncols):
                cell_value = worksheet.cell_value(row_index, column_index)
                cell_type = worksheet.cell_type(row_index, column_index)
                if cell_type == 3:
                    date_cell = xldate_as_tuple(cell_value, workbook.datemode)
                    date_cell = date(*date_cell[0:3]).strftime('%m%d%Y')
                    row_list.append(date_cell)
                else:
                    row_list.append(cell_value)
            data.append(row_list)
            
    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)
output_workbook.save(output_file)
  • 先通过定义sale_amount_value_index=3储存'sale amount'列所在的索引值。在遍历表格中每一行时,新建一个row_list列表用于储存每一列的值,同时取出每个'sale amount'列中单元格的值,转化为浮点数储存在sale_amount中,因为想筛选出大于$1400.00的行,使用这个变量作为if语句的验证条件。

  • 在if语句中,使用for遍历满足条件的行的每一项(列),并且获取该列单元格的值和类型,检查单元格类型是否为日期,是则将其格式化为'%m%d%Y'格式添加至row_list,不是则直接添加。

  • 而后通过if语句判断row_list是否为空,并将不为空的列添加至data列表中。(但我发现作者搞复杂了,可以直接在if sale_amount > 1400.0:后新建row_list,在for循环结束后添加至data列表,省去了判断列表是否为空的步骤。因此我直接修改代码简化。)

  • 最后再次使用fordata列表本身和内部各个列表进行迭代,根据列表内的索引依次写入表格文件,如果直接在前面的for遍历里写入excel的话,会使用原有表格中的行索引值导致出现缺口。

.XLSX

import sys
from openpyxl import WorkBook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_sheet = output_workbook.active
output_sheet.title = 'jan_2013_output'
input_workbook = load_workbook(input_file, read_only=True)
input_sheet = input_workbook['january_2013']

sale_amount_value_index = 4
data = []
header = [cell.value for cell in input_sheet[1]]
data.append(header)
for i in range(2, input_sheet.max_row+1):
    sale = float(str(input_sheet.cell(i, sale_amount_value_index).value).strip('$').replace(',', ''))
    if sale > 1400.00:
        row_list = []
        for j in range(1, input_sheet.max_column+1):
            if input_sheet.cell(i, j).is_date:
                date_cell = input_sheet.cell(i, j).value.strftime('%m/%d/%Y')
                row_list.append(date_cell)
            else:
                cell_value = input_sheet.cell(i, j).value
                row_list.append(cell_value)
        data.append(row_list)

for i, row_list in enumerate(data):
    for j, cell_value in enumerate(row_list):
        output_sheet.cell(i+1, j+1, cell_value)
output_workbook.save(output_file)
input_workbook.close()
  • 事实上,在提取'Sale Amount‘值的时候可以直接用float()函数将字符串中的货币符号,逗号等非数字内容化掉。因此,第18行可以简化为:sale = float(input_sheet.cell(i, sale_amount_value_index).value)

  • 使用openpyxl模块时,不能像xlrd模块那样,直接用worksheet.row_values(0)函数获取首行值,应当使用列表解析形式将首行值存储在header中。

  • 注意xlrd和xlwt行列索引从0开始,openpyxl行列索引从1开始。

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013')
data_frame_value_meets_condition = data_frame[data_frame['Sale Amount'].astype(float) > 1400.0]

writer = pd.ExcelWriter(output_file)
data_frame_value_meets_condition.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

行中的值属于某个集合

.XLS

import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')

important_dates = ['01/24/2013', '01/31/2013']

purchase_date_column_index = 4
with open_workbook(input_file) as workbook:
    worksheet = workbook.sheet_by_name('january_2013')
    data = []
    header = worksheet.row_values(0)
    data.append(header)
    for row_index in range(1, worksheet.nrows):        
        purchase_datetime = xldate_as_tuple(worksheet.cell_value(row_index, purchase_date_column_index),workbook.datemode)
        purchase_date = date(*purchase_datetime[0:3]).strftime('%m/%d/%Y')
        row_list = []
        if purchase_date in important_dates:
            for column_index in range(worksheet.ncols):
                cell_value = worksheet.cell_value(row_index,column_index)
                cell_type = worksheet.cell_type(row_index, column_index)
                if cell_type == 3:
                    date_cell = xldate_as_tuple(cell_value,workbook.datemode)
                    date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                    row_list.append(date_cell)
                else:
                    row_list.append(cell_value)
        if row_list:
            data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)

.XLSX

import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook()
ows = owb.active
ows.title = 'jan_2013_output'
iwb = load_workbook(input_file, read_only=True)
iws = iwb['january_2013']

important_dates = ['01/24/2013', '01/31/2013']

purchase_date_column_index = 5
data = []
header = [cell.value for cell in iws[1]]
data.append(header)
for i in range(2, iws.max_row+1):
    pdate = iws.cell(i, purchase_date_column_index).value.strftime('%m/%d/%Y')
    if pdate in important_dates:
        r_list = []
        for j in range(1, iws.max_column+1):
            if iws.cell(i, j).is_date:
                r_list.append(pdate)
            else:
                r_list.append(iws.cell(i, j).value)
        data.append(r_list)
        
for i, row in enumerate(data):
    for j, value in enumerate(row):
        ows.cell(i+1, j+1, value)
owb.save(output_file)
iwb.close()

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

important_dates = ['01/24/2013', '01/31/2013']
data_frame = pd.read_excel(input_file, 'january_2013')
data_frame_value_in_set = data_frame[data_frame['Purchase Date'].isin(important_dates)]
writer = pd.ExcelWriter(output_file)
data_frame_value_in_set.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()
  • 使用Pandas处理表格基本流程为读取工作表并储存至data_frame变量,然后根据条件筛选,data_frame['Purchase Date'].isin(important_dates)返回值则为除了列标题以外其他各行的索引值和条件筛选后的布尔值。再创建一个wirter变量用于储存输出内容与路径pd.ExcelWriter(output_file),最后使用to_excel()函数输出文件,writer, sheet_name='jan_13_output', index=False参数为输出内容,工作表名和是否保留行索引。结尾一定要有writer.save()否则python无法保存并关闭文件。
  • 条件筛选依旧可以使用data_frame.loc[(data_frame['Purchase Date'].isin(important_dates)), :]选中满足条件的行。

行中值匹配特定模式

.XLS件

import re
import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')

pattern = re.compile(r'(?P<my_pattern>^J.*)')

customer_name_column_index = 1
with open_workbook(input_file) as workbook:
    worksheet = workbook.sheet_by_name('january_2013')
    data = []
    header = worksheet.row_values(0)
    data.append(header)
    for row_index in range(1, worksheet.nrows):
        if pattern.search(worksheet.cell_value(row_index, customer_name_column_index)):
            row_list = []
            for column_index in range(worksheet.ncols):
                cell_value = worksheet.cell_value(row_index,column_index)
                cell_type = worksheet.cell_type(row_index, column_index)
                if cell_type == 3:
                    date_cell = xldate_as_tuple(cell_value, workbook.datemode)
                    date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                    row_list.append(date_cell)
                else:
                    row_list.append(cell_value)
            data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)

.XLSX件

import re
import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook()
ows = owb.active
ows.title = 'jan_2013_output'
iwb = load_workbook(input_file, read_only=True)
iws = iwb['january_2013']
pattern = re.compile(r'(?P<my_pattern>^J.*)')
data = []
header = [cell.value for cell in iws[1]]
data.append(header)

for i in range(2, iws.max_row+1):
    if pattern.search(iws.cell(i, 2).value):
        r_list = []
        for j in range(1, iws.max_column+1):
            cell_value = iws.cell(i, j).value
            if iws.cell(i, j).is_date:
                date_value = cell_value.strftime('%m/%d/%Y')
                r_list.append(date_value)
            else:
                r_list.append(cell_value)
        data.append(r_list)
        
for i, row in enumerate(data):
    for j, cell_value in enumerate(row):
        ows.cell(i+1, j+1, cell_value)
owb.save(output_file)
iwb.close()

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013')
data_frame_value_matches_pattern1 = data_frame[data_frame['Customer Name'].str.startswith('J')]
data_frame_value_matches_pattern = data_frame_value_matches_pattern1.copy()
data_frame_value_matches_pattern['Purchase Date'] = data_frame_value_matches_pattern1['Purchase Date'].dt.strftime('%m/%d/%Y')
writer = pd.ExcelWriter(output_file)
data_frame_value_matches_pattern.to_excel(writer, sheet_name='jan_2013_output', index=False)
writer.save()
  • 匹配模式本身不难,格式化时间花费了我很多时间来完成。值得注意的是,如果将某一列处理后重新赋值给本身,会出现SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.错误,==目前理解为,pandas会重复引用这个值对本身赋值==,因此创建一个副本引用原始dataframe便可以解决问题,此外,使用pd.options.mode.chained_assignment = None将链式赋值关闭也可以解决问题。

选取特定列

列索引值

.XLS
import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')

my_columns = [1, 4]

with open_workbook(input_file) as workbook:
    worksheet = workbook.sheet_by_name('january_2013')
    data = []
    for row_index in range(worksheet.nrows):
        row_list = []
        for column_index in my_columns:
            cell_value = worksheet.cell_value(row_index,column_index)
            cell_type = worksheet.cell_type(row_index, column_index)
            if cell_type == 3:
                date_cell = xldate_as_tuple(cell_value,workbook.datemode)
                date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                row_list.append(date_cell)
            else:
                row_list.append(cell_value)
        data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)
.XLSX
import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook()
ows = owb.active
ows.title = 'jan_2013_output'
iwb = load_workbook(input_file, read_only=True)
iws = iwb['january_2013']
my_columns = [2, 5]
data = []
for i in range(1, iws.max_row+1):
    r_list = []
    for j in my_columns:
        cell_value = iws.cell(i, j).value
        if iws.cell(i, j).is_date:
            date_value = cell_value.strftime('%m/%d/%Y')
            r_list.append(date_value)
        else:
            r_list.append(cell_value)
    data.append(r_list)
for i, row in enumerate(data):
    for j, cell_value in enumerate(row):
        ows.cell(i+1, j+1, cell_value)
owb.save(output_file)
iwb.close()
Pandas
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]
my_columns = [1, 4]

data_frame = pd.read_excel(input_file, 'january_2013')
data_frame_column_by_index = data_frame.iloc[:, my_columns]
writer = pd.ExcelWriter(output_file)
data_frame_column_by_index.to_excel(writer, sheet_name='jan_2013_output', index=False)
writer.save()

列标题

.XLS
import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('jan_2013_output')

my_columns = ['Customer ID', 'Purchase Date']

with open_workbook(input_file) as workbook:
    worksheet = workbook.sheet_by_name('january_2013')
    data = [my_columns]
    header_list = worksheet.row_values(0)
    header_index_list = []
    for header_index in range(len(header_list)):
        if header_list[header_index] in my_columns:
            header_index_list.append(header_index)
    for row_index in range(1,worksheet.nrows):
        row_list = []
        for column_index in header_index_list:
            cell_value = worksheet.cell_value(row_index,column_index)
            cell_type = worksheet.cell_type(row_index, column_index)
            if cell_type == 3:
                date_cell = xldate_as_tuple(cell_value,workbook.datemode)
                date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                row_list.append(date_cell)
            else:
                row_list.append(cell_value)
        data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)
.XLSX
import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook()
ows = owb.active
ows.title = 'jan_2013_output'
iwb = load_workbook(input_file, read_only=True)
iws = iwb['january_2013']
data = []
my_columns = ['Customer ID', 'Purchase Date']
data.append(my_columns)
header = [cell.value for cell in iws[1]]
header_indexes = []
for header_index in range(1, iws.max_column+1):
    if header[header_index-1] in my_columns:
        header_indexes.append(header_index)
for i in range(2, iws.max_row+1):
    r_list = []
    for j in header_indexes:
        cell_value = iws.cell(i, j).value
        if iws.cell(i, j).is_date:
            date_value = cell_value.strftime("%m/%d/%Y")
            r_list.append(date_value)
        else:
            r_list.append(cell_value)
    data.append(r_list)
for i, row in enumerate(data):
    for j, cell_value in enumerate(row):
        ows.cell(i+1, j+1, cell_value)
owb.save(output_file)
iwb.close()
  • openpyxl的难点在于行列索引值与excel一致,从1开始,因此获取列号需要注意是否正确。
Pandas
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013', index_col=None)
data_frame_column_by_name = data_frame.loc[:, ['Customer ID', 'Purchase Date']]
writer = pd.ExcelWriter(output_file)
data_frame_column_by_name.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

3.3 读取工作簿中的所有工作表

在所有工作表中筛选行

.XLS

import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('filtered_rows_all_worksheets')

sales_column_index = 3
threshold = 2000.0

first_worksheet = True
with open_workbook(input_file) as workbook:
    data = []
    for worksheet in workbook.sheets():
        if first_worksheet:
            header_row = worksheet.row_values(0)
            data.append(header_row)
            first_worksheet = False
        for row_index in range(1,worksheet.nrows):
            row_list = []
            sale_amount = worksheet.cell_value(row_index, sales_column_index)
            sale_amount = float(sale_amount)
            if sale_amount > threshold:
                for column_index in range(worksheet.ncols):
                    cell_value = worksheet.cell_value(row_index,column_index)
                    cell_type = worksheet.cell_type(row_index, column_index)
                    if cell_type == 3:
                        date_cell = xldate_as_tuple(cell_value,workbook.datemode)
                        date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                        row_list.append(date_cell)
                    else:
                        row_list.append(cell_value)
            if row_list:
                data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)

.XLSX

import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook(write_only=True)
ows = owb.create_sheet(title='filtered_rows_all_worksheets')
iwb = load_workbook(input_file, read_only=True)
first_sheet = True
sc_column_index = 4
threshold = 2000.0

for sheet in iwb.worksheets:
    if first_sheet:
        header = [cell.value for cell in sheet[1]]
        ows.append(header)
        first_sheet = False
    for i in range(2, sheet.max_row+1):
        sale_amount = float(sheet.cell(i, sc_column_index).value)
        if sale_amount > threshold:
            r_list = []
            for j in range(1, sheet.max_column+1):
                cell_value = sheet.cell(i, j).value
                if sheet.cell(i, j).is_date:
                    date_value = cell_value.strftime('%m/%d/%Y')
                    r_list.append(date_value)
                else:
                    r_list.append(cell_value)
            ows.append(r_list)
owb.save(output_file)
iwb.close()
  • 在这个部分使用了只写型工作簿,直接使用append()函数将各列添加到输出工作表内。值得注意的是,小心if语句结束时for语句的位置,否则可能只遍历了第一个工作表就结束了。

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, sheet_name=None, index_col=None)
row_output = []
for worksheet_name, data in data_frame.items():
    row_output.append(data[data['Sale Amount'].astype(float) > 2000.0])
filtered_rows = pd.concat(row_output, axis=0, ignore_index=True)

writer = pd.ExcelWriter(output_file)
filtered_rows.to_excel(writer, sheet_name='sale_amount_gt2000', index=False)
writer.save()
  • data_frame=pd.read_excel(input_file, sheet_name=None, index_col=None)参数sheet_name=None可以读取所有工作表。
  • concat()方法用于连接两个或多个数组。该方法不会改变现有的数组,而仅仅会返回被连接数组的一个副本。代码中将所有数据框连接成为一个数据框,axis参数用于设置方法,0表示垂直堆叠,1表示并排平行堆叠。
  • 返回data_frame如下(为了看清格式,在输出内容的每个字典键值对后加了换行符),pandas将每个工作表和所有内容作为键值对储存在了一个字典里:

{'january_2013': Customer ID Customer Name Invoice Number Sale Amount Purchase Date
0 1234 John Smith 100-0002 1200 2013-01-01
1 2345 Mary Harrison 100-0003 1425 2013-01-06
2 3456 Lucy Gomez 100-0004 1390 2013-01-11
3 4567 Rupert Jones 100-0005 1257 2013-01-18
4 5678 Jenny Walters 100-0006 1725 2013-01-24
5 6789 Samantha Donaldson 100-0007 1995 2013-01-31,

'february_2013': Customer ID Customer Name Invoice Number Sale Amount Purchase Date
0 9876 Daniel Farber 100-0008 1115 2013-02-02
1 8765 Laney Stone 100-0009 1367 2013-02-08
2 7654 Roger Lipney 100-0010 2135 2013-02-15
3 6543 Thomas Haines 100-0011 1346 2013-02-17
4 5432 Anushka Vaz 100-0012 1560 2013-02-21
5 4321 Harriet Cooper 100-0013 1852 2013-02-25,

'march_2013': Customer ID Customer Name Invoice Number Sale Amount Purchase Date
0 1234 John Smith 100-0014 1350 2013-03-04
1 8765 Tony Song 100-0015 1167 2013-03-08
2 2345 Mary Harrison 100-0016 1789 2013-03-17
3 6543 Rachel Paz 100-0017 2042 2013-03-22
4 3456 Lucy Gomez 100-0018 1511 2013-03-28
5 4321 Susan Wallace 100-0019 2280 2013-03-30}

在所有工作表中选取特定列

.XLS

import sys
from xlrd import open_workbook
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('selected_columns_all_worksheets')

my_columns = ['Customer Name', 'Sale Amount']

first_worksheet = True
with open_workbook(input_file) as workbook:
    data = [my_columns]
    index_of_cols_to_keep = []
    for worksheet in workbook.sheets():
        if first_worksheet:
            header = worksheet.row_values(0)
            for column_index in range(len(header)):
                if header[column_index] in my_columns:
                    index_of_cols_to_keep.append(column_index)
            first_worksheet = False
        for row_index in range(1, worksheet.nrows):
            row_list = []
            for column_index in index_of_cols_to_keep:
                cell_value = worksheet.cell_value(row_index, column_index)
                row_list.append(cell_value)
            data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)

.XLSX

import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook(write_only=True)
ows = owb.create_sheet(title='selected_columns_all_worksheets')
iwb = load_workbook(input_file, read_only=True)
my_columns = ['Customer Name', 'Sale Amount']
header_indexes = []
ows.append(my_columns)
first_sheet = True
for sheet in iwb.worksheets:
    if first_sheet:
        header = [cell.value for cell in sheet[1]]
        for col_index in range(1, sheet.max_column+1):
            if header[col_index-1] in my_columns:
                header_indexes.append(col_index)
        first_sheet = False
    for i in range(2, sheet.max_row+1):
        r_list = []
        for j in header_indexes:
            r_list.append(sheet.cell(i, j).value)
        ows.append(r_list)
owb.save(output_file)
iwb.close()
  • 通过设置first_sheet=True存储布尔值从而仅获取第一个表格的列标题,然后获取所需列对应的列索引值,最后根据索引值将表格内容添加到输出表格。

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, sheet_name=None, index_col=None)

column_output = []
for data in data_frame.values():
    column_output.append(data.loc[:, ['Customer Name', 'Sale Amount']])
all_column_output = pd.concat(column_output, axis=0, ignore_index=True)

writer = pd.ExcelWriter(output_file)
all_column_output.to_excel(writer, sheet_name='selected_columns_all_worksheets', index=False)
writer.save()

3.4 在EXCEL工作簿中读取一组工作表

在一组工作表中筛选特定行

.XLS

import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('set_of_worksheets')

my_sheets = [0,1]
threshold = 1900.0
sales_column_index = 3

first_worksheet = True
with open_workbook(input_file) as workbook:
    data = []
    for sheet_index in range(workbook.nsheets):
        if sheet_index in my_sheets:
            worksheet = workbook.sheet_by_index(sheet_index)
            if first_worksheet:
                header_row = worksheet.row_values(0)
                data.append(header_row)
                first_worksheet = False
            for row_index in range(1, worksheet.nrows):
                sale_amount = worksheet.cell_value(row_index, sales_column_index)
                if sale_amount > threshold:
                    row_list = []
                    for column_index in range(worksheet.ncols):
                        cell_value = worksheet.cell_value(row_index,column_index)
                        cell_type = worksheet.cell_type(row_index, column_index)
                        if cell_type == 3:
                            date_cell = xldate_as_tuple(cell_value,workbook.datemode)
                            date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                            row_list.append(date_cell)
                        else:
                            row_list.append(cell_value)
                    data.append(row_list)

    for list_index, output_list in enumerate(data):
        for element_index, element in enumerate(output_list):
            output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)
  • .xls如果不处理时间格式,输出文件中日期将会被转化为数字。

.XLSX

import sys
from openpyxl import Workbook, load_workbook

input_file = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook(write_only=True)
ows = owb.create_sheet(title='set_of_worksheets', index=None)
iwb = load_workbook(input_file, read_only=True)
my_sheets = [0, 1]
threshold = 1900.0
sales_column_index = 4
first_worksheet = True
for sheet_index, stname in enumerate(iwb.iwb.worksheets):
    if sheet_index in my_sheets:
        my_sheet = iwb.worksheets[sheet_index]
        if first_worksheet:
            header = [cell.value for cell in my_sheet[1]]
            ows.append(header)
            first_worksheet = False
        for i in range(2, my_sheet.max_row+1):
            sales_value = float(my_sheet.cell(i, sales_column_index).value)
            if sales_value > threshold:
                r_list = []
                for j in range(1, my_sheet.max_column+1):
                    cell_value = my_sheet.cell(i, j).value
                    r_list.append(cell_value)
                ows.append(r_list)
owb.save(output_file)
iwb.close()
  • openpyxl中,wb.worksheets[]可以通过从0开始的索引值读取工作表,而wb[sheet_name]通过工作表名读取。
  • 主要是卡在了第14-16行,遍历和储存工作表部分一直有误。

Pandas

import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

my_sheets = [0,1]
threshold = 1900.0

data_frame = pd.read_excel(input_file, sheet_name=my_sheets, index_col=None)

row_list = []
for worksheet_name, data in data_frame.items():
    row_list.append(data[data['Sale Amount'].astype(float) > threshold])
filtered_rows = pd.concat(row_list, axis=0, ignore_index=True)

writer = pd.ExcelWriter(output_file)
filtered_rows.to_excel(writer, sheet_name='set_of_worksheets', index=False)
writer.save()
  • Pandas读取excel时,sheet_name既可以使用字符串形式的工作表名,也可以使用从0开始的索引值,None则表示所有工作表。

3.5 处理多个工作簿

工作表计数以及每个工作表中的行列计算

.XLS

import glob
import os
import sys
from xlrd import open_workbook

input_directory = sys.argv[1]

workbook_counter = 0
for input_file in glob.glob(os.path.join(input_directory, '*.xls*')):
    workbook = open_workbook(input_file)
    print('Workbook: {}'.format(os.path.basename(input_file)))
    print('Number of worksheets: {}'.format(workbook.nsheets))
    for worksheet in workbook.sheets():
        print('Worksheet name:', worksheet.name, '\tRows:',\
                worksheet.nrows, '\tColumns:', worksheet.ncols)
    workbook_counter += 1
print('Number of Excel workbooks: {}'.format(workbook_counter))

.XLSX

import glob
import os
import sys
from openpyxl import load_workbook

input_directory = sys.argv[1]
workbook_counter = 0
for input_file in glob.glob(os.path.join(input_directory, r"*.xlsx")):
    iwb = load_workbook(input_file, read_only=True)
    print(f"Workbook: {os.path.basename(input_file)}")
    print(f"Number of worksheets: {len(iwb.worksheets)}")
    for worksheet in iwb.worksheets:
        details = f"Worksheet name: {worksheet.title}, \t"
        details += f"Rows: {worksheet.max_row}, \t"
        details += f"Columns: {worksheet.max_column}"
        print(details)
    workbook_counter += 1
print(f"Number of Excel workbooks: {workbook_counter}.")

从多个工作簿中连接数据

.XLS

import glob
import os
import sys
from datetime import date
from xlrd import open_workbook, xldate_as_tuple
from xlwt import Workbook

input_folder = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('all_data_all_workbooks')

data = []
first_worksheet = True
for input_file in glob.glob(os.path.join(input_folder, '*.xls*')):
    print (os.path.basename(input_file))
    with open_workbook(input_file) as workbook:
        for worksheet in workbook.sheets():
            if first_worksheet:
                header_row = worksheet.row_values(0)
                data.append(header_row)
                first_worksheet = False
            for row_index in range(1,worksheet.nrows):
                row_list = []
                for column_index in range(worksheet.ncols):
                    cell_value = worksheet.cell_value(row_index,column_index)
                    cell_type = worksheet.cell_type(row_index, column_index)
                    if cell_type == 3:
                        date_cell = xldate_as_tuple(cell_value, workbook.datemode)
                        date_cell = date(*date_cell[0:3]).strftime('%m/%d/%Y')
                        row_list.append(date_cell)
                    else:
                        row_list.append(cell_value)
                data.append(row_list)

for list_index, output_list in enumerate(data):
    for element_index, element in enumerate(output_list):
        output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)

.XLSX

import glob
import os
import sys
from openpyxl import Workbook, load_workbook

input_folder = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook(write_only=True)
ows = owb.create_sheet(title='all_data_all_workbooks', index=None)
input_files = glob.glob(os.path.join(input_folder, r"*.xlsx*"))

first_sheet = True
for iwb_path in input_files:
    print(os.path.basename(iwb_path))
    iwb = load_workbook(iwb_path, read_only=True)
    for iws in iwb.worksheets:
        if first_sheet:
            header = [cell.value for cell in iws[1]]
            ows.append(header)
            first_sheet = False
        for i in range(2, iws.max_row+1):
            r_list = []
            for j in range(1, iws.max_column+1):
                cell_value = iws.cell(i, j).value
                r_list.append(cell_value)
            ows.append(r_list)
    iwb.close()
owb.save(output_file)
  • 一定注意iwb.close()的位置,防止显式关闭提前导致无法继续遍历,靠后会导致打开多个表格,无法正确关闭。

Pandas

import pandas as pd
import glob
import os
import sys

input_path = sys.argv[1]
output_file = sys.argv[2]

all_workbooks = glob.glob(os.path.join(input_path,'*.xls*'))
data_frames = []
for workbook in all_workbooks:
    all_worksheets = pd.read_excel(workbook, sheet_name=None, index_col=None)
    for worksheet_name, data in all_worksheets.items():
        data_frames.append(data)
all_data_concatenated = pd.concat(data_frames, axis=0, ignore_index=True)

writer = pd.ExcelWriter(output_file)
all_data_concatenated.to_excel(writer, sheet_name='all_data_all_workbooks', index=False)
writer.save()

为每个工作簿和工作表计算总数和均值

.XLS

import glob
import os
import sys
from xlrd import open_workbook
from xlwt import Workbook

input_folder = sys.argv[1]
output_file = sys.argv[2]

output_workbook = Workbook()
output_worksheet = output_workbook.add_sheet('sums_and_averages')

all_data = []
sales_column_index = 3

header = ['workbook', 'worksheet', 'worksheet_total', 'worksheet_average',\
    'workbook_total', 'workbook_average']
all_data.append(header)

for input_file in glob.glob(os.path.join(input_folder, '*.xls*')):
    with open_workbook(input_file) as workbook:
        list_of_totals = []
        list_of_numbers = []
        workbook_output = []
        for worksheet in workbook.sheets():
            total_sales = 0
            number_of_sales = 0
            worksheet_list = []
            worksheet_list.append(os.path.basename(input_file))
            worksheet_list.append(worksheet.name)
            for row_index in range(1,worksheet.nrows):
                try:
                    total_sales += float(str(worksheet.cell_value(row_index, sales_column_index)).strip('$').replace(',',''))
                    number_of_sales += 1.
                except:
                    total_sales += 0.
                    number_of_sales += 0.
            average_sales = '%.2f' % (total_sales / number_of_sales)
            worksheet_list.append(total_sales)
            worksheet_list.append(float(average_sales))
            list_of_totals.append(total_sales)
            list_of_numbers.append(float(number_of_sales))
            workbook_output.append(worksheet_list)
        workbook_total = sum(list_of_totals)
        workbook_average = sum(list_of_totals)/sum(list_of_numbers)
        for list_element in workbook_output:
            list_element.append(workbook_total)
            list_element.append(workbook_average)
        all_data.extend(workbook_output)
        
for list_index, output_list in enumerate(all_data):
    for element_index, element in enumerate(output_list):
        output_worksheet.write(list_index, element_index, element)

output_workbook.save(output_file)

Pandas

import glob
import os
import sys
from openpyxl import Workbook, load_workbook

input_folder = sys.argv[1]
output_file = sys.argv[2]

owb = Workbook(write_only=True)
ows = owb.create_sheet(title='sums_and_averages', index=None)
sales_column_index = 4
header = ['workbook', 'worksheet', 'worksheet_total', 'worksheet_average',\
    'workbook_total', 'workbook_average']
ows.append(header)
input_files = glob.glob(os.path.join(input_folder, "*.xlsx*"))
for file_path in input_files:
    iwb = load_workbook(file_path, read_only=True)
    list_of_totals = []
    list_of_numbers = []
    workbook_output = []
    for iws in iwb.worksheets:
        total_sales = 0
        number_of_sales = 0
        worksheet_list = []
        worksheet_list.append(os.path.basename(file_path))
        worksheet_list.append(iws.title)
        for i in range(2, iws.max_row+1):
            try:
                total_sales += float(iws.cell(i, sales_column_index).value)
                number_of_sales += 1
            except:
                total_sales += 0
                number_of_sales += 0
        average_sales = total_sales / number_of_sales
        worksheet_list.append(total_sales)
        worksheet_list.append(average_sales)
        list_of_totals.append(total_sales)
        list_of_numbers.append(float(number_of_sales))
        workbook_output.append(worksheet_list)
    workbook_total = sum(list_of_totals)
    workbook_average = workbook_total / sum(list_of_numbers)
    for list_element in workbook_output:
        list_element.append(workbook_total)
        list_element.append(workbook_average)
        ows.append(list_element)
    iwb.close()
owb.save(output_file)
  • 搞清楚要添加的元素,一个一个添加进工作表的列表,然后添加进工作表所在的大列表,最后算出总值和平均值之后再遍历大列表将值添加进去并存入ows。

Pandas

import pandas as pd
import glob
import os
import sys

input_path = sys.argv[1]
output_file = sys.argv[2]

all_wbs = glob.glob(os.path.join(input_path, "*.xlsx*"))
data_frames = []
for wb in all_wbs:
    all_wss = pd.read_excel(wb, sheet_name=None, index_col=None)
    wb_total_sales = []
    wb_number_of_sales = []
    ws_data_frame = []
    wss_data_frame = None
    wb_data_frame = None
    for ws_name, data in all_wss.items():
        total_sales = pd.DataFrame(float(value) for value in data.loc[:, "Sale Amount"]).sum()
        number_of_sales = len(data.loc[:, "Sale Amount"])
        average_sales = total_sales / number_of_sales
        
        wb_total_sales.append(total_sales)
        wb_number_of_sales.append(number_of_sales)
        
        data = {'Workbook': os.path.basename(wb),
                'Worksheet': ws_name,
                'Worksheet_total': total_sales,
                'Worksheet_average': average_sales}
        ws_data_frame.append(pd.DataFrame(data, columns=['Workbook', 'Worksheet', 'Worksheet_total', 'Worksheet_average']))
    wss_data_frame = pd.concat(ws_data_frame, axis=0, ignore_index=True)
    wb_total = pd.DataFrame(wb_total_sales).sum()
    wb_total_number_of_sales = pd.DataFrame(wb_number_of_sales).sum()
    wb_average = wb_total / wb_total_number_of_sales
    
    wb_stats = {'Workbook': os.path.basename(wb),
                'Workbook_total': wb_total,
                'Workbook_average': wb_average}

    wb_stats = pd.DataFrame(wb_stats, columns=['Workbook', 'Workbook_total', 'Workbook_average'])
    wb_data_frame = pd.merge(wss_data_frame, wb_stats, on='Workbook', how='left')
    data_frames.append(wb_data_frame)

all_data_concatenated = pd.concat(data_frames, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
all_data_concatenated.to_excel(writer, sheet_name='sums_and_averages', index=False)
writer.save()
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,686评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,668评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,160评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,736评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,847评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,043评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,129评论 3 410
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,872评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,318评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,645评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,777评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,470评论 4 333
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,126评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,861评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,095评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,589评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,687评论 2 351

推荐阅读更多精彩内容