【python读取文件】

发布时间:2024年01月02日

系列文章目录


前言

一、python读取文件的几种方式

pip install xlwd
pip install xlwt
pip install openpyxl
pip install pandas

1.用xlrd和wlwt进行读写

file_path = os.path.dirname(os.path.abspath(__file__))
base_path = os.path.join(file_path,'data.xlsx')
book = xlrd.open_workbook(path)
sheet1 = book.sheets()[0]
nrows = sheet1.nrows
ncols = sheet1.ncols
row3_values = sheet1.row_values(2)  #第三行值
col3_values = sheet1.col3_values(2) #第三列值
cell_3_3 = sheet1.cell(2, 2).value  #单元格值

workbook = xlwt.Workbook(encoding = 'utf-8')
worksheet = workbook.add_sheet('Worksheet')
worksheet.write(0, 0, label = '测试')  #写入excel对应行、列、值
worksheet.col(0).width = 3333  #单元格宽度
tall_style = xlwt.easyxf('font:height 520')
worksheet.row(0).set_style(tall_style)

2.用openpyxl进行读写

openpyxl是一个python库,用于读写excel 2010 xlsx/xlsm/xltx/xltm文件

file_path = os.path.dirname(os.path.abspath(__file__))
base_path = os.path.join(file_path,'data.xlsx')
workbook = openpyxl.load_workbook(base_path)
worksheet = workbook.get_sheet_by_name('Sheet1')
row3 = [item.value for item in list(worksheet.rows)[2]]
col3 = [item.value for item in list(worksheet.columns)[2]]
cell_2_3 = worksheet.cell(row = 2, column = 3).value
max_row = worksheet.max_row

workbook = openpyxl.Workbook()  #新建工作薄
sheet = workbook.active
sheet['A1'] = 'python'
sheet['A2'] = datetime.datetime.now().strftime("%Y-%m-%d")
sheet.row_dimensions[2].height = 40
sheet.column_dimensions['B'].width = 30
sheet['A1'].alignment = Alignment(horizontal = 'center', vertical = 'center')
sheet.merge_cell('A2:B2') #合并单元格
sheet.unmerge_cell('A2:B2') #拆分单元格
workbook.save('new.xlsx')  #保存文件

2.1 openpyxl对excel的操作
wb = openpyxl.Workbook()  #创建一个工作薄
wb.create_sheet('test_case') #新增一个sheet表单
wb.save('cases.xlsx')  #保存文件
wb = openpyxl.load_workbook('cases.xlsx') #打开工作薄
sh = wb['sheet1'] #选取表单
ce = sh.cell(row = 1, column = 1)  #读取第一行,第一列数据
wb.close()  #关闭工作薄
columns_data = list(sh.columns) #按列读取数据

value = 'result':sh.cell(row = 1,column = 4,value = 'result')
sh.max_row 
sh.max_column  #获取最大行数/最大列数
del wb['sheet_name'] #删除表单
sh = wb['sheet_name']  wb.remove(sh) #remove删除表单

3.用pandas进行读写

pandas支持xls、xlsx、xlsm、xlsb、odf、ods、odt文件扩展名,支持读取单个工作表或工作表列表选项
语法:
pd.read_excel(io, sheet_name = 0, header = 0, names = None, index_col = None, usecols = None,
squeeze = False, dtype = None, engine = None, converters = None, true_values = None,
false_values = None, skiprows = None, nrows = None, na_values = None, parse_dates = False,
date_parser = None, thousands = None, comment = None, skipfooter = 0, convert_float = True, **kwds)

file_path = os.path.dirname(os.path.abspath)
base_path = os.join(file_path,'data.xlsx')
df = pd.read_excel(base_path)

DataFrame.to_excel(excel_writer, sheet_name = 'Sheet1', na_rep = '', float_format = None,
columns = None, header = True, index = True, index_label = None, startrow = 0, startcol = 0,
engine = None, merge_cells = True, encoding = None, inf_rep = 'inf', verbose = True, freeze_panes = None)

from pandas import DataFrame
data = {'name':['python','java','C++'],'age':[2021,2022,2023],'rank':['1','2','3']}
df = DataFrame(data)
df.to_excel('file.xlsx')

row_num = len(df.index.values)
col_num = len(df.columns.values)
row_indexs = df.index.values
col_indexs = df.columns.values

df.loc[0].value
df.loc[:,"列"].values
df.iloc[0].values
df.iloc[:,1].values
df.iloc[[1,2],[2,4]]
df3 = pd.merge(df,df2.iloc[])

二、excel文件

1、 xls为excel早期表格格式
xls格式是excel2003版本及其之前的文件格式。其最大的特点就是只有65536行,256列,规模较小;

2、xlsx为excel2007及其以后格式
可存储1048576行,16384列数据,存储相同的数据,xlsx格式要比xls格式文件小得多;

3、csv文件为逗号分隔值文件
csv逗号分隔值文件格式,其以纯文本形式存储表格数据;

备注:xlrd模块既可以读取xls文件也可读取xlsx文件;xlwt只可以写xlsx文件;openpyxl可以读写xlsx文件;pandas可以同时读写xls、xlsx文件;


总结

分享:

种一棵树最好的时间是10年前,其次是现在!!

文章来源:https://blog.csdn.net/qq_45365214/article/details/124474954
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。