Python,Pandas删除Excel中的行
A spreadsheet to remove certain rows.
用于删除特定行的电子表格。
All the rows of its first column contains values started with “36” are to be removed, before saving to a new spreadsheet.
在保存到新电子表格之前,第一列的所有行都包含以“36”开头的值。
I use these codes (and need to split the columns afterwards in Excel). The example looks like this:
我使用这些代码(然后需要在Excel中拆分列)。示例如下所示:
import xlwt
from xlrd import open_workbook
old_file = open_workbook('C:\\original.xlsx')
old_sheet = old_file.sheet_by_index(0)
new_file = xlwt.Workbook(encoding='utf-8', style_compression = 0)
new_sheet = new_file.add_sheet('Sheet1', cell_overwrite_ok = True)
contents = []
for row in range(old_sheet.nrows):
a = str(old_sheet.cell(row,0).value)
b = str(old_sheet.cell(row,1).value)
if not a.startswith("36"):
contents.append(a + "," + b)
for c, content in enumerate(contents):
new_sheet.write(c, 0, content)
new_file.save('C:\\result.xls')
It’s not really sufficient so I want to learn the Pandas way doing so.
这还不够,所以我想学习熊猫的方式。
I tried something like df.drop(["3649"]) but it doesn’t work.
我试过像df.drop([“3649”])之类的东西,但它不起作用。
What’s the proper Pandas way to remove the rows? Thank you.
什么是正确的Pandas删除行的方法?谢谢。
1 个解决方案
#1
1
I think you need first read_excel
, then filter by boolean indexing
with inverting mask by ~
with startswith
or contains
(^
is regex for start of string):
我认为你首先需要read_excel,然后通过带有反转掩码的布尔索引过滤~with startswith或contains(^是正则字符串的开头):
df = pd.read_excel('C:\\original.xlsx')
df = df[~df['Model'].astype(str).str.startswith('36')]
Alternative:
替代方案:
df = df[~df['Model'].astype(str).str.contains('^36')]
print (df)
Model Country
0 1021 France
1 9644 India
2 9656 India
4 9687 China
6 9630 Spain
7 9666 Brasil
and last to_excel
:
最后to_excel:
df.to_excel('C:\\result.xls', index=False)