Python，Pandas删除Excel中的行

A spreadsheet to remove certain rows.

用于删除特定行的电子表格。

All the rows of its first column contains values started with “36” are to be removed, before saving to a new spreadsheet.

在保存到新电子表格之前，第一列的所有行都包含以“36”开头的值。

I use these codes (and need to split the columns afterwards in Excel). The example looks like this:

我使用这些代码（然后需要在Excel中拆分列）。示例如下所示：

import xlwt
from xlrd import open_workbook

old_file = open_workbook('C:\\original.xlsx')
old_sheet = old_file.sheet_by_index(0)

new_file = xlwt.Workbook(encoding='utf-8', style_compression = 0)
new_sheet = new_file.add_sheet('Sheet1', cell_overwrite_ok = True)

contents = []

for row in range(old_sheet.nrows):
    a = str(old_sheet.cell(row,0).value)
    b = str(old_sheet.cell(row,1).value)

    if not a.startswith("36"):
        contents.append(a + "," + b)

for c, content in enumerate(contents):
    new_sheet.write(c, 0, content)

new_file.save('C:\\result.xls')

It’s not really sufficient so I want to learn the Pandas way doing so.

这还不够，所以我想学习熊猫的方式。

I tried something like df.drop(["3649"]) but it doesn’t work.

我试过像df.drop（[“3649”]）之类的东西，但它不起作用。

What’s the proper Pandas way to remove the rows? Thank you.

什么是正确的Pandas删除行的方法？谢谢。

1 个解决方案

#1

I think you need first read_excel, then filter by boolean indexing with inverting mask by ~ with startswith or contains (^ is regex for start of string):

我认为你首先需要read_excel，然后通过带有反转掩码的布尔索引过滤~with startswith或contains（^是正则字符串的开头）：

df = pd.read_excel('C:\\original.xlsx')

df = df[~df['Model'].astype(str).str.startswith('36')]

Alternative:

替代方案：

df = df[~df['Model'].astype(str).str.contains('^36')]

print (df)
   Model Country
0   1021  France
1   9644   India
2   9656   India
4   9687   China
6   9630   Spain
7   9666  Brasil

and last to_excel:

最后to_excel：

df.to_excel('C:\\result.xls', index=False)

1 个解决方案

#1

更多相关文章

随机推荐