使用 Python 在 Excel (.xlsx) 中查找和替换字符串
问题描述
我正在尝试替换 .xlsx 工作表中的一堆字符串(约 70k 行,38 列).我有一个要在文件中搜索和替换的字符串列表,格式如下:-
I am trying to replace a bunch of strings in an .xlsx sheet (~70k rows, 38 columns). I have a list of the strings to be searched and replaced in a file, formatted as below:-
bird produk - bird product
pig - pork
ayam - chicken
...
kuda - horse
要搜索的词在左边,替换词在右边(找到'bird produk',替换为'bird product'.我的.xlsx 表看起来像这样:-
The word to be searched is on the left, and the replacement is on the right (find 'bird produk', replace with 'bird product'. My .xlsx sheet looks something like this:-
name type of animal ID
ali pig 3483
abu kuda 3940
ahmad bird produk 0399
...
ahchong pig 2311
我正在为此寻找最快的解决方案,因为我在要搜索的列表中有大约 200 个单词,而且 .xlsx 文件非常大.我需要为此使用 Python,但我愿意接受任何其他更快的解决方案.
I am looking for the fastest solution for this, since I have around 200 words in the list to be searched, and the .xlsx file is quite large. I need to use Python for this, but I am open to any other faster solutions.
- 添加工作表示例
Edit2:- 尝试了一些 python 代码来读取单元格,花了很长时间来读取.有什么指点吗?
- tried some python codes to read the cells, took quite a long time to read. Any pointers?
from xlrd import open_workbook
wb = open_workbook('test.xlsx')
for s in wb.sheets():
print ('Sheet:',s.name)
for row in range(s.nrows):
values = []
for col in range(s.ncols):
print(s.cell(row,col).value)
谢谢!
Edit3:- 我终于想通了.VBA 模块和 Python 代码都可以工作.我改用 .csv 来让事情变得更容易.谢谢!这是我的 Python 代码版本:-
import csv
###### our dictionary with our key:values. ######
reps = {
'JUALAN (PRODUK SHJ)' : 'SALE( PRODUCT)',
'PAMERAN' : 'EXHIBITION',
'PEMBIAKAN' : 'BREEDING',
'UNGGAS' : 'POULTRY'}
def replace_all(text, dic):
for i, j in reps.items():
text = text.replace(i, j)
return text
with open('test.csv','r') as f:
text=f.read()
text=replace_all(text,reps)
with open('file2.csv','w') as w:
w.write(text)
解决方案
我会将文本文件的内容复制到 Excel 文件中的新工作表中,并将该工作表命名为查找".然后使用 text to columns 来获取这个新工作表的前两列中的数据,从第一行开始.
I would copy the contents of your text file into a new worksheet in the excel file and name that sheet "Lookup." Then use text to columns to get the data in the first two columns of this new sheet starting in the first row.
将以下代码粘贴到 Excel 中的模块中并运行它:
Paste the following code into a module in Excel and run it:
Sub Replacer()
Dim w1 As Worksheet
Dim w2 As Worksheet
'The sheet with the words from the text file:
Set w1 = ThisWorkbook.Sheets("Lookup")
'The sheet with all of the data:
Set w2 = ThisWorkbook.Sheets("Data")
For i = 1 To w1.Range("A1").CurrentRegion.Rows.Count
w2.Cells.Replace What:=w1.Cells(i, 1), Replacement:=w1.Cells(i, 2), LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Next i
End Sub
相关文章