如何以正确的方式将一列拆分为 2?

2022-01-24 00:00:00 python pandas dataframe split debian

问题描述

我正在从网站上抓取表格，并将其放入 Excel 文件.我的目标是以正确的方式将一列分成两列.

I am web-scraping tables from a website, and I am putting it to the Excel file. My goal is to split a columns into 2 columns in the correct way.

我要拆分的列:FLIGHT"

The columns what i want to split: "FLIGHT"

我想要这个表格:

第一个例子:KL744 --> KL 和 0744

First example: KL744 --> KL and 0744

第二个例子:BE1013 --> BE 和 1013

Second example: BE1013 --> BE and 1013

所以，我需要分隔第 2 个字符(在第一列中)，然后是 1-2-3-4 个字符的下一个字符.如果 4 没问题，我保留它，如果 3，我想在它前面放一个 0，如果 2:我想在它前面放 00(所以我的目标是在第二列中获得 4 个字符/数字.)

So, I need to separete the FIRST 2 character (in the first column), and after that the next characters which are 1-2-3-4 characters. If 4 it's oke, i keep it, if 3, I want to put a 0 before it, if 2 : I want to put 00 before it (so my goal is to get 4 character/number in the second column.)

我该怎么做?

这里是我的相关代码，里面已经包含了格式化代码.

Here my relevant code, which is already contains a formatting code.

df2 = pd.DataFrame(datatable,columns = cols) df2["UPLOAD_TIME"] = datetime.now() mask = np.column_stack([df2[col].astype(str).str.contains(r"Scheduled", na=True) for col in df2]) df3 = df2.loc[~mask.any(axis=1)] if os.path.isfile("output.csv"): df1 = pd.read_csv("output.csv", sep=";") df4 = pd.concat([df1,df3]) df4.to_csv("output.csv", index=False, sep=";") else: df3.to_csv df3.to_csv("output.csv", index=False, sep=";")

这里是我表中的 excel prt sc:

Here the excel prt sc from my table:

解决方案

你可以使用用 str 索引和 zfill:

You can use indexing with str with zfill:

df = pd.DataFrame({'FLIGHT':['KL744','BE1013']}) df['a'] = df['FLIGHT'].str[:2] df['b'] = df['FLIGHT'].str[2:].str.zfill(4) print (df) FLIGHT a b 0 KL744 KL 0744 1 BE1013 BE 1013

我相信你的代码需要:

df2 = pd.DataFrame(datatable,columns = cols) df2['a'] = df2['FLIGHT'].str[:2] df2['b'] = df2['FLIGHT'].str[2:].str.zfill(4) df2["UPLOAD_TIME"] = datetime.now() ... ...

相关文章