如何使用 Python 读写 CSV 文件?

2022-01-30 00:00:00 python csv

问题描述

我有一个包含内容的文件 example.csv

I have a file example.csv with the contents

1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3

如何使用 Python 读取此 example.csv?

How do I read this example.csv with Python?

同样,如果我有

data = [(1, "A towel,", 1.0),
        (42, " it says, ", 2.0),
        (1337, "is about the most ", -1),
        (0, "massively useful thing ", 123),
        (-2, "an interstellar hitchhiker can have.", 3)]

如何使用 Python 将 data 写入 CSV 文件?

How do I write data to a CSV file with Python?


解决方案

这里有一些最小的完整示例如何读取 CSV 文件以及如何使用 Python 编写 CSV 文件.

Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.

纯 Python

import csv

# Define data
data = [
    (1, "A towel,", 1.0),
    (42, " it says, ", 2.0),
    (1337, "is about the most ", -1),
    (0, "massively useful thing ", 123),
    (-2, "an interstellar hitchhiker can have.", 3),
]

# Write CSV file
with open("test.csv", "wt") as fp:
    writer = csv.writer(fp, delimiter=",")
    # writer.writerow(["your", "header", "foo"])  # write header
    writer.writerows(data)

# Read CSV file
with open("test.csv") as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    # next(reader, None)  # skip the headers
    data_read = [row for row in reader]

print(data_read)

之后,data_read的内容为

[['1', 'A towel,', '1.0'],
 ['42', ' it says, ', '2.0'],
 ['1337', 'is about the most ', '-1'],
 ['0', 'massively useful thing ', '123'],
 ['-2', 'an interstellar hitchhiker can have.', '3']]

请注意,CSV 仅读取字符串.您需要手动转换为列类型.

Please note that CSV reads only strings. You need to convert to the column types manually.

之前有一个 Python 2+3 版本(链接),但是放弃对 Python 2 的支持.删除 Python 2 的东西大大简化了这个答案.

A Python 2+3 version was here before (link), but Python 2 support is dropped. Removing the Python 2 stuff massively simplified this answer.

  • 如何将数据作为字符串(而非文件)写入 csv 格式?
  • 如何将 io.StringIO() 与 csv 模块一起使用?:如果您想提供服务,这很有趣使用 Flask 即时生成 CSV,而无需将 CSV 实际存储在服务器上.
  • How do I write data into csv format as string (not file)?
  • How can I use io.StringIO() with the csv module?: This is interesting if you want to serve a CSV on-the-fly with Flask, without actually storing the CSV on the server.

查看我的实用程序包 mpu 以获得超级简单和容易记住的一个:

Have a look at my utility package mpu for a super simple and easy to remember one:

import mpu.io
data = mpu.io.read('example.csv', delimiter=',', quotechar='"', skiprows=None)
mpu.io.write('example.csv', data)

熊猫

import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

参见 read_csv 文档 了解更多信息.请注意,pandas 会自动推断是否有标题行,但您也可以手动设置.

See read_csv docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.

如果您还没有听说过 Seaborn,我建议您看看它.

If you haven't heard of Seaborn, I recommend having a look at it.

许多其他库都支持读取 CSV 文件,例如:

Reading CSV files is supported by a bunch of other libraries, for example:

  • dask.dataframe.read_csv
  • spark.read.csv
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3

常见的文件结尾

.csv

在将 CSV 文件读取到元组/字典列表或 Pandas 数据框后,它只是在处理此类数据.没有特定的 CSV.

After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.

  • JSON:非常适合编写人类可读的数据;非常常用(读写)
  • CSV:超级简单的格式(读写)
  • YAML:易于阅读,类似于 JSON(读写)
  • pickle:一种 Python 序列化格式(读写)
  • MessagePack (Python 包):更紧凑的表示(读写)
  • HDF5 (Python 包):非常适合矩阵(读写)
  • XML:也存在 *sigh* (阅读 & 写)
  • JSON: Nice for writing human-readable data; VERY commonly used (read & write)
  • CSV: Super simple format (read & write)
  • YAML: Nice to read, similar to JSON (read & write)
  • pickle: A Python serialization format (read & write)
  • MessagePack (Python package): More compact representation (read & write)
  • HDF5 (Python package): Nice for matrices (read & write)
  • XML: exists too *sigh* (read & write)

对于您的应用程序,以下内容可能很重要:

For your application, the following might be important:

  • 其他编程语言的支持
  • 读/写性能
  • 紧凑性(文件大小)

另请参阅:数据序列化格式的比较

如果您正在寻找一种制作配置文件的方法,您可能想阅读我的短文 Python 中的配置文件

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

相关文章