Power BI 中 Python 可视化中时间序列的最佳数据格式是什么?
问题描述
截至 2018 年 8 月 9 日,Power BI 支持 Python 可视化.他们以前支持 R 可视化,但我仍然觉得这些集成有点尴尬.让我告诉你我的意思:
<小时>假设您有一个包含时间序列数据的表,其中第一行包含名称日期"和值",内容分别是格式为 yyyy-mm-dd 的日期和一个数字:
日期、数值2017-01-12,12017-01-13,42017-01-14,22017-01-15,42017-01-16,22017-01-17,22017-01-18,22017-01-19,52017-01-20,52017-01-21,52017-01-22,52017-01-23,62017-01-24,32017-01-25,62017-01-26,62017-01-27,52017-01-28,82017-01-29,42017-01-30,2
如果您将该数据集存储为像 timerseries.csv
这样的文本文件并使用 Get Data | 导入它.文本/CSV,你会得到一个比 VISUALIZATIONS |字段,像这样:
您可以使用 VISUALIZATIONS | 检查您的表表 并获取:
有了这个设置,你应该认为你已经准备好使用这个漂亮的新功能来释放 Py VISUALIZATION 的力量了:
如果你点击它,你会得到这个:
你被告知
<块引用>将字段拖入可视化"窗格中的值"区域以开始脚本
如果你从 Value
开始,你会在编辑器中得到这个默认设置:
如果您按照 Power BI 团队在
但这就是我目前的结局.
如果编辑器中的默认数据框共享标准数据框的功能,您应该能够引用该数据框中的列并使用此代码段轻松绘制图表:
import matplotlib.pyplot as pltplt.plot(数据集['值'])plt.show()
但是当你运行它时,它只会返回一个错误:
至少可以说细节很详细.
我也尝试过同时导入 Dates
和 Values
,并尝试使用 dataset.plot()
,但似乎没有任何效果.我还尝试通过这种方式将日期层次结构分解为简单的日期:
那么,对数据格式、导入方法和/或代码片段有什么想法吗?
感谢您的任何建议!
编辑 1 - 按照 Foxan Ng 的回答:
在值"字段中添加两列:
这仍然返回一个错误
<块引用>TypeError: from_bounds() 接受 4 个位置参数,但给出了 6 个
解决方案我没有遇到你提到的错误.您是否将两列都放入 Values
?
import matplotlib.pyplot as pltplt.plot(数据集['日期'],数据集['值'])plt.show()
<小时>使用 M 查询更新:
让Source = Csv.Document(File.Contents("C:your-directory.. imerseries.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),#"PromoteHeaders" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type date}, {"Value", Int64.Type}})在#改变类型"
As of today, August 9 2018, Power BI supports Python Visualizations. They've had support for R Visualizations before, but I still find these integrations to be a bit awkward. Let me show you what I mean:
Let's say that you have a table with time series data, where the top row containts the names 'Date' and 'Value', and the contents are dates of the form yyyy-mm-dd and a number, respectively:
Date,Value
2017-01-12,1
2017-01-13,4
2017-01-14,2
2017-01-15,4
2017-01-16,2
2017-01-17,2
2017-01-18,2
2017-01-19,5
2017-01-20,5
2017-01-21,5
2017-01-22,5
2017-01-23,6
2017-01-24,3
2017-01-25,6
2017-01-26,6
2017-01-27,5
2017-01-28,8
2017-01-29,4
2017-01-30,2
If you store that dataset as a textfile like timerseries.csv
and import it using Get Data | Text/CSV, you get a table uner VISUALIZATIONS | FIELDS, like this:
You can inspect your table using VISUALIZATIONS | Table and get:
With this setup, one should think that you were all set for unleashing the power of a Py VISUALIZATION using this beautiful new feature:
If you click that, you get this:
And you're told to
Drag fields into the Values area in the Visualization pane to start scripting
If you start with Value
, you get this default setup in the editor:
And if you follow the instructions given by the Power BI team in the August 2018 feature summary you should be able to make a matplotlib plot quite easily.
But this is where it ends for me at the time being.
If the default dataframe in the editor shares the features of a standard dataframe, you should be able to reference a column in that dataframe and easily make a plot with this snippet:
import matplotlib.pyplot as plt
plt.plot(dataset['Value'])
plt.show()
But when you run it, it onlu returns an error:
And the details are elaborate to say the least.
I've also tried to import both Dates
and Values
, and I've tried plotting the dataframe directly with dataset.plot()
, but nothing seems to be working. I've also tried stripping the date hierarchy down to simple dates this way:
So, any ideas on the dataformat, import method and/or the snippet?
Thank you for any suggestions!
EDIT 1 - Following the answer from Foxan Ng:
Add both columns in the Value field:
This still returns an error edning with:
TypeError: from_bounds() takes 4 positional arguments but 6 were given
解决方案
I didn't encounter errors that you've mentioned. Have you dropped in both columns into Values
?
import matplotlib.pyplot as plt
plt.plot(dataset['Date'], dataset['Value'])
plt.show()
UPDATED with M query:
let
Source = Csv.Document(File.Contents("C:your-directory.. imerseries.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type date}, {"Value", Int64.Type}})
in
#"Changed Type"
相关文章