在 Azure ML Studio 中将 pandas 更新到 0.19 版
问题描述
我真的很想访问 pandas 0.19 中的一些更新功能,但 Azure ML 工作室使用 pandas 0.18 作为 Anaconda 4.0 捆绑包的一部分.有没有办法更新执行 Python 脚本"组件中使用的版本?
解决方案我提供以下步骤来展示如何在Execute Python Script
中更新pandas库的版本.
第一步:使用virtualenv
组件在你的系统中创建一个独立的python运行环境.请先用命令pip install virtualenv
如果你没有的话.
如果你安装成功,你可以在你的 python/Scripts 文件中看到它.
Step2:运行命令创建独立的python运行环境.
第三步:然后进入创建目录的Scripts文件夹并激活它(这一步很重要,不要错过)
请不要关闭此命令窗口并在此命令窗口中使用 pip install pandas==0.19
下载外部库.
第 4 步:将 Lib/site-packages 文件夹中的所有文件压缩成一个 zip 包(我这里叫它 pandas - 包)
第 5 步:将压缩包上传到 Azure Machine Learning WorkSpace DataSet.
具体步骤请参考
第6步:在Execute Python Script模块定义方法azureml_main
之前,需要去掉旧的pandas
模块&它的依赖项,然后再次导入 pandas
,如下面的代码.
导入系统将熊猫导入为 pd打印(pd.__version__)del sys.modules['pandas']del sys.modules['numpy']del sys.modules['pytz']del sys.modules['六']del sys.modules['dateutil']sys.path.insert(0, '.\Script Bundle')for td in [m for m in sys.modules if m.startswith('pandas.') or m.startswith('numpy.') or m.startswith('pytz.') or m.startswith('dateutil.') 或 m.startswith('six.')]:del sys.modules[td]将熊猫导入为 pd打印(pd.__version__)# 入口点函数最多可以包含两个输入参数:# 参数<dataframe1>:一个pandas.DataFrame# 参数<dataframe2>:一个pandas.DataFramedef azureml_main(dataframe1 = 无,dataframe2 = 无):
然后你可以从日志中看到结果如下,首先打印旧版本0.14.0
,然后从上传的zip文件中打印新版本0.19.0
.
[信息] 0.14.0[信息] 0.19.0
您还可以参考以下主题:访问在 Azure 中使用时间戳的 blob 文件 和通过重置重新加载.
希望对你有帮助.
I would really like to get access to some of the updated functions in pandas 0.19, but Azure ML studio uses pandas 0.18 as part of the Anaconda 4.0 bundle. Is there a way to update the version that is used within the "Execute Python Script" components?
解决方案I offer the below steps for you to show how to update the version of pandas library in Execute Python Script
.
Step 1 : Use the virtualenv
component to create an independent python runtime environment in your system.Please install it first with command pip install virtualenv
if you don't have it.
If you installed it successfully ,you could see it in your python/Scripts file.
Step2 : Run the commad to create independent python runtime environment.
Step 3 : Then go into the created directory's Scripts folder and activate it (this step is important , don't miss it)
Please don't close this command window and use pip install pandas==0.19
to download external libraries in this command window.
Step 4 : Compress all of the files in the Lib/site-packages folder into a zip package (I'm calling it pandas - package here)
Step 5 :Upload the zip package into the Azure Machine Learning WorkSpace DataSet.
specific steps please refer to the Technical Notes.
After success, you will see the uploaded package in the DataSet List
Step 6 : Before the defination of method azureml_main
in the Execute Python Script module, you need to remove the old pandas
modules & its dependencies, then to import pandas
again, as the code below.
import sys
import pandas as pd
print(pd.__version__)
del sys.modules['pandas']
del sys.modules['numpy']
del sys.modules['pytz']
del sys.modules['six']
del sys.modules['dateutil']
sys.path.insert(0, '.\Script Bundle')
for td in [m for m in sys.modules if m.startswith('pandas.') or m.startswith('numpy.') or m.startswith('pytz.') or m.startswith('dateutil.') or m.startswith('six.')]:
del sys.modules[td]
import pandas as pd
print(pd.__version__)
# The entry point function can contain up to two input arguments:
# Param<dataframe1>: a pandas.DataFrame
# Param<dataframe2>: a pandas.DataFrame
def azureml_main(dataframe1 = None, dataframe2 = None):
Then you can see the result from logs as below, first print the old version 0.14.0
, then print the new version 0.19.0
from the uploaded zip file.
[Information] 0.14.0
[Information] 0.19.0
You could also refer to these threads: Access blob file using time stamp in Azure and reload with reset.
Hope it helps you.
相关文章