用于访问 Azure Data Lake Store 的 Python 代码

问题描述

我正在查看 Microsoft 文档 这里 和 这里,我在 Azure Active Directory 中创建了 Web 应用程序来访问 Data Lake Store

I 'm looking at Microsoft Documentation here and here, I have created Web App in Azure Active Directory to access the Data Lake Store

在 Web 应用中,我有 Object ID、Application ID 和 Key

From the Web App I have Object ID, Application ID and Key

查看我看到的文档:

adlCreds = lib.auth(tenant_id = 'FILL-IN-HERE', client_secret = 'FILL-IN-HERE', client_id = 'FILL-IN-HERE', resource = 'https://datalake.azure.net/')

如何使用它来验证我的代码并在 Data Lake Store 上运行操作?

how to use it to authenticate my code and run operation on Data Lake Store?

这是我的完整测试代码:

here is my full test code:

## Use this for Azure AD authentication
from msrestazure.azure_active_directory import AADTokenCredentials

## Required for Azure Data Lake Store account management
from azure.mgmt.datalake.store import DataLakeStoreAccountManagementClient
from azure.mgmt.datalake.store.models import DataLakeStoreAccount

## Required for Azure Data Lake Store filesystem management
from azure.datalake.store import core, lib, multithread

# Common Azure imports
import adal
from azure.mgmt.resource.resources import ResourceManagementClient
from azure.mgmt.resource.resources.models import ResourceGroup

## Use these as needed for your application
import logging, getpass, pprint, uuid, time


## Declare variables
subscriptionId = 'FILL-IN-HERE'
adlsAccountName = 'FILL-IN-HERE'

tenant_id = 'FILL-IN-HERE'
client_secret = 'FILL-IN-HERE'
client_id = 'FILL-IN-HERE'


## adlCreds = lib.auth(tenant_id = 'FILL-IN-HERE', client_secret = 'FILL-IN-HERE', client_id = 'FILL-IN-HERE', resource = 'https://datalake.azure.net/')
from azure.common.credentials import ServicePrincipalCredentials
adlCreds = lib.auth(tenant_id, client_secret, client_id, resource = 'https://datalake.azure.net/')


## Create a filesystem client object
adlsFileSystemClient = core.AzureDLFileSystem(adlCreds, store_name=adlsAccountName)

## Create a directory
adlsFileSystemClient.mkdir('/mysampledirectory')

当我尝试运行代码时,我得到了错误:

when I try to ru the code I get error:

[运行] python "c:....dls.py"回溯(最近一次通话最后):文件c:....dls.py",第 38 行,在adlCreds = lib.auth(tenant_id, client_secret, client_id, resource = 'https://datalake.azure.net/')文件C:Python36libsite-packagesazuredatalakestorelib.py",第 130 行,身份验证密码,client_id)文件C:Python36libsite-packagesadalauthentication_context.py",第 145 行,在 acquire_token_with_username_password 中返回 self._acquire_token(token_func)_acquire_token 中的文件C:Python36libsite-packagesadalauthentication_context.py",第 109 行返回 token_func(self)文件C:Python36libsite-packagesadalauthentication_context.py",第 143 行,在 token_func 中return token_request.get_token_with_username_password(用户名,密码)文件C:Python36libsite-packagesadal oken_request.py",第 280 行,在 get_token_with_username_passwordself._user_realm.discover()文件C:Python36libsite-packagesadaluser_realm.py",第 152 行,在发现中引发 AdalError(return_error_string, error_response)adal.adal_error.AdalError:用户领域发现请求返回 http 错误:404 和服务器响应:

[Running] python "c:....dls.py" Traceback (most recent call last): File "c:....dls.py", line 38, in adlCreds = lib.auth(tenant_id, client_secret, client_id, resource = 'https://datalake.azure.net/') File "C:Python36libsite-packagesazuredatalakestorelib.py", line 130, in auth password, client_id) File "C:Python36libsite-packagesadalauthentication_context.py", line 145, in acquire_token_with_username_password return self._acquire_token(token_func) File "C:Python36libsite-packagesadalauthentication_context.py", line 109, in _acquire_token return token_func(self) File "C:Python36libsite-packagesadalauthentication_context.py", line 143, in token_func return token_request.get_token_with_username_password(username, password) File "C:Python36libsite-packagesadal oken_request.py", line 280, in get_token_with_username_password self._user_realm.discover() File "C:Python36libsite-packagesadaluser_realm.py", line 152, in discover raise AdalError(return_error_string, error_response) adal.adal_error.AdalError: User Realm Discovery request returned http error: 404 and server response:

404 - 找不到文件或目录.

404 - File or directory not found.

[Done] 在 1.216 秒内以 code=1 退出

[Done] exited with code=1 in 1.216 seconds


解决方案

有两种不同的身份验证方式.第一个是交互式的,适合最终用户.它甚至适用于多因素身份验证.这是你如何做到的.您需要进行交互才能登录.

There are two different ways of authenticating. The first one is interactive which is suitable for end users. It even works with multi factor authentication. Here is how you do it. You need to be interactive in order to log on.

from azure.datalake.store import core, lib, multithread
token = lib.auth()

第二种方法是使用 Azure Active Directory 中的服务主体标识.此处提供了有关设置 Azure AD 应用程序、检索客户端 ID 和机密以及使用 SPI 配置访问权限的分步教程:https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory#create-an-active-directory-applicationp>

The second method is to use service principal identities in Azure Active directory. A step by step tutorial for setting up an Azure AD application, retrieving the client id and secret and configuring access using the SPI is available here: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory#create-an-active-directory-application

from azure.common.credentials import ServicePrincipalCredentials
token = lib.auth(tenant_id = '<your azure tenant id>', client_secret = '<your client secret>', client_id = '<your client id>')

这是一篇博文,展示了如何通过 pandas 和 Jupyter 访问它.它还逐步介绍了如何获取身份验证令牌.https://medium.com/azure-data-lake/using-jupyter-notebooks-and-pandas-with-azure-data-lake-store-48737fbad305

Here is blog post that shows how to access it through pandas and Jupyter. It also has a step by step on how to get the authentication token. https://medium.com/azure-data-lake/using-jupyter-notebooks-and-pandas-with-azure-data-lake-store-48737fbad305

相关文章