如何加入多个没有配置外部表权限的azure数据库?
在我当前的设置中,我使用 Authentication=Active Directory - Integrated 连接到 Azure SQL Server.这种访问方法一次只允许访问一个数据库.该架构是从本地 SQL 服务器环境迁移而来的,并进行了更改以使云开发可行,但仍然必须跨数据库进行分析和调试.通常,人们会简单地使用旧的 SQL Server 配置进行跨数据库联接,如果数据库位于不同的服务器上,则可能涉及链接服务器.在云或 azure 中,您应该使用弹性搜索 或避免需要的架构设计.在源数据上,我只有读取权限,这是一个问题,因为您 需要 ALTER ANY EXTERNAL DATA SOURCE 权限 所以我想我可能不得不求助于 ETL 工具来执行这种类型的连接,但想知道是否还有其他选项或工具允许本质上在 azure 中执行跨数据库或跨服务器连接,只有读取访问权限.
In my current setup I connect to an Azure SQL Server using Authentication=Active Directory - Integrated. This method of access only allows access to a single database at a time. The architecture was migrated from an on premises SQL server environment with changes to make cloud development feasible, but still analytics and debugging must occur across databases. Typically one would simply do a cross database join with a legacy SQL Server configuration, possibly involving link servers if the databases were on separate servers. In the cloud or azure you are supposed to use Elastic search or avoid the architecture designs that make this needed. On the source data I only have read access which is a problem since you need ALTER ANY EXTERNAL DATA SOURCE permission so I am thinking I may have to resort to an ETL tool to perform this type of join, but wondering if there are other options or tools that allow one to essentially perform cross database or cross server joins in azure with only read access.
在仅授予读取权限的情况下执行这些类型的连接有哪些想法?
What are some ideas for performing these types of joins given only read access?
推荐答案
我希望一位学者能回答这个问题,并告诉我我问这个问题很愚蠢,但这是我目前的想法,也许这会让她失望...正如我在下面看到的那样.
I was hoping a savant would answer this one and tell me I was dumb for asking, but here are my current thoughts maybe this will bring her out... path forward as I see it below.
如果只有读取权限,最好在开始时将所有数据从不同来源提取到我拥有完全开发权限的环境中(即本地主机上的 PostgreSQL 或 MySQL 服务器,如果很小……或者可能是单个开发场所天蓝色,该企业允许完全开发权).在 ETL 工具(即 SSIS)中创建它,并在源中的任何增量需要更新数据时运行.随后对该合并数据执行联接.
Given only read access it is best to pull all data at the onset from the different sources to an environment that I have full development rights (i.e. a PostgreSQL or MySQL server on localhost if small... or potentially a single development place on azure that the enterprise allows for full development rights). Create this in an ETL tool, i.e. SSIS, and run as updated data is needed for any delta in the source. Subsequently perform joins on this consolidated data.
这是有道理的,因为我需要对连接进行试验,处理足够少或可传输的数据量,数据的重复不会占用资源,并且会直接在 SQL 编辑器中执行试验连接而不是一个 ETL 工具.
This makes sense since I need to experiment with the joins, dealing with a small enough or transferable amount of data that the duplication of the data is not a resource hog, and would perform the experimental joins directly in a SQL editor instead of an ETL tool.
如果我有一个更大的数据集,我会尝试向下采样,执行相同的练习,然后一旦理解跨服务器连接,在 ETL 工具中大规模执行它,可能以某种理想情况下的批处理方式最终并行执行.
If I had a much larger dataset I would attempt to sample down, perform the same exercise and then once the cross server join was understood execute it at scale in an ETL tool, probably in some sort of batch fashion that ideally could be eventually executed in parallel.
相关文章