如何将 Mysql 实时同步到 Bigquery?

2021-12-30 00:00:00 google-bigquery mysql

目前我有一些脚本,它首先删除表并将表从 MySQL 上传到 Bigquery.而且很多时候都失败了.此外,它每天只运行一次.我正在寻找一些可扩展的实时解决方案.您的帮助将不胜感激:)

Currently I have some script which first deletes the table and upload the table from MySQL to Bigquery. And many time it had failed. Plus it run only once a day. I am looking for some scalable and realtime solution. Your Help will be much appreciated :)

推荐答案

阅读来自 Wepay 的这些系列文章,其中详细介绍了如何使用 Airflow 将 MySQL 数据库同步到 BigQuery:

Read these series of posts from Wepay, where they detail how they sync their MySQL databases to BigQuery, using Airflow:

  • https://wecode.wepay.com/posts/wepays-数据仓库-bigquery-airflow
  • https://wecode.wepay.com/posts/airflow-wepay
  • (第三个是关于 BigQuery)

作为总结(引用):

  • 设置身份验证、连接、DAG.
  • 定义要从 MySQL 中提取哪些列并加载到 BigQuery 中.
  • 选择加载数据的方式:增量加载或完全加载.
  • 去重.

相关文章