在Python脚本中忽略了BigQuery Use_Avro_Logical_Types
问题描述
我正在尝试使用一个Python脚本将avro文件加载到BigQuery。这个过程本身是成功的,但我在让BigQuery在创建表期间使用Avro的逻辑数据类型时遇到了一些问题。
Googlehere记录了这些逻辑类型的使用,并将其添加到google-cloud-python库here。
我的职业不是程序员,但我希望下面的片段是正确的……但是,USE_AVRO_LOGICAL_TYPE属性似乎已被忽略,并且时间戳被加载为int而不是时间戳。
...
with open(full_name, 'rb') as source_file:
var_job_config = google.cloud.bigquery.job.LoadJobConfig()
var_job_config.source_format = 'AVRO'
var_job_config.use_avro_logical_types = True
job = client.load_table_from_file(
source_file, table_ref, job_config=var_job_config)
job.result() # Waits for job to complete
...
avro架构如下:
{
"type": "record",
"name": "table_test",
"fields": [{
"name": "id_",
"type": {
"type": "bytes",
"logicalType": "decimal",
"precision": 29,
"scale": 0
}
},
{
"name": "datetime_",
"type": ["null",
{
"type": "long",
"logicalType": "timestamp-micros"
}]
},
{
"name": "integer_",
"type": ["null",
{
"type": "bytes",
"logicalType": "decimal",
"precision": 29,
"scale": 0
}]
},
{
"name": "varchar_",
"type": ["null",
{
"type": "string",
"logicalType": "varchar",
"maxLength": 60
}]
},
{
"name": "capture_time",
"type": {
"type": "long",
"logicalType": "timestamp-millis"
}
},
{
"name": "op_type",
"type": "int"
},
{
"name": "seq_no",
"type": {
"type": "string",
"logicalType": "varchar",
"maxLength": 16
}
}]
}
有没有人能详细说明一下这个问题?谢谢!
解决方案
显然,我的Python库没有我想象的那么及时。更新我的谷歌云库解决了这个问题。感谢您的意见shollyman
相关文章