创建表操作未正确使用表列的模式

huangapple go评论65阅读模式
英文:

create table operator does not use correct mode for table columns

问题

作为Airflow中DAG的一部分,我使用了一个包含BigQueryCreateEmptyTableOperator任务的任务,用于创建一个新的空表,我将表的模式传递给'schema_fields'参数,作为一个包含模式的Python变量,该模式以列表形式存在。模式中的一些列的模式是'REQUIRED'。
DAG正常运行,表确实被创建,但当我在BigQuery中检查它时,表中的所有列的模式都被设置为'NULLABLE' - 应该是'REQUIRED'模式的列被忽略了。

这是我的代码示例:

create_table = BigQueryCreateEmptyTableOperator( 
    trigger_rule=TriggerRule.NONE_FAILED,
    task_id="create_table", 
    dataset_id='my_dataset', 
    table_id="my_table", 
    project_id='my_project', 
    schema_fields=my_table_schema, 
    exists_ok=True
)

我的模式看起来像这样:

my_table_schema =  [
    {
        "name": "ID_NO",
        "type": "INTEGER",
        "mode": "REQUIRED"    
    },
    {
        "name": "PROD_NAME",
        "mode": "NULLABLE",
        "type": "STRING"
    },
    {
        "name": "DESC",
        "mode": "NULLABLE",
        "type": "STRING"
    }
]

因此,表被创建,并包含上述模式的列,但它们的模式都是'NULLABLE',甚至包括ID_NO列。

为什么模式中的模式被替换为'NULLABLE'呢?

英文:

As part of a DAG in Airflow, I'm using a task with a BigQueryCreateEmptyTableOperator to create a new, empty table, and I'm passing the table's schema in the 'schema_fields' argument, as a Python variable that contains the schema as a list. The mode of some columns in the schema is 'REQUIRED'
The DAG runs OK and the table is indeed created, but when I check it in BigQuery, all columns in the table have their mode set to 'NULLABLE' - the 'REQUIRED' mode has been ignored for the columns that should have it.

    create_table = BigQueryCreateEmptyTableOperator( 
    trigger_rule=TriggerRule.NONE_FAILED,
    task_id="create_table", 
    dataset_id = 'my_dataset', 
    table_id= "my_table", 
    project_id = 'my_project', 
    schema_fields= my_table_schema, 
    exists_ok=True
    )

My schema looks like this:

    my_table_schema =  [
  {
    "name": "ID_NO",
    "type": "INTEGER",
    "mode": "REQUIRED"    
  },
  {
    "name": "PROD_NAME",
    "mode": "NULLABLE",
    "type": "STRING"
  },
  {
    "name": "DESC",
    "mode": "NULLABLE",
    "type": "STRING"
  }
]

So the table gets created with the columns from the schema above, but they've all got NULLABLE mode, even the ID_NO column.

Why is the mode in the schema being replaced by NULLABLE?

答案1

得分: 1

我尝试了你的代码设置,对我来说运行正常。这可能是一些临时问题,请确保在Airflow环境中使用了正确的DAG文件。

这是我尝试过的内容:

with models.DAG(
    DAG_ID,
    schedule="@once",
    start_date=datetime(2021, 1, 1),
    catchup=False,
    tags=["example", "bigquery"],
) as dag:
    createtable1 = BigQueryCreateEmptyTableOperator( 
    task_id='createtable', 
    dataset_id='my-dataset', 
    table_id= 'mytablecomposer', 
    project_id = 'my-project', 
    schema_fields= [{"name": "ID_NO","type": "INTEGER","mode": "REQUIRED" },{"name":"PROD_NAME","mode": "NULLABLE","type": "STRING"},{"name": "DESC","mode": "NULLABLE","type": "STRING" }], 
    exists_ok=True
    )

BigQuery表:

创建表操作未正确使用表列的模式

英文:

I tried your code setup and it is working fine for me. This might be some transient issue, do check if you're using the correct dag file in the Airflow environment.

Here is what I tried:

with models.DAG(
    DAG_ID,
    schedule="@once",
    start_date=datetime(2021, 1, 1),
    catchup=False,
    tags=["example", "bigquery"],
) as dag:
    createtable1 = BigQueryCreateEmptyTableOperator( 
    task_id='createtable', 
    dataset_id = 'my-dataset', 
    table_id= 'mytablecomposer', 
    project_id = 'my-project', 
    schema_fields= [{"name": "ID_NO","type": "INTEGER","mode": "REQUIRED" },{"name":"PROD_NAME","mode": "NULLABLE","type": "STRING"},{"name": "DESC","mode": "NULLABLE","type": "STRING" }], 
    exists_ok=True
    )

BigQuery Table:

创建表操作未正确使用表列的模式

huangapple
  • 本文由 发表于 2023年4月4日 11:08:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75925222.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定