dbt-expectations列类型timestamptz测试错误地失败

huangapple go评论61阅读模式
英文:

dbt-expectations column_type timestamptz test improperly fails

问题

The timestamp and timestamp with time zone columns are failing the dbt-expectations tests because the expected data types specified in your YAML file do not match the actual data types of the columns in your table.

In your YAML file, you have specified that the "updated_at" column should have the data type "timestamptz," but in your table, it appears to have the data type "TIMESTAMP WITH TIME ZONE." Similarly, you have specified that the "observed" column should have the data type "date," but in your table, it might have a different data type.

To resolve this issue, you should update your dbt-expectations YAML file to match the actual data types of the columns in your table. You can do this by specifying the correct data types in the YAML file. For example:

- name: stg_holidays
    description: "Victorian State Holidays"
    columns:
      - name: holiday
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: text
      - name: observed
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: timestamp with time zone  # Update the data type here
      - name: updated_at
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: timestamp with time zone  # Update the data type here

By specifying the correct data types in your YAML file, the dbt-expectations tests should pass when they match the actual data types of your columns in the table.

英文:

Using a sql file

SELECT 
    holiday::text
    , observed::date
    , NOW()::timestamptz updated_at
FROM {{ ref('seed_holidays') }}

Upon dbt run -s stage (where stage is the model/schema) I get as expected a timestamptz column in the corresponding table (stg_holidays) with value like '2023-07-23 14:37:29.060 +1000', and inspecting the data_type via DBeaver confirms it's type timestamptz.

Next, running dbt test -s stage, with a spec in the model stg_schema.yml:

- name: stg_holidays
    description: "Victorian State Holidays"
    columns:
      - name: holiday
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: text
      - name: observed
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: date
      - name: updated_at
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: timestamptz

I get an error:

> Failure in test
> dbt_expectations_expect_column_values_to_be_of_type_stg_holidays_updated_at__timestamptz
> (models/stage/stg_schema.yml)
> Got 1 result, configured to fail if != 0
>
> compiled Code at target/compiled/[proj_name]/models/stage/stg_schema.yml/dbt_expectations_expect_column_[hash].sql

Running the test query at the path given is pretty useless:

with relation_columns as (

        
        select
            cast('HOLIDAY' as TEXT) as relation_column,
            cast('TEXT' as TEXT) as relation_column_type
        union all
        
        select
            cast('OBSERVED' as TEXT) as relation_column,
            cast('DATE' as TEXT) as relation_column_type
        union all
        
        select
            cast('UPDATED_AT' as TEXT) as relation_column,
            cast('TIMESTAMP WITH TIME ZONE' as TEXT) as relation_column_type
        
        
    ),
    test_data as (

        select
            *
        from
            relation_columns
        where
            relation_column = 'UPDATED_AT'
            and
            relation_column_type not in ('TIMESTAMPTZ')

    )
    select *
    from test_data

Obviously this returns

relation_column relation_column_type
UPDATED_AT TIMESTAMP WITH TIME ZONE

The same thing happens with just timestamp type.

Why do these timestamp and timestamp columns fail dbt-expectations tests?

答案1

得分: 0

翻译后的内容如下:

我只有在为这个问题撰写文档时才想到答案,但我认为我应该发布问题和答案,以免其他人感到烦恼。

对于 timestamptz(在 PostgreSQL 下),dbt-expectations 测试应该是:

  - name: stg_holidays
    columns:
      ...
      - name: updated_at
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: timestamp with time zone

或者对于 timestamp 使用 column_type: timestamp without time zone

在查看 dbt-expectations 文档时,这个问题没有立刻显现出来,但它确实解决了问题。

英文:

The answer only occurred to me while documenting for the question, but I thought I'd post question and answer to save others the irritation.

for timestamptz (under postgresql) the dbt-expectations test has to be

  - name: stg_holidays
    columns:
      ...
      - name: updated_at
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: timestamp with time zone

or column_type: timestamp without time zone for timestamp.

This hadn't jumped out at me in combing the dbt-expectations docs, but it does solve the problem.

huangapple
  • 本文由 发表于 2023年7月23日 16:32:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76747303.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定