Terraform:使用 for_each 选项创建 AWS Glue 表

huangapple go评论125阅读模式
英文:

Terraform: Create a AWS glue table using for each option

问题

I needed to create a multiple glue database of "name" and for each database need to create multiple glue tables present in "tablenames" list.

    variable "my_map" {
      type = map(object({
        name       = string
        tablenames = list(string)
      }))

      default = {
        obj1 = {
          name       = "NameA1",
          tablenames = ["TableA1", "TableA2", "TableA3"]
        },
        obj2 = {
          name       = "NameB1",
          tablenames = ["TableB1"]
        },
      }
    }

    resource "aws_s3_bucket" "my_bucket" {
      for_each = var.my_map
      bucket   = "${lower(each.value.name)}"
    }

    resource "aws_glue_catalog_database" "my_glue_database" {
      for_each = var.my_map
      name     = "${each.value.name}"
    }

Now I need to create the multiple glue table, but how do use for_each to have "database_name" and "location" pointing the resource created previously?
Also have suffix to be populated accordingly.

    resource "aws_glue_catalog_table" "my_glue_table" {
      for_each      = var.my_map    
      name          = "xxxxxxxxxxxxx" #--> here i needed the name to be from the tablename of my_map
      database_name = aws_glue_catalog_database.my_glue_database[each.key].name
      table_type    = "EXTERNAL_TABLE"
    
      partition_keys {
        name = "date"
        type = "date"
      }
    
      storage_descriptor {
        columns {
          name = "file_name"
          type = "string"
        }
        .
        .
        .
    
        compressed    = false
        location      = "s3://${aws_s3_bucket.my_bucket[each.key].id}/suffix/" ##### here suffix would be from the maps tablenames list
        input_format  = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
        output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"
    
        ser_de_info {
          name                  = "ParquetHiveSerDe"
          serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
        }
      }
    }

I need to create 2 glue DB, s3 bucket, and 4 glue tables dependent on these 2 resources.

<details>
<summary>英文:</summary>

I needed to create a multiple glue database of &quot;name&quot; and for each database need to create multiple glue tables present in &quot;tablenames&quot; list.

 

    variable &quot;my_map&quot; {
      type = map(object({
        name       = string
        tablenames = []
      }))

      default = {
        obj1 = {
          name       = &quot;NameA1&quot;
          tablenames = [&quot;TableA1&quot;, &quot;TableA2&quot;, &quot;TableA3&quot;]
        },
        obj2 = {
          name       = &quot;NameB1&quot;
          tablenames = [&quot;TableB1&quot;]
        },
      }
    }

    resource &quot;aws_s3_bucket&quot; &quot;my_bucket&quot; {
      for_each = var.my_map
      bucket   = &quot;${lower(each.value.name)}&quot;
    }

    resource &quot;aws_glue_catalog_database&quot; &quot;my_glue_database&quot; {
      for_each = var.my_map
      name     = &quot;${each.value.name}&quot;
    }

Now I need to create the multiple glue table, but how do use for_each to have &quot;database_name &quot; and &quot;location&quot; pointing the resource created previously?
Also have suffix to be populated accordingly. ```location      = &quot;s3://${aws_s3_bucket.my_bucket[each.key].id}/suffix/&quot; ##### here suffix would be from the maps tablenames list```

    resource &quot;aws_glue_catalog_table&quot; &quot;my_glue_table&quot; {
      for_each      = var.my_map    
      name          = &quot;xxxxxxxxxxxxx&quot; #--&gt; here i needed the name to be from the tablename of my_map
      database_name = aws_glue_catalog_database.my_glue_db[each.key].name
      table_type    = &quot;EXTERNAL_TABLE&quot;
    
      partition_keys {
        name = &quot;date&quot;
        type = &quot;date&quot;
      }
    
      storage_descriptor {
        columns {
          name = &quot;file_name&quot;
          type = &quot;string&quot;
        }
        .
        .
        .
    
        compressed    = false
        location      = &quot;s3://${aws_s3_bucket.my_bucket[each.key].id}/suffix/&quot; ##### here suffix would be from the maps tablenames list
        input_format  = &quot;org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat&quot;
        output_format = &quot;org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat&quot;
    
        ser_de_info {
          name                  = &quot;ParquetHiveSerDe&quot;
          serialization_library = &quot;org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe&quot;
        }
      }
    }

I need to create 2 glue DB, s3 bucket and 4 glue tables dependent on these 2 resources

</details>


# 答案1
**得分**: 1

可以。这将产生一个**扁平化**版本的你的 `my_map`,你可以轻松地对其进行迭代:

```hcl
locals {
  flat_my_map = merge([
      for k,v in var.my_map: {
        for tname in v.tablenames: 
          "${k}-${tname}" => {
            key1 = k
            name = v.name
            tablename = tname
          }
      }
  ]...) # 不要移除这些省略号
}

然后

resource "aws_glue_catalog_table" "my_glue_table" {

  for_each      = var.flat_my_map    

  name          = each.value.tablename
  database_name = aws_glue_catalog_database.my_glue_db[each.value.key1].name
  table_type    = "EXTERNAL_TABLE"

  partition_keys {
    name = "date"
    type = "date"
  }

  storage_descriptor {
    columns {
      name = "file_name"
      type = "string"
    }
    .
    .
    .

    compressed    = false
    location      = "s3://${aws_s3_bucket.my_bucket[each.value.key1].id}/suffix/${each.value.name}" ##### 这里的后缀将来自映射的 tablenames 列表
    input_format  = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
    output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"

    ser_de_info {
      name                  = "ParquetHiveSerDe"
      serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
    }
  }
}
英文:

> Is it possible to use for_each inside a for_each

Yes. This will result in a flat version of your my_map, which you can easily iterate over:

locals {
flat_my_map = merge([
for k,v in var.my_map: {
for tname in v.tablenames: 
&quot;${k}-${tname}&quot; =&gt; {
key1 = k
name = v.name
tablename = tname
}
}
]...) # do NOT remove the dots
}

then

resource &quot;aws_glue_catalog_table&quot; &quot;my_glue_table&quot; {
for_each      = var.flat_my_map    
name          = each.value.tablename
database_name = aws_glue_catalog_database.my_glue_db[each.value.key1].name
table_type    = &quot;EXTERNAL_TABLE&quot;
partition_keys {
name = &quot;date&quot;
type = &quot;date&quot;
}
storage_descriptor {
columns {
name = &quot;file_name&quot;
type = &quot;string&quot;
}
.
.
.
compressed    = false
location      = &quot;s3://${aws_s3_bucket.my_bucket[each.value.key1].id}/suffix/${each.value.name}&quot; ##### here suffix would be from the maps tablenames list
input_format  = &quot;org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat&quot;
output_format = &quot;org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat&quot;
ser_de_info {
name                  = &quot;ParquetHiveSerDe&quot;
serialization_library = &quot;org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe&quot;
}
}
}

huangapple
  • 本文由 发表于 2023年3月8日 17:29:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/75671328.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定