如何使用jq流解析Amazon的RDS实例信息?

huangapple go评论152阅读模式
英文:

How to parse amazon's rds instances information using jq with streams?

问题

以下是您要翻译的内容:

While trying to parse Amazon's RDS instances to mount an object of instanceType=memoryInBytes I used almost 1GB of memory so I tried to use jq with --stream to try and make it a bit better but, I can only output the key and the value on the next line.

On my current try, this is the used jq expression I've created:

Note that (. | tostream) is used only because the playground does not support the --stream flag.

def isInstanceType: .[0][3] == "instanceType";

(. | tostream) | select(isMemory or isInstanceType) | .[1]

Which then outputs

"db.r5.24xlarge"
"768 GiB"
"db.r4.large"
"15.25 GiB"

When running with a smaller version of that json:

{
  "formatVersion" : "v1.0",
  "disclaimer" : "This pricing list is for informational purposes only. All prices are subject to the additional terms included in the pricing pages on http://aws.amazon.com. All Free Tier prices are also subject to the terms included at https://aws.amazon.com/free/",
  "offerCode" : "AmazonRDS",
  "version" : "20230328234721",
  "publicationDate" : "2023-03-28T23:47:21Z",
  "products" : {
    "BHYABS232JP4AGQY" : {
      "sku" : "BHYABS232JP4AGQY",
      "productFamily" : "Database Instance",
      "attributes" : {
        "servicecode" : "AmazonRDS",
        "location" : "US East (Ohio)",
        "locationType" : "AWS Region",
        "instanceType" : "db.r5.24xlarge",
        "currentGeneration" : "Yes",
        "instanceFamily" : "Memory optimized",
        "vcpu" : "96",
        "physicalProcessor" : "Intel Xeon Platinum 8175",
        "clockSpeed" : "Up to 3.1 GHz",
        "memory" : "768 GiB",
        "storage" : "EBS Only",
        "networkPerformance" : "25 Gigabit",
        "processorArchitecture" : "64-bit",
        "engineCode" : "18",
        "databaseEngine" : "MariaDB",
        "licenseModel" : "No license required",
        "deploymentOption" : "Single-AZ",
        "usagetype" : "USE2-InstanceUsage:db.r5.24xl",
        "operation" : "CreateDBInstance:0018",
        "dedicatedEbsThroughput" : "14000 Mbps",
        "enhancedNetworkingSupported" : "Yes",
        "instanceTypeFamily" : "R5",
        "normalizationSizeFactor" : "192",
        "regionCode" : "us-east-2",
        "servicename" : "Amazon Relational Database Service"
      }
    },
    "D8GBHQEK73G5ADCK" : {
      "sku" : "D8GBHQEK73G5ADCK",
      "productFamily" : "Database Instance",
      "attributes" : {
        "servicecode" : "AmazonRDS",
        "location" : "Asia Pacific (Tokyo)",
        "locationType" : "AWS Region",
        "instanceType" : "db.r4.large",
        "currentGeneration" : "No",
        "instanceFamily" : "Memory optimized",
        "vcpu" : "2",
        "physicalProcessor" : "Intel Xeon E5-2686 v4 (Broadwell)",
        "clockSpeed" : "2.3 GHz",
        "memory" : "15.25 GiB",
        "storage" : "EBS Only",
        "networkPerformance" : "Up to 10 Gigabit",
        "processorArchitecture" : "64-bit",
        "engineCode" : "2",
        "databaseEngine" : "MySQL",
        "licenseModel" : "No license required",
        "deploymentOption" : "Multi-AZ",
        "usagetype" : "APN1-Multi-AZUsage:db.r4.large",
        "operation" : "CreateDBInstance:0002",
        "dedicatedEbsThroughput" : "400 Mbps",
        "enhancedNetworkingSupported" : "Yes",
        "instanceTypeFamily" : "R4",
        "normalizationSizeFactor" : "8",
        "processorFeatures" : "Intel AVX, Intel AVX2, Intel Turbo",
        "regionCode" : "ap-northeast-1",
        "servicename" : "Amazon Relational Database Service"
      }
    }
  }
}

You can also get the playground link to try it..

The initial expresion I had, which would then use too much memory was the following:

.products | to_entries | map(.value.attributes | select(.instanceType != null) | {(.instanceType): ((.memory | split(" ") | .[0] | tonumber) * 1024 * 1024 * 1024)}) | add

it works but it will blow the memory every time I run it.

英文:

While trying to parse Amazon's RDS instances to mount an object of instanceType=memoryInBytes I used almost 1GB of memory so I tried to use jq with --stream to try and make it a bit better but, I can only output the key and the value on the next line.

On my current try, this is the used jq expression I've created:

Note that (. | tostream) is used only because the playground does not support the --stream flag.

def isMemory: .[0][3] == "memory";
def isInstanceType: .[0][3] == "instanceType";

(. | tostream) | select(isMemory or isInstanceType) | .[1]

Which then outputs

"db.r5.24xlarge"
"768 GiB"
"db.r4.large"
"15.25 GiB"

When running with a smaller version of that json:

{
  "formatVersion" : "v1.0",
  "disclaimer" : "This pricing list is for informational purposes only. All prices are subject to the additional terms included in the pricing pages on http://aws.amazon.com. All Free Tier prices are also subject to the terms included at https://aws.amazon.com/free/",
  "offerCode" : "AmazonRDS",
  "version" : "20230328234721",
  "publicationDate" : "2023-03-28T23:47:21Z",
  "products" : {
    "BHYABS232JP4AGQY" : {
      "sku" : "BHYABS232JP4AGQY",
      "productFamily" : "Database Instance",
      "attributes" : {
        "servicecode" : "AmazonRDS",
        "location" : "US East (Ohio)",
        "locationType" : "AWS Region",
        "instanceType" : "db.r5.24xlarge",
        "currentGeneration" : "Yes",
        "instanceFamily" : "Memory optimized",
        "vcpu" : "96",
        "physicalProcessor" : "Intel Xeon Platinum 8175",
        "clockSpeed" : "Up to 3.1 GHz",
        "memory" : "768 GiB",
        "storage" : "EBS Only",
        "networkPerformance" : "25 Gigabit",
        "processorArchitecture" : "64-bit",
        "engineCode" : "18",
        "databaseEngine" : "MariaDB",
        "licenseModel" : "No license required",
        "deploymentOption" : "Single-AZ",
        "usagetype" : "USE2-InstanceUsage:db.r5.24xl",
        "operation" : "CreateDBInstance:0018",
        "dedicatedEbsThroughput" : "14000 Mbps",
        "enhancedNetworkingSupported" : "Yes",
        "instanceTypeFamily" : "R5",
        "normalizationSizeFactor" : "192",
        "regionCode" : "us-east-2",
        "servicename" : "Amazon Relational Database Service"
      }
    },
    "D8GBHQEK73G5ADCK" : {
      "sku" : "D8GBHQEK73G5ADCK",
      "productFamily" : "Database Instance",
      "attributes" : {
        "servicecode" : "AmazonRDS",
        "location" : "Asia Pacific (Tokyo)",
        "locationType" : "AWS Region",
        "instanceType" : "db.r4.large",
        "currentGeneration" : "No",
        "instanceFamily" : "Memory optimized",
        "vcpu" : "2",
        "physicalProcessor" : "Intel Xeon E5-2686 v4 (Broadwell)",
        "clockSpeed" : "2.3 GHz",
        "memory" : "15.25 GiB",
        "storage" : "EBS Only",
        "networkPerformance" : "Up to 10 Gigabit",
        "processorArchitecture" : "64-bit",
        "engineCode" : "2",
        "databaseEngine" : "MySQL",
        "licenseModel" : "No license required",
        "deploymentOption" : "Multi-AZ",
        "usagetype" : "APN1-Multi-AZUsage:db.r4.large",
        "operation" : "CreateDBInstance:0002",
        "dedicatedEbsThroughput" : "400 Mbps",
        "enhancedNetworkingSupported" : "Yes",
        "instanceTypeFamily" : "R4",
        "normalizationSizeFactor" : "8",
        "processorFeatures" : "Intel AVX, Intel AVX2, Intel Turbo",
        "regionCode" : "ap-northeast-1",
        "servicename" : "Amazon Relational Database Service"
      }
    }
  }
}

You can also get the playground link to try it..

The initial expresion I had, which would then use too much memory was the following:

.products | to_entries | map(.value.attributes | select(.instanceType != null) | {(.instanceType): ((.memory | split(" ") | .[0] | tonumber) * 1024 * 1024 * 1024)}) | add

it works but it will blow the memory every time I run it.

答案1

得分: 2

Your implentation using tostream also loads the whole input, and just then breaks it down into its streamed format. Instead, use the --stream flag to read in the input in streamed format right away.

Here's one approach loading only the .attributes objects in full into memory. Using reduce lets you iteratively build up your final object.

jq --stream -n '
  reduce fromstream(3|truncate_stream(
    inputs | select(.[0][2] == "attributes")
  )) as {$instanceType, $memory} ({};
    if $instanceType then .[$instanceType] = $memory else . end
  )
'
{
  "db.r5.24xlarge": "768 GiB",
  "db.r4.large": "15.25 GiB"
}
英文:

Your implentation using tostream also loads the whole input, and just then breaks it down into its streamed format. Instead, use the --stream flag to read in the input in streamed format right away.

Here's one approach loading only the .attributes objects in full into memory. Using reduce lets you iteratively build up your final object.

jq --stream -n '
  reduce fromstream(3|truncate_stream(
    inputs | select(.[0][2] == "attributes")
  )) as {$instanceType, $memory} ({};
    if $instanceType then .[$instanceType] = $memory else . end
  )
'
{
  "db.r5.24xlarge": "768 GiB",
  "db.r4.large": "15.25 GiB"
}

huangapple
  • 本文由 发表于 2023年3月31日 02:56:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75892008.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定