人口按美国地区划分

huangapple go评论61阅读模式
英文:

Population by United States Region

问题

--现在我卡在这一步。我无法将数据放入region_pop列。我如何获得按美国地区划分的人口总和?

英文:

Using SQL Let’s say we have a dataset of population by state (e.g., Vermont, 623,251, and so on), but we want to know the population by United States region (e.g., Midwest, 68,985,454). Could you describe how you would go about doing that?

Dataset from census.gov

Where I'm stuck at

--First I created a table with a state and population column.

CREATE TABLE states (
	state VARCHAR(20),
	population INT
);

--Then I uploaded a CSV file from census.gov that I cleaned up.

SELECT * FROM states;

--Created a temporary table to add in the region column.

DROP TABLE IF EXISTS temp_Regions;

CREATE TEMP TABLE temp_Regions (
	state VARCHAR(20),
	state_pop INT,
	region VARCHAR(20),
	region_pop INT
);

INSERT INTO temp_Regions
SELECT state, population
FROM states;

--Used CASE WHEN statements to put states in to their respective regions.

SELECT state,
	CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
		 WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
	     WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
		 WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
	END AS region, state_pop, region_pop
	FROM temp_Regions;

--Now I'm stuck at this point. I'm unable to get data into the region_pop column. How do I get the sum of the populations by U.S. Region?

Let me know if you need further clarification on things. Thanks for your help y'all!

答案1

得分: 0

您可以使用分析函数sum() over(partition by)来实现这个目标。

with data
as (
    SELECT state
          ,CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
             WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
             WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
             WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
            END AS region
            , state_pop
        FROM temp_Regions
      )
select state
       ,region
       ,state_pop
       ,sum(state_pop) over(partition by region) as region_population
  from data
英文:

You can make use of analytical function sum() over(partition by) to achieve this

with data
 as (
    SELECT state
          ,CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
             WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
             WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
             WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
            END AS region
            , state_pop
        FROM temp_Regions
      )
select state
       ,region
       ,state_pop
       ,sum(state_pop) over(partition by region) as region_population
  from data

huangapple
  • 本文由 发表于 2023年1月6日 13:04:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75027114.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定