英文:
Duplicate row N times while incrementing integer field
问题
### 目的
我有一个数据集,其中包含物体被获取的年份以及物体的寿命。我想为每一行创建副本,以便为物体的每一年都有一行数据。我还想对重复的行逐年递增年份列。
### 示例
以下是数据的简单示例。
```sql
CREATE TEMP TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);
SELECT * FROM "data";
结果如下:
year | lifespan
-------+----------
2000 | 3
2012 | 2
2021 | 1
我在PostgreSQL中通过使用 generate_series()
函数成功实现了复制,但似乎在Redshift中,该函数实现了完全不同的目标。以下是我在PostgreSQL中实现此目的的方式:
select *, generate_series(1, (lifespan)::integer) i from "data"
然后我可以使用 row_number()
, over
, 和 partition
来递增年份字段。
期望输出
我希望得到如下所示的数据:
year | lifespan
-------+----------
2000 | 3
2001 | 3
2002 | 3
2012 | 2
2013 | 2
2021 | 1
<details>
<summary>英文:</summary>
### Objective
I have a dataset which has a year in which an object was acquired and the lifespan of the object. I would like to create copies of each row, so that there is a row for each year of the object's lifespan. I would also like to increment the year column by one for each duplicate row.
### Example
Here is a simple example of the data.
CREATE TEMP TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);
SELECT * FROM "data";
which results in:
year | lifespan
-------+----------
2000 | 3
2012 | 2
2021 | 1
I was able to accomplish the duplication in PostgreSQL by using the `generate_series()` function, but it seems the function in Redshift accomplishes a quite different goal. Here is how I would accomplish this is PostgreSQL:
select *, generate_series(1, (lifespan)::integer) i from "data"
I could then use `row_number()`, `over`, and `partition` to increment the year field.
### Desired Output
I would like to have data that looks like:
year | lifespan
-------+----------
2000 | 3
2001 | 3
2002 | 3
2012 | 2
2013 | 2
2021 | 1
</details>
# 答案1
**得分**: 1
下面是已翻译的内容:
可以使用递归查询来实现这个目的(我在db-fiddle上使用了PostgreSQL,但我在实际的Redshift数据库上也试过了):
**模式(PostgreSQL v15)**
```sql
CREATE TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);
查询 #1
WITH RECURSIVE rec(year, lifespan, remaining) AS (
SELECT year, lifespan, lifespan
FROM data
UNION ALL
SELECT year + 1, lifespan, remaining - 1
FROM rec
WHERE remaining > 1
)
SELECT year, lifespan
FROM rec
ORDER BY year;
year | lifespan |
---|---|
2000 | 3 |
2001 | 3 |
2002 | 3 |
2012 | 2 |
2013 | 2 |
2021 | 1 |
英文:
You can use a recursive query for this (I used postgresql on db-fiddle but I tried it out on an actual redshift database):
Schema (PostgreSQL v15)
CREATE TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);
Query #1
WITH RECURSIVE rec(year, lifespan, remaining) AS (
SELECT year, lifespan, lifespan
FROM data
UNION ALL
SELECT year + 1, lifespan, remaining - 1
FROM rec
WHERE remaining > 1
)
SELECT year, lifespan
FROM rec
ORDER BY year;
year | lifespan |
---|---|
2000 | 3 |
2001 | 3 |
2002 | 3 |
2012 | 2 |
2013 | 2 |
2021 | 1 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论