复制行N次,同时递增整数字段。

huangapple go评论49阅读模式
英文:

Duplicate row N times while incrementing integer field

问题

### 目的
我有一个数据集,其中包含物体被获取的年份以及物体的寿命。我想为每一行创建副本,以便为物体的每一年都有一行数据。我还想对重复的行逐年递增年份列。

### 示例
以下是数据的简单示例。

```sql
CREATE TEMP TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);
SELECT * FROM "data";

结果如下:

year   | lifespan
-------+----------
 2000  |        3
 2012  |        2
 2021  |        1 

我在PostgreSQL中通过使用 generate_series() 函数成功实现了复制,但似乎在Redshift中,该函数实现了完全不同的目标。以下是我在PostgreSQL中实现此目的的方式:

select *, generate_series(1, (lifespan)::integer) i from "data"

然后我可以使用 row_number(), over, 和 partition 来递增年份字段。

期望输出

我希望得到如下所示的数据:

year   | lifespan
-------+----------
 2000  |        3
 2001  |        3
 2002  |        3
 2012  |        2
 2013  |        2
 2021  |        1

<details>
<summary>英文:</summary>

### Objective
I have a dataset which has a year in which an object was acquired and the lifespan of the object. I would like to create copies of each row, so that there is a row for each year of the object&#39;s lifespan. I would also like to increment the year column by one for each duplicate row.

### Example
Here is a simple example of the data.

CREATE TEMP TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);
SELECT * FROM "data";


which results in:

year | lifespan
-------+----------
2000 | 3
2012 | 2
2021 | 1


I was able to accomplish the duplication in PostgreSQL by using the `generate_series()` function, but it seems the function in Redshift accomplishes a quite different goal. Here is how I would accomplish this is PostgreSQL:

select *, generate_series(1, (lifespan)::integer) i from "data"


I could then use `row_number()`, `over`, and `partition` to increment the year field.

### Desired Output
I would like to have data that looks like:

year | lifespan
-------+----------
2000 | 3
2001 | 3
2002 | 3
2012 | 2
2013 | 2
2021 | 1



</details>


# 答案1
**得分**: 1

下面是已翻译的内容:

可以使用递归查询来实现这个目的(我在db-fiddle上使用了PostgreSQL,但我在实际的Redshift数据库上也试过了):

**模式(PostgreSQL v15)**

```sql
CREATE TABLE "data"("year" int, lifespan int);
INSERT INTO "data"("year", lifespan) VALUES (2000,3), (2012,2), (2021,1);

查询 #1

WITH RECURSIVE rec(year, lifespan, remaining) AS (
    SELECT year, lifespan, lifespan
    FROM data

    UNION ALL

    SELECT year + 1, lifespan, remaining - 1
    FROM rec
    WHERE remaining > 1
)
SELECT year, lifespan
FROM rec
ORDER BY year;
year lifespan
2000 3
2001 3
2002 3
2012 2
2013 2
2021 1

在DB Fiddle上查看

英文:

You can use a recursive query for this (I used postgresql on db-fiddle but I tried it out on an actual redshift database):

Schema (PostgreSQL v15)

CREATE TABLE &quot;data&quot;(&quot;year&quot; int, lifespan int);
INSERT INTO &quot;data&quot;(&quot;year&quot;, lifespan) VALUES (2000,3), (2012,2), (2021,1);

Query #1

WITH RECURSIVE rec(year, lifespan, remaining) AS (
    SELECT year, lifespan, lifespan
    FROM data

    UNION ALL

    SELECT year + 1, lifespan, remaining - 1
    FROM rec
    WHERE remaining &gt; 1
)
SELECT year, lifespan
FROM rec
ORDER BY year;
year lifespan
2000 3
2001 3
2002 3
2012 2
2013 2
2021 1

View on DB Fiddle

huangapple
  • 本文由 发表于 2023年4月19日 23:36:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76056388.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定