2023年6月6日 02:26:36go评论99阅读模式

英文:

Generating grouping column for records based on row order

问题

category	need	responsibility
Grounds Keeping	Tree/shrub maintenance	Parks and Recreation
Grounds Keeping	Tree Removal	Public Works
Grounds Keeping	Mowing	Parks and Recreation
Grounds Keeping	Fire Station Mowing	Fire
Grounds Keeping	Irrigation Systems	Parks and Recreation
Grounds Keeping	Retaining Walls/Slope Maint.	Capital Projects
Building Maintenance	Cleaning/Resupply	Building Occupant
Building Maintenance	Electrical	Facility Maintenance
Building Maintenance	Carpet Maintenance	Facility Operations
Building Maintenance	Generators	Facility Maintenance
Building Maintenance	Appliances	Facility Maintenance

英文:

I have Postgres data imported from a spreadsheet, where section headers have been imported along with the data. Is there a way to move the section headers to their own column? I have looked at window functions and I have tried a self join.

I am certain that the category names will always have a null "responsibility". It would have to rely on the order that they were inserted, there is no sort column. Is that reasonable?

I need to transform this:

need	responsibility
Grounds Keeping
Tree/shrub maintenance	Parks and Recreation
Tree Removal	Public Works
Mowing	Parks and Recreation
Fire Station Mowing	Fire
Irrigation Systems	Parks and Recreation
Retaining Walls/Slope Maint.	Capital Projects
Building Maintenance
Cleaning/Resupply	Building Occupant
Electrical	Facility Maintenance
Carpet Maintenance	Facility Operations
Generators	Facility Maintenance
Appliances	Facility Maintenance

into:

category	need	responsibility
Grounds Keeping	Tree/shrub maintenance	Parks and Recreation
Grounds Keeping	Tree Removal	Public Works
Grounds Keeping	Mowing	Parks and Recreation
Grounds Keeping	Fire Station Mowing	Fire
Grounds Keeping	Irrigation Systems	Parks and Recreation
Grounds Keeping	Retaining Walls/Slope Maint.	Capital Projects
Building Maintenance	Cleaning/Resupply	Building Occupant
Building Maintenance	Electrical	Facility Maintenance
Building Maintenance	Carpet Maintenance	Facility Operations
Building Maintenance	Generators	Facility Maintenance
Building Maintenance	Appliances	Facility Maintenance

答案1

得分: 3

正如 @eshirvana 已经指出的，没有可靠的方法来执行这个任务。尽管如果你仍然没有可能拥有一个唯一标识行顺序的增量值，你可以利用 "ctid" 系统值，它保存了在表中插入记录的顺序（如果你批量插入，仍然不可靠使用这种方法）。

解决这个问题的方法如下：

使用在 responsibility = NULL 上的运行总和重新生成你的分区，按 "ctid" 排序
将你的类别字段生成为 "need" 值的最大值，其中责任为 null
通过选择所有 "responsibility" 不为空的记录来移除标题记录

WITH cte AS (
    SELECT tab.*,
           SUM(responsibility IS NULL::INT) OVER(ORDER BY ctid) AS parts
    FROM tab
), cte2 AS (
    SELECT MAX(need) FILTER(WHERE responsibility IS NULL) OVER(PARTITION BY parts) AS category,
           cte.*
    FROM cte
)
SELECT * FROM cte2 WHERE responsibility IS NOT NULL

在这里查看演示 1。

英文:

As @eshirvana already points out, there's no reliable way to do this task. Although if you still have no possibility of having an incremental value that uniquely identifies the order of your rows, you can exploit the "ctid" system value, that holds the order of insertion of your records inside that table (if you inserted in bulk, it's still unreliable to use this approach).

So goes how you solve the problem:

regenerating your partition using a running sum on responsibility = NULL, ordering by "ctid"
generating your category field as a max "need" value where responsibility is null
removing the header records, by selecting all records where "responsibility" is not null

WITH cte AS (
    SELECT tab.*,
           SUM(responsibility IS NULL::INT) OVER(ORDER BY ctid) AS parts
    FROM tab
), cte2 AS (
    SELECT MAX(need) FILTER(WHERE responsibility IS NULL) OVER(PARTITION BY parts) AS category,
           cte.*
    FROM cte
)
SELECT * FROM cte2 WHERE responsibility IS NOT NULL

Check the demo here.

答案2

得分: 1

以下是翻译好的部分：

"put it here since it won't fit in comments section :

as it's been mentioned here:
> If sorting is not chosen, the rows will be returned in an unspecified order. The actual order in that case will depend on the scan and join plan types and the order on disk, but it must not be relied on.

And to add a new column for insert date , you have to alter the table and add the column and fill it with proper values whether you insert or manually when you want to update them:

alter table [table_name] add column InsertDate timestamp

see documentation above for more information.

Keep in mind this is only one approach, not necessary the best approach."

英文:

put it here since it won't fit in comments section :

And to add a new column for insert date , you have to alter the table and add the column and fill it with proper values whether you insert or manually when you want to update them:

alter table [table_name] add column InsertDate timestamp

see documentation above for more information.

Keep in mind this is only one approach, not necessary the best approach.

答案3

得分: 1

似乎这是一个最后非空谜题的示例。策略是创建一个 row_number() 列来分组你想要向下填充的值，从而为你的 category 创建一个衍生的最后非空列。

SELECT *
FROM (
	WITH cte AS (
		SELECT *
			,MAX(CASE 
					WHEN category IS NOT NULL
						THEN rn
					END) OVER (
				ORDER BY rn ROWS UNBOUNDED PRECEDING
				) AS grp
		FROM (
			SELECT row_number() OVER (
					ORDER BY NULL
					) AS rn
				,responsibility
				,CASE 
					WHEN responsibility IS NULL
						THEN NULL
					ELSE need
					END AS _need
				,CASE 
					WHEN responsibility IS NULL
						THEN need
					END AS category
			FROM test
			) AS t2
		)
	SELECT MAX(category) OVER (
			PARTITION BY grp ORDER BY rn ROWS UNBOUNDED PRECEDING
			) AS _category
		,_need
		,responsibility
	FROM cte
	) AS t3
WHERE responsibility IS NOT NULL;

这里是演示。

英文:

seems like this was an example of the last non-null puzzle. the strategy is to derive a row_number() column to group the values you want to 'fill' down so that you create a derived last-non-null column for your category

SELECT *
FROM (
	WITH cte AS (
			SELECT *
				,MAX(CASE 
						WHEN category IS NOT NULL
							THEN rn
						END) OVER (
					ORDER BY rn ROWS UNBOUNDED PRECEDING
					) AS grp
			FROM (
				SELECT row_number() OVER (
						ORDER BY NULL
						) AS rn
					,responsibility
					,CASE 
						WHEN responsibility IS NULL
							THEN NULL
						ELSE need
						END AS _need
					,CASE 
						WHEN responsibility IS NULL
							THEN need
						END AS category
				FROM test
				) AS t2
			)
	SELECT MAX(category) OVER (
			PARTITION BY grp ORDER BY rn ROWS UNBOUNDED PRECEDING
			) AS _category
		,_need
		,responsibility
	FROM cte
	) AS t3
WHERE responsibility IS NOT NULL;

the demo is here

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

基于行顺序生成记录的分组列。

问题

答案1

答案2

答案3

获取类似于Python风格的行数据

将行拆分以平衡行 || Oracle SQL

Query relating to split_part() in PostgreSQL

Liquibase找不到liquibase.changelog.ChangeSet的构造函数。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。