2023年2月16日 04:54:17go评论95阅读模式

英文:

Athena looking for records with different start dates

问题

我尝试找出具有多个安装日期的计量表数量。具有重复信息的多行不少见。在尝试不同策略时，我得到了不同的答案，所以我肯定做错了什么。

我已经尝试了以下查询：

选择客户ID，服务点ID，辅助服务点ID
从客户
在 (
  选择辅助服务点ID
  从客户
  按辅助服务点ID分组
  有长度(辅助服务点ID) > 1 并且计数(不同的计量表安装日期) > 1
)

选择客户ID，服务点ID，辅助服务点ID，计量表安装日期
从客户
在 (
  选择辅助服务点ID
  从客户
  按辅助服务点ID分组
  有计数(不同的计量表安装日期) > 1
)

选择a.服务点ID，a.辅助服务点ID，a.计量表安装日期
从客户 a，客户 b
其中a.服务点ID = b.服务点ID
并且a.辅助服务点ID = b.辅助服务点ID
并且a.计量表安装日期 != b.计量表安装日期
按a.服务点ID，a.辅助服务点ID，a.计量表安装日期分组

我期望的结果是：

客户	服务点	计量表ID	计量表安装日期
1	A1	AM1	20201005
1	A1	AM1	20150101
1	A3	AM3	20200509
1	A3	AM3	20221013

我认为我没有处理服务点有多个计量表且其中一个计量表有多个启动日期的情况。感谢您的帮助！

英文:

I have a lot of customer files with I customer data that includes a customer id which can have multiple service points. A service point can have a meter and a meter can have a meter install date:

Cust	Service Point	Meter ID	Meter Install Date
1	A1	AM1	20201005
1	A1	AM1	20201005
1	A1	AM1	20201005
1	A1	AM1	20150101
1	A1	AM1	20150101
1	A1	AM1	20150101
1	A2	AM2	20220110
1	A2	AM2	20220110
1	A2	AM2	20220110
1	A2	AM21	20230215
1	A3	AM3	20200509
1	A3	AM3	20200509
1	A3	AM3	20200509
1	A3	AM3	20221013

I'm trying to find the number of meters that have a multiple install dates. It is not uncommon to have multiple rows where these field's information is duplicated. As I try different strategies I get different answers so I'm doing something wrong.

I've tried:


select customer_id, service_point_id, secondary_sp_id
from customer
where secondary_sp_id in (
  select secondary_sp_id
  from customer
  group by secondary_sp_id
  having length(secondary_sp_id) &gt; 1 and count(distinct meter_install_date) &gt; 1


select customer_id, service_point_id, secondary_sp_id, meter_install_date
from customer
where secondary_sp_id in (
select secondary_sp_id
from customer
group by secondary_sp_id having count(distinct meter_install_date) &gt; 1 )


select a.service_point_id, a.secondary_sp_id, a.meter_install_date 
from customer a, customer b 
where a.service_point_id = b.service_point_id 
and a.secondary_sp_id = b.secondary_sp_id 
and a.meter_install_date != b.meter_install_date 
group by a.service_point_id, a.secondary_sp_id, a.meter_install_date

I would expect to get back:

Cust	Service Point	Meter ID	Meter Install Date
1	A1	AM1	20201005
1	A1	AM1	20150101
1	A3	AM3	20200509
1	A3	AM3	20221013

I don't think I'm handling when a service point has multiple meters and one of those meters has multiple start dates. Thanks for your help!

答案1

得分: 2

I'm not sure we have enough information of your data or schema, such as how "secondardy_sp_id" fits into this. No details were provided on that column nor the prod_peco_customer table.

If we assume your data appears like your first formatted section in the question, then the following CTE would work as-is.

create table customer (
cust integer,
service_point varchar(5),
meter_id varchar(5),
meter_install_date date
);

insert into customer values
(1, 'A1', 'AM1', '20201005'),
(1, 'A1', 'AM1', '20150101'),
(1, 'A2', 'AM2', '20230110');

with target_meters as (
select meter_id
from customer
group by meter_id
having count(distinct meter_install_date) > 1
)
select c.*
from customer c
join target_meters t
on c.meter_id = t.meter_id;

cust	service_point	meter_id	meter_install_date
1	A1	AM1	2020-10-05T00:00:00.000Z
1	A1	AM1	2015-01-01T00:00:00.000Z

But I kinda doubt your data looks like this even though you formatted it that way in the question. Adjust accordingly, but main point is that you could use a sub-query or CTE for identifying your meters with multiple install dates.

----------Update-----------

Based on the updated sample data, then you would simply need to change select c.* to select distinct c.* such as this...

with target_meters as (
select meter_id
from customer
group by meter_id
having count(distinct meter_install_date) > 1
)
select distinct c.*
from customer c
join target_meters t
on c.meter_id = t.meter_id
order by 1,2,3,4

cust	service_point	meter_id	meter_install_date
1	A1	AM1	2015-01-01T00:00:00.000Z
1	A1	AM1	2020-10-05T00:00:00.000Z
1	A3	AM3	2020-05-09T00:00:00.000Z
1	A3	AM3	2022-10-13T00:00:00.000Z

英文:

I'm not sure we have enough information of your data or schema, such as how "secondardy_sp_id" fits into this. No details were provided on that column nor the prod_peco_customer table.

If we assume your data appears like your first formatted section in the question, then the following CTE would work as-is.

create table customer (
  cust integer, 
  service_point varchar(5), 
  meter_id varchar(5), 
  meter_install_date date
  );
  
insert into customer values 
(1, &#39;A1&#39;, &#39;AM1&#39;, &#39;20201005&#39;),
(1, &#39;A1&#39;, &#39;AM1&#39;, &#39;20150101&#39;),
(1, &#39;A2&#39;, &#39;AM2&#39;, &#39;20230110&#39;);

with target_meters as (
  select meter_id
  from customer
  group by meter_id
  having count(distinct meter_install_date) &gt; 1
  )
select c.*
from customer c
join target_meters t
  on c.meter_id = t.meter_id;

cust	service_point	meter_id	meter_install_date
1	A1	AM1	2020-10-05T00:00:00.000Z
1	A1	AM1	2015-01-01T00:00:00.000Z

----------Update-----------

Based on the updated sample data, then you would simply need to change select c.* to select distinct c.* such as this...

with target_meters as (
  select meter_id
  from customer
  group by meter_id
  having count(distinct meter_install_date) &gt; 1
  )
select distinct c.*
from customer c
join target_meters t
  on c.meter_id = t.meter_id
order by 1,2,3,4

cust	service_point	meter_id	meter_install_date
1	A1	AM1	2015-01-01T00:00:00.000Z
1	A1	AM1	2020-10-05T00:00:00.000Z
1	A3	AM3	2020-05-09T00:00:00.000Z
1	A3	AM3	2022-10-13T00:00:00.000Z

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Athena 寻找具有不同开始日期的记录。

问题

答案1

预处理语句中的占位符

OAuth 和审计日志

我想要一个BigQuery查询代码来连接和计算两个表。

批量更新情况下，UPDATE(column)函数是如何工作的？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。