2023年3月1日 16:26:38go评论99阅读模式

英文:

MySQL, adding a new column to an already existing table, involves joins also

问题

创建一个名为DaysTakenForDelivery的新列，其中包含Order_Date和Ship_Date之间的日期差异。

表格可用：orders和shipping

ALTER TABLE shipping ADD DaysTakenForDelivery INT;
UPDATE shipping b
JOIN orders a ON b.Order_ID = a.Order_ID
SET b.DaysTakenForDelivery = DATEDIFF(b.Ship_Date, a.Order_Date);

请确保将此SQL查询运行在MySQL数据库中，版本8.0.31。

英文:

Q. Create a new column DaysTakenForDelivery that contains the date difference between Order_Date and Ship_Date.

Tables available are: orders and shipping

CREATE TABLE orders (
    Order_ID int DEFAULT NULL,
    Order_Date text,
    Order_Priority text,
    Ord_id text
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
CREATE TABLE shipping (
    Order_ID int DEFAULT NULL,
    Ship_Mode text,
    Ship_Date text,
    Ship_id text,
    DaysTakenForDelivery` int DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Row counts:

shipping: 7701
orders: 5506

Please note I changed the datatyes for date columns properly.

Things I did:

I added the required column in the table 'shipping', since nothing was specified with respect to that (which table it should be added to or otherwise). Query for the same:

ALTER TABLE shipping ADD DaysTakenForDelivery INT;

Next, I tried to update the column using various queries but nothing worked. A few of them are listed below:

FAILED ATTEMPTS:

UPDATE shipping SET DaysTakenForDelivery = (
    select datediff(b.ship_date, a.order_date) AS DaysTakenForDelivery 
    from orders a
    JOIN shipping b ON a.Order_ID = b.Order_ID
);

NOTE: this query led to the following error:

> Error Code: 1093. You can't specify target table 'shipping_dimen' for update in FROM clause

Next query I tried:

UPDATE shipping b SET DaysTakenForDelivery = (
    select datediff(b.ship_date, a.order_date) AS DaysTakenForDelivery 
    from orders a  
    WHERE a.Order_ID = b.Order_ID
);

NOTE: this query led to the following error:

> Error Code: 1242. Subquery returns more than 1 row

How am I supposed to achieve the desired result?

Please note I am using MySQL and answers for the same RDBMS would be appreciated for better understanding.

Version I am using: 8.0.31

答案1

得分: 0

以下是翻译好的部分：

It would be better to not store the redundant data, as there is always the risk that it becomes inconsistent, and it is just unnecessary use of storage.
最好不存储冗余数据，因为总有可能导致数据不一致，而且这只是不必要的存储开销。

It is just a normal multi-table update:
这只是一个普通的多表更新：

UPDATE shipping s
JOIN orders o ON s.Order_ID = o.Order_ID
SET s.DaysTakenForDelivery = DATEDIFF(s.Ship_Date, o.Order_Date);

As you appear to be having performance issues while trying to update the table, you could try updating in batches:
由于您在尝试更新表时似乎遇到了性能问题，您可以尝试分批更新：

UPDATE shipping s
JOIN (
SELECT s.ship_id, DATEDIFF(s.ship_date, o.order_date) AS diff
FROM orders o
JOIN shipping s ON o.order_id = s.order_id
WHERE s.DaysTakenForDelivery IS NULL
ORDER BY o.order_id, s.ship_id
LIMIT 1000 -- batch size
) sd ON s.ship_id = sd.ship_id
SET s.DaysTakenForDelivery = sd.diff;

If that still does not work you can try reducing the batch size further.
如果仍然不起作用，您可以尝试进一步减小批处理大小。

Now you have added the DDL for your tables, we can see where some of your issues are coming from. The lack of a keyed relationship between the two tables is an issue and the number of rows returned by the join tells us that the Order_ID is not unique in your orders table. You need to deal with the duplicate Order_IDs before you can move forward.
现在您已经添加了表的DDL，我们可以看出一些问题的根源。两个表之间缺乏关键关系是一个问题，而连接返回的行数告诉我们orders表中的Order_ID不是唯一的。在继续之前，您需要处理重复的Order_ID。

To find the duplicates you can use the following queries:
要查找重复项，您可以使用以下查询：

SELECT *
FROM (
SELECT , COUNT() OVER (PARTITION BY Order_ID) AS num
FROM orders
) t
WHERE num > 1;

/* And assuming Ship_id is intended to be the PK (unique identifier) /
/ 假设Ship_id是主键（唯一标识符） */
SELECT *
FROM (
SELECT , COUNT() OVER (PARTITION BY Ship_id) AS num
FROM shipping
) t
WHERE num > 1;

After you have dealt with the duplicates, and making sure the two date fields contain valid date strings (yyyy-mm-dd), you can run something like the following to add the primary keys, the foreign key for Order_ID in shipping, and change the datatypes of the DATE columns:
处理重复项并确保两个日期字段包含有效的日期字符串（yyyy-mm-dd）之后，您可以运行类似以下的操作来添加主键、在shipping中添加Order_ID的外键，以及更改DATE列的数据类型：

ALTER TABLE orders
MODIFY COLUMN Order_ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
MODIFY COLUMN Order_Date DATE NOT NULL;

ALTER TABLE shipping
CHANGE COLUMN Ship_id Ship_ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST,
MODIFY COLUMN Ship_Date DATE,
MODIFY COLUMN Order_ID INT UNSIGNED NOT NULL,
ADD FOREIGN KEY (Order_ID) REFERENCES orders (Order_ID);

You will still need to address the other datatype issues but without seeing sample data it is impossible to say what the other TEXT columns should be changed to.
您仍然需要解决其他数据类型问题，但如果没有看到示例数据，就无法确定其他TEXT列应该更改为什么。

You might want to do some reading about:
您可能想要阅读一些相关内容：

Normalization
- Wikipedia
- Old question on SO
数据类型
- MySQL文档

英文:

It would be better to not store the redundant data, as there is always the risk that it becomes inconsistent, and it is just unnecessary use of storage.

It is just a normal multi-table update:

UPDATE shipping s
JOIN orders o ON s.Order_ID = o.Order_ID
SET s.DaysTakenForDelivery = DATEDIFF(s.Ship_Date, o.Order_Date);

As you appear to be having performance issues while trying to update the table, you could try updating in batches:

UPDATE shipping s
JOIN (
    SELECT s.ship_id, DATEDIFF(s.ship_date, o.order_date) AS diff
    FROM orders o
    JOIN shipping s ON o.order_id = s.order_id
    WHERE s.DaysTakenForDelivery IS NULL
    ORDER BY o.order_id, s. ship_id
    LIMIT 1000 -- batch size
) sd ON s.ship_id = sd.ship_id
SET s.DaysTakenForDelivery = sd.diff;

If that still does not work you can try reducing the batch size further.

To find the duplicates you can use the following queries:

SELECT *
FROM (
    SELECT *, COUNT(*) OVER (PARTITION BY Order_ID) AS num
    FROM orders
) t
WHERE num &gt; 1;
/* And assuming Ship_id is intended to be the PK (unique identifier) */
SELECT *
FROM (
    SELECT *, COUNT(*) OVER (PARTITION BY Ship_id) AS num
    FROM shipping
) t
WHERE num &gt; 1;

ALTER TABLE orders
    MODIFY COLUMN Order_ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    MODIFY COLUMN Order_Date DATE NOT NULL;
ALTER TABLE shipping
    CHANGE COLUMN Ship_id Ship_ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST,
    MODIFY COLUMN Ship_Date DATE,
    MODIFY COLUMN Order_ID INT UNSIGNED NOT NULL,
    ADD FOREIGN KEY (Order_ID) REFERENCES orders (Order_ID);

You will still need to address the other datatype issues but without seeing sample data it is impossible to say what the other TEXT columns should be changed to.

You might want to do some reading about:

Normalization
- Wikipedia
- Old question on SO
Data Types
- MySQL docs

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

MySQL，在已经存在的表中添加一个新列，还涉及到连接操作。

问题

答案1

在GoRM中，查询行中的最大值返回”0″。

我需要查找从下面的表中，订购了所有价格大于4000的产品的人。使用MySQL。

用户权限从应用程序中获得完整权限，但只能从MySQL中选择和查看。

如何使用FPDF从MySQL表中生成特定ID的PDF发票。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。