这个 SQL 重写在 MySQL 中是否保留排序顺序。

huangapple go评论56阅读模式
英文:

Does this sql rewrite preserve sort order in mysql

问题

在《高性能MySQL第三版》一书中,提到了以下内容:

一个常见的问题是offset的值很高。如果你的查询看起来像LIMIT 10000, 20,它将生成10,020行并丢弃前面的10,000行,这是非常昂贵的。

提高效率的一个简单技巧是在覆盖索引上执行offset,而不是在完整行上执行。

考虑以下查询:

SELECT film_id, description FROM sakila.film ORDER BY title LIMIT 50, 5;

如果表非常大,这个查询最好这样编写:

mysql> SELECT film.film_id, film.description
-> FROM sakila.film
-> INNER JOIN (
-> SELECT film_id FROM sakila.film
-> ORDER BY title LIMIT 50, 5
-> ) AS lim USING(film_id);

这个"延迟连接"之所以有效,是因为它允许服务器尽可能少地检查索引中的数据,而不访问行,然后一旦找到所需的行,就会与完整表连接以检索行的其他列。

我的问题是,第二个SQL是否保留与第一个SQL相同的结果集顺序,因为它没有外部的ORDER BY子句?

这是用于快速参考的表DDL:

CREATE TABLE film (
  film_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  title VARCHAR(128) NOT NULL,
  description TEXT DEFAULT NULL,
  release_year YEAR DEFAULT NULL,
  language_id TINYINT UNSIGNED NOT NULL,
  original_language_id TINYINT UNSIGNED DEFAULT NULL,
  rental_duration TINYINT UNSIGNED NOT NULL DEFAULT 3,
  rental_rate DECIMAL(4,2) NOT NULL DEFAULT 4.99,
  length SMALLINT UNSIGNED DEFAULT NULL,
  replacement_cost DECIMAL(5,2) NOT NULL DEFAULT 19.99,
  rating ENUM('G','PG','PG-13','R','NC-17') DEFAULT 'G',
  special_features SET('Trailers','Commentaries','Deleted Scenes','Behind the Scenes') DEFAULT NULL,
  last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY  (film_id),
  KEY idx_title (title),
  KEY idx_fk_language_id (language_id),
  KEY idx_fk_original_language_id (original_language_id),
  CONSTRAINT fk_film_language FOREIGN KEY (language_id) REFERENCES language (language_id) ON DELETE RESTRICT ON UPDATE CASCADE,
  CONSTRAINT fk_film_language_original FOREIGN KEY (original_language_id) REFERENCES language (language_id) ON DELETE RESTRICT ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
英文:

In the book High performance mysql 3rd edition, it says

>A frequent problem is having a high value for the offset. If your query looks like LIMIT
10000, 20, it is generating 10,020 rows and throwing away the first 10,000 of them,
which is very expensive.
>
>One simple technique to improve efficiency is to do the offset on a covering index,
rather than the full rows.
>
>Consider the following
query:
>
>SELECT film_id, description FROM sakila.film ORDER BY title LIMIT 50, 5;
>

>If the table is very large, this query is better written as follows:
>
>mysql> SELECT film.film_id, film.description
> -> FROM sakila.film
> -> INNER JOIN (
> -> SELECT film_id FROM sakila.film
> -> ORDER BY title LIMIT 50, 5
> -> ) AS lim USING(film_id);
>

>This “deferred join” works because it lets the server examine as little data as possible
in an index without accessing rows, and then, once the desired rows are found, join
them against the full table to retrieve the other columns from the row.

My question is, does the second SQL preserve the same result set order as the first SQL, since it has no order by clause outside?

Here is the table DDL for quick reference:

CREATE TABLE film (
  film_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  title VARCHAR(128) NOT NULL,
  description TEXT DEFAULT NULL,
  release_year YEAR DEFAULT NULL,
  language_id TINYINT UNSIGNED NOT NULL,
  original_language_id TINYINT UNSIGNED DEFAULT NULL,
  rental_duration TINYINT UNSIGNED NOT NULL DEFAULT 3,
  rental_rate DECIMAL(4,2) NOT NULL DEFAULT 4.99,
  length SMALLINT UNSIGNED DEFAULT NULL,
  replacement_cost DECIMAL(5,2) NOT NULL DEFAULT 19.99,
  rating ENUM('G','PG','PG-13','R','NC-17') DEFAULT 'G',
  special_features SET('Trailers','Commentaries','Deleted Scenes','Behind the Scenes') DEFAULT NULL,
  last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY  (film_id),
  KEY idx_title (title),
  KEY idx_fk_language_id (language_id),
  KEY idx_fk_original_language_id (original_language_id),
  CONSTRAINT fk_film_language FOREIGN KEY (language_id) REFERENCES language (language_id) ON DELETE RESTRICT ON UPDATE CASCADE,
  CONSTRAINT fk_film_language_original FOREIGN KEY (original_language_id) REFERENCES language (language_id) ON DELETE RESTRICT ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

答案1

得分: 1

根据我的经验,结果将保持子查询的顺序,但我不能确定是否有保证。因此,您应该重复ORDER BY子句。

SELECT film.film_id, film.description
FROM sakila.film
INNER JOIN (
    SELECT film_id 
    FROM sakila.film
    ORDER BY title 
    LIMIT 50, 5
) AS lim USING(film_id)
ORDER BY title
英文:

In my experience the result will preserve the ordering of the subquery, but I'm not sure it's guaranteed. So you should repeat the ORDER BY clause.

SELECT film.film_id, film.description
FROM sakila.film
INNER JOIN (
    SELECT film_id 
    FROM sakila.film
    ORDER BY title 
    LIMIT 50, 5
) AS lim USING(film_id)
ORDER BY title

huangapple
  • 本文由 发表于 2023年2月10日 09:48:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75406218.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定