如何加速具有varchar列上的WHERE条件的查询?

huangapple go评论82阅读模式
英文:

How to speed up a query that has a WHERE on a varchar column?

问题

I have two columns in the users table:

  • id: 它是 bitintprimaryuniqueauto_incrementunsignednot null
  • cellphone: 它是 varchar(15),具有 index(btree),uniquenot null

注意:

  1. users 表格有大约150万行数据。
  2. cellphone 用作用户名
  3. 我在 cellphone 列上的 btree 索引是一个复合索引,类似于 users(cellphone, type)

奇怪的是,以下查询的执行时间差异很大。

对于 id

SELECT 1 FROM users WHERE id = x
//=> 执行时间:0.001 

如何加速具有varchar列上的WHERE条件的查询?

而对于 cellphone

SELECT 1 FROM users WHERE cellphone = x
//=> 执行时间:1.5 

如何加速具有varchar列上的WHERE条件的查询?

有什么办法可以让 cellphone 的查询速度与 id 的一样快吗?


此外,这是 SHOW CREATE TABLE users 的结果:

CREATE TABLE `users` (
 `id` bigint unsigned NOT NULL AUTO_INCREMENT,
 `f_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
 `l_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
 ...
 --(表结构定义的其余部分省略)
) ENGINE=InnoDB AUTO_INCREMENT=2110840 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
英文:

I have two columns in the users table:

  • id: It's bitint, primary, unique, auto_increment, unsigned and not null
  • cellphone: It's varchar(15), has index (btree), unique, and not null

Noted:

  1. The users table has about 1.5 milion rows.
  2. The cellphone is used as the username.
  3. The btree index that I have on the cellphone column is an composite index like users(cellphone, type)

The strange part is, the time execution for the following queries is much different.

For the id:

SELECT 1 FROM users WHERE id = x
//=> Time execution: 0.001 sec

如何加速具有varchar列上的WHERE条件的查询?

And for the cellphone:

SELECT 1 FROM users WHERE cellphone = x
//=> Time execution: 1.5 sec

如何加速具有varchar列上的WHERE条件的查询?

Any idea how can I make the cellphone one fast as much as the id one?


Also, here is the result of SHOW CREATE TABLE users:

CREATE TABLE `users` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`f_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`l_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`national_id_real` varchar(15) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Registered by the user',
`national_id` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'For customers, it''s for beh pardakht psp (generated randomly)',
`national_code_verification` tinyint NOT NULL DEFAULT '0',
`birth_certificate_number` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`cellphone` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`subscription_expire_time` timestamp NULL DEFAULT NULL COMMENT 'Null means a free user ',
`email` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`address` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`type` enum('marketer','business_owner','customer','daapapp_member','legal') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`belongs_to` enum('daapapp','baadraan','fartak','ibshop','gooya','acap','ezpay','zarindax') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'daapapp',
`belongs_to_old` enum('daapapp','baadraan') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'daapapp' COMMENT 'This column created because we changed all baadraan''s users to daapapp. Just keeping the old belongingness here for probabilistic needs in future',
`came_from` enum('organic','baadraan_api','wizard_landing','sms','api','web_site','organizational_import','myirancell','guarantor') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'organic',
`role_id` int unsigned DEFAULT NULL,
`province_id` int DEFAULT NULL COMMENT 'Based on the residence province selected inside the user''s profile page',
`chosen_province_id` int DEFAULT NULL COMMENT 'Based on the either pins-arount-me-button or select-province-you-want-to-see-pins-for selectOption on the map',
`sex` enum('male','female') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`birthday` date DEFAULT NULL,
`city_id` int DEFAULT NULL,
`area_id` int unsigned DEFAULT NULL,
`referral_code` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`has_parent` tinyint NOT NULL DEFAULT '0' COMMENT '1 means, his/her referrer would own 20% of his/her cashback share.',
`referrer_id` bigint unsigned DEFAULT NULL,
`referrer_id_legal` bigint unsigned DEFAULT NULL,
`tracker_name` varchar(191) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`referrer_gift_given` tinyint(1) NOT NULL DEFAULT '0',
`father_name` varchar(25) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`birth_date` date DEFAULT NULL,
`from` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`postal_code` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`sheba_id` bigint unsigned DEFAULT NULL,
`notification_num` mediumint unsigned NOT NULL DEFAULT '0',
`reputation_num` decimal(14,2) NOT NULL DEFAULT '0.00' COMMENT 'This should never get a negative number if the logic is all correct. Also this makes sense only for "business_owners".',
`transaction_num` mediumint unsigned NOT NULL DEFAULT '0',
`wallet_credit` decimal(14,2) NOT NULL DEFAULT '0.00',
`ewallet_credit` decimal(14,2) NOT NULL DEFAULT '0.00' COMMENT 'Only for Daapapp ewallet',
`password` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`level` enum('default','bronze','silver','golden') COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'default' COMMENT 'For future gamifications based on the club or "belongs_to" column',
`avatar` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`remember_token` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`charity_percentage` decimal(5,2) NOT NULL DEFAULT '0.00',
`otp_code` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`has_open_process` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'This column created for handling a semaphore in buying Simcard charge to avoid submitting several requests by clicking quickly on the button. It can be used for other requests/apis too in the future',
`show_referral` tinyint(1) DEFAULT '0',
`app_installed` enum('installed','not_installed') COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'installed' COMMENT 'At least once - login by PWA is counted',
`client_id` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`fcm_token` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`device_id` varchar(64) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`user_agent` enum('android','ios','web') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`status` enum('pending','active','suspend','inactive') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'pending' COMMENT '"pending" means the user tried to signup (login for the first time) and still has not entered the right OTP. So, the cellphone number is not verified yet',
`purchase_experience` tinyint(1) NOT NULL DEFAULT '1',
`temp` int DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `id_unique` (`cellphone`,`type`),
UNIQUE KEY `referral_code` (`referral_code`),
KEY `role_id` (`role_id`),
KEY `area_id` (`area_id`),
KEY `users_ibfk_4` (`sheba_id`),
KEY `client_id` (`client_id`),
KEY `referrer_id` (`referrer_id`,`created_at`) USING BTREE,
KEY `l_name` (`l_name`),
KEY `users_ibfk_5` (`city_id`),
KEY `province_id` (`province_id`),
KEY `created_at` (`created_at`,`type`) USING BTREE,
KEY `users_ibfk_7` (`chosen_province_id`),
KEY `referrer_id_legal` (`referrer_id_legal`),
KEY `came_from` (`id`,`came_from`) USING BTREE,
KEY `belongs_to` (`belongs_to`),
KEY `has_open_process` (`has_open_process`),
KEY `fcm_token` (`fcm_token`),
CONSTRAINT `users_ibfk_1` FOREIGN KEY (`role_id`) REFERENCES `roles` (`id`),
CONSTRAINT `users_ibfk_2` FOREIGN KEY (`area_id`) REFERENCES `location_areas` (`id`),
CONSTRAINT `users_ibfk_3` FOREIGN KEY (`referrer_id`) REFERENCES `users` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_4` FOREIGN KEY (`sheba_id`) REFERENCES `shebas` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_5` FOREIGN KEY (`city_id`) REFERENCES `location_cities` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_6` FOREIGN KEY (`province_id`) REFERENCES `location_provinces` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_7` FOREIGN KEY (`chosen_province_id`) REFERENCES `location_provinces` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_8` FOREIGN KEY (`referrer_id_legal`) REFERENCES `users` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT
) ENGINE=InnoDB AUTO_INCREMENT=2110840 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

答案1

得分: 0

由于cellphoneVARCHAR,在查询中必须使用字符串而不是整数来使用INDEX。这是性能问题。

在底层潜藏着一个棘手的问题——字符串中是否包含破折号、括号、前导零等字符?如果有的话,考虑将它们去除。在两个地方执行这个操作——数据录入和构建查询以进行查找时。您甚至可能需要添加国家代码(以保持一致性)。毕竟,几乎没有美国人能理解"+1"。

如果您去掉了所有的标点符号,DECIMAL(15,0)就足够了,只需占用7个字节,而VARCHAR(15)平均可能需要10个字节。这不是一个很大的差异。

关于号码末尾的"extension"怎么处理?

我假设您现在对所有可能的sex值都使用NULL了吗?

那个表格包含了很多个人信息;希望您的网站有足够的安全措施。

某些地方还有座机电话吗?特别是企业?您可能需要通过"操作员"联系许多不同的人。

英文:

Since cellphone is VARCHAR, you must use a string, not an integer in the query to use the INDEX. This is the performance problem.

There is a nasty problem lurking under the hood -- are there dashes, parentheses, leading 0, etc, in the string? If so consider removing them. Do this in two places -- data entry and when building the query to look up. You may even need to add the country code (for consistency). After all, hardly any Americans would understand "+1".

If you do get rid of all punctuation, DECIMAL(15,0) would suffice and fit in 7 bytes (versus an average of perhaps 10 for VARCHAR(15). That's not a big difference.

What about "extension" on the end of the number?

I assume you are using NULL for all the other possible sex values these days?

That table has a lot of personal info; I hope your site is adequately secured.

Aren't there still landlines in some places? Especially businesses? Might you need to reach many different people through an "Operator".

huangapple
  • 本文由 发表于 2023年7月31日 21:23:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76804077.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定