英文:
How to speed up a query that has a WHERE on a varchar column?
问题
I have two columns in the users
table:
id
: 它是bitint
,primary
,unique
,auto_increment
,unsigned
和not null
cellphone
: 它是varchar(15)
,具有index
(btree),unique
和not null
注意:
users
表格有大约150万行数据。cellphone
用作用户名。- 我在
cellphone
列上的 btree 索引是一个复合索引,类似于users(cellphone, type)
。
奇怪的是,以下查询的执行时间差异很大。
对于 id
:
SELECT 1 FROM users WHERE id = x
//=> 执行时间:0.001 秒
而对于 cellphone
:
SELECT 1 FROM users WHERE cellphone = x
//=> 执行时间:1.5 秒
有什么办法可以让 cellphone
的查询速度与 id
的一样快吗?
此外,这是 SHOW CREATE TABLE users
的结果:
CREATE TABLE `users` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`f_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`l_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
...
--(表结构定义的其余部分省略)
) ENGINE=InnoDB AUTO_INCREMENT=2110840 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
英文:
I have two columns in the users
table:
id
: It'sbitint
,primary
,unique
,auto_increment
,unsigned
andnot null
cellphone
: It'svarchar(15)
, hasindex
(btree),unique
, andnot null
Noted:
- The
users
table has about 1.5 milion rows. - The
cellphone
is used as the username. - The btree index that I have on the
cellphone
column is an composite index likeusers(cellphone, type)
The strange part is, the time execution for the following queries is much different.
For the id
:
SELECT 1 FROM users WHERE id = x
//=> Time execution: 0.001 sec
And for the cellphone
:
SELECT 1 FROM users WHERE cellphone = x
//=> Time execution: 1.5 sec
Any idea how can I make the cellphone
one fast as much as the id
one?
Also, here is the result of SHOW CREATE TABLE users
:
CREATE TABLE `users` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`f_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`l_name` varchar(29) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`national_id_real` varchar(15) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Registered by the user',
`national_id` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'For customers, it''s for beh pardakht psp (generated randomly)',
`national_code_verification` tinyint NOT NULL DEFAULT '0',
`birth_certificate_number` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`cellphone` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`subscription_expire_time` timestamp NULL DEFAULT NULL COMMENT 'Null means a free user ',
`email` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`address` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`type` enum('marketer','business_owner','customer','daapapp_member','legal') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`belongs_to` enum('daapapp','baadraan','fartak','ibshop','gooya','acap','ezpay','zarindax') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'daapapp',
`belongs_to_old` enum('daapapp','baadraan') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'daapapp' COMMENT 'This column created because we changed all baadraan''s users to daapapp. Just keeping the old belongingness here for probabilistic needs in future',
`came_from` enum('organic','baadraan_api','wizard_landing','sms','api','web_site','organizational_import','myirancell','guarantor') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'organic',
`role_id` int unsigned DEFAULT NULL,
`province_id` int DEFAULT NULL COMMENT 'Based on the residence province selected inside the user''s profile page',
`chosen_province_id` int DEFAULT NULL COMMENT 'Based on the either pins-arount-me-button or select-province-you-want-to-see-pins-for selectOption on the map',
`sex` enum('male','female') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`birthday` date DEFAULT NULL,
`city_id` int DEFAULT NULL,
`area_id` int unsigned DEFAULT NULL,
`referral_code` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`has_parent` tinyint NOT NULL DEFAULT '0' COMMENT '1 means, his/her referrer would own 20% of his/her cashback share.',
`referrer_id` bigint unsigned DEFAULT NULL,
`referrer_id_legal` bigint unsigned DEFAULT NULL,
`tracker_name` varchar(191) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`referrer_gift_given` tinyint(1) NOT NULL DEFAULT '0',
`father_name` varchar(25) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`birth_date` date DEFAULT NULL,
`from` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`postal_code` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`sheba_id` bigint unsigned DEFAULT NULL,
`notification_num` mediumint unsigned NOT NULL DEFAULT '0',
`reputation_num` decimal(14,2) NOT NULL DEFAULT '0.00' COMMENT 'This should never get a negative number if the logic is all correct. Also this makes sense only for "business_owners".',
`transaction_num` mediumint unsigned NOT NULL DEFAULT '0',
`wallet_credit` decimal(14,2) NOT NULL DEFAULT '0.00',
`ewallet_credit` decimal(14,2) NOT NULL DEFAULT '0.00' COMMENT 'Only for Daapapp ewallet',
`password` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`level` enum('default','bronze','silver','golden') COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'default' COMMENT 'For future gamifications based on the club or "belongs_to" column',
`avatar` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`remember_token` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`charity_percentage` decimal(5,2) NOT NULL DEFAULT '0.00',
`otp_code` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`has_open_process` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'This column created for handling a semaphore in buying Simcard charge to avoid submitting several requests by clicking quickly on the button. It can be used for other requests/apis too in the future',
`show_referral` tinyint(1) DEFAULT '0',
`app_installed` enum('installed','not_installed') COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'installed' COMMENT 'At least once - login by PWA is counted',
`client_id` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`fcm_token` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`device_id` varchar(64) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`user_agent` enum('android','ios','web') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`status` enum('pending','active','suspend','inactive') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'pending' COMMENT '"pending" means the user tried to signup (login for the first time) and still has not entered the right OTP. So, the cellphone number is not verified yet',
`purchase_experience` tinyint(1) NOT NULL DEFAULT '1',
`temp` int DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `id_unique` (`cellphone`,`type`),
UNIQUE KEY `referral_code` (`referral_code`),
KEY `role_id` (`role_id`),
KEY `area_id` (`area_id`),
KEY `users_ibfk_4` (`sheba_id`),
KEY `client_id` (`client_id`),
KEY `referrer_id` (`referrer_id`,`created_at`) USING BTREE,
KEY `l_name` (`l_name`),
KEY `users_ibfk_5` (`city_id`),
KEY `province_id` (`province_id`),
KEY `created_at` (`created_at`,`type`) USING BTREE,
KEY `users_ibfk_7` (`chosen_province_id`),
KEY `referrer_id_legal` (`referrer_id_legal`),
KEY `came_from` (`id`,`came_from`) USING BTREE,
KEY `belongs_to` (`belongs_to`),
KEY `has_open_process` (`has_open_process`),
KEY `fcm_token` (`fcm_token`),
CONSTRAINT `users_ibfk_1` FOREIGN KEY (`role_id`) REFERENCES `roles` (`id`),
CONSTRAINT `users_ibfk_2` FOREIGN KEY (`area_id`) REFERENCES `location_areas` (`id`),
CONSTRAINT `users_ibfk_3` FOREIGN KEY (`referrer_id`) REFERENCES `users` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_4` FOREIGN KEY (`sheba_id`) REFERENCES `shebas` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_5` FOREIGN KEY (`city_id`) REFERENCES `location_cities` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_6` FOREIGN KEY (`province_id`) REFERENCES `location_provinces` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_7` FOREIGN KEY (`chosen_province_id`) REFERENCES `location_provinces` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `users_ibfk_8` FOREIGN KEY (`referrer_id_legal`) REFERENCES `users` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT
) ENGINE=InnoDB AUTO_INCREMENT=2110840 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
答案1
得分: 0
由于cellphone
是VARCHAR
,在查询中必须使用字符串而不是整数来使用INDEX
。这是性能问题。
在底层潜藏着一个棘手的问题——字符串中是否包含破折号、括号、前导零等字符?如果有的话,考虑将它们去除。在两个地方执行这个操作——数据录入和构建查询以进行查找时。您甚至可能需要添加国家代码(以保持一致性)。毕竟,几乎没有美国人能理解"+1"。
如果您去掉了所有的标点符号,DECIMAL(15,0)
就足够了,只需占用7个字节,而VARCHAR(15)
平均可能需要10个字节。这不是一个很大的差异。
关于号码末尾的"extension"怎么处理?
我假设您现在对所有可能的sex
值都使用NULL
了吗?
那个表格包含了很多个人信息;希望您的网站有足够的安全措施。
某些地方还有座机电话吗?特别是企业?您可能需要通过"操作员"联系许多不同的人。
英文:
Since cellphone
is VARCHAR
, you must use a string, not an integer in the query to use the INDEX
. This is the performance problem.
There is a nasty problem lurking under the hood -- are there dashes, parentheses, leading 0, etc, in the string? If so consider removing them. Do this in two places -- data entry and when building the query to look up. You may even need to add the country code (for consistency). After all, hardly any Americans would understand "+1".
If you do get rid of all punctuation, DECIMAL(15,0)
would suffice and fit in 7 bytes (versus an average of perhaps 10 for VARCHAR(15)
. That's not a big difference.
What about "extension" on the end of the number?
I assume you are using NULL
for all the other possible sex
values these days?
That table has a lot of personal info; I hope your site is adequately secured.
Aren't there still landlines in some places? Especially businesses? Might you need to reach many different people through an "Operator".
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论