Mysql 在非常大的表上计算性能
我在 Innodb 中有一个超过 1 亿行的表.
I have a table with more than 100 millions rows in Innodb.
我必须知道是否有超过 5000 行的外键 = 1.我不需要确切的数字.
I have to know if there is more than 5000 rows where the foreign key = 1. I don't need the exact number.
我做了一些测试:
SELECT COUNT(*) FROM table WHERE fk = 1
=> 16 秒SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000
=> 16 秒SELECT primary FROM table WHERE fk = 1
=> 0.6 秒
SELECT COUNT(*) FROM table WHERE fk = 1
=> 16 seconds
SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000
=> 16 seconds
SELECT primary FROM table WHERE fk = 1
=> 0.6 seconds
我将拥有更大的网络和治疗时间,但可能会超载 15.4 秒!
I will have a bigger network and treatment time but it can be an overload of 15.4 seconds !
你有更好的主意吗?
谢谢
[添加 OP 的相关评论]
我尝试了 SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 但花了 25 秒
I tried SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 but it took 25 seconds
使用 Mysql Tuner 为 Innodb 调整了 Mysql.
Mysql was tuned for Innodb with Mysql Tuner.
CREATE TABLE table ( pk bigint(20) NOT NULL AUTO_INCREMENT,
fk tinyint(3) unsigned DEFAULT '0',
PRIMARY KEY (pk), KEY idx_fk (fk) USING BTREE )
ENGINE=InnoDB AUTO_INCREMENT=100380914 DEFAULT CHARSET=latin1
数据库资料:
'have_innodb', 'YES' 'ignore_builtin_innodb', 'OFF' 'innodb_adaptive_hash_index', 'ON'
'innodb_additional_mem_pool_size', '20971520' 'innodb_autoextend_increment', '8'
'innodb_autoinc_lock_mode', '1' 'innodb_buffer_pool_size', '25769803776'
'innodb_checksums', 'ON' 'innodb_commit_concurrency', '0',
'innodb_concurrency_tickets', '500' 'innodb_data_file_path',
'ibdata1:10M:autoextend' 'innodb_data_home_dir', '', 'innodb_doublewrite', 'ON'
'innodb_fast_shutdown', '1' 'innodb_file_io_threads', '4'
'innodb_file_per_table', 'OFF', 'innodb_flush_log_at_trx_commit', '1'
'innodb_flush_method', '' 'innodb_force_recovery', '0' 'innodb_lock_wait_timeout', '50'
'innodb_locks_unsafe_for_binlog', 'OFF' 'innodb_log_buffer_size', '8388608'
'innodb_log_file_size', '26214400' 'innodb_log_files_in_group', '2'
'innodb_log_group_home_dir', './' 'innodb_max_dirty_pages_pct', '90'
'innodb_max_purge_lag', '0' 'innodb_mirrored_log_groups', '1' 'innodb_open_files',
'300' 'innodb_rollback_on_timeout', 'OFF' 'innodb_stats_on_metadata', 'ON'
'innodb_support_xa', 'ON' 'innodb_sync_spin_loops', '20' 'innodb_table_locks', 'ON'
'innodb_thread_concurrency', '8' 'innodb_thread_sleep_delay', '10000'
'innodb_use_legacy_cardinality_algorithm', 'ON'
更新 '15:到目前为止,我使用了相同的方法,每天处理 6 亿行和 64 万行新行.它仍然工作正常.
Update '15: I used the same method up to now with 600 millions rows and 640 000 new rows per day. It's still working fine.
推荐答案
最后最快的是使用 C# 查询前 X 行并计算行数.
Finally the fastest was to query the first X rows using C# and counting the rows number.
我的应用程序正在批量处理数据.两批之间的时间长短取决于需要处理的行数
My application is treating the data in batches. The amount of time between two batches are depending the number of rows who need to be treated
SELECT pk FROM table WHERE fk = 1 LIMIT X
我在 0.9 秒内得到了结果.
I got the result in 0.9 seconds.
感谢大家的想法!
相关文章