• <bdo id='doPcU'></bdo><ul id='doPcU'></ul>

    <tfoot id='doPcU'></tfoot>

  • <legend id='doPcU'><style id='doPcU'><dir id='doPcU'><q id='doPcU'></q></dir></style></legend>

    1. <i id='doPcU'><tr id='doPcU'><dt id='doPcU'><q id='doPcU'><span id='doPcU'><b id='doPcU'><form id='doPcU'><ins id='doPcU'></ins><ul id='doPcU'></ul><sub id='doPcU'></sub></form><legend id='doPcU'></legend><bdo id='doPcU'><pre id='doPcU'><center id='doPcU'></center></pre></bdo></b><th id='doPcU'></th></span></q></dt></tr></i><div id='doPcU'><tfoot id='doPcU'></tfoot><dl id='doPcU'><fieldset id='doPcU'></fieldset></dl></div>
      1. <small id='doPcU'></small><noframes id='doPcU'>

        6000万个条目,选择某个月份的条目.如何优化数据库?

        时间:2023-05-23

            <legend id='rJ487'><style id='rJ487'><dir id='rJ487'><q id='rJ487'></q></dir></style></legend>
              <tbody id='rJ487'></tbody>

              • <bdo id='rJ487'></bdo><ul id='rJ487'></ul>

                  <tfoot id='rJ487'></tfoot>

                  <i id='rJ487'><tr id='rJ487'><dt id='rJ487'><q id='rJ487'><span id='rJ487'><b id='rJ487'><form id='rJ487'><ins id='rJ487'></ins><ul id='rJ487'></ul><sub id='rJ487'></sub></form><legend id='rJ487'></legend><bdo id='rJ487'><pre id='rJ487'><center id='rJ487'></center></pre></bdo></b><th id='rJ487'></th></span></q></dt></tr></i><div id='rJ487'><tfoot id='rJ487'></tfoot><dl id='rJ487'><fieldset id='rJ487'></fieldset></dl></div>

                  <small id='rJ487'></small><noframes id='rJ487'>

                  本文介绍了6000万个条目,选择某个月份的条目.如何优化数据库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  限时送ChatGPT账号..

                  我有一个包含 6000 万个条目的数据库.

                  I have a database with 60 million entries.

                  每个条目都包含:

                  • 身份证
                  • 数据源ID
                  • 一些数据
                  • 日期时间
                  1. 我需要选择某个月份的条目.每个月包含大约 200 万个条目.

                  1. I need to select entries from certain month. Each month contains approximately 2 million entries.

                   select * 
                     from Entries 
                    where time between "2010-04-01 00:00:00" and "2010-05-01 00:00:00"
                  

                  (查询大约需要 1.5 分钟)

                  (query takes approximately 1.5 minutes)

                  我还想从给定的 DataSourceID 中选择某个月份的数据.(大约需要 20 秒)

                  I'd also like to select data from certain month from a given DataSourceID. (takes approximately 20 seconds)

                  大约有 50-100 个不同的 DataSourceID.

                  There are about 50-100 different DataSourceIDs.

                  有没有办法让它更快?我有哪些选择?如何优化这个数据库/查询?

                  Is there a way to make this faster? What are my options? How to optimize this database/query?

                  大约有.每秒 60-100 次插入!

                  There's approx. 60-100 inserts PER second!

                  推荐答案

                  利用 innodb 聚集主键索引.

                  Take advantage of innodb clustered primary key indexes.

                  http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html

                  这将非常高效:

                  create table datasources
                  (
                  year_id smallint unsigned not null,
                  month_id tinyint unsigned not null,
                  datasource_id tinyint unsigned not null,
                  id int unsigned not null, -- needed for uniqueness
                  data int unsigned not null default 0,
                  primary key (year_id, month_id, datasource_id, id)
                  )
                  engine=innodb;
                  
                  select * from datasources where year_id = 2011 and month_id between 1 and 3;
                  
                  select * from datasources where year_id = 2011 and month_id = 4 and datasouce_id = 100;
                  
                  -- etc..
                  

                  编辑 2

                  忘记了我正在使用 3 个月的数据运行第一个测试脚本.这是一个月的结果:0.34 和 0.69 秒.

                  Forgot i was running the first test script with 3 months of data. Here's the results for a single month : 0.34 and 0.69 seconds.

                  select d.* from datasources d where d.year_id = 2010 and d.month_id = 3 and datasource_id = 100 order by d.id desc limit 10;
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           100 | 3290330 | 38434 |
                  |    2010 |        3 |           100 | 3290329 |  9988 |
                  |    2010 |        3 |           100 | 3290328 | 25680 |
                  |    2010 |        3 |           100 | 3290327 | 17627 |
                  |    2010 |        3 |           100 | 3290326 | 64508 |
                  |    2010 |        3 |           100 | 3290325 | 14257 |
                  |    2010 |        3 |           100 | 3290324 | 45950 |
                  |    2010 |        3 |           100 | 3290323 | 49986 |
                  |    2010 |        3 |           100 | 3290322 |  2459 |
                  |    2010 |        3 |           100 | 3290321 | 52971 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (0.34 sec)
                  
                  select d.* from datasources d where d.year_id = 2010 and d.month_id = 3 order by d.id desc limit 10;
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           116 | 3450346 | 42455 |
                  |    2010 |        3 |           116 | 3450345 | 64039 |
                  |    2010 |        3 |           116 | 3450344 | 27046 |
                  |    2010 |        3 |           116 | 3450343 | 23730 |
                  |    2010 |        3 |           116 | 3450342 | 52380 |
                  |    2010 |        3 |           116 | 3450341 | 35700 |
                  |    2010 |        3 |           116 | 3450340 | 20195 |
                  |    2010 |        3 |           116 | 3450339 | 21758 |
                  |    2010 |        3 |           116 | 3450338 | 51378 |
                  |    2010 |        3 |           116 | 3450337 | 34687 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (0.69 sec)
                  

                  编辑 1

                  决定用大约测试上述模式.6000 万行分布在 3 年内.每个查询都是冷运行的,即每个查询都单独运行,然后重新启动 mysql,清除任何缓冲区,并且没有查询缓存.

                  Decided to test the above schema with approx. 60 million rows spread over 3 years. Each query is run cold i.e. each run separately after which mysql is restarted clearing any buffers and with no query caching.

                  完整的测试脚本可以在这里找到:http://pastie.org/1723506 或以下...

                  The full test script can be found here : http://pastie.org/1723506 or below...

                  正如你所看到的,即使在我简陋的桌面上,它也是一个非常高性能的架构:)

                  As you can see it's a pretty performant schema even on my humble desktop :)

                  select count(*) from datasources;
                  +----------+
                  | count(*) |
                  +----------+
                  | 60306030 |
                  +----------+
                  
                  select count(*) from datasources where year_id = 2010;
                  +----------+
                  | count(*) |
                  +----------+
                  | 16691669 |
                  +----------+
                  
                  select
                   year_id, month_id, count(*) as counter
                  from
                   datasources
                  where 
                   year_id = 2010
                  group by
                   year_id, month_id;
                  +---------+----------+---------+
                  | year_id | month_id | counter |
                  +---------+----------+---------+
                  |    2010 |        1 | 1080108 |
                  |    2010 |        2 | 1210121 |
                  |    2010 |        3 | 1160116 |
                  |    2010 |        4 | 1300130 |
                  |    2010 |        5 | 1860186 |
                  |    2010 |        6 | 1220122 |
                  |    2010 |        7 | 1250125 |
                  |    2010 |        8 | 1460146 |
                  |    2010 |        9 | 1730173 |
                  |    2010 |       10 | 1490149 |
                  |    2010 |       11 | 1570157 |
                  |    2010 |       12 | 1360136 |
                  +---------+----------+---------+
                  12 rows in set (5.92 sec)
                  
                  
                  select 
                   count(*) as counter
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100;
                  
                  +---------+
                  | counter |
                  +---------+
                  |   30003 |
                  +---------+
                  1 row in set (1.04 sec)
                  
                  explain
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100
                  order by
                   d.id desc limit 10;
                  
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  | id | select_type | table | type  | possible_keys | key     | key_len | ref  |rows    | Extra                       |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  |  1 | SIMPLE      | d     | range | PRIMARY       | PRIMARY | 4       | NULL |4451372 | Using where; Using filesort |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  1 row in set (0.00 sec)
                  
                  
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100
                  order by
                   d.id desc limit 10;
                  
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           100 | 3290330 | 38434 |
                  |    2010 |        3 |           100 | 3290329 |  9988 |
                  |    2010 |        3 |           100 | 3290328 | 25680 |
                  |    2010 |        3 |           100 | 3290327 | 17627 |
                  |    2010 |        3 |           100 | 3290326 | 64508 |
                  |    2010 |        3 |           100 | 3290325 | 14257 |
                  |    2010 |        3 |           100 | 3290324 | 45950 |
                  |    2010 |        3 |           100 | 3290323 | 49986 |
                  |    2010 |        3 |           100 | 3290322 |  2459 |
                  |    2010 |        3 |           100 | 3290321 | 52971 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (0.98 sec)
                  
                  
                  select 
                   count(*) as counter
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3;
                  
                  +---------+
                  | counter |
                  +---------+
                  | 3450345 |
                  +---------+
                  1 row in set (1.64 sec)
                  
                  explain
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3
                  order by
                   d.id desc limit 10;
                  
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  | id | select_type | table | type  | possible_keys | key     | key_len | ref  |rows    | Extra                       |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  |  1 | SIMPLE      | d     | range | PRIMARY       | PRIMARY | 3       | NULL |6566916 | Using where; Using filesort |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  1 row in set (0.00 sec)
                  
                  
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3
                  order by
                   d.id desc limit 10;
                  
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           116 | 3450346 | 42455 |
                  |    2010 |        3 |           116 | 3450345 | 64039 |
                  |    2010 |        3 |           116 | 3450344 | 27046 |
                  |    2010 |        3 |           116 | 3450343 | 23730 |
                  |    2010 |        3 |           116 | 3450342 | 52380 |
                  |    2010 |        3 |           116 | 3450341 | 35700 |
                  |    2010 |        3 |           116 | 3450340 | 20195 |
                  |    2010 |        3 |           116 | 3450339 | 21758 |
                  |    2010 |        3 |           116 | 3450338 | 51378 |
                  |    2010 |        3 |           116 | 3450337 | 34687 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (1.98 sec)
                  

                  希望这有帮助:)

                  这篇关于6000万个条目,选择某个月份的条目.如何优化数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  上一篇:是否可以使用正则表达式在 MySQL 中强制执行数据检查 下一篇:如何在 MySQL 中生成数据?

                  相关文章

                    <bdo id='o3Os9'></bdo><ul id='o3Os9'></ul>

                • <tfoot id='o3Os9'></tfoot>
                • <legend id='o3Os9'><style id='o3Os9'><dir id='o3Os9'><q id='o3Os9'></q></dir></style></legend>

                      <i id='o3Os9'><tr id='o3Os9'><dt id='o3Os9'><q id='o3Os9'><span id='o3Os9'><b id='o3Os9'><form id='o3Os9'><ins id='o3Os9'></ins><ul id='o3Os9'></ul><sub id='o3Os9'></sub></form><legend id='o3Os9'></legend><bdo id='o3Os9'><pre id='o3Os9'><center id='o3Os9'></center></pre></bdo></b><th id='o3Os9'></th></span></q></dt></tr></i><div id='o3Os9'><tfoot id='o3Os9'></tfoot><dl id='o3Os9'><fieldset id='o3Os9'></fieldset></dl></div>

                      <small id='o3Os9'></small><noframes id='o3Os9'>