优化现有数据库时首先要检查的问题是什么?

在优化(性能调优、故障排除)现有(但您不知道)的数据库时,最重要的问题是什么?按重要性排序是什么?
您之前优化中的哪些操作/措施产生了最大的效果(可能工作最少)?

What are the top issues and in which order of importance to look into while optimizing (performance tuning, troubleshooting) an existing (but unknown to you) database?
Which actions/measures in your previous optimizations gave the most effect (with possibly the minimum of work) ?

我想将这个问题分为以下几类(按照我感兴趣的顺序):

I'd like to partition this question into following categories (in order of interest to me):

  1. 需要在最短的时间内展示性能提升(改进).即最具成本效益的方法/行动;
  2. 非侵入式或最少麻烦的最有效方法(无需更改现有架构等)
  3. 侵入式方法

更新:
假设我在开发机器上有一个数据库副本,无法访问生产环境来观察实际使用中的统计信息、最常用的查询、性能计数器等.
这是与开发相关但与 DBA 无关的问题.
更新 2:
假设数据库是别人开发的,交给我优化(审核)后才交付生产.
将外包开发与最终用户分离是很常见的.

Update:
Suppose I have a copy of a database on dev machine without access to production environment to observe stats, most used queries, performance counters, etc. in real use.
This is development-related but not DBA-related question.
Update2:
Suppose the database was developed by others and was given to me for optimization (review) before it was delivered to production.
It is quite usual to have outsourced development detached from end-users.

此外,还有一种数据库设计范式,与应用程序数据存储相比,数据库本身应该是一个独立于使用它的特定应用程序或其使用上下文的值.

Besides, there is a database design paradigm that a database, in contrast to application data storage, should be a value in itself independently on specific applications that use it or on context of its use.

Update3:感谢所有回答者!你们都推我打开子问题
你如何在本地加载开发数据库(服务器)?

Update3: Thanks to all answerers! You all pushed me to open subquestion
How do you stress load dev database (server) locally?

推荐答案

如果您对数据库的运行时行为不感兴趣,例如什么是最常执行的查询和那些消耗最多时间的查询,您只能对数据库结构本身进行静态"分析.这实际上的价值要小得多,因为您只能检查一些不良设计的关键指标 - 但您无法真正了解所使用系统的动态".

If you're not interested in the runtime behavior of the database, e.g. what are the most frequently executed queries and those that consume the most time, you can only do a "static" analysis of the database structure itself. That has a lot less value, really, since you can only check for a number of key indicators of bad design - but you cannot really tell much about the "dynamics" of the system being used.

我将在作为 .bak 文件获取的数据库中检查的内容 - 无法收集实时和实际运行时性能统计信息 - 将是:

Things I would check for in a database that I get as a .bak file - without the ability to collect live and actual runtime performance statistics - would be:

  1. 归一化 - 表结构是否归一化为第三范式?(至少大部分时间 - 可能有一些例外)

  1. normalization - is the table structure normalized to third normal form? (at least most of the time - there might be some exceptions)

所有表都有主键吗?(如果它没有主键,那就不是表",毕竟)

do all tables have a primary key? ("if it doesn't have a primary key, it's not a table", after all)

对于 SQL Server:是否所有的表都有好的聚集索引?一个独特的、狭窄的、静态的、最好是不断增加的聚集键——理想情况下是一个 INT IDENTITY,而且绝对不是许多字段的大型复合索引,没有 GUID 和大型 VARCHAR 字段(请参阅 Kimberly Tripp 的 优秀博文关于主题的详细信息)

For SQL Server: do all the tables have a good clustering index? A unique, narrow, static, and preferably ever-increasing clustered key - ideally an INT IDENTITY, and most definitely not a large compound index of many fields, no GUID's and no large VARCHAR fields (see Kimberly Tripp's excellent blog posts on the topics for details)

对数据库表是否有任何检查和默认约束?

are there any check and default constraints on the database tables?

是否所有外键字段都由非聚集索引备份以加速 JOIN 查询?

are all the foreign key fields backed up by a non-clustered index to speed up JOIN queries?

数据库中是否还有其他明显的致命罪行",例如过于复杂的视图,或者设计非常糟糕的表格等.

are there any other, obvious "deadly sins" in the database, e.g. overly complicated views, or really badly designed tables etc.

但同样:如果没有实际的运行时统计信息,从静态分析"的角度来看,您可以做的事情非常有限.真正的优化只有在您拥有日常操作中的工作负载时才能真正发生,以查看哪些查询被频繁使用并且对您的数据库施加的压力最大 --> 使用 Mitch 的清单来检查这些点.

But again: without actual runtime statistics, you're quite limited in what you can do from a "static analysis" point of view. The real optimization can only really happen when you have a workload from a regular day of operation, to see what queries are used frequently and put the most stress on your database --> use Mitch's checklist to check those points.

相关文章