Portrait of Mihaly Kertesz

sql server / services / performance review

SQL Server
performance review.

This is for SQL Server performance problems that are already real enough to hurt, but still messy enough that nobody trusts the diagnosis.

Usually that means some mix of blocking, waits, deadlocks, recurring slowness, tempdb side effects, and a queue of partial fixes that never made the workload feel stable. The point of the review is not to produce a generic tuning list. The point is to narrow the problem into a fix order that actually matches production behavior.

Related

Use the SQL Server blocking guide for contention chains, the SQL Server waits guide for signal reading, the SQL Server deadlocks guide for victim patterns, and the slow performance problem page when the symptom is still broad.

Good fit

  • Blocking, deadlocks, or slowness are already hurting real work.
  • The team has symptoms and theories, but no stable diagnosis yet.
  • Quick fixes keep changing the shape of the problem without removing it.
  • The next change in production feels risky because nobody is sure what the real bottleneck is.

What you get

  • A clearer triage order tied to the actual blockers instead of a broad tuning wishlist.
  • Workload-specific findings around waits, concurrency, plans, indexing, and timing.
  • A realistic split between fixes that help now and bigger changes that need separate planning.

What the problem usually looks like

Performance review work usually starts after the easy explanations have already failed

Most teams do not ask for outside performance review on day one. They ask after blocking keeps returning, after waits were read but not understood well enough to act, after deadlocks were patched around instead of solved, or after “just tune the indexes” failed to make the estate feel predictable again.

That is why a good performance review is narrower than a broad health audit but wider than a single-query exercise. The workload needs reading in context. What is slow. When it is slow. Which waits grew. Whether the pain is really concurrency. Whether tempdb or maintenance overlap is making the problem look random. Whether one statement is the issue or whether the environment keeps manufacturing contention.

The useful outcome is not one universal explanation. It is a cleaner path from production symptoms to the smallest set of fixes that reliably changes the workload story.

What we review

The review has to connect evidence back to workload behavior, not just collect metrics

Performance work gets noisy fast when people collect every metric available and still cannot explain why users are waiting. The useful review path is to connect the complaint to evidence that matters: waits that changed, blocking that persisted, plans that misfit the workload, transactions that stayed open too long, jobs that collided badly, or indexing that is forcing too much work.

That means reading more than one surface. Waits alone are not enough. Query text alone is not enough. One slow request log is not enough. The point is to connect what the database is doing with when it hurts, how it hurts, and which design or operational choices are making it repeat.

Typical review areas

  • Current symptom shape: blocking, deadlocks, broad slowness, unstable job overlap, or mixed signals.
  • Wait patterns, session evidence, and whether the observed waits match the production complaint.
  • Blocking chains, transaction scope, and the statements actually holding things open.
  • Access paths, plan shape, and indexing decisions that expand lock footprint or waste work.
  • Timing, background jobs, tempdb pressure, and other environmental factors that make the workload look random.

Deliverables

Good output should reduce guesswork, not increase the list of possible theories

Teams usually need three things from this work: a cleaner diagnosis, a triage order that is safe to act on, and an honest answer on what still needs deeper change. Some fixes are operational. Some are design-level. Some are just the cost of workload shape. If the review does not separate those clearly, performance work becomes one more noisy document in the pile.

The useful result is a report and discussion that says what is slowing the system, what evidence supports that view, what to fix first, and what should not be “tuned” blindly in production. Sometimes that leads to narrower concurrency work. Sometimes it points back to health-review issues like visibility or maintenance overlap. Sometimes it shows the real bottleneck is architectural and not something a query hint should be hiding.

OutputWhat it should answerWhy it matters
Triage orderWhat to inspect or fix first instead of touching everything at once.This keeps the team from making the workload noisier while trying to help.
Evidence-backed findingsWhich waits, chains, plans, or timing patterns actually support the diagnosis.This gives the next action a reason instead of a hunch.
Boundary of the problemWhat is a near-term fix, what needs wider redesign, and what belongs to another service lane.This stops performance work from pretending every issue is a small tune-up.

When this is not the right first step

  • A wider inherited-estate review where operational drift is the main problem.
  • A pure upgrade or migration readiness engagement.
  • An outage so active that the first job is incident command rather than structured review.

When outside review makes sense

Outside review usually makes sense when the team already has evidence, but not enough agreement or deep SQL experience to turn that evidence into the next safe move. It also helps when production pressure is making everyone overreact: killing blockers without understanding them, blaming SQL Server generally, or making indexing changes that only move the pain somewhere else.

If the real need is sharper diagnosis under production reality, that is the point of this service. If the estate mainly needs broader operational control first, then a health audit is usually the better opening move.

Next step

If blocking, waits, deadlocks, or broad slowness are already hurting work, use contact and describe the symptoms, when they show up, and what evidence you already have.

If you want the technical framing first, read the SQL Server blocking guide, the SQL Server waits guide, and the SQL Server indexing guide.

If the wider estate feels unknown beyond the performance symptoms, the better first page is SQL Server health audit.