Portrait of Mihaly Kertesz

sql server / consulting guide

SQL Server performance consulting.

This page is for environments where the performance pain is already real, but the explanation still keeps changing faster than the system improves.

Blocking keeps coming back

People know blocking exists, but still do not have a stable explanation for why the same contention pattern returns.

The story keeps changing

One week the blame is indexing, then hardware, then tempdb, then a deployment, then reporting. The team has theories, not a diagnosis.

Quick fixes changed the shape, not the result

Something improved for a while, but the estate still feels unstable and no one trusts the next tuning change enough.

What helps in the first message

  • What feels slow and who feels it.
  • Whether the problem is constant, intermittent, or recently worse.
  • Any evidence already available: waits, plans, deadlocks, blocking samples, or monitoring trends.
  • What has already been tried.

When not to start here

If the bigger problem is inherited uncertainty, upgrade risk, or recovery posture rather than live workload pain, the broader consulting page is usually the better first stop.

When this is the right kind of outside help

Performance consulting is for diagnosis-led work. It is not just a longer name for generic tuning.

SituationBetter fit
Live performance pain with unclear diagnosisSQL Server performance consulting
Broad inherited-estate uncertainty with many non-performance risksBroader SQL Server consulting
One known query issue with a trusted root causeTargeted remediation rather than broad consulting

What performance consulting is actually for

The job is to connect production pain back to the workload behavior causing it. That means reading waits, plans, blocking, deadlocks, tempdb pressure, workload timing, and change history together instead of collecting a folder full of screenshots and hoping the answer becomes obvious.

SQL Server performance consulting is useful when the team has already spent enough time living inside partial explanations. Maybe the database is slow only during overlap windows. Maybe deadlocks come in bursts. Maybe one release made things worse, but not in a way that cleanly points to one cause. The work exists to reduce that uncertainty.

A good result is not a heroic tuning anecdote. It is a diagnosis the team can trust enough to act on safely.

  • /Connect pain back to workload behavior.
  • /Reduce theories instead of adding more.
  • /Produce a diagnosis strong enough for production decisions.

Why performance problems stay vague for so long

Different teams usually see different fragments of the same issue. Developers may focus on query design. Infrastructure teams may focus on storage or memory. Operations may focus on jobs and timing. Support or product teams only see the user symptom. Each slice can be partly right while none of them explains the whole picture.

Intermittent symptoms make it worse. Systems that only slow under overlap conditions, month-end load, certain plan choices, or occasional parameter patterns are much harder to reason about than systems that are constantly slow. That is why performance work so often drifts into guesswork.

Outside consulting is useful here because it forces those fragments back into one technical story.

  • /Different teams own different fragments of the pain.
  • /Intermittent symptoms distort judgment.
  • /Without a single story, fixes become random.

What a useful review usually looks at

Symptom shape comes first. What is slow, who feels it, when does it happen, how long has it been like this, and what changed around it? Those questions matter because they stop the work from collapsing into generic tuning before the problem model is even clear.

From there the review usually moves through waits, blocking, deadlocks, plans, indexing quality, tempdb behavior, resource pressure, workload overlap, and recent change history. None of those areas mean much in isolation. Their value is in how they explain the same pain together.

The purpose of the review is not to admire every technical detail in the estate. It is to review enough of the production picture that the next fix order stops being random.

  • /Symptom shape first.
  • /Then evidence across waits, plans, blocking, tempdb, and workload timing.
  • /Review enough to explain the pain, not everything for its own sake.

Where teams usually lose time

They lose time when performance work turns into a generic wishlist. Update statistics. Rebuild indexes. Tune queries. Add memory. Move storage. Review tempdb. None of those ideas are always wrong, but without a strong diagnosis they become random acts of database maintenance.

They also lose time when they treat the loudest complaint as the only complaint. A timeout may be the most visible symptom while blocking, transaction scope, or workload overlap is the real source. A CPU spike may be downstream of bad plan stability. A tempdb issue may be a workload-pattern issue wearing a tempdb mask.

Performance consulting should keep the team from paying for the wrong fix order.

  • /Generic wishlists consume time without reducing uncertainty.
  • /The loudest symptom is not always the real bottleneck.
  • /The fix order matters as much as the fixes.

What a good performance review should leave behind

The team should leave with a clearer diagnosis and a smaller set of changes that actually fit the evidence. That may mean query and indexing work, transaction-scope changes, workload separation, tempdb changes, configuration review, or capacity movement. The point is not which category wins. The point is that the decision is finally supported.

The review should also leave behind better judgment. Future symptoms should be easier to interpret because the team better understands which signals matter, which monitoring gaps hurt, and which kinds of drift create the same confusion again.

That is why good performance work is not only about this week’s incident. It should leave the estate easier to reason about after the consultant is gone.

  • /Clearer diagnosis.
  • /Smaller action list.
  • /Better future judgment around performance drift.

Why this work is often worth doing before the next incident gets worse

Teams sometimes wait too long because the system is still usable most of the time. That can make the issue feel survivable even while it is slowly getting more expensive. What changes the math is usually not one dramatic outage. It is the accumulation of time lost to repeated slow periods, unstable reporting windows, and half-successful fixes that keep dragging the same people back into the same problem.

A proper performance review can interrupt that loop. It does not need the estate to be catastrophic first. It only needs the production pain to be real enough that the team wants a better explanation than the one it has now.

That is often why outside SQL Server performance consulting gets approved. Not because the environment is beyond repair, but because the current pace of guesswork is already too costly.

  • /The system does not need to be on fire before the review is worth doing.
  • /Repeated partial fixes carry their own cost.
  • /The real value is often stopping the cycle of expensive guesswork.

Useful next reads

If the main pain is performance, describe the symptoms plainly.

Say what is slow, when it happens, and what evidence already exists. That is usually enough to tell whether the work needs broader consulting or a narrower performance review.