sql server hub / production readiness guide

SQL Serverproduction readiness guide

Production readiness means the SQL Server can be operated when backups, failover, changes, alerts, and capacity all matter at the same time.

Use this guide before launch, audit, handover, migration, upgrade, or a bigger production workload. For a wider review, start with a SQL Server health audit.

backupsmonitoringaccessHA/DRrollback

Guide

Operational guide~18 min readUpdated 23 May 2026

Share

LinkedInXEmail

In this guide

  1. 1. What a SQL Server production readiness review should prove
  2. 2. SQL Server production readiness baseline
  3. 3. SQL Server backup and restore readiness
  4. 4. SQL Server job and maintenance readiness
  5. 5. SQL Server monitoring and alert readiness
  6. 6. SQL Server access and ownership readiness
  7. 7. SQL Server HA, DR, and failover readiness
  8. 8. SQL Server deployment, patch, and rollback readiness
  9. 9. SQL Server capacity and workload readiness
  10. 10. SQL Server production readiness scorecard
  11. 11. Common SQL Server production readiness mistakes
  12. 12. When a SQL Server production readiness review helps

First checks

1

Check whether backups and restores are proven, not just scheduled.

2

Confirm jobs, alerts, access, and HA/DR have real owners.

3

Review rollback rules before the next launch, patch, migration, or config change.

4

Find capacity and workload pressure before the system carries more production load.

1 / Output

What a SQL Server production readiness review should prove

A readiness review should say whether the SQL Server is ready, ready with fixes, not ready, or still unknown because proof is missing. Quiet production is not enough.

The review should name what would fail first during an outage, deployment, patch, restore, failover, or growth spike, then turn that into a fix order.

AreaWhat it provesCaution
BackupsRecent full, differential, and log backups match the recovery model.Does not prove restore success.
Restore historyRecent restore tests or test runs prove the path and timing.No restore history usually means recovery is still theoretical.
JobsCritical SQL Agent jobs succeed, finish on time, and alert when they fail.Green final status can hide retried or failed steps.
MonitoringFailures reach someone who can act early enough.A dashboard is not readiness if nobody watches the right signals.
AccessPrivileged access, job owners, and disabled logins are understood.Do not change access blindly during readiness review.
HA/DRReplica health, sync state, failover path, and restore path are clear.HA does not replace backups or restore testing.
RollbackPatch, deploy, migration, and config rollback rules are written down.A rollback plan must include stop points and validation.
CapacityWaits, active workload, file latency, file growth, and tempdb have headroom.Current availability is not the same as production headroom.

2 / Baseline

SQL Server production readiness baseline

Start with version, uptime, edition, CPU, memory, virtualization, database state, recovery model, compatibility level, and Query Store state. Uptime matters because many counters and waits reset after restart.

Instance build, uptime, CPU, and memory

Reads the current SQL Server build, uptime, CPU, memory, and virtualization context.

SELECT
    @@SERVERNAME AS server_name,
    SERVERPROPERTY(N'Edition') AS edition,
    SERVERPROPERTY(N'ProductVersion') AS product_version,
    SERVERPROPERTY(N'ProductLevel') AS product_level,
    SERVERPROPERTY(N'ProductUpdateLevel') AS product_update_level,
    SERVERPROPERTY(N'ProductUpdateReference') AS product_update_reference,
    si.sqlserver_start_time,
    DATEDIFF(hour, si.sqlserver_start_time, SYSDATETIME()) AS uptime_hours,
    si.cpu_count,
    si.scheduler_count,
    si.physical_memory_kb / 1024 AS physical_memory_mb,
    si.virtual_machine_type_desc
FROM sys.dm_os_sys_info AS si;

Recent uptime limits how much history waits and runtime counters can show. Build and update level matter before patching, upgrading, or opening a vendor case. CPU and memory shape help interpret capacity readiness. This is context, not a readiness decision by itself.

Database readiness baseline

Lists user database state, compatibility level, recovery model, and Query Store flag.

SELECT
    name AS database_name,
    state_desc,
    compatibility_level,
    recovery_model_desc,
    is_read_only,
    is_auto_close_on,
    is_auto_shrink_on,
    is_query_store_on
FROM sys.databases
WHERE name NOT IN (N'master', N'model', N'msdb', N'tempdb')
ORDER BY name;

Unexpected database states need review before launch or change. Recovery model must match backup and restore expectations. Auto close and auto shrink are warning flags on most production SQL Servers. Query Store helps validate workload behavior after changes.

Query Store state

Shows Query Store actual state and storage usage for user databases.

SELECT
    d.name AS database_name,
    d.is_query_store_on,
    qso.actual_state_desc,
    qso.desired_state_desc,
    qso.readonly_reason,
    qso.current_storage_size_mb,
    qso.max_storage_size_mb
FROM sys.databases AS d
LEFT JOIN sys.database_query_store_options AS qso
    ON qso.database_id = d.database_id
WHERE d.name NOT IN (N'master', N'model', N'msdb', N'tempdb')
ORDER BY d.name;

Actual state can differ from desired state. A full or read-only Query Store may not capture validation data. No Query Store means you need another way to compare workload behavior. Do not change Query Store settings blindly during a production incident.

3 / Backup and restore

SQL Server backup and restore readiness

Backups are not production readiness unless restore path, timing, and dependencies are known. Use the backup guide and recovery guide when this is the weakest area.

Backup and restore history

Shows latest backup by type plus restore history recorded in msdb.

WITH LastBackups AS (
    SELECT
        bs.database_name,
        bs.type,
        bs.backup_start_date,
        bs.backup_finish_date,
        bs.backup_size,
        bs.compressed_backup_size,
        ROW_NUMBER() OVER (
            PARTITION BY bs.database_name, bs.type
            ORDER BY bs.backup_finish_date DESC
        ) AS rn
    FROM msdb.dbo.backupset AS bs
    WHERE bs.backup_finish_date >= DATEADD(day, -60, GETDATE())
)
SELECT
    d.name AS database_name,
    d.recovery_model_desc,
    CASE lb.type
        WHEN 'D' THEN 'Full'
        WHEN 'I' THEN 'Differential'
        WHEN 'L' THEN 'Log'
        ELSE lb.type
    END AS backup_type,
    lb.backup_finish_date,
    DATEDIFF(hour, lb.backup_finish_date, GETDATE()) AS hours_since_backup,
    DATEDIFF(minute, lb.backup_start_date, lb.backup_finish_date) AS duration_minutes,
    CONVERT(decimal(18,2), lb.backup_size / 1048576.0) AS backup_size_mb,
    CONVERT(decimal(18,2), lb.compressed_backup_size / 1048576.0) AS compressed_size_mb
FROM sys.databases AS d
LEFT JOIN LastBackups AS lb
    ON lb.database_name = d.name
   AND lb.rn = 1
WHERE d.name <> N'tempdb'
ORDER BY d.name, backup_type;

SELECT
    destination_database_name,
    MAX(restore_date) AS last_restore_date,
    COUNT(*) AS restore_records
FROM msdb.dbo.restorehistory
GROUP BY destination_database_name
ORDER BY last_restore_date DESC;

Backup history proves SQL Server recorded backups, not that files are restorable. Restore history is strongest when it reflects recent controlled tests. Backup duration affects maintenance windows and rollback planning. Compressed size changes can reveal workload or data-growth changes.

Backup gap readiness review

Flags missing full backups, missing log backups, and older full backups.

SELECT
    d.name AS database_name,
    d.recovery_model_desc,
    MAX(CASE WHEN bs.type = 'D' THEN bs.backup_finish_date END) AS last_full_backup,
    MAX(CASE WHEN bs.type = 'I' THEN bs.backup_finish_date END) AS last_diff_backup,
    MAX(CASE WHEN bs.type = 'L' THEN bs.backup_finish_date END) AS last_log_backup,
    CASE
        WHEN MAX(CASE WHEN bs.type = 'D' THEN bs.backup_finish_date END) IS NULL
            THEN 'No full backup found'
        WHEN d.recovery_model_desc IN ('FULL', 'BULK_LOGGED')
             AND MAX(CASE WHEN bs.type = 'L' THEN bs.backup_finish_date END) IS NULL
            THEN 'Full/Bulk-logged with no log backup found'
        WHEN MAX(CASE WHEN bs.type = 'D' THEN bs.backup_finish_date END) < DATEADD(day, -7, GETDATE())
            THEN 'Full backup older than 7 days'
        ELSE 'Review against RPO'
    END AS review_note
FROM sys.databases AS d
LEFT JOIN msdb.dbo.backupset AS bs
    ON bs.database_name = d.name
   AND bs.backup_finish_date >= DATEADD(day, -60, GETDATE())
WHERE d.name <> N'tempdb'
GROUP BY d.name, d.recovery_model_desc
ORDER BY d.name;

Thresholds are starting points, not universal policy. Full or bulk-logged recovery without log backups is usually a serious readiness problem. Compare backup age to RPO and restore expectations. Use recovery readiness when restore/failover confidence is the main risk.

4 / Jobs and maintenance

SQL Server job and maintenance readiness

Production readiness fails when critical jobs fail quietly, run too long, depend on old owners, or hide the real maintenance work. Use the maintenance plan guide for a deeper job review.

Failed, long, disabled, and owner-risk jobs

Shows disabled jobs plus recent failed, retried, canceled, or long-running SQL Agent job steps.

SELECT TOP (150)
    j.name AS job_name,
    CASE WHEN j.enabled = 1 THEN 'Enabled' ELSE 'Disabled' END AS job_status,
    SUSER_SNAME(j.owner_sid) AS job_owner,
    h.step_id,
    h.step_name,
    CASE h.run_status
        WHEN 0 THEN 'Failed'
        WHEN 1 THEN 'Succeeded'
        WHEN 2 THEN 'Retry'
        WHEN 3 THEN 'Canceled'
        WHEN 4 THEN 'In progress'
        ELSE CONVERT(varchar(20), h.run_status)
    END AS run_status,
    msdb.dbo.agent_datetime(h.run_date, h.run_time) AS run_start_time,
    ((h.run_duration / 10000) * 3600)
      + (((h.run_duration % 10000) / 100) * 60)
      + (h.run_duration % 100) AS duration_seconds,
    h.message
FROM msdb.dbo.sysjobs AS j
LEFT JOIN msdb.dbo.sysjobhistory AS h
    ON h.job_id = j.job_id
   AND h.run_date >= CONVERT(int, CONVERT(char(8), DATEADD(day, -30, GETDATE()), 112))
WHERE j.enabled = 0
   OR h.run_status <> 1
   OR h.run_duration >= 3000
ORDER BY job_status, run_start_time DESC, duration_seconds DESC;

Disabled jobs should be intentional and documented. A final success can hide failed or retried job steps. Job owners can break after account changes. Long jobs can collide with backup, maintenance, or business windows.

Maintenance command readiness

Finds job steps that appear to run CHECKDB, backups, index/statistics work, or cleanup.

SELECT
    j.name AS job_name,
    CASE WHEN j.enabled = 1 THEN 'Enabled' ELSE 'Disabled' END AS job_status,
    SUSER_SNAME(j.owner_sid) AS job_owner,
    s.step_id,
    s.step_name,
    CASE
        WHEN s.command LIKE '%DBCC CHECKDB%' THEN 'CHECKDB'
        WHEN s.command LIKE '%BACKUP DATABASE%' THEN 'Database backup'
        WHEN s.command LIKE '%BACKUP LOG%' THEN 'Log backup'
        WHEN s.command LIKE '%ALTER INDEX%' THEN 'Index maintenance'
        WHEN s.command LIKE '%UPDATE STATISTICS%' THEN 'Statistics update'
        WHEN s.command LIKE '%sp_delete_backuphistory%' THEN 'Backup history cleanup'
        WHEN s.command LIKE '%sp_purge_jobhistory%' THEN 'Job history cleanup'
        WHEN s.command LIKE '%xp_delete_file%' THEN 'File cleanup'
        ELSE 'Other possible maintenance'
    END AS maintenance_type,
    LEFT(REPLACE(REPLACE(s.command, CHAR(13), ' '), CHAR(10), ' '), 700) AS command_preview
FROM msdb.dbo.sysjobs AS j
JOIN msdb.dbo.sysjobsteps AS s
    ON s.job_id = j.job_id
WHERE s.command LIKE '%DBCC CHECKDB%'
   OR s.command LIKE '%BACKUP DATABASE%'
   OR s.command LIKE '%BACKUP LOG%'
   OR s.command LIKE '%ALTER INDEX%'
   OR s.command LIKE '%UPDATE STATISTICS%'
   OR s.command LIKE '%sp_delete_backuphistory%'
   OR s.command LIKE '%sp_purge_jobhistory%'
   OR s.command LIKE '%xp_delete_file%'
ORDER BY j.name, s.step_id;

This helps reveal maintenance hidden behind generic job names. Command previews are for review only; do not edit jobs from this output alone. CHECKDB, backup, index, statistics, and cleanup work need separate review. No rows may mean the logic is in external scripts or stored procedures.

5 / Monitoring

SQL Server monitoring and alert readiness

Monitoring is ready when failures reach someone who can act. SQL Agent failures, backup failures, blocking, file growth, tempdb pressure, and error log patterns should not depend on someone manually looking later.

Use the SQL Server monitoring guide when alert coverage is unclear.

SQL Agent notification and Database Mail readiness

Shows job notification settings, email operators, and Database Mail profile/account setup.

SELECT
    j.name AS job_name,
    CASE WHEN j.enabled = 1 THEN 'Enabled' ELSE 'Disabled' END AS job_status,
    SUSER_SNAME(j.owner_sid) AS job_owner,
    j.notify_level_email,
    o.name AS email_operator,
    o.enabled AS operator_enabled
FROM msdb.dbo.sysjobs AS j
LEFT JOIN msdb.dbo.sysoperators AS o
    ON o.id = j.notify_email_operator_id
ORDER BY j.notify_level_email, j.name;

SELECT
    p.name AS profile_name,
    pp.is_default,
    a.name AS account_name,
    a.email_address
FROM msdb.dbo.sysmail_profile AS p
LEFT JOIN msdb.dbo.sysmail_principalprofile AS pp
    ON pp.profile_id = p.profile_id
LEFT JOIN msdb.dbo.sysmail_profileaccount AS pa
    ON pa.profile_id = p.profile_id
LEFT JOIN msdb.dbo.sysmail_account AS a
    ON a.account_id = pa.account_id
ORDER BY p.name, a.name;

No notification may be acceptable only if external monitoring covers the same risk. A configured mail profile does not prove email delivery works. A personal mailbox is fragile if the person leaves or stops watching it. Test alert delivery through approved procedures before relying on it.

6 / Access

SQL Server access and ownership readiness

Production readiness depends on knowing who has privileged access and which accounts own jobs. Review access before changing it. Use the ownership gap guide when responsibility is unclear.

Server role and sysadmin readiness

Lists server principals, disabled state, password policy flags, and server role membership.

SELECT
    sp.name AS login_name,
    sp.type_desc,
    sp.is_disabled,
    sl.is_policy_checked,
    sl.is_expiration_checked,
    STRING_AGG(rp.name, ', ') WITHIN GROUP (ORDER BY rp.name) AS server_roles
FROM sys.server_principals AS sp
LEFT JOIN sys.sql_logins AS sl
    ON sl.principal_id = sp.principal_id
LEFT JOIN sys.server_role_members AS srm
    ON srm.member_principal_id = sp.principal_id
LEFT JOIN sys.server_principals AS rp
    ON rp.principal_id = srm.role_principal_id
WHERE sp.type IN ('S', 'U', 'G')
  AND sp.name NOT LIKE '##%'
GROUP BY sp.name, sp.type_desc, sp.is_disabled, sl.is_policy_checked, sl.is_expiration_checked
ORDER BY CASE WHEN STRING_AGG(rp.name, ', ') LIKE '%sysadmin%' THEN 0 ELSE 1 END, sp.name;

Review sysadmin membership before launch, audit, or handover. Disabled logins may still own jobs or database users. Do not remove access based only on this query. Older SQL Server versions without STRING_AGG need an adjusted version of this query.

Orphaned database users starting point

Run inside a user database to find database users without matching server principals.

SELECT
    DB_NAME() AS database_name,
    dp.name AS database_user,
    dp.type_desc,
    dp.authentication_type_desc,
    dp.create_date,
    dp.modify_date
FROM sys.database_principals AS dp
LEFT JOIN sys.server_principals AS sp
    ON sp.sid = dp.sid
WHERE dp.type IN ('S', 'U', 'G')
  AND dp.authentication_type_desc = 'INSTANCE'
  AND sp.sid IS NULL
  AND dp.name NOT IN (N'dbo', N'guest', N'INFORMATION_SCHEMA', N'sys')
ORDER BY dp.name;

Run this in each important user database. Contained users and external authentication need separate interpretation. Orphaned users can matter during restore, migration, or handover. Review before changing users or mappings.

7 / HA and DR

SQL Server HA, DR, and failover readiness

HA can reduce downtime, but it does not replace restore readiness. Availability Groups still need healthy replicas, known failover behavior, valid backups, and application validation.

Use recovery readiness when restore or failover confidence is the main concern.

Availability Group readiness

Checks AG replica role, replica health, database sync state, send queue, and redo queue.

SELECT
    ag.name AS availability_group_name,
    ar.replica_server_name,
    ars.role_desc,
    ars.synchronization_health_desc AS replica_health,
    DB_NAME(drs.database_id) AS database_name,
    drs.synchronization_state_desc,
    drs.synchronization_health_desc AS database_health,
    drs.log_send_queue_size,
    drs.redo_queue_size
FROM sys.availability_groups AS ag
JOIN sys.availability_replicas AS ar
    ON ar.group_id = ag.group_id
LEFT JOIN sys.dm_hadr_availability_replica_states AS ars
    ON ars.replica_id = ar.replica_id
LEFT JOIN sys.dm_hadr_database_replica_states AS drs
    ON drs.replica_id = ar.replica_id
ORDER BY ag.name, ar.replica_server_name, database_name;

No rows may mean the instance does not host Availability Groups. Synchronization health matters before failover or planned maintenance. Send and redo queues can show lag that affects failover readiness. AG health does not prove restore readiness.

8 / Change and rollback

SQL Server deployment, patch, and rollback readiness

A production SQL Server is not ready for change if nobody knows when to stop, how to validate, or how to roll back. Use the SQL Server update guide before patch windows and the stability review guide when repeated issues follow changes.

Read-only error log searches before change

Reads the current SQL Server error log for common readiness and rollback concerns.

EXEC sys.xp_readerrorlog 0, 1, N'error';
EXEC sys.xp_readerrorlog 0, 1, N'severity';
EXEC sys.xp_readerrorlog 0, 1, N'failed';
EXEC sys.xp_readerrorlog 0, 1, N'I/O';
EXEC sys.xp_readerrorlog 0, 1, N'deadlock';
EXEC sys.xp_readerrorlog 0, 1, N'timeout';
EXEC sys.xp_readerrorlog 0, 1, N'recovery';

xp_readerrorlog reads log content; it does not change SQL Server state. Review timing against patches, deployments, restarts, and failovers. Repeated I/O, recovery, severity, or timeout messages should be understood before new change. Search terms are only a starting point.

Change typeReady meansWarning
PatchTarget build, backup point, rollback trigger, validation owner, and post-patch checks.Use the update guide before the window.
DeploymentRelease owner, database changes, rollback script/path, app validation, and stop time.Do not discover rollback steps during the outage.
MigrationCutover order, fallback path, data validation, DNS/app switch, and time limit.Migration rollback often differs from patch rollback.
Config changeCurrent value captured, target value known, restart impact understood, revert command ready.Some changes are not instantly reversible.
FailoverReplica health, sync state, app connection path, failback decision, and validation steps.Failover is not a substitute for restore readiness.

9 / Capacity

SQL Server capacity and workload readiness

Production readiness includes headroom. Current availability is not enough if waits, active workload, file latency, file growth, or tempdb already show pressure. Use the sizing and tempdb guides when this section raises concerns.

Wait stats readiness snapshot

Shows the largest wait categories since the last SQL Server restart, excluding common benign waits.

SELECT TOP (40)
    wait_type,
    wait_time_ms,
    waiting_tasks_count,
    signal_wait_time_ms,
    wait_time_ms - signal_wait_time_ms AS resource_wait_time_ms,
    CASE
        WHEN waiting_tasks_count = 0 THEN 0
        ELSE wait_time_ms / waiting_tasks_count
    END AS avg_wait_ms
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN (
    N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR', N'BROKER_TASK_STOP',
    N'BROKER_TO_FLUSH', N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE',
    N'CHKPT', N'CLR_AUTO_EVENT', N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE',
    N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE', N'DBMIRROR_WORKER_QUEUE',
    N'DBMIRRORING_CMD', N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE',
    N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'HADR_CLUSAPI_CALL',
    N'HADR_FILESTREAM_IOMGR_IOCOMPLETION', N'HADR_LOGCAPTURE_WAIT',
    N'HADR_NOTIFICATION_DEQUEUE', N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE',
    N'LAZYWRITER_SLEEP', N'LOGMGR_QUEUE', N'ONDEMAND_TASK_QUEUE',
    N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', N'REQUEST_FOR_DEADLOCK_SEARCH',
    N'SERVER_IDLE_CHECK', N'SLEEP_TASK', N'SLEEP_SYSTEMTASK',
    N'SQLTRACE_BUFFER_FLUSH', N'WAITFOR', N'XE_DISPATCHER_WAIT', N'XE_TIMER_EVENT'
)
ORDER BY wait_time_ms DESC;

Wait stats are cumulative since startup unless cleared. One snapshot does not prove a cause. Use deltas around important workload windows when possible. High waits should be compared with active workload and incident timing.

Active workload readiness snapshot

Shows currently running requests, waits, blocking, resource usage, and SQL text.

SELECT
    r.session_id,
    s.login_name,
    s.host_name,
    DB_NAME(r.database_id) AS database_name,
    r.status,
    r.command,
    r.wait_type,
    r.wait_time,
    r.blocking_session_id,
    r.cpu_time,
    r.logical_reads,
    r.reads,
    r.writes,
    r.total_elapsed_time,
    SUBSTRING(
        st.text,
        (r.statement_start_offset / 2) + 1,
        CASE
            WHEN r.statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(max), st.text))
            ELSE (r.statement_end_offset - r.statement_start_offset) / 2 + 1
        END
    ) AS running_statement
FROM sys.dm_exec_requests AS r
JOIN sys.dm_exec_sessions AS s
    ON s.session_id = r.session_id
OUTER APPLY sys.dm_exec_sql_text(r.sql_handle) AS st
WHERE r.session_id <> @@SPID
ORDER BY r.total_elapsed_time DESC;

Run during a representative workload, not only during quiet periods. Blocking, high elapsed time, high reads, or repeated waits deserve follow-up. Do not kill sessions from this output without understanding the transaction. This is a point-in-time view.

File latency readiness

Shows average read and write latency by database file from virtual file stats.

SELECT
    DB_NAME(vfs.database_id) AS database_name,
    mf.type_desc,
    mf.physical_name,
    vfs.num_of_reads,
    vfs.num_of_writes,
    CASE WHEN vfs.num_of_reads = 0 THEN 0 ELSE vfs.io_stall_read_ms / vfs.num_of_reads END AS avg_read_latency_ms,
    CASE WHEN vfs.num_of_writes = 0 THEN 0 ELSE vfs.io_stall_write_ms / vfs.num_of_writes END AS avg_write_latency_ms
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
JOIN sys.master_files AS mf
    ON mf.database_id = vfs.database_id
   AND mf.file_id = vfs.file_id
ORDER BY avg_read_latency_ms + avg_write_latency_ms DESC;

Counters are cumulative since the file was opened. Separate data, log, and tempdb latency patterns. High latency can affect backups, CHECKDB, workload, and maintenance windows. Confirm with storage/platform metrics before blaming SQL alone.

File growth readiness

Lists database file sizes, growth settings, and max size.

SELECT
    DB_NAME(database_id) AS database_name,
    type_desc,
    name AS file_name,
    physical_name,
    size * 8 / 1024 AS size_mb,
    growth,
    is_percent_growth,
    max_size
FROM sys.master_files
WHERE database_id > 4
ORDER BY database_name, type_desc, file_name;

Percent growth and tiny growth increments can both cause readiness issues. Autogrowth during busy windows can look like random slowness. Review growth with storage, backup, and maintenance plans. Do not shrink files as a readiness fix.

Tempdb readiness

Shows tempdb file layout and current reserved space by usage category.

SELECT
    name AS file_name,
    type_desc,
    physical_name,
    size * 8 / 1024 AS size_mb,
    growth,
    is_percent_growth
FROM tempdb.sys.database_files
ORDER BY type_desc, file_id;

SELECT
    SUM(user_object_reserved_page_count) * 8 / 1024 AS user_object_mb,
    SUM(internal_object_reserved_page_count) * 8 / 1024 AS internal_object_mb,
    SUM(version_store_reserved_page_count) * 8 / 1024 AS version_store_mb,
    SUM(unallocated_extent_page_count) * 8 / 1024 AS free_space_mb
FROM tempdb.sys.dm_db_file_space_usage;

Tempdb pressure can affect workload, jobs, and deployments. Run during realistic workload where possible. Version store, internal objects, and free space need separate interpretation. Use the tempdb guide for deeper diagnosis.

10 / Scorecard

SQL Server production readiness scorecard

The scorecard should be blunt. Ready means there is proof. Unknown means proof is missing. High risk means production should not carry more load or change until the issue is fixed.

AreaReadyReady with fixesHigh riskProof missing
RecoveryReadyRecent backups and restore test prove recovery path.Backups exist but restore timing needs work.Missing backups, broken log chain, or no restore path.No one can show backup or restore history.
JobsReadyCritical jobs succeed, alert, and finish in window.Some noisy or long jobs need cleanup.Critical jobs fail or are disabled.Job history is missing or too short.
MonitoringReadyFailures reach a watched destination.Coverage exists but has blind spots.No alert owner or repeated missed failures.No one knows what is monitored.
AccessReadyPrivileged access and owners are reviewed.Some old owners or logins need follow-up.Unknown sysadmin access or broken job owners.No access inventory exists.
HA/DRReadyFailover and restore paths are both understood.AG/DR exists but needs a drill.Replica unhealthy or failover path unclear.HA/DR state has not been checked.
RollbackReadyStop points, revert path, and validation are known.Rollback exists but timing is untested.No rollback trigger for risky change.Rollback is assumed, not documented.
CapacityReadyWaits, latency, tempdb, and growth have headroom.Growth or latency needs scheduled review.Capacity pressure is already visible.No baseline exists.
OwnershipReadyA named owner can act on findings.Ownership exists but escalation is soft.No one owns cross-area fixes.Responsibility is unclear.
False confidenceWhy it is weak
Quiet serverDemand may simply have been kind so far.
Green jobsFinal success can hide retries, skipped steps, or weak coverage.
Backup files existFiles do not prove restore timing, dependency order, or recovery success.
Availability Group existsHA does not prove restore readiness, app failover, or backup quality.
Dashboard is greenThe dashboard may not cover jobs, backups, blocking, file growth, or tempdb.
Old access model still worksIt may be over-privileged, ownerless, or dependent on stale accounts.

11 / Mistakes

Common SQL Server production readiness mistakes

MistakeResult
Assuming quiet means readyReadiness only matters when production gets less forgiving.
Backups without restore proofRecovery timing and restore order remain unknown.
HA without failover testingThe system may fail in the exact moment HA is needed.
No rollback triggerA bad change can consume the whole window before anyone stops.
No monitoring ownerThe alert can be correct and still useless.
Job failures treated as background noiseRoutine failures become production risk.
Access nobody reviewedOld sysadmin access and job owners can break change control.
Unsupported build/version ignoredPatching and support risk sit under every future change.
Capacity reviewed only after launchHeadroom problems become user-facing surprises.

12 / Review help

When a SQL Server production readiness review helps

Need a second pair of eyes before SQL Server carries more production risk?

If the SQL Server is about to carry a launch, audit, handover, migration, upgrade, or heavier workload, a readiness review can check what is actually safe before the pressure arrives.

Use health audit for the wider readiness picture. Use recovery readiness when the main concern is restore, failover, or runbook confidence.

Good fit when

  • The SQL Server is going live or taking more production load.
  • Backups exist, but restore or failover confidence is weak.
  • Jobs, alerts, access, or rollback plans have not been reviewed recently.
  • A patch, migration, audit, or handover is coming.

Next step

Use the SQL Server health audit when production readiness needs one findings list and a fix order.

Use recovery readiness when the main risk is restore, failover, or runbook confidence, or read the SQL Server recovery guide.