1433), and credentials. SQL Auth and Windows Auth are both supported.VIEW SERVER STATE and VIEW DATABASE STATE on the monitored instance. It never writes to your SQL Server.| Tier | Instances | Duration | Sentinel AI | All Modules | Best For |
|---|---|---|---|---|---|
| Monitor | 3 | 15 days | Text only | ✗ | Small teams, single SQL Server |
| Intelligence | 10 | 15 days | Voice + all tabs | ✓ | Mid-size DBA teams, multiple instances |
| Enterprise | Unlimited | 30 days | Voice + priority | ✓ | Large estates, multi-team rollout |
dpsmf.ai/trial.html. No credit card is required.DPSMF-XXXXXX-XXXXXX-TRIAL license key. Copy it immediately — it is also emailed to you, but the email may take a few minutes to arrive. You will need the key on first login.you+monitor@company.com). Trial accounts cannot be upgraded mid-trial — you can convert to a paid subscription at any time.dpsmf.ai/dpsmf-login.html or click Log In from any DPSMF page. Enter the email and password you registered with.DPSMF-XXXXXX-XXXXXX-TRIAL key and click Activate. The key is validated server-side and tied to your account — it cannot be used again./app/. The dashboard is initially empty — no instances are connected yet. The trial countdown begins at this point.SELECT @@VERSION) to confirm network and credentials. Common failures: firewall blocking port 1433, incorrect hostname, login not mapped to a database. See Requirements for the full permission list.master, tempdb, msdb) are always included for wait-stat and config analysis.VIEW SERVER STATE (for DMVs), VIEW DATABASE STATE (per-DB), VIEW ANY DEFINITION (execution plans). Read the Requirements panel for the exact GRANT script.DPSMF does not generate recommendations immediately. The Intelligence Engine needs to observe a representative sample of workload patterns — typically one full business day — before the Knowledge Graph can distinguish normal from anomalous behaviour.
During the baseline period you can still use Sentinel in Query Mode — ask questions like "What are my top wait types right now?" or "Show me execution plans for queries over 5 seconds." These are answered directly from live DMV data without requiring a trained baseline.
After the baseline window closes, Anomaly Detection activates and the alert system comes fully online. Here is what to expect on Day 2:
alerts@dpsmf.ai. Add this address to your allowlist to avoid spam filtering.sys.dm_os_wait_stats and classifies waits into CPU, I/O, lock, network, and memory categories. Deltas are computed every 10 seconds.avg CPU ms, avg reads, or execution count to find the culprit fast.cpu_pct, batch_requests_sec, or page_life_expectancy. Set the operator and threshold, then choose a notification channel. The baseline-aware option fires only when the value deviates from learned normal — not just crosses a static number.| Signal | What It Means | Action |
|---|---|---|
cpu_pct > 85% sustained | CPU-bound workload or runaway query | Open Top Queries, sort by avg CPU ms |
| PLE dropping > 20% in 10 min | Memory pressure from large sort/hash | Check memory grants in Top Queries |
| Batch rate drops 30%+ vs baseline | Blocking or connectivity issue | Open Blocking Chain view in Sentinel |
| PAGEIOLATCH waits spike | Missing index or cold buffer pool | See Index & Query Health module |
| RESOURCE_SEMAPHORE dominates | Insufficient memory for query grants | Review max server memory setting in Configuration Intelligence |
CONTROL SERVER) and cross-database permission chains.VIEW SERVER STATE and VIEW ANY DEFINITION permissions. If your monitoring account is missing VIEW ANY DEFINITION, TDE and permission sprawl scans will return partial results. Check Settings → Instances → Permission Check to validate.CONTROL SERVER, ALTER ANY LOGIN, or db_owner on a sensitive database is highlighted in amber.| Dimension | Weight | What Drops It |
|---|---|---|
| Identity hygiene | 25% | Orphaned logins, stale accounts, disabled users with active grants |
| Privilege minimisation | 30% | Non-SA sysadmin members, CONTROL SERVER grants, nested role escalation |
| Encryption coverage | 20% | Unencrypted databases, expired TDE certificates, missing backup encryption |
| Authentication hygiene | 15% | Mixed-mode auth on domain instances, SQL logins without password policy |
| Audit trail | 10% | SQL Audit not configured, C2 audit disabled, no login failure logging |
slow queries after index rebuild or blocking on tempdb. Results are ranked by semantic similarity to your query, not keyword match.| Scenario | What to Search | What You’ll Find |
|---|---|---|
| Query suddenly slow after deploy | plan regression after schema change | Parameter sniffing nodes, plan cache flush triggers, UPDATE STATISTICS guidance |
| Tempdb contention alerts firing | tempdb allocation contention | GAM/SGAM page contention nodes, trace flag 1118/1117 analysis, file count recommendations |
| Memory pressure on busy OLTP | memory grant timeout OLTP | Resource semaphore nodes, max server memory tuning, memory-optimised table guidance |
| AG failover investigation | availability group unexpected failover | Health check timeout nodes, lease timeout chain, network jitter patterns |
sys.dm_db_missing_index_details continuously, but de-duplicates and scores candidates by projected impact (avg user seeks × avg total cost) rather than just listing them raw.sys.dm_db_index_physical_stats across all user databases. Results are stored historically so you can see fragmentation growth over time, not just a point-in-time snapshot.REORGANIZE candidates; larger indexes are marked REBUILD. Use the maintenance window filter to schedule rebuilds outside peak hours.| Fragmentation % | Page Count | Recommended Action |
|---|---|---|
| < 5% | Any | No action needed |
| 5–30% | < 1,000 pages | ALTER INDEX ... REORGANIZE (online, low impact) |
| 5–30% | ≥ 1,000 pages | ALTER INDEX ... REBUILD WITH (ONLINE=ON) if Enterprise |
| > 30% | Any | ALTER INDEX ... REBUILD — schedule in maintenance window |
| Any | < 100 pages | Ignore — small tables don’t benefit from fragmentation fixes |
EXEC sp_configure or ALTER DATABASE statements ready to review and run.max degree of parallelism = 0 on a 32-core NUMA system). These should be addressed before any other tuning.sp_configure change, any ALTER DATABASE to a monitored setting, or any tempdb file count change.| Setting | Why It Matters | Common Misconfiguration |
|---|---|---|
max degree of parallelism | Controls CPU parallelism per query | 0 (unlimited) on NUMA systems causes CXPACKET storms |
cost threshold for parallelism | Minimum plan cost before parallel plan is used | Default value of 5 is too low for most OLTP workloads |
max server memory (MB) | Caps SQL Server buffer pool | Default 2,147,483,647 starves OS and other services |
tempdb file count | Reduces allocation page contention | Single file on multi-core servers causes GAM/SGAM waits |
optimize for ad hoc workloads | Reduces plan cache bloat from single-use plans | Disabled by default; should be ON for most OLTP systems |
backup compression default | Reduces backup I/O and storage | Disabled by default on Standard Edition |
msdb.dbo.backupset and suspect_pages across all monitored instances. Alerts when a database has not been backed up within its defined SLA window, or when a backup fails.01:00–05:00). Select days of week. During this window: no new alerts fire, anomaly scores are frozen, and performance metrics are excluded from baseline calculations.| Report | Cadence | Sections Included |
|---|---|---|
| Daily Health | Daily at chosen time | Alert count, top 3 performance issues, backup status, job failures, scorecard delta |
| Weekly Executive | Weekly (Mon 07:00 default) | Trend charts, scorecard history, top query changes, capacity forecast, action items |
| Monthly Trend | 1st of month | 30-day performance summary, index health evolution, security audit delta, growth projections |
| Security Audit | On-demand or scheduled | Full compliance scorecard, shadow identity list, permission map, TDE status, remediation checklist |
| Index Rebuild Plan | Weekly or on-demand | Fragmentation scan, rebuild/reorganise recommendations, estimated duration, maintenance window fit |
dpsmf.ai/legal/dpa.sysadmin or any write permission. Create a dedicated low-privilege monitoring account.| Platform | Minimum Version | Notes |
|---|---|---|
| SQL Server | 2016 (13.x) | Full DMV support from 2016 onwards |
| Azure SQL Database | Any | Some DMVs restricted; DPSMF adapts automatically |
| Azure SQL Managed Instance | Any | Full support |
| Amazon RDS for SQL Server | SQL Server 2016+ | Windows Auth not available on RDS |
Sentinel is the primary monitoring view in DPSMF. It shows a live feed of all connected instances, current health scores, active alerts, and real-time metric sparklines — all updated every 10 seconds.
The SN1 live feed delivers a sub-second stream of metric events, alert state changes, and anomaly detections via server-sent events. It powers the real-time indicators on Sentinel tiles.
Individual Sentinel features (anomaly overlay, AG health panel, capacity widgets) can be enabled or disabled per-user by an admin under Settings → Sentinel Features. This allows feature rollout to be controlled without a code deployment.
| Category | Metrics | Type |
|---|---|---|
| CPU & Schedulers | SQL CPU %, OS CPU %, scheduler queue length, context switches | CPU |
| Memory | Buffer pool pages, page life expectancy, memory grants pending, stolen server memory | MEM |
| Wait Statistics | All 900+ wait types tracked; top 20 surfaced with baseline deviation | WAIT |
| I/O | Read/write latency per file, stall rate, pending I/O count | I/O |
| Blocking & Deadlocks | Active blocking chains, deadlock events, lock wait time, lead blocker query | LOCK |
| Query Performance | Top queries by CPU, duration, reads, writes, executions; plan regressions | QUERY |
| Index Health | Fragmentation %, missing index recommendations, unused indexes, duplicate indexes | INDEX |
| Tempdb | Version store size, allocation contention, active tasks per session | MEM |
| Connections | Active connections, blocked sessions, sleeping sessions, orphaned transactions | CONN |
| Data Type | Default Interval | Configurable |
|---|---|---|
| DMV snapshot (waits, CPU, blocking) | 10 seconds | 10s – 60s |
| Query stats (sys.dm_exec_query_stats) | 60 seconds | 30s – 300s |
| Index fragmentation scan | Daily (2 AM) | Schedule via Settings |
| Anomaly model refresh | Every 6 hours | Not configurable |
| Baseline recalculation | Daily (midnight) | Not configurable |
DPSMF collects instance-level metrics by default. Per-database monitoring extends this to track transaction log usage, index health, active query load, and alert thresholds individually per database — so a single busy reporting database does not mask the health of other databases on the same instance.
The AG Monitor tracks all Availability Groups on connected instances. It polls sys.dm_hadr_availability_group_states, sys.dm_hadr_database_replica_states, and sys.dm_hadr_availability_replica_states every 10 seconds.
Each alert opens a detail view showing the metric timeline, the anomaly score at the time of firing, correlated events within ±30 minutes, and AI-generated root cause analysis with a specific recommended action.
DPSMF enforces license-level policy limits. Monitor tier allows up to 10 active policies per instance. Intelligence and Enterprise tiers are unlimited. Policy checks are evaluated before every alert fires — a suppressed policy will not alert even if the condition is met.
A runbook is a structured set of steps that appears automatically when a specific alert type fires. Instead of DBA tribal knowledge living in someone's head, runbooks encode the correct response procedure and make it instantly available to any engineer on call.
DPSMF’s anomaly engine builds a rolling baseline for every tracked metric using a 7-day sliding window. It computes the mean and standard deviation for each metric at each time-of-day and day-of-week slot, then scores incoming values against the expected distribution.
| Score Range | Meaning | Action |
|---|---|---|
| 0 – 1σ | Normal variation | No alert |
| 1σ – 2σ | Elevated — watch | Logged only |
| 2σ – 3σ | Significant deviation | Info alert (Aggressive mode) |
| > 3σ | Strong anomaly | Warning or Critical alert |
| Mode | Alert Threshold | Best For |
|---|---|---|
| Conservative | 5σ | High-noise environments; batch/ETL servers |
| Default | 3σ | Most OLTP and mixed-workload environments |
| Aggressive | 2σ | Critical OLTP servers where early warning is paramount |
When an anomaly fires, DPSMF searches the previous 30 minutes for other anomalies on the same instance. Correlated events are ranked by temporal proximity and knowledge graph relationship strength, then presented in the alert detail view as likely contributing causes.
The knowledge graph is DPSMF’s long-term memory. Every query hash, wait type, metric event, and alert is stored as a node. Observed relationships between them — temporal co-occurrence, causal linkage, query-to-wait attribution — are stored as edges. A taxonomy layer classifies nodes into categories (workload, resource, maintenance, replication, etc.).
Open Knowledge Graph → Explorer (/kg-explorer.html) to browse nodes and edges interactively. Search by query hash, wait type name, metric name, or alert ID. Click any node to see its connected edges and the evidence strength for each relationship.
DPSMF fits a linear trend model to each capacity metric (CPU, buffer pool, data file size, log file size, connection count) using the last 30 days of collected data. It extrapolates forward and identifies the projected date when each metric will breach its defined capacity threshold.
| Dimension | Weight | What It Measures |
|---|---|---|
| Query Health | 25% | Top query baseline deviation, plan regression count, missing index severity |
| Wait Profile | 20% | Dominant wait types vs. baseline, unusual wait pattern frequency |
| Blocking | 15% | Active blocking chain count, max block duration, deadlock rate |
| I/O Performance | 15% | Read/write latency vs. baseline, stall rate |
| Memory Pressure | 10% | Page life expectancy trend, memory grants pending, buffer pool pressure |
| CPU Utilisation | 5% | Average and peak CPU vs. baseline, scheduler queue |
| Index Health | 5% | Fragmentation level, unused index count, missing index impact score |
| Connectivity | 5% | Connection count trend, orphaned transactions, sleeping session count |
1433.| Setting | Default | Description |
|---|---|---|
| Poll Interval | 10s | How often DMVs are polled. Lower = more granular but higher overhead. |
| Alert Sensitivity | Default (3σ) | Anomaly threshold. Conservative / Default / Aggressive. |
| Baseline Window | 7 days | Rolling window used to compute the statistical baseline. |
| Maintenance Window | None | Time range when alerts are suppressed (e.g., nightly index rebuilds). |
| Tags | — | Free-text tags for grouping instances (e.g., Production, Staging, Region). |
| Role | Capabilities |
|---|---|
| dba | Full access: acknowledge/resolve alerts, manage instances, runbooks, policies, settings |
| viewer | Read-only: view dashboards, alerts, metrics, and scorecard — no write actions |
| readonly | Demo-level: GET requests only — suitable for stakeholder dashboards |
Go to Settings → Users, find the user, and click Reset Password. Enter a new temporary password (min 8 characters). The user will be required to change it on next login.
| Tier | Instances | Anomaly | AG Monitor | Capacity | Policies |
|---|---|---|---|---|---|
| Monitor | 3 | ✓ | ✗ | ✗ | 10/instance |
| Intelligence | 10 | ✓ | ✓ | ✓ | Unlimited |
| Enterprise | Unlimited | ✓ | ✓ | ✓ | Unlimited |
Go to Settings → License → Activate and enter your license key. Keys take the format DPSMF-XXXXXX-XXXXXX. After activation, the license manager shows the tier, expiry date, instance count, and days remaining.
| Section | Key Settings |
|---|---|
| General | Default alert sensitivity, timezone, date format, dashboard refresh rate |
| Instances | Add / edit / remove monitored instances; per-instance poll interval and maintenance windows |
| Notifications | Alert email recipients, escalation contacts (L1/L2), email server configuration |
| Policies | Create and manage alert policies; set suppression windows |
| Runbooks | Create and manage remediation runbooks attached to alert types |
| Users | Add users, assign roles, reset passwords |
| License | View current license, activate new keys, check usage vs. limits |
| Reports | Configure scheduled daily/weekly email reports |
| Sentinel Features | Enable/disable individual dashboard feature panels per user |