Process: Missing Domain Controller Heartbeat
Heartbeats are logs that are sent from Servers to the Sentinel SIEM Log ingestion. The Missing Domain Controller Heartbeat alert happens when a DC is no longer sending the heartbeats to Sentinel. This can be because of a number reasons as explained in Guidance https://securityservices.mariecurie.org.uk/a/solutions/articles/52000098273?language=en
Process:
1. Open the Alert URL from the Incident Tickets Description.
2. Click on the Sentinel Incident. Take note of the Entities shown as this is the DCs affected by the alert. Then click on the System Alert ID.
3. Click on the expand button for the alert which results from the KQL query. Locate the StartTime and this will give you the exact time of the alert in UTC/ZULU. Take note of this.
(PLEASE NOTE: DURING BST YOU NEED TO ADD AN HOUR TO CONVERT FROM UTC TO BST)
4. Perform KQL query on Sentinel on the 365 Tenant to check if heartbeats are back up for that DC. Run this KQL (Replace SWI-DC-01 with affected DC): Go to Log Analytics and run query
Heartbeat
| where Computer contains "SWI-DC-01"
| sort by TimeGenerated desc
5. Check the results to see if we are receiving logs from the DC. Logs should be ingested every 1 minute.
6. Edit the KQL query and change the DC hostname and reduce it to just location.
Example:
Heartbeat
| where Computer contains "SWI"
| sort by TimeGenerated desc
7. Check the results and ensure both HYPERV and the SERV logs are being ingested.
8. If logs are not being ingested from the other devices including the DC, then more than likely it is the HYPERV that is having issues and needs to be rebooted from the ILO. This is currently done by ITSD/ITTS.
9. If DC logs from step 5, are back up, then incident can be closed, and root cause analysis be performed. Please use Solution Article < https://securityservices.mariecurie.org.uk/a/solutions/articles/52000098273?language=en> for guidelines on root cause analysis.
10. If DC logs, are not back up. Ping the DC via CMD to check if DC is online.
11. If Device is online, try to RDP to the server.
i. Go to Azure -> Virtual Machines -> Jumpbox-prd01
ii. Using Bastion, remote onto Jumpbox-prd01 with your admin credentials https://portal.azure.com/#@mariecurieazure.onmicrosoft.com/resource/subscriptions/df36146a-4192-40db-9b12-c4084283847b/resourceGroups/RG-IT-PRD/providers/Microsoft.Compute/virtualMachines/jumpbox-prd01/bastionHost
iii. Open a run command in Jumpbox-prd01 (WIN+R)
iv. Run mstsc
v. Enter the DC name in, connect and enter your admin credentials.
12. If unable to RDP to the server. DC should be restarted. Follow Solution Article: https://securityservices.mariecurie.org.uk/a/solutions/articles/52000099199?language=en
13. After it is restarted successfully, give it 10 minutes, and then try to check for heartbeats again. If not, repeat step 11 and continue.
14. If able to RDP to the server, Open Run and then run “services.msc” to open Services.
15. Locate the “Microsoft Monitoring Agent Service” and “Microsoft Monitoring Agent Azure VM Extension Heartbeat Service”, right click it and restart the service. Wait 10 minutes, and then try to check for heartbeats again (step 4). If not, continue.
16. A last resort is to remove and reinstall the Sentinel Workspace on the Microsoft Monitoring Agent on the DC itself. Follow Solution Article: https://securityservices.mariecurie.org.uk/a/solutions/articles/52000099200?language=en