SCOM 2007 R2: Event ID 1220 on Domain controllers


I have been noticing these two Event Ids (1220 and/or 7022) on my domain controllers on Operations Manager event log. Turn out to be it is an easy one to fix.  I following Method B on this MSKB: The Health Service does not process configuration files, and events 7022 and 1220 are logged every 30 minutes on a domain controller on which you installed the Operations Manager 2007 agent 

Even though we specify action account is a domain admin, SCOM agent uses local system account for collecting privileged information. I preferred to run HSLockdown tool to enable local system account on domain controllers.

Only negative thing is you have to run this on all domain controllers on your enterprise.

Method 2: Run HSLockdown.exe to configure permissions

Run HSLockdown.exe on the affected domain controllers to remove NT Authority\SYSTEM from the Denied list. To do this, follow these steps:

  1. On the domain controller, open a command prompt, and then open the folder where the agent software is installed. By default, the agent is installed in the following folder:

    C:\Program Files\System Center Operations Manager 2007

  2. Type the following command, and then press ENTER:

    hslockdown “Management_Group _Name” /R “NT AUTHORITY\SYSTEM”

    In this command, Management_Group _Name is the name of the Operations Manager 2007 management group of which the agent is a member. Use quotation marks if the name contains spaces.

  3. Restart the OpsMgr Health Service.
  4. Repeat step 1 through step 3 on each domain controller that is affected.

SCOM 2007 R2: Disconnected Agents (Event Id 21034)


One day I noticed my servers (agents) were greyed out in Computers node in Monitoring section. You know what’s strange;  those greyed out computers were not even showing up in Agent Managed node in Administration section.

All of these greyed out servers, I see this event id 21034.

Event Type:        Warning
Event Source:        OpsMgr Connector
Event Category:        None
Event ID:        21034
Date:                8/12/2011
Time:                10:03:10 AM
User:                N/A
Computer:        SERVER-NAME
Description:
The Management Group Watch-Men has no configured parents and most monitoring tasks cannot be performed. This can happen if a management group in Active Directory does not have any server SCPs or if the agent does not have access to any server SCPs.

I usually try two things when a SCOM agent is not talking to SCOM management server.

1. Restart System Center Management (HealthService) on the affected server. Watch the event log in Operations Manager node.

2. If step 1 fails, I do

a. Stop the System Center Management (HealthService) service.

b. Open Explorer window and go to “C:\Program Files\System Center Operations Manager 2007”. Rename (or Delete) the folder named “Health Service State”.

c. Start the System Center Management Service (HealthService).

Well..well..well. That didn’t work. I have called Microsoft support and got help on how to reconnect the disconnected agents.

On the SQL server that hosts Operations Manager database, open SQL Server Management Studio. Browse to the OperationsManager database. Open new query window for the database. If you need help get from the DBA.

SCRIPT A: Execute the following query to list all disconnected agents in the database.

declare @DiscoverySourceId uniqueidentifier;
set @DiscoverySourceId = dbo.fn_DiscoverySourceId_User();
SELECT TME.[TypedManagedEntityid], HS.PrincipalName
FROM MTV_HealthService HS
INNER JOIN dbo.[BaseManagedEntity] BHS with(nolock)
ON BHS.[BaseManagedEntityId] = HS.[BaseManagedEntityId]
— get host managed computer instances
INNER JOIN dbo.[TypedManagedEntity] TME with(nolock)
ON TME.[BaseManagedEntityId] = BHS.[TopLevelHostEntityId]
AND TME.[IsDeleted] = 0
INNER JOIN dbo.[DerivedManagedTypes] DMT with(nolock)
ON DMT.[DerivedTypeId] = TME.[ManagedTypeId]
INNER JOIN dbo.[ManagedType] BT with(nolock)
ON DMT.[BaseTypeId] = BT.[ManagedTypeId]
AND BT.[TypeName] = N’Microsoft.Windows.Computer’
— only with missing primary
LEFT OUTER JOIN dbo.Relationship HSC with(nolock)
ON HSC.[SourceEntityId] = HS.[BaseManagedEntityId]
AND HSC.[RelationshipTypeId] = dbo.fn_RelationshipTypeId_HealthServiceCommunication()

AND HSC.[IsDeleted] = 0
INNER JOIN DiscoverySourceToTypedManagedEntity DSTME with(nolock)
ON DSTME.[TypedManagedEntityId] = TME.[TypedManagedEntityId]AND DSTME.[DiscoverySourceId] = @DiscoverySourceId WHERE HS.[IsAgent] = 1 AND HSC.[RelationshipId] IS NULL

If you see any results, Note down (copy/paste) the results in a note pad of all disconnected agents.

Now we need to delete all disconnected agents. Make database backup of Operations Manager database.

Execute this script to delete all disconnected agents. Note: you are on your own. I am NOT responsible for your actions.

declare @TypedManagedEntityId uniqueidentifier;
declare @DiscoverySourceId uniqueidentifier;
declare @LastErr int;
declare @TimeGenerated datetime;

set @TimeGenerated = GETUTCDATE();
set @DiscoverySourceId = dbo.fn_DiscoverySourceId_User();

DECLARE EntitiesToBeRemovedCursor CURSOR LOCAL FORWARD_ONLY READ_ONLY FOR SELECT TME.[TypedManagedEntityid] FROM MTV_HealthService HS INNER JOIN dbo.[BaseManagedEntity] BHS ON BHS.[BaseManagedEntityId] = HS.[BaseManagedEntityId]

— get host managed computer instances

INNER JOIN dbo.[TypedManagedEntity] TME ON TME. BaseManagedEntityId] = BHS.[TopLevelHostEntityId] AND TME.[IsDeleted] = 0 INNER JOIN dbo.[DerivedManagedTypes] DMT ON DMT.[DerivedTypeId] = TME.[ManagedTypeId] INNER JOIN dbo.[ManagedType] BT ON DMT.[BaseTypeId] = BT.[ManagedTypeId] AND BT.[TypeName] = N’Microsoft.Windows.Computer’

— only with missing primary

LEFT OUTER JOIN dbo.Relationship HSC

ON HSC.[SourceEntityId] = HS.[BaseManagedEntityId] AND HSC.[RelationshipTypeId] = dbo.fn_RelationshipTypeId_HealthServiceCommunication() AND HSC.[IsDeleted] = 0 INNER JOIN DiscoverySourceToTypedManagedEntity DSTME ON DSTME.[TypedManagedEntityId] = TME.TypedManagedEntityId] AND DSTME.[DiscoverySourceId] = @DiscoverySourceId WHERE HS.[IsAgent] = 1 AND HSC.[RelationshipId] IS NULL;

OPEN EntitiesToBeRemovedCursor

FETCH NEXT FROM EntitiesToBeRemovedCursor  INTO @TypedManagedEntityId

WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRAN

— Delete entity

EXEC @LastErr = [p_RemoveEntityFromDiscoverySourceScope] @TypedManagedEntityId, @DiscoverySourceId, @TimeGenerated;

IF @LastErr <> 0 GOTO Err

COMMIT TRAN

— Get the next typedmanagedentity to delete.

FETCH NEXT FROM EntitiesToBeRemovedCursor

INTO @TypedManagedEntityId

END

CLOSE EntitiesToBeRemovedCursor

DEALLOCATE EntitiesToBeRemovedCursor

GOTO Done

Err:

ROLLBACK TRAN

GOTO Done

Done:

 

Execute SCRIPT A again to see any disconnected agents listed. Hopefully not. If yes, you need to execute the following script. See the highlighted value for EntityId. Replace it with the ID from above script results. Run the script against all disconnected servers with their corresponding EntityIds.

DECLARE @EntityId uniqueidentifier;

DECLARE @TimeGenerated datetime;

— change "GUID" to the ID of the invalid entity

SET @EntityId = ‘3B2F8221-9F7B-5FFD-B80D-DEEAFFB6E342‘;

SET @TimeGenerated = getutcdate();

BEGIN TRANSACTION

EXEC dbo.p_TypedManagedEntityDelete @EntityId, @TimeGenerated;

COMMIT TRANSACTION

Execute SCRIPT A again to check the server is not listed as disconnected.

Check SCOM console to see these servers disappeared in Computers Node in Monitoring section.

Now you have to do the following all original disconnected servers after fixing it in the database. On every disconnected server,

a. Stop the System Center Management (HealthService) service.

b. Open Explorer window and go to “C:\Program Files\System Center Operations Manager 2007”. Rename (or Delete) the folder named “Health Service State”.

c. Start the System Center Management Service (HealthService).

I made a little VBScript to do the above task on all list of servers. Copy/Paste the following script in notepad an save it as "FixSCOMAgent.vbs”. Create a new text file called Servers.txt on the same folder you saved the VBScript.  Type the disconnected server names in Servers.txt file. List each server name on it’s own line. e.g,

servername1
servername2
servername3

 

‘ #######              #####   #####  ####### #     #
‘ #       # #    #    #     # #     # #     # ##   ##
‘ #       #  #  #     #       #       #     # # # # #
‘ #####   #   ##       #####  #       #     # #  #  #
‘ #       #   ##            # #       #     # #     #
‘ #       #  #  #     #     # #     # #     # #     #
‘ #       # #    #     #####   #####  ####### #     #
‘                                                    

‘    #                              
‘   # #    ####  ###### #    # #####
‘  #   #  #    # #      ##   #   #  
‘ #     # #      #####  # #  #   #  
‘ ####### #  ### #      #  # #   #  
‘ #     # #    # #      #   ##   #  
‘ #     #  ####  ###### #    #   #  

‘ Script Name: FixSCOMAgent.vbs                                   
‘ Description: This script will stop the SCOM agent service, Delete
‘ Health Service State folder and start the SCOM agent service on
‘ all servers listed in servers.txt.

‘ Written by Anand Venkatachalpathy

On Error Resume Next

‘how much time we want to wait after initating the stop service
intSleep = 18000

‘agent health service state folder to be deleted
HSFolder = "\c$\Program Files\System Center Operations Manager 2007\Health Service State"

‘ creating a File system object
Set objFSO = CreateObject("Scripting.FileSystemObject")

‘ Open Servers.txt file, should be located same folder as this script
Set f=objFSO.OpenTextFile("servers.txt",1)

‘ Read the file one line at a time
Do While f.AtEndOfStream <> True
  strComputer = f.ReadLine
  WScript.Echo "Fixing " & strComputer & " …"
 
  ‘call the sub routine to fix the server
  ReStartHealthService strComputer
Loop

‘-*-*-*-*-* End of Script -*-*-*-*-*-*

‘Sub Routine: RestartHealthService
‘Parameter: server name
‘Description: This sub routine stops the SCOM agent service,
‘Delete the Health Service Status folder and start the
‘agent service.

Sub RestartHealthService(strComputer)

  ‘Service Name
  strService = " ‘HealthService’ "
 
  ‘Get WMI object on the given server
  Set objWMIService = GetObject("winmgmts:" _
  & "{impersonationLevel=impersonate}!\\" _
  & strComputer & "\root\cimv2")
 
  ‘Get the services WMI object
  Set colListOfServices = objWMIService.ExecQuery _
  ("Select * from Win32_Service Where Name ="_
  & strService & " ")
 
  ‘Folder to delete
  strSource = "\\" & strComputer & HSFolder
  ‘Get the folder object to delete
  Set fTarget = objFSO.GetFolder(strSource) 
 
 
 
  For Each objService in colListOfServices
    WScript.Echo vbTab & "Stopping SCOM Agent Service"
    objService.StopService()
    WScript.Sleep intSleep
   
    WScript.Echo  vbTab & "Deleting the folder: " & strSource
    fTarget.Delete
   
    WScript.Echo vbTab & "Starting SCOM Agent Service"
    objService.StartService()
  Next
End Sub
‘-*-*-*- End of Sub Routine –*-*-*-*-*-

 

Now open the command prompt as administrator, go to the location where you saved the script and run it. (CScript FixSCOMAgent.vbs).

After you run the script, Watch the SCOM console. Servers will start showing up correctly. It may take about 15 minutes some times. Just in case check fixed server’s event log for any errors.

Whew! Hope this blog helped you.

SCOM: “Critical hotfixes required for reliable operation of the Exchange Server 2010”–alerts on wrong servers


Whenever I install SCOM agent on new server, I get this alert.

Alert description: Critical hotfixes required for reliable operation of the Exchange Server 2010 and other management packs are not installed on this server. Please see the appropriate KB article for more information, and to download the required hotfix

Why would SCOM barking on a wrong tree? None of them are Exchange servers.

After staring at the alert for a minute and checked the SCOM monitor properties, I found this monitor is targeting Health Service that mean all monitored servers. a..ah! Bug in Exchange 2010 management pack!!!

image

Now how to fix it? Disable this monitor for all Servers and  target this alert only to Exchange servers. Follow the steps below to do that.

  1. Open SCOM Console and go to Authoring section.
  2. Select Monitors Node. Select the scope to “Health Service”.
  3. Search for “required SCOM hotfixes” image
  4. Right click on “The required SCOM hotfixes for Exchange MP are not installed” monitor, Select Overrides –> Override this Monitor –> For all objects of class: Health Service.image
  5. Check the box for Enabled and select False in Override Value column.  Optional: Select a custom management pack created by yourself on the bottom. Click OK to disable for monitor. image
  6. Right click on the same monitor again and select Overrides –> Override this Monitor –> For a Group..
  7. image

  8. Type Exchange in the search box and select “Microsoft Exchange 2010 All Servers Computers” and click OK.image
  9. Monitor may be already enabled since we only selected Exchange servers only. You may change any alert priority or severity. Click OK.image

That’s All. No more false alarms. Enjoy. Smile

SCOM 2007: The Scheduled Reports has no graphs, pretty pictures.


I am running SCOM 2007 R2 with CU4 and ‘frigging’ reports doesn’t show pretty pictures.  Well it does show on ad-hot reports, but not in scheduled reports.  After bitching for minute or two, I found the solution from blog from Taiwan CSS Platform Team (Here).

Solution is simple enough. Follow the instructions below to fix it.

  • Where is your SQL Reporting Services running? Remote Desktop to that server.
  • Open Windows Explorer and go to this location:

<SQL Reporting Services install location>\Reporting Services\ReportServer\bin.

e.g., C:\Program Files\SQL Server\SCOMReporting\Reporting Services\ReportServer\bin.

  • Backup ReportingServicesService.exe.config file to safe location. (I would make a copy of file on the same location)
  • Open ReportingServicesService.exe.config file in Notepad.
  • Find <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
  • Paste the following <dependentAssembly> line next to above line.

dependentAssembly>

<assemblyIdentity name="Microsoft.ReportingServices.ProcessingCore" publicKeyToken="89845dcd8080cc91" culture="neutral" />

<bindingRedirect oldVersion="9.0.242.0" newVersion="10.0.0.0" />

</dependentAssembly>

<dependentAssembly xmlns="urn:schemas-microsoft-com:asm.v1">

<assemblyIdentity name="Microsoft.ReportingServices.ProcessingCore" publicKeyToken="89845dcd8080cc91" culture="neutral" />

<bindingRedirect oldVersion="9.0.242.0" newVersion="10.0.0.0" />

</dependentAssembly>

  • Save the file and close Notepad.
  • Restart the corresponding SQL Reporting service
  • That’s all, generate an schedule report and make a silly dance. Disappointed smile

SCOM Web Console: “Server Error in ‘/’ Application” Runtime Error


This error occurs when,

  1. You installed SCOM full management console first
  2. Then you installed Web Console on a later date
  3. Now, you are getting this error  when you visit SCOM web console

The issue is SCOM setup doesn’t install the following four DLL files correctly to the web console installed location.  The solution is really easy.

Find the following DLL files at %Program Files%\System Center Operations Manager 2007.

  • Corgent.Diagramming.CommandResources.dll
  • Corgent.Diagramming.CustomElements.dll
  • Microsoft.ReportViewer.Common.dll
  • Microsoft.ReportViewer.Webforms.dll

Copy the above DLLs to %Program Files%\System Center Operations Manager 2007\Web Console\Bin.

I would run a IISReset command and try accessing the web console again.

SCOM 2007 R2 – RMS server not seeing the Agents or Agents are not talking to the server


My SCOM 2007 R2 agents are stopped talking to the RMS server from one fine day.  I don’t remember doing anything on SCOM servers like no new management packs. The following event ids were getting my nerves. Servers show up a “Not monitored” and with blank circle icon.

Event ID 21016

OpsMgr was unable to set up a communications channel to (RMS FQDN) and there are no failover hosts.  Communication will resume when (RMS FQDN) is both available and allows communication from this computer.

Event ID 20070

The OpsMgr Connector connected to (RMS FQDN), but the connection was closed immediately after authentication occurred.  The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration.  Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.

Event ID 21023

OpsMgr has no configuration for management group (managementgroupname) and is requesting new configuration from the Configuration Service.

Event ID 21042

Operations Manager has discarded 1 items in management group (managementgroupname) , which came from $$ROOT$$.  These items have been discarded because no valid route exists at this time.  This can happen when new devices are added to the topology but the complete topology has not been distributed yet.  The discarded items will be regenerated.

Event ID 29106

The request to synchronize state for OpsMgr Health Service identified by "f9bc56f5-d69b-fb52-0788-792a86aec09d" failed due to the following exception "Microsoft.EnterpriseManagement.Common.DataAccessLayerException: Invalid column name SizeNumeric_486ADDDB_2EB8_819A_FA24_8F6AB3E29543 for query MTV_SelectProperty_5de7b548-657d-7794-52b4-2a828da0cfd1.
   at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryDefinition.GetColumnDefinitionBySourceColumnName(String sourceColumnName, Int32 resultSetIndex)
   at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryDefinition.GetColumnDefinitionBySourceColumnName(String sourceColumnName)
   at Microsoft.Mom.ConfigService.DataAccess.DatabaseAccessor.QueryInstanceProperties(ReadOnlyCollection`1 instances)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.CommunicationHelper.StateSyncRequestTask.ConfigurationItems.Instances.CollectPublicProperties(ReadOnlyCollection`1 identities, IConfigurationDataAccessor dataAccessor)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.CommunicationHelper.StateSyncRequestTask.ConfigurationItems.ConfigurationItemCollection`2.CollectPublicProperties(IConfigurationDataAccessor dataAccessor)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.CommunicationHelper.StateSyncRequestTask.ConfigurationItems..ctor(StateContext stateContext, IConfigurationDataAccessor dataAccessor)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.CommunicationHelper.StateSyncRequestTask.CreateResponse(Managers managers)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.Managers.Synchronize(OnDoSynchronizedWork onDoSynchronizedWork)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.CommunicationHelper.StateSyncRequestTask.Execute(Managers managers)
   at Microsoft.Mom.ConfigService.Engine.ConfigurationEngine.CommunicationHelper.StateSyncRequestTask.Run(Guid source, String cookie, Managers managers, IConfigurationDataAccessor dataAccessor, Stream stream, IConnection connection)".

I searched the internet..seems everyone had same event ids as above for different reasons. None of their solutions didn’t apply to my situation. I have seen solutions like,

– Restore the SQL server database

– Restore the Key for SQL server database

– Some complicated SQL query to find out some incompatible management packs (supposedly given by Microsoft PSS)

– Stopping SCOM Agent service (Systems Center Management) and Delete all folders under C:Program FilesSystem Center Operations Manager 2007Health Service State and start the agent service

– Enable Read permission for “Authenticated Users” on all OUs in Active Directory

– Make SCOM database as “Unrestricted Growth”

– Check Free Disk space on the SCOM server and Database Server

 

Solution that worked for me: Update the management packs.

1. Open the SCOM Console

2. Select to open “Administration” section

3. Right click on “Management Packs” and select “Import Management Packs”

4. Click Add and select “Add from Catalog…”

5. On the “View” drop down box, select “Updates available for installed management packs” and click “Search” button

image

6. Click Ok to download and apply the updated management packs.

 

And Voila! Suddenly all my agents are started talking…I see lot more alerts that I supposed see.