Unable to add a second SCOM 2016 MS: Event ID 1008

Recently i came across a strange situation where i was unable to add a second management server to the existing scom management group.

I had taken care of all the pre-requisites like:

  • SCOM actions, SCOM sdk were admins on the second management servers as well as on the DB servers
  • Even the account and computers accounts were admin on both RMS and DBs and DW
  • Firewall ports are opened and traffic is allowed

still the second SCOM MS would not get installed

SCOM secondary MS setup error

It will fail at the stage of Data ware house configuration and roll back everything.

 

Now, the only error in event id under Application log is:

The Open Procedure for service “MOMConnector” in DLL “D:\Program Files\Microsoft System Center 2016\Operations Manager\Server\MOMConnectorPerformance.dll” failed. Performance data for this service will not be available. The first four bytes (DWORD) of the Data section contains the error code.

Event ID 1008 Source: Perflib

i researched online a lot but nothing solved. until i got another error in the scom setup log file.

Below is part of the error, for you to understand the context

15:38:43]: Info: :Finished evaluation of rule ‘NewDBForConfigureDataWarehouseForAllServersRules’

15:38:43]: Info: :Finished evaluation of rule ‘NewDBForConfigureDataWarehouseForAllServersRules’

[15:38:43]: Debug: :Action ConfigureDataWarehouseForAllServers will not be needed.[15:38:43]: Always: :Done validating action list; now running individual actions.[15:38:43]: Always: :Current Action: GetCommonProperties

[15:38:43]: Info: :Info:Getting Common Values for Server Postprocessor

[15:38:43]: Info: :GetCommonProperties completed.

[15:38:43]: Always: :Current Action: StartServices

[15:38:43]: Always: :Starting OM Services.

[15:38:43]: Debug: :StartService: attempting to start service OMSDK

[15:38:43]: Debug: :StartService: Able to start the service OMSDK after 0 minutes.[15:38:43]: Debug: :StartService: attempting to start service healthservice

[15:38:43]: Debug: :StartService: Able to start the service healthservice after 0 minutes.[15:38:43]: Debug: :StartService: attempting to start service cshost

[15:38:43]: Debug: :StartService: Able to start the service cshost after 0 minutes.[15:38:43]: Info: :StartServices completed.[15:38:43]: Always: :Current Action: GetDataReaderWriterAccounts

[15:38:47]: Error: :GetAccountForAProfileFromManagementGroup error: Threw Exception.Type: System.InvalidOperationException, Exception Error Code: 0x80131509, Exception.Message: Sequence contains no matching element

[15:38:47]: Error: :StackTrace:   at System.Linq.Enumerable.First[TSource](IEnumerable`1 source, Func`2 predicate)   at Microsoft.EnterpriseManagement.OperationsManager.Setup.ReportingComponent.GetAccountForAProfileFromManagementGroup(ManagementGroup managementGroup, String profileGuid, Guid managementTypeId, String& userName, String& userDomain)[15:38:47]: Error: :GetDataReaderWriterAccounts failed with the following exception: : Threw Exception.Type: System.Reflection.TargetInvocationException, Exception Error Code: 0x80131604, Exception.Message: Exception has been thrown by the target of an invocation.

[15:38:47]: Error: :StackTrace:   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)   at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)   at System.Delegate.DynamicInvokeImpl(Object[] args)   at Microsoft.EnterpriseManagement.SetupFramework.ActionEngine.Action.Run(String displayStringNamespace, ProgressData progressData, Func`2 progressDelegate)   at Microsoft.EnterpriseManagement.SetupFramework.ActionEngine.InstallStep.Run(String displayStringNamespace, ProgressData progressData, Func`2 progressDelegate)[15:38:47]: Error: :Inner Exception.Type: System.InvalidOperationException, Exception Error Code: 0x80131604, Exception.Message: Sequence contains no matching element

[15:38:47]: Error: :InnerException.StackTrace:   at System.Linq.Enumerable.First[TSource](IEnumerable`1 source, Func`2 predicate)   at Microsoft.EnterpriseManagement.OperationsManager.Setup.ReportingComponent.GetAccountForAProfileFromManagementGroup(ManagementGroup managementGroup, String profileGuid, Guid managementTypeId, String& userName, String& userDomain)   at Microsoft.EnterpriseManagement.OperationsManager.Setup.ReportingComponent.GetDWWriterAccountFromManagementGroup(String managementServerName, String& userName, String& userDomain)   at Microsoft.SystemCenter.Essentials.SetupFramework.InstallItemsDelegates.OMDataWarehouseProcessor.GetDataReaderWriterAccounts()

[15:38:47]: Error: :FATAL ACTION: GetDataReaderWriterAccounts

[15:38:47]: Error: :FATAL ACTION: DWInstallActionsPostProcessor

[15:38:47]: Error: :ProcessInstalls: Running the PostProcessDelegate returned false.[15:38:47]: Always: :SetErrorType: Setting VitalFailure. currentInstallItem: Data Warehouse Configuration

[15:38:47]: Error: :ProcessInstalls: Running the PostProcessDelegate for OMDATAWAREHOUSE failed…. This is a fatal item.  Setting rollback.[15:38:47]: Info: :SetProgressScreen: FinishMinorStep.[15:38:47]: Always: :!***** Installing: POSTINSTALL ***[15:38:47]: Info: :ProcessInstalls: Rollback is set and we are not doing an uninstall so we will stop processing installs

After Binging a lot i still did not find any solution or any blog where something similar was mentioned.

My usual mantra is to go over and over the error logs in most cases it can direct you to the hidden cause. Well in this case also, it did for me.

Actual Issue:

when i read the setup error log i understood that scom setup while was unable to get the database reader account details and it is something mandatory for it to process and complete the setup.

Now, comes the resolution part:

Go to SCOM console – > Administration pane -> run as profile and check for the accounts associated for below two:

SCOM DW Run as account.png

In my case the associations of run as profile with scom DW account was not targeted properly. I added all the necessary targets as shown in below screenshots.

SCOM Run as Profile association target 2SCOM DW run as profile asscoated target 1

Once it was added, i rebooted the server.

Now, before running the setup again, you will need to delete the management server entry manually from the RMS.

Note:  There will be no entry of SCOM on the server, but if you login to RMS you will see an entry of the MS in grayed out state.

This happens as it was able to successfully register with the root management server during the setup and while it failed to register with data warehouse, scom setup file cannot rollback or unregister itself.

Now rerun the setup and it will be successful.

Voila! issue solved.

Advertisements

Troubleshooting SCOM 2016 MS Grayed out State

While working at a customer site, I came across a situation suddenly where on of the MS was in grayed out state.

Initially I followed the main troubleshooting steps in these situation that we all do:

  • Flush Health service state cache from SCOM console, for MS you can do it from Operations Manager folder -> Management group Health

Click on MS and in RHS under task select ‘Flush Health service state’ It was fine for 2 mins or so and then again back to grayed out state.

So time for next move

  • Deleted the ‘Health service state’ under operations manfer-> Server folder on the MS

Still same thing server is backed to grayed out state.

I am now going through each and every error in event viewer and that’s when I came across below error related to healthservice

SCOM_Config_Override_2 copy

Now it makes sense, so the main culprit here is this workflow.

Actually, this workflow is lined with OMS, but client is neither using OMS nor have they configured it ever.

So, it quite odd to have this pop here.

In the OMS configuration console, nothing is configured. So very odd.

Obviously, you can’t remove the OMS or System Center Advisor or Intelligence MP from SCOM as to do that you have to remove other dependant MPs which is not possible, so what do we do?

Now comes the resolution,

  • GO to Authoring Pane-> Rules type intel
  • You will find the below 3 rules show up in search results.
  • Click on first 2 rules and select enable hit override value as False make sure to select enforce check box too

SCOM_Override_3

  • Repeat it other for the 2nd rule

SCOM_Override_1

  • Save these overrides in a separate MP and not under Default MP
  • Once done, now again flush healthservice state cache and check

Voila! MS is now healthy, when I check event viewer logs, even the error are now gone.

Note: Do this only when you DO not want to connect your on-premise SCOM to OMS