SCOM DW Report Deployment Errors with SharePoint 2013 MP

Today i am going to talk about one of annoying errors that has been flooding my SCOM lately.

Error:

Data Warehouse failed to deploy reports for a management pack to SQL Reporting Services Server. Failed to deploy reporting component to the SQL Server Reporting Services server. The operation will be retried.
Exception ‘DeploymentException’: Failed to deploy reports for management pack with version dependent id ‘edf9e0b9-65aa-df29-6729-d16f0005e820’. Failed to deploy linked report ‘Microsoft.SharePoint.Server_Performance_Report’. Failed to convert management pack element reference ‘$MPElement[Name=”Microsoft.SharePoint.Foundation.2013.Responsetime”]$’ to guid. Check if MP element referenced exists in the MP. An object of class ManagementPackElement with ID 75668869-f88c-31f3-d081-409da1f06f0f was not found.
One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.DataWarehouse.Deployment.Report

In short the error was telling me that my SCOM is unable to deploy SharePoint server performance related reports to SCOM reporting services, which means that SharePoint Reports will be unavailable, however that was not the case for me. As I was able to see all SP 2013 reports listed in my reporting pane.

So I thinking it to be a false positive and have been breaking my head on this for almost 3 days to resolve the alert.

  • I deleted the SharePoint 2013 MP and added it again
  • Reconfigured it again
  • Recheck all run as accounts

But still nothing seemed to fix this thing.

Then after a lot searching I came across Kevin Holman blogs which states that it is known issues with SharePoint 2013 MP 15.0.4425.1000

https://blogs.technet.microsoft.com/kevinholman/2013/05/13/configuring-the-sharepoint-2013-management-pack/

Follow the above link for more information on this.

 

Advertisements

Non Standard CCMSetup Error Codes

While troubleshooting SCCM site server roles or client, you will come across a lot of errors. While some of the error codes are easy to understand thanks to CCMTrace tool.

You will also find this toll in your SCCM media under tools folder. Just open the log with cmtrace and copy the error code that you get highlighted in red and hit CTRL+L and paste the error code, immediately you will get a pop up which will show what the error means.

However, there are a few cases where MS SCCM teams decided to use non- standard error codes which means whatever the result that CMTrace told you is not correct. If you are in that kind of a situation below table if for you.

If you know more non-standard codes, share them in comments below.

 

Error Code Meaning
0 Success
6 Error
7 Reboot Required
8 Setup already running
9 Prerequisite evaluation failure
10 Setup manifest hash validation failure

 

Troubleshooting SCOM 2016 MS Grayed out State

While working at a customer site, I came across a situation suddenly where on of the MS was in grayed out state.

Initially I followed the main troubleshooting steps in these situation that we all do:

  • Flush Health service state cache from SCOM console, for MS you can do it from Operations Manager folder -> Management group Health

Click on MS and in RHS under task select ‘Flush Health service state’ It was fine for 2 mins or so and then again back to grayed out state.

So time for next move

  • Deleted the ‘Health service state’ under operations manfer-> Server folder on the MS

Still same thing server is backed to grayed out state.

I am now going through each and every error in event viewer and that’s when I came across below error related to healthservice

SCOM_Config_Override_2 copy

Now it makes sense, so the main culprit here is this workflow.

Actually, this workflow is lined with OMS, but client is neither using OMS nor have they configured it ever.

So, it quite odd to have this pop here.

In the OMS configuration console, nothing is configured. So very odd.

Obviously, you can’t remove the OMS or System Center Advisor or Intelligence MP from SCOM as to do that you have to remove other dependant MPs which is not possible, so what do we do?

Now comes the resolution,

  • GO to Authoring Pane-> Rules type intel
  • You will find the below 3 rules show up in search results.
  • Click on first 2 rules and select enable hit override value as False make sure to select enforce check box too

SCOM_Override_3

  • Repeat it other for the 2nd rule

SCOM_Override_1

  • Save these overrides in a separate MP and not under Default MP
  • Once done, now again flush healthservice state cache and check

Voila! MS is now healthy, when I check event viewer logs, even the error are now gone.

Note: Do this only when you DO not want to connect your on-premise SCOM to OMS

Unix / Linux SCOM Commandlets

CMdLet Description
Get-SCXAgent Returns list of managed UNIX / Linux computers
Get-SCXSSHCredential Creates an SSH credential
Install-SCXAgent Install SCOM agent for discovered UNIX / Linux computers.
Invoke-SCXDiscovery Invokes the discovery operation for the specified configuration of UNIX / Linux computers.
Remove-SCXAgent Remove a UNIX or Linux computer from a management group.
Set-SCXResourcePool Change the managing resource pool for the targeted UNIX or Lunix computer.
Uninstall-SCXAgent Uninstall the UNIX / Linux agent.
Update-SCXAgent Updates the UNIX / Linux agent
scxcertconfig -list List the Xplat certificates installed in management group
scxcertconfig -remove Remove the Xplat certificates installed in management group

Example 1:

Input: get-SCXagent

Output: Will return list of all Unix / Linux managed agents

Example 2:

Input: get-SCXagent | where {$_.Name -match “X01C-XPSCOM”} | Remove-SCXAgent

Output: No output will be displayed however, agent that matches with name X01C-XPSCOM will be removed from management group.

Example 3:

Input: scxcertconfig -list

Output: Will display all Xplat certificates installed in management group.

Example 4:

Input : scxcertconfig -remove-all

Output: No output will be displayed however, all Xplat certificates installed in management group will be removed.

MS SCSM VS Atlasian JIRA Service Desk

Jira SDM SCSM
Overview This is built on Atlassians JIRA workflow engine, JIRA Service Desk offers a collaborative, agile platform and knowledge base that’s low cost, easy to set up. Tool OOTB not fully ITIL compliant This is a product from Microsoft and come with System Center suite. Currently this is in its 3rd generation. Tool OOTB is MOF and ITIL compliant. Offers greater flexibility in terms of integration with other system center and MS products
Ease of use Easy to set up and use. However limited OOTB capabilities, needs a lot of customizations. Easy to set up and use, 2016 edition has better UI experience that previous versions. Also third part applications like ItnetX SCSM Portal, and Cireson SCSM Portal increases UI experience
ITIL Process OOTB only Incident and service request Management, rest modules can be built. However requires effort. OOTB has Incident, service catalogue, change, problem, request fulfillment, release management process
Scalability At the very begining you need to decide what are the features that you will use as this related to design of the solution. Scalability is biggest advantage of SCSM, you can easily scale horizontally or vertically based on your needs.
Automation Capabilities Limited, when it comes to outside Atlasion solutions Enhanced automation capabilities with System center and other MS products like SharePoint, Exchange, AD and other third party products. Need to use System center orchestrator and other third party adapters from Kelverion
Licensing Costs < US$ 40,000 (Approx., will vary based on region) Licensing is based on core so it depends upon your infrastructure sizing
Hardware costs Runs on most VMs technology like VM ware, Hyper-V, Oracle VM Runs on most VMs technology like VM ware, Hyper-V, Oracle VM
Resource costs Medium Low, Tool is very ease to use, however development costs are high
Company Size SME SME
Support Costs Low Low

Patching by Orchestrator Part -2

In this blog i will share with you script and other resources that you can use to automate patching.

Restart-Computer -ComputerName REMOTE_COMPUTER_NAME -Force – Wait WinRM

This will force server to restart and return a success code only after the server restarts and WinRM service is up and running. This will ensure that the steps that you have arranged to be executed after a server reboot happens correctly and they do not get failed.

Like i said i in my earlier Patching by Orchestrator Part -1 Blog that i am unable to share complete runbooks or scripts that i have used due to NDA with my employer.

If you know of more script or other innovative ways of automating patching then please post them in comments section.

Note: Credit and risk for all the script and links that i have mentioned here goes to their respective authors.

SCCM Primary sites design considerations

Today I will discuss scenarios under which you might require multiple primary sites.

As a thumb rule use a stand-alone primary site to support management of all of your systems and users. This topology is also successful when your company’s different geographic locations can be successfully served by a single primary site. To help manage network traffic, you can use multiple management and distribution points across your infrastructure to optimize network traffic.

A stand-alone primary site supports:

  • 175,000 total clients and devices, not to exceed:
    • 150,000 desktops (computers that run Windows, Linux, and UNIX)
    • 25,000 devices that run Mac and Windows CE 7.0

For mobile device management:

  • 50,000 devices by using on-premises MDM
  • 150,000 cloud-based devices

For example, a stand-alone primary site that supports 150,000 desktops and 10,000 Mac or Windows CE 7.0 can support only an additional 15,000 devices. Those devices can be either cloud-based or managed by using on-premises MDM.

For more information on sizing check https://docs.microsoft.com/en-us/sccm/core/plan-design/configs/size-and-scale-numbers

Now let’s get into scenarios of considering more than 1 Primary sites

  1. Load balancing across two Primary Sites

This scenario comes into play when you will have a Central Administration Site (CAS), and 2 or more Primary Sites with the thought of splitting the clients across multiple primary sites, in this scenarios if you lose one Primary site, you could still support half of your environment until the other Primary is recovered.

Below are pros and cons of this design:

Pros

  • If you lose the CAS or One Primary, then at least one Primary is still functional, as are its Secondary Sites until the CAS or other Primary is brought back online.

The deciding factor for this is if you have a tight SLA in bringing up SCCM sites then this is your best bet.

Typically, it takes around 3 hours to bring back SCCM sites if you have SCCM DB as SCCM site backup available.

  • Removes the Single Point of Failure scenario from the design, as clients assigned to other primaries would still be able to report in and be managed.

If need be, you can also manually switch clients to report to the available primary sites and continue to manage them

Cons

  • Increased Licensing costs
  • Increased hardware costs
  • Increased SQL Replication
  • Change latency across the Infrastructure as well as Locking due to replication latency
  1. Redundancy and High Availability

The data from Primary Sites and the CAS replicates among sites in the hierarchy. The CAS also provides centralized Administration and reporting.

Note that automatic Client Re-assignment does not occur when a Primary Site fails.

The result of a Primary Site failure is that the Primary Site and its Secondary sites communication are now broken, and the Secondary Sites cannot be re-parented. This coupled with the fact that the Client cannot be easily re-assigned in the time it would take to recover the failed Primary Site means there is really not a valid reason to do this unless the time it will take you to recover the Primary site, is greater than the time it would take to reassign and reinstall all of the Secondary sites the failed primary had.

However, this becomes valid when the scenario of Natural Disaster or War Type precautions for redundancy are being considered where the other location won’t be coming back online for quite some time.

  1. Geographic Boundaries

In some scenarios, companies across different countries require that each continent or country can share data, but that they also must be able to still support their country or continents clients must still be manageable. In this case, which is a business case for continuity; it would be feasible to have more than one Primary Site. Making the choice to use another Primary site in this case should be based on connectivity and client count because just using a Secondary site or remote Distribution point should be good enough for Geographic separation.

  1. Political or just that your clients want it

In some scenarios, your client you want multiple primary sites and segregate clients between them just because they are being managed by different departments or heads.

There can also be situations where they want to segregate data clients between and do not want everybody in the organization to have to access to all information.

Practically this cannot be a good reason to have multiple primary sites as SCCM user roles permissions can take care of it. And CAS by default will have access to all the information across primary sites.

However, there are situations that I have come across where this is required for client satisfaction.