System Center and Devops

In this blog, I will share with you on how devops approach is followed and maintained while using system center suite of products.

Before going into details let’s talk about devops first.

The definition – DevOps (development and operations) is an enterprise software development phrase used to mean a type of agile relationship between development and IT operations. The goal of DevOps is to change and improve the relationship by advocating better communication and collaboration between these two business units.

Ref: http://www.webopedia.com/TERM/D/devops_development_operations.html

Now in today’s fast changing enterprise world all business leaders ultimate goal is to be more collaborative and inter connected across various business functions and to do that you need an IT team and technology that enables you.

In market, there are various tools and suite of products that come to help enterprises. However, Microsoft suite of products system center suite is a winner in many areas.

In Devops approach from SME to large enterprise we follow the below approach

System Center DevOps Model

 

Monitor:

Starting with monitoring your entire Infrastructure SCOM (system center operations manager) is a great tool. Why you ask, as OOTB it has all management packs to monitor your entire Microsoft technological solutions and you have plenty of third party solution and adapters that makes it easier to integrate with other technologies or solutions.

Service:

Now when it comes to IT service management, you can rely on SCSM (system center service manager)

It is a great tool to manage all your incidents, service requests, change, problems. Off course you can use it for release and business relationship management but those features are not that great. If you combine it with other solutions then it is wonderful ITSM product.

Manage:

Now for managing your infrastructure you have SCCM (system center configuration manager) OOTB tool can manage all windows software including OS for desktop, laptops, servers and with other third-party solution you can extend its functionality for managing and patching other third-party software’s.

Automation:

Now for a successful devops you need to automate and combine all these functions. This is where SCORC (System Center Orchestrator) along with PowerShell comes in handy. You can automate almost any anything across your infrastructure.

Example Scenarios:

SCOM detects that one of your critical web services is down -> It then automatically create an incident and assigns it to L1 Wintel team. -> Wintel engineer validates the alert runs a runbook in SCORC from the SCSM console (which restarts IIS service) -> Now as the service is started -> SCOM alert is auto resolved and closed -> Incident in SCSM console is resolved with all the actions that got executed in background captured in incident logs -> Wintel engineer notices that from the past incidents and also by his experience that high RAM usage his root cause for this issue-> He goes to SCSM console raises a change request for increasing RAM on the server-> Goes to SCCM console and checks if the server is compliant with all latest security patches and critical updates -> once the change is approved in SCSM-> he uses SCORC runbooks which is integration with SCVMM to increase the RAM on the server -> weeks later from SCOM performance he pulls up a report and verifies that that IIS service going down has never happened after memory was increased on the server.

Above was just a high level example on how all the system center products work hand in hand. This makes it super easy to manage enterprise level IT infrastructure.

Advertisements

Non Standard CCMSetup Error Codes

While troubleshooting SCCM site server roles or client, you will come across a lot of errors. While some of the error codes are easy to understand thanks to CCMTrace tool.

You will also find this toll in your SCCM media under tools folder. Just open the log with cmtrace and copy the error code that you get highlighted in red and hit CTRL+L and paste the error code, immediately you will get a pop up which will show what the error means.

However, there are a few cases where MS SCCM teams decided to use non- standard error codes which means whatever the result that CMTrace told you is not correct. If you are in that kind of a situation below table if for you.

If you know more non-standard codes, share them in comments below.

 

Error Code Meaning
0 Success
6 Error
7 Reboot Required
8 Setup already running
9 Prerequisite evaluation failure
10 Setup manifest hash validation failure

 

SCCM Primary sites design considerations

Today I will discuss scenarios under which you might require multiple primary sites.

As a thumb rule use a stand-alone primary site to support management of all of your systems and users. This topology is also successful when your company’s different geographic locations can be successfully served by a single primary site. To help manage network traffic, you can use multiple management and distribution points across your infrastructure to optimize network traffic.

A stand-alone primary site supports:

  • 175,000 total clients and devices, not to exceed:
    • 150,000 desktops (computers that run Windows, Linux, and UNIX)
    • 25,000 devices that run Mac and Windows CE 7.0

For mobile device management:

  • 50,000 devices by using on-premises MDM
  • 150,000 cloud-based devices

For example, a stand-alone primary site that supports 150,000 desktops and 10,000 Mac or Windows CE 7.0 can support only an additional 15,000 devices. Those devices can be either cloud-based or managed by using on-premises MDM.

For more information on sizing check https://docs.microsoft.com/en-us/sccm/core/plan-design/configs/size-and-scale-numbers

Now let’s get into scenarios of considering more than 1 Primary sites

  1. Load balancing across two Primary Sites

This scenario comes into play when you will have a Central Administration Site (CAS), and 2 or more Primary Sites with the thought of splitting the clients across multiple primary sites, in this scenarios if you lose one Primary site, you could still support half of your environment until the other Primary is recovered.

Below are pros and cons of this design:

Pros

  • If you lose the CAS or One Primary, then at least one Primary is still functional, as are its Secondary Sites until the CAS or other Primary is brought back online.

The deciding factor for this is if you have a tight SLA in bringing up SCCM sites then this is your best bet.

Typically, it takes around 3 hours to bring back SCCM sites if you have SCCM DB as SCCM site backup available.

  • Removes the Single Point of Failure scenario from the design, as clients assigned to other primaries would still be able to report in and be managed.

If need be, you can also manually switch clients to report to the available primary sites and continue to manage them

Cons

  • Increased Licensing costs
  • Increased hardware costs
  • Increased SQL Replication
  • Change latency across the Infrastructure as well as Locking due to replication latency
  1. Redundancy and High Availability

The data from Primary Sites and the CAS replicates among sites in the hierarchy. The CAS also provides centralized Administration and reporting.

Note that automatic Client Re-assignment does not occur when a Primary Site fails.

The result of a Primary Site failure is that the Primary Site and its Secondary sites communication are now broken, and the Secondary Sites cannot be re-parented. This coupled with the fact that the Client cannot be easily re-assigned in the time it would take to recover the failed Primary Site means there is really not a valid reason to do this unless the time it will take you to recover the Primary site, is greater than the time it would take to reassign and reinstall all of the Secondary sites the failed primary had.

However, this becomes valid when the scenario of Natural Disaster or War Type precautions for redundancy are being considered where the other location won’t be coming back online for quite some time.

  1. Geographic Boundaries

In some scenarios, companies across different countries require that each continent or country can share data, but that they also must be able to still support their country or continents clients must still be manageable. In this case, which is a business case for continuity; it would be feasible to have more than one Primary Site. Making the choice to use another Primary site in this case should be based on connectivity and client count because just using a Secondary site or remote Distribution point should be good enough for Geographic separation.

  1. Political or just that your clients want it

In some scenarios, your client you want multiple primary sites and segregate clients between them just because they are being managed by different departments or heads.

There can also be situations where they want to segregate data clients between and do not want everybody in the organization to have to access to all information.

Practically this cannot be a good reason to have multiple primary sites as SCCM user roles permissions can take care of it. And CAS by default will have access to all the information across primary sites.

However, there are situations that I have come across where this is required for client satisfaction.