Академический Документы
Профессиональный Документы
Культура Документы
Strategy
Define business justification and expected outcomes.
Plan
Align actionable adoption plans to business outcomes.
Ready
Prepare the cloud environment for the planned changes.
Migrate
Migrate and modernize existing workloads.
Innovate
Develop new cloud-native or hybrid solutions.
Govern
Govern the environment and workloads.
Manage
Operations management for cloud and hybrid solutions.
Intent
The cloud fundamentally changes how enterprises procure, use, and secure technology resources. Traditionally, enterprises
assumed ownership of and responsibility for all aspects of technology, from infrastructure to software. By moving to the cloud,
enterprises can provision and consume resources only when they're needed. Although the cloud offers tremendous flexibility in
design choices, enterprises need a proven and consistent methodology for adopting cloud technologies. The Microsoft Cloud
Adoption Framework for Azure meets that need, helping guide decisions throughout cloud adoption.
However, cloud adoption is only a means to an end. Successful cloud adoption starts well before a cloud platform vendor is
selected. It begins when business and IT decision makers realize that the cloud can accelerate a specific business transformation
objective. The Cloud Adoption Framework can help them align strategies for business, culture, and technical change to achieve
their desired business outcomes.
The Cloud Adoption Framework provides technical guidance for Microsoft Azure. Because enterprise customers might still be in
the process of choosing a cloud vendor or may have an intentional multi-cloud strategy, the framework provides cloud-agnostic
guidance for strategic decisions whenever possible.
Intended audience
This guidance affects the business, technology, and culture of enterprises. The affected roles include line-of-business leaders,
business decision makers, IT decision makers, finance, enterprise administrators, IT operations, IT security and compliance, IT
governance, workload development owners, and workload operations owners. Each role uses its own vocabulary, and each has
different objectives and key performance indicators. A single set of content can't address all audiences effectively.
Enter the Cloud Architect. The Cloud Architect serves as the thought leader and facilitator to bring these audiences together.
We've designed this collection of guides to help Cloud Architects facilitate the right conversations with the right audiences and
drive decision-making. Business transformation that's empowered by the cloud depends on the Cloud Architect role to help guide
decisions throughout the business and IT.
Each section of the Cloud Adoption Framework represents a different specialization or variant of the Cloud Architect role. These
sections also create opportunities to share cloud architecture responsibilities across a team of Cloud Architects. For example, the
governance section is designed for Cloud Architects who have a passion for mitigating technical risks. Some cloud providers refer
to these specialists as cloud custodians; we prefer the term cloud guardian or, collectively, the cloud governance team.
Use the Microsoft Cloud Adoption Framework for Azure to begin a cloud migration journey. This framework
provides comprehensive guidance for transitioning legacy application workloads to the cloud using innovative
cloud-based technologies.
Executive summary
The Cloud Adoption Framework helps customers undertake a simplified cloud adoption journey. This framework
contains detailed information about an end-to-end cloud adoption journey, starting with targeted business
outcomes, and then aligning cloud readiness and assessments with clearly defined business objectives. Those
outcomes are achieved through a defined path for cloud adoption. With migration-based adoption, the defined
path focuses largely on migrating on-premises workloads to the cloud. Sometimes this journey includes
modernization of workloads to increase the return on investment from the migration effort.
This framework is designed primarily for cloud architects and the cloud strategy teams leading cloud adoption
efforts. However, many topics in this framework are relevant to other roles across the business and IT. Cloud
architects frequently serve as facilitators to engage each of the relevant roles. This executive summary is
designed to prepare the various roles before facilitating conversations.
NOTE
This guidance is currently a public preview. Terminology, approaches, and guidance are being thoroughly tested with
customers, partners, and Microsoft teams during this preview. As such, the TOC and guidance may change slightly over
time.
Motivations
Cloud migrations can help companies achieve their desired business outcomes. Clear communication of
motivations, business drivers, and measurements of success are important foundations for making wise decisions
throughout cloud migration efforts. The following table classifies motivations to facilitate this conversation. It is
assumed that most companies will have motivations across each classification. The objective of this table is not to
limit outcomes, but instead make it easier to prioritize overall objectives and motivations:
CRITICAL BUSINESS EVENTS MIGRATION MOTIVATIONS INNOVATION MOTIVATIONS
Mergers, acquisition, or divestiture Reduction in vendor or technical Build new technical capabilities
complexity
Reductions in capital expenses Modernize security posture and
Optimization of internal operations controls
End of support for mission-critical
technologies Increase business agility Scale to meet geographic or market
demands
Response to regulatory compliance Prepare for new technical capabilities
changes Improve customer experiences and
Scale to meet market demands engagements
Meet new data sovereignty
requirements Scale to meet geographic or market Transform products or services
demands
Reduce disruptions and improve IT Disrupt the market with new products
stability or services
When a response to critical business events is the highest priority, it is important to engage in cloud
implementation early, often in parallel with strategy and planning efforts. Taking such an approach requires a
growth mindset and a willingness to iteratively improve processes, based on direct lessons learned.
When migration motivations are a priority, strategy and planning will play a vital role early in the process.
However, it is highly suggested that implementation of the first workload is conducted in parallel with planning,
to help the team understand and plan for any learning curves associated with the cloud.
When innovation motivations are the highest priority, strategy and planning will require additional investments
early in the process to ensure balance in the portfolio and wise alignment of the investment made during cloud.
For more information about realizing innovation motivations, see Understand the innovation journey.
Preparing all participants across the migration effort with an awareness of the motivations will ensure wiser
decisions. The following migration methodology outlines how Microsoft suggests customers guide those
decisions in a consistent methodology.
Migration approach
The Cloud Adoption Framework establishes a high-level construct of Plan, Ready, Adopt to group the types of
effort required across any cloud adoption. This executive summary builds on that high-level flow to establish
iterative processes that can facilitate lift-shift-optimize efforts and modernization efforts in a single approach
across all cloud migration activities.
This approach consists of two methodologies or areas of focus: Cloud Strategy & Planning and Cloud
Implementation. The motivation or desired business outcome for a cloud migration often determines how much
a team should invest in strategy and planning and implementation. Those motivations can also influence
decisions to execute each sequentially or in parallel.
Cloud implementation
Cloud implementation is an iterative process for migrating and modernizing the digital estate in alignment with
targeted business outcomes and change management controls. During each iteration, workloads are migrated or
modernized in alignment with the strategy and plan. Decisions regarding IaaS, PaaS, or hybrid are made during
the assess phase to optimized control and execution. Those decisions will drive the tools used during the Migrate
phase. This model can be used with minimal strategy and planning. However, to ensure the greatest business
returns, it is highly suggested that both IT and the business align on a clear strategy and plan to guide
implementation activities.
The focus of this effort is the migration or modernization of workloads. A workload is a collection of
infrastructure, applications, and data that collectively supports a common business goal, or the execution of a
common business process. Examples of workloads could include things like a line-of-business application, an HR
payroll solution, a CRM solution, a financial document approval workflow, or a business intelligence solution.
Workloads may also include shared technical resources like a data warehouse that supports several other
solutions. In some cases, a workload could be represented by a single asset like a self-contained server,
application, or data platform.
Cloud migrations are often considered a single project within a broader program to streamline IT operations,
costs, or complexity. The cloud implementation methodology helps align the technical efforts within a series of
workload migrations to higher-level business values outlined in the cloud strategy and plan.
Getting started: To get started with a cloud implementation, the Azure migration guide and Azure setup guide
outline the tools and high-level processes needed to be successful in the execution of a cloud implementation.
Migrating your first workload using those guides will help the team overcome initial learning curves early in the
planning process. Afterwards, additional considerations should be given to the expanded scope checklist,
migration best practices and migration consideration, to align the baseline guidance with your effort's unique
constraints, processes, team structures, and objectives.
Cloud migration is an excellent option for your existing workloads. But creating new products and services
requires a different approach. The innovate methodology in the Cloud Adoption Framework establishes an
approach that guides the development of new products and services.
Next steps
Begin your innovation journey using the innovate methodology.
Begin your innovation journey
The Cloud Adoption Framework is a free self-service tool that guides readers through various cloud adoption efforts. The
framework helps customers succeed at realizing business objectives that can be enabled by Microsoft Azure. However, this
content also recognizes that the reader may be addressing broad business, culture, or technical challenges and that sometimes
might require a cloud-neutral position. Therefore, each section of this guidance begins with an Azure-first approach, and then
follows with cloud-neutral theory that can scale across many business and technical decisions.
Throughout this framework, enablement is a core theme. The following checklist itemizes fundamental cloud adoption principles
that ensure an adoption journey is considered successful by both IT and the business:
Plan: Establishing clear business outcomes, a clearly defined digital estate plan, and well-understood adoption backlogs.
Ready: Ensure the readiness of staff through skills and learning plans.
Operate: Define a manageable operating model to guide activities during and long after adoption.
Organize: Align people and teams to deliver proper cloud operations and adoption.
Govern: Align proper governance disciplines to consistently apply cost management, risk mitigation, compliance, and
security baselines across all cloud adoption.
Manage: Ongoing operational management of the IT portfolio to minimize interruptions to business processes and
ensure stability of the IT portfolio.
Support: Align proper partnership and support options.
Another core theme is security, which is a critical quality attribute for a successful cloud adoption. Security is integrated
throughout this framework to provide integrated guidance on maintaining confidentiality, integrity, and availability assurances for
your cloud workloads.
Additional tools
In addition to the Cloud Adoption Framework, Microsoft covers additional topics that can enable success. This article highlights a
few common tools that can significantly improve success beyond the scope of the Cloud Adoption Framework. Establishing cloud
governance, resilient architectures, technical skills, and a DevOps approach are each important to the success of any cloud
adoption effort. Bookmark this page as a resource to revisit throughout any cloud adoption journey.
Cloud Governance
Understand business risk, map those risks to proper policies and processes. Using cloud governance tools and the Five
Disciplines of Cloud Governance minimizes risks and improves the likelihood of success. Cloud governance helps control
costs, create consistency, improve security, and accelerate deployment.
DevOps Approach
Microsoft's historic transformation is rooted firmly in a Growth Mindset approach to culture and a DevOps approach to
technical execution. The Cloud Adoption Framework embeds both throughout the framework. To accelerate DevOps
adoption, review the learning DevOps content
Next steps
Armed with an understanding of the top enabling aspects of the Cloud Adoption Framework, the likelihood of success in a
Migrate or Innovate effort will be that much higher.
Migrate
Innovate
The cloud delivers fundamental technology benefits that can help your enterprise execute multiple business strategies. By using
cloud-based approaches, you can improve business agility, reduce costs, accelerate time to market, and enable expansion into
new markets. To take advantage of this great potential, start by documenting your business strategy in a way that's both
understandable to cloud technicians and palatable to your business stakeholders.
Motivations
Meet with key stakeholders and executives to document the motivations behind cloud adoption.
Business outcomes
Engage motivated stakeholders and executives to document specific business outcomes.
Business justification
Develop a business case to validate the financial model that supports your motivations and outcomes.
To help build out your cloud adoption strategy, download the Microsoft Cloud Adoption Plan template, and then track the output
of each exercise.
Next steps
Start building your cloud adoption strategy by documenting the motivations behind cloud adoption.
Document motivations
Motivations: Why are we moving to the cloud?
3 minutes to read • Edit Online
"Why are we moving to the cloud?" is a common question for business and technical stakeholders alike. If the
answer is, "Our board (or CIO, or C -level executives) told us to move to the cloud," it's unlikely that the business
will achieve the desired outcomes.
This article discusses a few motivations behind cloud migration that can help produce more successful business
outcomes. These options help facilitate a conversation about motivations and, ultimately, business outcomes.
Motivations
Business transformations that are supported by cloud adoption can be driven by various motivations. It's likely
that several motivations apply at the same time. The goal of the lists in the following table is to help spark ideas
about which motivations are relevant. From there, you can prioritize and assess the potential impacts of the
motivations. In this article, we recommend that your cloud adoption team meet with various executives and
business leaders using the list below to understand which of these motivations are affected by the cloud
adoption effort.
Response to regulatory compliance Preparation for new technical Improved customer experiences and
changes capabilities engagements
New data sovereignty requirements Scaling to meet market demands Transformation of products or services
Reduction of disruptions and Scaling to meet geographic demands Market disruption with new products
improvement of IT stability or services
Motivation-driven strategies
This section highlights the Migration and Innovation motivations and their corresponding strategies.
Migration
The Migration motivations listed near the top of the Motivations table are the most common, but not necessarily
the most significant, reasons for adopting the cloud. These outcomes are important to achieve, but they're most
effectively used to transition to other, more useful worldviews. This important first step to cloud adoption is often
called a cloud migration. The framework refers to the strategy for executing a cloud migration by using the term
Migrate.
Some motivations align well with a migrate strategy. The motives at the top of this list will likely have
significantly less business impact than those toward the bottom of the list.
Cost savings.
Reduction in vendor or technical complexity.
Optimization of internal operations.
Increasing business agility.
Preparing for new technical capabilities.
Scaling to meet market demands.
Scaling to meet geographic demands.
Innovation
Data is the new commodity. Modern apps are the supply chain that drives that data into various experiences. In
today's business market, it's hard to find a transformative product or service that isn't built on top of data,
insights, and customer experiences. The motivations that appear lower in the Innovation list align to a
technology strategy referred to in this framework as Innovate.
The following list includes motivations that cause an IT organization to focus more on an innovate strategy than
a migrate strategy.
Increasing business agility.
Preparing for new technical capabilities.
Building new technical capabilities.
Scaling to meet market demands.
Scaling to meet geographic demands.
Improving customer experiences and engagements.
Transforming products or services.
Next steps
Understanding projected business outcomes helps facilitate the conversations that you need to have as you
document your motivations and supporting metrics, in alignment with your business strategy. Next, read an
overview of business outcomes that are commonly associated with a move to the cloud.
Overview of business outcomes
What business outcomes are associated with
transformation journeys?
2 minutes to read • Edit Online
The most successful transformation journeys start with a business outcome in mind. Cloud adoption can be a
costly and time-consuming effort. Fostering the right level of support from IT and other areas of the business is
crucial to success. The Microsoft business outcome framework is designed to help customers identify business
outcomes that are concise, defined, and drive observable results or change in business performance, supported
by a specific measure.
During any cloud transformation, the ability to speak in terms of business outcomes supports transparency and
cross-functional partnerships. The business outcome framework starts with a simple template to help
technically minded individuals document and gain consensus. This template can be used with several business
stakeholders to collect a variety of business outcomes, which could each be influenced by a company's
transformation journey. Feel free to use this template electronically or, better still, draw it on a whiteboard to
engage business leaders and stakeholders in outcome-focused discussions.
To learn more about business outcomes and the business outcome template, see documenting business
outcomes, or download the business outcome template.
Next steps
Learn more about fiscal outcomes.
Fiscal outcomes
Examples of fiscal outcomes
7 minutes to read • Edit Online
NOTE
The following examples are hypothetical and should not be considered a guarantee of returns when adopting any cloud
strategy.
Revenue outcomes
New revenue streams
The cloud can help create opportunities to deliver new products to customers or deliver existing products in a new
way. New revenue streams are innovative, entrepreneurial, and exciting for many people in the business world.
New revenue streams are also prone to failure and are considered by many companies to be high risk. When
revenue-related outcomes are proposed by IT, there will likely be resistance. To add credibility to these outcomes,
partner with a business leader who's a proven innovator. Validation of the revenue stream early in the process
helps avoid roadblocks from the business.
Example: A company has been selling books for over a hundred years. An employee of the company realizes
that the content can be delivered electronically. The employee creates a device that can be sold in the
bookstore, which allows the same books to be downloaded directly, driving $X in new book sales.
Revenue increases
With global scale and digital reach, the cloud can help businesses to increase revenues from existing revenue
streams. Often, this type of outcome comes from an alignment with sales or marketing leadership.
Example: A company that sells widgets could sell more widgets, if the salespeople could securely access the
company's digital catalog and stock levels. Unfortunately, that data is only in the company's ERP system, which
can be accessed only via a network-connected device. Creating a service façade to interface with the ERP and
exposing the catalog list and nonsensitive stock levels to an application in the cloud would allow the
salespeople to access the data they need while onsite with a customer. Extending on-premises Active Directory
using Azure Active Directory (Azure AD ) and integrating role-based access into the application would allow
the company to help ensure that the data stays safe. This simple project could affect revenue from an existing
product line by x%.
Profit increases
Seldom does a single effort simultaneously increase revenue and decrease costs. However, when it does, align the
outcome statements from one or more of the revenue outcomes with one or more of the cost outcomes to
communicate the desired outcome.
Cost outcomes
Cost reduction
Cloud computing can reduce capital expenses for hardware and software, setting up datacenters, running on-site
datacenters, and so on. The costs of racks of servers, round-the-clock electricity for power and cooling, and IT
experts for managing the infrastructure add up fast. Shutting down a datacenter can reduce capital expense
commitments. This is commonly referred to as "getting out of the datacenter business." Cost reduction is typically
measured in dollars in the current budget, which could span one to five years depending on how the CFO
manages finances.
Example #1: A company's datacenter consumes a large percentage of the annual IT budget. IT chooses to
conduct a cloud migration and transitions the assets in that datacenter to infrastructure as a service (IaaS )
solutions, creating a three-year cost reduction.
Example #2: A holding company recently acquired a new company. In the acquisition, the terms dictate that
the new entity should be removed from the current datacenters within six months. Failure to do so will result in
a fine of 1 million USD per month to the holding company. Moving the digital assets to the cloud in a cloud
migration could allow for a quick decommission of the old assets.
Example #3: An income tax company that caters to consumers experiences 70 percent of its annual revenue
during the first three months of the year. The remainder of the year, its large IT investment sits relatively
dormant. A cloud migration could allow IT to deploy the compute/hosting capacity required for those three
months. During the remaining nine months, the IaaS costs could be significantly reduced by shrinking the
compute footprint.
Example: Coverdell
Coverdell modernizes their infrastructure to drive record cost savings with Azure. Coverdell's decision to invest in
Azure, and to unite their network of websites, applications, data, and infrastructure within this environment, led to
more cost savings than the company could have ever expected. The migration to an Azure-only environment
eliminated 54,000 USD in monthly costs for colocation services. With the company's new, united infrastructure
alone, Coverdell expects to save an estimated 1 million USD over the next two to three years.
"Having access to the Azure technology stack opens the door for some scalable, easy-to-implement, and
highly available solutions that are cost effective. This allows our architects to be much more creative with the
solutions they provide."
Ryan Sorensen
Director of Application Development and Enterprise Architecture
Coverdell
Cost avoidance
Terminating a datacenter can also provide cost avoidance, by preventing future refresh cycles. A refresh cycle is
the process of buying new hardware and software to replace aging on-premises systems. In Azure, hardware and
OS are routinely maintained, patched, and refreshed at no additional cost to customers. This allows a CFO to
remove planned future spend from long-term financial forecasts. Cost avoidance is measured in dollars. It differs
from cost reduction, generally focusing on a future budget that has not been fully approved yet.
Example: A company's datacenter is up for a lease renewal in six months. The datacenter has been in service
for eight years. Four years ago, all servers were refreshed and virtualized, costing the company millions of
dollars. Next year, the company plans to refresh the hardware and software again. Migrating the assets in that
datacenter as part of a cloud migration would allow cost avoidance by removing the planned refresh from next
year's forecasted budget. It could also produce cost reduction by decreasing or eliminating the real estate lease
costs.
Capital expenses vs. operating expenses
Before you discuss cost outcomes, it's important to understand the two primary cost options: capital expenses and
operating expenses.
The following terms will help you understand the differences between capital expenses and operating expenses
during business discussions about a transformation journey.
Capital is the money and assets owned by a business to contribute to a particular purpose, such as increasing
server capacity or building an application.
Capital expenditures generate benefits over a long period. These expenditures are generally nonrecurring
and result in the acquisition of permanent assets. Building an application could qualify as a capital expenditure.
Operating expenditures are ongoing costs of doing business. Consuming cloud services in a pay-as-you-go
model could qualify as an operating expenditure.
Assets are economic resources that can be owned or controlled to produce value. Servers, data lakes, and
applications can all be considered assets.
Depreciation is a decrease in the value of an asset over time. More relevant to the capital expense versus
operating expense conversation, depreciation is how the costs of an asset are allocated across the periods in
which they are used. For instance, if you build an application this year but it's expected to have an average shelf
life of five years (like most commercial apps), the cost of the dev team and necessary tools required to create
and deploy the code base would be depreciated evenly over five years.
Valuation is the process of estimating how much a company is worth. In most industries, valuation is based
on the company's ability to generate revenue and profit, while respecting the operating costs required to create
the goods that provide that revenue. In some industries, such as retail, or in some transaction types, such as
private equity, assets and depreciation can play a large part in the company's valuation.
It is often a safe bet that various executives, including the chief investment officer (CIO ), debate the best use of
capital to grow the company in the desired direction. Giving the CIO a means of converting contentious capital
expense conversations into clear accountability for operating expenses could be an attractive outcome by itself. In
many industries, chief financial officers (CFOs) are actively seeking ways of better associating fiscal accountability
to the cost of goods being sold.
However, before you associate any transformation journey with this type of capital versus operating expense
conversion, it's wise to meet with members of the CFO or CIO teams to see which cost structure the business
prefers. In some organizations, reducing capital expenses in favor of operating expenses is a highly undesirable
outcome. As previously mentioned, this approach is sometimes seen in retail, holding, and private equity
companies that place higher value on traditional asset accounting models, which place little value on IP. It's also
seen in organizations that had negative experiences when they outsourced IT staff or other functions in the past.
If an operating expense model is desirable, the following example could be a viable business outcome:
Example: The company's datacenter is currently depreciating at x USD per year for the next three years. It is
expected to require an additional y USD to refresh the hardware next year. We can convert the capital expenses
to an operating expense model at an even rate of z USD per month, allowing for better management of and
accountability for the operating costs of technology.
Next steps
Learn more about agility outcomes.
Agility outcomes
Examples of agility outcomes
3 minutes to read • Edit Online
As discussed in the business outcomes overview, several potential business outcomes can serve as the foundation
for any transformation journey conversation with the business. This article focuses on the timeliest business
measure: business agility. Understanding your company's market position and competitive landscape can help you
articulate the business outcomes that are the target of the business's transformation journey.
Traditionally, chief investment officers (CIOs) and IT teams were considered a source of stability in core mission-
critical processes. This is still true. Few businesses can function well when their IT platform is unstable. However,
in today's business world, much more is expected. IT can expand beyond a simple cost center by partnering with
the business to provide market advantages. Many CIOs and executives assume that stability is simply a baseline
for IT. For these leaders, business agility is the measure of IT's contribution to the business.
Time-to-market outcome
During cloud-enabled innovation efforts, time to market is a key measure of IT's ability to address market change.
In many cases, a business leader might have existing budget for the creation of an application or the launch of a
new product. Clearly communicating a time-to-market benefit can motivate that leader to redirect budget to IT's
transformation journey.
Example 1: The European division of a US -based company needs to comply with GDPR regulations by
protecting customer data in a database that supports UK operations. The existing version of SQL doesn't
support the necessary row -level security. An in-place upgrade would be too disruptive. Using Azure SQL to
replicate and upgrade the database, the customer adds the necessary compliance measure in a matter of
weeks.
Example 2: A logistics company has discovered an untapped segment of the market, but it needs a new
version of their flagship application to capture this market share. Their larger competitor has made the
same discovery. Through the execution of a cloud-enabled application innovation effort, the company
embraces customer obsession and a DevOps-driven development approach to beat their slower, legacy
competitor by x months. This jump on market entrance secured the customer base.
Aurora Health Care
Healthcare system transforms online services into a friendly digital experience. To transform its digital services,
Aurora Health Care migrated its websites to the Microsoft Azure platform and adopted a strategy of continuous
innovation.
"As a team, we're focused on high-quality solutions and speed. Choosing Azure was a very transformative
decision for us."
Jamey Shiels
Vice President of Digital Experience
Aurora Health Care
Provision time
When business demands new IT services or scale to existing services, acquisition and provision of new hardware
or virtual resources can take weeks. After cloud migration, IT can more easily enable self-service provisioning,
allowing the business to scale in hours.
Example: A consumer packaged goods company requires the creation and tear-down of hundreds of database
clusters per year to fulfill operational demands of the business. The on-premises virtual hosts can provision
quickly, but the process of recovering virtual assets is slow and requires significant time from the team. As
such, the legacy on-premises environment suffers from bloat and can seldom keep up with demand. After
cloud migration, IT can more easily provide scripted self-provisioning of resources, with a chargeback
approach to billing. Together, this allows the business to move as quickly as they need, but still be accountable
for the cost of the resources they demand. Doing so in the cloud limits deployments to the business's budget
only.
Next steps
Learn more about reach outcomes.
Reach outcomes
Examples of global reach outcomes
2 minutes to read • Edit Online
As discussed in business outcomes, several potential business outcomes can serve as the foundation for any
transformation journey conversation with the business. This article focuses on a common business measure:
reach. Understanding the company's globalization strategy will help to better articulate the business outcomes
that are the target of a business's transformation journey.
Across the Fortune 500 and smaller enterprises, globalization of services and customer base has been a focus for
over three decades. As the world shrinks, it is increasingly likely for any business to engage in global commerce.
Supporting global operations is challenging and costly. Hosting datacenters around the world can consume more
than 80 percent of an annual IT budget. By themselves, wide area networks using private lines to connect those
datacenters can cost millions of dollars per year.
Cloud solutions move the cost of globalization to the cloud provider. In Azure, customers can quickly deploy
resources in the same region as customers or operations without having to buy and provision a datacenter.
Microsoft owns one of the largest wide area networks in the world, connecting datacenters around the globe.
Connectivity and global operating capacity are available to global customers on demand.
Global access
Expanding into a new market can be one of the most valuable business outcomes during a transformation. The
ability to quickly deploy resources in market without a longer-term commitment allows sales and operations
leaders to explore options that wouldn't have been considered in the past.
Example: A cosmetics manufacturer has identified a trend. Some products are being shipped to the Asia
Pacific region even though no sales teams are operating in that region. The minimum systems required by a
remote sales force are small, but latency prevents a remote access solution. To capitalize on this trend, the VP
of sales would like to experiment with sales teams in Japan and Korea. Because the company has undergone a
cloud migration, it was able to deploy the necessary systems in both Japan and Korea within days. This allowed
the VP of Sales to grow revenue in the region by x percent within three months. Those two markets continue to
outperform other parts of the world, leading to sales operations throughout the region.
Data sovereignty
Operating in new markets introduces additional governance constraints. GDPR is one example of governance
criteria that could cause significant financial recourse. Azure provides compliance offerings that help customers
meet compliance obligations across regulated industries and global markets. For more information, see the
overview of Microsoft Azure compliance.
Example: A US -based utilities provider was awarded a contract to provide utilities in Canada. Canadian data
sovereignty law requires that Canadian data stay in Canada. This company had been working their way
through a cloud-enabled application innovation effort for years. As a result, their software was able to be
deployed through fully scripted DevOps processes. With a few minor changes to the code base, they were able
to deploy a working copy of the code to an Azure datacenter in Canada, meeting data sovereignty compliance
and keeping the customer.
Next steps
Learn more about customer engagement outcomes.
Customer engagement outcomes
Examples of customer engagement outcomes
2 minutes to read • Edit Online
As discussed in the business outcomes overview, several potential business outcomes can serve as the foundation
for any transformation journey conversation with the business. This article focuses on a common business
measure: customer engagement. Understanding the needs of customers and the ecosystem around customers
helps with articulating the business outcomes that are the target of a business's transformation journey.
During cloud-enabled data innovation efforts, customer engagement is assumed. Aggregating data, testing
theories, advancing insights, and informing cultural change; each of these disruptive functions requires a high
degree of customer engagement. During a cloud-enabled application innovation effort, this type of customer
engagement is a maturation goal.
Customer engagement outcomes are all about meeting and exceeding customer expectations. As a baseline for
customer engagements, customers assume that products and services are performant and reliable. When they are
not, it's easy for an executive to understand the business value of performance and reliability outcomes. For more
advanced companies, speed of integrating learnings and observations is a fundamental business outcome.
The following are examples and outcomes related to customer engagement:
Cycle time
During customer-obsessed transformations, like a cloud-enabled application innovation effort, customers respond
from direct engagement and the ability to see their needs met quickly by the development team. Cycle time is a
Six Sigma term that refers to the duration from the start to finish of a function. For business leaders who are
customer-obsessed and investing heavily in improving customer engagement, cycle time can be a strong business
outcome.
Example: A services company that provides business-to-business (B2B ) services is attempting to hold on to
market share in a competitive market. Customers who've left for a competing service provider have stated that
their overly complex technical solution interferes with their business processes and is the primary reason for
leaving. In this case, cycle time is imperative. Today, it takes 12 months for a feature to go from request to
release. If it's prioritized by the executive team, that cycle can be reduced to six to nine months. Through a
cloud-enabled application innovation effort, cloud-native application models and Azure DevOps integration,
the team was able to cut cycle time down to one month, allowing the business and application development
teams to interact more directly with customers.
ExakTime
Labor management breaks free of on-premises constraints with cloud technology. With Microsoft Azure,
ExakTime is moving toward streamlined agile product development, while the company's clients enjoy a more
robust and easier-to-use product, full of new features.
"Now, a developer can sit down at his machine, have an idea, spin up a web service or an Azure instance, test
out his idea, point it at test data, and get the concept going. In the time that it would have taken to provision
just the stuff to do a test, we can actually write the functionality."
Wayne Wise
Vice President of Software Development
ExakTime
Next steps
Learn more about performance outcomes.
Performance outcomes
Examples of performance outcomes
2 minutes to read • Edit Online
As discussed in business outcomes, several potential business outcomes can serve as the foundation for any
transformation journey conversation with the business. This article focuses on a common business measure:
performance.
In today's technological society, customers assume that applications will perform well and always be available.
When this expectation isn't met, it causes reputation damage that can be costly and long-lasting.
Performance
The biggest cloud computing services run on a worldwide network of secure datacenters, which are regularly
upgraded to the latest generation of fast and efficient computing hardware. This provides several benefits over a
single corporate datacenter, such as reduced network latency for applications and greater economies of scale.
Transform your business and reduce costs with an energy-efficient infrastructure that spans more than 100 highly
secure facilities worldwide, linked by one of the largest networks on earth. Azure has more global regions than
any other cloud provider. This translates into the scale that's required to bring applications closer to users around
the world, preserve data residency, and provide comprehensive compliance and resiliency options for customers.
Example 1: A services company was working with a hosting provider that hosted multiple operational
infrastructure assets. Those systems suffered from frequent outages and poor performance. The company
migrated its assets to Azure to take advantage of the SLA and performance controls of the cloud. The
downtime that it suffered cost it approximately 15,000 USD per minute of outage. With four to eight hours
of outage per month, it was easy to justify this organizational transformation.
Example 2: A consumer investment company was in the early stages of a cloud-enabled application
innovation effort. Agile processes and DevOps were maturing well, but application performance was spiky.
As a more mature transformation, the company started a program to monitor and automate sizing based
on usage demands. The company was able to eliminate sizing issues by using Azure performance
management tools, resulting in a surprising 5 percent increase in transactions.
Reliability
Cloud computing makes data backup, disaster recovery, and business continuity easier and less expensive,
because data can be mirrored at multiple redundant sites on the cloud provider's network.
One of IT's crucial functions is ensuring that corporate data is never lost and applications stay available despite
server crashes, power outages, or natural disasters. You can keep your data safe and recoverable by backing it up
to Azure.
Azure Backup is a simple solution that decreases your infrastructure costs while providing enhanced security
mechanisms to protect your data against ransomware. With one solution, you can protect workloads that are
running in Azure and on-premises across Linux, Windows, VMware, and Hyper-V. You can ensure business
continuity by keeping your applications running in Azure.
Azure Site Recovery makes it simple to test disaster recovery by replicating applications between Azure regions.
You can also replicate on-premises VMware and Hyper-V virtual machines and physical servers to Azure to stay
available if the primary site goes down. And you can recover workloads to the primary site when it's up and
running again.
Example: An oil and gas company used Azure technologies to implement a full site recovery. The company
chose not to fully embrace the cloud for day-to-day operations, but the cloud's disaster recovery and business
continuity (DRBC ) features still protected their datacenter. As a hurricane formed hundreds of miles away, their
implementation partner started recovering the site to Azure. Before the storm touched down, all mission-
critical assets were running in Azure, preventing any downtime.
Next steps
Learn how to use the business outcome template.
Use the business outcome template
How to use the business outcome template
2 minutes to read • Edit Online
As discussed in the business outcomes overview, it can be difficult to bridge the gap between business and
technical conversations. This simple template is designed to help teams uniformly capture business outcomes to
be used later in the development of customer transformation journey strategies.
Download the business outcome template spreadsheet to begin brainstorming and tracking business outcomes.
Continue reading to learn how to use the template. Review the business outcomes section for ideas on potential
business outcomes that could come up in executive conversations.
Figure 1 - Business outcomes visualized as a house with stakeholders, over business outcomes, over technical
capabilities.
The business outcome template focuses on simplified conversations that can quickly engage stakeholders without
getting too deep into the technical solution. By rapidly understanding and aligning the key performance indicators
(KPIs) and business drivers that are important to stakeholders, your team can think about high-level approaches
and transformations before diving into the implementation details.
An example can be found on the "Example Outcome" tab of the spreadsheet, as shown below. To track multiple
outcomes, add them to the "Collective Outcomes" tab.
Figure 2 - Example of a business outcome template.
Figure 3 - Five areas of focus in discovery: stakeholders, outcomes, drivers, KPIs, and capabilities.
Stakeholders: Who in the organization is likely to see the greatest value in a specific business outcome? Who is
most likely to support this transformation, especially when things get tough or time consuming? Who has the
greatest stake in the success of this transformation? This person is a potential stakeholder.
Business outcomes: A business outcome is a concise, defined, and observable result or change in business
performance, supported by a specific measure. How does the stakeholder want to change the business? How will
the business be affected? What is the value of this transformation?
Business drivers: Business drivers capture the current challenge that's preventing the company from achieving
desired outcomes. They can also capture new opportunities that the business can capitalize on with the right
solution. How would you describe the current challenges or future state of the business? What business functions
would be changing to meet the desired outcomes?
KPIs: How will this change be measured? How does the business know if they are successful? How frequently will
this KPI be observed? Understanding each KPI helps enable incremental change and experimentation.
Capabilities: When you define any transformation journey, how will technical capabilities accelerate realization of
the business outcome? What applications must be included in the transformation to achieve business objectives?
How do various applications or workloads get prioritized to deliver on capabilities? How do parts of the solution
need to be expanded or rearchitected to meet each of the outcomes? Can execution approaches (or timelines) be
rearranged to prioritize high-impact business outcomes?
Next steps
Learn about aligning your technical efforts to meaningful learning metrics.
Align your technical efforts
How can we align efforts to meaningful learning
metrics?
3 minutes to read • Edit Online
The business outcomes overview discussed ways to measure and communicate the impact a transformation will
have on the business. Unfortunately, it can take years for some of those outcomes to produce measurable results.
The board and C -suite are unhappy with reports that show a 0% delta for long periods of time.
Learning metrics are interim, shorter-term metrics that can be tied back to longer-term business outcomes. These
metrics align well with a growth mindset and help position the culture to become more resilient. Rather than
highlighting the anticipated lack of progress toward a long-term business goal, learning metrics highlight early
indicators of success. The metrics also highlight early indicators of failure, which are likely to produce the greatest
opportunity for you to learn and adjust the plan.
As with much of the material in this framework, we assume you're familiar with the transformation journey that
best aligns with your desired business outcomes. This article will outline a few learning metrics for each
transformation journey to illustrate the concept.
Cloud migration
This transformation focuses on cost, complexity, and efficiency, with an emphasis on IT operations. The most easily
measured data behind this transformation is the movement of assets to the cloud. In this kind of transformation,
the digital estate is measured by virtual machines (VMs), racks or clusters that host those VMs, datacenter
operational costs, required capital expenses to maintain systems, and depreciation of those assets over time.
As VMs are moved to the cloud, dependence on on-premises legacy assets is reduced. The cost of asset
maintenance is also reduced. Unfortunately, businesses can't realize the cost reduction until clusters are
deprovisioned and datacenter leases expire. In many cases, the full value of the effort isn't realized until the
depreciation cycles are complete.
Always align with the CFO or finance office before making financial statements. However, IT teams can generally
estimate current monetary cost and future monetary cost values for each VM based on CPU, memory, and
storage consumed. You can then apply that value to each migrated VM to estimate the immediate cost savings
and future monetary value of the effort.
Application innovation
Cloud-enabled application innovation focuses largely on the customer experience and the customer's willingness
to consume products and services provided by the company. It takes time for increments of change to affect
consumer or customer buying behaviors. But application innovation cycles tend to be much shorter than they are
in the other forms of transformation. The traditional advice is that you should start with an understanding of the
specific behaviors that you want to influence and use those behaviors as the learning metrics. For example, in an
e-commerce application, total purchases or add-on purchases could be the target behavior. For a video company,
time watching video streams could be the target.
The challenge with customer behavior metrics is that they can easily be influenced by outside variables. So it's
often important to include related statistics with the learning metrics. These related statistics can include release
cadence, bugs resolved per release, code coverage of unit tests, number of page views, page throughput, page
load time, and other app performance metrics. Each can show different activities and changes to the code base and
the customer experience to correlate with higher-level customer behavior patterns.
Data innovation
Changing an industry, disrupting markets, or transforming products and services can take years. In a cloud-
enabled data innovation effort, experimentation is key to measuring success. Be transparent by sharing prediction
metrics like percent probability, number of failed experiments, and number of models trained. Failures will
accumulate faster than successes. These metrics can be discouraging, and the executive team must understand the
time and investment needed to use these metrics properly.
On the other hand, some positive indicators are often associated with data-driven learning: centralization of
heterogeneous data sets, data ingress, and democratization of data. While the team is learning about the customer
of tomorrow, real results can be produced today. Supporting learning metrics could include:
Number of models available
Number of partner data sources consumed
Devices producing ingress data
Volume of ingress data
Types of data
An even more valuable metric is the number of dashboards created from combined data sources. This number
reflects the current-state business processes that are affected by new data sources. By sharing new data sources
openly, your business can take advantage of the data by using reporting tools like Power BI to produce
incremental insights and drive business change.
Next steps
After learning metrics are aligned, you're ready to start assessing the digital estate against those metrics. The
result will be a transformation backlog or migration backlog.
Assess the digital estate
Build a business justification for cloud migration
8 minutes to read • Edit Online
Cloud migrations can generate early return on investment (ROI) from cloud transformation efforts. But
developing a clear business justification with tangible, relevant costs and returns can be a complex process. This
article will help you think about what data you need to create a financial model that aligns with cloud migration
outcomes. First, let's dispel a few myths about cloud migration, so your organization can avoid some common
mistakes.
We can unpack this equation to get a migration-specific view of the formulas for the input variables on the right
side of the equation. The remaining sections of this article offer some considerations to take into account.
Next steps
Create a financial model for cloud transformation
Create a financial model for cloud transformation
5 minutes to read • Edit Online
Creating a financial model that accurately represents the full business value of any cloud transformation can be
complicated. Financial models and business justifications tend to vary for different organizations. This article
establishes some formulas and points out a few things that are commonly missed when strategists create
financial models.
Return on investment
Return on investment (ROI) is often an important criteria for the C -suite or the board. ROI is used to compare
different ways to invest limited capital resources. The formula for ROI is fairly simple. The details you'll need to
create each input to the formula might not be as simple. Essentially, ROI is the amount of return produced from
an initial investment. It's usually represented as a percentage:
In the next sections, we'll walk through the data you'll need to calculate the initial investment and the gain from
investment (earnings).
Revenue deltas
Revenue deltas should be forecast in partnership with business stakeholders. After the business stakeholders
agree on a revenue impact, it can be used to improve the earning position.
Cost deltas
Cost deltas are the amount of increase or decrease that will be caused by the transformation. Independent
variables can affect cost deltas. Earnings are largely based on hard costs like capital expense reductions, cost
avoidance, operational cost reductions, and depreciation reductions. The following sections describe some cost
deltas to consider.
Depreciation reduction or acceleration
For guidance on depreciation, speak with the CFO or finance team. The following information is meant to serve
as a general reference on the topic of depreciation.
When capital is invested in the acquisition of an asset, that investment could be used for financial or tax purposes
to produce ongoing benefits over the expected lifespan of the asset. Some companies see depreciation as a
positive tax advantage. Others see it as a committed, ongoing expense similar to other recurring expenses
attributed to the annual IT budget.
Speak with the finance office to find out if elimination of depreciation is possible and if it would make a positive
contribution to cost deltas.
Physical asset recovery
In some cases, retired assets can be sold as a source of revenue. This revenue is often lumped into cost reduction
for simplicity. But it's truly an increase in revenue and can be taxed as such. Speak with the finance office to
understand the viability of this option and how to account for the resulting revenue.
Operational cost reductions
Recurring expenses required to operate a business are often called operating expenses. This is a broad category.
In most accounting models, it includes:
Software licensing.
Hosting expenses.
Electric bills.
Real estate rentals.
Cooling expenses.
Temporary staff required for operations.
Equipment rentals.
Replacement parts.
Maintenance contracts.
Repair services.
Business continuity and disaster recovery (BCDR ) services.
Other expenses that don't require capital expense approvals.
This category provides one of the highest earning deltas. When you're considering a cloud migration, time
invested in making this list exhaustive is rarely wasted. Ask the CIO and finance team questions to ensure all
operational costs are accounted for.
Cost avoidance
When an operating expenditure is expected but not yet in an approved budget, it might not fit into a cost
reduction category. For example, if VMware and Microsoft licenses need to be renegotiated and paid next year,
they aren't fully qualified costs yet. Reductions in those expected costs are treated like operational costs for the
sake of cost-delta calculations. Informally, however, they should be referred to as "cost avoidance" until
negotiation and budget approval is complete.
Soft-cost reductions
At some companies, soft costs like reductions in operational complexity or reductions in full-time staff for
operating a datacenter could also be included in cost deltas. But including soft costs might not be a good idea.
When you include soft-cost reductions, you insert an undocumented assumption that the reduction will create
tangible cost savings. Technology projects rarely result in actual soft-cost recovery.
Headcount reductions
Time savings for staff are often included under soft-cost reduction. When those time savings map to actual
reduction of IT salary or staffing, they could be calculated separately as headcount reductions.
That said, the skills needed on-premises generally map to a similar (or higher-level) set of skills needed in the
cloud. So people aren't generally laid off after a cloud migration.
An exception occurs when operational capacity is provided by a third party or managed services provider (MSP ).
If IT systems are managed by a third party, the operating costs could be replaced by a cloud-native solution or
cloud-native MSP. A cloud-native MSP is likely to operate more efficiently and potentially at a lower cost. If that's
the case, operational cost reductions belong in the hard-cost calculations.
Capital expense reductions or avoidance
Capital expenses are slightly different from operating expenses. Generally, this category is driven by refresh
cycles or datacenter expansion. An example of a datacenter expansion would be a new high-performance cluster
to host a big data solution or data warehouse. This expense would generally fit into a capital expense category.
More common are the basic refresh cycles. Some companies have rigid hardware refresh cycles, meaning assets
are retired and replaced on a regular cycle (usually every three, five, or eight years). These cycles often coincide
with asset lease cycles or the forecasted life span of equipment. When a refresh cycle hits, IT draws capital
expense to acquire new equipment.
If a refresh cycle is approved and budgeted, the cloud transformation could help eliminate that cost. If a refresh
cycle is planned but not yet approved, the cloud transformation could avoid a capital expenditure. Both
reductions would be added to the cost delta.
Next steps
Learn more about cloud accounting models.
Cloud accounting
What is cloud accounting?
4 minutes to read • Edit Online
The cloud changes how IT accounts for costs, as is described in Creating a financial model for cloud
transformation. Various IT accounting models are much easier to support because of how the cloud allocates
costs. So it's important to understand how to account for cloud costs before you begin a cloud transformation
journey. This article outlines the most common cloud accounting models for IT.
Chargeback
One of the common first steps in changing IT's reputation as a cost center is implementing a chargeback model of
accounting. This model is especially common in smaller enterprises or highly efficient IT organizations. In the
chargeback model, any IT costs that are associated with a specific business unit are treated like an operating
expense in that business unit's budget. This practice reduces the cumulative cost effects on IT, allowing business
values to show more clearly.
In a legacy on-premises model, chargeback is difficult to realize because someone still has to carry the large
capital expenses and depreciation. The ongoing conversion from capital expenditures to operating expenses
associated with usage is a difficult accounting exercise. This difficulty is a major reason for the creation of the
traditional IT accounting model and the central IT accounting model. The operating expenses model of cloud cost
accounting is almost required if you want to efficiently deliver a chargeback model.
But you shouldn't implement this model without considering the implications. Here are a few consequences that
are unique to a chargeback model:
Chargeback results in a massive reduction of the overall IT budget. For IT organizations that are inefficient or
require extensive complex technical skills in operations or maintenance, this model can expose those expenses
in an unhealthy way.
Loss of control is a common consequence. In highly political environments, chargeback can result in loss of
control and staff being reallocated to the business. This could create significant inefficiencies and reduce IT's
ability to consistently meet SLAs or project requirements.
Difficulty accounting for shared services is another common consequence. If the organization has grown
through acquisition and is carrying technical debt as a result, it's likely that a high percentage of shared services
must be maintained to keep all systems working together effectively.
Cloud transformations include solutions to these and other consequences associated with a chargeback model.
But each of those solutions includes implementation and operating expenses. The CIO and CFO should carefully
weigh the pros and cons of a chargeback model before considering one.
Showback or awareness-back
For larger enterprises, a showback or awareness-back model is a safer first step in the transition from cost center
to value center. This model doesn't affect financial accounting. In fact, the P&Ls of each organization don't change.
The biggest shift is in mindset and awareness. In a showback or awareness-back model, IT manages the
centralized, consolidated buying power as an agent for the business. In reports back to the business, IT attributes
any direct costs to the relevant business unit, which reduces the perceived budget directly consumed by IT. IT also
plans budgets based on the needs of the associated business units, which allows IT to more accurately account for
costs associated to purely IT initiatives.
This model provides a balance between a true chargeback model and more traditional models of IT accounting.
There's a learning curve and a time commitment associated with cloud adoption planning. Even for experienced
teams, proper planning takes time: time to align stakeholders, time to collect and analyze data, time to validate
long-term decisions, and time to align people, processes, and technology. In the most productive adoption efforts,
planning grows in parallel with adoption, improving with each release and with each workload migration to the
cloud. It's important to understand the difference between a cloud adoption plan and a cloud adoption strategy.
You need a well-defined strategy to facilitate and guide the implementation of a cloud adoption plan.
The Cloud Adoption Framework for Azure outlines the processes for cloud adoption and the operation of
workloads hosted in the cloud. Each of the processes across the Define strategy, Plan, Ready, Adopt, and Operate
phases require slight expansions of technical, business, and operational skills. Some of those skills can come from
directed learning. But many of them are most effectively acquired through hands-on experience.
Starting a first adoption process in parallel with the development of the plan provides some benefits:
Establish a growth mindset to encourage learning and exploration
Provide an opportunity for the team to develop necessary skills
Create situations that encourage new approaches to collaboration
Identify skill gaps and potential partnership needs
Provide tangible inputs to the plan
Next steps
After the first cloud adoption project has begun, the cloud strategy team can turn their attention to the longer-term
cloud adoption plan.
Build your cloud adoption plan
Skills readiness path during the Plan phase of a
migration journey
4 minutes to read • Edit Online
During the Plan phase of a migration journey, the objective is to develop the plans necessary to guide migration
implementation. This phase requires a few critical skills, including:
Establishing the vision.
Building the business justification.
Rationalizing the digital estate.
Creating a migration backlog (technical plan).
The following sections provide learning paths to develop each of these skills.
Organizational skills
Depending on the motivations and desired business outcomes of a cloud adoption effort, leaders might need to
establish new organizational structures or virtual teams (v-teams) to facilitate various functions. These articles will
help you develop the skills necessary to structure those teams to meet desired outcomes:
Initial organizational alignment. Overview of organizational alignment and various team structures to facilitate
specific goals.
Breaking down silos and fiefdoms. Understanding two common organizational antipatterns and ways to guide
a team to productive collaboration.
Microsoft Learn
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. Earn points, levels and achieve more!
Here is an example of a tailored learning paths which aligns to the Strategy potion of the Cloud Adoption
Framework.
Learn the business value of Microsoft Azure: This learning experience will take you on a journey that will begin by
showing you how digital transformation and the power of the cloud can transform your business. We will cover
how Microsoft Azure cloud services can power your organization on a trusted cloud platform. Finally, we will wrap
up by illustrating how to make this journey real for your organization.
Learn more
To discover additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning
paths with your role.
Cloud adoption plans convert the aspirational goals of a cloud adoption strategy into an actionable plan. The collective cloud
teams can use the cloud adoption plan to guide their technical efforts and align them with the business strategy.
Digital estate
Inventory and rationalize your digital estate based on assumptions that align with motivations and business outcomes.
Download the Cloud Adoption Framework strategy and planning template to track the outputs of each exercise as you build out
your cloud adoption strategy.
Next steps
Start building the cloud adoption plan with a focus on the digital estate.
Digital estate
Cloud rationalization
4 minutes to read • Edit Online
Cloud rationalization is the process of evaluating assets to determine the best way to migrate or modernize each
asset in the cloud. For more information about the process of rationalization, see What is a digital estate?.
Rationalization context
The "five Rs of rationalization" listed in this article are a great way to label a potential future state for any
workload that's being considered as a cloud candidate. However, this labeling process should be put into the
correct context before you attempt to rationalize an environment. Review the following myths to provide that
context:
Myth: It's easy to make rationalization decisions early in the process. Accurate rationalization
requires a deep knowledge of the workload and associated assets (apps, VMs, and data). Most importantly,
accurate rationalization decisions take time. We recommend using an incremental rationalization process.
Myth: Cloud adoption has to wait for all workloads to be rationalized. Rationalizing an entire IT
portfolio or even a single datacenter can delay the realization of business value by months or even years.
Full rationalization should be avoided when possible. Instead, use the power of 10 approach to release
planning to make wise decisions about the next 10 workloads that are slated for cloud adoption.
Myth: Business justification has to wait for all workloads to be rationalized. To develop a business
justification for a cloud adoption effort, make a few basic assumptions at the portfolio level. When
motivations are aligned to innovation, assume rearchitecture. When motivations are aligned to migration,
assume rehost. These assumptions can accelerate the business justification process. Assumptions are then
challenged and budgets refined during the assessment phase of each workload's adoption cycles.
Now review the following five Rs of rationalization to familiarize yourself with the long-term process. While
developing your cloud adoption plan, choose the option that best aligns with your motivations, business
outcomes, and current state environment. The goal in digital estate rationalization is to set a baseline, not to
rationalize every workload.
Rehost
Also known as a lift and shift migration, a rehost effort moves a current state asset to the chosen cloud provider,
with minimal change to overall architecture.
Common drivers might include:
Reducing capital expense
Freeing up datacenter space
Achieving rapid return on investment in the cloud
Quantitative analysis factors:
VM size (CPU, memory, storage)
Dependencies (network traffic)
Asset compatibility
Qualitative analysis factors:
Tolerance for change
Business priorities
Critical business events
Process dependencies
Refactor
Platform as a service (PaaS ) options can reduce the operational costs that are associated with many applications.
It's a good idea to slightly refactor an application to fit a PaaS -based model.
"Refactor" also refers to the application development process of refactoring code to enable an application to
deliver on new business opportunities.
Common drivers might include:
Faster and shorter updates
Code portability
Greater cloud efficiency (resources, speed, cost, managed operations)
Quantitative analysis factors:
Application asset size (CPU, memory, storage)
Dependencies (network traffic)
User traffic (page views, time on page, load time)
Development platform (languages, data platform, middle-tier services)
Database (CPU, memory, storage, version)
Qualitative analysis factors:
Continued business investments
Bursting options/timelines
Business process dependencies
Rearchitect
Some aging applications aren't compatible with cloud providers because of the architectural decisions that were
made when the application was built. In these cases, the application might need to be rearchitected before
transformation.
In other cases, applications that are cloud-compatible, but not cloud-native, might create cost efficiencies and
operational efficiencies by rearchitecting the solution into a cloud-native application.
Common drivers might include:
Application scale and agility
Easier adoption of new cloud capabilities
Mix of technology stacks
Quantitative analysis factors:
Application asset size (CPU, memory, storage)
Dependencies (network traffic)
User traffic (page views, time on page, load time)
Development platform (languages, data platform, middle tier services)
Database (CPU, memory, storage, version)
Qualitative analysis factors:
Growing business investments
Operational costs
Potential feedback loops and DevOps investments.
Rebuild
In some scenarios, the delta that must be overcome to carry an application forward can be too large to justify
further investment. This is especially true for applications that previously met the needs of a business but are now
unsupported or misaligned with the current business processes. In this case, a new code base is created to align
with a cloud-native approach.
Common drivers might include:
Accelerate innovation
Build apps faster
Reduce operational cost
Quantitative analysis factors:
Application asset size (CPU, memory, storage)
Dependencies (network traffic)
User traffic (page views, time on page, load time)
Development platform (languages, data platform, middle tier services)
Database (CPU, memory, storage, version)
Qualitative analysis factors:
Declining end-user satisfaction
Business processes limited by functionality
Potential cost, experience, or revenue gains
Replace
Solutions are typically implemented by using the best technology and approach available at the time. Sometimes
software as a service (SaaS ) applications can provide all the necessary functionality for the hosted application. In
these scenarios, a workload can be scheduled for future replacement, effectively removing it from the
transformation effort.
Common drivers might include:
Standardizing around industry-best practices
Accelerating adoption of business process-driven approaches
Reallocating development investments into applications that create competitive differentiation or advantages
Quantitative analysis factors:
General operating cost reductions
VM size (CPU, memory, storage)
Dependencies (network traffic)
Assets to be retired
Database (CPU, memory, storage, version)
Qualitative analysis factors:
Cost benefit analysis of the current architecture versus a SaaS solution
Business process maps
Data schemas
Custom or automated processes
Next steps
Collectively, you can apply these five Rs of rationalization to a digital estate to help you make rationalization
decisions about the future state of each application.
What is a digital estate?
What is a digital estate?
2 minutes to read • Edit Online
Every modern company has some form of digital estate. Much like a physical estate, a digital estate is an
abstract reference to a collection of tangible owned assets. In a digital estate, those assets include virtual
machines (VMs), servers, applications, data, and so on. Essentially, a digital estate is the collection of IT assets
that power business processes and supporting operations.
The importance of a digital estate is most obvious during the planning and execution of digital transformation
efforts. During transformation journeys, the cloud strategy teams use the digital estate to map the business
outcomes to release plans and technical efforts. That all starts with an inventory and measurement of the
digital assets that the organization owns today.
TIP
Each type of transformation can be measured with any of the three views. Companies commonly complete all three
transformations in parallel. We strongly recommend that company leadership and the cloud strategy team agree
regarding the transformation that is most important for business success. That understanding serves as the basis for
common language and metrics across multiple initiatives.
Digital estate planning can take several forms depending on the desired outcomes and size of the existing estate.
There are various approaches that you can take. It's important to set expectations regarding the approach early in
planning cycles. Unclear expectations often lead to delays associated with additional inventory-gathering exercises.
This article outlines three approaches to analysis.
Workload-driven approach
The top-down assessment approach evaluates security aspects. Security includes the categorization of data (high,
medium, or low business impact), compliance, sovereignty, and security risk requirements. This approach assesses
high-level architectural complexity. It evaluates aspects such as authentication, data structure, latency
requirements, dependencies, and application life expectancy.
The top-down approach also measures the operational requirements of the application, such as service levels,
integration, maintenance windows, monitoring, and insight. When all of these aspects have been analyzed and
taken into consideration, the resulting score that reflects the relative difficulty of migrating this application to each
of the cloud platforms: IaaS, PaaS, and SaaS.
In addition, the top-down assessment evaluates the financial benefits of the application, such as operational
efficiencies, TCO, return on investment, and other appropriate financial metrics. The assessment also examines the
seasonality of the application (for example, are there times of the year when demand spikes?) and overall compute
load.
It also looks at the types of users it supports (casual/expert, always/occasionally logged on), and the required
scalability and elasticity. Finally, the assessment concludes by examining business continuity and resiliency
requirements, as well as dependencies for running the application if a disruption of service should occur.
TIP
This approach requires interviews and anecdotal feedback from business and technical stakeholders. Availability of key
individuals is the biggest risk to timing. The anecdotal nature of the data sources makes it more difficult to produce accurate
cost or timing estimates. Plan schedules in advance and validate any data that's collected.
Asset-driven approach
The asset-driven approach provides a plan based on the assets that support an application for migration. In this
approach, you pull statistical usage data from a configuration management database (CMDB ) or other
infrastructure assessment tools.
This approach usually assumes an IaaS model of deployment as a baseline. In this process, the analysis evaluates
the attributes of each asset: memory, number of processors (CPU cores), operating system storage space, data
drives, network interface cards (NICs), IPv6, network load balancing, clustering, operating system version,
database version (if necessary), supported domains, and third-party components or software packages, among
others. The assets that you inventory in this approach are then aligned with workloads or applications for
grouping and dependency mapping purposes.
TIP
This approach requires a rich source of statistical usage data. The time that's needed to scan the inventory and collect data is
the biggest risk to timing. The low-level data sources can miss dependencies between assets or applications. Plan for at least
one month to scan the inventory. Validate dependencies before deployment.
Incremental approach
We strongly suggest an incremental approach, as we do for many processes in the Cloud Adoption Framework. In
the case of digital estate planning, that equates to a multiphase process:
Initial cost analysis: If financial validation is required, start with an asset-driven approach, described
earlier, to get an initial cost calculation for the entire digital estate, with no rationalization. This establishes a
worst-case scenario benchmark.
Migration planning: After you have assembled a cloud strategy team, build an initial migration backlog
using a workload-driven approach that's based on their collective knowledge and limited stakeholder
interviews. This approach quickly builds a lightweight workload assessment to foster collaboration.
Release planning: At each release, the migration backlog is pruned and reprioritized to focus on the most
relevant business impact. During this process, the next five to ten workloads are selected as prioritized
releases. At this point, the cloud strategy team invests the time in completing an exhaustive workload-
driven approach. Delaying this assessment until a release is aligned better respects the time of
stakeholders. It also delays the investment in full analysis until the business starts to see results from earlier
efforts.
Execution analysis: Before migrating, modernizing, or replicating any asset, assess it both individually and
as part of a collective release. At this point, the data from the initial asset-driven approach can be
scrutinized to ensure accurate sizing and operational constraints.
TIP
This incremental approach enables streamlined planning and accelerated results. It's important that all parties involved
understand the approach to delayed decision making. It's equally important that assumptions made at each stage be
documented to avoid loss of details.
Next steps
After an approach is selected, the inventory can be collected.
Gather inventory data
Gather inventory data for a digital estate
2 minutes to read • Edit Online
Developing an inventory is the first step in digital estate planning. In this process, a list of IT assets that support
specific business functions are collected for later analysis and rationalization. This article assumes that a bottom-
up approach to analysis is most appropriate for planning. For more information, see Approaches to digital estate
planning.
Next steps
After an inventory is compiled and validated, it can be rationalized. Inventory rationalization is the next step to
digital estate planning.
Rationalize the digital estate
Rationalize the digital estate
10 minutes to read • Edit Online
Cloud rationalization is the process of evaluating assets to determine the best approach to hosting them in the
cloud. After you've determined an approach and aggregated an inventory, cloud rationalization can begin. Cloud
rationalization discusses the most common rationalization options.
Incremental rationalization
The complete rationalization of a large digital estate is prone to risk and can suffer delays because of its
complexity. The assumption behind the incremental approach is that delayed decisions stagger the load on the
business to reduce the risk of roadblocks. Over time, this approach creates an organic model for developing the
processes and experience required to make qualified rationalization decisions more efficiently.
Inventory: Reduce discovery data points
Few organizations invest the time, energy, and expense in maintaining an accurate, real-time inventory of the full
digital estate. Loss, theft, refresh cycles, and employee onboarding often justify detailed asset tracking of end-
user devices. However, the ROI of maintaining an accurate server and application inventory in a traditional, on-
premises datacenter is often low. Most IT organizations have other more pressing issues to address than
tracking the usage of fixed assets in a datacenter.
In a cloud transformation, inventory directly correlates to operating costs. Accurate inventory data is required for
proper planning. Unfortunately, current environmental scanning options can delay decisions by weeks or
months. Fortunately, a few tricks can accelerate data collection.
Agent-based scanning is the most frequently cited delay. The robust data that's required for a traditional
rationalization can often only be collected with an agent running on each asset. This dependency on agents often
slows progress, because it can require feedback from security, operations, and administration functions.
In an incremental rationalization process, an agent-less solution could be used for an initial discovery to
accelerate early decisions. Depending on the level of complexity in the environment, an agent-based solution
might still be required. However, it can be removed from the critical path to business change.
Quantitative analysis: Streamline decisions
Regardless of the approach to inventory discovery, quantitative analysis can drive initial decisions and
assumptions. This is especially true when trying to identify the first workload or when the goal of rationalization
is a high-level cost comparison. In an incremental rationalization process, the cloud strategy team and the cloud
adoption teams limit the five Rs of rationalization to two concise decisions and only apply those quantitative
factors. This streamlines the analysis and reduces the amount of initial data that's required to drive change.
For example, if an organization is in the midst of an IaaS migration to the cloud, you can assume that most
workloads will either be retired or rehosted.
Qualitative analysis: Temporary assumptions
By reducing the number of potential outcomes, it's easier to reach an initial decision about the future state of an
asset. When you reduce the options, you also reduce the number of questions asked of the business at this early
stage.
For example, if the options are limited to rehosting or retiring, the business needs to answer only one question
during initial rationalization, which is whether to retire the asset.
"Analysis suggests that no users are actively using this asset. Is that accurate, or have we overlooked
something?" Such a binary question is typically much easier to run through qualitative analysis.
This streamlined approach produces baselines, financial plans, strategy, and direction. In later activities, each
asset goes through further rationalization and qualitative analysis to evaluate other options. All assumptions that
you make in this initial rationalization are tested before
Challenge assumptions
The outcome of the prior section is a rough rationalization that's full of assumptions. Next, it's time to challenge
some of those assumptions.
Retire assets
In a traditional on-premises environment, hosting small, unused assets seldom causes a significant impact on
annual costs. With a few exceptions, FTE effort that's required to analyze and retire the actual asset outweighs
the cost savings from pruning and retiring those assets.
However, when you move to a cloud accounting model, retiring assets can produce significant savings in annual
operating costs and up-front migration efforts.
It's not uncommon for organizations to retire 20% or more of their digital estate after completing a quantitative
analysis. We recommend doing further qualitative analysis before deciding on such an action. After it's
confirmed, the retirement of those assets can produce the first ROI victory in the cloud migration. In many cases,
this is one of the biggest cost-saving factors. As such, we recommend that the cloud strategy team oversee the
validation and retirement of assets, in parallel with the build phase of the migration process, to allow for an early
financial win.
Program adjustments
A company seldom embarks on just one transformation journey. The choice between cost reduction, market
growth, and new revenue streams is rarely a binary decision. As such, we recommend that the cloud strategy
team work with IT to identify assets on parallel transformation efforts that are outside of the scope of the
primary transformation journey.
In the IaaS migration example given in this article:
Ask the DevOps team to identify assets that are already part of a deployment automation and remove
those assets from the core migration plan.
Ask the Data and R&D teams to identify assets that are powering new revenue streams and remove them
from the core migration plan.
This program-focused qualitative analysis can be executed quickly and creates alignment across multiple
migration backlogs.
You might still need to consider some assets as rehost assets for a while. You can phase in later rationalization
after the initial migration.
Release planning
While the cloud adoption team is executing the migration or implementation of the first workload, the cloud
strategy team can begin prioritizing the remaining applications and workloads.
Power of 10
The traditional approach to rationalization attempts to meet all foreseeable needs. Fortunately, a plan for every
application is often not required to start a transformation journey. In an incremental model, the Power of 10
provides a good starting point. In this model, the cloud strategy team selects the first 10 applications to be
migrated. Those ten workloads should contain a mixture of simple and complex workloads.
Build the first backlogs
The cloud adoption teams and the cloud strategy team can work together on the qualitative analysis for the first
10 workloads. This effort creates the first prioritized migration backlog and the first prioritized release backlog.
This method enables the teams to iterate on the approach and provides sufficient time to create an adequate
process for qualitative analysis.
Mature the process
After the two teams agree on the qualitative analysis criteria, assessment can become a task within each
iteration. Reaching consensus on assessment criteria usually requires two to three releases.
After the assessment has moved into the incremental execution process of migration, the cloud adoption team
can iterate faster on assessment and architecture. At this stage, the cloud strategy team is also abstracted, which
reduces the drain on their time. This also enables the cloud strategy team to focus on prioritizing the applications
that are not yet in a specific release, which ensures tight alignment with changing market conditions.
Not all of the prioritized applications will be ready for migration. Sequencing is likely to change as the team does
deeper qualitative analysis and discovers business events and dependencies that might prompt reprioritization
of the backlog. Some releases might group together a small number of workloads. Others might just contain a
single workload.
The cloud adoption team is likely to run iterations that don't produce a complete workload migration. The
smaller the workload, and the fewer dependencies, the more likely a workload is to fit into a single sprint or
iteration. For this reason, we recommend that the first few applications in the release backlog be small and
contain few external dependencies.
End state
Over time, the combination of the cloud adoption team and the cloud strategy team will complete a full
rationalization of the inventory. However, this incremental approach enables the teams to get continually faster
at the rationalization process. It also helps the transformation journey to yield tangible business results sooner,
without as much upfront analysis effort.
In some cases, the financial model might be too tight to make a decision without additional rationalization. In
such cases, you might need a more traditional approach to rationalization.
Next steps
The output of a rationalization effort is a prioritized backlog of all assets that are affected by the chosen
transformation. This backlog is now ready to serve as the foundation for costing models of cloud services.
Align cost models with the digital estate
Align cost models with the digital estate to forecast
cloud costs
2 minutes to read • Edit Online
After you've rationalized a digital estate, you can align it to equivalent costing models with the chosen cloud
provider. Discussing cost models is difficult without focusing on a specific cloud provider. To provide tangible
examples in this article, Azure is the assumed cloud provider.
Azure pricing tools help you manage cloud spend with transparency and accuracy, so you can make the most of
Azure and other clouds. Providing the tools to monitor, allocate, and optimize cloud costs, empowers customers to
accelerate future investments with confidence.
Azure Migrate: Azure Migrate is perhaps the most cost effective approach to cost model alignment. This
tool allows for digital estate inventory, limited rationalization, and cost calculations in one tool.
Total cost of ownership (TCO ) calculator: Lower the total cost of ownership of your on-premises
infrastructure with the Azure cloud platform. Use the Azure TCO calculator to estimate the cost savings you
can realize by migrating your application workloads to Azure. Provide a brief description of your on-
premises environment to get an instant report.
Azure pricing calculator: Estimate your expected monthly bill by using our pricing calculator. Track your
actual account usage and bill at any time using the billing portal. Set up automatic email billing alerts to
notify you if your spend goes above an amount you configure.
Azure Cost Management: Azure Cost Management, licensed by Microsoft subsidiary Cloudyn, is a
multicloud cost management solution that helps you use and manage Azure and other cloud resources
effectively. Collect cloud usage and billing data through application program interfaces (APIs) from Azure,
Amazon Web Services, and Google Cloud Platform. With that data, gain full visibility into resource
consumption and costs across cloud platforms in a single, unified view. Continuously monitor cloud
consumption and cost trends. Track actual cloud spending against your budget to avoid overspending.
Detect spending anomalies and usage inefficiencies. Use historical data to improve your forecasting
accuracy for cloud usage and expenditures.
Initial organization alignment
2 minutes to read • Edit Online
The most important aspect of any cloud adoption plan is the alignment of people who will make the plan a reality.
No plan is complete until you understand its people-related aspects.
True organizational alignment takes time. It will become important to establish long-term organizational
alignment, especially as cloud adoption scales across the business and IT culture. Alignment is so important that
an entire section has been dedicated to it in the Operate section of the Cloud Adoption Framework.
Full organization alignment is not a required component of the cloud adoption plan. However, some initial
organization alignment is needed. This article outlines a best-practice starting point for organizational alignment.
The guidance here can help complete your plan and get your teams ready for cloud adoption. When you're ready,
you can use the organization alignment section to customize this guidance to fit your organization.
It's fairly intuitive that cloud adoption tasks require people to execute those tasks. So, few people are surprised that
a cloud adoption team is a requirement. However, those who are new to the cloud may not fully appreciate the
importance of a cloud governance team. This challenge often occurs early in adoption cycles. The cloud
governance team provides the necessary checks and balances to ensure that cloud adoption doesn't expose the
business to any new risks. When risks must be taken, this team ensures that proper processes and controls are
implemented to mitigate or govern those risks.
To learn more about cloud adoption, cloud governance, and other such capabilities, see the brief section on
understanding required cloud capabilities.
Next steps
Learn how to plan for cloud adoption.
Plan for cloud adoption
Plan for cloud adoption
2 minutes to read • Edit Online
A plan is an essential requirement for a successful cloud adoption. A cloud adoption plan is an iterative project
plan that helps a company transition from traditional IT approaches to transformation over to modern, agile
approaches. This article series outlines how a cloud adoption plan helps companies balance their IT portfolio and
manage transitions over time. Through this process, business objectives can be clearly translated into tangible
technical efforts. Those efforts can then be managed and communicated in ways that make sense to business
stakeholders. However, adopting such a process may require some changes to traditional project-management
approaches.
Next steps
Before building your cloud adoption plan, ensure that all necessary prerequisites are in place.
Review prerequisites
Prerequisites for an effective cloud adoption plan
2 minutes to read • Edit Online
A plan is only as effective as the data that's put into it. For a cloud adoption plan to be effective, there are two
categories of input: strategic and tactical. The following sections outline the minimum data points required in each
category.
Strategic inputs
Accurate strategic inputs ensure that the work being done contributes to achievement of business outcomes. The
strategy section of the Cloud Adoption Framework provides a series of exercises to develop a clear strategy. The
outputs of those exercises feed the cloud adoption plan. Before developing the plan, ensure that the following
items are well defined as a result of those exercises:
Clear motivations: Why are we adopting the cloud?
Defined business outcomes: What results do we expect to see from adopting the cloud?
Business justification: How will the business measure success?
Every member of the team that implements the cloud adoption plan should be able to answer these three strategic
questions. Managers and leaders who are accountable for implementation of the plan should understand the
metrics behind each question and any progress toward realizing those metrics.
Tactical inputs
Accurate tactical inputs ensure that the work can be planned accurately and managed effectively. The plan section
of the Cloud Adoption Framework provides a series of exercises to develop planning artifacts before you develop
your plan. These artifacts provide answers to the following questions:
Digital estate rationalization: What are the top 10 priority workloads in the adoption plan? How many
additional workloads are likely to be in the plan? How many assets are being considered as candidates for
cloud adoption? Are the initial efforts focused more on migration or innovation activities?
Organization alignment: Who will do the technical work in the adoption plan? Who is accountable for
adherence to governance and compliance requirements?
Skills readiness: How many people are allocated to perform the required tasks? How well are their skills
aligned to cloud adoption efforts? Are partners aligned to support the technical implementation?
These questions are essential to the accuracy of the cloud adoption plan. At a minimum, the questions about
digital estate rationalization must be answered to create a plan. To provide accurate timelines, the questions about
organization and skills are also important.
Next steps
After the team is comfortable with the strategic inputs and the inputs for digital estate rationalization, the next step
of workload prioritization can begin.
Prioritize and define workloads
Cloud adoption plan and Azure DevOps
3 minutes to read • Edit Online
Azure DevOps is the set of cloud-based tools for Azure customers who manage iterative projects. It also includes
tools for managing deployment pipelines and other important aspects of DevOps.
In this article, you'll learn how to quickly deploy a backlog to Azure DevOps by using a cloud adoption plan
template. This template aligns cloud adoption efforts to a standardized process based on the guidance in the Cloud
Adoption Framework.
Next steps
Start aligning your plan project by defining and prioritizing workloads.
Define and prioritize workloads
Prioritize and define workloads for a cloud adoption
plan
5 minutes to read • Edit Online
Establishing clear, actionable priorities is one of the secrets to successful cloud adoption. The natural temptation
is to invest time in defining all workloads that could potentially be affected during cloud adoption. But that's
counterproductive, especially early in the adoption process.
Instead, we recommend that your team focus on thoroughly prioritizing and documenting the first 10 workloads.
After implementation of the adoption plan begins, the team can maintain a list of the next 10 highest-priority
workloads. This approach provides enough information to plan for the next few iterations.
Limiting the plan to 10 workloads encourages agility and alignment of priorities as business criteria change. This
approach also makes room for the cloud adoption team to learn and to refine estimates. Most important, it
removes extensive planning as a barrier to effective business change.
What is a workload?
In the context of a cloud adoption, a workload is a collection of IT assets (servers, VMs, applications, data, or
appliances) that collectively support a defined process. Workloads can support more than one process.
Workloads can also depend on other shared assets or larger platforms. However, a workload should have
defined boundaries regarding the dependent assets and the processes that depend upon the workload. Often,
workloads can be visualized by monitoring network traffic among IT assets.
Prerequisites
The strategic inputs from the prerequisites list make the following tasks much easier to accomplish. For help with
gathering the data discussed in this article, review the prerequisites.
NOTE
The Power of 10 serves as an initial boundary for planning, to focus the energy and investment in early-stage analysis.
However, the act of analyzing and defining workloads is likely to cause changes in the list of priority workloads.
Define workloads
After initial priorities have been defined and workloads have been added to the plan, each of the workloads can
be defined via deeper qualitative analysis. Before including any workload in the cloud adoption plan, try to
provide the following data points for each workload.
Business inputs
DATA POINT DESCRIPTION INPUT
Business freeze periods Are there any times during which the
business will not permit change?
Technical inputs
DATA POINT DESCRIPTION INPUT
Confirm priorities
Based on the assembled data, the cloud strategy team and the cloud adoption team should meet to reevaluate
priorities. Clarification of business data points might prompt changes in priorities. Technical complexity or
dependencies might result in changes related to staffing allocations, timelines, or sequencing of technical efforts.
After a review, both teams should be comfortable with confirming the resulting priorities. This set of
documented, validated, and confirmed priorities is the prioritized cloud adoption backlog.
Next steps
For any workload in the prioritized cloud adoption backlog, the team is now ready to align assets.
Align assets for prioritized workloads
Align assets to prioritized workloads
2 minutes to read • Edit Online
Workload is a conceptual description of a collection of assets: VMs, applications, and data sources. The previous
article, Prioritize and define workloads, gave guidance for collecting the data that will define the workload. Before
migration, a few of the technical inputs in that list require additional validation. This article helps with validation of
the following inputs:
Applications: List any applications included in this workload.
VMs and servers: List any VMs or servers included in the workload.
Data sources: List any data sources included in the workload.
Dependencies: List any asset dependencies not included in the workload.
There are several options for assembling this data. The following are a few of the most common approaches.
Azure Migrate
Azure Migrate provides a set of grouping functions that can speed up the aggregation of applications, VMs, data
sources, and dependencies. After workloads have been defined conceptually, they can be used as the basis for
grouping assets based on dependency mapping.
The Azure Migrate documentation provides guidance on how to group machines based on dependencies.
Configuration-management database
Some organizations have a well-maintained configuration-management database (CMDB ) within their existing
operations-management tooling. They could use the CMDB alternatively to provide the input data points
discussed earlier.
Next steps
Review rationalization decisions based on asset alignment and workload definitions.
Review rationalization decisions
Review rationalization decisions
4 minutes to read • Edit Online
During initial strategy and planning phases, we suggest you apply an incremental rationalization approach to the
digital estate. But this approach embeds some assumptions into the resulting decisions. We advise the cloud
strategy team and the cloud adoption teams to review those decisions in light of expanded-workload
documentation. This review is also a good time to involve business stakeholders and the executive sponsor in
future state decisions.
IMPORTANT
Further validation of the rationalization decisions will occur during the assessment phase of migration. This validation focuses
on business review of the rationalization to align resources appropriately.
To validate rationalization decisions, use the following questions to facilitate a conversation with the business. The
questions are grouped by the likely rationalization alignment.
Innovation indicators
If the joint review of the following questions results in a "Yes" answer, a workload might be a better candidate for
innovation. Such a workload wouldn't be migrated via a lift and shift or modernize model. Instead, the business
logic or data structures would be re-created as a new or rearchitected application. This approach can be more
labor-intensive and time-consuming. But for a workload that represents significant business returns, the
investment is justified.
Do the applications in this workload create market differentiation?
Is there a proposed or approved investment aimed at improving the experiences associated with the
applications in this workload?
Does the data in this workload make new product or service offerings available?
Is there a proposed or approved investment aimed at taking advantage of the data associated with this
workload?
Can the effect of the market differentiation or new offerings be quantified? If so, does that return justify the
increased cost of innovation during cloud adoption?
The following two questions can help you include high-level technical scenarios in the rationalization review.
Answering "Yes" to either could identify ways of accounting for or reducing the cost associated with innovation.
Will the data structures or business logic change during the course of cloud adoption?
Is an existing deployment pipeline used to deploy this workload to production?
If the answer to either question is "Yes," the team should consider including this workload as an innovation
candidate. At a minimum, the team should flag this workload for architecture review to identify modernization
opportunities.
Migration indicators
Migration is a faster and cheaper way of adopting the cloud. But it doesn't take advantage of opportunities to
innovate. Before you invest in innovation, answer the following questions. They can help you determine if a
migration model is more applicable for a workload.
Is the source code supporting this application stable? Do you expect it to remain stable and unchanged during
the time frame of this release cycle?
Does this workload support production business processes today? Will it do so throughout the course of this
release cycle?
Is it a priority that this cloud adoption effort improves the stability and performance of this workload?
Is cost reduction associated with this workload an objective during this effort?
Is reducing operational complexity for this workload a goal during this effort?
Is innovation limited by the current architecture or IT operation processes?
If the answer to any of these questions is "Yes," you should consider a migration model for this workload. This
recommendation is true even if the workload is a candidate for innovation.
Challenges in operational complexity, costs, performance, or stability can hinder business returns. You can use the
cloud to quickly produce improvements related to those challenges. Where it's applicable, we suggest you use the
migration approach to first stabilize the workload. Then expand on innovation opportunities in the stable, agile
cloud environment. This approach provides short-term returns and reduces the cost required to drive long-term
change.
IMPORTANT
Migration models include incremental modernization. Using platform as a service (PaaS) architectures is a common aspect of
migration activities. So too are minor configuration changes that use those platform services. The boundary for migration is
defined as a material change to the business logic or supporting business structures. Such change is considered an
innovation effort.
Next steps
Define iterations and releases to begin planning work.
Define iterations and releases to begin planning work.
Establish iterations and release plans
4 minutes to read • Edit Online
Agile and other iterative methodologies are built on the concepts of iterations and releases. This article outlines
the assignment of iterations and releases during planning. Those assignments drive timeline visibility to make
conversations easier among members of the cloud strategy team. The assignments also align technical tasks in a
way that the cloud adoption team can manage during implementation.
Establish iterations
In an iterative approach to technical implementation, you plan technical efforts around recurring time blocks.
Iterations tend to be one-week to six-week time blocks. Consensus suggests that two weeks is the average
iteration duration for most cloud adoption teams. But the choice of iteration duration depends on the type of
technical effort, the administrative overhead, and the team's preference.
To begin aligning efforts to a timeline, we suggest that you define a set of iterations that last 6 to 12 months.
Understand velocity
Aligning efforts to iterations and releases requires an understanding of velocity. Velocity is the amount of work
that can be completed in any given iteration. During early planning, velocity is an estimate. After several iterations,
velocity becomes a highly valuable indicator of the commitments that the team can make confidently.
You can measure velocity in abstract terms like story points. You can also measure it in more tangible terms like
hours. For most iterative frameworks, we recommend using abstract measurements to avoid challenges in
precision and perception. Examples in this article represent velocity in hours per sprint. This representation makes
the topic more universally understood.
Example: A five-person cloud adoption team has committed to two-week sprints. Given current obligations like
meetings and support of other processes, each team member can consistently contribute 20 hours per week to
the adoption effort. For this team, the initial velocity estimate is 100 hours per sprint.
Iteration planning
Initially, you plan iterations by evaluating the technical tasks based on the prioritized backlog. Cloud adoption
teams estimate the effort required to complete various tasks. Those tasks are then assigned to the first available
iteration.
During iteration planning, the cloud adoption teams validate and refine estimates. They do so until they have
aligned all available velocity to specific tasks. This process continues for each prioritized workload until all efforts
align to a forecasted iteration.
In this process, the team validates the tasks assigned to the next sprint. The team updates its estimates based on
the team's conversation about each task. The team then adds each estimated task to the next sprint until the
available velocity is met. Finally, the team estimates additional tasks and adds them to the next iteration. The team
performs these steps until the velocity of that iteration is also exhausted.
The preceding process continues until all tasks are assigned to an iteration.
Example: Let's build on the previous example. Assume each workload migration requires 40 tasks. Also assume
you estimate each task to take an average of one hour. The combined estimation is approximately 40 hours per
workload migration. If these estimates remain consistent for all 10 of the prioritized workloads, those workloads
will take 400 hours.
The velocity defined in the previous example suggests that the migration of the first 10 workloads will take four
iterations, which is two months of calendar time. The first iteration will consist of 100 tasks that result in the
migration of two workloads. In the next iteration, a similar collection of 100 tasks will result in the migration of
three workloads.
WARNING
The preceding numbers of tasks and estimates are strictly used as an example. Technical tasks are seldom that consistent.
You shouldn't see this example as a reflection of the amount of time required to migrate a workload.
Release planning
Within cloud adoption, a release is defined as a collection of deliverables that produce enough business value to
justify the risk of disruption to business processes.
Releasing any workload-related changes into a production environment creates some changes to business
processes. Ideally, these changes are seamless, and the business sees the value of the changes with no significant
disruptions to service. But the risk of business disruption is present with any change and shouldn't be taken lightly.
To ensure a change is justified by its potential return, the cloud strategy team should participate in release
planning. Once tasks are aligned to sprints, the team can determine a rough timeline of when each workload will
be ready for production release. The cloud strategy team would review the timing of each release. The team would
then identify the inflection point between risk and business value.
Example: Continuing the previous example, the cloud strategy team has reviewed the iteration plan. The review
identified two release points. During the second iteration, a total of five workloads will be ready for migration.
Those five workloads will provide significant business value and will trigger the first release. The next release will
come two iterations later, when the next five workloads are ready for release.
Next steps
Estimate timelines to properly communicate expectations.
Estimate timelines
Timelines in a cloud adoption plan
2 minutes to read • Edit Online
In the previous article in this series, workloads and tasks were assigned to releases and iterations. Those
assignments feed the timeline estimates in this article.
Work breakdown structures (WBS ) are commonly used in sequential project-management tools. They represent
how dependent tasks will be completed over time. Such structures work well when tasks are sequential in nature.
The interdependencies in tasks found in cloud adoption make such structures difficult to manage. To fill this gap,
you can estimate timelines based on iteration-path assignments by hiding complexity.
Estimate timelines
To develop a timeline, start with releases. Those release objectives create a target date for any business impact.
Iterations aid in aligning those releases with specific time durations.
If more granular milestones are required in the timeline, use iteration assignment to indicate milestones. To do this
assignment, assume that the last instance of a workload-related task can serve as the final milestone. Teams also
commonly tag the final task as a milestone.
For any level of granularity, use the last day of the iteration as the date for each milestone. This ties completion of
workload adoption to a specific date. You can track the date in a spreadsheet or a sequential project-management
tool like Microsoft Project.
IT staff members might feel anxious about their roles and positions as they realize a different set of skills is needed
to support cloud solutions. Agile employees who explore and learn new cloud technologies don't need to have that
fear. They can lead the adoption of cloud services by helping the organization understand and embrace the
associated changes.
Learn more
To discover additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning
paths with your role.
Adapt existing roles, skills, and processes for the
cloud
3 minutes to read • Edit Online
At each phase of the IT industry's history, the most notable changes have often been marked by changes in staff
roles. One example is the transition from mainframe computing to client/server computing. The role of the
computer operator during this transition has largely disappeared, replaced by the system administrator role. When
virtualization arrived, the requirement for individuals working with physical servers was replaced with a need for
virtualization specialists.
Roles will likely change as institutions similarly shift to cloud computing. For example, datacenter specialists might
be replaced with cloud administrators or cloud architects. In some cases, though IT job titles haven't changed, the
daily work of these roles has changed significantly.
IT staff members might feel anxious about their roles and positions because they realize that they need a different
set of skills to support cloud solutions. But agile employees who explore and learn new cloud technologies
shouldn't fear. They can lead the adoption of cloud services and help the organization learn and embrace the
associated changes.
For guidance on building a new skill set, see the Skills readiness path.
Capture concerns
As the organization prepares for a cloud adoption effort, each team should document staff concerns as they arise
by identifying:
The type of concern. For example, workers might be resistant to the changes in job duties that come with the
adoption effort.
The impact if the concern isn't addressed. For example, resistance to adoption might result in workers being
slow to execute the required changes.
The area equipped to address the concern. For example, if workers in the IT department are reluctant to acquire
new skills, the IT stakeholder's area is best equipped to address this concern. Identifying the area might be clear
for some concerns. In these cases, you might need to escalate to executive leadership.
IT staff members commonly have concerns about acquiring the training needed to support expanded functions
and new duties. Learning the training preferences of the team helps you prepare a plan. It also allows you to
address these concerns.
Identify gaps
Identifying gaps is another important aspect of organization readiness. A gap is a role, skill, or process that is
required for your digital transformation but doesn't currently exist in your enterprise.
1. Enumerate the responsibilities that come with the digital transformation. Emphasize new responsibilities and
existing responsibilities to be retired.
2. Identify the area that aligns with each responsibility. For each new responsibility, check how closely it aligns
with the area. Some responsibilities might span several areas. This crossover represents an opportunity for
better alignment that you should document as a concern. In the case where no area is identified as being
responsible, document this gap.
3. Identify the skills necessary to support each responsibility, and check if your enterprise has existing resources
with those skills. Where there are no existing resources, determine the training programs or talent acquisition
necessary to fill the gaps. Also determine the deadline by which you must support each responsibility to keep
your digital transformation on schedule.
4. Identify the roles that will execute these skills. Some of your existing workforce will assume parts of the roles.
In other cases, entirely new roles might be necessary.
Next steps
Ensuring proper support for the translated roles is a team effort. To act on this guidance, review the organizational
readiness introduction to identify the right team structures and participants.
Identify the right team structures
Before adoption can begin, you must create a landing zone to host the workloads that you plan to build in the cloud or migrate
to the cloud. This section of the framework guides you through the creation of a landing zone.
Best practices
Validate landing zone modifications against the best practices sections to ensure the proper configuration of your current
and future landing zones.
Next steps
To get ready for cloud adoption, review the Azure setup guide.
Azure setup guide
2 minutes to read • Edit Online
TIP
For an interactive experience, view this guide in the Azure portal. Go to the Azure Quickstart Center in the Azure portal,
select Introduction to Azure Setup, and then follow the step-by-step instructions.
Next steps: Organize your resources to simplify how you apply settings
This guide provides interactive steps that let you try features as they're introduced. To come back to where you
left off, use the breadcrumb for navigation.
Organize your Azure resources
6 minutes to read • Edit Online
Organizing your cloud-based resources is critical to securing, managing, and tracking the costs related to your
workloads. To organize your resources, use the management hierarchies within the Azure platform, implement
well-thought-out naming conventions, and apply resource tagging.
Azure management groups and hierarchy
Naming standards
Resource tags
Azure provides four levels of management scope: management groups, subscriptions, resource groups, and
resources. The following image shows the relationship of these levels.
Management groups: These groups are containers that help you manage access, policy, and compliance for
multiple subscriptions. All subscriptions in a management group automatically inherit the conditions applied to
the management group.
Subscriptions: A subscription groups together user accounts and the resources that were created by those
user accounts. Each subscription has limits or quotas on the amount of resources you can create and use.
Organizations can use subscriptions to manage costs and the resources that are created by users, teams, or
projects.
Resource groups: A resource group is a logical container into which Azure resources like web apps, databases,
and storage accounts are deployed and managed.
Resources: Resources are instances of services that you create, like virtual machines, storage, or SQL
databases.
Learn more
To learn more, see:
Azure fundamentals
Scaling with multiple Azure subscriptions
Understand resource access management in Azure
Organize your resources with Azure management groups
Subscription service limits
Actions
Create a management group:
Create a management group to help you manage access, policy, and compliance for multiple subscriptions.
1. Go to Management groups.
2. Select Add management group.
G O TO M A N A G E M E N T
G R O U PS
Create an additional subscription:
Use subscriptions to manage costs and resources that are created by users, teams, or projects.
1. Go to Subscriptions.
2. Select Add.
G O TO
S U B S C R I P TI O N S
Managing who can access your Azure resources and subscriptions is an important part of your Azure governance
strategy, and assigning group-based access rights and privileges is a good practice. Dealing with groups rather
than individual users simplifies maintenance of access policies, provides consistent access management across
teams, and reduces configuration errors. Azure role-based access control (RBAC ) is the primary method of
managing access in Azure.
RBAC provides detailed access management of resources in Azure. It helps you manage who has access to Azure
resources, what they can do with those resources, and what scopes they can access.
When you plan your access control strategy, grant users the least privilege required to get their work done. The
following image shows a suggested pattern for assigning RBAC.
When you plan your access control methodology, we recommend that you work with people in your organizations
with the following roles: security and compliance, IT administration, and enterprise architect.
The Cloud Adoption Framework offers additional guidance on how to use role-based access control as part of your
cloud adoption efforts.
Actions
Grant resource group access:
To grant a user access to a resource group:
1. Go to Resource groups.
2. Select a resource group.
3. Select Access control (IAM ).
4. Select + Add > Add role assignment.
5. Select a role, and then assign access to a user, group, or service principal.
G O TO R E S O U R C E
G R O U PS
Learn more
To learn more, see:
What is role-based access control (RBAC )?
Cloud Adoption Framework: Use role-based access control
Manage costs and billing for your Azure resources
2 minutes to read • Edit Online
Cost management is the process of effectively planning and controlling costs involved in your business. Cost
management tasks are typically performed by finance, management, and app teams. Azure Cost Management can
help you plan with cost in mind. It can also help you to analyze costs effectively and take action to optimize cloud
spending.
For more information on how to integrate cloud cost management processes throughout your organization, see
the Cloud Adoption Framework article on how to track costs across business units, environments, or projects.
Learn more
To learn more, see:
Azure billing and cost management documentation
Cloud Adoption Framework: Track costs across business units, environments, or projects
Cloud Adoption Framework: Cost management governance discipline
Actions
Predict and manage costs:
1. Go to Cost Management + Billing.
2. Select Cost Management.
Manage invoices and payment methods:
1. Go to Cost Management + Billing.
2. Select Invoices or Payment methods from the Billing section in the left pane.
G O TO C O S T M A N A G E M E N T +
B ILLING
As you establish corporate policy and plan your governance strategies, you can use tools and services like Azure
Policy, Azure Blueprints, and Azure Security Center to enforce and automate your organization's governance
decisions. Before you start your governance planning, use the Governance Benchmark tool to identify potential
gaps in your organization's cloud governance approach. For more information on how to develop governance
processes, see the Cloud Adoption Framework for Azure's governance guidance.
Azure Blueprints
Azure Policy
Azure Security Center
Azure Blueprints enables cloud architects and central information technology groups to define a repeatable set of
Azure resources that implements and adheres to an organization's standards, patterns, and requirements. Azure
Blueprints makes it possible for development teams to rapidly build and stand up new environments and trust that
they're building within organizational compliance using a set of built-in components--such as networking--to
speed up development and delivery.
Blueprints are a declarative way to orchestrate the deployment of various resource templates and other artifacts
like:
Role assignments.
Policy assignments.
Azure Resource Manager templates.
Resource groups.
Create a blueprint
To create a blueprint:
1. Go to Blueprints - Getting started.
2. In the Create a Blueprint section, select Create.
3. Filter the list of blueprints to select the appropriate blueprint.
4. Enter the Blueprint name, and select the appropriate Definition location.
5. Click Next : Artifacts >> and review the artifacts included in the blueprint.
6. Click Save Draft.
C R E A TE A
B LU E PR INT
Publish a blueprint
To publish a blueprint artifacts to your subscription:
1. Goto Blueprints - Blueprint definitions.
2. Select the blueprint you created in the previous steps.
3. Review the blueprint definition and select Publish blueprint.
4. Provide a Version (such as 1.0) and any Change notes, then select Publish.
B LU E PR INT
D E F I N I TI O N S
Learn more
To learn more, see:
Azure Blueprints
Cloud Adoption Framework: Resource consistency decision guide
Standards-based blueprints samples
Monitoring and reporting in Azure
4 minutes to read • Edit Online
Azure offers many services that together provide a comprehensive solution for collecting, analyzing, and acting on
telemetry from your applications and the Azure resources that support them. In addition, these services can extend
to monitoring critical on-premises resources to provide a hybrid monitoring environment.
Azure Monitor
Azure Service Health
Azure Advisor
Azure Security Center
Azure Monitor provides a single unified hub for all monitoring and diagnostics data in Azure. You can use it to get
visibility across your resources. With Azure Monitor, you can find and fix problems and optimize performance. You
also can understand customer behavior.
Monitor and visualize metrics. Metrics are numerical values available from Azure resources that help you
understand the health of your systems. Customize charts for your dashboards, and use workbooks for
reporting.
Query and analyze logs. Logs include activity logs and diagnostic logs from Azure. Collect additional logs
from other monitoring and management solutions for your cloud or on-premises resources. Log Analytics
provides a central repository to aggregate all this data. From there, you can run queries to help troubleshoot
issues or to visualize data.
Set up alerts and actions. Alerts proactively notify you of critical conditions. Corrective actions can be
taken based on triggers from metrics, logs, or service health issues. You can set up different notifications and
actions and send data to your IT service management tools.
Start monitoring your:
Applications
Containers
Virtual machines
Networks
To monitor other resources, find additional solutions in the Azure Marketplace.
To explore Azure Monitor, go to the Azure portal.
Learn more
To learn more, see Azure Monitor documentation.
Action
E XP L O R E A Z U R E
M O N I TO R
Stay current with Microsoft Azure
2 minutes to read • Edit Online
Cloud platforms like Microsoft Azure change faster than many organizations are accustomed to. This pace of
change means that organizations have to adapt people and processes to a new cadence. If you're responsible for
helping your organization keep up with change, you might feel overwhelmed at times. The resources listed in this
section can help you stay up to date.
Top resources
Additional resources
The following resources can help you stay current with Azure:
Azure Service Health
Service Health and alerts provide timely notifications about ongoing service issues, planned
maintenance, and health advisories. This resource also includes information about features being
removed from Azure.
Azure Updates
Subscribe to Azure Updates to receive announcements about product updates. Brief summaries link to
further details, which makes the updates easy to follow.
Subscribe via RSS.
Azure Blog
The Azure Blog communicates the most important announcements for the Azure platform. Follow this
blog to stay up to date on critical information.
Subscribe via RSS.
Service-specific blogs
Individual Azure services publish blogs that you might want to follow if you rely on those services.
Many Azure service blogs are available. Find the ones you're interested in through a web search.
Azure Info Hub
This site is an unofficial resource that pulls together most of the resources listed here. Follow links to
individual services to get detailed information and find service-specific blogs.
Subscribe via RSS.
Deploy a migration landing zone
4 minutes to read • Edit Online
Migration landing zone is a term used to describe an environment that has been provisioned and prepared to host
workloads being migrated from an on-premises environment into Azure. A migration landing zone is the final
deliverable of the Azure setup guide. This article ties together all of the readiness subjects discussed in this guide
and applies the decisions made to the deployment of your first migration landing zone.
The following sections outline a landing zone commonly used to establish an environment that's suitable for use
during a migration. The environment or landing zone described in this article is also captured in an Azure
blueprint. You can use the Cloud Adoption Framework migrate landing zone blueprint to deploy the defined
environment with a single click.
Blueprint alignment
The following image shows the Cloud Adoption Framework migrate landing zone blueprint in relation to
architectural complexity and compliance requirements.
The letter A sits inside of a curved line that marks the scope of this blueprint. That scope is meant to convey
that this blueprint covers limited architectural complexity but is built on relatively mid-line compliance
requirements.
Customers who have a high degree of complexity and stringent compliance requirements might be better
served by using a partner's extended blueprint or one of the standards-based blueprint samples.
Most customers' needs will fall somewhere between these two extremes. The letter B represents the process
outlined in the landing zone considerations articles. For customers in this space, you can use the decision
guides found in those articles to identify nodes to be added to the Cloud Adoption Framework migrate landing
zone blueprint. This approach allows you to customize the blueprint to fit your needs.
Assumptions
The following assumptions or constraints were used when this initial landing zone was defined. If these
assumptions align with your constraints, you can use the blueprint to create your first landing zone. The blueprint
also can be extended to create a landing zone blueprint that meets your unique constraints.
Subscription limits: This adoption effort isn't expected to exceed subscription limits. Two common indicators
are an excess of 25,000 VMs or 10,000 vCPUs.
Compliance: No third-party compliance requirements are needed in this landing zone.
Architectural complexity: Architectural complexity doesn't require additional production subscriptions.
Shared services: There are no existing shared services in Azure that require this subscription to be treated like
a spoke in a hub and spoke architecture.
If these assumptions seem aligned with your current environment, then this blueprint might be a good place to
start building your landing zone.
Decisions
The following decisions are represented in the landing zone blueprint.
Migration tools Azure Site Recovery will be deployed Migration tools decision guide
and an Azure Migrate project will be
created.
Identity It's assumed that the subscription is Identity management best practices
already associated with an Azure Active
Directory instance.
Naming and tagging standards N/A Naming and tagging best practices
Next steps
After a migration landing zone is deployed, you're ready to migrate workloads to Azure. For guidance on the tools
and processes that are required to migrate your first workload, see the Azure migration guide.
Migrate your first workload with the Azure migration guide
Landing zone considerations
2 minutes to read • Edit Online
A landing zone is the basic building block of any cloud adoption environment. The term landing zone refers to an
environment that's been provisioned and prepared to host workloads in a cloud environment like Azure. A fully
functioning landing zone is the final deliverable of any iteration of the Cloud Adoption Framework's Ready
methodology.
This image shows the major considerations for implementing any landing zone deployment. The considerations
can be broken into three categories or types of considerations: hosting, Azure fundamentals, and governance.
Hosting considerations
All landing zones provide structure for hosting options. The structure is created explicitly through governance
controls or organically through the adoption of services within the landing zone. The following articles can help
you make decisions that will be reflected in the blueprint or other automation scripts that create your landing
zone:
Compute decisions. To minimize operational complexity, align compute options with the purpose of the
landing zone. This decision can be enforced by using automation toolchains, like Azure Policy initiatives and
landing zone blueprints.
Storage decisions. Choose the right Azure Storage solution to support your workload requirements.
Networking decisions. Choose the networking services, tools, and architectures to support your
organization's workload, governance, and connectivity requirements.
Database decisions. Determine which database technology is best suited for your workload requirements.
Azure fundamentals
Each landing zone is part of a broader solution for organizing resources across a cloud environment. Azure
fundamentals are the foundational building blocks for organization.
Azure fundamental concepts. Learn fundamental concepts and terms that are used to organize resources in
Azure, and how the concepts relate to one another.
Resource consistency decision guide. When you understand each of the fundamentals, the resource
organization decision guide can help you make decisions that shape the landing zone.
Governance considerations
The Cloud Adoption Framework's Govern methodologies establish a process for governing the environment as a
whole. However, there are many use cases that might require you to make governance decisions on a per-landing
zone basis. In many scenarios, governance baselines are enforced on a per-landing zone basis, even though the
baselines are established holistically. It's true for the first few landing zones that an organization deploys.
The following articles can help you make governance-related decisions about your landing zone. You can factor
each decision into your governance baselines.
Cost requirements. Based on an organization's motivation for cloud adoption and operational commitments
made about its environment, various cost management configurations might need to be changed for the
landing zone.
Monitoring decisions. Depending on the operational requirements for a landing zone, various monitoring
tools can be deployed. The monitoring decisions article can help you determine the most appropriate tools to
deploy.
Using role-based access control. Azure role-based access control (RBAC ) offers fine-grained, group-based
access management for resources that are organized around user roles.
Policy decisions. Azure Blueprints samples provide premade compliance blueprints, each with predefined
policy initiatives. Policy decisions help inform a selection of the best blueprint or policy initiative based on your
requirements and constraints.
Create hybrid cloud consistency. Create hybrid cloud solutions that give your organization the benefits of
cloud innovation while maintaining many of the conveniences of on-premises management.
Azure fundamental concepts
5 minutes to read • Edit Online
Learn fundamental concepts and terms that are used in Azure, and how the concepts relate to one another.
Azure terminology
It's helpful to know the following definitions as you begin your Azure cloud adoption efforts:
Resource: An entity that's managed by Azure. Examples include Azure virtual machines, virtual networks, and
storage accounts.
Subscription: A logical container for your resources. Each Azure resource is associated with only one
subscription. Creating a subscription is the first step in adopting Azure.
Azure account: The email address that you provide when you create an Azure subscription is the Azure
account for the subscription. The party that's associated with the email account is responsible for the monthly
costs that are incurred by the resources in the subscription. When you create an Azure account, you provide
contact information and billing details, like a credit card. You can use the same Azure account (email address)
for multiple subscriptions. Each subscription is associated with only one Azure account.
Account administrator: The party associated with the email address that's used to create an Azure
subscription. The account administrator is responsible for paying for all costs that are incurred by the
subscription's resources.
Azure Active Directory (Azure AD ): The Microsoft cloud-based identity and access management service.
Azure AD allows your employees to sign in and access resources.
Azure AD tenant: A dedicated and trusted instance of Azure AD. An Azure AD tenant is automatically created
when your organization first signs up for a Microsoft cloud service subscription like Microsoft Azure, Microsoft
Intune, or Office 365. An Azure tenant represents a single organization.
Azure AD directory: Each Azure AD tenant has a single, dedicated, and trusted directory. The directory
includes the tenant's users, groups, and apps. The directory is used to perform identity and access management
functions for tenant resources. A directory can be associated with multiple subscriptions, but each subscription
is associated with only one directory.
Resource groups: Logical containers that you use to group related resources in a subscription. Each resource
can exist in only one resource group. Resource groups allow for more granular grouping within a subscription.
Commonly used to represent a collection of assets required to support a workload, application, or specific
function within a subscription.
Management groups: Logical containers that you use for one or more subscriptions. You can define a
hierarchy of management groups, subscriptions, resource groups, and resources to efficiently manage access,
policies, and compliance through inheritance.
Region: A set of Azure datacenters that are deployed inside a latency-defined perimeter. The datacenters are
connected through a dedicated, regional, low -latency network. Most Azure resources run in a specific Azure
region.
NOTE
When you sign up for Azure, you might see the phrase create an Azure account. You create an Azure account when you
create an Azure subscription and associate the subscription with an email account.
NOTE
Most Azure resources are deployed to a specific region. However, certain resource types are considered global resources,
such as policies that you set by using the Azure Policy services.
Related resources
The following resources provide detailed information about the concepts discussed in this article:
How does Azure work?
Resource access management in Azure
Azure Resource Manager overview
Role-based access control (RBAC ) for Azure resources
What is Azure Active Directory?
Associate or add an Azure subscription to your Azure Active Directory tenant
Topologies for Azure AD Connect
Subscriptions, licenses, accounts, and tenants for Microsoft's cloud offerings
Next steps
Now that you understand fundamental Azure concepts, learn how to scale with multiple Azure subscriptions.
Scale with multiple Azure subscriptions
Review your compute options
6 minutes to read • Edit Online
Determining the compute requirements for hosting your workloads is a key consideration as you prepare for your
cloud adoption. Azure compute products and services support a wide variety of workload computing scenarios
and capabilities. How you configure your landing zone environment to support your compute requirements
depends on your workload's governance, technical, and business requirements.
NOTE
Learn more about how to assess compute options for each of your applications or services in the Azure application
architecture guide.
Key questions
Answer the following questions about your workloads to help you make decisions based on the Azure compute
services decision tree:
Are you building net new applications and services or migrating from existing on-premises
workloads? Developing new applications as part of your cloud adoption efforts allows you to take full
advantage of modern cloud-based hosting technologies from the design phase on.
If you're migrating existing workloads, can they take advantage of modern cloud technologies?
Migrating on-premises workloads requires analysis: Can you easily optimize existing applications and services
to take advantage of modern cloud technologies or will a lift and shift approach work better for your
workloads?
Can your applications or services take advantage of containers? If your applications are good candidates
for containerized hosting, you can take advantage of the resource efficiency, scalability, and orchestration
capabilities provided by Azure container services. Both Azure Disk Storage and Azure Files services can be
used for persistent storage for containerized applications.
Are your applications web-based or API -based, and do they use PHP, ASP.NET, Node.js, or similar
technologies? Web apps can be deployed to managed Azure App Service instances, so you don't have to
maintain virtual machines for hosting purposes.
Will you require full control over the OS and hosting environment of your workload? If you need to
control the hosting environment, including OS, disks, locally running software, and other configurations, you
can use Azure Virtual Machines to host your applications and services. In addition to choosing your virtual
machine sizes and performance tiers, your decisions regarding virtual disk storage will affect performance and
SLAs related to your infrastructure as a service (IaaS )-based workloads. For more information, see the Azure
Disk Storage documentation.
Will your workload involve high-performance computing (HPC ) capabilities? Azure Batch provides job
scheduling and autoscaling of compute resources as a platform service, so it's easy to run large-scale parallel
and HPC applications in the cloud.
Will your applications use a microservices architecture? Applications that use a microservices-based
architecture can take advantage of several optimized compute technologies. Self-contained, event-driven
workloads can use Azure Functions to build scalable, serverless applications that don't need an infrastructure.
For applications that require more control over the environment where microservices run, you can use
container services like Azure Container Instances, Azure Kubernetes Service, and Azure Service Fabric.
NOTE
Most Azure compute services are used in combination with Azure Storage. Consult the storage decisions guidance for
related storage decisions.
I need to provision Linux and Windows virtual machines in Azure Virtual Machines
seconds with the configurations of my choice.
I need to achieve high availability by autoscaling to create Virtual machine scale sets
thousands of VMs in minutes.
I want to simplify the deployment, management, and Azure Kubernetes Service (AKS)
operations of Kubernetes.
SCENARIO COMPUTE SERVICE
I want to quickly create cloud apps for web and mobile by Azure App Service
using a fully managed platform.
I want to containerize apps and easily run containers by using Azure Container Instances
a single command.
I need to create highly available, scalable cloud applications Azure Cloud Services
and APIs that can help me focus on apps instead of hardware.
Regional availability
Azure lets you deliver services at the scale you need to reach your customers and partners wherever they are. A
key factor in planning your cloud deployment is to determine which Azure region will host your workload
resources.
Some compute options, such as Azure App Service, are generally available in most Azure regions. However, some
compute services are supported only in select regions. Some virtual machine types and their associated storage
types have limited regional availability. Before you decide which regions you will deploy your compute resources
to, we recommend that you refer to the regions page to check the latest status of regional availability.
To learn more about the Azure global infrastructure, see the Azure regions page. You can also view products
available by region for specific details about the overall services that are available in each Azure region.
Designing and implementing Azure networking capabilities is a critical part of your cloud adoption efforts. You'll
need to make networking design decisions to properly support the workloads and services that will be hosted in
the cloud. Azure networking products and services support a wide variety of networking capabilities. How you
structure these services and the networking architectures you choose depends on your organization's workload,
governance, and connectivity requirements.
I need to balance inbound and outbound connections and Azure Load Balancer
requests to my applications or services.
I want to optimize delivery from application server farms Azure Application Gateway
while increasing application security with a web application Azure Front Door Service
firewall.
I need to securely use the internet to access Azure Virtual Azure VPN Gateway
Network through high-performance VPN gateways.
I need to accelerate the delivery of high-bandwidth content Azure Content Delivery Network
to customers worldwide, from applications and stored content
to streaming video.
I need to protect my Azure applications from DDoS attacks. Azure DDoS Protection
I need to distribute traffic optimally to services across global Azure Traffic Manager
Azure regions, while providing high availability and Azure Front Door Service
responsiveness.
I need native firewall capabilities, with built-in high availability, Azure Firewall
unrestricted cloud scalability, and zero maintenance.
I need to connect business offices, retail locations, and sites Azure Virtual WAN
securely.
I need a scalable, security-enhanced delivery point for global Azure Front Door Service
microservices-based web applications.
You need to deploy and manage a large number of VMs and Hub and spoke
workloads, potentially exceeding Azure subscription limits, you
need to share services across subscriptions, or you need a
more segmented structure for role, application, or permission
segregation.
You have many branch offices that need to connect to each Azure Virtual WAN
other and to Azure.
Storage capabilities are critical for supporting workloads and services that are hosted in the cloud. As part of your
cloud adoption readiness preparations, review this article to help you plan for and address your storage needs.
I have bare-metal servers or VMs Azure Disk Storage (Premium SSD) For production services, the Premium
(Hyper-V or VMware) with direct SSD option provides consistent low-
attached storage running LOB latency coupled with high IOPS and
applications. throughput.
I have servers that will host web and Azure Disk Storage (Standard SSD) Standard SSD IOPS and throughput
mobile apps. might be sufficient (at a lower cost than
Premium SSD) for CPU-bound web and
app servers in production.
I have an enterprise SAN or all-flash Azure Disk Storage (Premium or Ultra Ultra SSD is NVMe-based and offers
array (AFA). SSD) submillisecond latency with high IOPS
and bandwidth. Ultra SSD is scalable up
Azure NetApp Files to 64 TiB. The choice of Premium SSD
and Ultra SSD depends on peak latency,
IOPS, and scalability requirements.
I have high-availability (HA) clustered Azure Files (Premium) Clustered workloads require multiple
servers (such as SQL Server FCI or Azure Disk Storage (Premium or Ultra nodes to mount the same underlying
Windows Server failover clustering). SSD) shared storage for failover or HA.
Premium file shares offer shared
storage that's mountable via SMB.
Shared block storage also can be
configured on Premium SSD or Ultra
SSD by using partner solutions.
I have a relational database or data Azure Disk Storage Premium or Ultra The choice of Premium SSD versus Ultra
warehouse workload (such as SQL SSD) SSD depends on peak latency, IOPS,
Server or Oracle). and scalability requirements. Ultra SSD
also reduces complexity by removing
the need for storage pool configuration
for scalability (see details).
I have a NoSQL cluster (such as Azure Disk Storage (Premium SSD) Azure Disk Storage Premium SSD
Cassandra or MongoDB). offering provides consistent low-latency
coupled with high IOPS and
throughput.
CONSIDERATIONS FOR SUGGESTED
SCENARIO SUGGESTED AZURE SERVICES SERVICES
I am running containers with persistent Azure Files (Standard or Premium) File (RWX) and block (RWO) volumes
volumes. driver options are available for both
Azure Disk Storage (Standard, Premium, Azure Kubernetes Service (AKS) and
or Ultra SSD) custom Kubernetes deployments.
Persistent volumes can map to either
an Azure Disk Storage disk or a
managed Azure Files share. Choose
premium versus standard options bases
on workload requirements for
persistent volumes.
I have a data lake (such as a Hadoop Azure Data Lake Storage Gen 2 The Data Lake Storage Gen 2 feature of
cluster for HDFS data). Azure Blob storage provides server-side
Azure Disk Storage (Standard or HDFS compatibility and petabyte scale
Premium SSD) for parallel analytics. It also offers HA
and reliability. Software like Cloudera
can use Premium or Standard SSD on
master/worker nodes, if needed.
I have an SAP or SAP HANA Azure Disk Storage (Premium or Ultra Ultra SSD is optimized to offer
deployment. SSD) submillisecond latency for tier-1 SAP
workloads. Ultra SSD is now in preview.
Premium SSD coupled with M-Series
offers a general availability (GA) option.
I have a disaster recovery site with Azure page blobs Azure page blobs are used by
strict RPO/RTO that syncs from my replication software to enable low-cost
primary servers. replication to Azure without the need
for compute VMs until failover occurs.
For more information, see the Azure
Disk Storage documentation. Note:
Page blobs support a maximum of 8
TB.
I use Windows File Server. Azure Files With Azure File Sync, you can store
rarely used data on cloud-based Azure
Azure File Sync file shares while caching your most
frequently used files on-premises for
fast, local access. You can also use
multisite sync to keep files in sync
across multiple servers. If you plan to
migrate your workloads to a cloud-only
deployment, Azure Files might be
sufficient.
CONSIDERATIONS FOR SUGGESTED
SCENARIO SUGGESTED AZURE SERVICES SERVICES
I have an enterprise NAS (such as Azure NetApp Files If you have an on-premises deployment
NetApp Filers or Dell-EMC Isilon). of NetApp, consider using Azure
Azure Files (Premium) NetApp Files to migrate your
deployment to Azure. If you use or will
migrate to Windows Server or a Linux
server, or you have basic functionality
needs from a file share, consider using
Azure Files. For continued on-premises
access, use Azure File Sync to sync
Azure file shares with on-premises file
shares by using a cloud tiering
mechanism.
I have a file share (SMB or NFS). Azure Files (Standard or Premium) The choice of Premium versus Standard
Azure Files tiers depends on IOPS,
Azure NetApp Files throughput, and your need for latency
consistency. If you have an on-premises
deployment of NetApp, consider using
Azure NetApp Files. If you need to
migrate your access control lists (ACLs)
and timestamps to the cloud, Azure File
Sync can bring all these settings to your
Azure file shares as a convenient
migration path.
I have an on-premises object storage Azure Blob storage Azure Blob storage provides premium,
system for petabytes of data (such as hot, cool, and archive tiers to match
Dell-EMC ECS). your workload performance and cost
needs.
I have a DFSR deployment or another Azure Files Azure File Sync offers multisite sync to
way of handling branch offices. keep files in sync across multiple
Azure File Sync servers and native Azure file shares in
the cloud. Move to a fixed storage
footprint on-premises by using cloud
tiering. Cloud tiering transforms your
server into a cache for the relevant files
while scaling cold data in Azure file
shares.
I have a tape library (either on- Azure Blob storage (cool or archive An Azure Blob storage archive tier will
premises or offsite) for backup and tiers) have the lowest possible cost, but it
disaster recovery or long-term data might require hours to copy the offline
retention. data to a cool, hot, or premium tier of
storage to allow access. Cool tiers
provide instantaneous access at low
cost.
CONSIDERATIONS FOR SUGGESTED
SCENARIO SUGGESTED AZURE SERVICES SERVICES
I have file or object storage configured Azure Blob storage (cool or archive To back up data for long-term retention
to receive my backups. tiers) with lowest-cost storage, move data to
Azure File Sync Azure Blob storage and use cool and
archive tiers. To enable fast disaster
recovery for file data on a server (on-
premises or on an Azure VM), sync
shares to individual Azure file shares by
using Azure File Sync. With Azure file
share snapshots, you can restore earlier
versions and sync them back to
connected servers or access them
natively in the Azure file share.
I run data replication to a disaster Azure Files Azure File Sync removes the need for a
recovery site. disaster recovery server and stores files
Azure File Sync in native Azure SMB shares. Fast
disaster recovery rebuilds any data on a
failed on-premises server quickly. You
can even keep multiple server locations
in sync or use cloud tiering to store
only relevant data on-premises.
I manage data transfer in disconnected Azure Data Box Edge or Azure Data Box Using Data Box Edge or Data Box
scenarios. Gateway Gateway, you can copy data in
disconnected scenarios. When the
gateway is offline, it saves all files you
copy in the cache, then uploads them
when you're connected.
I manage an ongoing data pipeline to Azure Data Box Edge or Azure Data Box Move data to the cloud from systems
the cloud. Gateway that are constantly generating data just
by having them copy that data straight
to the storage gateway. If they need to
access that data later, it's right there
where they put it.
I have bursts of quantities of data that Azure Data Box Edge or Azure Data Box Manage large quantities of data that
arrive at the same time. Gateway arrive at the same time, like when an
autonomous car pulls back into the
garage, or a gene sequencing machine
finishes its analysis. Copy all that data
to Data Box Gateway at fast local
speeds, and then let the gateway
upload it as your network allows.
I need to support "burst compute" - Avere vFXT for Azure IaaS scale-out NFS/SMB file caching
NFS/SMB read-heavy, file-based
workloads with data assets that reside
on-premises while computation runs in
the cloud.
I need to move file shares that aren't Azure Files Protocol Support Regional Availability
Windows Server or NetApp to the Performance Requirements Snapshot
cloud. Azure NetApp Files and Clone Capabilities Price Sensitivity
SERVICE DESCRIPTION
SERVICE DESCRIPTION
Azure Blob storage Azure Blob storage is Microsoft's object storage solution for
the cloud. Blob storage is optimized for storing massive
amounts of unstructured data. Unstructured data is data that
doesn't adhere to a specific data model or definition, such as
text or binary data.
Azure Data Lake Storage Gen 2 Blob storage supports Azure Data Lake Storage Gen2,
Microsoft's enterprise big data analytics solution for the cloud.
Azure Data Lake Storage Gen2 offers a hierarchical file system
as well as the advantages of Blob storage, including low-cost,
tiered storage; high availability; strong consistency; and
disaster recovery capabilities.
Azure Disk Storage Azure Disk Storage offers persistent, high-performance block
storage to power Azure virtual machines. Azure disks are
highly durable, secure, and offer the industry's only single-
instance SLA for VMs that use premium or ultra SSDs (learn
more about disk types). Azure disks provide high availability
with Availability Sets and Availability Zones that map to your
Azure virtual machines fault domains. In addition, Azure disks
are managed as a top-level resource in Azure. Azure Resource
Manager capabilities like role-based access control (RBAC),
policy, and tagging by default are provided.
Azure Files Azure Files provides fully managed, native SMB file shares as a
service, without the need to run a VM. You can mount an
Azure Files share as a network drive to any Azure VM or on-
premises machine.
Azure File Sync Azure File Sync can be used to centralize your organization's
file shares in Azure Files, while keeping the flexibility,
performance, and compatibility of an on-premises file server.
Azure File Sync transforms Windows Server into a quick cache
of your Azure file share.
Azure NetApp Files The Azure NetApp Files service is an enterprise-class, high-
performance, metered file storage service. Azure NetApp Files
supports any workload type and is highly available by default.
You can select service and performance levels and set up
snapshots through the service.
Azure Data Box Gateway Azure Data Box Gateway is a storage solution that enables
you to seamlessly send data to Azure. Data Box Gateway is a
virtual device based on a virtual machine provisioned in your
virtualized environment or hypervisor. The virtual device
resides on-premises and you write data to it by using the NFS
and SMB protocols. The device then transfers your data to
Azure block blobs or Azure page blobs, or to Azure Files.
Avere vFXT for Azure Avere vFXT for Azure is a filesystem caching solution for data-
intensive high-performance computing (HPC) tasks. Take
advantage of cloud computing's scalability to make your data
accessible when and where it's needed—even for data that's
stored in your own on-premises hardware.
Security
To help you protect your data in the cloud, Azure Storage offers several best practices for data security and
encryption for data at rest and in transit. You can:
Secure the storage account by using RBAC and Azure AD.
Secure data in transit between an application and Azure by using client-side encryption, HTTPS, or SMB 3.0.
Set data to be automatically encrypted when it's written to Azure Storage by using storage service encryption.
Grant delegated access to the data objects in Azure Storage by using shared access signatures.
Use analytics to track the authentication method that someone is using when they access storage in Azure.
These security features apply to Azure Blob storage (block and page) and to Azure Files. Get detailed storage
security guidance in the Azure Storage security guide.
Storage service encryption provides encryption at rest and safeguards your data to meet your organization's
security and compliance commitments. Storage service encryption is enabled by default for all managed disks,
snapshots, and images in all the Azure regions. Starting June 10, 2017, all new managed disks, snapshots, images,
and new data written to existing managed disks are automatically encrypted at rest with keys managed by
Microsoft. Visit the FAQ for managed disks for more details.
Azure Disk Encryption allows you to encrypt managed disks that are attached to IaaS VMs as OS and data disks
at rest and in transit by using your keys stored in Azure Key Vault. For Windows, the drives are encrypted by using
industry-standard BitLocker encryption technology. For Linux, the disks are encrypted by using the dm-crypt
subsystem. The encryption process is integrated with Azure Key Vault to allow you to control and manage the disk
encryption keys. For more information, see Azure Disk Encryption for Windows and Linux IaaS VMs.
Regional availability
You can use Azure to deliver services at the scale that you need to reach your customers and partners wherever
they are. The managed disks and Azure Storage regional availability pages show the regions where these services
are available. Checking the regional availability of a service beforehand can help you make the right decision for
your workload and customer needs.
Managed disks are available in all Azure regions that have Premium SSD and Standard SSD offerings. Although
Ultra SSD currently is in public preview, it's offered in only one availability zone, the East US 2 region. Verify the
regional availability when you plan mission-critical, top-tier workloads that require Ultra SSD.
Hot and cool blob storage, Data Lake Storage Gen2, and Azure Files storage are available in all Azure regions.
Archival bob storage, premium file shares, and premium block blob storage are limited to certain regions. We
recommend that you refer to the regions page to check the latest status of regional availability.
To learn more about Azure global infrastructure, see the Azure regions page. You can also consult the products
available by region page for specific details about what's available in each Azure region.
When you prepare your landing zone environment for your cloud adoption, you need to determine the data
requirements for hosting your workloads. Azure database products and services support a wide variety of data
storage scenarios and capabilities. How you configure your landing zone environment to support your data
requirements depends on your workload governance, technical, and business requirements.
NOTE
Learn more about how to assess database options for each of your application or services in the Azure application
architecture guide.
I need a fully managed relational database that provisions Azure SQL Database
quickly, scales on the fly, and includes built-in intelligence and
security.
I need a fully managed, scalable MySQL relational database Azure Database for MySQL
that has high availability and security built in at no extra cost.
I need a fully managed, scalable PostgreSQL relational Azure Database for PostgreSQL
database that has high availability and security built in at no
extra cost.
I plan to host enterprise SQL Server apps in the cloud and SQL Server on Virtual Machines
have full control over the server OS.
SCENARIO DATA SERVICE
I need a fully managed elastic data warehouse that has Azure SQL Data Warehouse
security at every level of scale at no extra cost.
I need data lake storage resources that are capable of Azure Data Lake
supporting Hadoop clusters or HDFS data.
I need high throughput and consistent, low-latency access for Azure Cache for Redis
my data to support fast, scalable applications.
I need a fully managed, scalable MariaDB relational database Azure Database for MariaDB
that has high availability and security built in at no extra cost.
Regional availability
Azure lets you deliver services at the scale you need to reach your customers and partners, wherever they are. A
key factor in planning your cloud deployment is to determine what Azure region will host your workload
resources.
Most database services are generally available in most Azure regions. However, there are a few regions, mostly
targeting governmental customers, that support only a subset of these products. Before you decide which regions
you will deploy your database resources to, we recommend that you refer to the regions page to check the latest
status of regional availability.
To learn more about Azure global infrastructure, see the Azure regions page. You can also view products available
by region for specific details about the overall services that are available in each Azure region.
Group-based access rights and privileges are a good practice. Dealing with groups rather than individual users
simplifies maintenance of access policies, provides consistent access management across teams, and reduces
configuration errors. Assigning users to and removing users from appropriate groups helps keep current the
privileges of a specific user. Azure role-based access control (RBAC ) offers fine-grained access management for
resources organized around user roles.
For an overview of recommended RBAC practices as part of an identity and security strategy, see Azure identity
management and access control security best practices.
For detailed instructions for assigning users and groups to specific roles and assigning roles to scopes, see
Manage access to Azure resources using RBAC.
When planning your access control strategy, use a least-privilege access model that grants users only the
permissions required to perform their work. The following diagram shows a suggested pattern for using RBAC
through this approach.
NOTE
The more specific or detailed permissions are that you define, the more likely it is that your access controls will become
complex and difficult to manage. This is especially true as your cloud estate grows in size. Avoid resource-specific
permissions. Instead, use management groups for enterprise-wide access control and resource groups for access control
within subscriptions. Also avoid user-specific permissions. Instead, assign access to groups in Azure AD.
Development, Test, and Operations DevOps Builds and deploys workload features
and applications.
The breakdown of actions and permissions in these standard roles are often the same across your applications,
subscriptions, or entire cloud estate, even if these roles are performed by different people at different levels.
Accordingly, you can create a common set of RBAC role definitions to apply across different scopes within your
environment. Users and groups can then be assigned a common role, but only for the scope of resources, resource
groups, subscriptions, or management groups that they're responsible for managing.
For example, in a hub and spoke networking topology with multiple subscriptions, you might have a common set
of role definitions for the hub and all workload spokes. A hub subscription's NetOps role can be assigned to
members of the organization's central IT staff, who are responsible for maintaining networking for shared services
used by all workloads. A workload spoke subscription's NetOps role can then be assigned to members of that
specific workload team, allowing them to configure networking within that subscription to best support their
workload requirements. The same role definition is used for both, but scope-based assignments ensure that users
have only the access that they need to perform their job.
Create hybrid cloud consistency
6 minutes to read • Edit Online
This article guides you through the high-level approaches for creating hybrid cloud consistency.
Hybrid deployment models during migration can reduce risk and contribute to a smooth infrastructure transition.
Cloud platforms offer the greatest level of flexibility when it comes to business processes. Many organizations are
hesitant to make the move to the cloud. Instead, they prefer to keep full control over their most sensitive data.
Unfortunately, on-premises servers don't allow for the same rate of innovation as the cloud. A hybrid cloud
solution offers the speed of cloud innovation and the control of on-premises management.
Figure 1 - Creating hybrid cloud consistency across identity, management, security, data, development, and
DevOps.
A true hybrid cloud solution must provide four components, each of which brings significant benefits:
Common identity for on-premises and cloud applications: This component improves user productivity by
giving users single sign-on (SSO ) to all their applications. It also ensures consistency as applications and users
cross network or cloud boundaries.
Integrated management and security across your hybrid cloud: This component provides you with a
cohesive way to monitor, manage, and secure the environment, which enables increased visibility and control.
A consistent data platform for the datacenter and the cloud: This component creates data portability,
combined with seamless access to on-premises and cloud data services for deep insight into all data sources.
Unified development and DevOps across the cloud and on-premises datacenters: This component
allows you to move applications between the two environments as needed. Developer productivity improves
because both locations now have the same development environment.
Here are some examples of these components from an Azure perspective:
Azure Active Directory (Azure AD ) works with on-premises Active Directory to provide common identity for all
users. SSO across on-premises and via the cloud makes it simple for users to safely access the applications and
assets they need. Admins can manage security and governance controls and also have the flexibility to adjust
permissions without affecting the user experience.
Azure provides integrated management and security services for both cloud and on-premises infrastructure.
These services include an integrated set of tools that are used to monitor, configure, and protect hybrid clouds.
This end-to-end approach to management specifically addresses real-world challenges that face organizations
considering a hybrid cloud solution.
Azure hybrid cloud provides common tools that ensure secure access to all data, seamlessly and efficiently.
Azure data services combine with Microsoft SQL Server to create a consistent data platform. A consistent
hybrid cloud model allows users to work with both operational and analytical data. The same services are
provided on-premises and in the cloud for data warehousing, data analysis, and data visualization.
Azure cloud services, combined with Azure Stack on-premises, provide unified development and DevOps.
Consistency across the cloud and on-premises means that your DevOps team can build applications that run in
either environment and can easily deploy to the right location. You also can reuse templates across the hybrid
solution, which can further simplify DevOps processes.
Azure provides native services for deploying your landing zones. Other third-party tools can also help with this
effort. One such tool that customers and partners often use to deploy landing zones is Hashicorp's Terraform. This
section shows how to use a prototype landing zone to deploy fundamental logging, accounting, and security
capabilities for an Azure subscription.
Architecture diagram
The first landing zone deploys the following components in your subscription:
Capabilities
The components deployed and their purpose include the following:
COMPONENT RESPONSIBILITY
Diagnostics logging All operation logs kept for a specific number of days:
- Storage account
- Event Hubs
COMPONENT RESPONSIBILITY
Azure Security Center Security hygiene metrics and alerts sent to email and phone
number
Assumptions
The following assumptions or constraints were considered when this initial landing zone was defined. If these
assumptions align with your constraints, you can use the blueprint to create your first landing zone. The blueprint
also can be extended to create a landing zone blueprint that meets your unique constraints.
Subscription limits: This adoption effort is unlikely to exceed subscription limits. Two common indicators are
an excess of 25,000 VMs or 10,000 vCPUs.
Compliance: No third-party compliance requirements are needed for this landing zone.
Architectural complexity: Architectural complexity doesn't require additional production subscriptions.
Shared services: There are no existing shared services in Azure that require this subscription to be treated like
a spoke in a hub and spoke architecture.
If these assumptions match your current environment, this blueprint might be a good way to start building your
landing zone.
Design decisions
The following decisions are represented in the Terraform landing zone:
Identity It's assumed that the subscription is Identity management best practices
already associated with an Azure Active
Directory instance.
Naming standards When the environment is created, a Naming and tagging best practices
unique prefix is also created. Resources
that require a globally unique name
(such as storage accounts) use this
prefix. The custom name is appended
with a random suffix. Tag usage is
mandated as described in the following
table.
Tagging standards
The following set of minimum tags must be present on all resources and resource groups:
resource_groups_hub = {
HUB-CORE-SEC = {
name = "-hub-core-sec"
location = "southeastasia"
}
HUB-OPERATIONS = {
name = "-hub-operations"
location = "southeastasia"
}
}
Next, we specify the regions where we can set the foundations. Here, southeastasia is used to deploy all the
resources.
location_map = {
region1 = "southeastasia"
region2 = "eastasia"
}
Then, we specify the retention period for the operations logs and the Azure subscription logs. This data is stored in
separate storage accounts and an event hub, whose names are randomly generated because they must be unique.
azure_activity_logs_retention = 365
azure_diagnostics_logs_retention = 60
Into the tags_hub, we specify the minimum set of tags that are applied to all resources created.
tags_hub = {
environment = "DEV"
owner = "Arnaud"
deploymentType = "Terraform"
costCenter = "65182"
BusinessUnit = "SHARED"
DR = "NON-DR-ENABLED"
}
Then, we specify the log analytics name and a set of solutions that analyze the deployment. Here, we retained
Network Monitoring, Active Directory (AD ) Assessment and Replication, DNS Analytics, and Key Vault Analytics.
analytics_workspace_name = "lalogs"
solution_plan_map = {
NetworkMonitoring = {
"publisher" = "Microsoft"
"product" = "OMSGallery/NetworkMonitoring"
},
ADAssessment = {
"publisher" = "Microsoft"
"product" = "OMSGallery/ADAssessment"
},
ADReplication = {
"publisher" = "Microsoft"
"product" = "OMSGallery/ADReplication"
},
AgentHealthAssessment = {
"publisher" = "Microsoft"
"product" = "OMSGallery/AgentHealthAssessment"
},
DnsAnalytics = {
"publisher" = "Microsoft"
"product" = "OMSGallery/DnsAnalytics"
},
KeyVaultAnalytics = {
"publisher" = "Microsoft"
"product" = "OMSGallery/KeyVaultAnalytics"
}
}
Get started
After you've reviewed the configuration, you can deploy the configuration as you would deploy a Terraform
environment. We recommend that you use the rover, which is a Docker container that allows deployment from
Windows, Linux, or MacOS. You can get started with the rover GitHub repository.
Next steps
The foundation landing zone lays the groundwork for a complex environment in a decomposed manner. This
edition provides a set of simple capabilities that can be extended by:
Adding other modules to the blueprint.
Layering additional landing zones on top of it.
Layering landing zones is a good practice for decoupling systems, versioning each component that you're using,
and allowing fast innovation and stability for your infrastructure as code deployment.
Future reference architectures will demonstrate this concept for a hub and spoke topology.
Review the foundation Terraform landing zone sample
The virtual datacenter: A network perspective
44 minutes to read • Edit Online
Overview
Migrating on-premises applications to Azure provides organizations the benefits of a secured and cost-efficient
infrastructure, even if the applications are migrated with minimal changes. However, to make the most of the
agility possible with cloud computing, enterprises should evolve their architectures to take advantage of Azure
capabilities.
Microsoft Azure delivers hyper-scale services and infrastructure with enterprise-grade capabilities and reliability.
These services and infrastructure offer many choices in hybrid connectivity so customers can choose to access
them over the public internet or over a private network connection. Microsoft partners can also provide enhanced
capabilities by offering security services and virtual appliances that are optimized to run in Azure.
With the Microsoft Azure platform, customers can seamlessly extend their infrastructure into the cloud and build
multi-tier architectures.
NOTE
It's important to understand that the VDC is NOT a discrete Azure product, but the combination of various features and
capabilities to meet your exact requirements. The VDC is a way of thinking about your workloads and Azure usage to
maximize your resources and abilities in the cloud. It's a modular approach to building up IT services in Azure while
respecting the enterprise's organizational roles and responsibilities.
A VDC implementation can help enterprises get workloads and applications into Azure for the following scenarios:
Host multiple related workloads.
Migrate workloads from an on-premises environment to Azure.
Implement shared or centralized security and access requirements across workloads.
Mix DevOps and centralized IT appropriately for a large enterprise.
Mesh is a model using VNet Peering to connect all virtual networks directly to each other.
VNet Peering Hub and spoke is a model for designing a network topology for distributed applications/teams and
delegation.
Azure Virtual WAN is a model for large-scale branch offices and global WAN services.
As shown above, two of the design types are hub and spoke (VNet Peering hub-and-spoke and Azure Virtual
WAN ). Hub and spoke designs are optimal for communication, shared resources, and centralized security policy.
Hubs are built either using a VNet Peering hub (Hub Virtual Network in the diagram) or a Virtual WAN Hub
(Azure Virtual WAN in the diagram). Virtual WAN is good for large-scale branch-to-branch and branch-to-Azure
communications, or if you opt to avoid the complexities of building all the components individually in a VNet
Peering Hub. In some cases, a VNet Peering Hub design is dictated by your requirements. An example of a
dictating requirement would be the need to use a Network Virtual Appliances in the hub.
In both hub and spoke topologies, the hub is the central network zone that controls and inspects ingress or egress
traffic between different zones: internet, on-premises, and the spokes. The hub and spoke topology gives the IT
department an effective way to enforce security policies in a central location. It also reduces the potential for
misconfiguration and exposure.
The hub often contains the common service components consumed by the spokes. The following examples are
common central services:
The Windows Active Directory infrastructure, required for user authentication of third parties that access from
untrusted networks before they get access to the workloads in the spoke. It includes the related Active Directory
Federation Services (AD FS ).
A Distributed Name System (DNS ) service to resolve naming for the workload in the spokes, to access
resources on-premises and on the internet if Azure DNS isn't used.
A public key infrastructure (PKI), to implement single sign-on on workloads.
Flow control of TCP and UDP traffic between the spoke network zones and the internet.
Flow control between the spokes and on-premises.
If needed, flow control between one spoke and another.
The VDC reduces overall cost by using the shared hub infrastructure between multiple spokes.
The role of each spoke can be to host different types of workloads. The spokes also provide a modular approach
for repeatable deployments of the same workloads. Examples are dev and test, user acceptance testing, pre-
production, and production. The spokes can also segregate and enable different groups within your organization.
An example is DevOps groups. Inside a spoke, it's possible to deploy a basic workload or complex multi-tier
workloads with traffic control between the tiers.
Subscription limits and multiple hubs
IMPORTANT
Based on the size of your Azure deployments, a multiple hub strategy may be needed. When designing your hub and spoke
strategy, ask "can this design scale to use another hub VNet in this region?", also, "can this design scale to accommodate
multiple regions?" It's far better to plan for a design that scales and not need it, than to fail to plan and need it.
When to scale to a secondary (or more) hub will depend on myriad factors, usually based on inherent limits on scale. Be sure
to review the Subscription, VNet, and VM limits when designing for scale.
In Azure, every component, whatever the type, is deployed in an Azure Subscription. The isolation of Azure
components in different Azure subscriptions can satisfy the requirements of different LOBs, such as setting up
differentiated levels of access and authorization.
A single VDC implementation can scale up to large number of spokes, although, as with every IT system, there are
platform limits. The hub deployment is bound to a specific Azure subscription, which has restrictions and limits (for
example, a maximum number of VNet peerings. See Azure subscription and service limits, quotas, and constraints
for details). In cases where limits may be an issue, the architecture can scale up further by extending the model
from a single hub-spokes to a cluster of hub and spokes. Multiple hubs in one or more Azure regions can be
connected using VNet Peering, ExpressRoute, Virtual WAN, or site-to-site VPN.
The introduction of multiple hubs increases the cost and management effort of the system. It is only justified due
to scalability, system limits, redundancy, regional replication for end-user performance, or disaster recovery. In
scenarios requiring multiple hubs, all the hubs should strive to offer the same set of services for operational ease.
Interconnection between spokes
Inside a single spoke, or a flat network design, it's possible to implement complex multi-tier workloads. Multi-tier
configurations can be implemented using subnets, one for every tier or application, in the same VNet. Traffic
control and filtering are done using network security groups and user-defined routes.
An architect might want to deploy a multi-tier workload across multiple virtual networks. With virtual network
peering, spokes can connect to other spokes in the same hub or different hubs. A typical example of this scenario is
the case where application processing servers are in one spoke, or virtual network. The database deploys in a
different spoke, or virtual network. In this case, it's easy to interconnect the spokes with virtual network peering
and, by doing that, avoid transiting through the hub. A careful architecture and security review should be done to
ensure that bypassing the hub doesn’t bypass important security or auditing points that might exist only in the
hub.
Spokes can also be interconnected to a spoke that acts as a hub. This approach creates a two-level hierarchy: the
spoke in the higher level (level 0) becomes the hub of lower spokes (level 1) of the hierarchy. The spokes of a VDC
implementation are required to forward the traffic to the central hub so that the traffic can transit to its destination
in either the on-premises network or the public internet. An architecture with two levels of hubs introduces
complex routing that removes the benefits of a simple hub-spoke relationship.
Although Azure allows complex topologies, one of the core principles of the VDC concept is repeatability and
simplicity. To minimize management effort, the simple hub-spoke design is the VDC reference architecture that we
recommend.
Components
The virtual datacenter is made up of four basic component types: Infrastructure, Perimeter Networks,
Workloads, and Monitoring.
Each component type consists of various Azure features and resources. Your VDC implementation is made up of
instances of multiple components types and multiple variations of the same component type. For instance, you
may have many different, logically separated workload instances that represent different applications. You use
these different component types and instances to ultimately build the VDC.
The preceding high-level conceptual architecture of the VDC shows different component types used in different
zones of the hub-spokes topology. The diagram shows infrastructure components in various parts of the
architecture.
As good practice in general, access rights and privileges should be group-based. Dealing with groups rather than
individual users eases maintenance of access policies, by providing a consistent way to manage it across teams,
and aids in minimizing configuration errors. Assigning and removing users to and from appropriate groups helps
keeping the privileges of a specific user up to date.
Each role group should have a unique prefix on their names. This prefix makes it easy to identify which group is
associated with which workload. For example, a workload hosting an authentication service might have groups
named AuthServiceNetOps, AuthServiceSecOps, AuthServiceDevOps, and AuthServiceInfraOps.
Centralized roles, or roles not related to a specific service, might be prefaced with Corp. An example is
CorpNetOps.
Many organizations use a variation of the following groups to provide a major breakdown of roles:
The central IT group, Corp, has the ownership rights to control infrastructure components. Examples are
networking and security. The group needs to have the role of contributor on the subscription, control of the hub,
and network contributor rights in the spokes. Large organizations frequently split up these management
responsibilities between multiple teams. Examples are a network operations CorpNetOps group with exclusive
focus on networking and a security operations CorpSecOps group responsible for the firewall and security
policy. In this specific case, two different groups need to be created for assignment of these custom roles.
The dev-test group, AppDevOps, has the responsibility to deploy app or service workloads. This group takes
the role of virtual machine contributor for IaaS deployments or one or more PaaS contributor’s roles. See Built-
in roles for Azure resources. Optionally, the dev/test team might need visibility on security policies (network
security groups) and routing policies (user-defined routes) inside the hub or a specific spoke. In addition to the
role of contributor for workloads, this group would also need the role of network reader.
The operation and maintenance group, CorpInfraOps or AppInfraOps, has the responsibility of managing
workloads in production. This group needs to be a subscription contributor on workloads in any production
subscriptions. Some organizations might also evaluate if they need an additional escalation support team group
with the role of subscription contributor in production and the central hub subscription. The additional group
fixes potential configuration issues in the production environment.
The VDC is designed so that groups created for the central IT group, managing the hub, have corresponding
groups at the workload level. In addition to managing hub resources only, the central IT group is able to control
external access and top-level permissions on the subscription. Workload groups are also able to control resources
and permissions of their VNet independently from central IT.
The VDC is partitioned to securely host multiple projects across different Lines-of-Business (LOBs). All projects
require different isolated environments (Dev, UAT, production). Separate Azure subscriptions for each of these
environments can provide natural isolation.
The preceding diagram shows the relationship between an organization's projects, users, and groups and the
environments where the Azure components are deployed.
Typically in IT, an environment (or tier) is a system in which multiple applications are deployed and executed. Large
enterprises use a development environment (where changes are made and tested) and a production environment
(what end-users use). Those environments are separated, often with several staging environments in between
them to allow phased deployment (rollout), testing, and rollback if problems arise. Deployment architectures vary
significantly, but usually the basic process of starting at development (DEV ) and ending at production (PROD ) is
still followed.
A common architecture for these types of multi-tier environments consists of DevOps for development and
testing, UAT for staging, and production environments. Organizations can leverage single or multiple Azure AD
tenants to define access and rights to these environments. The previous diagram shows a case where two different
Azure AD tenants are used: one for DevOps and UAT, and the other exclusively for production.
The presence of different Azure AD tenants enforces the separation between environments. The same group of
users, such as the central IT, need to authenticate by using a different URI to access a different Azure AD tenant to
modify the roles or permissions of either the DevOps or production environments of a project. The presence of
different user authentications to access different environments reduces possible outages and other issues caused
by human errors.
Component type: Infrastructure
This component type is where most of the supporting infrastructure resides. It's also where your centralized IT,
security, and compliance teams spend most of their time.
Infrastructure components provide an interconnection for the different components of a VDC implementation, and
are present in both the hub and the spokes. The responsibility for managing and maintaining the infrastructure
components is typically assigned to the central IT and/or security team.
One of the primary tasks of the IT infrastructure team is to guarantee the consistency of IP address schemas
across the enterprise. The private IP address space assigned to a VDC implementation must be consistent and
NOT overlapping with private IP addresses assigned on your on-premises networks.
While NAT on the on-premises edge routers or in Azure environments can avoid IP address conflicts, it adds
complications to your infrastructure components. Simplicity of management is one of the key goals of the VDC, so
using NAT to handle IP concerns, while a valid solution, is not a recommended solution.
Infrastructure components have the following functionality:
Identity and directory services. Access to every resource type in Azure is controlled by an identity stored in a
directory service. The directory service stores not only the list of users, but also the access rights to resources in
a specific Azure subscription. These services can exist cloud-only, or they can be synchronized with on-premises
identity stored in Active Directory.
Virtual Network. Virtual Networks are one of main components of the VDC, and enable you to create a traffic
isolation boundary on the Azure platform. A Virtual Network is composed of a single or multiple virtual
network segments, each with a specific IP network prefix (a subnet, either IPv4 or dual stack IPv4/IPv6). The
Virtual Network defines an internal perimeter area where IaaS virtual machines and PaaS services can establish
private communications. VMs (and PaaS services) in one virtual network can't communicate directly to VMs
(and PaaS services) in a different virtual network, even if both virtual networks are created by the same
customer, under the same subscription. Isolation is a critical property that ensures customer VMs and
communication remains private within a virtual network. Where cross-VNet connectivity is desired, the
following features describe how that can be accomplished.
VNet Peering. The fundamental feature used to create the infrastructure of the VDC is VNet Peering, a
mechanism that connects two virtual networks (VNets) in the same region through the Azure datacenter
network, or using the Azure world-wide backbone across regions.
Service Endpoints. Virtual Network (VNet) service endpoints extend your virtual network private address
space to include your PaaS space. The endpoints also extend the identity of your VNet to the Azure services
over a direct connection. Endpoints allow you to secure your critical Azure service resources to only your virtual
networks.
Private Link. Azure Private Link enables you to access Azure PaaS Services (for example, Azure Storage, Azure
Cosmos DB, and Azure SQL Database) and Azure hosted customer/partner services over a Private Endpoint in
your virtual network. Traffic between your virtual network and the service traverses over the Microsoft
backbone network, eliminating exposure from the public Internet. You can also create your own Private Link
Service in your virtual network (VNet) and deliver it privately to your customers. The setup and consumption
experience using Azure Private Link is consistent across Azure PaaS, customer-owned, and shared partner
services.
User-defined routes. Traffic in a virtual network is routed by default based on the system routing table. A
user-defined route is a custom routing table that network administrators can associate to one or more subnets
to override the behavior of the system routing table and define a communication path within a virtual network.
The presence of user-defined routes guarantees that egress traffic from the spoke transit through specific
custom VMs or network virtual appliances and load balancers present in both the hub and the spokes.
Network security groups. A network security group is a list of security rules that act as traffic filtering on IP
sources, IP destinations, protocols, IP source ports, and IP destination ports (also called a layer-4 five-tuple).
The network security group can be applied to a subnet, a Virtual NIC associated with an Azure VM, or both. The
network security groups are essential to implement a correct flow control in the hub and in the spokes. The
level of security afforded by the network security group is a function of which ports you open, and for what
purpose. Customers should apply additional per-VM filters with host-based firewalls such as IPtables or the
Windows Firewall.
DNS. The name resolution of resources in the VNets of a VDC implementation is provided through DNS.
Azure provides DNS services for both Public and Private name resolution. Private zones provide name
resolution both within a virtual network and across virtual networks. You can have private zones not only span
across virtual networks in the same region, but also across regions and subscriptions. For public resolution,
Azure DNS provides a hosting service for DNS domains, providing name resolution using Microsoft Azure
infrastructure. By hosting your domains in Azure, you can manage your DNS records using the same
credentials, APIs, tools, and billing as your other Azure services.
Management group, Subscription, and Resource Group management. A subscription defines a natural
boundary to create multiple groups of resources in Azure. This separation can be for function, role segregation,
or billing. Resources in a subscription are assembled together in logical containers known as resource groups.
The resource group represents a logical group to organize the resources of a VDC implementation. If your
organization has many subscriptions, you may need a way to efficiently manage access, policies, and
compliance for those subscriptions. Azure management groups provide a level of scope above subscriptions.
You organize subscriptions into containers called "management groups" and apply your governance conditions
to the management groups. All subscriptions within a management group automatically inherit the conditions
applied to the management group. To see these three features in a hierarchy view, read the Cloud Adoption
Framework page, Organizing your resources.
Role-Based Access Controls (RBAC ). Through RBAC, it's possible to map organizational role along with
rights to access specific Azure resources, allowing you to restrict users to only a certain subset of actions. If
you're using Azure Active Directory synchronized with an on-premises Active Directory, you can use the same
AD Groups in Azure that you use in Azure. With RBAC, you can grant access by assigning the appropriate role
to users, groups, and applications within the relevant scope. The scope of a role assignment can be an Azure
subscription, a resource group, or a single resource. RBAC allows inheritance of permissions. A role assigned at
a parent scope also grants access to the children contained within it. Using RBAC, you can segregate duties and
grant only the amount of access to users that they need to perform their jobs. For example, use RBAC to let one
employee manage virtual machines in a subscription, while another can manage SQL DBs within the same
subscription.
Component Type: Perimeter Networks
Perimeter network (sometimes called a DMZ network) components enable network connectivity between your on-
premises or physical datacenter networks, along with any connectivity to and from the Internet. It's also where
your network and security teams likely spend most of their time.
Incoming packets should flow through the security appliances in the hub before reaching the back-end servers and
services in the spokes. Examples are the firewall, IDS, and IPS. Before they leave the network, internet-bound
packets from the workloads should also flow through the security appliances in the perimeter network. The
purposes of this flow are policy enforcement, inspection, and auditing.
Perimeter network components include the following features:
Virtual networks, user-defined routes, and network security groups
Network virtual appliances
Azure Load Balancer
Azure Application Gateway with web application firewall (WAF )
Public IPs
Azure Front Door with web application firewall (WAF )
Azure Firewall and Azure Firewall Manager
Standard DDoS Protection
Usually, the central IT and security teams have responsibility for requirement definition and operation of the
perimeter networks.
The preceding diagram shows the enforcement of two perimeters with access to the internet and an on-premises
network, both resident in the DMZ hub. In the DMZ hub, the perimeter network to internet can scale up to support
large numbers of LOBs, using multiple farms of Web Application Firewalls (WAFs) and/or Azure Firewalls. The
hub also allows for on-premises connectivity via VPN or ExpressRoute as needed.
NOTE
In the preceding diagram, in the "DMZ Hub", many of the following features can be bundled together in an Azure Virtual
WAN hub (for instance; VNet, UDR, NSG, VPN Gateway, ExpressRoute Gateway, Azure Load Balancers, Azure Firewall, Firewall
Manager, and DDOS). Using vWAN Hubs can make the creation of the hub VNet, and thus the VDC, much easier since most
of the engineering complexity is done for you by Azure when you deploy an Azure vWAN Hub.
Virtual networks. The hub is typically built on a virtual network with multiple subnets to host the different types
of services that filter and inspect traffic to or from the internet via Azure Firewall, NVAs, WAF, and Azure
Application Gateway instances.
User-defined routes Using user-defined routes, customers can deploy firewalls, IDS/IPS, and other virtual
appliances, and route network traffic through these security appliances for security boundary policy enforcement,
auditing, and inspection. User-defined routes can be created in both the hub and the spokes to guarantee that
traffic transits through the specific custom VMs, Network Virtual Appliances, and load balancers used by a VDC
implementation. To guarantee that traffic generated from virtual machines residing in the spoke transits to the
correct virtual appliances, a user-defined route needs to be set in the subnets of the spoke by setting the front-end
IP address of the internal load balancer as the next-hop. The internal load balancer distributes the internal traffic to
the virtual appliances (load balancer back-end pool).
Azure Firewall is a managed, cloud-based network security service that protects your Azure Virtual Network
resources. It's a stateful firewall as a service with built-in high availability and cloud scalability. You can centrally
create, enforce, and log application and network connectivity policies across subscriptions and virtual networks.
Azure Firewall uses a static public IP address for your virtual network resources. It allows outside firewalls to
identify traffic that originates from your virtual network. The service is fully integrated with Azure Monitor for
logging and analytics.
If you use the vWAN Topology, the Azure Firewall Manager is a security management service that provides
central security policy and route management for cloud-based security perimeters. It works with Azure Virtual
WAN Hub, a Microsoft-managed resource that lets you easily create hub and spoke architectures. When security
and routing policies are associated with such a hub, it's referred to as a secured virtual hub.
Network virtual appliances. In the hub, the perimeter network with access to the internet is normally managed
through an Azure Firewall instance or a farm of firewalls or web application firewall (WAF ).
Different LOBs commonly use many web applications. These applications tend to suffer from various
vulnerabilities and potential exploits. Web application firewalls are a special type of product used to detect attacks
against web applications, HTTP/HTTPS, in more depth than a generic firewall. Compared with tradition firewall
technology, WAFs have a set of specific features to protect internal web servers from threats.
An Azure Firewall or NVA firewall both use a common administration plane, with a set of security rules to protect
the workloads hosted in the spokes, and control access to on-premises networks. The Azure Firewall has scalability
built in, whereas NVA firewalls can be manually scaled behind a load balancer. Generally, a firewall farm has less
specialized software compared with a WAF, but has a broader application scope to filter and inspect any type of
traffic in egress and ingress. If an NVA approach is used, they can be found and deployed from the Azure
marketplace.
We recommend that you use one set of Azure Firewall instances, or NVAs, for traffic originating on the internet.
Use another for traffic originating on-premises. Using only one set of firewalls for both is a security risk as it
provides no security perimeter between the two sets of network traffic. Using separate firewall layers reduces the
complexity of checking security rules and makes it clear which rules correspond to which incoming network
request.
Azure Load Balancer offers a high availability Layer 4 (TCP, UDP ) service, which can distribute incoming traffic
among service instances defined in a load-balanced set. Traffic sent to the load balancer from front-end endpoints
(public IP endpoints or private IP endpoints) can be redistributed with or without address translation to a set of
back-end IP address pool (examples are Network Virtual Appliances or VMs).
Azure Load Balancer can probe the health of the various server instances as well, and when an instance fails to
respond to a probe, the load balancer stops sending traffic to the unhealthy instance. In the VDC, an external load
balancer is deployed to the hub and the spokes. In the hub, the load balancer is used to efficiently route traffic
across firewall instances, and in the spokes, load balancers are used to manage application traffic.
Azure Front Door (AFD ) is Microsoft's highly available and scalable Web Application Acceleration Platform,
Global HTTP Load Balancer, Application Protection, and Content Delivery Network. Running in more than 100
locations at the edge of Microsoft's Global Network, AFD enables you to build, operate, and scale out your
dynamic web application and static content. AFD provides your application with world-class end-user
performance, unified regional/stamp maintenance automation, BCDR automation, unified client/user information,
caching, and service insights. The platform offers performance, reliability and support SLAs, compliance
certifications and auditable security practices developed, operated, and supported natively by Azure. A web
application firewall (WAF ) is also provided as part of the Front Door WAF SKU. This SKU provides protection to
web applications from common web vulnerabilities and exploits.
Application Gateway Microsoft Azure Application Gateway is a dedicated virtual appliance providing application
delivery controller (ADC ) as a service, offering various layer 7 load-balancing capabilities for your application. It
allows you to optimize web farm productivity by offloading CPU intensive SSL termination to the application
gateway. It also provides other layer 7 routing capabilities including round robin distribution of incoming traffic,
cookie-based session affinity, URL path-based routing, and the ability to host multiple websites behind a single
Application Gateway. A web application firewall (WAF ) is also provided as part of the application gateway WAF
SKU. This SKU provides protection to web applications from common web vulnerabilities and exploits. Application
Gateway can be configured as internet facing gateway, internal only gateway, or a combination of both.
Public IPs. With some Azure features, you can associate service endpoints to a public IP address so that your
resource is accessible from the internet. This endpoint uses network address translation (NAT) to route traffic to
the internal address and port on the Azure virtual network. This path is the primary way for external traffic to pass
into the virtual network. You can configure public IP addresses to determine which traffic is passed in and how and
where it's translated onto the virtual network.
Azure DDoS Protection Standard provides additional mitigation capabilities over the Basic service tier that are
tuned specifically to Azure Virtual Network resources. DDoS Protection Standard is simple to enable and requires
no application changes. Protection policies are tuned through dedicated traffic monitoring and machine learning
algorithms. Policies are applied to public IP addresses associated to resources deployed in virtual networks.
Examples are Azure Load Balancer, Azure Application Gateway, and Azure Service Fabric instances. Near real-time,
system-generated logs are available through Azure Monitor views during an attack and for history. Application
layer protection can be added through the Azure Application Gateway web application firewall. Protection is
provided for IPv4 and IPv6 Azure public IP addresses.
The Hub and Spoke topology at a detail level uses VNet Peering and UDRs to route traffic properly
In the diagram, the UDR ensures traffic flows from the spoke to the firewall before transiting to on-premises
through the ExpressRoute gateway (assuming the firewall policy allows that flow ).
Component type: Monitoring
Monitoring components provide visibility and alerting from all the other components types. All teams should have
access to monitoring for the components and services they have access to. If you have a centralized help desk or
operations teams, they require integrated access to the data provided by these components.
Azure offers different types of logging and monitoring services to track the behavior of Azure-hosted resources.
Governance and control of workloads in Azure is based not just on collecting log data but also on the ability to
trigger actions based on specific reported events.
Azure Monitor. Azure includes multiple services that individually perform a specific role or task in the monitoring
space. Together, these services deliver a comprehensive solution for collecting, analyzing, and acting on system-
generated logs from your applications and the Azure resources that support them. They can also work to monitor
critical on-premises resources in order to provide a hybrid monitoring environment. Understanding the tools and
data that are available is the first step in developing a complete monitoring strategy for your applications.
There are two fundamental types of logs in Azure Monitor:
Metrics are numerical values that describe some aspect of a system at a particular point in time. They are
lightweight and capable of supporting near real-time scenarios. For many Azure resources, you'll see data
collected by Azure Monitor right in their Overview page in the Azure portal. As an example, look at any
virtual machine and you'll see several charts displaying performance metrics. Click on any of the graphs to
open the data in metrics explorer in the Azure portal, which allows you to chart the values of multiple
metrics over time. You can view the charts interactively or pin them to a dashboard to view them with other
visualizations.
Logs contain different kinds of data organized into records with different sets of properties for each type.
Telemetry such as events and traces are stored as logs in addition to performance data so that it can all be
combined for analysis. Log data collected by Azure Monitor can be analyzed with queries to quickly retrieve,
consolidate, and analyze collected data. Logs are stored and queried from Log Analytics. You can create and
test queries using Log Analytics in the Azure portal and then either directly analyze the data using these
tools or save queries for use with visualizations or alert rules.
Azure Monitor can collect data from a variety of sources. You can think of monitoring data for your applications in
tiers ranging from your application, any operating system, and the services it relies on, down to the Azure platform
itself. Azure Monitor collects data from each of the following tiers:
Application monitoring data: Data about the performance and functionality of the code you have written,
regardless of its platform.
Guest OS monitoring data: Data about the operating system on which your application is running. This OS
could be running in Azure, another cloud, or on-premises.
Azure resource monitoring data: Data about the operation of an Azure resource.
Azure subscription monitoring data: Data about the operation and management of an Azure subscription, as
well as data about the health and operation of Azure itself.
Azure tenant monitoring data: Data about the operation of tenant-level Azure services, such as Azure Active
Directory.
Custom Sources: Logs sent from on-prem sources can be included as well, examples could be on-premises
server events, or network device syslog output.
Monitoring data is only useful if it can increase your visibility into the operation of your computing environment.
Azure Monitor includes several features and tools that provide valuable insights into your applications and other
resources that they depend on. Monitoring solutions and features such as Application Insights and Azure Monitor
for containers provide deep insights into different aspects of your application and specific Azure services.
Monitoring solutions in Azure Monitor are packaged sets of logic that provide insights for a particular application
or service. They include logic for collecting monitoring data for the application or service, queries to analyze that
data, and views for visualization. Monitoring solutions are available from Microsoft and partners to provide
monitoring for various Azure services and other applications.
With all of this rich data collected, it's important to take proactive action on events happening in your environment
where manual queries alone won't suffice. Alerts in Azure Monitor proactively notify you of critical conditions and
potentially attempt to take corrective action. Alert rules based on metrics provide near real-time alerting based on
numeric values, while rules based on logs allow for complex logic across data from multiple sources. Alert rules in
Azure Monitor use action groups, which contain unique sets of recipients and actions that can be shared across
multiple rules. Based on your requirements, action groups can perform such actions as using webhooks to have
alerts start external actions or to integrate with your ITSM tools.
Azure Monitor also allows the creation of custom dashboards. Azure dashboards allow you to combine different
kinds of data, including both metrics and logs, into a single pane in the Azure portal. You can optionally share the
dashboard with other Azure users. Elements throughout Azure Monitor can be added to an Azure dashboard in
addition to the output of any log query or metrics chart. For example, you could create a dashboard that combines
tiles that show a graph of metrics, a table of activity logs, a usage chart from Application Insights, and the output of
a log query.
Finally, Azure Monitor data is a native source for Power BI. Power BI is a business analytics service that provides
interactive visualizations across a variety of data sources and is an effective means of making data available to
others within and outside your organization. You can configure Power BI to automatically import log data from
Azure Monitor to take advantage of these additional visualizations.
Azure Network Watcher provides tools to monitor, diagnose, and view metrics and enable or disable logs for
resources in an Azure virtual network. It's a multifaceted service that allows the following functionalities and more:
Monitor communication between a virtual machine and an endpoint.
View resources in a virtual network and their relationships.
Diagnose network traffic filtering problems to or from a VM.
Diagnose network routing problems from a VM.
Diagnose outbound connections from a VM.
Capture packets to and from a VM.
Diagnose problems with an Azure virtual network gateway and connections.
Determine relative latencies between Azure regions and internet service providers.
View security rules for a network interface.
View network metrics.
Analyze traffic to or from a network security group.
View diagnostic logs for network resources.
Component type: Workloads
Workload components are where your actual applications and services reside. It's where your application
development teams spend most of their time.
The workload possibilities are endless. The following are just a few of the possible workload types:
Internal LOB Applications: Line-of-business applications are computer applications critical to the ongoing
operation of an enterprise. LOB applications have some common characteristics:
Interactive by nature. Data is entered, and results or reports are returned.
Data driven - data intensive with frequent access to databases or other storage.
Integrated - offer integration with other systems within or outside the organization.
Customer facing web sites (Internet or Internal facing): Most applications that interact with the Internet are
web sites. Azure offers the capability to run a web site on an IaaS VM or from an Azure Web Apps site (PaaS ).
Azure Web Apps support integration with VNets that allow the deployment of the Web Apps in a spoke network
zone. Internal facing web sites don't need to expose a public internet endpoint because the resources are accessible
via private non-internet routable addresses from the private VNet.
Big Data/Analytics: When data needs to scale up to larger volumes, relational databases may not perform well
under the extreme load or unstructured nature of the data. Azure HDInsight is a managed, full-spectrum, open-
source analytics service in the cloud for enterprises. You can use open-source frameworks such as Hadoop, Apache
Spark, Apache Hive, LLAP, Apache Kafka, Apache Storm, R, and more. HDInsight supports deploying into a
location-based VNet, can be deployed to a cluster in a spoke of the VDC.
Events and Messaging: Azure Event Hubs is a hyperscale telemetry ingestion service that collects, transforms,
and stores millions of events. As a distributed streaming platform, it offers low latency and configurable time
retention, enabling you to ingest massive amounts of telemetry into Azure and read that data from multiple
applications. With Event Hubs, a single stream can support both real-time and batch-based pipelines.
You can implement a highly reliable cloud messaging service between applications and services through Azure
Service Bus. It offers asynchronous brokered messaging between client and server, structured first-in-first-out
(FIFO ) messaging, and publishes and subscribe capabilities.
These examples barely scratch the surface of the types of workloads you can create in Azure; everything from a
basic Web and SQL app to the latest in IoT, Big Data, Machine Learning, AI, and so much more.
Making the VDC highly available: multiple VDCs
So far, this article has focused on the design of a single VDC, describing the basic components and architectures
that contribute to resiliency. Azure features such as Azure load balancer, NVAs, availability zones, availability sets,
scale sets, along with other mechanisms contribute to a system that enables you to build solid SLA levels into your
production services.
However, because a single VDC is typically implemented within a single region, it may be vulnerable to any major
outage that affects that entire region. Customers that require high availability must protect the services through
deployments of the same project in two (or more) VDC implementations placed in different regions.
In addition to SLA concerns, there are several common scenarios where deploying multiple VDC implementations
makes sense:
Regional or global presence of your end users or partners.
Disaster recovery requirements.
A mechanism to divert traffic between datacenters for load or performance.
Regional/global presence
Azure datacenters are present in numerous regions worldwide. When selecting multiple Azure datacenters,
customers need to consider two related factors: geographical distances and latency. To offer the best user
experience, evaluate the geographical distance between each VDC implementation as well as the distance between
each VDC implementation and the end users.
The region in which VDC implementations are hosted must conform with regulatory requirements established by
any legal jurisdiction under which your organization operates.
Disaster recovery
The design of a disaster recovery plan depends on the types of workloads and the ability to synchronize state of
those workloads between different VDC implementations. Ideally, most customers desire a fast fail-over
mechanism, and this requirement may need application data synchronization between deployments running in
multiple VDC implementations. However, when designing disaster recovery plans, it's important to consider that
most applications are sensitive to the latency that can be caused by this data synchronization.
Synchronization and heartbeat monitoring of applications in different VDC implementations requires them to
communicate over the network. Multiple VDC implementations in different regions can be connected through:
Hub-to-Hub communication automatically built into Azure Virtual WAN Hubs across regions in the same
virtual WAN.
VNet Peering - VNet Peering can connect hubs across regions.
ExpressRoute private peering when the hubs in each VDC implementation are connected to the same
ExpressRoute circuit.
Multiple ExpressRoute circuits connected via your corporate backbone and your multiple VDC implementations
connected to the ExpressRoute circuits.
Site-to-Site VPN connections between the hub zone of your VDC implementations in each Azure Region.
Typically, vWAN Hubs, VNet Peering, or ExpressRoute connections are the preferred type of network connectivity
due to the higher bandwidth and consistent latency levels when transiting through the Microsoft backbone.
We recommend that customers run network qualification tests to verify the latency and bandwidth of these
connections, and decide whether synchronous or asynchronous data replication is appropriate based on the result.
It's also important to weigh these results in view of the optimal recovery time objective (RTO ).
Disaster recovery: diverting traffic from one region to another
Both Azure Traffic Manager and Azure Front Door periodically check the service health of listening endpoints in
different VDC implementations and, if those endpoints fail, route automatically to the next closest VDC. Traffic
Manager uses real-time user measurements and DNS to route users to the closest (or next closest during failure).
Azure Front Door is a reverse proxy at over 100 Microsoft backbone edge sites, using anycast to route users to the
closest listening endpoint.
Summary
The Virtual datacenter is an approach to datacenter migration to create a scalable architecture in Azure that
maximizes cloud resource use, reduces costs, and simplifies system governance. The VDC is most often based on
hub and spoke network topologies (either using VNet Peering or Virtual WAN Hubs), providing common shared
services in the hub and allowing specific applications and workloads in the spokes. The VDC also matches the
structure of company roles, where different departments such as Central IT, DevOps, and operations and
maintenance all work together while performing their specific roles. The VDC satisfies the requirements for a "lift
and shift" migration, but also provides many advantages to native cloud deployments.
References
The following features were discussed in this document. Follow the links to learn more.
NETWORK FEATURES LOAD BALANCING CONNECTIVITY
Next Steps
Explore VNet Peering, the underpinning technology for VDC hub and spoke designs
Implement Azure AD to get started with RBAC exploration
Develop a Subscription and Resource management model and RBAC model to meet the structure,
requirements, and policies of your organization. The most important activity is planning. As much as practical,
analyze how reorganizations, mergers, new product lines, and other considerations will affect your initial
models to ensure you can scale to meet future needs and growth.
Best practices for Azure readiness
3 minutes to read • Edit Online
A large part of cloud readiness is equipping staff with the technical skills needed to begin a cloud adoption effort
and prepare your migration target environment for the assets and workloads you'll move to the cloud. The
following topics provide best practices and additional guidance to help your team establish and prepare your
Azure environment.
Azure fundamentals
Use the following guidance when organizing and deploying your assets in the Azure environment:
Azure fundamental concepts. Learn fundamental concepts and terms used in Azure. Also learn how these
concepts relate to one another.
Recommended naming and tagging conventions. Review detailed recommendations for naming and tagging
your resources. These recommendations support enterprise cloud adoption efforts.
Scaling with multiple Azure subscriptions. Understand strategies for scaling with multiple Azure subscriptions.
Organize your resources with Azure management groups. Learn how Azure management groups can manage
resources, roles, policies, and deployment across multiple subscriptions.
Create hybrid cloud consistency. Create hybrid cloud solutions that provide the benefits of cloud innovation
while maintaining many of the conveniences of on-premises management.
Networking
Use the following guidance to prepare your cloud networking infrastructure to support your workloads:
Networking decisions. Choose the networking services, tools, and architectures that will support your
organization's workload, governance, and connectivity requirements.
Virtual network planning. Learn to plan virtual networks based on your isolation, connectivity, and location
requirements.
Best practices for network security. Learn best practices for addressing common network security issues by
using built-in Azure capabilities.
Perimeter networks. Also known as demilitarized zones (DMZs), perimeter networks enable secure connectivity
between your cloud networks and your on-premises or physical datacenter networks, along with any
connectivity to and from the internet.
Hub and spoke network topology. Hub and spoke is a networking model for efficient management of common
communication or security requirements for complicated workloads. It also addresses potential Azure
subscription limitations.
Storage
Azure Storage guidance. Select the right Azure Storage solution to support your usage scenarios.
Azure Storage security guide. Learn about security features in Azure Storage.
Databases
Choose the correct SQL Server option in Azure. Choose the PaaS or IaaS solution that best supports your SQL
Server workloads.
Database security best practices. Learn best practices for database security on the Azure platform.
Choose the right data store. Selecting the right data store for your requirements is a key design decision. There
are literally hundreds of implementations to choose from among SQL and NoSQL databases. Data stores are
often categorized by how they structure data and the types of operations they support. This article describes
several of the most common storage models.
Cost management
Tracking costs across business units, environments, and projects. Learn best practices for creating proper cost-
tracking mechanisms.
How to optimize your cloud investment with Azure Cost Management. Implement a strategy for cost
management and learn about the tools available for addressing cost challenges.
Create and manage budgets. Learn to create and manage budgets by using Azure Cost Management.
Export cost data. Learn to create and manage exported data in Azure Cost Management.
Optimize costs based on recommendations. Learn to identify underutilized resources and take action to reduce
costs by using Azure Cost Management and Azure Advisor.
Use cost alerts to monitor usage and spending. Learn to use Cost Management alerts to monitor your Azure
usage and spending.
Scale with multiple Azure subscriptions
5 minutes to read • Edit Online
Organizations often need more than one Azure subscription as a result of resource limits and other governance
considerations. Having a strategy for scaling your subscriptions is important.
NOTE
Note that tag inheritance is not currently available but will become available soon.
By relying on this inheritance model, you can arrange the subscriptions in your hierarchy so that each
subscription follows appropriate policies and security controls.
Any access or policy assignment on the root management group applies to all resources in the directory.
Carefully consider which items you define at this scope. Include only the assignments you must have.
When you initially define your management-group hierarchy, you first create the root management group. You
then move all existing subscriptions in the directory into the root management group. New subscriptions are
always created in the root management group. You can later move them to another management group.
When you move a subscription to an existing management group, it inherits the policies and role assignments
from the management-group hierarchy above it. Once you have established multiple subscriptions for your
Azure workloads, you should create additional subscriptions to contain Azure services that other subscriptions
share.
For more information, see Organizing your resources with Azure management groups.
Related resources
Azure fundamental concepts.
Organize your resources with Azure management groups.
Elevate access to manage all Azure subscriptions and management groups.
Move Azure resources to another resource group or subscription.
Next steps
Review recommended naming and tagging conventions to follow when deploying your Azure resources.
Recommended naming and tagging conventions
Recommended naming and tagging conventions
9 minutes to read • Edit Online
Organizing cloud-based assets in ways that aid operational management and support accounting requirements
is a common challenge in large cloud adoption efforts. By applying well-defined naming and metadata tagging
conventions to cloud-hosted resources, IT staff can quickly find and manage resources. Well-defined names and
tags also help to align cloud usage costs with business teams by using chargeback and showback accounting
mechanisms.
The Azure Architecture Center's guidance for naming rules and restrictions for Azure resources provides general
recommendations and platform limitations. The following discussion extends that guidance with more detailed
recommendations aimed specifically at supporting enterprise cloud adoption efforts.
Resource names can be difficult to change. Prioritize establishing a comprehensive naming convention before
you begin any large cloud deployment.
NOTE
Every business has different organizational and management requirements. These recommendations provide a starting
point for discussions within your cloud adoption teams.
As these discussions proceed, use the following template to capture the naming and tagging decisions you make when
you align these recommendations to your specific business needs.
Download the naming and tagging convention tracking template.
Business unit Top-level division of your company fin, mktg, product, it, corp
that owns the subscription or workload
the resource belongs to. In smaller
organizations, this component might
represent a single corporate top-level
organizational element.
Application or service name Name of the application, workload, or navigator, emissions, sharepoint,
service that the resource is a part of. hadoop
Deployment environment The stage of the development lifecycle prod, dev, qa, stage, test
for the workload that the resource
supports.
Region The Azure region where the resource is westus, eastus2, westeurope, usgovia
deployed.
Subnet snet-
Virtual machine vm
Public IP pip-
NIC nic-
Storage account st
Metadata tags
When you apply metadata tags to your cloud resources, you can include information about those assets that
couldn't be included in the resource name. You can use that information to perform more sophisticated filtering
and reporting on resources. You want these tags to include context about the resource's associated workload or
application, operational requirements, and ownership information. This information can be used by IT or
business teams to find resources or generate reports about resource usage and billing.
What tags you apply to resources and what tags are required or optional differs among organizations. The
following list provides examples of common tags that capture important context and information about a
resource. Use this list as a starting point to establish your own tagging conventions.
End date of the project Date when the application, EndDate {date}
workload, or service is
scheduled for retirement.
Service class Service level agreement ServiceClass Dev, Bronze, Silver, Gold
level of the application,
workload, or service.
Start date of the project Date when the application, StartDate {date}
workload, or service was
first deployed.
Resource groups
ASSET TYPE SCOPE FORMAT EXAMPLES
Virtual networking
ASSET TYPE SCOPE FORMAT EXAMPLES
PaaS services
ASSET TYPE SCOPE FORMAT EXAMPLES
Databases
ASSET TYPE SCOPE FORMAT EXAMPLES
Storage
ASSET TYPE SCOPE FORMAT EXAMPLES
Analytics
ASSET TYPE SCOPE FORMAT EXAMPLES
As you plan and design for migration, in addition to the migration itself, one of the most critical steps is the design
and implementation of Azure networking. This article describes best practices for networking when migrating to
IaaS and PaaS implementations in Azure.
IMPORTANT
The best practices and opinions described in this article are based on the Azure platform and service features available at the
time of writing. Features and capabilities change over time. Not all recommendations might be applicable for your
deployment, so select those that work for you.
Learn more:
Learn about designing subnets.
Learn how a fictional company (Contoso) prepared their networking infrastructure for migration.
Secure VNets
The responsibility for securing VNets is shared between Microsoft and you. Microsoft provides many networking
features, as well as services that help keep resources secure. When designing security for VNets, best practices you
should follow include implementing a perimeter network, using filtering and security groups, securing access to
resources and IP addresses, and implementing attack protection.
Learn more:
Get an overview of best practices for network security.
Learn how to design for secure networks.
NIC1 AsgWeb
NIC2 AsgWeb
NIC3 AsgLogic
NIC4 AsgDb
In our example, each network interface belongs to only one application security group, but in fact an interface
can belong to multiple groups, in accordance with Azure limits.
None of the network interfaces have an associated NSG. NSG1 is associated to both subnets and contains the
following rules.
Destination port: 80
Protocol: TCP
Access: Allow.
Destination: AsgDb
Protocol: All
Access: Deny.
Protocol: TCP
Access: Allow.
The rules that specify an application security group as the source or destination are only applied to the network
interfaces that are members of the application security group. If the network interface is not a member of an
application security group, the rule is not applied to the network interface, even though the network security
group is associated to the subnet.
Learn more:
Learn about application security groups.
Best practice: Secure access to PaaS using VNet service endpoints
VNet service endpoints extend your VNet private address space and identity to Azure services over a direct
connection.
Endpoints allow you to secure critical Azure service resources to your VNets only. Traffic from your VNet to the
Azure service always remains on the Microsoft Azure backbone network.
VNet private address space can be overlapping and thus cannot be used to uniquely identify traffic originating
from a VNet.
After service endpoints are enabled in your VNet, you can secure Azure service resources by adding a VNet rule
to the service resources. This provides improved security by fully removing public internet access to resources,
and allowing traffic only from your VNet.
Service endpoints
Learn more:
Learn about VNet service endpoints.
Azure Firewall
Azure Firewall can centrally create, enforce, and log application and network connectivity policies across
subscriptions and VNets.
Azure Firewall uses a static public IP address for your VNet resources, allowing outside firewalls to identify
traffic originating from your VNet.
Azure Firewall is fully integrated with Azure Monitor for logging and analytics.
As a best practice when creating Azure Firewall rules, use the FQDN tags to create rules.
An FQDN tag represents a group of FQDNs associated with well-known Microsoft services.
You can use an FQDN tag to allow the required outbound network traffic through the firewall.
For example, to manually allow Windows Update network traffic through your firewall, you would need to
create multiple application rules. Using FQDN tags, you create an application rule, and include the Windows
Updates tag. With this rule in place, network traffic to Microsoft Windows Update endpoints can flow through
your firewall.
Learn more:
Get an overview of Azure Firewall.
Learn about FQDN tags.
Network Watcher
With Network Watcher you can monitor and diagnose networking issues without logging into VMs.
You can trigger packet capture by setting alerts, and gain access to real-time performance information at the
packet level. When you see an issue, you can investigate it in detail.
As a best practice, use Network Watcher to review NSG flow logs.
NSG flow logs in Network Watcher allow you to view information about ingress and egress IP traffic
through an NSG.
Flow logs are written in JSON format.
Flow logs show outbound and inbound flows on a per-rule basis, the network interface (NIC ) to which
the flow applies, 5-tuple information about the flow (source/destination IP, source/destination port, and
protocol), and whether the traffic was allowed or denied.
Learn more:
Get an overview of Network Watcher.
Learn more about NSG flow Logs.
Azure Firewall Like NVA firewall farms, Azure Firewall uses a common
administration mechanism, and a set of security rules to
protect workloads hosted in spoke networks, and to control
access to on-premises networks.
NVA firewalls Like Azure Firewall NVA firewall farms have common
administration mechanism, and a set of security rules to
protect workloads hosted in spoke networks, and to control
access to on-premises networks.
If you want to use NVA you can find them in the Azure
Marketplace.
We recommend using one set of Azure Firewalls (or NVAs) for traffic originating on the internet, and another for
traffic originating on-premises.
Using only one set of firewalls for both is a security risk, as it provides no security perimeter between the two
sets of network traffic.
Using separate firewall layers reduces the complexity of checking security rules, and it's clear which rules
correspond to which incoming network request.
Learn more:
Learn about using NVAs in an Azure VNet.
Next steps
Review other best practices:
Best practices for security and management after migration.
Best practices for cost management after migration.
Perimeter networks
6 minutes to read • Edit Online
Perimeter networks enable secure connectivity between your cloud networks and your on-premises or physical
datacenter networks, along with any connectivity to and from the internet. They're also known as demilitarized
zones (DMZs).
For perimeter networks to be effective, incoming packets must flow through security appliances hosted in secure
subnets before reaching back-end servers. Examples are the firewall, intrusion detection systems (IDS ), and
intrusion prevention systems (IPS ). Before they leave the network, internet-bound packets from workloads should
also flow through the security appliances in the perimeter network. The purposes of this flow are policy
enforcement, inspection, and auditing.
Perimeter networks make use of the following Azure features and services:
Virtual networks, user-defined routes, and network security groups
Network virtual appliances (NVAs)
Azure Load Balancer
Azure Application Gateway and web application firewall (WAF )
Public IPs
Azure Front Door with web application firewall
Azure Firewall
NOTE
Azure reference architectures provide example templates that you can use to implement your own perimeter networks:
Implement a DMZ between Azure and your on-premises datacenter
Implement a DMZ between Azure and the internet
Usually, your central IT and security teams are responsible for defining requirements for operating your perimeter
networks.
The preceding diagram shows an example hub and spoke network topology that implements enforcement of two
perimeters with access to the internet and an on-premises network. Both perimeters reside in the DMZ hub. In the
DMZ hub, the perimeter network to the internet can scale up to support many lines of business (LOBs), by using
multiple farms of WAFs and Azure Firewall instances that help protect the spoke virtual networks. The hub also
allows for connectivity via VPN or Azure ExpressRoute as needed.
Virtual networks
Perimeter networks are typically built using a virtual network with multiple subnets to host the different types of
services that filter and inspect traffic to or from the internet via NVAs, WAFs, and Azure Application Gateway
instances.
User-defined routes
By using user-defined routes, customers can deploy firewalls, IDS/IPS, and other virtual appliances. Customers
can then route network traffic through these security appliances for security boundary policy enforcement,
auditing, and inspection. User-defined routes can be created to guarantee that traffic passes through the specified
custom VMs, NVAs, and load balancers.
In a hub and spoke network example, guaranteeing that traffic generated by virtual machines that reside in the
spoke passes through the correct virtual appliances in the hub requires a user-defined route defined in the subnets
of the spoke. This route sets the front-end IP address of the internal load balancer as the next hop. The internal
load balancer distributes the internal traffic to the virtual appliances (load balancer back-end pool).
Azure Firewall
Azure Firewall is a managed cloud-based service that helps protect your Azure virtual network resources. It's a
fully stateful managed firewall with built-in high availability and unrestricted cloud scalability. You can centrally
create, enforce, and log application and network connectivity policies across subscriptions and virtual networks.
Azure Firewall uses a static public IP address for your virtual network resources. It allows outside firewalls to
identify traffic that originates from your virtual network. The service interoperates with Azure Monitor for logging
and analytics.
Application Gateway
Azure Application Gateway is a dedicated virtual appliance that provides a managed application delivery controller
(ADC ). It offers various layer 7 load-balancing capabilities for your application.
Application Gateway allows you to optimize web farm productivity by offloading CPU -intensive SSL termination
to the application gateway. It also provides other layer 7 routing capabilities, including round-robin distribution of
incoming traffic, cookie-based session affinity, URL path-based routing, and the ability to host multiple websites
behind a single application gateway.
The application gateway WAF SKU includes a web application firewall. This SKU provides protection to web
applications from common web vulnerabilities and exploits. You can configure Application Gateway as an internet-
facing gateway, an internal-only gateway, or a combination of both.
Public IPs
With some Azure features, you can associate service endpoints to a public IP address so that your resource can be
accessed from the internet. This endpoint uses network address translation (NAT) to route traffic to the internal
address and port on the Azure virtual network. This path is the primary way for external traffic to pass into the
virtual network. You can configure public IP addresses to determine what traffic is passed in, and how and where
it's translated onto the virtual network.
Hub and spoke is a networking model for more efficient management of common communication or security
requirements. It also helps avoid Azure subscription limitations. This model addresses the following concerns:
Cost savings and management efficiency. Centralizing services that can be shared by multiple workloads,
such as network virtual appliances (NVAs) and DNS servers, in a single location allows IT to minimize
redundant resources and management effort.
Overcoming subscriptions limits. Large cloud-based workloads might require the use of more resources
than are allowed in a single Azure subscription. Peering workload virtual networks from different subscriptions
to a central hub can overcome these limits. For more information, see subscription limits.
Separation of concerns. You can deploy individual workloads between central IT teams and workload teams.
Smaller cloud estates might not benefit from the added structure and capabilities that this model offers. But larger
cloud adoption efforts should consider implementing a hub and spoke networking architecture if they have any of
the concerns listed previously.
NOTE
The Azure Reference Architectures site contains example templates that you can use as the basis for implementing your own
hub and-spoke networks:
Implement a hub and spoke network topology in Azure
Implement a hub and spoke network topology with shared services in Azure
Overview
As shown in the diagram, Azure supports two types of hub and spoke design. It supports communication, shared
resources, and centralized security policy ("VNet Hub" in the diagram), or a virtual WAN type ("Virtual WAN" in
the diagram) for large-scale branch-to-branch and branch-to-Azure communications.
A hub is a central network zone that controls and inspects ingress or egress traffic between zones: internet, on-
premises, and spokes. The hub and spoke topology gives your IT department an effective way to enforce security
policies in a central location. It also reduces the potential for misconfiguration and exposure.
The hub often contains the common service components that the spokes consume. The following examples are
common central services:
The Windows Server Active Directory infrastructure, required for user authentication of third parties that gain
access from untrusted networks before they get access to the workloads in the spoke. It includes the related
Active Directory Federation Services (AD FS ).
A DNS service to resolve naming for the workload in the spokes, to access resources on-premises and on the
internet if Azure DNS isn't used.
A public key infrastructure (PKI), to implement single sign-on on workloads.
Flow control of TCP and UDP traffic between the spoke network zones and the internet.
Flow control between the spokes and on-premises.
If needed, flow control between one spoke and another.
You can minimize redundancy, simplify management, and reduce overall cost by using the shared hub
infrastructure to support multiple spokes.
The role of each spoke can be to host different types of workloads. The spokes also provide a modular approach
for repeatable deployments of the same workloads. Examples are dev and test, user acceptance testing, staging,
and production.
The spokes can also segregate and enable different groups within your organization. An example is Azure DevOps
groups. Inside a spoke, it's possible to deploy a basic workload or complex multitier workloads with traffic control
between the tiers.
The introduction of multiple hubs increases the cost and management overhead of the system. This is only
justified by scalability, system limits, or redundancy and regional replication for user performance or disaster
recovery. In scenarios that require multiple hubs, all the hubs should strive to offer the same set of services for
operational ease.
Spokes can also be interconnected to a spoke that acts as a hub. This approach creates a two-level hierarchy: the
spoke in the higher level (level 0) becomes the hub of lower spokes (level 1) of the hierarchy. The spokes of a hub
and spoke implementation are required to forward the traffic to the central hub so that the traffic can transit to its
destination in either the on-premises network or the public internet. An architecture with two levels of hubs
introduces complex routing that removes the benefits of a simple hub and spoke relationship.
Track costs across business units, environments, or
projects
7 minutes to read • Edit Online
Building a cost-conscious organization requires visibility and properly defined access (or scope) to cost-related
data. This best-practice article outlines decisions and implementation approaches to creating tracking
mechanisms.
In the preceding diagram, the root of the management group hierarchy contains a node for each business unit. In
this example, the multinational company needs visibility into the regional business units, so it creates a node for
geography under each business unit in the hierarchy.
Within each geography, there's a separate node for production and nonproduction environments to isolate cost,
access, and governance controls. To allow for more efficient operations and wiser operations investments, the
company uses subscriptions to further isolate production environments with varying degrees of operational
performance commitments. Finally, the company uses resource groups to capture deployable units of a function,
called applications.
The diagram shows best practices but doesn't include these options:
Many companies limit operations to a single geopolitical region. That approach reduces the need to diversify
governance disciplines or cost data based on local data-sovereignty requirements. In those cases, a geography
node is unnecessary.
Some companies prefer to further segregate development, testing, and quality control environments into
separate subscriptions.
When a company integrates a cloud center of excellence (CCoE ) team, shared services subscriptions in each
geography node can reduce duplicated assets.
Smaller adoption efforts might have a much smaller management hierarchy. It's common to see a single root
node for corporate IT, with a single level of subordinate nodes in the hierarchy for various environments. This
isn't a violation of best practices for a well-managed environment. But it does make it more difficult to provide
a least-rights access model for cost control and other important functions.
The rest of this article assumes the use of the best-practice approach in the preceding diagram. However, the
following articles can help you apply the approach to a resource organization that best fits your company:
Scaling with multiple Azure subscriptions
Deploying a Governance MVP to govern well-managed environment standards
During the Ready phase of a migration journey, the objective is to prepare for the journey ahead. This phase is
accomplished in two primary areas: organizational readiness and environmental (technical) readiness. Each area
might require new skills for both technical and nontechnical contributors. The following sections describe a few
options to help build the necessary skills.
Microsoft Learn
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. Earn points and levels and achieve more.
The following examples are a few tailored learning paths on Microsoft Learn which align to the Ready portion of
the Cloud Adoption Framework:
Azure fundamentals: Learn cloud concepts such as High Availability, Scalability, Elasticity, Agility, Fault Tolerance,
and Disaster Recovery. Understand the benefits of cloud computing in Azure and how it can save you time and
money. Compare and contrast basic strategies for transitioning to the Azure cloud. Explore the breadth of services
available in Azure including compute, network, storage and security.
Manage resources in Azure: Learn how to work with the Azure command line and web portal to create, manage,
and control cloud-based resources.
Administer infrastructure resources in Azure: Learn how to create, manage, secure and scale virtual machine
resources.
Store data in Azure: Azure provides a variety of ways to store data: unstructured, archival, relational, and more.
Learn the basics of storage management in Azure, how to create a Storage Account, and how to choose the right
model for the data you want to store in the cloud.
Architect great solutions in Azure: Learn how to design and build secure, scalable, high-performing solutions in
Azure by examining the core principles found in every good architecture.
Learn more
For additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning paths with
your role.
Any enterprise-scale cloud adoption plan, will include workloads which do not warrant significant investments in the creation of
new business logic. Those workloads could be moved to the cloud through any number of approaches: lift and shift; lift and
optimize; or modernize. Each of these approaches is considered a migration. The following exercises will help establish the
iterative processes to assess, migrate, optimize, secure, and manage those workloads.
Getting started
To prepare you for this phase of the cloud adoption lifecycle, the framework suggests the following five exercises:
Migration prerequisite
Validate that a landing zone has been deployed and is ready to host the first few workloads that will be migrated to Azure. If a
cloud adoption strategy and cloud adoption plan have not been created, validate that both efforts are in progress.
Best Practices
Validate any modifications against the best practices section to ensure proper implementation of expanded scope or
workload/architecture specific migration approaches.
Process Improvements
Migration is a process heavy activity. As migration efforts scale, use the migration considerations section to evaluate and
mature various aspects of your processes.
Iterative migration process
At its core, migration to the cloud consists of four simple phases: Assess, Migrate, Optimize, and Secure & Manage. This section
of the Cloud Adoption Framework teaches readers to maximize the return from each phase of the process and align those
phases with your cloud adoption plan. The following graphic illustrates those phases in an iterative approach:
Migration implementation
These articles outlines two journeys, each with a similar goal—to migrate a large percentage of existing assets to Azure.
However, the business outcomes and current state will significantly influence the processes required to get there. Those subtle
deviations result in two radically different approaches to reaching a similar end state.
To guide incremental execution during the transition to the end state, this model separates migration into two areas of focus.
Migration preparation: Establish a rough migration backlog based largely on the current state and desired outcomes.
Business outcomes: The key business objectives driving this migration.
Digital estate estimate: A rough estimate of the number and condition of workloads to be migrated.
Roles and responsibilities: A clear definition of the team structure, separation of responsibilities, and access requirements.
Change management requirements: The cadence, processes, and documentation required to review and approve changes.
These initial inputs shape the migration backlog. The output of the migration backlog is a prioritized list of applications to
migrate to the cloud. That list shapes the execution of the cloud migration process. Over time, it will also grow to include much of
the documentation needed to manage change.
Migration process: Each cloud migration activity is contained in one of the following processes, as it relates to the migration
backlog.
Assess: Evaluate an existing asset and establish a plan for migration of the asset.
Migrate: Replicate the functionality of an asset in the cloud.
Optimize: Balance the performance, cost, access, and operational capacity of a cloud asset.
Secure and manage: Ensure a cloud asset is ready for ongoing operations.
The information gathered during development of a migration backlog determines the complexity and level of effort required
within the cloud migration process during each iteration and for each release of functionality.
Next steps
Choose one of these journeys:
Azure migration guide
Expanded scope guide
4 minutes to read • Edit Online
TIP
For an interactive experience, view this guide in the Azure portal. Go to the Azure Quickstart Center in the Azure portal,
select Migrate your environment to Azure, and then follow the step-by-step instructions.
Overview
When to use this guide
Migration options
This guide walks you through the basics of migrating applications and resources from your on-premises
environment to Azure. It is designed for migration scopes with minimal complexity. To determine the suitability
of this guide for your migration, see the When to use this guide tab.
When you migrate to Azure, you may migrate your applications as-is using IaaS -based virtual machine
solutions (known as a rehost or lift and shift migration), or you may have the flexibility to use managed services
and other cloud-native features to modernize your applications. See the Migration options tab for more
information on these choices. As you develop your migration strategy, you might consider:
Will my migrating applications work in the cloud?
What is the best strategy (with regard to technology, tools, and migrations) for my application? See the
Microsoft Cloud Adoption Framework's Migration tools decision guide for more information.
How do I minimize downtime during the migration?
How do I control costs?
How do I track resource costs and bill them accurately?
How do I ensure we remain compliant and meet regulations?
How do I meet legal requirements for data sovereignty in certain countries?
This guide helps answer these questions. It suggests the tasks and features to consider as you prepare to deploy
resources in Azure, including:
Configure prerequisites. Plan and prepare for migration.
Assess your technical fit. Validate the technical readiness and suitability for migration.
Manage costs and billing. Look at the costs of your resources.
Migrate your services. Perform the actual migration.
Organize your resources. Lock resources critical to your system and tag resources to track them.
Optimize and transform. Use the post-migration opportunity to review your resources.
Secure and manage. Ensure that your environment is secure and monitored properly.
Get assistance. Get help and support during your migration or post-migration activities.
To learn more about organizing and structuring your subscriptions, managing your deployed resources, and
complying with your corporate policy requirements, see Governance in Azure.
4 minutes to read • Edit Online
Prerequisites
Prerequisites for migrating to Azure
The resources in this section will help prepare your current environment for migration to Azure.
Overview
Understand migration approaches
Planning checklist
Reasons for migrating to Azure include removing risks associated with legacy hardware, reducing capital expense,
freeing up datacenter space, and quickly realizing return on investment (ROI).
Eliminate legacy hardware. You may have applications hosted on infrastructure that is nearing end of life or
support, whether on-premises or at a hosting provider. Migration to the cloud offers an attractive solution to
the challenge as the ability to migrate "as-is" allows the team to quickly resolve the current infrastructure
lifecycle challenge and then turn its attention to long-term planning for application lifecycle and optimization in
the cloud.
Address end-of-support for software. You may have applications that depend on other software or operating
systems that are nearing end of support. Moving to Azure may provide extended support options for these
dependencies or other migration options that minimize refactoring requirements to support your applications
going forward. For example, see extended support options for Windows Server 2008 and SQL Server 2008.
Reduce capital expense. Hosting your own server infrastructure requires considerable investment in
hardware, software, electricity, and personnel. Migrating to a cloud solution can provide significant reductions in
capital expense. To achieve the best capital expense reductions, a redesign of the solution may be required.
However, an "as-is" migration is a great first step.
Free up datacenter space. You may choose Azure in order to expand your datacenter capacity. One way to do
this is using the cloud as an extension of your on-premises capabilities.
Quickly realize return on investment. Making a return on investment (ROI) is much easier with cloud
solutions, as the cloud payment model provides great utilization insight and promotes a culture for realizing
ROI.
Each of the above scenarios may be entry points for extending your cloud footprint using another methodology
(rehost, refactor, rearchitect, rebuild, or replace).
Migration characteristics
The guide assumes that prior to this migration, your digital estate consists mostly of on-premises hosted
infrastructure and may include hosted business-critical applications. After a successful migration, your data estate
may look very much how it did on-premises but with the infrastructure hosted in cloud resources. Alternatively,
the ideal data estate is a variation of your current data estate, since it has aspects of your on-premises
infrastructure with components which have been refactored to optimize and take advantage of the cloud platform.
The focus of this migration journey is to achieve:
Remediation of legacy hardware end-of-life.
Reduction of capital expense.
Return on investment.
NOTE
An additional benefit of this migration journey is the additional software support model for Windows 2008, Windows 2008
R2, and SQL Server 2008, and SQL Server 2008 R2. For more information, see:
Windows Server 2008 and Windows Server 2008 R2.
SQL Server 2008 and SQL Server 2008 R2.
Assess the digital estate
5 minutes to read • Edit Online
In an ideal migration, every asset (infrastructure, app, or data) would be compatible with a cloud platform and
ready for migration. In reality, not everything should be migrated to the cloud. Furthermore, not every asset is
compatible with cloud platforms. Before migrating a workload to the cloud, it is important to assess the workload
and each related asset (infrastructure, apps, and data).
The resources in this section will help you assess of your environment to determine its suitability for migration and
which methods to consider.
Tools
Scenarios and Stakeholders
Timelines
Cost Management
The following tools help you assess your environment to determine the suitability of migration and best approach
to use. For helpful information on choosing the right tools to support your migration efforts, see the Cloud
Adoption Framework's migration tools decision guide.
Azure Migrate
The Azure Migrate service assesses on-premises infrastructure, applications and data for migration to Azure. The
service assesses the migration suitability of on-premises assets, performs performance-based sizing, and provides
cost estimates for running on-premises assets in Azure. If you're considering lift and shift migrations, or are in the
early assessment stages of migration, this service is for you. After completing the assessment, Azure Migrate can
be used to execute the migration.
Learn more
Azure Migrate overview
Migrate physical or virtualized servers to Azure
Azure Migrate in the Azure portal
Service Map
Service Map automatically discovers application components on Windows and Linux systems and maps the
communication between services. With Service Map, you can view your servers in the way that you think of them:
as interconnected systems that deliver critical services. Service Map shows connections between servers,
processes, inbound and outbound connection latency, and ports across any TCP -connected architecture, with no
configuration required other than the installation of an agent.
Azure Migrate uses Service Map to enhance the reporting capabilities and dependencies across the environment.
Full details of this integration are outlined in Dependency visualization. If you use the Azure Migration service then
there are no additional steps required to configure and obtain the benefits of Service Map. The following
instructions are provided for your reference should your wish to use Service Map for other purposes or projects.
Enable dependency visualization using Service Map
To use dependency visualization, you need to download and install agents on each on-premises machine that you
want to analyze.
Microsoft Monitoring agent (MMA) needs to be installed on each machine.
The Microsoft Dependency agent needs to be installed on each machine.
In addition, if you have machines with no internet connectivity, you need to download and install Log Analytics
gateway on them.
Learn more
Using Service Map solution in Azure
Azure Migrate and Service Map: Dependency visualization
Migrate assets (infrastructure, apps, and data)
10 minutes to read • Edit Online
In this phase of the journey, you use the output of the assess phase to initiate the migration of the environment.
This guide helps identify the appropriate tools to reach a "done state", including native tools, third-party tools, and
project management tools.
Native migration tools
Third-party migration tools
Project management tools
Cost management
The following sections describe the native Azure tools available to perform or assist with migration. For
information on choosing the right tools to support your migration efforts, see the Cloud Adoption Framework's
Migration tools decision guide.
Azure Migrate
Azure Migrate delivers a unified and extensible migration experience. Azure Migrate provides a one-stop,
dedicated experience to track your migration journey across the phases of assessment and migration to Azure. It
provides you the option to use the tools of your choice and track the progress of migration across these tools.
Azure Migrate provides the following functionality:
1. Enhanced assessment and migration capabilities:
Hyper-V assessments.
Improved VMware assessment.
Agentless migration of VMware virtual machines to Azure.
2. Unified assessment, migration, and progress tracking.
3. Extensible approach with ISV integration (such as Cloudamize).
To perform a migration using Azure Migrate follow these steps:
1. Search for Azure Migrate under All services. Select Azure Migrate to continue.
2. Select Add a tool to start your migration project.
3. Select the subscription, resource group, and geography to host the migration.
4. Select Select assessment tool > Azure Migrate: Server Assessment > Next.
5. Select Review + add tools, and verify the configuration. Click Add tools to initiate the job to create the
migration project and register the selected solutions.
Learn more
Azure Migrate tutorial - Migrate physical or virtualized servers to Azure
After you register the resource provider, you can create an instance of Azure Database Migration Service.
1. Select +Create a resource and search the marketplace for Azure Database Migration Service.
2. Complete the Create Migration Service wizard, and select Create.
The service is now ready to migrate the supported source databases (for example, SQL Server, MySQL,
PostgreSQL, or MongoDb).
C R E A TE A N A Z U R E D A TA B A S E M I G R A TI O N S E R V I C E
I N S TA N C E
NOTE
For large migrations (in terms of number and size of databases), we recommend that you use the Azure Database Migration
Service, which can migrate databases at scale.
To get started with the Data Migration Assistant follow these steps.
1. Download and Install the Data Migration Assistant from the Microsoft Download Center.
2. Create an assessment by clicking the New (+) icon and select the Assessment project type.
3. Set the source and target server type. Click Create.
4. Configure the assessment options as required (recommend all defaults).
5. Add the databases to assess.
6. Click Next to start the assessment.
7. View results within the Data Migration Assistant tool set.
For an enterprise, we recommend following the approach outlined in Assess an enterprise and consolidate
assessment reports with DMA to assess multiple servers, combine the reports and then use provided Power BI
reports to analyze the results.
For more information, including detailed usage steps, see:
Data Migration Assistant overview
Assess an enterprise and consolidate assessment reports with DMA
Analyze consolidated assessment reports created by Data Migration Assistant with Power BI
The cloud introduces a few shifts in how we work, regardless of our role on the technology team. Cost is a great
example of this shift. In the past, only finance and IT leadership were concerned with the cost of IT assets
(infrastructure, apps, and data). The cloud empowers every member of IT to make and act on decisions that better
support the end user. However, with that power comes the responsibility to be cost conscious when making those
decisions.
This article introduces the tools that can help make wise cost decisions before, during, and after a migration to
Azure.
The tools in this article include:
Azure Migrate
Azure pricing calculator
Azure TCO calculator
Azure Cost Management
Azure Advisor
The processes described in this article may also require a partnership with IT managers, finance, or line-of-
business application owners.
Estimate VM costs prior to migration
Estimate and optimize VM costs during and after migration
Tips and tricks to optimize costs
Prior to migration of any asset (infrastructure, app, or data), there is an opportunity to estimate costs and refine
sizing based on observed performance criteria for those assets. Estimating costs serves two purposes: it allows for
cost control, and it provides a checkpoint to ensure that current budgets account for necessary performance
requirements.
Cost calculators
For manual cost calculations, there are two handy calculators which can provide a quick cost estimate based on the
architecture of the workload to be migrated.
The Azure pricing calculator provides cost estimates based on manually entered Azure products.
Sometimes decisions require a comparison of the future cloud costs and the current on-premises costs. The
Total Cost of Ownership (TCO ) calculator can provide such a comparison.
These manual cost calculators can be used on their own to forecast potential spend and savings. They can also be
used in conjunction with Azure Migrate's cost forecasting tools to adjust the cost expectations to fit alternative
architectures or performance constraints.
Additional resources
Set up and review an assessment with Azure Migrate
For a more comprehensive plan on cost management across larger numbers of assets (infrastructure, apps, and
data), see the Cloud Adoption Framework governance model. In particular, guidance on the Cost Management
discipline and the Cost Management improvement in the governance guide for complex enterprises.
Optimize and transform
3 minutes to read • Edit Online
Now that you have migrated your services to Azure, the next phase includes reviewing the solution for possible
areas of optimization. This could include reviewing the design of the solution, right-sizing the services, and
analyzing costs.
This phase is also an opportunity to optimize your environment and perform possible transformations of the
environment. For example, you may have performed a "rehost" migration, and now that your services are running
on Azure you can revisit the solutions configuration or consumed services, and possibly perform some
"refactoring" to modernize and increase the functionality of your solution.
Right-size assets
Cost Management
All Azure services that provide a consumption-based cost model can be resized through the Azure portal, CLI, or
PowerShell. The first step in correctly sizing a service is to review its usage metrics. The Azure Monitor service
provides access to these metrics. You may need to configure the collection of the metrics for the service you are
analyzing, and allow an appropriate time to collect meaningful data based on your workload patterns.
1. Go to Monitor.
2. Select Metrics and configure the chart to show the metrics for the service to analyze.
G O TO
M O N I TO R
The following are some common services that you can resize.
After migrating your environment to Azure, it's important to consider the security and methods used to manage
the environment. Azure provides many features and capabilities to meet these needs in your solution.
Azure Monitor
Azure Service Health
Azure Advisor
Azure Security Center
Azure Backup
Azure Site Recovery
Azure Monitor maximizes the availability and performance of your applications by delivering a comprehensive
solution for collecting, analyzing, and acting on telemetry from your cloud and on-premises environments. It helps
you understand how your applications are performing and proactively identifies issues affecting them and the
resources they depend on.
Learn more
Azure Monitor overview.
2 minutes to read • Edit Online
Assistance
Obtain assistance during your journey to Azure
We know that getting the right support at the right time will accelerate your migration efforts. Review the
assistance avenues below to meet your needs.
Support Plans
Partners
Microsoft Support
Microsoft offers a basic support plan to all Azure customers. You have 24x7 access to billing and subscription
support, online self-help, documentation, whitepapers, and support forums.
If you need help from Microsoft Support while using Azure, follow these steps to create a support request:
1. Select Help + support in the Azure portal.
2. Select New support request to enter details about your issue and contact support.
1. Select Help + support.
2. Select New support request to enter details about your issue and contact support.
C R E A TE A S U P P O R T
REQU EST
Online communities
The following online communities provide community-based support:
MSDN forums
Stack Overflow
Expanded scope for cloud migration
2 minutes to read • Edit Online
The Azure migration guide in the Cloud Adoption Framework is the suggested starting point for readers who
are interested in a rehost migration to Azure, also known as a "lift and shift" migration. That guide walks you
through a series of prerequisites, tools, and approaches to migrating virtual machines to the cloud.
While this guide is an effective baseline to familiarize you with this type of migration, it makes several
assumptions. Those assumptions align the guide with many of the Cloud Adoption Framework's readers by
providing a simplified approach to migrations. This section of the Cloud Adoption Framework addresses some
expanded scope migration scenarios, which help guide efforts when those assumptions don't apply.
Next steps
Browse the table of contents on the left to address specific needs or scope changes. Alternatively, the first scope
enhancement on the list, Balance the portfolio, is a good starting point when reviewing these scenarios.
Balance the portfolio
Balance the portfolio
9 minutes to read • Edit Online
Cloud adoption is a portfolio management effort, cleverly disguised as technical implementation. Like any
portfolio management exercise, balancing the portfolio is critical. At a strategic level, this means balancing
migration, innovation, and experimentation to get the most out of the cloud. When the cloud adoption effort leans
too far in one direction or another, complexity finds its way into the migration effort. This article will guide the
reader through approaches to achieve balance in the portfolio.
IMPORTANT
The above table is a fictional example and should not used to set priorities. In many cases, this table could considered an
antipattern by placing cost savings above customer experiences.
The above table could accurately represent the priorities of the cloud strategy team and the cloud adoption team
overseeing a cloud migration. Due to short-term constraints, this team is placing a higher emphasis on IT cost
reduction and prioritizing a datacenter exit as a means to achieve the desired IT cost reductions. However, by
documenting the competing priorities in this table, the cloud adoption team is empowered to help the cloud
strategy team identify opportunities to better align implementation of the overarching portfolio strategy.
Move fast while maintaining balance
The guidance regarding incremental rationalization of the digital estate suggests an approach in which the
rationalization starts with an unbalanced position. The cloud strategy team should evaluate every workload for
compatibility with a rehost approach. Such an approach is suggested because it allows for the rapid evaluation of
a complex digital estate based on quantitative data. Making such an initial assumption allows the cloud adoption
team to engage quickly, reducing time to business outcomes. However, as stated in that article, qualitative
questions will provide the necessary balance in the portfolio. This article documents the process for creating the
promised balance.
Importance of sunset and retire decisions
The table in the documenting business outcomes section above misses a key outcome that would support the
number one objective of reducing IT costs. When IT costs reductions rank anywhere in the list of business
outcomes, it is important to consider the potential to sunset or retire workloads. In some scenarios, cost savings
can come from NOT migrating workloads that don't warrant a short-term investment. Some customers have
reported cost savings in excess of 20% total cost reductions by retiring underutilized workloads.
To balance the portfolio, better reflecting sunset and retire decisions, the cloud strategy team and the cloud
adoption team are encouraged to ask the following questions of each workload within assess and migrate
processes:
Has the workload been used by end users in the past six months?
Is end-user traffic consistent or growing?
Will this workload be required by the business 12 months from now?
If the answer to any of these questions is "No", then the workload could be a candidate for retirement. If
retirement potential is confirmed with the app owner, then it may not make sense to migrate the workload. This
prompts for a few qualification questions:
Can a retirement plan or sunset plan be established for this workload?
Can this workload be retired prior to the datacenter exit?
If the answer to both of these questions is "Yes", then it would be wise to consider not migrating the workload.
This approach would help meet the objectives of reducing costs and exiting the datacenter.
If the answer to either question is "No", it may be wise to establish a plan for hosting the workload until it can be
retired. This plan could include moving the assets to a lower-cost datacenter or alternative datacenter, which
would also accomplish the objectives of reducing costs and exiting one datacenter.
Suggested prerequisites
The prerequisites specified in the baseline guide should still be sufficient for addressing this complexity topic.
However, the asset inventory and digital estate should be highlighted and bolded among those prerequisites, as
that data will drive the following activities.
Next steps
Return to the expanded scope checklist to ensure your migration method is fully aligned.
Expanded scope checklist
Skills readiness for cloud migration
2 minutes to read • Edit Online
During a cloud migration, it is likely that employees, as well as some incumbent systems integration partners or
managed services partners, will need to develop new skills to be effective during migration efforts.
There are four distinct processes that are completed iteratively during the "Migrate" phase of any migration
journey. The following sections align the necessary skills for each of those processes with references to two
prerequisites for skilling resources.
Next steps
Return to the expanded scope checklist to ensure your migration method is fully aligned.
Expanded scope checklist
Accelerate migration with VMware hosts
2 minutes to read • Edit Online
Migrating entire VMware hosts can move multiple workloads and several assets in a single migration effort. The
following guidance expands the scope of the Azure migration guide through a VMware host migration. Most of
the effort required in this scope expansion occurs during the prerequisites and migration processes of a migration
effort.
Suggested prerequisites
When migrating your first VMware host to Azure, you must meet a number of prerequisites to prepare identity,
network, and management requirements. After these prerequisites are met, each additional host should require
significantly less effort to migrate. The following sections provide more detail about the prerequisites.
Secure your Azure environment
Implement the appropriate cloud solution for role-based access control and network connectivity in your Azure
environment. The secure your environment guide can help with this implementation.
Private cloud management
There are two required tasks and one optional task to establish the private cloud management. Escalate private
cloud privileges and workload DNS and DHCP setup are each required best practices.
If the objective is to migrate workloads by using Layer 2 stretched networks, this third best practice is also
required.
Private cloud networking
After the management requirements are established, you can establish private cloud networking by using the
following best practices:
VPN connection to Private Cloud
On-premises network connection with ExpressRoute
Azure virtual network connection with ExpressRoute
Configure DNS name resolution
Integration with the cloud adoption plan
After you've met the other prerequisites, you should include each VMware host in the cloud adoption plan. Within
the cloud adoption plan, add each host to be migrated, as a distinct workload. Within each workload, add the VMs
to be migrated as assets. To add workloads and assets to the adoption plan in bulk, see adding/editing work items
with Excel.
Next steps
Return to the expanded scope checklist to ensure your migration method is fully aligned.
Expanded scope checklist
Accelerate migration by migrating multiple databases
or entire SQL Servers
9 minutes to read • Edit Online
Migrating entire SQL Server instances can accelerate workload migration efforts. The following guidance expands
the scope of the Azure migration guide by migrating an instance of SQL Server outside of a workload-focused
migration effort. This approach can seed the migration of multiple workloads with a single data-platform
migration. Most of the effort required in this scope expansion occurs during the prerequisites, assessment,
migration, and optimization processes of a migration effort.
Suggested prerequisites
Before performing a SQL Server migration, start with an expansion of the digital estate by including a data estate.
The data estate records an inventory of the data assets you're considering for migration. The following tables
outline an approach to recording the data estate.
Server inventory
The following is an example of a server inventory:
DATAB NUMBE
SQL PURPO VERSIO CRITICA SENSITI ASE CLUSTE R OF
SERVER SE N LITY VITY COUNT SSIS SSRS SSAS R NODES
Database inventory
The following is an example of a database inventory for one of the servers above:
DATA
MIGRATION
ASSISTANT DMA TARGET
SERVER DATABASE CRITICALITY SENSITIVITY (DMA) RESULTS REMEDIATION PLATFORM
Azure Database Migration Service Supports online (minimal downtime) and offline (one time)
migrations at scale to an Azure SQL Database managed
instance. Supports migration from: SQL Server 2005, SQL
Server 2008 and SQL Server 2008 R2, SQL Server 2012, SQL
Server 2014, SQL Server 2016, and SQL Server 2017.
MIGRATION OPTION PURPOSE
Bulk load Use bulk load to an Azure SQL Database managed instance
for data stored in: SQL Server 2005, SQL Server 2008 and SQL
Server 2008 R2, SQL Server 2012, SQL Server 2014, SQL
Server 2016, and SQL Server 2017.
RDS SQL Server Azure SQL Database Database Migration Online Tutorial
(or managed instance) Service
SQL Server Azure Data Factory Azure Data Factory Offline Tutorial
Integration Services integration runtime
SQL Server Analysis Azure Analysis SQL Server Data Tools Offline Tutorial
Services - tabular Services
model
Guidance and tutorials for migration from SQL Server to an IaaS instance of SQL Server
After migrating databases and services to PaaS instances, you might still have data structures and services that are
not PaaS -compatible. When existing constraints prevent migrating data structures or services, the following
tutorial can help with migrating various assets in the data portfolio to Azure IaaS solutions.
Use this approach to migrate databases or other services on the instance of SQL Server.
Next steps
Return to the expanded scope checklist to ensure your migration method is fully aligned.
Expanded scope checklist
Multiple datacenters
3 minutes to read • Edit Online
Often the scope of a migration involves the transition of multiple datacenters. The following guidance will expand
the scope of the Azure migration guide to address multiple datacenters.
Suggested prerequisites
Before beginning the migration, you should create epics within the project management tool to represent each
datacenter to be migrated. It is then important to understand the business outcomes and motivations, which are
justifying this migration. Those motivations can be used to prioritize the list of epics (or datacenters). For instance,
if migration is driven by a desire to exit datacenters before leases must be renewed, then each epic would be
prioritized based on lease renewal date.
Within each epic, the workloads to be assessed and migrated would be managed as features. Each asset within that
workload would be managed as a user story. The work required to assess, migrate, optimize, promote, secure, and
manage each asset would be represented as tasks for each asset.
Sprints or iterations would be then consist of a series of tasks required to migrate the assets and user stories
committed to by the cloud adoption team. Releases would then consist of one or more workloads or features to be
promoted to production.
IMPORTANT
Two important notes: First, a subject matter expert with an understanding of asset placement and IP address schemas is
required to identify assets that reside in a secondary datacenter. Second, it is important to evaluate both downstream
dependencies and clients in the visual to understand bidirectional dependencies.
Next steps
Return to the Expanded Scope Checklist to ensure your migration method is fully aligned.
Expanded scope checklist
Data requirements exceed network capacity during a
migration effort
5 minutes to read • Edit Online
In a cloud migration, assets are replicated and synchronized over the network between the existing datacenter and
the cloud. It is not uncommon for the existing data size requirements of various workloads to exceed network
capacity. In such a scenario, the process of migration can be radically slowed, or in some cases, stopped entirely.
The following guidance will expand the scope of the Azure migration guide to provide a solution that works
around network limitations.
Suggested prerequisites
Validate network capacity risks: Digital estate rationalization is a highly recommended prerequisite, especially if
there are concerns of overburdening the available network capacity. During digital estate rationalization, an
inventory of digital assets is collected. That inventory should include existing storage requirements across the
digital estate. As outlined in replication risks: Physics of replication, that inventory can be used to estimate total
migration data size, which can be compared to total available migration bandwidth. If that comparison
doesn't align with the required time to business change, then this article can help accelerate migration velocity
reducing the time required to migrate the datacenter.
Offline transfer of independent data stores: Pictured in the diagram below are examples of both online and
offline data transfers with Azure Data Box. These approaches could be used to ship large volumes of data to the
cloud prior to workload migration. In an offline data transfer, source data is copied to Azure Data Box, which is
then physically shipped to Microsoft for transfer into an Azure storage account as a file or a blob. This process can
be used to ship data that isn't directly tied to a specific workload, prior to other migration efforts. Doing so reduces
the amount of data that needs to be shipped over the network, in an effort to complete a migration within network
constraints.
This approach could be used to transfer data HDFS, backups, archives, File Servers, applications, etc… Existing
technical guidance explains how to use this approach to transfer data from an HDFS store or from disks using
SMB, NFS, REST, or data copy service to Data Box.
There are also third-party partner solutions that use Azure Data Box for a "Seed and Feed" migration, where a
large volume of data is moved via an offline transfer but is later synchronized at a lower scale over the network.
Assess process changes
If the storage requirements of a workload (or workloads) exceed network capacity, then Azure Data Box can still be
used in an offline data transfer.
Network transmission is the recommended approach unless the network is unavailable. The speed of transferring
data over the network, even when bandwidth is constrained, is typically faster than physically shipping the same
amount of data using an offline transfer mechanism such as Data Box.
If connectivity to Azure is available, an analysis should be conducted before using Data Box, especially if migration
of the workload is time sensitive. Data Box is only advisable when the time to transfer the necessary data exceeds
the time to populate, ship, and restore data using Data Box.
Suggested action during the assess process
Network Capacity Analysis: When workload-related data transfer requirements are at risk of exceeding network
capacity, the cloud adoption team would add an additional analysis task to the assess process, called network
capacity analysis. During this analysis, a member of the team with subject matter expertise regarding the local
network and network connectivity would estimate the amount of available network capacity and required data
transfer time. That available capacity would be compared to the storage requirements of all assets to be migrated
during the current release. If the storage requirements exceed the available bandwidth, then assets supporting the
workload would be selected for offline transfer.
IMPORTANT
At the conclusion of the analysis, the release plan may need to be updated to reflect the time required to ship, restore, and
synchronize the assets to be transferred offline.
Drift analysis: Each asset to be transferred offline should be analyzed for storage and configuration drift. Storage
drift is the amount of change in the underlying storage over time. Configuration drift is change in the configuration
of the asset over time. From the time the storage is copied to the time the asset is promoted to production, any
drift could be lost. If that drift needs to be reflected in the migrated asset, some form of synchronization would be
required, between the local asset and the migrated asset. This should be flagged for consideration during
migration execution.
Migrate process changes
When using offline transfer mechanisms, replication processes are not likely required. However, synchronization
processes may still be a requirement. Understanding the results of the drift analysis completed during the Assess
process will inform the tasks required during migration, if an asset is being transferred offline.
Suggested action during the migrate process
Copy storage: This approach could be used to transfer data HDFS, backups, archives, File Servers, applications,
etc… Existing technical guidance explains how to use this approach to transfer data from an HDFS store or from
disks using SMB, NFS, REST, or data copy service to Data Box.
There are also third-party partner solutions that use Azure Data Box for a "seed and sync" migration, where a large
volume of data is moved via an offline transfer but is later synchronized at a lower scale over the network.
Ship the device: Once the data is copied, the device can be shipped to Microsoft. Once received and imported,
the data is available in an Azure storage account.
Restore the asset: Verify the data is available in the storage account. Once verified, the data can be used as a blob
or in Azure Files. If the data is a VHD/VHDX file, the file can be converted managed disks. Those managed disks
can then be used to instantiate a virtual machine, which creates a replica of the original on-premises asset.
Synchronization: If synchronization of drift is a requirement for a migrated asset, one of the third-party partner
solutions could be used to synchronize the files until the asset is restored.
Next steps
Return to the expanded scope checklist to ensure your migration method is fully aligned.
Expanded scope checklist
Governance or compliance strategy
4 minutes to read • Edit Online
When governance or compliance is required throughout a migration effort, additional scope is required. The
following guidance will expand the scope of the Azure migration guide to address different approaches to
addressing governance or compliance requirements.
Suggested prerequisites
Configuration of the base Azure environment could change significantly when integrating governance or
compliance requirements. To understand how prerequisites change, it's important to understand the nature of the
requirements. Prior to beginning any migration that requires governance or compliance, an approach should be
chosen and implemented in the cloud environment. The following are a few high-level approaches commonly seen
during migrations:
Common governance approach: For most organizations, the Cloud Adoption Framework governance model is
a sufficient approach that consists of a minimum viable product (MVP ) implementation, followed by targeted
iterations of governance maturity to address tangible risks identified in the adoption plan. This approach provides
the minimum tooling needed to establish consistent governance, so the team can understand the tools. It then
expands on those tools to address common governance concerns.
ISO 27001 Compliance blueprints: For customer who are required to adhere to ISO compliance standards, the
ISO 27001 Shared Services blueprint samples can serve as a more effective MVP to produce richer governance
constraints earlier in the iterative process. The ISO 27001 App Service Environment/SQL Database Sample
expands on the blueprint to map controls and deploy a common architecture for an application environment. As
additional compliance blueprints are released, they will be referenced here as well.
Virtual Datacenter: A more robust governance starting point may be required. In such cases, consider the Azure
Virtual Datacenter (VDC ). This approach is commonly suggested during enterprise-scale adoption efforts, and
especially for efforts that exceed 10,000 assets. It is also the de facto choice for complex governance scenarios
when any of the following are required: extensive third-party compliance requirements, deep domain expertise, or
parity with mature IT governance policies and compliance requirements.
Partnership option to complete prerequisites
Microsoft Services: Microsoft Services provides solution offerings that can align to the Cloud Adoption
Framework governance model, compliance blueprints, or Virtual Datacenter options to ensure the most
appropriate governance or compliance model. Use the Secure Cloud Insights (SCI) solution offering to establish a
data-driven picture of a customer deployment in Azure and validate the customer´s Azure implementation
maturity while identifying optimization of existing deployment architectures, remove governance security and
availability risks. Based on customer insights, you should lead with the following approaches:
Cloud Foundation: Establish the customer's core Azure designs, patterns, and governance architecture with
the Hybrid Cloud Foundation (HCF ) solution offering. Map the customer's requirements to the most
appropriate reference architecture. Implement a minimum viable product consisting of Shared Services and
IaaS workloads.
Cloud Modernization: Use the Cloud Modernization solution offering as a comprehensive approach to move
applications, data, and infrastructure to an enterprise-ready cloud, as well as to optimize and modernize after
cloud deployment.
Innovate with Cloud: Engage customer through an innovative and unique cloud center of excellence (CCoE )
solution approach that builds a modern IT organization to enable agility at scale with DevOps while staying in
control. Implements an agile approach to capture business requirements, reuse deployment packages aligned
with security, compliance and service management policies, and maintains the Azure platform aligned with
operational procedures.
Next steps
As the final item on the expanded scope checklist, return to the checklist and reevaluate any additional scope
requirements for the migration effort.
Expanded scope checklist
Azure migration best practices
2 minutes to read • Edit Online
Azure provides several tools in Azure to help execute a migration effort. This section of the Cloud Adoption
Framework is designed to help readers implement those tools in alignment with best practices for migration.
These best practices are aligned to one of the processes within the Cloud Adoption Framework migration model
pictured below.
Expand any process in the table of contents on the left to see best practices typically required during that process.
NOTE
Digital estate planning and asset assessment represent two different levels of migration planning and assessment:
Digital estate planning: You plan or rationalize the digital estate during planning, to establish an overall migration
backlog. However, this plan is based on some assumptions and details that need to be validated before a workload can
be migrated.
Asset assessment: You assess a workload's individual assets before migration of the workload, to evaluate cloud
compatibility and understand architecture and sizing constraints. This process validates initial assumptions and provides
the details needed to migrate an individual asset.
Assess on-premises workloads for migration to Azure
21 minutes to read • Edit Online
This article shows how the fictional company Contoso assesses an on-premises app for migration to Azure. In the
example scenario, Contoso's on-premises SmartHotel360 app currently runs on VMware. Contoso assesses the
app's VMs using the Azure Migrate service, and the app's SQL Server database using Data Migration Assistant.
Overview
As Contoso considers migrating to Azure, the company needs a technical and financial assessment to determine
whether its on-premises workloads are good candidates for cloud migration. In particular, the Contoso team wants
to assess machine and database compatibility for migration. It wants to estimate capacity and costs for running
Contoso's resources in Azure.
To get started and to better understand the technologies involved, Contoso assesses two of its on-premises apps,
summarized in the following table. The company assesses for migration scenarios that rehost and refactor apps for
migration. Learn more about rehosting and refactoring in the migration examples overview.
SmartHotel360 Runs on Windows with a Two-tiered app. The front- VMs are VMware, running
SQL Server database end ASP.NET website runs on an ESXi host managed by
(manages Contoso travel on one VM (WEBVM) and vCenter Server.
requirements) the SQL Server runs on
another VM (SQLVM). You can download the
sample app from GitHub.
osTicket Runs on Linux/Apache with Two-tiered app. A front-end The app is used by customer
MySQL PHP (LAMP) PHP website runs on one service apps to track issues
(Contoso service desk app) VM (OSTICKETWEB) and for internal employees and
the MySQL database runs external customers.
on another VM
(OSTICKETMYSQL). You can download the
sample from GitHub.
Current architecture
This diagram shows the current Contoso on-premises infrastructure:
Contoso has one main datacenter. The datacenter is located in the city of New York in the Eastern United States.
Contoso has three additional local branches across the United States.
The main datacenter is connected to the internet with a fiber Metro Ethernet connection (500 MBps).
Each branch is connected locally to the internet by using business-class connections with IPsec VPN tunnels
back to the main datacenter. The setup allows Contoso's entire network to be permanently connected and
optimizes internet connectivity.
The main datacenter is fully virtualized with VMware. Contoso has two ESXi 6.5 virtualization hosts that are
managed by vCenter Server 6.5.
Contoso uses Active Directory for identity management. Contoso uses DNS servers on the internal network.
The domain controllers in the datacenter run on VMware VMs. The domain controllers at local branches run on
physical servers.
Business drivers
Contoso's IT leadership team has worked closely with the company's business partners to understand what the
business wants to achieve with this migration:
Address business growth. Contoso is growing. As a result, pressure has increased on the company's on-
premises systems and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for its
developers and users. The business needs IT to be fast and to not waste time or money, so the company can
deliver faster on customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes that occur in the marketplace for the company to be successful in a global economy. IT
at Contoso must not get in the way or become a business blocker.
Scale. As the company's business grows successfully, Contoso IT must provide systems that can grow at the
same pace.
Assessment goals
The Contoso cloud team has identified goals for its migration assessments:
After migration, apps in Azure should have the same performance capabilities that apps have today in
Contoso's on-premises VMware environment. Moving to the cloud doesn't mean that app performance is less
critical.
Contoso needs to understand the compatibility of its applications and databases with Azure requirements.
Contoso also needs to understand its hosting options in Azure.
Contoso's database administration should be minimized after apps move to the cloud.
Contoso wants to understand not only its migration options, but also the costs associated with the
infrastructure after it moves to the cloud.
Assessment tools
Contoso uses Microsoft tools for its migration assessment. The tools align with the company's goals and should
provide Contoso with all the information it needs.
Data Migration Assistant Contoso uses Data Migration Assistant Data Migration Assistant is a free,
to assess and detect compatibility issues downloadable tool.
that might affect its database
functionality in Azure. Data Migration
Assistant assesses feature parity
between SQL sources and targets. It
recommends performance and reliability
improvements.
Azure Migrate Contoso uses the Azure Migrate service As of May 2018, Azure Migrate is a free
to assess its VMware VMs. Azure service.
Migrate assesses the migration
suitability of the machines. It provides
sizing and cost estimates for running in
Azure.
Service Map Azure Migrate uses Service Map to Service Map is part of Azure Monitor
show dependencies between machines logs. Currently, Contoso can use Service
that the company wants to migrate. Map for 180 days without incurring
charges.
In this scenario, Contoso downloads and runs Data Migration Assistant to assess the on-premises SQL Server
database for its travel app. Contoso uses Azure Migrate with dependency mapping to assess the app VMs before
migration to Azure.
Assessment architecture
Contoso is a fictional name that represents a typical enterprise organization.
Contoso has an on-premises datacenter (contoso-datacenter) and on-premises domain controllers
(CONTOSODC1, CONTOSODC2).
VMware VMs are located on VMware ESXi hosts running version 6.5 (contosohost1, contosohost2).
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com, running on a VM ).
The SmartHotel360 travel app has these characteristics:
The app is tiered across two VMware VMs (WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com.
The VMs are running Windows Server 2008 R2 Datacenter with SP1.
The VMware environment is managed by vCenter Server (vcenter.contoso.com ) running on a VM.
The osTicket service desk app:
The app is tiered across two VMs (OSTICKETWEB and OSTICKETMYSQL ).
The VMs are running Ubuntu Linux Server 16.04-LTS.
OSTICKETWEB is running Apache 2 and PHP 7.0.
OSTICKETMYSQL is running MySQL 5.7.22.
Prerequisites
Contoso and other users must meet the following prerequisites for the assessment:
Owner or Contributor permissions for the Azure subscription, or for a resource group in the Azure subscription.
An on-premises vCenter Server instance running version 6.5, 6.0, or 5.5.
A read-only account in vCenter Server, or permissions to create one.
Permissions to create a VM on the vCenter Server instance by using an .ova template.
At least one ESXi host running version 5.5 or later.
At least two on-premises VMware VMs, one running a SQL Server database.
Permissions to install Azure Migrate agents on each VM.
The VMs should have direct internet connectivity.
You can restrict internet access to the required URLs.
If your VMs don't have internet connectivity, the Azure Log Analytics Gateway must be installed on them,
and agent traffic directed through it.
The FQDN of the VM running the SQL Server instance, for database assessment.
Windows Firewall running on the SQL Server VM should allow external connections on TCP port 1433
(default). This setup allows Data Migration Assistant to connect.
Assessment overview
Here's how Contoso performs its assessment:
Step 1: Download and install Data Migration Assistant. Contoso prepares Data Migration Assistant for
assessment of the on-premises SQL Server database.
Step 2: Assess the database by using Data Migration Assistant. Contoso runs and analyzes the database
assessment.
Step 3: Prepare for VM assessment by using Azure Migrate. Contoso sets up on-premises accounts and
adjusts VMware settings.
Step 4: Discover on-premises VMs by using Azure Migrate. Contoso creates an Azure Migrate collector
VM. Then, Contoso runs the collector to discover VMs for assessment.
Step 5: Prepare for dependency analysis by using Azure Migrate. Contoso installs Azure Migrate agents
on the VMs, so the company can see dependency mapping between VMs.
Step 6: Assess the VMs by using Azure Migrate. Contoso checks dependencies, groups the VMs, and runs
the assessment. When the assessment is ready, Contoso analyzes the assessment in preparation for migration.
> [!NOTE]
> Assessments shouldn't just be limited to using tooling to discover information about your environment, you
should schedule in time to speak to business owners, end users, other members within the IT department, etc in
order to get a full picture of what is happening within the environment and understand things tooling cannot
tell you.
3. In Select Target Version, Contoso selects SQL Server 2017 as the target version. Contoso needs to select
this version because it's the version that's used by the SQL Database Managed Instance.
4. Contoso selects reports to help it discover information about compatibility and new features:
Compatibility issues note changes that might break migration or that require a minor adjustment
before migration. This report keeps Contoso informed about any features currently in use that are
deprecated. Issues are organized by compatibility level.
New feature recommendation notes new features in the target SQL Server platform that can be used
for the database after migration. New feature recommendations are organized under the headings
Performance, Security, and Storage.
5. In Connect to a server, Contoso enters the name of the VM that's running the database and credentials to
access it. Contoso selects Trust server certificate to make sure the VM can access SQL Server. Then,
Contoso selects Connect.
6. In Add source, Contoso adds the database it wants to assess, and then selects Next to start the
assessment.
7. The assessment is created.
2. In the Feature recommendations report, Contoso views performance, security, and storage features that
the assessment recommends after migration. A variety of features are recommended, including In-Memory
OLTP, columnstore indexes, Stretch Database, Always Encrypted, dynamic data masking, and transparent
data encryption.
NOTE
Contoso should enable transparent data encryption for all SQL Server databases. This is even more critical when a
database is in the cloud than when it's hosted on-premises. Transparent data encryption should be enabled only after
migration. If transparent data encryption is already enabled, Contoso must move the certificate or asymmetric key to
the master database of the target server. Learn how to move a transparent data encryption-protected database to
another SQL Server instance.
NOTE
For large-scale assessments:
Run multiple assessments concurrently and view the state of the assessments on the All assessments page.
Consolidate assessments into a SQL Server database.
Consolidate assessments into a Power BI report.
9. In Select migration tool, select Skip adding a migration tool for now > Next.
10. In Review + add tools, review the settings, and click Add tools.
11. Wait a few minutes for the Azure Migrate project to deploy. You'll be taken to the project page. If you don't
see the project, you can access it from Servers in the Azure Migrate dashboard.
Download the collector appliance
1. In Migration Goals > Servers > Azure Migrate: Server Assessment, click Discover.
2. In Discover machines > Are your machines virtualized?, click Yes, with VMware vSphere
hypervisor.
3. Click Download to download the .OVA template file.
Example:
C:\>CertUtil -HashFile C:\AzureMigrate\AzureMigrate.ova SHA256
3. The generated hash should match the hash values listed in the Verify security section of the Assess VMware
VMs for migration tutorial.
Create the collector appliance
Now, Contoso can import the downloaded file to the vCenter Server instance and provision the collector appliance
VM:
1. In the vSphere Client console, Contoso selects File > Deploy OVF Template.
2. In the Deploy OVF Template Wizard, Contoso selects Source, and then specifies the location of the OVA file.
3. In Name and Location, Contoso specifies a display name for the collector VM. Then, it selects the
inventory location in which to host the VM. Contoso also specifies the host or cluster on which to run the
collector appliance.
4. In Storage, Contoso specifies the storage location. In Disk Format, Contoso selects how it wants to
provision the storage.
5. In Network Mapping, Contoso specifies the network in which to connect the collector VM. The network
needs internet connectivity to send metadata to Azure.
6. Contoso reviews the settings, and then selects Power on after deployment > Finish. A message that
confirms successful completion appears when the appliance is created.
Run the collector to discover VMs
Now, Contoso runs the collector to discover VMs. Currently, the collector currently supports only English (United
States) as the operating system language and collector interface language.
1. In the vSphere Client console, Contoso selects Open Console. Contoso specifies the accepts the licensing
terms, and password preferences for the collector VM.
2. On the desktop, Contoso selects the Microsoft Azure Appliance Configuration Manager shortcut.
3. In Azure Migrate Collector, Contoso selects Set up prerequisites. Contoso accepts the license terms and
reads the third-party information.
4. The collector checks that the VM has internet access, that the time is synced, and that the collector service is
running. (The collector service is installed by default on the VM.) Contoso also installs the VMware vSphere
Virtual Disk Development Kit.
NOTE
It's assumed that the VM has direct access to the internet without using a proxy.
5. Login to you Azure account and select the subscription and Migrate project you created earlier. Also enter a
name for the appliance so you can identify it in the Azure portal.
6. In Specify vCenter Server details, Contoso enters the name (FQDN ) or IP address of the vCenter Server
instance and the read-only credentials used for discovery.
7. Contoso selects a scope for VM discovery. The collector can discover only VMs that are within the specified
scope. The scope can be set to a specific folder, datacenter, or cluster.
8. The collector will now start to discovery and collect information about the Contoso environment.
4. In Azure Log Analytics, Contoso pastes the workspace ID and key that it copied from the portal.
2. Contoso must run the command to install the MMA agent as root. To become root, Contoso runs the
following command, and then enters the root password:
sudo -i
wget https://raw.githubusercontent.com/Microsoft/OMS-Agent-for-
Linux/master/installer/scripts/onboard_agent.sh && sh onboard_agent.sh -w 6b7fcaff-7efb-4356-ae06-
516cacf5e25d -s k7gAMAw5Bk8pFVUTZKmk2lG4eUciswzWfYLDTxGcD8pcyc4oT8c6ZRgsMy3MmsQSHuSOcmBUsCjoRiG2x9A8Mg==
NOTE
To view more granular dependencies, you can expand the time range. You can select a specific duration or select start
and end dates.
Run an assessment
1. In Groups, Contoso opens the group (smarthotelapp), and then selects Create assessment.
An assessment has a confidence rating of from 1 star to 5 stars (1 star is the lowest and 5 stars is the highest).
The confidence rating is assigned to an assessment based on the availability of data points that are needed
to compute the assessment.
The rating helps you estimate the reliability of the size recommendations that are provided by Azure
Migrate.
The confidence rating is useful when you are doing performance-based sizing. Azure Migrate might not
have enough data points for utilization-based sizing. For as on-premises sizing, the confidence rating is
always 5 stars because Azure Migrate has all the data points it needs to size the VM.
Depending on the percentage of data points available, the confidence rating for the assessment is provided:
0%-20% 1 star
21%-40% 2 stars
41%-60% 3 stars
AVAILABILITY OF DATA POINTS CONFIDENCE RATING
61%-80% 4 stars
81%-100% 5 stars
The assessment report shows the information that's summarized in the table. To show performance-based sizing,
Azure Migrate needs the following information. If the information can't be collected, sizing assessment might not
be accurate.
Utilization data for CPU and memory.
Read/write IOPS and throughput for each disk attached to the VM.
Network in/out information for each network adapter attached to the VM.
- Readiness unknown
Azure VM size For ready VMs, Azure Migrate provides Sizing recommendation depends on
an Azure VM size recommendation. assessment properties:
Cost estimates are calculated by using the size recommendations for a machine.
Estimated monthly costs for compute and storage are aggregated for all VMs in the group.
Conclusion
In this scenario, Contoso assesses its SmartHotel360 app database by using the Data Migration Assessment tool. It
assesses the on-premises VMs by using the Azure Migrate service. Contoso reviews the assessments to make sure
that on-premises resources are ready for migration to Azure.
Next steps
After Contoso assesses this workload as a potential migration candidate, it can begin preparing its on-premises
infrastructure and its Azure infrastructure for migration. See the deploy Azure infrastructure article in the Cloud
Adoption Framework migrate best practices section for an example of how Contoso performs these processes.
Best practices to set up networking for workloads
migrated to Azure
27 minutes to read • Edit Online
As you plan and design for migration, in addition to the migration itself, one of the most critical steps is the design
and implementation of Azure networking. This article describes best practices for networking when migrating to
IaaS and PaaS implementations in Azure.
IMPORTANT
The best practices and opinions described in this article are based on the Azure platform and service features available at the
time of writing. Features and capabilities change over time. Not all recommendations might be applicable for your
deployment, so select those that work for you.
Learn more:
Learn about designing subnets.
Learn how a fictional company (Contoso) prepared their networking infrastructure for migration.
Secure VNets
The responsibility for securing VNets is shared between Microsoft and you. Microsoft provides many networking
features, as well as services that help keep resources secure. When designing security for VNets, best practices
you should follow include implementing a perimeter network, using filtering and security groups, securing access
to resources and IP addresses, and implementing attack protection.
Learn more:
Get an overview of best practices for network security.
Learn how to design for secure networks.
NIC1 AsgWeb
NIC2 AsgWeb
NIC3 AsgLogic
NIC4 AsgDb
In our example, each network interface belongs to only one application security group, but in fact an interface
can belong to multiple groups, in accordance with Azure limits.
None of the network interfaces have an associated NSG. NSG1 is associated to both subnets and contains the
following rules.
Destination port: 80
Protocol: TCP
Access: Allow.
Destination: AsgDb
Protocol: All
Access: Deny.
Protocol: TCP
Access: Allow.
The rules that specify an application security group as the source or destination are only applied to the network
interfaces that are members of the application security group. If the network interface is not a member of an
application security group, the rule is not applied to the network interface, even though the network security
group is associated to the subnet.
Learn more:
Learn about application security groups.
Best practice: Secure access to PaaS using VNet service endpoints
VNet service endpoints extend your VNet private address space and identity to Azure services over a direct
connection.
Endpoints allow you to secure critical Azure service resources to your VNets only. Traffic from your VNet to the
Azure service always remains on the Microsoft Azure backbone network.
VNet private address space can be overlapping and thus cannot be used to uniquely identify traffic originating
from a VNet.
After service endpoints are enabled in your VNet, you can secure Azure service resources by adding a VNet
rule to the service resources. This provides improved security by fully removing public internet access to
resources, and allowing traffic only from your VNet.
Service endpoints
Learn more:
Learn about VNet service endpoints.
Azure Firewall
Azure Firewall can centrally create, enforce, and log application and network connectivity policies across
subscriptions and VNets.
Azure Firewall uses a static public IP address for your VNet resources, allowing outside firewalls to identify
traffic originating from your VNet.
Azure Firewall is fully integrated with Azure Monitor for logging and analytics.
As a best practice when creating Azure Firewall rules, use the FQDN tags to create rules.
An FQDN tag represents a group of FQDNs associated with well-known Microsoft services.
You can use an FQDN tag to allow the required outbound network traffic through the firewall.
For example, to manually allow Windows Update network traffic through your firewall, you would need to
create multiple application rules. Using FQDN tags, you create an application rule, and include the Windows
Updates tag. With this rule in place, network traffic to Microsoft Windows Update endpoints can flow through
your firewall.
Learn more:
Get an overview of Azure Firewall.
Learn about FQDN tags.
Network Watcher
With Network Watcher you can monitor and diagnose networking issues without logging into VMs.
You can trigger packet capture by setting alerts, and gain access to real-time performance information at the
packet level. When you see an issue, you can investigate it in detail.
As a best practice, use Network Watcher to review NSG flow logs.
NSG flow logs in Network Watcher allow you to view information about ingress and egress IP traffic
through an NSG.
Flow logs are written in JSON format.
Flow logs show outbound and inbound flows on a per-rule basis, the network interface (NIC ) to which
the flow applies, 5-tuple information about the flow (source/destination IP, source/destination port, and
protocol), and whether the traffic was allowed or denied.
Learn more:
Get an overview of Network Watcher.
Learn more about NSG flow Logs.
Azure Firewall Like NVA firewall farms, Azure Firewall uses a common
administration mechanism, and a set of security rules to
protect workloads hosted in spoke networks, and to control
access to on-premises networks.
NVA firewalls Like Azure Firewall NVA firewall farms have common
administration mechanism, and a set of security rules to
protect workloads hosted in spoke networks, and to control
access to on-premises networks.
If you want to use NVA you can find them in the Azure
Marketplace.
We recommend using one set of Azure Firewalls (or NVAs) for traffic originating on the internet, and another for
traffic originating on-premises.
Using only one set of firewalls for both is a security risk, as it provides no security perimeter between the two
sets of network traffic.
Using separate firewall layers reduces the complexity of checking security rules, and it's clear which rules
correspond to which incoming network request.
Learn more:
Learn about using NVAs in an Azure VNet.
Next steps
Review other best practices:
Best practices for security and management after migration.
Best practices for cost management after migration.
Application migration patterns and examples
7 minutes to read • Edit Online
This section of the Cloud Adoption Framework provides examples of several common migration scenarios,
demonstrating how you can migrate on-premises infrastructure to the Microsoft Azure cloud.
Introduction
Azure provides access to a comprehensive set of cloud services. As developers and IT professionals, you can use
these services to build, deploy, and manage applications on a range of tools and frameworks, through a global
network of datacenters. As your business faces challenges associated with the digital shift, the Azure cloud helps
you to figure out how to optimize resources and operations, engage with your customers and employees, and
transform your products.
However, Azure recognizes that even with all the advantages that the cloud provides in terms of speed and
flexibility, minimized costs, performance, and reliability, many organizations are going to need to run on-premises
datacenters for some time to come. In response to cloud adoption barriers, Azure provides a hybrid cloud strategy
that builds bridges between your on-premises datacenters, and the Azure public cloud. For example, using Azure
cloud resources like Azure Backup to protect on-premises resources, or using Azure analytics to gain insights into
on-premises workloads.
As part of the hybrid cloud strategy, Azure provides growing solutions for migrating on-premises apps and
workloads to the cloud. With simple steps, you can comprehensively assess your on-premises resources to figure
out how they'll run in the Azure cloud. Then, with a deep assessment in hand, you can confidently migrate
resources to Azure. When resources are up and running in Azure, you can optimize them to retain and improve
access, flexibility, security, and reliability.
Migration patterns
Strategies for migration to the cloud fall into four broad patterns: rehost, refactor, rearchitect, or rebuild. The
strategy you adopt depends on your business drivers and migration goals. You might adopt multiple patterns. For
example, you could choose to rehost simple apps, or apps that aren't critical to your business, but rearchitect those
that are more complex and business-critical. Let's look at these patterns.
Rehost Often referred to as a lift and shift When you need to move apps quickly
migration. This option doesn't require to the cloud.
code changes, and allows you to
migrate your existing apps to Azure When you want to move an app
quickly. Each app is migrated as is, to without modifying it.
reap the benefits of the cloud, without
the risk and cost associated with code When your apps are architected so that
changes. they can take advantage of Azure IaaS
scalability after migration.
Rearchitect Rearchitecting for migration focuses on When your apps need major revisions
modifying and extending app to incorporate new capabilities, or to
functionality and the code base to work effectively on a cloud platform.
optimize the app architecture for cloud
scalability. When you want to use existing
application investments, meet scalability
For example, you could break down a requirements, apply innovative Azure
monolithic application into a group of DevOps practices, and minimize use of
microservices that work together and virtual machines.
scale easily.
Rebuild Rebuild takes things a step further by When you want rapid development, and
rebuilding an app from scratch using existing apps have limited functionality
Azure cloud technologies. and lifespan.
For example, you could build greenfield When you're ready to expedite business
apps with cloud-native technologies like innovation (including DevOps practices
Azure Functions, Azure AI, Azure SQL provided by Azure), build new
Database Managed Instance, and Azure applications using cloud-native
Cosmos DB. technologies, and take advantage of
advancements in AI, Blockchain, and
IoT.
Assess on-premises resources for migration to Azure This article shows how to run an assessment of an on-
premises app running on VMware. In the example, an example
organization assesses app VMs using the Azure Migrate
service, and the app SQL Server database using Data
Migration Assistant.
Infrastructure
ARTICLE DETAILS
Deploy Azure infrastructure This article shows how an organization can prepare its on-
premises infrastructure and its Azure infrastructure for
migration. The infrastructure example established in this article
is referenced in the other samples provided in this section.
Rehost an app on Azure VMs This article provides an example of migrating on-premises app
VMs to Azure VMs using the Site Recovery service.
Rearchitect an app in Azure containers and Azure SQL This article provides an example of migrating an app while
Database rearchitecting the app web tier as a Windows container
running in Azure Service Fabric, and the database with Azure
SQL Database.
Linux workloads
ARTICLE DETAILS
Rehost a Linux app on Azure VMs and Azure Database for This article provides an example of migrating a Linux-hosted
MySQL app to Azure VMs by using Site Recovery. It migrates the app
database to Azure Database for MySQL by using MySQL
Workbench.
Rehost a Linux app on Azure VMs This example shows how to complete a lift and shift migration
of a Linux-based app to Azure VMs, using the Site Recovery
service.
Rehost an app on an Azure VM and SQL Database Managed This article provides an example of a lift and shift migration to
Instance Azure for an on-premises app. This involves migrating the app
front-end VM using Azure Site Recovery, and the app
database to an Azure SQL Database Managed Instance using
the Azure Database Migration Service.
Rehost an app on Azure VMs and in a SQL Server Always On This example shows how to migrate an app and data using
availability group Azure hosted SQL Server VMs. It uses Site Recovery to
migrate the app VMs, and the Azure Database Migration
Service to migrate the app database to a SQL Server cluster
that's protected by an Always On availability group.
Refactor an app in an Azure web app and Azure SQL Database This example shows how to migrate an on-premises
Windows-based app to an Azure web app and migrates the
app database to an Azure SQL Server instance with the Data
Migration Assistant.
Refactor a Linux app to multiple regions using Azure App This example shows how to migrate an on-premises Linux-
Service, Azure Traffic Manager, and Azure Database for MySQL based app to an Azure web app on multiple Azure regions
using Azure Traffic Manager, integrated with GitHub for
continuous delivery. The app database is migrated to an Azure
Database for MySQL instance.
Refactor Team Foundation Server on Azure DevOps Services This article shows an example migration of an on-premises
Team Foundation Server deployment to Azure DevOps
Services in Azure.
Migration scaling
ARTICLE DETAILS
Scale a migration to Azure This article how an example organization prepares to scale to
a full migration to Azure.
Demo apps
The example articles provided in this section use two demo apps: SmartHotel360 and osTicket.
SmartHotel360: This app was developed by Microsoft as a test app that you can use when working with
Azure. It's provided as open source and you can download it from GitHub. It's an ASP.NET app connected to a
SQL Server database. In the scenarios discussed in these articles, the current version of this app is deployed to
two VMware VMs running Windows Server 2008 R2, and SQL Server 2008 R2. These app VMs are hosted on-
premises and managed by vCenter Server.
osTicket: An open-source service desk ticketing app that runs on Linux. You can download it from GitHub. In
the scenarios discussed in these articles, the current version of this app is deployed on-premises to two
VMware VMs running Ubuntu 16.04 LTS, using Apache 2, PHP 7.0, and MySQL 5.7
Deploy a migration infrastructure
37 minutes to read • Edit Online
This article shows how the fictional company Contoso prepares its on-premises infrastructure for migration,
sets up an Azure infrastructure in preparation for migration, and runs the business in a hybrid environment.
When you use this example to help plan your own infrastructure migration efforts, keep the following in mind:
The provided sample architecture is specific to Contoso. Review your own organization's business needs,
structure, and technical requirements when making important infrastructure decisions about subscription
design or networking architecture.
Whether you need all the elements described in this article depends on your migration strategy. For example,
if you're building only cloud-native apps in Azure, you might need a less complex networking structure.
Overview
Before Contoso can migrate to Azure, it's critical to prepare an Azure infrastructure. Generally, there are six
broad areas Contoso needs to think about:
Step 1: Azure subscriptions. How will Contoso purchase Azure, and interact with the Azure platform and
services?
Step 2: Hybrid identity. How will it manage and control access to on-premises and Azure resources after
migration? How does Contoso extend or move identity management to the cloud?
Step 3: Disaster recovery and resilience. How will Contoso ensure that its apps and infrastructure are
resilient if outages and disasters occur?
Step 4: Networking. How should Contoso design a networking infrastructure, and establish connectivity
between its on-premises datacenter and Azure?
Step 5: Security. How will it secure the hybrid/Azure deployment?
Step 6: Governance. How will Contoso keep the deployment aligned with security and governance
requirements?
On-premises architecture
Here's a diagram showing the current Contoso on-premises infrastructure.
Contoso has one main datacenter located in the city of New York in the Eastern United States.
There are three additional local branches across the United States.
The main datacenter is connected to the internet with a fiber metro ethernet connection (500 mbps).
Each branch is connected locally to the internet using business class connections, with IPSec VPN tunnels
back to the main datacenter. This allows the entire network to be permanently connected, and optimizes
internet connectivity.
The main datacenter is fully virtualized with VMware. Contoso has two ESXi 6.5 virtualization hosts,
managed by vCenter Server 6.5.
Contoso uses Active Directory for identity management, and DNS servers on the internal network.
The domain controllers in the datacenter run on VMware VMs. The domain controllers at local branches run
on physical servers.
Examine licensing
With subscriptions configured, Contoso can look at Microsoft licensing. The licensing strategy will depend on
the resources that Contoso wants to migrate into Azure and how Azure VMs and services are selected and
deployed.
Azure Hybrid Benefit
When deploying VMs in Azure, standard images include a license that will charge Contoso by the minute for the
software being used. However, Contoso has been a long-term Microsoft customer, and has maintained EAs and
open licenses with Software Assurance (SA).
Azure Hybrid Benefit provides a cost-effective method for Contoso migration, by allowing it to save on Azure
VMs and SQL Server workloads by converting or reusing Windows Server Datacenter and Standard edition
licenses covered with Software Assurance. This will enable Contoso to pay a lower based compute rate for VMs
and SQL Server. Learn more.
License Mobility
License Mobility through SA gives Microsoft Volume Licensing customers like Contoso the flexibility to deploy
eligible server apps with active SA on Azure. This eliminates the need to purchase new licenses. With no
associated mobility fees, existing licenses can easily be deployed in Azure. Learn more.
Reserve instances for predictable workloads
Predictable workloads are those that always need to be available with VMs running. For example, line-of-
business apps such as an SAP ERP system. On the other hand, unpredictable workloads are those that are
variable, such as VMs that are on during high demand and off when demand is low.
In exchange for using reserved instances for specific VM instances must be maintained for large durations of
time, Console can get both a discount, and prioritized capacity. Using Azure Reserved Instances, together with
Azure Hybrid Benefit, Contoso can save up to 82% off regular pay-as-you-go pricing (April 2018).
ContosoFailoverRG This group serves as a landing zone for failed over resources.
Sc a l e r e so u r c e g r o u p s
In future, Contoso will add other resource groups based on needs. For example, they could define a resource
group for each app or service, so that they can be managed and secured independently.
Create matching security groups on-premises
1. In the on-premises Active Directory, Contoso admins set up security groups with names that match the
names of the Azure resource groups.
2. For management purposes, they create an additional group that will be added to all of the other groups.
This group will have rights to all resource groups in Azure. A limited number of Global Admins will be
added to this group.
Synchronize Active Directory
Contoso wants to provide a common identity for accessing resources on-premises and in the cloud. To do this, it
will integrate the on-premises Active Directory with Azure AD. With this model:
Users and organizations can take advantage of a single identity to access on-premises applications and cloud
services such as Office 365, or thousands of other sites on the internet.
Admins can use the groups in Active Directory to implement Role Based Access Control (RBAC ) in Azure.
To facilitate integration, Contoso uses the Azure AD Connect tool. When you install and configure the tool on a
domain controller, it synchronizes the local on-premises Active Directory identities to Azure AD.
Download the tool
1. In the Azure portal, Contoso admins go to Azure Active Directory > Azure AD Connect, and
download the latest version of the tool to the server they're using for synchronization.
2. They start the AzureADConnect.msi installation, with Use express settings. This is the most common
installation, and can be used for a single-forest topology, with password hash synchronization for
authentication.
3. In Connect to Azure AD, they specify the credentials for connecting to the Azure AD (in the form
admin@contoso.com or admin@contoso.onmicrosoft.com).
4. In Connect to AD DS, they specify credentials for the on-premises Active Directory (in the form
CONTOSO\admin or contoso.com\admin).
5. In Ready to configure, they select Start the synchronization process when configuration
completes to start the sync immediately. Then they install.
Note that:
Contoso has a direct connection to Azure. If your on-premises Active Directory is behind a proxy, read
this article.
After the first synchronization, on-premises Active Directory objects are visible in the Azure AD directory.
The Contoso IT team is represented in each group, based on its role.
Set up RBAC
Azure role-based access control (RBAC ) enables fine-grained access management for Azure. Using RBAC, you
can grant only the amount of access that users need to perform tasks. You assign the appropriate RBAC role to
users, groups, and applications at a scope level. The scope of a role assignment can be a subscription, a resource
group, or a single resource.
Contoso admins now assigns roles to the Active Directory groups that they synchronized from on-premises.
1. In the ControlCobRG resource group, they select Access control (IAM ) > Add role assignment.
2. In Add role assignment > Role, > Contributor, they select the ContosoCobRG group from the list.
The group then appears in the Selected members list.
3. They repeat this with the same permissions for the other resource groups (except for
ContosoAzureAdmins), by adding the Contributor permissions to the account that matches the
resource group.
4. For the ContosoAzureAdmins group, they assign the Owner role.
For the domain controllers in the VNET-PROD -EUS2 network, Contoso wants traffic to flow both between the
EUS2 hub/production network, and over the VPN connection to on-premises. To do this it Contoso admins
must allow the following:
1. Allow forwarded traffic and Allow gateway transit configurations on the peered connection. In our
example this would be the VNET-HUB -EUS2 to VNET-PROD -EUS2 connection.
2. Allow forwarded traffic and Use remote gateways on the other side of the peering, on the VNET-
PROD -EUS2 to VNET-HUB -EUS2 connection.
3. On-premises they'll set up a static route that directs the local traffic to route across the VPN tunnel to the
VNet. The configuration would be completed on the gateway that provides the VPN tunnel from Contoso
to Azure. They use RRAS for this.
Pr o du c t i o n n et w o r ks
A spoked peer network can't see a spoked peer network in another region via a hub.
For Contoso's production networks in both regions to see each other, Contoso admins need to create a direct
peered connection for VNET-PROD -EUS2 and VENT-PROD -CUS.
Set up DNS
When you deploy resources in virtual networks, you have a couple of choices for domain name resolution. You
can use name resolution provided by Azure, or provide DNS servers for resolution. The type of name resolution
you use depends on how your resources need to communicate with each other. Get more information about the
Azure DNS service.
Contoso admins have decided that the Azure DNS service isn't a good choice in the hybrid environment.
Instead, they will use the on-premises DNS servers.
Since this is a hybrid network all the VMs on-premises and in Azure need to be able to resolve names to
function properly. This means that custom DNS settings must be applied to all the VNets.
Contoso currently has DCs deployed in the Contoso datacenter and at the branch offices. The primary
DNS servers are CONTOSODC1(172.16.0.10) and CONTOSODC2(172.16.0.1)
When the VNets are deployed, the on-premises domain controllers will be set to be used as DNS servers
in the networks.
To configure this, when using custom DNS on the VNet, Azure's recursive resolvers IP address (such as
168.63.129.16) must be added to the DNS list. To do this, Contoso configures DNS server settings on
each VNet. For example, the custom DNS settings for the VNET-HUB -EUS2 network would be as
follows:
In addition to the on-premises domain controllers, Contoso are going to implement four more to support the
Azure networks, two for each region. Here's what Contoso will deploy in Azure.
After deploying the on-premises domain controllers, Contoso needs to update the DNS settings on networks on
either region to include the new domain controllers in the DNS server list.
Set up domain controllers in Azure
After updating network settings, Contoso admins are ready to build out the domain controllers in Azure.
1. In the Azure portal, they deploy a new Windows Server VM to the appropriate VNet.
2. They create availability sets in each location for the VM. Availability sets do the following:
Ensure that the Azure fabric separates the VMs into different infrastructures in the Azure Region.
Allows Contoso to be eligible for the 99.95% SLA for VMs in Azure. Learn more.
3. After the VM is deployed, they open the network interface for the VM. They set the private IP address to
static, and specify a valid address.
4. Now, they attach a new data disk to the VM. This disk contains the Active Directory database, and the
sysvol share.
The size of the disk will determine the number of IOPS that it supports.
Over time the disk size might need to increase as the environment grows.
The drive shouldn't be set to Read/Write for host caching. Active Directory databases don't support
this.
5. After the disk is added, they connect to the VM over Remote Desktop, and open Server Manager.
6. Then in File and Storage Services, they run the New Volume Wizard, ensuring that the drive is given
the letter F: or above on the local VM.
7. In Server Manager, they add the Active Directory Domain Services role. Then, they configure the VM
as a domain controller.
8. After the VM is configured as a DC and rebooted, they open DNS Manager and configure the Azure
DNS resolver as a forwarder. This allows the DC to forward DNS queries it can't resolve in the Azure
DNS.
9. Now, they update the custom DNS settings for each VNet with the appropriate domain controller for the
VNet region. They include on-premises DCs in the list.
Set up Active Directory
Active Directory is a critical service in networking, and must be configured correctly. Contoso admins will build
Active Directory sites for the Contoso datacenter, and for the EUS2 and CUS regions.
1. They create two new sites (AZURE -EUS2, and AZURE -CUS ) along with the datacenter site
(ContosoDatacenter).
2. After creating the sites, they create subnets in the sites, to match the VNets and datacenter.
3. Then, they create two site links to connect everything. The domain controllers should then be moved to
their location.
5. With everything complete, a list of the domain controllers and sites are shown in the on-premises Active
Directory Administrative Center.
Step 5: Plan for governance
Azure provides a range of governance controls across services and the Azure platform. For more information,
see the Azure governance options.
As they configure identity and access control, Contoso has already begun to put some aspects of governance
and security in place. Broadly, there are three areas it needs to consider:
Policy: Azure Policy applies and enforces rules and effects over your resources, so that resources stay
compliant with corporate requirements and SLAs.
Locks: Azure allows you to lock subscriptions, resources groups, and other resources, so that they can be
modified only by those with authority to do so.
Tags: Resources can be controlled, audited, and managed with tags. Tags attach metadata to resources,
providing information about resources or owners.
Set up policies
The Azure Policy service evaluates your resources, scanning for those not compliant with the policy definitions
you have in place. For example, you might have a policy that only allows certain types of VMs, or requires
resources to have a specific tag.
Policies specify a policy definition, and a policy assignment specifies the scope in which a policy should be
applied. The scope can range from a management group to a resource group. Learn about creating and
managing policies.
Contoso wants to get started with a couple of policies:
It wants a policy to ensure that resources can be deployed in the EUS2 and CUS regions only.
It wants to limit VM SKUs to approved SKUs only. The intention is to ensure that expensive VM SKUs aren't
used.
Limit resources to regions
Contoso uses the built-in policy definition Allowed locations to limit resource regions.
1. In the Azure portal, select All services, and search for Policy.
2. Select Assignments > Assign policy.
3. In the policy list, select Allowed locations.
4. Set Scope to the name of the Azure subscription, and select the two regions in the allowed list.
5. By default the policy is set with Deny, meaning that if someone starts a deployment in the subscription
that isn't in EUS2 or CUS, the deployment will fail. Here's what happens if someone in the Contoso
subscription tries to set up a deployment in West US.
Set up locks
Contoso has long been using the ITIL framework for the management of its systems. One of the most
important aspects of the framework is change control, and Contoso wants to make sure that change control is
implemented in the Azure deployment.
Contoso is going to implement locks as follows:
Any production or failover component must be in a resource group that has a ReadOnly lock. This means
that to modify or delete production items, the lock must be removed.
Nonproduction resource groups will have CanNotDelete locks. This means that authorized users can read or
modify a resource, but cannot delete it.
Learn more about locks.
Set up tagging
To track resources as they're added, it will be increasingly important for Contoso to associate resources with an
appropriate department, customer, and environment.
In addition to providing information about resources and owners, tags will enable Contoso to aggregate and
group resources, and to use that data for chargeback purposes.
Contoso needs to visualize its Azure assets in a way that makes sense for the business. For example by role or
department. Note that resources don't need to reside in the same resource group to share a tag. Contoso will
create a simple tag taxonomy so that everyone uses the same tags.
ApplicationTeam Email alias of the team that owns support for the app.
ServiceManager Email alias of the ITIL Service Manager for the resource.
For example:
After creating the tag, Contoso will go back and create new policy definitions and assignments, to enforce the
use of the required tags across the organization.
Encrypt data
Azure Disk Encryption integrates with Azure Key Vault to help control and manage the disk-encryption keys and
secrets in a Key Vault subscription. It ensures that all data on VM disks are encrypted at rest in Azure storage.
Contoso has determined that specific VMs require encryption.
Contoso will apply encryption to VMs with customer, confidential, or PPI data.
Conclusion
In this article, Contoso set up an Azure infrastructure and policy for Azure subscription, hybrid identify, disaster
recovery, networking, governance, and security.
Not all of the steps that Contoso completed here are required for a migration to the cloud. In this case, it wanted
to plan a network infrastructure that can be used for all types of migrations, and is secure, resilient, and scalable.
With this infrastructure in place, Contoso is ready to move on and try out migration.
Next steps
After setting up their Azure infrastructure, Contoso is ready to begin migrating workloads to the cloud. See the
migration patterns and examples overview section for a selection of scenarios using this sample infrastructure
as a migration target.
Rehost an on-premises app on Azure VMs
12 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso rehosts a two-tier Windows .NET front-end app
running on VMware VMs, by migrating the app VMs to Azure VMs.
The SmartHotel360 app used in this example is provided as open source. If you'd like to use it for your own testing
purposes, you can download it from GitHub.
Business drivers
The IT Leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, and as a result there is pressure on their on-premises systems
and infrastructure.
Limit risk. The SmartHotel360 app is critical for the Contoso business. It wants to move the app to Azure with
zero risk.
Extend. Contoso doesn't want to modify the app, but does want to ensure that it's stable.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals are used to determine the best
migration method:
After migration, the app in Azure should have the same performance capabilities as it does today in VMware.
The app will remain as critical in the cloud as it is on-premises.
Contoso doesn't want to invest in this app. It is important to the business, but in its current form Contoso
simply wants to move it safely to the cloud.
Contoso doesn't want to change the ops model for this app. Contoso do want to interact with it in the cloud in
the same way that they do now.
Contoso doesn't want to change any app functionality. Only the app location will change.
Solution design
After pinning down goals and requirements, Contoso designs and review a deployment solution, and identifies the
migration process, including the Azure services that Contoso will use for the migration.
Current app
The app is tiered across two VMs (WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
Proposed architecture
Since the app is a production workload, the app VMs in Azure will reside in the production resource group
ContosoRG.
The app VMs will be migrated to the primary Azure region (East US 2) and placed in the production network
(VNET-PROD -EUS2).
The web front-end VM will reside in the front-end subnet (PROD -FE -EUS2) in the production network.
The database VM will reside in the database subnet (PROD -DB -EUS2) in the production network.
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Server. The following considerations helped them to decide to go with SQL Server running on an Azure IaaS VM:
Using an Azure VM running SQL Server seems to be an optimal solution if Contoso needs to customize the
operating system or the database server, or if it might want to colocate and run third-party apps on the same
VM.
With Software Assurance, in future Contoso can exchange existing licenses for discounted rates on a SQL
Database Managed Instance using the Azure Hybrid Benefit for SQL Server. This can save up to 30% on
Managed Instance.
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
Pros Both the app VMs will be moved to Azure without changes,
making the migration simple.
Since Contoso is using a lift and shift approach for both app
VMs, no special configuration or migration tools are needed
for the app database.
Cons WEBVM and SQLVM are running Windows Server 2008 R2.
The operating system is supported by Azure for specific roles
(July 2018). Learn more.
The web and data tiers of the app will remain a single point of
failover.
Migration process
Contoso will migrate the app front-end and database VMs to Azure VMs with the Azure Migrate Server Migration
tool agentless method.
As a first step, Contoso prepares and sets up Azure components for Azure Migrate Server Migration, and
prepares the on-premises VMware infrastructure.
They already have the Azure infrastructure in place, so Contoso just needs to add configure the replication of
the VMs through the Azure Migrate Server Migration tool.
With everything prepared, Contoso can start replicating the VMs.
After replication is enabled and working, Contoso will migrate the VM by failing it over to Azure.
Azure services
SERVICE DESCRIPTION COST
Azure Migrate Server Migration The service orchestrates and manages During replication to Azure, Azure
migration of your on-premises apps Storage charges are incurred. Azure
and workloads, and AWS/GCP VM VMs are created, and incur charges,
instances. when failover occurs. Learn more about
charges and pricing.
Prerequisites
Here's what Contoso needs to run this scenario.
REQUIREMENTS DETAILS
Scenario steps
Here's how Contoso admins will run the migration:
Step 1: Prepare Azure for Azure Migrate Server Migration. They add the Server Migration tool to their
Azure Migrate project.
Step 2: Prepare on-premises VMware for Azure Migrate Server Migration. They prepare accounts for
VM discovery, and prepare to connect to Azure VMs after failover.
Step 3: Replicate VMs. They set up replication, and start replicating VMs to Azure storage.
Step 4: Migrate the VMs with Azure Migrate Server Migration. They run a test failover to make sure
everything's working, and then run a full failover to migrate the VMs to Azure.
Step 1: Prepare Azure for the Azure Migrate Server Migration tool
Here are the Azure components Contoso needs to migrate the VMs to Azure:
A VNet in which Azure VMs will be located when they're created during failover.
The Azure Migrate Server Migration tool provisioned.
They set these up as follows:
1. Set up a network-Contoso already set up a network that can be for Azure Migrate Server Migration when
they deployed the Azure infrastructure
The SmartHotel360 app is a production app, and the VMs will be migrated to the Azure production
network (VNET-PROD -EUS2) in the primary East US 2 region.
Both VMs will be placed in the ContosoRG resource group, which is used for production resources.
The app front-end VM (WEBVM ) will migrate to the front-end subnet (PROD -FE -EUS2), in the
production network.
The app database VM (SQLVM ) will migrate to the database subnet (PROD -DB -EUS2), in the
production network.
2. Provision the Azure Migrate Server Migration tool-With the network and storage account in place, Contoso
now creates a Recovery Services vault (ContosoMigrationVault), and places it in the ContosoFailoverRG
resource group in the primary East US 2 region.
NOTE
You can update replication settings any time before replication starts, in Manage > Replicating machines. Settings can't be
changed after replication starts.
3. In Test Migration, select the Azure VNet in which the Azure VM will be located after the migration. We
recommend you use a nonproduction VNet.
4. The Test migration job starts. Monitor the job in the portal notifications.
5. After the migration finishes, view the migrated Azure VM in Virtual Machines in the Azure portal. The
machine name has a suffix -Test.
6. After the test is done, right-click the Azure VM in Replicating machines, and click Clean up test
migration.
BCDR
For business continuity and disaster recovery (BCDR ), Contoso takes the following actions:
Keep data safe: Contoso backs up the data on the VMs using the Azure Backup service. Learn more.
Keep apps up and running: Contoso replicates the app VMs in Azure to a secondary region using Site Recovery.
Learn more.
Licensing and cost optimization
1. Contoso has existing licensing for their VMs, and will take advantage of the Azure Hybrid Benefit. Contoso will
convert the existing Azure VMs, to take advantage of this pricing.
2. Contoso will enable Azure Cost Management licensed by Cloudyn, a Microsoft subsidiary. It's a multicloud cost
management solution that helps to use and manage Azure and other cloud resources. Learn more about Azure
Cost Management.
Conclusion
In this article, Contoso rehosted the SmartHotel360 app in Azure by migrating the app VMs to Azure VMs using
the Azure Migrate Server Migration tool.
Rearchitect an on-premises app to an Azure
container and Azure SQL Database
20 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso rearchitects a two-tier Windows .NET app running
on VMware VMs as part of a migration to Azure. Contoso migrates the app front-end VM to an Azure Windows
container, and the app database to an Azure SQL database.
The SmartHotel360 app used in this example is provided as open source. If you'd like to use it for your own testing
purposes, you can download it from GitHub.
Business drivers
The Contoso IT leadership team has worked closely with business partners to understand what they want to
achieve with this migration:
Address business growth. Contoso is growing, and as a result there is pressure on its on-premises systems
and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures, and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes in the marketplace, to enable the success in a global economy. It mustn't get in the way,
or become a business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that are able to grow at the same
pace.
Reduce costs. Contoso wants to minimize licensing costs.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method.
GOALS DETAILS
GOALS DETAILS
Azure reqs Contoso wants to move the app to Azure, and run it in a
container to extend app life. It doesn't want to start
completely from scratch to implement the app in Azure.
Solution design
After pinning down goals and requirements, Contoso designs and review a deployment solution, and identifies the
migration process, including the Azure services that Contoso will use for the migration.
Current app
The SmartHotel360 on-premises app is tiered across two VMs (WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5)
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Proposed architecture
For the database tier of the app, Contoso compared Azure SQL Database with SQL Server using this article.
It decided to go with Azure SQL Database for a few reasons:
Azure SQL Database is a relational-database managed service. It delivers predictable performance at
multiple service levels, with near-zero administration. Advantages include dynamic scalability with no
downtime, built-in intelligent optimization, and global scalability and availability.
Contoso uses the lightweight Data Migration Assistant (DMA) to assess and migrate the on-premises
database to Azure SQL.
With Software Assurance, Contoso can exchange its existing licenses for discounted rates on a SQL
Database, using the Azure Hybrid Benefit for SQL Server. This could provide savings of up to 30%.
SQL Database provides several security features including always encrypted, dynamic data masking, and
row -level security/threat detection.
For the app web tier, Contoso has decided convert it to the Windows Container using Azure DevOps
services.
Contoso will deploy the app using Azure Service Fabric, and pull the Windows container image from the
Azure Container Registry (ACR ).
A prototype for extending the app to include sentiment analysis will be implemented as another service
in Service Fabric, connected to Cosmos DB. This will read information from Tweets, and display on the
app.
To implement a DevOps pipeline, Contoso will use Azure DevOps for source code management (SCM ), with
Git repos. Automated builds and releases will be used to build code, and deploy it to the Azure Container
Registry and Azure Service Fabric.
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
CONSIDERATION DETAILS
Contoso can configure the web tier of the app with multiple
instances, so that it's no longer a single point of failure.
Migration process
1. Contoso provisions the Azure service fabric cluster for Windows.
2. It provisions an Azure SQL instance, and migrates the SmartHotel360 database to it.
3. Contoso converts the Web tier VM to a Docker container using the Service Fabric SDK tools.
4. It connects the service fabric cluster and the ACR, and deploys the app using Azure service fabric.
Azure services
SERVICE DESCRIPTION COST
Data Migration Assistant (DMA) Assesses and detect compatibility issues It's a downloadable tool free of charge.
that might affect database functionality
in Azure. DMA assesses feature parity
between SQL sources and targets, and
recommends performance and reliability
improvements.
Azure SQL Database Provides an intelligent, fully managed Cost based on features, throughput and
relational cloud database service. size. Learn more.
Azure Container Registry Stores images for all types of container Cost based on features, storage, and
deployments. usage duration. Learn more.
Azure Service Fabric Builds and operate always-on, scalable Cost based on size, location, and
and distributed apps duration of the compute nodes. Learn
more.
Prerequisites
Here's what Contoso needs to run this scenario:
REQUIREMENTS DETAILS
REQUIREMENTS DETAILS
- Git
Scenario steps
Here's how Contoso runs the migration:
Step 1: Provision a SQL Database instance in Azure. Contoso provisions a SQL instance in Azure. After the
front-end web VM is migrated to an Azure container, the container instance with the app web front-end will
point to this database.
Step 2: Create an Azure Container Registry (ACR). Contoso provisions an enterprise container registry for
the docker container images.
Step 3: Provision Azure Service Fabric. It provisions a Service Fabric Cluster.
Step 4: Manage service fabric certificates. Contoso sets up certificates for Azure DevOps Services access to
the cluster.
Step 5: Migrate the database with DMA. It migrates the app database with the Data Migration Assistant.
Step 6: Set up Azure DevOps Services. Contoso sets up a new project in Azure DevOps Services, and
imports the code into the Git Repo.
Step 7: Convert the app. Contoso converts the app to a container using Azure DevOps and SDK tools.
Step 8: Set up build and release. Contoso sets up the build and release pipelines to create and publish the
app to the ACR and Service Fabric Cluster.
Step 9: Extend the app. After the app is public, Contoso extends it to take advantage of Azure capabilities, and
republishes it to Azure using the pipeline.
3. They set up a new SQL Server instance (sql-smarthotel-eus2) in the primary region.
4. They set the pricing tier to match server and database needs. And they select to save money with Azure
Hybrid Benefit because they already have a SQL Server license.
5. For sizing they use v-Core-based purchasing, and set the limits for the expected requirements.
2. They provide a name for the registry ( contosoacreus2), and place it in the primary region, in the resource
group they use for their infrastructure resources. They enable access for admin users, and set it as a
premium SKU so that they can use geo-replication.
Step 3: Provision Azure Service Fabric
The SmartHotel360 container will run in the Azure Service Fabric Cluster. Contoso admins create the Service
Fabric Cluster as follows:
1. Create a Service Fabric resource from the Azure Marketplace.
2. In Basics, they provide a unique DS name for the cluster, and credentials for accessing the on-premises VM.
They place the resource in the production resource group (ContosoRG) in the primary East US 2 region.
3. In Node type configuration, they input a node type name, durability settings, VM size, and app endpoints.
4. In Create Key Vault, they create a new Key Vault in their infrastructure resource group, to house the
certificate.
5. In Access policies, they enable access to virtual machines to deploy the key vault.
10. After the cluster is provisioned, they connect to the Service Fabric Cluster Explorer.
11. They need to select the correct certificate.
12. The Service Fabric Explorer loads, and the Contoso Admin can manage the cluster.
5. They enter the name of the certificate, and provide an X.509 distinguished name in Subject.
7. Now, they go back to the certificates list in the Key Vault, and copy the thumbprint of the client certificate
that's just been created. They save it in the text file.
8. For Azure DevOps Services deployment, they need to determine the Base64 value of the certificate. They do
this on the local developer workstation using PowerShell. They paste the output into a text file for later use.
[System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes("C:\path\to\certificate.pfx"))
9. Finally, they add the new certificate to the Service Fabric cluster. To do this, in the portal they open the
cluster, and select Security.
10. They select Add > Admin Client, and paste in the thumbprint of the new client certificate. Then they select
Add. This can take up to 15 minutes.
3. In the migration details, they add SQLVM as the source server, and the SmartHotel.Registration database.
4. They receive an error which seems to be associated with authentication. However after investigating, the
issue is the period (.) in the database name. As a workaround, they decided to provision a new SQL database
using the name SmartHotel-Registration, to resolve the issue. When they run DMA again, they're able to
select SmartHotel-Registration, and continue with the wizard.
5. In Select Objects, they select the database tables, and generate a SQL script.
10. They delete the extra SQL database SmartHotel.Registration in the Azure portal.
2. They import the Git Repo that currently holds their app code. It's in a public repo and you can download it.
3. After the code is imported, they connect Visual Studio to the repo, and clone the code using Team Explorer.
4. After the repository is cloned to the developer machine, they open the Solution file for the app. The web app
and wcf service each have separate project within the file.
2. They right-click the web app > Add > Container Orchestrator Support.
3. In Add Container Orchestra Support, they select Service Fabric.
6. Visual Studio created the Docker file, and pulled down the required images locally to the developer machine.
7. A manifest file (ServiceManifest.xml) is created and opened by Visual Studio. This file tells Service Fabric
how to configure the container when it's deployed to Azure.
8. Another manifest file (**ApplicationManifest.xml) contains the configuration applications for the containers.
9. They open the ApplicationParameters/Cloud.xml file, and update the connection string to connect the
app to the Azure SQL database. The connection string can be located in the database in the Azure portal.
10. They commit the updated code and push to Azure DevOps Services.
Step 8: Build and release pipelines in Azure DevOps Services
Contoso admins now configure Azure DevOps Services to perform build and release process to action the DevOps
practices.
1. In Azure DevOps Services, they select Build and release > New pipeline.
2. They select Azure DevOps Services Git and the relevant repo.
3. In Select a template, they select fabric with Docker support.
4. They change the Action Tag images to Build an image, and configure the task to use the provisioned ACR.
5. In the Push images task, they configure the image to be pushed to the ACR, and select to include the latest
tag.
6. In Triggers, they enable continuous integration, and add the master branch.
9. They select the Azure Service Fabric deployment template, and name the Stage (SmartHotelSF).
10. They provide a pipeline name (ContosoSmartHotel360Rearchitect). For the stage, they select 1 job, 1
task to configure the Service Fabric deployment.
15. They select the project and build pipeline, using the latest version.
16. Note that the lightning bolt on the artifact is checked.
20. To connect to the app, they direct traffic to the public IP address of the Azure load balancer in front of the
Service Fabric nodes.
3. In Getting Started, they select Data Explorer, and add a new collection.
4. In Add Collection they provide IDs and set storage capacity and throughput.
5. In the portal, they open the new database > Collection > Documents and select New Document.
6. They paste the following JSON code into the document window. This is sample data in the form of a single
tweet.
{
"id": "2ed5e734-8034-bf3a-ac85-705b7713d911",
"tweetId": 927750234331580911,
"tweetUrl": "https://twitter.com/status/927750237331580911",
"userName": "CoreySandersWA",
"userAlias": "@CoreySandersWA",
"userPictureUrl": "",
"text": "This is a tweet about #SmartHotel360",
"language": "en",
"sentiment": 0.5,
"retweet_count": 1,
"followers": 500,
"hashtags": [
""
]
}
7. They locate the Cosmos DB endpoint, and the authentication key. These are used in the app to connect to
the collection. In the database, they select Keys, and copy the URI and primary key to Notepad.
3. They can now click through the services to see that the SentimentIntegration app is up and running.
Clean up after migration
After migration, Contoso needs to complete these cleanup steps:
Remove the on-premises VMs from the vCenter inventory.
Remove the VMs from local backup jobs.
Update internal documentation to show the new locations for the SmartHotel360 app. Show the database as
running in Azure SQL database, and the front end as running in Service Fabric.
Review any resources that interact with the decommissioned VMs, and update any relevant settings or
documentation to reflect the new configuration.
Conclusion
In this article, Contoso refactored the SmartHotel360 app in Azure by migrating the app front-end VM to Service
Fabric. The app database was migrated to an Azure SQL database.
Rehost an on-premises Linux app to Azure VMs
12 minutes to read • Edit Online
This article shows how the fictional company Contoso rehosts a two-tier Linux-based Apache MySQL PHP
(LAMP ) app, using Azure IaaS VMs.
osTicket, the service desk app used in this example is provided as open source. If you'd like to use it for your own
testing purposes, you can download it from GitHub.
Business drivers
The IT Leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, and as a result there's pressure on the on-premises systems
and infrastructure.
Limit risk. The service desk app is critical for the Contoso business. Contoso wants to move it to Azure with
zero risk.
Extend. Contoso don't want to change the app right now. It simply wants to ensure that the app is stable.
Migration goals
The Contoso cloud team has pinned down goals for this migration, to determine the best migration method:
After migration, the app in Azure should have the same performance capabilities as it does today in their on-
premises VMware environment. The app will remain as critical in the cloud as it is on-premises.
Contoso doesn't want to invest in this app. It is important to the business, but in its current form Contoso
simply wants to move it safely to the cloud.
Contoso doesn't want to change the ops model for this app. It wants to interact with the app in the cloud in the
same way that they do now.
Contoso doesn't want to change app functionality. Only the app location will change.
Having completed a couple of Windows app migrations, Contoso wants to learn how to use a Linux-based
infrastructure in Azure.
Solution design
After pinning down goals and requirements, Contoso designs and review a deployment solution, and identifies the
migration process, including the Azure services that Contoso will use for the migration.
Current app
The OSTicket app is tiered across two VMs (OSTICKETWEB and OSTICKETMYSQL ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1)
Proposed architecture
Since the app is a production workload, the VMs in Azure will reside in the production resource group
ContosoRG.
The VMs will be migrated to the primary region (East US 2) and placed in the production network (VNET-
PROD -EUS2):
The web VM will reside in the front-end subnet (PROD -FE -EUS2).
The database VM will reside in the database subnet (PROD -DB -EUS2).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
Pros Both the app VMs will be moved to Azure without changes,
making the migration simple.
Since Contoso is using a lift and shift approach for both app
VMs, no special configuration or migration tools are needed
for the app database.
Cons The web and data tier of the app will remain a single point of
failover.
Azure services
SERVICE DESCRIPTION COST
Azure Migrate Server Migration The service orchestrates and manages During replication to Azure, Azure
migration of your on-premises apps Storage charges are incurred. Azure
and workloads, and AWS/GCP VM VMs are created, and incur charges,
instances. when failover occurs. Learn more about
charges and pricing.
Prerequisites
Here's what Contoso needs for this scenario.
REQUIREMENTS DETAILS
On-premises VMs Review Linux machines that are endorsed to run on Azure.
Scenario steps
Here's how Contoso will complete the migration:
Step 1: Prepare Azure for Azure Migrate Server Migration. They add the Server Migration tool to their
Azure Migrate project.
Step 2: Prepare on-premises VMware for Azure Migrate Server Migration. They prepare accounts for
VM discovery, and prepare to connect to Azure VMs after failover.
Step 3: Replicate VMs. They set up replication, and start replicating VMs to Azure storage.
Step 4: Migrate the VMs with Azure Migrate Server Migration. They run a test failover to make sure
everything's working, and then run a full failover to migrate the VMs to Azure.
Step 1: Prepare Azure for the Azure Migrate Server Migration tool
Here are the Azure components Contoso needs to migrate the VMs to Azure:
A VNet in which Azure VMs will be located when they're created during failover.
The Azure Migrate Server Migration tool provisioned.
They set these up as follows:
1. Set up a network: Contoso already set up a network that can be for Azure Migrate Server Migration when
they deployed the Azure infrastructure
The SmartHotel360 app is a production app, and the VMs will be migrated to the Azure production
network (VNET-PROD -EUS2) in the primary East US 2 region.
Both VMs will be placed in the ContosoRG resource group, which is used for production resources.
The app front-end VM (WEBVM ) will migrate to the front-end subnet (PROD -FE -EUS2), in the
production network.
The app database VM (SQLVM ) will migrate to the database subnet (PROD -DB -EUS2), in the
production network.
2. Provision the Azure Migrate Server Migration tool: With the network and storage account in place,
Contoso now creates a Recovery Services vault (ContosoMigrationVault), and places it in the
ContosoFailoverRG resource group in the primary East US 2 region.
2. In Replicate, > Source settings > Are your machines virtualized?, select Yes, with VMware vSphere.
3. In On-premises appliance, select the name of the Azure Migrate appliance that you set up > OK.
5. In Virtual machines, search for VMs as needed, and check each VM you want to migrate. Then click Next:
Target settings.
6. In Target settings, select the subscription, and target region to which you'll migrate, and specify the
resource group in which the Azure VMs will reside after migration. In Virtual Network, select the Azure
VNet/subnet to which the Azure VMs will be joined after migration.
7. In Azure Hybrid Benefit, select the following:
Select No if you don't want to apply Azure Hybrid Benefit. Then click Next.
Select Yes if you have Windows Server machines that are covered with active Software Assurance or
Windows Server subscriptions, and you want to apply the benefit to the machines you're migrating. Then
click Next.
8. In Compute, review the VM name, size, OS disk type, and availability set. VMs must conform with Azure
requirements.
VM size: If you're using assessment recommendations, the VM size dropdown will contain the
recommended size. Otherwise Azure Migrate picks a size based on the closest match in the Azure
subscription. Alternatively, pick a manual size in Azure VM size.
OS disk: Specify the OS (boot) disk for the VM. The OS disk is the disk that has the operating system
bootloader and installer.
Availability set: If the VM should be in an Azure availability set after migration, specify the set. The set
must be in the target resource group you specify for the migration.
9. In Disks, specify whether the VM disks should be replicated to Azure, and select the disk type (standard
SSD/HDD or premium-managed disks) in Azure. Then click Next.
You can exclude disks from replication.
If you exclude disks, won't be present on the Azure VM after migration.
10. In Review and start replication, review the settings, and click Replicate to start the initial replication for
the servers.
NOTE
You can update replication settings any time before replication starts, in Manage > Replicating machines. Settings can't be
changed after replication starts.
3. In Test Migration, select the Azure VNet in which the Azure VM will be located after the migration. We
recommend you use a nonproduction VNet.
4. The Test migration job starts. Monitor the job in the portal notifications.
5. After the migration finishes, view the migrated Azure VM in Virtual Machines in the Azure portal. The
machine name has a suffix -Test.
6. After the test is done, right-click the Azure VM in Replicating machines, and click Clean up test
migration.
2. They need to make sure that the OSTICKETWEB VM can communicate with the OSTICKETMYSQL VM.
Currently the configuration is hardcoded with the on-premises IP address 172.16.0.43.
Before the update:
After the update:
4. Finally, they update the DNS records for OSTICKETWEB and OSTICKETMYSQL, on one of the Contoso
domain controllers.
Need more help?
Learn about running a test failover.
Learn about migrating VMs to Azure.
This article shows how the fictional company Contoso rehosts a two-tier Linux-based Apache/MySQL/PHP
(LAMP ) app, migrating it from on-premises to Azure using Azure VMs and Azure Database for MySQL.
osTicket, the service desk app used in this example, is provided as open source. If you'd like to use it for your own
testing, you can download it from GitHub.
Business drivers
The IT Leadership team has worked closely with business partners to understand what they want to achieve:
Address business growth. Contoso is growing, and as a result there's pressure on the on-premises systems
and infrastructure.
Limit risk. The service desk app is critical for the business. Contoso wants to move it to Azure with zero risk.
Extend. Contoso doesn't want to change the app right now. It simply wants to keep the app stable.
Migration goals
The Contoso cloud team has pinned down goals for this migration, in order to determine the best migration
method:
After migration, the app in Azure should have the same performance capabilities as it does today in their on-
premises VMware environment. The app will remain as critical in the cloud as it is on-premises.
Contoso doesn't want to invest in this app. It's important to the business, but in its current form Contoso simply
want to move it safely to the cloud.
Having completed a couple of Windows app migrations, Contoso wants to learn how to use a Linux-based
infrastructure in Azure.
Contoso wants to minimize database admin tasks after the application is moved to the cloud.
Proposed architecture
In this scenario:
The app is tiered across two VMs ( OSTICKETWEB and OSTICKETMYSQL ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ), with an on-premises domain controller (
contosodc1 ).
The web tier app on OSTICKETWEB will be migrated to an Azure IaaS VM.
The app database will be migrated to the Azure Database for MySQL PaaS service.
Since Contoso is migrating a production workload, the resources will reside in the production resource group
ContosoRG .
The resources will be replicated to the primary region (East US 2), and placed in the production network (
VNET-PROD-EUS2 ):
The web VM will reside in the front-end subnet ( PROD-FE-EUS2 ).
The database instance will reside in the database subnet ( PROD-DB-EUS2 ).
The app database will be migrated to Azure Database for MySQL using MySQL tools.
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Migration process
Contoso will complete the migration process as follows:
To migrate the web VM:
1. As a first step, Contoso sets up the Azure and on-premises infrastructure needed to deploy Site Recovery.
2. After preparing the Azure and on-premises components, Contoso sets up and enables replication for the web
VM.
3. After replication is up-and-running, Contoso migrates the VM by failing it over to Azure.
To migrate the database:
1. Contoso provisions a MySQL instance in Azure.
2. Contoso sets up MySQL workbench, and backs up the database locally.
3. Contoso then restore the database from the local backup to Azure.
Azure services
SERVICE DESCRIPTION COST
Azure Site Recovery The service orchestrates and manages During replication to Azure, Azure
migration and disaster recovery for Storage charges are incurred. Azure
Azure VMs, and on-premises VMs and VMs are created, and incur charges,
physical servers. when failover occurs. Learn more about
charges and pricing.
Prerequisites
Here's what Contoso needs for this scenario.
REQUIREMENTS DETAILS
Scenario steps
Here's how Contoso admins will complete the migration:
Step 1: Prepare Azure for Site Recovery. They create an Azure storage account to hold replicated data, and
create a Recovery Services vault.
Step 2: Prepare on-premises VMware for Site Recovery. They prepare accounts for VM discovery and
agent installation, and prepare to connect to Azure VMs after failover.
Step 3: Provision the database. In Azure, they provision an instance of Azure Database for MySQL.
Step 4: Replicate VMs. They configure the Site Recovery source and target environment, set up a replication
policy, and start replicating VMs to Azure storage.
Step 5: Migrate the database. They set up migration with MySQL tools.
Step 6: Migrate the VMs with Site Recovery. Lastly, they run a test failover to make sure everything's
working, and then run a full failover to migrate the VMs to Azure.
2. With the network and storage account in place, they create a vault (ContosoMigrationVault), and place it in
the ContosoFailoverRG resource group, in the primary East US 2 region.
Need more help?
Learn about setting up Azure for Site Recovery.
2. They add the name contosoosticket for the Azure database. They add the database to the production
resource group ContosoRG, and specify credentials for it.
3. The on-premises MySQL database is version 5.7, so they select this version for compatibility. They use the
default sizes, which match their database requirements.
4. For Backup Redundancy Options, they select to use Geo-Redundant. This option allows them to restore
the database in their secondary Central US region if an outage occurs. They can only configure this option
when they provision the database.
5. In the VNET-PROD -EUS2 network > Service endpoints, they add a service endpoint (a database subnet)
for the SQL service.
6. After adding the subnet, they create a virtual network rule that allows access from the database subnet in
the production network.
12. Now, they download and install MySQL Server, and VMware PowerCLI.
13. After validation, they specify the FQDN or IP address of the vCenter server or vSphere host. They leave the
default port, and specify a friendly name for the vCenter server.
14. They input the account that they created for automatic discovery, and the credentials that Site Recovery will
use to automatically install the Mobility Service.
15. After registration finishes, in the Azure portal, they check that the configuration server and VMware server
are listed on the Source page in the vault. Discovery can take 15 minutes or more.
16. With everything in place, Site Recovery connects to VMware servers, and discovers VMs.
Set up the target
Now Contoso admins input target replication settings.
1. In Prepare infrastructure > Target, they select the target settings.
2. Site Recovery checks that there's an Azure storage account and network in the specified target.
Create a replication policy
With the source and target set up, Contoso admins are ready to create a replication policy.
1. In Prepare infrastructure > Replication Settings > Replication Policy > Create and Associate, they
create a policy ContosoMigrationPolicy.
2. They use the default settings:
RPO threshold: Default of 60 minutes. This value defines how often recovery points are created. An
alert is generated if continuous replication exceeds this limit.
Recovery point retention: Default of 24 hours. This value specifies how long the retention window
is for each recovery point. Replicated VMs can be recovered to any point in a window.
App-consistent snapshot frequency: Default of one hour. This value specifies the frequency at
which application-consistent snapshots are created.
3. Now they specify the target settings. These include the resource group and network in which the Azure VM
will be located after failover, and the storage account in which replicated data will be stored.
4. They select OSTICKETWEB for replication.
5. In the VM properties, they select the account that should be used to automatically install the Mobility
Service on the VM.
6. In Replication settings > Configure replication settings, they check that the correct replication policy is
applied, and select Enable Replication. The Mobility service will be automatically installed.
7. They track replication progress in Jobs. After the Finalize Protection job runs, the machine is ready for
failover.
Need more help?
You can read a full walkthrough of all these steps in Enable replication.
6. Now, they can import (restore) the database in the Azure Database for MySQL instance, from the self-
contained file. A new schema (osticket) is created for the instance.
Step 6: Migrate the VMs with Site Recovery
Finally, Contoso admins run a quick test failover, and then migrate the VM.
Run a test failover
Running a test failover helps verify that everything's working as expected, before the migration.
1. They run a test failover to the latest available point in time (Latest processed).
2. They select Shut down machine before beginning failover, so that Site Recovery attempts to shut down
the source VM before triggering the failover. Failover continues even if shutdown fails.
3. Test failover runs:
A prerequisites check runs to make sure all of the conditions required for migration are in place.
Failover processes the data, so that an Azure VM can be created. If select the latest recovery point, a
recovery point is created from the data.
An Azure VM is created using the data processed in the previous step.
4. After the failover finishes, the replica Azure VM appears in the Azure portal. They check that the VM is the
appropriate size, that it's connected to the right network, and that it's running.
5. After verifying, they clean up the failover, and record and save any observations.
Migrate the VM
To migrate the VM, Contoso admins creates a recovery plan that includes the VM, and fail over the plan to Azure.
1. They create a plan, and add OSTICKETWEB to it.
2. They run a failover on the plan. They select the latest recovery point, and specify that Site Recovery should
try to shut down the on-premises VM before triggering the failover. They can follow the failover progress on
the Jobs page.
3. During the failover, vCenter Server issues commands to stop the two VMs running on the ESXi host.
4. After the failover, they verify that the Azure VM appears as expected in the Azure portal.
5. After checking the VM, they complete the migration. This stops replication for the VM, and stops Site
Recovery billing for the VM.
Need more help?
Learn about running a test failover.
Learn how to create a recovery plan.
Learn about failing over to Azure.
Connect the VM to the database
As the final step in the migration process, Contoso admins update the connection string of the app to point to the
Azure Database for MySQL.
1. They make an SSH connection to the OSTICKETWEB VM using Putty or another SSH client. The VM is
private so they connect using the private IP address.
2. They update settings so that the OSTICKETWEB VM can communicate with the OSTICKETMYSQL
database. Currently the configuration is hardcoded with the on-premises IP address 172.16.0.43.
Before the update:
After the update:
4. Finally, they update the DNS records for OSTICKETWEB, on one of the Contoso domain controllers.
Clean up after migration
With migration complete, the osTicket app tiers are running on Azure VMs.
Now, Contoso needs to do the following:
Remove the VMware VMs from the vCenter inventory.
Remove the on-premises VMs from local backup jobs.
Update internal documentation show new locations and IP addresses.
Review any resources that interact with the on-premises VMs, and update any relevant settings or
documentation to reflect the new configuration.
Contoso used the Azure Migrate service with dependency mapping to assess the OSTICKETWEB VM for
migration. They should now remove the agents (the Microsoft Monitoring Agent and the Microsoft
Dependency agent) they installed for this purpose, from the VM.
This article shows how the fictional company Contoso migrates a two-tier Windows .NET front-end app running
on VMware VMs to an Azure VM using the Azure Site Recovery service. It also shows how Contoso migrates the
app database to Azure SQL Database Managed Instance.
The SmartHotel360 app used in this example is provided as open source. If you'd like to use it for your own testing
purposes, you can download it from GitHub.
Business drivers
Contoso's IT leadership team has worked closely with the company's business partners to understand what the
business wants to achieve with this migration:
Address business growth. Contoso is growing. As a result, pressure has increased on the company's on-
premises systems and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures, and to streamline processes for its
developers and users. The business needs IT to be fast and to not waste time or money, so the company can
deliver faster on customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes that occur in the marketplace for the company to be successful in a global economy. IT
at Contoso must not get in the way or become a business blocker.
Scale. As the company's business grows successfully, Contoso IT must provide systems that can grow at the
same pace.
Migration goals
The Contoso cloud team has identified goals for this migration. The company uses migration goals to determine
the best migration method.
After migration, the app in Azure should have the same performance capabilities that the app has today in
Contoso's on-premises VMware environment. Moving to the cloud doesn't mean that app performance is less
critical.
Contoso doesn't want to invest in the app. The app is critical and important to the business, but Contoso simply
wants to move the app in its current form to the cloud.
Database administration tasks should be minimized after the app is migrated.
Contoso doesn't want to use an Azure SQL Database for this app. It's looking for alternatives.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution, and
identifies the migration process, including the Azure services that it will use for the migration.
Current architecture
Contoso has one main datacenter (contoso-datacenter) . The datacenter is located in the city of New York in
the Eastern United States.
Contoso has three additional local branches across the United States.
The main datacenter is connected to the internet with a fiber Metro Ethernet connection (500 MBps).
Each branch is connected locally to the internet by using business-class connections with IPsec VPN tunnels
back to the main datacenter. The setup allows Contoso's entire network to be permanently connected and
optimizes internet connectivity.
The main datacenter is fully virtualized with VMware. Contoso has two ESXi 6.5 virtualization hosts that are
managed by vCenter Server 6.5.
Contoso uses Active Directory for identity management. Contoso uses DNS servers on the internal network.
Contoso has an on-premises domain controller (contosodc1).
The domain controllers run on VMware VMs. The domain controllers at local branches run on physical servers.
The SmartHotel360 app is tiered across two VMs (WEBVM and SQLVM ) that are located on a VMware ESXi
version 6.5 host (contosohost1.contoso.com ).
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ) running on a VM.
Proposed architecture
In this scenario, Contoso wants to migrate its two-tier on-premises travel app as follows:
Migrate the app database (SmartHotelDB ) to an Azure SQL Database Managed Instance.
Migrate the front-end WebVM to an Azure VM.
The on-premises VMs in the Contoso datacenter will be decommissioned when the migration is finished.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Server Managed Instance. The following considerations helped them to decide to go with Managed Instance.
Managed Instance aims to deliver almost 100% compatibility with the latest on-premises SQL Server version.
Microsoft recommends Managed instance for customers running SQL Server on-premises or on IaaS VM who
want to migrate their apps to a fully managed service with minimal design changes.
Contoso is planning to migrate a large number of apps from on-premises to IaaS. Many of these are ISV
provided. Contoso realizes that using Managed Instance will help ensure database compatibility for these apps,
rather than using SQL Database which might not be supported.
Contoso can simply do a lift and shift migration to Managed Instance using the fully automated Azure Database
Migration Service. With this service in place, Contoso can reuse it for future database migrations.
SQL Managed Instance supports SQL Server Agent which is an important issue for the SmartHotel360 app.
Contoso needs this compatibility, otherwise it will have to redesign maintenance plans required by the app.
With Software Assurance, Contoso can exchange their existing licenses for discounted rates on a SQL Database
Managed Instance using the Azure Hybrid Benefit for SQL Server. This can allow Contoso to save up to 30% on
Managed Instance.
SQL Managed Instance is fully contained in the virtual network, so it provides greater isolation and security for
Contoso's data. Contoso can get the benefits of the public cloud, while keeping the environment isolated from
the public Internet.
Managed Instance supports many security features including Always-encrypted, dynamic data masking, row -
level security, and threat detection.
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
For the data tier, Managed Instance might not be the best
solution if Contoso wants to customize the operating system
or the database server, or if they want to run third-party apps
along with SQL Server. Running SQL Server on an IaaS VM
could provide this flexibility.
Migration process
Contoso will migrate the web and data tiers of its SmartHotel360 app to Azure by completing these steps:
1. Contoso already has its Azure infrastructure in place, so it just needs to add a couple of specific Azure
components for this scenario.
2. The data tier will be migrated by using the Azure Database Migration Service. This service connects to the
on-premises SQL Server VM across a site-to-site VPN connection between the Contoso datacenter and
Azure. The service then migrates the database.
3. The web tier will be migrated by using a lift and shift migration by using Site Recovery. The process entails
preparing the on-premises VMware environment, setting up and enabling replication, and migrating the
VMs by failing them over to Azure.
Azure services
SERVICE DESCRIPTION COST
Azure Database Migration Service The Azure Database Migration Service Learn about supported regions and
enables seamless migration from Database Migration Service pricing.
multiple database sources to Azure data
platforms with minimal downtime.
SERVICE DESCRIPTION COST
Azure SQL Database Managed Instance Managed Instance is a managed Using a SQL Database Managed
database service that represents a fully Instance running in Azure incurs
managed SQL Server instance in the charges based on capacity. Learn more
Azure cloud. It uses the same code as about Managed Instance pricing.
the latest version of SQL Server
Database Engine, and has the latest
features, performance improvements,
and security patches.
Azure Site Recovery The Site Recovery service orchestrates During replication to Azure, Azure
and manages migration and disaster Storage charges are incurred. Azure
recovery for Azure VMs and on- VMs are created and incur charges
premises VMs and physical servers. when failover occurs. Learn more about
Site Recovery charges and pricing.
Prerequisites
Contoso and other users must meet the following prerequisites for this scenario:
REQUIREMENTS DETAILS
Azure subscription You should have already created a subscription when you
perform the assessment in the first article in this series. If you
don't have an Azure subscription, create a free account.
Site Recovery (on-premises) Your on-premises vCenter Server instance should be running
version 5.5, 6.0, or 6.5
Database Migration Service For the Azure Database Migration Service, you need a
compatible on-premises VPN device.
Make sure that the service account running the source SQL
Server instance has write permissions on the network share.
Scenario steps
Here's how Contoso plans to set up the deployment:
Step 1: Set up a SQL Database Managed Instance. Contoso needs an existing managed instance to which
the on-premises SQL Server database will migrate.
Step 2: Prepare the Azure Database Migration Service. Contoso must register the database migration
provider, create an instance, and then create an Azure Database Migration Service project. Contoso also must
set up a shared access signature (SAS ) uniform resource identifier (URI) for the Azure Database Migration
Service. An SAS URI provides delegated access to resources in Contoso's storage account, so Contoso can
grant limited permissions to storage objects. Contoso sets up an SAS URI, so the Azure Database Migration
Service can access the storage account container to which the service uploads the SQL Server backup files.
Step 3: Prepare Azure for Site Recovery. Contoso must create a storage account to hold replicated data for
Site Recovery. It also must create an Azure Recovery Services vault.
Step 4: Prepare on-premises VMware for Site Recovery. Contoso will prepare accounts for VM discovery
and agent installation to connect to Azure VMs after failover.
Step 5: Replicate VMs. To set up replication, Contoso configure the Site Recovery source and target
environments, sets up a replication policy, and starts replicating VMs to Azure Storage.
Step 6: Migrate the database using the Azure Database Migration Service. Contoso migrates the
database.
Step 7: Migrate the VMs by using Site Recovery. Contoso runs a test failover to make sure everything's
working. Then, Contoso runs a full failover to migrate the VMs to Azure.
5. They set custom DNS settings. DNS points first to Contoso's Azure domain controllers. Azure DNS is
secondary. The Contoso Azure domain controllers are located as follows:
Located in the PROD -DC -EUS2 subnet, in the East US 2 production network (VNET-PROD -EUS2)
CONTOSODC3 address: 10.245.42.4
CONTOSODC4 address: 10.245.42.5
Azure DNS resolver: 168.63.129.16
Need more help?
Get an overview of SQL Database Managed Instance.
Learn how to create a virtual network for a SQL Database Managed Instance.
Learn how to set up peering.
Learn how to update Azure Active Directory DNS settings.
Set up routing
The Managed Instance is placed in a private virtual network. Contoso needs a route table for the virtual network to
communicate with the Azure Management Service. If the virtual network can't communicate with the service that
manages it, the virtual network becomes inaccessible.
Contoso considers these factors:
The route table contains a set of rules (routes) that specify how packets sent from the Managed Instance should
be routed in the virtual network.
The route table is associated with subnets in which Managed Instances are deployed. Each packet that leaves a
subnet is handled based on the associated route table.
A subnet can be associated with only one route table.
There are no additional charges for creating route tables in Microsoft Azure.
To set up routing Contoso admins do the following:
1. They create a user-defined route table in the ContosoNetworkingRG resource group.
2. To comply with Managed Instance requirements, after the route table (MIRouteTable) is deployed, they add
a route that has an address prefix of 0.0.0.0/0. The Next hop type option is set to Internet.
3. They associate the route table with the SQLMI -DB -EUS2 subnet (in the VNET-SQLMI -EUS2 network).
2. They create a Blob storage container. Contoso generates an SAS URI so that the Azure Database Migration
Service can access it.
4. They place the Azure Database Migration Service instance in the PROD -DC -EUS2 subnet of the VNET-
PROD -DC -EUS2 virtual network.
The Azure Database Migration Service is placed here because the service must be in a virtual
network that can access the on-premises SQL Server VM via a VPN gateway.
The VNET-PROD -EUS2 is peered to VNET-HUB -EUS2 and is allowed to use remote gateways.
The Use remote gateways option ensures that the Azure Database Migration Service can
communicate as required.
14. When registration is finished, in the Azure portal, they verify again that the configuration server and
VMware server are listed on the Source page in the vault. Discovery can take 15 minutes or more.
15. Site Recovery connects to VMware servers by using the specified settings, and discovers VMs.
Set up the target
Now, Contoso admins configure the target replication environment:
1. In Prepare infrastructure > Target, they select the target settings.
2. Site Recovery checks that there's a storage account and network in the specified target.
Create a replication policy
When the source and target are set up, Contoso admins create a replication policy and associates the policy with
the configuration server:
1. In Prepare infrastructure > Replication Settings > Replication Policy > Create and Associate, they
create the ContosoMigrationPolicy policy.
2. They use the default settings:
RPO threshold: Default of 60 minutes. This value defines how often recovery points are created. An
alert is generated if continuous replication exceeds this limit.
Recovery point retention: Default of 24 hours. This value specifies how long the retention window is
for each recovery point. Replicated VMs can be recovered to any point in a window.
App-consistent snapshot frequency: Default of 1 hour. This value specifies the frequency at which
application-consistent snapshots are created.
3. They specify the target settings, including the resource group and network in which the Azure VM will be
located after failover. They specify the storage account in which replicated data will be stored.
4. They select WebVM for replication. Site Recovery installs the Mobility Service on each VM when replication
is enabled.
5. They check that the correct replication policy is selected, and enable replication for WEBVM. They track
replication progress in Jobs. After the Finalize Protection job runs, the machine is ready for failover.
6. In Essentials in the Azure portal, they can see status for the VMs that are replicating to Azure:
3. For the target, they enter the name of the Managed Instance in Azure, and the access credentials.
4. In New Activity > Run Migration, they specify settings to run migration:
Source and target credentials.
The database to migrate.
The network share created on the on-premises VM. The Azure Database Migration Service takes
source backups to this share.
The service account that runs the source SQL Server instance must have write permissions on this
share.
The FQDN path to the share must be used.
The SAS URI that provides the Azure Database Migration Service with access to the storage account
container to which the service uploads the backup files for migration.
5. They save the migration settings, and then run the migration.
6. In Overview, they monitor the migration status.
7. When migration is finished, they verify that the target databases exist on the Managed Instance.
2. They run a failover on the plan, selecting the latest recovery point. They specify that Site Recovery should try
to shut down the on-premises VM before it triggers the failover.
3. After the failover, they verify that the Azure VM appears as expected in the Azure portal.
4. After verifying, they complete the migration to finish the migration process, stop replication for the VM, and
stop Site Recovery billing for the VM.
2. They update the string with the user name and password of the SQL Database Managed Instance.
3. After the string is configured, they replace the current connection string in the web.config file of its
application.
4. After updating the file and saving it, they restart IIS on WEBVM by running IISRESET /RESTART in a
Command Prompt window.
5. After IIS is restarted, the app uses the database that's running on the SQL Database Managed Instance.
6. At this point, they can shut down on-premises the SQLVM machine. The migration has been completed.
Need more help?
Learn how to run a test failover.
Learn how to create a recovery plan.
Learn how to fail over to Azure.
Conclusion
In this article, Contoso rehosts the SmartHotel360 app in Azure by migrating the app front-end VM to Azure by
using the Site Recovery service. Contoso migrates the on-premises database to an Azure SQL Database Managed
Instance by using the Azure Database Migration Service.
Rehost an on-premises app on Azure VMs and SQL
Server Always On availability groups
27 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso rehosts a two-tier Windows .NET app running on
VMware VMs as part of a migration to Azure. Contoso migrates the app front-end VM to an Azure VM, and the
app database to an Azure SQL Server VM, running in a Windows Server failover cluster with SQL Server Always
On availability groups.
The SmartHotel360 app used in this example is provided as open source. If you'd like to use it for your own testing
purposes, you can download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, and as a result there is pressure on on-premises systems and
infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures, and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes in the marketplace, to enable the success in a global economy. IT mustn't get in the way,
or become a business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that are able to grow at the same
pace.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method:
After migration, the app in Azure should have the same performance capabilities as it does today in VMware.
The app will remain as critical in the cloud as it is on-premises.
Contoso doesn't want to invest in this app. It is important to the business, but in its current form Contoso
simply want to move it safely to the cloud.
The on-premises database for the app has had availability issues. Contoso would like to deploy it in Azure as a
high-availability cluster, with failover capabilities.
Contoso wants to upgrade from their current SQL Server 2008 R2 platform, to SQL Server 2017.
Contoso doesn't want to use an Azure SQL Database for this app, and is looking for alternatives.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution, and
identifies the migration process, including the Azure services that it will use for the migration.
Current architecture
The app is tiered across two VMs (WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5)
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
Proposed architecture
In this scenario:
Contoso will migrate the app front-end WEBVM to an Azure IaaS VM.
The front-end VM in Azure will be deployed in the ContosoRG resource group (used for production
resources).
It will be located in the Azure production network (VNET-PROD -EUS2) in the primary East US2 region.
The app database will be migrated to an Azure SQL Server VM.
It will be located in Contoso's Azure database network (PROD -DB -EUS2) in the primary East US2
region.
It will be placed in a Windows Server failover cluster with two nodes, that uses SQL Server Always On
availability groups.
In Azure the two SQL Server VM nodes in the cluster will be deployed in the ContosoRG resource
group.
The VM nodes will be located in the Azure production network (VNET-PROD -EUS2) in the primary East
US2 region.
VMs will run Windows Server 2016 with SQL Server 2017 Enterprise Edition. Contoso doesn't have
licenses for this operating system, so it will use an image in the Azure Marketplace that provides the
license as a charge to their Azure EA commitment.
Apart from unique names, both VMs use the same settings.
Contoso will deploy an internal load balancer which listens for traffic on the cluster, and directs it to the
appropriate cluster node.
The internal load balancer will be deployed in the ContosoNetworkingRG (used for networking
resources).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Server. The following considerations helped them to decide to go with an Azure IaaS VM running SQL Server:
Using an Azure VM running SQL Server seems to be an optimal solution if Contoso needs to customize the
operating system or the database server, or if it might want to colocate and run third-party apps on the same
VM.
Using the Data Migration Assistant, Contoso can easily assess and migrate to an Azure SQL Database.
Solution review
Contoso evaluates their proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
The SQL Server tier will run on SQL Server 2017 and Windows
Server 2016. This retires their current Windows Server 2008
R2 operating system, and running SQL Server 2017 supports
Contoso's technical requirements and goals. IT provides 100%
compatibility while moving away from SQL Server 2008 R2.
The web tier of the app will remain a single point of failover.
Azure services
SERVICE DESCRIPTION COST
Data Migration Assistant DMA runs locally from the on-premises DMA is a free, downloadable tool.
SQL Server machine, and migrates the
database across a site-to-site VPN to
Azure.
Azure Site Recovery Site Recovery orchestrates and manages During replication to Azure, Azure
migration and disaster recovery for Storage charges are incurred. Azure
Azure VMs, and on-premises VMs and VMs are created, and incur charges,
physical servers. when failover occurs. Learn more about
charges and pricing.
Migration process
Contoso admins will migrate the app VMs to Azure.
They'll migrate the front-end VM to Azure VM using Site Recovery:
As a first step, they'll prepare and set up Azure components, and prepare the on-premises VMware
infrastructure.
With everything prepared, they can start replicating the VM.
After replication is enabled and working, they migrate the VM by failing it over to Azure.
They'll migrate the database to a SQL Server cluster in Azure, using the Data Migration Assistant (DMA).
As a first step they'll need to provision SQL Server VMs in Azure, set up the cluster and an internal load
balancer, and configure Always On availability groups.
With this in place, they can migrate the database
After the migration, they'll enable Always On protection for the database.
Prerequisites
Here's what Contoso needs to do for this scenario.
REQUIREMENTS DETAILS
Site Recovery (on-premises) The on-premises vCenter server should be running version
5.5, 6.0, or 6.5
Scenario steps
Here's how Contoso will run the migration:
Step 1: Prepare a cluster. Create a cluster for deploying two SQL Server VM nodes in Azure.
Step 2: Deploy and set up the cluster. Prepare an Azure SQL Server cluster. Databases are migrated into this
existing cluster.
Step 3: Deploy the load balancer. Deploy a load balancer to balance traffic to the SQL Server nodes.
Step 4: Prepare Azure for Site Recovery. Create an Azure storage account to hold replicated data, and a
Recovery Services vault.
Step 5: Prepare on-premises VMware for Site Recovery. Prepare accounts for VM discovery and agent
installation. Prepare on-premises VMs so that users can connect to Azure VMs after migration.
Step 6: Replicate VMs. Enable VM replication to Azure.
Step 7: Install DMA. Download and install the Data Migration Assistant.
Step 8: Migrate the database with DMA. Migrate the database to Azure.
Step 9: Protect the database. Create an Always On availability group for the cluster.
Step 10: Migrate the web app VM. Run a test failover to make sure everything's working as expected. Then
run a full failover to Azure.
5. In SQL Server settings, they limit SQL connectivity to the virtual network (private), on default port 1433.
For authentication they use the same credentials as they use onsite (contosoadmin).
5. When they create the storage account, primary and secondary access keys are generated for it. They need
the primary access key to create the cloud witness. The key appears under the storage account name >
Access Keys.
Add SQL Server VMs to Contoso domain
1. Contoso adds SQLAOG1 and SQLAOG2 to contoso.com domain.
2. Then, on each VM they install the Windows Failover Cluster Feature and Tools.
Set up the cluster
Before setting up the cluster, Contoso admins take a snapshot of the OS disk on each machine.
1. Then, they run a script they've put together to create the Windows Failover Cluster.
2. After they've created the cluster, they verify that the VMs appear as cluster nodes.
11. They then download and install MySQL Server, and VMware PowerCLI.
12. After validation, they specify the FQDN or IP address of the vCenter server or vSphere host. They leave the
default port, and specify a friendly name for the vCenter server.
13. They specify the account that they created for automatic discovery, and the credentials that are used to
automatically install the Mobility Service. For Windows machines, the account needs local administrator
privileges on the VMs.
14. After registration finishes, in the Azure portal, they double check that the configuration server and VMware
server are listed on the Source page in the vault. Discovery can take 15 minutes or more.
15. Site Recovery then connects to VMware servers using the specified settings, and discovers VMs.
Set up the target
Now Contoso admins specify target replication settings.
1. In Prepare infrastructure > Target, they select the target settings.
2. Site Recovery checks that there's an Azure storage account and network in the specified target.
Create a replication policy
Now, Contoso admins can create a replication policy.
1. In Prepare infrastructure > Replication Settings > Replication Policy > Create and Associate, they
create a policy ContosoMigrationPolicy.
2. They use the default settings:
RPO threshold: Default of 60 minutes. This value defines how often recovery points are created. An
alert is generated if continuous replication exceeds this limit.
Recovery point retention: Default of 24 hours. This value specifies how long the retention window
is for each recovery point. Replicated VMs can be recovered to any point in a window.
App-consistent snapshot frequency: Default of one hour. This value specifies the frequency at
which application-consistent snapshots are created.
3. Now, they specify the target settings, including the resource group and VNet, and the storage account in
which replicated data will be stored.
4. They select the WebVM for replication, checks the replication policy, and enables replication. Site Recovery
installs the Mobility Service on the VM when replication is enabled.
5. They track replication progress in Jobs. After the Finalize Protection job runs, the machine is ready for
failover.
6. In Essentials in the Azure portal, they can see the structure for the VMs replicating to Azure.
3. In the migration details, they add SQLVM as the source server, and SQLAOG1 as the target. They specify
credentials for each machine.
4. They create a local share for the database and configuration information. It must be accessible with write
access by the SQL Service account on SQLVM and SQLAOG1.
5. Contoso selects the logins that should be migrated, and starts the migration. After it finishes, DMA shows
the migration as successful.
6. They verify that the database is running on SQLAOG1.
DMA connects to the on-premises SQL Server VM across a site-to-site VPN connection between the Contoso
datacenter and Azure, and then migrates the database.
4. They configure a listener for the group (SHAOG) and port. The IP address of the internal load balancer is
added as a static IP address (10.245.40.100).
5. In Select Data Synchronization, they enable automatic seeding. With this option, SQL Server
automatically creates the secondary replicas for every database in the group, so Contoso don't have to
manually back up and restore these. After validation, the availability group is created.
6. Contoso ran into an issue when creating the group. They aren't using Active Directory Windows Integrated
security, and thus need to grant permissions to the SQL login to create the Windows Failover Cluster roles.
7. After the group is created, Contoso can see it in SQL Management Studio.
Configure a listener on the cluster
As a last step in setting up the SQL deployment, Contoso admins configure the internal load balancer as the
listener on the cluster, and brings the listener online. They use a script to do this.
3. After the failover, they verify that the Azure VM appears as expected in the Azure portal.
4. After verifying the VM in Azure, they complete the migration to finish the migration process, stop
replication for the VM, and stop Site Recovery billing for the VM.
2. After updating the file and saving it, they restart IIS on WEBVM. They do this using the IISRESET
/RESTART from a cmd prompt.
3. After IIS has been restarted, the application is now using the database running on the SQL MI.
Need more help?
Learn about running a test failover.
Learn how to create a recovery plan.
Learn about failing over to Azure.
Clean up after migration
After migration, the SmartHotel360 app is running on an Azure VM, and the SmartHotel360 database is located in
the Azure SQL cluster.
Now, Contoso needs to complete these cleanup steps:
Remove the on-premises VMs from the vCenter inventory.
Remove the VMs from local backup jobs.
Update internal documentation to show the new locations and IP addresses for VMs.
Review any resources that interact with the decommissioned VMs, and update any relevant settings or
documentation to reflect the new configuration.
Add the two new VMs (SQLAOG1 and SQLAOG2) should be added to production monitoring systems.
Review the deployment
With the migrated resources in Azure, Contoso needs to fully operationalize and secure their new infrastructure.
Security
The Contoso security team reviews the Azure VMs WEBVM, SQLAOG1 and SQLAOG2 to determine any security
issues.
The team reviews the network security groups (NSGs) for the VM to control access. NSGs are used to ensure
that only traffic allowed to the application can pass.
The team considers securing the data on the disk using Azure Disk Encryption and Key Vault.
The team should evaluate transparent data encryption (TDE ), and then enable it on the SmartHotel360
database running on the new SQL AOG. Learn more.
For more information, see Security best practices for IaaS workloads in Azure.
BCDR
For business continuity and disaster recovery (BCDR ), Contoso takes the following actions:
To keep data safe, Contoso backs up the data on the WEBVM, SQLAOG1 and SQLAOG2 VMs using the Azure
Backup service. Learn more.
Contoso will also learn about how to use Azure Storage to back up SQL Server directly to blob storage. Learn
more.
To keep apps up and running, Contoso replicates the app VMs in Azure to a secondary region using Site
Recovery. Learn more.
Licensing and cost optimization
1. Contoso has existing licensing for their WEBVM and will take advantage of the Azure Hybrid Benefit. Contoso
will convert the existing Azure VMs to take advantage of this pricing.
2. Contoso will enable Azure Cost Management licensed by Cloudyn, a Microsoft subsidiary. It's a multicloud cost
management solution that helps you to use and manage Azure and other cloud resources. Learn more about
Azure Cost Management.
Conclusion
In this article, Contoso rehosted the SmartHotel360 app in Azure by migrating the app front-end VM to Azure
using the Site Recovery service. Contoso migrated the app database to a SQL Server cluster provisioned in Azure,
and protected it in a SQL Server Always On availability group.
Refactor an on-premises app to an Azure App
Service web app and Azure SQL database
14 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso refactors a two-tier Windows .NET app running on
VMware VMs as part of a migration to Azure. They migrate the app front-end VM to an Azure App Service web
app, and the app database to an Azure SQL database.
The SmartHotel360 app used in this example is provided as open source. If you'd like to use it for your own testing
purposes, you can download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, and there is pressure on on-premises systems and
infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures, and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes in the marketplace, to enable the success in a global economy. It mustn't get in the way,
or become a business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that are able to grow at the same
pace.
Reduce costs. Contoso wants to minimize licensing costs.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method.
REQUIREMENTS DETAILS
REQUIREMENTS DETAILS
The team doesn't want to invest in the app. For now, admins
will simply move the app safely to the cloud.
The team also wants to move away from SQL Server 2008 R2
to a modern PaaS Database platform, which will minimize the
need for management.
Azure Contoso wants to move the app to Azure, but doesn't want to
run it on VMs. Contoso wants to use Azure PaaS services for
both the web and data tiers.
Solution design
After pinning down goals and requirements, Contoso designs and review a deployment solution, and identifies the
migration process, including the Azure services that will be used for migration.
Current app
The SmartHotel360 on-premises app is tiered across two VMs (WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5)
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Proposed solution
For the database tier of the app, Contoso compared Azure SQL Database with SQL Server using this article.
Contoso decided to go with Azure SQL Database for a few reasons:
Azure SQL Database is a relational-database managed service. It delivers predictable performance at
multiple service levels, with near-zero administration. Advantages include dynamic scalability with no
downtime, built-in intelligent optimization, and global scalability and availability.
Contoso can use the lightweight Data Migration Assistant (DMA) to assess and migrate the on-premises
database to Azure SQL.
With Software Assurance, Contoso can exchange existing licenses for discounted rates on a SQL
Database, using the Azure Hybrid Benefit for SQL Server. This could provide savings of up to 30%.
SQL Database provides security features such as always encrypted, dynamic data masking, and row -level
security/threat detection.
For the app web tier, Contoso has decided to use Azure App Service. This PaaS service enables that to deploy
the app with just a few configuration changes. Contoso will use Visual Studio to make the change, and deploy
two web apps. One for the website, and one for the WCF service.
To meet requirements for a DevOps pipeline, Contoso has selected to use Azure DevOps for Source Code
Management (SCM ) with Git repos. Automated builds and release will be used to build the code, and deploy it
to the Azure App Service.
Solution review
Contoso evaluates their proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
Contoso can configure the web tier of the app with multiple
instances, so that it's no longer a single point of failure.
Cons Azure App Service only supports one app deployment for each
web app. This means that two web apps must be provisioned
(one for the website and one for the WCF service).
Proposed architecture
Migration process
1. Contoso provisions an Azure SQL instance, and migrates the SmartHotel360 database to it.
2. Contoso provisions and configures web apps, and deploys the SmartHotel360 app to them.
Azure services
SERVICE DESCRIPTION COST
Data Migration Assistant (DMA) Contoso will use DMA to assess and It's a downloadable tool free of charge.
detect compatibility issues that might
affect their database functionality in
Azure. DMA assesses feature parity
between SQL sources and targets, and
recommends performance and reliability
improvements.
Azure SQL Database An intelligent, fully managed relational Cost based on features, throughput,
cloud database service. and size. Learn more.
Azure App Service Create powerful cloud apps using a fully Cost based on size, location, and usage
managed platform duration. Learn more.
SERVICE DESCRIPTION COST
Prerequisites
Here's Contoso needs to run this scenario:
REQUIREMENTS DETAILS
Scenario steps
Here's how Contoso will run the migration:
Step 1: Provision a SQL Database instance in Azure. Contoso provisions a SQL instance in Azure. After the
app website is migrate to Azure, the WCF service web app will point to this instance.
Step 2: Migrate the database with DMA. Contoso migrates the app database with the Data Migration
Assistant.
Step 3: Provision web apps. Contoso provisions the two web apps.
Step 4: Set up Azure DevOps. Contoso creates a new Azure DevOps project, and imports the Git repo.
Step 5: Configure connection strings. Contoso configures connection strings so that the web tier web app,
the WCF service web app, and the SQL instance can communicate.
Step 6: Set up build and release pipelines. As a final step, Contoso sets up build and release pipelines to
create the app, and deploys them to two separate web wpps.
3. They set up a new SQL Server instance (sql-smarthotel-eus2) in the primary region.
4. They set the pricing tier to match their server and database needs. And they select to save money with Azure
Hybrid Benefit because they already have a SQL Server license.
5. For sizing they use v-Core-based purchasing, and set the limits for their expected requirements.
3. In the migration details, they add SQLVM as the source server, and the SmartHotel.Registration database.
4. They receive an error which seems to be associated with authentication. However after investigating, the
issue is the period (.) in the database name. As a workaround, they decided to provision a new SQL database
using the name SmartHotel-Registration, to resolve the issue. When they run DMA again, they're able to
select SmartHotel-Registration, and continue with the wizard.
5. In Select Objects, they select the database tables, and generate a SQL script.
10. They delete the extra SQL database SmartHotel.Registration in the Azure portal.
2. They provide an app name (SHWEB -EUS2), run it on Windows, and place it un the production resources
group ContosoRG. They create a new web app and Azure App Service plan.
3. After the web app is provisioned, they repeat the process to create a web app for the WCF service ( SHWCF-
EUS2)
4. After they're done, they browse to the address of the apps to check they've been created successfully.
3. After the code is imported, they connect Visual Studio to the repo, and clone the code using Team Explorer.
4. After the repository is cloned to the developer machine, they open the Solution file for the app. The web app
and wcf service each have separate project within the file.
4. The client section of the web.config file for the SmartHotel.Registration.Web should be changed to point to
the new location of the WCF service. This is the URL of the WCF web app hosting the service endpoint.
5. After the changes are in the code, admins need to commit the changes. Using Team Explorer in Visual
Studio, they commit and sync.
3. In Select a template, they select the ASP.NET template for their build.
4. The name ContosoSmartHotelRefactor-ASP.NET-CI is used for the build. They select Save & Queue.
5. This kicks off the first build. They select the build number to watch the process. After it's finished they can
see the process feedback, and select Artifacts to review the build results.
10. Under the stages, they select 1 job, 1 task to configure deployment of the WCF service.
11. They verify the subscription is selected and authorized, and select the App service name.
12. On the pipeline > Artifacts, they select +Add an artifact, and select to build with the
ContosoSmarthotel360Refactor pipeline.
13. They select the lightning bolt on the artifact is checked., to enable continuous deployment trigger.
16. In Select a file or folder, they locate the SmartHotel.Registration.Wcf.zip file that was creating during
the build, and select Save.
17. They select Pipeline > Stages +Add, to add an environment for SHWEB -EUS2. They select another Azure
App Service deployment.
18. They repeat the process to publish the web app (SmartHotel.Registration.Web.zip) file to the correct web
app.
19. After it's saved, the release pipeline will show as follows.
20. They move back to Build, and select Triggers > Enable continuous integration. This enables the pipeline
so that when changes are committed to the code, and full build and release occurs.
21. They select Save & Queue to run the full pipeline. A new build is triggered that in turn creates the first
release of the app to the Azure App Service.
22. Contoso admins can follow the build and release pipeline process from Azure DevOps. After the build
completes, the release will start.
23. After the pipeline finishes, both sites have been deployed and the app is up and running online.
At this point, the app is successfully migrated to Azure.
Conclusion
In this article, Contoso refactored the SmartHotel360 app in Azure by migrating the app front-end VM to two
Azure App Service web apps. The app database was migrated to an Azure SQL database.
Refactor a Linux app to multiple regions using Azure
App Service, Traffic Manager, and Azure Database for
MySQL
11 minutes to read • Edit Online
This article shows how the fictional company Contoso refactors a two-tier Linux-based Apache MySQL PHP
(LAMP ) app, migrating it from on-premises to Azure using Azure App Service with GitHub integration and Azure
Database for MySQL.
osTicket, the service desk app used in this example is provided as open source. If you'd like to use it for your own
testing purposes, you can download it from GitHub.
Business drivers
The IT Leadership team has worked closely with business partners to understand what they want to achieve:
Address business growth. Contoso is growing and moving into new markets. It needs additional customer
service agents.
Scale. The solution should be built so that Contoso can add more customer service agents as the business
scales.
Improve resiliency. In the past issues with the system affected internal users only. With the new business
model, external users will be affected, and Contoso need the app up and running at all times.
Migration goals
The Contoso cloud team has pinned down goals for this migration, in order to determine the best migration
method:
The application should scale beyond current on-premises capacity and performance. Contoso is moving the
application to take advantage of Azure's on-demand scaling.
Contoso wants to move the app code base to a continuous delivery pipeline. As app changes are pushed to
GitHub, Contoso wants to deploy those changes without tasks for operations staff.
The application must be resilient with capabilities for growth and failover. Contoso wants to deploy the app in
two different Azure regions, and set it up to scale automatically.
Contoso wants to minimize database admin tasks after the app is moved to the cloud.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution, and
identifies the migration process, including the Azure services that will be used for the migration.
Current architecture
The app is tiered across two VMs (OSTICKETWEB and OSTICKETMYSQL ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
Proposed architecture
Here's the proposed architecture:
The web tier app on OSTICKETWEB will be migrated by building an Azure App Service in two Azure regions.
Azure App Service for Linux will be implemented using the PHP 7.0 Docker container.
The app code will be moved to GitHub, and the Azure App Service web app will be configured for continuous
delivery with GitHub.
Azure App Servers will be deployed in both the primary (East US 2) and secondary (Central US ) region.
Traffic Manager will be set up in front of the two web apps in both regions.
Traffic Manager will be configured in priority mode to force the traffic through East US 2.
If the Azure App Server in East US 2 goes offline, users can access the failed over app in Central US.
The app database will be migrated to the Azure Database for MySQL service using MySQL Workbench tools.
The on-premises database will be backed up locally, and restored directly to Azure Database for MySQL.
The database will reside in the primary East US 2 region, in the database subnet (PROD -DB -EUS2) in the
production network (VNET-PROD -EUS2):
Since they're migrating a production workload, Azure resources for the app will reside in the production
resource group ContosoRG.
The Traffic Manager resource will be deployed in Contoso's infrastructure resource group ContosoInfraRG.
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Migration process
Contoso will complete the migration process as follows:
1. As a first step, Contoso admins set up the Azure infrastructure, including provisioning Azure App Service,
setting up Traffic Manager, and provisioning an Azure Datbase for MySQL instance.
2. After preparing the Azure, they migrate the database using MySQL Workbench.
3. After the database is running in Azure, they up a GitHub private repository for Azure App Service with
continuous delivery, and load it with the osTicket app.
4. In the Azure portal, they load the app from GitHub to the Docker container running Azure App Service.
5. They tweak DNS settings, and configure autoscaling for the app.
Azure services
SERVICE DESCRIPTION COST
Azure App Service The service runs and scales Pricing is based on the size
applications using the Azure of the instances, and the
PaaS service for websites. features required. Learn
more.
Traffic Manager A load balancer that uses Pricing is based on the Learn more.
DNS to direct users to Azure, number of DNS queries
or external websites and received, and the number of
services. monitored endpoints.
Azure Database for MySQL The database is based on Pricing based on compute,
the open-source MySQL storage, and backup
Server engine. It provides a requirements. Learn more.
fully managed, enterprise-
ready community MySQL
database, as a service for
app development and
deployment.
Prerequisites
Here's what Contoso needs to run this scenario.
REQUIREMENTS DETAILS
3. They create a new App Service plan in the primary region ( APP -SVP -EUS2), using the standard size.
4. They select a Linux OS with PHP 7.0 runtime stack, which is a Docker container.
5. They create a second web app (osticket-cus), and Azure App Service plan for the Central US region.
Need more help?
Learn about Azure App Service web apps.
Learn about Azure App Service on Linux.
2. They add the name contosoosticket for the Azure database. They add the database to the production
resource group ContosoRG, and specify credentials for it.
3. The on-premises MySQL database is version 5.7, so they select this version for compatibility. They use the
default sizes, which match their database requirements.
4. For Backup Redundancy Options, they select to use Geo-Redundant. This option allows them to restore
the database in their secondary Central US region if an outage occurs. They can only configure this option
when they provision the database.
5. They set up connection security. In the database > Connection Security, they set up Firewall rules to allow
the database to access Azure services.
6. They add the local workstation client IP address to the start and end IP addresses. This allows the web apps
to access the MySQL database, along with the database client that's performing the migration.
Step 4: Migrate the database
Contoso admins migrate the database using backup and restore, with MySQL tools. They install MySQL
Workbench, back up the database from OSTICKETMYSQL, and then restore it to Azure Database for MySQL
Server.
Install MySQL Workbench
1. They check the prerequisites and downloads MySQL Workbench.
2. They install MySQL Workbench for Windows in accordance with the installation instructions. The machine
on which they install must be accessible to the OSTICKETMYSQL VM, and Azure via the internet.
3. In MySQL Workbench, they create a MySQL connection to OSTICKETMYSQL.
6. Now, they can import (restore) the database in the Azure Database for MySQL instance, from the self-
contained file. A new schema (osticket) is created for the instance.
7. After data is restored, it can be queried using Workbench, and appears in the Azure portal.
8. Finally, they need to update the database information on the web apps. On the MySQL instance, they open
Connection Strings.
9. In the strings list, they locate the web app settings, and select to copy them.
10. They open a Notepad window and paste the string into a new file, and update it to match the osticket
database, MySQL instance, and credentials settings.
11. They can verify the server name and login from Overview in the MySQL instance in the Azure portal.
2. After forking, they navigate to the include folder, and find the ost-config.php file.
3. The file opens in the browser and they edit it.
4. In the editor, they update the database details, specifically DBHOST and DBUSER.
6. For each web app (osticket-eus2 and osticket-cus), they modify the Application settings in the Azure
portal.
7. They enter the connection string with the name osticket, and copy the string from notepad into the value
area. They select MySQL in the dropdown list next to the string, and save the settings.
4. After the configuration is updated and the osTicket web app is loaded from GitHub to the Docket container
running the Azure App Service, the site shows as Active.
5. They repeat the above steps for the secondary web app ( osticket-cus).
6. After the site is configured, it's accessible via the Traffic Manager profile. The DNS name is the new location
of the osTicket app. Learn more.
7. Contoso wants a DNS name that's easy to remember. They create an alias record (CNAME )
osticket.contoso.com which points to the Traffic Manager name, in the DNS on their domain controllers.
8. They configure both the osticket-eus2 and osticket-cus web apps to allow the custom hostnames.
Set up autoscaling
Finally, they set up automatic scaling for the app. This ensures that as agents use the app, the app instances
increase and decrease according to business needs.
1. In App Service APP -SRV -EUS2, they open Scale Unit.
2. They configure a new autoscale setting with a single rule that increases the instance count by one when the
CPU percentage for the current instance is above 70% for 10 minutes.
3. They configure the same setting on APP -SRV -CUS to ensure that the same behavior applies if the app fails
over to the secondary region. The only difference is that they set the default instance to 1 since this is for
failovers only.
This article demonstrates how the fictional company Contoso rebuilds a two-tier Windows .NET app running on
VMware VMs as part of a migration to Azure. Contoso migrates the app's front-end VM to an Azure App Service
web app. The app back end is built using microservices deployed to containers managed by Azure Kubernetes
Service (AKS ). The site interacts with Azure Functions to provide pet photo functionality.
The SmartHotel360 app used in this example is provided as open source. If you'd like to use it for your own testing
purposes, you can download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, and wants to provide differentiated experiences for customers
on Contoso websites.
Be agile. Contoso must be able to react faster than the changes in the marketplace, to enable the success in a
global economy.
Scale. As the business grows successfully, the Contoso IT team must provide systems that are able to grow at
the same pace.
Reduce costs. Contoso wants to minimize licensing costs.
Migration goals
The Contoso cloud team has pinned down app requirements for this migration. These requirements were used to
determine the best migration method:
The app in Azure is still as critical as it is today. It should perform well and scale easily.
The app shouldn't use IaaS components. Everything should be built to use PaaS or serverless services.
The app builds should run in cloud services, and containers should reside in a private Enterprise-wide container
registry in the cloud.
The API service used for pet photos should be accurate and reliable in the real world, since decisions made by
the app must be honored in their hotels. Any pet granted access is allowed to stay at the hotels.
To meet requirements for a DevOps pipeline, Contoso will use Azure DevOps for source code management
(SCM ), with Git Repos. Automated builds and releases will be used to build code and deploy to Azure App
Service, Azure Functions, and AKS.
Different CI/CD pipelines are needed for microservices on the back end, and for the web site on the front end.
The back-end services have a different release cycle from the front-end web app. To meet this requirement, they
will deploy two different pipelines.
Contoso needs management approval for all front-end website deployment, and the CI/CD pipeline must
provide this.
Solution design
After pinning down goals and requirements, Contoso designs and review a deployment solution, and identifies the
migration process, including the Azure services that will be used for the migration.
Current app
The SmartHotel360 on-premises app is tiered across two VMs (WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5)
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Proposed architecture
The front end of the app is deployed as an Azure App Service web app in the primary Azure region.
An Azure function provides uploads of pet photos, and the site interacts with this functionality.
The pet photo function uses the Azure Cognitive Services Vision API and Cosmos DB.
The back end of the site is built using microservices. These will be deployed to containers managed on the
Azure Kubernetes service (AKS ).
Containers will be built using Azure DevOps, and pushed to the Azure Container Registry (ACR ).
For now, Contoso will manually deploy the web app and function code using Visual Studio.
Microservices will be deployed using a PowerShell script that calls Kubernetes command-line tools.
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
CONSIDERATION DETAILS
CONSIDERATION DETAILS
Migration process
1. Contoso provision the ACR, AKS, and Cosmos DB.
2. They provision the infrastructure for the deployment, including Azure App Service web app, storage
account, function, and API.
3. After the infrastructure is in place, they'll build their microservices container images using Azure DevOps,
which pushes them to the ACR.
4. Contoso will deploy these microservices to AKS using a PowerShell script.
5. Finally, they'll deploy the function and web app.
Azure services
SERVICE DESCRIPTION COST
AKS Simplifies Kubernetes management, AKS is a free service. Pay for only the
deployment, and operations. Provides a virtual machines, and associated
fully managed Kubernetes container storage and networking resources
orchestration service. consumed. Learn more.
Azure Functions Accelerates development with an event- Pay only for consumed resources. Plan
driven, serverless compute experience. is billed based on per-second resource
Scale on demand. consumption and executions. Learn
more.
Azure Container Registry Stores images for all types of container Cost based on features, storage, and
deployments. usage duration. Learn more.
Azure App Service Quickly build, deploy, and scale App Service plans are billed on a per
enterprise-grade web, mobile, and API second basis. Learn more.
apps running on any platform.
Prerequisites
Here's what Contoso needs for this scenario:
REQUIREMENTS DETAILS
Git
Azure PowerShell
Azure CLI
Scenario steps
Here's how Contoso will run the migration:
Step 1: Provision AKS and ACR. Contoso provisions the managed AKS cluster and Azure container registry
using PowerShell.
Step 2: Build Docker containers. They set up CI for Docker containers using Azure DevOps, and push them
to the ACR.
Step 3: Deploy back-end microservices. They deploy the rest of the infrastructure that will be used by back-
end microservices.
Step 4: Deploy front-end infrastructure. They deploy the front-end infrastructure, including blob storage for
the pet phones, the Cosmos DB, and Vision API.
Step 5: Migrate the back end. They deploy microservices and run on AKS, to migrate the back end.
Step 6: Publish the front end. They publish the SmartHotel360 app to the App Service, and the function app
that will be called by the pet service.
4. They select View > Integrated Terminal to open the integrated terminal in Visual Studio Code.
5. In the PowerShell Integrated terminal, they sign into Azure using the Connect-AzureRmAccount command.
Learn more about getting started with PowerShell.
6. They authenticate Azure CLI by running the az login command, and following the instructions to
authenticate using their web browser. Learn more about logging in with Azure CLI.
7. They run the following command, passing the resource group name of ContosoRG, the name of the AKS
cluster smarthotel-aks-eus2, and the new registry name.
8. Azure creates another resource group, containing the resources for the AKS cluster.
9. After the deployment is finished, they install the kubectl command-line tool. The tool is already installed on
the Azure CloudShell.
az aks install-cli
10. They verify the connection to the cluster by running the kubectl get nodes command. The node is the
same name as the VM in the automatically created resource group.
11. They run the following command to start the Kubernetes Dashboard:
12. A browser tab opens to the Dashboard. This is a tunneled connection using the Azure CLI.
4. In Pipelines, they select Build, and create a new pipeline using Azure Repos Git as a source, from the
repository.
5. They select to start with an empty job.
7. In Phase 1, they add a Docker Compose task. This task builds the Docker compose.
8. They repeat and add another Docker Compose task. This one pushes the containers to ACR.
9. They select the first task (to build), and configure the build with the Azure subscription, authorization, and
the ACR.
10. They specify the path of the docker-compose.yaml file, in the src folder of the repo. They select to build
service images and include the latest tag. When the action changes to Build service images, the name of
the Azure DevOps task changes to Build services automatically.
11. Now, they configure the second Docker task (to push). They select the subscription and the
smarthotelacreus2 ACR.
12. Again, they enter the file to the docker-compose.yaml file, and select Push service images and include the
latest tag. When the action changes to Push service images, the name of the Azure DevOps task changes
to Push services automatically.
13. With the Azure DevOps tasks configured, Contoso saves the build pipeline, and starts the build process.
14. They select the build job to check progress.
15. After the build finishes, the ACR shows the new repos, which are populated with the containers used by the
microservices.
8. They add a new Azure PowerShell task so that they can run a PowerShell script in an Azure environment.
9. They select the Azure subscription for the task, and select the deploy.ps1 script from the Git repo.
10. They add arguments to the script. The script will delete all cluster content (except ingress and ingress
controller), and deploy the microservices.
11. They set the preferred Azure PowerShell version to the latest, and save the pipeline.
12. They move back to the Release page, and manually create a new release.
13. They select the release after creating it, and in Actions, they select Deploy.
14. When the deployment is complete, they run the following command to check the status of services, using
the Azure Cloud Shell: kubectl get services.
3. They create a second new container named settings. A file with all the front-end app settings will be placed
in this container.
4. They capture the access details for the storage account in a text file, for future reference.
Provision a Cosmos database
Contoso admins provision a Cosmos database to be used for pet information.
1. They create an Azure Cosmos DB in the Azure Marketplace.
2. They specify a name (contosomarthotel), select the SQL API, and place it in the production resource group
ContosoRG, in the main East US 2 region.
3. They add a new collection to the database, with default capacity and throughput.
4. They note the connection information for the database, for future reference.
Provision Computer Vision
Contoso admins provision the Computer Vision API. The API will be called by the function, to evaluate pictures
uploaded by users.
1. They create a Computer Vision instance in the Azure Marketplace.
2. They provision the API (smarthotelpets) in the production resource group ContosoRG, in the main East US
2 region.
3. They save the connection settings for the API to a text file for later reference.
2. They provide an app name (smarthotelpetchecker). They place the app in the production resource group
ContosoRG.They set the hosting place to Consumption Plan, and place the app in the East US 2 region. A
new storage account is created, along with an Application Insights instance for monitoring.
3. After the app is deployed, they browse to the app address to check it's been created successfully.
5. After the file is updated, they rename it smarthotelsettingsurl, and upload it to the blob storage they
created earlier.
6. They select the file to get the URL. The URL is used by the app when it pulls down the configuration files.
7. In the appsettings.Production.json file, they update the SettingsURL to the URL of the new file.
Deploy the website to Azure App Service
Contoso admins can now publish the website.
1. They open Azure DevOps, and in the SmartHotelFrontend project, in Builds and Releases, they select
+New Pipeline.
2. They select Azure DevOps Git as a source.
3. They select the ASP.NET Core template.
4. They review the pipeline, and check that Publish Web Projects and Zip Published Projects are selected.
5. In Triggers, they enable continuous integration, and add the master branch. This ensures that each time the
solution has new code committed to the master branch, the build pipeline starts.
6. They select Save & Queue to start a build.
7. After the build completes, they configure a release pipeline using Azure App Service Deployment.
8. They provide a Stage name Staging.
9. They add an artifact and select the build they just configured.
10. They select the lightning bolt icon on the artifact, and enable continuous deployment.
11. In Environment, they select 1 job, 1 task under Staging.
12. After selecting the subscription, and app name, they open the Deploy Azure App Service task. The
deployment is configured to use the staging deployment slot. This automatically builds code for review and
approval in this slot.
13. In the Pipeline, they add a new stage.
14. They select Azure App Service deployment with slot, and name the environment Prod.
15. They select 1 job, 2 tasks, and select the subscription, app service name, and the staging slot.
16. They remove the Deploy Azure App Service to Slot from the pipeline. It was placed there by the previous
steps.
17. They save the pipeline. On the pipeline, they select Post-deployment conditions.
18. They enable Post-deployment approvals, and add a dev lead as the approver.
19. In the Build pipeline, they manually kick off a build. This triggers the new release pipeline, which deploys the
site to the staging slot. For Contoso, the URL for the slot is
https://smarthotelcontoso-staging.azurewebsites.net/ .
20. After the build finishes, and the release deploys to the slot, Azure DevOps emails the dev lead for approval.
21. The dev lead selects View approval, and can approve or reject the request in the Azure DevOps portal.
22. The lead makes a comment and approves. This starts the swap of the staging and prod slots, and moves
the build into production.
23. The pipeline completes the swap.
24. The team checks the prod slot to verify that the web app is in production at
https://smarthotelcontoso.azurewebsites.net/ .
4. They commit the code, and sync it back to Azure DevOps, pushing their changes.
5. They add a new Build pipeline, and select Azure DevOps Git for the source.
6. They select the ASP.NET Core (.NET Framework) template.
7. They accept the defaults for the template.
8. In Triggers, then select to Enable continuous integration, and select Save & Queue to start a build.
9. After the build succeeds, they build a Release pipeline, adding Azure App Service deployment with slot.
10. They name the environment Prod, and select the subscription. They set the App type to Function App,
and the app service name as smarthotelpetchecker.
11. They add an artifact Build.
12. They enable Continuous deployment trigger, and select Save.
13. They select Queue new build to run the full CI/CD pipeline.
14. After the function is deployed, it appears in the Azure portal, with the Running status.
15. They browse to the app to test that the Pet Checker app is working as expected, at
http://smarthotel360public.azurewebsites.net/Pets.
16. They select the avatar to upload a picture.
Conclusion
In this article, Contoso rebuilds the SmartHotel360 app in Azure. The on-premises app front-end VM is rebuilt to
Azure App Service web apps. The application back end is built using microservices deployed to containers
managed by Azure Kubernetes Service (AKS ). Contoso enhanced app functionality with a pet photo app.
Suggested skills
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. Earn points and levels, and achieve more!
Here are a couple of examples of tailored learning paths on Microsoft Learn that align with the Contoso
SmartHotel360 app in Azure.
Deploy a website to Azure with Azure App Service: Web apps in Azure allow you to publish and manage your
website easily without having to work with the underlying servers, storage, or network assets. Instead, you can
focus on your website features and rely on the robust Azure platform to provide secure access to your site.
Process and classify images with the Azure Cognitive Vision Services: Azure Cognitive Services offers pre-built
functionality to enable computer vision functionality in your applications. Learn how to use the Cognitive Vision
Services to detect faces, tag and classify images, and identify objects.
Refactor a Team Foundation Server deployment to
Azure DevOps Services
15 minutes to read • Edit Online
This article shows how the fictional company Contoso refactors their on-premises Team Foundation Server (TFS )
deployment by migrating it to Azure DevOps Services in Azure. Contoso's development team have used TFS for
team collaboration and source control for the past five years. Now, they want to move to a cloud-based solution for
dev and test work, and for source control. Azure DevOps Services will play a role as they move to an Azure
DevOps model, and develop new cloud-native apps.
Business drivers
The IT Leadership team has worked closely with business partners to identify future goals. Partners aren't overly
concerned with dev tools and technologies, but they have captured these points:
Software: Regardless of the core business, all companies are now software companies, including Contoso.
Business leadership is interested in how IT can help lead the company with new working practices for users, and
experiences for their customers.
Efficiency: Contoso needs to streamline process and remove unnecessary procedures for developers and
users. This will allow the company to deliver on customer requirements more efficiently. The business needs IT
to fast, without wasting time or money.
Agility: Contoso IT needs to respond to business needs, and react more quickly than the marketplace to enable
success in a global economy. IT mustn't be a blocker for the business.
Migration goals
The Contoso cloud team has pinned down goals for the migration to Azure DevOps Services:
The team needs a tool to migrate the data to the cloud. Few manual processes should be needed.
Work item data and history for the last year must be migrated.
They don't want to set up new user names and passwords. All current system assignments must be maintained.
They want to move away from Team Foundation Version Control (TFVC ) to Git for source control.
The cutover to Git will be a "tip migration" that imports only the latest version of the source code. It will happen
during a downtime when all work will be halted as the codebase shifts. They understand that only the current
master branch history will be available after the move.
They're concerned about the change and want to test it before doing a full move. They want to retain access to
TFS even after the move to Azure DevOps Services.
They have multiple collections, and want to start with one that has only a few projects to better understand the
process.
They understand that TFS collections are a one-to-one relationship with Azure DevOps Services organizations,
so they'll have multiple URLs. However, this matches their current model of separation for code bases and
projects.
Proposed architecture
Contoso will move their TFS projects to the cloud, and no longer host their projects or source control on-
premises.
TFS will be migrated to Azure DevOps Services.
Currently Contoso has one TFS collection named ContosoDev , which will be migrated to an Azure DevOps
Services organization called contosodevmigration.visualstudio.com .
The projects, work items, bugs and iterations from the last year will be migrated to Azure DevOps Services.
Contoso will use their Azure Active Directory, which they set up when they deployed their Azure infrastructure
at the beginning of their migration planning.
Migration process
Contoso will complete the migration process as follows:
1. There's a lot of preparation involved. As a first step, Contoso needs to upgrade their TFS implementation to a
supported level. Contoso is currently running TFS 2017 Update 3, but to use database migration it needs to run
a supported 2018 version with the latest updates.
2. After upgrading, Contoso will run the TFS migration tool, and validate their collection.
3. Contoso will build a set of preparation files, and perform a migration dry run for testing.
4. Contoso will then run another migration, this time a full migration that includes work items, bugs, sprints, and
code.
5. After the migration, Contoso will move their code from TFVC to Git.
Prerequisites
Here's what Contoso needs to run this scenario.
REQUIREMENTS DETAILS
On-premises TFS server On-premises need to either be running TFS 2018 Upgrade 2
or be upgraded to it as part of this process.
Scenario steps
Here's how Contoso will complete the migration:
Step 1: Create an Azure storage account. This storage account will be used during the migration process.
Step 2: Upgrade TFS. Contoso will upgrade their deployment to TFS 2018 Upgrade 2.
Step 3: Validate collection. Contoso will validate the TFS collection in preparation for migration.
Step 4: Build preparation file. Contoso will create the migration files using the TFS Migration Tool.
5. They verify the TFS installation by reviewing projects, work items, and code.
NOTE
Some TFS upgrades need to run the Configure Features Wizard after the upgrade completes. Learn more.
2. They run the tool to perform the validation, by specifying the URL of the project collection:
6. They run TfsMigrator validate /help at the command line, and see that the command /tenantDomainName
seems to be required to validate identities.
7. They run the validation command again, and include this value, along with their Azure AD name:
TfsMigrator validate /collection:http://contosotfs:8080/tfs/ContosoDev
/tenantDomainName:contosomigration.onmicrosoft.com
.
8. An Azure AD sign-in screen appears, and they enter the credentials of a Global Admin user.
9. The validation passes, and is confirmed by the tool.
3. Prepare completes, and the tool reports that the import files have been generated successfully.
4. They can now see that both the IdentityMapLog.csv and the import.json file have been created in a new
folder.
5. The import.json file provides import settings. It includes information such as the desired organization name,
and storage account information. Most of the fields are populated automatically. Some fields required user
input. Contoso opens the file, and adds the Azure DevOps Services organization name to be created:
contosodevmigration. With this name, their Azure DevOps Services URL will be
contosodevmigration.visualstudio.com.
NOTE
The organization must be created before the migration, It can be changed after migration is done.
6. They review the identity log map file that shows the accounts that will be brought into Azure DevOps
Services during the import.
Active identities refer to identities that will become users in Azure DevOps Services after the import.
On Azure DevOps Services, these identities will be licensed, and show up as a user in the organization
after migration.
These identities are marked as Active in the Expected Import Status column in the file.
Step 5: Migrate to Azure DevOps Services
With preparation in place, Contoso admins can now focus on the migration. After running the migration, they'll
switch from using TFVC to Git for version control.
Before they start, the admins schedule downtime with the dev team, to take the collection offline for migration.
These are the steps for the migration process:
1. Detach the collection. Identity data for the collection resides in the TFS server configuration database while
the collection is attached and online. When a collection is detached from the TFS server, it takes a copy of that
identity data, and packages it with the collection for transport. Without this data, the identity portion of the
import cannot be executed. It's recommended that the collection stay detached until the import has been
completed, as there's no way to import the changes which occurred during the import.
2. Generate a backup. The next step of the migration process is to generate a backup that can be imported into
Azure DevOps Services. Data-tier Application Component Packages (DACPAC ), is a SQL Server feature that
allows database changes to be packaged into a single file, and deployed to other instances of SQL. It can also be
restored directly to Azure DevOps Services, and is therefore used as the packaging method for getting
collection data into the cloud. Contoso will use the SqlPackage.exe tool to generate the DACPAC. This tool is
included in SQL Server Data Tools.
3. Upload to storage. After the DACPAC is created, they upload it to Azure Storage. After it's uploaded, they get a
shared access signature (SAS ), to allow the TFS Migration Tool access to the storage.
4. Fill out the import. Contoso can then fill out missing fields in the import file, including the DACPAC setting. To
start with they'll specify that they want to do a dry run import, to check that everything's working properly
before the full migration.
5. Do a dry run. Dry run imports help test collection migration. Dry runs have limited life, and are deleted before
a production migration runs. They're deleted automatically after a set duration. A note about when the dry run
will be deleted is included in the success email received after the import finishes. Take note and plan accordingly.
6. Complete the production migration. With the dry run migration completed, Contoso admins do the final
migration by updating the import.json file, and running import again.
Detach the collection
Before starting, Contoso admins take a local SQL Server backup, and VMware snapshot of the TFS server, before
detaching.
1. In the TFS Admin console, they select the collection they want to detach (ContosoDev).
4. In Detach Progress, they monitor progress and select Next when the process finishes.
2. They connect to their subscription and locate the storage account they created for the migration
(contosodevmigration). They create a new blob container, azuredevopsmigration.
5. They accept the defaults and select Create. This enables access for 24 hours.
6. They copy the Shared Access Signature URL, so that it can be used by the TFS Migration Tool.
NOTE
The migration must happen before within the allowed time window or permissions will expire. Don't generate an SAS key
from the Azure portal. Keys generated like this are account-scoped, and won't work with the import.
3. The validation returns an error that the SAS key needs a longer expiry time.
4. They use Azure Storage Explorer to create a new SAS key with expiry set to seven days.
5. They update the import.json file and run the validation again. This time it completes successfully.
TfsMigrator import /importFile:C:\TFSMigrator\import.json /validateonly
7. A message is issued to confirm the migration. Note the length of time for which the staged data will be
maintained after the dry run.
8. Azure AD Sign In appears, and should be completing with Contoso Admin sign-in.
9. A message shows information about the import.
10. After 15 minutes or so, they browse to the URL, and see the following information:
11. After the migration finishes a Contoso Dev Leads signs into Azure DevOps Services to check that the dry
run worked properly. After authentication, Azure DevOps Services needs a few details to confirm the
organization.
12. In Azure DevOps Services, the Dev Lead can see that the projects have been migrated to Azure DevOps
Services. There's a notice that the organization will be deleted in 15 days.
13. The Dev Lead opens one of the projects and opens Work Items > Assigned to me. This shows that work
item data has been migrated, along with identity.
14. The Dev Lead also checks other projects and code, to confirm that the source code and history has been
migrated.
7. After around 15 minutes, they browse to the URL, and sees the following information:
8. After the migration finishes, a Contoso Dev Lead logs onto Azure DevOps Services to check that the
migration worked properly. After login, he can see that projects have been migrated.
9. The Dev Lead opens one of the projects and opens Work Items > Assigned to me. This shows that work
item data has been migrated, along with identity.
10. The Dev Lead checks other work item data to confirm.
11. The Dev Lead also checks other projects and code, to confirm that the source code and history has been
migrated.
Move source control from TFVC to GIT
With migration complete, Contoso wants to move from TFVC to Git for source code management. They need to
import the source code currently in their Azure DevOps Services organization as Git repos in the same
organization.
1. In the Azure DevOps Services portal, they open one of the TFVC repos ( $/PolicyConnect) and review it.
NOTE
Due to differences in how TFVC and Git store version control information, we recommend that Contoso don't migrate
history. This is the approach that Microsoft took when it migrated Windows and other products from centralized
version control to Git.
4. After the import, admins review the code.
6. After reviewing the source, the Dev Leads agree that the migration to Azure DevOps Services is done. Azure
DevOps Services now becomes the source for all development within teams involved in the migration.
Need more help?
Learn more about importing from TFVC.
Post-migration training
Contoso will need to provide Azure DevOps Services and Git training for relevant team members.
Scale a migration to Azure
21 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso performs a migration at scale to Azure. They
consider how to plan and perform a migration of more than 3000 workloads, 8000 databases, and over 10,000
VMs.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, causing pressure on on-premises systems and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures, and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes in the marketplace, to enable the success in a global economy. It mustn't get in the way,
or become a business blocker.
Scale. As the business grows successfully, the Contoso IT team must provide systems that are able to grow at
the same pace.
Improve cost models. Contoso wants to lessen capital requirements in the IT budget. Contoso wants to use
cloud abilities to scale and reduce the need for expensive hardware.
Lower licensing costs. Contoso wants to minimize cloud costs.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method.
REQUIREMENTS DETAILS
Move to Azure quickly Contoso wants to start moving apps and VMs to Azure as
quickly as possible.
Compile a full inventory Contoso wants a complete inventory of all apps, databases,
and VMs in the organization.
Assess and classify apps Contoso wants fully take advantage of the cloud. As a default
Contoso assumes that all services will run as PaaS. IaaS will be
used where PaaS isn't appropriate.
Train and move to DevOps Contoso wants to move to a DevOps model. Contoso will
provide Azure and DevOps training, and reorganize teams as
necessary.
After pinning down goals and requirements, Contoso reviews the IT footprint, and identifies the migration process.
Current deployment
After planning and setting up an Azure infrastructure and trying out different proof-of-concept (POC ) migration
combinations as detailed in the table above, Contoso is ready to embark on a full migration to Azure at scale.
Here's what Contoso wants to migrate.
Migration process
Now that Contoso have pinned down business drivers and migration goals, it determines a four-pronged
approach for the migration process:
Phase 1: Assess. Discover the current assets, and figure out whether they're suitable for migration to Azure.
Phase 2: Migrate. Move the assets to Azure. How they move apps and objects to Azure will depend on the app
and what they want to achieve.
Phase 3: Optimize. After moving resources to Azure, Contoso needs to improve and streamline them for
maximum performance and efficiency.
Phase 4: Secure and manage. With everything in place, Contoso now uses Azure security and management
resources and services to govern, secure, and monitor its cloud apps in Azure.
These phases aren't serial across the organization. Each piece of Contoso's migration project will be at a different
stage of the assessment and migration process. Optimization, security, and management will be ongoing over time.
Phase 1: Assess
Contoso kicks off the process by discovering and assessing on-premises apps, data, and infrastructure. Here's what
Contoso will do:
Contoso needs to discover apps, maps dependencies across apps, and decide on migration order and priority.
As Contoso assesses, it will build out a comprehensive inventory of apps and resources. Along with the new
inventory, Contoso will use and update the existing Configuration Management Database (CMDB ) and Service
Catalog.
The CMDB holds technical configurations for Contoso apps.
The Service Catalog documents the operational details of apps, including associated business partners,
and Service Level Agreements (SLAs).
Discover apps
Contoso runs thousands of apps across a range of servers. In addition to the CMDB and Service Catalog, Contoso
needs discovery and assessment tools.
The tools must provide a mechanism that can feed assessment data into the migration process.
Assessment tools must provide data that helps build up an intelligent inventory of Contoso's physical and
virtual resources. Data should include profile information, and performance metrics.
When discovery is complete, Contoso should have a complete inventory of assets, and metadata associated
with them. This inventory will be used to define the migration plan.
Identify classifications
Contoso identifies some common categories to classify assets in the inventory. These classifications are critical to
Contoso's decision making for migration. The classification list helps to establish migration priorities, and identify
complex issues.
Business group List of business group names Which group is responsible for the
inventory item?
Business group List of business group names Which group is responsible for the
inventory item?
Migration priority 1/2/3 What the migration priority for the app?
Migration risk 1-5 What's the risk level for migrating the
app? This value should be agreed on by
Contoso DevOps and relevant partners.
Contoso needs to use Azure Migrate correctly given the scale of this migration.
Contoso will do an app-by-app assessment with Azure Migrate. This ensures that Azure Migrate returns timely
data to the Azure portal.
Contoso admins read about deploying Azure Migrate at scale
Contoso notes the Azure Migrate limits summarized in the following table.
ACTION LIMIT
Rehost Often referred to as a lift and shift Contoso can rehost less-strategic apps,
migration, this is a no-code option for requiring no code changes.
migrating existing apps to Azure
quickly.
Refactor Also referred to as "repackaging", this Contoso can refactor strategic apps to
strategy requires minimal app code or retain the same basic functionality, but
configuration changes need to connect move them to run on an Azure platform
the app to Azure PaaS, and take better such as Azure App Service.
advantage of cloud capabilities.
This requires minimum code changes.
Rebuild This strategy rebuilds an app from Contoso can rewrite critical apps from
scratch using cloud-native technologies. the ground up, to take advantage of
cloud technologies such as serverless
Azure platform as a service (PaaS) computer, or microservices.
provides a complete development and
deployment environment in the cloud. Contoso will manage the app and
It eliminates some expense and services it develops, and Azure manages
complexity of software licenses, and everything else.
removes the need for an underlying
app infrastructure, middleware, and
other resources.
Data must also be considered, especially with the volume of databases that Contoso has. Contoso's default
approach is to use PaaS services such as Azure SQL Database to take full advantage of cloud features. By moving
to a PaaS service for databases, Contoso will only have to maintain data, leaving the underlying platform to
Microsoft.
Evaluate migration tools
Contoso are primarily using a couple of Azure services and tools for the migration:
Azure Site Recovery: Orchestrates disaster recovery, and migrates on-premises VMs to Azure.
Azure Database Migration Service: Migrates on-premises databases such as SQL Server, MySQL, and Oracle
to Azure.
Azure Site Recovery
Azure Site Recovery is the primary Azure service for orchestrating disaster recovery and migration from within
Azure, and from on-premises sites to Azure.
1. Site Recovery enables, orchestrates replication from your on-premises sites to Azure.
2. When replication is set up and running, on-premises machines can be failed over to Azure, completing the
migration.
Contoso already completed a POC to see how Site Recovery can help them to migrate to the cloud.
U se Si t e R e c o v e r y a t sc a l e
Contoso plans to perform multiple lift and shift migrations. To ensure this works, Site Recovery will be replicating
batches of around 100 VMs at a time. To figure out how this will work, Contoso needs to perform capacity
planning for the proposed Site Recovery migration.
Contoso needs to gather information about their traffic volumes. In particular:
Contoso needs to determine the rate of change for VMs it wants to replicate.
Contoso also needs to take network connectivity from the on-premises site to Azure into account.
In response to capacity and volume requirements, Contoso will need to allocate sufficient bandwidth based on
the daily data change rate for the required VMs, to meet its recovery point objective (RPO ).
Lastly, they need to figure out how many servers are needed to run the Site Recovery components that are
needed for the deployment.
Ga t h e r o n -p re mi s e s i n f o rma t i o n
Contoso can use the Site Recovery Deployment Planner tool to complete these steps:
Contoso can use the tool to remotely profile VMs without an impact on the production environment. This helps
pinpoint bandwidth and storage requirements for replication and failover.
Contoso can run the tool without installing any Site Recovery components on-premises.
The tool gathers information about compatible and incompatible VMs, disks per VM, and data churn per disk. It
also identifies network bandwidth requirements, and the Azure infrastructure needed for successful replication
and failover.
Contoso needs to ensure that then run the planner tool on a Windows Server machines that matches the
minimum requirements for the Site Recovery configuration server. The configuration server is a Site Recovery
machine that's needed in order to replicate on-premises VMware VMs.
I d e n t i f y Si t e R e c o v e ry re q u i re me n t s
In addition to the VMs being replicated, Site Recovery requires several components for VMware migration.
COMPONENT DETAILS
Contoso needs to figure out how to deploy these components, based on capacity considerations.
Component | Capacity requirements --- | --- Maximum daily change rate | A single process server can
handle a daily change rate up to 2 TB. Since a VM can only use one process server, the maximum daily data change
rate that's supported for a replicated VM is 2 TB. Maximum throughput | A standard Azure storage account can
handle a maximum of 20,000 requests per second, and input/output operations per second (IOPS ) across a
replicating VM should be within this limit. For example, if a VM has 5 disks, and each disk generates 120 IOPS (8K
size) on the VM, then it will be within the Azure per disk IOPS limit of 500.
Note that the number of storage accounts needed is equal to the total source machine IOPS, divided by 20,000. A
replicated machine can only belong to a single storage account in Azure. Configuration server | Based on
Contoso's estimate of replicating 100=200 VMs together, and the configuration server sizing requirements,
Contoso estimate is needs a configuration server machine as follows:
Memory: 32 GB
Cache disk: 1 TB
In addition to sizing requirements Contoso will need to make sure that the configuration server is optimally
located, on the same network and LAN segment as the VMs that will be migrated. Process server | Contoso will
deploy a standalone dedicated process server with the ability to replicate 100-200 VMs:
Memory: 32 GB
Cache disk: 1 TB
The process server will be working hard, and as such should be located on an ESXi host that can handle the disk
I/O, network traffic and CPU required for the replication. Contoso will consider a dedicated host for this purpose.
Networking | Contoso has reviewed the current site-to-site VPN infrastructure, and decided to implement Azure
ExpressRoute. The implementation is critical because it will lower latency, and improve bandwidth to Contoso's
primary East US 2 Azure region.
Monitoring: Contoso will need to carefully monitor data flowing from the process server. If the data overloads the
network bandwidth Contoso will consider throttling the process server bandwidth. Azure storage | For migration,
Contoso must identify the right type and number of target Azure storage accounts. Site Recovery replicates VM
data to Azure storage.
To decide about storage, Contoso must review storage limits, and factor in expected growth and increased usage
over time. Given the speed and priority of migrations, Contoso has decided to use premium SSDs.
Contoso has made the decision to use Managed disks for all VMs that are deployed to Azure. The IOPS required
will determine if the disks will be Standard HDD, Standard SSD, or Premium (SSD ).
Another scaling tactic for Contoso is temporarily scale up the Azure SQL or MySQL Database target
instance to the Premium tier SKU during the data migration. This minimizes database throttling that could
affect data transfer activities when using lower-level SKUs.
U se o t h e r t o o l s
In addition to DMS, Contoso can use other tools and services to identify VM information.
They have scripts to help with manual migrations. These are available in the GitHub repo.
Various partner tools can also be used for migration.
Phase 3: Optimize
After Contoso moves resources to Azure, they need to streamline them to improve performance, and maximize
ROI with cost management tools. Given that Azure is a pay-for-use service, it's critical for Contoso to understand
how systems are performing, and to ensure they're sized properly.
Azure Cost Management
To make the most of their cloud investment, Contoso will take advantage of the free Azure Cost Management tool.
This licensed solution built by Cloudyn, a Microsoft subsidiary, allows Contoso to manage cloud spending with
transparency and accuracy. It provides tools to monitor, allocate, and trim cloud costs.
Azure Cost Management provides simple dashboard reports to help with cost allocation, showbacks and
chargebacks.
Cost Management can optimize cloud spending by identifying underutilized resources that Contoso can then
manage and adjust.
Learn more about Azure Cost Management.
Native tools
Contoso will also use scripts to locate unused resources.
During large migrations, there are often leftover pieces of data such as virtual hard drives (VHDs), which incur a
charge, but provide no value to the company. Scripts are available in the GitHub repo.
Contoso will take advantage of work done by Microsoft's IT department, and consider implementing the Azure
Resource Optimization (ARO ) Toolkit.
Contoso can deploy an Azure Automation account with preconfigured runbooks and schedules to its
subscription, and start saving money. Azure resource optimization happens automatically on a subscription
after a schedule is enabled or created, including optimization on new resources.
This provides decentralized automation capabilities to reduce costs. Features include:
Autosnooze Azure VMs based on low CPU.
Schedule Azure VMs to snooze and unsnooze.
Schedule Azure VMs to snooze or unsnooze in ascending and descending order using Azure tags.
Bulk deletion of resource groups on-demand.
Get started with the ARO toolkit in this GitHub repo.
Partner optimization tools
Partner tools such as Hanu and Scalr can be used.
Migration of an entire VMware host to Azure may accelerate the standard migration methodology outlined in the
Cloud Adoption Framework and pictured below.
Migration processes
The expanded scope article on VMware host migration outlines the approach to integrate VMware host migrations
with other Azure migration efforts to reduce complexity and standardize the process.
Migration of an entire SQL Server to Azure may accelerate the standard migration methodology outlined in the
Cloud Adoption Framework and pictured below.
Migration processes
The expanded scope article on SQL Server migration outlines the approach to integrate SQL Server migrations
with other Azure migration efforts to reduce complexity and standardize the process.
Many companies and organizations benefit from moving some or all their mainframe workloads, applications, and
databases to the cloud. Azure provides mainframe-like features at cloud scale without many of the drawbacks
associated with mainframes.
The term mainframe generally refers to a large computer system, but the vast majority currently of mainframes
deployed are IBM System Z servers or IBM plug-compatible systems running MVS, DOS, VSE, OS/390, or z/OS.
Mainframe systems continue to be used in many industries to run vital information systems, and they have a place
in highly specific scenarios, such as large, high-volume, transaction-intensive IT environments.
Migrating to the cloud enables companies to modernize their infrastructure. With cloud services you can make
mainframe applications, and the value that they provide, available as a workload whenever your organization needs
it. Many workloads can be transferred to Azure with only minor code changes, such as updating the names of
databases. You can migrate more complex workloads using a phased approach.
Most Fortune 500 companies are already running Azure for their critical workloads. Azure's significant bottom-line
incentives motivate many migration projects. Companies typically move development and test workloads to Azure
first, followed by DevOps, email, and disaster recovery as a service.
Intended audience
If you're considering a migration or the addition of cloud services as an option for your IT environment, this guide
is for you.
This guidance helps IT organizations start the migration conversation. You may be more familiar with Azure and
cloud-based infrastructures than you are with mainframes, so this guide starts with an overview of how
mainframes work, and continues with various strategies for determining what and how to migrate.
Mainframe architecture
In the late 1950s, mainframes were designed as scale-up servers to run high-volume online transactions and batch
processing. Because of this, mainframes have software for online transaction forms (sometimes called green
screens) and high-performance I/O systems for processing batch runs.
Mainframes have a reputation for high reliability and availability, and are known for their ability to run huge online
transactions and batch jobs. A transaction results from a piece of processing initiated by a single request, typically
from a user at a terminal. Transactions can also come from multiple other sources, including web pages, remote
workstations, and applications from other information systems. A transaction can also be triggered automatically at
a predefined time as the following figure shows.
A typical IBM mainframe architecture includes these common components:
Front-end systems: Users can initiate transactions from terminals, web pages, or remote workstations.
Mainframe applications often have custom user interfaces that can be preserved after migration to Azure.
Terminal emulators are still used to access mainframe applications, and are also called green-screen
terminals.
Application tier: Mainframes typically include a customer information control system (CICS ), a leading
transaction management suite for the IBM z/OS mainframe that is often used with IBM Information
Management System (IMS ), a message-based transaction manager. Batch systems handle high-throughput
data updates for large volumes of account records.
Code: Programming languages used by mainframes include COBOL, Fortran, PL/I, and Natural. Job control
language (JCL ) is used to work with z/OS.
Database tier: A common relational database management system (DBMS ) for z/OS is IBM DD2. It
manages data structures called dbspaces that contain one or more tables and are assigned to storage pools
of physical data sets called dbextents. Two important database components are the directory that identifies
data locations in the storage pools, and the log that contains a record of operations performed on the
database. Various flat-file data formats are supported. DB2 for z/OS typically uses virtual storage access
method (VSAM ) datasets to store the data.
Management tier: IBM mainframes include scheduling software such as TWS -OPC, tools for print and
output management such as CA-SAR and SPOOL, and a source control system for code. Secure access
control for z/OS is handled by resource access control facility (RACF ). A database manager provides access
to data in the database and runs in its own partition in a z/OS environment.
LPAR: Logical partitions, or LPARs, are used to divide compute resources. A physical mainframe is
partitioned into multiple LPARs.
z/OS: A 64-bit operating system that is most commonly used for IBM mainframes.
IBM systems use a transaction monitor such as CICS to track and manage all aspects of a business transaction.
CICS manages the sharing of resources, the integrity of data, and prioritization of execution. CICS authorizes users,
allocates resources, and passes database requests by the application to a database manager, such as IBM DB2.
For more precise tuning, CICS is commonly used with IMS/TM (formerly IMS/Data Communications or IMS/DC ).
IMS was designed to reduce data redundancy by maintaining a single copy of the data. It complements CICS as a
transaction monitor by maintaining state throughout the process and recording business functions in a data store.
Mainframe operations
The following are typical mainframe operations:
Online: Workloads include transaction processing, database management, and connections. They are often
implemented using IBM DB2, CICS, and z/OS connectors.
Batch: Jobs run without user interaction, typically on a regular schedule such as every weekday morning.
Batch jobs can be run on systems based on Windows or Linux by using a JCL emulator such as Micro Focus
Enterprise Server or BMC Control-M software.
Job control language (JCL ): Specify resources needed to process batch jobs. JCL conveys this
information to z/OS through a set of job control statements. Basic JCL contains six types of statements: JOB,
ASSGN, DLBL, EXTENT, LIBDEF, and EXEC. A job can contain several EXEC statements (steps), and each
step could have several LIBDEF, ASSGN, DLBL, and EXTENT statements.
Initial program load (IPL ): Refers to loading a copy of the operating system from disk into a processor's
real storage and running it. IPLs are used to recover from downtime. An IPL is like booting the operating
system on Windows or Linux VMs.
Next steps
Myths and facts
Mainframe myths and facts
2 minutes to read • Edit Online
Mainframes figure prominently in the history of computing and remain viable for highly specific workloads. Most
agree that mainframes are a proven platform with long-established operating procedures that make them reliable,
robust environments. Software runs based on usage, measured in million instructions per second (MIPS ), and
extensive usage reports are available for chargebacks.
The reliability, availability, and processing power of mainframes have taken on almost mythical proportions. To
evaluate the mainframe workloads that are most suitable for Azure, you first want to distinguish the myths from
the reality.
Summary
By comparison, Azure offers an alternative platform that is capable of delivering equivalent mainframe
functionality and features, and at a much lower cost. In addition, the total cost of ownership (TCO ) of the cloud's
subscription-based, usage-driven cost model is far less expensive than mainframe computers.
Next steps
Make the switch from mainframes to Azure
Make the switch from mainframes to Azure
4 minutes to read • Edit Online
As an alternative platform for running traditional mainframe applications, Azure offers hyperscale compute and
storage in a high availability environment. You get the value and agility of a modern, cloud-based platform without
the costs associated with a mainframe environment.
This section provides technical guidance for making the switch from a mainframe platform to Azure.
NOTE
These estimates are subject to change as new virtual machine (VM) series become available in Azure.
Scalability
Mainframes typically scale up, while cloud environments scale out. Mainframes can scale out with the use of a
coupling facility (CF ), but the high cost of hardware and storage makes mainframes expensive to scale out.
A CF also offers tightly coupled compute, whereas the scale-out features of Azure are loosely coupled. The cloud
can scale up or down to match exact user specifications, with compute power, storage, and services scaling on
demand under a usage-based billing model.
Storage
Part of understanding how mainframes work involves decoding various overlapping terms. For example, central
storage, real memory, real storage, and main storage all generally refer to storage attached directly to the
mainframe processor.
Mainframe hardware includes processors and many other devices, such as direct-access storage devices (DASDs),
magnetic tape drives, and several types of user consoles. Tapes and DASDs are used for system functions and by
user programs.
Types of physical storage for mainframes include:
Central storage: Located directly on the mainframe processor, this is also known as processor or real storage.
Auxiliary storage: Located separately from the mainframe, this type includes storage on DASDs and is also
known as paging storage.
The cloud offers a range of flexible, scalable options, and you will pay only for those options that you need. Azure
Storage offers a massively scalable object store for data objects, a file system service for the cloud, a reliable
messaging store, and a NoSQL store. For VMs, managed and unmanaged disks provide persistent, secure disk
storage.
Next steps
Mainframe application migration
Mainframe application migration
10 minutes to read • Edit Online
When migrating applications from mainframe environments to Azure, most teams follow a pragmatic approach:
reuse wherever and whenever possible, and then start a phased deployment where applications are rewritten or
replaced.
Application migration typically involves one or more of the following strategies:
Rehost: You can move existing code, programs, and applications from the mainframe, and then recompile
the code to run in a mainframe emulator hosted in a cloud instance. This approach typically starts with
moving applications to a cloud-based emulator, and then migrating the database to a cloud-based database.
Some engineering and refactoring are required along with data and file conversions.
Alternatively, you can rehost using a traditional hosting provider. One of the principal benefits of the cloud is
outsourcing infrastructure management. You can find a datacenter provider that will host your mainframe
workloads for you. This model may buy time, reduce vendor lock in, and produce interim cost savings.
Retire: All applications that are no longer needed should be retired before migration.
Rebuild: Some organizations choose to completely rewrite programs using modern techniques. Given the
added cost and complexity of this approach, it's not as common as a lift and shift approach. Often after this
type of migration, it makes sense to begin replacing modules and code using code transformation engines.
Replace: This approach replaces mainframe functionality with equivalent features in the cloud. Software as
a service (SaaS ) is one option, which is using a solution created specifically for an enterprise concern, such
as finance, human resources, manufacturing, or enterprise resource planning. In addition, many industry-
specific apps are now available to solve problems that custom mainframe solutions used to previously solve.
You should consider starting by planning those workloads that you want to initially migrate, and then determine
those requirements for moving associated applications, legacy codebases, and databases.
On Azure, emulation environments are used to run the TP manager and the batch jobs that use JCL. In the data
tier, DB2 is replaced by Azure SQL Database, although Microsoft SQL Server, DB2 LUW, or Oracle Database can
also be used. An emulator supports IMS, VSAM, and SEQ. The mainframe's system management tools are
replaced by Azure services, and software from other vendors, that run in VMs.
The screen handling and form entry functionality is commonly implemented using web servers, which can be
combined with database APIs, such as ADO, ODBC, and JDBC for data access and transactions. The exact line-up
of Azure IaaS components to use depends on the operating system you prefer. For example:
Windows–based VMs: Internet Information Server (IIS ) along with ASP.NET for the screen handling and
business logic. Use ADO.NET for data access and transactions.
Linux–based VMs: The Java-based application servers that are available, such as Apache Tomcat for screen
handling and Java-based business functionality. Use JDBC for data access and transactions.
Partner solutions
If you are considering a mainframe migration, the partner ecosystem is available to assist you.
Azure provides a proven, highly available, and scalable infrastructure for systems that currently run on
mainframes. Some workloads can be migrated with relative ease. Other workloads that depend on legacy system
software, such as CICS and IMS, can be rehosted using partner solutions and migrated to Azure over time.
Regardless of the choice you make, Microsoft and our partners are available to assist you in optimizing for Azure
while maintaining mainframe system software functionality.
Learn more
For more information, see the following resources:
Get started with Azure
Deploy IBM DB2 pureScale on Azure
Host Integration Server documentation
Best practices for costing and sizing workloads
migrated to Azure
17 minutes to read • Edit Online
As you plan and design for migration, focusing on costs ensures the long-term success of your Azure migration.
During a migration project, it's critical that all teams (such as finance, management, and application development
teams) understand associated costs.
Before migration, estimating your migration spend, with a baseline for monthly, quarterly, and yearly budget
targets is critical to success.
After migration, you should optimize costs, continually monitor workloads, and plan for future usage patterns.
Migrated resources might start out as one type of workload, but shift to another type over time, based on
usage, costs, and shifting business requirements.
This article describes best practices for costing and sizing before and after migration.
IMPORTANT
The best practices and opinions described in this article are based on Azure platform and service features available at the
time of writing. Features and capabilities change over time. Not all recommendations might be applicable for your
deployment, so select what works for you.
Before migration
Before you move your workloads to the cloud, estimate the monthly cost of running them in Azure. Proactively
managing cloud costs helps you adhere to your operating expense budget. If budget is limited, take this into
account before migration. Consider converting workloads to Azure serverless technologies, where appropriate, to
reduce costs.
The best practices in this section help you to estimate costs, perform right-sizing for VMs and storage, use Azure
Hybrid benefits, use reserved VMs, and estimate cloud spending across subscriptions.
Storage optimized High disk throughput and IO. Good for big data, SQL and NoSQL
databases.
GPU optimized Specialized VMs. Single or multiple Heavy graphics and video editing.
GPUs.
High performance Fastest and most powerful CPU. VMs Critical high-performance apps.
with optional high-throughput network
interfaces (RDMA)
It's important to understand the pricing differences between these VMs, and the long-term budget effects.
Each type has several VM series within it.
Additionally, when you select a VM within a series, you can only scale the VM up and down within that series.
For example, a DSv2_2 can scale up to DSv2_4, but it can't be changed to a different series such as Fsv2_2.
Learn more:
Learn more about VM types and sizing, and map sizes to types.
Plan VM sizing.
Review a sample assessment for the fictional Contoso company.
Blobs Optimized to store massive Access data from Use for streaming and
amounts of unstructured everywhere over random access scenarios.
objects, such as text or HTTP/HTTPS. For example, to serve
binary data images and documents
directly to a browser, stream
video and audio, and store
backup and disaster
recovery data.
Disk management:
Unmanaged (you manage
disk settings and storage) or
Managed (you select the
disk type and Azure
manages the disk for you).
DATA TYPE DETAILS USAGE
Access tiers
Azure storage provides different options for accessing block blob data. Selecting the right access tier helps ensure
that you store block blob data in the most cost-effective manner.
Hot Higher storage cost than Cool. Lower Use for data in active use that's
access charges than Cool. accessed frequently.
Cool Lower storage cost than Hot. Higher Store short-term, data is available but
access charges than Hot. accessed infrequently.
Archive Used for individual block blobs. Use for data that can tolerate server
hours of retrieval latency and will
Most cost-effective option for storage. remain in the tier for at least 180 days.
Data access is more expensive than hot
and cold.
General Purpose v2 Standard Supports blobs (block, page, append), Use for most scenarios and most types
files, disks, queues, and tables. of data. Standard storage accounts can
be HDD or SSD based.
Supports Hot, Cool, and Archive access
tiers. ZRS is supported.
General Purpose v2 Premium Supports Blob storage data (page Microsoft recommends using for all
blobs). Supports Hot, Cool, and Archive VMs.
access tiers. ZRS is supported.
Stored on SSD.
General Purpose v1 Access tiering isn't supported. Doesn't Use if apps need the Azure classic
support ZRS deployment model.
ACCOUNT TYPE DETAILS USAGE
Blob Specialized storage account for storing you can't store page blobs in these
unstructured objects. Provides block accounts, and therefore can't store VHD
blobs and append blobs only (no File, files. You can set an access tier to Hot
Queue, Table or Disk storage services). or Cool.
Provides the same durability, availability,
scalability and performance as General
Purpose v2.
Locally redundant storage (LRS) Protects against a local outage by Consider if your app stores data that
replicating within a single storage unit can be easily reconstructed.
to a separate fault domain and update
domain. Keeps multiple copies of your
data in one datacenter. Provides at least
99.999999999 % (11 9's) durability of
objects over a given year.
Zone-redundant storage (ZRS) Protects again a datacenter outage by Consider if you need consistency,
replicating across three storage clusters durability, and high availability. Might
in a single region. Each storage cluster not protect against a regional disaster
is physically separated and located in its when multiple zones are permanently
own availability zone. Provides at least affected.
99.9999999999 % (12 9's) durability of
objects over a given year by keeping
multiple copies of your data across
multiple datacenters or regions.
Geographically redundant storage Protects against an entire region Replica data isn't available unless
(GRS) outage by replicating data to a Microsoft initiates a failover to the
secondary region hundreds of miles secondary region. If failover occurs,
away from the primary. Provides at read and write access is available.
least 99.99999999999999 % (16 9's)
durability of objects over a given year.
Read-access geographically Similar to GRS. Provides at least Provides and 99.99 % read availability
redundant storage (RA-GRS) 99.99999999999999 % (16 9's) by allowing read access from the
durability of objects over a given year second region used for GRS.
Learn more:
Review Azure Storage pricing.
Learn about Azure Import/Export for migration large amounts of data to the Azure blobs and files.
Compare blobs, files, and disk storage data types.
Learn more about access tiers.
Review different types of storage accounts.
Learn about storage redundancy, LRS, ZRS, GRS, and Read-access GRS.
Learn more about Azure Files.
After migration
After a successful migration of your workloads, and a few weeks of collecting consumption data, you'll have a clear
idea of resources costs.
As you analyze data, you can start to generate a budget baseline for Azure resource groups and resources.
Then, as you understand where your cloud budget is being spent, you can analyze how to further reduce your
costs.
Best practices in this section include using Azure Cost Management for cost budgeting and analysis, monitoring
resources and implementing resource group budgets, and optimizing monitoring, storage, and VMs.
Best practices: Use Logic Apps and runbooks with Budgets API
Azure provides a REST API that has access to your tenant billing information.
You can use the Budgets API to integrate external systems and workflows that are triggered by metrics that
you build from the API data.
You can pull usage and resource data into your preferred data analysis tools.
The Azure Resource Usage and RateCard APIs can help you accurately predict and manage your costs.
The APIs are implemented as a Resource Provider and are included in the APIs exposed by the Azure Resource
Manager.
The Budgets API can be integrated with Azure Logic Apps and Runbooks.
Learn more:
Learn more about the Budgets API.
Get insights into Azure usage with the Billing API.
Next steps
Review other best practices:
Best practices for security and management after migration.
Best practices for networking after migration.
Best practices for securing and managing workloads
migrated to Azure
29 minutes to read • Edit Online
As you plan and design for migration, in addition to thinking about the migration itself, you need to consider your
security and management model in Azure after migration. This article describes planning and best practices for
securing your Azure deployment after migrating, and for ongoing tasks to keep your deployment running at an
optimal level.
IMPORTANT
The best practices and opinions described in this article are based on the Azure platform and service features available at the
time of writing. Features and capabilities change over time.
Delete locks
Learn more:
Learn about locking resources to prevent unexpected changes.
Tagging
Learn more:
Learn about tagging and tag limitations.
Review PowerShell and CLI examples to set up tagging, and to apply tags from a resource group to its
resources.
Read Azure tagging best practices.
Best practice: Implement blueprints
Just as blueprint allows engineers and architects to sketch a project's design parameters, Azure Blueprints enable
cloud architects and central IT groups to define a repeatable set of Azure resources that implements and adheres
to an organization's standards, patterns, and requirements. Using Azure Blueprints, development teams can
rapidly build and create new environments that meet organizational compliance requirements, and that have a set
of built-in components, such as networking, to speed up development and delivery.
Use blueprints to orchestrate the deployment of resource groups, Azure Resource Manager templates, and
policy and role assignments.
Blueprints are stored in a globally distributed Azure Cosmos DB. Blueprint objects are replicated to multiple
Azure regions. Replication provides low latency, high availability, and consistent access to blueprint, regardless
of the region to which a blueprint deploys resources.
Learn more:
Read about blueprints.
Review a blueprint example used to accelerate AI in healthcare.
Management groups
Learn more:
Learn more about organizing resources into management groups.
Azure Policy
Learn more:
Get an overview of Azure Policy.
Learn about creating and managing policies to enforce compliance.
Site
Recovery
Learn more:
Review disaster recovery scenarios for Azure VMs.
Learn how to set up disaster recovery for an Azure VM after migration.
Alerts
Learn more:
Learn about alerts.
Learn about security playbooks that respond to Security Center alerts.
Azure dashboard
Learn more:
Learn how to create a dashboard.
Learn about dashboard structure.
Next steps
Review other best practices:
Best practices for networking after migration.
Best practices for cost management after migration.
Cloud Adoption Framework migration model
3 minutes to read • Edit Online
This section of the Cloud Adoption Framework explains the principles behind its migration model. Wherever
possible, this content attempts to maintain a vendor-neutral position while guiding you through the processes and
activities that can be applied to any cloud migration, regardless of your chosen cloud vendor.
NOTE
While business planning is important, a growth mindset is equally important. In parallel with broader business planning
efforts by the cloud strategy team, it's suggested that the cloud adoption team begin migrating a first workload as a
precursor to wider scale migration efforts. This initial migration will allow the team to gain practical experience with the
business and technical issues involved in a migration.
Migration and modernization of workloads range from simple rehost (also called lift and shift) migrations using
infrastructure as a service (IaaS ) capabilities that don't require code and app changes, through refactoring with
minimal changes, to rearchitecting to modify and extend code and app functionality to take advantage of cloud
technologies.
Cloud-native strategies and platform as a service (PaaS ) strategies rebuild on-premises workloads using Azure
platform offerings and managed services. Workloads that have equivalent fully managed software as a service
(SaaS ) cloud-based offerings can often be fully replaced by these services as part of the migration process.
NOTE
During the public preview of the Cloud Adoption Framework, this section of the framework emphasizes a rehost migration
strategy. Although PaaS and SaaS solutions are discussed as alternatives when appropriate, the migration of virtual machine-
based workloads using IaaS capabilities is the primary focus.
Other sections and future iterations of this content will expand on other approaches. For a high-level discussion on
expanding the scope of your migration to include more complicated migration strategies, see the article balancing the
portfolio.
Incremental migration
The Cloud Adoption Framework migration model is based on an incremental cloud transformation process. It
assumes that your organization will start with an initial, limited-scope, cloud migration effort, which we refer to
commonly as the first workload. This effort will expand iteratively to include more workloads as your Operations
teams refine and improve your migration processes.
Cloud migrations tools like Azure Site Recovery can migrate entire datacenters consisting of tens of thousands of
VMs. However, the business and existing IT operations can seldom handle such a high pace of change. As such
many organizations break up a migration effort into multiple iterations, moving one workload (or a collection of
workloads) per iteration.
The principles behind this incremental model are based on the execution of processes and prerequisites referenced
in the following infographic.
The consistent application of these principles represents an end goal for your cloud migration processes and
should not be viewed as a required starting point. As your migration efforts mature, refer to the guidance in this
section to help define the best process to support your organizational needs.
Next steps
Begin learning about this model by investigating the prerequisites to migration.
Prerequisites to migration
Prerequisites for migration
3 minutes to read • Edit Online
Prior to beginning any migrations, your migration target environment must be prepared for the coming changes.
In this case, environment refers to the technical foundation in the cloud. Environment also means the business
environment and mindset driving the migration. Likewise, the environment includes the culture of the teams
executing the changes and those receiving the output. Lack of preparation for these changes is the most common
reason for failure of migrations. This series of articles walks you through suggested prerequisites to prepare the
environment.
Objective
Ensure business, culture, and technical readiness prior to beginning an iterative migration plan.
Definition of done
Prerequisites are completed when the following are true:
Business readiness. The cloud strategy team has defined and prioritized a high-level migration backlog
representing the portion of the digital estate to be migrated in the next two or three releases. The cloud
strategy team and the cloud adoption team have agreed to an initial strategy for managing change.
Culture readiness. The roles, responsibilities, and expectations of the cloud adoption team, cloud strategy
team, and affected users have been agreed on regarding the workloads to be migrated in the next two or three
releases.
Technical readiness. The landing zone (or allocated hosting space in the cloud) that will receive the migrated
assets meets minimum requirements to host the first migrated workload.
Cau t i on
Preparation is key to the success of a migration. However, too much preparation can lead to analysis paralysis,
where too much time spent on planning can seriously delay a migration effort. The processes and prerequisites
defined in this section are meant to help you make decisions, but don't let them block you from making
meaningful progress.
Choose a relatively simple workload for your initial migration. Use the processes discussed in this section as you
plan and implement this first migration. This first migration effort will quickly demonstrate cloud principles to
your team and force them to learn about how the cloud works. As your team gains experience, integrate these
learnings as you take on larger and more complex migrations.
Next steps
With a general understanding of the prerequisites, you are ready to address the first prerequisite early migration
decisions.
Early migration decisions
Decisions that affect migration
6 minutes to read • Edit Online
During migration, several factors affect decisions and execution activities. This article explains the central theme of
those decisions and explores a few questions that carry through the discussions of migration principles in this
section of the Cloud Adoption Framework guidance.
Business outcomes
The objective or goal of any adoption effort can have a significant impact on the suggested approach to execution.
Migration. Urgent business drivers, speed of adoption, or cost savings are examples of operational outcomes.
These outcomes are central to efforts that drive business value from transitive change in IT or operations
models. The Migrate section of the Cloud Adoption Framework focuses heavily on Migration focused business
outcomes.
Application innovation. Improving customer experience and growing market share are examples of
incremental outcomes. The outcomes result from a collection of incremental changes focused on the needs and
desires of current customers.
Data-driven innovation. New products or services, especially those that come from the power of data, are
examples of disruptive outcomes. These outcomes are the result of experimentation and predictions that use
data to disrupt status quo in the market.
No business would pursue just one of these outcomes. Without operations, there are no customers, and vice versa.
Cloud adoption is no different. Companies commonly work to achieve each of these outcomes, but trying to focus
on all of them simultaneously can spread your efforts too thin and slow progress on work that could most benefit
your business needs.
This prerequisite isn't a demand for you to pick one of these three goals, but instead to help your cloud strategy
team and your cloud adoption team establish a set of operational priorities that will guide execution for the next
three to six months. These priorities are set by ranking each of the three itemized options from most significant to
least significant, as they relate to the efforts this team can contribute to in the next one or two quarters.
Act on migration outcomes
If operational outcomes rank highest in the list, this section of the Cloud Adoption Framework will work well for
your team. In this section, it is assumed that you need to prioritize speed and cost savings as primary key
performance indicators (KPIs), in which case a migration model to adoption would be well aligned with the
outcomes. A migration-focused model is heavily predicated on lift and shift migration of infrastructure as a service
(IaaS ) assets to deplete a datacenter and to produce cost savings. In such a model, modernization may occur but is
a secondary focus until the primary migration mission is realized.
Act on application innovations
If market share and customer experience are your primary drivers, this may not be the best section of the Cloud
Adoption Framework to guide your teams' efforts. Application innovation requires a plan that focuses on the
modernization and transition of workloads, regardless of the underlying infrastructure. In such a case, the
guidance in this section can be informative but may not be the best approach to guide core decisions.
Act on data innovations
If data, experimentation, research and development (R&D ), or new products are your priority for the next six
months or so, this may not be the best section of the Cloud Adoption Framework to guide your teams' efforts. Any
data innovation effort could benefit from guidance regarding the migration of existing source data. However, the
broader focus of that effort would be on the ingress and integration of additional data sources. Extending that
guidance with predictions and new experiences is much more important than the migration of IaaS assets.
Effort
Migration effort can vary widely depending on the size and complexities of the workloads involved. A smaller
workload migration involving a few hundred virtual machines (VMs) is a tactical process, potentially being
implemented using automated tools such as Azure Migrate. Conversely, a large enterprise migration of tens of
thousands of workloads requires a highly strategic process and can involve extensive refactoring, rebuilding, and
replacing of existing applications integrating platform as a service (PaaS ) and software as a service (SaaS )
capabilities. Identifying and balancing the scope of your planned migrations is critical.
Before making any decisions that could have a long-term impact on the current migration program, it is vital that
you create consensus on the following decisions.
Effort type
In any migration of significant scale (>250 VMs), assets are migrated using a variety of transition options,
discussed in the five Rs of rationalization: Rehost, Refactor, Rearchitect, Rebuild, and Replace.
Some workloads are modernized through a rebuild or rearchitect process, creating more modern applications with
new features and technical capabilities. Other assets go through a refactor process, for instance a move to
containers or other more modern hosting and operational approaches that don't necessarily affect the solutions
codebase. Commonly, virtual machines and other assets that are more well-established go through a rehost
process, transitioning those assets from the datacenter to the cloud. Some workloads could potentially be
migrated to the cloud but should instead be replaced using service–based (SaaS -based) cloud services that meet
the same business need, for example by using Office 365 as an alternative to migrating Exchange Server instances.
In the majority of scenarios, some business event creates a forcing function that causes a high percentage of assets
to temporarily migrate using the rehost process, followed by a more significant secondary transition using one of
the other migration strategies after they are in the cloud. This process is commonly known as a cloud transition.
During the process of rationalizing the digital estate, these types of decisions are applied to each asset to migrate.
However, the prerequisite needed at this time is to make a baseline assumption. Of the five migration strategies,
which best aligns with the business objectives or business outcomes driving this migration effort? This decision
serves as a guiding assumption throughout the migration effort.
Effort scale
Scale of the migration is the next important prerequisite decision. The processes required to migrate 1,000 assets
is different from the process required to move 10,000 assets. Before beginning any migration effort, it is important
to answer the following questions:
How many assets support the migrating workloads today? Assets would include data structures,
applications, VMs, and necessary IT appliances. It's recommended that you choose a relatively small workload
for your first migration candidate.
Of those assets, how many are planned for migration? It is common for a percentage of assets to be
terminated during a migration process, due to lack of sustained end-user dependency.
What are the top-down estimates of the migratable assets scale? For the workloads included for
migration, estimate the number of supporting assets such as applications, virtual machines, data sources, and
IT appliances. See the digital estate section of the Cloud Adoption Framework for guidance on identifying
relevant assets.
Effort timing
Often, migrations are driven by a compelling business event that is time sensitive. For instance, one common
driver is the termination or renewal of a third-party hosting contract. Although there are many potential business
events necessitating a migration, they are share one commonality: an end date. It is important to understand the
timing of any approaching business events, so activities and velocity can be planned and validated properly.
Recap
Before proceeding, document the following assumptions and share them with the cloud strategy team and the
cloud adoption teams:
Business outcomes.
Roles, documented and refined for the Assess, Migrate, Optimize, and Secure and Manage migration processes.
Definition of done, documented and refined separately for the Assess, Migrate, Optimize, and Secure and
Manage migration processes.
Effort type.
Effort scale.
Effort timing.
Next steps
After the process is understood among the team, it's time to review technical prerequisites. The migration
environment planning checklist helps to ensure that the technical foundation is ready for migration.
Once the process is understood among the team, its time to review technical prerequisites the [Migration Planning
Checklist] will help ensure the technical foundation is ready for migration.
Review the migration planning checklist
Migration environment planning checklist: validate
environmental readiness prior to migration
3 minutes to read • Edit Online
As an initial step in the migration process, you need to create the right environment in the cloud to receive, host,
and support migrating assets. This article provides a list of things to validate in the current environment prior to
migration.
The following checklist aligns with the guidance found in the Ready section of the Cloud Adoption Framework.
Review that section for guidance regarding execution of any of the following.
Governance alignment
The first and most important decision regarding any migration-ready environment is the choice of governance
alignment. Has a consensus been achieved regarding alignment of governance with the migration foundation? At
a minimum, the cloud adoption team should understand whether this migration is landing in a single environment
with limited governance, a fully governed environment factory, or some variant in between. For more options and
guidance on governance alignment, see the article on Governance and compliance alignment.
We highly recommend that you develop a governance strategy for anything beyond your initial workload
migration.
Regardless of your level of governance alignment, you will need to make decisions related to the following topics.
Resource organization
Based on the governance alignment decision, an approach to the organization and deployment of resources
should be established prior to migration.
Nomenclature
A consistent approach for naming resources, along with consistent naming schemas, should be established prior
to migration.
Resource governance
A decision regarding the tools to govern resources should be made prior to migration. The tools do not need to be
fully implemented, but a direction should be selected and tested. The cloud governance team should define and
require the implementation of a minimum viable product (MVP ) for governance tooling prior to migration.
Network
Your cloud-based workloads will require the provisioning of virtual networks to support end-user and
administrative access. Based on resource organization and resource governance decisions, you should select a
network approach align it to IT security requirements. Further, your networking decisions should be aligned with
any hybrid network constraints required to operate the workloads in the migration backlog and support any
access to resources hosted on-premises.
Identity
Cloud-based identity services are a prerequisite for offering identity and access management (IAM ) for your cloud
resources. Align your identity management strategy with your cloud adoption plans before proceeding. For
example, when migrating existing on-premises assets, consider supporting a hybrid identity approach using
directory synchronization to allow a consistent set of user credentials across you on-premises and cloud
environments during and after the migration.
Next steps
If the environment meets the minimum requirements, it may be deemed approved for migration readiness.
Cultural complexity and change management helps to align roles and responsibilities to ensure proper
expectations during execution of the plan.
Cultural complexity and change management
Prepare for cultural complexity: aligning roles and
responsibilities
3 minutes to read • Edit Online
An understanding of the culture required to operate the existing datacenters is important to the success of any
migration. In some organizations, datacenter management is contained within centralized IT operations teams. In
these centralized teams, roles and responsibilities tend to be well defined and well understood throughout the
team. For larger enterprises, especially those bound by third-party compliance requirements, the culture tends to
be more nuanced and complex. Cultural complexity can lead to roadblocks that are difficult to understand and time
consuming to overcome.
In either scenario, it's wise to invest in the documentation of roles and responsibilities required to complete a
migration. This article outlines some of the roles and responsibilities seen in a datacenter migration, to serve as a
template for documentation that can drive clarity throughout execution.
Business functions
In any migration, there are a few key functions that are best executed by the business, whenever possible. Often, IT
is capable of completing the following tasks. However, engaging members of the business could aid in reducing
barriers later in the adoption process. It also ensures mutual investment from key stakeholders throughout the
migration process.
Secure and manage Interruption impact Aid the cloud adoption team in
quantifying the impact of a business
process interruption.
Secure and manage Service-level agreement (SLA) validation Aid the cloud adoption team in defining
service level agreements and acceptable
tolerances for business outages.
Ultimately, the cloud adoption team is accountable for each of these activities. However, establishing
responsibilities and a regular cadence with the business for the completion of these activities on an established
rhythm can improve stakeholder alignment and cohesiveness with the business.
NOTE
In the following table, an accountable party should start the alignment of roles. That column should be customized to fit
existing processes for efficient execution. Ideally a single person should be named as the accountable party.
Prerequisite Digital estate Align the existing inventory cloud strategy team
to basic assumptions, based
on business outcomes.
Secure and manage Ops transition Document production cloud adoption team
systems prior to production
operations.
Cau t i on
For these activities, permissions and authorization heavily influence the accountable party, who must have direct
access to production systems in the existing environment or must have means of securing access through other
responsible actors. Determining this accountable party directly affects the promotion strategy during the migrate
and optimize processes.
Next steps
When the team has a general understanding of roles and responsibilities, it's time to begin preparing the technical
details of the migration. Understanding technical complexity and change management can help prepare the cloud
adoption team for the technical complexity of migration by aligning to an incremental change management
process.
Technical complexity and change management
Prepare for technical complexity: agile change
management
12 minutes to read • Edit Online
When an entire datacenter can be deprovisioned and re-created with a single line of code, traditional processes
struggle to keep up. The guidance throughout the Cloud Adoption Framework is built on practices like IT service
management (ITSM ), The Open Group Architecture Framework (TOGAF ), and others. However, to ensure agility
and responsiveness to business change, this framework molds those practices to fit agile methodologies and
DevOps approaches.
When shifting to an agile model where flexibility and iteration are emphasized, technical complexity and change
management are handled differently than they are in a traditional waterfall model focusing on a linear series of
migration steps. This article outlines a high-level approach to change management in an agile-based migration
effort. At the end of this article, you should have a general understanding of the levels of change management
and documentation involved in an incremental migration approach. Additional training and decisions are required
to select and implement agile practices based on that understanding. The intention of this article is to prepare
cloud architects for a facilitated conversation with project management to explain the general concept of change
management in this approach.
INVEST in workloads
The term workload appears throughout the Cloud Adoption Framework. A workload is a unit of application
functionality that can be migrated to the cloud. It could be a single application, a layer of an application, or a
collection of an application. The definition is flexible and may change at various phrases of migration. The Cloud
Adoption Framework uses the term invest to define a workload.
INVEST is a common acronym in many agile methodologies for writing user stories or product backlog items,
both of which are units of output in agile project management tools. The measurable unit of output in a migration
is a migrated workload. The Cloud Adoption Framework modifies the INVEST acronym a bit to create a construct
for defining workloads:
Independent: A workload should not have any inaccessible dependencies. For a workload to be considered
migrated, all dependencies should be accessible and included in the migration effort.
Negotiable: As additional discovery is performed, the definition of a workload changes. The architects
planning the migration could negotiate factors regarding dependencies. Examples of negotiation points could
include prerelease of features, making features accessible over a hybrid network, or packaging all
dependencies in a single release.
Valuable: Value in a workload is measured by the ability to provide users with access to a production
workload.
Estimable: Dependencies, assets, migration time, performance, and cloud costs should all be estimable and
should be estimated prior to migration.
Small: The goal is to package workloads in a single sprint. However, this may not always be feasible. Instead,
teams are encouraged to plan sprints and releases to minimize the time required to move a workload to
production.
Testable: There should always be a defined means of testing or validating completion of the migration of a
workload.
This acronym is not intended as a basis for rigid adherence but should help guide the definition of the term
workload.
The migration, release, and iteration backlogs track different levels of activity during migration processes.
In any migration backlog, the change management team should strive to obtain the following information for any
workload in the plan. At a minimum, this data should be available for any workloads prioritized for migration in
the next two or three releases.
Migration backlog data points
Business impact. Understanding of the impact to the business of missing the expected timeline or reducing
functionality during freeze windows.
Relative business priority. A ranked list of workloads based on business priorities.
Business owner. Document the one individual responsible for making business decisions regarding this
workload.
Technical owner. Document the one individual responsible for technical decisions related to this workload.
Expected timelines. When the migration is scheduled for completion.
Workload freezes. Time frames in which the workload should be ineligible for change.
Workload name.
Initial inventory. Any assets required to provide the functionality of the workload, including VMs, IT
appliances, data, applications, deployment pipelines, and others. This information is likely to be inaccurate.
Next steps
After change management approaches have been established, its time to address the final prerequisite, Migration
backlog review
Migration backlog review
Migration backlog review
2 minutes to read • Edit Online
The actionable output of the plan phase is a migration backlog, which influences all of the prerequisites discussed
so far. Development of the migration backlog should be completed as a first prerequisite. This article serves as a
milestone to complete prerequisite activities. The cloud strategy team is accountable for the care and maintenance
of the digital estate. However, the realization of the resultant backlog is the responsibility of every member of the
migration effort. As a final prerequisite, the cloud strategy team and the cloud adoption team should review and
understand the migration backlog. During that review, the members of both teams must gain sufficient knowledge
to articulate the following key points in the migration backlog.
Business priorities
Sometimes, prioritizing one workload over another may seem illogical to the cloud adoption team. Understanding
the business priorities that drove those decisions can help maintain the team's motivation. It also allows the team
to make a stronger contribution to the prioritization process.
Core assumptions
The article on digital estate rationalization discusses the agility and time-saving impact of basic assumptions when
evaluating a digital estate. To fully realize those values, the cloud adoption team needs to understand the
assumptions and the reasons that they were established. That knowledge better equips the cloud adoption team to
challenge those assumptions.
Next steps
With a general understanding of the digital estate and migration backlog, the team is ready to move beyond
prerequisites and to begin assessing workloads.
Assess workloads
Assess assets prior to migration
3 minutes to read • Edit Online
Many of your existing workloads are ideal candidates for cloud migration, but not every asset is compatible with
cloud platforms and not all workloads can benefit from hosting in the cloud. Digital estate planning allows you to
generate an overall migration backlog of potential workloads to migrate. However, this planning effort is high-
level. It relies on assumptions made by the cloud strategy team and does not dig deeply into technical
considerations.
As a result, before migrating a workload to the cloud it's critical to assess the individual assets associated with that
workload for their migration suitability. During this assessment, your cloud adoption team should evaluate
technical compatibility, required architecture, performance/sizing expectations, and dependencies to ensure that
the migrated workload can be deployed to the cloud effectively.
The Assess process is the first of four incremental activities that occur within an iteration. As discussed in the
prerequisite article regarding technical complexity and change management, a decision should be made in
advance to determine how this phase is executed. In particular, will assessments be completed by the cloud
adoption team during the same sprint as the actual migration effort? Alternatively, will a wave or factory model be
used to complete assessments in a separate iteration? If the answer to this basic process question can't be
answered by every member of the team, it may be wise to revisit the Prerequisites" section.
Objective
Assess a migration candidate, evaluating the workload, associated assets, and dependencies prior to migration.
Definition of done
This process is complete when the following are known about a single migration candidate:
The path from on-premises to cloud, including production promotion approach decision, has been defined.
Any required approvals, changes, cost estimates, or validation processes have been completed to allow the
cloud adoption team to execute the migration.
This full list of responsibilities and actions can support large and complex migrations involving multiple roles with
varying levels of responsibility, and requiring a detailed approval process. Smaller and simpler migration efforts
may not require all of roles and actions described here. To determine which of these activities add value and which
are unnecessary, your cloud adoption team and the cloud strategy team should use this complete process as part
of your first workload migration. After the workload has been verified and tested, the team can evaluate this
process and choose which actions to use moving forward.
Next steps
With a general understanding of the assessment process, you are ready to begin the process by aligning business
priorities.
Align business priorities
Business priorities: Maintaining alignment
3 minutes to read • Edit Online
Transformation is often defined as a dramatic or spontaneous change. At the board level, change can look like a
dramatic transformation. However, for those who work through the process of change in an organization,
transformation is a bit misleading. Under the surface, transformation is better described as a series of properly
executed transitions from one state to another.
The amount of time required to rationalize or transition a workload will vary, depending on the technical
complexity involved. However, even when this process can be applied to a single workload or group of
applications quickly, it takes time to produce substantial changes among a user base. It takes longer for changes to
propagate through various layers of existing business processes. If transformation is expected to shape behavior
patterns in consumers, the results can take longer to produce significant results.
Unfortunately, the market doesn't wait for businesses to transition. Consumer behavior patterns change on their
own, often unexpectedly. The market's perception of a company and its products can be swayed by social media or
a competitor's positioning. Fast and unexpected market changes require companies to be nimble and responsive.
The ability to execute processes and technical transitions requires a consistent, stable effort. Quick decisions and
nimble actions are needed to respond to market conditions. These two are at odds, making it easy for priorities to
fall out of alignment. This article describes approaches to maintaining transitional alignment during migration
efforts.
Next steps
With properly aligned business priorities, the cloud adoption team can confidently begin to evaluate workloads to
develop architecture and migration plans.
Evaluate workloads
Evaluate workload readiness
3 minutes to read • Edit Online
This activity focuses on evaluating readiness of a workload to migrate to the cloud. During this activity, the cloud
adoption team validates that all assets and associated dependencies are compatible with the chosen deployment
model and cloud provider. During the process, the team documents any efforts required to remediate
compatibility issues.
Evaluation assumptions
Most of the content discussing principles in the Cloud Adoption Framework is cloud agnostic. However, the
readiness evaluation process must be largely specific to each specific cloud platform. The following guidance
assumes an intention to migrate to Azure. It also assumes use of Azure Migrate (also known as Azure Site
Recovery) for replication activities. For alternative tools, see replication options.
This article doesn't capture all possible evaluation activities. It is assumed that each environment and business
outcome will dictate specific requirements. To help accelerate the creation of those requirements, the remainder of
this article shares a few common evaluation activities related to infrastructure, database, and network evaluation.
NOTE
Total storage directly affects bandwidth requirements during initial replication. However, storage drift continues from the
point of replication until release. This means that drift has a cumulative effect on available bandwidth.
Next steps
After the evaluation of a system is complete, the outputs feed the development of a new cloud architecture.
Architect workloads prior to migration
Architect workloads prior to migration
3 minutes to read • Edit Online
This article expands on the assessment process by reviewing activities associated with defining the architecture of
a workload within a given iteration. As discussed in the article on incremental rationalization, some architectural
assumptions are made during any business transformation that requires a migration. This article clarifies those
assumptions, shares a few roadblocks that can be avoided, and identifies opportunities to accelerate business
value by challenging those assumptions. This incremental model for architecture allows teams to move faster and
to obtain business outcomes sooner.
Next steps
After the new architecture is defined, accurate cost estimations can be calculated.
Estimate cloud costs
Estimate cloud costs
2 minutes to read • Edit Online
During migration, there are several factors that can affect decisions and execution activities. To help understand
which of those options are best for different situations, this article discusses various options for estimating cloud
costs.
Accounting models
Accounting models
If you are familiar with traditional IT procurement processes, estimation in the cloud may seem foreign. When
adopting cloud technologies, acquisition shifts from a rigid, structured capital expense model to a fluid operating
expense model. In the traditional capital expense model, the IT team would attempt to consolidate buying power
for multiple workloads across various programs to centralize a pool of shared IT assets that could support each of
those solutions. In the operating expenses cloud model, costs can be directly attributed to the support needs of
individual workloads, teams, or business units. This approach allows for a more direct attribution of costs to the
supported internal customer. When estimating costs, it's important to first understand how much of this new
accounting capability will be used by the IT team.
For those wanting to replicate the legacy capital expense approach to accounting, use the outputs of either
approach suggested in the "Digital estate size" section above to get an annual cost basis. Next, multiply that
annual cost by the company's typical hardware refresh cycle. Hardware refresh cycle is the rate at which a
company replaces aging hardware, typically measured in years. Annual run rate multiplied by hardware refresh
cycle creates a cost structure similar to a capital expense investment pattern.
Next steps
After estimating costs, migration can begin. However, it would be wise to review partnership and support options
before beginning any migration.
Understanding partnership options
Understand partnership options
6 minutes to read • Edit Online
During migration, the cloud adoption team performs the actual migration of workloads to the cloud. Unlike the
collaborative and problem-solving tasks when defining the digital estate or building the core cloud infrastructure,
migration tends to be a series of repetitive execution tasks. Beyond the repetitive aspects, there are likely testing
and tuning efforts that require deep knowledge of the chosen cloud provider. The repetitive nature of this process
can sometimes be best addressed by a partner, reducing strain on full-time staff. Additionally, partners may be
able to better align deep technical expertise when the repetitive processes encounter execution anomalies.
Partners tend to be closely aligned with a single cloud vendor or a small number of cloud vendors. To better
illustrate partnership options, the remainder of this article assumes that Microsoft Azure is the chosen cloud
provider.
During plan, build, or migrate, a company generally has four execution partnership options:
Guided self-service. The existing technical team executes the migration, with help from Microsoft.
FastTrack for Azure. Use the Microsoft FastTrack for Azure program to accelerate migration.
Solutions Partner. Get connected with Azure Solutions Partners or Cloud Solutions Partners (CSPs) to
accelerate migration.
Supported self-service. Execution is completed by the existing technical staff with support from Microsoft.
Guided self-service
If an organization is planning an Azure migration on its own, Microsoft is always there to assist throughout the
journey. To help fast-track migration to Azure, Microsoft and its partners have developed an extensive set of
architectures, guides, tools, and services to reduce risk and to speed migration of virtual machines, applications,
and databases. These tools and services support a broad selection of operating systems, programming languages,
frameworks, and databases.
Assessment and migration tools. Azure provides a wide range of tools to be used in different phases for
your cloud transformation, including assessing your existing infrastructure. For more , refer to the "Assess"
section in the "Migration" chapter that follows.
Microsoft Cloud Adoption Framework. This framework presents a structured approach to cloud adoption
and migration. It is based on best practices across many Microsoft-supported customer engagements and is
organized as a series of steps, from architecture and design to implementation. For each step, supporting
guidance helps you with the design of your application architecture.
Cloud design patterns. Azure provides some useful cloud design patterns for building reliable, scalable,
secure workloads in the cloud. Each pattern describes the problem that the pattern addresses, considerations
for applying the pattern, and an example based on Azure. Most of the patterns include code samples or
snippets that show how to implement the pattern on Azure. However, they are relevant to any distributed
system, whether hosted on Azure or on other cloud platforms.
Cloud fundamentals. Fundamentals help teach the basic approaches to implementation of core concepts.
This guide helps technicians think about solutions that go beyond a single Azure service.
Example scenarios. The guide provides references from real customer implementations, outlining the tools,
approaches, and processes that past customers have followed to accomplish specific business goals.
Reference architectures. Reference architectures are arranged by scenario, with related architectures
grouped together. Each architecture includes best practices, along with considerations for scalability, availability,
manageability, and security. Most also include a deployable solution.
FastTrack for Azure
FastTrack for Azure provides direct assistance from Azure engineers, working hand in hand with partners, to help
customers build Azure solutions quickly and confidently. FastTrack brings best practices and tools from real
customer experiences to guide customers from setup, configuration, and development to production of Azure
solutions, including:
Datacenter migration
Windows Server on Azure
Linux on Azure
SAP on Azure
Business continuity and disaster recovery (BCDR )
High-performance computing*
Cloud-native apps
DevOps
App modernization
Cloud-scale analytics**
Intelligent apps
Intelligent agents**
Data modernization to Azure
Security and management
Globally distributed data
IoT***
*Limited preview in United States, Canada, United Kingdom, and Western Europe
**Limited preview in United Kingdom and Western Europe
***Available in H2 2019
During a typical FastTrack for Azure engagement, Microsoft helps to define the business vision to plan and
develop Azure solutions successfully. The team assesses architectural needs and provides guidance, design
principles, tools, and resources to help build, deploy, and manage Azure solutions. The team matches skilled
partners for deployment services on request and periodically checks in to ensure that deployment is on track and
to help remove blockers.
The main phases of a typical FastTrack for Azure engagement are:
Discovery. Identify key stakeholders, understand the goal or vision for problems to be solved, and then assess
architectural needs.
Solution enablement. Learn design principles for building applications, review architecture of applications
and solutions, and receive guidance and tools to drive proof of concept (PoC ) work through to production.
Continuous partnership. Azure engineers and program managers check in every so often to ensure that
deployment is on track and to help remove blockers.
Azure Support
If you have questions or need help, create a support request. If your support request requires deep technical
guidance, visit Azure Support Plans to align the best plan for your needs.
Next steps
After a partner and support strategy is selected, the release and iteration backlogs can be updated to reflect
planned efforts and assignments.
Manage change using release and iteration backlogs
Manage change in an incremental migration effort
2 minutes to read • Edit Online
This article assumes that migration processes are incremental in nature, running parallel to the govern process.
However, the same guidance could be used to populate initial tasks in a work breakdown structure for traditional
waterfall change management approaches.
Release backlog
A release backlog consists of a series of assets (VMs, databases, files, and applications, among others) that must
be migrated before a workload can be released for production usage in the cloud. During each iteration, the cloud
adoption team documents and estimates the efforts required to move each asset to the cloud. See the "Iteration
backlog" section that follows.
Iteration backlog
An iteration backlog is a list of the detailed work required to migrate a specific number of assets from the existing
digital estate to the cloud. The entries on this list are often stored in an agile management tool, like Azure
DevOps, as work items.
Prior to starting the first iteration, the cloud adoption team specifies an iteration duration, usually two to four
weeks. This time box is important to create a start and finish time period for each set of committed activities.
Maintaining consistent execution windows makes it easy to gauge velocity (pace of migration) and alignment to
changing business needs.
Prior to each iteration, the team reviews the release backlog, estimating the effort and priorities of assets to be
migrated. It then commits to deliver a specific number of agreed-on migrations. After this is agreed to by the
cloud adoption team, the list of activities becomes the current iteration backlog.
During each iteration, team members work as a self-organizing team to fulfill commitments in the current
iteration backlog.
Next steps
After an iteration backlog is defined and accepted by the cloud adoption team, change management approvals
can be finalized.
Approve architecture changes prior to migration
Approve architecture changes before migration
4 minutes to read • Edit Online
During the assess process of migration, each workload is evaluated, architected, and estimated to develop a future
state plan for the workload. Some workloads can be migrated to the cloud with no change to the architecture.
Maintaining on-premises configuration and architecture can reduce risk and streamline the migration process.
Unfortunately, not every application can run in the cloud without changes to the architecture. When architecture
changes are required, this article can help classify the change and can provide some guidance on the proper
approval activities.
Existing culture
Your IT teams likely have existing mechanisms for managing change involving your on-premises assets. Typically
these mechanisms are governed by traditional Information Technology Infrastructure Library–based (ITIL -based)
change management processes. In many enterprise migrations, these processes involve a Change Advisory Board
(CAB ) that is responsible for reviewing, documenting, and approving all IT-related requests for changes (RFC ).
The CAB generally includes experts from multiple IT and business teams, offering a variety of perspectives and
detailed review for all IT-related changes. A CAB approval process is a proven way to reduce risk and minimize the
business impact of changes involving stable workloads managed by IT operations.
Technical approval
Organizational readiness for the approval of technical change is among the most common reasons for cloud
migration failure. More projects are stalled by a series of technical approvals than any deficit in a cloud platform.
Preparing the organization for technical change approval is an important requirement for migration success. The
following are a few best practices to ensure that the organization is ready for technical approval.
ITIL Change Advisory Board challenges
Every change management approach has its own set of controls and approval processes. Migration is a series of
continuous changes that start with a high degree of ambiguity and develop additional clarity through the course of
execution. As such, migration is best governed by agile-based change management approaches, with the cloud
strategy team serving as a product owner.
However, the scale and frequency of change during a cloud migration doesn't fit well with the nature of ITIL
processes. The requirements of a CAB approval can risk the success of a migration, slowing or stopping the effort.
Further, in the early stages of migration, ambiguity is high and subject matter expertise tends to be low. For the
first several workload migrations or releases, the cloud adoption team is often in a learning mode. As such, it could
be difficult for the team to provide the types of data needed to pass a CAB approval.
The following best practices can help the CAB maintain a degree of comfort during migration without become a
painful blocker.
Standardize change
It is tempting for a cloud adoption team to consider detailed architectural decisions for each workload being
migrated to the cloud. It is equally tempting to use cloud migration as a catalyst to refactor past architectural
decisions. For organizations that are migrating a few hundred VMs or a few dozen workloads, either approach can
be properly managed. When migrating a datacenter consisting of 1,000 or more assets, each of these approaches
is considered a high-risk antipattern that significantly reduces the likelihood of success. Modernizing, refactoring,
and rearchitecting every application require diverse skill sets and a significant variety of changes, and these tasks
create dependencies on human efforts at scale. Each of these dependencies injects risk into the migration effort.
The article on digital estate rationalization discusses the agility and time-saving impact of basic assumptions when
rationalizing a digital estate. There is an additional benefit of standardized change. By choosing a default
rationalization approach to govern the migration effort, the Cloud Advisory Board or product owner can review
and approve the application of one change to a long list of workloads. This reduces technical approval of each
workload to those that require a significant architecture change to be cloud compatible.
Clarify expectations and roles of approvers
Before the first workload is assessed, the cloud strategy team should document and communicate the expectations
of anyone involved in the approval of change. This simple activity can avoid costly delays when the cloud adoption
team is fully engaged.
Seek approval early
When possible, technical change should be detected and documented during the assessment process. Regardless
of approval processes, the cloud adoption team should engage approvers early. The sooner that change approval
can begin, the less likely an approval process is to block migration activities.
Next steps
With the help of these best practices, it should be easier to integrate proper, low -risk approval into migration
efforts. After workload changes are approved, the cloud adoption team is ready to migrate workloads.
Migrate workloads
Execute a migration
2 minutes to read • Edit Online
After a workload has been assessed, it can be migrated to the cloud. This series of articles explains the various
activities that may be involved in the execution of a migration.
Objective
The objective of a migration is to migrate a single workload to the cloud.
Definition of done
The migration phase is complete when a workload is staged and ready for testing in the cloud, including all
dependent assets required for the workload to function. During the optimize process, the workload is prepared for
production usage.
This definition of done can vary, depending on your testing and release processes. The next article in this series
covers deciding on a promotion model and can help you understand when it would be best to promote a migrated
workload to production.
Next steps
With a general understanding of the migration process, you are ready to decide on a promotion model.
Decide on a promotion model
Promotion models: single-step, staged, or flight
5 minutes to read • Edit Online
Workload migration is often discussed as a single activity. In reality, it is a collection of smaller activities that
facilitate the movement of a digital asset to the cloud. One of the last activities in a migration is the promotion of
an asset to production. Promotion is the point at which the production system changes for end users. It can often
be as simple as changing the network routing, redirecting end users to the new production asset. Promotion is
also the point at which IT operations or cloud operations change the focus of operational management processes
from the previous production system to the new production systems.
There are several promotion models. This article outlines three of the most common ones used in cloud
migrations. The choice of a promotion model changes the activities seen within the migrate and optimize
processes. As such, promotion model should be decided early in a release.
NOTE
The table of contents for this site lists the promotion activity as part of the optimize process. In a single-step model,
promotion occurs during the migrate process. When using this model, roles and responsibilities should be updated
to reflect this.
Staged. In a staged promotion model, the workload is considered migrated after it is staged, but it is not yet
promoted. Prior to promotion, the migrated workload undergoes a series of performance tests, business tests,
and optimization changes. It is then promoted at a future date in conjunction with a business test plan. This
approach improves the balance between cost and performance, while making it easier to obtain business
validation.
Flight. The flight promotion model combines single-step and staged models. In a flight model, the assets in
the workload are treated like production after landing in staging. After a condensed period of automated
testing, production traffic is routed to the workload. However, it is a subset of the traffic. That traffic serves as
the first flight of production and testing. Assuming the workload performs from a feature and performance
perspective, additional traffic is migrated. After all production traffic has been moved onto the new assets, the
workload is considered fully promoted.
The chosen promotion model affects the sequence of activities to be performed. It also affects the roles and
responsibilities of the cloud adoption team. It may even impact the composition of a sprint or multiple sprints.
Single-step promotion
This model uses migration automation tools to replicate, stage, and promote assets. The assets are replicated into
a contained staging environment controlled by the migration tool. After all assets have been replicated, the tool
can execute an automated process to promote the assets into the chosen subscription in a single step. While in
staging, the tool continues to replicate the asset, minimizing loss of data between the two environments. After an
asset is promoted, the linkage between the source system and the replicated system is severed. In this approach, if
additional changes occur in the initial source systems, the changes are lost.
Pros. Positive benefits of this approach include:
This model introduces less change to the target systems.
Continuous replication minimizes data loss.
If a staging process fails, it can quickly be deleted and repeated.
Replication and repeated staging tests enable an incremental scripting and testing process.
Cons. Negative aspects of this approach include:
Assets staged within the tools-isolated sandbox don't allow for complex testing models.
During replication, the migration tool consumes bandwidth in the local datacenter. Staging a large volume of
assets over an extended duration has an exponential impact on available bandwidth, hurting the migration
process and potentially affecting performance of production workloads in the on-premises environment.
Staged promotion
In this model, the staging sandbox managed by the migration tool is used for limited testing purposes. The
replicated assets are then deployed into the cloud environment, which serves as an extended staging environment.
The migrated assets run in the cloud, while additional assets are replicated, staged, and migrated. When full
workloads become available, richer testing is initiated. When all assets associated with a subscription have been
migrated, the subscription and all hosted workloads are promoted to production. In this scenario, there is no
change to the workloads during the promotion process. Instead, the changes tend to be at the network and
identity layers, routing users to the new environment and revoking access of the cloud adoption team.
Pros. Positive benefits of this approach include:
This model provides more accurate business testing opportunities.
The workload can be studied more closely to better optimize performance and cost of the assets.
A larger numbers of assets can be replicated within similar time and bandwidth constraints.
Cons. Negative aspects of this approach include:
The chosen migration tool can't facilitate ongoing replication after migration.
A secondary means of data replication is required to synchronize data platforms during the staged time frame.
Flight promotion
This model is similar to the staged promotion model. However, there is one fundamental difference. When the
subscription is ready for promotion, end-user routing happens in stages or flights. At each flight, additional users
are rerouted to the production systems.
Pros. Positive benefits of this approach include:
This model mitigates the risks associated with a big migration or promotion activity. Errors in the migrated
solution can be identified with less impact to business processes.
It allows for monitoring of workload performance demands in the cloud environment for an extended duration,
increasing accuracy of asset-sizing decisions.
Larger numbers of assets can be replicated within similar time and bandwidth constraints.
Cons. Negative aspects of this approach include:
The chosen migration tool can't facilitate ongoing replication after migration.
A secondary means of data replication is required to synchronize data platforms during the staged time frame.
Next steps
After a promotion model is defined and accepted by the cloud adoption team, remediation of assets can begin.
Remediating assets prior to migration
Remediate assets prior to migration
4 minutes to read • Edit Online
During the assessment process of migration, the team seeks to identify any configurations that would make an
asset incompatible with the chosen cloud provider. Remediate is a checkpoint in the migration process to ensure
that those incompatibilities have been resolved. This article discusses a few common remediation tasks for
reference. It also establishes a skeleton process for deciding whether remediation is a wise investment.
NOTE
This isn't production routing to the new assets, but rather configuration to allow for proper routing to the assets in
general.
Decision framework
While remediation for smaller workloads can be straightforward, which is one of the reasons it's recommended
you choose smaller workload for your initial migration. However, as your migration efforts mature and you begin
to tackle larger workloads, remediation can be a time consuming and costly process. For example, remediation
efforts for a Windows Server 2003 migration involving a 5,000+ VM pool of assets can delay a migration by
months. When such large-scale remediation is required, the following questions can help guide decisions:
Have all workloads affected by the remediation been identified and notated in the migration backlog?
For workloads that are not affected, will a migration produce a similar return on investment (ROI)?
Can the affected assets be remediated in alignment with the original migration timeline? What impact would
timeline changes have on ROI?
Is it economically feasible to remediate the assets in parallel with migration efforts?
Is there sufficient bandwidth on staff to remediate and migrate? Should a partner be engaged to execute one
or both tasks?
If these questions don't yield favorable answers, a few alternative approaches that move beyond a basic IaaS
rehosting strategy may be worth considering:
Containerization. Some assets can be hosted in a containerized environment without remediation. This could
produce less-than-favorable performance and doesn't resolve security or compliance issues.
Automation. Depending on the workload and remediation requirements, it may be more profitable to script
the deployment to new assets using a DevOps approach.
Rebuild. When remediation costs are very high and business value is equally high, a workload may be a good
fit as a candidate for rebuilding or rearchitecting.
Next steps
After remediation is complete, replication activities are ready.
Replicate assets
What role does replication play in the migration
process?
4 minutes to read • Edit Online
On-premises datacenters are filled with physical assets like servers, appliances, and network devices. However,
each server is only a physical shell. The real value comes from the binary running on the server. The applications
and data are the purpose for the datacenter. Those are the primary binaries to migrate. Powering these
applications and data stores are other digital assets and binary sources, like operating systems, network routes,
files, and security protocols.
Replication is the workhorse of migration efforts. It is the process of copying a point-in-time version of various
binaries. The binary snapshots are then copied to a new platform and deployed onto new hardware, in a process
referred to as seeding. When executed properly, the seeded copy of the binary should behave identically to the
original binary on the old hardware. However, that snapshot of the binary is immediately out of date and
misaligned with the original source. To keep the new binary and the old binary aligned, a process referred to as
synchronization continuously updates the copy stored in the new platform. Synchronization continues until the
asset is promoted in alignment with the chosen promotion model. At that point, the synchronization is severed.
Next steps
After replication is complete, staging activities can begin.
Staging activities during a migration
Replication options
2 minutes to read • Edit Online
Before any migration, you should ensure that primary systems are safe and will continue to run without issues.
Any downtime disrupts users or customers, and it costs time and money. Migration is not as simple as turning off
the virtual machines on-premises and copying them across to Azure. Migration tools must take into account
asynchronous or synchronous replication to ensure that live systems can be copied to Azure with no downtime.
Most of all, systems must be kept in lockstep with on-premises counterparts. You might want to test migrated
resources in isolated partitions in Azure, to ensure that workloads work as expected.
The content within the Cloud Adoption Framework assumes that Azure Migrate (or Azure Site Recovery) is the
most appropriate tool for replicating assets to the cloud. However, there are other options available. This article
discusses those options to help enable decision-making.
Next steps
After replication is complete, staging activities can begin.
Staging activities during a migration
Understand staging activities during a migration
2 minutes to read • Edit Online
As described in the article on promotion models, staging is the point at which assets have been migrated to the
cloud. However, they are not yet ready to be promoted to production. This is often the last step in the migrate
process of a migration. After staging, the workload is managed by an IT operations or cloud operations team to
prepare it for production usage.
Deliverables
Staged assets may not be ready for use in production. There are several production readiness checks that should
be finalized before this stage is considered complete. The following is a list of deliverables often associated with
completion of asset staging.
Automated testing. Any automated tests available to validate workload performance should be run before
concluding the staging process. After the asset leaves staging, synchronization with the original source system
is terminated. As such, it is harder to redeploy the replicated assets, after the assets are staged for optimization.
Migration documentation. Most migration tools can produce an automated report of the assets being
migrated. Before concluding the staging activity, all migrated assets should be documented for clarity.
Configuration documentation. Any changes made to an asset (during remediation, replication, or staging)
should be documented for operational readiness.
Backlog documentation. The migration backlog should be updated to reflect the workload and assets
staged.
Next steps
After staged assets are tested and documented, you can proceed to optimization activities.
Optimize migrated workloads
Optimize migrated workloads
2 minutes to read • Edit Online
After a workload and its supporting assets have been migrated to the cloud, it must be prepared before it can be
promoted to production. In this process, activities ready the workload, size the dependent assets, and prepare the
business for when the migrated cloud-based workload enters production usage.
The objective of optimization is to prepare a migrated workload for promotion to production usage.
Definition of done
The optimization process is complete when a workload has been properly configured, sized, and is being used in
production.
Next steps
With a general understanding of the optimization process, you are ready to begin the process by establishing a
business change plan for the candidate workload.
Business change plan
Business change plan
3 minutes to read • Edit Online
Traditionally, IT has overseen the release of new workloads. During a major transformation, like a datacenter
migration or a cloud migration, a similar pattern of IT lead adoption could be applied. However, the traditional
approach might miss opportunities to realize additional business value. For this reason, before a migrated
workload is promoted to production, implementing a broader approach to user adoption is suggested. This article
outlines the ways in which a business change plan adds to a standard user adoption plan.
Next steps
After business change is documented and planned, business testing can begin.
Guidance for business testing (UAT) during migration
References
Eason, K. (1988) Information technology and organizational change, New York: Taylor and Francis.
Guidance for business testing (UAT) during migration
3 minutes to read • Edit Online
Traditionally seen as an IT function, user acceptance testing during a business transformation can be orchestrated
solely by IT. However, this function is often most effectively executed as a business function. IT then supports this
business activity by facilitating the testing, developing testing plans, and automating tests when possible. Although
IT can often serve as a surrogate for testing, there is no replacement for firsthand observation of real users
attempting to take advantage of a new solution in the context of a real or replicated business process.
NOTE
When available, automated testing is a much more effective and efficient means of testing any system. However, cloud
migrations often focus most heavily on legacy systems or at least stable production systems. Often, those systems aren't
managed by thorough and well-maintained automated tests. This article assumes that no such tests are available at the
time of migration.
Second to automated testing is testing of the process and technology changes by power users. Power users are
the people that commonly execute a real-world process that requires interactions with a technology tool or set of
tools. They could be represented by an external customer using an e-commerce site to acquire goods or services.
Power users could also be represented by a group of employees executing a business process, such as a call center
servicing customers and recording their experiences.
The goal of business testing is to solicit validation from power users to certify that the new solution performs in
line with expectations and does not impede business processes. If that goal isn't met, the business testing serves
as a feedback loop that can help define why and how the workload isn't meeting expectations.
Next steps
In conjunction with business testing, optimization of migrated assets can refine cost and workload performance.
Benchmark and resize cloud assets
Benchmark and resize cloud assets
3 minutes to read • Edit Online
Monitoring usage and spending is critically important for cloud infrastructures. Organizations pay for the
resources they consume over time. When usage exceeds agreement thresholds, unexpected cost overages can
quickly accumulate. Cost Management reports monitor spending to analyze and track cloud usage, costs, and
trends. Using overtime reports, detect anomalies that differ from normal trends. Inefficiencies in cloud deployment
are visible in optimization reports. Note inefficiencies in cost-analysis reports.
In the traditional on-premises models of IT, requisition of IT systems is costly and time consuming. The processes
often require lengthy capital expenditure review cycles and may even require an annual planning process. As such,
it is common practice to buy more than is needed. It is equally common for IT administrators to then
overprovision assets in preparation for anticipated future demands.
In the cloud, the accounting and provisioning models eliminate the time delays that lead to overbuying. When an
asset needs additional resources, it can be scaled up or out almost instantly. This means that assets can safely be
reduced in size to minimize resources and costs consumed. During benchmarking and optimization, the cloud
adoption team seeks to find the balance between performance and costs, provisioning assets to be no larger and
no smaller than necessary to meet production demands.
Next steps
After a workload has been tested and optimized, it is time to ready the workload for promotion.
Getting a migrated workload ready for production promotion
Prepare a migrated application for production
promotion
2 minutes to read • Edit Online
After a workload is promoted, production user traffic is routed to the migrated assets. Readiness activities provide
an opportunity to prepare the workload for that traffic. The following are a few business and technology
considerations to help guide readiness activities.
Next steps
After all readiness activities have been completed, its time to promote the workload.
What is required to promote a migrated resource to production?
What is required to promote a migrated resource to
production?
2 minutes to read • Edit Online
Promotion to production marks the completion of a workload's migration to the cloud. After the asset and all of its
dependencies are promoted, production traffic is rerouted. The rerouting of traffic makes the on-premises assets
obsolete, allowing them to be decommissioned.
The process of promotion varies according to the workload's architecture. However, there are several consistent
prerequisites and a few common tasks. This article describes each and serves as a kind of prepromotion checklist.
Prerequisite processes
Each of the following processes should be executed, documented, and validated prior to production deployment:
Assess: The workload has been assessed for cloud compatibility.
Architect: The structure of the workload has been properly architected to align with the chosen cloud provider.
Replicate: The assets have been replicated to the cloud environment.
Stage: The replicated assets have been restored in a staged instance of the cloud environment.
Business testing: The workload has been fully tested and validated by business users.
Business change plan: The business has shared a plan for the changes to be made in accordance with the
production promotion; this should include a user adoption plan, changes to business processes, users that
require training, and timelines for various activities.
Ready: Generally, a series of technical changes must be made before promotion.
Next steps
Promotion of a workload signals the completion of a release. However, in parallel with migration, retired assets
need to be decommissioned taking them out of service.
Decommission retired assets
Decommission retired assets
2 minutes to read • Edit Online
After a workload is promoted to production, the assets that previously hosted the production workload are no
longer required to support business operations. At that point, the older assets are considered retired. Retired
assets can then be decommissioned, reducing operational costs. Decommissioning a resource can be as simple as
turning off the power to the asset and disposing of the asset responsibly. Unfortunately, decommissioning
resources can sometimes have undesired consequences. The following guidance can aid in properly
decommissioning retired resources, with minimal business interruptions.
Continued monitoring
After a migrated workload is promoted, the assets to be retired should continue to be monitored to validate that
no additional production traffic is being routed to the wrong assets.
Next steps
After retired assets are decommissioned, the migration is completed. This creates a good opportunity to improve
the migration process, and a retrospective engages the cloud adoption team in a review of the release in an effort
to learn and improve.
Retrospective
How do retrospectives help build a growth mindset?
3 minutes to read • Edit Online
"Culture eats strategy for breakfast." The best migration plan can easily be undone, if it doesn't have executive
support and encouragement from leadership. Learning, growing, and even failure are at the heart of a growth
mindset. They are also at the heart of any transformation effort.
Humility and curiosity have never been more important than they are during a business transformation.
Embracing digital transformation requires both in ample supply. These traits are strengthened by regular
introspection and an environment of encouragement. When employees are encouraged to take risks, they find
better solutions. When employees are allowed to fail and learn, they succeed. Retrospectives are an opportunity
for such investigation and growth.
Retrospectives reinforce the principles of a growth mindset: experimentation, testing, learning, sharing, growing,
and empowering. They provide a safe place for team members to share the challenges faced in the current sprint.
And they allow the team to discuss and collaborate on ways to overcome those challenges. Retrospectives
empower the team to create sustainable growth.
Retrospective structure
A quick search on any search engine will offer many different approaches and tools for running a retrospective.
Depending on the maturity of the culture and experience level of the team, these could prove useful. However, the
general structure of a retrospective remains roughly the same. During these meetings, each member of the team is
expected to contribute a thought regarding three basic questions:
What went well?
What could have been better?
What did we learn?
Although these questions are simple in nature, they require employees to pause and reflect on their work over the
last iteration. This small pause for introspection is the primary building block of a growth mindset. The humility
and honesty produced when sharing the answers can become infectious beyond the time contract for the
retrospective meeting.
Lessons learned
Highly effective teams don't just run retrospective meetings. They live retrospective processes. The lessons learned
and shared in these meetings can influence process, shape future work, and help the team execute more
effectively. Lessons learned in a retrospective should help the team grow organically. The primary byproducts of a
retrospective are an increase in experimentation and a refinement of the lessons learned by the team.
That new growth is most tangibly represented in changes to the release or iteration backlog.
The retrospective marks the end of a release or iteration, as teams gain experience and learn lessons, and they
adjust the adjust the release and iteration backlog to reflect new processes and experiments to be tested. This
starts the next iteration through the migration processes.
Next steps
The Secure and Manage section of this content can help prepare the reader for the transition from migration to
operations.
Secure monitoring and management tools
Secure monitoring and management tools
4 minutes to read • Edit Online
After a migration is complete, migrated assets should be managed by controlled IT operations. This article does
not represent a deviation from operational best practices. Instead, the following should be considered a minimum
viable product for securing and managing migrated assets, either from IT operations or independently as IT
operations come online.
Monitoring
Monitoring is the act of collecting and analyzing data to determine the performance, health, and availability of your
business workload and the resources that it depends on. Azure includes multiple services that individually perform
a specific role or task in the monitoring space. Together, these services deliver a comprehensive solution for
collecting, analyzing, and acting on telemetry from your workload applications and the Azure resources that
support them. Gain visibility into the health and performance of your apps, infrastructure, and data in Azure with
cloud monitoring tools, such as Azure Monitor, Log Analytics, and Application Insights. Use these cloud
monitoring tools to take action and integrate with your service management solutions:
Core monitoring. Core monitoring provides fundamental, required monitoring across Azure resources. These
services require minimal configuration and collect core telemetry that the premium monitoring services use.
Deep application and infrastructure monitoring. Azure services provide rich capabilities for collecting and
analyzing monitoring data at a deeper level. These services build on core monitoring and take advantage of
common functionality in Azure. They provide powerful analytics with collected data to give you unique insights
into your applications and infrastructure.
Learn more about Azure Monitor for monitoring migrated assets.
Security monitoring
Rely on the Azure Security Center for unified security monitoring and advanced threat notification across your
hybrid cloud workloads. The Security Center gives full visibility into and control over the security of cloud
applications in Azure. Quickly detect and take action to respond to threats and reduce exposure by enabling
adaptive threat protection. The built-in dashboard provides instant insights into security alerts and vulnerabilities
that require attention. Azure Security Center can help with many functions, including:
Centralized policy monitoring. Ensure compliance with company or regulatory security requirements by
centrally managing security policies across hybrid cloud workloads.
Continuous security assessment. Monitor the security of machines, networks, storage and data services, and
applications to discover potential security issues.
Actionable recommendations. Remediate security vulnerabilities before they can be exploited by attackers.
Include prioritized and actionable security recommendations.
Advanced cloud defenses. Reduce threats with just-in-time access to management ports and safe lists to
control applications running on your VMs.
Prioritized alerts and incidents. Focus on the most critical threats first, with prioritized security alerts and
incidents.
Integrated security solutions. Collect, search, and analyze security data from a variety of sources, including
connected partner solutions.
Learn more about Azure Security Center for securing migrated assets.
Service health monitoring
Azure Service Health provides personalized alerts and guidance when Azure service issues affect you. It can notify
you, help you understand the impact of issues, and keep you updated as the issue is resolved. It can also help you
prepare for planned maintenance and changes that could affect the availability of your resources.
Service health dashboard. Check the overall health of your Azure services and regions, with detailed updates
on any current service issues, upcoming planned maintenance, and service transitions.
Service health alerts. Configure alerts that will notify you and your teams in the event of a service issue like
an outage or upcoming planned maintenance.
Service health history. Review past service issues and download official summaries and reports from
Microsoft.
Learn more about Azure Service Health for staying informed about the health of your migrated resources.
Optimize resources
Azure Advisor is your personalized guide to Azure best practices. It analyzes your configuration and usage
telemetry and offers recommendations to help you optimize your Azure resources for high availability, security,
performance, and cost. Advisor’s inline actions help you quickly and easily remediate your recommendations and
optimize your deployments.
Azure best practices. Optimize migrated resources for high availability, security, performance, and cost.
Step-by-step guidance. Remediate recommendations efficiently with guided quick links.
New recommendations alerts. Stay informed about new recommendations, such as additional opportunities
to rightsize VMs and save money.
Learn more about Azure Advisor for optimizing your migrated resources.
Suggested skills
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. Earn points and levels, and achieve more!
Here is an example of a tailored learning path on Microsoft Learn that's aligned with the Secure and Manage
portion of the Cloud Adoption Framework:
Secure your cloud data: Azure was designed for security and compliance. Learn how to leverage the built-in
services to store your app data securely to ensure that only authorized services and clients have access to it.
All IT portfolios contain a few workloads and ideas that could significantly improve a company's position in the market. Most
cloud adoption efforts focus on the migration and modernization of existing workloads. It's innovation, however, that can provide
the greatest business value. Cloud adoption-related innovation can unlock new technical skills and expanded business
capabilities.
This section of the Cloud Adoption Framework focuses on the elements of your portfolio that drive the greatest return on
investment.
Get started
To prepare you for this phase of the cloud adoption lifecycle, the framework suggests the following exercises:
Best practices
Your architectural decisions should follow best practices for each tool in the toolchain. By adhering to such guidance, you can
better accelerate solution development and provide a reference for solid architectural designs.
Feedback loops
During each iteration, the solutions under development offer a way for your teams to learn alongside customers. Fast and
accurate feedback loops with your customers can help you better test, measure, learn, and ultimately reduce the time to
market impact. Learn how Azure and GitHub accelerate feedback loops.
Methodology summary
The considerations overview establishes a common language for innovation across application development, DevOps, IT, and
business teams.
The exercises in the Get started section help make the methodology actionable during the development of innovative
solutions.
This approach builds on existing lean methodologies. It's designed to help you create a cloud-focused conversation about
customer adoption and a scientific model for creating business value. The approach also maps existing Azure services to
manageable decision processes. This alignment can help you find the right technical options to address specific customer needs
or hypotheses.
Suggested skills
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with cloud adoption
doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning that helps you achieve your
goals faster. Earn points and levels, and achieve more!
Here are a couple of examples of role-specific learning paths on Microsoft Learn that align with the Innovate portion of the
Cloud Adoption Framework.
Administer containers in Azure: Azure Container Instances (ACI) are the quickest and easiest way to run containers in Azure. This
learning path will teach you how to create and manage your containers, and how you can use ACI to provide elastic scale for
Kubernetes.
Create serverless applications: Azure Functions enable the creation of event-driven, compute-on-demand systems that can be
triggered by various external events. Learn how to leverage functions to execute server-side logic and build serverless
architectures.
To discover additional learning paths, browse the Learn catalog. Use the Roles filter to align learning paths with your role.
Next steps
The first exercise for cloud innovation is to:
Build consensus for business value of innovation
2 minutes to read • Edit Online
TIP
For an interactive experience, view this guide in the Azure portal. Go to the Azure Quickstart Center in the Azure portal,
select Azure innovation guide, and then follow the step-by-step instructions.
Next steps: Prepare for innovation with a shared repository and ideation management tools
This guide provides interactive steps that let you try features as they're introduced. To come back to where you left
off, use the breadcrumb for navigation.
4 minutes to read • Edit Online
Azure DevOps
During your innovation journey, you'll eventually find yourself on the path to DevOps. Microsoft has long had an
on-premises product known as Team Foundation Server (TFS ). During our own innovation journey, Microsoft
developed Azure DevOps, a cloud-based service that provides build and release tools supporting many languages
and destinations for your releases. For more information, see Azure DevOps.
Generate value
In every industry, every organization is trying to do one thing: drive constant value generation.
The focus on innovation is essentially a process to help your organization find new ways to generate value.
Perhaps the biggest mistake organizations make is trying to create new value by introducing new technologies.
Sometimes the attitude is "if we just use more technology, we'll see things improve." But innovation is first and
foremost a people story.
Innovation is about the combination of people and technology.
Organizations that successfully innovate see vision, strategy, culture, unique potential, and capabilities as the
foundational elements. They then turn to technology with a specific purpose in mind. Every company is becoming a
software company. The hiring of software engineers is growing at a faster rate outside the tech industry than
inside, according to LinkedIn data.
Innovation is accomplished when organizations support their people to create the value they seek. One group of
those people, developers, is a catalyst for innovation. They play an increasingly vital role in value creation and
growth across every industry. They're the builders of our era, writing the world's code and sitting at the heart of
innovation. Innovative organizations build a culture that empowers developers to achieve more.
Developer productivity
Innovate collaboratively
Innovation characteristics
LiveOps innovation
Developer velocity
Empowering developers to invent means accelerating developer velocity, enabling them to create more, innovate
more, and solve more problems. Developer velocity is the underpinning of each organization's tech intensity.
Developer velocity isn't just about speed. It's also about unleashing developer ingenuity, turning your developers'
ideas into software with speed and agility so that innovative solutions can be built. The differentiated Azure
solution is uniquely positioned to unleash innovation in your organization.
Build productively
There are several areas of opportunity where Azure can help you build productively:
Ensure developers become and stay proficient in their domain by helping them advance their knowledge.
Hone the right skills by giving them the right tools.
One of the best ways to improve your developers' skills is by giving them tools they know and love. Azure tools
meet developers where they are today and introduce them to new technologies in the context of the code they're
writing. With the Azure commitment to open-source software and support for all languages and frameworks in
Azure tools, your developers can build how they want and deploy where you want.
Azure DevOps provides best-in-class tools for every developer. Azure developer services infuse modern
development practices and emerging trends into our tools. With the Azure platform, developers have access to the
latest technologies and a cutting-edge toolchain that supports the way they work.
AI-assisted development tools
Integrated tools and cloud
Remote development and pair programming
Go to the Get started documentation for Azure DevOps
Action
To create a DevOps project:
1. Go to Azure DevOps Projects.
2. Select Create DevOps project.
3. Select Runtime, Framework and Service.
G O TO A Z U R E D E V O P S
PR O JE CT
5 minutes to read • Edit Online
The IoT Hub Device Provisioning Service is a helper service for IoT Hub that enables zero-touch, just-in-time
provisioning.
Action
To create IoT Hub Device Provisioning Services:
1. Go to IoT Hub Device Provisioning Services.
2. Select Create Device Provisioning Services.
G O TO D E V I C E P R O V I S I O N I N G
S E R V IC E S
3 minutes to read • Edit Online
The digital economy is an undeniable force in almost every industry. During the Industrial Revolution, gasoline,
conveyor belts, and human ingenuity were key resources for promoting market innovation. Product quality, price,
and logistics drove markets as companies sought to deliver better products to their customers more quickly.
Today's digital economy shifts the way in which customers interact with corporations. The primary forms of
capital and market differentiators have all shifted as a result. In the digital economy, customers are less concerned
with logistics and more concerned with their overall experience of using a product. This shift arises from direct
interaction with technology in our daily lives and from a realization of the value associated with those interactions.
In the Innovate phase of the Cloud Adoption Framework, we'll focus on understanding customer needs and
rapidly building innovations that shape how your customers interact with your products. We'll also illustrate an
approach to delivering on the value of a minimum viable product (MVP ). Finally, we'll map decisions common to
innovation cycles to help you understand how the cloud can unlock innovation and create partnerships with your
customers.
Innovate methodology
The simple methodology for cloud innovation within the Cloud Adoption Framework is illustrated in the
following image. Subsequent articles in this section will show how to establish core processes, approaches, and
mechanisms for finding and driving innovation within your company.
Cultural commitments
Adopting the Innovate methodology requires some cultural commitments to effectively use the metrics outlined
in this article. Before you change your approach to driving innovation, make sure the adoption and leadership
teams are ready to make these important commitments.
Commitment to transparency
To understand measurement in an innovation approach, you must first understand the commitment to
transparency. Innovation can only thrive in an environment that adheres to a growth mindset. At the root of a
growth mindset is a cultural imperative to learn from experiences. Successful innovation and continuous learning
start with a commitment to transparency in measurement. This is a brave commitment for the cloud adoption
team. However, that commitment is meaningless if it's not matched by a commitment to preserve transparency
within the leadership and cloud strategy teams.
Transparency is important because measuring customer impact doesn't address the question of right or wrong.
Nor are impact measurements indicative of the quality of work or the performance of the adoption team. Instead,
they represent an opportunity to learn and better meet your customers' needs. Misuse of innovation metrics can
stifle that culture. Eventually, such misuse will lead to manipulation of metrics, which in turn causes long-term
failure of the invention, the supporting staff, and ultimately the management structure who misused the data.
Leaders and contributors alike should avoid using measurements for anything other than an opportunity to learn
and improve the MVP solution.
Commitment to iteration
Only one promise rings true across all innovation cycles—you won't get it right on the first try. Measurement
helps you understand what adjustments you should make to achieve the desired results. Changes that lead to
favorable outcomes stem from iterations of the build-measure-learn process. The cloud adoption team and the
cloud strategy team must commit to an iterative mindset before adopting a growth mindset or a build-measure-
learn approach.
Next steps
Before building the next great invention, get started with customer adoption by understanding the build-
measure-learn feedback loop.
Customer adoption with the build-measure-learn feedback loop
Build consensus on the business value of innovation
5 minutes to read • Edit Online
The first step to developing any new innovation is to identify how that innovation can drive business value. In this
exercise, you answer a series of questions that highlight the importance of investing ample time when your
organization defines business value.
Qualifying questions
Before you develop any solution (in the cloud or on-premises), validate your business value criteria by answering
the following questions:
1. What is the defined customer need that you seek to address with this solution?
2. What opportunities would this solution create for your business?
3. Which business outcomes would be achieved with this solution?
4. Which of your company's motivations would be served with this solution?
If the answers to all four questions are well documented, you might not need to complete the rest of this exercise.
Fortunately, you can easily test any documentation. Set up two short meetings to test both the documentation and
your organization's internal alignment. Invite committed business stakeholders to one meeting and set up a
separate meeting with the engaged development team. Ask the four questions above to each group, and then
compare the results.
NOTE
The existing documentation should not be shared with either team before the meeting. If true alignment exists, the guiding
hypotheses should be referenced or even recited by members of each group.
WARNING
Don't facilitate the meeting. This test is to determine alignment; it's not an alignment creation exercise. When you start the
meeting, remind the attendees that the objective is to test directional alignment to existing agreements within the team.
Establish a five-minute time limit for each question. Set a timer and close each question after five minutes even if the
attendees haven't agreed upon an answer.
Account for the different languages and interests of each group. If the test results in answers that are directionally
aligned, consider this exercise a victory. You're ready to move on to solution development.
If one or two of the answers are directionally aligned, recognize that your hard work is paying off. You're already
better aligned than most organizations. Future success is likely with minor continuing investment in alignment.
Review each of the following sections for ideas that may help you build further alignment.
If either team fails to answer all four questions in 30 minutes, then alignment and the considerations in the
following sections are likely to have a significant impact on this effort and others. Pay careful attention to each of
the following sections.
Next steps
After you've aligned your business value proposition and communicated it, you're ready to start building your
solution.
Return to the innovate exercises for next steps
Create customer partnerships through the build-
measure-learn feedback loop
2 minutes to read • Edit Online
True innovation comes from the hard work of building solutions that demonstrate customer empathy, from
measuring the impact of those changes on the customer, and from learning with the customer. Most importantly, it
comes from feedback over multiple iterations.
If the past decade has taught us anything about innovation, it's that the old rules of business have changed. Large,
wealthy incumbents no longer have an unbreakable hold on the market. The first or best players to market are
always the winners. Having the best idea doesn't lead to market dominance. In a rapidly changing business
climate, market leaders are the most agile. Those who can adapt to changing conditions lead.
Large or small, the companies that thrive in the digital economy as innovative leaders are those with the greatest
ability to listen to their customer base. That skill can be cultivated and managed. At the core of all good
partnerships is a clear feedback loop. The process for building customer partnerships within the Cloud Adoption
Framework is the build-measure-learn feedback loop.
Next steps
Learn how to Build with customer empathy to begin your build-measure-learn cycle.
Build with customer empathy
Build with customer empathy
11 minutes to read • Edit Online
"Necessity is the mother of invention." This proverb captures the indelibility of the human spirit and our natural
drive to invent. As explained in the Oxford English Dictionary, "When the need for something becomes
imperative, you are forced to find ways of getting or achieving it." Few would deny these universal truths about
invention. However, as described in Innovation in the digital economy, innovation requires a balance of
invention and adoption.
Continuing with the analogy, innovation comes from a more extended family. Customer empathy is the proud
parent of innovation. Creating a solution that drives innovation requires a legitimate customer need—one that
keeps the customer coming back to solve critical challenges. These solutions are based on what a customer
needs rather than on their wants or whims. To find customers' true needs, we start with empathy—a deep
understanding of the customer's experience. Empathy is an underdeveloped skill for many engineers, product
managers, and even business leaders. Fortunately, the diverse interactions and rapid pace of the cloud architect
role have already started fostering this skill.
Why is empathy so important? From the first release of a minimum viable product (MVP ) to the general
availability of a market-grade solution, customer empathy helps us understand and share in the experience of
the customer. Empathy helps us build a better solution. More importantly, it better positions us to invent
solutions that will encourage adoption. In a digital economy, those who can most readily empathize with
customer needs can build a brighter future that redefines and leads the market.
Properly defining what to build can be tricky and requires some practice. If you build something too quickly, if
might not reflect customer needs. If you spend too much time trying to understand initial customer needs and
solution requirements, the market may meet them before you have a chance to build anything at all. In either
scenario, the opportunity to learn can be significantly delayed or reduced. Sometimes the data can even be
corrupted.
The most innovative solutions in history began with an intuitive belief. That gut feeling comes from both existing
expertise and firsthand observation. We start with the build phase because it allows for a rapid test of that
intuition. From there, we can cultivate deeper understanding and clearer degrees of empathy. At every iteration
or release of a solution, balance comes from building MVPs that demonstrate customer empathy.
To steady this balancing act, the following two sections discuss the concepts of building with empathy and
defining an MVP.
Define a customer focused-hypothesis
Building with empathy means creating a solution based on defined hypotheses that illustrate a specific customer
need. The following steps aim to formulate a hypothesis that will encourage building with empathy.
1. When you build with empathy, the customer is always the focus. This intention can take many shapes. You
could reference a customer archetype, a specific persona, or even a picture of a customer in the midst of the
problem you want to solve. And keep in mind that customers can be internal (employees or partners) or
external (consumers or business customers). This definition is the first hypothesis to be tested: Can we help
this specific customer?
2. Understand the customer experience. Building with empathy means you can relate to the customer's
experience and understand their challenges. This mindset indicates the next hypothesis to be tested: Can we
help this specific customer with this manageable challenge?
3. Define a simple solution to a single challenge. Relying on expertise across people, processes, and subject
matter experts will lead to a potential solution. This is the full hypothesis to be tested: Can we help this
specific customer with this manageable challenge through the proposed solution?
4. Arrive at a value statement. What long-term value do you hope to provide to these customers? The answer to
this question creates your full hypothesis: How will these customers' lives be improved by using the
proposed solution to address this manageable challenge?
This last step is the culmination of an empathy-driven hypothesis. It defines the audience, the problem, the
solution, and the metric by which improvement is to be made, all of which center on the customer. During the
measure and learn phases, each hypothesis should be tested. Changes in the customer, problem statement, or
solution are anticipated as the team develops greater empathy for the addressable customer base.
Cau t i on
The goal is to build with customer empathy, not to plan with it. It's all too easy to get stuck in endless cycles of
planning and tweaking to hit upon the perfect customer empathy statement. Before you try to develop such a
statement, review the following sections on defining and building an MVP.
After core assumptions are proven, later iterations will focus on growth tests in addition to empathy tests. After
empathy is built, tested, and validated, you can begin to understand the addressable market at scale. This can be
done through an expansion of the standard hypothesis formula described earlier. Based on available data,
estimate the size of the total market—the number of potential customers.
From there, estimate the percentage of that total market that experiences a similar challenge and that might
therefore be interested in this solution. This is your addressable market. The next hypothesis to be tested is: how
will x% of customers' lives be improved by using the proposed solution to address this manageable challenge? A
small sampling of customers will reveal leading indicators that suggest a percentage impact on the pool of
customers engaged.
Define a solution to test the hypothesis
During each iteration of a build-measure-learn feedback loop, your attempt to build with empathy is defined by
an MVP.
An MVP is the smallest unit of effort (invention, engineering, application development, or data architecture)
required to create enough of a solution to learn with the customer. The goal of every MVP is to test some or all
of the prior hypotheses and to receive feedback directly from the customer. The output is not a beautiful
application with all the features required to change your industry. The desired output of each iteration is a
learning opportunity—a chance to more deeply test a hypothesis.
Timeboxing is a standard way to make sure a product remains lean. For example, make sure your development
team thinks the solution can be created in a single iteration to allow for rapid testing. To better understand using
velocity, iterations, and releases to define what minimal means, see Planning velocity, iterations, release, and
iteration paths.
Reduce complexity and delay technical spikes
The disciplines of invention found in the Innovate methodology describe the functionality that's often required
to deliver a mature innovation or scale-ready MVP solution. Use these disciplines as a long-term guide for
feature inclusion. Likewise, use them as a cautionary guide during early testing of customer value and empathy
in your solution.
Feature breadth and the different disciplines of invention can't all be created in a single iteration. It might take
several releases for an MVP solution to include the complexity of multiple disciplines. Depending on the
investment in development, there might be multiple parallel teams working within different disciplines to test
multiple hypotheses. Although it's smart to maintain architectural alignment between those teams, it's unwise to
try to build complex, integrated solutions until value hypotheses can be validated.
Complexity is best detected in the frequency or volume of technical spikes. Technical spikes are efforts to create
technical solutions that can't be easily tested with customers. When customer value and customer empathy are
untested, technical spikes represent a risk to innovation and should be minimized. For the types of mature tested
solutions found in a migration effort, technical spikes can be common throughout adoption. However, they
delay the testing of hypotheses in innovation efforts and should be postponed whenever possible.
A relentless simplification approach is suggested for any MVP definition. This approach means removing
anything that doesn't add to your ability to validate the hypothesis. To minimize complexity, reduce the number
of integrations and features that aren't required to test the hypothesis.
Build an MVP
At each iteration, an MVP solution can take many different shapes. The common requirement is only that the
output allows for measurement and testing of the hypothesis. This simple requirement initiates the scientific
process and allows the team to build with empathy. To deliver this customer-first focus, an initial MVP might rely
on only one of the disciplines of invention.
In some cases, the fastest path to innovation means temporarily avoiding these disciplines entirely, until the
cloud adoption team is confident that the hypothesis has been accurately validated. Coming from a technology
company like Microsoft, this guidance might sound counterintuitive. However, this simply emphasizes that
customer needs, not a specific technology decision, are the highest priority in an MVP solution.
Typically, an MVP solution consists of a simple web app or data solution with minimal features and limited
polish. For organizations that have professional development expertise, this path is often the fastest one to
learning and iteration. The following list includes several other approaches a team might take to build an MVP:
A predictive algorithm that's wrong 99% of the time but that demonstrates specific desired outcomes.
An IoT device that doesn't communicate securely at production scale but that demonstrates the value of
nearly real-time data within a process.
An application built by a citizen developer to test a hypothesis or meet smaller-scale needs.
A manual process that re-creates the benefits of the application to follow.
A wireframe or video that's detailed enough to allow the customer to interact.
Developing an MVP shouldn't require massive amounts of development investment. Preferably, investment
should be as constrained as possible to minimize the number of hypotheses being tested at one time. Then, in
each iteration and with each release, the solution is intentionally improved toward a scale-ready solution that
represents multiple disciplines of invention.
Accelerate MVP development
Time to market is crucial to the success of any innovation. Faster releases lead to faster learning. Faster learning
leads to products that can scale more quickly. At times, traditional application development cycles can slow this
process. More frequently, innovation is constrained by limits on available expertise. Budgets, headcount, and
availability of staff can all create limits to the number of new innovations a team can handle.
Staffing constraints and the desire to build with empathy have spawned a rapidly growing trend toward citizen
developers. These developers reduce risk and provide scale within an organization's professional development
community. Citizen developers are subject matter experts where the customer experience is concerned, but
they're not trained as engineers. These individuals use prototyping tools or lighter-weight development tools
that might be frowned upon by professional developers. These business-aligned developers create MVP
solutions and test theories. When aligned well, this process can create production solutions that provide value
but don't pass a sufficiently effective scale hypothesis. They can also be used to validate a prototype before scale
efforts begin.
Within any innovate plan, cloud adoption teams should diversify their portfolios to include citizen developer
efforts. By scaling development efforts, more hypotheses can be formed and tested at a reduced investment.
When a hypothesis is validated and an addressable market is identified, professional developers can harden and
scale the solution by using modern development tools.
Final build gate: Customer pain
When customer empathy is strong, a clearly existing problem should be easy to identify. The customer's pain
should be obvious. During build, the cloud adoption team is building a solution to test a hypothesis based on a
customer pain point. If the hypothesis is well-defined but the pain point is not, the solution is not truly based on
customer empathy. In this scenario, build is not the right starting point. Instead, invest first in building empathy
and learning from real customers. The best approach for building empathy and validating pain is simple: listen
to your customers. Invest time in meeting with and observing them until you can identify a pain point that
occurs frequently. After the pain point is well-understood, you're ready to test a hypothesized solution for
addressing that pain.
References
Some of the concepts in this article build on topics discussed in The Lean Startup (Eric Ries, Crown Business,
2011).
Next steps
After you've built an MVP solution, you can measure the empathy value and scale value. Learn how to measure
for customer impact.
Measure for customer impact
Measure for customer impact
4 minutes to read • Edit Online
There are several ways to measure for customer impact. This article will help you define metrics to validate
hypotheses that arise out of an effort to build with customer empathy.
Strategic metrics
During the strategy phase of the cloud adoption lifecycle, we examine motivations and business outcomes. These
practices provide a set of metrics by which to test customer impact. When innovation is successful, you tend to
see results that are aligned with your strategic objectives.
Before establishing learning metrics, define a small number of strategic metrics that you want this innovation to
affect. Generally those strategic metrics align with one or more of the following outcome areas: business agility,
customer engagement, customer reach, financial impact, or in the case of operational innovation: solution
performance.
Document the agreed-upon metrics and track their impact frequently. But don't expect results in any of these
metrics to emerge for several iterations. For more information about setting and aligning expectations across the
parties involved, see Commitment to iteration.
Aside from motivation and business outcome metrics, the remainder of this article focuses on learning metrics
designed to guide transparent discovery and customer-focused iterations. For more information about these
aspects, see Commitment to transparency.
Learning metrics
When the first version of any minimum viable product (MVP ) is shared with customers, preferably at the end of
the first development iteration, there will be no impact on strategic metrics. Several iterations later, the team may
still be struggling to change behaviors enough to materially affect strategic metrics. During learning processes,
such as build-measure-learn cycles, we advise the team to adopt learning metrics. These metrics tracking and
learning opportunities.
Customer flow and learning metrics
If an MVP solution validates a customer-focused hypothesis, the solution will drive some change in customer
behaviors. Those behavior changes across customer cohorts should improve business outcomes. Keep in mind
that changing customer behavior is typically a multistep process. Because each step provides an opportunity to
measure impact, the adoption team can keep learning along the way and build a better solution.
Learning about changes to customer behavior starts by mapping the flow that you hope to see from an MVP
solution.
In most cases, a customer flow will have an easily defined starting point and no more than two end points.
Between the start and end points are a variety of learning metrics to be used as measures in the feedback loop:
1. Starting point—initial trigger: The starting point is the scenario that triggers the need for this solution.
When the solution is built with customer empathy, that initial trigger should inspire a customer to try the MVP
solution.
2. Customer need met: The hypothesis is validated when a customer need has been met by using the solution.
3. Solution steps: This term refers to the steps that are required to move the customer from the initial trigger to
a successful outcome. Each step produces a learning metric based on a customer decision to move on to the
next step.
4. Individual adoption achieved: The next time the trigger is encountered, if the customer returns to the
solution to get their need met, individual adoption has been achieved.
5. Business outcome indicator: When a customer behaves in a way that contributes to the defined business
outcome, a business outcome indicator is observed.
6. True Innovation: When business outcome indicators and individual adoption both occur at the desired scale,
you've realized true innovation.
Each step of the customer flow generates learning metrics. After each iteration (or release), a new version of the
hypothesis is tested. At the same time, tweaks to the solution are tested to reflect adjustments in the hypothesis.
When customers follow the prescribed path in any given step, a positive metric is recorded. When customers
deviate from the prescribed path, a negative metric is recorded.
These alignment and deviation counters create learning metrics. Each should be recorded and tracked as the
cloud adoption team progresses toward business outcomes and true innovation. In Learn with customers, we'll
discuss ways to apply these metrics to learn and build better solutions.
Grouping and observing customer partners
The first measurement in defining learning metrics is the customer partner definition. Any customer who
participates in innovation cycles qualifies as a customer partner. To accurately measure behavior, you should use a
cohort model to define customer partners. In this model, customers are grouped to sharpen your understanding
of their responses to changes in the MVP. These groups typically resemble the following:
Experiment or focus group: Grouping customers based on their participation in a specific experiment
designed to test changes over time.
Segment: Grouping customers by the size of the company.
Vertical: Grouping customers by the industry vertical they represent.
Individual demographics: Grouping based on personal demographics like age and physical location.
These types of groupings help you validate learning metrics across various cross-sections of those customers
who choose to partner with you during your innovation efforts. All subsequent metrics should be derived from
definable customer grouping.
Next steps
As learning metrics accumulate, the team can begin to learn with customers.
Learn with customers
Some of the concepts in this article build on topics first described in The Lean Startup, written by Eric Ries.
Learn with customers
4 minutes to read • Edit Online
Our current customers represent our best resource for learning. By partnering with us, they help us build with
customer empathy to find the best solution to their needs. They also help create a minimum viable product (MVP )
solution by generating metrics from which we measure customer impact. In this article, we'll describe how to
learn with and from our customer-partners.
Continuous learning
At the end of every iteration, we have an opportunity to learn from the build and measure cycles. This process of
continuous learning is quite simple. The following image offers an overview of the process flow.
At its most basic, continuous learning is a method for responding to learning metrics and assessing their impact
on customer needs. This process consists of three primary decisions to be made at the end of each iteration:
Did the hypothesis prove true? When the answer is yes, celebrate for a moment and then move on. There
are always more things to learn, more hypotheses to test, and more ways to help the customer in your next
iteration. When a hypothesis proves true, it's often a good time for teams to decide on a new feature that will
enhance the solution's utility for the customer.
Can you get closer to a validated hypothesis by iterating on the current solution? The answer is
usually yes. Learning metrics typically suggest points in the process that lead to customer deviation. Use these
data points to find the root of a failed hypothesis. At times, the metrics may also suggest a solution.
Is a reset of the hypothesis required? The scariest thing to learn in any iteration is that the hypothesis or
underlying need was flawed. When this happens, an iteration alone isn't necessarily the right answer. When a
reset is required, the hypothesis should be rewritten and the solution reviewed in light of the new hypothesis.
The sooner this type of learning occurs, the easier it will be to pivot. Early hypotheses should focus on testing
the riskiest aspects of the solution in service of avoiding pivots later in development.
Unsure? The second most common response after "iterate" is "we're not sure." Embrace this response. It
represents an opportunity to engage the customer and to look beyond the data.
The answers to these questions will shape the iteration to follow. Companies that demonstrate an ability to apply
continuous learning and boldly make the right decisions for their customers are more likely to emerge as leaders
in their markets.
For better or worse, the practice of continuous learning is an art that requires a great deal of trial and error. It also
requires some science and data-driven decision-making. Perhaps the most difficult part of adopting continuous
learning concerns the cultural requirements. To effectively adopt continuous learning, your business culture must
be open to a fail first, customer-focused approach. The following section provides more details about this
approach.
Growth mindset
Few could deny the radical transformation within Microsoft culture that's occurred over the last several years. This
multifaceted transformation, led by Satya Nadella, has been hailed as a surprising business success story. At the
heart of this story is the simple belief we call the growth mindset. An entire section of this framework could be
dedicated to the adoption of a growth mindset. But to simplify this guidance, we'll focus on a few key points that
inform the process of learning with customers:
Customer first: If a hypothesis is designed to improve the experience of real customers, you have to meet real
customers where they are. Don't just rely on metrics. Compare and analyze metrics based on firsthand
observation of customer experiences.
Continuous learning: Customer focus and customer empathy stem from a learn-it-all mindset. The Innovate
method strives to be learn-it-all, not know -it-all.
Beginner's mindset: Demonstrate empathy by approaching every conversation with a beginner's mindset.
Whether you're new to your field or a 30-year veteran, assume you know little, and you'll learn a lot.
Listen more: Customers want to partner with you. Unfortunately, an ego-driven need to be right blocks that
partnership. To learn beyond the metrics, speak less and listen more.
Encourage others: Don't just listen; use the things you do say to encourage others. In every meeting, find
ways to pull in diverse perspectives from those who may not be quick to share.
Share the code: When we feel our obligation is to the ownership of a code base, we lose sight of the true
power of innovation. Focus on owning and driving outcomes for your customers. Share your code (publicly
with the world or privately within your company) to invite diverse perspectives into the solution and the code
base.
Challenge what works: Success doesn't necessarily mean you're demonstrating true customer empathy.
Avoid having a fixed mindset and a bias toward doing what's worked before. Look for learning in positive and
negative metrics by engaging your customers.
Be inclusive: Work hard to invite diverse perspectives into the mix. There are many variables that can divide
humans into segregated groups. Cultural norms, past behaviors, gender, religion, sexual preference, even
physical abilities. True innovation comes when we challenge ourselves to see past our differences and
consciously strive to include all customers, partners, and coworkers.
Next steps
As a next step to understanding this methodology, Common blockers and challenges to innovation can prepare
you for the changes ahead.
Understanding common blockers and challenges
Some of the concepts in this article build on topics first described in The Lean Startup, written by Eric Ries.
Common blockers and challenges to innovation
5 minutes to read • Edit Online
As described in Innovation in the digital economy, innovation requires a balance of invention and adoption. This
article expands on the common challenges and blockers to innovation, as it aims to help you understand how this
approach can add value during your innovation cycles. Formula for innovation: Innovation = Invention +
Adoption
Adoption challenges
Cloud technology advances have reduced some of the friction related to adoption. However, adoption is more
people-centric than technology-centric. And unfortunately, the cloud can't fix people.
The following list elaborates on some of the most common adoption challenges related to innovation. As you
progress through the Innovate methodology, each of the challenges in the following sections will be identified and
addressed. Before you apply this methodology, evaluate your current innovation cycles to determine which are the
most important challenges or blockers for you. Then, use the methodology to address or remove those blockers.
External challenges
Time to market: In a digital economy, time to market is one of the most crucial indicators of market
domination. Surprisingly, time to market impact has little to do with positioning or early market share. Both of
those factors are fickle and temporary. The time to market advantage comes from the simple truth that more
time your solution has on the market, the more time you have to learn, iterate, and improve. Focus heavily on
quick definition and rapid build of an effective minimum viable product to shorten time to market and
accelerate learning opportunities.
Competitive challenges: Dominant incumbents reduce opportunities to engage and learn from customers.
Competitors also create external pressure to deliver more quickly. Build fast but invest heavily in understanding
the proper measures. Well-defined niches produce more actionable feedback measures and enhance your
ability to partner and learn, resulting in better overall solutions.
Understand your customer: Customer empathy starts with an understanding of the customer and customer
base. One of the biggest challenges for innovators is the ability to rapidly categorize measurements and
learning within the build-measure-learn cycle. It's important to understand your customer through the lenses
of market segmentation, channels, and types of relationships. Throughout the build-measure-learn cycle, these
data points help create empathy and shape the lessons learned.
Internal challenges
Choosing innovation candidates: When investing in innovation, healthy companies spawn an endless
supply of potential inventions. Many of these create compelling business cases that suggest high returns and
generate enticing business justification spreadsheets. As described in the build article, building with customer
empathy should be prioritized over invention that's based only on gain projections. If customer empathy isn't
visible in the proposal, long-term adoption is unlikely.
Balancing the portfolio: Most technology implementations don't focus on changing the market or improving
the lives of customers. In the average IT department, more than 80% of workloads are maintained for basic
process automation. With the ease of innovation, it's tempting to innovate and rearchitect those solutions. Most
of the times, those workloads can experience similar or better returns by migrating or modernizing the solution,
with no change to core business logic or data processes. Balance your portfolio to favor innovation strategies
that can be built with clear empathy for the customer (internal or external). For all other workloads, follow a
migrate path to financial returns.
Maintaining focus and protecting priorities: When you've made a commitment to innovation, it's
important to maintain your team's focus. During the first iteration of a build phase, it's relatively easy to keep a
team excited about the possibilities of changing the future for your customers. However, that first MVP release
is just the beginning. True innovation comes with each build-measure-learn cycle, by learning from the
feedback loops to produce a better solution. As a leader in any innovation process, you should concentrate on
keeping the team focused and on maintaining your innovation priorities through the subsequent, less-
glamorous build iterations.
Invention challenges
Before the widespread adoption of the cloud, invention cycles that depended on information technology were
laborious and time-consuming. Procurement and provisioning cycles frequently delayed the crucial first steps
toward any new solutions. The cost of DevOps solutions and feedback loops delayed teams' abilities to collaborate
on early stage ideation and invention. Costs related to developer environments and data platforms prevented
anyone but highly trained professional developers from participating in the creation of new solutions.
The cloud has overcome many of these invention challenges by providing self-service automated provisioning,
light-weight development and deployment tools, and opportunities for professional developers and citizen
developers to cooperate in creating rapid solutions. Leveraging the cloud for innovation dramatically reduces
customer challenges and blockers to the invention side of the innovation equation.
Invention challenges in a digital economy
The invention challenges of today are different. The endless potential of cloud technologies also produces more
implementation options and deeper considerations about how those implementations might be used.
The Innovate methodology uses the following innovation disciplines to help align your implementation decisions
with your invention and adoption goals:
Data platforms: New sources and variations on data are available. Many of these couldn't be integrated into
legacy or on-premises applications to create cost-effective solutions. Understanding the change you hope to
drive in customers will inform your data platform decisions. Those decisions will be an extension of selected
approaches to ingest, integrate, categorize, and share data. Microsoft refers to this decision-making process as
the democratization of data.
Device interactions: IoT, mobile, and augmented reality blur the lines between digital and physical,
accelerating the digital economy. Understanding the real-world interactions surrounding customer behavior
will drive decisions about device integration.
Applications: Applications are no longer the exclusive domain of professional developers. Nor do they require
traditional server-based approaches. Empowering professional developers, enabling business specialists to
become citizen developers, and expanding compute options for API, micro-services, and PaaS solutions expand
application interface options. Understanding the digital experience required to shape customer behavior will
improve your decision-making about application options.
Source code and deployment: Collaboration between developers of all walks improves both quality and
speed to market. Integration of feedback and a rapid response to learning shape market leaders. Commitments
to the build, measure, and learn processes help accelerate tool adoption decisions.
Predictive solutions: In a digital economy, it's seldom sufficient to simply meet the current needs of your
customers. Customers expect businesses to anticipate their next steps and predict their future needs.
Continuous learning often evolves into prediction tooling. The complexity of customer needs and the
availability of data will help define the best tools and approaches to predict and influence.
In a digital economy, the greatest challenge architects face is to clearly understand their customers' invention and
adoption needs and to then determine the best cloud-based toolchain to deliver on those needs.
Next steps
Based on the knowledge gained regarding the build-measure-learn model and growth mindset, you are now ready
to develop digital inventions within the Innovate methodology.
Develop digital inventions
Develop digital inventions
2 minutes to read • Edit Online
As described in Innovation in the digital economy, innovation requires a balance of invention and adoption.
Customer feedback and partnership are required to drive adoption. The disciplines described in the next section
define a series of approaches to developing digital inventions while keeping adoption and customer empathy in
mind. Each of the disciplines is briefly described, along with deeper links into each process.
Next steps
Democratization of data is the first discipline of innovation to consider and evaluate.
Democratize data
Democratize data
7 minutes to read • Edit Online
Coal, oil, and human potential were the three most consequential assets during the Industrial Revolution. These
assets built companies, shifted markets, and ultimately changed nations. In the digital economy, there are three
equally important assets: data, devices, and human potential. Each of these assets holds great innovation
potential. For any innovation effort in the modern era, data is the new oil.
Across every company today, there are pockets of data that could be used to find and meet customer needs more
effectively. Unfortunately, the process of mining that data to drive innovation has long been costly and time-
consuming. Many of the most valuable solutions to customer needs go unmet because the right people can't
access the data they need.
Democratization of data is the process of getting this data into the right hands to drive innovation. This process
can take several forms, but they generally include solutions for ingested or integrated raw data, centralization of
data, sharing data, and securing data. When these methods are successful, experts around the company can use
the data to test hypotheses. In many cases, cloud adoption teams can build with customer empathy using only
data, and rapidly addressing existing customer needs.
Share data
When you build with customer empathy, all processes elevate customer need over a technical solution. Because
democratizing data is no exception, we start by sharing data. To democratize data, it must include a solution that
shares data with a data consumer. The data consumer could be a direct customer or a proxy who makes decisions
for customers. Approved data consumers can analyze, interrogate, and report on centralized data, with no
support from IT staff.
Many successful innovations have been launched as an minimum viable product (MVP ) that deliver manual, data-
driven processes on behalf of the customer. In this concierge model, an employee is the data consumer. That
employee uses data to aid the customer. Each time the customer engages manual support, a hypothesis can be
tested and validated. This approach is often a cost effective means of testing a customer-focused hypothesis
before you invest heavily in integrated solutions.
The primary tools for sharing data directly with data consumers include self-service reporting or data embedded
within other experiences, using tools like Power BI.
NOTE
Before you share data, make sure you've read the following sections. Sharing data might require governance to provide
protection for the shared data. Also, that data might be spread across multiple clouds and could require centralization.
Much of the data might even reside within applications, which will require data collection before you can share it.
Govern data
Sharing data can quickly produce an MVP that you can use in customer conversations. However, to turn that
shared data into useful and actionable knowledge, a bit more is generally required. After a hypothesis has been
validated through data sharing, the next phase of development is typically data governance.
Data governance is a broad topic that could require it's own dedicated framework. That degree of granularity is
outside the scope of the Cloud Adoption Framework. However, there are several aspects of data governance that
you should consider as soon as the customer hypothesis is validated. For example:
Is the shared data sensitive? Data should be classified before being shared publicly to protect the interests
of customers and the company.
If the data is sensitive, has it been secured? Protection of sensitive data should be a requirement for any
democratized data. The example workload focused on securing data solutions provides a few references for
securing data.
Is the data catalogued? Capturing details about the data being shared will aid in long-term data
management. Tools for documenting data, like Azure Data Catalog, can make this process much easier in the
cloud. Guidance regarding the annotation of data and documentation of data sources can help accelerate the
process.
When democratization of data is important to a customer-focused hypothesis, make sure the governance of
shared data is somewhere in the release plan. This will help protect customers, data consumers, and the company.
Centralize data
When data is disrupted across an IT environment, opportunities to innovate can be extremely constrained,
expensive, and time-consuming. The cloud provides new opportunities to centralize data across data silos. When
centralization of multiple data sources is required to build with customer empathy, the cloud can accelerate the
testing of hypotheses.
Cau t i on
Centralization of data represents a risk point in any innovation process. When data centralization is a technical
spike (as opposed to a source of customer value), we suggest that you delay centralization until the customer
hypotheses have been validated.
If centralization of data is required, you should first define the appropriate data store for the centralized data. It's a
good practice to establish a data warehouse in the cloud. This scalable option provides a central location for all
your data. This type of solution is available in Online Analytical Processing (OLAP ) or Big Data options.
The reference architectures for OLAP and Big Data solutions can help you choose the most relevant solution in
Azure. If a hybrid solution is required, the reference architecture for extending on-premises data can also help
accelerate solution development.
IMPORTANT
Depending on the customer need and the aligned solution, a simpler approach may be sufficient. The cloud architect should
challenge the team to consider lower cost solutions that could result in faster validation of the customer hypothesis,
especially during early development. The following section on collecting data covers some scenarios that might suggest a
different solution for your situation.
Collect data
When you need data to be centralized to address a customer need, it's very likely that you'll also have to collect
the data from various sources and move it into the centralized data store. There are two primary forms of data
collection: integration and ingestion.
Integration: Data that resides in an existing data store can be integrated into the centralized data store by using
traditional data movement techniques. This is especially common for scenarios that involve multicloud data
storage. These techniques involve extracting the data from the existing data store and then loading it into the
central data store. At some point in this process, the data is typically transformed to be more usable and relevant
in the central store.
Cloud-based tools have turned these techniques into pay-per-use tools, reducing the barrier to entry for data
collection and centralization. Tools like Azure Database Migration Service and Azure Data Factory are two
examples. The reference architecture for data factory with an OLAP data store is an example of one such solution.
Ingestion: Some data doesn't reside in an existing data store. When this transient data is a primary source of
innovation, you'll want to consider alternative approaches. Transient data can be found in a variety of existing
sources like applications, APIs, data streams, IoT devices, a blockchain, an application cache, in media content, or
even in flat files.
You can integrate these various forms of data into a central data store on an OLAP or Big Data solution. However,
for early iterations of the build–measure–learn cycle, an Online Transactional Processing (OLTP ) solution might
be more than sufficient to validate a customer hypothesis. OLTP solutions aren't the highest-quality solution for
any reporting scenario. However, when you're building with customer empathy, it's more important to focus on
customer needs than on technical tooling decisions. After the customer hypothesis is validated at scale, a more
suitable platform might be required. The reference architecture on OLTP data stores can help you determine
which data store is most appropriate for your solution.
Virtualize: Integration and ingestion of data can sometimes slow innovation. When a solution for data
virtualization is already available, it might represent a more reasonable approach. Ingestion and integration can
both duplicate storage and development requirements, add data latency, increase attack surface area, trigger
quality issues, and increase governance efforts. Data virtualization is a more contemporary alternative that leaves
the original data in a single location and creates pass-through or cached queries of the source data.
SQL Server 2017 and Azure SQL Data Warehouse both support PolyBase which is the approach to data
virtualization most commonly used in Azure.
Next steps
With a strategy for democratizing data in place, you'll next want to evaluate approaches to engaging customers
through apps.
Engaging customers through apps
Engage through applications
8 minutes to read • Edit Online
As discussed in Democratize data, data is the new oil. It fuels most innovations across the digital economy.
Building on that analogy, applications are the fueling stations and infrastructure required to get that fuel into the
right hands.
In some cases, data alone is enough to drive change and meet customer needs. More commonly, though, solutions
to customer needs require applications to shape the data and create an experience. Applications are the way we
engage the user. They are the home for the processes required to respond to customer triggers. They are
customers' means of providing data and receiving guidance. This article summarizes several principles that can
help align you with the right application solution, based on the hypotheses to be validated.
Shared code
Teams that more quickly and accurately respond to customer feedback, market changes, and opportunities to
innovate typically lead their respective markets in innovation. The first principle of innovative applications is
summed up in the growth mindset overview: "Share the code." Over time, innovation emerges from a cultural
focus. To sustain innovation, diverse perspectives and contributions are required.
To be ready for innovation, all application development should start with a shared code repository. The most
widely adopted tool for managing code repositories is GitHub, which allows you to create a shared code
repository quickly. Alternatively, Azure Repos is a set of version control tools in Azure DevOps Services that you
can use to manage your code. Azure Repos provides two types of version control:
Git: distributed version control
Team Foundation Version Control (TFVC ): centralized version control
Citizen developers
Professional developers are a vital component of innovation. When a hypothesis proves accurate at scale,
professional developers are required to stabilize and prepare the solution for scale. Most of the principles
referenced in this article require support from professional developers. Unfortunately, current trends suggest
there's a greater demand for professional developers than there are developers. Moreover, the cost and pace of
innovation can be less favorable when professional development is deemed necessary. In response to these
challenges, citizen developers provide a way to scale development efforts and accelerate early hypothesis testing.
The use of citizen developers can be viable and effective when early hypotheses can be validated through tools like
PowerApps for app interfaces, AI Builder for processes and predictions, Microsoft Flow for workflows, and Power
BI for data consumption.
NOTE
When you rely on citizen developers to test hypotheses, it's advisable to have some professional developers on hand to
provide support, review, and guidance. After a hypothesis is validated at scale, a process for transitioning the application into
a more robust programming model will accelerate returns on the innovation. By involving professional developers in process
definitions early on, you can realize cleaner transitions later.
Intelligent experiences
Intelligent experiences combine the speed and scale of modern web applications with the intelligence of cognitive
services and bots. Alone, each of these technologies might be sufficient to meet your customers' needs. When
smartly combined, they broaden the spectrum of needs that can be met through a digital experience, while helping
to contain development costs.
Modern web apps
When an application or experience is required to meet a customer need, modern web applications can be the
fastest way to go. Modern web experiences can engage internal or external customers quickly and allow for rapid
iteration on the solution.
Infusing intelligence
Machine learning and artificial intelligence are increasingly available to developers. The wide-spread availability of
common APIs with predictive capabilities allows developers to better meet the needs of the customer through
expanded access to data and predictions.
Adding intelligence to a solution can enable speech to text, text translation, computer vision, and even visual
search. With these expanded capabilities, it's easier for developers to build solutions that take advantage of
intelligence to create an interactive and modern experience.
Bots
Bots provide an experience that feels less like using a computer and more like dealing with a person — at least
with an intelligent robot. They can be used to shift simple, repetitive tasks (such as making a dinner reservation or
gathering profile information) onto automated systems that might no longer require direct human intervention.
Users converse with a bot through text, interactive cards, and speech. A bot interaction can range from a quick
question-and-answer to a sophisticated conversation that intelligently provides access to services.
Bots are a lot like modern web applications: they live on the internet and use APIs to send and receive messages.
What's in a bot varies widely depending on what kind of bot it is. Modern bot software relies on a stack of
technology and tools to deliver increasingly complex experiences on a variety of platforms. However, a simple bot
could just receive a message and echo it back to the user with very little code involved.
Bots can do the same things as other types of software: read and write files, use databases and APIs, and handle
regular computational tasks. What makes bots unique is their use of mechanisms generally reserved for human-
to-human communication.
Cloud-native solutions
Cloud-native applications are built from the ground up, and they're optimized for cloud scale and performance.
Cloud-native applications are typically built using a microservices, serverless, event-based, or container-based
approaches. Most commonly, cloud-native solutions use a combination of microservices architectures, managed
services, and continuous delivery to achieve reliability and faster time to market.
A cloud-native solution allows centralized development teams to maintain control of the business logic without
the need for monolithic, centralized solutions. This type of solution also creates an anchor to drive consistency
across the input of citizen developers and modern experiences. Finally, cloud-native solutions provide an
innovation accelerator by freeing citizen and professional developers to innovate safely and with a minimum of
blockers.
Refactoring or rearchitecting solutions or centralizing business logic can quickly trigger a time-consuming
technical spikeinstead of a source of customer value. This is a risk to innovation, especially early in hypothesis
validation. With a bit of creativity in the design of a solution, there should be a path to MVP that doesn't require
refactoring of existing solutions. It's wise to delay refactoring until the initial hypothesis can be validated at scale.
Next steps
Depending on the hypothesis and solution, the principles in this article can aid in designing apps that meet MVP
definitions and engage users. Up next are the principles for empowering adoption, which offer ways to get the
application and data into the hands of customers more quickly and efficiently.
Empower adoption
Empower adoption
8 minutes to read • Edit Online
The ultimate test of innovation is customer reaction to your invention. Did the hypothesis prove true? Do
customers use the solution? Does it scale to meet the needs of the desired percentage of users? Most importantly,
do they keep coming back? None of these questions can be asked until the minimum viable product (MVP )
solution has been deployed. In this article, we'll focus on the discipline of empowering adoption.
Shared solution: Establish a centralized repository for all aspects of the solution.
Feedback loops: Make sure that feedback loops can be managed consistently through iterations.
Continuous integration: Regularly build and consolidate the solution.
Reliable testing: Validate solution quality and expected changes to ensure the reliability of your testing metrics.
Solution deployment: Deploy solutions so that the team can quickly share changes with customers.
Integrated measurement: Add learning metrics to the feedback loop for clear analysis by the full team.
To minimize technical spikes, assume that maturity will initially be low across each of these principles. But
definitely plan ahead by aligning to tools and processes that can scale as hypotheses become more fine-grained.
In Azure, the GitHub and Azure DevOps allow small teams to get started with little friction. These teams might
grow to include thousands of developers who collaborate on scale solutions and test hundreds of customer
hypotheses. The remainder of this article illustrates the plan big/start small approach to empowering adoption
across each of these principles.
Shared solution
As described in Measure for customer impact, positive validation of any hypothesis requires iteration and
determination. You'll experience far more failures than wins during any innovation cycle. This is expected.
However, when a customer need, hypothesis, and solution align at scale, the world changes quickly.
When you're scaling innovation, there's no more valuable tool than a shared codebase for the solution.
Unfortunately, there's no reliable way of predicting which iteration or which MVP will yield the winning
combination. That's why it's never too early to establish a shared codebase or repository. This is the one technical
spike that should never be delayed. As the team iterates through various MVP solutions, a shared repo enables
easy collaboration and accelerated development. When changes to the solution drag down learning metrics,
version control lets you roll back to an earlier, more effective version of the solution.
The most widely adopted tool for managing code repositories is GitHub, which lets you create a shared code
repository with just a few clicks. Additionally, the Azure Repos feature of Azure DevOps can be used to create a
Git or Team Foundation repository.
Feedback loops
Making the customer part of the solution is the key to building customer partnerships during innovation cycles.
That's accomplished, in part, by measuring customer impact. It requires conversations and direct testing with the
customer. Both generate feedback that must be managed effectively.
Every point of feedback is a potential solution to the customer need. More importantly, every bit of direct
customer feedback represents an opportunity to improve the partnership. If feedback makes it into an MVP
solution, celebrate that with the customer. Even if some feedback isn't actionable, simply being transparent with
the decision to deprioritize the feedback demonstrates a growth mindset and a focus on continuous learning.
Azure DevOps includes ways to request, provide, and manage feedback. Each of these tools centralizes feedback
so that the team can take action and provide follow -up in service of a transparent feedback loop.
Continuous integration
As adoptions scale and a hypothesis gets closer to true innovation at scale, the number of smaller hypotheses to
be tested tends to grow rapidly. For accurate feedback loops and smooth adoption processes, it's important that
each of those hypotheses is integrated and supportive of the primary hypothesis behind the innovation. This
means that you also have to move quickly to innovate and grow, which requires multiple developers for testing
variations of the core hypothesis. For later stage development efforts, you might even need multiple teams of
developers, each building toward a shared solution. Continuous integration is the first step toward management
of all the moving parts.
In continuous integration, code changes are frequently merged into the main branch. Automated build and test
processes make sure that code in the main branch is always production quality. This ensures that developers are
working together to develop shared solutions that provide accurate and reliable feedback loops.
Azure DevOps and Azure Pipelines provide continuous integration capabilities with just a few clicks in GitHub or a
variety of other repositories. Learn more about continuous integration, or for more information, check out the
hands-on lab. There are also solution architectures to accelerate creation of your CI/CD pipelines through Azure
DevOps.
Reliable testing
Defects in any solution can create false positives or false negatives. Unexpected errors can easily lead to
misinterpretation of user adoption metrics. They can also generate negative feedback from customers that doesn't
accurately represent the test of your hypothesis.
During early iterations of an MVP solution, defects are expected; early adopters might even find them endearing.
In early releases, acceptance testing is typically nonexistent. However, one aspect of building with empathy
concerns the validation of the need and hypothesis. Both can be completed through unit tests at a code level and
manual acceptance tests before deployment. Together, these provide some means of reliability in testing. You
should strive to automate a well-defined series of build, unit, and acceptance tests. These will ensure reliable
metrics related to more granular tweaks to the hypothesis and the resulting solution.
The Azure Test Plans feature provides tooling to develop and operate test plans during manual or automated test
execution.
Solution deployment
Perhaps the most meaningful aspect of empowering adoption concerns your ability to control the release of a
solution to customers. By providing a self-service or automated pipeline for releasing a solution to customers,
you'll accelerate the feedback loop. By allowing customers to quickly interact with changes in the solution, you
invite them into the process. This approach also triggers quicker testing of hypotheses, thereby reducing
assumptions and potential rework.
The are several methods for solution deployment. The following represent the three most common:
Continuous deployment is the most advanced method, as it automatically deploys code changes into
production. For mature teams that are testing mature hypotheses, continuous deployment can be extremely
valuable.
During early stages of development, continuous delivery might be more appropriate. In continuous delivery,
any code changes are automatically deployed to a production-like environment. Developers, business decision-
makers, and others on the team can use this environment to verify that their work is production-ready. You can
also use this method to test a hypothesis with customers without affecting ongoing business activities.
Manual deployment is the least sophisticated approach to release management. As the name suggests,
someone on the team manually deploys the most recent code changes. This approach is error prone,
unreliable, and considered an antipattern by most seasoned engineers.
During the first iteration of an MVP solution, manual deployment is common, despite the preceding assessment.
When the solution is extremely fluid and customer feedback is unknown, there's a significant risk in resetting the
entire solution (or even the core hypothesis). Here's the general rule for manual deployment: no customer proof,
no deployment automation.
Investing early can lead to lost time. More importantly, it can create dependencies on the release pipeline that
make the team more resistant to an early pivot. After the first few iterations or when customer feedback suggests
potential success, a more advanced model of deployment should be quickly adopted.
At any stage of hypothesis validation, Azure DevOps and Azure Pipelines provide continuous delivery and
continuous deployment capabilities. Learn more about continuous delivery, or check out the hands-on lab.
Solution architecture can also accelerate creation of your CI/CD pipelines through Azure DevOps.
Integrated measurements
When you measure for customer impact, it's important to understand how customers react to changes in the
solution. This data, known as telemetry, provides insights into the actions a user (or cohort of users) took when
working with the solution. From this data, it's easy to get a quantitative validation of the hypothesis. Those metrics
can then be used to adjust the solution and generate more fine-grained hypotheses. Those subtler changes help
mature the initial solution in subsequent iterations, ultimately driving to repeat adoption at scale.
In Azure, Azure Monitor provides the tools and interface to collect and review data from customer experiences.
You can apply those observations and insights to refine the backlog by using Azure Boards.
Next steps
After you've gained an understanding of the tools and processes needed to empower adoption, it's time to
examine a more advanced innovation discipline: interact with devices. This discipline can help reduce the barriers
between physical and digital experiences, making your solution even easier to adopt.
Interact with devices
Ambient experiences: Interact with devices
8 minutes to read • Edit Online
In Build with customer empathy, we discussed the three tests of true innovation: Solve a customer need, keep the
customer coming back, and scale across a base of customer cohorts. Each test of your hypothesis requires effort
and iterations on the approach to adoption. This article offers insights on some advanced approaches to reduce
that effort through ambient experiences. By interacting with devices, instead of an application, the customer may
be more likely to turn to your solution first.
Ambient experiences
An ambient experience is a digital experience that relates to the immediate surroundings. A solution that features
ambient experiences strives to meet the customer in their moment of need. When possible, the solution meets the
customer need without leaving the flow of activity that triggered it.
Life in the digital economy is full of distractions. We're all bombarded with social, email, web, visual, and verbal
messaging, each of which is a risk of distraction. This risk increases with every second that elapses between the
customer's point of need and the moment they encounter a solution. Countless customers are lost in that brief
time gap. To foster an increase in repeat adoption, you have to reduce the number of distractions by reducing time
to solution (TTS ).
Ambient experiences typically require more than a web app these days. Through measurement and learning with
the customer the behavior that triggers the customer's need can be observed, tracked, and used to build a more
ambient experience. The following list summarizes a few approaches to integration of ambient solutions into your
hypotheses, with more details about each in the following paragraphs.
Mobile experience: As with laptops, mobile apps are ubiquitous in customer environments. In some
situations, this might provide a sufficient level of interactivity to make a solution ambient.
Mixed reality: Sometimes a customer's typical surroundings must be altered to make an interaction ambient.
This factor creates something of a false reality in which the customer interacts with the solution and has a need
met. In this case, the solution is ambient within the false reality.
Integrated reality: Moving closer to true ambience, integrated reality solutions focus on the use of a device
that exists within the customer's reality to integrate the solution into their natural behaviors. A virtual assistant
is a great example of integrating reality into the surrounding environment. A lesser known option concerns
Internet of Things (IoT) technologies, which integrate devices that already exist in the customer's surroundings.
Adjusted reality: When any of these ambient solutions use predictive analysis in the cloud to define and
provide an interaction with the customer through the natural surroundings, the solution has adjusted reality.
Understanding the customer need and measuring customer impact both help you determine whether a device
interaction or ambient experience are necessary to validate your hypothesis. With each of those data points, the
following sections will help you find the best solution.
Mobile experience
In the first stage of ambient experience, the user moves away from the computer. Today's consumers and business
professionals move fluidly between mobile and PC devices. Each of the platforms or devices used by your
customer creates a new potential experience. Adding a mobile experience that extends the primary solution is the
fastest way to improve integration into the customer's immediate surroundings. While a mobile device is far from
ambient, it might edge closer to the customer's point of need.
When customers are mobile and change locations frequently, that may represent the most relevant form of
ambient experience for a particular solution. Over the past decade, innovation has frequently been triggered by
the integration of existing solutions with a mobile experience.
Azure App Services is a great example of this approach. During early iterations, the web app feature of Azure App
Services can be used to test the hypothesis. As the hypotheses become more complex, the mobile app feature of
Azure App Services can extend the web app to run in a variety of mobile platforms.
Mixed reality
Mixed reality solutions represent the next level of maturity for ambient experiences. This approach augments or
replicates the customer's surroundings; it creates an extension of reality for the customer to operate within.
IMPORTANT
If a virtual reality (VR) device is required and is not already part of a customer's immediate surrounding or natural
behaviors, augmented or virtual reality is more of an alternative experience and less of an ambient experience.
Mixed reality experiences are increasingly common among remote workforces. Their use is growing even faster in
industries that require collaboration or specialty skills that aren't readily available in the local market. Situations
that require centralized implementation support of a complex product for a remote labor force are particularly
fertile ground for augmented reality. In these scenarios, the central support team and remote employees might
use augmented reality to work on, troubleshoot, and install the product.
For example, consider the case of spatial anchors. Spatial anchors allow you to create mixed reality experiences
with objects that persist their respective locations across devices over time. Through spatial anchors, a specific
behavior can be captured, recorded, and persisted, thereby providing an ambient experience the next time the user
operates within that augmented environment. Azure Spatial Anchors is a service that moves this logic to the
cloud, allowing experiences to be shared across devices and even across solutions.
Integrated reality
Beyond mobile reality or even mixed reality lies integrated reality. Integrated reality aims to remove the digital
experience entirely. All around us are devices with compute and connectivity capabilities. These devices can be
used to collect data from the immediate surroundings without the customer having to ever touch a phone, laptop,
or VR device.
This experience is ideal when some form of device is consistently within the same surroundings in which the
customer need occurs. Common scenarios include factory floors, elevators, and even your car. These types of large
devices already contain compute power. You can also use data from the device itself to detect customer behaviors
and send those behaviors to the cloud. This automatic capture of customer behavior data dramatically reduces the
need for a customer to input data. Additionally, the web, mobile, or VR experience can function as a feedback loop
to share what's been learned from the integrated reality solution.
Examples of integrated reality in Azure could include:
Azure Internet of Things (IoT) solutions, a collection of services in Azure that each aid in managing devices and
the flow of data from those devices into the cloud and back out to end users.
Azure Sphere, a combination of hardware and software. Azure Sphere is an innately secure way to enable an
existing device to securely transmit data between the device and Azure IoT solutions.
Azure Kinect Developers Kit, AI sensors with advance computer vision and speech models. These sensors can
collect visual and audio data from the immediate surroundings and feed those inputs into your solution.
You can use all three of these tools to collect data from the natural surroundings and at the point of customer
need. From there, your solution can respond to those data inputs to solve the need, sometimes before the
customer is even aware that a trigger for that need has occurred.
Adjusted reality
The highest form of ambient experience is adjusted reality, often referred to as ambient intelligence. Adjusted
reality is an approach to using information from your solution to change the customer's reality without requiring
them to interact directly with an application. In this approach, the application you initially built to prove your
hypothesis might no longer be relevant at all. Instead, devices in the environment help modulate the inputs and
outputs to meet customer needs.
Virtual assistants and smart speakers offer great examples of adjusted reality. Alone, a smart speaker is an
example of simple integrated reality. But add a smart light and motion sensor to a smart speaker solution and it's
easy to create a basic solution that turns on the lights when you enter a room.
Factory floors around the world provide additional examples of adjusted reality. During early stages of integrated
reality, sensors on devices detected conditions like overheating, and then alerted a human being through an
application. In adjusted reality, the customer might still be involved, but the feedback loop is tighter. On an
adjusted reality factory floor, one device might detect overheating in a vital machine somewhere along the
assembly line. Somewhere else on the floor, a second device then slows production slightly to allow the machine
to cool and then resume full pace when the condition is resolved. In this situation, the customer is a second-hand
participant. The customer uses your application to set the rules and understand how those rules have affected
production, but they're not necessary to the feedback loop.
The Azure services described in Azure Internet of Things (IoT) solutions, Azure Sphere, and Azure Kinect
Developers Kit could each be components of an adjusted reality solution. Your original application and business
logic would then serve as the intermediary between the environmental input and the change that should be made
in the physical environment.
A digital twin is another example of adjusted reality. This term refers to a digital representation of a physical
device, presented through through computer, mobile, or mixed-reality formats. Unlike less sophisticated 3D
models, a digital twin reflects data collected from an actual device in the physical environment. This solution
allows the user to interact with the digital representation in ways that could never be done in the real world. In this
approach, physical devices adjust a mixed reality environment. However, the solution still gathers data from an
integrated reality solution and uses that data to shape the reality of the customer's current surroundings.
In Azure, digital twins are created and accessed through a service called Azure Digital Twins.
Next steps
Now that you have a deeper understanding of device interactions and the ambient experience that's right for your
solution, you're ready to explore the final discipline of innovation, Predict and influence.
Predict and influence
Predict and influence
5 minutes to read • Edit Online
There are two classes of applications in the digital economy: historical and predictive. Many customer needs can
be met solely by using historical data, including nearly real-time data. Most solutions focus primarily on
aggregating data in the moment. They then process and share that data back to the customer in the form of a
digital or ambient experience.
As predictive modeling becomes more cost-effective and readily available, customers demand forward-thinking
experiences that lead to better decisions and actions. However, that demand doesn't always suggest a predictive
solution. In most cases, a historical view can provide enough data to empower the customer to make a decision on
their own.
Unfortunately, customers often take a myopic view that leads to decisions based on their immediate surroundings
and sphere of influence. As options and decisions grow in number and impact, that myopic view may not serve
the customer's needs. At the same time, as a hypothesis is proven at scale, the company providing the solution can
see across thousands or millions of customer decisions. This big-picture approach makes it possible to see broad
patterns and the impacts of those patterns. Predictive capability is a wise investment when an understanding of
those patterns is necessary to make decisions that best serve the customer.
If the customer hypothesis developed in Build with customer empathy includes predictive capabilities, the
principles described there might well apply. However, predictive capabilities require significant investment of time
and energy. When predictive capabilities are technical spikes, as opposed to a source of real customer value, we
suggest that you delay predictions until the customer hypotheses have been validated at scale.
Data
Data is the most elemental of the characteristics mentioned earlier. Each of the disciplines for developing digital
inventions generates data. That data, of course, contributes to the development of predictions. For more guidance
on ways to get data into a predictive solution, see Democratizing data and Interacting with devices.
A variety of data sources can be used to deliver predictive capabilities:
Insights
Subject matter experts use data about customer needs and behaviors to develop basic business insights from a
study of raw data. Those insights can pinpoint occurrences of the desired customer behaviors (or, alternatively,
undesirable results). During iterations on the predictions, these insights can aid in identifying potential correlations
that could ultimately generate positive outcomes. For guidance on enabling subject matter experts to develop
insights, see Democratizing data.
Patterns
People have always tried to detect patterns in large volumes of data. Computers were designed for that purpose.
Machine learning accelerates that quest by detecting precisely such patterns, a skill that comprises the machine
learning model. Those patterns are then applied through machine learning algorithms to predict outcomes when a
new set of data is entered into the algorithms.
Using insights as a starting point, machine learning develops and applies predictive models to capitalize on the
patterns in data. Through multiple iterations of training, testing, and adoption, those models and algorithms can
accurately predict future outcomes.
Azure Machine Learning is the cloud-native service in Azure for building and training models based on your data.
This tool also includes a workflow for accelerating the development of machine learning algorithms. This
workflow can be used to develop algorithms through a visual interface or Python.
For more robust machine learning models, ML services in Azure HDInsight provides a machine learning platform
built on Apache Hadoop clusters. This approach enables more granular control of the underlying clusters, storage,
and compute nodes. Azure HDInsight also offers more advanced integration through tools like ScaleR and
SparkR to create predictions based on integrated and ingested data, even working with data from a stream. The
flight delay prediction solution demonstrates each of these advanced capabilities when used to predict flight
delays based on weather conditions. The HDInsight solution also allows for enterprise controls, such as data
security, network access, and performance monitoring to operationalize patterns.
Predictions
After a pattern is built and trained, you can apply it through APIs, which can make predictions during the delivery
of a digital experience. Most of these APIs are built from a well-trained model based on a pattern in your data. As
more customers deploy everyday workloads to the cloud, the prediction APIs used by cloud providers lead to
ever-faster adoption.
Azure Cognitive Services is an example of a predictive API built by a cloud vendor. This service includes predictive
APIs for content moderation, anomaly detection, and suggestions to personalize content. These APIs are ready to
use and are based on well-known content patterns, which Microsoft has used to train models. Each of those APIs
makes predictions based on the data you feed into the API.
Azure Machine Learning lets you deploy custom-built algorithms, which you can create and train based solely on
your own data. Learn more about deploying predictions with Azure Machine Learning.
Set up HDInsight clusters discusses the processes for exposing predictions developed for ML Services on Azure
HDInsight.
Interactions
After a prediction is made available through an API, you can use it to influence customer behavior. That influence
takes the form of interactions. An interaction with a machine learning algorithm happens within your other digital
or ambient experiences. As data is collected through the application or experience, it's run through the machine
learning algorithms. When the algorithm predicts an outcome, that prediction can be shared back with the
customer through the existing experience.
Learn more about how to create an ambient experience through an adjusted reality solution.
Next steps
Having acquainted yourself with the Disciplines of invention and the Innovate methodology, you're now ready to
learn how to build with customer empathy.
Build with empathy
Develop digital inventions in Azure
2 minutes to read • Edit Online
Azure can help accelerate the development of each area of digital invention. This section of the Cloud Adoption
Framework builds on the Innovate methodology. This section shows how you can combine Azure services to
create a toolchain for digital invention.
Toolchain
Start with the overview page that relates to the type of digital invention you require to test your hypothesis. You
start with that page for guidance you can act on and so that you can build with customer empathy.
Here are the types of digital invention in this article series:
Democratize data: Tools for sharing data to solve information-related customer needs
Engage via apps: Tools to create apps that engage customers beyond raw data
Empower adoption: Tools to accelerate customer adoption through digital support for your build-measure-
learn cycles
Interact with devices: Tools to create different levels of ambient experiences for your customers
Predict and influence: Tools for predictive analysis and integration of their output into applications
Tools to democratize data in Azure
2 minutes to read • Edit Online
As described in the conceptual article on democratizing data, you can deliver many innovations with little technical
investment. Many major innovations require little more than raw data. Democratizing data is about investing as
little resource as needed to engage your customers who use data to take advantage of their existing knowledge.
Starting with data is a quick way to test a hypothesis before expanding into broader, more costly digital inventions.
As you refine more of the hypothesis and begin to adopt the inventions at scale, the following processes will help
you prepare for operational support of the innovation.
Toolchain
In Azure, the following tools are commonly used to accelerate digital invention across the preceding phases:
Power BI
Azure Data Catalog
Azure SQL Data Warehouse
Azure Cosmos DB
Azure Database for PostgreSQL
Azure Database for MySQL
Azure Database for MariaDB
Azure Database for PostgreSQL Hyperscale
Azure Data Lake
Azure Database Migration Service
Azure SQL Database, with or without managed instances
Azure Data Factory
Azure Stream Analytics
SQL Server Integration Services
Azure Stack
SQL Server Stretch Database
Microsoft Azure StorSimple
Azure Files
Azure File Sync
PolyBase
As the invention approaches adoption at scale, the aspects of each solution require refinement and technical
maturity. As that happens, more of these services are likely to be required. Use the table of contents on the left side
of this page for Azure tools guidance relevant to your hypothesis-testing process.
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
What is data classification?
2 minutes to read • Edit Online
Data classification allows you to determine and assign value to your organization's data, and is a common starting
point for governance. The data classification process categorizes data by sensitivity and business impact in order to
identify risks. When data is classified, you can manage it in ways that protect sensitive or important data from theft
or loss.
Next steps
Apply data classifications during one of the actionable governance guides.
Choose an actionable governance guide
Collect data through the migration and
modernization of existing data sources
2 minutes to read • Edit Online
Companies often have different kinds of existing data that they can democratize. When a customer hypothesis
requires the use of existing data to build modern solutions, a first step might be the migration and modernization
of data to prepare for inventions and innovations. To align with existing migration efforts within a cloud adoption
plan, you can more easily do the migration and modernization within the Migrate methodology.
Primary toolset
When you migrate and modernize on-premises data, the most common Azure tool choice is Azure Database
Migration Service. This service is part of the broader Azure Migrate toolchain. For existing SQL Server data
sources, Data Migration Assistant can help you assess and migrate a small number of data structures.
To support Oracle and NoSQL migrations, you can also use Database Migration Service for certain types of
source-to-target databases. Examples include Oracle to PostgreSQL and MongoDB to Cosmos DB. More
commonly, adoption teams use partner tools or custom scripts to migrate to Azure Cosmos DB, Azure HDInsight,
or virtual machine options based on infrastructure as a service (IaaS ).
RDS SQL Server Azure SQL Database Database Migration Online Tutorial
or Azure SQL Service
Database managed
instance
Different NoSQL DB Cosmo DB or IaaS Procedural migrations Offline or online Decision tree
options options or Azure Migrate
Tools to engage via apps in Azure
2 minutes to read • Edit Online
As described in Engage via apps, applications can be an important aspect of an MVP solution. Applications are
often required for testing a hypothesis. This article helps you learn the tools Azure provides to accelerate
development of those applications.
Toolchain
Depending on the path that the cloud adoption team takes, Azure provides tools to accelerate the team's ability to
build with customer empathy. The following list of Azure offerings is grouped based on the preceding decision
paths. These offerings include:
Azure App Service
Azure Kubernetes Service (AKS )
Azure Migrate
Azure Stack
PowerApps
Microsoft Flow
Power BI
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
Tools to empower adoption in Azure
2 minutes to read • Edit Online
As described in Empower adoption, building true innovation at scale requires an investment in removing friction
that could slow adoption. In the early stages of testing a hypothesis, a solution is small. The investment in
removing friction is likely small as well. As hypotheses prove true, the solution and the investment in empowering
adoption grows. This article provides key links to help you get started with each stage of maturity.
Toolchain
For adoption teams that are mature professional development teams with many contributors, the Azure toolchain
starts with GitHub and Azure DevOps.
As your need grows, you can expand this foundation to use other tool features. The expanded foundation might
involve tools like:
Azure Blueprints
Azure Policy
Azure Resource Manager templates
Azure Monitor
The table of contents on the left side of this page lists guidance for each tool and aligns with the previously
described maturity model.
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
Tools to interact with devices in Azure
2 minutes to read • Edit Online
As described in the conceptual article on interacting with devices, the devices used to interact with a customer
depend on the amount of ambient experience required to deliver the customer's need and empower adoption.
Speed from the trigger that prompts the customer's need and your solution's ability to meet that need are
determining factors in repeat usage. Ambient experiences help accelerate that response time and create a better
experience for your customers by embedding your solution in the customers' immediate surroundings.
Toolchain
In Azure, you commonly use the following tools to accelerate digital invention across each of the preceding levels
of ambient solutions. These tools are grouped based on the amount of experience required to reduce complexity in
aligning tools with those experiences.
Mobile Experience: Azure App Service, PowerApps, Microsoft Flow, Intune
Mixed Reality: Unity, Azure Spatial Anchors, HoloLens
Integrated Reality: Azure IoT Hub, Azure Sphere, Azure Kinect DK
Adjusted Reality: IoT cloud to device, Azure Digital Twins + HoloLens
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
Tools to predict and influence data in Azure
2 minutes to read • Edit Online
As described in the conceptual article on predict and influence, computers and AI are much better than we are at
seeing patterns. By using cloud-based analytics tools, you can easily detect patterns and apply them to your
customers' needs. Use of these tools results in predictions of the best outcomes. When those predictions are
integrated back into customer experiences, they can influence your customers' behavior patterns through
interactions.
Toolchain
In Azure, the following tools are commonly used to accelerate digital invention across each of the preceding
phases:
Azure Machine Learning
Azure HDInsight
Hadoop R ScaleR
Azure SQL Data Warehouse
How each tool helps with each phase of predict and influence is reflected in the guidance in the table of contents
on the left side of this page.
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
The cloud creates new paradigms for the technologies that support the business. These new paradigms also change how those
technologies are adopted, managed, and governed. When entire datacenters can be virtually torn down and rebuilt with one
line of code executed by an unattended process, we have to rethink traditional approaches. This is especially true for
governance.
Methodology
Establish a basic understanding of the methodology that drives cloud governance in the Cloud Adoption Framework to
begin thinking through the end state solution.
Benchmark
Assess your current state and future state to establish a vision for applying the framework.
Intended audience
The content in the Cloud Adoption Framework affects the business, technology, and culture of enterprises. This section of the
Cloud Adoption Framework interacts heavily with IT security, IT governance, finance, line-of-business leaders, networking,
identity, and cloud adoption teams. Various dependencies on these personnel require a facilitative approach by the cloud
architects using this guidance. Facilitation with these teams might be a one-time effort. In some cases, interactions with these
other personnel will be ongoing.
The cloud architect serves as the thought leader and facilitator to bring these audiences together. The content in this collection
of guides is designed to help the cloud architect facilitate the right conversation, with the right audience, to drive necessary
decisions. Business transformation that's empowered by the cloud depends on the cloud architect to help guide decisions
throughout the business and IT.
Cloud architect specialization in this section: Each section of the Cloud Adoption Framework represents a different
specialization or variant of the cloud architect role. This section of the Cloud Adoption Framework is designed for cloud
architects with a passion for mitigating or reducing technical risks. Some cloud providers refer to these specialists as cloud
custodians, but we prefer cloud guardians or, collectively, the cloud governance team. In each actionable governance guide, the
articles show how the composition and role of the cloud governance team might change over time.
Next steps
Establish a basic understanding of the methodology that drives cloud governance in the Cloud Adoption Framework.
Understand the methodology
Cloud governance methodology
3 minutes to read • Edit Online
Adopting the cloud is a journey, not a destination. Along the way, there are clear milestones and tangible business
benefits. However, the final state of cloud adoption is unknown when a company begins the journey. Cloud
governance creates guardrails that keep the company on a safe path throughout the journey.
The Cloud Adoption Framework provides governance guides that describe the experiences of fictional companies,
which are based on the experiences of real customers. Each guide follows the customer through the governance
aspects of their cloud adoption.
The Cloud Adoption Framework governance model identifies key areas of importance during the journey. Each
area relates to different types of risks the company must address as it adopts more cloud services. Within this
framework, the governance guide identifies required actions for the cloud governance team. Along the way, each
principle of the Cloud Adoption Framework governance model is described further. Broadly, these include:
Corporate policies: Corporate policies drive cloud governance. The governance guide focuses on specific aspects
of corporate policy:
Business risks: Identifying and understanding corporate risks.
Policy and compliance: Converting risks into policy statements that support any compliance requirements.
Processes: Ensuring adherence to the stated policies.
Five Disciplines of Cloud Governance: These disciplines support the corporate policies. Each discipline protects
the company from potential pitfalls:
Cost Management
Security Baseline
Resource Consistency
Identity Baseline
Deployment Acceleration
Essentially, corporate policies serve as the early warning system to detect potential problems. The disciplines help
the company manage risks and create guardrails.
NOTE
Governance is not a replacement for key functions such as security, networking, identity, finance, DevOps, or operations.
Along the way, there will be interactions with and dependencies on members from each function. Those members should be
included on the cloud governance team to accelerate decisions and actions.
Next steps
Use the Cloud Adoption Framework governance benchmark tool to assess your transformation journey and help
you identify gaps in your organization across six key domains as defined in the framework.
Assess your transformation journey
The Cloud Adoption Framework provides a governance benchmark tool to help you identify gaps in your organization across six
key domains as defined in the framework.
Next steps
Begin your governance journey with a small, easily implemented set of governance tools. This initial governance foundation is
called a minimum viable product (MVP ).
Establish an initial governance foundation
Establishing cloud governance is a broad iterative effort. It is challenging to strike an effective balance between speed and control,
especially during early phases of cloud adoption. The governance guidance in the Cloud Adoption Framework helps provide that
balance via an agile approach to adoption.
This article provides two options for establishing an initial foundation for governance. Either option ensures that governance
constraints can be scaled and expanded as the adoption plan is implemented and requirements become more clearly defined. By
default, the initial foundation assumes an isolate-and-control position. It also focuses more on resource organization than on
resource governance. This lightweight starting point is called a minimum viable product (MVP ) for governance. The objective of
the MVP is reducing barriers to establishing an initial governance position, and then enabling rapid maturation of the solution to
address a variety of tangible risks.
Next steps
Once a governance foundation is in place, apply suitable recommendations to improve the solution and protect against tangible
risks.
Improve the initial governance foundation
This article assumes that you have established an initial cloud governance foundation. As your cloud adoption plan is
implemented, tangible risks will emerge from the proposed approaches by which teams want to adopt the cloud. As these risks
surface in release planning conversations, use the following grid to quickly identify a few best practices for getting ahead of the
adoption plan to prevent risks from becoming real threats.
Maturity vectors
At any time, the following best practices can be applied to the initial governance foundation to address the risk or need mentioned
in the table below.
IM P O R T A N T
Resource organization can affect how these best practices are applied. It is important to start with the recommendations that best
align with the initial cloud governance foundation you implemented in the previous step.
Next steps
In addition to the application of best practices, the governance methodology in the Cloud Adoption Framework can be
customized to fit unique business constraints. After following the applicable recommendations, evaluate corporate policy to
understand additional customization requirements.
Evaluate corporate policy
The actionable governance guides in this section illustrate the incremental approach of the Cloud Adoption Framework
governance model, based on the governance methodology previously described. You can establish an agile approach to cloud
governance that will grow to meet the needs of any cloud governance scenario.
A more robust governance starting point may be required. In such cases, consider the Azure Virtual Datacenter approach
briefly described below. This approach is commonly suggested during enterprise-scale adoption efforts, and especially for
efforts which exceed 10,000 assets. It is also the de facto choice for complex governance scenarios when any of the following
are required: extensive third-party compliance requirements, deep domain expertise, or parity with mature IT governance
policies and compliance requirements.
NOTE
It's unlikely that either guide aligns completely to your situation. Choose whichever guide is closest and use it as a starting
point. Throughout the guide, additional information is provided to help you customize decisions to meet specific criteria.
Business characteristics
CHAR ACTER IS TIC S TAND AR D O R G ANIZATIO N CO MPLE X ENTER PR IS E
Geography (country or geopolitical region) Customers or staff reside largely in one Customers or staff reside in multiple
geography geographies or require sovereign clouds.
Business units affected Business Units that share a common IT Multiple business units that do not share a
infrastructure common IT infrastructure
Datacenter or third-party hosting providers Fewer than five datacenters More than five datacenters
Identity Single forest, single domain. Complex, multiple forests, multiple domains.
Cost Management – cloud accounting Showback model. Billing is centralized Chargeback model. Billing could be
through IT. distributed through IT procurement.
Security Baseline – protected data Company financial data and IP. Limited Multiple collections of customers' financial
customer data. No third-party compliance and personal data. May need to consider
requirements. third-party compliance.
Next steps
Choose one of these guides:
Standard enterprise governance guide
Governance guide for complex enterprises
Standard enterprise governance guide
8 minutes to read • Edit Online
WARNING
This MVP is a baseline starting point, based on a set of assumptions. Even this minimal set of best practices is based on
corporate policies driven by unique business risks and risk tolerances. To see if these assumptions apply to you, read the
longer narrative that follows this article.
Every application should be deployed in the proper area of the management group, subscription, and resource
group hierarchy. During deployment planning, the cloud governance team will create the necessary nodes in the
hierarchy to empower the cloud adoption teams.
1. One management group for each type of environment (such as production, development, and test).
2. Two subscriptions, one for production workloads and another for nonproduction workloads.
3. Consistent nomenclature should be applied at each level of this grouping hierarchy.
4. Resource groups should be deployed in a manner that considers its contents lifecycle: everything that is
developed together, is managed together, and retires together goes together. For more information on
resource group best practices, see here.
5. Region selection is incredibly important and must be considered so that networking, monitoring, auditing can
be in place for failover/failback as well as confirmation that needed SKUs are available in the preferred
regions.
Here is an example of this pattern in use:
These patterns provide room for growth without complicating the hierarchy unnecessarily.
NOTE
In the event of changes to your business requirements, Azure management groups allow you to easily reorganize your
management hierarchy and subscription group assignments. However, keep in mind that policy and role assignments
applied to a management group are inherited by all subscriptions underneath that group in the hierarchy. If you plan to
reassign subscriptions between management groups, make sure that you are aware of any policy and role assignment
changes that may result. See the Azure management groups documentation for more information.
Governance of resources
A set of global policies and RBAC roles will provide a baseline level of governance enforcement. To meet the
cloud governance team's policy requirements, implementing the governance MVP requires completing the
following tasks:
1. Identify the Azure Policy definitions needed to enforce business requirements. This can include using built-in
definitions and creating new custom definitions.
2. Create a blueprint definition using these built-in and custom policy and the role assignments required by the
governance MVP.
3. Apply policies and configuration globally by assigning the blueprint definition to all subscriptions.
Identify policy definitions
Azure provides several built-in policies and role definitions that you can assign to any management group,
subscription, or resource group. Many common governance requirements can be handled using built-in
definitions. However, it's likely that you will also need to create custom policy definitions to handle your specific
requirements.
Custom policy definitions are saved to either a management group or a subscription and are inherited through
the management group hierarchy. If a policy definition's save location is a management group, that policy
definition is available to assign to any of that group's child management groups or subscriptions.
Since the policies required to support the governance MVP are meant to apply to all current subscriptions, the
following business requirements will be implemented using a combination of built-in definitions and custom
definitions created in the root management group:
1. Restrict the list of available role assignments to a set of built-in Azure roles authorized by your cloud
governance team. This requires a custom policy definition.
2. Require the following tags on all resources: Department/Billing Unit, Geography, Data Classification,
Criticality, SLA, Environment, Application Archetype, Application, and Application Owner. This can be handled
using the Require specified tag built-in definition.
3. Require that the Application tag for resources should match the name of the relevant resource group. This
can be handled using the "Require tag and its value" built-in definition.
For information on defining custom policies see the Azure Policy documentation. For guidance and examples of
custom policies, consult the Azure Policy samples site and the associated GitHub repository.
Assign Azure Policy and RBAC roles using Azure Blueprints
Azure policies can be assigned at the resource group, subscription, and management group level, and can be
included in Azure Blueprints definitions. Although the policy requirements defined in this governance MVP apply
to all current subscriptions, it's very likely that future deployments will require exceptions or alternative policies.
As a result, assigning policy using management groups, with all child subscriptions inheriting these assignments,
may not be flexible enough to support these scenarios.
Azure Blueprints allow the consistent assignment of policy and roles, application of Resource Manager templates,
and deployment of resource groups across multiple subscriptions. As with policy definitions, blueprint definitions
are saved to management groups or subscriptions, and are available through inheritance to any children in the
management group hierarchy.
The cloud governance team has decided that enforcement of required Azure Policy and RBAC assignments
across subscriptions will be implemented through Azure Blueprints and associated artifacts:
1. In the root management group, create a blueprint definition named governance-baseline .
2. Add the following blueprint artifacts to the blueprint definition:
a. Policy assignments for the custom Azure Policy definitions defined at the management group root.
b. Resource group definitions for any groups required in subscriptions created or governed by the
Governance MVP.
c. Standard role assignments required in subscriptions created or governed by the Governance MVP.
3. Publish the blueprint definition.
4. Assign the governance-baseline blueprint definition to all subscriptions.
See the Azure Blueprints documentation for more information on creating and using blueprint definitions.
Secure hybrid VNet
Specific subscriptions often require some level of access to on-premises resources. This is common in migration
scenarios or dev scenarios where dependent resources reside in the on-premises datacenter.
Until trust in the cloud environment is fully established it's important to tightly control and monitor any allowed
communication between the on-premises environment and cloud workloads, and that the on-premises network
is secured against potential unauthorized access from cloud-based resources. To support these scenarios, the
governance MVP adds the following best practices:
1. Establish a cloud secure hybrid VNet.
a. The VPN reference architecture establishes a pattern and deployment model for creating a VPN
Gateway in Azure.
b. Validate that on-premises security and traffic management mechanisms treat connected cloud
networks as untrusted. Resources and services hosted in the cloud should only have access to
authorized on-premises services.
c. Validate that the local edge device in the on-premises datacenter is compatible with Azure VPN
Gateway requirements and is configured to access the public internet.
d. Note that VPN tunnels should not be considered production ready circuits for anything but the most
simple workloads. Anything beyond a few simple workloads requiring on-premises connectivity should
use Azure ExpressRoute.
2. In the root management group, create a second blueprint definition named secure-hybrid-vnet .
a. Add the Resource Manager template for the VPN Gateway as an artifact to the blueprint definition.
b. Add the Resource Manager template for the virtual network as an artifact to the blueprint definition.
c. Publish the blueprint definition.
3. Assign the secure-hybrid-vnet blueprint definition to any subscriptions requiring on-premises connectivity.
This definition should be assigned in addition to the governance-baseline blueprint definition.
One of the biggest concerns raised by IT security and traditional governance teams is the risk that early stage
cloud adoption will compromise existing assets. The above approach allows cloud adoption teams to build and
migrate hybrid solutions, with reduced risk to on-premises assets. As trust in the cloud environment increases,
later evolutions may remove this temporary solution.
NOTE
The above is a starting point to quickly create a baseline governance MVP. This is only the beginning of the governance
journey. Further evolution will be needed as the company continues to adopt the cloud and takes on more risk in the
following areas:
Mission-critical workloads
Protected data
Cost management
Multicloud scenarios
Moreover, the specific details of this MVP are based on the example journey of a fictional company, described in the articles
that follow. We highly recommend becoming familiar with the other articles in this series before implementing this best
practice.
Next steps
Now that you're familiar with the governance MVP and have an idea of the governance improvements to follow,
read the supporting narrative for additional context.
Read the supporting narrative
Standard enterprise governance guide: The narrative
behind the governance strategy
2 minutes to read • Edit Online
The following narrative describes the use case for governance during a standard enterprise's cloud adoption
journey. Before implementing the journey, it's important to understand the assumptions and rationale that are
reflected in this narrative. Then you can better align the governance strategy to your own organization's journey.
Back story
The board of directors started the year with plans to energize the business in several ways. They are pushing
leadership to improve customer experiences to gain market share. They are also pushing for new products and
services that will position the company as a thought leader in the industry. They also initiated a parallel effort to
reduce waste and cut unnecessary costs. Though intimidating, the actions of the board and leadership show that
this effort is focusing as much capital as possible on future growth.
In the past, the company's CIO has been excluded from these strategic conversations. However, because the future
vision is intrinsically linked to technical growth, IT has a seat at the table to help guide these big plans. IT is now
expected to deliver in new ways. The team isn't prepared for these changes and is likely to struggle with the
learning curve.
Business characteristics
The company has the following business profile:
All sales and operations reside in a single country, with a low percentage of global customers.
The business operates as a single business unit, with budget aligned to functions, including Sales, Marketing,
Operations, and IT.
The business views most of IT as a capital drain or a cost center.
Current state
Here is the current state of the company's IT and cloud operations:
IT operates two hosted infrastructure environments. One environment contains production assets. The second
environment contains disaster recovery and some dev/test assets. These environments are hosted by two
different providers. IT refers to these two datacenters as Prod and DR respectively.
IT entered the cloud by migrating all end-user email accounts to Office 365. This migration was completed six
months ago. Few other IT assets have been deployed to the cloud.
The application development teams are working in a dev/test capacity to learn about cloud-native capabilities.
The business intelligence (BI) team is experimenting with big data in the cloud and curation of data on new
platforms.
The company has a loosely defined policy stating that personal customer data and financial data cannot be
hosted in the cloud, which limits mission-critical applications in the current deployments.
IT investments are controlled largely by capital expense. Those investments are planned yearly. In the past
several years, investments have included little more than basic maintenance requirements.
Future state
The following changes are anticipated over the next several years:
The CIO is reviewing the policy on personal data and financial data to allow for the future state goals.
The application development and BI teams want to release cloud-based solutions to production over the next
24 months based on the vision for customer engagement and new products.
This year, the IT team will finish retiring the disaster recovery workloads of the DR datacenter by migrating
2,000 VMs to the cloud. This is expected to produce an estimated $25M USD cost savings over the next five
years.
The company plans to change how it makes IT investments by repositioning the committed capital expense as
an operating expense within IT. This change will provide greater cost control and enable IT to accelerate other
planned efforts.
Next steps
The company has developed a corporate policy to shape the governance implementation. The corporate policy
drives many of the technical decisions.
Review the initial corporate policy
Standard enterprise governance guide: Initial
corporate policy behind the governance strategy
4 minutes to read • Edit Online
The following corporate policy defines an initial governance position, which is the starting point for this guide. This
article defines early-stage risks, initial policy statements, and early processes to enforce policy statements.
NOTE
The corporate policy is not a technical document, but it drives many technical decisions. The governance MVP described in
the overview ultimately derives from this policy. Before implementing a governance MVP, your organization should develop a
corporate policy based on your own objectives and business risks.
Objective
The initial objective is to establish a foundation for governance agility. An effective Governance MVP allows the
governance team to stay ahead of cloud adoption and implement guardrails as the adoption plan changes.
Business risks
The company is at an early stage of cloud adoption, experimenting and building proofs of concept. Risks are now
relatively low, but future risks are likely to have a significant impact. There is little definition around the final state
of the technical solutions to be deployed to the cloud. In addition, the cloud readiness of IT employees is low. A
foundation for cloud adoption will help the team safely learn and grow.
Future-proofing: There is a risk of not empowering growth, but also a risk of not providing the right protections
against future risks.
An agile yet robust governance approach is needed to support the board's vision for corporate and technical
growth. Failure to implement such a strategy will slow technical growth, potentially risking current and future
market share growth. The impact of such a business risk is unquestionably high. However, the role IT will play in
those potential future states is unknown, making the risk associated with current IT efforts relatively high. That
said, until more concrete plans are aligned, the business has a high tolerance for risk.
This business risk can be broken down tactically into several technical risks:
Well-intended corporate policies could slow transformation efforts or break critical business processes, if not
considered within a structured approval flow.
The application of governance to deployed assets could be difficult and costly.
Governance may not be properly applied across an application or workload, creating gaps in security.
With so many teams working in the cloud, there is a risk of inconsistency.
Costs may not properly align to business units, teams, or other budgetary management units.
The use of multiple identities to manage various deployments could lead to security issues.
Despite current policies, there is a risk that protected data could be mistakenly deployed to the cloud.
Tolerance indicators
The current tolerance for risk is high and the appetite for investing in cloud governance is low. As such, the
tolerance indicators act as an early warning system to trigger more investment of time and energy. If and when the
following indicators are observed, you should iteratively improve the governance strategy.
Cost Management: The scale of deployment exceeds predetermined limits on number of resources or
monthly cost.
Security Baseline: Inclusion of protected data in defined cloud adoption plans.
Resource Consistency: Inclusion of any mission-critical applications in defined cloud adoption plans.
Policy statements
The following policy statements establish the requirements needed to remediate the defined risks. These policies
define the functional requirements for the governance MVP. Each will be represented in the implementation of the
governance MVP.
Cost Management:
For tracking purposes, all assets must be assigned to an application owner within one of the core business
functions.
When cost concerns arise, additional governance requirements will be established with the finance team.
Security Baseline:
Any asset deployed to the cloud must have an approved data classification.
No assets identified with a protected level of data may be deployed to the cloud, until sufficient requirements
for security and governance can be approved and implemented.
Until minimum network security requirements can be validated and governed, cloud environments are seen as
a demilitarized zone and should meet similar connection requirements to other datacenters or internal
networks.
Resource Consistency:
Because no mission-critical workloads are deployed at this stage, there are no SLA, performance, or BCDR
requirements to be governed.
When mission-critical workloads are deployed, additional governance requirements will be established with IT
operations.
Identity Baseline:
All assets deployed to the cloud should be controlled using identities and roles approved by current governance
policies.
All groups in the on-premises Active Directory infrastructure that have elevated privileges should be mapped to
an approved RBAC role.
Deployment Acceleration:
All assets must be grouped and tagged according to defined grouping and tagging strategies.
All assets must use an approved deployment model.
Once a governance foundation has been established for a cloud provider, any deployment tooling must be
compatible with the tools defined by the governance team.
Processes
No budget has been allocated for ongoing monitoring and enforcement of these governance policies. Because of
that, the cloud governance team has some ad hoc ways to monitor adherence to policy statements.
Education: The cloud governance team is investing time to educate the cloud adoption teams on the
governance guides that support these policies.
Deployment reviews: Before deploying any asset, the cloud governance team will review the governance
guide with the cloud adoption teams.
Next steps
This corporate policy prepares the cloud governance team to implement the governance MVP, which will be the
foundation for adoption. The next step is to implement this MVP.
Best practices explained
Standard enterprise governance guide: Best practices
explained
10 minutes to read • Edit Online
The governance guide starts with a set of initial corporate policies. These policies are used to establish a
governance MVP that reflects best practices.
In this article, we discuss the high-level strategies that are required to create a governance MVP. The core of the
governance MVP is the Deployment Acceleration discipline. The tools and patterns applied at this stage will enable
the incremental improvements needed to expand governance in the future.
Implementation process
The implementation of the governance MVP has dependencies on Identity, Security, and Networking. Once the
dependencies are resolved, the cloud governance team will decide a few aspects of governance. The decisions from
the cloud governance team and from supporting teams will be implemented through a single package of
enforcement assets.
This implementation can also be described using a simple checklist:
1. Solicit decisions regarding core dependencies: Identity, Networking, Monitoring, and Encryption.
2. Determine the pattern to be used during corporate policy enforcement.
3. Determine the appropriate governance patterns for the Resource Consistency, Resource Tagging, and Logging
and Reporting disciplines.
4. Implement the governance tools aligned to the chosen policy enforcement pattern to apply the dependent
decisions and governance decisions.
Dependent decisions
The following decisions come from teams outside of the cloud governance team. The implementation of each will
come from those same teams. However, the cloud governance team is responsible for implementing a solution to
validate that those implementations are consistently applied.
Identity Baseline
Identity Baseline is the fundamental starting point for all governance. Before attempting to apply governance,
identity must be established. The established identity strategy will then be enforced by the governance solutions. In
this governance guide, the Identity Management team implements the Directory Synchronization pattern:
RBAC will be provided by Azure Active Directory (Azure AD ), using the directory synchronization or "Same
Sign-On" that was implemented during company's migration to Office 365. For implementation guidance, see
Reference Architecture for Azure AD Integration.
The Azure AD tenant will also govern authentication and access for assets deployed to Azure.
In the governance MVP, the governance team will enforce application of the replicated tenant through subscription
governance tooling, discussed later in this article. In future iterations, the governance team could also enforce rich
tooling in Azure AD to extend this capability.
Security Baseline: Networking
Software Defined Network is an important initial aspect of the Security Baseline. Establishing the governance
MVP depends on early decisions from the Security Management team to define how networks can be safely
configured.
Given the lack of requirements, IT security is playing it safe and requires a Cloud DMZ Pattern. That means
governance of the Azure deployments themselves will be very light.
Azure subscriptions may connect to an existing datacenter via VPN, but must follow all existing on-premises IT
governance policies regarding connection of a demilitarized zone to protected resources. For implementation
guidance regarding VPN connectivity, see VPN Reference Architecture.
Decisions regarding subnet, firewall, and routing are currently being deferred to each application/workload
lead.
Additional analysis is required before releasing of any protected data or mission-critical workloads.
In this pattern, cloud networks can only connect to on-premises resources over an existing VPN that is compatible
with Azure. Traffic over that connection will be treated like any traffic coming from a demilitarized zone. Additional
considerations may be required on the on-premises edge device to securely handle traffic from Azure.
The cloud governance team has proactively invited members of the networking and IT security teams to regular
meetings, in order to stay ahead of networking demands and risks.
Security Baseline: Encryption
Encryption is another fundamental decision within the Security Baseline discipline. Because the company currently
does not yet store any protected data in the cloud, the Security Team has decided on a less aggressive pattern for
encryption. At this point, a cloud-native pattern for encryption is suggested but not required of any
development team.
No governance requirements have been set regarding the use of encryption, because the current corporate
policy does not permit mission-critical or protected data in the cloud.
Additional analysis will be required before releasing any protected data or mission-critical workloads.
Policy enforcement
The first decision to make regarding Deployment Acceleration is the pattern for enforcement. In this narrative, the
governance team decided to implement the Automated Enforcement pattern.
Azure Security Center will be made available to the security and identity teams to monitor security risks. Both
teams are also likely to use Security Center to identify new risks and improve corporate policy.
RBAC is required in all subscriptions to govern authentication enforcement.
Azure Policy will be published to each management group and applied to all subscriptions. However, the level
of policies being enforced will be very limited in this initial Governance MVP.
Although Azure management groups are being used, a relatively simple hierarchy is expected.
Azure Blueprints will be used to deploy and update subscriptions by applying RBAC requirements, Resource
Manager Templates, and Azure Policy across management groups.
IMPORTANT
Any time a resource in a resource group no longer shares the same lifecycle, it should be moved to another resource group.
Examples include common databases and networking components. While they may serve the application being developed,
they may also serve other purposes and should therefore exist in other resource groups.
Resource tagging
Resource tagging decisions determine how metadata is applied to Azure resources within a subscription to
support operations, management, and accounting purposes. In this narrative, the Classification pattern has been
chosen as the default model for resource tagging.
Deployed assets should be tagged with:
Data Classification
Criticality
SLA
Environment
These four values will drive governance, operations, and security decisions.
If this governance guide is being implemented for a business unit or team within a larger corporation, tagging
should also include metadata for the billing unit.
Logging and reporting
Logging and reporting decisions determine how your store log data and how the monitoring and reporting tools
that keep IT staff informed on operational health are structured. In this narrative, a cloud-native pattern** for
logging and reporting is suggested.
Alternative patterns
If any of the patterns selected in this governance guide don't align with the reader's requirements, alternatives to
each pattern are available:
Encryption patterns
Identity patterns
Logging and Reporting patterns
Policy Enforcement patterns
Resource Consistency patterns
Resource Tagging patterns
Software Defined Networking patterns
Subscription Design patterns
Next steps
Once this guide is implemented, each cloud adoption team can go forth with a sound governance foundation. At
the same time, the cloud governance team will work to continuously update the corporate policies and governance
disciplines.
The two teams will use the tolerance indicators to identify the next set of improvements needed to continue
supporting cloud adoption. For the fictional company in this guide, the next step is improving the Security Baseline
to support moving protected data to the cloud.
Improve the Security Baseline discipline
Standard enterprise governance guide: Improve the
Security Baseline discipline
8 minutes to read • Edit Online
This article advances the narrative by adding security controls that support moving protected data to the cloud.
Conclusion
Adding the above processes and changes to the governance MVP will help to remediate many of the risks
associated with security governance. Together, they add the network, identity, and security monitoring tools
needed to protect data.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs also
change. For the fictional company in this guide, the next step is to support mission-critical workloads. This is the
point when Resource Consistency controls are needed.
Improving Resource Consistency
Standard enterprise governance guide: Improving
Resource Consistency
6 minutes to read • Edit Online
This article advances the narrative by adding Resource Consistency controls to support mission-critical apps.
Conclusion
These additional processes and changes to the governance MVP help remediate many of the risks associated with
resource governance. Together they add recovery, sizing, and monitoring controls that empower cloud-aware
operations.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs will also
change. For the fictional company in this guide, the next trigger is when the scale of deployment exceeds 100
assets to the cloud or monthly spending exceeds $1,000 per month. At this point, the cloud governance team adds
Cost Management controls.
Improving Cost Management
Standard enterprise guide: Improve the Cost
Management discipline
4 minutes to read • Edit Online
This article advances the narrative by adding cost controls to the governance MVP.
Conclusion
Adding these processes and changes to the governance MVP helps remediate many of the risks associated with
cost governance. Together, they create the visibility, accountability, and optimization needed to control costs.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs will also
change. For the fictional company in this guide, the next step is using this governance investment to manage
multiple clouds.
Multicloud evolution
Standard enterprise governance guide: Multicloud
improvement
4 minutes to read • Edit Online
This article advances the narrative by adding controls for multicloud adoption.
Conclusion
This series of articles described the incremental development of governance best practices, aligned with the
experiences of this fictional company. By starting small, but with the right foundation, the company could move
quickly and yet still apply the right amount of governance at the right time. The MVP by itself did not protect the
customer. Instead, it created the foundation to manage risks and add protections. From there, layers of governance
were applied to remediate tangible risks. The exact journey presented here won't align 100% with the experiences
of any reader. Rather, it serves as a pattern for incremental governance. You should mold these best practices to fit
your own unique constraints and governance requirements.
Governance guide for complex enterprises
8 minutes to read • Edit Online
WARNING
This MVP is a baseline starting point, based on a set of assumptions. Even this minimal set of best practices is based on
corporate policies driven by unique business risks and risk tolerances. To see if these assumptions apply to you, read the
longer narrative that follows this article.
Every application should be deployed in the proper area of the management group, subscription, and resource
group hierarchy. During deployment planning, the cloud governance team will create the necessary nodes in the
hierarchy to empower the cloud adoption teams.
1. Define a management group for each business unit with a detailed hierarchy that reflects geography first,
then environment type (for example, production or nonproduction environments).
2. Create a production subscription and a nonproduction subscription for each unique combination of discrete
business unit or geography. Creating multiple subscriptions requires careful consideration. For more
information, see the Subscription decision guide.
3. Apply consistent nomenclature at each level of this grouping hierarchy.
4. Resource groups should be deployed in a manner that considers its contents lifecycle. Resources that are
developed together, managed together, and retired together belong in the same resource group. For more
information on best practices for using resource groups, see here.
5. Region selection is incredibly important and must be considered so that networking, monitoring, auditing can
be in place for failover/failback as well as confirmation that needed SKUs are available in the preferred
regions.
These patterns provide room for growth without making the hierarchy needlessly complicated.
NOTE
In the event of changes to your business requirements, Azure management groups allow you to easily reorganize your
management hierarchy and subscription group assignments. However, keep in mind that policy and role assignments
applied to a management group are inherited by all subscriptions underneath that group in the hierarchy. If you plan to
reassign subscriptions between management groups, make sure that you are aware of any policy and role assignment
changes that may result. See the Azure management groups documentation for more information.
Governance of resources
A set of global policies and RBAC roles will provide a baseline level of governance enforcement. To meet the
cloud governance team's policy requirements, implementing the governance MVP requires completing the
following tasks:
1. Identify the Azure Policy definitions needed to enforce business requirements. This can include using built-in
definitions and creating new custom definitions.
2. Create a blueprint definition using these built-in and custom policy and the role assignments required by the
governance MVP.
3. Apply policies and configuration globally by assigning the blueprint definition to all subscriptions.
Identify policy definitions
Azure provides several built-in policies and role definitions that you can assign to any management group,
subscription, or resource group. Many common governance requirements can be handled using built-in
definitions. However, it's likely that you will also need to create custom policy definitions to handle your specific
requirements.
Custom policy definitions are saved to either a management group or a subscription and are inherited through
the management group hierarchy. If a policy definition's save location is a management group, that policy
definition is available to assign to any of that group's child management groups or subscriptions.
Since the policies required to support the governance MVP are meant to apply to all current subscriptions, the
following business requirements will be implemented using a combination of built-in definitions and custom
definitions created in the root management group:
1. Restrict the list of available role assignments to a set of built-in Azure roles authorized by your cloud
governance team. This requires a custom policy definition.
2. Require the following tags on all resources: Department/Billing Unit, Geography, Data Classification,
Criticality, SLA, Environment, Application Archetype, Application, and Application Owner. This can be
handled using the Require specified tag built-in definition.
3. Require that the Application tag for resources should match the name of the relevant resource group. This
can be handled using the "Require tag and its value" built-in definition.
For information on defining custom policies see the Azure Policy documentation. For guidance and examples of
custom policies, consult the Azure Policy samples site and the associated GitHub repository.
Assign Azure Policy and RBAC roles using Azure Blueprints
Azure policies can be assigned at the resource group, subscription, and management group level, and can be
included in Azure Blueprints definitions. Although the policy requirements defined in this governance MVP apply
to all current subscriptions, it's very likely that future deployments will require exceptions or alternative policies.
As a result, assigning policy using management groups, with all child subscriptions inheriting these assignments,
may not be flexible enough to support these scenarios.
Azure Blueprints allow the consistent assignment of policy and roles, application of Resource Manager
templates, and deployment of resource groups across multiple subscriptions. As with policy definitions, blueprint
definitions are saved to management groups or subscriptions, and are available through inheritance to any
children in the management group hierarchy.
The cloud governance team has decided that enforcement of required Azure Policy and RBAC assignments
across subscriptions will be implemented through Azure Blueprints and associated artifacts:
1. In the root management group, create a blueprint definition named governance-baseline .
2. Add the following blueprint artifacts to the blueprint definition:
a. Policy assignments for the custom Azure Policy definitions defined at the management group root.
b. Resource group definitions for any groups required in subscriptions created or governed by the
Governance MVP.
c. Standard role assignments required in subscriptions created or governed by the Governance MVP.
3. Publish the blueprint definition.
4. Assign the governance-baseline blueprint definition to all subscriptions.
See the Azure Blueprints documentation for more information on creating and using blueprint definitions.
Secure hybrid VNet
Specific subscriptions often require some level of access to on-premises resources. This is common in migration
scenarios or dev scenarios where dependent resources reside in the on-premises datacenter.
Until trust in the cloud environment is fully established it's important to tightly control and monitor any allowed
communication between the on-premises environment and cloud workloads, and that the on-premises network
is secured against potential unauthorized access from cloud-based resources. To support these scenarios, the
governance MVP adds the following best practices:
1. Establish a cloud secure hybrid VNet.
a. The VPN reference architecture establishes a pattern and deployment model for creating a VPN
Gateway in Azure.
b. Validate that on-premises security and traffic management mechanisms treat connected cloud
networks as untrusted. Resources and services hosted in the cloud should only have access to
authorized on-premises services.
c. Validate that the local edge device in the on-premises datacenter is compatible with Azure VPN
Gateway requirements and is configured to access the public internet.
d. Note that VPN tunnels should not be considered production ready circuits for anything but the most
simple workloads. Anything beyond a few simple workloads requiring on-premises connectivity should
use Azure ExpressRoute.
2. In the root management group, create a second blueprint definition named secure-hybrid-vnet .
a. Add the Resource Manager template for the VPN Gateway as an artifact to the blueprint definition.
b. Add the Resource Manager template for the virtual network as an artifact to the blueprint definition.
c. Publish the blueprint definition.
3. Assign the secure-hybrid-vnet blueprint definition to any subscriptions requiring on-premises connectivity.
This definition should be assigned in addition to the governance-baseline blueprint definition.
One of the biggest concerns raised by IT security and traditional governance teams is the risk that early stage
cloud adoption will compromise existing assets. The above approach allows cloud adoption teams to build and
migrate hybrid solutions, with reduced risk to on-premises assets. As trust in the cloud environment increases,
later evolutions may remove this temporary solution.
NOTE
The above is a starting point to quickly create a baseline governance MVP. This is only the beginning of the governance
journey. Further evolution will be needed as the company continues to adopt the cloud and takes on more risk in the
following areas:
Mission-critical workloads
Protected data
Cost management
Multicloud scenarios
Moreover, the specific details of this MVP are based on the example journey of a fictional company, described in the articles
that follow. We highly recommend becoming familiar with the other articles in this series before implementing this best
practice.
Next steps
Now that you're familiar with the governance MVP and the forthcoming governance changes, read the
supporting narrative for additional context.
Read the supporting narrative
Governance guide for complex enterprises: The
supporting narrative
4 minutes to read • Edit Online
The following narrative establishes a use case for governance during complex enterprise's cloud adoption journey.
Before acting on the recommendations in the guide, it's important to understand the assumptions and reasoning
that are reflected in this narrative. Then you can better align the governance strategy to your own organization's
cloud adoption journey.
Back story
Customers are demanding a better experience when interacting with this company. The current experience caused
market erosion and led to the board to hire a Chief Digital Officer (CDO ). The CDO is working with marketing and
sales to drive a digital transformation that will power improved experiences. Additionally, several business units
recently hired data scientists to farm data and improve many of the manual experiences through learning and
prediction. IT is supporting these efforts where it can. However, there are "shadow IT" activities occurring that fall
outside of needed governance and security controls.
The IT organization is also facing its own challenges. Finance is planning continued reductions in the IT budget
over the next five years, leading to some necessary spending cuts starting this year. Conversely, GDPR and other
data sovereignty requirements are forcing IT to invest in assets in additional countries to localize data. Two of the
existing datacenters are overdue for hardware refreshes, causing further problems with employee and customer
satisfaction. Three more datacenters require hardware refreshes during the execution of the five-year plan. The
CFO is pushing the CIO to consider the cloud as an alternative for those datacenters, to free up capital expenses.
The CIO has innovative ideas that could help the company, but she and her teams are limited to fighting fires and
controlling costs. At a luncheon with the CDO and one of the business unit leaders, the cloud migration
conversation generated interest from the CIO's peers. The three leaders aim to support each other using the cloud
to achieve their business objectives, and they have begun the exploration and planning phases of cloud adoption.
Business characteristics
The company has the following business profile:
Sales and operations span multiple geographic areas with global customers in multiple markets.
The business grew through acquisition and operates across three business units based on the target customer
base. Budgeting is a complex matrix across business units and functions.
The business views most of IT as a capital drain or a cost center.
Current state
Here is the current state of the company's IT and cloud operations:
IT operates more than 20 privately owned datacenters around the globe.
Due to organic growth and multiple geographies, there are a few IT teams that have unique data sovereignty
and compliance requirements that impact a single business unit operating within a specific geography.
Each datacenter is connected by a series of regional leased lines, creating a loosely coupled global WAN.
IT entered the cloud by migrating all end-user email accounts to Office 365. This migration was completed
more than six months ago. Since then, only a few IT assets have been deployed to the cloud.
The CDO's primary development team is working in a dev/test capacity to learn about cloud-native capabilities.
One business unit is experimenting with big data in the cloud. The BI team inside of IT is participating in that
effort.
The existing IT governance policy states that personal customer data and financial data must be hosted on
assets owned directly by the company. This policy blocks cloud adoption for any mission-critical apps or
protected data.
IT investments are controlled largely by capital expense. Those investments are planned yearly and often
include plans for ongoing maintenance, as well as established refresh cycles of three to five years depending on
the datacenter.
Most investments in technology that don't align to the annual plan are addressed by shadow IT efforts. Those
efforts are usually managed by business units and funded through the business unit's operating expenses.
Future state
The following changes are anticipated over the next several years:
The CIO is leading an effort to modernize the policy on personal and financial data to support future goals.
Two members of the IT Governance team have visibility into this effort.
The CIO wants to use the cloud migration as a forcing function to improve consistency and stability across
business units and geographies. However, the future state must respect any external compliance requirements
which would require deviation from standard approaches by specific IT teams.
If the early experiments in App Dev and BI show leading indicators of success, they would each like to release
small-scale production solutions to the cloud in the next 24 months.
The CIO and CFO have assigned an architect and the Vice President of Infrastructure to create a cost analysis
and feasibility study. These efforts will determine if the company can and should move 5,000 assets to the
cloud over the next 36 months. A successful migration would allow the CIO to eliminate two datacenters,
reducing costs by over $100M USD during the five-year plan. If three to four datacenters can experience
similar results, the budget will be back in the black, giving the CIO budget to support more innovative
initiatives.
Along with this cost savings, the company plans to change the management of some IT investments by
repositioning the committed capital expense as an operating expense within IT. This change will provide greater
cost control, which IT can use to accelerate other planned efforts.
Next steps
The company has developed a corporate policy to shape the governance implementation. The corporate policy
drives many of the technical decisions.
Review the initial corporate policy
Governance guide for complex enterprises: Initial
corporate policy behind the governance strategy
5 minutes to read • Edit Online
The following corporate policy defines the initial governance position, which is the starting point for this guide.
This article defines early-stage risks, initial policy statements, and early processes to enforce policy statements.
NOTE
The corporate policy is not a technical document, but it drives many technical decisions. The governance MVP described in
the overview ultimately derives from this policy. Before implementing a governance MVP, your organization should develop a
corporate policy based on your own objectives and business risks.
Objective
The initial objective is to establish a foundation for governance agility. An effective Governance MVP allows the
governance team to stay ahead of cloud adoption and implement guardrails as the adoption plan changes.
Business risks
The company is at an early stage of cloud adoption, experimenting and building proofs of concept. Risks are now
relatively low, but future risks are likely to have a significant impact. There is little definition around the final state
of the technical solutions to be deployed to the cloud. In addition, the cloud readiness of IT employees is low. A
foundation for cloud adoption will help the team safely learn and grow.
Future-proofing: There is a risk of not empowering growth, but also a risk of not providing the right protections
against future risks.
An agile yet robust governance approach is needed to support the board's vision for corporate and technical
growth. Failure to implement such a strategy will slow technical growth, potentially risking current and future
market share growth. The impact of such a business risk is unquestionably high. However, the role IT will play in
those potential future states is unknown, making the risk associated with current IT efforts relatively high. That
said, until more concrete plans are aligned, the business has a high tolerance for risk.
This business risk can be broken down tactically into several technical risks:
Well-intended corporate policies could slow transformation efforts or break critical business processes, if not
considered within a structured approval flow.
The application of governance to deployed assets could be difficult and costly.
Governance may not be properly applied across an application or workload, creating gaps in security.
With so many teams working in the cloud, there is a risk of inconsistency.
Costs may not properly align to business units, teams, or other budgetary management units.
The use of multiple identities to manage various deployments could lead to security issues.
Despite current policies, there is a risk that protected data could be mistakenly deployed to the cloud.
Tolerance indicators
The current risk tolerance is high and the appetite for investing in cloud governance is low. As such, the tolerance
indicators act as an early warning system to trigger the investment of time and energy. If the following indicators
are observed, it would be wise to advance the governance strategy.
Cost Management: Scale of deployment exceeds 1,000 assets to the cloud, or monthly spending exceeds
$10,000 USD per month.
Identity Baseline: Inclusion of applications with legacy or third-party multi-factor authentication
requirements.
Security Baseline: Inclusion of protected data in defined cloud adoption plans.
Resource Consistency: Inclusion of any mission-critical applications in defined cloud adoption plans.
Policy statements
The following policy statements establish the requirements needed to remediate the defined risks. These policies
define the functional requirements for the governance MVP. Each will be represented in the implementation of the
governance MVP.
Cost Management:
For tracking purposes, all assets must be assigned to an application owner within one of the core business
functions.
When cost concerns arise, additional governance requirements will be established with the finance team.
Security Baseline:
Any asset deployed to the cloud must have an approved data classification.
No assets identified with a protected level of data may be deployed to the cloud, until sufficient requirements
for security and governance can be approved and implemented.
Until minimum network security requirements can be validated and governed, cloud environments are seen as
a demilitarized zone and should meet similar connection requirements to other datacenters or internal
networks.
Resource Consistency:
Because no mission-critical workloads are deployed at this stage, there are no SLA, performance, or BCDR
requirements to be governed.
When mission-critical workloads are deployed, additional governance requirements will be established with IT
operations.
Identity Baseline:
All assets deployed to the cloud should be controlled using identities and roles approved by current governance
policies.
All groups in the on-premises Active Directory infrastructure that have elevated privileges should be mapped to
an approved RBAC role.
Deployment Acceleration:
All assets must be grouped and tagged according to defined grouping and tagging strategies.
All assets must use an approved deployment model.
Once a governance foundation has been established for a cloud provider, any deployment tooling must be
compatible with the tools defined by the governance team.
Processes
No budget has been allocated for ongoing monitoring and enforcement of these governance policies. Because of
that, the cloud governance team has some ad hoc ways to monitor adherence to policy statements.
Education: The cloud governance team is investing time to educate the cloud adoption teams on the
governance guides that support these policies.
Deployment reviews: Before deploying any asset, the cloud governance team will review the governance
guide with the cloud adoption teams.
Next steps
This corporate policy prepares the cloud governance team to implement the governance MVP, which will be the
foundation for adoption. The next step is to implement this MVP.
Best practices explained
Governance guide for complex enterprises: Best
practices explained
11 minutes to read • Edit Online
The governance guide begins with a set of initial corporate policies. These policies are used to establish a minimum
viable product (MVP ) for governance that reflects best practices.
In this article, we discuss the high-level strategies that are required to create a governance MVP. The core of the
governance MVP is the Deployment Acceleration discipline. The tools and patterns applied at this stage will enable
the incremental improvements needed to expand governance in the future.
Implementation process
The implementation of the governance MVP has dependencies on Identity, Security, and Networking. Once the
dependencies are resolved, the cloud governance team will decide a few aspects of governance. The decisions from
the cloud governance team and from supporting teams will be implemented through a single package of
enforcement assets.
This implementation can also be described using a simple checklist:
1. Solicit decisions regarding core dependencies: Identity, Network, and Encryption.
2. Determine the pattern to be used during corporate policy enforcement.
3. Determine the appropriate governance patterns for the Resource Consistency, Resource Tagging, and Logging
and Reporting disciplines.
4. Implement the governance tools aligned to the chosen policy enforcement pattern to apply the dependent
decisions and governance decisions.
Dependent decisions
The following decisions come from teams outside of the cloud governance team. The implementation of each will
come from those same teams. However, the cloud governance team is responsible for implementing a solution to
validate that those implementations are consistently applied.
Identity Baseline
Identity Baseline is the fundamental starting point for all governance. Before attempting to apply governance,
identity must be established. The established identity strategy will then be enforced by the governance solutions. In
this governance guide, the Identity Management team implements the Directory Synchronization pattern:
RBAC will be provided by Azure Active Directory (Azure AD ), using the directory synchronization or "Same
Sign-On" that was implemented during company's migration to Office 365. For implementation guidance, see
Reference Architecture for Azure AD Integration.
The Azure AD tenant will also govern authentication and access for assets deployed to Azure.
In the governance MVP, the governance team will enforce application of the replicated tenant through subscription
governance tooling, discussed later in this article. In future iterations, the governance team could also enforce rich
tooling in Azure AD to extend this capability.
Security Baseline: Networking
Software Defined Network is an important initial aspect of the Security Baseline. Establishing the governance
MVP depends on early decisions from the Security Management team to define how networks can be safely
configured.
Given the lack of requirements, IT security is playing it safe and requires a Cloud DMZ Pattern. That means
governance of the Azure deployments themselves will be very light.
Azure subscriptions may connect to an existing datacenter via VPN, but must follow all existing on-premises IT
governance policies regarding connection of a demilitarized zone to protected resources. For implementation
guidance regarding VPN connectivity, see VPN Reference Architecture.
Decisions regarding subnet, firewall, and routing are currently being deferred to each application/workload
lead.
Additional analysis is required before releasing of any protected data or mission-critical workloads.
In this pattern, cloud networks can only connect to on-premises resources over an existing VPN that is compatible
with Azure. Traffic over that connection will be treated like any traffic coming from a demilitarized zone. Additional
considerations may be required on the on-premises edge device to securely handle traffic from Azure.
The cloud governance team has proactively invited members of the networking and IT security teams to regular
meetings, in order to stay ahead of networking demands and risks.
Security Baseline: Encryption
Encryption is another fundamental decision within the Security Baseline discipline. Because the company currently
does not yet store any protected data in the cloud, the Security Team has decided on a less aggressive pattern for
encryption. At this point, a cloud-native pattern for encryption is suggested but not required of any
development team.
No governance requirements have been set regarding the use of encryption, because the current corporate
policy does not permit mission-critical or protected data in the cloud.
Additional analysis will be required before releasing any protected data or mission-critical workloads.
Policy enforcement
The first decision to make regarding Deployment Acceleration is the pattern for enforcement. In this narrative, the
governance team decided to implement the Automated Enforcement pattern.
Azure Security Center will be made available to the security and identity teams to monitor security risks. Both
teams are also likely to use Security Center to identify new risks and improve corporate policy.
RBAC is required in all subscriptions to govern authentication enforcement.
Azure Policy will be published to each management group and applied to all subscriptions. However, the level
of policies being enforced will be very limited in this initial Governance MVP.
Although Azure management groups are being used, a relatively simple hierarchy is expected.
Azure Blueprints will be used to deploy and update subscriptions by applying RBAC requirements, Resource
Manager Templates, and Azure Policy across management groups.
IMPORTANT
Any time a resource in a resource group no longer shares the same lifecycle, it should be moved to another resource group.
Examples include common databases and networking components. While they may serve the application being developed,
they may also serve other purposes and should therefore exist in other resource groups.
Resource tagging
Resource tagging decisions determine how metadata is applied to Azure resources within a subscription to
support operations, management, and accounting purposes. In this narrative, the Accounting pattern has been
chosen as the default model for resource tagging.
Deployed assets should be tagged with values for:
Department/Billing Unit
Geography
Data Classification
Criticality
SLA
Environment
Application Archetype
Application
Application Owner
These values along with the Azure management group and subscription associated with a deployed asset will
drive governance, operations, and security decisions.
Logging and reporting
Logging and reporting decisions determine how your store log data and how the monitoring and reporting tools
that keep IT staff informed on operational health are structured. In this narrative a Hybrid monitoring pattern for
logging and reporting is suggested, but not required of any development team at this point.
No governance requirements are currently set regarding the specific data points to be collected for logging or
reporting purposes. This is specific to this fictional narrative and should be considered an antipattern. Logging
standards should be determined and enforced as soon as possible.
Additional analysis is required before the release of any protected data or mission-critical workloads.
Before supporting protected data or mission-critical workloads, the existing on-premises operational
monitoring solution must be granted access to the workspace used for logging. Applications are required to
meet security and logging requirements associated with the use of that tenant, if the application is to be
supported with a defined SLA.
Alternative patterns
If any of the patterns chosen in this governance guide don't align with the reader's requirements, alternatives to
each pattern are available:
Encryption patterns
Identity patterns
Logging and Reporting patterns
Policy Enforcement patterns
Resource Consistency patterns
Resource Tagging patterns
Software Defined Networking patterns
Subscription Design patterns
Next steps
Once this guidance is implemented, each cloud adoption team can proceed with a solid governance foundation. At
the same time, the cloud governance team will work to continually update the corporate policies and governance
disciplines.
Both teams will use the tolerance indicators to identify the next set of improvements needed to continue
supporting cloud adoption. The next step for this company is incremental improvement of their governance
baseline to support applications with legacy or third-party multi-factor authentication requirements.
Improve the Identity Baseline discipline
Governance guide for complex enterprises: Improve
the Identity Baseline discipline
4 minutes to read • Edit Online
This article advances the narrative by adding Identity Baseline controls to the governance MVP.
Conclusion
Adding these changes to the governance MVP helps remediate many of the risks in this article, allowing each
cloud adoption team to quickly move past this roadblock.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs will also
change. The following are a few changes that may occur. For this fictional company, the next trigger is the inclusion
of protected data in the cloud adoption plan. This change requires additional security controls.
Improve the Security Baseline discipline
Governance guide for complex enterprises: Improve
the Security Baseline discipline
13 minutes to read • Edit Online
This article advances the narrative by adding security controls that support moving protected data to the cloud.
Conclusion
Adding these processes and changes to the governance MVP helps remediate many of the risks associated with
security governance. Together, they add the network, identity, and security monitoring tools needed to protect
data.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs also
change. For the fictional company in this guide, the next step is to support mission-critical workloads. This is the
point when Resource Consistency controls are needed.
Improve the Resource Consistency discipline
Governance guide for complex enterprises: Improve
the Resource Consistency discipline
6 minutes to read • Edit Online
This article advances the narrative by adding Resource Consistency controls to the governance MVP to support
mission-critical applications.
Conclusion
Adding these processes and changes to the governance MVP helps remediate many of the risks associated with
resource governance. Together, they add the recovery, sizing, and monitoring controls necessary to empower
cloud-aware operations.
Next steps
As cloud adoption grows and delivers additional business value, the risks and cloud governance needs will also
change. For the fictional company in this guide, the next trigger is when the scale of deployment exceeds 1,000
assets to the cloud or monthly spending exceeds $10,000 USD per month. At this point, the cloud governance
team adds Cost Management controls.
Improve the Cost Management discipline
Governance guide for complex enterprises: Improve
the Cost Management discipline
4 minutes to read • Edit Online
This article advances the narrative by adding cost controls to the minimum viable product (MVP ) governance.
Changes in risk
Budget control: There is an inherent risk that self-service capabilities will result in excessive and unexpected
costs on the new platform. Governance processes for monitoring costs and mitigating ongoing cost risks must be
in place to ensure continued alignment with the planned budget.
This business risk can be expanded into a few technical risks:
There is a risk of actual costs exceeding the plan.
Business conditions change. When they do, there will be cases when a business function needs to consume
more cloud services than expected, leading to spending anomalies. There is a risk that these additional costs
will be considered overages as opposed to a required adjustment to the plan. If successful, the Canadian
experiment should help remediate this risk.
There is a risk of systems being overprovisioned, resulting in excess spending.
Conclusion
Adding the above processes and changes to the governance MVP helps remediate many of the risks associated
with cost governance. Together, they create the visibility, accountability, and optimization needed to control costs.
Next steps
As cloud adoption grows and delivers additional business value, risks and cloud governance needs will also
change. For this fictional company, the next step is using this governance investment to manage multiple clouds.
Multicloud improvement
Governance guide for complex enterprises:
Multicloud improvement
3 minutes to read • Edit Online
Next steps
In many large enterprises, the Five Disciplines of Cloud Governance can be blockers to adoption. The next article
has some additional thoughts on making governance a team sport to help ensure long-term success in the cloud.
Multiple layers of governance
Governance guide for complex enterprises: Multiple
layers of governance
3 minutes to read • Edit Online
When large enterprises require multiple layers of governance, there are greater levels of complexity that must be
factored into the governance MVP and later governance improvements.
A few common examples of such complexities include:
Distributed governance functions.
Corporate IT supporting Business unit IT organizations.
Corporate IT supporting geographically distributed IT organizations.
This article explores some ways to navigate this type of complexity.
However, cloud governance requires more than technical implementation. Subtle changes in the corporate narrative or
corporate policies can affect adoption efforts significantly. Before implementation, it's important to look beyond IT while
defining corporate policy.
Figure 1 - Visual of corporate policy and the Five Disciplines of Cloud Governance.
Business risk
Investigate current cloud adoption plans and data classification to identify risks to the business. Work with the business to
balance risk tolerance and mitigation costs.
Processes
The pace of adoption and innovation activities will naturally create policy violations. Executing relevant processes will aid
in monitoring and enforcing adherence to policies.
Next steps
Learn how to make your corporate policy ready for the cloud.
Prepare corporate policy for the cloud
Prepare corporate IT policy for the cloud
4 minutes to read • Edit Online
Cloud governance is the product of an ongoing adoption effort over time, as a true lasting transformation doesn't
happen overnight. Attempting to deliver complete cloud governance before addressing key corporate policy
changes using a fast aggressive method seldom produces the desired results. Instead we recommend an
incremental approach.
What is different about our Cloud Adoption Framework is the purchasing cycle and how it can enable authentic
transformation. Since there is not a big capital expenditure acquisition requirement, engineers can begin
experimentation and adoption sooner. In most corporate cultures, elimination of the capital expense barrier to
adoption can lead to tighter feedback loops, organic growth, and incremental execution.
The shift to cloud adoption requires a shift in governance. In many organizations, corporate policy transformation
allows for improved governance and higher rates of adherence through incremental policy changes and
automated enforcement of those changes, powered by newly defined capabilities that you configure with your
cloud service provider.
This article outlines key activities that can help you shape your corporate policies to enable an expanded
governance model.
TIP
If your organization is governed by third-party compliance, one of the biggest business risks to consider may be a risk of
adherence to regulatory compliance. This risk often cannot be remediated, and instead may require a strict adherence. Be
sure to understand your third-party compliance requirements before beginning a policy review.
Next steps
Effective cloud governance strategy begins with understanding business risk.
Understand business risk
Understand business risk during cloud migration
4 minutes to read • Edit Online
An understanding of business risk is one of the most important elements of any cloud transformation. Risk drives
policy, and it influences monitoring and enforcement requirements. Risk heavily influences how we manage the
digital estate, on-premises or in the cloud.
Relativity of risk
Risk is relative. A small company with a few IT assets, in a closed building has little risk. Add users and an internet
connection with access to those assets, the risk is intensified. When that small company grows to Fortune 500
status, the risks are exponentially greater. As revenue, business process, employee counts, and IT assets
accumulate, risks increase and coalesce. IT assets that aid in generating revenue are at tangible risk of stopping
that revenue stream in the event of an outage. Every moment of downtime equates to losses. Likewise, as data
accumulates, the risk of harming customers grows.
In the traditional on-premises world, IT governance teams focus on assessing risks, creating processes to manage
those risks, and deploying systems to ensure remediation measures are successfully implemented. These efforts
work to balance risks required to operate in a connected, modern business environment.
Next steps
Learn how to evaluate risk tolerance during cloud adoption.
Evaluate risk tolerance
Evaluate risk tolerance
8 minutes to read • Edit Online
Every business decision creates new risks. Making an investment in anything creates risk of losses. New products
or services create risks of market failure. Changes to current products or services could reduce market share.
Cloud transformation does not provide a magical solution to everyday business risk. To the contrary, connected
solutions (cloud or on-premises) introduce new risks. Deploying assets to any network connected facility also
expands the potential threat profile by exposing security weaknesses to a much broader, global community.
Fortunately, cloud providers are aware of the changes, increases, and addition of risks. They invest heavily to
reduce and manage those risks on the behalf of their customers.
This article is not focused on cloud risks. Instead it discusses the business risks associated with various forms of
cloud transformation. Later in the article, the discussion shifts focus to discuss ways of understanding the
business' tolerance for risk.
IMPORTANT
Before reading the following, be aware that each of these risks can be managed. The goal of this article is to inform and
prepare readers for more productive risk management discussions.
Data breach: The top risk associated with any transformation is a data breach. Data leaks can cause
significant damage to your company, leading to loss of customers, decrease in business, or even legal
liability. Any changes to the way data is stored, processed, or used creates risk. Cloud transformations
create a high degree of change regarding data management, so the risk should not be taken lightly.
Security Baseline, Data Classification, and Incremental Rationalization can each help manage this risk.
Service disruption: Business operations and customer experiences rely heavily on technical operations.
Cloud transformations will create change in IT operations. In some organizations, that change is small and
easily adjusted. In other organizations, these changes could require retooling, retraining, or new
approaches to support cloud operations. The bigger the change, the bigger the potential impact on
business operations and customer experience. Managing this risk will require the involvement of the
business in transformation planning. Release planning and first workload selection in the incremental
rationalization article discuss ways to choose workloads for transformation projects. The business's role in
that activity is to communicate the business operations risk of changing prioritized workloads. Helping IT
choose workloads that have a lower impact on operations will reduce the overall risk.
Budget control: Cost models change in the cloud. This change can create risks associated with cost
overruns or increases in the cost of goods sold (COGS ), especially directly attributed operating expenses.
When business works closely with IT, it is feasible to create transparency regarding costs and services
consumed by various business units, programs, or projects. Cost Management provides examples of ways
business and IT can partner on this topic.
The above are a few of the most common risks mentioned by customers. The cloud governance team and the
cloud adoption teams can begin to develop a risk profile, as workloads are migrated and readied for production
release. Be prepared for conversations to define, refine, and manage risks based on the desired business
outcomes and transformation effort.
Next steps
This type of conversation can help the business and IT evaluate tolerance more effectively. These conversations
can be used during the creation of MVP policies and during incremental policy reviews.
Define corporate policy
Define corporate policy for cloud governance
3 minutes to read • Edit Online
Once you've analyzed the known risks and related risk tolerances for your organization's cloud transformation
journey, your next step is to establish policy that will explicitly address those risks and define the steps needed to
remediate them where possible.
TIP
If your organization uses vendors or other trusted business partners, one of the biggest business risks to consider may be a
lack of adherence to regulatory compliance by these external organizations. This risk often cannot be remediated, and
instead may require a strict adherence to requirements by all parties. Make sure you've identified and understand any third-
party compliance requirements before beginning a policy review.
Next steps
After defining your policies, draft an architecture design guide to provide IT staff and developers with actionable
guidance.
Align your governance design guide with corporate policy
Align your cloud governance design guide with
corporate policy
2 minutes to read • Edit Online
After you've defined cloud policies based on your identified risks, you'll need to generate actionable guidance that
aligns with these policies for your IT staff and developers to refer to. Drafting a cloud governance design guide
allows you to specify specific structural, technological, and process choices based on the policy statements you
generated for each of the five governance disciplines.
A cloud governance design guide should establish the architecture choices and design patterns for each of the core
infrastructure components of cloud deployments that best meet your policy requirements. Alongside these you
should provide a high-level explanation of the technology, tools, and processes that will support each of these
design decisions.
Although your risk analysis and policy statements may, to some degree, be cloud platform agnostic, your design
guide should provide platform-specific implementation details that your IT and dev teams can use when creating
and deploying cloud-based workloads. Focus on the architecture, tools, and features of your chosen platform when
making design decision and providing guidance.
While cloud design guides should take into account some of the technical details associated with each
infrastructure component, they are not meant to be extensive technical documents or specifications. Make sure
your guides address your policy statements and clearly state design decisions in a format easy for staff to
understand and reference.
Next steps
With design guidance in place, establish policy adherence processes to ensure policy compliance.
Establish policy adherence processes
Establish policy adherence processes
5 minutes to read • Edit Online
After establishing your cloud policy statements and drafting a design guide, you'll need to create a strategy for
ensuring your cloud deployment stays in compliance with your policy requirements. This strategy will need to
encompass your cloud governance team's ongoing review and communication processes, establish criteria for
when policy violations require action, and defining the requirements for automated monitoring and compliance
systems that will detect violations and trigger remediation actions.
See the corporate policy sections of the actionable governance guides for examples of how policy adherence
process fit into a cloud governance plan.
Cost Management Monthly cloud spending is more than Notify the billing unit leader who will
20% higher than expected. begin a review of resource usage.
Security Baseline Detect suspicious user activity. Notify the IT security team and disable
the suspect user account.
Resource Consistency CPU utilization for a workload is greater Notify the IT Operations team and scale
than 90%. out additional resources to handle the
load.
Next steps
Learn more about regulatory compliance in the cloud.
Regulatory compliance
Introduction to regulatory compliance
3 minutes to read • Edit Online
This is an introductory article about regulatory compliance, therefore it's not intended for implementing a
compliance strategy. More detailed information about Azure compliance offerings is available at the Microsoft
Trust Center. Moreover, all downloadable documentation is available to certain Azure customers from the
Microsoft Service Trust Portal.
Regulatory compliance refers to the discipline and process of ensuring that a company follows the laws enforced
by governing bodies in their geography or rules required by voluntarily adopted industry standards. For IT
regulatory compliance, people and processes monitor corporate systems in an effort to detect and prevent
violations of policies and procedures established by these governing laws, regulations, and standards. This in turn
applies to a wide array of monitoring and enforcement processes. Depending on the industry and geography,
these processes can become lengthy and complex.
Compliance is challenging for multinational organizations, especially in heavily regulated industries like healthcare
and financial services. Standards and regulations abound, and in certain cases may change frequently, making it
difficult for businesses to keep up with changing international electronic data handling laws.
As with security controls, organizations should understand the division of responsibilities regarding regulatory
compliance in the cloud. Cloud providers strive to ensure that their platforms and services are compliant. But
organizations also need to confirm that their applications, the infrastructure those applications depend on, and
services supplied by third parties are also certified as compliant.
The following are descriptions of compliance regulations in various industries and geographies:
HIPAA
A healthcare application that processes protected health information (PHI) is subject to both the Privacy Rule and
the Security Rule encompassed within the Health Information Portability and Accountability Act (HIPAA). At a
minimum, HIPAA could likely require that a healthcare business must receive written assurances from the cloud
provider that it will safeguard any PHI received or created.
PCI
Payment Card Industry Data Security Standard (PCI DSS ) is a proprietary information security standard for
organizations that handle branded credit cards from the major card schemes, including Visa, MasterCard,
American Express, Discover, and JCB. The PCI standard is mandated by the card brands and administered by the
Payment Card Industry Security Standards Council. The standard was created to increase controls around
cardholder data to reduce credit-card fraud. Validation of compliance is performed annually, either by an external
Qualified Security Assessor (QSA) or by a firm-specific Internal Security Assessor (ISA) who creates a Report on
Compliance (ROC ) for organizations handling large volumes of transactions, or by a Self-Assessment
Questionnaire (SAQ ) for companies.
Personal data
Personal data is information that could be used to identify a consumer, employee, partner, or any other living or
legal entity. Many emerging laws, particularly those dealing with privacy and personal data, require that
businesses themselves comply and report on compliance and any breaches that might occur.
GDPR
One of the most important developments in this area is the General Data Protection Regulation (GDPR ), designed
to strengthen data protection for individuals within the European Union. GDPR requires that data about
individuals (such as "a name, a home address, a photo, an email address, bank details, posts on social networking
websites, medical information, or a computer's IP address") be maintained on servers within the EU and not
transferred out of it. It also requires that companies notify individuals of any data breaches, and mandates that
companies have a data protection officer (DPO ). Other countries have, or are developing, similar types of
regulations.
Next steps
Learn more about cloud security readiness.
Cloud security readiness
CISO cloud readiness guide
3 minutes to read • Edit Online
Microsoft guidance like the Cloud Adoption Framework is not positioned to determine or guide the unique
security constraints of the thousands of enterprises supported by this documentation. When moving to the cloud,
the role of the chief information security officer or chief information security office (CISO ) isn't supplanted by
cloud technologies. Quite the contrary, the CISO and the office of the CISO, become more engrained and
integrated. This guide assumes the reader is familiar with CISO processes and is seeking to modernize those
processes to enable cloud transformation.
Cloud adoption enables services that weren't often considered in traditional IT environments. Self-service or
automated deployments are commonly executed by application development or other IT teams not traditionally
aligned to production deployment. In some organizations, business constituents similarly have self-service
capabilities. This can trigger new security requirements that weren't needed in the on-premises world. Centralized
security is more challenging, Security often becomes a shared responsibility across the business and IT culture.
This article can help a CISO prepare for that approach and engage in incremental governance.
Next steps
The first step to taking action in any governance strategy is a policy review. Policy and compliance could be a
useful guide during your policy review.
Prepare for a policy review
Conduct a cloud policy review
3 minutes to read • Edit Online
A cloud policy review is the first step toward governance maturity in the cloud. The objective of this process is to
modernize existing corporate IT policies. When completed, the updated policies provide an equivalent level of
risk management for cloud-based resources. This article explains the cloud policy review process and its
importance.
Next steps
Learn more about including data classification in your cloud governance strategy.
Data classification
What is data classification?
2 minutes to read • Edit Online
Data classification allows you to determine and assign value to your organization's data, and is a common
starting point for governance. The data classification process categorizes data by sensitivity and business impact
in order to identify risks. When data is classified, you can manage it in ways that protect sensitive or important
data from theft or loss.
Next steps
Apply data classifications during one of the actionable governance guides.
Choose an actionable governance guide
Any change to business processes or technology platforms introduces risk. Cloud governance teams, whose members are
sometimes known as cloud custodians, are tasked with mitigating these risks and ensuring minimal interruption to adoption
or innovation efforts.
The Cloud Adoption Framework governance model guides these decisions (regardless of the chosen cloud platform) by
focusing on development of corporate policy and the Five Disciplines of Cloud Governance. Actionable design guides
demonstrate this model using Azure services. Learn about the disciplines of the Cloud Adoption Framework governance
model below.
Figure 1 - Diagram of corporate policy and the Five Disciplines of Cloud Governance.
Cost Management
Cost is a primary concern for cloud users. Develop policies for cost control for all cloud platforms.
Security Baseline
Security is a complex topic, unique to each company. Once security requirements are established, cloud governance
policies and enforcement apply those requirements across network, data, and asset configurations.
Identity Baseline
Inconsistencies in the application of identity requirements can increase the risk of breach. The Identity Baseline discipline
focuses ensuring that identity is consistently applied across cloud adoption efforts.
Resource Consistency
Cloud operations depend on consistent resource configuration. Through governance tooling, resources can be
configured consistently to manage risks related to onboarding, drift, discoverability, and recovery.
Deployment Acceleration
Centralization, standardization, and consistency in approaches to deployment and configuration improve governance
practices. When provided through cloud-based governance tooling, they create a cloud factor that can accelerate
deployment activities.
Cost Management is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework governance
model. For many customers, governing cost is a major concern when adopting cloud technologies. Balancing performance
demands, adoption pacing, and cloud services costs can be challenging. This is especially relevant during major business
transformations that implement cloud technologies. This section outlines the approach to developing a Cost Management
discipline as part of a cloud governance strategy.
NOTE
Cost Management governance does not replace the existing business teams, accounting practices, and procedures that are
involved in your organization's financial management of IT-related costs. The primary purpose of this discipline is to identify
potential cloud-related risks related to IT spending, and provide risk-mitigation guidance to the business and IT teams
responsible for deploying and managing cloud resources.
The primary audience for this guidance is your organization's cloud architects and other members of your cloud governance
team. However, the decisions, policies, and processes that emerge from this discipline should involve engagement and
discussions with relevant members of your business and IT teams, especially those leaders responsible for owning, managing,
and paying for cloud-based workloads.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Cost Management
discipline. To see policy statement samples, see the article on Cost Management Policy Statements. These samples can serve as
a starting point for your organization's governance policies.
C A U T IO N
The sample policies come from common customer experiences. To better align these policies to specific cloud governance
needs, execute the following steps to create policy statements that meet your unique business needs.
Business Risks
Understand the motives and risks commonly associated with the Cost Management discipline.
Maturity
Aligning Cloud Management maturity with phases of cloud adoption.
Toolchain
Azure services that can be implemented to support the Cost Management discipline.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Cost Management template
2 minutes to read • Edit Online
The first step to implementing change is communicating the desired change. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern Cost Management issues in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Cost Management policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Cost Management discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Cost Management motivations and business risks
2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Cost Management discipline within a cloud
governance strategy. It also provides a few examples of business risks that drive policy statements.
Business risk
The Cost Management discipline attempts to address core business risks related to expenses incurred when
hosting cloud-based workloads. Work with your business to identify these risks and monitor each of them for
relevance as you plan for and implement your cloud deployments.
Risks will differ between organization, but the following serve as common cost-related risks that you can use as a
starting point for discussions within your cloud governance team:
Budget control: Not controlling budget can lead to excessive spending with a cloud vendor.
Utilization loss: Prepurchases or precommitments that go unused can result in lost investments.
Spending anomalies: Unexpected spikes in either direction can be indicators of improper usage.
Overprovisioned assets: When assets are deployed in a configuration that exceed the needs of an application
or virtual machine (VM ), they can create waste.
Next steps
Using the Cloud Management template, document business risks that are likely to be introduced by the current
cloud adoption plan.
After you've gained an understanding of realistic business risks, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Cost Management metrics, indicators, and risk
tolerance
3 minutes to read • Edit Online
This article will help you quantify business risk tolerance as it relates to Cost Management. Defining metrics and
indicators helps you create a business case for making an investment in the maturity of the Cost Management
discipline.
Metrics
Cost Management generally focuses on metrics related to costs. As part of your risk analysis, you'll want to gather
data related to your current and planned spending on cloud-based workloads to determine how much risk you
face, and how important investment in cost governance is to your cloud adoption strategy.
The following are examples of useful metrics that you should gather to help evaluate risk tolerance within the Cost
Management discipline:
Annual spending: The total annual cost for services provided by a cloud provider.
Monthly spending: The total monthly cost for services provided by a cloud provider.
Forecasted versus actual ratio: The ratio comparing forecasted and actual spending (monthly or annual).
Pace of adoption (MOM ) ratio: The percentage of the delta in cloud costs from month to month.
Accumulated cost: Total accrued daily spending, starting from the beginning of the month.
Spending trends: Spending trend against the budget.
Next steps
Using the Cloud Management template, document metrics and tolerance indicators that align to the current cloud
adoption plan.
Review sample Cost Management policies as a starting point to develop policies that address specific business
risks that align with your cloud adoption plans.
Review sample policies
Cost Management sample policy statements
3 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Business risk: A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common cost-related business risks. These statements are
examples you can reference when drafting policy statements to address your organization's needs. These
examples are not meant to be prescriptive, and there are potentially several policy options for dealing with each
identified risk. Work closely with business and IT teams to identify the best policies for your unique set of risks.
Future-proofing
Business risk: Current criteria that don't warrant an investment in a Cost Management discipline from the
governance team. However, you anticipate such an investment in the future.
Policy statement: You should associate all assets deployed to the cloud with a billing unit and
application/workload. This policy will ensure that future Cost Management efforts will be effective.
Design options: For information on establishing a future-proof foundation, see the discussions related to
creating a governance MVP in the actionable design guides included as part of the Cloud Adoption Framework
guidance.
Budget overruns
Business risk: Self-service deployment creates a risk of overspending.
Policy statement: Any cloud deployment must be allocated to a billing unit with approved budget and a
mechanism for budgetary limits.
Design options: In Azure, budget can be controlled with Azure Cost Management
Underutilization
Business risk: The company has prepaid for cloud services or has made an annual commitment to spend a
specific amount. There is a risk that the agreed-on amount won't be used, resulting in a lost investment.
Policy statement: Each billing unit with an allocated cloud budget will meet annually to set budgets, quarterly to
adjust budgets, and monthly to allocate time for reviewing planned versus actual spending. Discuss any deviations
greater than 20% with the billing unit leader monthly. For tracking purposes, assign all assets to a billing unit.
Design options:
In Azure, planned versus actual spending can be managed via Azure Cost Management
There are several options for grouping resources by billing unit. In Azure, a resource consistency model should
be chosen in conjunction with the governance team and applied to all assets.
Overprovisioned assets
Business risk: In traditional on-premises datacenters, it is common practice to deploy assets with extra capacity
planning for growth in the distant future. The cloud can scale more quickly than traditional equipment. Assets in
the cloud are also priced based on the technical capacity. There is a risk of the old on-premises practice artificially
inflating cloud spending.
Policy statement: Any asset deployed to the cloud must be enrolled in a program that can monitor utilization
and report any capacity in excess of 50% of utilization. Any asset deployed to the cloud must be grouped or
tagged in a logical manner, so governance team members can engage the workload owner regarding any
optimization of overprovisioned assets.
Design options:
In Azure, Azure Advisor can provide optimization recommendations.
There are several options for grouping resources by billing unit. In Azure, a resource consistency model should
be chosen in conjunction with the governance team and applied to all assets.
Overoptimization
Business risk: Effective cost management creates new risks. Optimization of spending is inverse to system
performance. When reducing costs, there is a risk of overtightening spending and producing poor user
experiences.
Policy statement: Any asset that directly affects customer experiences must be identified through grouping or
tagging. Before optimizing any asset that affects customer experience, the cloud governance team must adjust
optimization based on at least 90 days of utilization trends. Document any seasonal or event driven bursts
considered when optimizing assets.
Design options:
In Azure, Azure Monitor's insights features can help with analysis of system utilization.
There are several options for grouping and tagging resources based on roles. In Azure, you should choose a
resource consistency model in conjunction with the governance team and apply this to all assets.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific business
risks that align with your cloud adoption plans.
To begin developing your own custom policy statements related to Cost Management, download the Cost
Management template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with
your environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Cost Management policy
adherence.
Establish policy compliance processes
Cost Management policy compliance processes
3 minutes to read • Edit Online
This article discusses an approach to creating processes that support a Cost Management governance discipline.
Effective governance of cloud costs starts with recurring manual processes designed to support policy compliance.
This requires regular involvement of the cloud governance team and interested business stakeholders to review
and update policy and ensure policy compliance. In addition, many ongoing monitoring and enforcement
processes can be automated or supplemented with tooling to reduce the overhead of governance and allow for
faster response to policy deviation.
Next steps
Using the Cloud Management template, document the processes and triggers that align to the current cloud
adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on Cost
Management discipline improvement.
Cost Management discipline improvement
Cost Management discipline improvement
4 minutes to read • Edit Online
The Cost Management discipline attempts to address core business risks related to expenses incurred when
hosting cloud-based workloads. Within the Five Disciplines of Cloud Governance, Cost Management is involved in
controlling cost and usage of cloud resources with the goal of creating and maintaining a planned cost cycle.
This article outlines potential tasks your company perform to develop and mature your Cost Management
discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud identity governance, examine the Cost Management toolchain to
identify Azure tools and features that you'll need when developing the Cost Management governance discipline on
the Azure platform.
Cost Management toolchain for Azure
Cost Management tools in Azure
2 minutes to read • Edit Online
Cost Management is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways of
establishing cloud spending plans, allocating cloud budgets, monitoring and enforcement of cloud budgets,
detecting costly anomalies, and adjusting the cloud governance plan when actual spending is misaligned.
The following is a list of Azure native tools that can help mature the policies and processes that support this
governance discipline.
Security Baseline governance does not replace the existing IT teams, processes, and procedures that your organization uses to
secure cloud-deployed resources. The primary purpose of this discipline is to identify security-related business risks and provide
risk-mitigation guidance to the IT staff responsible for security infrastructure. As you develop governance policies and processes
make sure to involve relevant IT teams in your planning and review processes.
This article outlines the approach to developing a Security Baseline discipline as part of your cloud governance strategy. The
primary audience for this guidance is your organization's cloud architects and other members of your cloud governance team.
However, the decisions, policies, and processes that emerge from this discipline should involve engagement and discussions with
relevant members of your IT and security teams, especially those technical leaders responsible for implementing networking,
encryption, and identity services.
Making the correct security decisions is critical to the success of your cloud deployments and wider business success. If your
organization lacks in-house expertise in cybersecurity, consider engaging external security consultants as a component of this
discipline. Also consider engaging Microsoft Consulting Services, the Microsoft FastTrack cloud adoption service, or other
external cloud adoption experts to discuss concerns related to this discipline.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Security Baseline
discipline. To see policy statement samples, see the article on Security Baseline Policy Statements. These samples can serve as a
starting point for your organization's governance policies.
C A U T IO N
The sample policies come from common customer experiences. To better align these policies to specific cloud governance needs,
execute the following steps to create policy statements that meet your unique business needs.
Business Risks
Understand the motives and risks commonly associated with the Security Baseline discipline.
Indicators and Metrics
Indicators to understand if it is the right time to invest in the Security Baseline discipline.
Maturity
Aligning Cloud Management maturity with phases of cloud adoption.
Toolchain
Azure services that can be implemented to support the Security Baseline discipline.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Security Baseline template
2 minutes to read • Edit Online
The first step to implementing change is communicating what is desired. The same is true when changing
governance practices. The template below provides a starting point for documenting and communicating policy
statements that govern security related issues in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Security Baseline policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Security Baseline discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Security Baseline motivations and business risks
2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Security Baseline discipline within a cloud
governance strategy. It also provides a few examples of potential business risks that can drive policy statements.
NOTE
While it is important to understand Identity Baseline in the context of Security Baseline and how that relates to Access
Control, the Five Disciplines of Cloud Governance calls out Identity Baseline as its own discipline, separate from Security
Baseline.
Business risk
The Security Baseline discipline attempts to address core security-related business risks. Work with your business
to identify these risks and monitor each of them for relevance as you plan for and implement your cloud
deployments.
Risks will differ between organization, but the following serve as common security-related risks that you can use
as a starting point for discussions within your cloud governance team:
Data breach: Inadvertent exposure or loss of sensitive cloud-hosted data can lead to losing customers,
contractual issues, or legal consequences.
Service disruption: Outages and other performance issues due to insecure infrastructure interrupts normal
operations and can result in lost productivity or lost business.
Next steps
Using the Cloud Management template, document business risks that are likely to be introduced by the current
cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Security Baseline metrics, indicators, and risk
tolerance
4 minutes to read • Edit Online
This article will help you quantify business risk tolerance as it relates to Security Baseline. Defining metrics and
indicators helps you create a business case for making an investment in maturing the Security Baseline discipline.
Metrics
Security Baseline generally focuses on identifying potential vulnerabilities in your cloud deployments. As part of
your risk analysis you'll want to gather data related to your security environment to determine how much risk you
face, and how important investment in Security Baseline governance is to your planned cloud deployments.
Every organization has different security environments and requirements and different potential sources of
security data. The following are examples of useful metrics that you should gather to help evaluate risk tolerance
within the Security Baseline discipline:
Data classification: Number of cloud-stored data and services that are unclassified according to on your
organization's privacy, compliance, or business impact standards.
Number of sensitive data stores: Number of storage end points or databases that contain sensitive data and
should be protected.
Number of unencrypted data stores: Number of sensitive data stores that are not encrypted.
Attack surface: How many total data sources, services, and applications will be cloud-hosted. What percentage
of these data sources are classified as sensitive? What percentage of these applications and services are
mission-critical?
Covered standards: Number of security standards defined by the security team.
Covered resources: Deployed assets that are covered by security standards.
Overall standards compliance: Ratio of compliance adherence to security standards.
Attacks by severity: How many coordinated attempts to disrupt your cloud-hosted services, such as through
Distributed Denial of Service (DDoS ) attacks, does your infrastructure experience? What is the size and severity
of these attacks?
Malware protection: Percentage of deployed virtual machines (VMs) that have all required anti-malware,
firewall, or other security software installed.
Patch latency: How long has it been since VMs have had OS and software patches applied.
Security health recommendations: Number of security software recommendations for resolving health
standards for deployed resources, organized by severity.
Next steps
Using the Cloud Management template, document metrics and tolerance indicators that align to the current cloud
adoption plan.
Review sample Security Baseline policies as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
Review sample policies
Security Baseline sample policy statements
4 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk: A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Technical options: Actionable recommendations, specifications, or other guidance that IT teams and
developers can use when implementing the policy.
The following sample policy statements address common security-related business risks. These statements are
examples you can reference when drafting policy statements to address your organization's needs. These examples
are not meant to be proscriptive, and there are potentially several policy options for dealing with each identified
risk. Work closely with business, security, and IT teams to identify the best policies for your unique set of risks.
Asset classification
Technical risk: Assets that are not correctly identified as mission-critical or involving sensitive data may not
receive sufficient protections, leading to potential data leaks or business disruptions.
Policy statement: All deployed assets must be categorized by criticality and data classification. Classifications
must be reviewed by the cloud governance team and the application owner before deployment to the cloud.
Potential design option: Establish resource tagging standards and ensure IT staff apply them consistently to any
deployed resources using Azure resource tags.
Data encryption
Technical risk: There is a risk of protected data being exposed during storage.
Policy statement: All protected data must be encrypted when at rest.
Potential design option: See the Azure encryption overview article for a discussion of how data at rest
encryption is performed on the Azure platform. Additional controls such as in account data encryption and control
over how storage account settings can be changed should also be considered.
Network isolation
Technical risk: Connectivity between networks and subnets within networks introduces potential vulnerabilities
that can result in data leaks or disruption of mission-critical services.
Policy statement: Network subnets containing protected data must be isolated from any other subnets. Network
traffic between protected data subnets is to be audited regularly.
Potential design option: In Azure, network and subnet isolation is managed through Azure Virtual Networks.
DDoS protection
Technical risk: Distributed denial of service (DDoS ) attacks can result in a business interruption.
Policy statement: Deploy automated DDoS mitigation mechanisms to all publicly accessible network endpoints.
No public facing web site backed by IaaS should be exposed to the internet without DDoS.
Potential design option: Use Azure DDoS Protection Standard to minimize disruptions caused by DDoS attacks.
Security review
Technical risk: Over time, new security threats and attack types emerge, increasing the risk of exposure or
disruption of your cloud resources.
Policy statement: Trends and potential exploits that could affect cloud deployments should be reviewed regularly
by the security team to provide updates to Security Baseline tooling used in the cloud.
Potential design option: Establish a regular security review meeting that includes relevant IT and governance
team members. Review existing security data and metrics to establish gaps in current policy and Security Baseline
tooling, and update policy to remediate any new risks. Leverage Azure Advisor and Azure Security Center to gain
actionable insights on emerging threats specific to your deployments.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific security risks
that align with your cloud adoption plans.
To begin developing your own custom policy statements related to Security Baseline, download the Security
Baseline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Security Baseline policy
adherence.
Establish policy compliance processes
Security Baseline policy compliance processes
5 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern Security Baseline. Effective
governance of cloud security starts with recurring manual processes designed to detect vulnerabilities and impose
policies to remediate those security risks. This requires regular involvement of the cloud governance team and
interested business and IT stakeholders to review and update policy and ensure policy compliance. In addition,
many ongoing monitoring and enforcement processes can be automated or supplemented with tooling to reduce
the overhead of governance and allow for faster response to policy deviation.
Next steps
Using the Cloud Management template, document the processes and triggers that align to the current cloud
adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Security Baseline discipline improvement
Security Baseline discipline improvement
5 minutes to read • Edit Online
The Security Baseline discipline focuses on ways of establishing policies that protect the network, assets, and most
importantly the data that will reside on a cloud provider's solution. Within the Five Disciplines of Cloud
Governance, Security Baseline includes classification of the digital estate and data. It also includes documentation
of risks, business tolerance, and mitigation strategies associated with the security of the data, assets, and network.
From a technical perspective, this also includes involvement in decisions regarding encryption, network
requirements, hybrid identity strategies, and the processes used to develop cloud Security Baseline policies.
This article outlines some potential tasks your company can engage in to better develop and mature the Security
Baseline discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud security governance, move on to learn more about what security
and best practices guidance Microsoft provides for Azure.
Learn about security guidance for Azure Introduction to Azure security Learn about logging, reporting, and
monitoring
Cloud-native Security Baseline policy
6 minutes to read • Edit Online
Security Baseline is one of the Five Disciplines of Cloud Governance. This discipline focuses on general security
topics including protection of the network, digital assets, data, etc. As outlined in the policy review guide, the Cloud
Adoption Framework includes three levels of sample policy: cloud-native, enterprise, and cloud-design-principle-
compliant for each of the disciplines. This article discusses the cloud-native sample policy for the Security Baseline
discipline.
NOTE
Microsoft is in no position to dictate corporate or IT policy. This article will help you prepare for an internal policy review. It is
assumed that this sample policy will be extended, validated, and tested against your corporate policy before attempting to
use it. Any use of this sample policy as-is is discouraged.
Policy alignment
This sample policy synthesizes a cloud-native scenario, meaning that the tools and platforms provided by Azure are
sufficient to manage business risks involved in a deployment. In this scenario, it is assumed that a simple
configuration of the default Azure services provides sufficient asset protection.
Next steps
Now that you've reviewed the sample Security Baseline policy for cloud-native solutions, return to the policy
review guide to start building on this sample to create your own policies for cloud adoption.
Build your own policies using the policy review guide
Microsoft Security Guidance
5 minutes to read • Edit Online
Tools
Microsoft introduced the Service Trust Platform and Compliance Manager to help with the following:
Overcome compliance management challenges.
Fulfill responsibilities of meeting regulatory requirements.
Conduct self-service audits and risk assessments of enterprise cloud service utilization.
These tools are designed to help organizations meet complex compliance obligations and improve data protection
capabilities when choosing and using Microsoft Cloud services.
Service Trust Platform (STP ) provides in-depth information and tools to help meet your needs for using
Microsoft Cloud services, including Azure, Office 365, Dynamics 365, and Windows. STP is a one-stop shop for
security, regulatory, compliance, and privacy information related to the Microsoft Cloud. It is where we publish the
information and resources needed to perform self-service risk assessments of cloud services and tools. STP was
created to help track regulatory compliance activities within Azure, including:
Compliance Manager: Compliance Manager, a workflow -based risk assessment tool in the Microsoft Service
Trust Platform, enables you to track, assign, and verify your organization's regulatory compliance activities
related to Microsoft Cloud services, such as Office 365, Dynamics 365 and Azure. You can find more details in
the next section.
Trust documents: Currently there are three categories of guides that provide you with abundant resources to
assess Microsoft Cloud; learn about Microsoft operations in security, compliance, and privacy; and help you act
on improving your data protection capabilities. These include:
Audit reports: Audit reports allow you to stay current on the latest privacy, security, and compliance-related
information for Microsoft Cloud services. This includes ISO, SOC, FedRAMP and other audit reports, bridge
letters, and materials related to independent third-party audits of Microsoft Cloud services such as Azure,
Office 365, Dynamics 365, and others.
Data protection guides: Data protection guides provide information about how Microsoft Cloud services
protect your data, and how you can manage cloud data security and compliance for your organization. This
includes deep-dive white papers that provide details on how Microsoft designs and operates cloud services,
FAQs, reports of end-of-year security assessments, penetration test results, and guidance to help you conduct
risk assessment and improve your data protection capabilities.
Azure security and compliance blueprint: Blueprints provide resources to assist you in building and
launching cloud-powered applications that help you comply with stringent regulations and standards. With
more certifications than any other cloud provider, you can have confidence deploying your critical workloads to
Azure, with blueprints that include:
Industry-specific overview and guidance.
Customer responsibilities matrix.
Reference architectures with threat models.
Control implementation matrices.
Automation to deploy reference architectures.
Privacy resources: Documentation for Data Protection Impact Assessments, Data Subject Requests
(DSRs), and Data Breach Notification is provided to incorporate into your own accountability program in
support of the General Data Protection Regulation (GDPR ).
Get started with GDPR: Microsoft products and services help organizations meet GDPR requirements while
collecting or processing personal data. STP is designed to give you information about the capabilities in
Microsoft services that you can use to address specific requirements of the GDPR. The documentation can help
your GDPR accountability and your understanding of technical and organizational measures. Documentation
for Data Protection Impact Assessments, Data Subject Requests (DSRs), and Data Breach Notification is
provided to incorporate into your own accountability program in support of the GDPR.
Data subject requests: The GDPR grants individuals (or data subjects) certain rights in connection with
the processing of their personal data. This includes the right to correct inaccurate data, erase data, or
restrict its processing, as well as receive their data and fulfill a request to transmit their data to another
controller.
Data breach: The GDPR mandates notification requirements for data controllers and processors in the
event of a breach of personal data. STP provides you with information about how Microsoft tries to
prevent breaches in the first place, how Microsoft detects a breach, and how Microsoft will respond in
the event of a breach and notify you as a data controller.
Data protection impact assessment: Microsoft helps controllers complete GDPR Data Protection
Impact Assessments. The GDPR provides an in-exhaustive list of cases in which DPIAs must be carried
out, such as automated processing for the purposes of profiling and similar activities; processing on a
large scale of special categories of personal data, and systematic monitoring of a publicly accessible area
on a large scale.
Other resources: In addition to tools guidance discussed in the above sections, STP also provides other
resources including regional compliance, additional resources for the Security and Compliance Center,
and frequently asked questions about the Service Trust Platform, Compliance Manager, and
privacy/GDPR.
Regional compliance: STP provides numerous compliance documents and guidance for Microsoft online
services to meet compliance requirements for different regions including Czech Republic, Poland, and Romania.
Next-generation detection
Attackers are increasingly automated and sophisticated. They use data science too. They reverse-engineer
protections and build systems that support mutations in behavior. They masquerade their activities as noise, and
learn quickly from mistakes. Machine learning helps us respond to these developments.
Behavioral analytics
Behavioral analytics is a technique that analyzes and compares data to a collection of known patterns. However,
these patterns are not simple signatures. They are determined through complex machine learning algorithms that
are applied to massive data sets. They are also determined through careful analysis of malicious behaviors by
expert analysts. Azure Security Center can use behavioral analytics to identify compromised resources based on
analysis of virtual machine logs, virtual network device logs, fabric logs, crash dumps, and other sources.
Security Baseline tools in Azure
2 minutes to read • Edit Online
Security Baseline is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways of
establishing policies that protect the network, assets, and most importantly the data that will reside on a cloud
provider's solution. Within the Five Disciplines of Cloud Governance, the Security Baseline discipline involves
classification of the digital estate and data. It also involves documentation of risks, business tolerance, and
mitigation strategies associated with the security of data, assets, and networks. From a technical perspective, this
discipline also includes involvement in decisions regarding encryption, network requirements, hybrid identity
strategies, and tools to automate enforcement of security policies across resource groups.
The following list of Azure tools can help mature the policies and processes that support Security Baseline.
AZURE PORTAL
AND AZURE AZURE
RESOURCE AZURE KEY SECURITY AZURE
TOOL MANAGER VAULT AZURE AD AZURE POLICY CENTER MONITOR
Encrypt No Yes No No No No
virtual drives
Manage No No Yes No No No
hybrid
identity
services
Restrict No No No Yes No No
allowed types
of resource
Preemptively No No No No Yes No
detect
vulnerabilities
Configure Yes No No No No No
backup and
disaster
recovery
For a complete list of Azure security tools and services, see Security services and technologies available on Azure.
It is also common for customers to use third-party tools for facilitating Security Baseline activities. For more
information, see the article Integrate security solutions in Azure Security Center.
In addition to security tools, the Microsoft Trust Center contains extensive guidance, reports, and related
documentation that can help you perform risk assessments as part of your migration planning process.
Identity Baseline is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework governance model.
Identity is increasingly considered the primary security perimeter in the cloud, which is a shift from the traditional focus on
network security. Identity services provide the core mechanisms supporting access control and organization within IT
environments, and the Identity Baseline discipline complements the Security Baseline discipline by consistently applying
authentication and authorization requirements across cloud adoption efforts.
NOTE
Identity Baseline governance does not replace the existing IT teams, processes, and procedures that allow your organization to
manage and secure identity services. The primary purpose of this discipline is to identify potential identity-related business risks
and provide risk-mitigation guidance to IT staff that are responsible for implementing, maintaining, and operating your identity
management infrastructure. As you develop governance policies and processes make sure to involve relevant IT teams in your
planning and review processes.
This section of the Cloud Adoption Framework outlines the approach to developing an Identity Baseline discipline as part of
your cloud governance strategy. The primary audience for this guidance is your organization's cloud architects and other
members of your cloud governance team. However, the decisions, policies, and processes that emerge from this discipline
should involve engagement and discussions with relevant members of the IT teams responsible for implementing and managing
your organization's identity management solutions.
If your organization lacks in-house expertise in Identity Baseline and security, consider engaging external consultants as a part of
this discipline. Also consider engaging Microsoft Consulting Services, the Microsoft FastTrack cloud adoption service, or other
external cloud adoption partners to discuss concerns related to this discipline.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of an Identity Baseline
discipline. To see policy statement samples, see the article on Identity Baseline Policy Statements. These samples can serve as a
starting point for your organization's governance policies.
C A U T IO N
The sample policies come from common customer experiences. To better align these policies to specific cloud governance needs,
execute the following steps to create policy statements that meet your unique business needs.
Business Risks
Understand the motives and risks commonly associated with the Identity Baseline discipline.
Indicators and Metrics
Indicators to understand if it is the right time to invest in the Identity Baseline discipline.
Maturity
Aligning Cloud Management maturity with phases of cloud adoption.
Toolchain
Azure services that can be implemented to support the Identity Baseline discipline.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Identity Baseline template
2 minutes to read • Edit Online
The first step to implementing change is communicating the desired change. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern identity services in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Identity Baseline policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Identity Baseline discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Identity Baseline motivations and business risks
2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt an Identity Baseline discipline within a cloud
governance strategy. It also provides a few examples of business risks that drive policy statements.
Business risk
The Identity Baseline discipline attempts to address core business risks related to identity services and access
control. Work with your business to identify these risks and monitor each of them for relevance as you plan for
and implement your cloud deployments.
Risks will differ between organization, but the following serve as common identity-related risks that you can use
as a starting point for discussions within your cloud governance team:
Unauthorized access. Sensitive data and resources that can be accessed by unauthorized users can lead to
data leaks or service disruptions, violating your organization's security perimeter and risking business or legal
liabilities.
Inefficiency due to multiple identity solutions. Organizations with multiple identity services tenants can
require multiple accounts for users. This can lead to inefficiency for users who need to remember multiple sets
of credentials and for IT in managing accounts across multiple systems. If user access assignments are not
updated across identity solutions as staff, teams, and business goals change, your cloud resources may be
vulnerable to unauthorized access or users unable to access required resources.
Inability to share resources with external partners. Difficulty adding external business partners to your
existing identity solutions can prevent efficient resource sharing and business communication.
On-premises identity dependencies. Legacy authentication mechanisms or third-party multi-factor
authentication might not be available in the cloud, requiring either migrating workloads to be retooled, or
additional identity services to be deployed to the cloud. Either requirement could delay or prevent migration,
and increase costs.
Next steps
Using the Cloud Management template, document business risks that are likely to be introduced by the current
cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Identity Baseline metrics, indicators, and risk
tolerance
4 minutes to read • Edit Online
This article will help you quantify business risk tolerance as it relates to Identity Baseline. Defining metrics and
indicators helps you create a business case for making an investment in maturing the Identity Baseline discipline.
Metrics
Identity Baseline focuses on identifying, authenticating, and authorizing individuals, groups of users, or automated
processes, and providing them appropriate access to resources in your cloud deployments. As part of your risk
analysis you'll want to gather data related to your identity services to determine how much risk you face, and how
important investment in Identity Baseline governance is to your planned cloud deployments.
The following are examples of useful metrics that you should gather to help evaluate risk tolerance within the
Identity Baseline discipline:
Identity systems size. Total number of users, groups, or other objects managed through your identity
systems.
Overall size of directory services infrastructure. Number of directory forests, domains, and tenants used
by your organization.
Dependency on legacy or on-premises authentication mechanisms. Number of workloads that depend
on legacy or third-party or multi-factor authentication mechanisms.
Extent of cloud-deployed directory services. Number of directory forests, domains, and tenants you've
deployed to the cloud.
Cloud-deployed Active Directory servers. Number of Active Directory servers deployed to the cloud.
Cloud-deployed organizational units. Number of Active Directory organizational units (OUs) deployed to
the cloud.
Extent of federation. Number of Identity Baseline systems federated with your organization's systems.
Elevated users. Number of user accounts with elevated access to resources or management tools.
Use of role-based access control. Number of subscriptions, resource groups, or individual resources not
managed through role-based access control (RBAC ) via groups.
Authentication claims. Number of successful and failed user authentication attempts.
Authorization claims. Number of successful and failed attempts by users to access resources.
Compromised accounts. Number of user accounts that have been compromised.
Next steps
Using the Cloud Management template, document metrics and tolerance indicators that align to the current cloud
adoption plan.
Review sample Identity Baseline policies as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
Review sample policies
Identity Baseline sample policy statements
3 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk: A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common identity-related business risks. These statements are
examples you can reference when drafting policy statements to address your organization's needs. These examples
are not meant to be proscriptive, and there are potentially several policy options for dealing with each identified
risk. Work closely with business and IT teams to identify the best policies for your unique set of risks.
Overprovisioned access
Technical risk: Users and groups with control over resources beyond their area of responsibility can result in
unauthorized modifications leading to outages or security vulnerabilities.
Policy statement: The following policies will be implemented:
A least-privilege access model will be applied to any resources involved in mission-critical applications or
protected data.
Elevated permissions should be an exception, and any such exceptions must be recorded with the cloud
governance team. Exceptions will be audited regularly.
Potential design options: Consult the Azure Identity Management best practices to implement a role-based
access control (RBAC ) strategy that restricts access based on the need to know and least-privilege security
principles.
Identity reviews
Technical risk: As business changes over time, the addition of new cloud deployments or other security concerns
can increase the risks of unauthorized access to secure resources.
Policy statement: Cloud Governance processes must include quarterly review with identity management teams
to identify malicious actors or usage patterns that should be prevented by cloud asset configuration.
Potential design options: Establish a quarterly security review meeting that includes both governance team
members and IT staff responsible for managing identity services. Review existing security data and metrics to
establish gaps in current identity management policy and tooling, and update policy to remediate any new risks.
Next steps
Use the samples mentioned in this article as a starting point for developing policies to address specific business
risks that align with your cloud adoption plans.
To begin developing your own custom policy statements related to Identity Baseline, download the Identity
Baseline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Identity Baseline policy
adherence.
Establish policy compliance processes
Identity Baseline policy compliance processes
4 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern Identity Baseline. Effective
governance of identity starts with recurring manual processes that guide identity policy adoption and revisions.
This requires regular involvement of the cloud governance team and interested business and IT stakeholders to
review and update policy and ensure policy compliance. In addition, many ongoing monitoring and enforcement
processes can be automated or supplemented with tooling to reduce the overhead of governance and allow for
faster response to policy deviation.
Next steps
Using the Cloud Management template, document the processes and triggers that align to the current cloud
adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Identity Baseline discipline improvement
Identity Baseline discipline improvement
6 minutes to read • Edit Online
The Identity Baseline discipline focuses on ways of establishing policies that ensure consistency and continuity of
user identities regardless of the cloud provider that hosts the application or workload. Within the Five Disciplines
of Cloud Governance, Identity Baseline includes decisions regarding the Hybrid Identity Strategy, evaluation and
extension of identity repositories, implementation of single sign-on (same sign-on), auditing and monitoring for
unauthorized use or malicious actors. In some cases, it may also involve decisions to modernize, consolidate, or
integrate multiple identity providers.
This article outlines some potential tasks your company can engage in to better develop and mature the Identity
Baseline discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud identity governance, examine the Identity Baseline toolchain to
identify Azure tools and features that you'll need when developing the Identity Baseline governance discipline on
the Azure platform.
Identity Baseline toolchain for Azure
Identity Baseline tools in Azure
4 minutes to read • Edit Online
Identity Baseline is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways of
establishing policies that ensure consistency and continuity of user identities regardless of the cloud provider
that hosts the application or workload.
The following tools are included in the discovery guide on Hybrid Identity.
Active Directory (on-premises): Active Directory is the identity provider most frequently used in the
enterprise to store and validate user credentials.
Azure Active Directory: A software as a service (SaaS ) equivalent to Active Directory, capable of federating
with an on-premises Active Directory.
Active Directory (IaaS ): An instance of the Active Directory application running in a virtual machine in Azure.
Identity is the control plane for IT security. So authentication is an organization's access guard to the cloud.
Organizations need an identity control plane that strengthens their security and keeps their cloud apps safe from
intruders.
Cloud authentication
Choosing the correct authentication method is the first concern for organizations wanting to move their apps to
the cloud.
When you choose this method, Azure AD handles users' sign-in process. Coupled with seamless single sign-on
(SSO ), users can sign in to cloud apps without having to reenter their credentials. With cloud authentication, you
can choose from two options:
Azure AD password hash synchronization: The simplest way to enable authentication for on-premises
directory objects in Azure AD. This method can also be used with any method as a back-up failover
authentication method in case your on-premises server goes down.
Azure AD Pass-through Authentication: Provides a persistent password validation for Azure AD
authentication services by using a software agent that runs on one or more on-premises servers.
NOTE
Companies with a security requirement to immediately enforce on-premises user account states, password policies, and
sign-in hours should consider the pass-through Authentication method.
Federated authentication:
When you choose this method, Azure AD passes the authentication process to a separate trusted authentication
system, such as on-premises Active Directory Federation Services (AD FS ) or a trusted third-party federation
provider, to validate the user's password.
The article choosing the right authentication method for Azure Active Directory contains a decision tree to help
you choose the best solution for your organization.
The following table lists the native tools that can help mature the policies and processes that support this
governance discipline.
PASSWORD HASH PASS-THROUGH
SYNCHRONIZATION + AUTHENTICATION +
CONSIDERATION SEAMLESS SSO SEAMLESS SSO FEDERATION WITH AD FS
Where does authentication In the cloud In the cloud after a secure On-premises
happen? password verification
exchange with the on-
premises authentication
agent
What are the on-premises None One server for each Two or more AD FS servers
server requirements beyond additional authentication
the provisioning system: agent Two or more WAP servers
Azure AD Connect? in the perimeter/DMZ
network
What are the requirements None Outbound internet access Inbound internet access to
for on-premises internet from the servers running WAP servers in the
and networking beyond the authentication agents perimeter
provisioning system?
Inbound network access to
AD FS servers from WAP
servers in the perimeter
Is there a health monitoring Not required Agent status provided by Azure AD Connect Health
solution? Azure Active Directory
admin center
Do users get single sign-on Yes with Seamless SSO Yes with Seamless SSO Yes
to cloud resources from
domain-joined devices
within the company
network?
Alternate login ID
Is Windows Hello for Key trust model Key trust model Key trust model
Business supported?
Certificate trust model with Certificate trust model with Certificate trust model
Intune Intune
PASSWORD HASH PASS-THROUGH
SYNCHRONIZATION + AUTHENTICATION +
CONSIDERATION SEAMLESS SSO SEAMLESS SSO FEDERATION WITH AD FS
What are the multi-factor Azure Multi-Factor Azure Multi-Factor Azure Multi-Factor
authentication options? Authentication Authentication Authentication
Third-party multi-factor
authentication
What user account states Disabled accounts Disabled accounts Disabled accounts
are supported? (up to 30-minute delay)
Account locked out Account locked out
What are the conditional Azure AD conditional access Azure AD conditional access Azure AD conditional access
access options?
AD FS claim rules
Can you customize the Yes, with Azure AD Premium Yes, with Azure AD Premium Yes
logo, image, and description
on the sign-in pages?
What advanced scenarios Smart password lockout Smart password lockout Multisite low-latency
are supported? authentication system
Leaked credentials reports
AD FS extranet lockout
NOTE
Custom controls in Azure AD conditional access does not currently support device registration.
Next steps
The Hybrid Identity Digital Transformation Framework whitepaper outlines combinations and solutions for
choosing and integrating each of these components.
The Azure AD Connect tool helps you to integrate your on-premises directories with Azure AD.
Resource Consistency is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework governance
model. This discipline focuses on ways of establishing policies related to the operational management of an environment,
application, or workload. IT Operations teams often provide monitoring of applications, workload, and asset performance. They
also commonly execute the tasks required to meet scale demands, remediate performance Service Level Agreement (SLA)
violations, and proactively avoid performance SLA violations through automated remediation. Within the Five Disciplines of
Cloud Governance, Resource Consistency is a discipline that ensures resources are consistently configured in such a way that
they can be discoverable by IT operations, are included in recovery solutions, and can be onboarded into repeatable operations
processes.
NOTE
Resource Consistency governance does not replace the existing IT teams, processes, and procedures that allow your organization
to effectively manage cloud-based resources. The primary purpose of this discipline is to identify potential business risks and
provide risk-mitigation guidance to the IT staff that are responsible for managing your resources in the cloud. As you develop
governance policies and processes make sure to involve relevant IT teams in your planning and review processes.
This section of the Cloud Adoption Framework outlines how to develop a Resource Consistency discipline as part of your cloud
governance strategy. The primary audience for this guidance is your organization's cloud architects and other members of your
cloud governance team. However, the decisions, policies, and processes that emerge from this discipline should involve
engagement and discussions with relevant members of the IT teams responsible for implementing and managing your
organization's Resource Consistency solutions.
If your organization lacks in-house expertise in Resource Consistency strategies, consider engaging external consultants as a part
of this discipline. Also consider engaging Microsoft Consulting Services, the Microsoft FastTrack cloud adoption service, or other
external cloud adoption experts for discussing how best to organize, track, and optimize your cloud-based assets.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Resource Consistency
discipline. To see policy statement samples, see the article on Resource Consistency Policy Statements. These samples can serve
as a starting point for your organization's governance policies.
C A U T IO N
The sample policies come from common customer experiences. To better align these policies to specific cloud governance needs,
execute the following steps to create policy statements that meet your unique business needs.
Business Risks
Understand the motives and risks commonly associated with the Resource Consistency discipline.
Indicators and Metrics
Indicators to understand if it is the right time to invest in the Resource Consistency discipline.
Maturity
Aligning Cloud Management maturity with phases of cloud adoption.
Toolchain
Azure services that can be implemented to support the Resource Consistency discipline.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Resource Consistency template
2 minutes to read • Edit Online
The first step to implementing change is communicating what is desired. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern IT operations and management in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Resource Consistency policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Resource Consistency discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Resource Consistency motivations and business risks
2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Resource Consistency discipline within a cloud
governance strategy. It also provides a few examples of potential business risks that can drive policy statements.
Business risk
The Resource Consistency discipline attempts to address core operational business risks. Work with your business
and IT teams to identify these risks and monitor each of them for relevance as you plan for and implement your
cloud deployments.
Risks will differ between organization, but the following serve as common risks that you can use as a starting
point for discussions within your cloud governance team:
Unnecessary operational cost. Obsolete or unused resources, or resources that are overprovisioned during
times of low demand, add unnecessary operational costs.
Underprovisioned resources. Resources that experience higher than anticipated demand can result in
business disruption as cloud resources are overwhelmed by demand.
Management inefficiencies. Lack of consistent naming and tagging metadata associated with resources can
lead to IT staff having difficulty finding resources for management tasks or identifying ownership and
accounting information related to assets. This results in management inefficiencies that can increase cost and
slow IT responsiveness to service disruption or other operational issues.
Business interruption. Service disruptions that result in violations of your organization's established Service
Level Agreements (SLAs) can result in loss of business or other financial impacts to your company.
Next steps
Using the Cloud Management template, document business risks that are likely to be introduced by the current
cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Resource Consistency metrics, indicators, and risk
tolerance
5 minutes to read • Edit Online
This article will help you quantify business risk tolerance as it relates to Resource Consistency. Defining metrics
and indicators helps you create a business case for making an investment in maturing the Resource Consistency
discipline.
Metrics
The Resource Consistency discipline focuses on addressing risks related to the operational management of your
cloud deployments. As part of your risk analysis you'll want to gather data related to your IT operations to
determine how much risk you face, and how important investment in Resource Consistency governance is to your
planned cloud deployments.
Every organization has different operational scenarios, but the following items represent useful examples of the
metrics you should gather when evaluating risk tolerance within the Resource Consistency discipline:
Cloud assets. Total number of cloud-deployed resources.
Untagged resources. Number of resources without required accounting, business impact, or organizational
tags.
Underused assets. Number of resources where memory, CPU, or network capabilities are all consistently
underutilized.
Resource depletion. Number of resources where memory, CPU, or network capabilities are exhausted by
load.
Resource age. Time since resource was last deployed or modified.
VMs in critical condition. Number of deployed VMs where one or more critical issues are detected which
need to be addressed in order to restore normal functionality.
Alerts by severity. Total number of alerts on a deployed asset, broken down by severity.
Unhealthy network links. Number of resources with network connectivity issues.
Unhealthy service endpoints. Number of issues with external network endpoints.
Cloud provider service health incidents. Number of disruptions or performance incidents caused by the
cloud provider.
Service level agreements. This can include both Microsoft's commitments for uptime and connectivity of
Azure services, as well as commitments made by the business to its external and internal customers.
Service availability. Percentage of actual uptime cloud-hosted workloads compared to the expected uptime.
Recovery time objective (RTO ). The maximum acceptable time that an application can be unavailable after
an incident.
Recovery point objective (RPO ). The maximum duration of data loss that is acceptable during a disaster. For
example, if you store data in a single database, with no replication to other databases, and perform hourly
backups, you could lose up to an hour of data.
Mean time to recover (MTTR). The average time required to restore a component after a failure.
Mean time between failures (MTBF). The duration that a component can reasonably expect to run between
outages. This metric can help you calculate how often a service will become unavailable.
Backup health. Number of backups actively being synchronized.
Recovery health. Number of recovery operations successfully performed.
Risk tolerance indicators
Cloud platforms offer a baseline set of features that allow deployment teams to effectively manage small
deployments without extensive additional planning or processes. As a result, small Dev/Test or experimental first
workloads that include a relatively small amount of cloud-based assets represent low level of risk, and will likely
not need much in the way of a formal Resource Consistency policy.
However, as the size of your cloud estate grows the complexity of managing your assets becomes significantly
more difficult. With more assets on the cloud, the ability identify ownership of resources and control resource
useful becomes critical to minimizing risks. As more mission-critical workloads are deployed to the cloud, service
uptime becomes more critical, and tolerance for service disruption potential cost overruns diminishes rapidly.
In the early stages of cloud adoption, work with your IT operations team and business stakeholders to identify
business risks related to Resource Consistency, then determine an acceptable baseline for risk tolerance. This
section of the Cloud Adoption Framework provides examples, but the detailed risks and baselines for your
company or deployments may be different.
Once you have a baseline, establish minimum benchmarks representing an unacceptable increase in your
identified risks. These benchmarks act as triggers for when you need to take action to remediate these risks. The
following are a few examples of how operational metrics, such as those discussed above, can justify an increased
investment in the Resource Consistency discipline.
Tagging and naming trigger. A company with more than x resources lacking required tagging information or
not obeying naming standards should consider investing in the Resource Consistency discipline to help refine
these standards and ensure consistent application of them to cloud-deployed assets.
Overprovisioned resources trigger. If a company has more than x% of assets regularly using small amounts
of their available memory, CPU, or network capabilities, investment in the Resource Consistency discipline is
suggested to help optimize resources usage for these items.
Underprovisioned resources trigger. If a company has more than x% of assets regularly exhausting most of
their available memory, CPU, or network capabilities, investment in the Resource Consistency discipline is
suggested to help ensure these assets have the resources necessary to prevent service interruptions.
Resource age trigger. A company with more than x resources that have not been updated in over y months
could benefit from investment in the Resource Consistency discipline aimed at ensuring active resources are
patched and healthy, while retiring obsolete or otherwise unused assets.
Service-level agreement trigger. A company that cannot meet its service-level agreements to its external
customers or internal partners should invest in the Deployment Acceleration discipline to reduce system
downtime.
Recovery time triggers. If a company exceeds the required thresholds for recovery time following a system
failure, it should invest in improving its Deployment Acceleration discipline and systems design to reduce or
eliminate failures or the effect of individual component downtime.
VM health trigger. A company that has more than x% of VMs experiencing a critical health issue should
invest in the Resource Consistency discipline to identify issues and improve VM stability.
Network health trigger. A company that has more than x% of network subnets or endpoints experiencing
connectivity issues should invest in the Resource Consistency discipline to identify and resolve network issues.
Backup coverage trigger. A company with x% of mission-critical assets without up-to-date backups in place
would benefit from an increased investment in the Resource Consistency discipline to ensure a consistent
backup strategy.
Backup health trigger. A company experiencing more than x% failure of restore operations should invest in
the Resource Consistency discipline to identify problems with backup and ensure important resources are
protected.
The exact metrics and triggers you use to gauge risk tolerance and the level of investment in the Resource
Consistency discipline will be specific to your organization, but the examples above should serve as a useful base
for discussion within your cloud governance team.
Next steps
Using the Cloud Management template, document metrics and tolerance indicators that align to the current cloud
adoption plan.
Review sample Resource Consistency policies as a starting point to develop policies that address specific business
risks that align with your cloud adoption plans.
Review sample policies
Resource Consistency sample policy statements
4 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk: A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common business risks related to resource consistency. These
statements are examples you can reference when drafting policy statements to address your organization's needs.
These examples are not meant to be proscriptive, and there are potentially several policy options for dealing with
each identified risk. Work closely with business and IT teams to identify the best policies for your unique set of
risks.
Tagging
Technical risk: Without proper metadata tagging associated with deployed resources, IT Operations cannot
prioritize support or optimization of resources based on required SLA, importance to business operations, or
operational cost. This can result in mis-allocation of IT resources and potential delays in incident resolution.
Policy statement: The following policies will be implemented:
Deployed assets should be tagged with the following values:
Cost
Criticality
SLA
Environment
Governance tooling must validate tagging related to cost, criticality, SLA, application, and environment. All
values must align to predefined values managed by the governance team.
Potential design options: In Azure, standard name-value metadata tags are supported on most resource types.
Azure Policy is used to enforce specific tags as part of resource creation.
Ungoverned subscriptions
Technical risk: Arbitrary creation of subscriptions and management groups can lead to isolated sections of your
cloud estate that are not properly subject to your governance policies.
Policy statement: Creation of new subscriptions or management groups for any mission-critical applications or
protected data will require a review from the cloud governance team. Approved changes will be integrated into a
proper blueprint assignment.
Potential design options: Lock down administrative access to your organizations Azure management groups to
only approved governance team members who will control the subscription creation and access control process.
Deployment compliance
Technical risk: Deployment scripts and automation tooling that is not fully vetted by the cloud governance team
can result in resource deployments that violate policy.
Policy statement: The following policies will be implemented:
Deployment tooling must be approved by the cloud governance team to ensure ongoing governance of
deployed assets.
Deployment scripts must be maintained in central repository accessible by the cloud governance team for
periodic review and auditing.
Potential design options: Consistent use of Azure Blueprints to manage automated deployments allows
consistent deployments of Azure resources that adhere to your organization's governance standards and policies.
Monitoring
Technical risk: Improperly implemented or inconsistently instrumented monitoring can prevent the detection of
workload health issues or other policy compliance violations.
Policy statement: The following policies will be implemented:
Governance tooling must validate that all assets are included in monitoring for resource depletion, security,
compliance, and optimization.
Governance tooling must validate that the appropriate level of logging data is being collected for all
applications and data.
Potential design options: Azure Monitor is the default monitoring service in Azure, and consistent monitoring
can be enforced via Azure Blueprints when deploying resources.
Disaster recovery
Technical risk: Resource failure, deletions, or corruption can result in disruption of mission-critical applications or
services and the loss of sensitive data.
Policy statement: All mission-critical applications and protected data must have backup and recovery solutions
implemented to minimize business impact of outages or system failures.
Potential design options: The Azure Site Recovery service provides backup, recovery, and replication
capabilities that minimize outage duration in business continuity and disaster recovery (BCDR ) scenarios.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
To begin developing your own custom policy statements related to Resource Consistency, download the Resource
Consistency template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Resource Consistency policy
adherence.
Establish policy compliance processes
Resource Consistency policy compliance processes
5 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern Resource Consistency. Effective
cloud Resource Consistency governance starts with recurring manual processes designed to identify operational
inefficiency, improve management of deployed resources, and ensure mission-critical workloads have minimal
disruptions. These manual processes are supplemented with monitoring, automation, and tooling to help reduce
the overhead of governance and allow for faster response to policy deviation.
Next steps
Using the Cloud Management template, document the processes and triggers that align to the current cloud
adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Resource Consistency discipline improvement
Resource Consistency discipline improvement
6 minutes to read • Edit Online
The Resource Consistency discipline focuses on ways of establishing policies related to the operational
management of an environment, application, or workload. Within the Five Disciplines of Cloud Governance,
Resource Consistency includes the monitoring of application, workload, and asset performance. It also includes the
tasks required to meet scale demands, remediate performance Service Level Agreement (SLA) violations, and
proactively avoid SLA violations through automated remediation.
This article outlines some potential tasks your company can engage in to better develop and mature the Resource
Consistency discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud resource governance, move on to learn more about how resource
access is managed in Azure in preparation for learning how to design a governance model for a simple workload
or for multiple teams.
Learn about resource access management in Azure Learn about service-level agreements for Azure Learn about
logging, reporting, and monitoring
Resource Consistency tools in Azure
2 minutes to read • Edit Online
Resource Consistency is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways of
establishing policies related to the operational management of an environment, application, or workload. Within
the Five Disciplines of Cloud Governance, the Resource Consistency discipline involves monitoring of application,
workload, and asset performance. It also involves the tasks required to meet scale demands, remediate
performance SLA violations, and proactively avoid performance SLA violations through automated remediation.
The following is a list of Azure tools that can help mature the policies and processes that support this governance
discipline.
AZURE AZURE
AZURE RESOURCE AZURE AUTOMATIO AZURE AZURE SITE
TOOL PORTAL MANAGER BLUEPRINTS N AZURE AD BACKUP RECOVERY
Orchestrate No No Yes No No No No
d
environmen
t
deployment
Assess No No No Yes No No No
availability
and
scalability
Apply No No No Yes No No No
automated
remediation
Manage Yes No No No No No No
billing
Along with these Resource Consistency tools and features, you will need to monitor your deployed resources for
performance and health issues. Azure Monitor is the default monitoring and reporting solution in Azure. Azure
Monitor provides features for monitoring your cloud resources. This list shows which feature addresses common
monitoring requirements.
Schedule regular No No No No
reports or custom
analysis
When planning your deployment, you will need to consider where logging data is stored and how you integrate
cloud-based reporting and monitoring services with your existing processes and tools.
NOTE
Organizations also use third-party DevOps tools to monitor workloads and resources. For more information, see DevOps
tool integrations.
Next steps
Learn how to create, assign, and manage policy definitions in Azure.
Resource access management in Azure
4 minutes to read • Edit Online
Cloud Governance outlines the Five Disciplines of Cloud Governance, which includes Resource Management.
What is resource access governance furthers explains how resource access management fits into the resource
management discipline. Before you move on to learn how to design a governance model, it's important to
understand the resource access management controls in Azure. The configuration of these resource access
management controls forms the basis of your governance model.
Begin by taking a closer look at how resources are deployed in Azure.
Figure 1 - A resource.
Summary
In this article, you learned about how resource access is managed in Azure using Azure Resource Manager.
Next steps
Now that you understand how to manage resource access in Azure, move on to learn how to design a governance
model for a simple workload or for multiple teams using these services.
An overview of governance
Governance design for a simple workload
6 minutes to read • Edit Online
The goal of this guidance is to help you learn the process for designing a resource governance model in Azure to
support a single team and a simple workload. You'll look at a set of hypothetical governance requirements, then
go through several example implementations that satisfy those requirements.
In the foundational adoption stage, our goal is to deploy a simple workload to Azure. This results in the following
requirements:
Identity management for a single workload owner who is responsible for deploying and maintaining the
simple workload. The workload owner requires permission to create, read, update, and delete resources as well
as permission to delegate these rights to other users in the identity management system.
Manage all resources for the simple workload as a single management unit.
Azure licensing
Before you begin designing our governance model, it's important to understand how Azure is licensed. This is
because the administrative accounts associated with your Azure license have the highest level of access to your
Azure resources. These administrative accounts form the basis of your governance model.
NOTE
If your organization has an existing Microsoft Enterprise Agreement that does not include Azure, Azure can be added by
making an upfront monetary commitment. For more information, see licensing Azure for the enterprise.
When Azure was added to your organization's Enterprise Agreement, your organization was prompted to create
an Azure account. During the account creation process, an Azure account owner was created, as well as an
Azure Active Directory (Azure AD ) tenant with a global administrator account. An Azure AD tenant is a logical
construct that represents a secure, dedicated instance of Azure AD.
Figure 1 - An Azure account with an Account Manager and Azure AD Global Administrator.
Identity management
Azure only trusts Azure AD to authenticate users and authorize user access to resources, so Azure AD is our
identity management system. The Azure AD global administrator has the highest level of permissions and can
perform all actions related to identity, including creating users and assigning permissions.
Our requirement is identity management for a single workload owner who is responsible for deploying and
maintaining the simple workload. The workload owner requires permission to create, read, update, and delete
resources as well as permission to delegate these rights to other users in the identity management system.
Our Azure AD global administrator will create the workload owner account for the workload owner:
Figure 2 - The Azure AD global administrator creates the workload owner user account.
You aren't able to assign resource access permission until this user is added to a subscription, so you'll do that in
the next two sections.
Figure 4 - The Azure account owner associates the Azure AD tenant with the subscription.
You may have noticed that there is currently no user associated with the subscription, which means that no one
has permission to manage resources. In reality, the account owner is the owner of the subscription and has
permission to take any action on a resource in the subscription. However, in practical terms the account owner is
more than likely a finance person in your organization and is not responsible for creating, reading, updating, and
deleting resources - those tasks will be performed by the workload owner. Therefore, you need to add the
workload owner to the subscription and assign permissions.
Since the account owner is currently the only user with permission to add the workload owner to the
subscription, they add the workload owner to the subscription:
Figure 5 - The Azure account owner adds the workload owner to the subscription.
The Azure account owner grants permissions to the workload owner by assigning a role-based access control
(RBAC ) role. The RBAC role specifies a set of permissions that the workload owner has for an individual
resource type or a set of resource types.
Notice that in this example, the account owner has assigned the built-in owner role:
Figure 6 - The workload owner was assigned the built-in owner role.
The built-in owner role grants all permissions to the workload owner at the subscription scope.
IMPORTANT
The Azure account owner is responsible for the financial commitment associated with the subscription, but the workload
owner has the same permissions. The account owner must trust the workload owner to deploy resources that are within
the subscription budget.
The next level of management scope is the resource group level. A resource group is a logical container for
resources. Operations applied at the resource group level apply to all resources in a group. Also, it's important to
note that permissions for each user are inherited from the next level up unless they are explicitly changed at that
scope.
To illustrate this, let's look at what happens when the workload owner creates a resource group:
Figure 7 - The workload owner creates a resource group and inherits the built-in owner role at the resource group
scope.
Again, the built-in owner role grants all permissions to the workload owner at the resource group scope. As
discussed earlier, this role is inherited from the subscription level. If a different role is assigned to this user at this
scope, it applies to this scope only.
The lowest level of management scope is at the resource level. Operations applied at the resource level apply only
to the resource itself. Again, permissions at the resource level are inherited from resource group scope. For
example, let's look at what happens if the workload owner deploys a virtual network into the resource group:
Figure 8 - The workload owner creates a resource and inherits the built-in owner role at the resource scope.
The workload owner inherits the owner role at the resource scope, which means the workload owner has all
permissions for the virtual network.
Next steps
Deploy a basic workload to Azure
Learn about resource access for multiple teams
Governance design for multiple teams
24 minutes to read • Edit Online
The goal of this guidance is to help you learn the process for designing a resource governance model in Azure to
support multiple teams, multiple workloads, and multiple environments. First you'll look at a set of hypothetical
governance requirements, then go through several example implementations that satisfy those requirements.
The requirements are:
The enterprise plans to transition new cloud roles and responsibilities to a set of users and therefore requires
identity management for multiple teams with different resource access needs in Azure. This identity
management system is required to store the identity of the following users:
The individual in your organization responsible for ownership of subscriptions.
The individual in your organization responsible for the shared infrastructure resources used to
connect your on-premises network to an Azure virtual network.
Two individuals in your organization responsible for managing a workload.
Support for multiple environments. An environment is a logical grouping of resources, such as virtual
machines, virtual networking, and network traffic routing services. These groups of resources have similar
management and security requirements and are typically used for a specific purpose such as testing or
production. In this example, the requirement is for four environments:
A shared infrastructure environment that includes resources shared by workloads in other
environments. For example, a virtual network with a gateway subnet that provides connectivity to on-
premises.
A production environment with the most restrictive security policies. May include internal or external
facing workloads.
A preproduction environment for development and testing work. This environment has security,
compliance, and cost policies that differ from those in the production environment. In Azure, this takes
the form of an Enterprise Dev/Test subscription.
A sandbox environment for proof of concept and education purposes. This environment is typically
assigned per employee participating in development activities and has strict procedural and operational
security controls in place to prevent corporate data from landing here. In Azure, these take the form of
Visual Studio subscriptions. These subscriptions should also not be tied to the enterprise Azure Active
Directory.
A permissions model of least privilege in which users have no permissions by default. The model must
support the following:
A single trusted user (treated like a service account) at the subscription scope with permission to assign
resource access rights.
Each workload owner is denied access to resources by default. Resource access rights are granted
explicitly by the single trusted user at the resource group scope.
Management access for the shared infrastructure resources limited to the shared infrastructure owners.
Management access for each workload restricted to the workload owner (in production) and increasing
levels of control as development increases from Dev to Test to Stage to Prod.
The enterprise does not want to have to manage roles independently in each of the three main
environments, and therefore requires the use of only built-in roles available in Azure's role-based access
control (RBAC ). If the enterprise absolutely requires custom RBAC roles, additional processes would be
needed to synchronize custom roles across the three environments.
Cost tracking by workload owner name, environment, or both.
Identity management
Before you can design identity management for your governance model, it's important to understand the four
major areas it encompasses:
Administration: The processes and tools for creating, editing, and deleting user identity.
Authentication: Verifying user identity by validating credentials, such as a user name and password.
Authorization: Determining which resources an authenticated user is allowed to access or what operations
they have permission to perform.
Auditing: Periodically reviewing logs and other information to discover security issues related to user identity.
This includes reviewing suspicious usage patterns, periodically reviewing user permissions to verify they are
accurate, and other functions.
There is only one service trusted by Azure for identity, and that is Azure Active Directory (Azure AD ). You'll be
adding users to Azure AD and using it for all of the functions listed above. But before looking at how to configure
Azure AD, it's important to understand the privileged accounts that are used to manage access to these services.
When your organization signed up for an Azure account, at least one Azure account owner was assigned. Also,
an Azure AD tenant was created, unless an existing tenant was already associated with your organization's use of
other Microsoft services such as Office 365. A global administrator with full permissions on the Azure AD
tenant was associated when it was created.
The user identities for both the Azure Account Owner and the Azure AD global administrator are stored in a
highly secure identity system that is managed by Microsoft. The Azure Account Owner is authorized to create,
update, and delete subscriptions. The Azure AD global administrator is authorized to perform many actions in
Azure AD, but for this design guide you'll focus on the creation and deletion of user identity.
NOTE
Your organization may already have an existing Azure AD tenant if there's an existing Office 365, Intune, or Dynamics license
associated with your account.
The Azure Account Owner has permission to create, update, and delete subscriptions:
Figure 1 - An Azure account with an Account Manager and Azure AD Global Administrator.
The Azure AD global administrator has permission to create user accounts:
Figure 2 - The Azure AD Global Administrator creates the required user accounts in the tenant.
The first two accounts, App1 Workload Owner and App2 Workload Owner are each associated with an
individual in your organization responsible for managing a workload. The network operations account is owned
by the individual that is responsible for the shared infrastructure resources. Finally, the subscription owner
account is associated with the individual responsible for ownership of subscriptions.
2. The service administrator reviews their request and creates resource group A. At this point, workload
owner A still doesn't have permission to do anything.
3. The service administrator adds workload owner A to resource group A and assigns the built-in
contributor role. The contributor role grants all permissions on resource group A except managing access
permission.
4. Let's assume that workload owner A has a requirement for a pair of team members to view the CPU and
network traffic monitoring data as part of capacity planning for the workload. Because workload owner A is
assigned the contributor role, they do not have permission to add a user to resource group A. They must send
this request to the service administrator.
5. The service administrator reviews the request, and adds the two workload contributor users to resource
group A. Neither of these two users require permission to manage resources, so they are assigned the built-in
reader role.
6. Next, workload owner B also requires a resource group to contain the resources for their workload. As with
workload owner A, workload owner B initially does not have permission to take any action at the
subscription scope so they must send a request to the service administrator.
7. The service administrator reviews the request and creates resource group B.
8. The service administrator then adds workload owner B to resource group B and assigns the built-in
contributor role.
At this point, each of the workload owners is isolated in their own resource group. None of the workload owners
or their team members have management access to the resources in any other resource group.
Figure 4 - A subscription with two workload owners isolated with their own resource group.
This model is a least-privilege model—each user is assigned the correct permission at the correct resource
management scope.
However, consider that every task in this example was performed by the service administrator. While this is a
simple example and may not appear to be an issue because there were only two workload owners, it's easy to
imagine the types of issues that would result for a large organization. For example, the service administrator can
become a bottleneck with a large backlog of requests that result in delays.
Let's take a look at second example that reduces the number of tasks performed by the service administrator.
1. In this model, workload owner A is assigned the built-in owner role at the subscription scope, enabling them
to create their own resource group: resource group A.
2. When resource group A is created, workload owner A is added by default and inherits the built-in owner
role from the subscription scope.
3. The built-in owner role grants workload owner A permission to manage access to the resource group.
Workload owner A adds two workload contributors and assigns the built-in reader role to each of them.
4. Service administrator now adds workload owner B to the subscription with the built-in owner role.
5. Workload owner B creates resource group B and is added by default. Again, workload owner B inherits
the built-in owner role from the subscription scope.
Note that in this model, the service administrator performed fewer actions than they did in the first example due
to the delegation of management access to each of the individual workload owners.
Figure 5 - A subscription with a service administrator and two workload owners, all assigned the built-in owner
role.
However, because both workload owner A and workload owner B are assigned the built-in owner role at the
subscription scope, they have each inherited the built-in owner role for each other's resource group. This means
that not only do they have full access to one another's resources, they are also able to delegate management
access to each other's resource groups. For example, workload owner B has rights to add any other user to
resource group A and can assign any role to them, including the built-in owner role.
If you compare each example to the requirements, you'll see that both examples support a single trusted user at
the subscription scope with permission to grant resource access rights to the two workload owners. Each of the
two workload owners did not have access to resource management by default and required the service
administrator to explicitly assign permissions to them. However, only the first example supports the requirement
that the resources associated with each workload are isolated from one another such that no workload owner has
access to the resources of any other workload.
3. The network operations user creates a VPN gateway and configures it to connect to the on-premises VPN
appliance. The network operations user also applies a pair of tags to each of the resources:
environment:shared and managedBy:netOps. When the subscription service administrator exports a cost
report, costs will be aligned with each of these tags. This allows the subscription service administrator to
pivot costs using the environment tag and the managedBy tag. Notice the resource limits counter at the top
right-hand side of the figure. Each Azure subscription has service limits, and to help you understand the effect
of these limits you'll follow the virtual network limit for each subscription. There is a limit of 1000 virtual
networks per subscription, and after the first virtual network is deployed there are now 999 available.
4. Two more resource groups are deployed. The first is named prod-rg . This resource group is aligned with the
production environment. The second is named dev-rg and is aligned with the development environment. All
resources associated with production workloads are deployed to the production environment and all resources
associated with development workloads are deployed to the development environment. In this example, you'll
only deploy two workloads to each of these two environments, so you won't encounter any Azure subscription
service limits. However, consider that each resource group has a limit of 800 resources per resource group. If
you continue to add workloads to each resource group, eventually this limit will be reached.
5. The first workload owner sends a request to the subscription service administrator and is added to each
of the development and production environment resource groups with the contributor role. As you learned
earlier, the contributor role allows the user to perform any operation other than assigning a role to another
user. The first workload owner can now create the resources associated with their workload.
6. The first workload owner creates a virtual network in each of the two resource groups with a pair of virtual
machines in each. The first workload owner applies the environment and managedBy tags to all resources.
Note that the Azure service limit counter is now at 997 virtual networks remaining.
7. Each of the virtual networks does not have connectivity to on-premises when they are created. In this type of
architecture, each virtual network must be peered to the hub -vnet in the shared infrastructure environment.
Virtual network peering creates a connection between two separate virtual networks and allows network traffic
to travel between them. Note that virtual network peering is not inherently transitive. A peering must be
specified in each of the two virtual networks that are connected, and if only one of the virtual networks
specifies a peering the connection is incomplete. To illustrate the effect of this, the first workload owner
specifies a peering between prod-vnet and hub-vnet. The first peering is created, but no traffic flows because
the complementary peering from hub-vnet to prod-vnet has not yet been specified. The first workload
owner contacts the network operations user and requests this complementary peering connection.
8. The network operations user reviews the request, approves it, then specifies the peering in the settings for
the hub-vnet. The peering connection is now complete and network traffic flows between the two virtual
networks.
9. Now, a second workload owner sends a request to the subscription service administrator and is added to
the existing production and development environment resource groups with the contributor role. The
second workload owner has the same permissions on all resources as the first workload owner in each
resource group.
10. The second workload owner creates a subnet in the prod-vnet virtual network, then adds two virtual
machines. The second workload owner applies the environment and managedBy tags to each resource.
This example resource management model enables us to manage resources in the three required environments.
The shared infrastructure resources are protected because there's only a single user in the subscription with
permission to access those resources. Each of the workload owners is able to use the shared infrastructure
resources without having any permissions on the actual shared resources themselves. However, This
management model fails the requirement for workload isolation - each of the two workload owners are able to
access the resources of the other's workload.
There's another important consideration with this model that may not be immediately obvious. In the example, it
was app1 workload owner that requested the network peering connection with the hub-vnet to provide
connectivity to on-premises. The network operations user evaluated that request based on the resources
deployed with that workload. When the subscription owner added app2 workload owner with the
contributor role, that user had management access rights to all resources in the prod-rg resource group.
This means app2 workload owner had permission to deploy their own subnet with virtual machines in the
prod-vnet virtual network. By default, those virtual machines now have access to the on-premises network. The
network operations user is not aware of those machines and did not approve their connectivity to on-premises.
Next, let's look at a single subscription with multiple resources groups for different environments and workloads.
Note that in the previous example, the resources for each environment were easily identifiable because they were
in the same resource group. Now that you no longer have that grouping, you will have to rely on a resource group
naming convention to provide that functionality.
1. The shared infrastructure resources will still have a separate resource group in this model, so that remains
the same. Each workload requires two resource groups - one for each of the development and production
environments. For the first workload, the subscription owner creates two resource groups. The first is named
app1-prod-rg and the second is named app1-dev-rg. As discussed earlier, this naming convention identifies
the resources as being associated with the first workload, app1, and either the dev or prod environment.
Again, the subscription owner adds the app1 workload owner to the resource group with the contributor
role.
2. Similar to the first example, app1 workload owner deploys a virtual network named app1-prod-vnet to the
production environment, and another named app1-dev-vnet to the development environment. Again,
app1 workload owner sends a request to the network operations user to create a peering connection. Note
that app1 workload owner adds the same tags as in the first example, and the limit counter has been
decremented to 997 virtual networks remaining in the subscription.
3. The subscription owner now creates two resource groups for app2 workload owner. Following the same
conventions as for app1 workload owner, the resource groups are named app2-prod-rg and app2-dev-rg.
The subscription owner adds app2 workload owner to each of the resource groups with the contributor
role.
4. App2 workload owner deploys virtual networks and virtual machines to the resource groups with the same
naming conventions. Tags are added and the limit counter has been decremented to 995 virtual networks
remaining in the subscription.
5. App2 workload owner sends a request to the network operations user to peer the app2 -prod -vnet with the
hub -vnet. The network operations user creates the peering connection.
The resulting management model is similar to the first example, with several key differences:
Each of the two workloads is isolated by workload and by environment.
This model required two more virtual networks than the first example model. While this is not an important
distinction with only two workloads, the theoretical limit on the number of workloads for this model is 24.
Resources are no longer grouped in a single resource group for each environment. Grouping resources
requires an understanding of the naming conventions used for each environment.
Each of the peered virtual network connections was reviewed and approved by the network operations user.
Now let's look at a resource management model using multiple subscriptions. In this model, you'll align each of
the three environments to a separate subscription: a shared services subscription, production subscription, and
finally a development subscription. The considerations for this model are similar to a model using a single
subscription in that you have to decide how to align resource groups to workloads. Already determined is that
creating a resource group for each workload satisfies the workload isolation requirement, so you'll stick with that
model in this example.
1. In this model, there are three subscriptions: shared infrastructure, production, and development. Each of these
three subscriptions requires a subscription owner, and in the simple example you'll use the same user account
for all three. The shared infrastructure resources are managed similarly to the first two examples above, and
the first workload is associated with the app1 -rg in the production environment and the same-named resource
group in the development environment. The app1 workload owner is added to each of the resource group with
the contributor role.
2. As with the earlier examples, app1 workload owner creates the resources and requests the peering connection
with the shared infrastructure virtual network. App1 workload owner adds only the managedBy tag because
there is no longer a need for the environment tag. That is, resources are for each environment are now
grouped in the same subscription and the environment tag is redundant. The limit counter is decremented to
999 virtual networks remaining.
3. Finally, the subscription owner repeats the process for the second workload, adding the resource groups with
the app2 workload owner in the *contributor role. The limit counter for each of the environment subscriptions
is decremented to 998 virtual networks remaining.
This management model has the benefits of the second example above. However, the key difference is that limits
are less of an issue due to the fact that they are spread over two subscriptions. The drawback is that the cost data
tracked by tags must be aggregated across all three subscriptions.
Therefore, you can select any of these two examples resource management models depending on the priority of
your requirements. If you anticipate that your organization will not reach the service limits for a single
subscription, you can use a single subscription with multiple resource groups. Conversely, if your organization
anticipates many workloads, multiple subscriptions for each environment may be better.
Related resources
Built-in roles for Azure resources
Next steps
Learn about deploying a basic infrastructure
Deployment Acceleration is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework governance
model. This discipline focuses on ways of establishing policies to govern asset configuration or deployment. Within the Five
Disciplines of Cloud Governance, Deployment Acceleration includes deployment, configuration alignment, and script reusability.
This could be through manual activities or fully automated DevOps activities. In either case, the policies would remain largely
the same. As this discipline matures, the cloud governance team can serve as a partner in DevOps and deployment strategies by
accelerating deployments and removing barriers to cloud adoption, through the application of reusable assets.
This article outlines the Deployment Acceleration process that a company experiences during the planning, building, adopting,
and operating phases of implementing a cloud solution. It's impossible for any one document to account for all of the
requirements of any business. As such, each section of this article outlines suggested minimum and potential activities. The
objective of these activities is to help you build a policy MVP, but establish a framework for Incremental Policy improvement.
The cloud governance team should decide how much to invest in these activities to improve the Deployment Acceleration
position.
NOTE
The Deployment Acceleration discipline does not replace the existing IT teams, processes, and procedures that allow your
organization to effectively deploy and configure cloud-based resources. The primary purpose of this discipline is to identify
potential business risks and provide risk-mitigation guidance to the IT staff that are responsible for managing your resources in
the cloud. As you develop governance policies and processes make sure to involve relevant IT teams in your planning and
review processes.
The primary audience for this guidance is your organization's cloud architects and other members of your cloud governance
team. However, the decisions, policies, and processes that emerge from this discipline should involve engagement and
discussions with relevant members of your business and IT teams, especially those leaders responsible for deploying and
configuring cloud-based workloads.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Deployment Acceleration
discipline. To see policy statement samples, see the article on Deployment Acceleration Policy Statements. These samples can
serve as a starting point for your organization's governance policies.
C A U T IO N
The sample policies come from common customer experiences. To better align these policies to specific cloud governance needs,
execute the following steps to create policy statements that meet your unique business needs.
Business Risks
Understand the motives and risks commonly associated with the Deployment Acceleration discipline.
Indicators and Metrics
Indicators to understand if it is the right time to invest in the Deployment Acceleration discipline.
Maturity
Aligning Cloud Management maturity with phases of cloud adoption.
Toolchain
Azure services that can be implemented to support the Deployment Acceleration discipline.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Deployment Acceleration template
2 minutes to read • Edit Online
The first step to implementing change is communicating the desired change. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern configuration and deployment issues in the cloud. The template also outlines the business
criteria that may have led you to create the documented policy statements.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Deployment Acceleration
policy statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Deployment Acceleration discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Deployment Acceleration motivations and business
risks
2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Deployment Acceleration discipline within a
cloud governance strategy. It also provides a few examples of business risks that drive policy statements.
Business risk
The Deployment Acceleration discipline attempts to address the following business risks. During cloud adoption,
monitor each of the following for relevance:
Service disruption: Lack of predictable repeatable deployment processes or unmanaged changes to system
configurations can disrupt normal operations and can result in lost productivity or lost business.
Cost overruns: Unexpected changes in configuration of system resources can make identifying root cause of
issues more difficult, raising the costs of development, operations, and maintenance.
Organizational inefficiencies: Barriers between development, operations, and security teams can cause
numerous challenges to effective adoption of cloud technologies and the development of a unified cloud
governance model.
Next steps
Using the Cloud Management template, document business risks that are likely to be introduced by the current
cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Metrics, indicators, and risk tolerance
Deployment Acceleration metrics, indicators, and risk
tolerance
2 minutes to read • Edit Online
This article will help you quantify business risk tolerance as it relates to Deployment Acceleration. Defining metrics
and indicators helps you create a business case for making an investment in the maturity of the Deployment
Acceleration discipline.
Metrics
The Deployment Acceleration discipline focuses on risks related to how cloud resources are configured, deployed,
updated, and maintained. The following information is useful when adopting this discipline of cloud governance:
Deployment failures: Percentage of deployments that fail or result in misconfigured resources.
Time to deployment: The amount of time needed to deploy updates to an existing system.
Assets out-of-compliance: The number or percentage of resources that are out of compliance with defined
policies.
Next steps
Using the Cloud Management template, document metrics and tolerance indicators that align to the current cloud
adoption plan.
Review sample Deployment Acceleration policies as a starting point to develop policies that address specific
business risks that align with your cloud adoption plans.
Review sample policies
Deployment Acceleration sample policy statements
3 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk: A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common configuration-related business risks. These statements
are examples you can reference when drafting policy statements to address your organization's needs. These
examples are not meant to be proscriptive, and there are potentially several policy options for dealing with each
identified risk. Work closely with business and IT teams to identify the best policies for your unique set of risks.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
To begin developing your own custom policy statements related to identity management, download the Identity
Baseline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Deployment Acceleration
policy adherence.
Establish policy compliance processes
Deployment Acceleration policy compliance
processes
4 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern Deployment Acceleration. Effective
governance of cloud configuration starts with recurring manual processes designed to detect issues and impose
policies to remediate those risks. However, you can automate these processes and supplement with tooling to
reduce the overhead of governance and allow for faster response to deviation.
Next steps
Using the Cloud Management template, document the processes and triggers that align to the current cloud
adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Deployment Acceleration discipline improvement
Deployment Acceleration discipline improvement
4 minutes to read • Edit Online
The Deployment Acceleration discipline focuses on establishing policies that ensure that resources are deployed
and configured consistently and repeatably, and remain in compliance throughout their lifecycle. Within the Five
Disciplines of Cloud Governance, Deployment Acceleration includes decisions regarding automating deployments,
source-controlling deployment artifacts, monitoring deployed resources to maintain desired state, and auditing
any compliance issues.
This article outlines some potential tasks your company can engage in to better develop and mature the
Deployment Acceleration discipline. These tasks can be broken down into planning, building, adopting, and
operating phases of implementing a cloud solution, which are then iterated on allowing the development of an
incremental approach to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud identity governance, examine the Identity Baseline toolchain to
identify Azure tools and features that you'll need when developing the Identity Baseline governance discipline on
the Azure platform.
Identity Baseline toolchain for Azure
Deployment Acceleration tools in Azure
2 minutes to read • Edit Online
Deployment Acceleration is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways of
establishing policies to govern asset configuration or deployment. Within the Five Disciplines of Cloud
Governance, the Deployment Acceleration discipline involves deployment and configuration alignment. This
could be through manual activities or fully automated DevOps activities. In either case, the policies involved
would remain largely the same.
Cloud custodians, cloud guardians, and cloud architects with an interest in governance are each likely to invest a
lot of time in the Deployment Acceleration discipline, which codifies policies and requirements across multiple
cloud adoption efforts. The tools in this toolchain are important to the cloud governance team and should be a
high priority on the learning path for the team.
The following is a list of Azure tools that can help mature the policies and processes that support this governance
discipline.
Implement Yes No No No No No
corporate
policies
Deploy No No Yes No No No
defined
resources
Report on No No No No No Yes
cost of
resources
The following are additional tools that may be required to accomplish specific Deployment Acceleration
objectives. Often these tools are used outside of the governance team, but are still considered an aspect of
Deployment Acceleration as a discipline.
AZURE
RESOURCE AZURE AZURE AZURE SITE
AZURE PORTAL MANAGER AZURE POLICY DEVOPS BACKUP RECOVERY
Create an No No No Yes No No
automated
pipeline to
deploy code
and configure
assets
(DevOps)
Aside from the Azure native tools mentioned above, it is common for customers to use third-party tools to
facilitate Deployment Acceleration and DevOps deployments.
Delivering on a cloud strategy requires solid planning, readiness, and adoption. But it's the ongoing operation of the digital
assets that delivers tangible business outcomes. Without a plan for reliable, well-managed operations of the cloud solutions,
those efforts will yield little value. The following exercises help develop the business and technical approaches needed to provide
cloud management that powers ongoing operations.
Getting started
To prepare you for this phase of the cloud adoption lifecycle, the framework suggests the following exercises:
Intended audience
The content in the Cloud Adoption Framework affects the business, technology, and culture of enterprises. This section of the
Cloud Adoption Framework interacts heavily with IT operations, IT governance, finance, line-of-business leaders, networking,
identity, and cloud adoption teams. Various dependencies on these personnel require a facilitative approach by the cloud
architects who are using this guidance. Facilitation with these teams is seldom a one-time effort.
The cloud architect serves as the thought leader and facilitator to bring these audiences together. The content in this collection of
guides is designed to help the cloud architect facilitate the right conversation, with the right audience, to drive necessary
decisions. Business transformation that's empowered by the cloud depends on the cloud architect to help guide decisions
throughout the business and IT.
Cloud architect specialization in this section: Each section of the Cloud Adoption Framework represents a different
specialization or variant of the cloud architect role. This section of the Cloud Adoption Framework is designed for cloud
architects with a passion for operations and management of deployment solutions. Within this framework, these specialists are
referred to frequently as cloud operations, or collectively as the cloud operations team.
Management baseline
A management baseline is the minimum set of tools and processes that should be applied to every asset in an
environment. Several additional options can be included in the management baseline. The next few articles
accelerate cloud management capabilities by focusing on the minimum options necessary instead of on all of the
available options.
TIP
For an interactive experience, view this guide in the Azure portal. Go to Azure Quickstart Center in the Azure portal and
select Azure Management Guide. Then follow the step-by-step instructions.
Inventory and visibility is the first of three disciplines in a cloud management baseline.
This discipline comes first because collecting proper operational data is vital when you make decisions about
operations. Cloud management teams must understand what is managed and how well those assets are operated.
This article describes the different tools that provide both an inventory and visibility into the inventory's run state.
For any enterprise-grade environment, the following table outlines the suggested minimum for a management
baseline.
Monitor health of Azure services Azure Service Health Health, performance, and diagnostics
for services running in Azure
Log centralization Log Analytics Central logging for all visibility purposes
Virtual machine inventory and change Azure Change Tracking and Inventory Inventory VMs and monitor changes for
tracking guest OS level
Guest OS monitoring Azure Monitor for VMs Monitoring changes and performance
of VMs
Log Analytics
Log Analytics
A Log Analytics workspace is a unique environment for storing Azure Monitor log data. Each workspace has its
own data repository and configuration. Data sources and solutions are configured to store their data in particular
workspaces. Azure monitoring solutions require all servers to be connected to a workspace, so that their log data
can be stored and accessed.
Action
E XP L O R E A Z U R E
M O N I TO R
Learn more
To learn more, see the Log Analytics workspace creation documentation.
Azure Monitor
Azure Monitor
Azure Monitor provides a single unified hub for all monitoring and diagnostics data in Azure and gives you
visibility across your resources. With Azure Monitor, you can find and fix problems and optimize performance. You
can also understand customer behavior.
Monitor and visualize metrics. Metrics are numerical values available from Azure resources. They help
you understand the health of your systems. Customize charts for your dashboards, and use workbooks for
reporting.
Query and analyze logs. Logs include activity logs and diagnostic logs from Azure. Collect additional logs
from other monitoring and management solutions for your cloud or on-premises resources. Log Analytics
provides a central repository to aggregate all of this data. From there, you can run queries to help
troubleshoot issues or to visualize data.
Set up alerts and actions. Alerts notify you of critical conditions. Corrective actions can be taken based on
triggers from metrics, logs, or service-health issues. You can set up different notifications and actions and
can also send data to your IT service management tools.
Action
E XP L O R E A Z U R E
M O N I TO R
Onboard solutions
Onboard solutions
To enable solutions, you need to configure the Log Analytics workspace. Onboarded Azure VMs and on-premises
servers get the solutions from the Log Analytics workspaces they're connected to.
There are two approaches to onboarding:
Single VM
Entire subscription
Each article guides you through a series of steps to onboard these solutions:
Update Management
Change Tracking and Inventory
Azure Activity Log
Azure Log Analytics Agent Health
Antimalware Assessment
Azure Monitor for VMs
Azure Security Center
Each of the previous steps helps establish inventory and visibility.
Operational compliance in Azure
3 minutes to read • Edit Online
Improving operational compliance reduces the likelihood of an outage related to configuration drift or
vulnerabilities related to systems being improperly patched.
For any enterprise-grade environment, this table outlines the suggested minimum for a management baseline.
Update Management
Update Management
Computers that are managed by Update Management use the following configurations to do assessment and
update deployments:
Microsoft Monitoring Agent (MMA) for Windows or Linux
PowerShell Desired State Configuration (DSC ) for Linux
Azure Automation Hybrid Runbook Worker
Microsoft Update or Windows Server Update Services (WSUS ) for Windows computers
For more information, see Update Management solution.
WARNING
Before using Update Management, you must onboard virtual machines or an entire subscription into Log Analytics and
Azure Automation.
There are two approaches to onboarding:
Single VM
Entire subscription
You should follow one before proceeding with Update Management.
Manage updates
To apply a policy to a resource group:
1. Go to Azure Automation.
2. Select Automation accounts, and choose one of the listed accounts.
3. Go to Configuration Management.
4. Inventory, Change Management, and State Configuration can be used to control the state and operational
compliance of the managed VMs.
AS S IG N
PO LIC Y
Azure Policy
Azure Policy
Azure Policy is used throughout governance processes. It's also highly valuable within cloud management
processes. Azure Policy can audit and remediate Azure resources and can also audit settings inside a machine. The
validation is performed by the Guest Configuration extension and client. The extension, through the client, validates
settings like:
Operating system configuration.
Application configuration or presence.
Environment settings.
Azure Policy Guest Configuration currently only audits settings inside the machine. It doesn't apply configurations.
Action
Assign a built-in policy to a management group, subscription, or resource group.
AS S IG N
PO LIC Y
Apply a policy
To apply a policy to a resource group:
1. Go to Azure Policy.
2. Select Assign a policy.
Learn more
To learn more, see:
Azure Policy
Azure Policy - Guest configuration
Cloud Adoption Framework: Policy enforcement decision guide
Azure Blueprints
Azure Blueprints
With Azure Blueprints, cloud architects and central information-technology groups can define a repeatable set of
Azure resources. These resources implement and adhere to an organization's standards, patterns, and
requirements.
With Azure Blueprints, development teams can rapidly build and stand up new environments. Teams can also trust
they're building within organizational compliance. They do so by using a set of built-in components like networking
to speed up development and delivery.
Blueprints are a declarative way to orchestrate the deployment of different resource templates and other artifacts
like:
Role assignments.
Policy assignments.
Azure Resource Manager templates.
Resource groups.
Applying a blueprint can enforce operational compliance in an environment if this enforcement isn't done by the
cloud governance team.
Create a blueprint
To create a blueprint:
1. Go to Blueprints - Getting started.
2. On the Create a Blueprint pane, select Create.
3. Filter the list of blueprints to select the appropriate blueprint.
4. In the Blueprint name box, enter the blueprint name.
5. Select Definition location, and choose the appropriate location.
6. Select Next : Artifacts >>, and review the artifacts included in the blueprint.
7. Select Save Draft.
C R E A TE A
B LU E PR INT
Protect and recover is the third and final discipline in any cloud-management baseline.
In Operational compliance in Azure the objective is to reduce the likelihood of a business interruption. The current
article aims to reduce the duration and impact of outages that can't be prevented.
For any enterprise-grade environment, this table outlines the suggested minimum for any management baseline:
Protect the environment Azure Security Center Strengthen security and provide
advanced threat protection across your
hybrid workloads.
Azure Backup
Azure Backup
With Azure Backup, you can back up, protect, and recover your data in the Microsoft cloud. Azure Backup replaces
your existing on-premises or offsite backup solution with a cloud-based solution. This new solution is reliable,
secure, and cost competitive. Azure Backup can also help protect and recover on-premises assets through one
consistent solution.
Enable backup for an Azure VM
1. In the Azure portal, select Virtual machines, and select the VM you want to replicate.
2. On the Operations pane, select Backup.
3. Create or select an existing Azure Recovery Services vault.
4. Select Create (or edit) a new policy.
5. Configure the schedule and retention period.
6. Select OK.
7. Select Enable Backup.
G O TO V I R TU A L
M AC H INE S
Overview
TIP
Depending on your scenario, the exact steps might differ slightly.
Verify settings
After the replication job has finished, you can check the replication status, verify replication health, and test the
deployment.
1. In the VM menu, select Disaster recovery.
2. Verify replication health, the recovery points that have been created, and source and target regions on the map.
G O TO V I R TU A L
M AC H INE S
Learn more
Azure Site Recovery overview
Replicate an Azure VM to another region
Enhanced management baseline in Azure
3 minutes to read • Edit Online
The first three cloud management disciplines describe a management baseline. The preceding articles in this guide
outline a minimum viable product (MVP ) for cloud management services, which is referred to as a management
baseline. This article outlines a few common improvements to the baseline.
The purpose of a management baseline is to create a consistent offering that provides a minimum level of business
commitment for all supported workloads. With this baseline of common, repeatable management offerings, the
team can deliver highly optimized operational management with minimal deviation.
However, you might need a greater commitment to the business beyond the standard offering. The following
image and list show three ways to go beyond the management baseline.
Workload operations:
Largest per-workload operations investment.
Highest degree of resiliency.
Suggested for the approximately 20% of workloads that drive business value.
Typically reserved for high-criticality or mission-critical workloads.
Platform operations:
Operations investment is spread across many workloads.
Resiliency improvements affect all workloads that use the defined platform.
Suggested for the approximately 20% of platforms that have highest criticality.
Typically reserved for medium-criticality to high-criticality workloads.
Enhanced management baseline:
Lowest relative operations investment.
Slightly improved business commitments using additional cloud-native operations tools and processes.
Both workload operations and platform operations require changes to design and architecture principles. Those
changes can take time and might result in increased operating expenses. To reduce the number of workloads that
require such investments, an enhanced management baseline can provide enough of an improvement to the
business commitment.
This table outlines a few processes, tools, and potential effects common in customers' enhanced management
baselines:
Inventory and Service change Azure Resource Graph Greater visibility into Overview of Azure
visibility tracking changes to Azure Resource Graph
services might help
detect negative effects
sooner or remediate
faster.
Protect and recover Breach notification Azure Security Center Extend protection to See the following
include security- sections
breach recovery
triggers.
Azure Automation
Azure Automation
Azure Automation provides a centralized system for the management of automated controls. In Azure Automation,
you can run simple remediation, scale, and optimization processes in response to environmental metrics. These
processes reduce the overhead associated with manual incident processing.
Most importantly, automated remediation can be delivered in near-real-time, significantly reducing interruptions to
business processes. A study of the most common business interruptions identifies activities within your
environment that could be automated.
Runbooks
The basic unit of code for delivering automated remediation is a runbook. Runbooks contain the instructions for
remediating or recovering from an incident.
To create or manage runbooks:
1. Go to Azure Automation.
2. Select Automation accounts and choose one of the listed accounts.
3. Go to Process automation.
4. With the options presented, you can create or manage runbooks, schedules, and other automated remediation
functionality.
AS S IG N
PO LIC Y
Much like the enhanced management baseline, platform specialization is extension beyond the standard
management baseline. See the following image and list that show the ways to expand the management baseline.
This article addresses the platform specialization options.
Workload operations: The largest per-workload operations investment and the highest degree of resiliency.
We suggest workload operations for the approximately 20% of workloads that drive business value. This
specialization is usually reserved for high criticality or mission-critical workloads.
Platform operations: Operations investment is spread across many workloads. Resiliency improvements
affect all workloads that use the defined platform. We suggest platform operations for the approximately 20%
of platforms that have the highest criticality. This specialization is usually reserved for medium to high criticality
workloads.
Enhanced management baseline: The relatively lowest operations investment. This specialization slightly
improves business commitments by using additional cloud-native operations tools and processes.
Both workload and platform operations require changes to design and architecture principles. Those changes can
take time and might result in increased operating expenses. To reduce the number of workloads requiring such
investments, an enhanced management baseline might provide enough of an improvement to the business
commitment.
This table outlines a few common processes, tools, and potential effects common in customers' enhanced
management baselines:
SUGGESTED MANAGEMENT
PROCESS TOOL PURPOSE LEVEL
Container performance Azure Monitor for containers Monitoring and diagnostics Platform operations
of containers
Platform as a service (PaaS) Azure SQL Analytics Monitoring and diagnostics Platform operations
data performance for PaaS databases
Infrastructure as a service SQL Server Health Check Monitoring and diagnostics Platform operations
(IaaS) data performance for IaaS databases
High-level process
Platform specialization consists of a disciplined execution of the following four processes in an iterative approach.
Each process is explained in more detail in later sections of this article.
Improve system design: Improve the design of common systems or platforms to effectively minimize
interruptions.
Automate remediation: Some improvements aren't cost effective. In such cases, it might make more sense to
automate remediation and reduce the effect of interruptions.
Scale the solution: As systems design and automated remediation are improved, those changes can be scaled
across the environment through the service catalog.
Continuous improvement: Different monitoring tools can be used to discover incremental improvements.
These improvements can be addressed in the next pass of system design, automation, and scale.
Automated remediation
Automated remediation
Some technical debt can't be addressed. Resolution might be too expensive to correct or might be planned but
have a long project duration. The business interruption might not have a significant business effect. Or the
business priority might be to recover quickly instead of investing in resiliency.
When resolution of technical debt isn't the desired approach, automated remediation is commonly the next step.
Using Azure Automation and Azure Monitor to detect trends and provide automated remediation is the most
common approach to automated remediation.
For guidance on automated remediation, see Azure Automation and alerts.
Continuous improvement
Continuous improvement
Platform specialization and platform operations both depend on strong feedback loops among adoption, platform,
automation, and management teams. Grounding those feedback loops in data helps each team make wise
decisions. For platform operations to achieve long-term business commitments, it's important to use insights
specific to the centralized platform.
Containers and SQL Server are the two most common centrally managed platforms. These articles can help you
get started with continuous-improvement data collection on those platforms:
Container performance
PaaS database performance
IaaS database performance
Workload specialization for cloud management
2 minutes to read • Edit Online
Workload operations: The largest per-workload operations investment and highest degree of resiliency. We
suggest workload operations for the approximately 20% of workloads that drive business value. This
specialization is usually reserved for high criticality or mission-critical workloads.
Platform operations: Operations investment is spread across many workloads. Resiliency improvements
affect all workloads that use the defined platform. We suggest platform operations for the approximately 20% of
platforms that have the highest criticality. This specialization is usually reserved for medium to high criticality
workloads.
Enhanced management baseline: The relatively lowest operations investment. This specialization slightly
improves business commitments by using additional cloud-native operations tools and processes.
High-level process
Workload specialization consists of a disciplined execution of the following four processes in an iterative approach.
Each process is explained in more detail in Platform Specialization.
Improve system design: Improve the design of a specific workload to effectively minimize interruptions.
Automate remediation: Some improvements aren't cost effective. In such cases, it might make more sense to
automate remediation and reduce the effect of interruptions.
Scale the solution: As you improve systems design and automated remediation, you can scale those changes
across the environment through the service catalog.
Continuous improvement: You can use different monitoring tools to discover incremental improvements.
These improvements can be addressed in the next pass of system design, automation, and scale.
Cultural change
Workload specialization often triggers a cultural change in traditional IT build processes that focus on delivering a
management baseline, enhanced baselines, and platform operations. Those types of offerings can be scaled across
the environment. Workload specialization is similar in execution to platform specialization. But unlike common
platforms, the specialization required by individual workloads often doesn't scale.
When workload specialization is required, operational management commonly evolves beyond a central IT
perspective. The approach suggested in Cloud Adoption Framework is a distribution of cloud management
functionality.
In this model, operational tasks like monitoring, deployment, DevOps, and other innovation-focused functions shift
to an application-development or business-unit organization. The Cloud Platform and core Cloud Monitoring team
still delivers on the management baseline across the environment.
Those centralized teams also guide and instruct workload-specialized teams on operations of their workloads. But
the day-to-day operational responsibility falls on a cloud management team that is managed outside of IT. This
type of distributed control is one of the primary indicators of maturity in a cloud center of excellence.
Performance, availability, and usage Application Insights Advanced application monitoring with
app dashboard, composite maps, usage,
and tracing
Azure server management services provide a consistent experience for managing servers at scale. These services
cover both Linux and Windows operating systems. They can be used in production, development, and test
environments. The server management services can support Azure IaaS virtual machines, physical servers, and
virtual machines that are hosted on-premises or in other hosting environments.
The Azure server management services suite includes the services in the following diagram:
This section of the Microsoft Cloud Adoption Framework provides an actionable and prescriptive plan for
deploying server management services in your environment. This plan helps orient you quickly to these services,
guiding you through an incremental set of management stages for all environment sizes.
For simplicity, we've categorized this guidance into three stages:
Next steps
Familiarize yourself with the tools, services, and planning involved with adopting the Azure server management
suite.
Prerequisite tools and planning
Phase 1: Prerequisite planning for Azure server
management services
6 minutes to read • Edit Online
In this phase, you'll become familiar with the Azure server management suite of services, and plan how to deploy
the resources needed to implement these management solutions.
Planning considerations
When preparing the workspaces and accounts that you need for onboarding management services, consider the
following issues:
Azure geographies and regulatory compliance: Azure regions are organized into geographies. An Azure
geography ensures that data residency, sovereignty, compliance, and resiliency requirements are honored
within geographical boundaries. If your workloads are subject to data-sovereignty or other compliance
requirements, workspace and Automation accounts must be deployed to regions within the same Azure
geography as the workload resources they support.
Number of workspaces: As a guiding principle, create the minimum number of workspaces required per
Azure geography. We recommend at least one workspace for each Azure geography where your compute or
storage resources are located. This initial alignment helps avoid future regulatory issues when you migrate
data to different geographies.
Data retention and capping: You may also need to take Data retention policies or data capping requirements
into consideration when creating workspaces or Automation accounts. For more information about these
principles, and for additional considerations when planning your workspaces, see Manage log data and
workspaces in Azure Monitor.
Region mapping: Linking a Log Analytics workspace and an Azure Automation account is supported only
between certain Azure regions. For example, if the Log Analytics workspace is hosted in the EastUS region, the
linked Automation account must be created in the EastUS2 region to be used with management services. If you
have an Automation account that was created in another region, it can't link to a workspace in EastUS. The
choice of deployment region can significantly affect Azure geography requirements. Consult the region
mapping table to decide which region should host your workspaces and Automation accounts.
Workspace multihoming: The Azure Log Analytics agent supports multihoming in some scenarios, but the
agent faces several limitations and challenges when running in this configuration. Unless Microsoft has
recommended it for your specific scenario, we don't recommend that you configure multihoming on the Log
Analytics agent.
NOTE
When you create an Automation account by using the Azure portal, the portal attempts by default to create Run As
accounts for both Azure Resource Manager and the classic deployment model resources. If you don't have classic virtual
machines in your environment and you're not the co-administrator on the subscription, the portal creates a Run As account
for Resource Manager, but it generates an error when deploying the classic Run As account. If you don't intend to support
classic resources, you can ignore this error.
You can also create Run As accounts by using PowerShell.
Next steps
Learn how to onboard your servers to Azure server management services.
Onboard to Azure server management services
Phase 2: Onboarding Azure server management
services
2 minutes to read • Edit Online
After you're familiar with the tools and planning involved in Azure management services, you're ready for the
second phase. Phase 2 provides step-by-step guidance for onboarding these services for use with your Azure
resources. Start by evaluating this onboarding process before adopting it broadly in your environment.
NOTE
The automation approaches discussed in later sections of this guidance are meant for deployments that don't already have
servers deployed to the cloud. They require that you have the Owner role on a subscription to create all the required
resources and policies. If you've already created Log Analytics workspaces and Automation accounts, we recommend that
you pass these resources in the appropriate parameters when you start the example automation scripts.
Onboarding processes
This section of the guidance covers the following onboarding processes for both Azure virtual machines and on-
premises servers:
Enable management services on a single VM for evaluation by using the portal. Use this process to
familiarize yourself with the Azure server management services.
Configure management services for a subscription by using the portal. This process helps you configure
the Azure environment so that any new VMs that are provisioned will automatically use management services.
Use this approach if you prefer the Azure portal experience to scripts and command lines.
Configure management services for a subscription by using Azure Automation. This process is fully
automated. Just create a subscription, and the scripts will configure the environment to use management
services for any newly provisioned VM. Use this approach if you're familiar with PowerShell scripts and Azure
Resource Manager templates, or if you want to learn to use them.
The procedures for each of these approaches are different.
NOTE
When you use the Azure portal, the sequence of onboarding steps differs from the automated onboarding steps. The portal
offers a simpler onboarding experience.
The following diagram shows the recommended deployment model for management services:
As shown in the preceding diagram, the Log Analytics agent has both an auto -enroll and opt-in configuration for
on-premises servers:
Auto-enroll: When the Log Analytics agent is installed on a server and configured to connect to a workspace,
the solutions that are enabled on that workspace are applied to the server automatically.
Opt-in: Even if the agent is installed and connected to the workspace, the solution isn't applied unless it's added
to the server's scope configuration in the workspace.
Next steps
Learn how to onboard a single VM by using the portal to evaluate the onboarding process.
Onboard a single Azure VM for evaluation
Enable server management services on a single VM
for evaluation
2 minutes to read • Edit Online
NOTE
Create the required Log Analytics workspace and Azure Automation account before you implement Azure management
services on a VM.
It's simple to onboard Azure server management services to individual virtual machines in the Azure portal. You
can familiarize yourself with these services before you onboard them. When you select a VM instance, all the
solutions on the list of management tools and services appear on the Operations or Monitoring menu. You
select a solution and follow the wizard to onboard it.
Related resources
For more information about how to onboard these solutions to individual VMs, see:
Onboard Update Management, Change Tracking, and Inventory solutions from Azure virtual machine
Onboard Azure Monitoring for VMs
Next steps
Learn how to use Azure Policy to onboard Azure VMs at scale.
Configure Azure management services for a subscription
Configure Azure server management services at scale
7 minutes to read • Edit Online
You must complete these two tasks to onboard Azure server management services to your servers:
Deploy service agents to your servers
Enable the management solutions
This article covers the three processes that are necessary to complete these tasks:
1. Deploy the required agents to Azure VMs by using Azure Policy
2. Deploy the required agents to on-premises servers
3. Enable and configuring the solutions
NOTE
Create the required Log Analytics workspace and Azure Automation account before you onboard virtual machines to Azure
server management services.
NOTE
For more information about various agents for Azure monitoring, see Overview of the Azure monitoring agents.
Assign policies
To assign the policies that described in the previous section:
1. In the Azure portal, go to Azure Policy > Assignments > Assign initiative.
2. On the Assign Policy page, set the Scope by selecting the ellipsis (…) and then selecting either a
management group or subscription. Optionally, select a resource group. Then choose Select at the bottom
of the Scope page. The scope determines which resources or group of resources the policy is assigned to.
3. Select the ellipsis (… ) next to Policy definition to open the list of available definitions. To filter the initiative
definitions, enter Azure Monitor in the Search box:
4. The Assignment name is automatically populated with the policy name that you selected, but you can
change it. You can also add an optional description to provide more information about this policy
assignment. The Assigned by field is automatically filled based on who is signed in. This field is optional,
and it supports custom values.
5. For this policy, select Log Analytics workspace for the Log analytics agent to associate.
6. Select the Managed Identity location check box. If this policy is of the type DeployIfNotExists, a managed
identity will be required to deploy the policy. In the portal, the account will be created as indicated by the
check box selection.
7. Select Assign.
After you complete the wizard, the policy assignment will be deployed to the environment. It can take up to 30
minutes for the policy to take effect. To test it, create new VMs after 30 minutes, and check if the Microsoft
Monitoring Agent is enabled on the VM by default.
For on-premises servers, you need to download and install the Log Analytics agent and the Microsoft Dependency
agent manually and configure them to connect to the correct workspace. You must specify the workspace ID and
key information. To get that information, go to your Log Analytics workspace in the Azure portal and select
Settings > Advanced settings.
Enable and configure solutions
To enable solutions, you need to configure the Log Analytics workspace. Onboarded Azure VMs and on-premises
servers will get the solutions from the Log Analytics workspaces that they're connected to.
Update Management
The Update Management, Change Tracking, and Inventory solutions require both a Log Analytics workspace and
an Automation account. To ensure that these resources are properly configured, we recommend that you onboard
through your Automation account. For more information, see Onboard Update Management, Change Tracking,
and Inventory solutions.
We recommend that you enable the Update Management solution for all servers. Update Management is free for
Azure VMs and on-premises servers. If you enable Update Management through your Automation account, a
scope configuration is created in the workspace. Manually update the scope to include machines that are covered
by the Update Management service.
To cover your existing servers as well as future servers, you need to remove the scope configuration. To do this,
view your Automation account in the Azure portal. Select Update Management > Manage machine > Enable
on all available and future machines. This setting allows all Azure VMs that are connected to the workspace to
use Update Management.
Heartbeat
| where AzureEnvironment=~"Azure" or Computer in~ ("list of the on-premises server names", "server1")
| distinct Computer
NOTE
The server name must exactly match the value in the expression, and it shouldn't contain a domain name suffix.
5. Select Save. By default, the scope configuration is linked to the MicrosoftDefaultComputerGroup saved
search. It will be automatically updated.
Azure Activity Log
Azure Activity Log is also part of Azure Monitor. It provides insight into subscription-level events that occur in
Azure.
To implement this solution:
1. In the Azure portal, open All services and select Management + Governance > Solutions.
2. In the Solutions view, select Add.
3. Search for Activity Log Analytics and select it.
4. Select Create.
You need to specify the Workspace name of the workspace that you created in the previous section where the
solution is enabled.
Azure Log Analytics Agent Health
The Azure Log Analytics Agent Health solution reports on the health, performance, and availability of your
Windows and Linux servers.
To implement this solution:
1. In the Azure portal, open All services and select Management + Governance > Solutions.
2. In the Solutions view, select Add.
3. Search for Azure Log Analytics agent health and select it.
4. Select Create.
You need to specify the Workspace name of the workspace that you created in the previous section where the
solution is enabled.
After creation is complete, the workspace resource instance displays AgentHealthAssessment when you select
View > Solutions.
Antimalware Assessment
The Antimalware Assessment solution helps you identify servers that are infected or at increased risk of infection
by malware.
To implement this solution:
1. In the Azure portal, open All services and select Management + Governance > Solutions.
2. In the Solutions view, select Add.
3. Search for and select Antimalware Assessment.
4. Select Create.
You need to specify the Workspace name of the workspace that you created in the previous section where the
solution is enabled.
After creation is complete, the workspace resource instance displays AntiMalware when you select View >
Solutions.
Azure Monitor for VMs
You can enable Azure Monitor for VMs through the view page for the VM instance, as described in Enable
management services on a single VM for evaluation. You shouldn't enable solutions directly from the Solutions
page as you do for the other solutions that are described in this article. For large-scale deployments, it may be
easier to use automation to enable the correct solutions in the workspace.
Azure Security Center
We recommend that you onboard all your servers at least to the Azure Security Center Free tier. This option
provides a basic level of security assessments and actionable security recommendations for your environment. If
you upgrade to the Standard tier, you get additional benefits, which are discussed in detail on the Security Center
pricing page.
To enable the Azure Security Center Free tier, follow these steps:
1. Go to the Security Center portal page.
2. Under POLICY & COMPLIANCE, select Security policy.
3. Find the Log Analytics workspace resource that you created in the pane on the right side.
4. Select Edit settings for that workspace.
5. Select Pricing tier.
6. Choose the Free option.
7. Select Save.
Next steps
Learn how to use automation to onboard servers and create alerts.
Automate onboarding and alert configuration
Automate onboarding
2 minutes to read • Edit Online
To improve the efficiency of deploying Azure server management services, consider automating deployment as
discussed in previous sections of this guidance. The script and the example templates provided in the following
sections are starting points for developing your own automation of onboarding processes.
This guidance has a supporting GitHub repository of sample code, CloudAdoptionFramework. The repository
provides example scripts and Azure Resource Manager templates to help you automate the deployment of Azure
server management services.
The sample files illustrate how to use Azure PowerShell cmdlets to automate the following tasks:
Create a Log Analytics workspace. (Or, use an existing workspace if it meets the requirements. For details,
see Workspace planning.
Create an Automation account. (Or, use an existing account if it meets the requirements. For details, see
Workspace planning).
Link the Automation account and the Log Analytics workspace. This step isn't required if you're onboarding
by using the Azure portal.
Enable Update Management, and Change Tracking and Inventory, for the workspace.
Onboard Azure VMs by using Azure Policy. A policy installs the Log Analytics agent and the Microsoft
Dependency Agent on the Azure VMs.
Onboard on-premises servers by installing the Log Analytics agent on them.
The files described in the following table are used in this sample. You can customize them to support your own
deployment scenarios.
ScopeConfig.json A Resource Manager template that uses the opt-in model for
on-premises servers with the Change Tracking solution. Using
the opt-in model is optional.
ChangeTracking-Filelist.json A Resource Manager template that defines the list of files that
will be monitored by Change Tracking.
Use the following command to run New -AMSDeployment.ps1:
Next steps
Learn how to set up basic alerts to notify your team of key management events and issues.
Set up basic alerts
Set up basic alerts
2 minutes to read • Edit Online
A key part of managing resources is getting notified when problems occur. Alerts proactively notify you of critical
conditions, based on triggers from metrics, logs, or service-health issues. As part of onboarding the Azure server
management services, you can set up alerts and notifications that help keep your IT teams aware of any problems.
Next steps
Learn about operations and security mechanisms that support your ongoing operations.
Ongoing management and security
Phase 3: Ongoing management and security
2 minutes to read • Edit Online
After you've onboarded Azure server management services, you'll need to focus on the operations and security
configurations that will support your ongoing operations. We'll start with securing your environment by reviewing
the Azure Security Center. We'll then configure policies to keep your servers in compliance and automate common
tasks. This section covers the following topics:
Address security recommendations. Azure Security Center provides suggestions to improve the security of
your environment. When you implement these recommendations, you see the impact reflected in a security
score.
Enable the Guest Configuration policy. Use the Azure Policy Guest Configuration feature to audit the
settings in a virtual machine. For example, you can check whether any certificates are about to expire.
Track and alert on critical changes. When you're troubleshooting, the first question to consider is, "What's
changed?" In this article, you'll learn how to track changes and create alerts to proactively monitor critical
components.
Create update schedules. Schedule the installation of updates to ensure that all your servers have the latest
ones.
Common Azure Policy examples. This article provides examples of common management policies.
Next steps
Learn how to enable the Azure Policy Guest Configuration feature.
Guest Configuration policy
Guest Configuration policy
2 minutes to read • Edit Online
You can use the Azure Policy Guest Configuration extension to audit the configuration settings in a virtual
machine. Guest Configuration is currently supported only on Azure VMs.
To find the list of Guest Configuration policies, search for "Guest Configuration" on the Azure Policy portal page.
Or run this cmdlet in a PowerShell window to find the list:
NOTE
Guest Configuration functionality is regularly updated to support additional policy sets. Check for new supported policies
periodically and evaluate whether they'll be useful.
Deployment
Use the following example PowerShell script to deploy these policies to:
Verify that password security settings in Windows and Linux computers are set correctly.
Verify that certificates aren't close to expiration on Windows VMs.
Before you run this script, use the Connect-AzAccount cmdlet to sign in. When you run the script, you must
provide the name of the subscription that you want to apply the policies to.
New-AzPolicyAssignment -Name "CertExpirePolicy" -DisplayName "[Preview]: Audit that certificates are not
expiring on Windows VMs" -Scope $scope -PolicySetDefinition $CertExpirePolicy -AssignIdentity -Location eastus
Next steps
Learn how to enable change tracking and alerting for critical file, service, software, and registry changes.
Enable tracking and alerting for critical changes
Enable tracking and alerting for critical changes
3 minutes to read • Edit Online
Azure Change Tracking and Inventory provide alerts on the configuration state of your hybrid environment and
changes to that environment. It can report critical file, service, software, and registry changes that might affect
your deployed servers.
By default, the Azure Automation inventory service doesn't monitor files or registry settings. The solution does
provide a list of registry keys that we recommend for monitoring. To see this list, go to your Automation account
in the Azure portal and select Inventory > Edit Settings.
For more information about each registry key, see Registry key change tracking. Select any key to evaluate and
then enable it. The setting is applied to all VMs that are enabled in the current workspace.
You can also use the service to track critical file changes. For example, you might want to track the
C:\windows\system32\drivers\etc\hosts file because the OS uses it to map host names to IP addresses. Changes
to this file could cause connectivity problems or redirect traffic to dangerous websites.
To enable file-content tracking for the hosts file, follow the steps in Enable file content tracking.
You can also add an alert for changes to files that you're tracking. For example, say you want to set an alert for
changes to the hosts file. Select Log Analytics on the command bar or Log Search for the linked Log Analytics
workspace. In Log Analytics, use the following query to search for changes to the hosts file:
This query searches for changes to the contents of files that have a path that contains the word “hosts.” You can
also search for a specific file by changing the path parameter. (For example,
FileSystemPath == "c:\\windows\\system32\\drivers\\etc\\hosts" .)
After the query returns the results, select New alert rule to open the alert-rule editor. You can also get to this
editor via Azure Monitor in the Azure portal.
In the alert-rule editor, review the query and change the alert logic if you need to. In this case, we want the alert to
be raised if any changes are detected on any machine in the environment.
After you set the condition logic, you can assign action groups to perform actions in response to the alert. In this
example, when the alert is raised, emails are sent and an ITSM ticket is created. You can take many other useful
actions, like triggering an Azure function, an Azure Automation runbook, a webhook, or a logic app.
After you've set all the parameters and logic, apply the alert to the environment.
Next steps
Learn how to use Azure Automation to create update schedules to manage updates to your servers.
Create update schedules
Create update schedules
2 minutes to read • Edit Online
You can manage update schedules by using the Azure portal or the new PowerShell cmdlet modules.
To create an update schedule via the Azure portal, see Schedule an update deployment.
The Az.Automation module now supports configuring update management by using Azure PowerShell. Version
1.7.0 of the module adds support for the New -AzAutomationUpdateManagementAzureQuery cmdlet. This cmdlet
lets you use tags, location, and saved searches to configure update schedules for a flexible group of machines.
Example script
The example script in this section illustrates the use of tagging and querying to create dynamic groups of
machines that you can apply update schedules to. It performs the following actions. You can refer to the
implementations of the specific actions when you create your own scripts.
Creates an Azure Automation update schedule that runs every Saturday at 8:00 AM.
Creates a query for machines that match these criteria:
Deployed in the westus , eastus , or eastus2 Azure location
Have an Owner tag applied to them with a value set to JaneSmith
Have a Production tag applied to them with a value set to true
Applies the update schedule to the queried machines and sets a two-hour update window.
Before you run the example script, you'll need to sign in by using the Connect-AzAccount cmdlet. When you start
the script, provide the following information:
The target subscription ID
The target resource group
Your Log Analytics workspace name
Your Azure Automation account name
<#
.SYNOPSIS
This script orchestrates the deployment of the solutions and the agents.
.Parameter SubscriptionName
.Parameter WorkspaceName
.Parameter AutomationAccountName
.Parameter ResourceGroupName
#>
param (
[Parameter(Mandatory=$true)]
[string]$SubscriptionId,
[Parameter(Mandatory=$true)]
[string]$ResourceGroupName,
[Parameter(Mandatory=$true)]
[string]$WorkspaceName,
[Parameter(Mandatory=$true)]
[string]$AutomationAccountName,
[Parameter(Mandatory=$false)]
[string]$scheduleName = "SaturdayCritialSecurity"
)
Import-Module Az.Automation
$startTime = ([DateTime]::Now).AddMinutes(10)
$schedule = New-AzAutomationSchedule -ResourceGroupName $ResourceGroupName `
-AutomationAccountName $AutomationAccountName `
-StartTime $startTime `
-Name $scheduleName `
-Description "Saturday patches" `
-DaysOfWeek Saturday `
-WeekInterval 1 `
-ForUpdateConfiguration
$queryScope = @("/subscriptions/$SubscriptionID/resourceGroups/")
$AzureQueries = @($DGQuery)
Next steps
See examples of how to implement common policies in Azure that can help manage your servers.
Common policies in Azure
Common Azure Policy examples
2 minutes to read • Edit Online
Azure Policy can help you apply governance to your cloud resources. This service can help you create guardrails
that ensure company-wide compliance to governance policy requirements. To create policies, use either the Azure
portal or PowerShell cmdlets. This article provides PowerShell cmdlet examples.
NOTE
With Azure Policy, enforcement policies (deployIfNotExists) aren't automatically deployed to existing VMs. Remediation is
required to keep VMs in compliance. For more information, see Remediate noncompliant resources with Azure Policy.
The following script shows how to assign the policy. Change the $SubscriptionID value to point to the
subscription that you want to assign the policy to. Before you run the script, use the Connect-AzAccount cmdlet to
sign in.
#Replace the -Name GUID with the policy GUID you want to assign.
$AllowedLocationPolicy = Get-AzPolicyDefinition -Name "e56962a6-4747-49cd-b67b-bf8b01975c4c"
You can also use this script to apply the other policies that are discussed in this article. Just replace the GUID in
the line that sets $AllowedLocationPolicy with the GUID of the policy that you want to apply.
Block certain resource types
Another common built-in policy that's used to control costs can also be used to block certain resource types.
To find this policy in the portal, search for "allowed resource types" on the policy definition page. Or run this
cmdlet to find the policy:
Get-AzPolicyDefinition | Where-Object { ($_.Properties.policyType -eq "BuiltIn") -and
($_.Properties.displayName -like "*allowed resource types") }
After you identify the policy that you want to use, you can modify the PowerShell sample in the Restrict resource
regions section to assign the policy.
Restrict VM size
Azure offers a wide range of VM sizes to support various workloads. To control your budget, you could create a
policy that allows only a subset of VM sizes in your subscriptions.
Deploy antimalware
You can use this policy to deploy a Microsoft IaaSAntimalware extension with a default configuration to VMs that
aren't protected by antimalware.
The policy GUID is 2835b622-407b-4114-9198-6f7064cbe0dc .
The following script shows how to assign the policy. To use the script, change the $SubscriptionID value to point
to the subscription that you want to assign the policy to. Before you run the script, use the Connect-AzAccount
cmdlet to sign in.
#Replace location “eastus” with the value that you want to use.
New-AzPolicyAssignment -Name "Deploy Antimalware" -DisplayName "Deploy default Microsoft IaaSAntimalware
extension for Windows Server" -Scope $scope -PolicyDefinition $AntimalwarePolicy -Location eastus –
AssignIdentity
Next steps
Learn about other server-management tools and services that are available.
Azure server management tools and services
Azure server management tools and services
5 minutes to read • Edit Online
As is discussed in the overview of this guidance, the suite of Azure server management services covers these
areas:
Migrate
Secure
Protect
Monitor
Configure
Govern
The following sections briefly describe these management areas and provide links to detailed content about the
main Azure services that support them.
Migrate
Migration services can help you migrate your workloads into Azure. To provide the best guidance, the Azure
Migrate service starts by measuring on-premises server performance and assessing suitability for migration.
After Azure Migrate completes the assessment, you can use Azure Site Recovery and Azure Database Migration
Service to migrate your on-premises machines to Azure.
Secure
Azure Security Center is a comprehensive security management application. By onboarding to Security Center,
you can quickly get an assessment of the security and regulatory compliance status of your environment. For
instructions on onboarding your servers to Azure Security Center, see Configure Azure management services for
a subscription.
Protect
To protect your data, you need to plan for backup, high availability, encryption, authorization, and related
operational issues. These topics are covered extensively online, so here we'll focus on building a Business
Continuity Disaster Recovery (BCDR ) plan. We'll include references to documentation that describes in detail how
to implement and deploy this type of plan.
When you build data-protection strategies, first consider breaking down your workload applications into their
different tiers. This approach helps because each tier typically requires its own unique protection plan. To learn
more about designing applications to be resilient, see Designing resilient applications for Azure.
The most basic data protection is backup. To speed up the recovery process if servers are lost, back up not just
data but also server configurations. Backup is an effective mechanism to handle accidental data deletion and
ransomware attacks. Azure Backup can help you protect your data on Azure and on-premises servers running
Windows or Linux. For details about what Backup can do and for how -to guides, see the Azure Backup
documentation.
Recovery via backup can take a long time. The industry standard is usually one day. If a workload requires
business continuity for hardware failures or datacenter outage, consider using data replication. Azure Site
Recovery provides continuous replication of your VMs, a solution that provides bare-minimum data loss. Site
Recovery also supports several replication scenarios, such as replication:
Of Azure VMs between two Azure regions.
Between servers on-premises.
Between on-premises servers and Azure.
For more information, see the complete Azure Site Recovery replication matrix.
For your file-server data, another service to consider is Azure File Sync. This service helps you centralize your
organization's file shares in Azure Files, while preserving the flexibility, performance, and compatibility of an on-
premises file server. To use this service, follow the instructions for deploying Azure File Sync.
Monitor
Azure Monitor provides a view into various resources, like applications, containers, and virtual machines. It also
collects data from several sources:
Azure Monitor for VMs (insights) provides an in-depth view of virtual-machine health, performance trends,
and dependencies. The service monitors the health of the operating systems of your Azure virtual machines,
virtual-machine scale sets, and machines in your on-premises environment.
Log Analytics (logs) is a feature of Azure Monitor. Its role is central to the overall Azure management story. It
serves as the data store for log analysis and for many other Azure services. It offers a rich query language and
an analytics engine that provides insights into the operation of your applications and resources.
Azure Activity Log is also a feature of Azure Monitor. It provides insight into subscription-level events that
occur in Azure.
Configure
Several services fit into this category. They can help you to:
Automate operational tasks.
Manage server configurations.
Measure update compliance.
Schedule updates.
Detect changes to your servers.
These services are essential to supporting ongoing operations:
Update Management automates the deployment of patches across your environment, including deployment to
operating-system instances running outside of Azure. It supports both Windows and Linux operating systems,
and tracks key OS vulnerabilities and nonconformance caused by missing patches.
Change Tracking and Inventory provides insight into the software that's running in your environment, and
highlights any changes that have occurred.
Azure Automation lets you run Python and PowerShell scripts or runbooks to automate tasks across your
environment. When you use Automation with the Hybrid Runbook Worker, you can extend your runbooks to
your on-premises resources as well.
Azure Automation State Configuration enables you to push PowerShell Desired State Configuration (DSC )
configurations directly from Azure. DSC also lets you monitor and preserve configurations for guest operating
systems and workloads.
Govern
Adopting and moving to the cloud creates new management challenges. It requires a different mindset as you
shift from an operational management burden to monitoring and governance. The Cloud Adoption Framework
for Azure starts with governance. The framework explains how to migrate to the cloud, what the journey will look
like, and who should be involved.
The governance design for standard organizations often differs from governance design for complex enterprises.
To learn more about governance best practices for a standard organization, see the standard enterprise
governance guide. To learn more about governance best practices for a complex enterprise, see the governance
guide for complex enterprises.
Billing information
To learn about pricing for Azure management services, go to these pages:
Azure Site Recovery
Azure Backup
Azure Monitor
Azure Security Center
Azure Automation, including:
Desired State Configuration
Azure Update Management service
Azure Change Tracking and Inventory services
Azure Policy
Azure File Sync service
NOTE
The Azure Update Management solution is free, but there's a small cost related to data ingestion. As a rule of thumb, the
first 5 gigabytes (GB) per month of data ingestion are free. We generally observe that each machine uses about 25 MB per
month. So, about 200 machines per month are covered for free. For more servers, multiply the number of additional servers
by 25 MB per month. Then, multiply the result by the storage price for the additional storage that you need. For
information about costs, see Azure Storage Overview pricing. Each additional server typically has a nominal impact on cost.
Cloud monitoring guide: Introduction
3 minutes to read • Edit Online
The cloud fundamentally changes how enterprises procure and use technology resources. In the past, enterprises
assumed ownership of and responsibility for all levels of technology, from infrastructure to software. Now, the
cloud offers the potential for enterprises to provision and consume resources as needed.
Although the cloud offers nearly unlimited flexibility in terms of design choices, enterprises seek proven and
consistent methodology for the adoption of cloud technologies. Each enterprise has different goals and timelines
for cloud adoption, making a one-size-fits-all approach to adoption nearly impossible.
This digital transformation also enables an opportunity to modernize your infrastructure, workloads, and
applications. Depending on business strategy and objectives, adopting a hybrid cloud model is likely part of the
migration journey from on-premises to operating fully in the cloud. During this journey, IT teams are challenged to
adopt and realize rapid value from the cloud. IT must also understand how to effectively monitor the application or
service that's migrating to Azure, and continue to deliver effective IT operations and DevOps.
Stakeholders want to use cloud-based, software as a service (SaaS ) monitoring and management tools. They need
to understand what services and solutions deliver to achieve end-to-end visibility, reduce costs, and focus less on
infrastructure and maintenance of traditional software-based IT operations tools.
However, IT often prefers to use the tools they've already made a significant investment in. This approach supports
their service operations processes to monitor both cloud models, with the eventual goal of transitioning to a SaaS -
based offering. IT prefers this approach not only because it takes time, planning, resources, and funding to switch.
It's also because of confusion about which products or Azure services are appropriate or applicable to achieve the
transition.
The goal of this guide is to provide a detailed reference to help enterprise IT managers, business decision makers,
application architects, and application developers understand:
Azure monitoring platforms, with an overview and comparison of their capabilities.
The best-fit solution for monitoring hybrid, private, and Azure native workloads.
The recommended end-to-end monitoring approach for both infrastructure and applications. This approach
includes deployable solutions for migrating these common workloads to Azure.
This guide isn't a how -to article for using or configuring individual Azure services and solutions, but it does
reference those sources when they're applicable or available. After you've read it, you'll understand how to
successfully operate a workload by following best practices and patterns.
If you're unfamiliar with Azure Monitor and System Center Operations Manager, and you want to get a better
understanding of what makes them unique and how they compare to each other, review the Overview of our
monitoring platforms.
Audience
This guide is useful primarily for enterprise administrators, IT operations, IT security and compliance, application
architects, workload development owners, and workload operations owners.
Next steps
Monitoring strategy for cloud deployment models
Cloud monitoring guide: Monitoring strategy for
cloud deployment models
17 minutes to read • Edit Online
This article includes our recommended monitoring strategy for each of the cloud deployment models, based on
the following criteria:
You must maintain your commitment to Operations Manager or another enterprise monitoring platform,
because it's integrated with your IT operations processes, knowledge, and expertise, or certain functionality
isn't available yet in Azure Monitor.
You must monitor workloads both on-premises and in the public cloud, or just in the cloud.
Your cloud migration strategy includes modernizing IT operations and moving to our cloud monitoring
services and solutions.
You might have critical systems that are air-gapped or physically isolated, or are hosted in a private cloud or on
physical hardware, and these systems need to be monitored.
Our strategy includes support for monitoring infrastructure (compute, storage, and server workloads), application
(end-user, exceptions, and client), and network resources. It delivers a complete, service-oriented monitoring
perspective.
Azure resources - platform Azure Database services (for Azure Database for SQL Enable diagnostics logging
as a service (PaaS) example, SQL or MySQL). performance metrics. to stream SQL data to Azure
Monitor logs.
Azure resources - 1. Azure Storage 1. Capacity, availability, and 1. Storage metrics for Blob
infrastructure as a service 2. Azure Application performance. storage.
(IaaS) Gateway 2. Performance and 2. Enable diagnostics logging
3. Network security groups diagnostics logs (activity, and configure streaming to
4. Azure Traffic Manager access, performance, and Azure Monitor logs.
5. Azure Virtual Machines firewall). 3. Enable diagnostics logging
6. Azure Kubernetes 3. Monitor events when of network security groups,
Service/Azure Container rules are applied, and the and configure streaming to
Instances rule counter for how many Azure Monitor logs.
times a rule is applied to 4. Enable diagnostics logging
deny or allow. of Traffic Manager
4. Monitor endpoint status endpoints, and configure
availability. streaming to Azure Monitor
5. Monitor capacity, logs.
availability, and performance 5. Enable Azure Monitor for
in a guest VM operating VMs.
system (OS). Map app 6. Enable Azure Monitor for
dependencies hosted on containers.
each VM, including the
visibility of active network
connections between
servers, inbound and
outbound connection
latency, and ports across
any TCP-connected
architecture.
6. Monitor capacity,
availability, and performance
of workloads running on
containers and container
instances.
Azure subscription Azure service health and Administrative actions Delivered in the Activity Log
basic resource health. performed on a service or for monitoring and alerting
resource. by using Azure Resource
Service health with an Manager.
Azure service is in a
degraded or unavailable
state.
Health issues detected
with an Azure resource from
the Azure service
perspective.
Operations performed
with Azure Autoscale
indicating a failure or
exception.
Operations performed
with Azure Policy indicating
that an allowed or denied
action occurred.
Record of alerts
generated by Azure Security
Center.
Legacy web application monitoring Yes, limited, varies by SDK Yes, limited
Supports monitoring older versions of
.NET and Java web applications.
Next steps
Collect the right data
Cloud monitoring guide: Collect the right data
2 minutes to read • Edit Online
This article describes some considerations for collecting monitoring data in a cloud application.
To observe the health and availability of your cloud solution, you must configure the monitoring tools to collect a
level of signals that are based on predictable failure states. These signals are the symptoms of the failure, not the
cause. The monitoring tools use metrics and, for advanced diagnostics and root cause analysis, logs.
Plan for monitoring and migration carefully. Start by including the monitoring service owner, the manager of
operations, and other related personnel during the planning phase, and continue engaging them throughout the
development and release cycle. Their focus will be to develop a monitoring configuration that's based on the
following criteria:
What's the composition of the service, and are those dependencies monitored today? If so, are there multiple
tools involved? Is there an opportunity to consolidate, without introducing risks?
What is the SLA of the service, and how will I measure and report it?
What should the service dashboard look like when an incident is raised? What should the dashboard look like
for the service owner, and for the team that supports the service?
What metrics does the resource produce that I need to monitor?
How will the service owner, support teams, and other personnel be searching the logs?
How you answer those questions, and the criteria for alerting, determines how you'll use the monitoring platform.
If you're migrating from an existing monitoring platform or set of monitoring tools, use the migration as an
opportunity to reevaluate the signals you collect. This is especially true now that there are several cost factors to
consider when you migrate or integrate with a cloud-based monitoring platform like Azure Monitor. Remember,
monitoring data needs to be actionable. You need to have optimized data collected to give you "a 10,000 foot
view" of the overall health of the service. The instrumentation that's defined to identify real incidents should be as
simple, predictable, and reliable as possible.
Next steps
Alerting strategy
Cloud monitoring guide: Alerting
11 minutes to read • Edit Online
For years, IT organizations have struggled to combat the alert fatigue that's created by the monitoring tools
deployed in the enterprise. Many systems generate a high volume of alerts often considered meaningless, while
other alerts are relevant but are either overlooked or ignored. As a result, IT and developer operations have
struggled to meet the service-level quality promised to internal or external customers. To ensure reliability, it's
essential to understand the state of your infrastructure and applications. To minimize service degradation and
disruption, or to decrease the effect of or reduce the number of incidents, you need to identify causes quickly.
Azure Monitor for containers Calculated average performance data Create metric alerts if you want to be
from nodes and pods are written to the alerted based on variation of measured
metrics store. utilization performance, aggregated
over time.
Calculated performance data that uses Create log query alerts if you want to
percentiles from nodes, controllers, be alerted based on variation of
containers, and pods are written to the measured utilization from clusters and
logs store. Container logs and containers. Log query alerts can also be
inventory information are also written configured based on pod-phase counts
to the logs store. and status node counts.
Azure Monitor for VMs Health criteria are metrics written to the Alerts are generated when the health
metrics store. state changes from healthy to
unhealthy. This alert supports only
Action Groups that are configured to
send SMS or email notifications.
NOTE
These features apply only to metric alerts, alerts based on data that's being sent to the Azure Monitor metric database. The
features don't apply to the other types of alerts. As mentioned previously, the primary objective of metric alerts is speed. If
getting an alert in less than five minutes isn't of primary concern, you can use a log query alert instead.
Dynamic thresholds: Dynamic thresholds look at the activity of the resource over a time period, and create
upper and lower "normal behavior" thresholds. When the metric being monitored falls outside of these
thresholds, you get an alert.
Multisignal alerts: You can create a metric alert that uses the combination of two different inputs from two
different resource types. For example, if you want to fire an alert when the CPU utilization of a VM is over
90 percent, and the number of messages in a certain Azure Service Bus queue feeding that VM exceeds a
certain amount, you can do so without creating a log query. This feature works for only two signals. If you
have a more complex query, feed your metric data into the Azure Monitor log store, and use a log query.
Multiresource alerts: Azure Monitor allows a single metric alert rule that applies to all VM resources. This
feature can save you time because you don't need to create individual alerts for each VM. Pricing for this
type of alert is the same. Whether you create 50 alerts for monitoring CPU utilization for 50 VMs, or one
alert that monitors CPU utilization for all 50 VMs, it costs you the same amount. You can use these types of
alerts in combination with dynamic thresholds as well.
Used together, these features can save time by minimizing alert notifications and the management of the
underlying alerts.
Alerts limitations
Be sure to note the limitations on the number of alerts you can create. Some limits (but not all of them) can be
increased by calling support.
Best query experience
If you're looking for trends across all your data, it makes sense to import all your data into Azure Logs, unless it's
already in Application Insights. You can create queries across both workspaces, so there's no need to move data
between them. You can also import activity log and Service Health data into your Log Analytics workspace. You
pay for this ingestion and storage, but you get all your data in one place for analysis and querying. This approach
also gives you the ability to create complex query conditions and alert on them.
Cloud monitoring guide: Monitoring platforms
overview
14 minutes to read • Edit Online
Microsoft provides a range of monitoring capabilities from two products: System Center Operations Manager,
which was designed for on-premises and then extended to the cloud, and Azure Monitor, which was designed for
the cloud but can also monitor on-premises systems. These two offerings deliver core monitoring services, such as
alerting, service uptime tracking, application and infrastructure health monitoring, diagnostics, and analytics.
Many organizations are embracing the latest practices for DevOps agility and cloud innovations to manage their
heterogenous environments. Yet they are also concerned about their ability to make appropriate and responsible
decisions about how to monitor those workloads.
This article provides a high-level overview of our monitoring platforms to help you understand how each delivers
core monitoring functionality.
Infrastructure requirements
Operations Manager
Operations Manager requires significant infrastructure and maintenance to support a management group, which
is a basic unit of functionality. At a minimum, a management group consists of one or more management servers,
a SQL Server instance, hosting the operational and reporting data warehouse database, and agents. The
complexity of a management group design depends on a number of factors, such as the scope of workloads to
monitor, and the number of devices or computers supporting the workloads. If you require high availability and
site resiliency, as is commonly the case with enterprise monitoring platforms, the infrastructure requirements and
associated maintenance can increase dramatically.
Operations Manager
Management Group
Network Device
Web
console UNIX/Linux System
Management
server Agent-managed
system
Operations Operational DB
console
Data Warehouse DB
Agentless-managed
system
Reporting DB
Azure Monitor
Azure Monitor is a software as a service (SaaS ) service, where all the infrastructure supporting it is running in
Azure and managed by Microsoft. It's designed to perform monitoring, analytics, and diagnostics at scale, and is
available in all national clouds. Core parts of the infrastructure (collectors, metrics and logs store, and analytics)
that are necessary to support Azure Monitor are maintained by Microsoft.
Azure Monitor
Sources Collectors Storage Usage
Insights
Application Container VM Monitoring
Solutions
Integrate
Export APIs Logic Apps
Grey items not part of Azure Monitor, but part of Azure Monitor story.
Data collection
Operations Manager
Agents
Operations Manager collects data directly only from agents that are installed on Windows computers. It can accept
data from the Operations Manager SDK, but this approach is typically used for partners that extend the product
with custom applications, not for collecting monitoring data. It can collect data from other sources, such as Linux
computers and network devices, by using special modules that run on the Windows agent that remotely accesses
these other devices.
Network Device
Linux server
Management
server
Agent-managed
Windows computer
The Operations Manager agent can collect from multiple data sources on the local computer, such as the event log,
custom logs, and performance counters. It can also run scripts, which can collect data from the local computer or
from external sources. You can write custom scripts to collect data that can't be collected by other means, or to
collect data from a variety of remote devices that can't otherwise be monitored.
Management packs
Operations Manager performs all monitoring with workflows (rules, monitors, and object discoveries). These
workflows are packaged together in a management pack and deployed to agents. Management packs are available
for a variety of products and services, which include predefined rules and monitors. You can also author your own
management pack for your own applications and custom scenarios.
Monitoring configuration
Management packs can contain hundreds of rules, monitors, and object discovery rules. An agent runs all these
monitoring settings from all the management packs that apply, which are determined by discovery rules. Each
instance of each monitoring setting runs independently and acts immediately on the data that it collects. This is
how Operations Manager can achieve near-real-time alerting and the current health state of monitored resources.
For example, a monitor might sample a performance counter every few minutes. If that counter exceeds a
threshold, it immediately sets the health state of its target object, which immediately triggers an alert in the
management group. A scheduled rule might watch for a particular event to be created and immediately fire an
alert when that event is created in the local event log.
Because these monitoring settings are isolated from each other and work from the individual sources of data,
Operations Manager has challenges correlating data between multiple sources. It's also difficult to react to data
after it's been collected. You can run workflows that access the Operations Manager database, but this scenario
isn't common, and it's typically used for a limited number of special purpose workflows.
Operations Manager
Management Group
Network Device
Web
console UNIX/Linux System
Management
server Agent-managed
system
Operations Operational DB
console
Data Warehouse DB
Agentless-managed
system
Reporting DB
Azure Monitor
Data sources
Azure Monitor collects data from a variety of sources, including Azure infrastructure and platform resources,
agents on Windows and Linux computers, and monitoring data collected in Azure storage. Any REST client can
write log data to Azure Monitor by using an API, and you can define custom metrics for your web applications.
Some metric data can be routed to different locations, depending on its usage. For example, you might use the
data for "fast-as-possible" alerting or for long-term trend analysis searches in conjunction with other log data.
Monitoring solutions and insights
Monitoring solutions use the logs platform in Azure Monitor to provide monitoring for a particular application or
service. They typically define data collection from agents or from Azure services, and provide log queries and
views to analyze that data. They typically don't provide alert rules, which means that you must define your own
alert criteria based on collected data.
Insights, such as Azure Monitor for containers and Azure Monitor for VMs, use the logs and metrics platform of
Azure Monitor to provide a customized monitoring experience for an application or service in the Azure portal.
They might provide health monitoring and alerting conditions, in addition to customized analysis of collected data.
Monitoring configuration
Azure Monitor separates data collection from actions taken against that data, which supports distributed
microservices in a cloud environment. It consolidates data from multiple sources into a common data platform,
and provides analysis, visualization, and alerting capabilities based on the collected data.
All data that's collected by Azure Monitor is stored as either logs or metrics, and different features of Monitor rely
on either. Metrics contain numerical values in time series that are well suited for near-real-time alerting and fast
detection of issues. Logs contain text or numerical data, and are supported by a powerful query language that
make them especially useful for performing complex analysis.
Because Monitor separates data collection from actions against that data, it might not be able to provide near-real-
time alerting in many cases. To alert on log data, queries are run on a recurring schedule defined in the alert. This
behavior allows Azure Monitor to easily correlate data from all monitored sources, and you can interactively
analyze data in a variety of ways. This is especially helpful for doing root cause analysis and identifying where else
an issue might occur.
Health monitoring
Operations Manager
Management Packs in Operations Manager include a service model that describes the components of the
application being monitored and their relationship. Monitors identify the current health state of each component
based on data and scripts on the agent. Health states roll up so that you can quickly view the summarized health
state of monitored computers and applications.
Azure Monitor
Azure Monitor doesn't provide a user-definable method of implementing a service model or monitors that indicate
the current health state of any service components. Because monitoring solutions are based on standard features
of Azure Monitor, they don't provide state-level monitoring. The following features of Azure Monitor can be
helpful:
Application Insights: Builds a composite map of your web application, and provides a health state for
each application component or dependency. This includes alerts status and drill-down to more detailed
diagnostics of your application.
Azure Monitor for VMs: Delivers a health-monitoring experience for the guest Azure VMs, similar to that
of Operations Manager, when it monitors Windows and Linux virtual machines. It evaluates the health of
key operating system components from the perspective of availability and performance to determine the
current health state. When it determines that the guest VM is experiencing sustained resource utilization,
disk-space capacity, or an issue related to core operating system functionality, it generates an alert to bring
this state to your attention.
Azure Monitor for containers: Monitors the performance and health of Azure Kubernetes Service or
Azure Container Instances. It collects memory and processor metrics from controllers, nodes, and
containers that are available in Kubernetes through the Metrics API. It also collects container logs and
inventory data about containers and their images. Predefined health criteria that are based on the collected
performance data help you identify whether a resource bottleneck or capacity issue exists. You can also
understand the overall performance, or the performance from a specific Kubernetes object type (pod, node,
controller, or container).
Analyze data
Operations Manager
Operations Manager provides four basic ways to analyze data after it has been collected:
Health Explorer: Helps you discover which monitors are identifying a health state issue and review
knowledge about the monitor and possible causes for actions related to it.
Views: Offers predefined visualizations of collected data, such as a graph of performance data or a list of
monitored components and their current health state. Diagram views visually present the service model of
an application.
Reports: Allow you to summarize historical data that's stored in the Operations Manager data warehouse.
You can customize the data that views and reports are based on. However, there is no feature to allow for
complex or interactive analysis of collected data.
Operations Manager Command Shell: Extends Windows PowerShell with an additional set of cmdlets,
and can query and visualize collected data. This includes graphs and other visualizations, natively with
PowerShell, or with the Operations Manager HTML -based web console.
Azure Monitor
With the powerful Azure Monitor analytics engine, you can interactively work with log data and combine them
with other monitoring data for trending and other data analysis. Views and dashboards allow you to visualize
query data in a variety of ways from the Azure portal, and import it into Power BI. Monitoring solutions include
queries and views to present the data they collect. Insights such as Application Insights, Azure Monitor for VMs,
and Azure Monitor for containers include customized visualizations to support interactive monitoring scenarios.
Alerting
Operations Manager
Operations Manager creates alerts in response to predefined events, when a performance threshold is met, and
when the health state of a monitored component changes. It includes the complete management of alerts, allowing
you to set their resolution and assign them to various operators or system engineers. You can set notification rules
that specify which alerts will send proactive notifications.
Management packs include various predefined alerting rules for different critical conditions in the application
being monitored. You can tune these rules or create custom rules to the particular requirements of your
environment.
Azure Monitor
With Azure Monitor, you can create alerts based on a metric crossing a threshold, or based on a scheduled query
result. Although alerts based on metrics can achieve near-real-time results, scheduled queries have a longer
response time, depending on the speed of data ingestion and indexing. Instead of being limited to a specific agent,
log query alerts in Azure Monitor let you analyze data across all data stored in multiple workspaces. These alerts
also include data from a specific Application Insights app by using a cross-workspace query.
Although monitoring solutions can include alert rules, you ordinarily create them based on your own
requirements.
Workflows
Operations Manager
Management packs in Operations Manager contain hundreds of individual workflows, and they determine both
what data to collect and what action to perform with that data. For example, a rule might sample a performance
counter every few minutes, storing its results for analysis. A monitor might sample the same performance counter
and compare its value to a threshold to determine the health state of a monitored object. Another rule might run a
script to collect and analyze some data on an agent computer, and then fire an alert if it returns a particular value.
Workflows in Operations Manager are independent of each other, which makes analysis across multiple monitored
objects difficult. These monitoring scenarios must be based on data after it's collected, which is possible but can be
difficult, and it isn't common.
Azure Monitor
Azure Monitor separates data collection from actions and analysis taken from that data. Agents and other data
sources write log data to a Log Analytics workspace and write metric data to the metric database, without any
analysis of that data or knowledge of how it might be used. Monitor performs alerting and other actions from the
stored data, which allows you to perform analysis across data from all sources.
Next steps
Monitoring the cloud deployment models
Centralize management operations
2 minutes to read • Edit Online
For most organizations, using a single Azure Active Directory (Azure AD ) tenant for all users simplifies
management operations and reduces maintenance costs. This is because all management tasks can be by
designated users, user groups, or service principals within that tenant.
We recommend that you use only one Azure AD tenant for your organization, if possible. However, some
situations might require an organization to maintain multiple Azure AD tenants for the following reasons:
They are wholly independent subsidiaries.
They're operating independently in multiple geographies.
Certain legal or compliance requirements apply.
There are acquisitions of other organizations (sometimes temporary until a long-term tenant consolidation
strategy is defined).
When a multiple-tenant architecture is required, Azure Lighthouse provides a way to centralize and streamline
management operations. Subscriptions from multiple tenants can be onboarded for Azure delegated resource
management. This option allows specified users in the managing tenant to perform cross-tenant management
functions in a centralized and scalable manner.
For example, let's say your organization has a single tenant, Tenant A. The organization then acquires two
additional tenants, Tenant B and Tenant C, and you have business reasons that require you to maintain them as
separate tenants.
Your organization wants to use the same policy definitions, backup practices, and security processes across all
tenants. Because you already have users (including user groups and service principals) that are responsible for
performing these tasks within Tenant A, you can onboard all of the subscriptions within Tenant B and Tenant C so
that those same users in Tenant A can perform those tasks. Tenant A then becomes the managing tenant for Tenant
B and Tenant C.
As your enterprise begins to operate workloads in Azure, the next step is to establish a process for operational
fitness review. This process enumerates, implements, and iteratively reviews the nonfunctional requirements for
these workloads. Nonfunctional requirements are related to the expected operational behavior of the service.
There are five essential categories of nonfunctional requirements, which are called the pillars of software quality:
Scalability
Availability
Resiliency, including business continuity and disaster recovery
Management
Security
A process for operational fitness review ensures that your mission-critical workloads meet the expectations of your
business with respect to the quality pillars.
Create a process for operational fitness review to fully understand the problems that result from running
workloads in a production environment, and how to remediate and resolve those problems. This article outlines a
high-level process for operational fitness review that your enterprise can use to achieve this goal.
At a high level, the process has two phases. In the prerequisites phase, the requirements are established and
mapped to supporting services. This phase occurs infrequently: perhaps annually or when new operations are
introduced. The output of the prerequisites phase is used in the flow phase. The flow phase occurs more frequently,
such as monthly.
Prerequisites phase
The steps in this phase capture the requirements for conducting a regular review of the important services.
1. Identify critical business operations. Identify the enterprise's mission-critical business operations.
Business operations are independent from any supporting service functionality. In other words, business
operations represent the actual activities that the business needs to perform and that are supported by a set
of IT services.
The term mission-critical (or business critical) reflects a severe impact on the business if the operation is
impeded. For example, an online retailer might have a business operation, such as "enable a customer to add
an item to a shopping cart" or "process a credit card payment." If either of these operations fails, a customer
can't complete the transaction and the enterprise fails to realize sales.
2. Map operations to services. Map the critical business operations to the services that support them. In the
shopping-cart example, several services might be involved, including an inventory stock-management
service and a shopping-cart service. To process a credit-card payment, an on-premises payment service
might interact with a third-party, payment-processing service.
3. Analyze service dependencies. Most business operations require orchestration among multiple
supporting services. It's important to understand the dependencies between the services, and the flow of
mission-critical transactions through these services.
Also consider the dependencies between on-premises services and Azure services. In the shopping-cart
example, the inventory stock-management service might be hosted on-premises and ingest data entered by
employees from a physical warehouse. However, it might store data off-premises in an Azure service, such
as Azure Storage, or a database, such as Azure Cosmos DB.
An output from these activities is a set of scorecard metrics for service operations. The scorecard measures criteria
such as availability, scalability, and disaster recovery. Scorecard metrics express the operational criteria that you
expect the service to meet. These metrics can be expressed at any level of granularity that's appropriate for the
service operation.
The scorecard should be expressed in simple terms to facilitate meaningful discussion between the business
owners and engineering. For example, a scorecard metric for scalability might be color-coded in a simple way.
Green means meeting the defined criteria, yellow means failing to meet the defined criteria but actively
implementing a planned remediation, and red means failing to meet the defined criteria with no plan or action.
It's important to emphasize that these metrics should directly reflect business needs.
Service -review phase
The service-review phase is the core of the operational fitness review. It involves these steps:
1. Measure service metrics. Use the scorecard metrics to monitor the services, to ensure that the services
meet the business expectations. Service monitoring is essential. If you can't monitor a set of services with
respect to the nonfunctional requirements, consider the corresponding scorecard metrics to be red. In this
case, the first step for remediation is to implement the appropriate service monitoring. For example, if the
business expects a service to operate with 99.99% availability, but there is no production telemetry in place
to measure availability, assume that you're not meeting the requirement.
2. Plan remediation. For each service operation for which metrics fall below an acceptable threshold,
determine the cost of remediating the service to bring operation to an acceptable level. If the cost of
remediating the service is greater than the expected revenue generation of the service, move on to consider
the intangible costs, such as customer experience. For example, if customers have difficulty placing a
successful order by using the service, they might choose a competitor instead.
3. Implement remediation. After the business owners and engineering team agree on a plan, implement it.
Report the status of the implementation whenever you review scorecard metrics.
This process is iterative, and ideally your enterprise has a team dedicated to it. This team should meet regularly to
review existing remediation projects, kick off the fundamental review of new workloads, and track the enterprise's
overall scorecard. The team should also have the authority to hold remediation teams accountable if they're behind
schedule or fail to meet metrics.
Recommended resources
Pillars of software quality. This section of the Azure Application Architecture Guide describes the five pillars of
software quality: scalability, availability, resiliency, management, and security.
Ten design principles for Azure applications. This section of the Azure Application Architecture Guide discusses
a set of design principles to make your application more scalable, resilient, and manageable.
Designing resilient applications for Azure. This guide starts with a definition of the term resiliency and related
concepts. Then, it describes a process for achieving resiliency by using a structured approach over the lifetime of
an application, from design and implementation to deployment and operations.
Cloud design patterns. These design patterns are useful for engineering teams when building applications on
the pillars of software quality.
Azure Advisor. Advisor provides recommendations that are personalized based on your usage and
configurations to help you optimize your resources for high availability, security, performance, and cost.
IT management and operations in the cloud
2 minutes to read • Edit Online
As a business moves to a cloud-based model, the importance of proper management and operations can't be
overstated. Unfortunately, few organizations are prepared for the IT management shift that's required for success
in building a cloud-first operating model. This section of the Cloud Adoption Framework outlines the operating
model, processes, and tooling that have proven successful in the cloud. Each of these areas represents a minor but
fundamental change in the way the business should view IT operations and management as it begins to adopt the
cloud.
Cloud management
The historical IT operating model was sufficient for over 20 years. But that model is now outdated and is less
desirable than cloud-first alternatives. When IT management teams move to the cloud, they have an opportunity to
rethink this model and drive greater value for the business. This article series outlines a modernized model of IT
management.
Next steps
For a deeper understanding of the new cloud management model, start with Understand business alignment.
Understand business alignment
Create business alignment in cloud management
2 minutes to read • Edit Online
In on-premises environments, IT assets (applications, virtual machines, VM hosts, disk, servers, devices, and data
sources) are managed by IT to support workload operations. In IT terms, a workload is a collection of IT assets
that support a specific business operation. To help support business operations, IT management delivers
processes that are designed to minimize disruptions to those assets. When an organization moves to the cloud,
management and operations shift a bit, creating an opportunity to develop tighter business alignment.
Business vernacular
The first step in creating business alignment is to ensure term alignment. IT management, like most engineering
professions, has amassed a collection of jargon, or highly technical terms. Such terms can lead to confusion for
business stakeholders and make it difficult to map management services to business value.
Fortunately, the process of developing a cloud adoption strategy and cloud adoption plan creates an ideal
opportunity to remap these terms. The process also creates opportunities to rethink commitments to operational
management, in partnership with the business. The following article series walks you through this new approach
across three specific terms that can help improve conversations among business stakeholders:
Criticality: Mapping workloads to business processes. Ranking criticality to focus investments.
Impact: Understanding the impact of potential outages to aid in evaluating return on investment for cloud
management.
Commitment: Developing true partnerships, by creating and documenting agreements with the business.
NOTE
Underlying these terms are classic IT terms such as SLA, RTO, and RPO. Mapping specific business and IT terms is covered in
more detail in the Commitment article.
Next steps
Start creating business alignment by defining workload criticality.
Define workload criticality
Business criticality in cloud management
3 minutes to read • Edit Online
Across every business, there exist a small number of workloads that are too important to fail. These workloads
are considered mission critical. When those workloads experience outages or performance degradation, the
adverse impact on revenue and profitability can be felt across the entire company.
At the other end of the spectrum, some workloads can go months at a time without being used. Poor performance
or outages for those workloads is not desirable, but the impact is isolated and limited.
Understanding the criticality of each workload in the IT portfolio is the first step toward establishing mutual
commitments to cloud management. The following diagram illustrates a common alignment between the
criticality scale to follow and the standard commitments made by the business.
Criticality scale
The first step in any business criticality alignment effort is to create a criticality scale. The following table presents
a sample scale to be used as a reference, or template, for creating your own scale.
Unit-critical Affects the mission of a specific business unit and its profit-
and-loss statements.
It's common for businesses to include additional criticality classifications that are specific to their industry, vertical,
or specific business processes. Examples of additional classifications include:
Compliance-critical: In heavily regulated industries, some workloads might be critical as part of an effort to
maintain compliance requirements.
Security-critical: Some workloads might not be mission critical, but outages could result in loss of data or
unintended access to protected information.
Safety-critical: When lives or the physical safety of employees and customers is at risk during an outage, it
can be wise to classify workloads as safety-critical.
Next steps
After your team has defined business criticality, you can calculate and record business impact.
Calculate and record business impact
Business impact in cloud management
4 minutes to read • Edit Online
Assume the best, prepare for the worst. In IT management, it's safe to assume that the workloads required to
support business operations will be available and will perform within agreed-upon constraints, based on the
selected criticality. However, to manage investments wisely, it's important to understand the impact on the
business when an outage or performance degradation occurs. This importance is illustrated in the following
graph, which maps potential business interruptions of specific workloads to the business impact of outages across
a relative value scale.
To create a fair basis of comparison for the impact on various workloads across a portfolio, a time/value metric is
suggested. The time/value metric captures the adverse impact of a workload outage. Generally, this impact is
recorded as a direct loss of revenue or operating revenue during a typical outage period. More specifically, it
calculates the amount of lost revenue for a unit of time. The most common time/value metric is Impact per hour,
which measures operating revenue losses per hour of outage.
A few approaches can be used to calculate impact. You can apply any of the options in the following sections to
achieve similar outcomes. It's important to use the same approach for each workload when you calculate
protected losses across a portfolio.
Calculate time
Depending on the nature of the workload, you could calculate losses differently. For high-paced transactional
systems such as a real-time trading platform, losses per millisecond might be significant. Less frequently used
systems, such as payroll, might not be used every hour. Whether the frequency of usage is high or low, it's
important to normalize the time variable when you calculate financial impact.
Next steps
After the business has defined impact, you can align commitments.
Align management commitments with the business
Business commitment in cloud management
9 minutes to read • Edit Online
Defining business commitment is an exercise in balancing priorities. The objective is to align the proper level of
operational management at an acceptable operating cost. Finding that balance requires a few data points and
calculations, which we've outlined in this article.
Commitments to business stability, via technical resiliency or other service-level agreement (SLA) impacts, are a
business justification decision. For most workloads in an environment, a baseline level of cloud management is
sufficient. For others, a 2x to 4x cost increase is easily justified because of the potential impact of any business
interruptions.
The previous articles in this series can help you understand the classification and impact of interruptions to
various workloads. This article helps you calculate the returns. As illustrated in the preceding image, each level of
cloud management has inflection points where cost can rise faster than increases in resiliency. Those inflection
points will prompt detailed business decisions and business commitments.
IT operations prerequisites
The Azure Management Guide outlines the management tools that are available in Azure. Before reaching a
commitment with the business, IT should determine an acceptable standard-level management baseline to be
applied to all managed workloads. IT would then calculate a standard management cost for each of the managed
workloads in the IT portfolio, based on counts of CPU cores, disk space, and other asset-related variables. IT
would also estimate a composite SLA for each workload, based on the architecture.
TIP
IT operations teams often use a default minimum of 99.9 percent uptime for the initial composite SLA. They might also
choose to normalize management costs based on the average workload, especially for solutions with minimal logging and
storage needs. Averaging the costs of a few medium criticality workloads can provide a starting point for initial
conversations.
TIP
If you're using the Ops Management workbook to plan for cloud management, the Ops management fields should be
updated to reflect these prerequisites. Those fields include Commitment level, Composite SLA, and Monthly cost. Monthly
cost should represent the cost of the added operational management tools on a monthly basis.
The operations management baseline serves as an initial starting point to be validated in each of the following
sections.
Management responsibility
In a traditional on-premises environment, the cost of managing the environment is commonly assumed to be a
sunk cost that's owned by IT operations. In the cloud, management is a purposeful decision with direct budgetary
impact. The costs of each management function can be more directly attributed to each workload that's deployed
to the cloud. This approach allows for greater control, but it does create a requirement for cloud operations teams
and cloud strategy teams to first commit to an agreement about responsibilities.
Organizations might also choose to outsource some of their ongoing management functions to a service provider.
These service providers can use Azure Lighthouse to give organizations more precise control in granting access to
their resources, along with greater visibility into the actions performed by the service providers.
Delegated responsibility: Because there's no need to centralize and assume operational management
overhead, IT operations for many organizations are considering new approaches. One common approach
is referred to as delegated responsibility. In a cloud center of excellence model, platform operations and
platform automation provide self-service management tools that can be used by business-led operations
teams, independent of a central IT operations team. This approach gives business stakeholders complete
control over management-related budgets. It also allows the cloud center of excellence (CCoE ) team to
ensure that a minimum set of guardrails has been properly implemented. In this model, IT acts as a broker
and a guide to help the business make wise decisions. Business operations oversee day to day operations
of dependent workloads.
Centralized responsibility: Compliance requirements, technical complexity, and some shared service
models might require a central IT model. In this model, IT continues to exercise its operations management
responsibilities. Environmental design, management controls, and governance tooling might be centrally
managed and controlled, which restricts the role of business stakeholders in making management
commitments. But the visibility into the cost and architecture of cloud approaches makes it much easier for
centralized IT to communicate the cost and level of management for each workload.
Mixed model: Classification is at the heart of a mixed model of management responsibilities. Companies
that are in the midst of a transformation from on-premises to cloud might require an on-premises-first
operating model for a while. Companies with strict compliance requirements, or that depend on long-term
contracts with IT outsourcing vendors, might require a centralized operating model.
Regardless of their constraints, today's businesses must innovate. When rapid innovation must flourish, in
the midst of a central-IT, centralized-responsibility model, a mixed-model approach might provide balance.
In this approach, central IT provides a centralized operating model for all workloads that are mission-
critical or contain sensitive information. At the same time, all other workload classifications might be placed
in a cloud environment that's designed for delegated responsibilities. The centralized responsibility
approach serves as the general operating model. The business then has flexibility to adopt a specialized
operating model, based on its required level of support and sensitivity.
The first step is committing to a responsibility approach, which then shapes the following commitments.
Which organization will be responsible for day-to-day operations management for this workload?
Cloud tenancy
For most businesses, management is easier when all assets reside in a single tenant. However, some organizations
might need to maintain multiple tenants. To learn why a business might require a multitenant Azure environment,
see Centralize management operations with Azure Lighthouse.
Will this workload reside in a single Azure tenant, alongside all other workloads?
Soft-cost factors
The next section outlines an approach to comparative returns that are associated with levels of management
processes and tooling. At the end of that section, each analyzed workload measures the cost of management
relative to the forecast impact of business disruptions. That approach provides a relatively easy way to understand
whether an investment in richer management approaches is warranted.
Before you run the numbers, it's important to look at the soft-cost factors. Soft-cost factors produce a return, but
that return is difficult to measure through direct hard-cost savings that would be visible in a profit-and-loss
statement. Soft-cost factors are important because they can indicate a need to invest in a higher level of
management than is fiscally prudent.
A few examples of soft-cost factors would include:
Daily workload usage by the board or CEO.
Workload usage by a top x percent of customers that leads to a greater revenue impact elsewhere.
Impact on employee satisfaction.
The next data point that's required to make a commitment is a list of soft-cost factors. These factors don't need to
be documented at this stage, but business stakeholders should be aware of the importance of these factors and
their exclusion from the following calculations.
TIP
If you're using the Ops Management workbook to plan for cloud management, update the Ops management fields to
reflect to reflect each conversation. Those fields include Commitment level, Composite SLA, and Monthly cost. Monthly cost
should represent the monthly cost of the added operational management tools. After they're updated, the fields will update
the ROI formulas and each of the following fields.
The workbook uses the default value of 8,760 hours per year.
Standard loss impact
Standard loss impact (labeled Standard impact in the workbook) forecasts the financial impact of any outage,
assuming that the Estimated outage prediction proves accurate. To calculate this forecast without using the
workbook, apply the following formula:
This serves as a baseline for cost, should the business stakeholders choose to invest in a higher level of
management.
Composite SLA impact
Composite SLA impact (labeled Commitment level impact in the workbook) provides updated fiscal impact, based
on the changes to the uptime SLA. This calculation allows you to compare the projected financial impact of both
options. To calculate this forecast impact without the spreadsheet, apply the following formula:
The value represents the potential losses to be avoided by the changed commitment level and new composite
SLA.
Comparison basis
Comparison basis evaluates standard impact and composite SLA impact to determine which is most appropriate
in the return column.
Return on loss avoidance
If the cost of managing a workload exceeds the potential losses, the proposed investment in cloud management
might not be fruitful. To compare the Return on Loss Avoidance, see the column labeled Annual ROI****. To
calculate this column on your own, use the following formula:
Unless there are other soft-cost factors to consider, this comparison can quickly suggest whether there should be a
deeper investment in cloud operations, resiliency, reliability, or other areas.
Next steps
After the commitments are made, the responsible operations teams can begin configuring the workload in
question. To get started, evaluate various approaches to inventory and visibility.
Inventory and visibility options
Management leveling across cloud management
disciplines
3 minutes to read • Edit Online
The keys to proper management in any environment are consistency and repeatable processes. There are endless
of options for the things that can be done in Azure. Likewise, there are countless approaches to cloud management.
To provide consistency and repeatability, it's important to narrow those options to a consistent set of management
processes and tools that will be offered for workloads hosted in the cloud.
As a starting point, consider establishing the management levels that are shown in the preceding diagram and
suggested in the following list:
Management baseline: A cloud management baseline (or management baseline) is a defined set of tools,
processes, and consistent pricing that serve as the foundation for all cloud management in Azure. To establish a
cloud management baseline and determine which tools to include in the baseline offering to your business,
review the list in the "Cloud management disciplines" section.
Enhanced baseline: A number of workloads might require enhancements to the baseline that aren't
necessarily specific to a single platform or workload. Although these enhancements aren't cost effective for
every workload, there should be common processes, tools, and solutions for any workload that can justify the
cost of the extra management support.
Platform specialization: In any given environment, some common platforms are used by a variety of
workloads. This general architectural commonality doesn't change when businesses adopt the cloud. Platform
specialization is an elevated level of management that applies data and architectural subject matter expertise to
provide a higher level of operational management. Examples of platform specialization would include
management functions specific to SQL Server, Containers, Active Directory, or other services that can be better
managed through consistent, repeatable processes, tools, and architectures.
Workload specialization: For workloads that are truly mission critical, there might be a cost justification to go
much deeper into the management of that workload. Workload specialization applies workload telemetry to
determine more advanced approaches to daily management. That same data often identifies automation,
deployment, and design improvements that would lead to greater stability, reliability, and resiliency beyond
what's possible with operational management alone.
Unsupported: It's equally important to communicate common management processes that won't be delivered
through cloud management disciplines for workloads that are classified as not supported or not critical.
Organizations might also choose to outsource functions related to one or more of these management levels to a
service provider. These service providers can use Azure Lighthouse to provide greater precision and transparency.
The remaining articles in this series outline a number of processes that are commonly found within each of these
disciplines. In parallel, the Azure Management Guide demonstrates the tools that can support each of those
processes. For assistance with building your management baseline, start with the Azure Management Guide. After
you've established the baseline, this article series and the accompanying best practices can help expand that
baseline to define other levels of management support.
Next steps
The next step toward defining each level of cloud management is an understanding of inventory and visibility.
Inventory and visibility options
Inventory and visibility in cloud management
6 minutes to read • Edit Online
Operational management has a clear dependency on data. Consistent management requires an understanding
about what is managed (inventory) and how those managed workloads and assets change over time (visibility).
Clear insights about inventory and visibility help empower the team to manage the environment effectively. All
other operational management activities and processes build on these two areas.
A few classic phrases about the importance of measurements set the tone for this article:
Manage what matters.
You can only manage what you can measure.
If you can't measure it, it might not matter.
The inventory and visibility discipline builds on these timeless phrases. Before you can effectively establish
operational management processes, it's important to gather data and create the right level of visibility for the
right teams.
Processes
Perhaps more important than the features of the cloud management platform, the cloud management processes
will realize operations commitments with the business. Any cloud management methodology should include, at a
minimum, the following processes:
Reactive monitoring: When deviations adversely affect business operations, who addresses those
deviations? What actions do they take to remediate the deviations?
Proactive monitoring: When deviations are detected but business operations are not affected, how are those
deviations addressed, and by whom?
Commitment reporting: How is adherence to the business commitment communicated to business
stakeholders?
Budgetary reviews: What is the process for reviewing those commitments against budgeted costs? What is
the process for adjusting the deployed solution or the commitments to create alignment?
Escalation paths: What escalation paths are available when any of the preceding processes fail to meet the
needs of the business?
There are several more processes related to inventory and visibility. The preceding list is designed to provoke
thought within the operations team. Answering these questions will help develop some of the necessary
processes, as well as likely trigger new, deeper questions.
Responsibilities
When you're developing processes for operational monitoring, it's equally important to determine responsibilities
for daily operation and regular support of each process.
In a central IT organization, IT would provide the operational expertise. The business would be consultative in
nature, when issues require remediation.
In a cloud center of excellence organization, business operations would provide the expertise and hold
responsibility for management of these processes. IT would focus on the automation and support of teams, as
they operate the environment.
But these are the common responsibilities. Organizations often require a mixture of responsibilities to meet
business commitments.
Next steps
Operational compliance builds on inventory capabilities by applying management automation and controls. See
how operational compliance maps to your processes.
Plan for operational compliance
Operational compliance in cloud management
2 minutes to read • Edit Online
Operational compliance builds on the discipline of inventory and visibility. As the first actionable step of cloud
management, this discipline focuses on regular telemetry reviews and remediation efforts (both proactive and
reactive remediation). This discipline is the cornerstone for maintaining balance between security, governance,
performance, and cost.
Next steps
Protection and recovery are the next areas to consider in a cloud management baseline.
Protect and recover
Protect and recover in cloud management
5 minutes to read • Edit Online
After they've met the requirements for inventory and visibility and operational compliance, cloud management
teams can anticipate and prepare for a potential workload outage. As they're planning for cloud management, the
teams must start with an assumption that something will fail.
No technical solution can consistently offer a 100 percent uptime SLA. Solutions with the most redundant
architectures claim to deliver on "six 9s" or 99.9999 percent uptime. But even a "six 9s" solution goes down for
31.6 seconds in any given year. Sadly, it's rare for a solution to warrant a large, ongoing operational investment
that's required to reach "six 9s" of uptime.
Preparation for an outage allows the team to detect failures sooner and recover more quickly. The focus of this
discipline is on the steps that come immediately after a system fails. How do you protect workloads, so that they
can be recovered quickly when an outage occurs?
Next steps
After this management baseline component is met, the team can look ahead to avoid outages in platform
operations and workload operations.
Platform operations Workload operations
Platform operations in cloud management
6 minutes to read • Edit Online
A cloud management baseline that spans inventory and visibility, operational compliance, and protection and
recovery might provide a sufficient level of cloud management for most workloads in the IT portfolio. However,
that baseline is seldom enough to support the full portfolio. This article builds on the most common next step in
cloud management, portfolio operations.
A quick study of the assets in the IT portfolio highlights patterns across the workloads that are being supported.
Within those workloads, there will be a number of common platforms. Depending on the past technical decisions
within the company, those platforms could vary widely.
For some organizations, there will be a heavy dependence on SQL Server, Oracle, or other open-source data
platforms. In other organizations, the commonalities might be rooted in the hosting platforms for virtual
machines (VMs) or containers. Still others might have a common dependency on applications or Enterprise
Resource Planning (ERP ) systems, such as SAP, Oracle, or others.
By understanding these commonalities, the cloud management team can specialize in higher levels of support for
those prioritized platforms.
NOTE
Building a service catalog requires a great deal of effort and time from multiple teams. Using the service catalog or
approved list as a gating mechanism will slow innovation. When innovation is a priority, service catalogs should be
developed parallel to other adoption efforts.
Next steps
In parallel with improvements to platform operations, cloud management teams also focus on improving
workload operations for the top 20 percent or less of production workloads.
Improve workload operations
Workload operations in cloud management
5 minutes to read • Edit Online
Some workloads are critical to the success of the business. For those workloads, a management baseline is
insufficient to meet the required business commitments to cloud management. Platform operations might not
even be sufficient to meet business commitments. This highly important subset of workloads requires a
specialized focus on the way the workload functions and how it is supported.
In return, the investment in workload operations can lead to improved performance, decreased risk of business
interruption, and faster recovery when system failures occur. This article discusses an approach to investing in the
continued operations of these high priority workloads to drive improved business commitments.
Continued observation
Initial data and ongoing telemetry can help formulate and test theories about the performance of a workload. But
ongoing workload operations are rooted in a continued and expanded observation of workload performance,
with a heavy focus on application and data performance.
Test the automation
At the application level, the first requirements of workload operations, is an investment in deep testing. For any
application that's supported through workload operations, a test plan should be established and regularly
executed to deliver functional and scale testing across the applications.
Regular test telemetry can provide immediate validation of various hypotheses about the operation of the
workload. Improving operational and architectural patterns can be executed and tested. The resulting deltas
provide a clear impact analysis to guide continued investments.
Understand releases
A clear understanding of release cycles and release pipelines is an important element of workload operations.
An understanding of cycles can prepare for potential interruptions and allow the team to proactively address any
releases that might produce an adverse effect on operations. This understanding also allows the cloud
management team to partner with adoption teams to continuously improve the quality of the product and
address any bugs that might affect stability.
More importantly, an understanding of release pipelines can significantly improve the recovery point objective
(RPO ) of a workload. In many scenarios, the fastest and most accurate path to the recovery of an application is a
release pipeline. For application layers that change only when a new release happens, it might be wise to invest
more heavily in pipeline optimization than on the recovery of the application from traditional back-up processes.
Although a deployment pipeline can be the fastest path to recovery, it can also be the fastest path to remediation.
When an application has a fast, efficient, and reliable release pipeline, the cloud management team has an option
to automate deployment to a new host as a form of automated remediation.
There might be many other faster, more effective mechanisms for remediation and recovery. However, when the
use of an existing pipeline can meet business commitments and capitalize on existing DevOps investments, the
existing pipeline might be a viable alternative.
Clearly communicate changes to the workload
Change to any workload is among the biggest risks to workload operations. For any workload in the workload
operations level of cloud management, the cloud management team should closely align with the cloud adoption
teams to understand the changes coming from each release. This investment in proactive understanding will have
a direct, positive impact on operational stability.
Improve outcomes
The data and communication investments in a workload will yield suggestions for improvements to ongoing
operations in one of three areas:
Technical debt resolution
Automated remediation
Improved system design
Technical debt resolution
The best workload operations plans still require remediation. As your cloud management team seeks to stay
connected to understand adoption efforts and releases, the team likewise should regularly share remediation
requirements to ensure that technical debt and bugs are a continued priority for your development teams.
Automated remediation
By applying the Pareto Principle, we can say that 80 percent of negative business impact likely comes from 20
percent of the service incidents. When those incidents can't be addressed in normal development cycles,
investments in remediation automation can significantly reduce business interruptions.
Improved system design
In the cases of technical debt resolution and automated remediation, system flaws are the common cause of most
system outages. You can have the greatest impact on overall workload operations by adhering to a few design
principles:
Scalability: The ability of a system to handle increased load.
Availability: The percentage of time that a system is functional and working.
Resiliency: The ability of a system to recover from failures and continue to function.
Management: Operations processes that keep a system running in production.
Security: Protecting applications and data from threats.
To help improve overall operations, the Azure Architecture Framework provides an approach to evaluating
specific workloads for adherence to these pillars. You can apply the pillars can be applied to both platform
operations and workload operations.
Next steps
With a full understanding of the manage methodology within the Cloud Adoption Framework, you are now
armed to implement cloud management principles. For guidance on making this methodology actionable within
your operations environment, see Cloud management in the Cloud Adoption Framework of the adoption
lifecycle.
Apply this methodology
Apply design principles and advanced operations
6 minutes to read • Edit Online
The first three cloud management disciplines describe a management baseline. At a minimum, a management
baseline should include a standard business commitment to minimize business interruptions and accelerate
recovery if service is interrupted. Most management baselines include a disciplined focus on maintaining
"inventory and visibility," "operational compliance," and "protection and recovery."
The purpose of a management baseline is to create a consistent offering that provides a minimum level of business
commitment for all supported workloads. This baseline of common, repeatable management offerings allows the
team to deliver a highly optimized degree of operational management, with minimal deviation. But that standard
offering might not provide a rich enough commitment to the business.
The diagram in the next section illustrates three ways to go beyond the management baseline.
The management baseline should meet the minimum commitment required by 80 percent of the lowest criticality
workloads in the portfolio. The baseline should not be applied to mission-critical workloads. Nor should it be
applied to common platforms that are shared across workloads. Those workloads require a focus on design
principles and advanced operations.
Management specialization
Aspects of workload and platform operations might require changes to design and architecture principles. Those
changes could take time and might result in increased operating expenses. To reduce the number of workloads
requiring such investments, an enhanced management baseline could provide enough of an improvement to the
business commitment.
For workloads that warrant a higher investment to meet a business commitment, specialization of operations is
key.
Structure type
Define the type of organizational structure that best fits your operating model.
Cloud capabilities
Understand the cloud capabilities required to adopt and operate the cloud.
Establish teams
Define the teams that will be providing various cloud capabilities. A number of best practice options are listed for reference.
RACI matrix
Clearly defined roles are an important aspect of any operating model. Leverage the provided RACI matrix to map
responsibility, accountability, consulted, and informed roles to each of the teams for various functions of the cloud operating
model.
Structure type
The following organizational structures do not necessarily have to map to an organizational chart (org chart). Org charts
generally reflect command and control management structures. Conversely, the following organizational structures are
designed to capture alignment of roles and responsibilities. In an agile, matrix organization, these structures may be best
represented as virtual teams (or v-teams). There is no limitation suggesting that these organizational structures couldn't be
represented in an org chart, but it is not necessary in order to produce an effective operating model.
The first step of managing organizational alignment is to determine how the following organizational structures will be
fulfilled:
Org chart alignment: Management hierarchies, manager responsibilities, and staff alignment will align to organizational
structures.
Virtual teams (v-teams): Management structures and org charts remain unchanged. Instead, virtual teams will be created
and tasked with the required capabilities.
Mixed model: More commonly, a mixture of org chart and v-team alignment will be required to deliver on transformation
goals.
The article on determining organizational structure maturity provides additional detail regarding each level of maturity.
Next steps
To track organization structure decisions over time, download and modify the RACI spreadsheet template.
Download the RACI spreadsheet template
Cloud strategy capabilities
2 minutes to read • Edit Online
Successful cloud adoption should align to defined motivations and business outcomes. When those outcomes
impact business functions, it's a good idea to establish a team made up of business leaders from across the
organization. To unite various leaders, we recommend that the executive leadership create a cloud strategy
team. The goal of the cloud strategy team is to produce tangible business results that are enabled by cloud
technologies. This team ensures that cloud adoption efforts progress in alignment with business outcomes.
In the absence of a defined cloud strategy team, someone must still provide the capability to align technical
activities to business outcomes. That same person or group should also manage change across the project. This
section defines this capability in more detail.
NOTE
The organization's CEO and CIO often assign the team. Assignments are typically based on empowering this team to drive
change that cuts across various different organizations within the enterprise. The cloud strategy team members should be
assigned based on the motivations for cloud adoption, business outcomes, and relevant financial models.
Key responsibilities
The primary focus of the cloud strategy is to validate and maintain alignment between business priorities and
cloud adoption efforts. Secondarily, you should focus on change management across the adoption efforts. The
following tasks assist in achieving this capability.
Early planning tasks
Review and provide feedback on business outcomes and financial models.
Aid in establishing clear motivations for cloud adoption that align with corporate objectives.
Define relevant learning metrics that clearly communicate progress toward business outcomes.
Understand business risks introduced by the plan, represent the business's tolerance for risk.
Review and approve the rationalization of the digital estate.
Ongoing monthly tasks
Support the cloud governance capability during risk/tolerance conversations.
Review release plans to understand timelines and business impact of technical change.
Define business change plans associated with planned releases.
Ensure business teams are ready to execute business testing and the business change plan.
Meeting cadence
The tasks listed in preceding section can be time-consuming in the early planning phases. Here are some
recommendations for the allocation of time for cloud strategy team members:
During early planning efforts, allocate an hour each week to meet with the team. After the adoption plan is
solidified (usually within 4–6 weeks), the time requirements can be reduced.
Throughout the adoption efforts, allocate 1–2 hours each month to review progress and validate continued
priorities.
Additional time is likely required from delegated members of the executive's team on an as-needed basis.
Each member of the cloud strategy team should appoint a delegate who can allocate 5–10 hours per week to
support ongoing prioritization questions and report on any urgent needs.
Next steps
Strategy and planning are important. However, nothing is actionable without cloud adoption capabilities.
Understand the role of this important capability before beginning adoption efforts.
Align cloud adoption capabilities
Cloud adoption capabilities
3 minutes to read • Edit Online
Cloud adoption capabilities allow for the implementation of technical solutions in the cloud. Like any IT project,
the people delivering the actual work will determine success. The teams providing the necessary cloud adoption
capabilities can be staffed from multiple subject matter experts or implementation partners.
Key responsibilities
The primary need from any cloud adoption capability is the timely, high-quality implementation of the technical
solutions outlined in the adoption plan, in alignment with governance requirements and business outcomes,
taking advantage of technology, tools, and automation solutions made available to the team.
Early planning tasks:
Execute the rationalization of the digital estate
Review, validate, and advance the prioritized migration backlog
Begin execution of the first workload as a learning opportunity
Ongoing monthly tasks:
Oversee change management processes
Manage the release and sprint backlogs
Build and maintain the adoption landing zone in conjunction with governance requirements
Execute the technical tasks outlined in the sprint backlogs
Team cadence
We recommend that teams providing cloud adoption capability be dedicated to the effort full-time.
It's best if these teams meet daily in a self-organizing way. The goal of daily meetings is to quickly update the
backlog, and to communicate what has been completed, what is to be done today, and what things are blocked,
requiring additional external support.
Release schedules and iteration durations are unique to each company. However, a range of one to four weeks
per iteration seems to be the average duration. Regardless of iteration or release cadence, we recommend that
the team meets all supporting teams at the end of each release to communicate the outcome of the release, and
to reprioritize upcoming efforts. Likewise, it's valuable to meet as a team at the end of each sprint, with the cloud
center of excellence or cloud governance team to stay aligned on common efforts and any needs for support.
Some of the technical tasks associated with cloud adoption can become repetitive. Team members should rotate
every 3–6 months to avoid employee satisfaction issues and maintain relevant skills. A rotating seat on cloud
center of excellence or cloud governance team can provide an excellent opportunity to keep employees fresh and
harness new innovations.
Next steps
Adoption is great, but ungoverned adoption can produce unexpected results. Aligning cloud governance
capabilities accelerates adoption and best practices, while reducing business and technical risks.
Align cloud governance capabilities
Cloud governance capabilities
6 minutes to read • Edit Online
Any kind of change generates new risks. Cloud governance capabilities ensure that risks and risk tolerance are
properly evaluated and managed. This capability ensures the proper identification of risks that can't be tolerated
by the business. The people providing this capability can then convert risks into governing corporate policies.
Governing policies are then executed through defined disciplines executed by the staff members who provide
cloud governance capabilities.
Key responsibilities
The primary duty of any cloud governance capability is to balance competing forces of transformation and risk
mitigation. Additionally, cloud governance ensures that cloud adoption is aware of data and asset classification
and architecture guidelines that govern all adoption approaches. The team will also work with the cloud center
of excellence to apply automated approaches to governing cloud environments.
These tasks are usually executed by the cloud governance capability on a monthly basis.
Early planning tasks:
Understand business risks introduced by the plan
Represent the business' tolerance for risk
Aid in the creation of a Governance MVP
Ongoing monthly tasks:
Understand business risks introduced during each release
Represent the business' tolerance for risk
Aid in the incremental improvement of Policy and Compliance requirements
Meeting cadence
Cloud governance capability is usually delivered by a working team. The time commitment from each team
member will represent a large percentage of their daily schedules. Contributions will not be limited to meetings
and feedback cycles.
Additional participants
The following represent participants who will frequently participate in cloud governance activities:
Leaders from middle management and direct contributors in key roles who have been appointed to
represent the business will help evaluate risk tolerances.
The cloud governance capabilities are delivered by an extension of the cloud strategy capability. Just as the
CIO and business leaders are expected to participate in cloud strategy capabilities, their direct reports are
expected to participate in cloud governance activities.
Business employees that are members of the business unit who work closely with the leadership of the line-
of-business should be empowered to make decisions regarding corporate and technical risk.
Information Technology (IT) and Information Security (IS ) employees who understand the technical aspects
of the cloud transformation may serve in a rotating capacity instead of being a consistent provider of cloud
governance capabilities.
Next steps
As cloud governance matures, teams will be empowered to adopt the cloud at ever faster paces. Continued
cloud adoption efforts tend to trigger maturity in IT operations. This maturation may also necessitate the
development of cloud operations capabilities.
Develop cloud operations capabilities
Central IT capabilities
7 minutes to read • Edit Online
As cloud adoption scales, cloud governance capabilities alone may not be sufficient to govern adoption efforts.
When adoption is gradual, teams tend to organically develop the skills and processes needed to be ready for the
cloud over time.
However, when one cloud adoption team uses the cloud to achieve a high-profile business outcome, gradual
adoption is seldom the case. Success follows success. This is also true for cloud adoption, but it happens at cloud
scale. When cloud adoption expands from one team to multiple teams relatively quickly, additional support from
existing IT staff is needed. However, those staff members may lack the training and experience required to support
the cloud using cloud-native IT tools. This often drives the formation of a central IT team governing the cloud.
Cau t i on
While this is a common maturity step, it can present a high risk to adoption, potentially blocking innovation and
migration efforts if not managed effectively. See the risk section below to learn how to mitigate the risk of
centralization becoming a cultural antipattern.
WARNING
Central IT should only be applied in the cloud when existing delivery on-premises is based on a Central IT model. If the
current on-premises model is based on delegated control, consider a cloud center of excellence (CCoE) approach for a more
compatible alternative.
Key responsibilities
Adapt existing IT practices to ensure adoption efforts result in well-governed, well-managed environments in the
cloud.
The following tasks are typically executed regularly:
Strategic tasks
Review:
business outcomes
financial models
motivations for cloud adoption
business risks
rationalization of the digital estate
Monitor adoption plans and progress against the prioritized migration backlog.
Identify and prioritize platform changes that are required to support the migration backlog.
Act as an intermediary or translation layer between cloud adoption needs and existing IT teams.
Leverage existing IT teams to accelerate platform capabilities and enable adoption.
Technical tasks
Build and maintain the cloud platform to support solutions.
Define and implement the platform architecture.
Operate and manage the cloud platform.
Continuously improve the platform.
Keep up with new innovations in the cloud platform.
Deliver new cloud capabilities to support business value creation.
Suggest self-service solutions.
Ensure that solutions meet existing governance and compliance requirements.
Create and validate deployment of platform architecture.
Review release plans for sources of new platform requirements.
Meeting cadence
Central IT expertise usually comes from a working team. Expect participants to commit much of their daily
schedules to alignment efforts. Contributions aren't limited to meetings and feedback cycles.
Central IT risks
Each of the cloud capabilities and phases of organizational maturity are prefixed with the word "cloud". Central IT
is the only exception. Central IT became prevalent when all IT assets could be housed in few locations, managed by
a small number of teams, and controlled through a single operations management platform. Global business
practices and the digital economy have largely reduced the instances of those centrally managed environments.
In the modern view of IT, assets are globally distributed. Responsibilities are delegated. Operations management is
delivered by a mixture of internal staff, managed service providers, and cloud providers. In the digital economy, IT
management practices are transitioning to a model of self-service and delegated control with clear guardrails to
enforce governance. Central IT can be a valuable contributor to cloud adoption by becoming a cloud broker and a
partner for innovation and business agility.
Central IT as a function is well positioned to take valuable knowledge and practices from existing on-premises
models and apply those practices to cloud delivery. However, this process will require change. New processes, new
skills, and new tools are required to support cloud adoption at scale. When Central IT adapts, it becomes an
important partner in cloud adoption efforts. However, if Central IT doesn't adapt to the cloud, or attempts to use
the cloud as a catalyst for tight-grain controls, Central IT quickly becomes a blocker to adoption, innovation, and
migration.
The measures of this risk are speed and flexibility. The cloud simplifies adopting new technologies quickly. When
new cloud capabilities can be deployed within minutes, but Central IT reviews add weeks or months to the
deployment process, then these centralized processes become a major impediment to business success. When this
indicator is encountered, consider alternative strategies to IT delivery.
Exceptions
Many industries require rigid adherence to third-party compliance. Some compliance requirements still demand
centralized IT control. Delivering on these compliance measures can add time to deployment processes, especially
for new technologies that haven't been used broadly. In these scenarios, expect delays in deployment during the
early stages of adoption. Similar situations my exist for companies that deal with sensitive customer data, but may
not be governed by a third-party compliance requirement.
Operate within the exceptions
When centralized IT processes are required and those processes create appropriate checkpoints in adoption of
new technologies, these innovation checkpoints can still be addressed quickly. Governance and compliance
requirements are designed to protect those things that are sensitive, not to protect everything. The cloud provides
simple mechanisms for acquiring and deploying isolated resources while maintaining proper guardrails.
A mature Central IT team maintains necessary protections but negotiates practices that still enable innovation.
Demonstrating this level of maturity depends on proper classification and isolation of resources.
Example narrative of operating within exceptions to empower adoption
This example narrative illustrates the approach taken by a mature Central IT team to empower adoption.
Contoso, LLC has adopted a Central IT model for the support of the business's cloud resources. To deliver this
model, they have implemented tight controls for various shared services such as ingress network connections. This
wise move reduced the exposure of their cloud environment and provided a single "break-glass" device to block all
traffic in case of a breach. Their security baseline policies state that all ingress traffic must come through a shared
device managed by the Central IT team.
However, one of their cloud adoption teams now requires an environment with a dedicated and specially
configured ingress network connection to use a specific cloud technology. An immature Central IT team would
simply refuse the request and prioritize its existing processes over adoption needs. Contoso's Central IT team is
different. They quickly identified a simple four-part solution to this dilemma: Classification, Negotiation, Isolation,
and Automation.
Classification: Since the cloud adoption team was in the early stages of building a new solution and didn't have
any sensitive data or mission-critical support needs, the assets in the environment were classified as low risk and
noncritical. Effective classification is a sign of maturity in Central IT. Classifying all assets and environments allows
for clearer policies.
Negotiation: Classification alone isn't sufficient. Shared services were implemented to consistently operate
sensitive and mission-critical assets. Changing the rules would compromise governance and compliance policies
designed for the assets that need more protection. Empowering adoption can't happen at the cost of stability,
security, or governance. This led to a negotiation with the adoption team to answer specific questions. Would a
business-led DevOps team be able to provide operations management for this environment? Would this solution
require direct access to other internal resources? If the cloud adoption team is comfortable with those tradeoffs,
then the ingress traffic might be possible.
Isolation: Since the business can provide its own ongoing operations management, and since the solution doesn't
rely on direct traffic to other internal assets, it can be cordoned off in a new subscription. That subscription is also
added to a separate node of the new management group hierarchy.
Automation: Another sign of maturity in this team is their automation principles. The team uses Azure Policy to
automate policy enforcement. They also use Azure Blueprints to automate deployment of common platform
components and enforce adherence to the defined identity baseline. For this subscription and any others in the
new management group, the policies and templates are slightly different. Policies blocking ingress bandwidth have
been lifted. They have been replaced by requirements to route traffic through the shared services subscription, like
any ingress traffic, to enforce traffic isolation. Since the on-premises operations management tooling can't access
this subscription, agents for that tool are no longer required either. All other governance guardrails required by
other subscriptions in the management group hierarchy are still enforced, ensuring sufficient guardrails.
The mature creative approach of Contoso's central IT team provided a solution that didn't compromise
governance or compliance, but was still encouraged adoption. This approach of brokering rather than owning
cloud-native approaches to centralized IT is the first step toward building a true cloud center of excellence (CCoE ).
Adopting this approach to quickly evolve existing policies will allow for centralized control when required and
governance guardrails when more flexibility is acceptable. Balancing these two considerations mitigates the risks
associated with Central IT in the cloud.
Next steps
As Central IT matures in the cloud, the next maturity step is typically looser coupling of cloud operations. The
availability of cloud-native operations management tooling and lower operating costs for PaaS -first solutions
often lead to business teams (or more specifically, DevOps teams within the business) assuming responsibility for
cloud operations.
Cloud operations capability
Cloud operation capabilities
2 minutes to read • Edit Online
Business transformation may be enabled by cloud adoption. However, returns are only realized when the
workloads deployed to the cloud are operating in alignment with performance expectations. As additional
workloads adopt cloud technologies, additional operations capacity will be required.
Traditional IT operations were required to focus on maintaining current-state operations for a wide variety of
low -level technical assets. Things like storage, cpu, memory, network equipment, servers, and virtual machine
hosts require continuous maintenance to maintain peek operations. Capital budgets often include large expenses
related to annual or periodic updates to these low -level assets.
Human capital within operations would also focus heavily on the monitoring, repair, and remediation of issues
related to these assets. In the cloud, many of these capital costs and operations activities are transferred to the
cloud provider. This provides an opportunity for IT operations to improve and provide significant additional
value.
Key responsibilities
The duties of the people providing cloud operations capability is to deliver maximum workload performance and
minimum business interruptions within agreed upon operations budgets.
Strategic tasks
Review business outcomes, financial models, motivations for cloud adoption, business risks, and
rationalization of the digital estate.
Determine workload criticality, impact of disruptions or performance degradation.
Establish business approved cost/performance commitments.
Monitor and operate cloud workloads.
Technical tasks
Maintain asset and workload inventory.
Monitor performance of workloads.
Maintain operational compliance.
Protect workloads and associated assets.
Recover assets in the case of performance degradation or business interruption.
Mature capabilities of core platforms.
Continuously improve workload performance.
Improve budgetary and design requirements of workloads to fit commitments to the business.
Meeting cadence
Those performing cloud operations capabilities should be involved in release planning and cloud center of
excellence planning to provide feedback and prepare for operational requirements.
Next steps
As adoption and operations scale, it's important to define and automate governance best practices that extend
existing IT requirements. Forming a cloud center of excellence is an important step to scaling cloud adoption,
cloud operations, and cloud governance efforts.
Establish a cloud center of excellence
Cloud center of excellence
9 minutes to read • Edit Online
Business and technical agility are core objectives of most IT organizations. A cloud center of excellence (CCoE )
is a function that creates a balance between speed and stability.
Function structure
A CCoE model requires collaboration between each of the following capabilities:
Cloud adoption (specifically solution architects)
Cloud strategy (specifically the program and project managers)
Cloud governance
Cloud platform
Cloud automation
Key responsibilities
The primary duty of the CCoE team is to accelerate cloud adoption through cloud-native or hybrid solutions.
The objective of the CCoE is to:
Help build a modern IT organization through agile approaches to capture and implement business
requirements.
Use reusable deployment packages that align with security, compliance, and service management policies.
Maintain a functional Azure platform in alignment with operational procedures.
Review and approve the use of cloud-native tools.
Over time, standardize and automate commonly needed platform components and solutions.
Meeting cadence
The CCoE is a function staffed by four high demand teams. It is important to allow for organic collaboration
and track growth through a common repository/solution catalog. Maximize natural interactions, but minimize
meetings. When this function matures, the teams should try to limit dedicated meetings. Attendance at
recurring meetings, like release meetings hosted by the cloud adoption team, will provide data inputs. In
parallel, a meeting after each release plan is shared can provide a minimum touch point for this team.
Provision a production SQL Server Network, IT, and data platform teams The team requiring the server deploys
provision various components over a PaaS instance of Azure SQL
the course of days or even weeks. Database. Alternatively, a
preapproved template could be used
to deploy all of the IaaS assets to the
cloud in hours.
Provision a development environment Network, IT, Development, and The development team defines their
DevOps teams agree to specs and own specs and deploys an
deploy an environment. environment based on allocated
budget.
Update security requirements to Networking, IT, and security teams Cloud governance tools are used to
improve data protection update various networking devices update policies that can be applied
and VMs across multiple immediately to all assets in all cloud
environments to add protections. environments.
Negotiations
At the root of any CCoE effort is an ongoing negotiation process. The CCoE team negotiates with existing IT
functions to reduce central control. The trade-offs for the business in this negotiation are freedom, agility, and
speed. The value of the trade-off for existing IT teams is delivered as new solutions. The new solutions provide
the existing IT team with one or more of the following benefits:
Ability to automate common issues.
Improvements in consistency (reduction in day-to-day frustrations).
Opportunity to learn and deploy new technical solutions.
Reductions in high severity incidents (fewer quick fixes or late-night pager-duty responses).
Ability to broaden their technical scope, addressing broader topics.
Participation in higher-level business solutions, addressing the impact of technology.
Reduction in menial maintenance tasks.
Increase in technology strategy and automation.
In exchange for these benefits, the existing IT function may be trading the following values, whether real or
perceived:
Sense of control from manual approval processes.
Sense of stability from change control.
Sense of job security from completion of necessary yet repetitive tasks.
Sense of consistency that comes from adherence to existing IT solution vendors.
In healthy cloud-forward companies, this negotiation process is a dynamic conversation between peers and
partnering IT teams. The technical details may be complex, but are manageable when IT understands the
objective and is supportive of the CCoE efforts. When IT is less than supportive, the following section on
enabling CCoE success can help overcome cultural blockers.
Next steps
A CCoE model requires both a cloud platform capabilities and cloud automation capabilities. The next step is
to align cloud platform capabilities.
Align cloud platform capabilities
Cloud platform capabilities
2 minutes to read • Edit Online
The cloud introduces many technical changes as well as opportunities to streamline technical solutions. However,
general IT principles and business needs stay the same. You still need to protect sensitive business data. If your IT
platform depends on a local area network, there's a good chance that you'll need network definitions in the cloud.
Users who need to access applications and data will want their current identities to access relevant cloud
resources.
While the cloud presents the opportunity to learn new skills, your current architects should be able to directly
apply their experiences and subject matter expertise. Cloud platform capabilities are usually provided by a select
group of architects who focus on learning about the cloud platform. These architects then aid others in decision
making and the proper application of controls to cloud environments.
Key responsibilities
Cloud platform duties center around the creation and support of your cloud platform or landing zones.
The following tasks are typically executed on a regular basis:
Strategic tasks
Review:
business outcomes
financial models
motivations for cloud adoption
business risks
rationalization of the digital estate
Monitor adoption plans and progress against the prioritized migration backlog.
Identify and prioritize platform changes that are required to support the migration backlog.
Technical tasks
Build and maintain the cloud platform to support solutions.
Define and implement the platform architecture.
Operate and manage the cloud platform.
Continuously improve the platform.
Keep up with new innovations in the cloud platform.
Bring new cloud capabilities to support business value creation.
Suggest self-service solutions.
Ensure solutions meet existing governance/compliance requirements.
Create and validate deployment of platform architecture.
Review release plans for sources of new platform requirements.
Meeting cadence
Cloud platform expertise usually comes from a working team. Expect participants to commit a large portion of
their daily schedules to cloud platform work. Contributions aren't limited to meetings and feedback cycles.
Next steps
As your cloud platform becomes better defined, aligning cloud automation capabilities can accelerate adoption. It
can also help establish best practices while reducing business and technical risks.
Align cloud automation capabilities
Cloud automation capabilities
2 minutes to read • Edit Online
During cloud adoption efforts, cloud automation capabilities will unlock the potential of DevOps and a cloud-
native approach. Expertise in each of these areas can accelerate adoption and innovation.
Mindset
Before you admit a team member to this group, they should demonstrate three key characteristics:
Expertise in any cloud platform with a special emphasis on DevOps and automation.
A growth mindset or openness to changing the way IT operates today.
A desire to accelerate business change and remove traditional IT roadblocks.
Key responsibilities
The primary duty of cloud automation is to own and advance the solution catalog. The solution catalog is a
collection of prebuilt solutions or automation templates. These solutions can rapidly deploy various platforms as
required to support needed workloads. These solutions are building blocks that accelerate cloud adoption and
reduce the time to market during migration or innovation efforts.
Examples of solutions in the catalog include:
A script to deploy a containerized application
A Resource Manager template to deploy a SQL HA AO cluster
Sample code to build a deployment pipeline using Azure DevOps
An Azure DevTest Labs instance of the corporate ERP for development purposes
Automated deployment of a self-service environment commonly requested by business users
The solutions in the solution catalog aren't deployment pipelines for a workload. Instead, you might use
automation scripts in the catalog to quickly create a deployment pipeline. You might also use a solution in the
catalog to quickly provision platform components to support workload tasks like automated deployment, manual
deployment, or migration.
These following tasks are typically executed by cloud automation on a regular basis:
Strategic tasks
Review:
business outcomes
financial models
motivations for cloud adoption
business risks
rationalization of the digital estate
Monitor adoption plans and progress against the prioritized migration backlog.
Identify opportunities to accelerate cloud adoption, reduce effort through automation, and improve security,
stability, and consistency.
Prioritize a backlog of solutions for the solution catalog that delivers the most value given other strategic
inputs.
Technical tasks
Curate or develop solutions based on the prioritized backlog.
Ensure solutions align to platform requirements.
Ensure solutions are consistently applied and meet existing governance/compliance requirements.
Create and validate solutions in the catalog.
Review release plans for sources of new automation opportunities.
Meeting cadence
Cloud automation is a working team. Expect participants to commit a large portion of their daily schedules to
cloud automation work. Contributions aren't limited to meetings and feedback cycles.
The cloud automation team should align activities with other areas of capability. This alignment might result in
meeting fatigue. To ensure cloud automation has sufficient time to manage the solution catalog, you should
review meeting cadences to maximize collaboration and minimize disruptions to development activities.
Next steps
As the essential cloud capabilities align, the collective teams can help develop needed technical skills.
Building technical skills
Establish team structures
5 minutes to read • Edit Online
Every cloud capability is provided by someone during every cloud adoption effort. These assignments and team
structures can develop organically, or they can be intentionally designed to match a defined team structure.
As adoption needs grow, so does the need for balance and structure. This article provides examples of common
team structures at various stages of organizational maturity. The following graphic and list outline those structures
based on typical maturation stages. Use these examples to find the organizational structure that best aligns with
your operational needs.
Organizational structures tend to move through the common maturity model that's outlined here:
1. Cloud adoption team only
2. MVP best practice
3. Central IT
4. Strategic alignment
5. Operational alignment
6. Cloud center of excellence (CCoE )
Most companies start with little more than a cloud adoption team. However, we recommend that you establish an
organizational structure that more closely resembles the MVP best practice structure.
WARNING
Operating with only a cloud adoption team (or multiple cloud adoption teams) is considered an antipattern and should be
avoided. At a minimum, consider the MVP best practice.
This proven approach is considered an MVP because it may not be sustainable. Each team is wearing many hats,
as outlined in the responsible, accountable, consulted, informed (RACI) charts.
The following sections describe a fully staffed, proven organizational structure along with approaches to aligning
the appropriate structure to your organization.
Central IT
As adoption scales, the cloud governance team may struggle to keep pace with the flow of innovation from
multiple cloud adoption teams. This is especially true in environments which have heavy compliance, operations,
or security requirements. At this stage, it is common for companies to shift cloud responsibilities to an existing
central IT team. If that team is able to reassess tools, processes, and people to better support cloud adoption at
scale, then including the central IT team can add significant value. Bringing in subject matter experts from
operations, automation, security, and administration to modernize Central IT can drive effective operational
innovations.
Unfortunately, the central IT phase can be one of the riskiest phases of organizational maturity. The central IT team
must come to the table with a strong growth mindset. If the team views the cloud as an opportunity to grow and
adapt their capabilities, then it can provide great value throughout the process. However, if the central IT team
views cloud adoption primarily as a threat to their existing model, then the central IT team becomes an obstacle to
the cloud adoption teams and the business objectives they support. Some central IT teams have spent months or
even years attempting to force the cloud into alignment with on-premises approaches, with only negative results.
The cloud doesn't require that everything change within Central IT, but it does require change. If resistance to
change is prevalent within the central IT team, this phase of maturity can quickly become a cultural antipattern.
Cloud adoption plans heavily focused on platform as a service (PaaS ), DevOps, or other solutions that require less
operations support are less likely to see value during this phase of maturity. On the contrary, these types of
solutions are the most likely to be hindered or blocked by attempts to centralize IT. A higher level of maturity, like
a cloud center of excellence (CCoE ), is more likely to yield positive results for those types of transformational
efforts. To understand the differences between Central IT in the cloud and a CCoE, see Cloud center of excellence.
Strategic alignment
As the investment in cloud adoption grows and business values are realized, business stakeholders often become
more engaged. A defined cloud strategy team, as the following image illustrates, aligns those business
stakeholders to maximize the value realized by cloud adoption investments.
When maturity happens organically, as a result of IT-led cloud adoption efforts, strategic alignment is usually
preceded by a governance or central IT team. When cloud adoption efforts are lead by the business, the focus on
operating model and organization tends to happen earlier. Whenever possible, business outcomes and the cloud
strategy team should both be defined early in the process.
Operational alignment
Realizing business value from cloud adoption efforts requires stable operations. Operations in the cloud may
require new tools, processes, or skills. When stable IT operations are required to achieve business outcomes, it's
important to add a defined cloud operations team, as shown here.
Cloud operations can be delivered by the existing IT operations roles. But it's not uncommon for cloud operations
to be delegated to other parties outside of IT operations. Managed service providers, DevOps teams, and business
unit IT often assume the responsibilities associated with cloud operations, with support and guardrails provided
by IT operations. This is increasingly common for cloud adoption efforts that focus heavily on DevOps or PaaS
deployments.
The primary difference between this structure and the Central IT structure above is a focus on self-service. The
teams in this structure organize with the intent of delegating control as much as possible. Aligning governance
and compliance practices to cloud-native solutions creates guardrails and protection mechanisms. Unlike the
Central IT model, the cloud-native approach maximizes innovation and minimizes operational overhead. For this
model to be adopted, mutual agreement to modernize IT processes will be required from business and IT
leadership. This model is unlikely to occur organically and often requires executive support.
Next steps
After aligning to a certain stage of organizational structure maturity, you can use RACI charts to align
accountability and responsibility across each team.
Align the appropriate RACI chart
Align responsibilities across teams
2 minutes to read • Edit Online
Learn to align responsibilities across teams by developing a cross-team matrix that identifies responsible,
accountable, consulted, and informed (RACI) parties. This article provides an example RACI matrix for the
organizational structures described in Establish team structures:
Cloud adoption team only
MVP best practice
Central IT
Strategic alignment
Operational alignment
Cloud center of excellence (CCoE )
To track organizational structure decisions over time, download and modify the RACI spreadsheet template.
The examples in this article specify these RACI constructs:
The one team that is accountable for a function.
The teams that are responsible for the outcomes.
The teams that should be consulted during planning.
The teams that should be informed when work is completed.
The last row of each table (except the first) contains a link to the most-aligned cloud capability for additional
information.
Central IT
BUSINESS CHANGE SOLUTION PLATFORM PLATFORM
SOLUTION ALIGNMEN MANAGEM OPERATIO GOVERNA PLATFORM OPERATIO AUTOMATI
DELIVERY T ENT NS NCE MATURITY NS ON
Strategic alignment
BUSINESS CHANGE SOLUTION PLATFORM PLATFORM
SOLUTION ALIGNMEN MANAGEM OPERATIO GOVERNA PLATFORM OPERATIO AUTOMATI
DELIVERY T ENT NS NCE MATURITY NS ON
Operational alignment
BUSINESS CHANGE SOLUTION PLATFORM PLATFORM
SOLUTION ALIGNMEN MANAGEM OPERATIO GOVERNA PLATFORM OPERATIO AUTOMATI
DELIVERY T ENT NS NCE MATURITY NS ON
Next steps
To track decisions about organization structure over time, download and modify the RACI spreadsheet template.
Copy and modify the most closely aligned sample from the RACI matrices in this article.
Download the RACI spreadsheet template
Skills readiness path during the Ready phase of a
migration
3 minutes to read • Edit Online
During the ready phase of a migration, the objective is to prepare for the journey ahead. This phase is
accomplished in two primary areas: organizational and environmental (technical) readiness. Both may require new
skills for technical and nontechnical contributors. The following information can help your organization build the
necessary skills.
Learn more
For additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning paths with
your role.
Build a cost-conscious organization
6 minutes to read • Edit Online
As outlined in Motivations: Why are we moving to the cloud?, there are many sound reasons for a company to
adopt the cloud. When cost reduction is a primary driver, it's important to create a cost-conscious organization.
Ensuring cost consciousness is not a one-time activity. Like other cloud-adoption topics, it's iterative. The following
diagram outlines this process to focus on three interdependent activities: visibility, accountability, and optimization.
These processes play out at macro and micro levels, which we describe in detail in this article.
Next steps
Practicing these responsibilities at each level of the business helps drive a cost-conscious organization. To begin
acting on this guidance, review the organizational readiness introduction to help identify the right team structures.
Identify the right team structures
Organizational antipatterns: Silos and fiefdoms
13 minutes to read • Edit Online
Success in any major change to business practices, culture, or technology operations requires a growth mindset.
At the heart of the growth mindset is an acceptance of change and the ability to lead in spite of ambiguity.
Some antipatterns can block a growth mindset in organizations that want to grow and transform, including
micromanagement, biased thinking, and exclusionary practices. Many of these blockers are personal challenges
that create personal growth opportunities for everyone. But two common antipatterns in IT require more than
individual growth or maturity: silos and fiefdoms.
These antipatterns are a result of organic changes within various teams, which result in unhealthy organizational
behaviors. To address the resistance caused by each antipattern, it's important to understand the root cause of this
formation.
Antipatterns
The organic and responsive growth within IT that creates healthy IT teams can also result in antipatterns that
block transformation and cloud adoption. IT silos and fiefdoms are different from the natural microcultures within
healthy IT teams. In either pattern, the team focus tends to be directed toward protecting their "turf". When team
members are confronted with an opportunity to drive change and improve operations, they will invest more time
and energy into blocking the change than finding a positive solution.
As mentioned earlier, healthy IT teams can create natural resistance and positive friction. Silos and fiefdoms are a
different challenge. There is no documented leading indicator for either antipattern. These antipatterns tend to be
identified after months of cloud center of excellence and cloud governance team efforts. They're discovered as the
result of ongoing resistance.
Even in toxic cultures, the efforts of the CCoE and the cloud governance team should help drive cultural growth
and technical progress. After months of effort, a few teams might still show no signs of inclusive behaviors and
stand firm in their resistance to change. These teams are likely operating in one of the following antipattern
models: silos and fiefdoms. Although these models have similar symptoms, the root cause and approaches to
addressing resistance is radically different between them.
IT silos
Team members in an IT silo are likely to define themselves through their alignment to a small number of IT
vendors or an area of technical specialization. However, don't confuse IT silos with IT fiefdoms. IT silos tend to be
driven by comfort and passion, and are generally easier to overcome than the fear-driven motives behind
fiefdoms.
This antipattern often emerges from of a common passion for a specific solution. IT silos are then reinforced by
the team's advanced skills as a result of the investment in that specific solution. This superior skill can be an
accelerator to cloud adoption efforts if the resistance to change can be overcome. It can also become a major
blocker if the silos are broken down or if the team members can't accurately evaluate options. Fortunately, IT silos
can often be overcome without any significant changes to the organizational chart.
Address resistance from IT silos
IT silos can be addressed through the following approaches. The best approach will depend on the root cause of
the resistance.
Create virtual teams: The organizational readiness section of the Cloud Adoption Framework describes a
multilayered structure for integrating and defining four virtual teams (v-teams). One benefit of this structure is
cross-organization visibility and inclusion. Introducing a cloud center of excellence creates a high-profile
aspirational team that top engineers will want to participate in. This helps create new cross-solution alignments
that aren't bound by organizational-chart constraints, and will drive inclusion of top engineers who have been
sheltered by IT silos.
Introduction of a cloud strategy team will create immediate visibility to IT contributions regarding cloud adoption
efforts. When IT silos fight for separation, this visibility can help motivate IT and business leaders to properly
support those resistant team members. This process is a quick path to stakeholder engagement and support.
Consider experimentation and exposure: Team members in an IT silo have likely been constrained to think a
certain way for some time. Breaking the one-track mind is a first step to addressing resistance.
Experimentation and exposure are powerful tools for breaking down barriers in silos. The team members might be
resistant to competing solutions, so it's not wise to put them in charge of an experiment that competes with their
existing solution. However, as part of a first workload test of the cloud, the organization should implement
competing solutions. The siloed team should be invited to participate as an input and review source, but not as a
decision maker. This should be clearly communicated to the team, along with a commitment to engage the team
more deeply as a decision maker before moving into production solutions.
During review of the competing solution, use the practices outlined in Define corporate policy to document
tangible risks of the experiment and establish policies that help the siloed team become more comfortable with
the future state. This will expose the team to new solutions and harden the future solution.
Be "boundaryless": The teams that drive cloud adoption find it easy to push boundaries by exploring exciting,
new cloud-native solutions. This is one half of the approach to removing boundaries. However, that thinking can
further reinforce IT silos. Pushing for change too quickly and without respect to existing cultures can create
unhealthy friction and lead to natural resistance.
When IT silos start to resist, it's important to be boundaryless in your own solutions. Be mindful of one simple
truth: cloud-native isn't always the best solution. Consider hybrid solutions that might provide an opportunity to
extend the existing investments of the IT silo into the future.
Also consider cloud-based versions of the solution that the IT silo team uses now. Experiment with those solutions
and expose yourself to the viewpoint of those living in the IT silo. At a minimum, you will gain a fresh perspective.
In many situations, you might earn enough of the IT silo's respect to lessen resistance.
Invest in education: Many people living in an IT silo became passionate about the current solution as a result of
expanding their own education. Investing in the education of these teams is seldom misplaced. Allocate time for
these individuals to engage in self-learning, classes, or even conferences to break the day-to-day focus on the
current solution.
For education to be an investment, some return must come as a result of the expense. In exchange for the
investment, the team might demonstrate the proposed solution to the rest of the teams involved in cloud
adoption. They might also provide documentation of the tangible risks, risk management approaches, and desired
policies in adopting the proposed solution. Each will engage these teams in the solution and help take advantage
of their tribal knowledge.
Turn roadblocks into speed bumps: IT silos can slow or stop any transformation. Experimentation and iteration
will find a way, but only if the project keeps moving. Focus on turning roadblocks into merely speed bumps.
Define policies that everyone can be temporarily comfortable with in exchange for continued progression.
For instance, if IT security is the roadblock because its security solution can't monitor compromises of protected
data in the cloud, establish data classification policies. Prevent deployment of classified data into the cloud until an
agreeable solution can be found. Invite IT security into experimentation with hybrid or cloud-native solutions to
monitor protected data.
If the network team operates as a silo, identify workloads that are self-contained and don't have network
dependencies. In parallel, experiment, expose, and educate the network team while working on hybrid or
alternative solutions.
Be patient and be inclusive: It's tempting to move on without support of an IT silo. But this decision will cause
disruptions and roadblocks down the road. Changing minds in members of the IT silo can take time. Be patient of
their natural resistance--convert it to value. Be inclusive and invite healthy friction to improve the future solution.
Never compete: The IT silo exists for a reason. It persists for a reason. There is an investment in maintaining the
solution that the team members are passionate about. Directly competing with the solution or the IT silo will
distract from the real goal of achieving business outcomes. This trap has blocked many transformation projects.
Stay focused on the goal, as opposed to a single component of the goal. Help accentuate the positive aspects of
the IT silo's solution and help the team members make wise decisions about the best solutions for the future.
Don't insult or degrade the current solution, because that would be counterproductive.
Partner with the business: If the IT silo isn't blocking business outcomes, why do you care? There is no perfect
solution or perfect IT vendor. Competition exists for a reason; each has its own benefits.
Embrace diversity and include the business by supporting and aligning to a strong cloud strategy team. When an
IT silo supports a solution that blocks business outcomes, it will be easier to communicate that roadblock without
the noise of technical squabbles. Supporting nonblocking IT silos will show an ability to partner for the desired
business outcomes. These efforts will earn more respect and greater support from the business when an IT silo
presents a legitimate blocker.
IT fiefdoms
Team members in an IT fiefdom are likely to define themselves through their alignment to a specific process or
area of responsibility. The team operates under an assumption that external influence on its area of responsibility
will lead to problems. Fiefdoms tend to be a fear-driven antipattern, which will require significant leadership
support to overcome.
Fiefdoms are especially common in organizations that have experienced IT downsizing, frequent turbulence in IT
staff, or poor IT leadership. When the business sees IT purely as a cost center, fiefdoms are much more likely to
arise.
Generally, fiefdoms are the result of a line manager who fears loss of the team and the associated power base.
These leaders often have a sense of duty to their team and feel a need to protect their subordinates from negative
consequences. Phrases like "shelter the team from change" and "protect the team from process disruption" can be
indicators of an overly guarded manager who might need more support from leadership.
Address resistance from IT fiefdoms
IT fiefdoms can demonstrate some growth by following the approaches to addressing IT silo resistance. Before
you try to address resistance from an IT fiefdom, we recommend that you treat the team like an IT silo first. If
those types of approaches fail to yield any significant change, the resistant team might be suffering from an IT
fiefdom antipattern. The root cause of IT fiefdoms is a little more complex to address, because that resistance
tends to come from the direct line manager (or a leader higher up the organizational chart). Challenges that are IT
silo-driven are typically simpler to overcome.
When continued resistance from IT fiefdoms blocks cloud adoption efforts, it might be wise for a combined effort
to evaluate the situation with existing IT leaders. IT leaders should carefully consider insights from the cloud
strategy team, cloud center of excellence, and cloud governance team before making decisions.
NOTE
IT leaders should never take changes to the organizational chart lightly. They should also validate and analyze feedback from
each of the supporting teams. However, transformational efforts like cloud adoption tend to magnify underlying issues that
have gone unnoticed or unaddressed long before this effort. When fiefdoms are preventing the company's success,
leadership changes are a likely necessity.
Fortunately, removing the leader of a fiefdom doesn't often end in termination. These strong, passionate leaders can often
move into a management role after a brief period of reflection. With the right support, this change can be healthy for the
leader of the fiefdom and the current team.
Cau t i on
For managers of IT fiefdoms, protecting the team from risk is a clear leadership value. However, there's a fine line
between protection and isolation. When the team is blocked from participating in driving changes, it can have
psychological and professional consequences on the team. The urge to resist change might be strong, especially
during times of visible change.
The manager of any isolated team can best demonstrate a growth mindset by experimenting with the guidance
associated with healthy IT teams in the preceding sections. Active and optimistic participation in governance and
CCoE activities can lead to personal growth. Managers of IT fiefdoms are best positioned to change stifling
mindsets and help the team develop new ideas.
IT fiefdoms can be a sign of systemic leadership issues. To overcome an IT fiefdom, IT leaders need the ability to
make changes to operations, responsibilities, and occasionally even the people who provide line management of
specific teams. When those changes are required, it's wise to approach those changes with clear and defensible
data points.
Alignment with business stakeholders, business motivations, and business outcomes might be required to drive
the necessary change. Partnership with the cloud strategy team, cloud center of excellence, and cloud governance
team can provide the data points needed for a defensible position. When necessary, these teams should be
involved in a group escalation to address challenges that can't be addressed with IT leadership alone.
Next steps
Disrupting organizational antipatterns is a team effort. To act on this guidance, review the organizational readiness
introduction to identify the right team structures and participants:
Identify the right team structures and participants
The cloud fundamentally changes how enterprises procure and use technology resources. Traditionally, enterprises assumed
ownership and responsibility of all aspects of technology, from infrastructure to software. The cloud allows enterprises to provision
and to consume resources only as needed. However, cloud adoption is a means to an end. Businesses adopt the cloud when they
realize it can address any of these business opportunities:
Businesses are motivated to migrate to the cloud to:
Optimize operations
Simplify technology
Increase business agility
Reduce costs
Prepare for new technical capabilities
Scaling to market demands or new geographical regions
Businesses are motivated to innovate using the cloud to:
Improve customer experiences
Increase customer engagements
Transform products
Prepare for and build new technical capabilities
Scale to market demands or new geographical regions
Cloud adoption is an iterative effort focusing on what you do in the cloud. The cloud strategy outlines the digital
transformation to guide business programs, as various teams execution adoption projects. Planning and Readiness
help ensure the success of each of those important elements. All steps of cloud adoption equate to tangible
projects with manageable objectives, timelines, and budgets.
These adoption efforts are relatively easy to track and measure, even when they involve multiple projected
iterations and releases. Each phase of the adoption lifecycle is important. Each phase is prone to potential
roadblocks across business, culture, and technology constraints. But, each phase depends heavily on the underlying
operating model.
If adoption describes what you are doing, the operating model defines the underlying who and how
that enable adoption.
Satya Nadella said "Culture eats strategy for breakfast". The operating model is the embodiment of the IT
culture, captured in a number of measurable processes. When the cloud is powered by a strong operating model,
the culture will drive to the strategy, accelerating adoption and business values realization. Conversely, when
adoption is successful but there is no operating model, the returns can be impressive but very short lived. For long
term success it is vital that adoption and operating models advance in parallel.
Next steps
Governance is a common first step toward establishing an operating model for the cloud.
Learn about cloud governance
Operating model terminology
2 minutes to read • Edit Online
The term operating model has many definitions. This intro article establishes terminology associated with
operating models. To understand an operating model as it relates to the cloud, we first have to understand how an
operating model fits into the bigger theme of corporate planning.
Terms
Business model: Business models tend to define corporate value ("what" the business does to provide value) and
mission/vision statements ("why" the business has chosen to add value in that way). At a minimum, business
models should be able to represent the "what" and "why" in the form of financial projections. There are many
different schools of thought regarding how far a business model goes beyond these basic leadership principles.
However, to create a sound operating model, the business models should include high-level statements to establish
directional goals. It's even more effective if those goals can be represented in metrics or KPIs to track progress.
Customer experience: All good business models ground the "why" side of a business's strategy in the experience
of their customers. This process could involve a customer acquiring a product or service. It could include
interactions between a company and its business customers. Another example could center around the long-term
management of a customer's financial or health needs, as opposed to a single transaction or process. Regardless of
the type of experience, the majority of successful companies realize that they exist to operate and improve the
experiences that drive their "why" statements.
Digital transformation: Digital transformation has become an industry buzzword. However, it is a vital
component in the fulfillment of modern business models. Since the advent of the smartphone and other portable
computing form factors, customer experiences have become increasingly digital. This shift is painfully obvious in
some industries like DVD rentals, print media, automotive, or retail. In each case, digitized experiences have had a
significant impact on the customer experience. In some cases, physical media have been entirely replaced with
digital media, upsetting the entire industry vertical. In others, digital experiences are seen as a standard
augmentation of the experience. To deliver business value ("what" statements), the customer experience ("why"
statements) must factor in the impact of digital experiences on the customers' experiences. This process is digital
transformation. Digital transformation is seldom the entire "why" statement in a business strategy, but it is an
important aspect.
Operating model: If the business model represents the "what" and "why", then an operating model represents the
"how" and "who" for operationalizing the business strategy. The operating model defines the ways in which people
work together to accomplish the large goals outlined in the business strategy. Operating models are often
described as the people, process, and technology behind the business strategy. In the article on the Cloud Adoption
Framework operating model, this concept is explained in detail.
Cloud adoption: As stated above, digital transformation is an important aspect of the customer experience and
the business model. Likewise, cloud adoption is an important aspect of any operating model. Cloud adoption is a
strong enabler to deliver the right technologies and processes required to successfully deliver on the modern
operating model.
Cloud adoption is "what we do" to realize the business value. The operating model represents "who we are and
how we function on a daily basis" while cloud adoption is being delivered.
Next steps
Leverage the operating model provided by the Cloud Adoption Framework to develop operational maturity.
Leverage the operating model
Architectural decision guides
2 minutes to read • Edit Online
The architectural decision guides in the Cloud Adoption Framework describe patterns and models that help
when creating cloud governance design guidance. Each decision guide focuses on one core infrastructure
component of cloud deployments and lists patterns and models that can support specific cloud deployment
scenarios.
When you begin to establish cloud governance for your organization, actionable governance journeys provide a
baseline roadmap. However, these journeys make assumptions about requirements and priorities that might not
reflect those of your organization.
These decision guides supplement the sample governance journeys by providing alternative patterns and
models that help you align the architectural design choices made in the example design guidance with your own
requirements.
Effective subscription design helps organizations establish a structure to organize assets in Azure during a cloud
adoption.
Each resource in Azure, such as a virtual machine or a database, is associated with a subscription. Adopting
Azure begins by creating an Azure subscription, associating it with an account, and deploying resources to the
subscription. For an overview of these concepts, see Azure fundamental concepts.
As your digital estate in Azure grows, you will likely need to create additional subscriptions to meet your
requirements. Azure allows you to define a hierarchy of management groups to organize your subscriptions and
easily apply the right policy to the right resources. For more information, see Scaling with multiple Azure
subscriptions.
Some basic examples of using management groups to separate different workloads include:
Production vs. nonproduction workloads: Some enterprises create management groups to separate their
production and nonproduction subscriptions. Management groups allow these customers to more easily
manage roles and policies. For example, a nonproduction subscription may allow developers contributor
access, but in production, they have only reader access.
Internal services vs. external services: Much like production versus nonproduction workloads, enterprises
often have different requirements, policies, and roles for internal services versus external customer-facing
services.
This decision guide helps you consider different approaches to organizing your management group hierarchy.
NOTE
Azure Enterprise Agreements (EAs) allows you to define another organizational hierarchy for billing purposes. This
hierarchy is distinct from your management group hierarchy, which focuses on providing an inheritance model for easily
applying suitable policies and access control to your resources.
The following subscription patterns reflect an initial increase in subscription design sophistication, followed by
several more advanced hierarchies that may align well to your organization:
Single subscription
A single subscription per account may suffice for organizations that need to deploy a small number of cloud-
hosted assets. This is the first subscription pattern you'll implement when beginning your cloud adoption
process, allowing small-scale experimental or proof-of-concept deployments to explore the capabilities of the
cloud.
Production-and-nonproduction pattern
When you're ready to deploy a workload to a production environment, you should add an additional
subscription. This helps you keep your production data and other assets out of your dev/test environments. You
can also easily apply two different sets of policies across the resources in the two subscriptions.
Mixed patterns
Management group hierarchies can be up to six levels deep. This provides you with the flexibility to create a
hierarchy that combines several of these patterns to meet your organizational needs. For example, the diagram
below shows an organizational hierarchy that combines a business unit pattern with a geographic pattern.
Related resources
Resource access management in Azure
Multiple layers of governance in large enterprises
Multiple geographic regions
Next steps
Subscription design is just one of the core infrastructure components requiring architectural decisions during a
cloud adoption process. Visit the decision guides overview to learn about alternative patterns or models used
when making design decisions for other types of infrastructure.
Architectural decision guides
Identity decision guide
7 minutes to read • Edit Online
In any environment, whether on-premises, hybrid, or cloud-only, IT needs to control which administrators,
users, and groups have access to resources. Identity and access management (IAM ) services enable you to
manage access control in the cloud.
Jump to: Determine Identity Integration Requirements | Cloud baseline | Directory Synchronization | Cloud
hosted domain services | Active Directory Federation Services | Learn more
Several options are available for managing identity in a cloud environment. These options vary in cost and
complexity. A key factor in structuring your cloud-based identity services is the level of integration required with
your existing on-premises identity infrastructure.
In Azure, Azure Active Directory (Azure AD ) provides a base level of access control and identity management
for cloud resources. However, if your organization's on-premises Active Directory infrastructure has a complex
forest structure or customized organizational units (OUs), your cloud-based workloads might require directory
synchronization with Azure AD for a consistent set of identities, groups, and roles between your on-premises
and cloud environments. Additionally, support for applications that depend on legacy authentication
mechanisms might require the deployment of Active Directory Domain Services (AD DS ) in the cloud.
Cloud-based identity management is an iterative process. You could start with a cloud-native solution with a
small set of users and corresponding roles for an initial deployment. As your migration matures, you might
need to integrate your identity solution using directory synchronization or add domains services as part of your
cloud deployments. Revisit your identity strategy in every iteration of your migration process.
As part of planning your migration to Azure, you will need to determine how best to integrate your existing
identity management and cloud identity services. The following are common integration scenarios.
Cloud baseline
Azure AD is the native Identity and Access Management (IAM ) system for granting users and groups access to
management features on the Azure platform. If your organization lacks a significant on-premises identity
solution, and you plan on migrating workloads to be compatible with cloud-based authentication mechanisms,
you should begin developing your identity infrastructure using Azure AD as a base.
Cloud baseline assumptions: Using a purely cloud-native identity infrastructure assumes the following:
Your cloud-based resources will not have dependencies on on-premises directory services or Active
Directory servers, or workloads can be modified to remove those dependencies.
The application or service workloads being migrated either support authentication mechanisms compatible
with Azure AD or can be modified easily to support them. Azure AD relies on internet-ready authentication
mechanisms such as SAML, OAuth, and OpenID Connect. Existing workloads that depend on legacy
authentication methods using protocols such as Kerberos or NTLM might need to be refactored before
migrating to the cloud using the cloud baseline pattern.
TIP
Completely migrating your identity services to Azure AD eliminates the need to maintain your own identity infrastructure,
significantly simplifying your IT management.
However, Azure AD is not a full replacement for a traditional on-premises Active Directory infrastructure. Directory
features such as legacy authentication methods, computer management, or group policy might not be available without
deploying additional tools or services to the cloud.
For scenarios where you need to integrate your on-premises identities or domain services with your cloud deployments,
see the directory synchronization and cloud-hosted domain services patterns discussed below.
Directory synchronization
For organizations with existing on-premises Active Directory infrastructure, directory synchronization is often
the best solution for preserving existing user and access management while providing the required IAM
capabilities for managing cloud resources. This process continuously replicates directory information between
Azure AD and on-premises directory services, allowing common credentials for users and a consistent identity,
role, and permission system across your entire organization.
Note: Organizations that have adopted Office 365 might have already implemented directory synchronization
between their on-premises Active Directory infrastructure and Azure Active Directory.
Directory synchronization assumptions: Using a synchronized identity solution assumes the following:
You need to maintain a common set of user accounts and groups across your cloud and on-premises IT
infrastructure.
Your on-premises identity services support replication with Azure AD.
TIP
Any cloud-based workloads that depend on legacy authentication mechanisms provided by on-premises Active Directory
servers and that are not supported by Azure AD will still require either connectivity to on-premises domain services or
virtual servers in the cloud environment providing these services. Using on-premises identity services also introduces
dependencies on connectivity between the cloud and on-premises networks.
TIP
While a directory migration coupled with cloud-hosted domain services provides great flexibility when migrating existing
workloads, hosting virtual machines within your cloud virtual network to provide these services does increase the
complexity of your IT management tasks. As your cloud migration experience matures, examine the long-term
maintenance requirements of hosting these servers. Consider whether refactoring existing workloads for compatibility
with cloud identity providers such as Azure Active Directory can reduce the need for these cloud-hosted servers.
Learn more
For more information about identity services in Azure, see:
Azure AD. Azure AD provides cloud-based identity services. It allows you to manage access to your Azure
resources and control identity management, device registration, user provisioning, application access control,
and data protection.
Azure AD Connect. The Azure AD Connect tool allows you to connect Azure AD instances with your existing
identity management solutions, allowing synchronization of your existing directory in the cloud.
Role-based access control (RBAC ). Azure AD provides RBAC to efficiently and securely manage access to
resources in the management plane. Jobs and responsibilities are organized into roles, and users are
assigned to these roles. RBAC allows you to control who has access to a resource along with which actions a
user can perform on that resource.
Azure AD Privileged Identity Management (PIM ). PIM lowers the exposure time of resource access
privileges and increases your visibility into their use through reports and alerts. It limits users to taking on
their privileges "just in time" (JIT), or by assigning privileges for a shorter duration, after which privileges are
revoked automatically.
Integrate on-premises Active Directory domains with Azure Active Directory. This reference architecture
provides an example of directory synchronization between on-premises Active Directory domains and Azure
AD.
Extend Active Directory Domain Services (AD DS ) to Azure. This reference architecture provides an example
of deploying AD DS servers to extend domain services to cloud-based resources.
Extend Active Directory Federation Services (AD FS ) to Azure. This reference architecture configures Active
Directory Federation Services (AD FS ) to perform federated authentication and authorization with your
Azure AD directory.
Next steps
Identity is just one of the core infrastructure components requiring architectural decisions during a cloud
adoption process. Visit the decision guides overview to learn about alternative patterns or models used when
making design decisions for other types of infrastructure.
Architectural decision guides
Policy enforcement decision guide
3 minutes to read • Edit Online
Defining organizational policy is not effective unless it can be enforced across your organization. A key aspect of
planning any cloud migration is determining how best to combine tools provided by the cloud platform with
your existing IT processes to maximize policy compliance across your entire cloud estate.
Jump to: Baseline best practices | Policy compliance monitoring | Policy enforcement | Cross-organization policy |
Automated enforcement
As your cloud estate grows, you will be faced with a corresponding need to maintain and enforce policy across a
larger array of resources, and subscriptions. As your estate gets larger and your organization's policy
requirements increase, the scope of your policy enforcement processes needs to expand to ensure consistent
policy adherence and fast violation detection.
Platform-provided policy enforcement mechanisms at the resource or subscription level are usually sufficient for
smaller cloud estates. Larger deployments justify a larger enforcement scope and may need to take advantage of
more sophisticated enforcement mechanisms involving deployment standards, resource grouping and
organization, and integrating policy enforcement with your logging and reporting systems.
The primary factors in determining the scope of your policy enforcement processes is your organization's cloud
governance requirements, the size and nature of your cloud estate, and how your organization is reflected in your
subscription design. An increase in size of your estate or a greater need to centrally manage policy enforcement
can both justify an increase in enforcement scope.
Policy enforcement
In Azure, you can apply configuration settings and resource creation rules at the management group,
subscription, or resource group level to help ensure policy alignment.
Azure Policy is an Azure service for creating, assigning, and managing policies. These policies enforce different
rules and effects over your resources, so those resources stay compliant with your corporate standards and
service level agreements. Azure Policy evaluates your resources for noncompliance with assigned policies. For
example, you might want to limit the SKU size of virtual machines in your environment. After implementing a
corresponding policy, new and existing resources are evaluated for compliance. With the right policy, existing
resources can be brought into compliance.
Cross-organization policy
As your cloud estate grows to span many subscriptions that require enforcement, you will need to focus on a
cloud-estate-wide enforcement strategy to ensure policy consistency.
Your subscription design must account for policy in relation to your organizational structure. In addition to
helping support complex organization within your subscription design, Azure management groups can be used
to assign Azure Policy rules across multiple subscriptions.
Automated enforcement
While standardized deployment templates are effective at a smaller scale, Azure Blueprints allows large-scale
standardized provisioning and deployment orchestration of Azure solutions. Workloads across multiple
subscriptions can be deployed with consistent policy settings for any resources created.
For IT environments integrating cloud and on-premises resources, you may need use logging and reporting
systems to provide hybrid monitoring capabilities. Your third-party or custom operational monitoring systems
may offer additional policy enforcement capabilities. For larger or more mature cloud estates, consider how best
to integrate these systems with your cloud assets.
Next steps
Policy enforcement is just one of the core infrastructure components requiring architectural decisions during a
cloud adoption process. Visit the decision guides overview to learn about alternative patterns or models used
when making design decisions for other types of infrastructure.
Architectural decision guides
Resource consistency decision guide
5 minutes to read • Edit Online
Azure subscription design defines how you organize your cloud assets in relation to your organization's
structure, accounting practices, and workload requirements. In addition to this level of structure, addressing
your organizational governance policy requirements across your cloud estate requires the ability to
consistently organize, deploy, and manage resources within a subscription.
Jump to: Basic grouping | Deployment consistency | Policy consistency | Hierarchical consistency | Automated
consistency
Decisions regarding the level of your cloud estate's resource consistency requirements are primarily driven by
these factors: post-migration digital estate size, business or environmental requirements that don't fit neatly
within your existing subscription design approaches, or the need to enforce governance over time after
resources have been deployed.
As these factors increase in importance, the benefits of ensuring consistent deployment, grouping, and
management of cloud-based resources becomes more important. Achieving more advanced levels of resource
consistency to meet increasing requirements requires more effort spent in automation, tooling, and
consistency enforcement, and this results in additional time spent on change management and tracking.
Basic grouping
In Azure, resource groups are a core resource organization mechanism to logically group resources within a
subscription.
Resource groups act as containers for resources with a common lifecycle as well as shared management
constraints such as policy or role-based access control (RBAC ) requirements. Resource groups can't be nested,
and resources can only belong to one resource group. All control plane actions act on all resources in a
resource group. For example, deleting a resource group also deletes all resources within that group. The
preferred pattern for resource group management is to consider:
1. Are the contents of the resource group developed together?
2. Are the contents of the resource group managed, updated, and monitored together and done so by the
same people or teams?
3. Are the contents of the resource group retired together?
If you answered NO to any of the above points, the resource in question should be placed elsewhere, in
another resource group.
IMPORTANT
Resource groups are also region specific; however, it is common for resources to be in different regions within the same
resource group because they are managed together as described above. For more information on region selection, see
the Regions decision guide.
Deployment consistency
Building on top of the base resource grouping mechanism, the Azure platform provides a system for using
templates to deploy your resources to the cloud environment. You can use templates to create consistent
organization and naming conventions when deploying workloads, enforcing those aspects of your resource
deployment and management design.
Azure Resource Manager templates allow you to repeatedly deploy your resources in a consistent state using a
predetermined configuration and resource group structure. Resource Manager templates help you define a set
of standards as a basis for your deployments.
For example, you can have a standard template for deploying a web server workload that contains two virtual
machines as web servers combined with a load balancer to distribute traffic between the servers. You can then
reuse this template to create structurally identical set of virtual machines and load balancer whenever this type
of workload is needed, only changing the deployment name and IP addresses involved.
You can also programmatically deploy these templates and integrate them with your CI/CD systems.
Policy consistency
To ensure that governance policies are applied when resources are created, part of resource grouping design
involves using a common configuration when deploying resources.
By combining resource groups and standardized Resource Manager templates, you can enforce standards for
what settings are required in a deployment and what Azure Policy rules are applied to each resource group or
resource.
For example, you may have a requirement that all virtual machines deployed within your subscription connect
to a common subnet managed by your central IT team. You can create a standard template for deploying
workload VMs to create a separate resource group for the workload and deploy the required VMs there. This
resource group would have a policy rule to only allow network interfaces within the resource group to be
joined to the shared subnet.
For a more in-depth discussion of enforcing your policy decisions within a cloud deployment, see Policy
enforcement.
Hierarchical consistency
Resource groups allow you to support additional levels of hierarchy within your organization within the
subscription, applying Azure Policy rules and access controls at a resource group level. However, As the size of
your cloud estate grows, you may need to support more complicated cross-subscription governance
requirements than can be supported using the Azure Enterprise Agreement's
Enterprise/Department/Account/Subscription hierarchy.
Azure management groups allow you to organize subscriptions into more sophisticated organizational
structures by grouping subscriptions in a hierarchy distinct from your enterprise agreement's hierarchy. This
alternate hierarchy allows you to apply access control and policy enforcement mechanisms across multiple
subscriptions and the resources they contain. Management group hierarchies can be used to match your cloud
estate's subscriptions with operations or business governance requirements. For more information, see the
subscription decision guide.
Automated consistency
For large cloud deployments, global governance becomes both more important and more complex. It is crucial
to automatically apply and enforce governance requirements when deploying resources, as well as meet
updated requirements for existing deployments.
Azure Blueprints enable organizations to support global governance of large cloud estates in Azure. Blueprints
move beyond the capabilities provided by standard Azure Resource Manager templates to create complete
deployment orchestrations capable of deploying resources and applying policy rules. Blueprints support
versioning, the ability to update all subscriptions where the blueprint was used, and the ability to lock down
deployed subscriptions to avoid the unauthorized creation and modification of resources.
These deployment packages allow IT and development teams to rapidly deploy new workloads and
networking assets that comply with changing organizational policy requirements. Blueprints can also be
integrated into CI/CD pipelines to apply revised governance standards to deployments as they are updated.
Next steps
Resource consistency is just one of the core infrastructure components requiring architectural decisions during
a cloud adoption process. Visit the decision guides overview to learn about alternative patterns or models used
when making design decisions for other types of infrastructure.
Architectural decision guides
Resource naming and tagging decision guide
4 minutes to read • Edit Online
Organizing cloud-based resources is one of the most important tasks for IT, unless you only have simple
deployments. Organizing your resources serves three primary purposes:
Resource Management: Your IT teams will need to quickly find resources associated with specific workloads,
environments, ownership groups, or other important information. Organizing resources is critical to assigning
organizational roles and access permissions for resource management.
Automation: In addition to making resources easier for IT to manage, a proper organizational scheme allows
you to take advantage of automation as part of resource creation, operational monitoring, and the creation of
DevOps processes.
Accounting: Making business groups aware of cloud resource consumption requires IT to understand what
workloads and teams are using which resources. To support approaches such as chargeback and showback
accounting, cloud resources need to be organized to reflect ownership and usage.
Jump to: Baseline naming conventions | Resource tagging patterns | Learn more
Your tagging approach can be simple or complex, with the emphasis ranging from supporting IT teams managing
cloud workloads to integrating information relating to all aspects of the business.
An IT aligned tagging focus, such as tagging based on workload, function, or environment, will reduce the
complexity of monitoring assets and make management decisions based on operational requirements much
easier.
Tagging schemes that include a business aligned focus, such as accounting, business ownership, or business
criticality may require a larger time investment to create tagging standards that reflect business interests and
maintain those standards over time. However, the result of this process is a tagging system providing an
improved ability to account for costs and value of IT assets to the overall business. This association of an asset's
business value to its operational cost is one of the first steps in changing the cost center perception of IT within
your wider organization.
Learn more
For more information about naming and tagging in Azure, see:
Naming conventions for Azure resources. Refer to this guidance for recommended naming conventions for
Azure resources.
Use tags to organize your Azure resources. You can apply tags in Azure at both the resource group and
individual resource level, giving you flexibility in the granularity of any accounting reports based on applied
tags.
Next steps
Resource tagging is just one of the core infrastructure components requiring architectural decisions during a
cloud adoption process. Visit the decision guides overview to learn about alternative patterns or models used
when making design decisions for other types of infrastructure.
Architectural decision guides
Encryption decision guide
7 minutes to read • Edit Online
Encrypting data protects it against unauthorized access. Properly implemented encryption policy provides
additional layers of security for your cloud-based workloads and guards against attackers and other unauthorized
users from both inside and outside your organization and networks.
Jump to: Key management | Data encryption | Learn more
Cloud encryption strategy focuses on corporate policy and compliance mandates. Encrypting resources is
desirable, and many Azure services such as Azure Storage and Azure SQL Database enable encryption by
default. However, encryption has costs that can increase latency and overall resource usage.
For demanding workloads, striking the correct balance between encryption and performance, and determining
how data and traffic is encrypted can be essential. Encryption mechanisms can vary in cost and complexity, and
both technical and policy requirements can influence your decisions on how encryption is applied and how you
store and manage critical secrets and keys.
Corporate policy and third-party compliance are the biggest drivers when planning an encryption strategy. Azure
provides multiple standard mechanisms that can meet common requirements for encrypting data, whether at rest
or in transit. However, for policies and compliance requirements that demand tighter controls, such as
standardized secrets and key management, encryption in-use, or data-specific encryption, you will need to
develop a more sophisticated encryption strategy to support these requirements.
Key management
Encryption of data in the cloud depends on the secure storage, management, and operational use of encryption
keys. A key management system is critical to your organization's ability to create, store, and manage
cryptographic keys, as well important passwords, connection strings, and other IT confidential information.
Modern key management systems such as Azure Key Vault support storage and management of software
protected keys for dev and test usage and hardware security module (HSM ) protected keys for maximum
protection of production workloads or sensitive data.
When planning a cloud migration, the following table can help you decide how to store and manage encryption
keys, certificates, and secrets, which are critical for creating secure and manageable cloud deployments:
QUESTION CLOUD-NATIVE BRING YOUR OWN KEY HOLD YOUR OWN KEY
Cloud-native
With cloud-native key management, all keys and secrets are generated, managed, and stored in a cloud-based
vault such as Azure Key Vault. This approach simplifies many IT tasks related to key management, such as key
backup, storage, and renewal.
Using a cloud-native key management system includes these assumptions:
You trust the cloud key management solution with creating, managing, and hosting your organization's secrets
and keys.
You enable all on-premises applications and services that rely on accessing encryption services or secrets to
access the cloud key management system.
Bring your own key
With a bring your own key approach, you generate keys on dedicated HSM hardware within your on-premises
environment, then securely transferring these keys to a cloud-based management system such as Azure Key Vault
for use with your cloud-hosted resources.
Bring your own key assumptions: Generating keys on-premises and using them with a cloud-based key
management system includes these assumptions:
You trust the underlying security and access control infrastructure of the cloud platform for hosting and using
your keys and secrets.
Your cloud-hosted applications or services are able to access and use keys and secrets in a robust and secure
way.
You are required by regulatory or organizational policy to keep the creation and management of your
organization's secrets and keys on-premises.
On-premises (hold your own key)
Certain scenarios might have regulatory, policy, or technical reasons prohibiting the storage of keys on a cloud-
based key management system. If so, you must generate keys using on-premises hardware, store and manage
them using an on-premises key management system, and establish a way for cloud-based resources to access
these keys for encryption purposes. Note that holding your own key might not be compatible with all Azure-
based services.
On-premises key management assumptions: Using an on-premises key management system includes these
assumptions:
You are required by regulatory or organizational policy to keep the creation, management, and hosting of your
organization's secrets and keys on-premises.
Any cloud-based applications or services that rely on accessing encryption services or secrets can access the
on-premises key management system.
Data encryption
Consider several different states of data with different encryption needs when planning your encryption policy:
DATA STATE DATA
Data in transit
Data in transit is data moving between resources on the internal, between datacenters or external networks, or
over the internet.
Data in transit is usually encrypted by requiring SSL/TLS protocols for network traffic. Always encrypt traffic
between your cloud-hosted resources and external networks or the public internet. PaaS resources typically
enforce SSL/TLS encryption by default. Your cloud adoption teams and workload owners should consider
enforcing encryption for traffic between IaaS resources hosted inside your virtual networks.
Assumptions about encrypting data in transit: Implementing proper encryption policy for data in transit
assumes the following:
All publicly accessible endpoints in your cloud environment will communicate with the public internet using
SSL/TLS protocols.
When connecting cloud networks with on-premises or other external network over the public internet, use
encrypted VPN protocols.
When connecting cloud networks with on-premises or other external network using a dedicated WAN
connection such as ExpressRoute, you will use a VPN or other encryption appliance on-premises paired with a
corresponding virtual VPN or encryption appliance deployed to your cloud network.
If you have sensitive data that shouldn't be included in traffic logs or other diagnostics reports visible to IT
staff, you will encrypt all traffic between resources in your virtual network.
Data at rest
Data at rest represents any data not being actively moved or processed, including files, databases, virtual machine
drives, PaaS storage accounts, or similar assets. Encrypting stored data protects virtual devices or files against
unauthorized access either from external network penetration, rogue internal users, or accidental releases.
PaaS storage and database resources generally enforce encryption by default. IaaS resources can be secured by
encrypting data at the virtual disk level or by encrypting the entire storage account hosting your virtual drives. All
of these assets can make use of either Microsoft-managed or customer-managed keys stored in Azure Key Vault.
Encryption for data at rest also encompasses more advanced database encryption techniques, such as column-
level and row level encryption, providing much more control over exactly what data is being secured.
Your overall policy and compliance requirements, the sensitivity of the data being stored, and the performance
requirements of your workloads should determine which assets require encryption.
Assumptions about encrypting data at rest
Encrypting data at rest assumes the following:
You are storing data that is not meant for public consumption.
Your workloads can accept the added latency cost of disk encryption.
Data in use
Encryption for data in use involves securing data in nonpersistent storage, such as RAM or CPU caches. Use of
technologies such as full memory encryption, enclave technologies, such as Intel's Secure Guard Extensions
(SGX). This also includes cryptographic techniques, such as homomorphic encryption that can be used to create
secure, trusted execution environments.
Assumptions about encrypting data in use: Encrypting data in use assumes the following:
You are required to maintain data ownership separate from the underlying cloud platform at all times, even at
the RAM and CPU level.
Learn more
For more information about encryption and key management in Azure, see:
Azure encryption overview . A detailed description of how Azure uses encryption to secure both data at rest
and data in transit.
Azure Key Vault. Key Vault is the primary key management system for storing and managing cryptographic
keys, secrets, and certificates within Azure.
Azure Data Security and Encryption Best Practices. A discussion of Azure data security and encryption best
practices.
Confidential computing in Azure. Azure's confidential computing initiative provides tools and technology to
create trusted execution environments or other encryption mechanisms to secure data in use.
Next steps
Encryption is just one of the core infrastructure components requiring architectural decisions during a cloud
adoption process. Visit the decision guides overview to learn about alternative patterns or models used when
making design decisions for other types of infrastructure.
Architectural decision guides
Software Defined Networking decision guide
3 minutes to read • Edit Online
Software Defined Networking (SDN ) is a network architecture designed to allow virtualized networking
functionality that can be centrally managed, configured, and modified through software. SDN enables the
creation of cloud-based networks using the virtualized equivalents to physical routers, firewalls, and other
networking devices used in on-premises networks. SDN is critical to creating secure virtual networks on public
cloud platforms such as Azure.
Jump to: PaaS Only | Cloud-native | Cloud DMZ Hybrid | Hub and spoke model | Learn more
SDN provides several options with varying degrees of pricing and complexity. The above discovery guide
provides a reference to quickly personalize these options to best align with specific business and technology
strategies.
The inflection point in this guide depends on several key decisions that your cloud strategy team has made
before making decisions about networking architecture. Most important among these are decisions involving
your digital estate definition and subscription design (which may also require inputs from decisions made
related to your cloud accounting and global markets strategies).
Small single-region deployments of fewer than 1,000 VMs are less likely to be significantly affected by this
inflection point. Conversely, large adoption efforts with more than 1,000 VMs, multiple business units, or
multiple geopolitical markets, could be substantially affected by your SDN decision and this key inflection point.
Learn more
For more information about Software Defined Networking in Azure, see:
Azure Virtual Network. On Azure, the core SDN capability is provided by Azure Virtual Network, which acts
as a cloud analog to physical on-premises networks. Virtual networks also act as a default isolation boundary
between resources on the platform.
Azure best practices for network security. Recommendations from the Azure Security team on how to
configure your virtual networks to minimize security vulnerabilities.
Next steps
Software defined networking is just one of the core infrastructure components requiring architectural decisions
during a cloud adoption process. Visit the decision guides overview to learn about alternative patterns or models
used when making design decisions for other types of infrastructure.
Architectural decision guides
Software Defined Networking: PaaS-only
2 minutes to read • Edit Online
When you implement a platform as a service (PaaS ) resource, the deployment process automatically creates an
assumed underlying network with a limited number of controls over that network, including load balancing, port
blocking, and connections to other PaaS services.
In Azure, several PaaS resource types can be deployed into or connected to a virtual network, allowing these
resources to integrate with your existing virtual networking infrastructure. Other services, such as App Service
Environments, Azure Kubernetes Service (AKS ), and Service Fabric must be deployed within virtual network.
However, in many cases a PaaS only networking architecture, relying only on the default native networking
capabilities provided by PaaS resources, is sufficient to meet a workload's connectivity and traffic management
requirements.
If you are considering a PaaS only networking architecture, be sure you validate that the required assumptions
align with your requirements.
PaaS-only assumptions
Deploying a PaaS -only networking architecture assumes the following:
The application being deployed is a standalone application or depends only on other PaaS resources that do
not require a virtual network.
Your IT operations teams can update their tools, training, and processes to support management,
configuration, and deployment of standalone PaaS applications.
The PaaS application is not part of a broader cloud migration effort that will include IaaS resources.
These assumptions are minimum qualifiers aligned to deploying a PaaS -only network. While this approach may
align with the requirements of a single application deployment, each cloud adoption team should consider these
long-term questions:
Will this deployment expand in scope or scale to require access to other non-PaaS resources?
Are other PaaS deployments planned beyond the current solution?
Does the organization have plans for other future cloud migrations?
The answers to these questions would not preclude a team from choosing a PaaS only option but should be
considered before making a final decision.
Software Defined Networking: Cloud-native
2 minutes to read • Edit Online
A cloud-native virtual network is a required when deploying IaaS resources such as virtual machines to a cloud
platform. Access to virtual networks from external sources, similar to the web, need to be explicitly provisioned.
These types of virtual networks support the creation of subnets, routing rules, and virtual firewall and traffic
management devices.
A cloud-native virtual network has no dependencies on your organization's on-premises or other noncloud
resources to support the cloud-hosted workloads. All required resources are provisioned either in the virtual
network itself or by using managed PaaS offerings.
Cloud-native assumptions
Deploying a cloud-native virtual network assumes the following:
The workloads you deploy to the virtual network have no dependencies on applications or services that are
accessible only from inside your on-premises network. Unless they provide endpoints accessible over the
public internet, applications and services hosted internally on-premises are not usable by resources hosted on
a cloud platform.
Your workload's identity management and access control depends on the cloud platform's identity services or
IaaS servers hosted in your cloud environment. You will not need to directly connect to identity services hosted
on-premises or other external locations.
Your identity services do not need to support single sign-on (SSO ) with on-premises directories.
Cloud-native virtual networks have no external dependencies. This makes them simple to deploy and configure,
and as a result this architecture is often the best choice for experiments or other smaller self-contained or rapidly
iterating deployments.
Additional issues your cloud adoption teams should consider when discussing a cloud-native virtual networking
architecture include:
Existing workloads designed to run in an on-premises datacenter may need extensive modification to take
advantage of cloud-based functionality, such as storage or authentication services.
Cloud-native networks are managed solely through the cloud platform management tools, and therefore may
lead to management and policy divergence from your existing IT standards as time goes on.
Next steps
For more information about cloud-native virtual networking in Azure, see:
Azure Virtual Network: How -to guides. Newly created Azure Virtual Networks are cloud-native by default. Use
these guides to help plan the design and deployment of your virtual networks.
Subscription limits: Networking. Any single virtual network and connected resources can only exist within a
single subscription, and are bound by subscription limits.
Software Defined Networking: Cloud DMZ
2 minutes to read • Edit Online
The Cloud DMZ network architecture allows limited access between your on-premises and cloud-based networks,
using a virtual private network (VPN ) to connect the networks. Although a DMZ model is commonly used when
you want to secure external access to a network, the Cloud DMZ architecture discussed here is intended
specifically to secure access to the on-premises network from cloud-based resources and vice versa.
This architecture is designed to support scenarios where your organization wants to start integrating cloud-based
workloads with on-premises workloads but may not have fully matured cloud security policies or acquired a
secure dedicated WAN connection between the two environments. As a result, cloud networks should be treated
like a demilitarized zone to ensure on-premises services are secure.
The DMZ deploys network virtual appliances (NVAs) to implement security functionality such as firewalls and
packet inspection. Traffic passing between on-premises and cloud-based applications or services must pass
through the DMZ where it can be audited. VPN connections and the rules determining what traffic is allowed
through the DMZ network are strictly controlled by IT security teams.
Learn more
For more information about implementing a Cloud DMZ in Azure, see:
Implement a DMZ between Azure and your on-premises datacenter. This article discusses how to implement a
secure hybrid network architecture in Azure.
Software Defined Networking: Hybrid network
2 minutes to read • Edit Online
The hybrid cloud network architecture allows virtual networks to access your on-premises resources and services
and vice versa, using a Dedicated WAN connection such as ExpressRoute or other connection method to directly
connect the networks.
Building on the cloud-native virtual network architecture, a hybrid virtual network is isolated when initially
created. Adding connectivity to the on-premises environment grants access to and from the on-premises network,
although all other inbound traffic targeting resources in the virtual network need to be explicitly allowed. You can
secure the connection using virtual firewall devices and routing rules to limit access or you can specify exactly
what services can be accessed between the two networks using cloud-native routing features or deploying
network virtual appliances (NVAs) to manage traffic.
Although the hybrid networking architecture supports VPN connections, dedicated WAN connections like
ExpressRoute are preferred due to higher performance and increased security.
Hybrid assumptions
Deploying a hybrid virtual network includes the following assumptions:
Your IT security teams have aligned on-premises and cloud-based network security policy to ensure cloud-
based virtual networks can be trusted to communicated directly with on-premises systems.
Your cloud-based workloads require access to storage, applications, and services hosted on your on-premises
or third-party networks, or your users or applications in your on-premises need access to cloud-hosted
resources.
You need to migrate existing applications and services that depend on on-premises resources, but don't want
to expend the resources on redevelopment to remove those dependencies.
Connecting your on-premises networks to cloud resources over VPN or dedicated WAN is not prevented by
corporate policy, data sovereignty requirements, or other regulatory compliance issues.
Your workloads either do not require multiple subscriptions to bypass subscription resource limits, or your
workloads involve multiple subscriptions but do not require central management of connectivity or shared
services used by resources spread across multiple subscriptions.
Your cloud adoption teams should consider the following issues when looking at implementing a hybrid virtual
networking architecture:
Connecting on-premises networks with cloud networks increases the complexity of your security requirements.
Both networks must be secured against external vulnerabilities and unauthorized access from both sides of the
hybrid environment.
Scaling the number and size of workloads within a hybrid cloud environment can add significant complexity to
routing and traffic management.
You will need to develop compatible management and access control policies to maintain consistent
governance throughout your organization.
Learn more
For more information about hybrid networking in Azure, see:
Hybrid network reference architecture. Azure hybrid virtual networks use either an ExpressRoute circuit or
Azure VPN to connect your virtual network with your organization's existing IT assets not hosted in Azure. This
article discusses the options for creating a hybrid network in Azure.
Software Defined Networking: Hub and spoke
2 minutes to read • Edit Online
The hub and spoke networking model organizes your Azure-based cloud network infrastructure into multiple
connected virtual networks. This model allows you to more efficiently manage common communication or
security requirements and deal with potential subscription limitations.
In the hub and spoke model, the hub is a virtual network that acts as a central location for managing external
connectivity and hosting services used by multiple workloads. The spokes are virtual networks that host
workloads and connect to the central hub through virtual network peering.
All traffic passing in or out of the workload spoke networks is routed through the hub network where it can be
routed, inspected, or otherwise managed by centrally managed IT rules or processes.
This model aims to address each of the following concerns:
Cost savings and management efficiency. Centralizing services that can be shared by multiple workloads,
such as network virtual appliances (NVAs) and DNS servers, in a single location allows IT to minimize
redundant resources and management effort across multiple workloads.
Overcoming subscriptions limits. Large cloud-based workloads may require the use of more resources than
are allowed within a single Azure subscription (see subscription limits). Peering workload virtual networks
from different subscriptions to a central hub can overcome these limits.
Separation of concerns. The ability to deploy individual workloads between central IT teams and workloads
teams.
The following diagram shows an example hub and spoke architecture including centrally managed hybrid
connectivity.
The hub and spoke architecture is often used alongside the hybrid networking architecture, providing a centrally
managed connection to your on-premises environment shared between multiple workloads. In this scenario, all
traffic traveling between the workloads and on-premises passes through the hub where it can be managed and
secured.
Learn more
For examples of how to implement hub and spoke networks on Azure, see the following examples on the Azure
Reference Architectures site:
Implement a hub and spoke network topology in Azure
Implement a hub and spoke network topology with shared services in Azure
Logging and reporting decision guide
7 minutes to read • Edit Online
All organizations need mechanisms for notifying IT teams of performance, uptime, and security issues before
they become serious problems. A successful monitoring strategy allows you to understand how the individual
components that make up your workloads and networking infrastructure are performing. Within the context of a
public cloud migration, integrating logging and reporting with any of your existing monitoring systems, while
surfacing important events and metrics to the appropriate IT staff, is critical in ensuring your organization is
meeting uptime, security, and policy compliance goals.
Jump to: Planning your monitoring infrastructure | Cloud-native | On-premises extension | Gateway aggregation
| Hybrid monitoring (on-premises) | Hybrid monitoring (cloud-based) | Multicloud | Learn more
The inflection point when determining a cloud logging and reporting strategy is based primarily on existing
investments your organization has made in operational processes, and to some degree any requirements you
have to support a multicloud strategy.
Activities in the cloud can be logged and reported in multiple ways. Cloud-native and centralized logging are two
common managed service options that are driven by the subscription design and the number of subscriptions.
ON-PREMISES GATEWAY
QUESTION CLOUD-NATIVE EX TENSION HYBRID MONITORING AGGREGATION
Cloud-native
If your organization currently lacks established logging and reporting systems, or if your planned deployment
does not need to be integrated with existing on-premises or other external monitoring systems, a cloud-native
SaaS solution such as Azure Monitor, is the simplest choice.
In this scenario, all log data is recorded and stored in the cloud, while the logging and reporting tools that
process and surface information to IT staff are provided by the Azure platform and Azure Monitor.
Custom Azure Monitor-based logging solutions can be implemented ad hoc for each subscription or workload
in smaller or experimental deployments, and are organized in a centralized manner to monitor log data across
your entire cloud estate.
Cloud-native assumptions: Using a cloud-native logging and reporting system assumes the following:
You do not need to integrate the log data from you cloud workloads into existing on-premises systems.
You will not be using your cloud-based reporting systems to monitor on-premises systems.
On-premises extension
It might require substantial redevelopment effort for applications and services migrating to the cloud to use
cloud-based logging and reporting solutions such as Azure Monitor. In these cases, consider allowing these
workloads to continue sending telemetry data to existing on-premises systems.
To support this approach, your cloud resources will need to be able to communicate directly with your on-
premises systems through a combination of hybrid networking and cloud hosted domain services. With this in
place, the cloud virtual network functions as a network extension of the on-premises environment. Therefore,
cloud hosted workloads can communicate directly with your on-premises logging and reporting system.
This approach capitalizes on your existing investment in monitoring tooling with limited modification to any
cloud-deployed applications or services. This is often the fastest approach to support monitoring during a lift
and shift migration. However, it won't capture log data produced by cloud-based PaaS and SaaS resources, and
it will omit any VM -related logs generated by the cloud platform itself such as VM status. As a result, this pattern
should be a temporary solution until a more comprehensive hybrid monitoring solution is implemented.
On-premises–only assumptions:
You need to maintain log data only in your on-premises environment only, either in support of technical
requirements or due to regulatory or policy requirements.
Your on-premises systems do not support hybrid logging and reporting or gateway aggregation solutions.
Your cloud-based applications can submit telemetry directly to your on-premises logging systems or
monitoring agents that submit to on-premises can be deployed to workload VMs.
Your workloads don't depend on PaaS or SaaS services that require cloud-based logging and reporting.
Gateway aggregation
For scenarios where the amount of cloud-based telemetry data is large or existing on-premises monitoring
systems need log data modified before it can be processed, a log data gateway aggregation service might be
required.
A gateway service is deployed to your cloud provider. Then, relevant applications and services are configured to
submit telemetry data to the gateway instead of a default logging system. The gateway can then process the
data: aggregating, combining, or otherwise formatting it before then submitting it to your monitoring service for
ingestion and analysis.
Also, a gateway can be used to aggregate and preprocess telemetry data bound for cloud-native or hybrid
systems.
Gateway aggregation assumptions:
You expect large volumes of telemetry data from your cloud-based applications or services.
You need to format or otherwise optimize telemetry data before submitting it to your monitoring systems.
Your monitoring systems have APIs or other mechanisms available to ingest log data after processing by the
gateway.
Hybrid monitoring (on-premises)
A hybrid monitoring solution combines log data from both your on-premises and cloud resources to provide an
integrated view into your IT estate's operational status.
If you have an existing investment in on-premises monitoring systems that would be difficult or costly to replace,
you might need to integrate the telemetry from your cloud workloads into preexisting on-premises monitoring
solutions. In a hybrid on-premises monitoring system, on-premises telemetry data continues to use the existing
on-premises monitoring system. Cloud-based telemetry data is either sent to the on-premises monitoring
system directly, or the data is sent to Azure Monitor then compiled and ingested into the on-premises system at
regular intervals.
On-premises hybrid monitoring assumptions: Using an on-premises logging and reporting system for
hybrid monitoring assumes the following:
You need to use existing on-premises reporting systems to monitor cloud workloads.
You need to maintain ownership of log data on-premises.
Your on-premises management systems have APIs or other mechanisms available to ingest log data from
cloud-based systems.
TIP
As part of the iterative nature of cloud migration, transitioning from distinct cloud-native and on-premises monitoring to a
partial hybrid approach is likely as the integration of cloud-based resources and services into your overall IT estate
matures.
Learn more
Azure Monitor is the default reporting and monitoring service for Azure. It provides:
A unified platform for collecting app telemetry, host telemetry (such as VMs), container metrics, Azure
platform metrics, and event logs.
Visualization, queries, alerts, and analytical tools. It can provide insights into virtual machines, guest
operating systems, virtual networks, and workload application events.
REST APIs for integration with external services and automation of monitoring and alerting services.
Integration with many popular third-party vendors.
Next steps
Logging and reporting is just one of the core infrastructure components requiring architectural decisions during
a cloud adoption process. Visit the decision guides overview to learn about alternative patterns or models used
when making design decisions for other types of infrastructure.
Architectural decision guides
Migration tools decision guide
4 minutes to read • Edit Online
The strategy and tools you use to migrate an application to Azure will largely depend on your business
motivations, technology strategies, and timelines, as well as a deep understanding of the actual workload and
assets (infrastructure, apps, and data) being migrated. The following decision tree serves as high-level guidance
for selecting the best tools to use based on migration decisions. Treat this decision tree as a starting point.
The choice to migrate using platform as a service (PaaS ) or infrastructure as a service (IaaS ) technologies is
driven by the balance between cost, time, existing technical debt, and long-term returns. IaaS is often the fastest
path to the cloud with the least amount of required change to the workload. PaaS could require modifications to
data structures or source code, but produces substantial long-term returns in the form of reduced operating costs
and greater technical flexibility. In the following diagram, the term modernize is used to reflect a decision to
modernize an asset during migration and migrate the modernized asset to a PaaS platform.
Key questions
Answering the following questions will allow you to make decisions based on the above tree.
Would modernization of the application platform during migration prove to be a wise investment
of time, energy, and budget? PaaS technologies such as Azure App Service or Azure Functions can increase
deployment flexibility and reduce the complexity of managing virtual machines to host applications. However,
applications may require refactoring before they can take advantage of these cloud-native capabilities,
potentially adding significant time and cost to a migration effort. If your application can migrate to PaaS
technologies with a minimum of modifications, it is likely a good candidate for modernization. If extensive
refactoring would be required, a migration using IaaS -based virtual machines may be a better choice.
Would modernization of the data platform during migration prove to be a wise investment of time,
energy, and budget? As with application migration, Azure PaaS managed storage options, such as Azure
SQL Database, Cosmos DB, and Azure Storage, offer significant management and flexibility benefits, but
migrating to these services may require refactoring of existing data and the applications that use that data.
Data platforms often require significantly less refactoring than the application platform would. As such, it is
very common for the data platform to be modernized, even though the application platform remains the
same. If your data can be migrated to a managed data service with minimal changes, it is a good candidate for
modernization. Data that would require extensive time or cost to be refactored to use these PaaS services may
be better migrated using IaaS -based virtual machines to better match existing hosting capabilities.
Is your application currently running on dedicated virtual machines or sharing hosting with other
applications? Application running on dedicated virtual machines may be more easily migrated to PaaS
hosting options than applications running on shared servers.
Will your data migration exceed your network bandwidth? Network capacity between your on-premises
data sources and Azure can be a bottleneck on data migration. If the data you need to transfer faces
bandwidth limitations that prevent efficient or timely migration, you may need to look into alternative or
offline transfer mechanisms. The Cloud Adoption Framework's article on migration replication discusses how
replication limits can affect migration efforts. As part of your migration assessment, consult your IT teams to
verify your local and WAN bandwidth is capable of handling your migration requirements. Also see the
expanded scope migration scenario for when storage requirements exceed network capacity during a
migration.
Does your application make use of an existing DevOps pipeline? In many cases Azure Pipelines can be
easily refactored to deploy applications to cloud-based hosting environments.
Does your data have complex data storage requirements? Production applications usually require data
storage that is highly available, offers always on functionality and similar service uptime and continuity
features. Azure PaaS -based managed database options, such as Azure SQL Database, Azure Database for
MySQL, and Azure Cosmos DB all offer 99.99% uptime service-level agreements. Conversely, IaaS -based
SQL Server on Azure VMs offers single-instance service-level agreements of 99.95%. If your data cannot be
modernized to use PaaS storage options, guaranteeing higher IaaS uptime will involve more complex data
storage scenarios such as running SQL Server Always-on clusters and continuously syncing data between
instances. This can involve significant hosting and maintenance costs, so balancing uptime requirements,
modernization effort, and overall budgetary impact is important when considering your data migration
options.
Learn more
Cloud fundamentals: Overview of Azure compute options: Provides information on the capabilities of
Azure IaaS and PaaS compute options.
Cloud fundamentals: Choose the right data store: Discusses PaaS storage options available on the Azure
platform.
Expanded scope migration: Data requirements exceed network capacity during a migration effort:
Discusses alternative data migration mechanisms for scenarios where data migration is hindered by available
network bandwidth.
SQL Database: Choose the right SQL Server option in Azure: Discussion of the options and business
justifications for choosing to host your SQL Server workloads in a hosted infrastructure (IaaS ) or a hosted
service (PaaS ) environment.
Deploy a basic workload in Azure
3 minutes to read • Edit Online
The term workload is typically defined as an arbitrary unit of functionality, such as an application or service. It
helps to think about a workload in terms of the code artifacts that are deployed to a server, and also other services
specific to an application. This may be a useful definition for an on-premises application or service, but for cloud
applications it needs to be expanded.
In the cloud a workload not only encompasses all the artifacts, but it also includes the cloud resources as well.
Included is cloud resources as part of the definition because of the concept known as "infrastructure as code". As
you learned in how does Azure work?, resources in Azure are deployed by an orchestrator service. This
orchestrator service exposes functionality through a web API, and you can call the web API using several tools
such as PowerShell, the Azure CLI, and the Azure portal. This means that you can specify Azure resources in a
machine-readable file that can be stored along with the code artifacts associated with the application.
This enables you to define a workload in terms of code artifacts and the necessary cloud resources, thus further
enabling you to isolate workloads. You can isolate workloads by the way resources are organized, by network
topology, or by other attributes. The goal of workload isolation is to associate a workload's specific resources to a
team, so that the team can independently manage all aspects of those resources. This enables multiple teams to
share resource management services in Azure while preventing the unintentional deletion or modification of each
other's resources.
This isolation also enables another concept, known as DevOps. DevOps includes the software development
practices that include both software development and IT operations above, and adds the use of automation as
much as possible. One of the principles of DevOps is known as continuous integration and continuous delivery
(CI/CD ). Continuous integration refers to the automated build processes that are run every time a developer
commits a code change. Continuous delivery refers to the automated processes that deploy this code to various
environments such as a development environment for testing or a production environment for final deployment.
Basic workload
A basic workload is typically defined as a single web application or a virtual network (VNet) with virtual machine
(VM ).
NOTE
This guide does not cover application development. For more information about developing applications on Azure, see the
Azure Application Architecture Guide.
Regardless of whether the workload is a web application or a VM, each of these deployments requires a resource
group. A user with permission to create a resource group must do this before following the steps below.
Once you deploy a simple workload, you can learn more about the best practices for deploying a basic web
application to Azure.
Next steps
See Architectural decision guides for how to use core infrastructure components in the Azure cloud.
In early 2018, Microsoft released the Cloud Operating Model (COM ). The COM was a guide that helped customers understand the
what and the why of digital transformation. This helped customers get a sense of all the areas that needed to be addressed:
business strategy, culture strategy, and technology strategy. What was not included in the COM were the specific how-to's, which
left customers wondering, "Where do we go from here?"
In October 2018, we began a review of all the models that had proliferated across the Microsoft community, we found roughly 60
different cloud adoption models. A cross-Microsoft team was established to bring everything together as a dedicated engineering
"product" with defined implementations across services, sales, and marketing. This effort culminated in the creation of a single
model, the Microsoft Cloud Adoption Framework for Azure, designed to help customers understand the what and why and
provide unified guidance on the how to help them accelerate their cloud adoption. The goal of this project is to create a One
Microsoft approach to cloud adoption.
Using Cloud Operating Model practices within the Cloud Adoption Framework
For a similar approach to COM, readers should begin with one of the following:
Begin a cloud migration journey
Innovate through cloud adoption
Enable successful cloud adoption
The guidance previously provided in COM is still relevant to the Cloud Adoption Framework. The experience is different, but the
structure of the Cloud Adoption Framework is simply an expansion of that guidance. To transition from COM to the Cloud
Adoption Framework, an understanding of scope and structure is important. The following two sections describe that transition.
Scope
COM established a scope comprised of the following components:
Business strategy: Establish clear business objectives and outcomes that are to be supported by cloud adoption.
Technology strategy: Align the overarching strategy to guide adoption of the cloud in alignment with the business strategy.
People strategy: Develop a strategy for training the people and changing the culture to enable business success.
The high-level scopes of the Cloud Operating Model and the Cloud Adoption Framework are similar. Business, culture, and
technology are reflected throughout the guidance and each methodology within the Cloud Adoption Framework.
NOTE
The Cloud Adoption Framework's scope has two significant points of clarity. In the Cloud Adoption Framework, business strategy
goes beyond the documentation of cloud costs—it is about understanding motivations, desired outcomes, returns, and cloud costs
to create actionable plans and clear business justifications. In the Cloud Adoption Framework, people strategy goes beyond
training to include approaches that create demonstrable cultural maturity. A few areas on the roadmap include demonstrations of
the impact of Agile management, DevOps integration, customer empathy and obsession, and lean product development
approaches.
Structure
COM included an infographic that outlined the various decisions and actions needed during a cloud adoption effort. That graphic
provided a clear means of communicating next steps and dependent decisions.
The Cloud Adoption Framework follows a similar model. However, as the actions and decisions expanded into multiple decision
trees, complexity quickly made a single graphical view appear overwhelming. To simplify the guidance and make it more
immediately actionable, the single graphic has been decomposed into the following structures.
At the executive level, the Cloud Adoption Framework has been simplified into the following three phases of adoption and two
primary governance guides.
The Azure enterprise scaffold has been integrated into the Microsoft Cloud Adoption Framework for Azure. The
goals of the enterprise scaffold are now addressed in the Ready section of the Cloud Adoption Framework. The
enterprise scaffold content has been deprecated.
To begin using the Cloud Adoption Framework, see:
Ready overview
Creating your first landing zone
Landing zone considerations.
If you need to review the deprecated content, see the Azure enterprise scaffold.
W A R N IN G
Azure Virtual Datacenter has been integrated into the Microsoft Cloud Adoption Framework for Azure. This guidance serves as a
significant part of the foundation for the Ready and Governance methodologies within the Cloud Adoption Framework. To
support customers making this transition, the following resources have been archived and will be maintained in a separate
GitHub repository.
Archived resources
Azure Virtual Datacenter: Concepts
This e-book shows you how to deploy enterprise workloads to the Azure cloud platform, while respecting your existing security and
networking policies.
Azure is Microsoft's public cloud platform. Azure offers a large collection of services including platform as a
service (PaaS ), infrastructure as a service (IaaS ), and managed database service capabilities. But what exactly is
Azure, and how does it work?
Azure, like other cloud platforms, relies on a technology known as virtualization. Most computer hardware can be
emulated in software, because most computer hardware is simply a set of instructions permanently or semi-
permanently encoded in silicon. Using an emulation layer that maps software instructions to hardware
instructions, virtualized hardware can execute in software as if it were the actual hardware itself.
Essentially, the cloud is a set of physical servers in one or more datacenters that execute virtualized hardware on
behalf of customers. So how does the cloud create, start, stop, and delete millions of instances of virtualized
hardware for millions of customers simultaneously?
To understand this, let's look at the architecture of the hardware in the datacenter. Inside each datacenter is a
collection of servers sitting in server racks. Each server rack contains many server blades as well as a network
switch providing network connectivity and a power distribution unit (PDU ) providing power. Racks are sometimes
grouped together in larger units known as clusters.
Within each rack or cluster, most of the servers are designated to run these virtualized hardware instances on
behalf of the user. However, some of the servers run cloud management software known as a fabric controller. The
fabric controller is a distributed application with many responsibilities. It allocates services, monitors the health of
the server and the services running on it, and heals servers when they fail.
Each instance of the fabric controller is connected to another set of servers running cloud orchestration software,
typically known as a front end. The front end hosts the web services, RESTful APIs, and internal Azure databases
used for all functions the cloud performs.
For example, the front end hosts the services that handle customer requests to allocate Azure resources such as
virtual machines, and services like Cosmos DB. First, the front end validates the user and verifies the user is
authorized to allocate the requested resources. If so, the front end checks a database to locate a server rack with
sufficient capacity and then instructs the fabric controller on that rack to allocate the resource.
So fundamentally, Azure is a huge collection of servers and networking hardware running a complex set of
distributed applications to orchestrate the configuration and operation of the virtualized hardware and software
on those servers. It is this orchestration that makes Azure so powerful—users are no longer responsible for
maintaining and upgrading hardware because Azure does all this behind the scenes.
Next steps
Now that you understand Azure internals, learn about cloud resource governance.
Learn about resource governance