Вы находитесь на странице: 1из 14

THE COMPLETE GUIDE TO

LOW-RISK
CONTINUOUS DELIVERY

Published: Dec 2017


CONTENTS

Author’s Note............................................................................3

A Recap......................................................................................4

Making the Change..................................................................5

Culture...........................................................................5

Architecture...................................................................6

Tooling...........................................................................6

Post-Deploy Tooling.................................................................9

Key Attributes................................................................9

The Role of Automation.............................................10

Final Thoughts........................................................................12

Real World Examples..................................................13


AUTHOR’S NOTE

We’re not here to introduce you to Continuous Delivery or


explain its benefits - much has been written on the topic, and
you already know all that.

We are writing about the culture, architecture, and tooling


required to successfully transition toward Continuous Delivery.
Of particular interest is post-deploy tooling - a topic that has
not been discussed nearly enough.

Our views are colored by our own experience and the


experience of our customers, which include CI/CD software
vendors such as CircleCI, Codeship, and Puppet and leading
companies such as Salesforce, Twilio, Instacart and more.

Rollbar on Continuous Delivery 3


A RECAP

Continuous Delivery at its core is about getting useful soft-


ware from development to users quickly.

Why is speed important?

Because the faster you get your software out there, the faster
your users benefit from the improvements made.

You also get feedback quickly, helping your team be agile


in responding to the needs of your users.

When you develop software at a high speed, the chances of


your business or organization gaining a first-mover advan-
tage, keeping competitors at bay, winning new customers,
or satisfying existing ones go up significantly.

Done right, Continuous Delivery turns your software devel-


opment cycle from single lines to many circles:

From this…

… to this:

Figure 1: Software development cycles before and after adopting CD

Rollbar on Continuous Delivery 4


MAKING THE CHANGE

Clearly, to go from deploying software say, once a quarter,


to doing production deploys dozens of times per day is a
big change. The question one should be asking is,

“In making this change, how do I lower risk?”

Notice we use the word “useful” when describing Continuous


Delivery. Usefulness implies a level of quality that’s required
so you can get meaningful feedback.

So, the answer to the above question must address how to


not just optimize for speed, but also ensure quality at the
same time.

In our view, the answer involves three parts: culture, archi-


tecture, and tooling.

CULTURE

The kind of culture we’re talking about is one where Dev and
Ops teams are sharing the responsibility and success of a
software release - hence the term “DevOps”.

In the past, the two teams work relatively independently. Dev


would write the software, and when they’re done, they pass it
along to Ops for deployment. To say this act of “throwing the
code over the wall” hasn’t worked well is an understatement.

Rollbar on Continuous Delivery 5


A lot about enabling a DevOps culture has been covered
elsewhere, so we’re not going to repeat them here. What’s
worth remembering is the right cultural incentives can go a
long way toward ensuring your team is successful in making
the change.

ARCHITECTURE

Naturally, the less available time there is between each deploy,


the less code gets written.

It doesn’t imply the code is of lower quality or incomplete


functionality-wise - in fact it should be the opposite because
the code still needs to be useful - but it does mean your soft-
ware is now deployed in smaller increments.

In terms of reducing risks, this is a good thing because the


smaller the increment, the more manageable it becomes if
and when issues arise.

With microservices architectures, your application is com-


prised of very small chunks already, so in that sense adopting
microservices architecture directly helps your team reduce
the risk associated with frequent deploys.

If you apply the ideas in twelve-factor apps when architecting


your software, such as making the development and produc-
tion environments as similar as possible, the transition to this
new way of delivering software is also made easier.

TOOLING

Most of us are familiar with, or have already invested in, new


tools to get us moving from code to deploy faster.

Rollbar on Continuous Delivery 6


Examples of such tools include Jenkins, Terraform, Sauce
Labs, Chef, Puppet, CircleCI, Codeship, and more.

Generally speaking, these tools are designed to enable an


automated deployment pipeline, where any changes - new
features or bug fixes alike - can propagate through all the
intermediate build, test, and integrate steps instantly.

Little has been said, however, about the importance of tooling


post deploy to complete the feedback loop we want to build.

Automated

Test Release Deploy

Build

Code

Figure 2: Continuous Delivery loop with unmapped post deploy stages

We know that despite all the right intentions and efforts,


errors can still somehow creep into the software. The code
itself can be buggy, the configurations poorly managed,
third-party services experiencing downtime, and so on - but
inevitably as a result, issues occur in production.

When you deploy to production so frequently as you would


in this case, what you need is fast reaction times, so you can
quickly write and deploy fixes and ensure sufficient quality
in the software.

Rollbar on Continuous Delivery 7


To enable such fast reaction times, post-deploy tools must
meet the following requirements:

• Monitor: Any and all issues affecting production


application shall be monitored, and new issues shall
be detected immediately – ideally before any user
notices it or report it to your Support team

• Triage: When an issue occurs, the person or team


doing the monitoring shall be alerted instantly so they
can triage the issue accordingly

• Analyze: The person assigned to investigate the issue


shall be given sufficient context and detailed data so
they can analyze and debug the issue quickly

• Workflow: Because post-deploy monitoring, triaging,


and resolving issues is a shared responsibility of Dev
and Ops, the tools shall allow the teams to have a
shared visibility and understanding of the issues

Automated

Test Release Deploy

Build

Monitor

Code /
Debug Analyze Triage

Figure 3: Continuous Delivery loop with post deploy stages mapped

Rollbar on Continuous Delivery 8


POST-DEPLOY TOOLING

As is often said, the tools you choose should reinforce the


behaviors you want to see.

The difference between using a tool that only either Dev or


Ops can understand or are naturally familiar with, vs. one that
works well for both, can be significant. The latter can go a long
way in fostering the DevOps culture critical to your success.

KEY ATTRIBUTES

We believe a post-deploy tool for enabling fast reaction times


needs to be application-centric, error-centric, and resolution-
centric by design:

Application-centric

Application-centric means the tool is primarily focused on


the application itself, not the infrastructure hosting the app.
Your tool should help ensure the software is useful, not just
check if it is up or slow. In other words, an APM tool alone is
not sufficient. For a software to be useful, it must be functional
and reliable, so having a tool that’s purpose-built for dealing
with application errors is a must.

Error-centric

Error-centric means the data the tool collects should be


presented in format that is useful to developers, who are
responsible for debugging the errors. For example, an error

Rollbar on Continuous Delivery 9


rate graph showing more errors are recorded than usual isn’t
as useful to them compared to a live feed of specific errors
happening in real-time post deploy.

Resolution-centric

Lastly and perhaps most importantly, the tool needs to be


resolution-centric. What’s the point of monitoring for issues
and triaging them, if your developers can’t resolve them
fast enough? Having fast reaction times means the time-to-
resolution needs to be in minutes - not hours, days, or weeks.

If that sounds ambitious, consider that today’s modern deploy-


ment tools allow you to roll back a deployment in minutes
with just a few commands.

“Why shouldn’t you expect to be able to


resolve errors as quickly as doing a rollback?”

The only way you can achieve time-to-resolution in minutes


is if an important signal like a critical error can be discovered
easily despite the noise, and if all the information a developer
needs to debug the error is available right then and there.

This is where automation comes in.

THE ROLE OF AUTOMATION

If you look more closely at aforementioned tools designed for


getting to the deploy stage faster, they all share a common
focus on enabling automation.

Rollbar on Continuous Delivery 10


Applying the concept of automation to post deploy stages is
arguably a lot more challenging, however. You are still going
to have the fix the bugs yourself. No software is going to do
that for you (yet).

Having said that, some level of automation is still valuable


to help you:

1. Manage the monitored data to increase signal-to-noise ratio

2. Collect all relevant data to expedite the debugging process

Noise Reduction

One of the biggest problems when it comes to monitoring is


the large amount of noise. Often, for fear of missing important
signals, we set up alerts for everything. Eventually as a result,
those on the receiving end would begin to ignore the alerts.

When it comes to errors, a lot of noise we typically see has


to do with not knowing whether “this error is the same as or
related to that error”.

If the tool we use is intelligent enough to tell that certain er-


rors that occur on different browsers are all the same error,
for example, and group them automatically, the noise has
just been reduced.

Automation can also be applied to your triaging workflow to


ensure important signals get sent to the right person or team
with the appropriate level of priority.

For example, you may want to specify that certain errors require
immediate reactions by your on-call developers, while others
get automatically logged as tickets in your issue tracking tool
for later resolution.

Rollbar on Continuous Delivery 11


Data Collection

For a developer dealing with an error, the data collection


effort is a necessary but time-consuming activity. Relative to
writing code to fix the error, the activity is of lower value and
should be automated.

All kinds of data can be useful for identifying the root cause
of an error, and they should be automatically collected and
made easily digestible by the tool you use.

Examples of data to collect include details of an error such as


the stack trace, parameters, local variable values, and telem-
etry events showing you everything that happened leading
up to the error.

Equally useful are contextual data such as how often the error
has occurred and when, which browsers, OSes, IP addresses,
and users are affected, which deploys are associated with
the error, whether a similar error has been resolved before,
and even whether a solution is already found and published
elsewhere.

FINAL THOUGHTS

As post-deploy tools mature as a category, you can expect


to see them become just as sophisticated and automated as
tools that exist today for pre-deploy stages.

These tools like Rollbar are invaluable in helping you practice


low-risk Continuous Delivery.

A common and perhaps best practice among our customers


when initially adopting Continuous Delivery is to start with

Rollbar on Continuous Delivery 12


newer apps or greenfield projects. This way, you don’t have
to contend with a monolithic legacy code base that could be
quite noisy to monitor.

Having said that, tools like Rollbar are very useful in detecting
issues in your existing apps before your users report or even
notice them, and fixing them fast.

It makes little sense that many of us still rely on users to find


bugs in our software - and logs to fix them - when a tool built
for such purpose is available.

Real World Examples

Instacart, a leading consumer delivery company with over


1000 employees, does around 30 production deploys per day.
To enable such cadence, they use Rollbar for error resolution,
dramatically reducing time-to-resolution down to minutes
(see case study).

Many other customers including Salesforce, Twilio, and CircleCI


(see case study) use our product the same way, so we know
using Rollbar is a battle-tested way to help drive Continuous
Delivery adoption and successfully extract value from it.

“One of the things I talk about with our customers is


that continuous deployment, though it scares a lot
of people, is actually a mechanism for reducing risk,
because the change that you deploy is very, very small.

But the key to that working is great instrumentation.


You can’t just blindly throw things into production, and
assume that everything is great. You really need to feel
confident that you understand what’s happening.
Rob Zuber
CircleCI CTO
Without Rollbar giving us visibility into
exceptions in production, we just wouldn’t be
confident and we’d ship more slowly.”

Rollbar on Continuous Delivery 13


The Real Value of Rollbar

Rollbar is a visibility and remediation tool in one. Dev teams


are familiar with it. Ops can perform triage with it. It meets
all the requirements for a post-deploy tool that enables fast
reaction times.

But what Rollbar really does for you is to act as a kind of “safety
net” that reduces and mitigates risks from frequent deploys.

It gives you the confidence to deploy often, and helps you


design out risk as you begin making the big change toward
a faster way of developing software.

TAKE ACTION TODAY

Talk to us if you want more tips on using Rollbar to accelerate


the transition to Continuous Delivery in your organization, or
if you are interested in learning more about the product.

We also offer a fully featured free trial for 14 days.

The best way to start is to go to Rollbar.com and sign up for


a trial or schedule a demo.

Should you have any questions or feedback, please contact


our team at sales@rollbar.com.

Rollbar on Continuous Delivery 14

Вам также может понравиться