Five Simple Steps to Agile Risk Management
This report, by its very length, defends itself against the risk of being read. (Winston Churchill)
In my post on Agile Project Charters I outlined the embarrassingly high failure rate of software projects. Success rates today are only marginally better than they were when the Standish Group released its first Chaos report in 1995. Recognizing the tremendous misalignment between project expectations and project results, a variety of tools and methods have evolved to help improve the odds of success. Chief among them is Project Management methodologies. Even with fifteen years of experience combined with improved software development tools and better methods, software project success rate have eked out only marginal gains. This is not a vilification of project management methodologies. Rather, it is a statement that software development is an inherently and increasingly complex undertaking with many uncertainties. With Risk Management, we attempt to identify the things we don’t know (the uncertainties) and quantify them so that they can be managed. This sounds like a paradox – how can you quantify what you don’t know- but it is a paradox we can manage.
Agile Methods such as Scrum are a relatively new entrant into the field of project management. A basic tenet of Agile Methods is that teams produce a continuous series of useable software builds in very short cycles called Sprints. Each build is assessed, issues identified and the backlog of tasks is reviewed and prioritized and the most important tasks are scheduled for the next sprint. It sounds like an ideal approach. For many teams it works extremely well as Agile teams tend to claim higher project success rates than do teams using more traditional methods. There is not a lot of empirical data available that makes effective comparisons of Agile project success rates to other methodologies, but what data that does exist tends to support those claims.
Most methodologies place a fairly high importance on Risk Management. Agile approaches tend to implicitly manage Risk. That might not be a bad approach if the only things that affected the outcome of the project were the decisions that the developers made to implement the solution, but as we shall shortly see, there exist a multitude of factors that can have a significant impact on the success of a project. Further, I maintain the position that explicit Risk identification and management can further improve on the success rate of Agile projects. In this article I will outline a Risk Management methodology I use that is quick, simple, pretty comprehensive and very Agile friendly. As the title of the article implies, I have broken the process down into five steps:
Oops I lied… there are six steps. Actually there are only five steps but it is worth stating Repeat as a sixth step to emphasize that our Agile Risk Management Process defines a virtuous circle of continuous improvement.
Yes, risk taking is inherently failure-prone. Otherwise, it would be called sure-thing-taking. (Tim McMahon)
- Risks are influencing factors that might adversely affect the outcome of a project.
- Risk is the direct result of uncertainty. If there is no uncertainty, it is not a risk – it is a certainty.
- Risk analysis is used to help a team understand uncertainty that could affect the outcome of the project.
- Risk management (sometimes called Risk Mitigation) is the plan that the team puts into place to pre-empt, contain or mitigate the effects of risk to a project.
The important thing to remember is that even in simple projects, things can and will go wrong, and that you need to make plans to minimize the impact of those events when they occur.
The Five Steps
The Dimensions of Risk
Risk has two dimensional influences. The first Helpful/Harmful is a simple assessment of factors that have a potentially positive or negative influence on the success of our project:
- Helpful: Factors that advance the objectives of the project
- Harmful: Factors that hinder or imperil the outcome of the project.
The second dimension of Risk is the identification of the source of the Risk:
- Internal: Factors originating inside the organization or within the sphere of influence of the project.
- External: Factors originating outside of the organization or project that cannot usually be influenced by the project.
Combining these factors into a two dimensional assessment provides us with the classic SWOT Analysis view of our project: Strengths, Weaknesses, Opportunities, and Threats. In the diagram below we see the two dimensions (four factor categories) arranged in a matrix with Helpful/Harmful dimension represented as columns, and Internal/External dimension represented as rows.
Risk Management is primarily interested in the Harmful column and that is what we will focus on in this article.
Examples of Weakness
- Insufficient resources
- Limited budget
- Aggressive timeline
- Important skills lacking in the team
- Technological uncertainties
- Lack of stakeholder consensus
- Lack of a disaster recovery plan
Examples of Threats
- Rapid and significant changes in the economy
- Geopolitical tensions
- Economic uncertainty
- Changing legislation
- Changing competitive landscape
- Trade tariffs
Weaknesses are factors over which we tend to have some degree of control. Threats, however, are factors over which we tend to have little or no control. It is important to understand that even though we may have no control over a factor such as a pandemic, there are usually things we can do to manage or minimize the Risk effects on our project.
Each of the Risks needs to be categorized as to the affected area, likelihood and level of Impact it may have on the project. Risk Classes are used primarily for organizing, summarizing and reporting of Risks to management and stakeholders. Some Risks you identify may impact more than one Class, and if they do, they should be reflected in the summaries of those Classes.
The next chart is a list of Risks Classes I typically use. These categories are not prescriptive and you may wish to add others such as Reputation, Environmental Impact, etc… to suit your project or company needs. Solution, Timeline, Budget, Privacy and Security should be of interest to everyone with a stake in the outcome of the project. Resources and Scope are primarily relevant to the development team, but they can have a significant impact on the other categories and are as such included in the set. Some Risks may affect multiple Risk Classes and that effect should be reflected in your Risk Classification. I will show how the Risk Classes are summarized later in this article.
What Do You Assess?
I maintain the Risk ratings of each Story or Defect directly on the Cards. I use the same pattern of recording the three numbers Probability, Impact and Risk Rating and use a highlighter to colour code the risks. At a story level, most risk will likely be pretty benign, so don’t obsess and spend a lot of time on the low risk items. Focus on the ones that are genuine threats. Defects are areas that may require more attention if only because as Defects they likely have higher visibility in the organization. In both cases, write a few details about the Risk directly on the card. An added benefit of having developers assess Risk associated with the Stories and Defects is that it encourage a new dimension for their thinking about the work they are doing and helps them to be cognizant of the effects their work has to the overall success of the project.
Tracking Risk associated with Stories and Defects is insufficient – especially for Threats (factors external to the project) and for any identified Risk that is not a Story or a Defect I use a Risk Register (more on that later). The Stories and Defects that receive a high Risk Rating are also tracked in the Risk Register.
Great – so now we know what to measure, but how do we go about doing that? If you’ve read my three previous blog posts, you’ve likely already guessed that we will use a matrix based on two vectors. The two vectors we will use in this case will be Probability and Impact. The Risks you identify must each be assessed according to these two vectors.
The assessment of each risk must be performed by the respective SME (Subject Matter Expert). A project manager is not qualified to perform an assessment of system security unless he/she is also a security SME. The same is likely true for assessing Risks relative to system performance, quality and privacy. Scrum is about teamwork so depend on the team to bring their expertise to the table. Another reason for SMEs to do the assessment is that I have in some organizations witnessed political pressure applied to PMs to produce Risk Reports to reflect a particular or desired Risk profile. This may force the PM to game the numbers to produce the desired results. The ethics of such practices are highly questionable. If you experience a situation like this, you’ve got much bigger issues on your hands than managing the Risk in the project and should perhaps consider looking for a new job. Having the SMEs do the assessment does help insulate the PM from such pressures. Once the SMEs have performed their assessments, it is useful to discuss the assessments as a team to ensure that there is a consistent approach and weighting applied across all assessments. This also allows the thinking and assumptions behind the assessments to be shared amongst the team and brings the team’s collective wisdom to bear on evolving potential solutions. It may even uncover additional Risks due to Risk interdependencies.
The Impact of a Risk is a measure of its affect on the project. It ranges from Minimal (1) at the low end where the consequences would be very small up to Extreme (5) at the high end. You and your team should devise wording to describe each Impact level to suit the realities of your organization. Whatever you decide upon should be consistent throughout the entire organization so at to minimize confusion. The wording should not be viewed as a set of rules – instead, it is a set of guidelines. Here is some suggested wording:
If there is a very high probability that a Risk may be realized, then it is clear that it should have the attention of the team. Conversely, if there is a very low probability of the risk being realized, then it is likely that it should receive less attention from the team. We thus need to ensure that the greatest attention is focused on the Risks with the highest occurrence probability. The following chart provides a suggested scale for assessing the probability of Risk manifestation.
|5||91 – 100% or Very likely to occur|
|4||61 – 90% or Likely to occur|
|3||41 – 60% or May occur about half of the time|
|2||11 – 40% or Unlikely to occur|
|1||0 – 10% or Very unlikely to occur|
Enter the Matrix
We now have two Risk Vectors and as we did in the prioritization of Stories and Defects (see my previous blogs), we take the two vectors and multiply them together to obtain the simple product which is the Risk value. Using the same thresholds for Stories and Defects as well as the corresponding colour system we end up with a Risk Matrix that looks like this:
Now that you’ve identified the important Risks that threaten the success of your project, what should you do about them? You can make your Risk Planning as comprehensive as you wish, but like most things in life, the simplest approach is often the best approach. Unlike Impact and Probability Assessment, your wording should not be considered a guideline. For each of the various Risk Ratings, we want specific things to occur because the risk thresholds are triggers to mobilize the team or stakeholder to take action to mitigate the Risk. Here is some suggested wording for your Risk Planning. The wording you use in your company should be different than mine and reflect the realities of your organization, but it is important that the wording be focused on Actions to manage the Risk:
To track and manage Risk on a project I use a Risk Register. To do this, I use a spreadsheet. Each time I do a Risk Assessment (ideally each sprint planning session) I add a new page to the spreadsheet and each page is a Risk Assessment corresponding to a particular Sprint. This way I can track how a Risk has changed over the course of a project. I can also monitor how Risks are added and removed from the Register. As you near the end of the project, you should see all of your Risks gradually move into the green or minimal range. If this does not happen, you are definitely doing something wrong because if you still have Orange or Red risk in the late stages of your project, you have not been managing the Risk and you are rolling the dice on project success. All of the time, effort and money invested up to that point is at Risk of being lost.
Insanity: doing the same thing over and over again and expecting different results.
First Things First
Act is simply that. It is the implementation of the defined Risk Mitigation Strategies. Well it’s actually not that simple. Human nature is such that we tend to put off the things that aren’t fun, interesting or that might be just plain hard work. This is project suicide when it comes to Risk. It is imperative that you deal with the high Risk items first. Deferring performance testing and finding out a week before implementation that you can’t possibly achieve the requisite transactional throughput may be the death of your project. At a minimum, someone is going to have to do a lot of explaining – don’t let that be you.
This is important
The “Fail Early” phrase is becoming very popular in the world of venture capital. In essence it means figure out as early as possible in the process as to whether or not what you are doing will succeed. These findings are essentially the go-no go for your project. If success is not possible, either stop (kill the project) and move on to something else, or rethink the project and come at it from a different angle. Either way, do the difficult, gnarly, risky stuff up front. An added benefit is that it helps you define the boundaries of your system and sets expectations as to what is possible/realistic and what is impossible/unrealistic. It could even bring to light unrealistic success criteria and the definition of project success many need to change. When this happens, the project may still live, but under a revised and possibly more realistic set of stakeholder expectations. It may also stimulate commitments like a larger budget or access to key people.
As simple and as obvious as this may sound, it is amazing how often such critical, high Risk items are left until the final stages of the project. From my own observations over the years, this is one of the biggest reasons for project failure. Do the Risky stuff first and Fail Early.
This process is very lightweight and very quick to perform. Identifying Risks early, and implementing appropriate Risk Mitigation Strategies for each is essential to the success of projects. Done properly, it is a continuous virtuous circle of Assessment and Action to constantly identify, manage and minimize Risk.
Your Risk Plan should be reviewed at a minimum quarterly. Better still, your review should coincide with your sprint planning sessions. At your these sessions, you have access to your team where everyone is already looking at the stories, reviewing effort estimate, etc… You don’t need to do an exhaustive review each time, but pay particular attention to the Risks you are tracking in your Risk Register. Also look for any new Risks that might start appearing as the team progresses through the project and learns more about the challenges. As always, if you discover Risks that are high, deal with them early.
In this article I have presented a simple, easy five step process for assessing and managing Risk in an Agile process. My next post will approach how you can aggregate the Risks of multiple Projects into a Program view of Risk.
As always, I look forward to your comments.
I am an independent consultant who has been leading software teams, designing, building and delivering software for nearly three decades. It’s still as exciting and enjoyable for me today as at was when I wrote my very first Hello World program and saw it spring to life in front of me.