The life-changing magic of a tidy kanban board
I’ve gone from actively hating our documentation issue backlog to feeling proud of our issue triage process. I collaborated with my documentation teammates and the Salt project manager to refine and improve our system for triaging documentation issues over the last six months. Because I feel pretty excited about what we accomplished, I’m sharing some of the key insights and best practices that I’ve learned about issue triage along the way.
Issue triage basics: tools and terminology
Just in case you have no idea what I mean when I say issue triage, let me take a moment to define some terms and talk about the tools involved. If this is stuff you already know, feel free to skip to the next section.
The main project management method that most software projects use for tracking work that needs to be done is the issue board. An issue (also often called a ticket) represents a discrete task that needs to be done or a problem that needs to be solved.
For example, if I know that a new feature needs to be documented as part of a release, I might create an issue to track that work throughout the whole process. In the body of the issue, I typically include a brief description of the feature and add a checklist of different tasks I need to complete in order for that issue to be considered completely resolved. See Issues in the official GitLab documentation for an example of how a ticket looks in GitLab (which is our tool of choice).
Issues can have multiple labels. A label is a flag you can add to a ticket that provides some meta-data or identifying information about the content of the ticket. Effective labels are usually a single word or simple phrase and follow a consistent color-coding pattern. A label can include information like the issue’s general category, status, priority, etc. Labels are very crucial to an effective issue triage workflow, so I’ll discuss them more at length later in this post.
Not all projects use this feature, but an issue can also be an epic, which mean it’s a big ticket that contains a lot of related sub-tasks in the form of smaller issues. Epics are used for tracking major initiatives that require cross-communication or that might take longer than a sprint to complete.
An issue board (also sometimes called a kanban board) lists all the open issues and visualizes where each issue is in the current workflow. A typical issue board has several different columns, each representing a different phase of the workflow. See Issue boards in the GitLab documentation for an example of a how an issue board looks in GitLab.
My documentation team uses the same columns as the development team and our columns have the following status headings:
- Open – Contains all open issues.
- To Do – Contains issues we intend to work on in the current or upcoming sprint.
- Doing – For issues we are actively working on.
- Reviewing – For issues that are complete but not yet merged. Usually these issues are in the process of being reviewed by subject matter experts for accuracy or by other technical writers for style.
- In QA – Used more by development than by our team. I generally keep this column collapsed so that it takes less space. Every now and then we do need QA to verify that something is working correctly in our documentation (such as context-sensitive links).
- Blocked – For issues that we can’t work on because of a blocking problem or dependency on a team outside of ours.
- Closed – Contains all closed issues.
An issue list is a flat list of all the issues in that repository. In GitLab, the issue list is where you can also go to open a new ticket. See Issue List in the GitLab documentation for an example.
As you have likely already inferred, we use GitLab at SaltStack (recently acquired by VMware). However, some other commonly used tools are Jira and GitHub (in open source). If you’re using GitHub, I recommend using the ZenHub kanban board extension (free for open source) instead of GitHub’s native project board functionality. You can also use Trello for kanban board cross-communication if so desired.
Opening new issues
The crucial point to note is that anyone can open a new documentation issue. Issues can be opened by documentation team members, product managers, development, support, or in response to a customer request.
We’ve trained people inside our organization to open issues as a way of requesting work from the documentation team. If someone inside our organization tries to give us a task by sending us an email or Slack message, we might open an issue on their behalf and tag them on it (so that they will receive notifications on the issue). However, after doing it for them a few times, we’ll start to encourage them to open issues on their own.
Prior to the acquisition, we were in the process of setting up a system so that customers could send an email with a documentation request or make a comment on a documentation topic and it would open a ticket for the request. However, that process is on hold while we are working on integrating our documentation into VMware’s documentation.
Weekly triage meeting agenda
The documentation team regularly reviews issues in our weekly triage meeting, which is currently held Monday afternoons for an hour. It’s important to hold this meeting consistently as part of an informal service level agreement (SLA). When you hold consistent triage meetings, that becomes a way to let people who open new issues know when you plan to review them.
It’s important to note that we only commit to reviewing the issue once a week. We are not committing to work on it immediately. We can’t promise to work on the issue without knowing what the impact and time estimate of the issue is. We have to balance new issues against our existing priorities and plans.
Our triage meetings have the following agenda:
- Review and triage all newly opened tickets that were opened in all of our repositories. For us in GitLab, we look at our issue list at the docs organization level, which shows us all our repositories. We also display issues by created date and sort by newest first. We typically recognize which issues are new, but we can also tell which issues are new because they typically don’t have the normal triage labels we use.
- Review all new tickets, assign appropriate labels, and ask for more clarification from the individual who opened the ticket as needed. These tasks are the essence of issue triage. I’ll explain more about this process in the rest of the blog entry.
- After reviewing new tickets, search for any issues with the “Triage needed (docs)” label. These tickets were usually assigned this label as a reminder to follow up on it in a future triage meeting.
- If time allows, groom the backlog. Grooming means reviewing the oldest tickets in our backlog to determine if they need to be archived. Our policy is to mark issues that are older than a year as stale. We typically give a one-week notice before closing to give the person who opened the ticket a chance to respond.
The goal of issue triage
The goal of triage is to regularly review newly-opened documentation issues and prepare them for future work. The purpose of triage is not necessarily to resolve tickets. The purpose is to get tickets into an actionable state and to add necessary metadata that will help the documentation team make important decisions about when to work on a given issue.
When the documentation team evaluates a ticket, we assess the ticket to determine:
- Severity level – How urgent or important is this issue? Is it a high risk issue? (You could also call this the priority level.)
- Time estimate – How long will it take to resolve the issue and how much work will be involved?
- General docs element, topic, or type (optional) – Does the issue impact a specific docs element, such as the API docs, Sphinx, or our theme? Does it impact a specific topic, such as the Downloads page or LDAP? Is the issue related to a larger issue category, such as style guide issues or search/navigation?
- Team or individual (optional) – Who should work on this issue? Does the issue need the entire documentation team’s attention? Does the issue need support from additional teams like QA, UX, or SRE? Note that it’s okay to leave this unassigned.
- Correct repository – Was the issue opened in the same repository in which the fix for the issue will be submitted? (If not, the issue needs to be moved.)
- Clarity needed – Is it clear from the issue’s description and comments what work needs to be done to resolve this issue? Is more clarification needed from the individual who opened the ticket?
The documentation will then assign metadata to the issue in the form of labels that captures this information. If the issue needs further clarity, the team will follow up with the individual who opened the issue for more information.
Severity level
Our team currently uses the following severity levels to designate a ticket’s impact and importance (based on what our dev teams use):
Label | Description | Examples |
---|---|---|
Critical | The product has a critical problem that needs immediate attention and supporting documentation is needed. | Security vulnerabilities, documentation or website functionality failures |
High | The documentation is in a state that is actively impacting a large number of users and affects their ability to use the product effectively or access the documentation. In a word: urgent. | Inaccurate documentation, missing documentation, known issues in the product |
Medium | The documentation is still in a usable state but could be misleading or confusing to users. | Formatting issues, grammar or punctuation errors, improper terminology |
Low | The product or documentation is in a usable state but could be improved. | Style guide inconsistencies, cosmetic fixes |
Time estimate
Our team currently uses the following time estimates to signify our best guess about how long an issue will take to resolve:
Label | Description | Examples |
---|---|---|
Long-term | A major task that could take a month or more and is usually reserved for full documentation initiatives. It will involve collaborating with cross-functional teams and may involve planning sessions. The task might require approval from upper management because it will involve using a lot of resources. | Epics, major back-end or front-end changes, creating full new doc sets comprising several topics, migrations, major reworks of key content |
Sprint | A fairly major task that could take a 1-3 week time period to complete. These tasks might involve interviewing subject matter experts or conducting independent research. | Documentation for a feature release or that satisfies a user story, improvements to existing documentation |
Single-day | A task that could likely be completed in 1-2 days of dedicated time. (It might take up to a week after it gets through the review process.) | Reworking a single topic, updating a style guide issue in a set of topics or a small doc set |
Quick fix | A task that could take a few minutes, an hour, or up to half a day to complete. No subject matter reviews needed. Low-hanging fruit. | Fixing a typo, fixing a minor formatting error, adding a few lines of text |
General docs element, topic, or type (optional)
If needed, we assign the issue a label that describes the specific docs element it impacts. For example, it might impact elements such as:
- API docs
- KB (for Knowledge base)
- Front-end (for issues that impact our documentation theme)
- Back-end (for static site generator or pipeline issues)
If the issue is related to a specific topic that generally needs heavy work, assign it those kinds of labels. For example, we sometimes assign labels for:
- Downloads page
- Authentication docs
If the issue is related to a larger type of issue or docs initiative, we assign it those kinds of labels. For example, we sometimes assign labels for:
- Best practices
- Continuous integration
- Images and visuals
- Style guide issues
- Search and navigation
- Testing
- Troubleshooting
Additional labels might be needed in certain contexts. For example, in Open Salt:
- If it’s module documentation, it gets assigned the docstring label.
- If it’s in the topical documentation, it gets assigned the rst label.
Team or individual (optional)
If the issue requires a specific member of the docs team or if someone strongly wants to work on it, we might go ahead and assign it to that individual.
If the issue needs support from an additional team, such as QA, Dev, UX, or SRE, we add those labels to the issue. If the issue will require support from the full documentation team or needs to be labeled as a docs ticket for visibility in other repositories, we add the Documentation team label.
Correct repository
Ideally, issues should be opened in the repository where the merge request and/or pull request for that work will be done. If the issue was not opened in the correct repo, it should be moved to that repo. We do have some exceptions to this rule, such as for API docs and knowledge base articles.
Clarity needed
In some cases, it may not be clear from the issue’s description and comments what work needs to be done to resolve this issue. The team might also not be able to assign a severity or complexity level until more information is gathered.
When that occurs, the team should:
- Add a comment requesting clarification from the individual who opened the issue.
- Assign the Triage needed (docs) label, which indicates that this ticket must be reviewed in the next triage meeting after more information has been gathered.
If the individual who opened the issue doesn’t respond by the next week, I usually add a comment with a stronger warning that we will close the issue unless we get more clarification. Occasionally I might reach out to the person on Slack or through email. After another week passes, I close the ticket. Sometimes that’s the only way to get attention from the original person who opened the issue.
Good triage sets you up for success
So, what’s the point of all this triage work? A good triage process helps you make decisions about what to work on by helping you realize what’s most important and how long it might take you to complete the work. The labels (meta-data) you add to issues during triage make it possible to quickly filter issues and see what needs to be worked on. This data helps you:
- Plan your sprints
- Prioritize your work
- Measure your progress
- Determine if you have enough resources to complete the work that is being requested from you
For example, my team operates on 2-week sprint cycles. On the last day of the sprint, we assign issues and add the issues we’re working on to each sprint for tracking. (In GitLab, you assign issues to a milestone to track a sprint. Other project management systems have similar functionality but might use different naming conventions.)
When I’m planning my sprint with my teammates, I filter our kanban board using the triage labels to see if there are any high severity issues that I can fix in the time I have available.
I also sometimes look at issues that cover a similar topic or documentation element so that I can group similar issues together to work on at a time. I like to call these my sprint themes. In the sprint description where I add my issues, I describe my themes and connect them to the larger goal or initiative that unifies the tickets I’m working on. These themes give me a big picture about what I’m going to work on.
When I log in to work every day, I check my current sprint milestones and I look at a custom kanban board that shows me which issues I’m specifically working on. As I work on an issue, I make sure to update its status by moving it into the correct column.
Concluding thoughts
Our triage process evolved over about six months and was refined by working closely with our project manager, who loves implementing process improvements. We met frequently to talk as a team about what our severity levels and time estimates needed to be. These conversations ensured that we all had a common baseline of understanding.
If you are like us, you might need to implemented this system on top of an existing one. For that reason, we had to go into our backlog and triage old issues that had been created before we implemented this process. I volunteered to take this task on and went through our backlog to assign my best severity and time estimates to the old issues. We had under 100 tickets, so this process took me about a day to do. For any issues that I didn’t feel confident assessing, I assigned the Triage needed (docs) label to ensure that we would regularly review and triage these issues in future triage meetings. It took a few meetings to get through all of the issues I had assigned this label to, but now we’re all caught up. Now it’s just a matter of maintaining new tickets as they come in and reviewing the backlog for stale tickets.
I mentioned at the beginning that I used to feel overwhelmed and discouraged about our backlog. Now I feel like we have a regular process for reviewing, scheduling, and prioritizing incoming work. Our kanban board and our issue list feels organized and manageable. It’s nice to have this peace of mind.