Skip to content
Meindert Niemeijer edited this page Oct 29, 2023 · 14 revisions

RoboClerk Docs

For the folks that are impatient and want to jump straight into configuring the software for use, please jump to the "Additional Details" document. There is also auto generated HTML documentation for the source code available here. For those of you that enjoy reading lengthy prose on medical device software documentation and the philosophy behind RoboClerk, keep reading! 😃

Introduction

I have a long background in the development of a variety of advanced medical device software. Often this software was developed in start-up companies or in companies where resources were limited for various reasons. One of the most challenging aspects of developing medical device software in a resource constrained situation is the creation of high quality documentation. Most software developers do not enjoy working on documentation, especially not on documentation that has limited perceived value for the development team. The RoboClerk software is an attempt to take the learnings from my many years in the field and create a documentation as code tool that is easy to understand for developers and empowers them to apply best practices using the tools they are familiar with and like to use. In other words, RoboClerk enables the tools to be the source of truth, not the documentation. When information about the development process is changed (e.g. a requirement ticket is created) this information is automatically picked up by RoboClerk and integrated into the documentation. The RoboClerk philosophy is that documentation should be created as part of the build, source controlled in human readable ascii files that are stored with the product source code (hence, documentation as code).

Why and How Does Medical Device Software Need to be Documented

In a typical medical device software project, as much as 50% of the final deliverable is documentation. Documentation exists for a number of reasons but for companies developing medical device software one important reason is to show the regulators that regulate their industry that best practices were applied in the software's development and that the software meets a certain standard of quality. In the United States this regulator is the FDA. Long one of the most strict regulators but others (for example, the European Union) are adopting many of the same requirements and practices.

One major question newcomers to the field may have is: "what about the development process should be documented?". Luckily, there exists an internationally harmonized ISO standard for the development of Medical Device Software, ISO62304. This standard describes the elements that need to be present in the software documentation. If the development process meets this standard, all required elements will be present. There are other ISO standards that are potentially important. ISO62366 describes useability engineering and ISO14971 describes an approach for risk management for medical device software. For the implementation of these standards, RoboClerk can help as well.

I think the easiest way for a software engineer to think about an ISO standard is as a requirements document. The standard describes what the development process should contain, for example documented system requirements, but not how this is accomplished. This is analogous to good requirements not spelling out how some piece of functionality is supposed to be implemented in the software. Each company implementing the standard has to write its own specification document (i.e. a Standard Operating Procedure or SOP) that describes how the company will be implementing the standard. This means that every company building medical device software has a different set of SOPs but once you have seen one set, other sets of SOPs will be easier to interpret because all of the SOPs are trying to implement the same standard.

A secondary reason, which is also important, is to help the team. If it is possible for a document to be created in such a way that the development team can benefit from its existence then that is a bonus. This also ties into when certain activities take place in the development cycle. The best practices as required by ISO62304 are best applied before and during development. If software already exists, documenting it after the fact provides limited to no benefit (other than satisfying a regulatory requirement).

After medical device software is cleared and reaches the market, changes will inevitably need to be made. Maintaining the documentation is a major hassle and slows down the process of iterating on software changes. The ability to quickly generate updated documentation and iterate is a major motivator for the creation of RoboClerk.

RoboClerk

RoboClerk is a software tool that enables users to collect information from a variety of sources and integrate that information into text based documentation based on the asciidoc format. It also features a traceability function that can be used to analyze references between trace entities and documents and determine the completeness of these trace links. It is highly configurable and supports a plugin interface to allow interoperability with various software lifecycle management systems (SLMS) like Azure Devops and Redmine.

Using RoboClerk it is possible to automate the generation of large segments of the typical set of development artifacts that the standard expects to be created. A lot of the information that goes into the documentation is stored electronically in an SLMS or in other systems (including the version control system). In this document these data sources are referred to as a "source of truth". Creating the documentation without linking that information back to the source of truth wastes time and risks the documentation being out of date once changes are made (usually almost immediately).

Any time a software engineer needs to switch to a different toolset (e.g. from their IDE to Microsoft Word) that context switch costs productivity. A design goal of RoboClerk is to minimize context switching by doing two things:

  • Using a text based format for the raw documentation allows the use of standard version control (e.g. GIT) to control documentation. This has the advantage that all the standard tools developers are already familiar using for managing text files can be used on the docs. It also means documentation can be stored and versioned with the code. Many IDEs and text editors come with asciidoc preview functionality so that the person editing the asciidoc templates can see the effect of the edits in real-time.
  • Using the SLMS/source code as the source of truth. These are tools that developers use every day and they tend to be well integrated into their workflow already.

I hear you asking, "But what about the non-developer stakeholders that only can work with word?". RoboClerk is completely agnostic to your documentation pipeline. It generates asciidoc documents based on a set of templates. How you choose to process those documents is up to you. RoboClerk has the ability to run tool chains after it finishes generating documents. Microsoft Word is still a very popular tool for document creation in business. A tool chain that I have set up uses the asciidoctor and pandoc tools which allow an almost seamless conversion of asciidoc based text documents to the Microsoft Word format. Additionally, RoboClerk supports linking all elements taken from an SLMS back to the SLMS using hyperlinks in the documents so finding the information in the SLMS during a document review is simple.

We are in an exciting era of AI advancement, particularly in document creation and text interpretation. Tools like ChatGPT offer vast potential, hinting at a future where human input in documentation generation and interpretation might be minimal. RoboClerk supports the use of Large Language Models (LLMs) to assist in the creation of better documentation as well as to support the development team with their development and compliance activities. At this time RoboClerk supports AI plugins that can be used to review generated documentation and provide comments. In the future this will be expanded to include generation of parts of the documentation, direct assistance with requirements gathering, risk management and other activities.

The RoboClerk software is written in .NET6 and is completely cross platform. This documentation is focussed on creating a mental model of the software first while keeping the technical details for later sections. If you are interested in jumping straight in, try the "Getting Started" section.

The High Level Architecture

Documentation Data Architecture

The ISO62304 standard is not prescriptive in how it expects best development practices to be documented. However, it does presume a certain high level structure with two levels of requirements and a number of other entities that can be related to each other and should, in that case, be traceable through the documentation.

Truth Entity Overview

This image shows the basic traceable entities (aka truth entities) and, for most of the items, the kind of document where they could be documented:

  • Risk, an identified hazardous situation that can occur. Not every requirement will have a risk associated with it (hence the dotted line). If a requirement has a Risk attached to it, then that requirement describes a risk control measure.
  • System Requirement, the highest level of requirements.
  • Software Requirement, the more detailed level of requirements.
  • Documentation Requirement, requirements for the documentation (e.g. the user manual or the IFU).
  • DocContent, the text that satisfies the documentation requirement. RoboClerk can insert this into the documentation in a traceable way.
  • Anomaly, also known as a bug. RoboClerk searches for open bugs.
  • SOUP, Software Of Unknown Provenance. In the US this is typically known as Off-the-shelf (OTS) software.

Note that when requirements are discussed, these could also be user stories or some other method of defining what the system/software/documentation is supposed to do or contain. The standard uses the term requirement but does not specify the form of the requirement.

RoboClerk supports tracing each of the entities in the list above to each other as well as to any document created by RoboClerk. In this way a requirement can be traced all the way down to its tests. In order to trace an entity to a document, that document must contain a reference to that entity. If RoboClerk inserts any entities into a document, it will automatically trace those entities to the document. It is also possible for a user to explicitly insert a trace link in a document. For example, to establish trace from a software requirement to its design, one can simply refer to the software requirement at the location in the design specification where the design for that requirement is outlined. RoboClerk supports a particular kind of tag, a trace tag, for this exact purpose.

The Risk Control Measure entity is shown in the image as that is an outcome of the risk management process and each risk control measure is either a system or a software level requirement. System and software level requirements that are not risk control measures are generally created by first determining the system level requirement (which can come from a wide variety of sources, e.g. product management, regulatory affairs, etc.) and then breaking the system level requirement down into more specific software requirements. Software Requirements trace directly to Software System Tests that test the software requirements. When required, software requirements can also trace down to unit tests.

Note that there is a lot of flexibility in the naming and content of these documents and that different organizations tend to have different names for documents and even for the different trace entities. An organization may also work with combined versions of documents. In addition to the documents shown in the above image, there exists the need to further document other aspects of the development project. Here is a list of a number of example documents one might find in a medical device documentation project and what they typically contain:

  • System Requirements Specification - lists the system requirements. Usually the requirements are grouped in some way to make interpretation easier. For example, all risk control measures might be grouped together and so will the regulatory requirements.
  • Software Requirements Specification - lists the software requirements. Usually, these requirements are grouped based on the group their parent system requirements belonged to. Each software requirement should list their parent system requirement.
  • Software Development Plan - This is a top level document that generally does not contain traceable items. Depending on how extensive the organization's SOPs are, this can be a shorter (lots of SOPs covering many aspects of software development) or longer (SOPs cover less). This document is the highest level planning document and will typically contain many element from SDLC definition to Configuration Management. If an organization typically creates a certain type of software, the Software Development Plans of different projects tend to be very similar.
  • Software Design Specification - The software design specification contains the high level design which typically includes a breakdown of the software into items and units. Other things that can typically be found in this document are the data design and for SaMD the manual could even be part of this document. Software requirements typically trace to this document using trace tags.
  • Risk Assessment Record - The risk assessment record is the record that is created to document the risk assessment process. Describing a risk assessment process is outside of the scope of this document but typically a standard approach for risk assessment is used (e.g. Failure Mode Effect Analysis, FMEA). This record contains the identified primary hazards associated with the software as well as the identified risks and their risk control measures. If these RCMs are design changes, they are either translated into a system or a software requirement. The advantage of this is that by following the normal development process, one can be sure that the risk control measures will be implemented, verified and validated. Trace to and from the Risk Assessment Record is focused on those requirements that are in the risk assessment group. All of those should trace to risk control measures and vice versa.
  • System Level Test Plan - Lists all the Software System Tests that test the software requirements. Each Software System Test should refer to the software requirement that it tests and list all the test steps with expected results.
  • Integration Level Test Plan - The integration level test plan contains tests that are focused on testing the integration of software systems, items and units together. It is possible to cover these tests with the System Level Test Plan instead and then no separate Integration Level Test Plan is necessary according to ISO 62304.
  • Unit Level Test Plan - The Unit Level Test Plan is the lowest level test plan. This plan contains a description for each individual unit test. These unit tests can refer back to the software requirement that covers the unit they are testing.
  • Detailed Software Design Specification - This contains the detailed software design which describes the precise manner in which software units are integrated and designed. Only required for the highest safety classification in ISO 62304.
  • Transfer to Production Plan - This plan describes how the software will be put into production. Classically, this would have the process for creating the installation files and then the process for duplication of media and packaging of the software for shipping. For web applications there should be a one-time preparatory process to create the files for installation. Given these files, there is then an install process that is repeated with every installation. In some organizations, the risk assessment record traces here if the transfer to production process was subjected to risk assessment.
  • Code Coverage Record - A record that describes what parts of the code have been covered with tests.
  • Residual Anomalies Record - Contains a list of all outstanding "anomalies", that is, bugs and issues. For each items typically, there is a justification of why it is ok not to fix the anomaly and in what release of the software a fix is expected.
  • Runtime Error Detection Record - In some cases this document will have both the plan and the record for the static code analysis. This is not required by ISO62304 but it is considered a best practice.
  • Work Order - if the process for transferring the software into production is sufficiently complex, a work order can serve as a record for successful transfer. For a web application, the work order could contain installation instructions, a number of verification steps to verify the install was successful as well as a final sign-off. Whether a work order is an appropriate document to have depends on how much human intervention is needed in the installation process. With modern DevOps practices, the infrastructure management can be written as code and managed as such (by including software requirements and tests for the infrastructure code).
  • Revision Level History - Contains a list of the versions of the software and what changed in each version.
  • System Baseline - The baseline contains all the versions of the software environment in which the final production software was produced and tested.
  • Product Validation Plan - The product validation plan provides for the high level validation of the product. It is intended to answer the question: "Can the user use the software successfully to accomplish their goals?". Typically, a set of usability system requirements are used to specify beforehand what the user must be able to do with the software. The validation plan then validates the software through user testing. All usability system requirements (that typically do not have child software requirements) should be traced to this document.
  • Traceability Analysis Record - A record of the overall traceability analysis. It allows a reader to determine whether the trace is complete or that any trace items are missing.

Note that some documents only apply to certain ISO 62304 safety classifications (A, B or C). The standard contains more details about what artifacts are required for each of the three different safety classifications. The set of documents that RoboClerk can support is limitless as any documents can be added to the project configuration file. Even documents that are not managed or generated by RoboClerk can be added to facilitate referencing those documents.

Data Flow

The following diagram show how data enters and leaves the RoboClerk system:

Data flow diagram

RoboClerk takes in information from various datasources like Software Lifecycle Managment Systems (SLMS), the source code, and Dependency Management Systems. Examples are https://azure.microsoft.com/en-us/services/devops/[Microsoft Azure Devops] for an SLMS, pre-specified test documentation in the source code for unit tests and NuGet for a dependency management system. Below we will discuss each of the elements of the diagram to familiarize you with them.

SLMS

The typical information that can be retrieved from an SLMS are requirements, bugs, issues, test cases and their relations. RoboClerk encourages teams to add more information to the SLMS. For example, SOUP documentation. One powerful feature of SLMS systems is their ability to keep track of the relationships between truth entities. For example, when a software requirement is retrieved the relationship information can be used to determine what its parent system requirement is and what test cases test the requirement. Each of these items typically has a unique identifier that can be used to trace a product requirement, through a software requirement all the way through implementation and testing. This information is critical to document in a traceability analysis to comply with ISO62304.

RoboClerk DataSource Plugin

Each source of data that is to be used in the RoboClerk system (with the exception of the project configuration file) requires a DataSource Plugin. This plugin contains the logic needed to extract the information from the data source in a format that is usable for RoboClerk. In this way RoboClerk can extract information from a variety of systems and sources. It also makes customization of RoboClerk straightforward. All information from all datasources is combined and made available to the RoboClerk Core.

One special case that warrants further explanation is the checkpoint file datasource. Every time RoboClerk runs it produces a JSON file that contains all the information that was pulled from the various datasources. This file is put in the output directory with the documents that were produced. Using this checkpoint file it is possible to re-generate the documentation even if truth entities in the SLMS have changed. This can be handy if a major version update of the software was released, changes to the SLMS state were made afterwards and now a limited bug fix needs to be put in place. By indicating the location of the checkpoint file in the project configuration, RoboClerk will use the state from that file to generate the documentation. If particular items have changed, the ID of those items can be indicated as well and RoboClerk will pull an updated version of just that item from the SLMS. In essence, this functions as a way to freeze the state of the external data sources at release. A demo checkpoint file is included in the RoboClerk repo.

RoboClerk Configuration File

RoboClerk has a configuration file to configure various pieces of the system. All configuration files use the TOML format to make editing and interpretation by humans easier. The configuration file contains comments that explain the different options. At a high level the following software features can be configured there:

  • The list of data source plugins to load and use
  • The list of filesystem locations where to search for plugins
  • The output directory where the final, generated documents should be placed
  • The logging level (log file is placed in output directory)

Project Configuration File

RoboClerk is intended to be part of the build process. In addition to the overall configuration file, each set of documents for a particular software project has its own project configuration file. This TOML file contains settings that are specific to this particular project such as:

  • All the locations of the document templates for this project. RoboClerk has a standard set of documents that it supports. If no information is provided, RoboClerk assumes that that document is not relevant and will ignore it.
  • Entity name mapping allows you to customize the name of the base entities in the software to match the names used inside your organization
  • Configured traceability allows you to specify what items should trace to and from what documents. This allows you to customize the traceability analysis to match the standard operating procedure in your organization.
  • Custom project based values. RoboClerk supports inline tags in the document templates that are replaced with the values from this file in the final documentation. One application is to customize boilerplate document content for a particular project. For example, the name and version of the software in the templates should be a tag in this file so that templates can be easily reused between projects.

Asciidoc based templates

For each project, RoboClerk assumes there is a set of templates for each of the documents that RoboClerk is supposed to produce. The templates are configured in the project configuration file. RoboClerk will load the templates, extract any RoboClerk tags, retrieve any information associated with the tags and then replaces the tags with that information in the output files. For specific details on how to use the RoboClerk tags, see the "Additional Details" document.

RoboClerk Tags

At a high level there are two types of tags that are supported, container tags and inline tags. The delimiter for container tags is @@@ while inline tags are @@. Container tags start on a line and end on a different line while inline tags have to be on a single line (no line breaks in the tag definition). The tags take parameters that configure what and how output is generated in the output documents.

Output Asciidoc Documents

The output documents are exactly like the templates but every RoboClerk tag has been replaced with information gathered from the data sources. The documents can be stored with the other build artifacts. Note that RoboClerk can be ran stand-alone as well. This is often useful when working on documents locally and is typically faster than checking in the changed documents and having the build process create the documents.