Paper Link

Summary

The paper documents examples of technical and social difficulties in promoting the use of their coverity static analysis tool. Much of the article analyses the social response of developers to different types of error from the tool (such as false-positives), as well as to found bugs (e.g. dismissal of misunderstood diagnostics as false-positives by developers).

Pros

The paper discusses the difficulty in accessing a representation of client codebases to analyse. This includes their decision to move from capturing make commands to capturing invocations of the compiler to identify files and their relations, as well as their use of modified EDG parsers and file transformations for C++ in order to support a huge number of compilers, versions, and build systems.
Strategies for solving developer-social problems and solutions are discussed. As coverity is a commercial tool ensuring developers & managers feel the tool is useful is just as important as the technical details of false-positive rates. Reactions ranging from "Shrug" to "No, your tool is broken; that is not a bug" are presented as reactions to on-site sales staff demoing the tool.
The coverity tool has been used to find bugs in compilers. These range from "dubious honor of being the single largest source of EDG bug reports after only three years of use", to finding a "use-after-free bug" in the Visual Studio C++ compiler when using a microsoft specific extension in debug mode.
Careful ignoring of diagnostics is allowed by the tool, this includes persisting ignores for diagnostics that are repeated, as well as ignoring errors originating from the codebase when the tool is initially used (developers want to keep old code unchanged and ignore errors from it).
Coverity is deterministic, given the time constraints on the tool timeouts are discouraged and random algorithms (even "elegant solutions to many of the exponential problems [they] encounter" are disallowed). This is important for the aforementioned social aspect - developers need to trust the tool, and trust deterministic results more.

Cons

Complex analyses are ignored due to their propensity for misdiagnosis and difficult errors. The paper gives very little detail on what complex analysis was investigated and rejected, and only leaves the justification that "errors found with little analysis are often better than errors found with deeper tricks". Given other tools that have since been released such as infer (which directly competes with coverity as a bug finder not requiring annotation) do much complex analysis (e.g. finding data races, deadlock, and even basic performance analysis) across billion line codebases (at facebook, uber, mozilla and more) this justification is unsubstantiated.
The tool is inherently conservative about reporting errors due to the distrust associated with false-positives. From the article it seems the developers opted to not include functionality, rather than convert suspect or high-false-positive scans into warnings.
Performance is impacted by the commitment to determinism, while this can be a positive (as mentioned in the pros section), it is a negative if you don't need determinism. The lack of options to enable faster non-deterministic scanners is a weakness in this respect.
Significant development resources are consumed supporting a wide variety of builds, compilers and standards, taking time away from bug-finding improvement. By comparison for C/C++ infer only supports code compliable with clang, and a limited number of build systems (cmake, make, gradle, buck).

Improvements

Better justification of the technical decision to avoid more complex analyses, given infer has found great success doing this.
Better justification of the choice for determinism, how much performance is lost for a given example case.
An overview of the whole system (from parsing, to the structure of scanners, and the type of static analysis used) to better explain the system.
Replacement of unnecessarily large graphics that contain no useful content (they are basically abstract art), with useful diagrams.
Comparison between coverity and other competing tools (e.g. infer - now that it is available), competitors are only briefly acknowledged as "The set of bugs found by tool A is rarely a superset of another tool B, even if A is much better than B" with no further explanation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Few Billion Lines of Code Later.md

A Few Billion Lines of Code Later.md

Paper Link

Summary

Pros

Cons

Improvements

Files

A Few Billion Lines of Code Later.md

Latest commit

History

A Few Billion Lines of Code Later.md

File metadata and controls

Paper Link

Summary

Pros

Cons

Improvements