Handling missing covariate data in clinical studies in haematology

Authors: Edouard F. Bonneville, Johannes Schetelig, Hein Putter, and Liesbeth C. de Wreede

Abstract

Missing data are frequently encountered across studies in clinical haematology. Failure to handle these missing values in an appropriate manner can complicate the interpretation of a study’s findings, as estimates presented may be biased and/or imprecise. In the present work, we first provide an overview of current methods for handling missing covariate data, along with their advantages and disadvantages. Furthermore, a systematic review is presented, exploring both contemporary reporting of missing values in major haematological journals, and the methods used for handling them. A principal finding was that the method of handling missing data was explicitly specified in a minority of articles (in 76 out of 195 articles reporting missing values, 39%). Among these, complete case analysis and the missing indicator methods were the most common approaches to dealing with missing values, with more complex methods such as multiple imputation being extremely rare (in 7 out of 195 articles). An example analysis (with associated code) is also provided using haematopoietic stem cell transplant data, illustrating the different approaches to handling missing values. We conclude with various recommendations regarding the reporting and handling of missing values for future studies in clinical haematology.

Usage

The data-raw/2022-09-06_ris.ris file corresponds to the raw corpus export from the OVID platform. The .ris file was thereafter imported into Zotero, and re-exported into a cleaner format - yielding the data-raw/literature-database-raw.csv. The aforementioned file formed the basis for the extraction sheet provided with the manuscript.

Two main files are of interest:

analysis/illustrative-example.R - the code corresponding to the illustrative example in the manuscript (comparison of imputation methods for event-free survival outcome).
analysis/review-analysis.R - provides the numbers reported in the review (requires the extraction sheet uploaded with manuscript).

.
├── analysis
│   ├── illustrative-example.R
│   ├── review-analysis.R
│   └── zotero-to-extaction-sheet.R
├── data
│   └── imps_all.rds
├── data-raw
│   ├── 2022-09-06_ris.ris
│   ├── dat-mds_admin-cens.fst
│   ├── data_dictionary.rda
│   ├── extraction-sheet.xlsx
│   └── literature-database-raw.csv
├── figures
│   └── journals-overview.svg
├── hema-missing-review.Rproj
├── R
│   └── forest-helper.R
├── README.md
└── README.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handling missing covariate data in clinical studies in haematology

Abstract

Usage

Overview journals included in review

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
R		R
analysis		analysis
data-raw		data-raw
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.Rmd		README.Rmd
README.md		README.md
hema-missing-review.Rproj		hema-missing-review.Rproj

License

survival-lumc/ReviewHaemaMissing

Folders and files

Latest commit

History

Repository files navigation

Handling missing covariate data in clinical studies in haematology

Abstract

Usage

Overview journals included in review

About

Topics

Resources

License

Stars

Watchers

Forks

Languages