data

Date: December 16, 2020

Length: 90 min

Instructor: Karl Broman, PhD

Learning Level: Fundamental

Primary Audience: Research Team

Skills Domain: Study and Site Management

Interested in learning how to make your data analysis and other scientific computations reproducible?

The December seminar in the Center for Quantitative Methods and Data Sciences (QM&DS), in partnership with the Biostatistics, Epidemiology and Research Design (BERD) Center at Tufts CTSI and the Data-Intensive Studies Center (DISC) at Tufts University, was held on Wednesday, December 16 via Zoom. Click enroll to view the archived recording from this webinar.

A minimal standard for data analysis and other scientific computations is that they be reproducible: that the code and data are assembled in a way so that another group can re-create all of the results (e.g., the figures and table in a paper). Adopting a workflow that will make your results reproducible will ultimately make your life easier; if a problem or question arises somewhere down the line, it will be much easier to correct or explain.

But organizing analyses so that they are reproducible is not easy. It requires diligence and a considerable investment of time: to learn new computational tools, and to organize and document analyses as you go. Nevertheless, partially reproducible is better than not at all reproducible. Just try to make your next paper or project better organized than the last. There are many paths toward reproducible research, and you shouldn't try to change all aspects of your current practices all at once. Identify one weakness, adopt an improved approach, refine that a bit, and then move on to the next thing. Dr. Karl Broman will offer some suggestions for the initial steps to take towards making your work reproducible.

Faculty: Dr. Karl Broman is a Professor in the Department of Biostatistics & Medical Informatics at the University of Wisconsin-Madison. Dr. Broman is an applied statistician working on the genetics of complex diseases in experimental organisms. He develops the R package, R/qtl, has written a number of short tutorials useful for data scientists, and is very keen to develop tools for interactive data visualization (to view an example, click here.)

 This Course is Free