data

Date: December 16, 2020

Length: 90 min

Course Type: Archived Event

Instructor: Karl Broman, PhD

Learning Level: Fundamental

Primary Audience: All Research Team Members

Prerequisite: None

Course Collection(s): Data Science and Informatics

 

Interested in learning how to make your data analysis and other scientific computations reproducible?

This seminar from the Center for Quantitative Methods and Data Sciences (QM&DS), in partnership with the Biostatistics, Epidemiology and Research Design (BERD) Center at Tufts CTSI and the Data-Intensive Studies Center (DISC) at Tufts University, was held on Wednesday, December 16 via Zoom. 

A minimal standard for data analysis and other scientific computations is that they be reproducible: that the code and data are assembled in a way so that another group can re-create all of the results (e.g., the figures and table in a paper). Adopting a workflow that will make your results reproducible will ultimately make your life easier; if a problem or question arises somewhere down the line, it will be much easier to correct or explain.

But organizing analyses so that they are reproducible is not easy. It requires diligence and a considerable investment of time to learn new computational tools and to organize and document analyses as you go. Nevertheless, partially reproducible is better than not at all reproducible. Just try to make your next paper or project better organized than the last. There are many paths toward reproducible research, and you shouldn't try to change all aspects of your current practices all at once. Identify one weakness, adopt an improved approach, refine that a bit, and then move on to the next thing. Dr. Karl Broman offers some suggestions for the initial steps to take towards making your work reproducible.

Featured Speaker

Dr. Karl Broman is a Professor in the Department of Biostatistics & Medical Informatics at the University of Wisconsin-Madison. Dr. Broman is an applied statistician working on the genetics of complex diseases in experimental organisms. He develops the R package, R/qtl, has written a number of short tutorials useful for data scientists, and is very keen to develop tools for interactive data visualization (to view an example, click here.).

Available courses

Date Location Type Price
2024 Course: Open January 1 through December 31 Online Archived Event This Course is Free