The Researcher Toolkit – Bioinformatics Edition

Research Data Services at UW-Madison has created a Researcher Toolkit that details main steps, considerations and options when planning a research project. This page aims to add to the information there by adding some bioinformatics related food-for-though.

Researcher Toolkit

Plan:

  • What is your biological research question?
  • Do you work with sensitive data?
  • What is your budget to generate new omics data? Or will you use existing publicly available data?
  • What are the data policies for the data you wish to use?
  • What are you current computational skills and where would you like to be?

Gather Data:

  • Has your project started? Did you “inherit” data from your lab? Or will you be generating new one?
  • What will your experimental design look like?
  • What type of sequencing will you be performing? How many samples? Where will you sequence the data?

Analyze and Visualize Data:

  • What data size are you working with (GBs, TBs, etc.)
  • What is the optimal to run your analyses (locally, web-interface, command-line only server, distributed system)?
  • Do you anticipate having to redo this analysis in the future? Over hundreds of samples?
  • How will you keep track of your bioinformatics code?
  • Will you be using version control (git)?
  • How will you organize your data?
  • How will you identify the most appropriate bioinformatic tools or pipelines to answer your question?

Publish Research Artifacts

  • Most sequencing data is required to be published along with publications.
  • Will you be uploading your data to NCBI (Genbank) or other biological data repository?
  • Do you have metadata related to your experiment that you need to share?
  • It’s more and more common for people to publish the code used to analyze data along with the results.
  • Do you have bioinformatics software to publish? Do you know which license to use?

Closeout