How to Create a Data Analysis Plan: A Detailed Guide

by | Aug 12, 2020 | Writing

how to create a data analysis plan

If a good research question equates to a story then, a roadmap will be very vital for good storytelling. We advise every student/researcher to personally write his/her data analysis plan before seeking any advice. In this blog article, we will explore how to create a data analysis plan: the content and structure.

This data analysis plan serves as a roadmap to how data collected will be organised and analysed. It includes the following aspects:

  • Clearly states the research objectives and hypothesis
  • Identifies the dataset to be used
  • Inclusion and exclusion criteria
  • Clearly states the research variables
  • States statistical test hypotheses and the software for statistical analysis
  • Creating shell tables

1. Stating research question(s), objectives and hypotheses:

All research objectives or goals must be clearly stated. They must be Specific, Measurable, Attainable, Realistic and Time-bound (SMART). Hypotheses are theories obtained from personal experience or previous literature and they lay a foundation for the statistical methods that will be applied to extrapolate results to the entire population.

2. The dataset:

The dataset that will be used for statistical analysis must be described and important aspects of the dataset outlined. These include; owner of the dataset, how to get access to the dataset, how the dataset was checked for quality control and in what program is the dataset stored (Excel, Epi Info, SQL, Microsoft access etc.).

3. The inclusion and exclusion criteria:

They guide the aspects of the dataset that will be used for data analysis. These criteria will also guide the choice of variables included in the main analysis.

4. Variables:

Every variable collected in the study should be clearly stated. They should be presented based on the level of measurement (ordinal/nominal or ratio/interval levels), or the role the variable plays in the study (independent/predictors or dependent/outcome variables). The variable types should also be outlined.  The variable type in conjunction with the research hypothesis forms the basis for selecting the appropriate statistical tests for inferential statistics. A good data analysis plan should summarize the variables as demonstrated in Figure 1 below.

Presentation of variables in a data analysis plan
Figure 1. Presentation of variables in a data analysis plan

5. Statistical software

There are tons of software packages for data analysis, some common examples are SPSS, Epi Info, SAS, STATA, Microsoft Excel. Include the version number,  year of release and author/manufacturer. Beginners have the tendency to try different software and finally not master any. It is rather good to select one and master it because almost all statistical software have the same performance for basic and the majority of advance analysis needed for a student thesis. This is what we recommend to all our students at CRENC before they begin writing their results section.

6. Selecting the appropriate statistical method to test hypotheses

Depending on the research question, hypothesis and type of variable, several statistical methods can be used to answer the research question appropriately. This aspect of the data analysis plan outlines clearly why each statistical method will be used to test hypotheses. The level of statistical significance (p-value) which is often but not always <0.05 should also be written.  Presented in figures 2a and 2b are decision trees for some common statistical tests based on the variable type and research question

A good analysis plan should clearly describe how missing data will be analysed.

How to choose a statistical method to determine association between variables
Figure 2a.  How to choose a statistical method to determine association between variables
How to choose a statistical method to compare differences between variables
Figure 2b. How to choose a statistical method to compare differences between variables

7. Creating shell tables

Data analysis involves three levels of analysis; univariable, bivariable and multivariable analysis with increasing order of complexity. Shell tables should be created in anticipation for the results that will be obtained from these different levels of analysis. Read our blog article on how to present tables and figures for more details. Suppose you carry out a study to investigate the prevalence and associated factors of a certain disease “X” in a population, then the shell tables can be represented as in Tables 1, Table 2 and Table 3 below.

Table 1: Example of a shell table from univariate analysis

Example of a shell table from univariate analysis

Table 2: Example of a shell table from bivariate analysis

Example of a shell table from bivariate analysis

Table 3: Example of a shell table from multivariate analysis

Example of a shell table from multivariate analysis

aOR = adjusted odds ratio

Summary

Now that you have learned how to create a data analysis plan, these are the takeaway points. It should clearly state the:

  • Research question, objectives, and hypotheses
  • Dataset to be used
  • Inclusion and exclusion criteria
  • Variable types and their role
  • Statistical software and statistical methods
  • Shell tables for univariate, bivariate and multivariate analysis

Further readings

Creating a Data Analysis Plan: What to Consider When Choosing Statistics for a Study https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4552232/pdf/cjhp-68-311.pdf

Creating an Analysis Plan: https://www.cdc.gov/globalhealth/healthprotection/fetp/training_modules/9/creating-analysis-plan_pw_final_09242013.pdf

Data Analysis Plan: https://www.statisticssolutions.com/dissertation-consulting-services/data-analysis-plan-2/

Photo created by freepik – www.freepik.com

Author

  • Dr Barche is a physician and holds a Masters in Public Health. He is a senior fellow at CRENC with interests in Data Science and Data Analysis.

Post Navigation

16 Comments

  1. Ewane Edwin, MD

    Thanks. Quite informative.

    Reply
  2. James Tony

    Educative write-up. Thanks.

    Reply
  3. Mabou Gabriel

    Easy to understand. Thanks Dr

    Reply
  4. Dongmo Roosvelt, MD

    I will always remember how you help me conceptualize and understand data science in a simple way. I can only hope that someday I’ll be in a position to repay you, my dear friend.

    Reply
    • Nkai

      Very understandable and informative. Thank you..

  5. Ndzeshang

    love the figures.

    Reply
  6. Selemani C Ngwira

    Nice, and informative

    Reply
  7. MONICA NAYEBARE

    This is so much educative and good for beginners, I would love to recommend that you create and share a video because some people are able to grasp when there is an instructor. Lots of love

    Reply
    • Kwasseu

      Thank you Doctor very helpful.

  8. Kwasseu

    Thank you Doctor very helpful.

    Reply
  9. Mbapah L. Tasha

    Educative and clearly written. Thanks

    Reply
  10. Philomena Balera

    Well said doctor,thank you.But when do you present in tables ,bars,pie chart etc?

    Reply
  11. Rasheda

    Very informative guide!

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

  Receive updates on new courses and blog posts

Never Miss a Thing!

Never Miss a Thing!

Subscribe to our mailing list to receive the latest news and updates on our webinars, articles and courses.

You have Successfully Subscribed!

Share This