In this notebook we will be looking for relationships between disease stage, mouse sex and group. We begin by the necessary libraries.
# required libraries library(psych) library(dplyr) library(ggplot2) library(knitr) library(pander) panderOptions('knitr.auto.asis', FALSE) After importing libraries we will import the data.
# read the csv data and split into test and control groups data <- read.csv("FinalAnalysisClientDataset.csv", header=TRUE, stringsAsFactors=TRUE) data$BMP_2 <- sapply(sapply(data$BMP_2, as.character), as.numeric) #had to convert this column to numerics test <- subset(data, Category == "Test") control <- subset(data, Category == "Control") Lets write some descriptive statistics on BMP_1 and BMP_2 levels so we understand what our data looks like.
This project focuses on determining if there is a statistically significant difference in biomarker protein levels (BMP) in patients with a genetic disease. There are too few participants to take advantage of the robustness to non-normality for the t-test. However we will first assess normality using Q-Q plots, and the Shapiro-Wilks test. We will then test the means of each sample to determine whether the difference in the population means are statistically different from 0.