################
### EXERCISE 7.1
################
###For the production of a new bread product, you need to select one type of wheat
###with appropriate protein content. Based on n = 10 sample measurements (per type),
###you want to compare d = 3 different types of wheat by their protein content.
###You learned that the so called One-way ANOVA is a suitable way to do so.
###Formulate the Null and Alternative Hypothesis for the given case.
###DATA IMPORT
###Read the xlxs file 'WheatProteinContent_V2.xlsx' into your R-workspace
###These data contains the protein content (%) for three types of wheat (A,B,C).
###Familiarize yourself with the dataset (e.g. n, max, min, ....)
library(xlsx)
setwd("G:\\tierzucht\\AG_bioinf\\teaching\\Master FPPE\\DataExamples")
X = read.xlsx("WheatProteinContent_V2.xlsx", 1)
head(X)
dim(X)
###FIRST ASSESSMENTS
###Check whether the data of each variable is normally distributed
###E.g using quantile-quantile-plots
###What are other ways to Check the normality assumption?
###VISUALIZATION
###Visualize the wheat data in a boxplot.
###Add title and labels.
###ANOVA
###One-Way ANOVA
###Comparison of variance within the groups with variance between the groups.
###Generalization of the t-test with more than two groups.
#Fit the one-way ANOVA model
#The general synatx is aov(response variable ~ predictor_variable, data = dataset)
model1 = aov(X$protein_content ~ X$wheat_type, data = X)
#view the model output
summary(model1)
#summary returns:
#Degrees of freedom
#Sum of squares a.k.a. the total variation between the group means and the overall mean
#mean of the sum of squares
#F value
#p value of the F statistic
#What is your interpretation of the results?
###Kruskal-Wallis test
###Non-parametic way for One-Way ANOVA
kruskal.test(X$protein_content ~ X$wheat_type, data = X)
################
### EXERCISE 7.2
################
###You read in the newspaper that the freezing duration and temperature
###has an effect on the vitamin C content.
###You are interested if this applies to beans as well.
###Online you find the data set 'FrozenBeans.xlsx'.
###These data come from an experiment in which beans were frozen and
###the vitamin C concentration was studied at different temperatures and for different durations.
###DATA IMPORT
###Read the xlxs file 'FrozenBeans.xlsx' into your R-workspace
###Familiarize yourself with the dataset.
beans = read.xlsx(file = "FrozenBeans.xlsx", 1)
head(beans)
###VISUALIZATION
###Visualize the bean data set.
###This time with a different package: ggplot2
library(ggplot2)
ggplot(df_beans, aes(x=days, y=vitaminC, color=temperature)) +
geom_boxplot()
###ANOVA
###Two-Way ANOVA
###used to evaluate simultaneously the effect of two grouping variables (A and B)
###on a response variable.
###What are the two grouping factors in this case?
#Fit the two-way anova model
model2 = aov(vitaminC ~ freezing_duration + temperature, data = beans)
summary(model2)
#Two-way ANOVA with interaction effect
model3 = aov(vitaminC ~ freezing_duration * temperature, data = beans)
summary(model3)
###Which steps was missed here?