An R script is simply a text file containing (almost) the same commands that you would enter on the command line of R. ( almost) refers to the fact that if you are using sink() to send the output to a file, you will have to enclose some commands in print() to get the same output as on the command line.
Therefore an R Script enables you to reproduce key commands you input to manipulate <or analyze the data. It is also crucial because it enables you to check possible errors.
The command to create a new numerical vector is the following
Name of the variable <- c(val1, val2, valx)
#TUT1
#Excercise 1: Create numerical vector
#Identification = c (item1,item2,...)
#Identification <- c (item1,item2,...)
?
ID<-c(1,2,3,4,5,6,7,8,9,10)
?Attention: the name can not include space in it!
For character vector (categorical data) the command to write is slightly different
Name of the variable <- c(“Name1”, “Name2”, “Name3”)
What should I do if one vector is not properly written? Remove the vector using the command
rm(object)
# Step 1: Create a vector for ID ranging from 1 to 10
ID <- 1:10
# Step 2: Create vectors for age, height, weight, diabetes type, sex, and social class
age <- c(25, 30, 35, 55, 67, 79, 17, 19, 26, 45)
height <- c(1.67, 1.72, 1.73, 1.77, 1.85, 1.54, 1.56, 1.68, 1.59, 1.82)
weight <- c(68, 76, 77, 81, 50, 48, 80, 64, 56, 94)
diabetes <- c("Type1", "Type2", "Type1", "Type2", "Type2", "Type2", "Type1", "Type1", "Type2", "Type1")
sex <- c("Male", "Female", "Male", "Male", "Male", "Female", "Female", "Male", "Male", "Male")
social_class <- c("Routine or semi-routine", "Lower service", "Routine or semi-routine",
"Lower service", "Lower service", "NA", "NA",
"Routine or semi-routine", "Higher service", "Higher service")
# Step 3: Calculate BMI (Body Mass Index) and create a vector for it
BMI <- weight / (height * height)
# Step 4: Install and load the tidyverse package for data manipulation
install.packages("tidyverse")
library(tidyverse)
# Extra Exercises:
# Step 5: Create a dataframe named BMI_data with ID, age, BMI, diabetes, sex, social_class, and weight
BMI_data <- data.frame(ID, age, BMI, diabetes, sex, social_class, weight)
# Step 6: Subset data for individuals with Type1 and Type2 diabetes using the subset function
BMI_data_diabetes1 <- subset(BMI_data, diabetes == "Type1")
BMI_data_diabetes2 <- subset(BMI_data, diabetes == "Type2")
# Step 7: Basic analysis of BMI distribution using the psych package
# Note: install.packages("psych") if you haven't installed it yet
library(psych)
# Describe BMI distribution for Type1 diabetes individuals
describe(BMI_data_diabetes1$BMI)
# Describe BMI distribution for Type2 diabetes individuals
describe(BMI_data_diabetes2$BMI)
Explanation:
ID
with values ranging from 1 to 10.weight / (height * height)
.tidyverse
package for data manipulation.BMI_data
combining all vectors.psych
package to describe the BMI distribution for Type1 and Type2 diabetes individuals.> describe(BMI_data_diabete1$BMI)
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 5 26.81 3.98 25.73 26.81 3.93 22.68 32.87 10.2 0.45 -1.66 1.78
> describe(BMI_data_diabete2$BMI)
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 5 21.71 4.63 22.15 21.71 5.25 14.61 25.85 11.25 -0.46 -1.66 2.07
Make sure to install the psych
package using install.packages("psych")
before running the analysis for BMI distribution.
Clear the Console by clicking on Ctrl + L
?It will be important to clear from the history all the command that are containing a typo.
On the right side click on “History”. The list of the command you wrote should appear.
Select a line you just wrote and click on to the source.
Then this line will be copied to the source on the left.
Some of those packages are preloaded in R Studio, you just need to select them in the Package section
Otherwise to install a package you need the following command
install.packages("name of the package")
To load the package you have install
library("name of the package")
Subset data for individuals with Type2 diabetes using the subset function
#BMI_data_diabetes1 <- subset(data_name, conditions)
BMI_data_diabetes2 <- subset(BMI_data, diabetes == "Type2")