Â
Data should be tidy with mis en place
last update: 2021-08-04
Â
Data should be tidy with mis en place
Â
Â
Â
The data should be in a standard, accessible, reproducible, non-haphazard, format
One column per variable
One row per observational unit
Thought given to sympathetic data “codification”
explanation of variables
Variable naming conventions
Â
Delimited text files
.csv, .txt, tab-delimited, others
first row with variable names
problem with explanation of variables
Â
Excel is okay, but must be Tidy
Avoid proprietary formats (like SPSS, Genstat, etc.)
Â
Step 1 Set your working directory with code, or RStudio menus
Â
Different data file types require different functions
Â
for .csv read.csv()
for other delimited read.table()
for Excel library(openxlsx); read.xlsx()
many others exist
Â
names()
function
The use of the $
operator for data frames
The use of the str()
function for data frames
The use of the index operator [ , ]
The use of the attach()
function
Â