Chapter 3 RStudio Projects
What is this?
- Projects make it easy to keep your work organized.
- All files, scripts, documentation related to a specific project are bound together with a .Rproj file
Encourages reproducibility and easy sharing
3.1 Create a new project
Use the Create project command (available in the Projects menu and the global toolbar)
3.2 Keep your files organized
One project = one folder
Place similar files inside of their own folders
Keep track of versions
3.3 Preparing data for R
- Datasets should be stored as comma separated files (.csv) in Data folder.
- comma separated files (.csv) can be created from almost all applications (Excel, LibreOffice, GoogleDocs)
- file -> save as .csv
3.3.1 Naming files
- Good:
- rawDatasetAgo2017.csv
- co2_concentrations_QB.csv
- 01_figIntro.R
- Bad:
- final.csv (Uninformative!)
- safnnejs.csv (Random!)
- 1-4.csv (Avoid using numbers!)
- Dont.separate.names.with.dots.csv (Can lead to reading file errors!)
3.3.2 Naming variables
- Use short informative titles (i.e. “Time_1” not “First time
measurement”)
- Good: “Measurements”, “SpeciesNames”, “Site”
- Bad: “a”, “3”, “supercomplicatedverylongname”
- Column values must match their intended use
3.3.3 Common data preparation mistakes
- No text in numeric columns
- Do not include spaces!
- NA (not available) can be used for missing values, and blank entries will automatically be replaced with NA
- Name your variables informatively
- Look for typos!
- Avoid numeric values for data that do not have a numeric meaning
(i.e. subject, replicate, treatment)
- For example, if subjects are “1,2,3” change to “A,B,C” or “S1,S2,S3”
- Use CONSISTENT formats for dates, numbers, metrics, etc.
- Do not include notes, additional headings, or merged cells!
- One variable per column!
3.3.4 Bad data examples
It is possible to do all your data preparation work within R. This has several benefits:
- Saves time for large datasets
- Keeps original data intact
- Keeps track of the manipulation and transformation you did
- Can switch between long and wide format data very easily (more on this later and in workshop 4)
- For a useful resource, see https://www.zoology.ubc.ca/~schluter/R/data/