Introduction

Who am I?

Mr. (almost Dr.!) Richard E.W. Berl

(but “Ricky” is fine)

I am an evolutionary social (data) scientist with a background in behavior and cultural change and a passion for conserving biocultural diversity and improving social good and environmental sustainability.


B.A. Biological Sciences & B.A. Anthropology from University of Delaware (2009)



Field Assistant, Lomas Barbudal Monkey Project (2009-2010)


  • Social learning and behavioral traditions




M.S. Zoology from Washington State University (2015)

  • Social behavior and learning in captive gray wolves (Canis lupus) at Wolf Park in Battle Ground, IN





  • Cultural and genetic variation of the Chabu hunter-gatherers of Southwestern Ethiopia

     

    • Gopalan, S., Berl, R. E. W., Belbin, G., Gignoux, C., Feldman, M. W., Hewlett, B. S., & Henn, B. M. (2019). Hunter-gatherer genomes reveal diverse demographic trajectories following the rise of farming in East Africa [preprint]. bioRxiv, 517730. Available: https://www.biorxiv.org/node/152746.abstract

      Global ancestry proportions of northeast African individuals.

       

      Effective migration surfaces depicted as contour lines over A) satellite imagery, B) elevation and water features, and C) the geographic distribution of major language families in Eastern Africa.



Ph.D. Human Dimensions of Natural Resources from Colorado State University (2019)

  • Ph.D. Candidate in Human Dimensions of Natural Resources (defending on May 15th!)

  • Graduate Certificate in Applied Statistics

  • Influence of prestige in determining what people learn and from whom they choose to learn

     

    Prestige domain item loadings from exploratory factor analysis of attitudinal data.

     

    Determinants of prestige by level of social stratification across 16 societies.

     

    Mean proportion of propositions recalled from artificial creation stories by type of content bias and by speaker prestige.

     

    Color matrices of propositions recalled from artificial creation stories.


  • Volunteer data scientist for Trees, Water & People

     

    Random forest prediction of Pinus ponderosa var. scopulorum habitat suitability under present conditions on Pine Ridge Reservation and Trust Land.

     

    Correlation matrix heatmap of climatic and soil variables.

     

    Logistic regression of Pinus ponderosa var. scopulorum occurrence on burn area.



What we will cover in this course

  • See the Syllabus and Course Schedule

  • Objectives (from Syllabus)

    • Set up a convenient computing workflow

    • Write clean, thoroughly commented R code

    • Recognize different types of data, how they are measured, and how they are handled in R

    • Use the principle of ‘tidy data’ to effectively clean and format messy data sets

    • Creatively explore data sets with descriptive statistics and rough visualizations prior to confirmatory analyses

    • Clearly communicate results by visualizing data simply and effectively and by telling a compelling story with data

    • Conduct basic statistical tests and linear regression modeling

    • Explore advanced topics in data analysis, including dimensionality reduction and structural equation modeling

    • Utilize R for your own research by developing a research question, collecting and wrangling data, and conducting the appropriate analyses

    • Support reproducible research by documenting and embedding analyses in a written report

    • Use the skills you have learned to communicate your process and results to a general audience


What we will not cover

  • R Markdown (kind of) and R Notebook

  • LaTeX

  • Version control
    • Git / GitHub

  • Tibbles (tibble package) and piping (magrittr package)

  • Statistical theory (except when necessary)


Setting up a computing workflow

Good organization will save you time and frustration. Future you will thank present you. Trust me. You will have analyses that you’ll come back to years later and wonder what in the world you were thinking. It also makes collaborative work a whole lot easier when your work is organized.

See the required reading by FitzJohn and their recommendations for organizing a project. Do it for this class and use it to help organize your own research projects, as well. The earlier, the better.

Create a folder structure for this course. I’d recommend something like the following:

nr592/
├── data/
├── docs/
├── figs/
├── output/
└┬─ R/
 ├─ assignment1.R
 └─ lecture01.R

You may want to create another folder, like lectures\, to keep my R Markdown lecture files in (the lecture01.R in the structure above refers to your own notes). Or you could put them in docs\.

All of the above are just suggestions. Do what makes sense to you and stick with it.

Now we’re going to create an R project file for the course. You should put it in your main class folder, like this:

nr592/
├── data/
├── docs/
├── figs/
├── output/
├── R/
└── nr592.Rproj

Use the button in the top right that says “Project: (None)” and select “New Project…” Choose “Existing Directory” if you’ve already made the class folder, as above. Navigate to the class folder and hit “Create Project.”

You’re done! Any time you need to work on something for this class, you can open this project and any scripts you had open last time will open up for you.

Importantly, your working directory (the directory where R will look for any files you’re loading or saving) is set to your class directory while you are working on scripts in your R project.

You can check your working directory at any time by running:

getwd()
## [1] "S:/MEGAsync/CSU/Courses/NR 592 (R Seminar)/Site/lectures"

(This is my current working directory.)

And if you need to change it, you can do so using setwd() with the full folder path inside quotes, inside the parentheses, like so:

setwd("C:/My Research/My Big Project/")

(Note the direction of the slashes.)


Basic concepts in R

DON’T BE AFRAID TO FAIL!

Type something and run it! It doesn’t matter if you get an error; you won’t break anything.

Run current line/selection of code:

  • Ctrl+Enter (Windows)
  • Command+Enter (Mac)

Clear console:

  • Ctrl+L (Windows & Mac)

Source: RStudio Keyboard Shortcuts

Objects

Variables

Assignment

a = 1
b = 2
c = 42
a
## [1] 1
b
## [1] 2
c
## [1] 42

Assignment can also be done using the <- operator (i.e. c <- 42). I like using = because it uses fewer keystrokes and is more similar to other programming languages, like Python. You can use either, just pick one and stick with it.

If you use =, keep in mind that = is used for assignment and == is used for logical comparisons, i.e. “Does a == 7? FALSE”.

If you use <-, the direction of the arrow matters: c <- 42 is the same as 42 -> c, but is different from c -> 42. You can also assign a value to multiple variables at once with this operator, i.e. g <- h <- i <- 6.

Operations

a + b
## [1] 3
a^2 + b^2
## [1] 5
c / (a + b)
## [1] 14

Spacing doesn’t matter.

c / (a                                                                     +b )
## [1] 14

You can even hit ‘Enter’/‘Return’ in a script (see below) and carry on to the next line.

(a - c) +
  (a * b) +
  (c / b)
## [1] -18

Or leave empty lines between commands (though you wouldn’t want to do this in the middle of a command–it would be confusing and bad stuff could happen).

a + b + c - b^2




c * 2

(This code block wasn’t evaluated so that you could actually see the spacing, but it works! Try it.)

Use spacing to your advantage to make your code more easily readable. If you have a line of code that goes over 80 characters (the thin line on the right side of your script pane), insert returns after operators (+, -, etc.) for clean breaks in your code. R will auto-indent the next line for you.

Scripts

Always work in scripts!

Open a new R script:

  • Ctrl+Shift+N (Windows)
  • Command+Shift+N (Mac)

Source: RStudio Keyboard Shortcuts



(pdf / Rmd)