Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

CSSS 508, Lecture 1

Syllabus and Introduction to R, RStudio, and RMarkdown

Michael Pearce
(based on slides from Chuck Lanfear)

March 29, 2022

1 / 54

Welcome to CSSS 508: Introduction to R for Social Scientists

Today, we will do:

  • Introductions

  • Syllabus

  • Lecture 1: Introduction to R, RStudio, and RMarkdown

2 / 54

Introductions

We'll go around the room and each share our:

  • Name and preferred pronouns

  • Program and year

  • Experience with programming (in R or generally)

  • Something fun you did over Spring Break

3 / 54

Syllabus

The syllabus (as well as lots of other information) can be found on our course website:


https://pearce790.github.io/CSSS508


Feel free to follow along online as I run through the syllabus!

4 / 54

Course Goals

This course is intended to give students a foundational understanding of programming in the statistical language R. General topics include:

  • Exploring data with graphics and summaries
5 / 54

Course Goals

This course is intended to give students a foundational understanding of programming in the statistical language R. General topics include:

  • Exploring data with graphics and summaries

  • Cleaning, preparing, and linking data for analyses

5 / 54

Course Goals

This course is intended to give students a foundational understanding of programming in the statistical language R. General topics include:

  • Exploring data with graphics and summaries

  • Cleaning, preparing, and linking data for analyses

  • Foundational programming skills such as functions and loops

5 / 54

Course Goals

This course is intended to give students a foundational understanding of programming in the statistical language R. General topics include:

  • Exploring data with graphics and summaries

  • Cleaning, preparing, and linking data for analyses

  • Foundational programming skills such as functions and loops

  • Organizing projects and creating reproducible research

We will cover almost no statistics here, but I hope you'll leave being able to focus on statistics instead of coding in future CSSS or STAT courses!

5 / 54

Logistics

Sessions:

  • Lecture: Wednesdays, 3:30-5:20 (Savery 117) -- Interactive sessions in which we'll learn key skills, concepts, and principles
  • Lab: Mondays, 3:30-5:20 (Savery 117) -- Optional and mostly unstructured sessions to work on homework and review
  • Office Hours: Tuesdays, 9-10am and 3-4pm (on Zoom; link on Canvas)

Course Website: https://pearce790.github.io/CSSS508

Contact: Feel free to email me at [mpp790 at uw dot edu]

6 / 54

Schedule

  • Week 1: Introduction to R, RStudio, and RMarkdown
  • Week 2: Visualizing Data
  • Week 3: Manipulating and Summarizing Data
  • Week 4: Understanding R Data Structures
  • Week 5: Importing, Exporting, and Cleaning Data
  • Week 6: Using Loops
  • Week 7: Writing Functions
  • Week 8: Working with Text Data
  • Week 9: Working with Geographical Data
  • Week 10: Reproducibility and Model Results

This course will have no meeting during final exam week.

7 / 54

Prereqs, Materials, and Texts

Prerequisites: None

Materials: All course materials are provided on the course website. This includes:

  • These slides and the code used to generate them.
  • An R script for the slides to follow along in class.
  • Homework instructions and/or templates
  • Useful links to other resources.

Laptops: It's helpful to bring a laptop to class. If you don't have one, you can use the lab computers or borrow one for free from the UW Student Technology Loan Program.

Textbooks: This course has no textbook. However, the website has links to a few texts which I have found useful!

8 / 54

Grading

Final grade: C/NC, 60% to get Credit.

  • Homework (75%; assessed by peers): 8 total homeworks; assessed on a 0-3 point rubric. Assigned after lectures and due before the following lecture.
  • Peer Grading (25%; assessed by the instructor): One per homework, assessed on a binary "good"/"not good" scale. Due before the following lab.

Assignment/peer grading instructions and deadlines can be found on the Homework page of the course website. All homework will be turned in on Canvas.

9 / 54

Ugh, peer grading?

Yes, because:

  • You will write your reports better knowing others will see them

  • You learn alternate approaches to the same problem

  • You will have more opportunities to practice and have the material sink in

10 / 54

Ugh, peer grading?

Yes, because:

  • You will write your reports better knowing others will see them

  • You learn alternate approaches to the same problem

  • You will have more opportunities to practice and have the material sink in

How to peer review:

  • Leave constructive comments: You'll only get the point if you write at least 1 full paragraph that includes
    • Any key issues from the assignment and,
    • Points out something positive in your peer's work.
  • Email me if you would like your assignment to be regraded or provide feedback if no peer review was given.
10 / 54

Academic Integrity

Academic integrity is essential to this course and to your learning. Violations of the academic integrity policy include but are not limited to:

  • Copying from a peer
  • Copying from an online resource
  • Using resources from a previous iteration of the course.
11 / 54

Academic Integrity

Academic integrity is essential to this course and to your learning. Violations of the academic integrity policy include but are not limited to:

  • Copying from a peer
  • Copying from an online resource
  • Using resources from a previous iteration of the course.

I hope you will collaborate with peers on assignments and use Internet resources when questions arise to help solve issues. The key is that you ultimately submit your own work.

11 / 54

Academic Integrity

Academic integrity is essential to this course and to your learning. Violations of the academic integrity policy include but are not limited to:

  • Copying from a peer
  • Copying from an online resource
  • Using resources from a previous iteration of the course.

I hope you will collaborate with peers on assignments and use Internet resources when questions arise to help solve issues. The key is that you ultimately submit your own work.

Anything found in violation of this policy will be automatically given a score of 0 with no exceptions. If the situation merits, it will also be reported to the UW Student Conduct Office, at which point it is out of my hands. If you have any questions about this policy, please do not hesitate to reach out and ask.

11 / 54

Classroom Environment

I am absolutely committed to fostering a friendly and inclusive classroom environment in which all students have an equal opportunity to learn and succeed.

12 / 54

Classroom Environment

I am absolutely committed to fostering a friendly and inclusive classroom environment in which all students have an equal opportunity to learn and succeed.

  • Names & Pronouns: Everyone should be addressed respectfully and correctly. Feel free to send me your preferred name/pronouns anytime.
12 / 54

Classroom Environment

I am absolutely committed to fostering a friendly and inclusive classroom environment in which all students have an equal opportunity to learn and succeed.

  • Names & Pronouns: Everyone should be addressed respectfully and correctly. Feel free to send me your preferred name/pronouns anytime.

  • Covid: Covid creates unique circumstances for each of us, which may limit your ability to fully participate in this course. You never need to apologize to me for anything pandemic related. Let me know how I can help!

12 / 54

Classroom Environment

I am absolutely committed to fostering a friendly and inclusive classroom environment in which all students have an equal opportunity to learn and succeed.

  • Names & Pronouns: Everyone should be addressed respectfully and correctly. Feel free to send me your preferred name/pronouns anytime.

  • Covid: Covid creates unique circumstances for each of us, which may limit your ability to fully participate in this course. You never need to apologize to me for anything pandemic related. Let me know how I can help!

  • Accessibility & Accomodations: See course website for information on health, disability, and religious accomodations.

12 / 54

Classroom Environment

I am absolutely committed to fostering a friendly and inclusive classroom environment in which all students have an equal opportunity to learn and succeed.

  • Names & Pronouns: Everyone should be addressed respectfully and correctly. Feel free to send me your preferred name/pronouns anytime.

  • Covid: Covid creates unique circumstances for each of us, which may limit your ability to fully participate in this course. You never need to apologize to me for anything pandemic related. Let me know how I can help!

  • Accessibility & Accomodations: See course website for information on health, disability, and religious accomodations.

  • Feedback: I encourage feedback at any point in the quarter. I will also send out a mid-quarter evaluation around Week 5.

12 / 54

Classroom Environment

I am absolutely committed to fostering a friendly and inclusive classroom environment in which all students have an equal opportunity to learn and succeed.

  • Names & Pronouns: Everyone should be addressed respectfully and correctly. Feel free to send me your preferred name/pronouns anytime.

  • Covid: Covid creates unique circumstances for each of us, which may limit your ability to fully participate in this course. You never need to apologize to me for anything pandemic related. Let me know how I can help!

  • Accessibility & Accomodations: See course website for information on health, disability, and religious accomodations.

  • Feedback: I encourage feedback at any point in the quarter. I will also send out a mid-quarter evaluation around Week 5.

  • Getting Help: If you ever find yourself struggling, know I'm here to help! Try chatting after class, email, or office hours.

12 / 54

Asking Questions

Don't ask like this:

tried lm(y~x) but it iddn't work wat do

13 / 54

Asking Questions

Don't ask like this:

tried lm(y~x) but it iddn't work wat do

Instead, ask like this:

y <- seq(1:10) + rnorm(10)
x <- seq(0:10)
model <- lm(y ~ x)
Running the block above gives me the following error, anyone know why?
Error in model.frame.default(formula = y ~ x, 
drop.unused.levels = TRUE) : variable lengths differ 
(found for 'x')

I may send out your question (anonymously) and my answer to the course mailing list!

13 / 54

Questions?

14 / 54

Lecture 1: Introduction to R, RStudio, and RMarkdown

15 / 54

A Note on Slide Formatting

Bold and Italics indicate important terms!

16 / 54

A Note on Slide Formatting

Bold and Italics indicate important terms!

Code represents R code you could use to perform actions. For example: "Press Ctrl-P to open the print dialogue."

16 / 54

A Note on Slide Formatting

Bold and Italics indicate important terms!

Code represents R code you could use to perform actions. For example: "Press Ctrl-P to open the print dialogue."

Code chunks that span the page represent actual R code embedded in the slides.

# Sometimes important stuff is highlighted!
7 * 49
## [1] 343
16 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

17 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

  • R is free.
17 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

  • R is free.

  • R has a very large community.

17 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

  • R is free.

  • R has a very large community.

  • R can handle virtually any data format.

17 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

  • R is free.

  • R has a very large community.

  • R can handle virtually any data format.

  • R makes replication easy.

17 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

  • R is free.

  • R has a very large community.

  • R can handle virtually any data format.

  • R makes replication easy.

  • R is a language so it can do everything.

17 / 54

Why R?

R is a programming language built for statistical computing.

If one already knows Stata or similar software, why use R?

  • R is free.

  • R has a very large community.

  • R can handle virtually any data format.

  • R makes replication easy.

  • R is a language so it can do everything.

  • R skills transfer to other languages like Python and Julia.

17 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

18 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

We'll show RStudio can...

18 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

We'll show RStudio can...

  • Organize your code, output, and plots
18 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

We'll show RStudio can...

  • Organize your code, output, and plots

  • Auto-complete code and highlight syntax

18 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

We'll show RStudio can...

  • Organize your code, output, and plots

  • Auto-complete code and highlight syntax

  • Help view data and objects

18 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

We'll show RStudio can...

  • Organize your code, output, and plots

  • Auto-complete code and highlight syntax

  • Help view data and objects

  • Enable easy integration of R code into documents with R Markdown

18 / 54

R Studio

R Studio is a "front-end" or integrated development environment (IDE) for R that can make your life easier.

We'll show RStudio can...

  • Organize your code, output, and plots

  • Auto-complete code and highlight syntax

  • Help view data and objects

  • Enable easy integration of R code into documents with R Markdown

It can also...

  • Manage git repositories

  • Run interactive tutorials

  • Handle other languages like C++, Python, SQL, HTML, and shell scripting

18 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output
19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
    • Easy for collaborators to understand
19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
    • Easy for collaborators to understand
    • Show as little or as much code as you want
19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
    • Easy for collaborators to understand
    • Show as little or as much code as you want
  • Produce many different document types as output

19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
    • Easy for collaborators to understand
    • Show as little or as much code as you want
  • Produce many different document types as output

    • PDF documents
    • HTML webpages and reports
    • Word and PowerPoint documents
    • Presentations (like these slides)
19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
    • Easy for collaborators to understand
    • Show as little or as much code as you want
  • Produce many different document types as output

    • PDF documents
    • HTML webpages and reports
    • Word and PowerPoint documents
    • Presentations (like these slides)
  • Works with LaTeX and HTML for math and more formatting control

19 / 54

Selling You on R Markdown

The ability to create R Markdown files is a powerful advantage of R:

  • Document analyses by combining text, code, and output

    • No copying and pasting into Word
    • Easy for collaborators to understand
    • Show as little or as much code as you want
  • Produce many different document types as output

    • PDF documents
    • HTML webpages and reports
    • Word and PowerPoint documents
    • Presentations (like these slides)
  • Works with LaTeX and HTML for math and more formatting control

We'll get back to this shortly!

19 / 54

Downloading R and RStudio

If you don't already have R and RStudio on your machine, now is the time to do so!

  1. Go to the course homepage, https://pearce790.github.io/CSSS508
  2. Click the Download R link and download R to your machine.
  3. Afterwards, click the Download RStudio link and download RStudio to your machine.

We'll take a ~10 minute break now to stretch and solve any software issues!

20 / 54

Getting Started

Open up RStudio now and choose File > New File > R Script.

Then, let's get oriented with the interface:

  • Top Left: Code editor pane, data viewer (browse with tabs)

  • Bottom Left: Console for running code (> prompt)

  • Top Right: List of objects in environment, code history tab.

  • Bottom Right: Tabs for browsing files, viewing plots, managing packages, and viewing help files.

21 / 54

Editing and Running Code

There are several ways to run R code in RStudio:

22 / 54

Editing and Running Code

There are several ways to run R code in RStudio:

  • Highlight lines in the editor window and click Run at the top or hit Ctrl+Enter or ⌘+Enter to run them all.
22 / 54

Editing and Running Code

There are several ways to run R code in RStudio:

  • Highlight lines in the editor window and click Run at the top or hit Ctrl+Enter or ⌘+Enter to run them all.

  • With your caret1 on a line you want to run, hit Ctrl+Enter or ⌘+Enter. Note your caret moves to the next line, so you can run code sequentially with repeated presses.

This thing is the caret: |

22 / 54

Editing and Running Code

There are several ways to run R code in RStudio:

  • Highlight lines in the editor window and click Run at the top or hit Ctrl+Enter or ⌘+Enter to run them all.

  • With your caret1 on a line you want to run, hit Ctrl+Enter or ⌘+Enter. Note your caret moves to the next line, so you can run code sequentially with repeated presses.

This thing is the caret: |

  • Type individual lines in the console and press Enter.
22 / 54

Editing and Running Code

There are several ways to run R code in RStudio:

  • Highlight lines in the editor window and click Run at the top or hit Ctrl+Enter or ⌘+Enter to run them all.

  • With your caret1 on a line you want to run, hit Ctrl+Enter or ⌘+Enter. Note your caret moves to the next line, so you can run code sequentially with repeated presses.

This thing is the caret: |

  • Type individual lines in the console and press Enter.

  • In R Markdown documents, click within a code chunk and click the green arrow to run the chunk. The button beside that runs all prior chunks.

22 / 54

Editing and Running Code

There are several ways to run R code in RStudio:

  • Highlight lines in the editor window and click Run at the top or hit Ctrl+Enter or ⌘+Enter to run them all.

  • With your caret1 on a line you want to run, hit Ctrl+Enter or ⌘+Enter. Note your caret moves to the next line, so you can run code sequentially with repeated presses.

This thing is the caret: |

  • Type individual lines in the console and press Enter.

  • In R Markdown documents, click within a code chunk and click the green arrow to run the chunk. The button beside that runs all prior chunks.

The console will show the lines you ran followed by any printed output.

22 / 54

Incomplete Code

If you mess up (e.g. leave off a parenthesis), R might show a + sign prompting you to finish the command:

> (11-2
+

Finish the command or hit Esc to get out of this.

23 / 54

R as a Calculator

In the console, type 123 + 456 + 789 and hit Enter.

24 / 54

R as a Calculator

In the console, type 123 + 456 + 789 and hit Enter.

123 + 456 + 789
## [1] 1368
24 / 54

R as a Calculator

In the console, type 123 + 456 + 789 and hit Enter.

123 + 456 + 789
## [1] 1368

The [1] in the output indicates the numeric index of the first element on that line.

24 / 54

R as a Calculator

In the console, type 123 + 456 + 789 and hit Enter.

123 + 456 + 789
## [1] 1368

The [1] in the output indicates the numeric index of the first element on that line.

Now in your blank R document in the editor, try typing the line sqrt(400) and either clicking Run or hitting Ctrl+Enter or ⌘+Enter.

24 / 54

R as a Calculator

In the console, type 123 + 456 + 789 and hit Enter.

123 + 456 + 789
## [1] 1368

The [1] in the output indicates the numeric index of the first element on that line.

Now in your blank R document in the editor, try typing the line sqrt(400) and either clicking Run or hitting Ctrl+Enter or ⌘+Enter.

sqrt(400)
## [1] 20
24 / 54

Functions and Help

sqrt() is an example of a function in R.

If we didn't have a good guess as to what sqrt() will do, we can type ?sqrt in the console and look at the Help panel on the right.

?sqrt

Arguments are the inputs to a function. In this case, the only argument to sqrt() is x which can be a number or a vector of numbers.

Help files provide documentation on how to use functions and what functions produce.

25 / 54

Objects

R stores everything as an object, including data, functions, models, and output.

26 / 54

Objects

R stores everything as an object, including data, functions, models, and output.

Creating an object can be done using the assignment operator: <-

26 / 54

Objects

R stores everything as an object, including data, functions, models, and output.

Creating an object can be done using the assignment operator: <-

new.object <- 144
26 / 54

Objects

R stores everything as an object, including data, functions, models, and output.

Creating an object can be done using the assignment operator: <-

new.object <- 144

Operators like <- are functions that look like symbols but typically sit between their arguments (e.g. numbers or objects) instead of having them inside () like in sqrt(x).

26 / 54

Objects

R stores everything as an object, including data, functions, models, and output.

Creating an object can be done using the assignment operator: <-

new.object <- 144

Operators like <- are functions that look like symbols but typically sit between their arguments (e.g. numbers or objects) instead of having them inside () like in sqrt(x).

We do math with operators, e.g., x + y. + is the addition operator!

26 / 54

Calling Objects

You can display or "call" an object simply by using its name.

new.object
## [1] 144
27 / 54

Calling Objects

You can display or "call" an object simply by using its name.

new.object
## [1] 144

Object names can contain _ and . in them but cannot begin with numbers. Try to be consistent in naming objects. RStudio auto-complete means long names are better than vague ones!

Good names save confusion later!

Object names are CaSe SeNsItIvE!!

27 / 54

Using Objects

An object's name represents the information stored in that object, so you can treat the object's name as if it were the values stored inside.

28 / 54

Using Objects

An object's name represents the information stored in that object, so you can treat the object's name as if it were the values stored inside.

new.object + 10
## [1] 154
new.object + new.object
## [1] 288
sqrt(new.object)
## [1] 12
28 / 54

Vectors

A vector is a series of elements, such as numbers.

29 / 54

Vectors

A vector is a series of elements, such as numbers.

You can create a vector using the function c() which stands for "combine" or "concatenate".

29 / 54

Vectors

A vector is a series of elements, such as numbers.

You can create a vector using the function c() which stands for "combine" or "concatenate".

new.object <- c(4, 9, 16, 25, 36)
new.object
## [1] 4 9 16 25 36
29 / 54

Vectors

A vector is a series of elements, such as numbers.

You can create a vector using the function c() which stands for "combine" or "concatenate".

new.object <- c(4, 9, 16, 25, 36)
new.object
## [1] 4 9 16 25 36

If you name an object the same name as an existing object, it will overwrite it.

29 / 54

Vectors

A vector is a series of elements, such as numbers.

You can create a vector using the function c() which stands for "combine" or "concatenate".

new.object <- c(4, 9, 16, 25, 36)
new.object
## [1] 4 9 16 25 36

If you name an object the same name as an existing object, it will overwrite it.

You can provide a vector as an argument for many functions.

29 / 54

Vectors

A vector is a series of elements, such as numbers.

You can create a vector using the function c() which stands for "combine" or "concatenate".

new.object <- c(4, 9, 16, 25, 36)
new.object
## [1] 4 9 16 25 36

If you name an object the same name as an existing object, it will overwrite it.

You can provide a vector as an argument for many functions.

sqrt(new.object)
## [1] 2 3 4 5 6
29 / 54

More Complex Objects

There are other, more complex data types in R which we will discuss later in the quarter! These include matrices, arrays, lists, and dataframes.

Most data sets you will work with will be read into R and stored as a dataframe, so this course will mainly focus on manipulating and visualizing these objects.

30 / 54

R Markdown

31 / 54

R Markdown Documents

Let's try making an R Markdown file:

  1. Choose File > New File > R Markdown...
  2. Make sure HTML Output is selected and click OK
  3. Save the file somewhere, call it my_first_rmd.Rmd
  4. Click the Knit HTML button
  5. Watch the progress in the R Markdown pane, then gaze upon your result!
32 / 54

R Markdown Documents

Let's try making an R Markdown file:

  1. Choose File > New File > R Markdown...
  2. Make sure HTML Output is selected and click OK
  3. Save the file somewhere, call it my_first_rmd.Rmd
  4. Click the Knit HTML button
  5. Watch the progress in the R Markdown pane, then gaze upon your result!

If you ever have trouble knitting your file (especially if creating a PDF), try running the following code in the console:

install.packages('rmarkdown')
install.packages('tinytex')
tinytex::install_tinytex()
32 / 54

R Markdown Headers

The header of an .Rmd file is a code block, and everything else is part of the main document.

33 / 54

R Markdown Headers

The header of an .Rmd file is a code block, and everything else is part of the main document.

---
title: "Untitled"
author: "Michael Pearce"
date: "March 29, 2023"
output: html_document
---
33 / 54

R Markdown Headers

The header of an .Rmd file is a code block, and everything else is part of the main document.

---
title: "Untitled"
author: "Michael Pearce"
date: "March 29, 2023"
output: html_document
---

To mess with global formatting, you can modify the header2.

output:
html_document:
theme: readable
33 / 54

R Markdown Syntax

Output

bold/strong emphasis

italic/normal emphasis

Header

Subheader

Subsubheader

Block quote from famous person

Syntax

**bold/strong emphasis**

*italic/normal emphasis*


# Header


## Subheader

### Subsubheader

> Block quote from
> famous person
34 / 54

More R Markdown Syntax

Output

  1. Ordered lists
  2. Are real easy
    1. Even with sublists
    2. Or when lazy with numbering
  • Unordered lists
  • Are also real easy
    • Also even with sublists

Syntax

1. Ordered lists
1. Are real easy
  1. Even with sublists
  1. Or when lazy with numbering


* Unordered lists
* Are also real easy
  + Also even with sublists

35 / 54

Formulae and Syntax

Output

Include math y=(23)2.

1nni=1xi=ˉxn

Or write code-looking font.

Or a block of code:

y <- 1:5
z <- y^2

Syntax

Include math $y= \left(
\frac{2}{3} \right)^2$.

`$$\frac{1}{n} \sum_{i=1}^{n}
x_i = \bar{x}_n$$`

Or write
`code-looking font`.

Or a block of code:

    ```
    y <- 1:5
    z <- y^2
    ```
36 / 54

R Markdown Tinkering

R Markdown docs can be modified in many ways. Visit these links for more information.

37 / 54

R Code in R Markdown

Inside RMarkdown, lines of R code are called chunks. Code is sandwiched between sets of three backticks and {r}. This chunk of code...

```{r}
summary(cars)
```

Produces this output in your document:

summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
38 / 54

Chunk Options

Chunks have options that control what happens with their code, such as:

  • echo=FALSE: Keeps R code from being shown in the document

  • eval=FALSE: Shows R code in the document without running it

  • include=FALSE: Hides all output but still runs code (good for setup chunks where you load packages!)

  • results='hide': Hides R's (non-plot) output from the document

  • cache=TRUE: Saves results of running that chunk so if it takes a while, you won't have to re-run it each time you re-knit the document

  • fig.height=5, fig.width=5: modify the dimensions of any plots that are generated in the chunk (units are in inches)

  • fig.cap="Text": add a caption to your figure in the chunk

39 / 54

Playing with Chunk Options

Try adding or changing the chunk options (separated by commas) for the two chunks in my_first_Rmd.Rmd and re-knitting to check what happens.

```{r echo=FALSE}
summary(cars)
```
40 / 54

In-Line R code

Sometimes we want to insert a value directly into our text. We do that using code in single backticks starting off with r.

41 / 54

In-Line R code

Sometimes we want to insert a value directly into our text. We do that using code in single backticks starting off with r.

Four score and seven years ago is the same as `r 4*20 + 7` years.
41 / 54

In-Line R code

Sometimes we want to insert a value directly into our text. We do that using code in single backticks starting off with r.

Four score and seven years ago is the same as `r 4*20 + 7` years.

Four score and seven years ago is the same as 87 years.

41 / 54

In-Line R code

Sometimes we want to insert a value directly into our text. We do that using code in single backticks starting off with r.

Four score and seven years ago is the same as `r 4*20 + 7` years.

Four score and seven years ago is the same as 87 years.

Maybe we've saved a variable in a chunk we want to reference in the text:

x <- sqrt(77) # <- is how we assign objects
41 / 54

In-Line R code

Sometimes we want to insert a value directly into our text. We do that using code in single backticks starting off with r.

Four score and seven years ago is the same as `r 4*20 + 7` years.

Four score and seven years ago is the same as 87 years.

Maybe we've saved a variable in a chunk we want to reference in the text:

x <- sqrt(77) # <- is how we assign objects
The value of `x` rounded to the nearest two decimals is `r round(x, 2)`.
41 / 54

In-Line R code

Sometimes we want to insert a value directly into our text. We do that using code in single backticks starting off with r.

Four score and seven years ago is the same as `r 4*20 + 7` years.

Four score and seven years ago is the same as 87 years.

Maybe we've saved a variable in a chunk we want to reference in the text:

x <- sqrt(77) # <- is how we assign objects
The value of `x` rounded to the nearest two decimals is `r round(x, 2)`.

The value of x rounded to the nearest two decimals is 8.77.

41 / 54

This is Amazing!

Having R dump values directly into your document protects you from silly mistakes:

42 / 54

This is Amazing!

Having R dump values directly into your document protects you from silly mistakes:

  • Never wonder "how did I come up with this quantity?" ever again: Just look at your formula in your .Rmd file!
42 / 54

This is Amazing!

Having R dump values directly into your document protects you from silly mistakes:

  • Never wonder "how did I come up with this quantity?" ever again: Just look at your formula in your .Rmd file!

  • Consistency! No "find/replace" mishaps; reference a variable in-line throughout your document without manually updating if the calculation changes (e.g. reporting sample sizes).

42 / 54

This is Amazing!

Having R dump values directly into your document protects you from silly mistakes:

  • Never wonder "how did I come up with this quantity?" ever again: Just look at your formula in your .Rmd file!

  • Consistency! No "find/replace" mishaps; reference a variable in-line throughout your document without manually updating if the calculation changes (e.g. reporting sample sizes).

  • You are more likely to make a typo in a "hard-coded" number than you are to write R code that somehow runs but gives you the wrong thing.

42 / 54

Example: Keeping Dates

In your YAML header, make the date come from R's Sys.time() function by changing:

date: "March 29, 2023"

to:

date: "`r Sys.time()`"
43 / 54

Data Frames

44 / 54

What's Up with cars?

In the sample R Markdown document you are working on, we can load the built-in data cars, which loads as a dataframe, a type of object mentioned earlier. Then, we can look at it in a couple different ways.

45 / 54

What's Up with cars?

In the sample R Markdown document you are working on, we can load the built-in data cars, which loads as a dataframe, a type of object mentioned earlier. Then, we can look at it in a couple different ways.

data(cars) loads this dataframe into the Global Environment.

45 / 54

What's Up with cars?

In the sample R Markdown document you are working on, we can load the built-in data cars, which loads as a dataframe, a type of object mentioned earlier. Then, we can look at it in a couple different ways.

data(cars) loads this dataframe into the Global Environment.

View(cars) pops up a Viewer pane ("interactive" use only, don't put in R Markdown document!) or...

45 / 54

What's Up with cars?

In the sample R Markdown document you are working on, we can load the built-in data cars, which loads as a dataframe, a type of object mentioned earlier. Then, we can look at it in a couple different ways.

data(cars) loads this dataframe into the Global Environment.

View(cars) pops up a Viewer pane ("interactive" use only, don't put in R Markdown document!) or...

head(cars, 5) # prints first 5 rows, see tail() too
## speed dist
## 1 4 2
## 2 4 10
## 3 7 4
## 4 7 22
## 5 8 16
45 / 54

Tell Me More About cars

str() displays the structure of an object:

str(cars) # str[ucture]
## 'data.frame': 50 obs. of 2 variables:
## $ speed: num 4 4 7 7 8 9 10 10 10 11 ...
## $ dist : num 2 10 4 22 16 10 18 26 34 17 ...
46 / 54

Tell Me More About cars

str() displays the structure of an object:

str(cars) # str[ucture]
## 'data.frame': 50 obs. of 2 variables:
## $ speed: num 4 4 7 7 8 9 10 10 10 11 ...
## $ dist : num 2 10 4 22 16 10 18 26 34 17 ...

summary() displays summary information1:

summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
46 / 54

Ugly Pictures of cars

hist() generates a histogram of a vector. Note you can access a vector that is a column of a dataframe using $, the extract operator.

hist(cars$speed) # Histogram

hist(cars$dist)

47 / 54

Drawing Slightly Less Ugly Pictures

hist(cars$dist,
xlab = "Distance (ft)", # X axis label
main = "Observed stopping distances of cars") # Title

48 / 54

Math with cars

dist_mean <- mean(cars$dist)
print(dist_mean)
## [1] 42.98
speed_mean <- mean(cars$speed)
print(speed_mean)
## [1] 15.4
49 / 54

Drawing Still Ugly Pictures

plot(dist ~ speed, data = cars,
xlab = "Speed (mph)",
ylab = "Stopping distance (ft)",
main = "Speeds and stopping distances of cars",
pch = 16) # Point size

50 / 54

swiss Time

Let's switch gears to the swiss data frame built in to R.

51 / 54

swiss Time

Let's switch gears to the swiss data frame built in to R.

First, use ?swiss to see what things mean.

51 / 54

swiss Time

Let's switch gears to the swiss data frame built in to R.

First, use ?swiss to see what things mean.

Then, load it using data(swiss)

51 / 54

swiss Time

Let's switch gears to the swiss data frame built in to R.

First, use ?swiss to see what things mean.

Then, load it using data(swiss)

Add chunks to your R Markdown document inspecting swiss, defining variables, doing some exploratory plots using hist or plot.

You might experiment with colors and shapes.

51 / 54

Looking at swiss

pairs(swiss, pch = 8, col = "violet",
main = "Pairwise comparisons of Swiss variables")

pairs() is a pairwise scatterplot function. Good for a quick look at small datasets, but mostly useless for larger data.

52 / 54

Homework

Visit Homework Page for details.

Advice: Start with the provided template, then modify!

53 / 54

See you on Monday for lab!

(I'll stay for a bit after class to chat!)

54 / 54

Welcome to CSSS 508: Introduction to R for Social Scientists

Today, we will do:

  • Introductions

  • Syllabus

  • Lecture 1: Introduction to R, RStudio, and RMarkdown

2 / 54
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow