Data Challenge Lab Home

Spreading and gathering [wrangle]

(Builds on: Tidy data)

The two most common ways for data to be messy are to have:

  1. One variable spread across multiple columns.
  2. One observation scattered across multiple rows.

To fix these problems you need spread() and gather() from the tidyr package.

spread() and gather() also illustrate a new type of missingness. So far we’ve discussed explicit missing values (NA), but it’s also possible for missing values to be simply absent from the data.

Readings