Course details

Very rarely in a data science project is data easily available as part of a package. It's more typical for the data to be in a file, a database, or extracted from a document such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidyverse package. This usually involves several, often complicated, steps to convert data from its raw form to the tidy form that greatly facilitates the rest of the analysis. We refer to this process as data wrangling.In this course, we will cover several common steps of the data wrangling process including importing data into R from files, tidying data, string processing, html parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but data scientist will likely face them all at some point. HarvardX has partnered with DataCamp for all assignments. This allows students to program directly in a browser-based interface. You will not need to download any special software, but an up-to-date browser is recommended. Updated on 17 September, 2019
Courses you can instantly connect with... Do an online course on Data Science starting now. See all courses

Is this the right course for you?

Rate this page

Didn't find what you were looking for ?

or