Course details

If you are looking for that one course that includes everything about data analysis with R, this is it. Let's get on this data analysis journey together.

This course is a blend of text, videos, code examples, and assessments, which together makes your learning journey all the more exciting and truly rewarding. It includes sections that form a sequential flow of concepts covering a focused learning path presented in a modular manner. This helps you learn a range of topics at your own speed and also move towards your goal of solving data analysis problems with R.

The R language is a powerful open source functional programming language. R is becoming the go-to tool for data scientists and analysts. Its growing popularity is due to its open source nature and extensive development community. R is increasingly being used by experienced data science professionals instead of Python and it will remain the top choice for data scientists in 2017. Big companies continue to use R for their data science needs and this course will make you ready for when these opportunities come your way.

This course has been prepared using extensive research and curation skills. Each section adds to the skills learned and helps us to achieve mastery of data analysis. Every section is modular and can be used as a standalone resource.

This course has been designed to include topics on every possible requirement from a data scientist and it does so in a step-by-step and practical manner. This course covers step-by-step and practical solutions to data analysis using R. It covers every required topic and also adds an introduction to machine learning.

We will start off with learning how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation will be provided, illustrating how to use the "dplyr" and "data.table" packages to efficiently process larger data structures. We will then understand how easily R can confront probability and statistics problems and look at R instructions to quickly organize and manipulate large datasets. We will then learn to predict user purchase behavior by adopting a classification approach and implement data mining techniques to discover items that are frequently purchased together. Finally, we will offer insight into time series analysis on financial data, after which there will be detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction.

This course has been authored by some of the best in their fields:

Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData, a start-up company that mainly focuses on providing big data and machine learning products. He specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences.

Selva Prabhakaran

Selva Prabhakaran is a data scientist with a large E-commerce organization. In his 7 years of experience in data science, he has tackled complex real-world data science problems and delivered production-grade solutions for top multinational companies.

Tony Fischetti

Tony Fischetti is a data scientist at College Factual, where he gets to use R everyday to build personalized rankings and recommender systems.

Viswa Viswanathan

Viswa Viswanathan is an associate professor of Computing and Decision Sciences at the Stillman School of Business in Seton Hall University. In addition to teaching at the university, Viswa has conducted training programs for industry professionals. He has written several peer-reviewed research publications in journals such as Operations Research, IEEE Software, Computers and Industrial Engineering, and International Journal of Artificial Intelligence in Education.

Shanthi Viswanathan

Shanthi Viswanathan is an experienced technologist who as a consultant, has helped several large organizations, such as Canon, Cisco, Celgene, Amway, Time Warner Cable, and GE among others, in areas such as data architecture and analytics, master data management, service-oriented architecture, business process management, and modeling.

Romeo Kienzler

Romeo Kienzler is the Chief Data Scientist of the IBM Watson IoT Division and working as an Advisory Architect helping client worldwide to solve their data analysis problems. His current research focus is on cloud-scale data mining using open source technologies including R, ApacheSpark, SystemML, ApacheFlink, and DeepLearning4J.


This course is a blend of text, videos, and assessments, all packaged together keeping your journey in mind. It combines some of the best that Packt has to offer in one complete package. It includes content from the following Packt products:

  • R for Data Science Cookbook by Yu-Wei, Chiu (David Chiu)
  • R for Data Science Solutions [video] by Yu-Wei, Chiu (David Chiu)
  • Mastering R Programming [video] by Selva Prabhakaran
  • Data Analysis with R by Tony Fischetti
  • R Data Analysis Cookbook by Viswa Viswanathan and Shanthi Viswanathan
  • Learning Data Mining with R [video] by Romeo Kienzler


Updated on 14 February, 2018
Courses you can instantly connect with... Do an online course on Data Science starting now. See all courses