Data Science Specialization
Launch Your Career in Data Science. A ten-course introduction to data science, developed and taught by leading professors.
Offered By
What you will learn
Use R to clean, analyze, and visualize data.
Navigate the entire data science pipeline from data acquisition to publication.
Use GitHub to manage data science projects.
Perform regression analysis, least squares and inference using regression models.
Skills you will gain
About this Specialization
You should have beginner level experience in Python. Familiarity with regression is recommended
You should have beginner level experience in Python. Familiarity with regression is recommended
There are 10 Courses in this Specialization
The Data Scientist’s Toolbox
In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.
R Programming
In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.
Getting and Cleaning Data
Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.
Exploratory Data Analysis
This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.
Offered by

Johns Hopkins University
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.


Frequently Asked Questions
What is the refund policy?
Can I just enroll in a single course?
Is financial aid available?
Can I take the course for free?
Is this course really 100% online? Do I need to attend any classes in person?
How long does it take to complete the Specialization?
How often is each course in the Specialization offered?
What background knowledge is necessary?
Do I need to take the courses in a specific order?
Will I earn university credit for completing the Specialization?
What will I be able to do upon completing the Specialization?
Can I sign up for the course without paying or applying for financial aid?
More questions? Visit the Learner Help Center.