Getting Started In Data Science
April 23, 2015

This post was written by Lillian Pierson, author of recently published Data Science for Dummies, featuring a foreword by our own Jake Porway! Learn more about Lillian here and read below for some of her favorite resources for those just starting to learn about data science.

DataKind has been getting lots of requests for information on good places to go for learning how to do data science – and that makes sense, data science rocks!! Especially when it’s being done for the good of humanity. So, by popular demand, today’s post offers an overview of some programs that are available out there to help you get started. Ranging from free programs to more expensive ones, I’ve tried to present something for everyone.

The Open Source Data Science Masters is a website that’s dedicated to teaching people how to do data science. Their main priority is to help people learn to work with data and make data useful. They achieve this mostly by directing visitors to books they can buy or free online courses they can take to learn about the different topics that comprise data science. A benefit to choosing to learn via the Open Source Data Science Masters curriculum recommendations is that you get to learn at your own pace, and are offered a variety of resources to help you learn each subject. A drawback is that Open Source Data Science Masters doesn’t really offer any structured, engaging learning plan or a forum through which you can interact with fellow students or course teachers.

Coursera is another good place to go to learn to do data science. Through Coursera, you can take classes in topics like “Data Visualization,” “The Data Scientist’s Toolbox,” and “Practical Machine Learning.” The two best things about Coursera are that (a) you can learn for free and (b) you get to virtually attend classes that are offered by some of the best universities in the world (including Columbia, John Hopkins, and Vanderbilt). Another benefit is that Coursera courses are self-paced and, if you’re willing to pay $49, you can get a certification of completion when you’re done. A drawback about Coursera courses is that they are excessively difficult… think of a weed out course in freshman year. Some have argued that you might as well just go back to college and get a piece of paper for all the time and trouble these courses require.


Data-Mania’s courses in data science are offered at a mid-level price range. Data-Mania courses are designed to deliver a structured and engaging curriculum to help career professionals or recent non-data science graduates – to help them learn the fundamental skills they need to begin applying data science methods to solve problems in their respective fields. The courses offer a dynamic well-structured curriculum, fun instructor-led videos, long-form outlines, and short-form cheat sheets. Although self-paced, the program is designed to get you trained in statistics, R, and Python for data science, in about 12 weeks at an average of about 5 hours per week. The drawback of Data-Mania’s program is that it doesn’t offer a formal certification, but for a fraction of the cost of more formal programs… it’s a great place to get the job done and have fun while you’re doing it.


EMC2 offers a Data Science Associate Program for people that are looking for a piece of paper proving that they have taken classes in data science. The EMC2 program is rather formal and offers the following delivery modules: streaming, USB flash drive, and instructor led. The instructor-led course "provides grounding in basic and advanced analytic methods and an introduction to big data analytics technology and tools, including MapReduce and Hadoop.” The course provides students 40 hours of training for $5000, so you’re looking at paying about $125/hour -- but, for this price you’ll also get the education you need to take a test and get “Proven” by the EMCDSA certificate. This allows you to add a formalized credential to the “Education” section of your resume. Other delivery modules are less expensive. The chief drawback to the EMC2 program is that it’s rather formal and costly. This program is really designed for people who need or want an additional certification.

I know there are way more resources out there! Which ones have you found most helpful? Let us know in the comments below or tweet 'em out @DataKind!

Read more posts
January 11, 2022
Our Ethics + Responsible Data Science Practices at DataKind
At DataKind, we take an expansive definition of data ethics and responsible data science as broad terms that can be used to describe the appropriate handling of data...
Read full story
December 21, 2021
Lessons from DataKind San Francisco’s Launch of DataAdvisory Projects
From financial forecasting to targeted advertisements, advancements in data collection and analysis have benefited a myriad of for-profit organizations today.
Read full story
October 14, 2021
Celebrating DataKind’s CEO: An Interview with Lauren Woodman
We’re thrilled to welcome Lauren Woodman as the new CEO of DataKind. She brings to the role over 25 years of experience working at the intersection of technology, development, policy, and NGOs...
Read full story
December 20, 2021
Shining a Light on Community: Looking Back at DataKind’s Virtual DataDive® Event
We hosted a DataDive® event in fall 2021, and with it being the season of giving, we thought what better time to share some highlights and express our deepest gratitude to our partners, volunteers, and sponsors...
Read full story
Blog Archive