General advice

Keep in mind as you do online research (including here) that there is no one ordained way to get in the field. There is a plethora of sources out there, and as you scour keep in mind this principle (borrowing from Tolstoy):

“All [successfull approaches to getting into data science] are alike; each [unsuccessful approach to getting into data science] is [unsuccessful] in its own way.”

Good advice will have similar themes, like taking the initiative and reading a book, complete a challenge, participate at local meetups, etc.

Here are a few good resources to get you started:

Getting setup

Often just getting setup is the biggest hurdle but once you’ve got an environment to start coding in you’re off to the races. But it can be scary diving into something new like this. If you want to get setup but need help, I recommend going to one of the meetups listed [here]() or reach out to me and I’ll help you (for real).


To start coding in Python on your machine I recommend installing either Anaconda or Miniconda. They’re both the same except Miniconda just is a smaller download of Python packages. This has been my go to tool for keeping a separate Python working environment on my machine. It makes it easy to create virtual environments and work within them. I strongly recommend only installing the Python 3.x version.

If you have a Windows machine, this tutorial will help you install Anaconda/Miniconda. It has screenshots that will walk you through each step of the way.

Then install Jupyter notebooks or Jupyter lab. This is a fantastic tool for prototyping code and for getting used to working in Python, especially if you’re new to it.

With these installed, you’ll be ready to start developing on your machine.


Installing R and RStudio on your machine is the first step in learning how to work in R. Here are some links for setting up R and RStudio on your machine.

Checkout this page in the edX Foundations of Data course. It has step-by-step instructions for downloading and installing R and RStudio for Windows and Mac users.

If you already have R installed you can download RStudio from RStudio’s products download page. Download the free version.

With this installed, you’ll be ready to start developing on your machine.


SQL is a great language to learn because it introduces you to the world of databases. It might not sound as “sexy” as learning deep learning or Spark, but learning the universal language of SQL is your ticket to understanding how to work with databases anywhere.

To get started learning the language I recommend checking out these online resources that will get you familiar with the language and concepts.

  • SQLZoo. This is a very simple but effective online SQL tutorial tool. It will teach you the concepts and help you along and the best thing is if you get it right you see a creepy emoji face pop up!
  • End-to-End Machine Learning’s SQL resources. Brandon Roeher’s new collection of learning SQL resources looks pretty awesome!
  • Select Star SQL. This is an interactive book that teaches you chapter by chapter through some interesting data.