datavaluepeople logo

Compiled advice on learning Data Science, Machine Learning, and Data Engineering

Written by Daniel Burkhardt Cerigo

We often speak to people who are looking to get into the data science, machine learning, and data engineering professions. The following is a compiled set of that advice. We will extend this article as we give further unique advice. Find changes and history in github.com/datavaluepeople/dvp-landing-static.

Advice for moving from PhD to industry data science job - 2023-03-01

So here’s a bunch of stuff, hopefully there some usefulness in there:

Advice for going from scratch to an industry data (engineering) job - 2023-02-01

Note; parts of the following are also relevant for going from scratch to any data/ML job, and some are relevant for any dev job.

= Fundamental things definitely to learn =

  • Using the command line (also called the “terminal”). This is Linux (Unix on Mac). Probably try not to learn the Windows command line, as it is not used as much in the industry as Linux is. Starting point: https://ubuntu.com/tutorials/command-line-for-beginners#1-overview You can play with real Linux computers to learn and practice on using these free online ones: https://linuxcontainers.org/lxd/try-it/
  • Git - this is the version control tool used in 99% of places - a required tool for writing software. Think of it as like a programmer’s chef’s knife - so you should get real familiar with it. Starting point: https://git-scm.com/doc
  • SQL - get solid at this, you’re definitely going to be using it. Any of the common flavours will do, but if you’re not sure stick with PostgreSQL (they are all super similar though so). Starting point: https://www.postgresql.org/docs/online-resources/

= Things you then might want to learn =

  • Python - don’t worry about making it too data orientated. Just get familiar with Python and programming like basic toy stuff to begin with. You don’t have to know python to be a data engineer, just there’s a lot of companies that use python in data engineering and just generally, so it is valuable skill to have. Staring point: https//www.python.org/about/gettingstarted/ (There’s more but this is enough to start with.)

Learning how to search for stuff/Google well, and what references are good to follow and what are not is a really important skill to develop. So see that as part of the learning process. Googling “How to learn python effectively?” and stuff like that, and reading about where, and how, best to learn it, is really good usage of time. So try stuff out, read articles, checkout youtube vids, skim through some free online courses - explore until you find what works for you. For me Wikipedia is always a good concise place to read up on a subject first, for example for “command line”, checkout this https://en.wikipedia.org/wiki/Command-line_interface. Then the other things is trying to find and read official documentation for stuff - this is often better than all the tutorial stuff that is just trying to rank high on Google for clicks or sell you a course or something but is often not the most high quality - I added the links to official sites as starting points in the above.

The other thing to really really be conscious of is, there will be moments, especially nearer the beginning, where you’re gonna be like “what the actual fuck is all this random shit, I have no idea what is going on” - that is totally normal. Getting a grasp of it all is going to be a very non-linear process. Reading stuff, and just being curious about it all, even if you’re a bit lost, but continuing and going down rabbit holes and reading random stuff and slowly slowly building up all the connections and piecing the bits all together is how it is going to be, so keep that in mind so as not to get demotivated.

datavaluepeople’s technical interview questions for client hiring - 2021-01-01

We often support clients with their hiring of perm data scientists/engineers etc, but carrying out technical interviews with their candidates. The questions can serve a good guide on what areas one might need to improve their knowledge and experience in. Drop us an email and we’ll send you them. (Not posting them publicly as we still want them to be useful for interviews!)

Daniel Burkhardt Cerigo

Written by Daniel Burkhardt Cerigo

April 03, 2023

datavaluepeople is a group of artificial intelligence experts. Through applied machine learning, building automated systems, advising, and education, we create value for businesses, organizations, and humans. Drop us an email to speak to us about how we could work with your organisation, or if you are interested in joining our team.

linkedIn icongithub icon
Continue reading