Machine Learning Zoomcamp

There are so many skills to acquire if you really want to become a machine learning engineer. One reason that jobs in the field of data science and machine learning pay so well is that they are very hard to undertake. There are so many skills that you need to acquire, such as having a solid foundation in mathematics, linear algebra, calculus, and statistics, and a strong understanding of computer science fundamentals, including algorithms and data structures. You also need to know languages commonly used in machine learning, such as Python or R, and be familiar with popular machine learning libraries and frameworks like TensorFlow or PyTorch. Then there are the machine learning concepts of supervised and unsupervised learning, feature engineering, model evaluation, and the skills of data manipulation, cleaning, and preprocessing of structured and unstructured data, using tools like Pandas, NumPy and SciPy. The other aspect of the machine learning engineer is to familiarize yourself with version control systems like Git and software developments concepts like CI/CD. Although not strictly necessary for all roles in machine learning, you may also be required to be knowledgable in deep learning, and delve into neural networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and other deep learning architectures.

Even though I possess most of the skills listed above, I have always been daunted by the requirements needed to become proficient as a machine learning engineer. However, I found out about the new iteration of Alexey Grigorev’s Machine Learning Zoomcamp and decided to take the opportunity to have a paced, weekly progression into this field. I was mostly interested in the deep learning and deployment side, since that is where my most of background was lacking. The course is mainly focused on delivering enough theory to achieve a working understanding of machine learning, and implements a very hands-on approach to the topic. It also has the benefit of a large community on Slack that anyone can sign up to, a Google docs FAQ, some Q&A sessions on YouTube (for 2023) where you can interact with instructors and get immediate answers to your questions, and also a Telegram channel with Machine Learning Zoomcamp course announcements.

There are many other courses out there that do a much better job in explaining the theory, as such Andrew Ng’s great Machine Learning Specialization Coursera course, but many times I find that the nitty-gritty side of machine learning is just as important, in my option at least. However I will say that in one of the first lessons of the Machine Learning Zoomcamp, which are available on the DataTalks.clubs YouTube channel, there was one explanation showing a different use of regularization in linear regression for cases of very similar data inputs, possibly due to data collection from a faulty sensor, that I found very interesting and surprising. It showed another use for regularization beyond just for the case of model overfitting.

Participants are required to complete two projects for the course. Even though participants are strongly encouraged to undertake weekly homeworks, there are not required to obtain a certification of course completion. The part I found the most challenging was the second part of the course on deep learning and deployment. I would add that the deep learning homework to create a new convolutional neural network (CNN) modifying a previous neural network, took me quite a lot to complete. The second part of the Machine Learning Zoomcamp takes inspiration from the Stanford course on Computer Vision ( CS231n: Deep Learning for Computer Vision), and Alexey suggests that this is an excellent course to learn from if one is interested in developing deep-learning computer vision skills. There are other courses on the internet which have a very accommodating approach in teaching deep learning, such as Jeremy Howard’s Practical Deep Learning for Coders course at fast.ai. I think I may confront this course once I’m done with the Machine Learning Zoomcamp.

Overall I have to say that the course, even though it is an introduction to the field, has giving me more confidence to confront other similar courses, and try to tackle some projects that employ deep learning. Specifically, there is already a Kaggle machine learning project in natural language processing to challenge participants to accurately match questions with their correct answers that I would be very interested in doing. My homework and projects for the Machine Learning Zoomcamp are on my GitHub repository. Enjoy!