New Course: Amateur Data Science with Python

steve@thoughtsociety.org Course, General Leave a Comment

Course Strategy

All of us have engaged in online learning of some form or fashion in the last several years. There is an e-learning explosion right now with not so obvious shortcomings: Courses are too basic or they are way over our heads.  Where is the middle ground?

We are the middle

What we mean here is that there is a middle strata of working pro’s who have some experience and maybe just want to expand their expertise in a practical way or pursue something a little beyond their experience.  We want to learn in smaller bites, go faster and get to a level of proficiency without a complete review of the fundamentals every time. There are some pre-requisites of course but there are so many Basic, intermediate and advanced Python courses nowadays, we will not be making another. The same will be said for statistics, linear algebra and machine-learning.

Fundamentals

What we think works is that you go through the tracks in the ‘Amateur Data Science with Python’ course and figure out where your knowledge gaps are, peruse the resources and go out and acquire the requisite knowhow yourself. It will be noted what levels of knowledge of the basics will be required to get through the course.

What I can tell you is that I had none of these fundamentals before I embarked on a personal quest to learn Python, A.I. and Machine-Learning. I rapidly self-learned Python 2 and then 3, added data wrangling, visualization and then Machine-Learning. This does require an understanding of some higher math and statistics but my lack of it didn’t stop me from actually doing cool things with analytics and machine-learning. I was pretty surprised how many libraries for performing the analytics existed and was easy to incorporate into pretty basic Python scripts. Without Juypter notebooks, this would have been much harder.

Learn Visualization Techniques

With some basics under my belt, I set out to learn visualization techniques. When you get comfortable with Python and Pandas, plotting becomes one of the most valuable tools in your quiver. There is a hurdle to get comfortable with some of the plot libraries but once you choose one, you can master it quickly. I found that practice makes perfect and with the great communities associated with these libraries, you can easily get help solving just about any problem you come across. Stack Overflow helps a lot in this quest as well.

Statistics

Too many years between learning basic statistics and using them in my career have eroded that knowledge. Since then,  I have been exposed to the usefulness and the logic behind statistics. In engineering and product management, there are lots of numbers flying and as we go forward in our careers, this data gets very dense and ubiquitous. In very short order, I gleaned a lot more intuitively by using Pandas and Seaborn to analyze data. Tableau and other visualization apps with freemium versions made this much more accessible. I strongly recommend getting Tableau Public installed so you too can experiment with statistics without having to take a course on it beforehand.

Machine-Learning

We have all probably heard of Machine-Learning by now. This is an integral part of Data Science and has gained momentum in the last several years. What drove its rise is the increasing power of computing as a result of distributed cloud computing and low-cost processor and memory.

Machine-Learning makes it possible to glean useful insights from large amounts of data. How? Using different methods such as ‘supervised learning’ where a model is ‘trained’ on data that has the answers we want to predict for new data absent any answers. There is also ‘unsupervised learning’ where we train the model on data absent any answers and it figures those out on its own.

Sound cool yet? Maybe ML didn’t take off until much more data started to become available.  Makes sense, right? Just about the same time, more ways to make use of predictive analytics become practical, driving business to fund new r&d at a faster pace. Just look at Social Media, Facebook especially. Their crazy ad model uses ML to help advertisers target potential customers on their captive network. That is as close to a money printing facility as you will find.

Financial analysts and ‘quant’s, or quantitative analysts were using statistical analysis to do predictions of stock and bond prices for years. Only in the last several years have these genii picked up ML techniques to drive accuracy way up and to actually predict accuracy ahead of time.  That is valuable.  You will find that much of ML involves estimating accuracy of models in order to get predictions close to 100%. Knowing the accuracy that your model attains is half the battle of building the model in the first place.

Leave a Reply

Your email address will not be published. Required fields are marked *