Democratize Data Science

Every once in a while, I would come across an article that decries online data science courses and boot camps as pathways towards getting a data science job. Most of the articles aim not to discourage but serve as a reminder to take a hard look in the mirror first and realize what we’re up against. However, a few detractors have proclaimed that the proliferation of these online courses and boot camps have caused the degradation of the profession.

To the latter, I vehemently disagree.

Bridging the Skill Gap

Data science have captured popular imagination ever since Harvard Business Review dubbed data scientist as the sexiest job of the 21st century. More than seven years later, data science remains one of the most highly sought-after job markets today. In fact, due to the dynamics of supply and demand, “the United States alone is projected to face a shortfall of some 250,000 data scientists by 2024¹.”

As a result, capitalism and entrepreneurship answered the call and companies like Codeup have vowed to “help bridge the gap between companies and people wanting to enter the field.”²

In addition, AutoML libraries like PyCaret are “democratizing machine learning and the use of advanced analytics by providing free, open-source, and low-code machine learning solution for business analysts, domain experts, citizen data scientists, and experienced data scientists”³.

The availability of online courses, boot camps, and AutoML libraries has led a lot of data scientists to raise their brows. They fear that boot camp alumni and self-taught candidates would somehow lower the overall caliber of data scientists and disgrace the field. Furthermore, they are afraid that the availability of tools like AutoML would allow anyone to be a data scientist.

I mean, God forbid if anyone thinks that they too can be data scientists! Right?

Wrong.

The Street Smart Data Scientist

Alumni of boot camps and self-taught learners, like myself, have one thing going for them: our rookie smarts. To quote Liz Wiseman, author of the book Rookie Smarts:

In a rapidly changing world, experience can be a curse. Being new, naïve, and even clueless can be an asset. — Liz Wiseman

Rookies are unencumbered. We are alert and constantly seeking like hunter-gatherers, cautious but quick like firewalkers, and hungry and relentless like frontiersmen⁴. In other words, we’re street smart.

Many are so bogged down by “you’ve got to learn this” and “you’ve got learn that” that they forget to stress the fact that data science is so vast that you can’t possibly know everything about anything. And that’s okay.

We learn fast and adapt quickly.

At the end of the day, it’s all about the value that we bring to our organizations. They are, after all, the ones paying our bills. We don’t get paid to memorize formulas or by knowing how to code an algorithm from scratch.

We get paid to solve problems.

And this is where the street smart data scientist excels. We don’t suffer from analysis paralysis or be bothered with theories, at least not while on the clock. Our center of focus is based on pragmatic solutions to problems, not on academic debate.

This is not to say we’re not interested in the latest research. In fact, it’s quite the contrary. We are voracious consumers of the latest development in machine learning and AI. We drool over the latest development in natural language processing. And we’re always on the lookout for the latest tool that will make our jobs easier and less boring.

And AutoML

So what if we have to use AutoML? If it gets us to an automatic pipeline where analysts can get the results of machine learning without manual intervention by a data scientist, the better. We’re not threatened by automation, we’re exhilarated by it!

Do not let perfection be the enemy of progress. — Winston Churchill

By building an automatic pipeline, there’s bound to be some tradeoffs. But building it this way will free our brain cells and gives us more time to focus on solving other higher-level problems and produce more impactful solutions.

We’re not concerned about job security, because we know that it doesn’t exist. What we do know is that the more value we bring to a business, the better we will be in the long run.

Maybe They’re Right?

After all this, I will concede a bit. For the sake of argument, maybe they’re right. Maybe online courses, boot camps, and low-code machine learning libraries really do produce low-caliber data scientists.

Big maybe.

But still, I argue, this doesn’t mean we don’t have value. Data science skills lie on a spectrum and so does companies’ maturity when it comes to data. Why hire a six-figure employee when your organization barely has a recognizable machine learning infrastructure?

Again, maybe.

The Unicorn

Maybe, to be labeled as a data scientist, one must be a unicorn first. A unicorn data scientist is a data scientist who excels at all facets of data science.

Image for post
Hckum / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)

Data science has long been described as the intersection between computer science, applied statistics, and business or domain knowledge. To this, they ask, how can one person possibly accumulate all those knowledge into just a few months? To this, we also ask the same question, how can a college grad?

Unicorns do exist I believe, but they also have had to start from somewhere.

So why can’t we?

Conclusion

A whole slew of online courses and tools promise to democratize data science, and this is a good thing.

Thank you for reading. If you want to learn more about my journey from slacker to data scientist, check out the article From Slacker to Data Scientist: My journey into data science without a degree.

And if you’re thinking about switching gears and venture into data science, start thinking about rebranding now The Slacker’s Guide to Rebranding Yourself as a Data ScientistOpinionated advice for the rest of us. Love of math, optional.

Stay tuned!

You can reach me on Twitter or LinkedIn.

[1] Harvard Business Review. (June 3, 2020). Democratizing Data Science in Your Organization. https://hbr.org/sponsored/2019/04/democratizing-data-science-in-your-organization

[2] San Antonio Express-News. (June 3, 2020). Software development bootcamp Codeup launching new data science program. https://www.mysanantonio.com/business/technology/article/Software-development-bootcamp-Codeup-launching-13271597.php

[3] Towards Data Science. (June 4, 2020). Machine Learning in Power BI Using PyCaret. https://towardsdatascience.com/machine-learning-in-power-bi-using-pycaret-34307f09394a

[4] The Wiseman Group. (June 4, 2020). Rookie Smarts
Why Learning Beats Knowing in the New Game of Work. 
https://thewisemangroup.com/books/rookie-smarts/

This article was first published in Towards Data Science’ Medium publication.

Published by

Ednalyn C. De Dios

I’ve always been enamored with code and I love data science because of its inherent power to solve real problems. Having grown up in the Philippines, served in the United States Navy, and worked in the nonprofit sector, I am driven to make the world a better place. I have started and participated in numerous campaigns that aim to reduce domestic violence and child abuse in the community.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.