Data Science Explain with Venn Diagram(Data Science for dummies : Part 1)
What the hell is data science you might ask? You come to the right article my online friend!
This is the first part of my data science for dummies series. If you like the content, make sure to follow and give a clap!
Data science is so happening right now. Many company have a job vacancies in data science and coding bootcamps start to make courses about data science. Data science is considered the hottest job right now and many industry require more and more data science.
This brings the question : Why data science are such in a high demand?
The world is constantly changing and moving. The internet is the center of it all. The internet consist an apps, webs, and many other thing. All of this webs and apps require data from the consumer. The more webs and apps changing and going bigger, the data is getting bigger too. That’s why there’s been a buzz about big data. This data is a gold mind or many people said data is the new oil. For an example analogy in a machine, AI is the machine and data is the oil that makes the generator running. The bigger the data, the bigger the chance of an insight at that data. The insight can be a prediction or a domain knowledge (knowledge about the data). This insight is so valuable for a company to stay a head of the competition in this era. Data science job is to bring the insight. That’s why data science are in such a high demand.
The simplest way to understand what is data science is to look at the venn diagram. This venn diagram is created by Drew Conway. So, thanks Drew Conway!
Data Science Venn Diagram
1. Coding/Hacking Skills
The first chunk of data science is coding. Coding is required for data science to gather the data. Data is everywhere. You can find it in many webs and apps. Sometimes this data have a different format or have to be extract from somewhere. This is where coding come in. Coding is required to extract the data and prepare the data. Coding is also use to build a prediction model form the data. The prediction model are often build by a data scientist to look at insight and predict the next available data. The tools to build the prediction model is Scikit-learn. The most common programming language for data science is python and R. SQL is also required for the data base. So, coding skills are a major part of data science to manipulate, gather the data, and build a prediction model to gain insight.
2. Mathematical and Statistical
Math is basically the basic of all tech related. Math is also the basic for statistic. Statistic play an important role in processing data. Math and statistic give you probability, distribution, regression, etc. This is useful for getting insight of the data before make a prediction model. Math and statistic is very helpful to look at the type of the data, data pre-processing, and feature engineering (data pre-processing and feature engineering are gonna be explained in the next article). They can be use to finding out the problem of the data. That’s why math and statistical knowledge is crucial for data science.
3. Domain Knowledge or Substantive expertise
Domain knowledge is like a knowledge about a certain field. Domain knowledge is so useful to data science because sometimes there are certain things that data science can’t be implemented because of the field reason. This is a problem if we use data science without domain knowledge because we can get a wrong conclusion of insight about the data. In domain knowledge you have to know the goal, the methods, and also the constraints of the field before you go implement a data science model or machine learning. The objective of domain knowledge in data science is that the model or the insight can be implement well at the field. So, domain knowledge is so valuable to data science for the implementing process.
That’s the big 3 of data science. Sometimes that 3 skills can’t be done with just one person. If there’s such a person, that’s like finding a unicorn. It’s very hard to find a person that can be an expert in all 3 skills. That’s why data science often done by a team.
That’s it! That’s all you need to know what is data science.
If you like this kind of content, please let me know! You can find me on instagram or twitter @jeffsabarman.