Data Science Full Course For Beginner
Hi, all I welcome you to data science session:
Data Science is the most revolutionary technology of the era. It's all about deriving useful insights from data in order to solve real-world complex problems.let's take a look at the agenda
1.Introduction to data science.
2.Statistics And Probability.
3.Basic Of Machine Learning.
4.Linear Regression.
5.Logistic Regression.
6.Decision Trees.
7.Random Forest.
8.K Nearest Neighbour.
9.Naive Bayes.
10.Support Vector Machine.
11.K Means Clustering.
12.Association Rule Mining.
13.Reinforcement Learning.
14.Deep Learning.
15.Interview Questions.
1.Introduction to data science:
Now, this is probably because we're generating data at an Unstoppable pace. And obviously, we need to process and make sense out of this much data. This is exactly where data science comes in in today's session. We'll be talking about data science in depth.So let's move ahead and take a look at today's agenda. We're going, to begin by discussing the various sources of data and, how the evolution of technology and the introduction of IOD, and social media have led to the need for data sign next.
We'll discuss how Walmart is using insightful patterns from their database to increase the potential of their business. After that. We will see what exactly data science is, then we'll move on and discuss who is data scientist is where we will also discuss the various skill sets.
Needed to become a data scientist next we can move on to see the various data science job roles such as data analyst data architect data engineer. So on after this, we will cover the data life cycle where we will discuss how data is extracted processed and finally use as a solution. Once we're done with that. We'll cover the basics of machine learning where we'll see what exactly machine learning is and the different types of machine learning next.
We will move onto the K means algorithm and we'll discuss a use case of the k-means clustering after which we Discuss the various steps involved in the k-means algorithm and then we will finally move on to the Hands-On part.
Where we use the k-means algorithm to Cluster movies based on their popularity on social media platforms, like Facebook at the end of today's session will also discuss what a data science certification is and why you should take it up.
So, guys, there's a lot to cover in today's session. Let's jump into the first topic. Do you guys remember the times when we have telephones and we had to go to PC your boots in order to make a phone call? Call now those things are very simple because we didn't generate a lot of data. We didn't even store the contacts and our phones or our telephones.
We used to memorize phone numbers back then or you know, these have a diary of all our contact but these days we have smartphones with store a lot of data. So there's everything about us on our mobile phones. We have images we have contacts. We have various apps. We have games. Everything is stored on mobile phones these days similarly to the PCS that we use in the earlier times. It used to process very little data.
All right, there was A lot of data processing needed because the technology was evolved that much. So if you guys remember we use floppy disk back then and floppy. This was used to store small amounts of data, but later on, hard disks were created and those used to store GBS of data. But now if you look around there's data everywhere around us.
All right, we have data stored in the cloud. We have data in each and every Appliance at our houses. Similarly, If you look at smart cars these days they're connected to the internet they connected to mobile phones and this also generates a lot of data. What we don't realize is that evolution of technology has generated a lot of data.
Now initially there was very little data and most of it was even structured only a small part of the data was unstructured or semi-structured. And in those days you could use Simple bi Tools in order to process all of this data and make sense out of it.
But now we have way too much data and order to process this much data. We need more complex algorithms. We need a better process. And this is where data science comes in now guys, I'm not going to get into the depth of data science. Yet I'm sure all of you have heard of IoT or the Internet of things. Now.
Did you guys know that we produce 2.5 quintillion bytes of data each day? And this is only accelerating with the growth of IoT. Now IoT or Internet of Things is just a fancy term that we use for a network of tools or devices that communicate and transfer data through the internet. So various devices are connected to each other through the internet and they communicate with each other right now the communication happens by the exchange of data or by.
Generation of data now these devices include the vehicles. We drive the include our TVs of coffee machines refrigerators washing machines and almost everything else that we use on a daily basis. Now, these interconnected devices produce an unimaginable amount of data guys IoT data is measured in zettabytes and one zettabyte is equal to trillion gigabytes.
So according to a recent survey by Cisco. It's estimated that by the end of 2019, which is almost here. The IoT will generate more than five hundred zettabytes of data per year. And this number will only increase through time. It's hard to imagine data in that much volume, imagine processing analyzing and managing this much of data.
It's only going to cause as a migraine so guys having to deal with this much data is not something that traditional bi tools can do. We no longer can rely on traditional data processing methods. That's exactly why we need data science. It's our only hope right now let's not get into the details here.
how social media is adding on to the generation of data:
Now the fact that we are all in love with social media. It's actually generating a lot of data for us. Okay. It's certainly one of the fuels for data creation Now all these numbers that you see on the screen are generated every minute of the day.
Okay, and this number is just going to increase so for Instagram it says that approximately 1.7 million pictures uploaded in a minute and similarly on Twitter approximately. A hundred and forty-eight thousand tweets are published every minute of the day.
So guys imagine in one are how much that would be and then imagine in 24 hours. So, guys, this is the amount of data that is generated through social media. It's unimaginable. Imagine processing this much data analyzing it and then trying to figure out, you know, the important insights from this much data analyzing this much data is going to be very hard with traditional tools or traditional methods.
That's why data science was introduced data science is a simple process that will just extract useful information from data. All right, it's just going to process and analyze the entire data and then it's just going to extract.
what is needed now guys apart from social media and IoT, there are other factors as well which contribute to data generation these days all our transactions are done online, right?
We pay bills online. We shop online. We even buy homes online these days you can even sell your pets on oil excuses. Not only that when we stream music and watch videos on YouTube all of this is generating a lot of data not to forget.
We've also brought Health Care into the internet wall. Now there are various watches like bit fit which basically trans our heart rate and it generates data about a health condition education is also an online thing right now. That's exactly what you are doing right now. So with the emergence
of the internet, we now perform all our activities online.
Okay, obviously, this is helping us, but we are unaware of how much data we are generating what can be done with All of this data and what if we could use the data that we generated to our benefit? Well, that's exactly what data science does data science is all about extracting the useful insights from data and using it to grow your business.
Details of Data Science:
let's see how Walmart uses data science to grow that business. So, guys, Walmart is the world's biggest retailer with over 20,000 stores in just 28 countries.Okay, now, it's currently building the world's biggest. Good Cloud, which will be able to process
two point five petabytes of data every hour now. The reason behind Walmart's success is how the user customer data to get useful insights about customers' shopping patterns.
Now the data analyst and the data scientist at Walmart. They know every detail about their customers.
They know that if a customer buys Pop-Tarts, they might also buy cookies, how do they know all of this?
Like how do they generate information like this now the user data that they get from their customers? Hours and analyze it to see what a particular customer is looking for.
Let's look at a few cases where Walmart actually analyzes the data and they figured out the customer needs. So let's consider the Halloween and the cookie sales example now during Halloween sales Analyst at Walmart took a look at the data.
Okay, and he found out that a specific cookie was popular across all Walmart stores. So every Walmart store was selling these cookies very well, but he found out that they would to stores that are not selling. So the situation was immediately investigated and it was found that there was a simple stocking oversight. Okay, because of which the cookies were not put on the shelves for sale.
So because this issue was immediately identified they prevented any further loss of sales now another such example, is that true Association rule mining Walmart found out that strawberry Pop-Tart sales increased by seven times before a hurricane.
So a data analyst at Walmart identified the association between ha Hurricane and strawberry pop tarts through data mining now guys. Don't ask me the relationship between Pop-Tarts and Harry Caine, but for some reason whenever there was a hurricane approaching people really wanted to eat strawberry Pop-Tart.
So what Walmart did was they place all the strawberry Pop-Tarts? I will check out before a hurricane would occur. So this way the increasing sales of the Pop-Tarts Now, where's this is a natural thing. I'm not making it up. You can look it up on the internet. Not only that Walmart is analyzing the data generated by Social media to find out all the training products so through social media.
You can find out the likes and dislikes of a person right?
So what Walmart did is they are quite smart the user data generated by social media to find out what products are trending or what products are liked by customers.Okay, an example of this is 1 mod analyze social media data to find out that Facebook users were crazy about cake pops. Okay, so Walmart immediately took a decision and they introduced cake pops into the Walmart stores. So guys the only reason Walmart is so successful is because of the huge amount of data that they get they don't see it as a burden instead. They process this data analyze it and then you try to draw useful insights from it.
Okay, so they invest a lot of money in a lot of effort and a lot of time and data analysis. They spend a lot of time analyzing data in order to find any hidden patterns. So as soon as they find out hidden patterns or associations between any two products, these are giving out offers or Started having discounts or something along that line.
So basically Walmart uses data in a very effective manner the analyzer very, well. They process the data very well and they find out the useful insights that they need in order to get more customers or in order to improve their business. So, guys, this was all about how Walmart uses data science now, let's move ahead and look at what is data set now guys data science is all about uncovering findings from data. It's all about surfacing the hidden insights that can help. Ponies to make smart business decisions.
So all these hidden insights or these hidden patterns can be used to make better decisions in business now an example of this is also Netflix.