Statistics/ Statistics AND Probability/Data Science

Statistics And Probability:

                So as we know data can be collected it can be measured and analyzed it can be visualized by using statistical models and graphs now data is divided into two major subcategories. Alright, so first we have qualitative data and quantitative data. These are the two different types of data under qualitative data. We have nominal and ordinal data and under quantitative data. We have discrete and continuous data.

Qualitative data:


      Now, this type of data deals with characteristics and descriptors that can't be easily measured but can be observed subjectively now qualitative data is further divided into nominal and ordinal data. So nominal data is any sort of data that doesn't have any order or ranking?
   example: An example of nominal data is gender.

       There is no ranking in gender. There's only male-female or other right? There is no one-two, three-four or any sort of ordering in gender race is another example of nominal data. Now ordinal data is basically an ordered series of information. Okay, let's say that you went to a restaurant. Okay, Your information is stored in the form of a customer ID. So basically you are represented with a customer ID. Now you would have rated their service as either good or average. All right, that's how no ordinal data is and similarly, they'll have a record of other customers who visit the restaurant along with their ratings.

All right,
  So any data which has some sort of sequence or some sort of order to it is known as ordinal data. So, guys, this is pretty simple to understand now, let's move on and look at quantitative data. So quantitative data basically these He's with numbers and things. You can understand that by the word quantitative itself quantitative is basically quantity. Right Saudis will number a deal with anything that you can measure objectively. Now discrete data is also known as categorical data and it can hold a finite number of possible values. Now, the number of students in a class is a finite number. All right, you can't have an infinite number of students in a class. Let's say in your fifth grade. They have a hundred students in your class. There wasn't an infinite number but there was a definite finite number of students in your class. Okay, that's discrete data.

Continuous data:


      Now, this type of data can hold an infinite number of possible values. So when you say the weight of a person is an example of continuous data, what I mean to see is my weight can be 50 kgs or it NB 50.1 kgs or it can be 50.00 one kgs or 50.000 one or is 50.0 2 3 and so on right there is an infinite number of possible values, right?
             So this is what I mean by continuous data. This is the difference between discrete and continuous data. And also I'd like to mention a few other things over here. Now, there are a couple of types of variables as well. We have a discrete variable and we have a continuous variable discrete variable is also known as a categorical variable or and it can hold values of different categories. Let's say that you have a variable called message and there are two types of values that this variable can hold let's say that your message can either be a Spam message or a nonspam message. Okay, that's when you call a variable as a discrete or categorical variable. All right, because it can hold values that represent different categories of data now continuous variables are basically variables that can store an infinite number of values. So the weight of a person can be denoted as a continuous variable. All right, let's say there is a variable called weight and it can store an infinite number of possible values. That's why we will call it a continuous variable. So guys basically variable is anything that can store a value right, So if you associate any sort of data with an Able, then it will become either a discrete variable or continuous variable. There is also a dependent and independent type of variables Now, we won't discuss all of that in death because that's pretty understandable. I'm sure all of you know, what are the independent variable and dependent variables right,  The dependent variable is any variable whose value depends on any other independent variable?
So guys that much knowledge I expect or if you do have all right. So now let's move on and look at our next topic.

What are the statistics?

                 Now coming to the formal definition of statistics is an area of Applied Mathematics, which is concerned with data collection analysis interpretation and presentation now usually when I speak about statistics people think statistics is all about analysis but statistics has other parts to it it has data collection is also a part of the Statistics data interpretation presentation.

        All of this comes into statistics already are going to use statistical methods to visualize data to collect data to interpret data. So the area of mathematics deals with understanding how data can be used to solve complex problems.

             Now I'll give you a couple of examples that can be solved by using statistics. Let's say that your company has created a new drug that may cure cancer. How would you conduct a test to confirm the As Effectiveness now, even though this sounds like a biology problem? This can be solved with Statistics already will have to create a test which can confirm the effectiveness of the drum or a this is a common problem that can be solved using statistics. Let me give you another example you and a friend are at a baseball game and out of the blue. He offers you a be that neither team will hit a home run in that game. Should you take the BET? All right here you just discuss the of I know you'll win or lose. All right, this is another problem that comes under statistics.

Terminologies In Statistics:

               Now before you dive deep into, it is important that you understand basic terminologies used in. The two most important terminologies in are population and Sample. So throughout the course or throughout any problem that you're trying to stall with. You will come across these two words, which are population and Sample Now the population is a collection of a set of individuals or objects or events. Events whose properties are to be analyzed. So basically you can refer to population as a subject that you're trying to analyze how a sample is just like the word suggests. It's a subset of the population. So you have to make sure that you choose the sample in such a way that it represents
the entire population. All right. It shouldn't Focus add one part of the population instead. It should represent the entire population. That's how your sample should be chosen. So Well-chosen sample will contain most of the information about a particular population parameter. Now, you must be wondering how can one choose a sample that best represents the entire population now sampling is a statistical method that deals with the selection of individual observations within a population. So sampling is performed in order to infer statistical knowledge about a population. All right, if you want to understand the difference of a population like the mean the median Median the mode or the standard deviation or the variance of a population.
        Then you're going to perform sampling. Because it's not reasonable for you to study a large population and find out the mean median and everything else.

Q.why is sampling performed you might ask?
Q.What is the point of sampling?

         We can just study the entire population now guys, think of a scenario wherein you asked to perform a survey about the eating habits of teenagers in the US. So at present, there are
over 42 million teens in the US and this number is growing as we are speaking right now, correct. Is it possible to survey each of these 42 million individuals about their health? Is it possible? Well, it might be possible but this will take forever to do now. Obviously, it's not reasonable to go around knocking each door and asking for what does your teenage son eats and all of that right? This is not very reasonable.
       That's By sampling is used. It's a method wherein a sample of the population is studied in order to draw inferences about the entire population. So it's basically a shortcut to studying the entire population instead of taking the entire population and find out all the solutions. You just going to take a part of the population that represents the entire population and you're going to perform all your statistical analysis your inferential statistic on that small sample. All right, that sample basically here Presents the entire population. All right, so I'm sort of made this clear to y'all what is a sample and what is population now? There are two main types of sampling techniques that are discussed today.

            We have sampling and non- sampling now in this video will only be focusing on sampling techniques because non- sampling is not within the scope of this video. All right will only discuss the probability part because we're focusing on and, correct. Now again under-sampling We have three different types.
            We have a random sampling of systematic and stratified sampling. All right, and just to mention the different types of non- sampling,'s we have no bald Kota judgment and convenience sampling. All right now guys in this session. I'll only be focusing on it. So let's move on and look at the different types of sampling. So what is sampling is a sampling technique in which samples from a large population are chosen by using the theory of.
      So there are three types of sampling. All right first we have random sampling now in this method each member of the population has an equal chance of being selected in the sample. All right, so each and every individual or each and every object in the population has an equal John's of being a part of the sample. That's what random sampling is all about. Okay, you are randomly going to select any individual or any object. So this Bay each individual has an equal chance of being selected.

      Next. We have systematic sampling now in systematic sampling every nth record is chosen from the population to be a part of the sample. Now refer to this image that I've shown over here out of these six. Groups every second grouP is chosen as a sample. So every second record is chosen here and this is our systematic sampling works. Okay, you're randomly selecting the nth record and you're going to the ad that to your sample. Next, We have stratified sampling now in this type of technique a stratum is used to form samples from a large population.

What is stratum?

        A stratum is basically a subset of the population that shares at One common characteristic. So let's say that your population has a mix of both male and female so you can create to straightens out of this one will have only the male subset and the other will have the female subset.

                   This is what stratum is. It is basically a subset of the population that shares at least one common characteristic. All right in our example, it is gender. So after you've created a stratum you're going to use random sampling on these stratums and you're going to choose. Choose the final sample. So random sampling meaning that all of the individuals in each of the stratum will have an equal chance of being selected in the sample. So, Guys, these were the three different types of sampling techniques.

1.Introduction to data science.2.Statistics And PROBABILIT.
3.Basic Of Machine Learning.
4.Linear Regression.
5.Logistic Regression.
6.Decision Trees.
7.Random Forest.
8.K Nearest Neighbour.
9.Naive Bayes.
10.Support Vector Machine.
11.K Means Clustering.
12.Association Rule Mining.
13.Reinforcement Learning.
14.Deep Learning.
15.Interview Questions.


1 Comments

Previous Post Next Post