# PPT 1 – Introduction to Statistics and Data Analysis

## 1. A large collection, or set, of individuals or objects or events whose properties are

to be analyzed. POPULATION!!!!!!
2. is another name for a categorical variable. NOMINAL!!!!!
3. Uses graphical representation data DESCRIPTIVE!!!!!
4. have been collected, summarized, reported and stored DATA!!!
5. data which are numerical in nature. These data can be ordered or ranked.
QUANTITATIVE!!!!
6. assumes values that can be counted and their values are represented by
counting numbers only. DISCRETE!!!!
7. to determine whether there is a relationship between two variables and to
describe the relationship. CORRELATIONAL!!!!
8. discrepancy between a sample statistic and its population parameter SAMPLING
ERROR!!!!
9. one variable is manipulated to create treatment conditions. A second variable is
observed and measured to obtain scores for a group of individuals in each of the
treatment conditions. The measurements are then compared to see if there are
differences between treatment conditions. EXPERIMENT!!!!
10. All other variables are controlled to prevent them from influencing the results.
EXPERIMENT!!!!
11. observed variable is_______ DEPENDENT!!!!!
12. ∑ SIGMA!!!!
13. upper limit of the summation is represented by_____ n!!!!
14. one simply names or categorizes responses. NOMINAL!!!!!
15. branch of science that deals with studies involving collection, organization,
analysis, interpretation, and drawing conclusions from the data. STATISTICS!!!!!
16. allow comparisons of the degree to which two subjects possess the dependent
variable. ORDINAL SCALE (!!!!)
17. numerical scales in which intervals have the same interpretation throughout
INTERVAL SCALE (!!!!)
18. the most informative scale RATIO SCALE (!!!)
19. is the formula used to generate each term of the sum a(summation index)
20. research is the label given to a study when a researcher cannot control,
manipulate or alter the predictor variable or subjects, but instead, relies on
interpretation, observation or interactions to come to a conclusion. QUASI
EXPERIMENTAL OR NON-EXPERIMENTAL!!!!
21. A numerical value summarizing the sample data. STATISTIC (!!!)
22. are variables that can be placed into categories according to their characteristics
or attributes. QUALITATIVE!!!!
23. the manipulated variable is called the IDEPENDENT!!!
24. Shorthand way of expressing a sum SUMMATION NOTATION!!!!
25. Any letter in can be _____in summation notation SUMMATION INDEX!!!!!
26. If the variable cannot be further subdivided, it is a clue that you are probably
dealing with a DISCRETE!!!
27. study simply observes the two variables as they exist naturally.
CORRELATIONAL!!!!
28. demonstrate a cause-and-effect relationship between two variables; that is, to
show that changing the value of one variable causes changes to occur in a
second variable. EXPERIMENT!!!!
29. It is an interval scale with the additional property that its zero position indicates
the absence of the quantity being measured. RATIO SCALE!!!!
30. they do not imply any ordering among the responses. NOMINAL SCALE (!!!)
31. Arithmetic operations, such as addition and averaging, are not meaningful for
data resulting from______ QUALITATIVE!!!!
32. are usually associated with counting DISCRETE!!!!!
33. helps in making scientific judgments in the face if uncertainty and variation
INFERENTIAL!!!!
34. a decision making procedure to find out whether there is a significant difference
between a claim about a population and another information about the said
population. HYPHOTHESIS TESTING!!!!!
35. have two or more categories without having any kind of natural order. NOMINAL
VARIABLE (!!!)
36. The value of the variable associated with one element of a population or sample.
This value may be a number, a word, or a symbol. DATA SINGULAR!!!!
37. Arithmetic operations such as addition and averaging, are meaningful for data
resulting from QUANTITATIVE!!!!
38.  is one which can assume all values between any two specific values or intervals.
The values are obtained through measurement. CONTINUOUS VARIABLE (!!!)
39. are variables with no numeric value, such as occupation or political party
affiliation NOMINAL VARIABLE (!!!)
40. is similar to a categorical variable. The difference between the two is that there is
a clear ordering of the variables. ORDINAL VARIABLE (!!!)
41. A numerical value summarizing all the data of an entire population.
PARAMETER!!!
42. are usually associated with measurements CONTINUOUS!!!
43. A subset of the population or a part of population that has the same
characteristics of the given population. SAMPLE!!!!
44. consists of higher degree of analysis, interpretation and inferences.
INFERENTIAL!!!!
45. the use of _____________in many areas involves the gathering of information or
scientific data. STATISTICAL METHOD
46. Provide simple summaries about the sample and the measures.
DESCRIPTIVE!!!
47. can be classified as qualitative or quantitative VARIABLE!!!!
48. Trying to reach conclusions that extend beyond the immediate data alone
INFERENTIAL!!!!
49. The set of values collected for the variable from each of the elements belonging
to the sample. DATA PLURAL!!!!!
50. To establish relationships between variables, researchers must observe the
variables and record their observations. This requires that the variables be
MEASURED!!!!
51. used to describe the basic features of the data in a study. They provide simple
summaries about the sample and the measures. Together with simple graphics
analysis, they form the basis of virtually every quantitative analysis of data.
DESCRIPTIVE!!!!
52. A planned activity whose results yield a set of data. EXPERIMENT!!!!
53. process that classifies each individual into one category. SCALE OF
MEASUREMENT
54. A characteristic about each individual element of a population or sample.
VARIABLE!!!!
55. The process of measuring a variable requires a set of categories called a SCALE
OF MEASUREMENT!!!

## 1. allow us to systematically collect information about our objects of study (people,

objects, phenomena) and about the settings in which they occur. DATA
COLLECTION TECHNIQUES
2. we have to be systematic in COLLECTION OF DATA
3. are data that are not organized, or if arranged, could only be from highest to lowest
or lowest to highest. UNGROUPED DATA
4. are the largest numbers that can actually belong to different classes UPPER CLASS
LIMIT
5. Also known as raffle LOTTERY SAMPLING TECHNIQUE
6. are the numbers used to separate classes, but without the gaps created by class
limits CLASS BOUNDARIES
7. The value of median and other partition values can be located from the OGIVES
8. captions above the columns HEAD BOX
9. the only way to recruit the members of rare or much sought after groups
PURPOSIVE SAMPLING TECHNIQUE
10. refers to sampling plans where the sampling is carried out in stages using smaller
and smaller sampling units at each stage. MULTI-STAGE SAMPLING TECHNIQUE
11. is used in exploratory research where the researcher is interested in getting an
inexpensive approximation of the truth. CONVENIENCE SAMPLING TECHNIQUE
12. label that classify values of a variable STUBS
13. used to draw the numbers for the sample TABLE OF RANDOM NUMBERS
14. collect data on each sampling unit that was randomly sampled from each group
STRATUM
15. obtain a simple random sample from each group STRATUM(!!!!)
16. are the smallest numbers that can actually belong to different classes LOWER
CLASS LIMIT
17. , a line graph is drawn by joining all the midpoints of the top of the bars of a
histogram. FREQUENCY POLYGON
18. a bar graph that shows how frequently data occur within certain ranges or intervals.
The height of each bar gives the frequency in the respective interval. HISTROGRAM
19. is a vertical bar graph in which values are plotted in decreasing order of relative
frequency from left to right. PARETO DISTRUBUTION DIAGRAM
20. Sample has a known probability of being selected PROBABILITY SAMPLING
21. researcher may ask or invite individuals to send text opinions on certain issues or
send in their choices on their brand preferences on a particular product using their
cellphones. TEXTING METHOD
N
22. Sample Size (n) = SLOVIN’S FORMULA
1+ Ne 2
23. partition the population into groups STRATA
24. Mark is the respective average of each class limits CLASS MARK OR CLASS
MIDPOINT
25. the difference between two consecutive lower class limits or two consecutive class
boundaries CLASS WIDTH
26. method of collecting data is used to find out the cause and effect relationship of
certain phenomena under controlled conditions. EXPERIMENTAL METHOD
27. can be a complex form of cluster sampling MULTI-STAGE SAMPLING TECHNIQUE
28. Each of the N population members is assigned a unique number. The numbers are
placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n
numbers. Population members having the selected numbers are included in
the sample. LOTTERY SAMPLING TECHNIQUE
29. An individual group is called a STRATUM
30. divide the population into groups CLUSTERED SAMPLING TECHNIQUE
31. Sample does not have known probability of being selected as in convenience or
voluntary response surveys NON-PROBABILITY SAMPLING
32. nonprobability method is often used during preliminary research efforts to get a
gross estimate of the results, without incurring the cost or time required to select a
random sample. CONVENIENCE SAMPLING TECHNIQUE
33. is characterized by a deliberate effort to gain representative samples by including
groups or typical areas in a sample. PURPOSIVE SAMPLING TECHNIQUE
34. Less time consuming compared to many other sampling methods because only
suitable candidates are targeted PURPOSIVE SAMPLING TECHNIQUE
35. researcher makes direct and personal contact with the interviewee. The researcher
gathers data by asking the interviewee series of questions. INTERVIEW METHOD
36. also referred to as self-administered questionnaire WRITTEN QUESTIONNAIRE
37. is a data collection tool in which written questions are presented that are to be
answered by the respondents in written form. QUESTIONNAIRE METHOD
38. a type of probability sampling method in which sample members from a larger
population are selected according to a random starting point and a fixed, periodic
interval. SYSTEMATIC SAMPLING TECHNIQUE
39. The less than cumulative frequencies are in ascending order. The cumulative
frequency of each class is plotted against the upper limit of the class interval in this
type of ogive and then various points are joined by straight LESS THAN OGIVE
40. the nonprobability equivalent of stratified sampling. QUOTA SAMPLING
TECHNIQUE
41. is calculated by dividing the population size by the desired sample size. SAMPLING
INTERVAL
42. researcher may observe subjects individually or group of individuals to obtain data
and information related to the objectives of the investigation. OBSERVATION
METHOD
43. used to show relationship between and among variables. SCATTER DIAGRAM
44. is possible when it makes sense to partition the population into groups based on a
factor that may influence the variable that is being measured.  STRATIFIED
SAMPLING TECHNIQUE
45. usually more representative of target population compared to other sampling
methods PURPOSIVE SAMPLING TECHNIQUE
46. The cumulative frequencies in this type are in the descending order. The cumulative
frequency of each class is plotted against the lower limit of the class interval. MORE
THAN OGIVE
47. is a circular statistical graphic, which is divided into slices to illustrate numerical
proportion. In a PIE CHART the arc length of each slice (and consequently its
central angle and area), is proportional to the quantity it represents.
48. used to show relationship between two sets of quantities. LINE GRAPH
49. is a table which shows the data arranged into different classes (or categories) and
the number of cases (or frequencies) which fall into each class. FREQUENCY
DISTRIBUTION TABLE
50. used to show vivid pictorial of data. PICTOGRAPH
51. extremely useful for analyzing what problems need attention first because the taller
bars on the chart, which represent frequency, clearly illustrate which variables have
the greatest cumulative effect on a given system. PARETO DIAGRAM CHART
52. effective devices of presenting both qualitative and quantitative data. TABULAR
FORM DATA (STATISTICAL TABLE)
53. Data can be presented using paragraphs or sentences. It involves enumerating
important characteristics, emphasizing significant figures and identifying important
features of data. TEXTUAL PRESENTATION DATA
54. method of collecting data is governed by our existing laws. The researcher gathers
data from offices concerned REGISTRATION METHOD
55. referred to as judgment, selective or subjective sampling is a non-
probability sampling method PURPOSIVE SAMPLING TECHNIQUE
56. this chart divides or breaks down total quantities into their component parts.
COMPONENT BAR CHART
57. used to describe or classify quantitative data by geographical areas STATISTICAL
MAP
58. are data that are organized and arranged into different classes or categories.
GROUPED DATA
59. a table which sorts data according to a certain pattern. STEM AND LEAF PLOT

## 1. The most commonly used measure of central tendency MEAN

2. the midpoint or class mark of each of the class intervals shall be multiplied to
their corresponding frequencies. The sum of the products is then divided by the
total number of frequencies. MIDPOINT METHOD
3. we arrange the observations from ascending to descending order or vice versa.
The observation found in the middle is the MEDIAN
4. and many (2 or more) modes MULTI-MODAL
5. difference between the frequency of the modal class and the frequency of the
class interval preceding it fm1
6. two methods in computing for the mean of grouped data UNIT DEVIATION
METHOD AND MIDPOINT METHOD
7. It is best to compute the measures of central tendency for grouped data using
FREQUENCY DISTRIBUTION TABLE
8. Data which are arranged in a frequency distribution are called GROUPED DATA
9. associated with ordinal data. MEDIAN
10. associated with normal data. MODE
11. a positional measure MEDIAN
12. It can be easily identified by inspection of an ungrouped set of data by getting the
score or item which occurs most frequently MODE
13. a set with two modes BI MODAL
14. A set of scores or data with only one mode UNI MODAL
15. the sum of all n values divided by the total frequency. MEAN
16. most reliable measure of central tendency MEAN
17. the midpoint of the class interval having the highest frequency ASSUMED
MEAN
18. denoted by x̅ MEAN (X HAT)
19. not affected by the extreme values. MEDIAN
20. only a function of the middle values (even or odd) or the average of the two
middle values (when n is even) when the data are arranged from the highest
value to the lowest value or vice versa. MEDIAN
21. difference between the frequency of the modal class and the frequency of the
class interval following the modal class fm2
22. associated with the interval/ ratio data. MEAN
23. strongly influenced by the extreme values in a set of data MEAN AND MODE
24. is the center most observation that divides the data, arranged in either ascending
or descending order, into halves MEDIAN
25. There are some cases when values are given more importance than the others
WEIGHTED ARITHMETHIC MEAN
26. is the simplest measure of central tendency. MEAN
27. Three modes TRI MODAL
28. may not even exist at all MODE
29. the value within the class interval having the highest frequency. MODE
30. always a unique value in any set of data. MEAN
PPT 4 – MEASURE OF POSITION

1. define the value below which given proportion of the scores lie. QUANTILES
2. Quantity QUANTILE
3. the most utilized measures for location or classification purposes (in case of people
when the characteristics are weight, height, etc.) PERCENTILE
4. the value of the trajectory of a variable, that encompasses a specific proportion of
the population. PERCENTILE
5. is the median of the values that are below of Q2 (Second Quartile). Q1
6. divide the succession of ordered data set into ten equal parts or into nine
divisions. DECILE
7. divide the succession of ordered data set into one hundred equal parts or into 99
divisions PERCENTILE
8. is greater than one percent of the values and lower than the remaining ninety-nine.
P1
9. also called “Base Case D5
10. 50th percentile MEDIAN
11. 75% or three quarters of the way up an ascending list of sorted samples UPPER
QUARTILE
12. 25% or one quarter of the way up an ascending list of sorted samples LOWER
QUARTILE
13. divide a set into four equal parts or into three divisions QUARTILE

## PPT 5 - MEASURE OF VARIATION

1. indicate the degree or extent to which numerical values are dispersed or spread out
above the average value in distribution. MEASURE OF VARIATION
2. These are the range, the semi-interquartile range, the quartile range, the mean
deviation or average deviation, the variance, and the standard deviation. MEASURE
OF VARIATION
3. The difference between the largest and the lowest values in the set of numerical
data. RANGE
4. The most important measure of variation. STANDARD DEVIATION
5. Through this, we will be able to determine the position of the scores in a frequency
distribution in relation to the mean. !!!!!!!!!!!STANDARD DEVIATION
6. The mean of the distances of each value from their mean. MEAN DEVIATION
7. The average of the squared differences from the Mean. VARIANCE
8. The square root of this variance is known as the STANDARD DEVIATION
9. indicates the variation or dispersion of the values covering the middle 50% of the
distribution of the data. SEMI INTER QUARTILE RANGE
10. It is found by getting the half of the value or distance between the third quartile or
upper quartile and the first quartile or the lower quartile. SEMI INTER QUARTILE
RANGE
11. takes into account the deviations of the individual values from the mean. MEAN
DEVIATION
12. Simplest to compute in measure of variation RANGE
13. considers only the extreme values RANGE
14. Defined as the average of the absolute deviations of the individual set of numerical
data from either the mean, the median, or the mode MEAN DEVIATION
15. Poor and Unstable measure of variation RANGE
16. It does not consider and tell anything about all the other values between these
extreme values RANGE
17. is found by finding the difference between the values of the third quartile (Q 3) or
upper quartile and the first quartile (Q1) or the lower quartile. INTER QUARTILE
DEVIATION
18. Other term for quartile deviation SEMI INTER QUARTILE DEVIATION

## 1. The measure that is expressed as the quotient of the absolute dispersion or

amount of variability, and the average. RELATIVE DISPERSION
2. Two most commonly used measures of relative dispersion COEFFICIENT
VARIATION AND THE COEFFICIENT OF QUARTILE DEVIATION
3. This is a type of measure of relative dispersion that expressed the standard
deviation as a percentage of the mean COEFFICIENT VARIATION
4. Another measure of relative dispersion that can be used when the quartiles are
known. THE COEFFICIENT OF QUARTILE DEVIATION
5. provide us additional data and information for a more accurate description of a
numerical data. MEASURE OF SHAPE
6. defined as the degree of departure from symmetry. SKWENESS
7. A frequency curve that has a longer tail to the right than to the left is said to be
8. if it has a tail which is longer to the left than to the right. POSITIVELY SKWED
9. If the mean is higher than the median, the curve is POSITIVELY SKWED
10. The degree of peakedness of a frequency curve of a distribution in relation to a
normal distribution is known as KURTOSIS
11. A frequency distribution with a relatively high curve or peak is LEPTOKURTIC
12. A flat topped distribution, where the values are relatively even in distribution
about the center is known as MESOKURTIC
13. normal distribution curve which does not have a relatively high curve or peak or
is not too flat is called PLATYKURTIC