Measurement and Scaling

MEASUREMENT AND SCALING

What is measurement? Measurement is the process observing and recording the observations that are collected as part of a research effort. There are two major issues that will be considered here. What is measurement error? The error that results from the difference between the information sought and the information actually obtained by the measurement process. Measurement error is caused by difference between the information desired by the researcher and the information provided by the measurement process. Random Error

All chance factors that confound the measurement of any phenomenon. They tend to cancel each other out in the long run in direction and magnitude.
Types of Random Errors
Transient qualities of the individual (mood, motivation, degree of alertness, boredom, or fatigue) Situational factors involve the physical setting such as noise level, lighting, ventilation,etc.) anonymity, presence of peers). Administrative factors involve the actual administration of the instrument or the amount of subjectivity influencing the measurement process.
Systematic Error
Refers to those factors that consistently or systematically affect the variable being measured Two most common sources are demographic characteristics (education, SES, etc.) and personal style (response set).
Experimental Error Experiments are designed to measure the impact of one or more independent variables on a dependent variable. Experimental error occurs when the effect of experimental situation itself is measured rather than the effect of independent variable. For example, a retail Business Research (8510) Page 1

chain may increase the price of selected items constant in four similar outlets, in an attempt to discover the best pricing strategy. However, unique weather patterns, traffic conditions, or competitors activities may affect the sales at one set of stores and not the other. Thus, the experimental result will reflect the impact of variables other than price. Population Specification Error Population specification error is caused by selecting an inappropriate universe or population from which to collect data. This is a potentially serious problem in both industrial and consumer research. A firm wishing to learn the criteria that are considered most important in the purchase of certain machine tools might conduct a survey among purchasing agents. Yet, in many firms the purchasing agents dont determine or necessary even know the criteria behind brand selections. These decisions may be made by the machine operators, by committee or high level executives. A study that focuses on the purchasing agent as the person who decides which brands to order may be subject to population specification error. Frame Error The sampling frame is the list of population members from which the sample units are selected. An ideal frame identifies each member of the population once and only once. Frame error is caused by using inaccurate or incomplete sampling frame. For example, using the telephone directory as sampling frame for the population of a community contains a potential for frame error. Those families who dont have listed numbers, both voluntarily and involuntarily, are likely to differ from those with listed numbers in such respects as income, gender and mobility. Sampling Error Sampling error is caused by the generation of non-representative sample by means of a probability sampling method. For example, a random sample of 100 university students could produce a sample of all families. Such a sample wouldnt be representative of the overall student body. Yet it could occur in classic sampling technique. Sampling error is the focal point of concern in classical statistics.
Business Research (8510)
Page 2

Selection Error Selection error occurs when a non-representative sample is obtained by non probability sampling methods. For example, one of the authors talked with an interviewer who is afraid of dogs. In surveys that allowed any freedom of choice, this interviewer avoided home with dogs present. Obviously such practice may introduce error in to the survey results. Selection error is a major problem in non-probability samples. Non-response Error Non-response error is caused by: (1) Failure to contact all members of a sample, and /or (2) The failure of some contacted members of the sample to respond to all or specific parts of the measurement instrument. Individuals who are difficult to contact or who are reluctant to cooperate will differ, on at least some characteristics, from those who are relatively easy to contact or who readily cooperate. If these differences include variable of interest, non-response error has occurred. For example, people who are more likely to respond to a survey on a topic that interests them. If a firm were to conduct a mail survey to estimate the incidences foot among adults, non response error would be of major concern. Why? Those most likely in athletes foot, and thus more likely to respond to the survey, are current or recent suffers of the problem. If the firm were to choose the percentage of those responding who report having athletes foot as an estimate of the total population having athletes foot, the company would probably overestimate the extent of the problem.
Page 3
MEASUREMENT AND SCALING Scales for measurement of variables

To measure is to assess, quantify, analyze or appraise. It is to discover the extent, dimensions, capacity and quantity of any physical object. Business research deals with physical objects as well as ideas. How sound is an idea is parallel to assessing how well you like a song, a painting or personality of your boss. While physical objects are measured directly, ideas or concepts are measured with the help of an operational definition. Obviously, salesmanship cannot be measured directly but it is easy to set a benchmark for a good salesman as one having sold 200 cars per year without any complaint. Four scales are used to measure any object or to quantify any concept or idea or properties. These are discussed as follows: NOMINAL SCALE It is just a label having no essential value or quality. It cannot be used in grading or ranking, There are no overlaps and nominal scale are mutually exclusive. One can be either Muslim or non-Muslim, not both at the same time as it requires an item to be placed in one and only one class. It is used for counting or cross-tabulation. Hair could be black or grey, blood can be A,B,O or AB. In cricket, there is left arm or right arm spinners. It is used for obtaining personal data and is usually exhaustive to include all categories or segmentation. ORDINAL It used for ranking, rating or grading. It can show best to worst status or first to last preference. But distance between two ordinal scales is not the same. Income level of poor, middle and rich class are like less than Rs.10,000, between Rs.11,000 to Rs.50,000 and 51,000 and above. The distances are 10,000, 39,000 and infinitive respectively. It is evident that ordinal scale can rank some items in an order like less than or more but not how much more INTERVAL It is more powerful than nominal and ordinal as it not only orders or ranks or rates but also shows exact distances in between. But it does not start from zero. If there is zero like zero temperature it is not natural but arbitrary as 0 degree does not mean any temperature. Likewise, year 0 in a forecast is the end of construction year. This scale is used in addition or subtraction of scale value to calculate mean, range, variance, standard deviation, correlation and regression. Difference between interval and ordinal scale: Ordinal scale only ranks but does not measure difference between the two ranks like satisfactory and not-satisfactory. Interval scale not only ranks but also gives exact
Page 4

distance between them by assigning a value. Difference in temperature of 20 degree and 40 degree is 20 but 40 is not double hot than 20. RATIO SCALE This scale can perform all functions. It can show all mathematical and geographical indicators. It is useful when exact figures are required in objective matters are required. If a person is drawing a salary of Rs.20,000 and another Rs.40,000, it can be said that the latter is getting double the salary of the former. FOUR SCALES COMPARED NOMINAL ORGINAL INTERVEL RATIO
Classification but no Classification but Classification, ordered Classification, order, order, distance or order but no distance and distance but no distance and unique origin or unique origion unique origin origin Determinition equality Only Label of Determinination of Determination of Determination greater or lesser equality of intervals or equality of ratios value differences Ranks, Rating and equal grouping Grade Weight, height of
Doneness of meat, Gener (male, female) (well, medium well, temperature in degrees medium rare, rare) Counting Frequency Distribution AAA, BBB, CCC
Age in years
Addition/subtraction but no multiplication or All functions division personality measure Can say no measurable value like zero sales
Black & While Religion
Levels, one-star & Mean, range, variance, Annual Income 4-star standard deviation
Page 5
MEASUREMENT AND SCALING TYPES OF SCALES

Rating and Ranking Scales
RATING SCALES Requires the respondent to estimate the magnitude of a quality that an object possesses. Scoring an object without making a direct comparison to another object. Likert scale Semantic differential scale Graphic scale Staple Scale RANKING SCALES Requires that the respondents rank order a small number of activities, events or objects on the basis of overall preference or some characteristic of the stimulus.

PAIRED COMPARISON FORCED CHOICE COMPARATIVE SCALE
Page 6
Validity and Reliability

Validity and Reliability In business research, all measurements should be both valid and reliable. Validity is relevancy and appropriateness of measuring instrument or scale. While we can judge the health of a child by use of weighing machine, we cannot use the same machine for checking his or her intelligence. The weighing machine is only valid for weighing. When we say is this machine reliable, we mean to say that it would give true measure whenever it is used. Evaluating Measures: Reliability
Refers to the consistency, repeatability, and reproducibility of empirical measurements It refers to whether a particular technique, applied repeatedly to the same object, would yield consistent results each time.
Types of Reliability: Test-Retest

This is where the same test is given to the same people after a period of time. After the retest, we have two scores on the same measure for each person. Correlation between scores of the test is obtained.
Problems in Estimating Test-Retest Reliability

The memory effect The reactivity effect
Alternate Form
Here, two separate but equivalent (designed to be as similar as possible) versions of an instrument is constructed and administered successively to the same subjects. Page 7

Problems with Alternate Form
The problem of being able to construct an alternative form parallel to the original one.
Split-Half
Here, the total number of indicants is divided into two halves by separating the odd-numbered items from the even-numbered ones. o The two halves are correlated by using an appropriate measure of association.
Internal Consistency
Used to determined the homogeneity of items. That is, do the items measure the same property? Computing Cronbachs alpha (@) is a common way to assess internal consistency. It measures internal consistency by taking random samples of items on the test and correlating the scores obtained from these samples with each other. If it proves to be very low, then it means the items have very little in common. For example, if alpha is below 0.5, then it means that there is low correlation.
Intercoder Reliability (The Level of Agreement)
It examines the extent to which different interviewers, observers, or coders using the same instrument or measure get equivalent results.
Improving Reliability
Exploratory studies, preliminary interviews, or pretests of a measure with a small sample of persons with similar characteristics to the target group. Adding items of the same type to a scale. A composite measure containing more items will normally be more reliable than a composite measure having fewer. An item-by-item analysis to reveal which items discriminate well between units with different values on a particular variable. Instructions to respondents Page 8

Validity-Definition

Validity refers to the accuracy of a measure. It is the extent to which a measuring instrument actually measures the underlying concept it is suppose to measure. It refers to the extent of matching, congruence, or goodness of fit between an operational definition and the concept it is purported to measure. An instrument is said to be valid if it taps the concept it is suppose to measure. It is designed to answer the question-is it true.
Assessing the Validity of a Measure

Content validity/ Face Validity Criterion-related validity Construct Validity
Content validity (also called face validity)
This is the extent to which a measuring instrument reflects a specific domain of content. It can also be viewed as the sampling adequacy of the content of a phenomena being measured. This type of validity is often used in the assessment of various educational and psychological tests. Content validation then, is essentially judgmental.
Problem with Content Validity
Specifying the full domain of content relevant to a particular measurement situation. No agreed upon criterion for determining content validity.
Criterion-Related Validity
This is at issue when the purpose is to use an instrument to estimate some important form of behavior that is external to the measuring instrument itself, the latter being referred to as the criterion. Page 9

A test used to select students for special programs of study in high school is valid only to the extent that it actually predicts performance in those programs.
Two Types of Criterion-Related Validity

Concurrent validity Predictive validity
Concurrent Validity
Refers to the ability of a measure to accurately predict the current situation or status of an individual. Where the instrument being assessed is compared to some already existing criterion, such as the results of another measuring device.
Predictive Validity

This is where an instrument is used to predict some future state of affairs. An example here is the various educational tests used for selection purposes in different occupations and schools; the SAT, the GRE, etc. If people who score high on the SAT or GRE do better in college than lowscorers, then the SAT or GRE test is presumably a valid measure of Scholastic aptitude (in the case of SAT). The prison system uses this to assess criminals who are less likely to recidivist. They use factors such as age, type of crime, family background, etc.
Problems with Criterion-Related Validity
From the definition of criterion-related validity, it can be inferred that the degree of criterion-related validity depends on the extent of the correspondence between the test and the criterion. Most measures in the social sciences have no well delimited relevant criterion variables against which measures can be reasonably evaluated.
Construct validity
This is evaluated by examining the degree to which certain explanatory concepts (constructs) derived from theory, account for performance on a measure.
Page 10

Types of validity which depicts how a particular measure relates to other measures consistent with theoretically derived hypotheses concerning the concepts or constructs that are being measured. The process of construct validation is theory-laden.
Type of Construct Validity

Convergent validity Discriminant validity
Convergent Validity
This is based on the idea that two instruments that are valid measures of the same concept should correlate rather highly with one another or yield similar results even though they are different instruments.
Discriminant Validity
This is based on the idea that two instruments, although similar to one another, should not correlate highly if they measure different concepts. This approach thus involves the simultaneous assessment of numerous instruments (multimethod) and numerous concepts (multitrait) through the computation of intercorrelations.
Problems with Construct validity
The process of validation is theory-laden. It is thus almost impossible to 'validate' a measure of a concept unless there is in existence a theoretical network that surrounds it.
The Reliability- Validity Relationship

An instrument that is valid is always reliable An instrument that is not valid may or may not be reliable An instrument that is reliable may or may not be valid An instrument that is not reliable is never valid. Reliability is a necessary, but not sufficient, condition for good measurement.
Page 11

Measurement and Scaling

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Measurement and Scaling

Hochgeladen von

Copyright:

Verfügbare Formate

MEASUREMENT AND SCALING