You are on page 1of 14

# Introduction

### 1.1 Origin of the report

In the statistics course part-2, our course teacher Dr. Swapan Kumar Dhar has consigned us to carry out a report on “Bachelors degree earned by field”. The data set is secondary data and collected from the website of U.S. Census Bureau which was suggested by my course teacher. The data set was all about earning the bachelor degree from different field of education. The task was plotting this data set and finding the requirements with the statistical tools.

• ### 1.2 Problem Statement

The problem statement was: from the table of “Bachelor’s Degrees Earned by Field” I took a random sample of 28 fields. By keeping ‘Business’ and ‘Mathematics and statistics’ fixed plotted the data for each field. Then described the apparent trends and obtained the trend values by least square method. For ‘Business’ and ‘Mathematics and statistics’, their proportions are obtained individually as it was required to find out the confidence interval(CI) of these two subjects. The purpose of this report is to compare among the different data set using various statistical tool.

### 1.3METHODOLOGY

The data

of this report

is

secondary data and has

been collected

from

the

website of U.S. Census Bureau.

# Discussion

As I mentioned at the introduction part, our data set was “Bachelors degree earned by field”. I have collected 28 samples from the sample by keeping the ‘Business’ and ‘Mathematics and Statistics’ fixed. The data is plotted below at table-1 which is at the next page.

From this table the mean of this data has been obtained. The mean indicates the average number of students has earned those degrees in different fields of education from the year 1980 to 2006. The mean is shown separately for each year and shown in

table-2.

 Field of Study 1980 1990 2000 2003 2004 2005 2006 Table-1: Bachelor's Degrees Earned by Fields Agriculture and natural 22,80 resources Architecture and related 2 12,900 24,238 23,294 22,835 23,002 23,053 services 9,132 9,364 8,462 9,054 8,838 9,237 9,515 Area, ethnic, cultural, and gender studies 2,840 4,447 6,212 6,629 7,181 7,569 7,879 Biological and biomedical 46,19 sciences 0 37,204 63,005 60,072 61,509 64,611 69,178 186,2 256,07 293,54 307,14 311,57 318,04 Business 64248,568 0 5 9 4 2 Communications technologies 1,689 1,458 1,298 1,933 2,034 2,523 2,981 Computer and information 11,15 sciences 4 27,347 37,788 57,439 59,488 54,111 47,480 118,0 108,03 105,79 106,27 105,45 107,23 Education 38105,112 4 0 8 1 8 Engineering and engineering 69,38 technologies 7 82,480 73,419 77,267 78,227 79,743 81,223 58,89 Engineering 6 64,509 58,822 62,611 63,558 64,906 67,045 10,49 Engineering technologies 1 17,971 14,597 14,656 14,669 14,837 14,178 English language and 32,18 literature/letters 7 46,803 50,106 53,670 53,984 54,379 55,096 Family and consumer 18,41 sciences/human sciences 1 13,514 16,321 18,166 19,172 20,074 20,775 Foreign languages, literatures, 12,48 and linguistics 0 13,133 15,886 16,901 17,754 18,386 19,410 Health professions and related 63,84 clinical sciences 8 58,983 80,863 71,223 73,934 80,685 91,973 Legal professions and studies 683 1,632 1,969 2,466 2,841 3,161 3,302 Liberal arts and sciences, 23,19 general studies 6 27,985 36,104 40,221 42,106 43,751 44,898 Library science 398 77 154 99 72 76 76 11,37 Mathematics and statistics 8 14,276 11,418 12,493 13,327 14,351 14,770 Military technologies 38 196 7 6 10 40 33 11,45 Multi/interdisciplinary studies Parks, recreation, leisure and 7 16,557 28,561 28,757 29,162 30,243 32,012 fitness studies 5,753 4,582 17,571 21,428 22,164 22,888 25,490 Philosophy and religious studies 7,069 7,034 8,535 10,344 11,152 11,584 11,985 Physical sciences and science 23,40 technologies 7 16,056 18,331 17,940 17,983 18,905 20,318 Public administration and social 16,64 services 4 13,908 20,185 19,878 20,552 21,769 21,986 15,01 Security and protective services 5 15,354 24,877 26,189 28,175 30,723 35,319 103,6 127,10 143,21 150,35 156,89 161,48 Social sciences and history Theology and religious 62118,083 1 8 7 2 5 5 vocations 6,170 5,185 6,789 7,926 8,126 9,284 8,548

Table-2: MEAN Values of the randomly selected 28 field

 Field of Study 1980 1990 2000 2003 2004 2005 2006 23,29 22,83 Agriculture and natural resources 22,802 12,900 24,238 4 5 23,002 23,053 Architecture and related services 9,132 9,364 8,462 9,054 8,838 9,237 9,515 Area, ethnic, cultural, and gender studies 2,840 4,447 6,212 6,629 7,181 7,569 7,879 60,07 61,50 Biological and biomedical sciences 46,190 37,204 63,005 2 9 64,611 69,178 186,26 248,56 256,07 293,5 307,1 311,57 Business 4 8 0 45 49 4 318,042 Communications technologies 1,689 1,458 1,298 1,933 2,034 2,523 2,981 Computer and information 57,43 59,48 sciences 11,154 27,347 37,788 9 8 54,111 47,480 118,03 105,11 108,03 105,7 106,2 105,45 Education 8 2 4 90 78 1 107,238 Engineering and engineering 77,26 78,22 technologies 69,387 82,480 73,419 7 7 79,743 81,223 62,61 63,55 Engineering 58,896 64,509 58,822 1 8 64,906 67,045 14,65 14,66 Engineering technologies 10,491 17,971 14,597 6 9 14,837 14,178 English language and 53,67 53,98 literature/letters 32,187 46,803 50,106 0 4 54,379 55,096 Family and consumer 18,16 19,17 sciences/human sciences 18,411 13,514 16,321 6 2 20,074 20,775 Foreign languages, literatures, and 16,90 17,75 linguistics 12,480 13,133 15,886 1 4 18,386 19,410 Health professions and related 71,22 73,93 clinical sciences 63,848 58,983 80,863 3 4 80,685 91,973 Legal professions and studies 683 1,632 1,969 2,466 2,841 3,161 3,302 Liberal arts and sciences, general 40,22 42,10 studies 23,196 27,985 36,104 1 6 43,751 44,898 Library science 398 77 154 99 72 76 76 12,49 13,32 Mathematics and statistics 11,378 14,276 11,418 3 7 14,351 14,770 Military technologies 38 196 7 6 10 40 33 28,75 29,16 Multi/interdisciplinary studies 11,457 16,557 28,561 7 2 30,243 32,012 Parks, recreation, leisure and 21,42 22,16 fitness studies 5,753 4,582 17,571 8 4 22,888 25,490
 10,34 11,15 Philosophy and religious studies 7,069 7,034 8,535 4 2 11,584 11,985 Physical sciences and science 17,94 17,98 technologies 23,407 16,056 18,331 0 3 18,905 20,318 Public administration and social 19,87 20,55 services 16,644 13,908 20,185 8 2 21,769 21,986 26,18 28,17 Security and protective services 15,015 15,354 24,877 9 5 30,723 35,319 103,66 118,08 127,10 143,2 150,3 156,89 Social sciences and history 2 3 1 18 57 2 161,485 Theology and religious vocations 6,170 5,185 6,789 7,926 8,126 9,284 8,548 35,16 44,3 45,52 Mean 31,739 9 39,883 42,972 80 7 46,975

Here at the table if we look closely then see that all these data has been arranged with respect to time. Then we can refer it as time series. Because we know that a time series is a set of measurements, ordered over time on a particular quantity of interest. From the given information we can see that the data has changed over long period of time and the mean of each year indicates the inclining tendency of this data set. This general tendency of a time series over a fairly long period of time is termed as trend or secular trend. To measure the secular trend of this data set there are several methods:

• 1. Graphical Method

• 2. Semi-average Method

• 3. Moving average Method

• 4. Least Squares Method

To find out the trend value of the given information I have followed the Least Squares Method. The calculation of fitting the straight line has shown here.

This trend suppose to be linear the trend equation is of the type y c = a + bx. The values of a and b are two parameters. Applying the least squares method the values of a and b are estimated as:

And

Let the equation of the linear trend be y c = a + bx.

Here the number of years (n) is 7 which is a odd number. That’s why we choose the origin(x) the middle Year unit of x as 1 year.

Table-3: Calculation of fitting a straight line

 No. of people earned Bachelor's Trend Year(t) x= t-2003 degree(y) x² xy value(y=a+bx) 1980 -23 926,731 529 -21,314,813 2,159,065 1990 -13 1,020,205 169 -13,262,665 1,744,302 2000 -3 1,169,302 9 -3,507,906 1,329,540 2003 0 1,268,060 0 0 1,205,112 2004 1 1,312,637 1 1,312,637 1,163,636 2005 2 1,348,141 4 2,696,282 1,122,159 2006 3 1,390,706 9 4,172,118 1,080,683 Total -33 8,435,782 721 -29,904,347

Now from the equation,

 a= 1146573.6 b= -

40018.18169

Then I obtain the trend value for each year and plot it on table-3. Now this trend line is fitted at the following graphical presentation.

Figure 1: Fitting straight line trend

Now the problem statement required the proportion and the confidence interval (CI) of ‘Business’ and ‘Mathematics and Statistics’ individually for each year. To find out the CI the standard error (SE) of the sample is also needed.

Table-4 has shown the proportion and the CI of ‘Business’ field. Here n=28, the no. of student in ‘Business’ has earned the degree each year=X, the total no. of students earned the degrees in each year=N, population proportion=P

T-Distribution Table Analysis

Using the T distribution for estimating is required whenever the sample size is 30 or less and the population standard deviation is not known. The table of T distribution values differs in construction from the Z table. The T table is more compact and shows the areas and T values for only a few percentages (10, 5, 2, and 1). A different T distribution for each number of degrees of freedom makes a lengthier but complete table.

In using T table we must specify the degrees of freedom with which we are dealing. In this report of Bachelors’ degree earned by a population of 36 fields we take a sample size of 28 fields. That is,

Sample size, n = 28; Degrees of freedom, df = n-1 = 28-1 = 27. Acceptance error, α = 0.05. Confidence Interval level, CI = 1-α = 1 - .05 = .95 = 95%

Now we look in T table down the 0.05 column until we encounter the row for 27 degrees of freedom. There we find that the T value is 2.052 and set our confidence limits of the proportion of BUSINESS and MATHEMATICS AND STATISTICS.

In order to fin the PROPORTION of BUSINESS and MATHEMATICS AND STATISTICS we require the following equations.

Sample Proportion, P̂ = Number of Successes/Total number of outcomes = X/N Standard Error of Sample Proportion, SE (P̂) = √ P̂ (1- P̂)/n Confidence Interval of Sample Proportion = P̂ ± t α/2 SE (P̂) = Lower Confidence level < t < Upper

Confidence level

Table-4: The proportion and CI of ‘Business” field

 Proporti SE(p̂)= P on √p̂(1- CI of p̂±t α/2 SE(p̂) = Year X N p̂=X/N p̂)/n p̂- Range of CI p̂+ t α/2 SE(p̂) t α/2 SE( p̂) 926,73 0.0042

From the table above, we have found the upper limit and the lower limit of the population proportion when the CI is 100(1-0.05) % or 95%. Here the error α=0.05. These numbers implied the percentage of the total population who has earned the degree in the ‘Business’ field.

From the above figure, we can report that with 95% CI, the sample proportion of the BUSINESS field in a certain year underlies between our estimated confidence range with α = .05. Here the above figure is based on the sample proportion of the BUSINESS field in the year of 2004.

In table-5 the proportion and CI of ‘Mathematics and Statistics’ has shown where n=28, the no. of student in ‘Mathematics and Statistics’ has earned the degree each year=X, the total no. of students earned the degrees in each year=N, population proportion=P

Table-5: The proportion and CI of ‘Mathematics and Statistics’ field

 Proporti SE(p̂)= p̂ p̂ ± on √p̂(1- CI of t α/2 SE(p̂) = Year X N p̂=X/N p̂)/n p̂+t α/2 SE( p̂- Range of CI p̂) t α/2 SE(p̂) P(198 186,2 926,73 0.1716
 P(200 307,1 1,312,6 0.2030

From the table above, we have found the upper limit and the lower limit of the population proportion when the CI is 100(1-0.05) % or 95%. Here the error α=0.05. These numbers implied the percentage of the total population who has earned the degree in the ‘Mathematics and Statistics’ field.

From the above figure, we can report that with 95% CI, the sample proportion of the BUSINESS field in a certain year underlies between our estimated confidence range with α = .05. Here the above figure is based on the sample proportion of the MATHEMATICS & STATISTICS field in the year of 2004.

# CONCLUTION

3.1 ENDING SUMMARY

The report is on Statistical Analysis on Bachelor’s Degrees earned by field by suitable Statistical Tools.

After interpreting all the data I have found the following characteristics of the given data set.

With 95% CI, the sample proportion of the BUSINESS field in a certain year underlies between our estimated confidence range with α = .05. Here the above figure is based on the sample proportion of the BUSINESS and MATHEMATICS & STATISTICS field.