Beruflich Dokumente
Kultur Dokumente
CA
(xm , ym )
FL PA XA XA NY XA
1. Summary
The gun violence death statistics (for 2010) for different states (of the US) is reviewed here. It is shown that as the population x increases, the number of murders due to guns y, also increases, following that a simple linear relation of the type y = hx + c. For the US, we deduce the linear law (based on 2010 data), y = 3.37x - 27.32, where x is the population in 100,000s. Hence the gun deaths per 100,000 of population, the ratio y/x, is biased against the larger states like California. There is nothing "odd" about the fact that California has the highest gun deaths per 100,000 of population while also having the toughest gun laws. The entire debate on gun deaths (on both sides of the fence) is based on an erroneous understanding of the meaning of the y/x ratio (gun death per 100,000 population) when the x-y relation is a straight line. Differences in the gun violence deaths in different US states can be understood, as shown here, using the idea of a work function, akin to Einsteins work function in the photoelectric law.
Howertons CA figures 37,553,956 375.54 1220 The following figures were compiled from the Wikipedia article cited 1 California 37,253,956 372.54 1811 1257 2 Texas 25,145,561 251.46 1246 805 3 Florida 19,687,653 196.88 987 669 4 New York 19,378,102 193.78 860 517 5 Pennsylvania 12,702,379 127.02 646 457 6 Ohio 11,536,504 115.37 460 310 7 Georgia 9,920,000 99.20 527 376 8 Michigan 9,883,640 98.84 558 413 9 North Carolina 9,535,483 95.35 445 286 10 New Jersey 8,791,894 87.92 363 246 11 Virginia 8,001,024 80.01 369 250 12 Washington 6,724,540 67.25 151 93 13 Massachusetts 6,547,629 65.48 209 118 14 Arizona 6,392,017 63.92 352 232 15 Maryland 5,773,552 57.74 424 293 16 Colorado 5,029,196 50.29 117 65 17 Alabama 4,799,736 48.00 199 135 18 Oregon 3,831,074 38.31 78 36 19 Connecticut 3,574,097 35.74 131 97 20 Utah 2,763,885 27.64 52 22 21 West Virginia 1,852,994 18.53 55 27 22 DC 601,723 6.02 131 99 Average (for 21 states plus District of Columbia, Washington DC) xm = 99.876 ym = 309.23
Data source: Gun violence in United States by state (2010), http://en.wikipedia.org/wiki/Gun_violence_in_the_United_States_by_state My main purpose here is to discuss the significance of this x-y relation, without any political posturing pro or cons. According to Howerton, the following is an exact quote, California had the highest number of gun murders in 2011 with 1,220 which makes up 68 percent of all murders in the state that year and equates to 3.25 murders per 100,000 people. The irony of such a grisly distinction is evident when
Page | 3
you look at which state was named the state with the strongest gun control laws in 2011 by the Brady Campaign to Prevent Gun Violence. You guessed it it was California. The figure being used here for discussion is a y/x ratio, 3.25 murders per 100,000 population. If x is the population (expressed in 100,000) and y is the number of gun-related deaths, the ratio y/x = 3.25 is the figure that is being prominently highlighted. Notice that y = 1220, the absolute number of gun murders is overlooked. The population x is not mentioned. Of course, we all know that California is a highly populous state. Since we have the ratio y/x = 3.25 and the number of gun murders, y = 1220, we can deduce the population x that was used to compute this ratio. It turns out to be 37,553, 956 or x = 375.54 if population is expressed in units of 100,000 (or 37.56 million). Since the raw data was not available in this news article, I obtained the x and y values for various states from the Wikipedia article, the easiest source that I could find, although I would have liked to use the same source Howerton had used. (I have since been able to find the FBI Uniform Crime Reports, the source used by Howerton, see Ref. [19], click here . This 2011 data is discussed briefly in Appendix 1, in this updated version.) However, as you can see from Table 1, the number of gun deaths and the gun deaths per 100,000 population based on the numbers that I have used are in good agreement, although the data here are for 2010. Hence, we can take Figure 1 as a perfectly legitimate consequence of the y/x ratio, the gun deaths per 100,000 population, that have dominated the discussion on this contentious topic. What can we learn from the x-y graph in Figure 1? First, we learn from Figure 1 is that the number of gun deaths increases as the population increases following a remarkable simple linear law. This is simply common sense. No explanation is needed. As long as murders are committed by the criminal elements in society (and here it just happens to be murders committed using guns), we expect the number of murders to increase as the
Page | 4
population increases. This is also confirmed by the positive correlation between total murders and gun murders in a population, see Figure 2.
1400 1200 1000 800 600 400 200 0 0 400 800 1200 1600 2000
allow us to draw more meaningful conclusions about the relative gun violence related deaths for the various states in our data set. In other words, a single y/x ratio cannot be used to make comparisons, or to make predictions (here, about gun deaths and so on for other problems of interest) because of the nonzero intercept c which can be positive or negative. (This depends on how the data is aggregated; see also Appendix 1.) The simple linear law, y = hx + c, relating the murders y and population x is also confirmed if we consider the 2011 data for just four states, obtained from the FBI Uniform Crime Report, see Table 2 and plotted in Figure 3. The firearm murder rate, i.e., the ratio y/x = h + (c/x), is not a constant and varies with increasing population x. This will depend on the numerical value of the nonzero intercept c in this relation. Table 2: Gun Crime Rates by US States (FBI Uniform Crime Report 2011) Firearm murder State Population, x Total firearm rate, y/x (in 100,000) murders, y (murders per 100,000) California (CA) 375.38 1220 3.25 Illinois (IL) 128.67 377 2.93 Virginia (VA) 80.62 208 2.58 Utah (UT) 26.8 26 0.97 The line joining CA-IL pairs has the equation y = 3.417x 62.66. Hence, the death rate y/x = 3.417 (62.66/x) increases as we move up this line to larger populations. The VA and UT data fall virtually on this line and consistent with CA-IL since there is some error in population estimates. The linear law should hardly come as a surprise. What is surprising is the conclusions that we draw based only on the y/x ratio while overlooking the x and y values. (The population x values are not even listed in the crime data summary table given in Ref. [19].) The ratio y/x is clearly biased against the states with the larger populations x. The reason for this bias is a mathematical one and is easily understood. It has to do with the nonzero intercept c in the
Page | 6
mathematical relationship between x and y. If c < 0, the ratio y/x will increase with increasing x and is therefore biased against the states with the larger populations. This is the situation with the four states considered in Figure 3.
1400
400
200 0 0 50 100 150 200 250 300 350 400 450
Page | 7
Did you know that (or ever consciously think about), if a straight line does not pass through the origin, the ratio y/x is not a constant and can either increase or decrease as one moves up or down the line? Further discussion of this point can be found in the references cited and in other articles listed in the bibliography, Ref. [4].
If we consider the game-by-game batting stats of a great baseball player like Babe Ruth, we will find (x, y) some games with the scores such as (0, 0), (1, 1), (2, 2), (3, 3), (4, 4), and (5, 5) where the first number x is the At Bats (AB) and the second number y is the number of Hits. In other words, y = x. Thus, for these games, Babe Ruth had the PERFECT batting average of BA = y/x = 1/1 = 2/2 = 3/3 = 4/4 = 5/5 = 1.000. In other games, the scores were (1, 0), (2, 1), (3, 2), 4, 3) and even (6, 5). In other words, y = x + c = x 1 where the nonzero intercept c = -1 is the number of missing hits. The BA = y/x = 1 (1/x) deviates from the PERFECT value and decreases as the AB increases. We will also find games with scores like (2, 0), (3, 1), (4, 2), (5, 3) or y = x + c = x 2 and the BA = y/x = 1 (2/x). If we continue the analysis and consider the aggregated stats for several months in a season, or several season of a career, we find that the number of Hits y increases following a simple linear law y = hx + c with h < 1 and a nonzero c. The ratio y/x = BA = h + (c/x) can either increase or decrease as the number of AB increases depending on the numerical values of the constants h and c in this relationship. The number of missing hits determines the numerical value of the constant c, the baseball work function. We find exactly similar linear relations when we analyze the GDP (x) and the Debt (y) data for several countries. Although, we cannot understand the genesis of the linear law, as we can with baseball batting stats, the GDP-Debt data also reveals the same linear law. Hence, the debt/GDP ratio can either increase or decrease as the GDP increases. With a slope h > 0 (as usually observed), if the intercept c > 0, the Debt/GDP ratio decreases as x (GDP or AB) increases. The reverse is true if the nonzero intercept c < 0. This nonzero intercept can be compared to the work function, first conceived by Einstein, in 1905, to explain certain puzzling empirical observations on photoelectricity. When light shines on the surface of a metal, a stream of photons, each with energy , bombard the surface of the metal and eject electrons. The electrons can be collected and made to flow as an electric current in an external circuit. (Modern photocells work on this principle.) The maximum (kinetic) energy of the electron, K, must be less than because
Page | 9
some energy must be given up to do the work needed to overcome the forces that bind the electron to the metal. Hence, Einstein proposed a simple linear law K = W = hf - W = h(f f0) where W = hf0 is the work function. This is the minimum energy needed to produce the electron. Here h is the universal constant, now called the Planck constant and f is the frequency of light. Einstein was thus able to conclude that the experiments with various metals will reveal a linear K-f graph with a slope equal to the Planck constant h. This fundamental constant can therefore be determined directly from carefully planned experiments which (if confirmed) would also reveal the quantum nature of light radiation. These points have been discussed in detail in the references cited. Einsteins law also implies that the K-f graph for various metals will be a series of parallels, each having a slope equal to h. We will see the implications of this very shortly for the gun death statistics of interest to us here. Einsteins work function W is just like the missing hits in baseball statistics. Babe Ruths batting stats were discussed to provide the non-physicist with a simple explanation for the significance of the nonzero c in the law y = hx + c that is often observed when we analyze our empirical observations on many many complex problems of interest to us. The discussion of Airline Quality Ratings, or the AQR scores, see Refs.[11-13], is another example where we find various y/x ratios (relating On-time arrivals, Missed Baggages, Denied Boardings, and Customer Complaints) being used to produce a rating, or a score, for various airlines, similar to the scores being used in the gun violence debate. However, as shown by a detailed consideration of the On-time (OT) arrivals ratio, this ratio is biased in favor of the airlines that operate the least number of flights. The nonzero constant c in the law y = hx + c relating the number of flights x and the number of On-time arrivals y has a positive value. Hence, the OT arrival ratios y/x favors the smaller airlines. Not surprisingly, Virgin American, the smallest of 14 airlines considered in the annual ratings, was the Best American Airline.
Page | 10
Finally, to appreciate the significance of the work function, or the nonzero c, take a look again at the x-y graph in Figure 1. The (x, y) pair for DC (District of Columbia), with a very high y/x ratio falls above the best-fit line as does the data for Michigan, which also high a rather high y/x ratio. DC has a very low population and so its high y/x ratio is not too surprising. Michigan is not as big as California but has a sizable population of nearly 10 million, see Table 3.
With the higher population, x = 92.82, the number of gun-related deaths also increased by the amount y = 314. Amazingly, we find that the slope h = 3.38 = 314/92.82, deduced for the straight line joining the DC-Michigan data is nearly identical to the slope h = 3.37 for the best-fit line through all the data points. The same pattern is observed with the data for New York, on the east coast and the state of Washington on the west coast of the nation, see Table 4. These (x, y) pairs fall below the best-fit line.
In other words, even the small scatter that we see in the data (recall that the regression coefficient r2 = 0.9492 is very high indicating very strong positive correlation) can be understood in terms of a work function that describes this complex problem, see Figure 4. One can essentially envision a series of
Page | 11
parallels, just like the K-f parallels for different metals in the Nobel Prize winning photoelectric experiments of Millikan (see discussion in Refs. [5-10]). The work function in the gun violence problem is thus quite similar to that in the photoelectricity experiments. Just like the complex environment in which an electron is located in various metals, gun deaths occur in a complex environment in various states of the nation. The small differences in the constant c, which give rise to the parallels envisioned here, are due to these differences in the local environment in which gun violence is experienced. Nonetheless, the remarkable constancy of the slope h that is observed here also points to a more fundamental constant, the significance of which has been overlooked in the gun death debate.
1600
y = 3.37x - 27.32
1400 1200 1000
y = 3.38x + 78.65
800 600
y = 3.35x - 132.36
400 200 0 0 50 100 150 200 250 300 350 400 450
perfectly parallel to the best-fit line (numerical value of the slopes differ only in the second decimal place) and joining the DC-Michigan data (upper line) and the Washington-New York data.
Is the constant h the same for all countries, for example? Sadly, the US is, perhaps, unique in this regard with the highest gun ownership in the whole world, see Ref. [19]. I am not here as a crime expert. All I have done is to offer an analysis of the (x, y) data based on my understanding of the mathematical principles that must be applied to make sense of data when we are confronted with tables upon tables of (x, y) values. It is clear that the current understanding of economic, financial, business, social, political, and other cultural observations using simple y/x ratios cannot be justified without paying attention to the nature of the underlying x-y relation. Thus far, I had deliberately stayed away from this topic of gun death statistics because of its inherent controversies. However, it is now clear that this too is another example how one (ab)uses ratios and percentages simply to score political points. Unfortunately, besides simply scoring political points, the widespread use of ratios in economics, financial analysis, and in the social and political sciences, has important and far-reaching societal consequences. The fortunes of companies (their stock prices) are related to how their profit margins (the ratio of profits y to revenues x) and earnings per share (ratio of earnings y to number of outstanding shares x) are viewed by Wall Street analysts. Unemployment rates (ratio of unemployed y to labor force x) dictate political policy. The Debt/GDP ratio dictates austerity programs, and so on. The gun death statistics highlighted here is another example. It is to be hoped that the simple x-y diagram prepared here will provide those on both sides of this important national debate something to think about. The bottom line: The entire gun debate, based on the use of gun death statistics (essentially misconceived y/x ratios), is meaningless. We have to approach this issue not based on these misleading ratios (or statistics) but
Page | 13
based on values that we cherish as a nation. There are three kinds of lies: Lies, damned lies, and statistics. This has been proved to be eminently true in the gun debates. In summary, the entire debate on gun deaths (on both sides of the fence) is based on an erroneous understanding of the meaning of the y/x ratio (gun death per 100,000 population) when the x-y relation is a straight line. Instead of throwing these numbers (y/x ratios) at each other, with all the underlying difficulties of interpretation, let us devote our energies on a reasonable solution that does the most public good while also being mindful of the desire of law abiding citizens to bear arms, as permitted by the Constitution of the United States.
(y1 - hx1)= (y2 hx2) and draw some meaningful conclusions about the relative numbers for the various states in our data set. In other words, a single y/x ratio cannot be used to make comparisons, or to make predictions (here, about the number of murders and so on for other problems of interest) because of the nonzero intercept c which can be positive or negative. This depends on how the data is aggregated; see Figures 5 and 6.
1400
y = 3.331x + 56.41 The GA-DC line LA NY XY IL TX Y y = 3.417x - 62.65 The CA-IL line
CA
100
150
200
250
300
350
400
450
is higher for New York (very close to the GA-DC line). The abbreviation LA stands for the state of Louisiana where the environment seems to promote a rather high positive value for the work function c (relative to other states of the US). Note that the term environment is used here, without prejudice, in the same sense that it is used to describe the complex nature of the environment of an electron in a metal when we discuss the photoelectric law.
1400
CA Y TX
600 400 200 0 0 50 100 150 200 250 300 350 400 450
the best-fit line, along a line that is roughly parallel to the best-fit line. Many other states fall below the best-fit line. New York falls below the best-fit line in Figure 1 but falls above the best-fit line based on the 2011 data. Remember the data for two states (Florida and Alabama) is missing but the slope of the best-fit line has not changed significantly. The year-on-year differences are like the changing batting average of a baseball player from one season to the next.
Using the baseball analogy, the batting stats for Babe Ruth reveals a negative intercept c whereas the batting stats for his Yankee team-mate, Lou Gehrig, reveals a positive intercept c; see Ref. [10]. The conclusion being reported here, about both negative and positive values of the intercept c, is based on an analysis of the batting stats for Ruth and Gehrig, in the same season, when they were both in a race to set the single season home run record. As we know, Ruth won the race with a single season record of 60 home runs. Gehrig could not catch up with Ruth and the reason lies in the difference in their work functions. The same considerations apply when we analyze other complex systems, such as the gun violence of interest to us here. The understanding of this work function and the factors that promote a reduction of the work function (i.e., a movement to a parallel with a lower and lower, or more negative, intercept) must be more fully understood so that we can formulate policies that benefit society at large and minimize the pain inflicted by gun violence. A work function also appears when we analyze other types of empirical observations. This has been discussed in other articles listed under Ref. [4].
Reference List
1. The Firearm Statistics that Gun Control Advocates Dont Want to See, by Jason Howerton, Published in The Blaze, May 6, 2013, See http://www.theblaze.com/stories/2013/05/06/the-firearms-statisticsthat-gun-control-advocates-dont-want-to-see/ and also
Page | 17
http://news.yahoo.com/firearms-statistics-gun-control-advocates-don-twant-194040384.html The problem here, as with many other controversial problems, such as the Debt-GDP problem, is that only the ratios are quoted without an analysis of the relationship between x and y that enter into the ratio. Baseball statistics offers one way to examine the roots of these misconceptions created by ratios. But, then we get carried away even with baseball ratios, such as the BA, in our enthusiasm as baseball fans. 2. Legendre, On Least Squares, English Translation of the original paper http://www.york.ac.uk/depts/maths/histstat/legendre.pdf 3. Line of Best-Fit, Least Squares Method, see worked example given http://hotmath.com/hotmath_help/topics/line-of-best-fit.html The formula for h used in this example is an actually approximate one and was used, before the advent of modern computers, since it only involves the determination of x2 and xy and the sum of all the values of x, y, x2 and xy. The exact formula, is given below, with xm and ym denoting the mean or average values of x and y in the data set, and ym = hxm + c since the bestfit line always passes through the point (xm , ym). h = (x xm)(y ym)/ (x xm)2 Determine the deviations of the individual x and y values from the mean, or average, (x xm) and (y ym). Determine the product (x xm)(y ym) and their sum. This gives the numerator in the expression for h. Determine the square (x xm)2 and the sum. This gives the denominator in the expression for h. This also fixes the intercept c via ym = hxm = c . Then, using the regression equation, determine the predicted value yb on the best-fit line and the vertical deviation (y yb) and the squares (y- yb)2. The sum of these squares is a minimum. This can be checked by assigning other values for h (using any two points) and allowing the graph to pivot around (xm, ym). The regression coefficient r2 = 1 - { (y- yb)2 / (y- ym)2 } is a measure of the strength of the correlation between x and y (or y/x versus x). For a perfect correlation, when all points lie exactly on the graph, r2 = +1.000.
Page | 18
4. Bibliography, Articles on Extension of Plancks Ideas and Einsteins Ideas beyond physics, Compiled on April 16, 2013, http://www.scribd.com/doc/136492067/Bibliography-Articles-on-theExtension-of-Planck-s-Ideas-and-Einstein-s-Ideas-on-Energy-Quantum-totopics-Outside-Physics-by-V-Laxmanan 5. The Method of Least Squares: The Debt-GDP Relation for the Trillionaire Club of Nations, Published May 4, 2013, http://www.scribd.com/doc/139348541/The-Method-of-Least-SquaresThe-GDP-Debt-Relation-for-the-Trillionaires-Club-of-Nations 6. An MIT Non-Economists View of the Harvard-UMass Debt/GDP Ratio and Economic Growth Debate, Published April 26, 2013, http://www.scribd.com/doc/138076426/An-MIT-Non-Economist-s-Viewof-the-Harvard-UMass-Debt-GDP-Ratio-and-the-Economic-Growth-Debate 7. Iceland Votes Against Austerity: Analysis of Icelands Debt-GDP, Published April 28, 2013, http://www.scribd.com/doc/138345921/IcelandVotes-Against-Austerity-Analysis-of-Iceland-s-Debt-GDP-Data-2002-2012
8. A Brief Survey of the Debt-GDP Relations for Some Modern 21st Century Economies, Published May 1, 2013, http://www.scribd.com/doc/138912093/A-Brief-Survey-of-the-DebtGDP-Relationship-for-Some-Modern-21st-Century-Economies 9. Babe Ruths 1923 Batting Statistics and Einsteins Work Function, Published April 17, 2013, http://www.scribd.com/doc/136489156/BabeRuth-s-1923-Batting-Statistics-and-Einstein-s-Work-Function 10. Babe Ruth Batting Statistics and Einsteins Work Function, To be Published April 17, 2013, http://www.scribd.com/doc/136556738/BabeRuth-Batting-Statistics-and-Einstein-s-Work-Function 11. Airline Quality Report: An Analysis of On-Time Percentages, Published April 18, 2013, http://www.scribd.com/doc/136760664/Airline-QualityReport-2013-Analysis-of-the-On-Time-Percentages 12. Airline Quality Rating 2013, Purdue University, e-Pubs, April 8, 2013, by Dr. Brent D. Bowen (Purdue University, College of Technology) and Dr. Dean E. Headley (Wichita State University, W. Frank Barton School of Business) http://docs.lib.purdue.edu/aqrr/23/
Page | 19
13. Airline Quality Report 2013: An Analysis of On-Time Percentages, Published April 18, 2013, http://www.scribd.com/doc/136760664/Airline-Quality-Report-2013Analysis-of-the-On-Time-Percentages 14. The Method of Least Squares: Predicting the Batting Average of a Baseball Player (Hamilton in 2013), Published May 7, 2013, http://www.scribd.com/doc/139924317/The-Method-of-Least-SquaresPredicting-the-Batting-Average-of-a-Baseball-Player-Hamilton-in-2013 15. Hamilton at the center of Angels first month woes, by Alden Gonzalez, http://mlb.mlb.com/news/article.jsp?ymd=20130506&content_id=46768 790&vkey=news_mlb&c_id=mlb May 6, 2013. 16. Struggling Hamilton is held out of Angels starting lineup, by Kevin Baxter, May 5, 2013, http://articles.latimes.com/2013/may/05/sports/lasp-0505-angels-notes-20130505 The following references were added during updates made after the first publication of this document today (May 8, 2013), see link below 17. Gun Death Statistics and the Method of Least Squares and the Forgotten Property of a Straight line, Published May 8, 2013, http://www.scribd.com/doc/140152581/Gun-Death-Statistics-and-theMethod-of-Least-Squares-and-the-Forgotten-Property-of-a-Straight-line 18. Gun Control 2013: Suicide Stats Irrelevant to Gun Control Policy, Matt MacBradaigh, in Politics, May 6, 2013, http://www.policymic.com/articles/38391/gun-control-2013-suicidestats-are-irrelevant-to-gun-control-policy 19. Gun crime statistics by US state: latest data, Datablog, Posted by Simon Rogers, December 17, 2012, http://www.guardian.co.uk/news/datablog/2011/jan/10/gun-crime-usstate Total firearm murders and the firearm murder rates (per 100,000 population) for all states is given here.
Page | 20
actually have many applications far beyond blackbody radiation studies where it was first conceived. Einsteins photoelectric law is a simple linear law and was deduced from Plancks non-linear law for describing blackbody radiation. It appears that financial and economic systems can be modeled using a similar approach. Finance, business, economics and management sciences now essentially seem to operate like astronomy and physics before the advent of Kepler and Newton. Finally, during my professional career, I also twice had the opportunity and great honor to make presentations to two Nobel laureates: first at NASA to Prof. Robert Schrieffer (1972 Physics Nobel Prize), who was the Chairman of the Schrieffer Committee appointed to review NASAs space flight experiments (following the loss of the space shuttle Challenger on January 28, 1986) and second at GM Research Labs to Prof. Robert Solow (1987 Nobel Prize in economics), who was Chairman of Corporate Research Review Committee, appointed by GM corporate management.
Page | 22