Beruflich Dokumente
Kultur Dokumente
15-16 July, 2010 Boston10 Stata Conference Choonjoo Lee, Kyoung-Rok Lee sarang90@kndu.ac.kr, bloom.rampike@gmail.com Korea National Defense University
Contents
Part I. A Large Data Set in Stata/DEA
Large Data Set in DEA? Computational Aspects of Large Data Set The Scope of this Study Efficiency Matters in Stata/DEA/Linear Programming Tasks to be covered
Matrix Density
# of nonzeros of the matrix How many zero elements in the matrix?
Numerical Difficulties
Inaccuracy and inefficiency due to the Floating Point Arithmetic with finite precision Numerical Precision due to the binary representation of number
Stata SE
if the number of observations(n) becomes significantly larger than the number of variables(m)?
Output Oriented
DATA
15D -
Standard form
Min s.t. 10 - 10A - 15B - 20C - 25D - 12E - S120 - 20A - 15B 6A + 30C - 15D 2D + 8E 9E - S1+ -S2 + - S2+ x3 70A + 100B + 80C + 100D + 90E 3B + 5C + + x1 + x2 = 70 +x4 = 6 =0 =0
x1 x2 x3 x4 x1 x2 x3 x4 x1 x2 x3
-53/4 -93/8 -195/8 -51/4 5/2 265/4 95/4 155/2 6/8 3/8 5/8 2/8
-1/6 -14/15 -7/6 5/3 10/3 -55/3 -1/6 -14/15 -7/6 53/9 19/9 62/9 451/72 177/72 257/36
cN
S1 0 -1 0 0 0 S2 0 0 -1 0 0 S1 + 0 0 0 -1 0 S2 + 0 0 0 0 -1 x1 -1 1 0 0 0 x2 -1 0 1 0 0 x3 -1 0 0 1 0
cB
x4 -1 0 0 0 1 RHS 0 0 0 70 6
Step1: Set up the initial tableau factors. Step2: Find entering variable. Step3: Find leaving variable.
30
A
46
B
73
C
35
D
62
E
77 Max
S1-1
S 2-1
S1+
-1
S2+
-1
cN
S2 0 0 -1 0 0 S1 + 0 0 0 -1 0 S2 + 0 0 0 0 -1 x1 -1 1 0 0 0 x2 -1 0 1 0 0 x3 -1 0 0 1 0
cB
x4 -1 0 0 0 1 RHS 0 0 0 70 6
N
X 1 x1 x2 x3 E 0 0 0 0 0 10 20 0 0 A 0 -10 -20 70 6 B 0 -15 -15 100 3 C 0 -20 -30 80 5 D 0 -25 -15 100 2 x4 -1 0 0 0 1 S1 0 -1 0 0 0 S2 0 0 -1 0 0 S1 + 0 0 0 -1 0 S2 + 0 0 0 0 -1 x1 -1 1 0 0 0
B
x2 -1 0 1 0 0 x3 -1 0 0 1 0 x4 0 -12 -9 90 8
b
RHS 0 0 0 70 6
Tasks to be covered
Computational Accuracy
Example: Obtaining Inverse Matrix
Matrix D
1 1.341099143-61.13394928 0.4455321 1.883781314 2.58794665 3 0 0 0 0.0588235 0 0 0 0.116421975-6.672515869 -0.110761 0.495342732 0.09713860 6 0-0.172319263-19.71403694 -0.262333 - 1.54739666 0.074690066 0-0.046367686-4.060891628 -0.082268 - 0.25169459 0.009800959 0 0.105886854 4.651313305 0.1136269 - 0.03722914 0.015884314 3
Tasks to be covered
Computational Accuracy
Example: Obtaining Inverse Matrix
Inverse matrix D by Stata/Mata luinv (D)
1 162470623.2 -4.022811871 - 487411816.6 81235289.98 81235306 0 -147760451.4 -0.087162294 73880208 - -73880196.74 443281245.5 0 3410527.559 0.007873073 -1705264 10231581.38 1705263.517 0 16.99999999 0 86785601.44 -2.96E-17 -2.77E-08 2.18378179 1.66E-07 2.77E-08
Tasks to be covered
Computational Accuracy
Example: Obtaining Inverse Matrix
Inverse matrix D by Stata/Mata luinv (D)
. mata mata (type end to exit) : st_view(X=.,.,(" a1"," a2"," a3"," a4"," a5","a6")) : b=luinv(X) : b 1 2 3 4 5 6 1 1 0 0 0 0 0 6 1 2 3 4 5 6 81235289.98 -73880196.74 1705263.517 2.76977e-08 43392788.04 15592419.02 2 162470623.2 -147760451.4 3410527.559 16.99999999 86785601.44 31184842.39 3 -4.022811871 -.0871622935 .0078730725 -2.95716e-17 2.18378179 .1960047586 4 -81235305.55 73880208.39 -1705263.586 -2.76977e-08 -43392791.54 -15592418.13 5 487411816.6 -443281245.5 10231581.38 1.66186e-07 260356746.7 93554511.28
Tasks to be covered
Computational Accuracy
Example: Obtaining Inverse Matrix
D*D-1 in Stata/Mata(default tolerance)
1 5.96E-08 2.36E-08 -3.73E-08 -1.74E-18 -1.63E-09 1 -1.63E-09 1.81E-09 1 5.96E-08 9.78E-09 -2.98E-08 0 -7.45E-08 1.63E-09 -3.96E-09 -7.45E-09 0 1.000000003 0 0 0 0 4.66E-10 -1.49E-08 -2.79E-09 4.66E-09
Tasks to be covered
Computational Accuracy
Example: Obtaining Inverse Matrix
D*D-1 in Excel
1 5.96046E-08-7.77156E-16 7.45058E-09-5.96046E-08-1.49012E-08 0 0.999999999 2.72414E-17 0 4.19095E-09 0 1.49012E-08 0 7.31257E-09 0
Tasks to be covered
Computational Accuracy
One of the possible reasons: Decimal and Binary numbers
17(decimal number)
17 / 2 = 1 8/2=0 4/2=0 2/2=0 1/2=1
= 10001(binary number)
How computer saves a=0.75, b=0.7+0.05, c=0.6+0.1+0.05?
Tasks to be covered
Accuracy
Tolerance
to set upper or lower limit on the number of iterations. to stop an unattended run if the algorithm falls into a cycle
Preprocessing: Scaling
to improve the numerical gap and get a safe solution.
Ex) Rank(D)
Malmquist Productivity Index(MPI) measures the productivity changes along with time variations and can be decomposed into changes in efficiency and technology.
Notes
The data and code related to the presentation will be available from the Conference website.
References
Cooper, W. W., Seiford, L. M., & Tone, A. (2006). Introduction to Data Envelopment Analysis and Its Uses, Springer Science+Business Media. Ji, Y., & Lee, C. (2010). Data Envelopment Analysis, The Stata Journal, 10(no.2), pp.267-280. Lee, C., & Ji, Y. (2009). Data Envelopment Analysis in Stata, DC09 Stata Conference. Maros, Istvan. (2003). Computational techniques of the simplex method, Kluwer Academic Publishers.