00766967

Performance
Yong Yan
Profit-effective parallel
HAL Computer Systems Inc.
computing
Xiaodong Zhang
College of William & Mary
R esearchers widely use speedup, efficiency, and scal-

ability1–3 to assess parallel-computing performance.
These metrics encourage researchers to use any novel
We quantify a sequential or parallel system’s performance as
the reciprocal of a program’s execution time on that system.
Let T(m) be an application’s execution time on a parallel sys-
technique to design or improve a parallel system, without tem with m processors, and let C(m) be the cost, where m = 1
paying enough attention to the cost increase that such a refers to a sequential system. Substituting T(m), T(1), C(m),
technique incurs. However, as national-defense applications and C(1) for the variables in Equation 1, the speedup of a com-
are downsizing, commercial applications are the dominant putation using m processors is
users of parallel systems. Customers and vendors are
particularly concerned with whether a parallel system can T (1) C (m)
speedup(m ) = > .
make a profit. T (m) C (1)
Customers and vendors frequently use the performance/cost
ratio to compare systems.4 Based on this metric, David Wood So, if the speedup is larger than the cost ratio, C(m)/C(1), defined
and Mark Hill showed that parallel computing is more cost- as the costup, parallel computing on the system is financially
effective whenever the speedup is larger than the costup, a ratio justifiable. In practice, costup(m) > 1, for m > 1. This is a repre-
of the parallel-computing cost to the sequential-computing sentative performance model used for parallel computing.6,7
cost.5 They also indicated that a cost-effective parallel com- However, the cost-effective model has limitations when it
putation does not necessarily require a linear speedup. is used to justify the cost of parallel computing. For example,
Here, we extend Wood and Hill’s work, from the profit a company is deciding whether to buy a parallel system mainly
point of view. Our major goal is to investigate financially to increase their profit. The cost-effective model does not
justified parallel computing. To evaluate parallel computing’s reflect the profit made by the parallel system, so it might not
effectiveness, we use a simple profitup metric to measure how be suitable for making this decision. A non-cost-effective par-
performance, cost, and business production affect profit. We allel system might be acceptable, as long as it makes a higher
focus on investigating the relationship between cost-effective profit. On the other hand, a cost-effective system that cannot
parallel computing and profit-effective parallel computing. make enough profit to offset its cost would be unacceptable.
So, the performance model must include profit-effectiveness.
THE COST-EFFECTIVENESS METRIC AND ITS
POTENTIAL LIMITS PROFITUP: REFLECTING PROFIT IN THE
Computer system S1 is more cost-effective than system S2 if PERFORMANCE MODEL
it has a higher performance/cost ratio. That is, if P1/C1 > Our metric considers profit as a major objective of using a par-
P2/C2, then S1 is more cost-effective than S2, where P1 and C1 allel system. A computation on a parallel system is financially
and P2 and C2 are the quantified performance and cost of justifiable only if it makes more profit than the same computa-
systems S1 and S2. If S1 is more cost-effective than S2, the ratio tion on a sequential system. To study the profit-effectiveness of
of S1’s performance to S2’s performance is higher than that of parallel computing, we use five common model parameters:
S1’s cost to S2’s cost, which we express as performance, cost, the production function, lifetime, and profit.
As with cost-effectiveness analysis, we measure the perfor-
mance of a system with m processors, P(m), as the reciprocal
P1 C1 .
> (1) of a program’s execution time, where m = 1 refers to a sequen-
P2 C 2 tial system.
April–June 1999 65
Cost, cost(m, t), represents the development and maintenance THE RELATIONSHIP BETWEEN COST-EFFECTIVE AND
cost of a system with m processors in a t time period. PROFIT-EFFECTIVE PARALLEL COMPUTING
The production function, Pro(P(m), t), measures the profit In our analysis, we normalize the sequential system’s perfor-
(dollars) gained from P(m) during t, for m > 1. In practice, a mance as unitary performance, P(1) = 1, and the system’s cost
production function might exhibit different relationships to a in unitary time as unitary cost, cost(1, Lf) = 1. Under such con-
computer’s performance, which we examine in the context of ditions, P(m) is equivalent to the speedup, speedup(m); the cost
economics.8 We use three common production functions— of parallel computing, cost(m, Lf), is the costup of parallel com-
linear, superlinear, and sublinear—which measure the parallel puting, costup(m). Furthermore, Pro(1, 1) and Pro(P(m), 1) are
system’s production increase, compared to a sequential sys- the production function of sequential computing and parallel
tem. The relationship of Pro(P(m), t) to P(m) is computing, in unitary time.
Substituting P(1) = 1, cost(1, Lf) = 1, P(m) = speedup(m), and
• linearly proportional if Pro(P(m), t) = Pro(P(1), t) × P(m), cost(m, Lf) = costup(m) into Equation 2, we obtain this relation-
• superlinearly proportional if Pro(P(m), t) > Pro(P(1), t) × ship among profitup, speedup, and costup:
P(m), and
• sublinearly proportional if Pro(P(m), t) < Pro(P(1), t) ×
profitup(m ) =
( )
Pro speedup(m ),1 − costup(m )
.
P(m). Pro(1,1) − 1
(3)
Lifetime, Lf, is the interval between when someone starts If Pro(speedup(m), 1) is a linear function of the speedup; that is,
using a parallel system and when a sequential system with the
same order of performance goes on the market. This assumes Pro(speedup(m), 1) = Pro(1, 1) × speedup(m),
that a new state-of-the art parallel system always outperforms the profitup is
the fastest existing sequential system. This is also consistent
with the fact that sequential systems have been continuously Pro(1,1) × speedup(m ) − costup(m )
profitup(m ) = .
improved to perform as well as existing parallel systems. Also, Pro(1,1) − 1
if a parallel system can be upgraded timely, its lifetime (theo- (4)
retically) is infinite. In this case, parallel computing only needs With speedup(m) > costup(m) for cost-effectiveness in Equation
to be evaluated in unitary time—that is, Lf = 1—and the cost is 4, we have
Pro(1,1) × costup(m) − costup(m)
amortized over the lifetime. In the cost-effective model, the
evaluation time is also unitary time. profitup(m) >
Profit is a parallel system’s net profit: Pro(1,1) − 1
Profit(m) = Pro(P(m), Lf) Ð cost(m, Lf). = costup(m) > 1.
Similarly, a sequential system’s profit during Lf is So, we reach this conclusion:
Profit(1) = Pro(P(1), Lf) Ð cost(1, Lf). Conclusion 2. If the production from parallel computing is linearly
proportional to the computation’s speedup, and the computation is
In practice, the profit might be negative. To simplify discus- cost-effective, then the computation is also profit-effective.
sions, we assume the profit is positive, which means that
Pro(P(i), Lf) > cost(i, Lf), for i = 1, ..., m. For the same production function, is a profit-effective par-
We characterize a parallel system’s profit-effectiveness for allel computation (profitup(m) > 1) also cost-effective? To
applications by profitup, a ratio of the parallel system’s profit answer this question, we substitute profitup(m) > 1 into Equa-
to the sequential system’s profit: tion 4 to obtain
Pro(1,1) × speedup(m ) − costup(m )

Profit(m ) >1
profitup(m ) = Pro(1,1) − 1
Profit(1)
=
( ) ( )
Pro P (m ), L f − cost m, L f
. (2)
or
Pro(P (1), L ) − cost(1, L ) Pro(1,1) + costup(m ) − 1

f f
speedup(m ) > . (5)
Pro(1,1)
So, we reach this conclusion: From Equation 5, we derive
Conclusion 1. A computation on a parallel system is financially

justifiable if and only if profitup(m) > 1. speedup(m ) > costup(m ) −
(Pro(1,1) − 1)(costup(m) − 1) . (6)
Pro(1,1)
66 IEEE Concurrency
Because Pro(1, 1) > 1 and costup(m) > 1, Equation 6 does not Like Wood and Hill, we use the prices of Silicon Graphics’
necessarily satisfy the condition of cost-effectiveness, speedup(m) server products without including other costs, such as for soft-
> costup(m). So, we reach this conclusion: ware and application development. Computer system prices
are based on the basic prices of Challenge series products. In
Conclusion 3. If the production from parallel computing is lin- 1994, a uniprocessor Challenge DM cost approximately
early proportional to the computation’s speedup, a profit-effective $38,400 plus approximately $100 per Mbyte of memory. Thus,
parallel computation might not be cost-effective. the uniprocessor cost is
Conclusions 2 and 3 indicate that the cost-effective metric cost(1, s) = $38,400 + $100 × s,
is too pessimistic in judging if a parallel computation is finan-
cially justifiable for a linear production function. where s represents memory size in Mbytes. The price of a par-
By Equation 3, if parallel computing is profit-effective allel Challenge XL of m processors with s′ Mbytes of shared
(profitup(m) > 1), this condition should be satisfied: memory is
Pro(speedup(m), 1) > costup(m) + Pro(1, 1) Ð 1. (7) cost(m, s′) = $81,600 + $20,000 × m + $100 × s′.
If the production function is superlinear to the parallel com- Assuming that the shared-memory size is identical to that of a
puting’s speedup, and the computation is cost-effective uniprocessor workstation, we have s = s′. Therefore, the costup
(speedup(m) > costup(m) > 1), we have is
2.125 + 0.521 × m + 0.0026 × s .
Pro(speedup(m), 1) > Pro(1, 1) × speedup(m) > Pro(1, 1) × costup(m) costup(m, s ) = (8)
= costup(m)+(Pro(1,1) Ð1)costup(m) 1 + 0.0026 × s
> costup(m) + Pro(1, 1) Ð 1.
To calculate the profitup, we use Equation 3, where Pro(1,
Thus, Equation 7 is valid from the cost-effective point of view. 1) represents the production gained by a uniprocessor. Here,
So, we reach this conclusion: the production is normalized to the sequential computing cost,
and lifetime is normalized to the unit time. Pro(1, 1) is a value
Conclusion 4. If the production function from parallel computing is that closely depends on applications. In practice, Pro(1, 1) is a
superlinearly proportional to the computation’s speedup, and the com- constant for a class of applications. This constant is updated
putation is cost-effective, then the computation is also profit-effective. to a different value when the system is upgraded. In our study,
we assume that Pro(1, 1) is a constant.
If the production function is sublinear to the performance, the Regarding the parallel-system production, we consider the
following function choice is valid for speedup(m) > costup(m) > 1 production function
if we simply reverse the comparison sign in Equation 7:
Pro(speedup(m), 1) = fp × speedup(m) × Pro(1, 1),
Pro(speedup(m), 1) < costup(m) + Pro(1, 1) − 1.
where fp < 1, fp = 1, or fp > 1 represents a sublinear, a linear, or
Substituting this sublinear production function into Equation a superlinear production function of the speedup. Again, the
3, we obtain profit(m) < 1. So, we reach this conclusion: relationship between speedup(m) and m, the number of proces-
sors, is speedup(m) = fs × m, where fs ≤ 1. So, we characterize the
Conclusion 5. If the production function from parallel computing cost effect by this profitup formula:
is sublinearly proportional to the computation’s speedup, a cost-
f p × Pro(1,1) × m × f s − costup(m, s )
profitup(m, s ) =
effective computation might not be profit-effective. . (9)
Pro(1,1) − 1
We have shown that the production function is a major fac-
tor in determining if a parallel computation is financially jus- Memory is an important component of a system’s hardware.
tifiable from a cost-effective or profit-effective point of view. So, Table 1 shows the effect of memory cost as the memory
size increases, for an eight-processor system where the speedup
CASE STUDIES factor fs is 0.5 and Pro(1, 1) is 4. As the memory size increases,
We now demonstrate the difference between cost-effective costup decreases and profitup increases.
computing and profit-effective computing for varying proces- The table’s top section shows that when memory is more
sor and memory costs. For consistency and fairness, we use the than 500 Mbytes, the profitup is larger than 1 and the speedup
same vendor data on 1994 Silicon Graphics product prices that is larger than the costup. However, for 300 or 400 Mbytes, the
Wood and Hill used.4 Current market prices are different, but profitup is smaller than 1 even though the speedup is larger
that does not affect the comparison’s validity. than the costup. This shows that cost-effective parallel com-
April–June 1999 67
puting is not necessarily profit-effective. In this case, parallel eight to 1,024 processors and is both cost-effective and profit-
computing is profit-effective only when the speedup is suffi- effective for 64 to 1,024 processors.
ciently larger than the costup.
The table’s middle section shows that for 100 or 200 Mbytes,
the profitup is larger than 1, but the speedup is smaller than the
costup. This shows that profit-effective parallel computing is
not necessarily cost-effective. The table’s bottom section also
shows this result. In this case, parallel computing is profit-
B ecause high performance has strongly motivated parallel-
computing research and development for advanced appli-
cations, profit has not been a real concern. However, with
effective when the speedup is smaller than the costup. rapid advances in commodity processors and networking
Processors are another important component of hardware technology, and with rapidly changing global political and
cost because many scientific applications require a large num- economic structures, mainstream parallel computing platforms
ber of them to exploit parallelism. Table 2 shows the effect of are shifting from expensive, custom-designed massively
processor cost where the memory size is 512 Mbytes, the parallel processing machines to cheap, commodity-designed
speedup factor fs is 0.25, and Pro(1, 1) is 4. As the number of symmetric multiprocessors and networks of SMPs, work-
processors increases, the costup, speedup, and profitup stations, and PCs. Therefore, more and more users have been
increase. The table’s top section, where production is a sub- serious about profit gain from parallel computing.
linear function of speedup, shows that parallel computing is These trends and our work indicate that profit analysis is
not profit-effective when the speedup is larger than the costup necessary for evaluating parallel computing’s effectiveness.
for 64 processors and 128 processors, respectively. This shows Two major cost components that our case study did not quan-
that only a sufficiently large parallel system is likely to be profit- titatively consider are the lifetime software and hardware main-
effective for a sublinear production function. The table’s bot- tenance costs for a system, and the human cost to develop effi-
tom section, where production is a superlinear function of the cient parallel programs. In practice, the profit model should
speedup, shows that parallel computing is profit-effective for include these two application-dependent components. Also,
Table 1. The effect of memory cost. The number of processors (m) = 8, speedup(m) = 0.5m, and Pro(1, 1) = 4.
Costup and profitup are calculated by Equations 8 and 9.
PRODUCTION FUNCTION MEMORY (MBYTES)

OF SPEEDUP 100 200 300 400 500 600 700 800 900 1,000
Sublinear, fp = 0.4 Costup 5.2 4.5 4.0 3.6 3.3 3.1 2.9 2.7 2.6 2.5
Speedup 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0
Profitup 0.4 0.6 0.8 0.9 1.0 1.1 1.1 1.2 1.2 1.3
Linear, fp = 1.0 Costup 5.2 4.5 4.0 3.6 3.3 3.1 2.9 2.7 2.6 2.5
Speedup 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0
Profitup 3.6 3.8 4.0 4.1 4.2 4.3 4.4 4.4 4.5 4.5
Superlinear, fp = 1.2 Costup 5.2 4.5 3.9 3.6 3.3 3.1 2.9 2.7 2.6 2.5
Speedup 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0
Profitup 4.7 4.9 5.1 5.2 5.3 5.4 5.4 5.5 5.5 5.6
Table 2. The effect of processor cost. The memory size is 512 Mbytes, speedup(m) = 0.25m, and Pro(1, 1) = 4.
Costup and profitup are calculated by Equations 8 and 9.
PRODUCTION FUNCTION PROCESSORS

OF SPEEDUP 2 4 8 16 32 64 128 256 512 1,024
Sublinear, fp = 0.25 Costup 1.9 2.4 3.3 5.1 8.6 15.8 30.0 58.6 115.8 230.1
Speedup 0.5 1.0 2.0 4.0 8.0 16.0 32.0 64.0 128.0 256.0
Profitup - 0.5 - 0.5 - 0.4 - 0.4 - 0.2 0.1 0.6 1.8 4.1 8.6
Superlinear, fp = 1.2 Costup 1.9 2.4 3.3 5.1 8.6 15.8 30.0 58.6 115.7 229.8
Speedup 0.5 1.0 2.0 4.0 8.0 16.0 32.0 64.0 128.0 256.0
Profitup 0.2 0.8 2.1 4.7 9.9 20.3 41.2 82.9 166.2 332.9
68 IEEE Concurrency
P U R P OSE The IEEE
Computer Society is the
world’s largest association of
computing professionals, and
is the leading provider of
technical information in the
field.
M E M B E R S H I P Members receive the monthly maga-

zine COMPUTER, discounts, and opportunities to serve (all activ-
ities are led by volunteer members). Membership is open to all
the development of production functions requires specific IEEE members, affiliate society members, and others
knowledge in a particular application area. interested in the computer field.
E X E C U T I V E C O M M I T E E
ACKNOWLEDGMENTS President: LEONARD L. TRIPP
We appreciate the constructive comments from the anonymous ref- Boeing Commercial Airplane Group
erees. Fred Preston at the NASA Langley Research Center and our P.O. Box 3707
colleague Zhao Zhang read the paper and made helpful comments. M/S 19-RF VP, Standards Activities:
Seattle, WA 98124
This work has been supported in part by the National Science Foun- STEVEN L. DIAMOND *
dation under grants CCR-9400719 and CCR-9812187, by the Sun VP, Technical Activities:
President-Elect: JAMES D. ISAAK *
Microsystems Computer Corporation under grant EDUE-NAFO- GUYLAINE M. POLLOCK *
Past President: Secretary:
980405, and by the Air Force Office of Scientific Research under grant DEBORAH K. SCHERRER*
DORIS L. CARVER *
AFOSR-95-1-0215. VP, Press Activities: Treasurer:
CARL K. CHANG †
MICHEL ISRAEL*
VP, Educational Activities: IEEE Division V Director:
JAMES H. CROSS II † MARIO R. BARBACCI †
VP, Conferences and Tutorials: IEEE Division VIII Director:
REFERENCES WILLIS K. KING (2ND VP) *
BARRY W. JOHNSON†
1. G.M. Amdahl, “Validity of the Single Processor Approach to Achiev- VP, Chapters Activities:
FRANCIS C.M. LAU* Executive Director &
ing Large Scale Computing Capabilities,” Proc. American Federa- VP, Publications: Chief Executive Officer:
tion of Information Processing Societies Conf., Thompson Books, BENJAMIN W. WAH (1ST VP)* T. MICHAEL ELLIOTT †
Washington, D.C., 1967, pp. 438–485.
2. J.E. Smith, “Characterizing Computer Performance with a Single *voting member of the Board of Governors †nonvoting member of the Board of Governors
Number,” Comm. ACM, Vol. 31, No. 10, Oct. 1988, pp. 1202–1206. B OARD OF GOVERNORS
3. X. Zhang, Y. Yan, and K. He, “Latency Metric: An Experimental Term Expiring 1999: Steven L. Diamond, Richard A. Eckhouse,
Gene F. Hoffnagle, Tadao Ichikawa, James D. Isaak, Karl Reed, Debo-
Method for Measuring and Evaluating Program and Architecture
rah K. Scherrer
Scalability,” J. Parallel and Distributed Computing, Vol. 22, No. 3, Term Expiring 2000: Fiorenza C. Albert-Howard, Paul L. Bor-
Sept. 1994, pp. 392–410. rill, Carl K. Chang, Deborah M. Cooper, James H. Cross, II, Ming T. Liu,
Christina M. Schober
4. D.A. Wood and M.D. Hill, “Cost-Effective Parallel Computing,” Com-
Term Expiring 2001: Kenneth R. Anderson, Wolfgang K. Giloi,
puter, Vol. 28, No. 2, Feb. 1995, pp. 69–72. Haruhisa Ichikawa, Lowell G. Johnson, David G. McKendry, Anneliese
von Mayrhauser, Thomas W. Williams
5. J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quan-
titative Approach, 2nd ed., Morgan Kaufmann, San Francisco, 1996. Next Board Meeting: 7 June 1999, Richmond, Va.
6. B. Falsafi and D.A. Wood, “Cost/Performance of a Parallel Computer C OMPUTER SOCIETY OFFICES
Simulator,” Proc. Eighth Workshop on Parallel and Distributed Sim- Headquarters Office European Office
ulation, IEEE Computer Society Press, Los Alamitos, Calif., 1994, pp. 1730 Massachusetts Ave. NW, 13, Ave. de L’Aquilon
173–182. Washington, DC 20036-1992 B-1200 Brussels, Belgium
Phone: (202) 371-0101 Phone: 32 (2) 770-21-98
7. S.H. Fuller, “Price/Performance Comparison of C.mmp and the PDP- Fax: (202) 728-9614 Fax: 32 (2) 770-85-05
10,” Proc. Third Int’l Symp. Computer Architecture, ACM Press, New E-mail: hq.ofc@computer.org E-mail: euro.ofc@computer.org
York, 1976, pp. 195–202. Publications Office Asia/Pacific Office
10662 Los Vaqueros Cir., Watanabe Building
8. J. Stiglitz, Principles of Microeconomics, W.W. Norton & Company, PO Box 3014
Los Alamitos, CA 90720-1314 1-4-2 Minami-Aoyama,
New York, 1993. General Information: Minato-ku, Tokyo 107-0062,
Phone: (714) 821-8380 Japan
membership@computer.org Phone: 81 (3) 3408-3118
Yong Yan is a performance analyst responsible for the design and eval- Membership and Fax: 81 (3) 3408-3553
uation of multiprocessor systems at HAL Computer Systems Inc. He Publication Orders: (800) 272-6657 E-mail: tokyo.ofc@computer.org
has extensively published in the areas of parallel and distributed com- Fax: (714) 821-4641
E-mail: cs.books@computer.org
puting, computer architecture, performance evaluation, operating sys-
tems, and algorithm analysis. He received his BS and MS in computer E X E C U T I V E S T A F F
science from Huazhong University of Science and Technology, China, Executive Director & Chief Financial Officer:
and his PhD in computer science from the College of William & Mary. Chief Executive Officer: VIOLET S. DOAN
He is a member of the IEEE and ACM. Contact him at the Multi- T. MICHAEL ELLIOTT
Chief Information Officer:
processor Server Division, HAL Computer Systems Inc., Campbell, Publisher: ROBERT G. CARE
CA 95008; yyan@hal.com. MATTHEW S. LOEB
Manager, Research &
Director, Volunteer Services: Planning:
Xiaodong Zhang is a professor of computer science at the College of ANNE MARIE KELLY JOHN C. KEATON
William & Mary. His research interests are parallel and distributed
systems, computer-system performance evaluation, and scientific com- I E E E O F F I C E R S
puting. He is an associate editor of IEEE Transactions on Parallel and Dis- President: KENNETH R. LAKER
tributed Systems, and chairs the IEEE Computer Society Technical President-Elect: BRUCE A. EISENSTEIN
Committee on Supercomputing Applications. He received his BS in Executive Director: DANIEL J. SENESE
electrical engineering from Beijing Polytechnic University, China, and Secretary: MAURICE PAPO
his MS and PhD in computer science from the University of Colorado Treasurer: DAVID A. CONNOR
VP, Educational Activities: ARTHUR W. WINSTON
at Boulder. Contact him at the Dept. of Computer Science, College of VP, Publications Activities: LLOYD A. “PETE” MORLEY
William & Mary, Williamsburg, VA 23187-8795; zhang@ cs.wm.edu. VP, Regional Activities: DANIEL R. BENIGNI
VP, Standards Association: DONALD C. LOUGHRY
VP, Technical Activities: MICHAEL S. ADLER
President, IEEE-USA: PAUL J. KOSTEK
April–June 1999 5May1999

00766967

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

00766967

Hochgeladen von

Copyright:

Verfügbare Formate

Performance

R esearchers widely use speedup, efficiency, and scal-

Pro(1,1) × speedup(m ) − costup(m )

Pro(P (1), L ) − cost(1, L ) Pro(1,1) + costup(m ) − 1

So, we reach this conclusion: From Equation 5, we derive

Conclusion 1. A computation on a parallel system is financially

PRODUCTION FUNCTION MEMORY (MBYTES)

PRODUCTION FUNCTION PROCESSORS

M E M B E R S H I P Members receive the monthly maga-

Das könnte Ihnen auch gefallen