Sie sind auf Seite 1von 624

Springer Proceedings in Mathematics & Statistics

RonaldCools
DirkNuyens Editors

Monte Carlo
and QuasiMonte Carlo
Methods
MCQMC, Leuven, Belgium, April 2014

Springer Proceedings in Mathematics & Statistics


Volume 163

Springer Proceedings in Mathematics & Statistics


This book series features volumes composed of selected contributions from
workshops and conferences in all areas of current research in mathematics and
statistics, including operation research and optimization. In addition to an overall
evaluation of the interest, scientic quality, and timeliness of each proposal at the
hands of the publisher, individual contributions are all refereed to the high quality
standards of leading journals in the eld. Thus, this series provides the research
community with well-edited, authoritative reports on developments in the most
exciting areas of mathematical and statistical research today.

More information about this series at http://www.springer.com/series/10533

Ronald Cools Dirk Nuyens

Editors

Monte Carlo and


Quasi-Monte Carlo Methods
MCQMC, Leuven, Belgium, April 2014

123

Editors
Ronald Cools
Department of Computer Science
KU Leuven
Heverlee
Belgium

Dirk Nuyens
Department of Computer Science
KU Leuven
Heverlee
Belgium

ISSN 2194-1009
ISSN 2194-1017 (electronic)
Springer Proceedings in Mathematics & Statistics
ISBN 978-3-319-33505-6
ISBN 978-3-319-33507-0 (eBook)
DOI 10.1007/978-3-319-33507-0
Library of Congress Control Number: 2016937963
Mathematics Subject Classication (2010): 11K45, 11K38, 65-06, 65C05, 65D30, 65D18, 65C30,
65C35, 65C40, 91G60
Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microlms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specic statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG Switzerland

Preface

This volume represents the refereed proceedings of the Eleventh International


Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientic
Computing which was held at the KU Leuven in Belgium from 6 to 11 April 2014.
It contains a limited selection of articles based on presentations given at the conference. The conference program was arranged with the help of an international
committee consisting of the following members:

Ronald Cools (Belgium, KU Leuven)Chair


Luc Devroye (Canada, McGill University)
Josef Dick (Australia, University of New South Wales)
Alain Dubus (Belgium, Universit libre de Bruxelles)
Philip Dutr (Belgium, KU Leuven)
Henri Faure (France, Aix-Marseille Universit)
Alan Genz (USA, Washington State University)
Mike Giles (UK, Oxford University)
Paul Glasserman (USA, Columbia University)
Michael Gnewuch (Germany, Universitt Kaiserslautern)
Stefan Heinrich (Germany, Universitt Kaiserslautern)
Fred Hickernell (USA, Illinois Institute of Technology)
Aicke Hinrichs (Germany, Universitt Rostock)
Stephen Joe (New Zealand, University of Waikato)
Aneta Karaivanova (Bulgaria, Bulgarian Academy of Sciences)
Alexander Keller (Germany, NVIDIA)
Dirk Kroese (Australia, The University of Queensland)
Frances Kuo (Australia, University of New South Wales)
Pierre LEcuyer (Canada, Universit de Montral)
Gerhard Larcher (Austria, Johannes Kepler Universitt Linz)
Christiane Lemieux (Canada, University of Waterloo)
Christian Lcot (France, Universit de Savoie)
Makoto Matsumoto (Japan, Hiroshima University)
Thomas Mller-Gronbach (Germany, Universitt Passau)

vi

Preface

Harald Niederreiter (Austria, Austrian Academy of Sciences)


Erich Novak (Germany, Friedrich-Schiller-Universitt Jena)
Dirk Nuyens (Belgium, KU Leuven)
Art Owen (USA, Stanford University)
Gareth Peters (UK, University College London)
Friedrich Pillichshammer (Austria, Johannes Kepler Universitt Linz)
Leszek Plaskota (Poland, University of Warsaw)
Eckhard Platen (Australia, University of Technology Sydney)
Klaus Ritter (Germany, Universitt Kaiserslautern)
Giovanni Samaey (Belgium, KU Leuven)
Wolfgang Schmid (Austria, Universitt Salzburg)
Nikolai Simonov (Russia, Russian Academy of Sciences)
Ian Sloan (Australia, University of New South Wales)
Shu Tezuka (Japan, Kyushu University)
Xiaoqun Wang (China, Tsinghua University)
Grzegorz Wasilkowski (USA, University of Kentucky)
Henryk Woniakowski (Poland, University of Warsaw)

This conference continued the tradition of biennial MCQMC conferences initiated by Harald Niederreiter, held previously at the following places:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Las Vegas, USA (1994)


Salzburg, Austria (1996)
Claremont, USA (1998)
Hong Kong (2000)
Singapore (2002)
Juan-Les-Pins, France (2004)
Ulm, Germany (2006)
Montreal, Canada (2008)
Warsaw, Poland (2010)
Sydney, Australia (2012)

The next conference will be held at Stanford University, USA, in August 2016.
The proceedings of these previous conferences were all published by
Springer-Verlag, under the following titles:
Monte Carlo and Quasi-Monte Carlo Methods in Scientic Computing
(H. Niederreiter and P.J.-S. Shiue, eds.)
Monte Carlo and Quasi-Monte Carlo Methods 1996 (H. Niederreiter,
P. Hellekalek, G. Larcher and P. Zinterhof, eds.)
Monte Carlo and Quasi-Monte Carlo Methods 1998 (H. Niederreiter and
J. Spanier, eds.)
Monte Carlo and Quasi-Monte Carlo Methods 2000 (K.-T. Fang,
F.J. Hickernell and H. Niederreiter, eds.)
Monte Carlo and Quasi-Monte Carlo Methods 2002 (H. Niederreiter, ed.)
Monte Carlo and Quasi-Monte Carlo Methods 2004 (H. Niederreiter and
D. Talay, eds.)

Preface

vii

Monte Carlo and Quasi-Monte Carlo Methods 2006 (A. Keller, S. Heinrich and
H. Niederreiter, eds.)
Monte Carlo and Quasi-Monte Carlo Methods 2008 (P. LEcuyer and A. Owen,
eds.)
Monte Carlo and Quasi-Monte Carlo Methods 2010 (L. Plaskota and
H. Woniakowski, eds.)
Monte Carlo and Quasi-Monte Carlo Methods 2012 (J. Dick, F.Y. Kuo,
G.W. Peters and I.H. Sloan, eds.)
The program of the conference was rich and varied with 207 talks. Highlights
were the invited plenary talks, the tutorials and a public lecture. The plenary talks
were given by Steffen Dereich (Germany, Westflische Wilhelms-Universitt
Mnster), Peter Glynn (USA, Stanford University), Wenzel Jakob (Switzerland,
ETH Zrich), Makoto Matsumoto (Japan, Hiroshima University), Harald
Niederreiter (Austria, Austrian Academy of Sciences), Erich Novak (Germany,
Friedrich-Schiller-Universitt Jena), Christian Robert (France, Universit
Paris-Dauphine and UK, University of Warwick) and Raul Tempone (Saudi Arabia,
King Abdullah University of Science and Technology). The tutorials were given by
Mike Giles (UK, Oxford University) and Art Owen (USA, Stanford University),
and the public lecture was by Jos Leys.
The papers in this volume were carefully refereed and cover both theory and
applications of Monte Carlo and quasi-Monte Carlo methods. We thank the
reviewers for their extensive reports.
We gratefully acknowledge nancial support from the KU Leuven, the city of
Leuven, the US National Science Foundation and the FWO Scientic Research
Community Stochastic Modelling with Applications in Financial Markets.
Leuven
December 2015

Ronald Cools
Dirk Nuyens

Contents

Part I

Invited Papers

Multilevel Monte Carlo Implementation for SDEs Driven


by Truncated Stable Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Steffen Dereich and Sangmeng Li

Construction of a Mean Square Error Adaptive EulerMaruyama


Method With Applications in Multilevel Monte Carlo . . . . . . . . . . . . .
Hkon Hoel, Juho Hppl and Ral Tempone

29

Vandermonde Nets and Vandermonde Sequences . . . . . . . . . . . . . . . .


Roswitha Hofer and Harald Niederreiter

87

Path Space Markov Chain Monte Carlo Methods in Computer


Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Wenzel Jakob
Walsh Figure of Merit for Digital Nets: An Easy Measure
for Higher Order Convergent QMC . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Makoto Matsumoto and Ryuichi Ohori
Some Results on the Complexity of Numerical Integration . . . . . . . . . . 161
Erich Novak
Approximate Bayesian Computation: A Survey on Recent Results . . . . 185
Christian P. Robert
Part II

Contributed Papers

Multilevel Monte Carlo Simulation of Statistical Solutions


to the NavierStokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Andrea Barth, Christoph Schwab and Jonas ukys

ix

Contents

Unbiased Simulation of Distributions with Explicitly


Known Integral Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Denis Belomestny, Nan Chen and Yiwei Wang
Central Limit Theorem for Adaptive Multilevel Splitting
Estimators in an Idealized Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Charles-Edouard Brhier, Ludovic Goudenge and Loc Tudela
Comparison Between LS-Sequences and b-Adic van der
Corput Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Ingrid Carbone
Computational Higher Order Quasi-Monte Carlo Integration . . . . . . . 271
Robert N. Gantner and Christoph Schwab
Numerical Computation of Multivariate Normal Probabilities
Using Bivariate Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Alan Genz and Giang Trinh
Non-nested Adaptive Timesteps in Multilevel Monte
Carlo Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Michael B. Giles, Christopher Lester and James Whittle
On ANOVA Decompositions of Kernels and Gaussian
Random Field Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
David Ginsbourger, Olivier Roustant, Dominic Schuhmacher,
Nicolas Durrande and Nicolas Lenz
The Mean Square Quasi-Monte Carlo Error for Digitally
Shifted Digital Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Takashi Goda, Ryuichi Ohori, Kosuke Suzuki and Takehito Yoshiki
Uncertainty and Robustness in Weather Derivative Models . . . . . . . . . 351
Ahmet Gnc, Yaning Liu, Giray kten and M. Yousuff Hussaini
Reliable Adaptive Cubature Using Digital Sequences . . . . . . . . . . . . . . 367
Fred J. Hickernell and Llus Antoni Jimnez Rugama
Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate
Periodic Functions with Bounded Mixed Derivatives . . . . . . . . . . . . . . 385
Aicke Hinrichs and Jens Oettershagen
Adaptive Multidimensional Integration Based on Rank-1 Lattices . . . . . 407
Llus Antoni Jimnez Rugama and Fred J. Hickernell
Path Space Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Alexander Keller, Ken Dahm and Nikolaus Binder
Tractability of Multivariate Integration in Hybrid Function
Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Peter Kritzer and Friedrich Pillichshammer

Contents

xi

Derivative-Based Global Sensitivity Measures and Their Link


with Sobol Sensitivity Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Sergei Kucherenko and Shugfang Song
Bernstein Numbers and Lower Bounds for the Monte Carlo Error . . . . 471
Robert J. Kunsch
A Note on the Importance of Weak Convergence Rates for SPDE
Approximations in Multilevel Monte Carlo Schemes . . . . . . . . . . . . . . 489
Annika Lang
A Strategy for Parallel Implementations of Stochastic
Lagrangian Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Lionel Lentre
A New Rejection Sampling Method for Truncated Multivariate
Gaussian Random Variables Restricted to Convex Sets . . . . . . . . . . . . 521
Hassan Maatouk and Xavier Bay
Van der Corput and Golden Ratio Sequences Along the Hilbert
Space-Filling Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Colas Schretter, Zhijian He, Mathieu Gerber, Nicolas Chopin
and Harald Niederreiter
Uniform Weak Tractability of Weighted Integration . . . . . . . . . . . . . . 545
Pawe Siedlecki
Incremental Greedy Algorithm and Its Applications
in Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
Vladimir Temlyakov
On Upper Error Bounds for Quadrature Formulas
on Function Classes by K.K. Frolov . . . . . . . . . . . . . . . . . . . . . . . . . . 571
Mario Ullrich
Tractability of Function Approximation with Product Kernels . . . . . . . 583
Xuan Zhou and Fred J. Hickernell
Discrepancy Estimates For Acceptance-Rejection Samplers
Using Stratied Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Houying Zhu and Josef Dick
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621

List of Participants

Nico Achtsis, KU Leuven, Belgium


Sergios Agapiou, University of Warwick, UK
Giacomo Albi, University of Ferrara, Italy
Martin Altmayer, Universitt Mannheim, Germany
Anton Antonov, Saint Petersburg State University, Russia
Emanouil Atanassov, Bulgarian Academy of Sciences, Bulgaria
Yves Atchad, University of Michigan, USA
Serge Barbeau, Montreal University, Canada
Andrea Barth, ETH Zrich, Switzerland
Kinjal Basu, Stanford University, USA
Tobias Baumann, University of Mainz, Germany
Christian Bayer, Weierstrass Institute, Germany
Benot Beck, Arxios sprl, Belgium
Denis Belomestny, Duisburg-Essen University, Germany
Francisco Bernal, Instituto Superior Tcnico, Portugal
Debarati Bhaumik, CWI Amsterdam, The Netherlands
Dmitriy Bilyk, University of Minnesota, USA
Jose Blanchet, Columbia University, USA
Bastian Bohn, University of Bonn, Germany
Luke Bornn, Harvard University, USA
Bruno Bouchard, ENSAE-ParisTech, France
Luca Brandolini, University of Bergamo, Italy
Johann Brauchart, The University of New South Wales, Australia
Charles-Edouard Brhier, Ecoles des Ponts, France
Tim Brereton, Universitt Ulm, Germany
Glenn Byrenheid, University of Bonn, Germany
Ingrid Carbone, University of Calabria, Italy
Biagio Ciuffo, Joint Research Centre European Commission, Italy
Leonardo Colzani, Universit di Milano-Bicocca, Italy
Ronald Cools, KU Leuven, Belgium
Simon Cotter, University of Manchester, UK
xiii

xiv

List of Participants

Radu Craiu, University of Toronto, Canada


Antonio Dalessandro, University College London, UK
Fred Daum, Raytheon, USA
Thomas Daun, Technische Universitt Kaiserslautern, Germany
Lucia Del Chicca, Johannes Kepler University Linz, Austria
Steffen Dereich, Westflische Wilhelms-Universitt Mnster, Germany
Josef Dick, The University of New South Wales, Australia
Giacomo Dimarco, University of Toulouse III, France
Ivan Dimov, Bulgarian Academy of Sciences, Bulgaria
Dng Dinh, Vietnam National University, Vietnam
Benjamin Doerr, Ecole Polytechnique, France
Gonalo dos Reis, Technical University Berlin, Germany
Alain Dubus, Universit Libre de Bruxelles, Belgium
Pnar H. Durak, Yeditepe University, Turkey
Pierre tor, Grenoble University, France
Henri Faure, Aix-Marseille Universit, France
Robert Gantner, ETH Zrich, Switzerland
Christel Geiss, University of Innsbruck, Austria
Stefan Geiss, University of Innsbruck, Austria
Alan Genz, Washington State University, USA
Iliyan Georgiev, Solid Angle Ltd., UK
Mathieu Gerber, University of Lausanne, Switzerland
Giacomo Gigante, University of Bergamo, Italy
Mike Giles, University of Oxford, UK
David Ginsbourger, University of Bern, Switzerland
Peter W. Glynn, Stanford University, USA
Michael Gnewuch, Technische Universitt Kaiserslautern, Germany
Maciej Gowin, AGH University of Science and Technology, Poland
Takashi Goda, The University of Tokyo, Japan
Ahmet Gnc, Xian Jiaotong Liverpool University, China
Peter Grabner, Graz University of Technology, Austria
Mathilde Grandjacques, Grenoble University, France
Andreas Griewank, Humboldt-University Berlin, Germany
Adrien Gruson, Rennes 1 University, France
Arnaud Guyader, University of Rennes, France
Toshiya Hachisuka, Aarhus University, Denmark
Georg Hahn, Imperial College London, UK
Abdul-Lateef Haji-Ali, King Abdullah University of Science and Technology,
Saudi Arabia
Hiroshi Haramoto, Ehime University, Japan
Shin Harase, Tokyo Institute of Technology, Japan
Carsten Hartmann, Freie Universitt Berlin, Germany
Mario Hefter, Technische Universitt Kaiserslautern, Germany
Stefan Heinrich, Technische Universitt Kaiserslautern, Germany
Clemens Heitzinger, Arizona State University, USA

List of Participants

xv

Peter Hellekalek, University of Salzburg, Austria


Fred J. Hickernell, Illinois Institute of Technology, USA
Aicke Hinrichs, University of Rostock, Germany
Hkon Hoel, King Abdullah University of Science and Technology, Saudi Arabia
Wanwan Huang, Roosevelt University, USA
Martin Hutzenthaler, University of Frankfurt, Germany
Mac Hyman, Tulane University, USA
Christian Irrgeher, Johannes Kepler University Linz, Austria
Pierre Jacob, University of Oxford, UK
Wenzel Jakob, ETH Zrich, Switzerland
Alexandre Janon, Universit Paris Sud, France
Karl Jansen, Deutsches Elektronen Synchroton, Germany
Wojciech Jarosz, The Walt Disney Company, Switzerland
Arnulf Jentzen, ETH Zrich, Switzerland
Lan Jiang, Illinois Institute of Technology, USA
Llus Antoni Jimnez Rugama, Illinois Institute of Technology, USA
Stephen Joe, The University of Waikato, New Zealand
Charles Joseph, Case Western Reserve University, USA
Lutz Kmmerer, TU Chemnitz, Germany
Anton S. Kaplanyan, Karlsruhe Institute of Technology, Germany
Alexander Keller, NVIDIA, Germany
Amirreza Khodadadian, TU Vienna, Austria
Anton Kostiuk, Technische Universitt Kaiserslautern, Germany
Alexander Kreinin, IBM, Canada
Peter Kritzer, Johannes Kepler University Linz, Austria
Jaroslav Kivnek, Charles University in Prague, Czech Republic
Sergei Kucherenko, Imperial College London, UK
Thomas Khn, Universitt Leipzig, Germany
Arno Kuijlaars, KU Leuven, Belgium
Robert J. Kunsch, Friedrich Schiller University Jena, Germany
Frances Kuo, The University of New South Wales, Australia
Pierre LEcuyer, University of Montreal and INRIA Rennes, Canada
Cline Labart, Universit de Savoie, France
William Lair, EDF R&D, France
Annika Lang, Chalmers University of Technology, Sweden
Gerhard Larcher, Johannes Kepler University Linz, Austria
Kody Law, King Abdullah University of Science and Technology, Saudi Arabia
Christian Lcot, Universit de Savoie, France
Fabrizio Leisen, University of Kent, UK
Tony Lelivre, Ecole des Ponts, France
Jrme Lelong, Grenoble University, France
Lionel Lentre, INRIA Rennes Bretagne Atlantique and Rennes 1, France
Gunther Leobacher, Johannes Kepler University Linz, Austria
Paul Leopardi, The Australian National University, Australia
Hernan Levey, Humboldt-University Berlin, Germany

xvi

List of Participants

Chris Lester, University of Oxford, UK


Josef Leydold, Vienna University of Economics and Business, Austria
Sangmeng Li, Westflische Wilhelms-Universitt Mnster, Germany
Binghuan Lin, Techila Technologies Ltd., Finland
Jingchen Liu, Columbia University, USA
Kai Liu, University of Waterloo, Canada
Yanchu Liu, The Chinese University of Hong Kong, China
Hassan Maatouk, Ecole des Mines de St-Etienne, France
Sylvain Maire, Universit de Toulon, France
Lev Markhasin, University of Stuttgart, Germany
Luca Martino, University of Helsinki, Finland
Makoto Matsumoto, Hiroshima University, Japan
Charles Matthews, University of Edinburgh, UK
Roel Matthysen, KU Leuven, Belgium
Sebastian Mayer, University of Bonn, Germany
Baej Miasojedow, University of Warsaw, Poland
Alvaro Moraes, King Abdullah University of Science and Technology, Saudi Arabia
Pawe Morkisz, AGH University of Science and Technology, Poland
Hozumi Morohosi, National Graduate Institute for Policy Studies, Japan
Eric Moulines, Tlcom ParisTech, France
Chiranjit Mukhopadhyay, Indian Institute of Science, India
Thomas Mller-Gronbach, University of Passau, Germany
Tigran Nagapetyan, Fraunhofer ITWM, Germany
Andreas Neuenkirch, Universitt Mannheim, Germany
Duy Nguyen, University of Wisconsin-Madison, USA
Nguyet Nguyen, Florida State University, USA
Thi Phuong Dong Nguyen, KU Leuven, Belgium
Harald Niederreiter, Austrian Academy of Sciences, Austria
Wojciech Niemiro, University of Warsaw, Poland
Takuji Nishimura, Yamagata University, Japan
Erich Novak, Friedrich Schiller University Jena, Germany
Dirk Nuyens, KU Leuven, Belgium
Jens Oettershagen, University of Bonn, Germany
Ryuichi Ohori, The University of Tokyo, Japan
Giray kten, Florida State University, USA
Steffen Omland, Technische Universitt Kaiserslautern, Germany
Michela Ottobre, Imperial College London, UK
Daoud Ounaissi, Universit Lille 1, France
Art Owen, Stanford University, USA
Angeliki Papana, University of Macedonia, Greece
Peter Parczewski, Universitt Mannheim, Germany
Robert Patterson, Weierstrass Institute, Germany
Stefan Pauli, ETH Zrich, Switzerland
Jean-Philippe Praud, Massachusetts Institute of Technology, USA

List of Participants

Magnus Perninge, Lund University, Sweden


Gareth William Peters, University College London, UK
Friedrich Pillichshammer, Johannes Kepler University Linz, Austria
sabel Piri, Johannes Kepler University Linz, Austria
Leszek Plaskota, University of Warsaw, Poland
Jan Pospil, University of West Bohemia, Czech Republic
Clmentine Prieur, Grenoble University, France
Antonija Prlja, Arctur d.o.o., Slovenia
Pawe Przybyowicz, AGH University of Science and Technology, Poland
Mykhailo Pupashenko, Technische Universitt Kaiserslautern, Germany
Vilda Purutuolu, Middle East Technical University, Turkey
Shaan Qamar, Duke University, USA
Christoph Reisinger, University of Oxford, UK
Lee Ricketson, Univerisity of California, Los Angeles, USA
Klaus Ritter, Technische Universitt Kaiserslautern, Germany
Christian Robert, Universit Paris-Dauphine, France
Werner Roemisch, Humboldt-University Berlin, Germany
Mathias Rousset, INRIA Paris, Rocquencourt, France
Raphal Roux, Universit Pierre et Marie Curie, France
Daniel Rudolf, Friedrich Schiller University Jena, Germany
Halis Sak, Yeditepe University, Turkey
Andrea Saltelli, Joint Research Centre European Commission, Italy
Giovanni Samaey, KU Leuven, Belgium
Wolfgang Ch. Schmid, University of Salzburg, Austria
Scott Schmidler, Duke University, USA
Colas Schretter, Vrije Universiteit Brussel, Belgium
Nikolaus Schweizer, Saarland University, Germany
Jean Michel Sellier, Bulgarian Academy of Sciences, Bulgaria
John Shortle, George Mason University, USA
Winfried Sickel, Friedrich Schiller University Jena, Germany
Pawe Siedlecki, University of Warsaw, Poland
Martin Simon, University of Mainz, Germany
Ian Sloan, The University of New South Wales, Australia
Alexey Stankovskiy, SCK-CEN, Belgium
iva Stepani, Arctur d.o.o., Slovenia
Jonas ukys, ETH Zrich, Switzerland
Gowri Suryanarayana, KU Leuven, Belgium
Kosuke Suzuki, The University of Tokyo, Japan
David Swenson, Universiteit van Amsterdam, The Netherlands
Michaela Szlgyenyi, Johannes Kepler University Linz, Austria
Lukasz Szpruch, University of Edinburgh, UK
Tor Srevik, University of Bergen, Norway
Stefano Tarantola, Joint Research Centre European Commission, Italy
Rodrigo Targino, University College London, UK
Aretha Teckentrup, Florida State University, USA

xvii

xviii

List of Participants

Vladimir Temlyakov, University of South Carolina, USA


Ral Tempone, King Abdullah University of Science and Technology, Saudi Arabia
Tom Tich, VSB-TU Ostrava, Czech Republic
Giancarlo Travaglini, Universit di Milano-Bicocca, Italy
Benjamin Trendelkamp-Schroer, Freie Universitt Berlin, Germany
Bruno Tufn, INRIA Rennes Bretagne, Atlantique, France
Gerhard Tulzer, TU Vienna, Austria
Plamen Turkedjiev, Ecole Polytechnique, France
Mario Ullrich, Friedrich Schiller University Jena, Germany
Tino Ullrich, University of Bonn, Germany
Manolis Vavalis, Univeristy of Thessaly, Greece
Matti Vihola, University of Jyvskyl, Finland
Pedro Vilanova, King Abdullah University of Science and Technology, Saudi Arabia
Toni Volkmer, TU Chemnitz, Germany
Sebastian Vollmer, University of Oxford, UK
Jan Vybral, Technical University Berlin, Germany
Wander Wadman, CWI Amsterdam, The Netherlands
Clment Walter, Universit Paris DiderotParis 7, France
Xiaoqun Wang, Tsinghua University, China
Yiwei Wang, The Chinese University of Hong Kong, China
Markus Weimar, Philipps-University Marburg, Germany
Jakub Wojdya, AGH University of Science and Technology, Poland
Kasia Wolny, University of Warwick, UK
Yijun Xiao, Universit Paris-Ouest Nanterre-La Dfense, France
Yuanwei Xu, University of Warwick, UK
Larisa Yaroslavtseva, University of Passau, Germany
Takehito Yoshiki, The University of Tokyo, Japan
Xuan Zhou, Illinois Institute of Technology, USA
Houying Zhu, The University of New South Wales, Australia

Part I

Invited Papers

Multilevel Monte Carlo Implementation


for SDEs Driven by Truncated Stable
Processes
Steffen Dereich and Sangmeng Li

Abstract In this article we present an implementation of a multilevel Monte Carlo


scheme for Lvy-driven SDEs introduced and analysed in (Dereich and Li, Multilevel Monte Carlo for Lvy-driven SDEs: central limit theorems for adaptive Euler
schemes, Ann. Appl. Probab. 26, No. 1, 136185, 2016 [12]). The scheme is based
on direct simulation of Lvy increments. We give an efficient implementation of the
algorithm. In particular, we explain direct simulation techniques for Lvy increments.
Further, we optimise over the involved parameters and, in particular, the refinement
multiplier. This article complements the theoretical considerations of the above reference. We stress that we focus on the case where the frequency of small jumps is
particularly high, meaning that the BlumenthalGetoor index is larger than one.
Keywords Multilevel Monte Carlo Lvy-driven stochastic differential equation
Truncated stable distributions Computation of expectations

1 Introduction
The numerical computation of expectations E[G(X )] for solutions X = (X t )t[0,T ]
of stochastic differential equations (SDE) is a classical problem in stochastic analysis and numerous numerical schemes were developed and analysed within the last
twenty to thirty years, see for instance the textbooks by Kloeden and Platen [19]
and Glasserman [15]. Recently, a new very efficient class of Monte Carlo algorithms
was introduced by Giles [14], see also Heinrich [17] for an earlier variant of the
computational concept. Central to these multilevel Monte Carlo algorithms is the
use of whole hierarchies of approximations in numerical simulations.
S. Dereich (B) S. Li
Institut Fr Mathematische Statistik, Westflische Wilhelms-Universitt Mnster,
Orlans-Ring 10, 48149 Mnster, Germany
e-mail: steffen.dereich@wwu.de
S. Li
e-mail: li.sangmeng@wwu.de
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_1

S. Dereich and S. Li

In this article, we focus on stochastic differential equations that are driven by


Lvy processes. That means the driving process is a sum of a Brownian motion and
a discontinuous process typically featuring infinitely many jumps in compact intervals. Numerical methods for Lvy-driven SDEs have been introduced and analysed
by various authors, see e.g. [18, 27]. A common approach in the simulation of Lvy
processes is to simulate all discontinuities of the Lvy process that are larger than a
threshold and to ignore the remainder or to approximate the remainder by a Brownian
motion (Gaussian approximation), see [2]. Corresponding multilevel Monte Carlo
schemes are analysed in [10, 11]. In general the efficiency of such schemes depends
on the frequency of small jumps that is measured in terms of the Blumenthal-Getoor
index (BG index), a number in [0, 2] with a higher number referring to a higher frequency. If the BG index is less than one, then the quadrature error of simple schemes
based on shot noise representations is of the same order as the one obtained for continuous diffusions. However, when the BG index is larger than one, schemes that are
based on the simulation of individual discontinuities slow down significantly and the
simulation of the Lvy process is the main bottleneck in the numerics. Introducing a
Gaussian approximation improves the order of convergence, but still such schemes
show worse orders of convergence as obtained for diffusions. A remedy to obtain the
same order of convergence as for diffusions is to directly sample from the distribution
of Lvy increments. In this article, we consider an adaptive scheme introduced in
[12] that applies direct sampling techniques. Our focus lies on the implementation
of such algorithms with a particular emphasize on SDEs driven by truncated stable processes. We conduct numerical tests concerning the accuracy of the sampling
algorithm and of the multilevel scheme.
In the following, (, F , P) denotes a probability space that is sufficiently rich
to ensure existence of all random variables used in the exposition. We let Y =
(Yt )t[0,T ] be a square integrable Lvy-process and note that there exist
 b R (drift),
2 [0, ) (diffusion coefficient) and a measure on R\{0} with x 2 ( dx) <
(Lvy measure) such that



1 2
i zYt
E[e ] = exp t ibz z + (ei zx 1 i zx) ( dx)
2
for t [0, T ] and z R. We call the unique triplet (b, 2 , ) Lvy triplet, although
this notion slightly deviates from its original use. We refer the reader to the textbooks
by Applebaum [1], Bertoin [6] and Sato [28] for a concise treatment of Lvy processes.
The process X = (X t )t[0,T ] denotes the solution to the stochastic integral equation
 t
X t = x0 +
a(X s ) dYs ,
t [0, T ],
(1)
0

where a : R R is a continuously differentiable Lipschitz function and x0 R. Both


processes Y and X attain values in the space of cdlg functions, i.e. the space of right
continuous functions with left limits, on [0, T ] which we denote by D(R) and endow
with the Skorokhod topology. We will analyse multilevel algorithms for the computation of expectations E[G(X )], where G : D(R) R is a measurable functional such

Multilevel Monte Carlo Implementation for SDEs Driven

that G(x) depends on the marginals, integrals and/or supremum of the path x D(R).
Before we state the results we introduce the underlying numerical schemes.

1.1 Jump-Adapted Euler Scheme


In the context of Lvy-driven stochastic differential equations there are various Eulertype schemes analysed in the literature. We consider jump-adapted Euler schemes.
For finite Lvy measures these were introduced by Platen [25] and analysed by
various authors, see, e.g., [7, 22]. For infinite Lvy measures an error analysis is
conducted in [9, 11] for two multilevel Monte Carlo schemes. Further, weak approximation is analysed in [20, 24].
For the definition of the scheme we use the simple Poisson point process on
the Borel sets of (0, T ] (R\{0}) that is associated to Y , that is


(s,Ys ) ,

s(0,T ]:Ys =0

where denotes the Dirac delta function and xt = xt xt for x D(R) and
t (0, T ]. It has intensity (0,T ] , where (0,T ] denotes Lebesgue measure on
(0, T ]. Further, let be the compensated variant of that is the random signed
measure on (0, T ] (R\{0}) given by
= (0,T ] .
The process (Yt )t[0,T ] admits the representation

Yt = bt + Wt + lim
0

(0,t]B(0,)c

x d (s, x),

(2)

where (Wt )t[0,T ] is an appropriate (of independent) standard Brownian motion


and the limit is to be understood uniformly in L2 .
We introduce the numerical scheme from [12] that is based on direct simulation
of Lvy increments. We use a family of approximations indexed by three strictly
positive parameters h, and  satisfying
T N and  N.
We represent (Yt ) as a sum of two independent processes (Yth ) and (Yth ). The former
one is constituted by the drift, the diffusive part and the jumps bigger than h, that is

Yth

= bt + Wt +

(0,t]B(0,h)c

x d (s, x),

(3)

S. Dereich and S. Li

and the latter one by the (compensated) small jumps only, that is
Yth = lim
0


(0,t](B(0,h)\B(0,))

x d (s, x).

(4)

We apply an Euler scheme with two sets of update times for the coefficient. We
enumerate the times

Z [0, T ] {t (0, T ] : |Yt | h} = {T0 , T1 , . . . },




in increasing order and consider the Euler approximation X h,, = ( X th,, )t[0,T ]

given as the unique process with X 0h,, = x0 that is piecewise constant on [Tn1 , Tn )
and satisfies


h
h
= X Th,,
+ a( X Th,,
) (YThn YThn1 ) + 1 Z (Tn ) a( X Th,,
X Th,,
 ) (Y T Y T  ),
n
n
n
n1
n1
n
(5)
for n = 1, 2, . . . . Note that the coefficient in front of (Yth ) is updated at all times in
{T0 , T1 , . . . } and the coefficient in front of (Yth ) at all times in {0,  , 2 , . . . , T }
{T0 , T1 , . . . }. Hence two kinds of updates are used and we will consider schemes
where in the limit the second kind is in number negligible to the first kind. The parameter h serves as a threshold for jumps being considered large that entail immediate
updates on the fine scale. The parameters and  control the regular updates on the
fine and coarse scale.

We call X h,, piecewise constant approximation with parameter (h, ,  ). We


will also work with the continuous approximation X h,, = (X th,, )t[0,T ] defined
for n = 1, 2, . . . and t [Tn1 , Tn ) by



+ a( X Th,,
)(Yth YThn1 ).
X th,, = X Th,,
n1
n1

Note that for this approximation the evolution Y h takes effect continuously.

1.2 Multilevel Monte-Carlo


In general a multilevel scheme makes use of a whole hierarchy of approximate
solutions and we choose decreasing sequences (h k )kN , (k )kN and (k )kN and

denote for each k N by X k := X h k ,k ,k the corresponding Euler approximation as
introduced above, the so-called kth level.
Once this hierarchy of approximations has been fixed, a multilevel scheme
S is
parametrised by a N-valued vector (n 1 , . . . , n L ) of arbitrary finite length L: for a
measurable function G : D(R) R we approximate E[G(X )] by
E[G(X 1 )] + E[G(X 2 ) G(X 1 )] + . . . + E[G(X L ) G(X L1 )]

Multilevel Monte Carlo Implementation for SDEs Driven

and denote by
S(G) the random output that is obtained when estimating the individual
expectations E[G(X 1 )], E[G(X 2 ) G(X 1 )], . . . , E[G(X L ) G(X L1 )] independently by classical Monte-Carlo with n 1 , . . . , n L iterations and summing up the
individual estimates. More explicitly, a multilevel scheme
S associates to each measurable G a random variable
nk
n1
L


1 
1 

G(X k,i, f ) G(X k1,i,c ) ,
G(X 1,i ) +
S(G) =
n 1 i=1
n
k=2 k i=1

(6)

where the pairs of random variables (X k,i, f , X k1,i,c ), resp. the random variables X 1,i ,
appearing in the sums are all independent with identical distribution as (X k , X k1 ),
resp. X 1 . Note that the entries of the pairs are not independent of each other and the
superscript f and c refer to the fine and coarse simulation, respectively!

1.3 Error Analysis


In this section, we provide error estimates for multilevel Monte Carlo algorithms
based on the adaptive Euler scheme introduced before. We consider the quadrature
problem for functionals G : D(R) R of the form
G(x) = g(Ax)
with g : Rd R and linear functional A : D(R) Rd both satisfying regularity
assumptions to be specified below. Further we will consider the case where d = 1
and Ax = supt[0,T ] xt .
The hierarchical scheme of approximations: The hierarchical scheme of approximate solutions is described by a sequence of parameters ((h k , k , k ) : k N) each
triple describing an approximation as before. We assume that all three parameters
tend to zero and satisfy
(ML1)
(ML2)
(ML3)
(ML4)

k = M k T , where M N\{1} is fixed,


limk (B(0, h k )c ) k = 0,
k B(0,h k ) x 2 ( dx) log2 (1 + 1/k ) = o(k ),
h 2k log2 (1 + 1/k ) = o(k ).

We note that (ML3) and (ML4) are conditions that entail that our approximations
have the same quality as the ones that one obtains when doing adapted Euler with
update times {T0 , T1 , . . . }. Condition (ML2) implies that the number of updates
caused by large jumps is negligible in comparison to the regular updates at times in
N0 [0, T ]. This will be in line with our examples and entails that the error process
is of a particularly simple form.
Let (X k : k N) be a family of path approximation for X depending on ((h k , k ,

k ) : k N) and assume that is a parameter greater or equal to 1/2 such that

S. Dereich and S. Li



lim n E G(X n ) G(X ) = 0.

(7)

The maximal level and iteration numbers: We specify the family of multilevel
schemes. For each (0, 1) we denote by
S the multilevel scheme which has
maximal level
 log 1 
L() =
log M
and iteration numbers



n k () = 2 L() k1

for k = 1, 2, . . . , L().
The error process: The error estimate will make use of an additional process
which can be interpreted as idealised description of the difference between two
consecutive levels, the so called error process. We equip the points (s, Ys ) of the
Poisson point process with two independent marks s2 and s , the former one being
} and the latter one being standard normal.
uniformly distributed on {0, M1 , . . . , M1
M
The error process U = (Ut )t[0,T ] is defined as the solution of the integral equation

Ut =




1
1
2
M

a (X s )Us dYs +
0

+
s s (aa  )(X s ) Ys ,
2

(aa  )(X s ) dBs

(8)

s(0,t]:Ys =0

where B = (Bt )t[0,T ] is an additional independent standard Brownian motion.


Note that the above infinite sum has to be understood as an appropriate martingale
limit. More explicitly, denoting by Z = (Z t )t[0,T ] the Lvy process

Zt =


1

1
1
Bt + lim
s s Ys
0
2
M
s(0,t]:|Y |
s

we can rewrite (8) as



Ut =

a  (X s )Us dYs +

(aa  )(X s ) dZ s .

Central limit theorem: We cite an error estimate from [12]. We assume as before
that the driving process Y is a square integrable Lvy process and that the coefficient a
is a continuously differentiable Lipschitz function. Additionally we assume that 2
is strictly positive.
Suppose that G : D(R) R is of the form
G(x) = g(Ax)

Multilevel Monte Carlo Implementation for SDEs Driven

with A : D(R) Rd and g : Rd R satisfying the following assumptions:


1. A is a Lipschitz continuous functional A : D(R) Rd (w.r.t. supremum norm)
that is continuous w.r.t. the Skorokhod topology in PU -almost every path (Case 1)
or
2. A is given by Ax = supt[0,T ] xt and in particular d = 1 (Case 2),
and g is Lipschitz continuous and differentiable in P AX -almost every point.
Further we assume that we are given a hierarchical scheme of approximations
as described above. In particular, we assume that assumptions (ML1)-(ML4) and
Eq. (7) are satisfied for a fixed parameter [ 21 , ).
Theorem 1 Assume that Y is as introduced in this subsection and additionally
assume that the coefficient a : R R does not attain zero in Case 2.
The multilevel schemes (
S : (0, 1)) as introduced above satisfy
S (G) E[G(X )]) N (0, 2 ) as 0,
1 (
where N (0, 2 ) is the normal distribution with mean zero and

1. variance 2 = Var f (AX )


AU in Case 1 and
2. variance 2 = Var f  (AX )U S with S denoting the random time when X reaches
its supremum in Case 2.
Further,



lim 2 E (
S (G) E[G(X )])2 = 2 .
0

Remark 1 1. The theorem is a combination of Theorem 1.6, 1.8, 1.9, 1.10 of [12].
2. One of the assumptions requires a control on the bias, see (7). We note that the
assumptions imposed on G in the theorem imply validity of (7) for = 21 . In
general, research on weak approximation of SDEs suggests that (7) is typically
valid for < 1, see [3] for a corresponding result concerning diffusions.
3. If
 T


xs ds ,
Ax = x T ,
0

then the statement of the previous theorem remains true for the multilevel scheme
based on piecewise constant approximations with the same terms appearing in
the limit.
4. For k = 1, 2, . . . the expected number of Euler steps to generate X k (at the
discrete time skeleton of update times) is T (k1 + (B(0, h k )c )). Taking as cost
for a joint simulation of G(X k ) G(X k1 ) the expected number of Euler steps
we assign one simulation of
S (G) the cost

10

S. Dereich and S. Li

T 11 + T (B(0, h 1 )c + T

L()


1
n k ()(k1 + k1
+ (B(0, h k )c + (B(0, h k1 )c )

k=2

T (M + 1) 2
2
(log 1/)2 ,
(log M)2

5.

6.

7.

8.

as 0. In general we write for two functions g, h, h g to indicate that


lim hg = 1.
The supremum of the continuous approximation is simulatable. Between the
update times of the coefficient, the continuous approximation is a Brownian
motion plus drift and joint simulation of increments and suprema are feasible,
see [4].
In the original work the results are proved for one dimensional SDEs only to
keep the notation and presentation simple. However the proofs do not make use
of that fact and a generalisation to the multidimensional setting does not require
new techniques.
Error estimates as in the previous theorem that give only an upper bound
(of the same order) are known to hold under weaker assumptions. In particular, the differentiability of f and a is not needed for such a result, see [21].
In the diffusion setting a similar result as Theorem 1 can be found in Ben Alaya
and Kebaier [5] for diffusions and a smaller class of functionals. The main effort
in the analysis is the representation of the variance in terms of the error process.
In general, the validity of a central limit theorem without control on the variance can be often easily deduced with the Lindeberg condition. This approach
has appeared at other places in the literature and we mention [24] as an early
reference.

In Theorem 1 the effect of the multiplier M on the variance 2 is not completely


obvious. We cite Theorem 1.11 of [12].
Theorem 2 We assume the setting of Theorem 1. Further assume in Case 1 that A is
of integral type meaning that there exist finite signed measures 1 , . . . , d on [0, T ]
such that A = (A1 , . . . , Ad ) with


Ajx =

xs d j (s), for x D(R) and j = 1, . . . , d,

and generally suppose that a  (X s )Ys = 1 for all s [0, T ], almost surely. Then
there exists a constant depending on G and the underlying SDE, but not on M such
that the variance 2 satisfies

=

1
1
.
2
M

Multilevel Monte Carlo Implementation for SDEs Driven

11

2 Direct Simulation of Lvy Increments


In this section, we explain how we achieve sampling of Lvy increments. In the
following, we denote by F the cumulative distribution function of the real infinitely
divisible distribution with characteristic function


(ei zx 1 i zx) ( dx) , for z R,
(9)
(z) = exp
R\{0}


where is a measure on R\{0} with x 2 ( dx) < . In practise, the measure is
given and we need an effective algorithm for sampling from F.

2.1 Fourier Inversion


In a first precomputation we approximately invert the characteristic function with
the Hilbert transform method analysed in [8].
We consider a family of approximate cumulative distribution functions (cdf) that
is parametrised by two parameters > 0 and K N. We set
K
i  i x(k 1 ) ((k 21 ))
1
2
F,K (x) = +
e
, for x R.
2 2 k=K
(k 21 )

(10)

This approximation converges fast to the cdf, provided that satisfies certain assumptions. We cite an error estimate from [8].
Theorem 3 Suppose there exist positive reals d , d+ such that
is analytic in the space {z C; im(z) (d , d+ )},
d
d+ |(u + i y)| dy 0, as u ,

 := lim0 R |(u i(d ))| du < +.
If there exist constants , c, > 0 such that,
|(z)| exp(c|z| ), for z R,
then
|G(x) F,K (x)|

for x R.

e2d /xd
e2d+ /+xd+
 +
+
2d
/

2 d (1 e
)
2 d+ (1 e2d+ / )

4
1

+
ec(K )
+

2 K
c(K )

12

S. Dereich and S. Li

2.2 Sampling Algorithm


We aim at using a second order spline approximation to do sampling via an inverse
cdf method. We describe separately the precomputation and sampling algorithm.
Precomputation: In a precomputation, we compute second order approximations
for F,K on N consecutive intervals of equal length. More explicitly, we fix an interval
[xmin , xmax ], store for each k = 0, . . . , N the values
xk = xmin + k

xmax xmin
and yk = re(F,K (xk ))
N

and, for each k = 1, . . . , N , the unique parabola pk that coincides with re F,K in
the points xk1 , (xk1 + xk )/2, xk . We suppose that F is strictly increasing and note
that by choosing a sufficiently accurate approximation F,K we can guarantee that
each parabola pk is strictly increasing on [xk1 , xk ] and thus has a unique inverse
pk1 when restricted to the domain [xk1 , xk ].
We assume that N is of the form 2d+1 1 with d N and arrange the N entries
y0 , . . . , y N in a binary search tree of depth d.
Sampling: Sampling is achieved by carrying out the following steps:
generation of an on [y0 , y N ] uniformly distributed random number u,
identification of an index k {1, . . . , N } with u [yk1 , yk ] based on the binary
search tree,
output of pk1 (u).

3 Truncated Stable Processes


In this section we focus on truncated stable processes. Let c+ , c , h > 0 and
(0, 2). A Lvy process Y = (Yt )t0 is called truncated stable process with parameters
c+ ,c
) with
, h, c+ , c , if it has Lvy triplet (0, 0, h,
c ,c

+
h,
( dx) =

c+ 1(0,h] (x) + c 1[h,0) (x)


dx.
|x|1+

(11)

This class of processes has a scaling property similar to stable processes which is
particularly useful in simulations. It will allow us to do the precomputation for one
infinitely divisible distribution only and use the scaling property to do simulations
of different levels. For applications of truncated stable processes, we refer the reader
to [23, 26].

Multilevel Monte Carlo Implementation for SDEs Driven

13

3.1 Preliminaries
Proposition 1 Let > 0 and (Yt ) be a truncated stable process with parameters
, h, c+ , c . The process (Yt/ ) is a truncated stable process with parameters
, h, c+ , c .
Proof The process (Yt/ ) is a Lvy process with
E[e

i zxYt/

 t 

c+
c  
(eizx izx 1) 1(0,h] (x) 1+ + 1[h,0) (x) 1+ dx

|x|
|x|
 

c+
c  
= exp t (ei zy i zy 1) 1(0,h] (y) 1+ + 1[h,0) (x) 1+ dy
|y|
|y|

] = exp

for t 0 and z R.

In order to do a Fourier inversion via (10) we need the characteristic function of


a truncated stable distribution.
Proposition 2 Let Y be a truncated stable process with parameters , h, c+ , c .
Then for t 0 and z R
E[ei zYt ] = exp


t  i zh
i zh
c
e
+
c
e

(c
+
c
)

(c

c
)i
zh
+

h


t

i z c+ ei zh c ei zh (c+ c )
1
( 1)h

th 2
z 2 c+ 1 F1 (2 , 3 , i zh)

( 1)(2 )

+ c 1 F1 (2 , 3 , i zh) ,

where 1 F1 denotes the hypergeometric function. In the symmetric case where c :=


c+ = c , we have
 ct
ct
(ei zh + ei zh 2)
i z(ei zh ei zh )
h
( 1)h 1

ct
2

z
F
(2

,
3

,
i
zh)
+
F
(2

,
3

,
i
zh))
(
1
1
1
1
( 1)(2 )h 2

E[ei zYt ] = exp

Proof It suffices to prove the statement for c+ = 1 and c = 0. All other cases can
be deduced from this case via scaling, reflection and superposition. Recall that
  h

1
(ei zx 1 i zx) 1+ dx .
E[ei zYt ] = exp t
x
0

14

S. Dereich and S. Li

Applying partial integration we get




(ei zx 1 i zx)

1
x 1+


 1
1 h
i z h i zx
1
dx = (ei zx 1 i zx)
+
(e 1) dx

x 0+
0
x

and de lHpitals rule implies that lim x0


integration we get


(e

i zx

ei zx 1i zx
x

(12)
= 0. Doing an additional partial

 h

1
1
1
1 h
iz
i zx
(e 1) 1
1) dx =
+
ei zx 1 dx
0+
x
1
x
1 0
x
 h
1
cos(zx)
1
i
z
(ei zh 1) 1 +
=
dx
(13)
1
h
1 0 x 1
 h
z
sin(zx)

dx.
1 0 x 1

Using the integral tables of [16, Sect. 3.761] we conclude that




h
0

 1
cos(zx)
cos(zhx)
2
dx = h
dx
1
x
x 1
0
h 2
=
(1 F1 (2 , 3 , i zh) + 1 F1 (2 , 3 , i zh))
2(2 )

and


h
0

 1
sin(zx)
sin(zhx)
2
dx
=
h
dx
1
x
x 1
0
i h 2
=
(1 F1 (2 , 3 , i zh) 1 F1 (2 , 3 , i zh)).
2(2 )

Inserting this into (13) and then inserting the result into (12) finishes the proof.

Next we show that Theorem 3 is applicable for increments of truncated stable


processes. This implies that the distribution function of the increment can be efficiently approximated with the techniques of the previous section.
Proposition 3 Let h, c+ , c 0 and (1, 2) and let F be the distribution function with characteristic function
(z) = exp


R\{0}


(ei zx i zx 1) ( dx) .

Then the assumptions of Theorem 3 are satisfied for arbitrary d = d+ = d > 0 and
one has

( 1)

c+ + c 
h
|z| + 2
|(z)| exp (c+ + c ) () sin
2

Multilevel Monte Carlo Implementation for SDEs Driven

15

for z R. Here () denotes the -function evaluated at . Furthermore,



2

2 
,
 1 e 2 |d| 2 1 +
2
where
1 := exp

1
2

ehd d 2

c+ + c 2
c+ + c hd 
h
h e
and 2 := edh > 0.
+2
2

Proof Fix d > 0 and take z = u + i y with u R and y [d, d]. Using that
ei(u+i y)x 1 i(u + i y)x = eiuxyx 1 iux + yx
= eyx (eiux 1 iux) + eyx 1 + yx + iux(eyx 1)

we write (z) as product


(z) = exp






(eyx 1 + yx)( dx)
e (e 1 iux)( dx) exp


exp
iux(eyx 1)( dx) =: 1 (z) 2 (z) 3 (z).
yx

iux

We will analyse 1 , 2 and 3 separately.


Since y R, the integral ux(eyx 1) ( dx) is real and hence |3 (z)| = 1. To
estimate 2 we note that as a consequence of the Taylor approximation one has
|e 1 |

1 | | 2
e , for R.
2

Together with |y| d and |x| h we get that


|2 (z)| exp

1
2


ehd d 2


1
c+ + c 2 
h
.
x 2 ( dx) = exp ehd d 2
2
2

Finally we estimate 1 (z). Note that Re(eiux 1 iux) 0 and eyx edh if
|x| h. Hence,


Re eyx (eiux 1 iux) ( dx) edh Re(eiux 1 iux) ( dx)
so that




dh
|1 (z)| exp e
Re(eiux 1 iux) ( dx) .

16

S. Dereich and S. Li

In terms of the measure with ( dx) =




c+ +c
1[h,h] (x) |x|(1+)
2

dx we have


Re(eiux 1 iux) ( dx) = (eiux 1 iux) ( dx)


c+ + c (1+)
c+ + c (1+)
|x|
|x|
dx
(eiux 1 iux)
dx
= (eiux 1 iux)
c
2
2


 [h,h]
=|u| (symm. stable)

|u| + 2

c+ + c
h ,

where := (c+ + c ) () sin( (1)


) > 0. Combining the estimates yields for
2
z = u + i y with u R, y [d, d] the estimate
|(z)| 1 e2 |u|
where
1 := exp
and

1
2

ehd d 2

(14)

c+ + c 2
c+ + c hd 
h
h e
+2
2

2 := edh > 0.

Equation (14) implies that all assumptions of Theorem 3 are satisfied. If additionally
the imaginary part y of z is zero, then 2 (z) = 3 (z) = 1 and using the estimate for
1 gives that

c+ + c 
h
.
|(z)| exp |z| + 2

It remains to estimate  . One has






|u|+|d|
2
2
|(u + i(u ))| du 1 e2 (u +|d| ) 2 du 1 e2 ( 2 ) du
R
R
R
2

1 e 2 (|u| +|d| ) du
R




2 |u|
2 |u|
2 |d|
1 e 2
e 2 du +
e 2 du
B(0,1)


2

2+1 
,
1 e 2 |d| 2 +
2

B(0,1)c

and letting 0 we get that indeed + satisfies the inequality of the statement. A

similar computation shows that also  satisfies the same inequality.

Multilevel Monte Carlo Implementation for SDEs Driven

17

3.2 Multilevel Monte Carlo for Truncated Stable Processes


In this section we introduce a particular simple multilevel scheme for truncated
stable processes. We suppose that (Yt ) is a L 2 -integrable Lvy process with triplet
(b, 2 , ), where is of the form
c ,c

+
( dx) =
( dx) = H,

c+ 1(0,H ] (x) + c 1[H,0) (x)


dx
|x|1+

(15)

with c+ , c 0, (1, 2) and H 1 (in order to simplify notation we assume that


H 1, although we could equally well allow any H > 0).
We choose the hierarchical scheme of approximations as follows. We fix M
N\{1} and a parameter M  (M /2 , M) and let
1. k = M k T

N and k M  k as k
2. k k N (0, 1] with k k+1
 1/
3. h k = k .
In general, we write for two functions g and h, g h, if 0 < lim inf
. For instance we may define (k ) iteratively via 1 = T and

= min{k /m : m N} k N [M 
k+1

g
h

lim sup hg <

T, )

(16)

for k N.
Proposition 4 If the parameters ((h k , k , k ) : k N) are chosen as above, then
properties (ML1)(ML4) are satisfied.
Proof One has for k
(B(0, h k )c ) k

 M  k
c+ + c

k h k
0

since M  < M and


 M k
h 2k log2 (1 + 1/k )
M  2k/ log2 k

=
log2 k 0,
k
M k
M  2/
since M  2/ > M. Hence, (ML2) and (ML4) are satisfied. Property (ML3) follows
analogously



k
1  c+ + c 2 k
1
hk
x 2 ( dx) log2 1 +  =
log2 1 + 
k B(0,h k )
k
2
k
k

and the proof is finished.

M  2k/
log2 k
M k


18

S. Dereich and S. Li

As we show in the next proposition the fact that h k /k 1/ is constant in k allows
us to do the sampling of the process constituted by the small increments with the
help of only one additional distribution for which we have to do a precomputation
in advance.
Proposition 5 Suppose that is a real random variable with


c+ 1(0,1] (x) + c 1[1,0) (x)
(ei zx 1 i zx)
dx
.
E[ei z ] = exp
x 1+
(0,1]
For every k N the increments of (Yth k )t0 over intervals of length k are independent
and distributed as h k .
Proof We note that the increments of (Yt1 ) over intervals of length one are equally

distributed as . Furthermore, using that h k = k 1/ we get with Proposition 1 that


1
hk
the processes (h k Yt/
 )t0 and (Yt )t0 are equally distributed. Hence, the increments
k
of Y h k over intervals of length k are distributed as h k .

Next we describe how we implement one joint simulation of two consecutive
levels ( X tk ) and ( X tk+1 ). We will assume that we can sample from the distribution
of with as in the previous proposition. In practise we use the approximate
sampling algorithm introduced before.
One joint simulation of two levels: First we discuss how we simulate the fine
level ( X tk+1 ). Once we know the values of
h k+1

1. (Yt

) on the random set of times


(k+1 Z [0, T ]) {s (0, T ] : |Ys | h k+1 } = {T0 , T1 , . . . }

2. (Yt

h k+1


) on the set of times k+1
Z [0, T ]

we can compute ( X tk+1 ) via the Euler update rule (5). The increments of the process
in (2) are independent and distributed as h k+1 so that the simulation is straightforward. To simulate the process in (1) we first simulate the random set of discontinuities
{(s, Ys ) : s (0, T ], |Ys | h k+1 } =: {(S1 , D1 ), (S2 , D2 ), . . . }.
Here the points are ordered in such a way that S1 , S2 , . . . is increasing. These points
constitute a Poisson point process with intensity |(0,T ] | B(0,h k+1 )c . When considering an infinite time horizon the random variables ((Sk Sk1 , Dk ) : k N) (with
S0 = 0) are independent and identically distributed with both components being
independent of each other, the first being exponentially distributed with parameter
({x : |x| h k+1 }) and the second having distribution
1{|x|h k+1 }
( dx).
({|x| h k+1 })

Multilevel Monte Carlo Implementation for SDEs Driven

19

Hence, the sampling of the large discontinuities is achieved by drawing iid


samples of the previous distribution, adding the time increments and stopping once
one exceeds T . Once the discontinuities have been sampled we build the set of update
times via
{T0 , T1 , . . . } = (k+1 Z [0, T ]) {S1 , S2 , . . . }
and simulate standard Brownian motion (Bt ) on this set of times (using that the
increments are independent and conditionally N (0, Tk Tk1 )-distributed). Then
h

YTkk+1 = BTk +



Dk + Tk b

i:Si Tk

|x|h k+1

x ( dx)

for the times T0 , T1 , . . . .


To generate the coarse level (X tk ) we do not need further random samples. It only
depends on the values of
1. (Yth k ) on the random set of times
(k Z [0, T ]) {s (0, T ] : |Ys | h k } = {T0 , T1 , . . . }
2. (Yth k ) on the set of times k Z [0, T ].


Note that since k k+1
N we have k N k+1
N so that the updates times for the
coarse level are also update times for the fine level. We use that
h
Yth k = Yt k+1 +


Di t

i:Si t

|x|(h k+1 ,h k ]

x ( dx)

to generate (Yth k ) on k Z [0, T ]. To generate the former process we note that since
k k+1 N the set {T0 , T1 , . . . } is a subset of {T0 , T1 , . . . } so that we can use that
YThk
k

= B +

Tk

i:Si Tk ,|Di |h k



Dk + Tk b

|x|h k


x ( dx) .

We stress that all integrals can be made explicit due to the particular choice of .

3.3 Numerical Tests


In this section we do numerical tests for SDEs driven by truncated Lvy processes.
In Sect. 3.3.1 we analyse the error of the approximate direct simulation algorithm
used. Here we discuss when the error is of the order of real machine precision.

20

S. Dereich and S. Li

In Sect. 3.3.2 we optimise over the multiplier M appearing in the multilevel scheme.
The answer depends on a parameter that depends in a subtle way on the underlying
problem and the implementation. We conduct tests for a volatility model.
In Sect. 3.3.3 we numerically analyse the error and runtime of the multilevel
schemes for the volatility model introduced there.

3.3.1

Error Analysis of the Sampling Algorithm

As error criterion on the space of real probability distributions we use the


L p -Wasserstein metric. For two real distributions and we call a distribution
on the Borel sets of R2 coupling of and , if the first, resp. second, marginal
distribution of is , resp. . For p 1 and two probability measures we denote
by W p the Wasserstein metric on the space of real distributions defined by
W p (, ) = inf



|x y| p d(x, y)

1/ p

: is a coupling of and

for two distributions and . For further details concerning the Wasserstein metric we refer the reader to [13, 29]. For real distributions optimal couplings can
be given in terms of quantiles which leads to an alternative representation for the
Wasserstein metric that is particularly suited for explicit computations. We denote
by F : (0, 1) R the generalised right continuous inverse of the cdf F of , that
is
F (u) = inf{t R : F (t) u},
and we use analogous notation for the measure replaced by . One has
W p (, ) =


0

|F (u) F (u)| p du

1/ p

(17)

We do a numerical test for fixed c+ = c = H = 1 and {1.2, 1.5, 1.8}. Our


sampling algorithm makes use of the following parameters
: window width used in approximation (10)
K : 2K + 1 summands used in approximation (10)
xmin and xmax : the minimal and maximal point for which we precompute the distribution function, see Sect. 2.2
N = 2d+1 1: the number of intervals used for the interpolation of the distribution
function, see Sect. 2.2.
To assess the quality of an approximation we numerically compute the Wasserstein metric between the sampling scheme with the given parameters and the one
with significantly refined parameters, namely 2 , 4K , xmin 2, xmax + 2 and d + 2.
The Wasserstein metric between the two sampling distributions is estimated by doing
a Monte Carlo simulation of the Wasserstein distance (17) for the second moment

Multilevel Monte Carlo Implementation for SDEs Driven


Table 1 Dependence of the
W2 -Wasserstein metric on the
choice of d computed with
double precision arithmetic

W2

d
d
d
d

= 11
=9
=7
=5

W2

d
d
d
d

= 11
=9
=7
=5

W2

d
d
d
d

= 11
=9
=7
=5

21

= 1.2

8.1505 1014 5.2284 1014

1.6056 1012 1.2793 1012

4.7843 1012 2.9185 1013

2.0309 108 7.7718 1011


= 1.5

1.2348 1014 4.9172 1015

4.1492 1014 2.1661 1014

9.4183 1012 3.8266 1012

2.2023 108 9.2660 1011


= 1.8

3.7942 1015 1.5269 1015

5.2974 1014 1.5046 1014

1.0557 1011 4.0751 1013

4.7766 108 2.2151 1010

with 106 iterations. Preliminary numerical tests showed that for the following parameters the approximate distribution function has an error of about the machine precision
for reals on the supporting points of the distribution function: K = 400, = 0.02
and

11, if = 1.2,
xmin = xmax = 13, if = 1.5,

20, if = 1.8.
Since these parameters only effect the precomputation we choose them as above and
only vary d (and N ) in the following test that is depicted in Table 1. There the term
following is twice the estimated standard deviation. The results show that one
achieves machine precision for about d = 9.

3.3.2

Optimising the Multiplier M

When generating a pair of levels (X k , X k+1 ) we need to carry out an expected number
of T /k+1 + T (B(0, h k+1 )) Euler steps for the simulation of X k+1 and an expected
number of T /k + T (B(0, h k )) Euler steps for the simulation of X k . By assumption
(B(0, h k+1 )) = o(k1 ) is asymptotically negligible and hence it is natural to assign
one simulation of G(X k+1 ) G(X k ) the cost
T /k+1 + T /k = (M + 1)T /k .

22

S. Dereich and S. Li

A corresponding minimisation of the parameter M is carried out in [14, Sect. 4.1] for
diffusions. The number of Euler steps is only an approximation to the real runtime
caused by the algorithm. In general, the dominating cost is caused by computations
1
and we make the Ansatz that the computational cost for
being of order k1 and k+1
k+1
one simulation of G(X ) G(X k ) is
Ck = (1 + o(1))cost (M + )/k ,

(18)

where cost and are positive constants that do not depend on the choice of M. The
case where one restricts attention to the number of Euler steps is the one with = 1.
We note that for the numerical schemes as in Theorem 2 one has for F as in the
latter theorem

2
S (G) E[G(X )]) N 0, err
(1
1 (

1
)
M

where err does not depend on the choice of M. Taking := ()


:=
/(err 1 1/M) we get
S (G) E[G(X )]) N (0, 1).
1 (
Assigning computational cost (18) for one joint simulation of G(X k+1 ) G(X k ) we
end up with a cost for one simulation of
S (F) of
(1 + o(1))

2
(M 1)(M + ) 2
cost err
(log 1 )2 .
2

M(log M)2

For = 1 this equals the result of [14] for diffusions.


To determine the optimal M we need to estimate the parameter . For > 0 and
M N\{1} we denote by R,M the expected time to jointly simulate two levels, one
with step size coarse = and the other one with fine = coarse /M, and to evaluate
G at the respective realisations. We note that for M1 , M2 {2, 3, . . . } our Ansatz
implies that
R,M1
M1 +
= (1 + o(1))
R,M2
M2 +
as 0. We estimate the runtimes R,M1 and R,M2 for two distinct M1 and M2
and for small > 0 and we conclude back on the parameter by using the latter
equation.
We test our theoretical findings in a four-dimensional problem which formally
does not satisfy some of the assumptions of our theorems. Still we remark that the
results are believed to hold in greater generality and we chose a higher dimensional
example in order to decrease the relative computational cost of the direct simulation
of Levy increments. We let (X t ) be a three-dimensional and (t ) be a one-dimensional
process solving the SDE

Multilevel Monte Carlo Implementation for SDEs Driven

dX t =
dt =

1
1
(X t )t dWt + 10
10
1
1
10
dt + 10
dYt ,

23

dt

(19)

where (t ) is conceived as random volatility. As starting values we choose X 0 =


(0.8, 0.8, 0.8) and 0 = 0.2. Further (Yt ) is a Lvy process with Lvy triplet (0, 0, ),
(Wt ) an independent three dimensional standard Brownian motion and

4x1 0.1x1 0.1x1


((x1 , x2 , x3 )) = 0.1x2 3x2 0.1x2 , for (x1 , x2 , x3 ) R3 .
0.1x3 0.1x3 2x3
We aim at computing the expectation E[G(X T )] for G(x) = max(x1 1,
x2 1, x3 1, 0) (x R3 ).
We estimate in the case where is as in (15) with H = 10, c+ = c = 1 and
= 1.2. In the Fourier-based simulation of the increments we choose as parameters
xmax = 11, d = 11, K = 400, = 0.02. In order to verify that the computational
time used for the direct simulation of Lvy increments is indeed of minor relevance
for our choices of the stepsize we also estimate in the classical setting, where
(Yt ) is replaced by a Brownian motion.
For various choices of and pairs of parameters (M1 , M2 ) we estimate twice.
The results are depicted in Table 2 for the genuine SDE and in Table 3 for the simplified diffusion model. One notices that lies around 0.3. In various other tests we
Table 2 Estimates for in the volatility model with adapted Euler scheme
1
M1 = 2, M2 = 4
M1 = 2, M2 = 8
214
215
216
217
218
219

0.2941
0.3102
0.3401
0.3030
0.3049
0.3053

0.3215
0. 3039
0.3286
0.3029
0.3169
0.3162

0.3094
0.3238
0.3153
0.3002
0.3187
0.3169

0.3087
0.3386
0.3217
0.3110
0.3220
0.3169

Table 3 Estimates for in simplified classical diffusion setting


1
M1 = 2, M2 = 4
M1 = 2, M2 = 8
214
215
216
217
218
219

0.3574
0.3590
0.3478
0.3568
0.3573
0.3594

0.3582
0.3576
0.3545
0.3573
0.3562
0.3592

0.3581
0.3595
0.3591
0.3481
0.3581
0.3599

0.3588
0.3604
0.3656
0.3610
0.3563
0.3600

24

Fig. 1 Estimates for bias and variance for = 1.2

Fig. 2 Error versus runtime in the volatility model for = 1.2

S. Dereich and S. Li

Multilevel Monte Carlo Implementation for SDEs Driven

25

Fig. 3 Error versus runtime in the volatility model for = 1.5

Fig. 4 Error versus runtime in the volatility model for = 1.8

noticed that varies strongly with the implementation and the choice of the stochastic
differential equation. In most tests we observed to be between 0.2 and 0.6.

3.3.3

Numerical Tests of Error and Runtime

In this section we numerically test the error of our multilevel schemes in the volatility
model (19). We adopt the same setting as described in the lines following (19). Further

1
we choose M = 4 and M  = M 3 + 3 in the calibration of the scheme.

26

S. Dereich and S. Li

Using Monte Carlo we estimate E[G(X k ) G(X k1 ))] and Var[G(X k )


G(X k1 )] for k = 3, . . . , 7. The results for = 1.2 are depicted in Fig. 1. They are
based on 1000 samples. Using interpolation we estimate that E[G(X k ) G(X k1 )]
is of order k0.7812 and we choose = 0.8 in the implementation of the algorithm.
We depict a log-log-plot of error versus runtime in Fig. 2. For comparison we also
treated the cases = 1.5 and = 1.8 similarly. The corresponding plots of error
versus runtime are depicted below, see Figs. 3 and 4.

References
1. Applebaum, D.: Lvy processes and stochastic calculus. Cambridge Studies in Advanced Mathematics, vol. 116. Cambridge University Press, Cambridge (2009)
2. Asmussen, S., Rosinski, J.: Approximations of small jumps of Lvy processes with a view
towards simulation. J. Appl. Probab. 38(2), 482493 (2001)
3. Bally, V., Talay, D.: The law of the Euler scheme for stochastic differential equations. I. Convergence rate of the distribution function. Probab. Theory Relat. Fields 104(1), 4360 (1996)
4. Becker, M.: Exact simulation of final, minimal and maximal values of Brownian motion and
jump-diffusions with applications to option pricing. Comput. Manag. Sci. 7(1), 117 (2010)
5. Ben Alaya, M., Kebaier, A.: Central limit theorem for the multilevel Monte Carlo Euler method.
Ann. Appl. Probab. 25(1), 211234 (2015)
6. Bertoin, J.: Lvy Processes. Cambridge University Press, Cambridge (1996)
7. Bruti-Liberati, N., Nikitopoulos-Sklibosios, C., Platen, E.: First order strong approximations
of jump diffusions. Monte Carlo Methods Appl. 12(34), 191209 (2006)
8. Chen, Z.S., Feng, L.M., Lin, X.: Simulating Lvy processes from their characteristic functions
and financial applications. ACM Trans. Model. Comput. Simul. 22(3), 14 (2012)
9. Dereich, S.: The coding complexity of diffusion processes under supremum norm distortion.
Stoch. Process. Appl. 118(6), 917937 (2008)
10. Dereich, S.: Multilevel Monte Carlo algorithms for Lvy-driven SDEs with Gaussian correction. Ann. Appl. Probab. 21(1), 283311 (2011)
11. Dereich, S., Heidenreich, F.: A multilevel Monte Carlo algorithm for Lvy-driven stochastic
differential equations. Stoch. Process. Appl. 121(7), 15651587 (2011)
12. Dereich, S., Li, S.: Multilevel Monte Carlo for Lvy-driven SDEs: central limit theorems for
adaptive Euler schemes. Ann. Appl. Probab. 26(1), 136185 (2016)
13. Dobrushin, R.L.: Prescribing a system of random variables by conditional distributions. Theory
Probab. Appl. 15(3), 458486 (1970)
14. Giles, M.B.: Multilevel Monte Carlo path simulation. Oper. Res. 56(3), 607617 (2008)
15. Glasserman, P.: Monte Carlo methods in financial engineering. Applications of Mathematics
(New York). Stochastic Modelling and Applied Probability, vol. 53. Springer, New York (2004)
16. Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products. Academic, New York
(1980)
17. Heinrich, S.: Multilevel Monte Carlo methods. Lect. Notes Comput. Sci. 2179, 5867 (2001)
18. Jacod, J., Kurtz, T.G., Mlard, S., Protter, P.: The approximate Euler method for Lvy driven
stochastic differential equations. Ann. Inst. H. Poincar Probab. Statist. 41(3), 523558 (2005).
doi:10.1016/j.anihpb.2004.01.007
19. Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations, Applications
of Mathematics (New York), vol. 23. Springer, Berlin (1992)
20. Kohatsu-Higa, A., Tankov, P.: Jump-adapted discretization schemes for Lvy-driven SDEs.
Stoch. Process. Appl. 120(11), 22582285 (2010)
21. Li, S.: Multilevel Monte Carlo simulation for stochastic differential equations driven by Lvy
processes. Ph.D. dissertation, Westflische Wilhelms-Universitt (2015)

Multilevel Monte Carlo Implementation for SDEs Driven

27

22. Maghsoodi, Y.: Mean square efficient numerical solution of jump-diffusion stochastic differential equations. Sankhya Ser. A 58(1), 2547 (1996)
23. Menn, C., Rachev, S.T.: Smoothly truncated stable distributions, GARCH-models, and option
pricing. Math. Methods Oper. Res. 69(3), 411438 (2009)
24. Mordecki, E., Szepessy, A., Tempone, R., Zouraris, G.E.: Adaptive weak approximation of
diffusions with jumps. SIAM J. Numer. Anal. 46(4), 17321768 (2008)
25. Platen, E.: An approximation method for a class of It processes with jump component. Litovsk.
Mat. Sb. 22(2), 124136 (1982)
26. Quek, T., De La Roche, G., Gven, I., Kountouris, M.: Small Cell Networks: Deployment,
PHY Techniques, and Resource Management. Cambridge University Press, Cambridge (2013)
27. Rubenthaler, S.: Numerical simulation of the solution of a stochastic differential equation driven
by a Lvy process. Stoch. Process. Appl. 103(2), 311349 (2003)
28. Sato, K.: Lvy processes and infinitely divisible distributions. Cambridge Studies in Advanced
Mathematics, vol. 68. Cambridge University Press, Cambridge (1999)
29. Vasershtein, L.N.: Markov processes over denumerable products of spaces describing large
system of automata. Problemy Peredaci Informacii 5(3), 6472 (1969)

Construction of a Mean Square Error


Adaptive EulerMaruyama Method
With Applications in Multilevel Monte Carlo
Hkon Hoel, Juho Hppl and Ral Tempone

Abstract A formal mean square error expansion (MSE) is derived for Euler
Maruyama numerical solutions of stochastic differential equations (SDE). The error
expansion is used to construct a pathwise, a posteriori, adaptive time-stepping Euler
Maruyama algorithm for numerical solutions of SDE, and the resulting algorithm is
incorporated into a multilevel Monte Carlo (MLMC) algorithm for weak approximations of SDE. This gives an efficient MSE adaptive MLMC algorithm for handling
a number of low-regularity approximation problems. In low-regularity numerical
example problems, the developed adaptive MLMC algorithm is shown to outperform
the uniform time-stepping MLMC algorithm by orders of magnitude, producing output whose error with high probability is bounded by TOL > 0 at the near-optimal
MLMC cost rate O TOL2 log(TOL)4 that is achieved when the cost of sample
generation is O(1).
Keywords Multilevel monte carlo Stochastic differential equations Euler
Maruyama method Adaptive methods A posteriori error estimation Adjoints

1 Introduction
SDE models are frequently applied in mathematical finance [12, 28, 29], where an
observable may, for example, represent the payoff of an option. SDE are also used
to model the dynamics of multiscale physical, chemical or biochemical systems

H. Hoel (B)
Department of Mathematics, University of Oslo, P.O. Box 1053,
0316 Blindern, Oslo, Norway
e-mail: haakonah@math.uio.no
H. Hoel J. Hppl R. Tempone
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division,
King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
e-mail: juho.happola@kaust.edu.sa
R. Tempone
e-mail: raul.tempone@kaust.edu.sa
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_2

29

30

H. Hoel et al.

[11, 25, 30, 32], where, for instance, concentrations, temperature and energy may
be sought observables.
Given a filtered, complete probability space (, F , (Ft )0tT , P), we consider
the It SDE
dXt = a(t, Xt )dt + b(t, Xt )dWt ,

t (0, T ],

X0 = x0 ,

(1)

where X : [0, T ] Rd1 is a stochastic process with randomness generated by a


d2 -dimensional Wiener process, W : [0, T ] Rd2 , with independent components, W = (W (1) , W (2) , . . . , W (d2 ) ), and a : [0, T ] Rd1 Rd1 and b : [0, T ]
Rd1 Rd1 d2 are the drift and diffusion coefficients, respectively. The initial condition x0 is a random variable on (, P, F ) independent of W . The considered
filtration Ft is generated from the history of the Wiener process W up to time t and
the possible outcomes of the initial data X0 , and succeedingly completed with all
P-outer measure zero sets of the sample space . That is
Ft := ({Ws }0st ) (X0 )
where the operation A B denotes the -algebra generated by the pair of -algebras
A and B, i.e., A B := (A , B), and A denotes the P-outer measure null-set
completion of A ,




A := A A 

inf

AA

A{
| AA}


 
P A = 0 .

The contributions of this work are twofold. First, an a posteriori adaptive timestepping algorithm for computing numerical realizations of SDEs using the Euler
Maruyama method is developed. And second, for a given observable g : Rd1 R,
we construct a mean square error (MSE) adaptive time-stepping multilevel

Monte

Carlo (MLMC) algorithm for approximating the expected value, E g(XT ) , under
the following constraint:




P E g(XT) A  TOL 1 .

(2)

Here, A denotes the algorithms approximation of E g(XT ) (examples of which


are given in Item (A.2) and Eq. (6) and TOL and > 0 are accuracy and confidence
constraints, respectively.
The rest of this paper is organized as follows: First, in Sect. 1.1, we review the
Monte Carlo methods and their use with the EulerMaruyama integrator. This is followed by discussion of Multilevel Monte Carlo methods and adaptivity for SDEs. The
theory, framework and numerical examples for the MSE adaptive algorithm is presented in Sect. 2. In Sect. 3, we develop the framework for the MSE adaptive MLMC
algorithm and present implementational details in algorithms with pseudocode. In

Construction of a Mean Square Error Adaptive

31

Sect. 4, we compare the performance of the MSE adaptive and uniform MLMC
algorithms in a couple of numerical examples, one of which is a low-regularity SDE
problem. Finally, we present brief conclusions followed by technical proofs and the
extension of the main result to higher-dimensional problems in the appendices.

1.1 Monte Carlo Methods and the EulerMaruyama Scheme


Monte Carlo (MC) methods provide a robust and typically non-intrusive way to
compute weak approximations of SDE. The convergence rate of MC methods does
not depend on the dimension of the problem; for that reason, MC is particularly
effective on multi-dimensional problems. In its simplest form, an approximation by
the MC method consists of the following two steps:
(A.1) Make M independent and identically distributed numerical approximations,
{X m,T }m=1,2,...,M , of the
numerical solution of the SDE (1).
(A.2) Approximate E g(XT) a realization of the sample average


M

g X m,T
A :=
.
M
m=1

(3)

Regarding ordinary differential equations (ODE), the theory for numerical integrators
of different orders for scalar SDE is vast. Provided sufficient regularity, higher order
integrators generally yield higher convergence rates [22]. With MC methods it is
straightforward
 to determine that the goal (2) is fulfilled at the computational cost

O TOL21/ , where 0 denotes the weak convergence rate of the numerical
method, as defined in Eq. (5).
As a method of temporal discretization, the EulerMaruyama scheme is given by
X tn+1 = X tn + a(tn , X tn )tn + b(tn , X tn )Wn ,
X 0 = x0 ,

(4)

using time steps tn = tn+1 tn and Wiener increments Wn = Wtn+1 Wtn


N(0, tn Id2 ), where Id2 denotes the d2 d2 identity matrix. In this work, we will
focus exclusively on EulerMaruyama time-stepping. The EulerMaruyama scheme,
which may be considered the SDE-equivalent of the forward-Euler method for ODE,
has, under sufficient regularity, first-order weak convergence rate



 

E g(XT) g X T  = O max tn ,
n

(5)

32

H. Hoel et al.

and also first-order MSE convergence rate





 2
= O max tn ,
E g(XT) g X T
n

(6)

cf. [22]. For multi-dimensional SDE problems, higher order schemes are generally
less applicable, as either the diffusion coefficient matrix has to fulfill a rigid commutativity condition, or Levy areas, required in higher order numerical schemes, have to
be accurately approximated to achieve better convergence rates than those obtained
with the EulerMaruyama method [22].

1.2 Uniform and Adaptive Time-Stepping MLMC


MLMC is a class of MC methods that uses a hierarchy of subtly correlated and
increasingly refined realization ensembles to reduce the variance of the sample estimator. In comparison with single-level MC, MLMC may yield orders of magnitude
reductions in the computational cost of moment approximations. MLMC was first
introduced by Heinrich [14, 15] for approximating integrals that depend on random
parameters. For applications in SDE problems, Kebaier [21] introduced a two-level
MC method and demonstrated its potential efficiency gains over single-level MC.
Giles [8] thereafter developed an MLMC algorithm for SDE, exhibiting even higher
potential efficiency gains. Presently, MLMC is a vibrant and growing research topic,
(cf. [3, 4, 9, 10, 13, 26, 34], and references therein).

1.2.1

MLMC Notation

We define the multilevel estimator by


AML :=

M
L

 gm
=0 m=1

M

(7)

where
 {0}
g X T ,
if  = 0,
 {1}
 {} 
 gm :=
g X
, otherwise.
m,T g X m,T
Here, the positive integer, L, denotes the final level of the estimator, M is the number
{}
{1}
of sample realizations on the th level, and the realization pair, X m,T and X m,T , are
copies of the by the EulerMaruyama method (4) approximations of the SDE using
the same Wiener path, Wm , sampled on the respective meshes, t {} and t {1} ,
(cf. Fig. 1).

Construction of a Mean Square Error Adaptive

33

Fig. 1 (Left) A sample Wiener path, W , generated on the coarse mesh, t {0} , with uniform step
size 1/10 (blue line). The path is thereafter Brownian bridge interpolated onto a finer mesh, t {1} ,
which has uniform step size of 1/20 (green line). (Right) EulerMaruyama numerical solutions of the
OrnsteinUhlenbeck SDE problem, dXt = 2(1 Xt )dt + 0.2dWt , with initial condition X0 = 3/2,
are computed on the meshes t {0} (blue line) and t {1} (green line) using Wiener increments from
the respective path resolutions

1.2.2

Uniform Time-Stepping MLMC

In the uniform time-stepping MLMC introduced in [8], the respective SDE realiza{}
tions {X T } are constructed on a hierarchy of uniform meshes with geometrically
decaying step size, min t {} = max t {} = T /N , and N = c N0 with c N\{1}
and N0 an integer. For simplicity, we consider the uniform time-stepping MLMC
method with c = 2.
1.2.3

Uniform Time-Stepping MLMC Error


and Computational Complexity

By construction,
 multilevel estimator is telescoping in expectation, i.e.,
 the


{L}
E AML = E g X T . Using this property, we may conveniently bound the multilevel approximation error:
   




 


E g(XT) A  E g(XT) g X {L}  + E g X {L} A
.
ML
ML 
T
T


 


=:ET

=:ES

The approximation goal (2) is then reached by ensuring that the sum of the bias, ET ,
and the statistical error, ES , is bounded from above by TOL, e.g., by the constraints
ET TOL/2 and ES TOL/2, (see Sect. 3.2 for more details on the MLMC error
control). For the MSE error goal,

2

TOL2 ,
E E g(XT) AML
the following theorem states the optimal computational cost for MLMC:

34

H. Hoel et al.

Theorem 1 (Computational cost of deterministic MLMC; Cliffe et al. [4]) Suppose


)
and
there are constants , , such that min(,
2
  



{}


(i) E g X T g(XT)  = O N ,



(ii) Var( g) = O N ,
 
(iii) Cost( g) = O N .
Then, for any TOL < e1 , there exists an L and a sequence {M }L=0 such that


2
TOL2 ,
E AML E g(XT)

(8)

and


2

,
if > ,

O TOL

2
2
log(TOL)
,
if
= ,
O
TOL
Cost(AML ) =



2+
O TOL

,
if < .

(9)

In comparison,
the computational cost of achieving the goal (8) with single-level

MC is O TOL2 / . Theorem 1 thus shows that for any problem with > 0,
MLMC will asymptotically be more efficient than single-level MC. Furthermore,
the performance gain of MLMC over MC is particularly apparent in settings where
. The latter property is linked to the contributions of this work. In low-regularity
SDE problems, e.g., Example 6 below and [1, 35], the uniform time-stepping Euler
Maruyama results in convergence rates for which < . More sophisticated integrators can preserve rates such that .
Remark 1 Similar accuracy versus complexity results to Theorem 1, requiring
slightly stronger moment bounds, have also been derived for the approximation
goal (2) in the asymptotic setting when TOL 0, cf. [5, 16].

1.2.4

MSE A Posteriori Adaptive Time-Stepping

In general, adaptive time-stepping algorithms seek to fulfill one of two equivalent


goals [2]:
(B.1) Provided a computational budget N and a norm , determine
  the possibly
g(XT) g X T .
non-uniform mesh, which minimizes
the
error

 
(B.2) Provided an error constraint g(XT) g X T  TOL, determine the possibly
non-uniform mesh, which achieves the constraint at the minimum computational cost.
Evidently, the refinement criterion of an adaptive algorithm depends on the error
one seeks to minimize. In this work, we consider adaptivity goal (B.1) with the error
measured in terms of the MSE. This error measure is suitable for MLMC algorithms

Construction of a Mean Square Error Adaptive

35


as it often will lead to improved convergence rates, (since Var( g) E  g2 ),
which by Theorem 1 may reduce the computational cost of MLMC. In Theorem 2,
we derive the following error expansion for the MSE of EulerMaruyama numerical
solutions of the SDE (1):
N1



 2
 2
2
=E
n tn + o tn ,
E g(XT ) g X T

(10)

n=0

where the error density, n , is a function of the local error and sensitivities from the
dual solution of the SDE problem, as defined in (24). The error expansion (10) is an
a posteriori error estimate for the MSE, and in our adaptive algorithm, the mesh is
refined by equilibration of the expansions error indicators
r n := n tn2 , for n = 0, 1, . . . , N 1.
1.2.5

(11)

An MSE Adaptive MLMC Algorithm

Using the described MSE adaptive algorithm, we construct an MSE adaptive MLMC
{}
algorithm in Sect. 3. The MLMC algorithm generates SDE realizations, {X T } , on
a hierarchy of pathwise adaptively refined meshes, {t {} } . The meshes are nested,
i.e., for all realizations ,
t {0} () t {1} () . . . t {} () . . . ,


with the constraint that the number of time steps in t {} , t {} , is bounded by 2N :
 {} 
t  < 2N = 2+2 N1 .
Here, N1 denotes the pre-initial number of time steps; it is an integer set in advance
of the computations. This corresponds to the hierarchy setup for the uniform timestepping MLMC algorithm in Sect. 1.2.2.
The potential efficiency gain of adaptive MLMC is experimentally illustrated in
this work using the drift blow-up problem
dXt =

rXt
dt + Xt dWt , X0 = 1.
|t |p

This problem is addressed in Example 6 for the three different singularity exponents
p = 1/2, 2/3 and 3/4, with a pathwise, random singularity point U(1/4, 3/4),
an observable g(x) = x, and a final time T = 1. For the given singularity exponents, we observe experimental deteriorating convergence rates, = (1 p) and
= 2(1 p), for the uniform time-stepping EulerMaruyama integrator, while for

36

H. Hoel et al.

Table 1 Observed computational costdisregarding log(TOL) multiplicative factors of finite


orderfor the drift blow-up study in Example 6
Singularity exponent p
Observed computational cost
Adaptive MLMC
Uniform MLMC
1/2
2/3
3/4

TOL2
TOL2
TOL2

TOL2
TOL3
TOL4

the adaptive time-step EulerMaruyama we observe 1 and 1. Then, as


predicted by Theorem 1, we also observe an order of magnitude difference in computational cost between the two algorithms (cf. Table 1).

1.2.6

Earlier Works on Adaptivity for SDE

Gaines and Lyons work [7] is one of the seminal contributions on adaptive algorithms for SDE. They present an algorithm that seeks to minimize the pathwise error
of the mean and variation of the local error conditioned on the -algebra generated by
(i.e., the values at which the Wiener path has been evaluated in order to numerically
integrate the SDE realization) {Wtn }Nn=1 . The method may be used in combination
with different numerical integration methods, and an approach to approximations of
potentially needed Levy areas is proposed, facilitated by a binary tree representation
of the Wiener path realization at its evaluation points. As for a posteriori adaptive
algorithms, the error indicators in Gaines and Lyons algorithm are given by products of local errors and weight terms, but, unlike in a posteriori methods, the weight
terms are computed from a priori estimates, making their approach a hybrid one.
Szepessy et al. [31] introduced a posteriori weak error based adaptivity for
the EulerMaruyama algorithm with numerically computable error indicator terms.
Their development of weak error adaptivity took inspiration from Talay and Tubaros
seminal work [33], where an error expansion for the weak error was derived for the
EulerMaruyama algorithm when uniform time steps were used. In [16], Szepessy
et al.s weak error adaptive algorithm was used in the construction of a weak error
adaptive MLMC algorithm. To the best of our knowledge, the present work is the
first on MSE a posteriori adaptive algorithms for SDE both in the MC- and MLMC
setting.
Among other adaptive algorithms for SDE, many have refinement criterions based
only or primarily on estimates of the local error. For example in [17], where the stepsize depends on the size of the diffusion coefficient for a MSE EulerMaruyama
adaptive algorithm; in [23], the step-size is controlled by the variation in the size of
the drift coefficient in the constructed EulerMaruyama adaptive algorithm, which
preserves the long-term ergodic behavior of the true solution for many SDE problems;
and in [19], a local error based adaptive Milstein algorithm is developed for solving
multi-dimensional chemical Langevin equations.

Construction of a Mean Square Error Adaptive

37

2 Derivation of the MSE A Posteriori Adaptive Algorithm


In this section, we construct an MSE a posteriori adaptive algorithm for SDE whose
realizations are numerically integrated by the EulerMaruyama algorithm (4). Our
goal is, in rough terms, to obtain an algorithm for solving the SDE problem (1) that
for a fixed number of intervals
the time-stepping, t0 , t1 , . . . , tN1
   N, determines
2
is minimized. That is,
such that the MSE, E g X T g(XT)
  
2
min!, N given.
E g X T g(XT)

(12)

The derivation of our adaptive algorithm consists of two steps. First, an error expansion for the MSE is presented in Theorem 2. Based on the error expansion, we
thereafter construct a mesh refinement algorithm. At the end of the section, we apply
the adaptive algorithm to a few example problems.

2.1 The Error Expansion


Let us now present a leading-order error expansion for the MSE (12) of the SDE problem (1) in the one-dimensional (1D) setting, i.e., when Xt attains values in R and the
drift and diffusion coefficients are respectively of the form a : [0, T ] R R and
b : [0, T ] R R. An extension of the MSE error expansion to multi-dimensions
is given in Appendix Error Expansion for the MSE in Multiple Dimensions. To state
the error expansion Theorem, some notation is needed. Let Xsx,t denote the solution
of the SDE (1) at time s t, when the initial condition is Xt = x at time t, i.e.,

Xsx,t := x +


a(u, Xu )du +

b(u, Xu )dWu ,

s [t, T ],

(13)

and in light of this notation, Xt is shorthand for Xtx0 ,0 . For a given observable g, the
payoff-of-flow map function is defined by (t, x) = g(XTx,t ). We also make use of
the following function space notation
C(U) := {f : U R | f is continuous},
Cb (U) := {f : U R | f is continuous and bounded},


dj
Cbk (R) := f : R R | f C(R) and j f Cb (R) for all integers 1 j k ,
dx

k1 ,k2
Cb ([0, T ] R) := f : [0, T ] R R | f C([0, T ] R) and

j
t 1 xj2 f Cb ([0, T ] R) for all integers j1 k1 and 1 j1 + j2 k2 .

38

H. Hoel et al.

We are now ready to present our mean square expansion result, namely,
Theorem 2 (1D MSE leading-order error expansion) Assume that drift and diffusion
coefficients and input data of the SDE (1) fulfill
(R.1) a, b Cb2,4 ([0, T ] R),
(R.2) there exists a constant C > 0 such that
|a(t, x)|2 + |b(t, x)|2 C(1 + |x|2 ),

x R and t [0, T ],

(R.3) The gradient of g, g : R R satisfies g Cb3 (R),


(R.4) for the initial data, X0 is F0 -measurable and E[|X0 |p ] < for all p 1.
Assume further the mesh points 0 = t0 < t1 < . . . < tN = T
(M.1) are stopping times for which tn is Ftn1 -measurable for n = 1, 2, . . . , N,
(M.2) there exists N N, and a c1 > 0 such that c1 N inf N() and sup
N() N holds for each realization. Furthermore, there exists a c2 > 0 such
that sup maxn{0,1,...,N1} tn () < c2 N 1 ,
(M.3) and there exists a c3 > 0 such that for all p [1, 8] and n {0, 1, . . . , N 1}

p

E tn2p c3 E tn2 .
Then, as N increases,






 (bx b)2

 2 N1

(tn , X tn )tn2 + o tn2 ,
=
E g(XT ) g X T
E x tn , X tn
2
n=0

(14)

where we have defined tn = T and t


 n = 0 for all n {N, N + 1, . . . , N}. And
replacing the first variation, x tn , X n , by the numerical approximation, x,n , as
defined in (23), yields the following to leading order all-terms-computable error
expansion:




2


 2 N1

2 (bx b)
(tn , X tn )tn2 + o tn2 .
E g(XT ) g X T
=
E x,n
2
n=0

(15)

We present the proof to the theorem in Appendix Error Expansion for the MSE
in 1D
Remark 2 In condition (M.2) of the above theorem we have introduced N to denote
the deterministic upper bound for the number of time steps in all mesh realizations.
Moreover, from this point on the mesh points {tn }n and time steps {tn }n are defined
with the natural extension tn = T and tn = 0 for all
for all indices {0, 1, . . . , N}
In addition to ensuring an upper bound on the complexity of a
n {N + 1, . . . , N}.

Construction of a Mean Square Error Adaptive

39

numerical realization and that maxn tn 0 as N , replacing the random N


(the smallest integer value for which tN = T in a given mesh) with the deterministic
N in the MSE error expansion (15) simplifies our proof of Theorem 2.
Remark 3 For most SDE problems on which it is relevant to apply a posteriori
adaptive integrators, at least one of the regularity conditions (R.1), (R.2), and (R.3)
and the mesh adaptedness assumption (M.1) in Theorem 2 will not be fulfilled. In
our adaptive algorithm, the error expansion (15) is interpreted in a formal sense and
only used to facilitate the systematic construction of a mesh refinement criterion.
When applied to low-regularity SDE problems where some of the conditions (R.1),
(R.2), or (R.3), do not hold, the actual leading-order term of the error expansion (15)
2 (bx b)2
(tn , X tn ) in the error density.
may contain other or additional terms besides x,n
2
Example 6 presents a problem where ad hoc additional terms are added to the error
density.

2.1.1

Numerical Approximation of the First Variation

The first variation of the flow map, (t, x), is defined by


x (t, x) = x g(Xtx,t ) = g (XTx,t )x XTx,t
and the first variation of the path itself, x Xsx,t , is the solution of the linear SDE
d(x Xsx,t ) = ax (s, Xsx,t )x Xsx,t ds + bx (s, Xsx,t )x Xsx,t dWs , s (t, T ],
x Xtx,t = 1,

(16)

where ax denotes the partial derivative of a with respect to its spatial argument. To
describe conditions under which the terms g (Xsx,t ) and x Xsx,t are well defined, let us
first recall that if Xsx,t solves the SDE (13) and


E
t


|Xsx,t |2 ds

< ,

then we say that there exists a solution to the SDE. If a solution Xsx,t exists and all
solutions 
Xsx,t satisfy



P sup Xsx,t 
Xsx,t  > 0 = 0,
s[t,T ]

we say the solution Xsx,t is pathwise unique.

40

H. Hoel et al.

Lemma 1 Assume the regularity assumptions (R.1), (R.2), (R.3), and (R.4)
in
Theorem 2 hold, and that for any fixed t [0, T ], x is Ft -measurable and E |x|2p < ,
for all p N. Then there exist pathwise unique solutions Xsx,t and x Xsx,t to the respective SDE (13) and (16) for which
! 
 
"
 x,t 2p


x,t 2p



max E sup Xs
, E sup x Xs
< , p N.
s[t,T ]

s[t,T ]

Furthermore, x (t, x) is FT -measurable and


E |x (t, x) |2p < , p N.
We leave the proof of the Lemma to Appendix Variations of the flow map.
To obtain an all-terms-computable error expansion in Theorem 2, which will be
needed to construct an a posteriori adaptive algorithm, the first variation of the flow
map, x , is approximated by the first variation of the EulerMaruyama numerical
solution,
x,n := g (X T )X tn X T .
Here, for k > n, x X
(x X

X tn ,tn

X tn ,tn

v is the solution of the EulerMaruyama scheme

)tj+1 = (X X)tj + ax (tj , X tj )(x X


tn

X tn ,tn

)tj tj + bx (tj , X tj )(x X

for j = n, n + 1, . . . k 1 and with the initial condition x X


pled to the numerical solution of the SDE, X tj .

X tn ,tn

X tn ,tn

)tj Wj ,

(17)

= 1, which is cou-

Lemma 2 If the assumptions (R.1), (R.2), (R.3), (R.4), (M.1) and (M.2) in Theorem 2
hold, then the numerical solution X of (4) converges in mean square sense to the
solution of the SDE (1),
 
2p 1/2p
max E X tn Xtn 
C N 1/2 ,

(18)

 2p
max E X tn  < , p N.

(19)

1nN

and

1nN

For any fixed instant of time tn in the mesh, 1 n N, the numerical solution X tn X
of (17) converges in mean square sense to x X Xtn ,tn ,
# 
2p $1/2p

X tk ,tk
Xtn ,tn 

x Xtk 
C N 1/2 .
max E x X
nkN

(20)

Construction of a Mean Square Error Adaptive

41


2p 

X tk ,tk 


max E x X
< , p N.

nkN

and

(21)

Furthermore, x,n is FT -measurable and


E |x,n |2p < , p N.

(22)

From the SDE (16), it is clear that


X t ,tn

x X T n

N1
%



1 + ax (tk , X tk )tk + bx (tk , X tk )Wk ,

k=n

and this implies that x,n solves the backward scheme


x,n = cx (tn , X tn )x,n+1 , n = N 1, N 2, . . . , 0,

(23)

with the initial condition x,N = g (X T ) and the shorthand notation


c(tn , X tn ) := X tn + a(tn , X tn )tn + b(tn , X tn )Wn .
The backward scheme (23) is convenient from a computational perspective since it
implies that the set of points, {x,n }Nn=0 , can be computed at the same cost as that of
one-path realization, {X tn }Nn=0 , which can be verified as follows
x,n = g (X T )

N1
%

cx (tk , X tk )

k=n

= cx (tn , X tn )g (X T )

N1
%

cx (tk , X tk )

k=n+1

= cx (tn , X tn )g (X T )tn+1 X T


= cx (tn , X tn )x,n+1 .

2.2 The Adaptive Algorithm


Having derived computable expressions for all terms in the error expansion, we next
introduce the error density using a heuristic leading-order expansion
2
n := x,n

(bx b)2
(tn , X tn ), n = 0, 1, . . . , N 1,
2

(24)

42

H. Hoel et al.

and, for representing the numerical solutions error contribution from the time interval
(tn , tn+1 ), the error indicators
r n := n tn2 , n = 0, 1, . . . , N 1.

(25)

The error expansion (15) may then be written as



 2 N1




=
E r n + o tn2 .
E g(XT ) g X T

(26)

n=0

The final goal of the adaptive


is minimization of the leading order of the

& algorithm
N1
MSE in (26), namely, E
n=0 r n , which (for each realization) is approached by
&
minimization of the error expansion realization N1
n=0 r n . An approximately optimal
choice for the refinement procedure can be derived by introducing the Lagrangian

L (t, ) =


(s)t(s)ds + (

ds N),
t(s)

(27)

for which we seek to minimize the pathwise squared error



 2
g(XT ) g X T
=

(s)t(s)ds

under the constraint that

T
0

ds = N,
t(s)

and the implicit constraint that the error indicators


for a fixed number of time steps, N,
are equilibrated,
rn =

n tn2

 2

g(XT ) g X T
=
, n = 0, 1, . . . , N 1.
N

(28)

Minimizing (27) yields


'

(
( g(XT ) gX T2
T *
1
)
(s) ds
tn =
and MSEadaptive E
N (tn )
N
0


, (29)

Construction of a Mean Square Error Adaptive

43

where the above inequality follows from using Hlders inequality,







 2
  T *
1


= * E g(XT ) g X T
(s) ds
E g(XT ) g X T
0
N
' 
(  T
+
*

 2 (
1
E g(XT ) g X T  )E
(s) ds
*
0
N


.

In comparison, we notice that if a uniform mesh is used, the MSE becomes


MSEuniform =

T
E
N




(s) ds .

(30)

A consequence of observations (29) and (30) is that for many low-regularity problems, for instance, if (s) = sp with p [1, 2), adaptive time-stepping Euler
Maruyama methods may produce more accurate solutions (measured in the MSE)
than are obtained using the uniform time-stepping EulerMaruyama method under
the same computational budget constraints.

2.2.1

Mesh Refinement Strategy

To equilibrate the error indicators (28), we propose an iterative mesh refinement


strategy to identify the largest error indicator and then refining the corresponding
time step by halving it.
To compute the error indicators prior to refinement, the algorithm first computes
the numerical SDE solution, X tn , and the corresponding first variation x,n (using
Eqs. (4) and (23) respectively) on the initial mesh, t {0} . Thereafter, the error indicators r n are computed by Eq. (25) and the mesh is refined a prescribed number of
times, Nrefine , as follows:
(C.1) Find the largest error indicator
n := arg max r n ,

(31)

and refine the corresponding time step by halving




tn + tn +1
, tn +1 ,
(tn , tn +1 ) tn ,
2  
 
=t new
=tnnew
+1

(32)

n +2

and increment the number of refinements by one.


(C.2) Update the values of the error indicators, either by recomputing the whole
problem or locally by interpolation, cf. Sect. 2.2.3.

44

H. Hoel et al.

(C.3) Go to step (C.4) if Nrefine mesh refinements have been made; otherwise, return
to step (C.1).
(C.4) (Postconditioning) Do a last sweep over the mesh and refine by
 every
 halving
1

denotes
time step that is strictly larger than tmax , where tmax = O N
the maximum allowed step size.
The postconditioning step (C.4) ensures that all time steps become infinitesimally
small as the number of time steps N with such a rate of decay that condition
(M.2) in Theorem 2 holds and is thereby one of the necessary conditions from
Lemma 2 to ensure strong convergence for the numerical solutions of the MSE
adaptive EulerMaruyama algorithm. However, the strong convergence result should
primarily be interpreted as a motivation for introducing the postconditioning step
(C.4) since Theorem 2s assumption (M.1), namely that the mesh points are stopping
times tn measurable with respect to Ftn1 , will not hold in general for our adaptive
algorithm.

2.2.2

Wiener Path Refinements

When a time step is refined, as described in (32), the Wiener path must be refined
correspondingly. The value of the Wiener path at the midpoint between Wtn and
Wtn +1 can be generated by Brownian bridge interpolation,
W

tnnew
+1

Wt + Wtn +1
+
= n
2

tn
,
2

(33)

where N(0, 1), cf. [27]. See Fig. 1 for an illustration of Brownian bridge interpolation applied to numerical solutions of an OrnsteinUhlenbeck SDE.

2.2.3

Updating the Error Indicators

After the refinement of an interval, (tn , tn +1 ), and its Wiener path, error indicators
must also be updated before moving on to determine which interval is next in line for
refinement. There are different ways of updating error indicators. One expensive but
more accurate option is to recompute the error indicators completely by first solving
the forward problem (4) and the backward problem (23). A less costly but also less
accurate alternative is to update only the error indicators locally at the refined time
step by one forward and backward numerical solution step, respectively:
new

X tn +1 = X tn + a(tn , X tn )tnnew
+ b(tn , X tn )Wnnew
,

new
new
x,n
)x,n +1 .
+1 = cx (tn , X t new
n

(34)

Construction of a Mean Square Error Adaptive

45

Thereafter, we compute the resulting error density, new


n+1 , by Eq. (24), and finally
update the error locally by

2
r n = n tnnew
,

 new 2
r n +1 = new
n +1 tn +1 .

(35)

As a compromise between cost and accuracy, we here propose the following mixed
approach to updating error indicators post refinement: With Nrefine denoting the prescribed number of refinement iterations of the input mesh, let all error indicators
 = O(log(Nrefine ))th iteration, whereas for the
be completely recomputed every N
 iterations, only local updates of the error indicators are comremaining Nrefine N
puted. Following this approach, the computational cost of
 refining a mesh holding
N time steps into a mesh of 2N time steps becomes O N log(N)2 . Observe that
the asymptotically dominating cost is to sort the meshs error indicators O(log(N))
times. To anticipate the computational cost for the MSE adaptive MLMC algorithm, this implies
 that
 the cost of generating an MSE adaptive realization pair is
Cost( g) = O 2 2 .

2.2.4

Pseudocode

The mesh refinement and the computation of error indicators are presented in Algorithms 1 and 2, respectively.
Algorithm 1 meshRefinement
Input: Mesh t, Wiener path W , number of refinements Nrefine , maximum time step tmax
Output: Refined mesh t and Wiener path W .
 = O (log(Nrefine )) and
Set the number of re-computations of all error indicators to a number N
, = Nrefine /N.

compute the refinement batch size N
 do
for i = 1 to N
Completely update the error density by applying
[r, X, x , ] = computeErrorIndicators(t, W ).
, then
if Nrefine > 2N
,
Set the below for-loop limit to J = N.
else
Set J = Nrefine .
end if
for j = 1 to J do
Locate the largest error indicator r n using Eq. (31).
Refine the interval (tn , tn +1 ) by the halving (32), add a midpoint value Wnnew
+1 to the Wiener
path by the Brownian bridge interpolation (33), and set Nrefine = Nrefine 1.
Locally update the error indicators rnnew
and rnnew

+1 by the steps (34) and (35).


end for
end for
Do a final sweep over the mesh and refine all time steps of the input mesh which are strictly larger
than tmax .

46

H. Hoel et al.

Algorithm 2 computeErrorIndicators
Input: mesh t, Wiener path W .
Output: error indicators r, path solutions X and x , error density .
Compute the SDE path X using the EulerMaruyama algorithm (4).
Compute the first variation x using the backward algorithm (23).
Compute the error density and error indicators r by the formulas (24) and (25), respectively.

2.3 Numerical Examples


To illustrate the procedure for computing error indicators and the performance of the
adaptive algorithm, we now present four SDE example problems. To keep matters
relatively elementary, the dual solutions, x (t), for these examples are derived not
from a posteriori but a priori analysis. This approach results in adaptively generated
mesh points which for all problems in this section will contain mesh points which are
stopping times for which tn is Ftn1 -measurable for all n {1, 2, . . . , N}. In Examples 13, it is straightforward to verify that the other assumptions of the respective
single- and multi-dimensional MSE error expansions of Theorems 2 and 3 hold,
meaning that the adaptive approach produces numerical solutions whose MSE to
leading order are bounded by the respective error expansions (14) and (67).
Example 1 We consider the classical geometric Brownian motion problem
dXt = Xt dt + Xt dWt , X0 = 1,
for which we seek to minimize the MSE


E (XT X T )2 = min!, N given,

(36)

at the final time, T = 1, (cf. the goal (B.1)). One may derive that the dual solution
of this problem is of the form
x (Xt , t) = Xt XTXt ,t =

XT
,
Xt

which leads to the error density


(t) =

X2
(bx b)2 (Xt , t) (x (Xt , t))2
= T.
2
2

We conclude that uniform time-stepping is optimal. A further reduction of the MSE


could be achieved by allowing the number of time steps to depend on the magnitude
of XT2 for each realization. This is however outside the scope of the considered
refinement goal (B.1), where we assume the number of time steps, N, is fixed for
all realizations and would be possible only to a very weak degree under the slight
generalization of (B.1) given in assumption (M.2) of Theorem 2.

Construction of a Mean Square Error Adaptive

47

Example 2 Our second example is the two-dimensional (2D) SDE problem


dWt = 1dWt ,
dXt = Wt dWt ,

W0 = 0,
X0 = 0.


Here, we seek to minimize the MSE E (XT X T )2 for the observable

XT =

Wt dWt
0

at the final time T = 1. With the diffusion matrix represented by



1
,
b((Wt , Xt ), t) =
Wt


and observing that


Xt XTXt ,t



= Xt Xt +

Ws dWs

= 1,

it follows from the error density in multi-dimensions in Eq. (65) that (t) = 21 . We
conclude that uniform time-stepping is optimal for this problem as well.
Example 3 Next, we consider the three-dimensional (3D) SDE problem
dWt(1) = 1dWt(1) ,

W0(1) = 0,

dWt(2) = 1dWt(2) ,

W0(2) = 0,

dXt = Wt(1) dWt(2) Wt(2) dWt(1) ,

X0 = 0,

where Wt(1) and Wt(2) are


independent Wiener processes. Here, we seek to minimize
the MSE E (XT X T )2 for the Levy area observable

XT =

(Wt(1) dWt(2) Wt(2) dWt(1) ),

at the final time, T = 1. Representing the diffusion matrix by

1
0
1 ,
b((Wt , Xt ), t) = 0
(1)
Wt Wt(2)

48

H. Hoel et al.

and observing that


Xt XTXt ,t


 T
= Xt Xt +
(Ws(1) dWs(2) Ws(2) dWs(1) ),

= 1,

it follows from Eq. (65) that (t) = 1. We conclude that uniform time-stepping is
optimal for computing Levy areas.
Example 4 As the last example, we consider the 2D SDE
dWt = 1dWt ,
dXt =

3(Wt2

W0 = 0,

t)dWt ,

X0 = 0.

We seek to minimize the MSE (36) at the final time T = 1. For this problem, it
may be shown by It calculus that the pathwise exact solution is XT = WT3 3WT T .
Representing the diffusion matrix by

b((Wt , Xt ), t) =

3(Wt2 t)

Mean square error E[(XT X T )2]

Equation (65) implies that (t) = 18Wt2 . This motivates the use of discrete error
indicators, r n = 18Wt2n tn2 , in the mesh refinement criterion. For this problem, we
may not directly conclude that the error expansion (67) holds since the diffusion
coefficient does not fulfill the assumption in Theorem 3. Although we will not include
j
the details here, it is easy to derive that x XTx,t = 0 for all j > 1 and to prove that
the MSE leading-order error expansion also holds for this particular problem by
following the steps of the proof of Theorem 2. In Fig. 2, we compare the uniform
and adaptive time-stepping EulerMaruyama algorithms in terms of MSE versus the
100

Uniform time stepping


Adaptive time stepping
101

102

103
101

102

103

104

Number of time steps N


Fig. 2 Comparison of the performance of uniform and adaptive time-stepping EulerMaruyama
numerical integration for Example 4 in terms of MSE versus number of time steps

Construction of a Mean Square Error Adaptive

49

number of time steps, N. Estimates for the MSE for both algorithms are computed
by MC sampling using M = 106 samples. This is a sufficient sample size to render
the MC estimates statistical error negligible. For the adaptive algorithm, we have
used the following input parameter in Algorithm 1: uniform input mesh, t, with
step size 2/N (and tmax = 2/N). The number of refinements is set to Nrefine = N/2.
We observe that the algorithms have approximately equal convergence rates, but, as
expected, the adaptive algorithm is slightly more accurate than the uniform timestepping algorithm.

3 Extension of the Adaptive Algorithm


to the Multilevel Setting
In this section, we incorporate the MSE adaptive time-stepping algorithm presented
in the preceding section into an MSE adaptive MLMC algorithm for weak approximations. First, we shortly recall the approximation goal and important concepts
for the MSE adaptive MLMC algorithm, such as the structure of the adaptive mesh
hierarchy and MLMC error control. Thereafter, the MLMC algorithm is presented
in pseudocode form.

3.1 Notation and Objective


For a tolerance, TOL > 0, and confidence, 0 < 1 < 1, we recall that our objective is to construct an adaptive time-stepping MLMC estimator, AML , which meets
the approximation constraint




P E g(XT) AML  TOL 1 .
We denote the multilevel estimator by
AML :=

M
L

 gm
,
M
=0 m=1
  
=:A ( g;M )

where

 

gX m,T ,
 1 if  = 0,
 gm :=

g X m,T g X m,T , else.

Section 1.2.5 presents further details on MLMC notation and parameters.

(37)

50

3.1.1

H. Hoel et al.

The Mesh Hierarchy

 
A realization,  g i, , is generated on a nested pair of mesh realizations
. . . t {1} (i, ) t {} (i, ).
Subsequently, mesh realizations are generated step by step from a prescribed and
deterministic input mesh, t {1} , holding N1 uniform time steps. First, t {1} is
refined into a mesh, t {0} , by applying Algorithm 1, namely


[t {0} , W {0} ] = meshRefinement t {1} , W {1} , Nrefine = N1 , tmax = N01 .
The mesh refinement process is iterated until meshes t {1} and t {1} are produced, with the last couple of iterations being


1
,
[t {1} , W {1} ] = meshRefinement t {2} , W {2} , Nrefine = N2 , tmax = N1

and


[t {} , W {} ] = meshRefinement t {1} , W {1} , Nrefine = N1 , tmax = N1 .
 {1}
 {}
is thereafter
The output realization for the difference  gi = g X i g X i
generated on the output temporal mesh and Wiener path pairs, (t {1} , W {1} ) and
(t {} , W {} ).
For later estimates of the computational cost of the MSE adaptive MLMC algorithm, it is useful to have upper bounds on the growth of the number of time steps
in the mesh hierarchy, {t {} } , as  increases. Letting |t| denote the number of
time steps in a mesh, t (i.e., the cardinality of the set t = {t0 , t1 , . . .}), the
following bounds hold


N t {}  < 2N

 N0 .

The lower bound follows straightforwardly from the mesh hierarchy refinement procedure described above. To show the upper bound, notice the maximum number of
mesh refinements going from a level  1 mesh, t {1} to a level  mesh, t {} is
2N1 1. Consequently,
|t {} | |t {1} | +

1

Maximum number of refinements going from t {j1} to t {j}

j=0

N1 + 2



j=0

Nj1 ( + 1) < 2N .

Construction of a Mean Square Error Adaptive

51

 {}


to hold, it is not
Remark 4 For the telescoping property E AML = E g X T
required that the adaptive mesh hierarchy is nested, but non-nested meshes make it
more complicated to compute Wiener path pairs (W {1} , W {} ). In the numerical
tests leading to this work, we tested both nested and non-nested adaptive meshes and
found both options performing satisfactorily.

3.2 Error Control


The error control for the adaptive MLMC algorithm follows the general framework
of a uniform time-stepping MLMC, but for the sake of completeness, we recall the
error control framework for the setting of weak approximations. By splitting


E g(XT) A

ML

   



 

 E g(XT) g X {L}  + E g X {L} A
ML 
T
T


 


=:ET

=:ES

and
TOL = TOLT + TOLS ,

(38)

we seek to implicitly fulfill (37) by imposing the stricter constraints


ET TOLT ,


P ES TOLS 1 ,
3.2.1

the time discretization error,

(39)

the statistical error.

(40)

The Statistical Error

Under the moment assumptions stated in [6], Lindebergs version of the Central
Limit Theorem yields that as TOL 0,
 {L}
AML E g X T
D
1 

N(0, 1).

Var AML
D

Here,
denotes convergence in distribution. By construction, we have
L


Var( g)
Var AML =
.
M
=0

52

H. Hoel et al.

This asymptotic result motivates the statistical error constraint


 TOLS 2

,
Var AML
CC 2 ()

(41)

where CC () is the confidence parameter chosen such that


1
1
2

CC
CC ()

()ex

/2

dx = (1 ),

(42)

for a prescribed confidence (1 ).


Another important question is how to distribute the number of samples, {M } ,
on the level hierarchy such that both the computational cost of the MLMC estimator
is minimized and the constraint (41) is met.
 Letting C denote the expected cost of
generating a numerical realization  g i, , the approximate total cost of generating
the multilevel estimator becomes
CML :=

C M .

=0

An optimization of the number of samples at each level can then be found through
minimization of the Lagrangian
L (M0 , M1 , . . . , ML , ) =

# L
Var( g)
=0

M

TOLS 2

CC 2 ()

$
+

C M ,

=0

yielding
2
M =

CC 2 ()
TOLS 2

4
L
Var( g) *
C Var( g) ,  = 0, 1, . . . , L.
C
=0



Since the cost of adaptively refining a mesh, t {} , is O N log(N )2 , as noted in
Sect.
 2.2.3, the cost of generating an SDE realization, is of the same order: C =
O N log(N )2 . Representing the cost by its leading-order term and disregarding the
logarithmic factor, an approximation to the level-wise optimal number of samples
becomes
3
4
2
L
CC 2 () Var( g) *
(43)
N Var( g) ,  = 0, 1, . . . , L.
M =
N
TOLS 2
=0

Construction of a Mean Square Error Adaptive

53

Remark 5 In our MLMC implementations, the variances, Var( g), in Eq. (43)
are approximated by sample variances. To save memory in our parallel computer
implementation,
the maximum permitted batch size for a set of realizations,
 
, samples,
{ g i, }i , is set to 100,000. For the initial batch consisting of M = M
the sample variance is computed by the standard approach,
M

 
1
( g i, A ( g; M ))2 .
V ( g; M ) =
M 1 i=1

   +M
Thereafter, for every new batch of realizations, { g i, }M
i=M +1 (M here denotes
an arbitrary natural number smaller or equal to 100,000), we incrementally update
the sample variance,
M
V ( g; M )
M + M
M
 +M
 
1
( g i, A ( g; M + M))2 ,
+
(M + M 1) i=M +1

V ( g; M + M) =

and update the total number of samples on level  accordingly, M = M + M.

3.2.2

The Time Discretization Error

To control the time discretization error, we assume that a weak order convergence
rate, > 0, holds for the given SDE problem when solved with the EulerMaruyama
method, i.e.,

 {L} 




E g(XT) g X T  = O NL ,
and we assume that the asymptotic rate is reached at level L 1. Then





 {L}  
E L g 

 


 


.
E  g  E L g 
2
=
E g(XT) g X T  = 


2 1
=L+1
=1
In our implementation, we assume

the weak convergence rate, , is known prior to


sampling and, replacing E L g with a sample average approximation in the above
inequality, we determine L by the following stopping criterion:

54

H. Hoel et al.



max 2 |A (L1 g; ML1 )| , |A (L g; ML )|
TOLT ,
2 1

(44)

(cf. Algorithm 3). Here we implicitly assume that the statistical error in estimating
the bias condition is not prohibitively large.
A final level L of order log(TOLT 1 ) will thus control the discretization error.
3.2.3

Computational Cost

Under the convergence rate assumptions stated in Theorem 1, it follows that the cost
of generating
an adaptive
MLMC
estimator, AML , fulfilling the MSE approximation

goal E (AML E g(XT) )2 TOL2 is bounded by

CML




O TOL2 ,
if > 1,




2
4
=
M C O TOL log(TOL) ,  if = 1,

=0
O TOL2+ log(TOL)2 , if < 1.

(45)

Moreover, under the additional higher moment approximation rate assumption


  
2+ 


{}


= O 2+/2 ,
E g X T g(XT)
the complexity bound (45) also holds for fulfilling criterion (2) asymptotically as
TOL 0, (cf. [5]).

3.3 MLMC Pseudocode


In this section, we present pseudocode for the implementation of the MSE adaptive
MLMC algorithm. In addition to Algorithms 1 and 2, presented in Sect. 2.2.4, the
implementation consists of Algorithms 3 and 4. Algorithm 3 describes how the stopping criterion for the final level L is implemented and how the multilevel estimator
is generated, and Algorithm 4 describes the steps for generating a realization  g.

Construction of a Mean Square Error Adaptive

55

Algorithm 3 mlmcEstimator
Input: TOLT , TOLS , confidence , initial mesh t {1} , initial number of mesh steps N1 , input
,
weak rate , initial number of samples M.
Output: Multilevel estimator AML .
Compute the confidence parameter CC () by (42).
Set L = 1.
while L < 2 or (44), using the input for the weak rate, is violated do
Set L = L + 1.
  L
,
Set ML = M,
generate a set of realizations { g i, }M
by
i=1
adaptiveRealizations(t {1} ).
for  = 0 to L do
Compute the sample variance V ( g; Ml ).
end for
for  = 0 to L do
Determine the number of samples M by (43).
if new value of M is larger than the old value then
  Mnew
by
Compute
additional
realizations
{ g i, }i=M
 +1
adaptiveRealizations(t {1} ).
end if
end for
end while
Compute AML from the generated samples by using formula (7).

applying

applying

Remark 6 For each increment of L in Algorithm 3, all realizations  g that have


been generated up to that point are reused in later computations of the multilevel
estimator. This approach, which is common in MLMC, (cf. [8]), seems to work fine
in practice although the independence between samples is then lost. Accounting for
the lack of independence complicates the convergence analysis.

4 Numerical Examples for the MLMC Algorithms


To illustrate the implementation of the MSE adaptive MLMC algorithm and to show
its robustness and potential efficiency gain over the uniform MLMC algorithm, we
present two numerical examples in this section. The first example considers a geometric Brownian motion SDE problem with sufficient regularity, such that there is
very little (probably nothing) to gain by introducing adaptive mesh refinement. The
example is included to show that in settings where adaptivity is not required, the
MSE adaptive MLMC algorithm is not excessively more expensive than the uniform
MLMC algorithm. In the second example, we consider an SDE with a random time
drift coefficient blow-up of order t p with p [0.5, 1). The MSE adaptive MLMC
algorithm performs progressively more efficiently than does the uniform MLMC
algorithm as the value of the blow-up exponent p increases. We should add, however,
that although we observe numerical evidence for the numerical solutions converg-

56

H. Hoel et al.

Algorithm 4 adaptiveRealization
Input: Mesh t {1} .
Outputs: One realization  g()
Generate a Wiener path W {1} on the initial mesh t {1} .
for j = 0 to  do
Refine the mesh by applying
[t {j} , W {j} ] = meshRefinement(t {j1} , W {j1} , Nrefine = Nj1 , tmax = Nj1 ).
end for
{1}
{}
Compute EulerMaruyama realizations (X T
, X T )() using the mesh pair (t {1} , t {} )()
{1}
{}
and Wiener path pair (W
, W )(), cf. (4), and return the output




{}
{1}
 g() = g X T () g X T
() .

ing for both examples, all of the assumptions in Theorem 2 are not fulfilled for our
adaptive algorithm, when applied to either of the two examples. We are therefore not
able to prove theoretically that our adaptive algorithm converges in these examples.
For reference, the implemented MSE adaptive MLMC algorithm is described in
Algorithms 14, the standard form of the uniform time-stepping MLMC algorithm
that we use in these numerical comparisons is presented in Algorithm 5, Appendix A
Uniform Time Step MLMC Algorithm, and a summary of the parameter values used
in the examples is given in Table 2. Furthermore, all average properties derived from
the MLMC algorithms that we plot for the considered examples in Figs. 3, 4, 5, 6, 7,
8, 9, 10, 11 and 12 below are computed from 100 multilevel estimator realizations,
and, when plotted, error bars are scaled to one sample standard deviation.
Example 5 We consider the geometric Brownian motion
dXt = Xt dt + Xt dWt , X0 = 1,
where we seek to fulfill the weak approximation goal (2) for
the observable, g(x) = x,
at the final time, T = 1. The reference solution is E g(XT) = eT . From Example 1,
we recall that the MSE minimized in this problem by using uniform time steps.
However, our a posteriori MSE adaptive MLMC algorithm computes error indicators
from numerical solutions of the path and the dual solution, which may lead to slightly
non-uniform output meshes. In Fig. 3, we study how

to uniform the MSE
 close
t {}  /N , where we recall
adaptive
meshes
are
by
plotting
the
level-wise
ratio,
E


that t {}  denotes the number of time steps in the mesh, t {} , and
that a uniform
mesh on level  has N time steps. As the level, , increases, E t {}  /N converges
to 1, and to interpret this result, we recall from the construction of the adaptive mesh

Construction of a Mean Square Error Adaptive

57

Table 2 List of parameter values used by the MSE adaptive MLMC algorithm and (when required)
the uniform MLMC algorithm for the numerical examples in Sect. 4
Parameter
Description of parameter
Example 5
Example 6

TOL
TOLS
TOLT
t {1}
N0

N()

tmax ()
tmin
,
M

Confidence parameter, cf. (37)


Accuracy parameter, cf. (37)
Statistical error tolerance, cf. (38)
Bias error tolerance, cf. (38)
Pre-initial input uniform mesh
having the following step size
Number of time steps in the initial
mesh t {0}
The number of complete updates
of the error indicators in the MSE
adaptive algorithm, cf. Algorithm 1
Maximum permitted time step size
Minimum permitted time step size
(due to the used double-precision
binary floating-point format)
Number of first batch samples for a
(first) estimate of the variance
Var( g)
Input weak convergence rate used
in the stopping rule (44) for
uniform time step
EulerMaruyama numerical
integration
Input weak convergence rate used
in the stopping rule (44) for the
MSE adaptive time step
EulerMaruyama numerical
integration

0.1
[103 , 101 ]
TOL/2
TOL/2
1/2

0.1
[103 , 101 ]
TOL/2
TOL/2
1/2

log(+2)
log(2)

log(+2)
log(2)

N1
251

N1
251

100

20

(1 p)



hierarchy in Sect. 3 that if t {}  = N , then the mesh, t {} , is uniform. We thus
conclude that for this problem, the higher the level, the more uniform the MSE
adaptive mesh realizations generally become.
Since adaptive mesh refinement is costly and since this problem has sufficient
regularity for the first-order weak and MSE convergence rates (5) and (6) to hold,
respectively, one might expect that MSE adaptive MLMC will be less efficient than
the uniform MLMC. This is verified in Fig. 5, which shows that the runtime of the
MSE adaptive MLMC algorithm grows slightly faster than the uniform MLMC algorithm and that the cost ratio is at most roughly 3.5, in favor of uniform MLMC. In
Fig. 4, the accuracy of the MLMC algorithms is compared, showing that both algorithms fulfill the goal (2) reliably. Figure 6 further shows that

 algorithms have
both
roughly first-order convergence rates for the weak error E  g  and the variance
Var( g), and that the decay rates for Ml are close to identical. We conclude that

58

H. Hoel et al.
Number of time steps ratio E[|t{} |]/N

1.010
1.008
1.006
1.004
1.002
1.000
0

10

12

Level 



Fig. 3 The ratio of the level-wise mean number of time steps E t {}  /N , of MSE adaptive
mesh realizations to uniform mesh realizations for Example 6

Fig. 4 For a set of TOL values, 100 realizations of the MSE adaptive multilevel estimator
are


computed using both MLMC algorithms for Example 5. The errors |AML (i ; TOL, ) E g(XT) |
are respectively plotted as circles (adaptive MLMC) and triangles (uniform MLMC), and the

number
of multilevel estimator realizations failing the constraint |AML (i ; TOL, ) E g(XT) | < TOL
is written above the (TOL1 , TOL) line. Since the confidence parameter is set to = 0.1 and less
than 10 realizations fail for any of the tested TOL values, both algorithms meet the approximation
goal (37)

although MSE adaptive MLMC is slightly more costly than uniform MLMC, the
algorithms perform comparably in terms of runtime for this example.
Remark 7 The reason why we are unable to prove theoretically that the numerical
solution of this problem computed with our adaptive algorithm asymptotically converges to the true solution is slightly subtle. The required smoothness conditions in
Theorem 2 are obviously fulfilled, but due to the local update of the error indicators
in our mesh refinement procedure, (cf. Sect. 2.2.3), we cannot prove that the mesh
points will asymptotically be stopping times for which tn is Ftn1 -measurable for all
n {1, 2, . . . , N}. If we instead were to use the version of our adaptive algorithm
that recomputes all error indicators for each mesh refinement, the definition of the
error density (24) implies that, for this particular problem, it would take the same

Construction of a Mean Square Error Adaptive

59

104

adaptive MLMC
uniform MLMC
c1 TOL2 log(TOL)2

103
2

Runtime [s]

10

101
100
101
102
103

101

102

103
1

TOL

Fig. 5 Average runtime versus TOL1 for the two MLMC algorithms solving Example 5

Adaptive MLMC

Uniform MLMC

10

101
102

A(g ; M )

103

A( g; M )

104

c2

105
101
100
101
102

V(g ; M )

103

V( g; M )
c2

10

1010

E[M (TOL = 103 )]

109

E[M (TOL = 102.11 )]

108

c2

10

106
105
104

Level 

10

12 0

10

12

Level 

Fig. 6 Output for Example 5 solved

with the MSE adaptive and uniform time-stepping MLMC



algorithms. (Top) Weak error E  g  for solutions at TOL = 103 . (Middle) Variance Var( g)
for solutions at TOL = 103 . (Bottom) Average number of samples E[Ml ]

60

H. Hoel et al.

7
2
value, n = N1
k=0 cx (tk , X tk ) /2, for all indices, n {0, 1, . . . , N}. The resulting
adaptively refined mesh would then become uniform and we could verify convergence, for instance, by using Theorem 2. Connecting this to the numerical results for
the adaptive algorithm that we have implemented
here, we notice that the level-wise



mean number of time steps ratio, E t {}  /N , presented in Fig. 3 seems to tend
towards 1 as  increases, a limit ratio that is achieved only if t {} is indeed a uniform
mesh.
Example 6 We next consider the two-dimensional SDE driven by a one-dimensional
Wiener process
dXt = a(t, Xt ; )dt + b(t, Xt ; )dWt
X0 = [1, ]T ,

(46)

with the low-regularity drift coefficient, a(t, x) = [r|t x (2) |p , 0]T , interest rate,
r = 1/5, and volatility b(t, x) = [, 0]T with, = 0.5, and observable, g(x) = x, at
the final time T = 1. The in the initial condition is distributed as U(1/4, 3/4)
and it is independent from the Wiener process, W . Three different blow-up exponent
test cases are considered, p = (1/2, 2/3, 3/4), and to avoid blow-ups in the numerical
integration of the drift function component, f (; ), we replace the fully explicit
EulerMaruyama integration scheme with the following semi-implicit scheme:

X tn+1 = X tn +

rf (tn ; )X tn tn + X tn Wn ,
if f (tn ; ) < 2f (tn+1 ; ),
(47)
rf (tn+1 ; )X tn tn + X tn Wn , else,

where we have dropped the superscript for the first component of the SDE, writing
out only the first component, since the evolution of the second component is trivial.
For p [1/2, 3/4] it may be shown that for any singularity point, any path integrated
by the scheme (47) will have at most one drift-implicit integration step. The reference
mean for the exact solution is given by

E[XT ] = 2

3/4

exp
1/4

r(x 1p + (1 x)1p )
dx,
1p

and in the numerical experiments, we approximate this integral value by quadrature


to the needed accuracy.
The MSE Expansion for the Adaptive Algorithm
Due to the low-regularity drift present in this problem, the resulting MSE expansion
will also contain drift-related terms that formally are of higher order. From the proof
of Theorem 2, Eq. (59), we conclude that, to leading order the MSE is bounded by

Construction of a Mean Square Error Adaptive


Adaptive realization p=0.5

2.5

{2}

Adaptive realization p=0.67

Adaptive realization p=0.75

{6}

X t ()

X t ()

{4}

{8}

X t ()

2.0

61

X t ()

1.5

1.0

0.5
0.0

101
102
103
104
105
106
107
108
109
1010
1011
1012
1013
1014
1015
0.0

0.2

0.4

0.6

0.8

1.00.0

0.2

0.4

0.6

0.8

1.00.0

0.2

0.4

0.6

0.8

time t

time t

time t

Adaptive mesh p=0.5

Adaptive mesh p=0.67

Adaptive mesh p=0.75

t{2} ()

t{6} ()

t{4} ()

t{8} ()

0.2

0.4

0.6

time t

0.8

1.00.0

0.2

0.4

0.6

time t

0.8

1.00.0

0.2

0.4

0.6

0.8

1.0

1.0

time t

Fig. 7 (Top) One MSE adaptive numerical realization of the SDE problem (46) at different mesh
hierarchy levels. The blow-up singularity point is located at 0.288473 and the realizations
are computed for three singularity exponent values. We observe that as the exponent, p, increases,
the more jump at t = becomes more pronounced. (Bottom) Corresponding MSE adaptive mesh
realizations for the different test cases

N1





2
N(at + ax a)2 tn2 + (bx b)2 (tn , X tn ; ) 2
2
tn .
E X T XT  E
x,n
2
n=0
This is the error expansion we use for the adaptive mesh refinement (in Algorithm 1)
in this example. In Fig. 7, we illustrate the effect that the singularity exponent, p, has
on SDE and adaptive mesh realizations.
Implementation Details and Observations
Computational tests for the uniform and MSE adaptive MLMC algorithms are implemented with the input parameters summarized in Table 2. The weak convergence
rate, , which is needed in the MLMC implementations stopping criterion (44), is
estimated experimentally as (p) = (1 p) when using the EulerMaruyama integrator with uniform time steps, and roughly = 1 when using the EulerMaruyama
integrator with adaptive time steps, (cf. Fig. 8). We further estimate the variance convergence rate to (p) = 2(1 p), when using uniform time-stepping, and roughly

62

H. Hoel et al.
p = 0.5, TOL = 103

p = 0.67, TOL = 102

p = 0.75, TOL = 101.5

100
101
102
103

A(g ; M )
A( g; M )

104

c2
105

Level 

10 1

p = 0.5, TOL = 103

Level 

p = 0.67, TOL = 102

10 1

Level 

10

p = 0.75, TOL = 101

100
101
102

A(g ; M )

103
10

c2 TOL0.5

10

Level 

A( g; M )

c20.33
15

20

10

15

Level 

c20.25
20

25

10

15

Level 

20

25


Fig. 8 (Top) Average errors E  g  for Example 6 solved with the MSE adaptive MLMC algorithm for three singularity exponent values. (Bottom) Corresponding average errors for the uniform
MLMC algorithm

to = 1 when using MSE adaptive time-stepping, (cf. Fig. 9). The low weak convergence rate for uniform MLMC implies that the number of levels L in the MLMC
estimator will be become very large, even with fairly high tolerances. Since computations of realizations on high levels are extremely costly, we have, for the sake of
, = 20, for the initial number
computational feasibility, chosen a very low value, M
of samples in both MLMC algorithms. The respective estimators use of samples,
M , (cf. Fig. 10), shows that the low number of initial samples is not strictly needed
for the the adaptive MLMC algorithm, but for the sake of fair comparisons, we have
chosen to use the same parameter values in both algorithms.
From the rate estimates of and , we predict the computational cost of reaching
the approximation goal (37) for the respective MLMC algorithms to be




1
Costadp (AML ) = O log(TOL)4 TOL2 and Costunf (AML ) = O TOL 1p ,

Construction of a Mean Square Error Adaptive


p = 0.5, TOL = 103

63

p = 0.67, TOL = 102

p = 0.75, TOL = 101.5

101
100
101
102
103

V(g ; M )

104

V( g; M )

10

106

c2
2

Level 

10 1

p = 0.5, TOL = 103

Level 

p = 0.67, TOL = 102

10 1

Level 

10

p = 0.75, TOL = 101

101
100
101
102
103
104
105
10

V(g ; M )

c21

106
108

c2 TOL0.5

V( g; M )

0.67

c2
5

10

Level 

15

20

10

15

Level 

20

25

10

15

Level 

20

25

Fig. 9 (Top) Variances Var( g) for for Example 6 solved with the MSE adaptive MLMC algorithm
for three singularity exponent values. (Bottom) Corresponding variances for the uniform MLMC
algorithm. The more noisy data on the highest levels is due to the low number used for the initial
= 20, and only a subset of the generated 100 multilevel estimator realizations reached
samples, M
the last levels

by using the estimate (45) and Theorem 1 respectively. These predictions fit well
with the observed computational runtime for the respective MLMC algorithms,
(cf. Fig. 11). Lastly, we observe that the numerical results are consistent with both
algorithms fulfilling the goal (37) in Fig. 12.
Computer Implementation
The computer code for all algorithms was written in Java and used the Stochastic
Simulation in Java library to sample the random variables in parallel from threadindependent MRG32k3a pseudo random number generators, [24]. The experiments
were run on multiple threads on Intel Xeon(R) CPU X5650, 2.67GHz processors
and the computer graphics were made using the open source plotting library Matplotlib, [18].

64

H. Hoel et al.
p = 0.5

109

107

E[M ]

p = 0.67

TOL = 102.11
TOL = 103
c2

108

p = 0.75

TOL = 101.56
TOL = 102

TOL = 101.06
TOL = 101.5

106
105
104
103
102
101

10 1

p = 0.5

108

10 1

p = 0.67

Level 

10

p = 0.75
1.56

TOL = 10
TOL = 103
c21

106

Level 

2.11

107

E[M ]

Level 

TOL = 101.06
TOL = 101.5
c20.75

TOL = 10
TOL = 102
c20.83

105
104
103
102
101

10

Level 

15

20

10

15

Level 

20

25

10

15

Level 

20

25

Fig. 10 (Top) Average number of samples M for for Example 6 solved with the MSE adaptive
MLMC algorithm for three singularity exponent values. (Bottom) Corresponding average number of
samples for the uniform MLMC algorithm. The plotted decay rate reference lines, c2(((p)+1)/2) ,
for M follow implicitly from Eq. (43) (assuming that (p) = 2(1 p) is the correct variance decay
rate)

104

Runtime [s]

10

c2 TOL3

Adaptive MLMC
c1 TOL2

c2 TOL4

Uniform MLMC

102
101
100
101
102
101

102

TOL1

103

101

102

TOL1

100.50

101

101.50

TOL1

Fig. 11 Average runtime versus TOL1 for the two MLMC algorithms for three singularity exponent values in Example 6

Construction of a Mean Square Error Adaptive


p=0.67

|E[g(XT )] AML (i ; TOL, )|

|E[g(XT )] AML (i ; TOL, )|

p=0.5
TOL
Adaptive MLMC

100
101
102
103

65

5 3
1 2
0 1
0 0
0 0

p=0.75

0 1 0
1 1 0
0 0 0
0

3 1 2
0 1 0
2 1 0
1

104
105

101

102
103

103

101

102

100.50

101

TOL1

TOL1

p=0.5

p=0.67

p=0.75

TOL
Uniform MLMC

100
101

102

TOL1

7 7
8 4
4 6
2 4
5 3

2 0 3 5
2 4 1 2
0 2

101.50

0 1 0 1 0 0 0
0 0 0

104
105

101

102

TOL1

103

101

102

TOL1

100.50

101

TOL1

Fig. 12 Approximation errors for both of the MLMC algorithms solving Example 6. At every
TOL value, circles and triangles represent the errors from 100 independent multilevel estimator
realizations of the respective algorithms

5 Conclusion
We have developed an a posteriori, MSE adaptive EulerMaruyama time-stepping
algorithm and incorporated it into an MSE adaptive MLMC algorithm. The MSE
error expansion presented in Theorem 2 is fundamental to the adaptive algorithm.
Numerical tests have shown that MSE adaptive time-stepping may outperform uniform time-stepping, both in the single-level MC setting and in the MLMC setting,
(Examples 4 and 6). Due to the complexities of implementing adaptive time-stepping,
the numerical examples in this work were restricted to quite simple, low-regularity
SDE problems with singularities in the temporal coordinate. In the future, we aim
to study SDE problems with low-regularity in the state coordinate (preliminary tests
and analysis do however indicate that then some ad hoc molding of the adaptive
algorithm is required).
Although a posteriori adaptivity has proven to be a very effective method for
deterministic differential equations, the use of information from the future of the
numerical solution of the dual problem makes it a somewhat unnatural method to
extend to It SDE: It can result in numerical solutions that are not Ft -adapted,
which consequently may introduce a bias in the numerical solutions. [7] provides
an example of a failing adaptive algorithm for SDE. A rigorous analysis of the
convergence properties of our developed MSE adaptive algorithm would strengthen
the theoretical basis of the algorithm further. We leave this for future work.

66

H. Hoel et al.

Acknowledgments This work was supported by King Abdullah University of Science and Technology (KAUST); by Norges Forskningsrd, research project 214495 LIQCRY; and by the University
of Texas, Austin Subcontract (Project Number 024550, Center for Predictive Computational Science). The first author was and the third author is a member of the Strategic Research Initiative on
Uncertainty Quantification in Computational Science and Engineering at KAUST (SRI-UQ). The
authors would like to thank Arturo Kohatsu-Higa for his helpful suggestions for improvements in
the proof of Theorem 2.

Theoretical Results
Error Expansion for the MSE in 1D
In this section, we derive a leading-order error expansion for the MSE (12) in the 1D
setting when the drift and diffusion coefficients are respectively mappings of the form
a : [0, T ] R R and b : [0, T ] R R. We begin by deriving a representation
of the MSE in terms of products of local errors and weights.
Recalling the definition of the flow map, (x, t) := g(XTx,t ), and the first variation
of the flow map and the path itself given in Sect. 2.1.1, we use the Mean Value
Theorem to deduce that
 
g(XT) g X T = (0, x0 ) (0, X T )
=

N1

(tn , X tn ) (tn+1 , X tn+1 )

n=0

N1



X tn ,tn
(tn+1 , X tn+1 )
tn+1 , Xtn+1

(48)

n=0

N1



x tn+1 , X tn+1 + sn en en ,

n=0
X ,t

tn n
X tn+1 and sn [0, 1]. It expansion
where the local error is given by en := Xtn+1
of the local error gives the following representation:


en =

tn+1

n


tn+1

n

+
t

tn+1

n

 tn+1
X ,t
) a(tn , X tn ) dt +
b(t, Xt tn n ) b(tn , X tn ) dWt
tn

 



X tn ,tn

a(t, Xt

an

bn

 tn+1  t
axx 2
X ,t
X ,t
b )(s, Xs tn n ) ds dt +
(at + ax a +
(ax b)(s, Xs tn n ) dWs dt
2
tn
tn
tn

 


t

|n
=:a

8n
=:a

 tn+1  t
bxx 2
X ,t
X ,t
b )(s, Xs tn n )ds dWt +
(bt + bx a +
(bx b)(s, Xs tn n )dWs dWt .
2
tn
tn
tn

 


t

|n
=:b

8n
=:b

(49)

Construction of a Mean Square Error Adaptive

67

By Eq. (48) we may express the MSE by the following squared sum

N1




 2
E g(XT ) g X T
= E
x tn+1 , X tn+1 + sn en en
n=0

N1


 


E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en ek en .

n,k=0

This is the first step in deriving the error expansion in Theorem 2. The remaining
steps follow in the proof below.
Proof of Theorem 2. The main tools used in proving this theorem are Taylor and
ItTaylor expansions, It isometry, and truncation of higher order terms. For errors
8 n , (cf. Eq. (49)), we do detailed
attributed to the leading-order local error term, b
calculations, and the remainder is bounded by stated higher order terms.
We begin by noting that under the assumptions in Theorem 2 Lemmas 1 and 2
respectively verify then the existence and uniqueness of the solution of the SDE X
and the numerical solution X, and provide higher order moment bounds for both.
Furthermore, due to the assumption of the mesh points being stopping times for
which tn is Ftn1 -measurable for all n, it follows also that the numerical solution is
adapted to the filtration, i.e., X tn is Ftn -measurable for all n.
We further need to extend the flow map and the first variation notation from
x,tk
Sect. 2.1.1. Let X tn for n k denote the numerical solution of the EulerMaruyama
scheme
x,tk

x,tk

x,tk

x,tk

X tj+1 = X tj + a(tj , X tj )tj + b(tj , X tj )Wj , j k,


x,tk

(50)
x,tk

with initial condition Xtk = x. The first variation of X tn is defined by x X tn . Provided


that E |x|2p < for all p N, x is Ftk -measurable and provided the assumptions
of Lemma 2 hold, it is straightforward to extend the proof of the lemma to verify
x,tk
x,tk
that (X , x X ) converges strongly to (X x,tk , x X x,tk ) for t [tk , T ],
# 
2p 

 x,tk
k
E X tn Xtx,t
max

n

1/2p

# 
2p 

x,tk

x,tk 
E x X tn x Xtn 
max

1/2p

knN

knN

and

C N 1/2 , p N
C N 1/2 , p N


 
  
 
x,tk 2p
 x,tk 2p

max max E X tn  , E x X tn 

knN

< , p N.

(51)

68

H. Hoel et al.

In addition to this, we will also make use of moment bounds for the second and
third variation of the flow map in the proof, i.e., xx (t, x) and xxx (t, x). The second
variation is described in Section Variations of the flow
where it is shown in

map,
Lemma 3 that provided that x is Ft -measurable and E |x|2p < for all p N, then



max E |xx (t, x)|2p , E |xxx (t, x)|2p , E |xxxx (t, x)|2p < , p N.
Considering the MSE error contribution from the leading order local error terms
8 n , i.e.,
b
 



8 k b
8n ,
E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en b

(52)

we have for k = n,

 



2
8 2n
E x tn+1 , X tn+1 + xx tn+1 , X tn+1 + sn en sn en b

2


8 2n + o tn2 .
= E x tn+1 , X tn+1 b


The above o tn2 follows from Youngs and Hlders inequalities,


 

8 2n
E 2x tn+1 , X tn+1 xx tn+1 , X tn+1 + sn en sn en b

#
$
 
8 4n
 
2 3
e2n b
C E x tn+1 , X tn+1 xx tn+1 , X tn+1 + sn en tn + E
tn3

 
 
2  3
C E E x tn+1 , X tn+1 xx tn+1 , X n+1 + sn en
Ftn tn



 2 4
 2 4
 6
| b
| 2n b
8n
8 4n
8n
8
8 n b
a
a
b
b
n
n
+E
+
E
+
E
+
E
tn3
tn3
tn3
tn3
3
3

!




1 

| 4n |Ftn 1 + E E a
8 4n |Ftn
C E tn3 +
E E a
tn
tn
3
3 
3




1 
1 "
4
4
8
| |Ft 1 + E E b
8
8
+ E E b
|F
E
E
|F
b
t
t
n
n
n
n
n
n
tn
tn
tn5


= E o(tn2 )
(53)
where the last inequality is derived by applying the moment bounds for multiple
It integrals described in [22, Lemma 5.7.5] and under the assumptions (R.1), (R.2),
(M.1), (M.2) and (M.3). This yields

Construction of a Mean Square Error Adaptive

69




axx 2 4

X tn ,tn 
b  (s, Xs
sup at + ax a +
)  Ftn tn8 ,
2
s[tn ,tn+1 )





4

4
X
,t
8 n |Ftn CE
sup |ax b| (s, Xs tn n )  Ftn tn6 ,
E a


| 4n |Ftn CE
E a

s[tn ,tn+1 )






bxx 2 4
X tn ,tn 

b (s, Xs
sup bt + bx a +
)  Ftn tn6 ,
2 
s[tn ,tn+1 )




4

4
X
,t
8 n |Ftn CE
sup |bx b| (s, Xs tn n )  Ftn tn4 ,
E b


4
| |Ft CE
E b
n
n


8
8 n |Ftn CE
E b

s[tn ,tn+1 )

sup

s[tn ,tn+1 )

|bx b|


(s, XsX tn ,tn )  Ftn

(54)


tn8 .

And by similar reasoning,





2

8 2n CE tn4 .
E xx X tn+1 + sn en , tn+1 sn2 e2n b
For achieving independence between forward paths and dual solutions in the expectations, an ItTaylor expansion of x leads to the equality



2



8 2n = E x tn+1 , X tn 2 b
8 2n + o tn2 .
E x tn+1 , X tn+1 b
Introducing the null set completed -algebra


,n = ({Ws }0stn ) ({Ws Wtn+1 }tn+1 sT ) (X0 ),
F

2
,n measurable by construction, (cf. [27, Appenwe observe that x tn+1 , X tn is F
dix B]). Moreover, by conditional expectation,



2
 2 n
,
8 2n = E x tn+1 , X tn 2 E b
8 n |F
E x tn+1 , X tn b


 2

2
tn2
2
+ o tn ,
= E x tn+1 , X tn (bx b) (tn , X tn )
2
where the last equality follows from using Its formula,
$
 t #
b2 2 
X ,t
2
t + ax + x (bx b) (s, Xs tn n ) ds
2
tn
 t

X ,t
bx (bx b)2 (s, Xs tn n ) dWs , t [tn , tn+1 ),
+

X ,t
(bx b)2 (t, Xt tn n ) = (bx b)2 (tn , X tn ) +

tn

70

H. Hoel et al.

to derive that


2
n
,
8
E bn |F = E

tn+1


tn

tn

(bx b)(s, XsX tn ,tn )dWs

dWt



X tn



(bx b)2 (tn , X tn ) 2
tn + o tn2 .
2
 2
Here, the higher order o tn terms are bounded in a similar fashion as the terms in
inequality (53), by using [22, Lemma 5.7.5].
For the terms in (52) for which k < n, we will show that
=

N1



 
N1



8n =
8 k b
E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en b
E o tn2 ,
n=0

k,n=0

(55)
which means that the contribution to the MSE from these terms is negligible to
leading order. For the use in later expansions, let us first observe by use of the chain
rule that for any Ftn -measurable y with bounded second moment,
x (tk+1 , y) = g (XT

y,tk+1
Xt

y,tk+1

)x XT

y,tk+1

+sm ek ,tk+1

Xt

= g (XT k+1
)x XT n+1


y,t
y,t
= x tn+1 , Xtn+1k+1 x Xtn+1k+1 ,

,tn+1

y,t

x Xtn+1k+1

and that
Xt

x Xtn+1k+1

+sk ek ,tk+1


+

Xt

= x Xtn k+1
tn+1

+sk ek ,tk+1
X tk+1 +sk ek ,tk+1

ax (s, Xs

tn

tn+1

X tk+1 +sk ek ,tk+1

)x Xs

X tk+1 +sk ek ,tk+1

bx (s, Xs

ds

X tk+1 +sk ek ,tk+1

)x Xs

dWs .

tn

We next introduce the -algebra


,k,n := ({Ws }0st ) ({Ws Wt }t st ) ({Ws Wt }t sT ) (X0 ),
F
n
n+1 n+1
k+1 k+1
k

,k,n and ItTaylor expand the x functions in (55) about center points that are F
measurable:



X t +sk ek ,tk+1
X t +sk ek ,tk+1
x Xtn+1k+1
x tk+1 , X tk+1 + sk ek = x tn+1 , Xtn+1k+1


 
X t +sk ek ,tk+1
X tk ,tk+1
X tk ,tk+1
X t ,tk+1
+ xx tn+1 , Xtn
Xtn+1k+1
= x tn+1 , Xtn
Xtn k

Construction of a Mean Square Error Adaptive

71


X t +sk ek ,tk+1
X t ,tk+1
Xtn+1k+1
Xtn k


X t ,tk+1
+ xxx tn+1 , Xtn k
2

X t +sk ek ,tk+1
X t ,tk+1
+ xxxx tn+1 , (1 sn )Xtn k
+ sn Xtn+1k+1
Xt

(Xtn+1k+1

+sk ek ,tk+1

X t ,tk+1 2 

Xtn k


X t ,tk+1
X t ,tk+1
x Xtn k
+ xx Xtn k
(a(tk , X tk )tk + b(tk , X tk )Wk + sk ek )
X t +`sk (a(tk ,X tk )tk +(b(tk ,X tk )Wk +sk ek ),tk+1

+ xxx Xtn k


(a(tk , X tk )tk + b(tk , X tk )Wk + sk ek )2


2

tn+1

X tk+1 +sk ek ,tk+1

ax (s, Xs

X tk+1 +sk ek ,tk+1

)x Xs

ds

tn

tn+1

X t +sk ek ,tk+1
X t +sk ek ,tk+1
bx (s, Xs k+1
)x Xs k+1
dWs


,

(56)

tn

where
Xt

k+1
Xtn+1

+sk ek ,tk+1

 t
n+1
tn

X t ,tk+1

Xtn k

X tk+1 +sk ek ,tk+1

a(s, Xs

)ds +

 t
n+1
tn

X t +sk (a(tk ,X tk )tk +b(tk ,X tk )Wk +sk ek ),tk+1

+ x Xtn k

X tk+1 +sk ek ,tk+1

b(s, Xs

)dWs

(a(tk , X tk )tk + b(tk , X tk )Wk + sk ek ),

and



X t ,tk+1
x tn+1 , X tn+1 + sn en = x tn+1 , X tn k

+ xx

X t ,tk+1
tn+1 , X tn k


X k ,tk+1
k,n + xxx tn+1 , X n

2
k,n

2

3
k,n
X t ,tk+1
,
+ xxxx tn+1 , (1 sn )X tn k
+ sn (X tn+1 + sn en )
6

(57)

with
k,n := a(tn , X tn )tn + b(tn , X tn )Wn + sn en
X t +sk (a(tk ,X tk )tk +b(tk ,X tk )Wk ),tk+1

+ x X tn k

(a(tk , X tk )tk + b(tk , X tk )Wk + sk ek ).

Plugging the expansions (56) and (57) into the expectation

72

H. Hoel et al.

 



8n ,
8 k b
E x tk+1 , X k+1 + sk ek x tn+1 , X n+1 + sn en b
the summands in the resulting expression that only contain products of the first
variations vanishes,


 
X t ,tk+1
X t ,tk+1
X t ,tk+1
8 k b
8n
x Xtn k
x tn+1 , X tn k+1
E x tn+1 , Xtn k
b





,k,n x tn+1 , XtX tk ,tk+1 x XtX tk ,tk+1 x tn+1 , X X tk ,tk+1


8 n b
8 k |F
= 0.
= E E b
tn
n
n
One can further deduce that all of the the summands in which the product of multiple
8 k and b
8 n are multiplied only with one additional It integral of
It integrals b
first-order vanish by using the fact that the inner product of the resulting multiple
It integrals is zero, cf. [22, Lemma 5.7.2], and by separating the first and second
variations from the It integrals by taking a conditional expectation with respect to
the suitable filtration. We illustrate this with a couple of examples,


 
X t ,tk+1
X t ,tk+1
X t ,tk+1
8n
8 k b
xx Xtn k
b(tk , X tk )Wk x tn+1 , X tn k
E x tn+1 , Xtn k
b

 
X tk ,tk+1
X tk ,tk+1
X t ,tk+1
8k
xx Xtn
= E x tn+1 , Xtn
b(tk , X tk )Wk x tn+1 , X tn k
b



,n = 0,
8 n |F
E b
and
 


X tk ,tk+1
X tk ,tk+1
X tk ,tk+1
8
8
E x tn+1 , Xtn
x Xtn
b(tn , X tn )Wn x tn+1 , X tn
bk bn


 


X tk ,tk+1
X tk ,tk+1
n
,
8
8
x tn+1 , X tn
= 0.
= E x tn+1 , Xtn+1
bk b(tn , X tn )E bn Wn |F
From these observations, assumption (M.3), inequality (54), and, when necessary,
,k additional expansions of integrands to render the leading order integrand either F
n
,
or F -measurable and thereby sharpen the bounds (an example of such an expansion is
 tn+1  t
8n =
(bx b)(s, XsX tn ,tn )dWs dWt
b


tn

tn+1

=
tn

tn

t
tn

X t ,tk+1
k

(bx b) s, Xs tn

,tn

dWs dWt + h.o.t.).

Construction of a Mean Square Error Adaptive

73

We derive after a laborious computation which we will not include here that
 
E x tk+1 , X t

k+1

 



8n 
8 k b
+ sk ek x tn+1 , X tn+1 + sn en b

C N 3/2 E tk2 E tn2 .

This further implies that

N1

 



8 k b
8n
E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en b

k,n=0,k=n

E tk2 E tn2

N1

C N 3/2

k,n=0,k=n

N1
1

C N 3/2

E tn2

n=0

C N 1/2

N1


E tn2 ,

n=0

such that inequality (55) holds.


So far, we have shown that
#
$2
N1


8n
E
x tn+1 , X tn+1 + sn en b
n=0

N1


2 (bx b)2


=E
(tn , X tn )tn2 + o tn2 . (58)
x tn+1 , X tn
2
n=0

| n , can also be
| n , a
8 n and b
The MSE contribution from the other local error terms, a
m,n
,
bounded using the above approach with ItTaylor expansions, F -conditioning
and It isometries. This yields that


 

| k a
|n
E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en a


 
  at + ax a + axx b2 /2 
(tk , X tk )
= E x X tk , tk x tn , X tn
2

 a + a a + a b2 /2 


t
x
xx
(tn , X tn )tk2 tn2 + o tk2 tn2 ,
2

(59)

74

H. Hoel et al.

 



8n
8 k a
E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en a




E x tn , X t 2 (ax b)2 (tn , X t )t 3 + o t 3 , if k = n,
n
n
n
n
2

=


 
O N 3/2 E t 3 E t 3 1/2 ,
if k = n,
n
k
and


 

| k b
|n
E x tk+1 , X tk+1 + sk ek x tn+1 , X tn+1 + sn en b




E x tn , X t 2 (bt +bx a+bxx b2 /2)2 (tn , X t )t 3 + o t 3 , if k = n,
n
n
n
n
3

=


 
O N 3/2 E t 3 E t 3 1/2 ,
if k = n.
n
k
Moreover, conservative bounds for error contributions involving products of different
| k b
8 n , can be induced from the above bounds and Hlders
local error terms, e.g., a
inequality. For example,





N1








E
|
8

a
b
t

,
X
+
s
e

,
X
+
s
e
t
x
t
x
t
n
n
n
n+1
k+1
k
k
k
n+1


k+1

 k,n=0






N1
N1








| k
8 n 
= E
x tk+1 , X tk+1 + sk ek a
x tn+1 , X tn+1 + sn en b



k=0
k=0
'
(
2
(



( N1

(
| k
)E
x tk+1 , X t
+ sk ek a

k+1

k=0

'
(
2
(

( N1



8 n
(
x tn+1 , X tn+1 + sn en b

)E
n=0

= O N 1/2

N1

E tn2 .

n=0

The proof is completed in two replacement


steps
to x on the right-hand

 applied

side of equality (58). First, we replace x tn+1 , X tn by x tn , X tn . Under the regularity
assumed in this theorem, the replacement is possible without introducing additional
leading order error terms as








X ,t
X ,t
X ,t
X ,t 
E |x tn+1 , X tn x tn , X tn | = E g (XT tn n+1 )x XT tn n+1 g (XT tn n )x XT tn n 




X ,t
X ,t
X ,t
E (g (XT tn n+1 ) g (XT tn n ))x XT tn n+1 



X ,t
X ,t
X ,t 
+ E g (XT tn n )(x XT tn n+1 x XT tn n )


= O N 1/2 .

Construction of a Mean Square Error Adaptive

75

Here, the last equality follows from the assumptions (M.2), (M.3), (R.2), and (R.3),
and Lemmas 1 and 2,



X ,t
X ,t
X ,t 

E  g (XT tn n+1 ) g (XT tn n ) x XT tn n+1 
'
( 
2 
( 

2 

X tn ,tn
Xtn+1
,tn+1 
(  X tn ,tn+1
X tn ,tn+1 


)
C E XT
XT
 E x XT




4 1/4


X tn ,tn
(1sn )X tn +sn Xtn+1 ,tn+1 

C E x XT



# 

E 

tn+1
tn

a(s, XsX tn ,tn )ds


+

tn+1

tn

 
C E sup |a(s, XsX tn ,tn )|4 tn4 +
tn stn+1



= O N 1/2 ,

4 $1/4


b(s, XsX tn ,tn )dWs 

sup |b(s, XsX tn ,tn )|4 tn2

1/4

tn stn+1

and that
' 
( 

2 

( 
  X tn ,tn
X tn ,tn+1
X tn ,tn 
X ,t
X ,t 

E g (XT
)(x XT
x XT
) C )E x XT tn n+1 x XT tn n 
'
( 
2
X ,tn
( 

Xt tn ,tn+1
( 
X ,t
X tn ,tn 
= C )E x XT tn n+1 x XT n+1
x Xtn+1



' 

(
X ,tn
( 
Xt tn ,tn+1 
X ,t
C )E x XT tn n+1 x XT n+1



'
( 
 tn+1
 tn+1
X ,tn
( 
Xt tn ,tn+1
( 
X ,t
X ,t
+ )E x XT n+1
ax (s, Xs tn n )ds +
bx (s, Xs tn n )dWs

tn
tn

2





'
( 
 tn+1
 tn+1
X tn ,tn
( 
(1sn )X tn +sn Xtn+1
,tn+1
( 
X ,t
X ,t
C )E xx XT
ax (s, Xs tn n )ds +
bx (s, Xs tn n )dWs

tn
tn

2







+ O N 1/2


= O N 1/2 .



The last step is to replace the first variation of the exact path x tn , X tn with the
X t ,tn

first variation of the numerical solution x,n = g (X T )x X T n . This is also possible


without introducing additional leading order error terms by the same assumptions
and similar bounding arguments as in the two preceding bounds as

76

H. Hoel et al.








X t ,tn
X ,t
X ,t 
E x,n x tn , X tn  = E g (X T )x X T n g (XT tn n )x XT tn n 







X t ,tn
X ,t 
X ,t  
X ,t 

E |g (X T )| x X T n x XT tn n  + E g (X T ) g (XT tn n ) x XT tn n 


= O N 1/2 .


Variations of the Flow Map


The proof of Theorem 2 relies on bounded moments of variations of order up to
four of the flow map . Furthermore, the error density depends explicitly on the
first variation. In this section, we we will verify that these variations are indeed well
defined random variables with all required moments bounded. First, we present the
proof of Lemma 1. Having proven Lemma 1, we proceed to present how essentially
the same technique can be used in an iterative fashion to prove the existence, pathwise
uniqueness and bounded moments of the higher order moments. The essentials of
this procedure are presented in Lemma 3.
First, let us define the following set of coupled SDE
(1)

dYu

(2)

(1)

(1)

=a(u, Yu )du + b(u, Yu )dWu ,


(1)

(2)

(1)

(2)

=ax (u, Yu )Yu du + bx (u, Yu )Yu dWu ,





(3)
(1)
(2) 2
(1) (3)
dYu = axx (u, Yu ) Yu
du
+ ax (u, Yu )Yu



(1)
(2) 2
(1) (3)
+ bxx (u, Yu ) Yu
dWu ,
+ bx (u, Yu )Yu



(4)
(1)
(2) 3
(1) (2) (3)
(1) (4)
du
dYu = axxx (u, Yu ) Yu
+ 3axx (u, Yu )Yu Yu + ax (u, Yu )Yu



(1)
(2) 3
(1) (2) (3)
(1) (4)
+ bxxx (u, Yu ) Yu
dWu ,
+ 3bxx (u, Yu )Yu Yu + bx (u, Yu )Yu





(5)
(1)
(2) 4
(1)
(2) 2 (3)
du
dYu = axxxx (u, Yu ) Yu
+ 6axxx (u, Yu ) Yu
Yu

 

(1)
(3) 2
(2) (4)
(1) (5)
+ axx (u, Yu ) 3 Yu
+ ax (u, Yu )Yu
du
+ 4Yu Yu





(1)
(2) 4
(1)
(2) 2 (3)
+ bxxxx (u, Yu ) Yu
dWu
+ 6bxxx (u, Yu ) Yu
Yu

 

(1)
(3) 2
(2) (4)
(1) (5)
+ bx (u, Yu )Yu
dWu ,
+ bxx (u, Yu ) 3 Yu
+ 4Yu Yu

dYu

(60)

Construction of a Mean Square Error Adaptive

77

defined for u (t, T ] with the initial condition Yt = (x, 1, 0, 0, 0). The first component of the vector coincides with Eq. (13), whereas the second one is the first variation
of the path from Eq. (16). The last three components can be understood as the second,
third and fourth variations of the path, respectively.
Making use of the solution of SDE (60), we also define the second, third and
fourth variations as
xx (t, x) = g (XTx,t )xx XTx,t + g (XTx,t )(x XTx,t )2 ,
xxx (t, x) = g (XTx,t )xxx XTx,t + + g (XTx,t )(x XTx,t )3 ,
xxxx (t, x) = g

(XTx,t )xxxx XTx,t

+ + g



(61)

(XTx,t )(x XTx,t )4 .

In the sequel, we prove that the solution to Eq. (60) when understood in the integral
sense that extends (13) is a well defined random variable with bounded moments.
Given sufficient differentiability of the payoff g, this results in the boundedness of
the higher order variations as required in Theorem 2.
Proof of Lemma 1. By writing (Ys(1) , Ys(2) ) := (Xsx,t , x Xsx,t ), (13) and (16) together
form an SDE:
dYs(1) = a(s, Ys(1) )ds + b(s, Ys(1) )dWs
(62)
dYs(2) = ax (s, Ys(1) )Ys(2) ds + bx (s, Ys(1) )Ys(2) dWs
for s (t, T ] and with initial condition Yt = (x, 1). As before, ax stands for the
partial derivative of the drift function with respect to its spatial argument. We note
that (62) has such a structure that dynamics of Ys(2) depends on Ys(1) , that, in turn, is
independent of Ys(2) . By the Lipschitz continuity of a(s, Ys(1) ) and the linear growth
bound of the drift and diffusion coefficients a(s, Ys(1) ) and b(s, Ys(1) ), respectively,
there exists a pathwise unique solution of Ys(1) that satisfies

E sup

s[t,T ]

|Ys(1) |2p


< , p N,

(cf. [22, Theorems 4.5.3 and 4.5.4 and Exercise 4.5.5]). As a solution of an It SDE,
XTx,t is measurable with respect to FT it generates.
Note that Theorem [20, Theorem 5.2.5] establishes that the solutions of (62) are
pathwise unique. Kloeden and Platen [22, Theorems 4.5.3 and 4.5.4] note that the
existence and uniqueness theorems for SDEs they present can be modified in order
to account for looser regularity conditions, and the proof below is a case in point.
Our approach below follows closely presentation of Kloeden and Platen, in order to
prove the existence and moment bounds for Ys(2) .
(2)
, n N by
Let us define Yu,n
(2)
Yu,n+1


=
t

(2)
ax (s, Ys(2) )Ys,n
ds


+
t

(2)
bx (s, Ys(2) )Ys,n
dWs ,

78

H. Hoel et al.

(2)
with Yu,1
= 1, for all u [t, T ]. We then have, using Youngs inequality, that

 
2 
 u

(1)
(2)
bx (s, Ys )Ys,n dWs 
+ 2E 
t
t
 u 
 u 
2 
2 


(2) 
(2) 
2(u t)E
ax (s, Ys(1) )Ys,n
 ds + 2E
bx (s, Ys(1) )Ys,n
 ds .

 

 

 (2) 2
E Yu,n+1  2E 

2

(2) 
ax (s, Ys(1) )Ys,n
ds

Boundedness of the partial derivatives of the drift and diffusion terms in (62) gives


 
 (2) 2
E Yu,n+1
 C(u t + 1)E


 (2) 2 
 ds .
1 + Ys,n

By induction, we consequently obtain that




(2) 2
< ,
sup E Yu,n

n N.

tuT

(2)
(2)
(2)
Now, set Yu,n
= Yu,n+1
Yu,n
. Then



 

 (2) 2
E Yu,n
 2E 

u
t


2 


(2)
ax (s, Ys(1) )Ys,n1 ds + 2E 

u
t

2 

(2)
bx (s, Ys(1) )Ys,n1 dWs 


 u 
 
 


(2) 2
(2) 2
2(u t)
E ax (s, Ys(1) )Ys,n1  ds + 2
E bx (s, Ys(1) )Ys,n1  ds
t
t
 u 
 
 (2) 2
C1
E Ys,n1  ds.


Thus, by Grnwalls inequality,



2
E Y (2) 
u,n

C1n1
(n 1)!

(u s)

n1


 
 (2) 2
E Ys,1  ds.


 
 (2) 2
Next, let us show that E Ys,1  is bounded. First,
 

 

 (2) 2
E Yu,1  = E 

(2)
ax (s, Ys(1) )Ys,2
ds

+
t

2 


(3)
bx (s, Ys(1) )Yu,2
dWs 


 
 (2) 2
C(u t + 1) sup E Ys,2
 .
s[t,u]

Consequently, there exists a C R such that



 C n (u t)n
(2) 2

E Yu,n
,
n!


 C n (T t)n
(2) 2

sup E Yu,n
.
n!
u[t,T ]

Construction of a Mean Square Error Adaptive

79

Define
 (2) 
,
Zn = sup Yu,n
tuT

and note that







(2)
(2) 
ax (s, Ys(1) )Ys,n+1 ax (s, Ys(1) )Ys,n
 ds
t
 u



(2)
(1)
(1)
(2)

+ sup 
bx (s, Ys )Ys,n+1 bx (s, Ys )Ys,n dWs  .

Zn

tuT

Using Doobs and Schwartzs inequalities, as well as the boundedness of ax and bx ,



2 

(2)
(2) 
E ax (s, Ys(1) )Ys,n+1
ax (s, Ys(1) )Ys,n
 ds
t
 T 
2 

(2)
(2) 
+8
E bx (s, Ys(1) )Ys,n+1
bx (s, Ys(1) )Ys,n
 ds


E |Zn |2 2(T t)

C n (T t)n

,
n!
for some C R. Using the Markov inequality, we get




n4 C n (T t)n
.
P Zn > n2
n!
n=1
n=1

The right-hand side of the equation above converges by the ratio test, whereas the
BorelCantelli Lemma guarantees the (almost sure) existence of K N, such that
(2)
Zk < k 2 , k > K . We conclude that Yu,n
converges uniformly in L 2 (P) to the limit
&
(2)
(2)
(2)
Yu = n=1 Yu,n and that since {Yu,n }n is a sequence of continuous and Fu -adapted
processes, Yu(2) is also continuous and Fu -adapted. Furthermore, as n ,
 u

 u
 u


 (3)
(1) (3)
(1) (3) 
(3) 


C
a
(s,
Y
)Y
ds

a
(s,
Y
)Y
ds
Ys,n Ys  ds 0, a.s.,
x
x
s
s,n
s
s


t

and, similarly,





(3)
bx (s, Ys(1) )Ys,n
dWs




bx (s, Ys(1) )Ys(3) dWs 

This implies that (Yu(1) , Yu(2) ) is a solution to the SDE (62).

0, a.s.

80

H. Hoel et al.

Having established that Yu(2) solves the relevant SDE and that it has a finite second
moment, we may follow the principles laid out in [22, Theorem 4.5.4] and show that
all even moments of
 u
 u
ax (t, Ys(1) )Ys(2) ds +
bx (t, Ys(1) )Ys(2) dWs
Yu(2) = +
t

are finite. By Its Lemma, we get that for any even integer l,
 (3) l
Y  =
u

 (2) l2 (2)


Y  Y ax (s, Y (1) )Y (2) ds
s
s
s
s

2
l(l 1)  (2) l2 
Ys
bx (s, Ys(1) )Ys(2) ds
2
t u
 (2) l2 (2) 

Y  Y
+
bx (s, Ys(1) )Ys(2) dWs .
s
s
u

Taking expectations, the It integral vanishes,



l
E Ys(2)  = E



 (2) l2 (2) 



Y  Y
ax (s, Y (1) )Y (2) ds
s



+E
t



l2

l(l 1) Ys(2)  
(1)
(2) 2
bx (s, Ys )Ys
ds .
2

Using Youngs inequality and exploiting the boundedness of ax , we have that





(2) l

C
E Y
u


E |Y2,u |l ds



+E
t



l2
 (2) 2
l(l 1) Ys(2)   
(1)
bx s, Ys Ys
ds .
2

By the same treatment for the latter integral, using that bx is bounded,



(2) l

C
E Y
u


l
E Yu(2)  ds.


l
Thus, by Grnwalls inequality, E Y (2)  < .

Lemma 3 Assume that (R.1), (R.2), and (R.3) in Theorem

2 hold and that for any



fixed t [0, T ] and x is Ft -measurable such that E |x|2p < for all p N. Then,
Eq. (60) has pathwise unique solutions with finite moments. That is,

max

i{1,2,...,5}

 2p
sup E Yu(i) 
< ,

u[t,T ]

p N.

Construction of a Mean Square Error Adaptive

81

Furthermore, the higher variations as defined by Eq. (61) satisfy are FT -measurable
and for all p N,

@
?
max E |x (t, x)|2p , E |xx (t, x)|2p , E |xxx (t, x)|2p , E |xxxx (t, x)|2p < .
Proof We note that (60) shares with (62) the triangular dependence structure. That
(j) 1
for d1 < 5 has drift and diffusion functions a :
is, the truncated SDE for {Yu }dj=1
(j)
d1
d1
[0, T ] R R and b : [0, T ] Rd1 Rd1 d2 that do not depend on Yu for
j d1 .
This enables verifying existence of solutions for the SDE in stages: first for
(Y (1) , Y (2) ), thereafter for (Y (1) , Y (2) , Y (3) ), and so forth, proceeding iteratively to
add the next component Y (d1 +1) of the SDE. We shall also exploit this structure
for proving the result of bounded moments for each component. The starting point
for our proof is Lemma 1, which guarantees existence, uniqueness and the needed
moment bounds for the first two components Y (1) , and Y (2) . As one proceeds to Y (i) ,
i > 2, the relevant terms in (64) feature derivatives of a and b of increasingly high
order. The boundedness of these derivatives is guaranteed by assumption (R.1).
(3)
, n N by
Defining a successive set of approximations Yu,n
(3)
Yu,n+1


=
t

2

(3)
axx (s, Ys(1) ) Ys(2) + ax (s, Ys(2) )Ys,n
ds
 u

2
(3)
+
bxx (s, Ys(1) ) Ys(2) + bx (s, Ys(2) )Ys,n
dWs ,
t

(3)
= 0, for all u [t, T ]. Let us denote by
with the initial approximation defined by Yu,1


Q=
t

2

axx (s, Ys(1) ) Ys(1) ds +

u
t

2

bxx (s, Ys(1) ) Ys(2) dWs

(63)

(3)
. We then have, using
the terms that do not depend on the, highest order variation Yu,n
Youngs inequality, that



 

2

 (3) 2
E Yu,n+1  3E |Q| + 3E 


2 

 u
(1) (3)

bx (s, Ys )Ys,n dWs 
+ 3E 
t
t
 u 
 u 
2 
2 




(3) 
(3) 
3E |Q|2 + 3(u t)E
ax (s, Ys(1) )Ys,n
 ds + 3E
bx (s, Ys(1) )Ys,n
 ds .
u

2

(3) 
ax (s, Ys(1) )Ys,n
ds

The term Q is bounded by Lemma 1 and the remaining terms can be bounded by
the same methods as in the proof of 1. Using the same essential tools: Youngs
and Doobs inequalities, Grnwalls lemma, Markov inequality and BorelCantelli
(3)
converges. This limit
Lemma, we can establish the existence of a limit to which Yu,n
(3)
is the solution of of Yu , and has bounded even moments through arguments that are
straightforward generalisations of those already presented in the proof of Lemma 1.

82

H. Hoel et al.

Exploiting the moment bounds of Yu(3) and the boundedness of derivatives of g,


we can establish the measurability of the second order variation x (t, x). Repeating
the same arguments in an iterative fashion, we can establish the same properties for

Yu(4) and Yu(5) as well as xx (t, x), xxx (t, x), xxxx (t, x).

Error Expansion for the MSE in Multiple Dimensions


In this section, we extend the 1D MSE error expansion presented in Theorem 2 to
the multi-dimensional setting.
Consider the SDE
dXt = a (t, Xt ) dt + b (t, Xt ) dWt ,
X0 = x0 ,

t (0, T ]

(64)

where X : [0, T ] Rd1 , W : [0, T ] Rd2 , a : [0, T ] Rd1 Rd1 and


b : [0, T ] Rd1 Rd1 d2 . Let further xi denote the ith component of x Rd1 , a(i) ,
the ith component of a drift coefficient and b(i,j) and bT denote the (i, j)th element
and the transpose of the diffusion matrix b, respectively. (To avoid confusion, this
derivation does not make use of any MLMC notation, particularly not the multilevel
superscript {} .)
Using the Einstein summation convention to sum over repeated indices, but not
over the time index n, the 1D local error terms in Eq. (49) generalize into
| (i)
a
n =
8 (i)
a
n =
| (i) =
b
n
8 (i)
b
n =

tn+1

tn

tn

tn+1

tn

tn

 t

tn+1

1
at(i) + ax(i)j a(j) + ax(i)j xk (bbT )(j,k)
2

ax(i)j b(j,k) dWs(k) dt,


(i,j)

bt
tn

tn

tn

tn+1

tn

ds dt,

1
(j)
+ bx(i,j)
a(k) + bx(i,j)
(bbT )(k,) ds dWt ,
k
2 k x
(j)

bx(i,j)
b(k,) dWs() dWt ,
k

where all the above integrand functions in all equations implicitly depend on the
X ,t
X ,t
state argument Xs tn n . In flow notation, at(i) is shorthand for at(i) (s, Xs tn n ).
Under sufficient regularity, a tedious calculation similar to the proof of Theorem 2
verifies that, for a given smooth payoff, g : Rd1 R,
N1



 2
 2
2
E g(XT) g X T
E
n tn + o tn ,
n=0

Construction of a Mean Square Error Adaptive

where
n :=


(i,j)
1
xi ,n (bbT )(k,) (bxk bxT )
(tn , X tn )xj ,n .
2

83

(65)

In the multi-dimensional setting, the ith component of first variation of the flow map,
x = (x1 , x2 , . . . , xd1 ), is given by
 y,t (j)
y,t
xi (t, y) = gxj (XT )xi XT
.
The first variation is defined as the second component to the solution of the SDE,




dYs(1,i) = a(i) s, Ys(1) ds + b(i,j) s, Ys(1) dWs(j)




s, Ys(1) Ys(2,k,j) dWs() ,
dYs(2,i,j) = ax(i)k s, Ys(1) Ys(2,k,j) ds + bx(i,)
k
where s (t, T ] and the initial conditions are given by Yt(1) = x Rd1 , Yt(2) = Id1 ,
with Id1 denoting the d1 d1 identity matrix. Moreover, the extension of the numerical method for solving the first variation of the 1D flow map (23) reads
xi ,n = cx(j)i (tn , X tn ) xj ,n+1 , n = N 1, N 2, . . . 0.

(66)

xi ,N = gxi (X T ),
with the jth component of c : [0, T ] Rd1 Rd1 defined by


(j)
c(j) tn , X tn = X tn + a(j) (tn , X tn )tn + b(j,k) (tn , X tn )Wn(k) .
Let U and V denote subsets of Euclidean spaces and let us introduce the
multi-index
7 partial derivatives of order
& = (1 , 2 , . . . , d ) to represent spatial
|| := dj=1 j on the following short form x := dj=1 xj . We further introduce
the following function spaces.
C(U; V ) := {f : U V | f is continuous},
Cb (U; V ) := {f : U V | f is continuous and bounded},

dj
Cbk (U; V ) := f : U V | f C(U; V ) and j f Cb (U; V )
dx

for all integers 1 j k ,

Cbk1 ,k2 ([0, T ] U; V ) := f : [0, T ] U V | f C([0, T ] U; V ), and

j
t f Cb ([0, T ] U; V ) for all integers j k1 and 1 j + || k2 .
Theorem 3 (MSE leading order error expansion in the multi-dimensional setting)
Assume that drift and diffusion coefficients and input data of the SDE (64) fulfill

84

H. Hoel et al.

(R.1) a Cb2,4 ([0, T ] Rd1 ; Rd1 ) and b Cb2,4 ([0, T ] Rd1 ; Rd1 d2 ),
(R.2) there exists a constant C > 0 such that
|a(t, x)|2 + |b(t, x)|2 C(1 + |x|2 ),

x Rd1 and t [0, T ],

(R.3) g Cb4 (Rd1 ),


(R.4) for the initial data, X0 is F0 -measurable and E[|X0 |p ] < for all p 1.
Assume further the mesh points 0 = t0 < t1 < < tN = T
(M.1) are stopping times such that tn is Ftn1 -measurable for n = 1, 2, . . . , N,
(M.2) there exists N N, and a c1 > 0 such that c1 N inf N() and sup
N() N holds for each realization. Furthermore, there exists a c2 > 0 such
that sup maxn{0,1,...,N1} tn () < c2 N 1 ,
(M.3) and there exists a c3 > 0 such that for all p[1, 8] and n{0, 1, . . . , N 1},

p

E tn2p c3 E tn2 .
Then, as N increases,

 2
E g(XT ) g X T
 

(i,j) 
N1
xj (tn , X tn )
xi (bbT )(k,) (bxk bxT )
tn2 + o(tn2 ),
= E
2
n=0
where we have dropped the arguments of the first variation as well as the diffusion
matrices for clarity.


Replacing the first variation xi tn , X n by the numerical approximation xi ,n ,
as defined in (66) and using the error density notation from (65), we obtain the
following to leading order all-terms-computable error expansion:
N1


 2
2
2
=E
n tn + o(tn ) .
E g(XT ) g X T


(67)

n=0

A Uniform Time Step MLMC Algorithm


The uniform time step MLMC algorithm for MSE approximations of SDE was
proposed in [8]. Below, we present the version of that method that we use in the
numerical tests in this work for reaching the approximation goal (2).

Construction of a Mean Square Error Adaptive

85

Algorithm 5 mlmcEstimator
Input: TOLT , TOLS , confidence , input mesh t {1} , input mesh intervals N1 , inital number
, weak convergence rate , SDE problem.
of samples M,
Output: Multilevel estimator AML .
Compute the confidence parameter CC () by (42).
Set L = 1.
while L < 3 or (44), using the input rate , is violated do
Set L = L + 1.
 
, generate a set of (EulerMaruyama) realizations { g i, }ML on mesh and
Set ML = M,
i=1
Wiener path pairs (t {L1} , t {L} ) and (W {L1} , W {L} ), where the uniform mesh pairs have
step sizes t {L1} = T /NL1 and t {L} = T /NL ), respectively.
for  = 0 to L do
Compute the sample variance V ( g; Ml ).
end for
for  = 0 to L do
Determine the number of samples by
3
2
4
L
CC 2 () Var( g) *
M =
N
Var(
g)
.


N
TOLS 2
=0

(The equation for Ml is derived by Lagrangian optimization, cf. Sect. 3.2.1.)


if New value of M is larger than the old value then
  Mnew
Compute additional (EulerMaruyama) realizations { g i, }i=M
on mesh and
 +1
Wiener path pairs (t {1} , t {} ) and (W {1} , W {} ), where the uniform mesh pairs
have step sizes t {1} = T /(2 N1 ) and t {} = T /(2+1 N1 ), respectively.
end if
end for
end while
Compute AML using the generated samples by the formula (7).

References
1. Avikainen, R.: On irregular functionals of SDEs and the Euler scheme. Financ. Stoch. 13(3),
381401 (2009)
2. Bangerth, W., Rannacher, R.: Adaptive Finite Element Methods for Differential Equations.
Lectures in Mathematics ETH Zrich. Birkhuser, Basel (2003)
3. Barth, A., Lang, A.: Multilevel Monte Carlo method with applications to stochastic partial
differential equations. Int. J. Comput. Math. 89(18), 24792498 (2012)
4. Cliffe, K.A., Giles, M.B., Scheichl, R., Teckentrup, A.L.: Multilevel Monte Carlo methods and
applications to elliptic PDEs with random coefficients. Comput. Vis. Sci. 14(1), 315 (2011)
5. Collier, Nathan, Haji-Ali, Abdul-Lateef, Nobile, Fabio, von Schwerin, Erik, Tempone, Ral:
A continuation multilevel Monte Carlo algorithm. BIT Numer. Math. 55(2), 399432 (2014)
6. Durrett, R.: Probability: Theory and Examples, 2nd edn. Duxbury Press, Belmont (1996)
7. Gaines, J.G., Lyons, T.J.: Variable step size control in the numerical solution of stochastic
differential equations. SIAM J. Appl. Math. 57, 14551484 (1997)
8. Giles, M.B.: Multilevel Monte Carlo path simulation. Oper. Res. 56(3), 607617 (2008)
9. Giles, M.B.: Multilevel Monte Carlo methods. Acta Numerica 24, 259328 (2015)
10. Giles, M.B., Szpruch, L.: Antithetic multilevel Monte Carlo estimation for multi-dimensional
SDEs without Lvy area simulation. Ann. Appl. Probab. 24(4), 15851620 (2014)

86

H. Hoel et al.

11. Gillespie, D.T.: The chemical Langevin equation. J. Chem. Phys. 113(1), 297306 (2000)
12. Glasserman, P.: Monte Carlo Methods in Financial Engineering. Applications of Mathematics
(New York), vol. 53. Springer, New York (2004). Stochastic Modelling and Applied Probability
13. Haji-Ali, A.-L., Nobile, F., von Schwerin, E., Tempone, R.: Optimization of mesh hierarchies
in multilevel Monte Carlo samplers. Stoch. Partial Differ. Equ. Anal. Comput. 137 (2015)
14. Heinrich, S.: Monte Carlo complexity of global solution of integral equations. J. Complex.
14(2), 151175 (1998)
15. Heinrich, S., Sindambiwe, E.: Monte Carlo complexity of parametric integration. J. Complex.
15(3), 317341 (1999)
16. Hoel, H., von Schwerin, E., Szepessy, A., Tempone, R.: Implementation and analysis of an
adaptive multilevel Monte Carlo algorithm. Monte Carlo Methods Appl. 20(1), 141 (2014)
17. Hofmann, N., Mller-Gronbach, T., Ritter, K.: Optimal approximation of stochastic differential
equations by adaptive step-size control. Math. Comp. 69(231), 10171034 (2000)
18. Hunter, J.D.: Matplotlib: a 2d graphics environment. Comput. Sci. Eng. 9(3), 9095 (2007)
19. Ilie, S.: Variable time-stepping in the pathwise numerical solution of the chemical Langevin
equation. J. Chem. Phys. 137(23), 234110 (2012)
20. Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics, vol. 113, 2nd edn. Springer, New York (1991)
21. Kebaier, A.: Statistical Romberg extrapolation: a new variance reduction method and applications to option pricing. Ann. Appl. Probab. 15(4), 26812705 (2005)
22. Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations. Applications
of Mathematics (New York). Springer, Berlin (1992)
23. Lamba, H., Mattingly, J.C., Stuart, A.M.: An adaptive Euler-Maruyama scheme for SDEs:
convergence and stability. IMA J. Numer. Anal. 27(3), 479506 (2007)
24. LEcuyer, P., Buist, E.: Simulation in Java with SSJ. In: Proceedings of the 37th conference on
Winter simulation, WSC 05, pages 611620. Winter Simulation Conference (2005)
25. Milstein, G.N., Tretyakov, M.V.: Quasi-symplectic methods for Langevin-type equations. IMA
J. Numer. Anal. 23(4), 593626 (2003)
26. Mishra, S., Schwab, C.: Sparse tensor multi-level Monte Carlo finite volume methods for
hyperbolic conservation laws with random initial data. Math. Comp. 81(280), 19792018
(2012)
27. ksendal, B.: Stochastic Differential Equations. Universitext, 5th edn. Springer, Berlin (1998)
28. Platen, E., Heath, D.: A Benchmark Approach to Quantitative Finance. Springer Finance.
Springer, Berlin (2006)
29. Shreve, S.E.: Stochastic Calculus for Finance II. Springer Finance. Springer, New York (2004).
Continuous-time models
30. Skeel, R.D., Izaguirre, J.A.: An impulse integrator for Langevin dynamics. Mol. Phys. 100(24),
38853891 (2002)
31. Szepessy, A., Tempone, R., Zouraris, G.E.: Adaptive weak approximation of stochastic differential equations. Comm. Pure Appl. Math. 54(10), 11691214 (2001)
32. Talay, D.: Stochastic Hamiltonian systems: exponential convergence to the invariant measure,
and discretization by the implicit Euler scheme. Markov Process. Relat. Fields 8(2), 163198
(2002). Inhomogeneous random systems (Cergy-Pontoise, 2001)
33. Talay, D., Tubaro, L.: Expansion of the global error for numerical schemes solving stochastic
differential equations. Stoch. Anal. Appl. 8(4), 483509 (1990)
34. Teckentrup, A.L., Scheichl, R., Giles, M.B., Ullmann, E.: Further analysis of multilevel Monte
Carlo methods for elliptic PDEs with random coefficients. Numer. Math. 125(3), 569600
(2013)
35. Yan, L.: The Euler scheme with irregular coefficients. Ann. Probab. 30(3), 11721194 (2002)

Vandermonde Nets and Vandermonde


Sequences
Roswitha Hofer and Harald Niederreiter

Abstract A new family of digital nets called Vandermonde nets was recently
introduced by the authors. We generalize the construction of Vandermonde nets
with a view to obtain digital nets that serve as stepping stones for new constructions
of digital sequences called Vandermonde sequences. Another new family of Vandermonde sequences is built from global function fields, and this family of digital
sequences has asymptotically optimal quality parameters for a fixed prime-power
base and increasing dimension.
Keywords Low-discrepancy point sets and sequences
sequences Digital point sets and sequences

(t, m, s)-nets

(t, s)-

1 Introduction and Basic Definitions


Low-discrepancy point sets and sequences are basic ingredients of quasi-Monte Carlo
methods for numerical integration. The most powerful known methods for the construction of low-discrepancy point sets and sequences are based on the theory of
(t, m, s)-nets and (t, s)-sequences, which are point sets, respectively sequences,
satisfying strong uniformity properties with regard to their distribution in the sdimensional unit cube [0, 1]s . Various methods for the construction of (t, m, s)-nets
and (t, s)-sequences have been developed, and we refer to the monograph [1] for an
excellent survey of these methods. We follow the recent handbook article [9] in the
notation and terminology. First we recall the definition of a (t, m, s)-net.
R. Hofer
Institute of Financial Mathematics and Applied Number Theory,
Johannes Kepler University Linz, Altenbergerstr. 69, 4040 Linz, Austria
e-mail: roswitha.hofer@jku.at
H. Niederreiter (B)
Johann Radon Institute for Computational and Applied Mathematics,
Austrian Academy of Sciences, Altenbergerstr. 69, 4040 Linz, Austria
e-mail: ghnied@gmail.com
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_3

87

88

R. Hofer and H. Niederreiter

Definition 1 Let b 2 and s 1 be integers and let t and m be integers with 0


t m. A (t, m, s)-net in base b is a point set P consisting of bm points in the sdimensional half-open unit cube [0, 1)s such that every subinterval J of [0, 1)s of
the form
s

[ai bdi , (ai + 1)bdi )
J=
i=1

with integers di 0 and 0 ai < bdi for 1 i s and with volume btm contains
exactly bt points of P.
The number t is called the quality parameter of a (t, m, s)-net in base b and it
should be as small as possible in order to get strong uniformity properties of the net.
It was shown in [7] (see also [8, Theorem 4.10]) that in the nontrivial case m 1,
the star discrepancy D N (P) of a (t, m, s)-net P in base b with N = bm satisfies


D N (P) B(b, s)bt N 1 (log N )s1 + O bt N 1 (log N )s2 ,

(1)

where B(b, s) and the implied constant in the Landau symbol depend only on b and
s. The currently best values of B(b, s) are due to Kritzer [6] for odd b and to Faure
and Kritzer [3] for even b.
Most of the known constructions of (t, m, s)-nets are based on the digital method
which was introduced in [7]. Although the digital method works for any base b 2,
we focus in the present paper on the case where b is a prime power. In line with
standard notation, we write q for a prime-power base. The construction of a digital
net over Fq proceeds as follows. Given a prime power q, a dimension s 1, and an
integer m 1, we let Fq be the finite field of order q and we choose m m matrices
C (1) , . . . , C (s) over Fq . We write Z q = {0, 1, . . . , q 1} Z for the set of digits in
base q. Then we define the map m : Fqm [0, 1) by


m (b ) =

m


(b j )q j

j=1

for any column vector b = (b1 , . . . , bm ) Fqm , where : Fq Z q is a chosen


bijection. With a fixed column vector b Fqm , we associate the point


m (C (1) b ), . . . , m (C (s) b ) [0, 1)s .

(2)

By letting b range over all q m column vectors in Fqm , we arrive at a digital net
consisting of q m points in [0, 1)s .
Definition 2 If the digital net P over Fq consisting of the q m points in (2) with b
ranging over Fqm is a (t, m, s)-net in base q for some value of t, then P is called a
digital (t, m, s)-net over Fq . The matrices C (1) , . . . , C (s) are the generating matrices
of P.

Vandermonde Nets and Vandermonde Sequences

89

This construction of digital nets can be generalized somewhat by employing further bijections between Fq and Z q (see [8, p. 63]), but this is not needed for our
purposes since our results depend only on the generating matrices of a given digital
net. Note that a digital net over Fq consisting of q m points in [0, 1)s is always a digital
(t, m, s)-net over Fq with t = m.
A new family of digital nets called Vandermonde nets was recently introduced
by the authors in [5]. In the present paper, we extend the results in [5] in several
directions. Most importantly, we show how to obtain not only new (t, m, s)-nets, but
also new (t, s)-sequences from our approach. It seems reasonable to give the name
Vandermonde sequences to these (t, s)-sequences.
The rest of the paper is organized as follows. In Sect. 2, we briefly review the construction of digital nets in [5]. We generalize this construction in Sect. 3, as a preparation for the construction of Vandermonde sequences. Finally, the constructions of
new (t, s)-sequences and more generally of (T, s)-sequences called Vandermonde
sequences are presented in Sects. 4 and 5.

2 Vandermonde Nets via Extension Fields


We recall that the construction of an s-dimensional digital net over Fq with q m
points requires m m generating matrices C (1) , . . . , C (s) over Fq . The row vectors
of these generating matrices belong to the vector space Fqm over Fq , and according
to a suggestion in [10, Remark 6.3] we can view these row vectors as elements of
the extension field Fq m . Instead of choosing generating matrices C (1) , . . . , C (s) , we
may thus set up a single s m matrix C = ( j(i) )1is, 1 jm over Fq m . By taking a
m
vector space isomorphism : Fq m Fqm , we obtain the jth row vector c(i)
j Fq of
C (i) as
(i)
for 1 i s, 1 j m.
(3)
c(i)
j = ( j )
The crucial idea of the paper [5] is to consider a matrix C with a Vandermondetype structure. Concretely, we choose an s-tuple a = (1 , . . . , s ) Fqs m and we
j1
then set up the s m matrix C = ( j(i) )1is, 1 jm over Fq m defined by j(1) = 1
for 1 j m and (if s 2) j(i) = i for 2 i s and 1 j m. We use the
standard convention 00 = 1 Fq . The digital net over Fq whose generating matrices
are obtained from C and (3) is called a Vandermonde net over Fq .
We need some notation in order to state, in Proposition 1 below, the formula for
the quality parameter of a Vandermonde net given in [5]. Let Fq [X ] be the ring of
polynomials over Fq in the indeterminate X . For any integer m 1, we put
j

G q,m = {h Fq [X ] : deg(h) < m},


Hq,m = {h Fq [X ] : deg(h) m, h(0) = 0}.

90

R. Hofer and H. Niederreiter

For the zero polynomial 0 Fq [X ] we use the convention deg(0) = 0. We define a


second degree function on Fq [X ] by deg (h) = deg(h) for h Fq [X ] with h = 0 and
deg (0) = 1. We write h = (h 1 , . . . , h s ) Fq [X ]s for a given dimension s 1.
For every s-tuple a = (1 , . . . , s ) Fqs m , we put


L(a) = h G q,m

s1
Hq,m

s


h i (i ) = 0

i=1

and L
(a) = L(a)\{0}. The following figure of merit was defined in [5, Definition 2.1]. We use the standard convention that an empty sum is equal to 0.
Definition 3 If L
(a) is nonempty, we define the figure of merit
(a) = min

hL (a)

deg (h 1 ) +

s



deg(h i ) .

i=2

Otherwise, we define (a) = m.


Proposition 1 Let q be a prime power, let s, m N, and let a Fqs m . Then the
Vandermonde net determined by a is a digital (t, m, s)-net over Fq with
t = m (a).
A nonconstructive existence theorem for large figures of merit was shown in
[5, Corollary 2.7] and is stated in the proposition below. The subsequent corollary
follows from this proposition and from Proposition 1.
Proposition 2 Let q be a prime power and let s, m N. Then there exists an a Fqs m
with


(a) m s logq m 3 ,
where logq denotes the logarithm to the base q.
Corollary 1 For any prime power q and any s, m N, there exists a Vandermonde
net over Fq which is a digital (t, m, s)-net over Fq with


t m m s logq m 3 .
If we combine Corollary 1 with the discrepancy bound in (1), then we see that the
Vandermonde net P over Fq in Corollary 1 satisfies


D N (P) = O N 1 (log N )2s1 ,
where N = q m and where the implied constant depends only on q and s. If q is
a prime and s 3, then the exponent 2s 1 of log N can be improved to s + 1

Vandermonde Nets and Vandermonde Sequences

91

by an averaging argument (see [5, Sect. 3]). Again for a prime q, suitable s-tuples
a Fqs m yielding this improved discrepancy bound can be obtained by a componentby-component algorithm (see [5, Sect. 5]).
We comment on the relationship between Vandermonde nets and other known
families of digital nets. A broad class of digital nets, namely that of hyperplane
nets, was introduced in [17] (see also [1, Chap. 11]). Choose 1 , . . . , s Fq m not
all 0. Then for the corresponding hyperplane net relative to a fixed ordered basis
1 , . . . , m of Fq m over Fq , the matrix C = ( j(i) )1is, 1 jm at the beginning of this
section is given by j(i) = i j for 1 i s and 1 j m (see [1, Theorem 11.5]
and [10, Remark 6.4]). Thus, this matrix C is also a structured matrix, but the structure
is in general not a Vandermonde structure. Consequently, Vandermonde nets are in
general not hyperplane nets relative to a fixed ordered basis of Fq m over Fq . The wellknown family of polynomial lattice point sets (see [1, Chap. 10] and [15]) belongs
to the family of hyperplane nets by [16, Theorem 2] (see also [1, Theorem 11.7]),
and so Vandermonde nets are in general not polynomial lattice point sets.

3 Vandermonde Nets with General Moduli


It was already pointed out in [5, Remark 2.3] that the construction of Vandermonde
nets over Fq in [5], which is described also in Sect. 2 of the present paper, can
be presented in the language of polynomials over Fq . There is then an analogy
with polynomial lattice point sets with irreducible moduli. This analogy was carried
further in [5, Remark 2.4] where a construction of Vandermonde nets with general
moduli was sketched. Such a generalization of the theory of Vandermonde nets is
needed for the construction of Vandermonde sequences in Sect. 4.
For a prime power q and an integer m 1, we choose a polynomial f Fq [X ]
with deg( f ) = m which serves as the modulus. We consider the residue class ring
Fq [X ]/( f ) which can be viewed as a vector space over Fq isomorphic to Fqm . Let
B be an ordered basis of the vector space Fq [X ]/( f ) over Fq . We set up the map
f : Fq [X ] Fqm as follows: for every h Fq [X ], let h Fq [X ]/( f ) be the residue
class of h modulo f and let f (h) Fqm be the coordinate vector of h relative to the
ordered basis B. It is obvious that f is an Fq -linear transformation.
Now we construct an s-dimensional digital net over Fq with m m generating matrices C (1) , . . . , C (s) over Fq in the following way. We choose an s-tuple
s
. The first generating matrix C (1) has the row vectors
g = (g1 , . . . , gs ) G q,m
j1
c1(1) , . . . , cm(1) with c(1)
j = f (g1 ) for 1 j m. If s 2, then for 2 i s the jth
(i)
row vector c(i)
is given by c(i)
j of C
j = f (gi ) for 1 j m. The digital net over
Fq with generating matrices C (1) , . . . , C (s) is called the Vandermonde net V (g, f ).
If the modulus f Fq [X ] is irreducible over Fq , then Fq [X ]/( f ) and Fq m are isomorphic as fields, and so it is clear that the present construction of Vandermonde
nets reduces to that in Sect. 2.
j

92

R. Hofer and H. Niederreiter

In order to determine the quality parameter of V (g, f ), we have to generalize


Definition 3. We write h g for the composition of two polynomials h, g Fq [X ],
s
and f Fq [X ]
that is, (h g)(X ) = h(g(X )). Then for g = (g1 , . . . , gs ) G q,m
with deg( f ) = m 1, we put
s


s1
:
(h i gi ) 0 (mod f )
L(g, f ) = h G q,m Hq,m
i=1

and L
(g, f ) = L(g, f )\{0}.
Definition 4 Let q be a prime power and let s, m N. Let f Fq [X ] with deg( f ) =
s
. If L
(g, f ) is nonempty, we define the figure of merit
m and let g G q,m
(g, f ) =


min

hL (g, f )

deg (h 1 ) +

s



deg(h i ) .

i=2

Otherwise, we define (g, f ) = m.


Remark 1 It is trivial that we always have (g, f ) 0. For s = 1 it is clear that
(g, f ) m. For s 2 the m + 1 vectors f (1), f (g1 ), f (g12 ),
. . . , f (g1m1 ), f (g2 ) in Fqm must be linearly dependent over Fq . Hence for some
b0 , b1 , . . . , bm Fq , not all 0, we have
m1


b j f (g1 ) + bm f (g2 ) = 0 Fqm .

j=0

Since f is an Fq -linear transformation, this can also be written as


f

m1



j
b j g1 + bm g2 = 0 Fqm .

j=0

The definition of f implies that


h 1 (X ) =

m1


m1
j=0

b j g1 + bm g2 0 (mod f ). If we put

b j X j , h 2 (X ) = bm X, h i (X ) = 0 for 3 i s,

j=0
s1
is a nonzero s-tuple belonging to L(g, f ).
then h = (h 1 , . . . , h s ) G q,m Hq,m

Hence L (g, f ) is nonempty and (g, f ) m by Definition 4.

Theorem 1 Let q be a prime power and let s, m N. Let f Fq [X ] with deg( f ) =


s
. Then the Vandermonde net V (g, f ) is a digital (t, m, s)-net
m and let g G q,m
over Fq with

Vandermonde Nets and Vandermonde Sequences

93

t = m (g, f ).
Proof The case (g, f ) = 0 is trivial, and so in view of Remark 1 we can assume that
1 (g, f ) m. According to a well-known result for digital nets (see [1, Theorem 4.52]),
for any nonnegative integers d1 , . . . , ds
s it suffices to show the following:(i)
with i=1
di = (g, f ), the row vectors c j Fqm , 1 j di , 1 i s, of the
generating matrices of V (g, f ) are linearly independent over Fq . Suppose, on the
contrary, that we had a linear dependence relation
di
s 


m
bi, j c(i)
j = 0 Fq ,

i=1 j=1

where all bi, j Fq and not all of them are 0. By the definition of the c(i)
j and the
Fq -linearity of f we obtain
f

d1


j1

b1, j g1

j=1

This means that


h 1 (X ) =

s

d1


i=1 (h i

di
s 


bi, j gi

= 0 Fqm .

i=2 j=1

gi ) 0 (mod f ), where

b1, j X j1 G q,m , h i (X ) =

j=1

di


bi, j X j Hq,m for 2 i s,

j=1

and so h = (h 1 , . . . , h s ) L
(g, f ). Furthermore, by the definitions of the degree
functions deg and deg in Sect. 2, we have deg (h 1 ) < d1 and deg(h i ) di for 2
i s. It follows that
deg (h 1 ) +

s

i=2

deg(h i ) <

s


di = (g, f ),

i=1

which is a contradiction to the definition of (g, f ).

Now we generalize the explicit construction of Vandermonde nets in [5, Sect. 4].
Let q be a prime power and let s and m be integers with 1 s q + 1 and m 2. Put
g1 (X ) = X G q,m . If s 2, then we choose s 1 distinct elements c2 , . . . , cs of Fq ;
this is possible since s 1 q. Furthermore, let f Fq [X ] be such that deg( f ) =
m. If s 2, then suppose that f (ci ) = 0 for 2 i s (for instance, this condition
is automatically satisfied if f is a power of a nonlinear irreducible polynomial over
Fq ). For each i = 2, . . . , s, we have gcd(X ci , f (X )) = 1, and so there exists a
uniquely determined gi G q,m with
gi (X )(X ci ) 1 (mod f (X )).

(4)

94

R. Hofer and H. Niederreiter

In this way, we arrive at the Vandermonde net V (g, f ) with g = (g1 , . . . , gs )


s
.
G q,m
Theorem 2 Let q be a prime power and let s and m be integers with 1 s q + 1
and m 2. Let f Fq [X ] be such that deg( f ) = m. If s 2, then let c2 , . . . , cs
Fq be distinct and suppose that f (ci ) = 0 for 2 i s. Then the Vandermonde net
V (g, f ) constructed above is a digital (t, m, s)-net over Fq with t = 0.
Proof According to Theorem 1, it suffices to show that (g, f ) = m. This is trivial
for s = 1 since then L
(g, f ) is empty. Therefore we can assume that s 2.
We proceed by contradiction and assume that (g, f ) m 
1. Then by Defins
di m 1,
ition 4, there exists an s-tuple h = (h 1 , . . . , h s ) L
(g, f ) with i=1

=
deg
(h
)
and
d
=
deg(h
)
for
2

s.
Now
h

L
(g,
f
)
where
d
1
1
i
i
s implies that
s
dk
i=1 (h i gi ) 0 (mod f ), and multiplying this congruence by
k=2 (X ck )
we get
h 1 (X )

s


(X ck ) +
dk

k=2

(h i gi )(X )

i=2

If we write h i (X ) =
(h i gi )(X )

s


s


di
j=1

s


(X ck )dk 0 (mod f (X )).

k=2

h i, j X j for 2 i s with all h i, j Fq , then

(X ck )dk =

di


k=2

h i, j gi (X )

s


j=1

di


(X ck )dk

k=2
j
h i, j gi (X )

(X ci )

j=1

di


s


(X ck )dk

k=2
k =i

h i, j (X ci )di j

j=1

di

s

(X ck )dk (mod f (X ))
k=2
k =i

by (4), and so
h 1 (X )

s

k=2

(X ck ) +
dk

di
s 

i=2

j=1

h i, j (X ci )di j

s


(X ck )dk 0 (mod f (X )).

k=2
k =i

Let f 0 Fq [X ] denote
sthe left-hand side of the preceding
s congruence. The first term
di m 1. In the sum i=2
in
expression
f0 , a
of f 0 has degree i=1
the
for
s
s
di 1 i=1
di
term appears only if di 1 and such a term has degree i=2
m 1 since d1 = deg (h 1 ) 1. Altogether we have deg( f 0 ) m 1 < deg( f ).
But f divides f 0 according to the congruence above, and so f 0 = 0 Fq [X ]. If we
assume that dr 1 for some r {2, . . . , s}, then substituting X = cr in f 0 (X ) we
obtain

Vandermonde Nets and Vandermonde Sequences

0 = f 0 (cr ) =

dr

j=1

h r, j (cr cr )dr j

s


95

(cr ck )dk = h r,dr

k=2
k =r

s


(cr ck )dk .

k=2
k =r

Since the last product is nonzero, we deduce that h r,dr = 0. This is a contradiction to
deg(h r ) = dr . Thus we have shown that di = 0 for 2 i s, and so h i = 0 Fq [X ]
for 2 i s. Since f 0 = 0 Fq [X ], it follows that also h 1 = 0 Fq [X ]. This is
the final contradiction since h L
(g, f ) means in particular that h is a nonzero
s-tuple.

In the case where f Fq [X ] with deg( f ) = m 2 is irreducible over Fq , this
construction of Vandermonde (0, m, s)-nets over Fq is equivalent to that in [5,
Sect. 4]. The construction is best possible in terms of the condition on s since it is well
known that if m 2, then s q + 1 is a necessary condition for the existence of a
(0, m, s)-net in base q (see [8, Corollary 4.21]). The fact that we can explicitly construct Vandermonde (0, m, s)-nets over Fq for all dimensions s q + 1 represents
an advantage over polynomial lattice point sets since explicit constructions of good
polynomial lattice point sets are known only for s = 1 and s = 2 (see [8, Sect. 4.4]
and also [1, p. 305]).

4 Vandermonde Sequences from Polynomials


We now extend the work in the previous sections from (finite) point sets to (infinite)
sequences, and thus we arrive at new digital (t, s)-sequences and more generally
digital (T, s)-sequences which we call Vandermonde sequences. We first provide
the necessary background (see [1, Chap. 4] and [9]). For integers b 2, s 1, and
m 1, let [x]b,m denote the coordinatewise m-digit truncation in base b of a point
x [0, 1]s (compare with [13, p. 194]). We write N0 for the set of nonnegative
integers.
Definition 5 Let b 2 and s 1 be integers and let T : N N0 be a function
with T(m) m for all m N. Then a sequence x0 , x1 , . . . of points in [0, 1]s is a
(T, s)-sequence in base b if for all integers k 0 and m 1, the points [xn ]b,m with
kbm n < (k + 1)bm form a (T(m), m, s)-net in base b. If for some integer t 0 we
have T(m) = t for m > t, then a (T, s)-sequence in base b is called a (t, s)-sequence
in base b.
Every (t, s)-sequence
S in
 base b is a low-discrepancy sequence, in the sense that

D N (S ) = O N 1 (log N )s for all N 2, where D N (S ) is the star discrepancy
of the first N terms of S (see [8, Theorem 4.17]). The currently best values of the
implied constant can be found in [6] for odd b and in [3] for even b.
The digital method for the construction of (t, m, s)-nets (see Sect. 1) can be
extended to the digital method for the construction of (T, s)-sequences. As in Sect. 1,

96

R. Hofer and H. Niederreiter

we restrict the attention to the case of a prime-power base b = q. For a given dimension s 1, the generating matrices are now matrices C (1) , . . . , C (s) over
Fq , where by an matrix we mean a matrix with denumerably many rows
and columns. Let Fq be the sequence space over Fq , viewed as a vector space of
column vectors over Fq of infinite length. We define the map : Fq [0, 1] by
(e) =

(e j )q j

j=1

for all e = (e1 , e2 , . . .) Fq , where : Fq Z q is a chosen bijection. For n =


0, 1, . . ., let


a j (n)q j1 ,
n=
j=1

with all a j (n) Z q and a j (n) = 0 for all sufficiently large j, be the unique digit
expansion of n in base q. With n we associate the column vector
n = ((a1 (n)), (a2 (n)), . . .) Fq ,
where : Z q Fq is a given bijection with (0) = 0. Now we define the sequence
S by


xn = (C (1) n), . . . , (C (s) n) [0, 1]s

for n = 0, 1, . . . .

Note that the matrix-vector products C (i) n for i = 1, . . . , s are meaningful since
n has only finitely many nonzero coordinates. The sequence S is called a digital
sequence over Fq .
Definition 6 If the digital sequence S over Fq is a (T, s)-sequence in base q for
some function T : N N0 with T(m) m for all m N, then S is called a digital
(T, s)-sequence over Fq . Similarly, if S is a (t, s)-sequence in base q for some
integer t 0, then S is called a digital (t, s)-sequence over Fq .
For i = 1, . . . , s and any integer m 1, we write Cm(i) for the left upper m m
submatrix of the generating matrix C (i) of a digital sequence over Fq . The following well-known result serves to determine a suitable function T for a given digital
sequence over Fq (see [1, Theorem 4.84]).
Lemma 1 Let S be a digital sequence over Fq with generating matrices C (1) , . . . ,
C (s) and let T : N N0 with T(m) m for all m N. Then S is a digital (T, s)sequence over Fq if the following property holds: for any integer m 1 and any
s
m
di = m T(m), the vectors c(i)
integers d1 , . . . , ds 0 with i=1
j,m Fq , 1 j
(i)
di , 1 i s, are linearly independent over Fq , where c j,m denotes the jth row
vector of Cm(i) .

Vandermonde Nets and Vandermonde Sequences

97

In our construction of digital (T, s)-sequences over Fq in this section, we will


initially determine the values of T(m) for m from a proper subset of N. The values
of T(m) for any m N can then be derived from the following general principle.
Lemma 2 Let S be a digital (T0 , s)-sequence over Fq for some function T0 : N
N0 with T0 (m) m for all m N. Then S is also a digital (T, s)-sequence over
Fq for a suitably defined function T : N N0 which satisfies T(m) T0 (m) for all
m N and
T(m + r ) T(m) + r
for all m, r N.
Proof Let T : N N0 be such that T(m) is the least possible value for any m N to
make S a digital (T, s)-sequence over Fq or, in the language of [1, Definition 4.31],
such that S is a strict (T, s)-sequence in base q. Then it is trivial that T(m)
T0 (m) for all m N. For given m N, the fact that S is a digital sequence over Fq
and a strict (T, s)-sequence in base q implies, according to [1, Theorem 4.84], the
following
property with the notation in Lemma 1: for any integers d1 , . . . , ds 0 with
s
m
d
=
m T(m), the vectors c(i)
j,m Fq , 1 j di , 1 i s, are linearly
i=1 i
independent over Fq . In order to verify that T(m + r ) T(m) + r for all r N, it
suffices to show by Lemma 1 that for any integers d1 , . . . , ds 0 with
s


di = (m + r ) (T(m) + r ) = m T(m),

i=1
m+r
the vectors c(i)
, 1 j di , 1 i s, are linearly independent over Fq .
j,m+r Fq
But this is obvious since any nontrivial linear dependence relation between the latter
vectors would yield, by projection onto the first m coordinates of these vectors, a
m
nontrivial linear dependence relation between the vectors c(i)
j,m Fq , 1 j di ,
1 i s.


Now we show how to obtain digital (T, s)-sequences over Fq from the Vandermonde nets in Theorem 2. Let k and s be integers with k 2 and 1 s q + 1. Let
f Fq [X ] be such that deg( f ) = k. If s 2, then let c2 , . . . , cs Fq be distinct
and suppose that f (ci ) = 0 for 2 i s. For any integer e 1, we consider the
modulus f e Fq [X ]. We have again f e (ci ) = 0 for 2 i s, and so Theorem 2
yields a Vandermonde net V (g e , f e ) which is a digital (0, ek, s)-net over Fq . We
write
s
for all e N.
g e = (g1,e , . . . , gs,e ) G q,ek
Then we have the compatibility property
g e+1 g e (mod f e )

for all e N,

(5)

where a congruence between s-tuples of polynomials is meant coordinatewise. The


congruence for the first coordinates is trivial since g1,e (X ) = X for all e N. For the

98

R. Hofer and H. Niederreiter

other coordinates, the congruence follows from the fact that gi G q,m is uniquely
determined by (4).
Recall that V (g e , f e ) depends also on the choice of an ordered basis Be of the
vector space Fq [X ]/( f e ) over Fq (see Sect. 3). We make these ordered bases Be
for e N compatible by choosing them as follows. Let B1 consist of the residue
classes of 1, X, . . . , X k1 modulo f (X ), let B2 consist of the residue classes of
1, X, . . . , X k1 , f (X ), X f (X ), . . . , X k1 f (X ) modulo f 2 (X ), and so on in an obvious manner. For the maps f , f 2 , . . . in Sect. 3, this has the pleasant effect that for
any e N and any h Fq [X ] we have
f e (h) = (e+1)k,ek ( f e+1 (h)),

(6)

where (e+1)k,ek : Fq(e+1)k Fqek is the projection onto the first ek coordinates of a
vector in Fq(e+1)k .
Finally, we construct the generating matrices C (1) , . . . , C (s) of an sdimensional digital sequence over Fq . We do this by defining certain left upper square
submatrices of each C (i) and by showing that these submatrices are compatible.
(i)
Concretely, for i = 1, . . . , s and any e N, the left upper (ek) (ek) submatrix Cek
(i)
e
of C is defined as the ith generating matrix of the Vandermonde net V (g e , f ).
For this to make sense, we have to verify the compatibility condition that for each
(i)
is equal to
i = 1, . . . , s and e N, the left upper (ek) (ek) submatrix of C(e+1)k
(i)
Cek . In the notation of Lemma 1, this means that we have to show that
(i)
c(i)
j,ek = (e+1)k,ek (c j,(e+1)k )

for e N, 1 i s, and 1 j ek. For 2 i s, we have


(i)
(e+1)k,ek (c(i)
j,(e+1)k ) = (e+1)k,ek ( f e+1 (gi,e+1 )) = f e (gi,e+1 ) = f e (gi,e ) = c j,ek
j

by (6) and (5), and obvious modifications show the analogous identity for i = 1. This
completes the construction of the Vandermonde digital sequence S over Fq with
generating matrices C (1) , . . . , C (s) .
Theorem 3 Let q be a prime power and let k and s be integers with k 2 and 1
s q + 1. Let f Fq [X ] be such that deg( f ) = k. If s 2, then let c2 , . . . , cs
Fq be distinct and suppose that f (ci ) = 0 for 2 i s. Then the Vandermonde
sequence S constructed above is a digital (T, s)-sequence over Fq with T(m) =
rk (m) for all m N, where rk (m) is the least residue of m modulo k.
Proof It suffices to show that S is a digital (T0 , s)-sequence over Fq with T0 (m) = 0
if m 0 (mod k) and T0 (m) = m otherwise. The rest follows from Lemma 2.
Now let m 0 (mod k), say m = ek with e N. Then for m = ek, we have to
verify the linear independence property in Lemma 1 for the left upper (ek) (ek)
(1)
(s)
(1)
(s)
of S , with the
submatrices
ek , . . . , C ek of the generating matrices C , . . . , C
C
s
condition i=1 di = ek in Lemma 1. By the construction of the latter generating

Vandermonde Nets and Vandermonde Sequences

99

(1)
(s)
matrices, the submatrices Cek
, . . . , Cek
are the generating matrices of the Vandere
e
monde net V (g e , f ). Now V (g e , f ) is a digital (0, ek, s)-net over Fq by Theorem 2, and this implies the desired linear independence property in Lemma 1 for
(1)
(s)
, . . . , Cek
.

Cek

Example 1 Let q be a prime power and let s = q + 1. Let c2 , . . . , cq+1 be the q


distinct elements of Fq and let f be an irreducible quadratic polynomial over Fq .
Then Theorem 3 provides a digital (T, q + 1)-sequence over Fq with T(m) = 0 for
even m N and T(m) = 1 for odd m N. A digital sequence with these parameters
was also constructed in [11], but the present construction is substantially simpler
than that in [11]. Note that there cannot exist a digital (U, q + 1)-sequence over Fq
with U(m) = 0 for all m N, because of the well-known necessary condition s q
for (0, s)-sequences in base q (see [8, Corollary 4.24]).

5 Vandermonde Sequences from Global Function Fields


The construction of Vandermonde sequences in Sect. 4 can be described also in the
language of valuations and Riemann-Roch spaces of the rational function field Fq (X )
over Fq (see Example 2 below and [4, Sect. 2]). This description is the starting point
for a generalization of the construction by using arbitrary global function fields.
The generalized construction allows us to overcome the restriction s q + 1 in the
construction in Sect. 4. It is well known that global function fields are powerful tools
for constructing (t, m, s)-nets and (t, s)-sequences; see [1, Chap. 8], [13, Chap. 8],
and [14, Sect. 5.7] for expository accounts of constructions based on global function
fields. The construction in the present section serves as another illustration for the
power of global function fields in this area.
Concerning global function fields, we follow the notation and terminology in
the book [14]. Another good reference for global function fields is the book of
Stichtenoth [19]. We briefly review some basic notions from the theory of global
function fields and we refer to [14] and [19] for more detailed information. For a
finite field Fq , a global function field F over Fq is an algebraic function field of one
variable with constant field Fq , i.e., F is a finite extension (in the sense of field theory)
of the rational function field Fq (X ) over Fq . We assume without loss of generality
that Fq is the full constant field of F, which means that Fq is algebraically closed in F.
An important concept is that of a valuation of F, which is a map : F R {}
satisfying the following four axioms: (i) ( f ) = if and only if f = 0; (ii)
( f 1 f 2 ) = ( f 1 ) + ( f 2 ) for all f 1 , f 2 F; (iii) ( f 1 + f 2 ) min (( f 1 ), ( f 2 ))
for all f 1 , f 2 F; (iv) (F ) = {0} for F := F\{0}. Two valuations of F are equivalent if one is a constant multiple of the other. An equivalence class of valuations of
F is called a place of F. Each place P of F contains a unique normalized valuation
P for which P (F ) = Z. The residue class field of P is a finite extension of Fq
and the degree of this extension is the degree deg(P) of P. A place P of F with
deg(P) = 1 is called a rational place of F. Let P F denote the set of all places of F.

100

R. Hofer and H. Niederreiter

A divisor D of F is a formal sum


D=

nP P

PP F

with n P Z for all P P F and all but finitely many n P = 0. We write also n P =
P (D). The finite set of all places P of F with P (D) = 0 is called the support of
D. The degree deg(D) of a divisor D is defined by
deg(D) =

n P deg(P) =

PP F

P (D) deg(P).

PP F

Divisors are added and subtracted term by term. We say that a divisor D of F is
positive if P (D) 0 for all P P F . The principal divisor div( f ) of f F is
defined by

div( f ) =
P ( f ) P.
PP F

For any divisor D of F, the Riemann-Roch space


L (D) = { f F : div( f ) + D 0} {0}
associated with D is a finite-dimensional vector space over Fq . We write (D) for
the dimension of this vector space. If the integer g 0 is the genus of F, then the
celebrated Riemann-Roch theorem [14, Theorem 3.6.14] says that (D) deg(D) +
1 g, with equality whenever deg(D) 2g 1. We quote the following result from
[14, Corollary 3.4.4].
Lemma 3 If the divisor D of the global function field F satisfies deg(D) < 0, then
L (D) = {0}.
We are now ready to describe a construction of s-dimensional Vandermonde
sequences based on the global function field F of genus g. We avoid trivial cases by
assuming that s 2 and g 1. We suppose that F has at least s + 1 rational places.
Let P1 , . . . , Ps , P be distinct rational places of F and let D be a positive divisor
of F with deg(D) = 2g such that P2 , . . . , Ps , P are not in the support of D (for
instance D = 2g P1 ).
Lemma 4 For every integer j 1, there exist
(1)
j L (D + ( j 1)P1 ( j 1)P2 ) \L (D + ( j 2)P1 ( j 1)P2 ) ,
(i)

j L (D + j Pi j P1 ) \ (L (D + j Pi ( j + 1)P1 ) L (D + ( j 1)Pi j P1 ))

for 2 i s. Furthermore, we have:


(i) P1 ( (1)
j ) = ( j 1) P1 (D),

Vandermonde Nets and Vandermonde Sequences

101

(ii) P1 ( (i)
j ) = j P1 (D),
(iii) Pi ( (i)
j ) = j,
(l)
(iv) Ph ( j ) 0
for j 1, for 2 i s, and for 2 h s and 1 l s with h = l.
Proof We first observe that obviously
deg (D + ( j 1)P1 ( j 1)P2 ) = 2g,

(7)

deg (D + ( j 2)P1 ( j 1)P2 ) = 2g 1,


deg (D + j Pi j P1 ) = 2g,

(8)
(9)

deg (D + j Pi ( j + 1)P1 ) = 2g 1,
deg (D + ( j 1)Pi j P1 ) = 2g 1,
(L (D + j Pi ( j + 1)P1 ) L (D + ( j 1)Pi j P1 )) {0}.

(10)
(11)
(12)

The existence of the (1)


j for j 1 follows directly from the Riemann-Roch theorem
together with (7) and (8). The existence of the (i)
j for 2 i s and j 1 follows
from
|L (D + j Pi j P1 )| |L (D + j Pi ( j + 1)P1 )| |L (D + ( j 1)Pi j P1 )|
+ |L (D + j Pi ( j + 1)P1 ) L (D + ( j 1)Pi j P1 )| q g+1 q g q g + 1 1,

where we used the Riemann-Roch theorem together with (9), (10), (11), and (12).
The results (i), (ii), (iii), and (iv) are now obtained from the choice of the (i)
j for
1 i s and j 1 and from the given properties of the divisor D.

Example 2 If F is the rational function field Fq (X ), then the elements (i)
j F in
Lemma 4 can be given explicitly. For this F we have the so-called infinite place
(which is a rational place of F), and the remaining places of F are in one-to-one
correspondence with the monic irreducible polynomials over Fq (see [14, Sect. 1.5]).
For an integer s with 2 s q + 1, let P1 be the infinite place of F and for i =
2, . . . , s let Pi be the rational place of F corresponding to the polynomial X
ci Fq [X ], where c2 = 0, c3 , . . . , cs are distinct elements of Fq . Let D be the zero
j1
j
for j 1 and (i)
for
divisor of F. Then the elements (1)
j = X
j = (X ci )
2 i s and j 1 satisfy all properties in Lemma 4 (note that no choice of P is
needed for Lemma 4). There is an obvious relationship between these elements (i)
j
and the construction of Vandermonde sequences in Sect. 4 (compare also with the
construction leading to Theorem 2).
A trick that was used in [20] for the construction of good digital sequences comes
in handy now. We first determine a basis {w1 , . . . , wg } of the vector space L (D
P1 ) with dimension (D P1 ) = g as follows. By the Riemann-Roch theorem and
Lemma 3, we know the dimensions (D P1 ) = g and (D P1 2g P ) = 0.
Hence there exist integers 0 n 1 < < n g < 2g such that

102

R. Hofer and H. Niederreiter

(D P1 n r P ) = (D P1 (n r + 1)P ) + 1 for 1 r g.
Now we choose wr L (D P1 n r P )\L (D P1 (n r + 1)P ) to obtain the
basis {w1 , . . . , wg } of L (D P1 ). Note that P (wr ) = n r , P1 (wr ) 1 P1 (D),
and Pi (wr ) 0 for all 2 i s, 1 r g.
Lemma 5 With the notation above, the system {w1 , . . . , wg } { (i)
j }1is, j1 is linearly independent over Fq .
Proof The linear independence of { (i)
j } j1 for every fixed i = 1, . . . , s is obvious
from the known values of valuations in Lemma 4. Suppose that
g


ar wr +

r =1

s 
v


(i)
b(i)
j j = 0

i=1 j=1

for some integer v 1 and ar , b(i)


j Fq . For a fixed h = 2, . . . , s, we consider
v


(h)
b(h)
j j

g


ar wr

s 
v


r =1

j=1

i=1
i =h

(i)
b(i)
j j .

j=1

We abbreviate the left-hand side by . Now if we had = 0, then we know from


Lemma 4 that Ph () < 0. But the right-hand side satisfies Ph () 0. Hence = 0
and all coefficients b(h)
j on the left-hand side have to be 0. We arrive at the identity
v


(1)
b(1)
j j =

g


ar wr .

r =1

j=1

We abbreviate the left-hand side by . If there were a b(1)


j = 0 for at least one j 1,
then by Lemma 4 the left-hand side yields P1 ( ) P1 (D), but the right-hand
side shows that P1 ( ) P1 (D) + 1. Hence all b(1)
j , and therefore also all ar by

the basis property of {w1 , . . . , wg }, have to be 0.
Now we construct the generating matrices C (1) , . . . , C (s) of a digital sequence over
Fq . For 1 i s and j 1, the jth row of C (i) is determined by the coefficients of
(i)
the local expansion of (i)
j at the rational place P . Since P ( j ) 0 by Lemma 4,
this local expansion has the form
(i)
j =

a (i)
j,k z k

k=0

with coefficients a (i)


j,k Fq for k 0, j 1, 1 i s. The sequence (z k )k0 of
elements of F satisfies P (z k ) = k for k N0 \{n 1 , . . . , n g }, and for k = n r with

Vandermonde Nets and Vandermonde Sequences

103

r {1, . . . , g} we put z k = wr , so that P (z k ) = n r . This preliminary construction


yields the sequence


(i)
(i)
(i)
(i)
(a (i)
j,0 , . . . , a j,n 1 , a j,n 1 +1 , . . . , a j,n g , a j,n g +1 , . . .)
of elements of Fq for any j 1 and 1 i s. After deleting the terms with the hat,
(i)
we arrive at a sequence of elements of Fq which serves as the jth row c(i)
j of C ,
and we write
(i)
(i)
c(i)
j = (c j,0 , c j,1 , . . .).
Theorem 4 Let q be a prime power and let s 2 be an integer. Let F be a global
function field with full constant field Fq and with genus g 1. Suppose that F has at
least s + 1 rational places. Then the digital sequence with the generating matrices
C (1) , . . . , C (s) constructed above is a digital (t, s)-sequence over Fq with t = g.
Proof We proceed by
Lemma 1 and prove that for any integer m > g and any integers
s
di = m g, the vectors
d1 , . . . , ds 0 with i=1
(i)
(i)
m
c(i)
j,m := (c j,0 , . . . , c j,m1 ) Fq

with 1 j di , 1 i s, are linearly independent over Fq . Choose b(i)


j Fq for
1 j di , 1 i s, satisfying
di
s 


(i)
m
b(i)
j c j,m = 0 Fq .

(13)

i=1 j=1

We can assume without loss of generality that d1 , . . . , ds 1. The linearity of the


local expansion implies that
:=

di
s 

i=1 j=1

(i)
b(i)
j j

di
s 


b(i)
j

a (i)
j,nr wr

r =1

i=1 j=1

g




=:


k=0
k =n 1 ,...,n g

di
s 


(i)
b(i)
zk .
j a j,k

i=1 j=1

In view of the construction algorithm and (13), we obtain


=


k=m+g

di
s 

(i)

b(i)
zk .
j a j,k
i=1 j=1

Therefore P () m + g. We observe that


L (D P1 ) L (D + (d1 1)P1 + d2 P2 + + ds Ps ),

104

R. Hofer and H. Niederreiter


(1)

L (D + ( j 1)P1 ( j 1)P2 ) L (D + (d1 1)P1 + d2 P2 + + ds Ps )

for 1 j d1 , and
(i)

j L (D + j Pi j P1 ) L (D + di Pi P1 ) L (D + (d1 1)P1 + d2 P2 + + ds Ps )

for 1 j di , 2 i s. This together with P () m + g implies that


L (D + (d1 1)P1 + d2 P2 + + ds Ps (m + g)P ) =: L (Dm,d1 ,...,ds ).
We compute the degree of Dm,d1 ,...,ds and obtain
deg(Dm,d1 ,...,ds ) = 2g + m g 1 (m + g) = 1,
which entails = 0 by Lemma 3. Finally, the linear independence over Fq of the
(i)
system {w1 , . . . , wg } { (i)
j }1is, j1 shown in Lemma 5 yields b j = 0 for 1

j di , 1 i s.
For the missing case g = 0 in Theorem 4, we have the Faure-Niederreiter
sequences obtained from the rational function field Fq (X ) which yield digital (0, s)sequences over Fq for every dimension s q (see [2, 7], and [8, Remark 4.52]). It
follows from this and Theorem 4 that for every prime power q and every integer s 1,
there exists a digital (Vq (s), s)-sequence over Fq , where Vq (s) is the least value of
g 0 for which there is a global function field with full constant field Fq and genus
g containing at least s + 1 rational places. For fixed q, we have Vq (s) = O(s) as
s with an absolute implied constant by [12, Theorem 4]. The (t, s)-sequences
obtained from Theorem 4 are asymptotically optimal in terms of the quality parameter since it is known that for any fixed base b 2, the values of t for (t, s)-sequences
in base b must grow at least linearly as a function of s as s . The currently best
version of the latter result can be found in [18].
There is also a construction of Vandermonde sequences for the case where
P1 , . . . , Ps are again s 2 distinct rational places of the global function field F,
but where P is a place of F with degree k 2 (see [4, Sect. 3]). This construction
yields a digital (T, s)-sequence over Fq with
T(m) = min (m, 2g + rk (m))

for all m N,

where g is the genus of F and rk (m) is as in Theorem 3.


Acknowledgments The first author was supported by the Austrian Science Fund (FWF), Project
F5505-N26, which is a part of the Special Research Program Quasi-Monte Carlo Methods: Theory
and Applications.

Vandermonde Nets and Vandermonde Sequences

105

References
1. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte
Carlo Integration. Cambridge University Press, Cambridge (2010)
2. Faure, H.: Discrpance de suites associes un systme de numration (en dimension s). Acta
Arith. 41, 337351 (1982)
3. Faure, H., Kritzer, P.: New star discrepancy bounds for (t, m, s)-nets and (t, s)-sequences.
Monatsh. Math. 172, 5575 (2013)
4. Hofer, R., Niederreiter, H.: Explicit constructions of Vandermonde sequences using global
function fields, preprint available at http://arxiv.org/abs/1311.5739
5. Hofer, R., Niederreiter, H.: Vandermonde nets. Acta Arith. 163, 145160 (2014)
6. Kritzer, P.: Improved upper bounds on the star discrepancy of (t, m, s)-nets and (t, s)sequences. J. Complex. 22, 336347 (2006)
7. Niederreiter, H.: Point sets and sequences with small discrepancy. Monatsh. Math. 104, 273
337 (1987)
8. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM,
Philadelphia (1992)
9. Niederreiter, H.: (t, m, s)-nets and (t, s)-sequences. In: Mullen, G.L., Panario, D. (eds.) Handbook of Finite Fields, pp. 619630. CRC Press, Boca Raton (2013)
10. Niederreiter, H.: Finite fields and quasirandom points. In: Charpin, P., Pott, A., Winterhof, A.
(eds.) Finite Fields and Their Applications: Character Sums and Polynomials, pp. 169196. de
Gruyter, Berlin (2013)
11. Niederreiter, H., zbudak, F.: Low-discrepancy sequences using duality and global function
fields. Acta Arith. 130, 7997 (2007)
12. Niederreiter, H., Xing, C.P.: Low-discrepancy sequences and global function fields with many
rational places. Finite Fields Appl. 2, 241273 (1996)
13. Niederreiter, H., Xing, C.P.: Rational Points on Curves over Finite Fields: Theory and Applications. Cambridge University Press, Cambridge (2001)
14. Niederreiter, H., Xing, C.P.: Algebraic Geometry in Coding Theory and Cryptography. Princeton University Press, Princeton (2009)
15. Pillichshammer, F.: Polynomial lattice point sets. In: Plaskota, L., Wozniakowski, H. (eds.)
Monte Carlo and Quasi-Monte Carlo Methods 2010, pp. 189210. Springer, Berlin (2012)
16. Pirsic, G.: A small taxonomy of integration node sets. sterreich. Akad. Wiss. Math.-Naturw.
Kl. Sitzungsber. II(214), 133140 (2005)
17. Pirsic, G., Dick, J., Pillichshammer, F.: Cyclic digital nets, hyperplane nets, and multivariate
integration in Sobolev spaces. SIAM J. Numer. Anal. 44, 385411 (2006)
18. Schrer, R.: A new lower bound on the t-parameter of (t, s)-sequences. In: Keller, A., Heinrich,
S., Niederreiter, H. (eds.)Monte Carlo and Quasi-Monte Carlo Methods 2006, pp. 623632.
Springer, Berlin (2008)
19. Stichtenoth, H.: Algebraic Function Fields and Codes, 2nd edn. Springer, Berlin (2009)
20. Xing, C.P., Niederreiter, H.: A construction of low-discrepancy sequences using global function
fields. Acta Arith. 73, 87102 (1995)

Path Space Markov Chain Monte Carlo


Methods in Computer Graphics
Wenzel Jakob

Abstract The objective of a rendering algorithm is to compute a photograph of a


simulated reality, which entails finding all the paths along which light can flow from a
set of light sources to the camera. The purpose of this article is to present a high-level
overview of the underlying physics and analyze how this leads to a high-dimensional
integration problem that is typically handled using Monte Carlo methods. Following
this, we survey recent work on path space Markov Chain Monte Carlo (MCMC)
methods that compute the resulting integrals using proposal distributions defined on
sets of light paths.
Keywords Rendering Path space Specular manifold MCMC

1 Introduction
The central goal of light transport algorithms in computer graphics is the generation
of renderings, two-dimensional images that depict a simulated environment as if
photographed by a virtual camera. Driven by the increasing demand for photorealism,
computer graphics is currently undergoing a substantial transition to physics-based
rendering techniques that compute such images while accurately accounting for the
interaction of light and matter.
These methods require a detailed model of the scene including the shape and
optical properties of all objects including light sources; the final rendering is then
generated by a simulation of the relevant physical laws, specifically transport and
scattering, i.e., the propagation of light and its interaction with the materials that
comprise the objects. In this article, we present a high-level overview of the underlying physics and analyze how this leads to a high-dimensional integration problem
that is typically handled using Monte Carlo methods.

W. Jakob (B)
Realistic Graphics Lab, EPFL, Lausanne, Switzerland
e-mail: wenzel.jakob@epfl.ch
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_4

107

108

W. Jakob

Section 2 begins with a discussion of the geometric optics framework used in computer graphics. After defining the necessary notation and physical units, we state
the energy balance equation that characterizes the interaction of light and matter.
Section 3 presents a simple recursive Monte Carlo estimator that solves this equation, though computation time can be prohibitive if accurate solutions are desired.
Section 4 introduces path space integration, which offers a clearer view of the underlying light transport problem. This leads to a large class of different estimators that
can be combined to improve convergence. Section 5 introduces MCMC methods in
rendering. Section 6 covers an MCMC method that explores a lower-dimensional
manifold of light paths, and Sect. 7 discusses extensions to cases involving interreflection between glossy objects. Section 8 concludes with a discussion of limitations and unsolved problems.
This article is by no means a comprehensive treatment of rendering; the selection
of topics is entirely due to the authors personal preference. It is intended that the
discussion will be helpful to readers who are interested in obtaining an understanding of recent work on path-space methods and applications of MCMC methods in
rendering.

2 Geometric Optics and Light Transport on Surfaces


Light transport simulations in computer graphics are generally conducted using a
simplified variant of geometric optics. In this framework, light moves along a straight
line until an interaction (i.e., a scattering event) occurs, which involves a change of
direction and potentially some absorption. The wave-like nature of light is not simulated, which leads to a simpler computation and is an excellent approximation in
general (the wavelength of visible light is minuscule compared to the sizes of everyday objects). Light is also assumed to be incoherent and unpolarized, and although it
moves at a finite speed, this motion is not modeled explicitly. More complex theories
without these assumptions are available but ultimately not needed since the phenomena described by them are in most cases too subtle to be observed by humans. For
the sake of simplicity, we only discuss monochromatic rendering in this article; the
generalization to the full color spectrum poses no fundamental difficulties.
In the following sections, we review relevant background material, starting with
the standard light transport model used in computer graphics and leading up to the
path space framework proposed by Veach [28].
In geometric optics, light is usually quantified using radiance, which has units of
W sr 1 m2 . Given a point x R3 and a direction S 2 , the radiance L(x, ) is a
density function that describes how much illumination flows through the point, in this
direction. Radiance can be measured by registering the amount of energy arriving
on a small surface patch dA at x that is perpendicular to and sensitive to a small
cone of directions d around , and then letting the surface and solid angle tend to
zero. For a thorough review of radiance and many related radiometric quantities, we
refer the reader to Preisendorfer [25].

Path Space Markov Chain Monte Carlo Methods

109

An important property of radiance is that it remains invariant along rays when


there are no obstructions (e.g., in vacuum),
L(x, ) = L(x + t, ), t [0, tobstr ).
Due to this property, a complete model of a virtual environment can be obtained
simply by specifying how L behaves in places where an obstruction interacts with
the illumination, i.e., at the boundaries of objects or inside turbid substances like fog
or milk. In this article, we only focus on the boundary case for simplicity. For a more
detailed discussion including volumetric scattering, we refer to [10].
We assume that the scene to be rendered is constructed from a set of surfaces that
all lie inside a bounded domain R3 . The union of these surfaces is denoted
M and assumed to be a differentiable manifold, i.e. is parameterized by a set
of charts with differentiable transition maps.
Furthermore, let N : M S 2 denote the Gauss map, which maps surface
positions to normal directions on the unit sphere.
Because boundaries of objects introduce discontinuities in the radiance function
L, we must take one-sided limits to distinguish between the exterior radiance function
L + (x, ) and the interior radiance function L (x, ) at surface locations x M as
determined by the normal N(x) (Fig. 1). Based on these limits, intuitive incident and
outgoing radiance functions can then be defined as

L + (x, ),
Li (x, ) :=
L (x, ),

L + (x, ),
Lo (x, ) :=
L (x, ),

N(x) > 0
N(x) < 0

and

N(x) > 0
.
N(x) < 0

With the help of these definitions, we can introduce the surface energy balance
equation that describes the relation between the incident and outgoing radiance based
on the material properties at x:

Lo (x, ) =

S2



Li (x,  ) f (x,  )  N(x) d + Le (x, ), x M . (1)

The integration domain S 2 is the unit sphere and f is the bidirectional scattering distribution function (BSDF) of the surface, which characterizes the surfaces response

Fig. 1 Limits of the


radiance function L from
above and below

110

W. Jakob
Incident
radiance

Reflectance
Forefunction
shortening

Emitted
radiance

Final
pixel color

Fig. 2 Illustration of the energy balance Eq. (1) on surfaces. Here, it is used to compute the pixel
color of the surface location highlighted in white (only the top hemisphere is shown in the figure)

to illumination from different directions. Given illumination reaching a point x from


a direction  , the BSDF expresses how much of this illumination is scattered into
the direction . For a detailed definition of the concept of a BSDF as well as other
types of scattering functions, we refer the reader to Nicodemus [22]. The function
Le (x, ) is the source term which specifies how much light is emitted from position
x into direction ; it is zero when the position x is not located on a light source.
Figure 2 visualizes the different terms in Eq. (1) over the top hemisphere. The
example shows a computation of the radiance traveling from the surface location
marked with a white dot towards the camera. The first term is an integral over the
incident radiance as seen from the surface location. The integral also contains the
BSDF and a cosine foreshortening term which models the effect that a beam of light
arriving at a grazing angle spreads out over a larger region on the receiving surface
and thus deposits less energy per unit area. The ceiling of the scene is made of
rough metal; its reflectance function effectively singles out a small portion of the
incident illumination, which leads to a fairly concentrated reflection compared to
the other visible surfaces. The emission term is zero, since the highlighted surface
position is not located on a light source.
Considerable research has been conducted on characterizing the reflectance properties of different materials, and these works have proposed a wide range of BSDF
functions f that reproduce their appearance in renderings. Figure 3 shows several
commonly used BSDF models, along with the resulting material appearance. The
illustrations left of the renderings show polar plots of the BSDF f ( ) where
the surface receives illumination from a fixed incident direction  highlighted in
red. The primary set of reflected directions is shown in blue, and the transmitted
directions (if any) are shown in green.
Specular materials shown in the top row are characterized by having a degenerate BSDF f that is described by a Dirac delta distribution. For instance, a mirror
reflects light arriving from into only a single direction  = 2N(x)( N(x)) .
In comparison, rough materials usually have a smooth function f . BSDFs based on

Path Space Markov Chain Monte Carlo Methods

Smooth conducting material

Smooth dielectric material

Rough conducting material

Rough dielectric material

111

Smooth diffuse material

Fig. 3 An overview of common material types. The left side of each example shows a 2D illustration
of the underlying scattering process for light arriving from the direction highlighted in red. The
right side shows a corresponding rendering of a material test object

microfacet theory [4, 27, 32] are a popular choice in particularthey model the interaction of light with random surfaces composed of tiny microscopic facets that are
oriented according to a statistical distribution. Integration over this distribution then
leads to simple analytic expressions that describe the expected reflection and refraction properties at a macroscopic scale. In this article, we assume that the BSDFs are
provided as part of the input scene description and will not discuss their definitions
in detail.

3 Path Tracing
We first discuss how Eq. (1) can be solved using Monte Carlo integration, which leads
to a simple method known as Path Tracing [12]. For this, it will be convenient to
establish some further notation: we define the distance to the next surface encountered
by the ray (x, ) R3 S 2 as
dM (x, ) := inf {d > 0 | x + d M }
where inf = . Based on this distance, we can define a ray-casting function r:
r(x, ) := x + dM (x, ).

(2)

112

W. Jakob

Due to the preservation of radiance along unoccluded rays, the ray-casting function
can be used to relate the quantities Li and Lo :
Li (x, ) = Lo (r(x, ), ).
In other words, to find the incident radiance along a ray (x, ), we must only determine the nearest surface visible in this direction and evaluate its outgoing radiance
into the opposite direction. Using this relation, we can eliminate Li from the energy
balance Eq. (1):

Lo (x, ) =

S2



Lo (r(x,  ),  ) f (x,  )  N(x) d + Le (x, )

(3)

Although the answer is still not given explicitly, the equation is now in a form
that is suitable for standard integral equation solution techniques. However, this is
made difficult by the ill-behaved nature of the integrand, which is generally riddled
with singularities and discontinuities caused by visibility changes in the ray-casting
function r. Practical solution methods often rely on a Neumann series expansion of the
underlying integral operator, in which case the resulting high number of dimensions
rules out standard deterministic integration rules requiring an exponential number
of function evaluations. Monte Carlo methods are resilient to these issues and hence
see significant use in rendering.
To obtain an unbiased MC estimator based on Eq. (3), we replace the integral with
a single sample of the integrand at a random direction  and divide by its probability
density p( ), i.e.



Lo (r(x,  ),  ) f (x,  )  N(x)

+ Le (x, )
Lo (x, ) =
p( )

(4)

In this case, Ep


Lo , we obtain an approxLo = Lo , and by averaging many estimates 
imation of the original integral. Typically, some form of importance sampling is
employed, e.g. by choosing a sampling density function p( ) f (x,  ).
Algorithm 1 shows the pseudo-code of the resulting recursive method. Based on the
underlying sequence of spherical sampling steps, path tracing can also be interpreted
as a method that generates trajectories along which light is carried from the light
source to the camera; we refer to these trajectories as a light paths and will revisit
this concept in more detail later. In practice, the path tracing algorithm is combined
with additional optimizations that lead to better convergence, but this is beyond the
scope of this article.
Due to their simplicity and ability to produce photorealistic images, optimized
path tracing methods have seen increased use in research and industrial applications.
The downside of these methods is that they converge very slowly given challenging
input, sometimes requiring days or even weeks to compute a single image on stateof-the-art computers. Problems arise whenever complete light paths are found with
too low a probabilitya typical example is shown in Fig. 5a.

Path Space Markov Chain Monte Carlo Methods

113

Algorithm 1 Pseudocode of a simple Path Tracer


function 
Lo (x, )
Return zero with probability (0, 1).
Sample a direction  proportional to f (x,  ),
let the factor of proportionality be denoted as fprop .

Set x = r(x,

 )
1
Return 1
Le (x, ) + fprop 
Lo (x ,  ) .

4 The Path Space Formulation of Light Transport


In this section, we discuss the path space formulation of light transport, which provides a clearer view of the sampling operations performed by Algorithm 1. This
framework can be used to develop other types of integration methods, including
ones based on MCMC proposals that we discuss afterwards.
The main motivation for using path space is that it provides an explicit expression
for the value of the radiance function as an integral over light paths, as opposed to
the unwieldy recursive estimations on spherical domains in Algorithm 1. This allows
for considerable freedom in developing and comparing sampling strategies. The path
space framework was originally developed by Veach [28] and builds on a theoretical
analysis of light transport operators by Arvo [1]. Here, we only present a high-level
sketch.
Let us define an integral operator T

(Th)(x, ) :=

S2



h(r(x,  ),  ) f (x,  )  N(x) d ,

(5)

and use it to rewrite Eq. (3) as


Lo = TLo + Le .
An explicit solution for Lo can be found by inverting the operator so that
Lo = (1 T )1 Le .
Let

L be a norm on the space of radiance functions


h
L :=


S2

h(x, ) | N(x)| d dA(x),

which induces a corresponding operator norm


T
op = sup
h
L 1
Th
. Veach
proved that physically realizable scenes satisfy
T l
op < 1 for some fixed l N.
Given this property, it is not only guaranteed that the inverse operator (1 T )1
exists, but it can also be computed using a Neumann series expansion:
(1 T )1 = I + T + T 2 + . . . ,

114

W. Jakob

which intuitively expresses the property that the outgoing radiance is equal to the
emitted radiance plus radiance that has scattered one or more times (the sum converges since the energy of the multiply scattered illumination tends to zero).
Lo = Le + TLe + T 2 Le + .

(6)

Rather than explicitly computing the radiance function Lo , the objective of rendering is usually to determine the response of a simulated camera to illumination that
reaches its aperture. Suppose that the sensitivity of a pixel j in the camera is given by
(j)
sensitivity profile function We : M S 2 R defined on ray space. The intensity
Ij of the pixel is given by

Ij =


M

S2

We(j) (x, ) Lo (r(x, ), ) | N(x)| d dA(x),

(7)

which integrates over its sensitivity function weighted by the outgoing radiance
on surfaces that are observed by the camera. The spherical integral in the above
expression involves an integrand that is evaluated at the closest surface position as
seen from the ray (x, ). It is convenient to switch to a different domain involving
only area integrals. We can transform the above integral into this form using the
identity


S2

q(r(x, )) | N(x)| d =

q(y) G(x y) dA(y),

(8)

where x, y M , and q : M R is any integrable function defined on surfaces,


and G is the geometric term [24] defined as
G(x y) := V (x y)





N(x)
xy 
xy  N(y)

x y
2

(9)

The double arrows emphasize the symmetric nature of this function,


xy is the normalized direction from x to y, and V is a visibility function defined as

V (x y) :=

1, if {x + (1 )y | 0 < < 1} M =
0, otherwise

(10)

Applying the change of variables (8) to Eq. (7) yields



Ij =


M

We(j) (x,
xy) Lo (y,
yx) G(x y) dA(x, y).

(11)

We can now substitute Lo given by Eq. (6) into the above integral, which is a power
series of the T operator (i.e. increasingly nested spherical integrals). Afterwards, we
apply the change of variables once more to convert all nested spherical integrals into

Path Space Markov Chain Monte Carlo Methods

115

nested surface integrals. This is tedious but straightforward and leads to an explicit
expression of Ij in terms of an infinite series of integrals over increasing Cartesian
powers of M .
These nested integrals over surfaces are due to the propagation of light along
straight lines and changes of direction at surfaces, which leads to the concept of a light
path. This can be thought of as the trajectory of a particle carrying an infinitesimal
portion of the illumination. It is a piecewise linear curve x = x1 xn with endpoints
x1 and xn and intermediate scattering vertices x2 , . . . , xn1 . The space of all possible
light paths is a union consisting of paths with just the endpoints, paths that have one
intermediate scattering event, and so on. More formally, we define path space as
P :=

Pn , and

n=2

Pn := {x1 xn | x1 , . . . , xn M } .

(12)

The nested integrals which arose from our manipulation of Eq. (11) are simply integrals over light paths of different lengths, i.e.

Ij =

P2


(x1 x2 ) dA(x1 , x2 ) +

P3

(x1 x2 x3 ) dA(x1 , x2 , x3 ) + . . . .

(13)

Because some paths carry more illumination from the light source to the camera
than others, the integrand : P R is needed to quantify their light-carrying
capacity; its definition varies based on the number of input arguments and is given
by Eq. (15). The total illumination Ij arriving at the camera is often written more
compactly as an integral of over the entire path space, i.e.:

=:

(x) dA(x).

(14)

The definition of the weighting function consists of a product of termsone for


each vertex and edge of the path:
(x1 xn ) = Le (x1 x2 )

n1


G(xk1 xk ) f (xk1 xk xk+1 )

k=2

G(xn1 xn ) We(j) (xn1 xn ). (15)


The arrows in the above expression symbolize the symmetry of the geometric terms
as well as the flow of light at vertices. xi xi+1 can also be read as a spatial argument

x
xi followed by a directional argument
i xi+1 . Figure 4 shows an example light path
and the different weighting terms. We summarize their meaning once more:

116

W. Jakob

Fig. 4 Illustration of a simple light path with four vertices and its corresponding weighting function

Le (x1 x2 ) is the emission profile of the light source. This term expresses the
amount of radiance emitted from position x1 traveling towards x2 . It is equal to
zero when x1 is not located on a light source.
j
We (xn1 xn ) is the sensitivity profile of pixel j of the camera; we can think of
the pixel grid as an array of sensors, each with its own profile function.
G(x y) is the geometric term (Eq. 9), which specifies the differential amount
of illumination carried along segments of the light path. Among other things, it
accounts for visibility: when there is no unobstructed line of sight between x and
y, G evaluates to zero.
f (xk1 xk xk+1 ) is the BSDF, which specifies how much of the light
that travels from xk1 to xk is then scattered towards position xk+1 . This function
essentially characterizes the material appearance of an object (e.g., whether it is
made of wood, plastic, concrete, etc.).
Over the last 40 years, considerable research has investigated realistic expressions
for the Le , We , and f terms. In this article, we do not discuss their definition and prefer
to think of them as black box functions that can be queried by the rendering algorithm. This is similar to how rendering software is implemented in practice: a scene
description might reference a particular material (e.g., car paint) whose corresponding function f is provided by a library of material implementations. The algorithm
accesses it through a high-level interface shared by all materials, but without specific
knowledge about its internal characteristics.

Path Space Markov Chain Monte Carlo Methods

117

4.1 Regular Expressions to Select Sets of Light Paths


Different materials can interact with light in fundamentally different ways, which
has important implications on the design of rendering algorithms. It is helpful to
distinguish between interactions using a 1-letter classification for each vertex type:
S (ideal specular): specular surfaces indicate boundaries between materials with
different indices of refraction (e.g., air and water). Ideal specular boundaries have
no roughness and cause and incident ray of light to be scattered into a discrete set
of outgoing directions (Fig. 3). Examples of specular materials include polished
glass and metal surfaces and smooth coatings.
G (glossy): glossy surfaces also mark an index of refraction transition, but in this
case the surface is affected by small-scale roughness. This causes the same ray
to scatter into a continuous distribution of directions which concentrates around
the same directions as the ideally smooth case.
D (diffuse): diffuse surfaces reflect light into a directional distribution that is either
uniform or close to uniform; examples include clay and plaster.
We additionally assign the labels L and E to light source and camera (eye) vertices,
respectively, allowing for the classification of entire light paths using a sequence of
symbols (e.g., LSDSE). Larger classes of paths can be described using Heckberts
path regular expressions [8], which add convenient regular expression rules such as
the Kleene star * and plus + operators. For instance, LD+E refers to light that
has been scattered only by diffuse surfaces before reaching the camera. We will use
this formalism shortly.

4.2 Path Tracing Variants


The path tracing algorithm discussed in Sect. 3 constructs complete light paths by
randomly sampling them one vertex at a time (we refer to this as sequential sampling). In each iteration, it randomly chooses an additional light path vertex xi1
using a probability density that is proportional to the (partial) weighting function
( xi1 xi xi+1 ) involving only factors that depend on the previous two vertices, i.e. xi and xi+1 (this is a variant of the Markov property). The indices decrease
because the algorithm constructs paths in reverse; intuitively, it searches for the trajectory of an idealized light particle that moves backwards in time until its emission
point on the light source is found.
Path tracing performs poorly when the emission point of a light path is challenging
to find, so that complete light paths are constructed with low probability. This occurs
in a wide range of situations; Fig. 5 shows an example where the light sources are
encased, making it hard to reach them by chance. The path tracing rendering has
unacceptably high variance at 32 samples per pixel.
The path space view makes it possible to construct other path tracing variants with
better behavior. For instance, we can reverse the direction of the random walk and

118

W. Jakob

Fig. 5 A bidirectional path tracer finds light paths by generating partial paths starting at the camera
and light sources and connecting them in every possible way. The resulting statistical estimators
tend to have lower variance than unidirectional techniques. Modeled after a scene by Eric Veach. a
Path tracer, 32 samples/pixel. b Bidirectional path tracer, 32 samples/pixel

generate vertex xi+1 from xi and xi1 , which leads to a method referred to as light
tracing or particle tracing. This method sends out particles from the light source
(thus avoiding problems with the enclosure) and records the contribution to rendered
pixels when they hit the aperture of the camera.

4.2.1

Bidirectional Path Tracing (BDPT)

The bidirectional path tracing method (BDPT) [17, 29] computes radiance estimates
via two separate random walks from the light sources and the camera. The resulting
two partial paths are connected for every possible vertex pair, creating many complete
paths of different lengths, which supplies this method with an entire family of path
sampling strategies. A path with n vertices can be created in n + 1 different ways,
which is illustrated by Fig. 6 for a simple path with 3 vertices (2 endpoints and 1
scattering event). The captions s and t indicate the number of sampling steps from
the camera and light source. In practice, each of the strategies is usually successful at
dealing with certain types of light paths, while being a poor choice for others (Fig. 7).

4.2.2

Multiple Importance Sampling (MIS)

Because all strategies are defined on the same space (i.e. path space), and because
each has a well-defined density function on this space, it is possible to evaluate
and compare these densities to determine the most suitable strategy for sampling
particular types of light paths. This is the key insight of multiple importance sampling

Path Space Markov Chain Monte Carlo Methods

119

(a) s=0, t=3

(b) s=1, t=2

(c) s=2, t=1

(d) s=3, t=0

Fig. 6 The four different ways in which bidirectional path tracing can create a path with one
scattering event: a Standard path tracing, b Path tracing variant: connect to sampled light source
positions, c Standard light tracing, d Light tracing variant: connect to sampled camera positions.
Solid lines indicate sampled rays which are intersected with the geometry, whereas dashed lines
indicate deterministic connection attempts which must be validated by a visibility test

(MIS) [30] which BDPT uses to combine multiple sampling strategies in a provably
good way to minimize variance in the resulting rendering (bottom of Fig. 7).
Suppose two statistical estimators of the pixel intensity Ij are available. These
estimators can be used to generate two light paths x 1 and x 2 , which have path space
probability densities p1 (x1 ) and p2 (x2 ), respectively. The corresponding MC estimates are given by
Ij(1)  =

(x1 )
(x2 )
and Ij(2)  =
.
p1 (x1 )
p2 (x2 )

To obtain a combined estimator, we could simply average these estimators, i.e.:


Ij(3)  :=


1  (1)
Ij  + Ij(2)  .
2

However, this is not a good idea, since the combination is affected by the variance
of the worst ingredient estimator (BDPT generally uses many estimators, including
ones that have very high variance). Instead, MIS combines estimators using weights
that are related to the underlying sample density functions:
Ij(4)  := w1 (x1 )Ij(1)  + w2 (x2 )Ij(2) ,
where
wi (x) :=

pi (x)
.
p1 (x) + p2 (x)

(16)

120

W. Jakob

(a)

(b)

Fig. 7 The individual sampling strategies that comprise the previous BDPT rendering, both a
without and b with multiple importance sampling. Each row corresponds to light paths of a certain
length, and the top row matches the four strategies from Fig. 6. Almost every strategy has deficiencies
of some kind; multiple importance sampling re-weights samples to use strategies where they perform
well

Path Space Markov Chain Monte Carlo Methods

121

While not optimal, Veach proves that no other choice of weighting functions can
significantly improve on Eq. (16). He goes on to propose a set of weighting heuristics that combine many estimators (i.e., more than two), and which yield perceptually
better results. The combination of BDPT and MIS often yields an effective method
that addresses many of the flaws of the path tracing algorithm. Yet, even this combination can fail in simple cases, as we will discuss next.

4.3 Limitations of Monte Carlo Path Sampling


Ultimately, all Monte Carlo path sampling techniques can be seen to compute integrals of the weighting function using a variety of importance sampling techniques
that evaluate at many randomly chosen points throughout the integration domain,
i.e., path space P.
Certain input, particularly scenes containing metal, glass, or other shiny surfaces,
can lead to integrals that are difficult to evaluate. Depending on the roughness of the
surfaces, the integrand can take on large values over small regions of the integration
domain. Surfaces of lower roughness lead to smaller and higher-valued regions,
which eventually collapse to lower-dimensional sets with singular integrands as the
surface roughness tends to zero. This case where certain paths cannot be sampled at
all is known as the problem of insufficient techniques [16].
Convergence problems arise whenever high-valued regions receive too few samples. Depending on the method used, this manifests as objectionable noise or other
visual artifacts in the output image that gradually disappear as the sample count N
tends to infinity. However, due to the slow convergence rate of MC integration (typical error is O(N 0.5 )), it may not be an option to wait for the error to average out.
Such situations can force users of rendering software to make unrealistic scene modifications (e.g., disabling certain light interactions), thereby compromising realism in
exchange for obtaining converged-looking results within a reasonable time. Biased
estimators can achieve lower errors in some situationshowever, these methods are
beyond the scope of this article, we refer the reader to Pharr et al. [24] for an overview.
Figure 8 illustrates the behavior of several path sampling methods when rendering
caustics, which we define as light paths matching the regular expression LS+DS*E.
They form interesting light patterns at the bottom of the swimming pool due to
focusing effect of ripples in the water surface.
In Fig. 8a, light tracing is used to emit particles proportional to the light source
emission profile Le . The highlighted path is the trajectory of a particle that encounters the water surface and refracts into the pool. The refraction is an ideal specular
interaction described by Snells law and the Fresnel equations. The diffuse concrete
surface at the pool bottom then reflects the particle upwards into a direction drawn
from a uniform distribution, where it is refracted once more by the water surface.
Ultimately, the particle never hits the camera aperture and thus cannot contribute to
the output image.

122

(a) Path tracing from the light source (b) Path tracing from the camera

W. Jakob

(c) Bidirectional path tracing

Fig. 8 Illustration of the difficulties of sequential path sampling methods when rendering LSDSE
caustic patterns at the bottom of a swimming pool. a, b Unidirectional techniques sample light paths
by executing a random walk consisting of alternating transport and scattering steps. The only way to
successfully complete a path in this manner is to randomly hit the light source or camera, which
happens with exceedingly low probability. c Bidirectional techniques trace paths from both sides,
but in this case they cannot create a common vertex at the bottom of the pool to join the partial light
paths

Figure 8b shows the behavior of the path tracing method, which generates paths
in the reverse direction but remains extremely inefficient: in order to construct a
complete light path x with (x) > 0, the path must reach the other end by chance,
which happens with exceedingly low probability. Assuming for simplicity that rays
leave the pool with a uniform distribution in Fig. 8b, the probability of hitting the
sun with an angular diameter of 0.5 is on the order of 105 .
BDPT traces paths from both sides, but even this approach is impractical here:
vertices on the water surface cannot be used to join two partial paths, since the
resulting pair of incident and outgoing directions would not satisfy Snells law. It is
possible to generate two vertices at the bottom of the pool as shown in the figure,
but these cannot be connected: the resulting path edge would be fully contained in a
surface rather than representing transport between surfaces.
In this situation, biased techniques would connect the two vertices at the bottom
of the pool based on a proximity criterion, which introduces systematic errors into
the solution. We will only focus on unbiased techniques that do not rely on such
approximations.
The main difficulty in scenes like this is that caustic paths are tightly constrained:
they must start on the light source, end on the aperture, and satisfy Snells law in two
places. Sequential sampling approaches are able to satisfy all but one constraint and
run into issues when there is no way to complete the majority of paths.
Paths like the one examined in Fig. 8 lead to poor convergence in other settings
as well; they are collectively referred to as speculardiffusespecular (SDS) paths
due to the occurrence of this sequence of interactions in their path classification.
SDS paths occur in common situations such as a tabletop seen through a drinking
glass standing on it, a bottle containing shampoo or other translucent liquid, a shop
window viewed and illuminated from outside, as well as scattering inside the eye of
a virtual character. Even in scenes where these paths do not cause dramatic effects,
their presence can lead to excessively slow convergence in rendering algorithms that
attempt to account for all transport paths. It is important to note that while the SDS
class of paths is a well-studied example case, other classes (e.g., involving glossy

Path Space Markov Chain Monte Carlo Methods

123

Algorithm 2 Pseudocode of a MCMC-based rendering algorithm


function Metropolis- Light- Transport
x 0 An initial light path
for i = 1 to N do
x i Mutate(xi1 )



(x )T (x ,x )
with probability min 1, (x i )T (ix i1,x )
x i
i1
i1
i
x i
x i1 otherwise
Record(xi )
end for

interactions) can lead to many similar issues. It is desirable that rendering methods
are robust to such situations. Correlated path sampling techniques based on MCMC
offer an attractive way to approach such challenges. We review these methods in the
remainder of this article.

5 Markov Chain Monte Carlo (MCMC) Rendering


Techniques
In 1997, Veach and Guibas proposed an unusual rendering technique named Metropolis Light Transport [31], which applies the MetropolisHastings algorithm to the path
space integral in Eq. (14). Using correlated samples and highly specialized mutation
rules, their approach enables more systematic exploration of the integration domain,
avoiding many of the problems encountered by methods based on standard Monte
Carlo and sequential path sampling.
Later, Kelemen et al. [14] showed that a much simpler approach can be used
to combine MCMC sampling with existing MC rendering algorithms, making it
possible to side-step the difficulties of the former method. The downside of their
approach is the reduced flexibility in designing custom mutation rules. An extension
by Hachisuka et al. [7] further improves the efficiency of this method.
Considerable research has built on these two approaches, including extensions
to participating media [23], combinations of MCMC and BDPT [7], specialized
techniques for specular [11] and glossy [13] materials, gradient-domain rendering
[18, 19], and MCMC variants which perform a localized non-ergodic exploration of
path space [3].
In this section, we provide an overview of the initial three methods, starting first
with the Primary Sample Space approach by Kelemen et al. followed the extension by
Hachisuka et al. and finally the Metropolis Light Transport algorithm by Veach and
Guibas. All variants are based on a regular MCMC iteration shown in Algorithm 2.
Starting with an initial light path x 0 , the methods simulate N steps of a Markov Chain.
In each step, a mutation is applied to the path x i1 to obtain a proposal path x i , where
it is assumed that the proposal density is known and given by T (xi1 , x i ). After a
standard MetropolisHastings acceptance/rejection step, the algorithm invokes the

124

W. Jakob

function Record(xi ), which first determines the pixel associated with the current
iterations light path xi and then increases its brightness by a fixed amount.
These MCMC methods all sample light paths proportional to the amount they
contribute to the pixels of the final rendering; by increasing the pixel brightness in
this way during each iteration, these methods effectively compute a 2D histogram of
the marginal distribution of over pixel coordinates. This is exactly the image to be
rendered up to a global scale factor, which can be recovered using a traditional MC
sampling technique such as BDPT. The main difference among these algorithms is
the underlying state space, as well as the employed set of mutation rules.

5.1 Primary Sample Space Metropolis Light Transport


(PSSMLT)
Primary Sample Space Metropolis Light Transport (PSSMLT) [14] combines traditional MC sampling techniques with a MCMC iteration. The approach is very
flexible and can also be applied to integration problems outside of computer graphics. PSSMLT always operates on top of an existing MC sampling technique; we
assume for simplicity that path tracing is used, but many other techniques are also
admissible. The details of this method are easiest to explain from a implementationcentric viewpoint.
Recall the path tracing pseudo-code shown earlier in Algorithm 1. Lines 1 and 2
performed random sampling steps, but the rest of the procedure was fully deterministic. In practice, the first two lines are often realized using a pseudorandom number
generator such as Mersenne Twister [20] or a suitable quasi-Monte Carlo scheme
[6], potentially using the inversion method or a similar technique to warp uniform
variates to desired distributions as needed. For more details, we refer the reader to a
tutorial by Keller [15].
Let us consider a small adjustment to the implementation of this method: instead
of generating univariate samples during the recursive sampling steps, we can also
generate them ahead of time and supply them to the implementation as an additional
argument, in which case the algorithm can be interpreted as a fully deterministic
function of its (random or pseudorandom) arguments. Suppose that we knew (by
some way) that the maximum number of required random variates was equal to n,
and that the main computation was thus implemented by a function with signature
: [0, 1]n R, which maps a vector of univariate samples to a pixel intensity
estimate. By taking many estimates and averaging them to obtain a converged pixel
intensity, path tracing is effectively integrating the estimator over a n-dimensional
unit hypercube of random numbers denoted as primary sample space:

Ij =

[0,1]n

( ) d .

(17)

Path Space Markov Chain Monte Carlo Methods

(a)

125

(b)

Fig. 9 Primary sample space MLT performs mutations in an abstract random number space. A deterministic mapping induces corresponding mutations in path space. a Primary sample space view.
b Path space view

The key idea of PSSMLT is to compute Eq. (17) using MCMC integration on primary
sample space, which leads to a trivial implementation, as all complications involving
light paths and other rendering-specific details are encapsulated in the black box
mapping (Fig. 9).
One missing detail is that the primary sample space dimension n is unknown
ahead of time. This can be solved by starting with a low-dimensional integral and
extending the dimension on demand when additional samples are requested by .
PSSMLT uses two types of Mutate functions. The first is an independence
sampler, i.e., it forgets the current state and switches to a new set of pseudorandom
variates. This is needed to ensure that the Markov Chain is ergodic. The second is a
local (e.g. Gaussian or similar) proposal centered around a current state i [0, 1]n .
Both are symmetric so that the proposal density T cancels in the acceptance ratio
(Line 5 in Algorithm 2).
PSSMLT uses independent proposals to find important light paths that cannot be
reached using local proposals. When it finds one, local proposals are used to explore
neighboring light paths which amortizes the cost of the search. This can significantly
improve convergence in many challenging situations and is an important advantage
of MCMC methods in general when compared to MC integration.
Another advantage of PSSMLT is that it explores light paths through a black box
mapping that already makes internal use of sophisticated importance sampling
techniques for light paths, which in turn leads to an easier integration problem in
primary sample space. The main disadvantage of this method is that its interaction
with is limited to a stream of pseudorandom numbers. It has no direct knowledge
of the generated light paths, which prevents the design of more efficient mutation
rules based on the underlying physics.

126

W. Jakob

5.2 Multiplexed Metropolis Light Transport (MMLT)

PSSMLT

MMLT

Components

PSSMLT is commonly implemented in conjunction with the BDPT technique: in this


setting, the rendering algorithm generates paths using a large set of BDPT connection
strategies and then re-weights them using MIS. In most cases, only a subset of the
strategies is truly effective, and MIS will consequently assign a large weight to this
subset. One issue with the combination of BDPT and PSSMLT is that the algorithm
still spends a considerable portion of its time generating connections with strategies
that have low weights and thus contribute very little to the rendered image. Hachisuka
et al. [7] recently presented an extension of PSSMLT named Multiplexed Metropolis
Light Transport (MMLT) to address this problem.
They propose a simple but effective modification to the inner BDPT sampler;
the outer MetropolisHastings iteration remains unchanged: instead of generating
a sample from all BDPT connection strategies, the algorithm (pseudo-)randomly
chooses a single strategy and returns its contribution scaled by the inverse discrete
probability of the choice. This (pseudo-)random sample is treated in the same way as
other sampling operations in PSSMLT and exposed as an additional state dimension
that can be mutated using small or large steps. The practical consequence is that the
Markov Chain will tend to spend more computation on effective strategies, which
further improves the statistical efficiency of the underlying estimator (Fig. 10).

(t, s ) = (2 , 5)

(3 , 4)

(4 , 3)

(5 , 2)

(6 , 1)

Visualization

Fig. 10 Analysis of the Multiplexed MLT (MMLT) technique [7] (used with permission): the
top row shows weighted contributions from different BDPT strategies in a scene with challenging
indirect illumination [18, 28]. The intensities in the middle row visualize the time spent on each
strategy using the MMLT technique: they are roughly proportional to the weighted contribution
in the first row. The rightmost column visualizes the dominant strategies (3,4), (4, 3), and (5, 2)
using RGB colors. PSSMLT (third row) cannot target samples in this way and thus produces almost
uniform coverage

Path Space Markov Chain Monte Carlo Methods

127

5.3 Path Space Metropolis Light Transport (MLT)


Path Space Metropolis Light transport, or simply Metropolis Light transport (MLT)
[31] was the first application of MCMC to the problem of light transport. Doucet
et al. [5] proposed a related method in applied mathematics, which focuses on a more
general class of integral equations.
The main difference as compared to PSSMLT is that MLT operates directly on path
space and does not use a black-box mapping . Its mutation rules are considerably
more involved than those of PSSMLT, but this also provides substantial freedom
to design custom rules that are well-suited for rendering specific physical effects.
MLT distinguishes between mutations that change the structure of the path and
perturbations that move the vertices by small distances while preserving the path
structure, both using the building blocks of bidirectional path tracing to sample paths.
One of the following operations is randomly selected in each iteration:
1. Bidirectional mutation: This mutation replaces a segment of an existing path
with a new segment (possibly of different length) generated by a BDPT-like
sampling strategy. This rule generally has a low acceptance ratio but it is essential
to guarantee ergodicity of the resulting Markov Chain.
2. Lens subpath mutation: The lens subpath mutation is similar to the previous
mutation but only replaces the lens subpath, which is defined as the trailing portion
of the light path matching the regular expression [^S]S*E.
3. Lens perturbation: This transition rule shown in Fig. 11a only perturbs the lens
subpath rather than regenerating it from scratch. In the example, it slightly rotates
the outgoing ray at the camera and propagates it until the first non-specular material is encountered. It then attempts to create a connection (dashed line) to the
unchanged remainder of the path.
4. Caustic perturbation: The caustic perturbation (Fig. 11b) works just like the
lens perturbation, except that it proceeds in reverse starting at the light source. It
is well-suited for rendering caustics that are directly observed by the camera.
5. Multi-chain perturbation: This transition rule (Fig. 11c) is used when there
are multiple separated specular interactions, e.g., in the swimming pool example
encountered before. After an initial lens perturbation, a cascade of additional
perturbations follows until a connection to the remainder of the path can finally
be established.
The main downside of MLT is the severe effort needed to implement this method:
several of the mutation and perturbation rules (including their associated proposal
densities) are challenging to reproduce. Another problem is that a wide range of
different light paths generally contribute to the output image. The MLT perturbations
are designed to deal with specific types of light paths, but it can be difficult to foresee
every kind in order to craft a suitable set of perturbation rules. In practice, the included
set is insufficient.

128

W. Jakob

(a)

(b)

(c)

(d)

Fig. 11 MLT operates on top of path space, which permits the use of a variety of mutation rules that
are motivated by important physical scattering effects. The top row illustrates ones that are useful
when rendering a scene involving a glass object on top of a diffuse table. The bottom row is the
swimming pool example from Fig. 8. In each example, the original path is black, and the proposal
is highlighted in blue. a Lens perturbation. b Caustic perturbation. c Multi-chain perturbation. d
Manifold perturbation

6 Specular Manifolds and Manifold Exploration (ME)


In this section, we discuss the principles of Manifold Exploration (ME) [11], which
leads to the manifold perturbation (Fig. 11d). This perturbation provides local exploration for large classes of different path types and subsumes MLTs original set of
perturbations. We begin with a discussion of the concept of a specular manifold.
When a scene contains ideal specular materials, these materials require certain physical laws to be satisfied (e.g. Snells law or the law of reflection). Mathematically,
these act like constraint equations that remove some dimensions of the space of light
paths, leaving behind a lower-dimensional manifold embedded in path space.
We illustrate this using a simple example in two dimensions, in which a camera
observes a planar light source through an opposing mirror (Fig. 12). We will refer to
a light path joining two endpoints through a sequence of k ideal specular scattering
Light source

Camera

Mirror
Fig. 12 A motivating example in two dimensions: specular reflection in a mirror

Path Space Markov Chain Monte Carlo Methods

129

events as a specular chain of length k. A specular chain of length 1 from the light
source to the camera is shown in the figure.
Reflections in the mirror must satisfy the law of specular reflection. Assuming
that the space of all specular chains in this simple scene can be parameterized using
the horizontal coordinates x1 , x2 , and x3 , it states that
x2 =

x1 + x3
,
2

(18)

i.e., the x coordinate of the second vertex must be exactly half-way between the
endpoints. Note that this equation can also be understood as the implicit definition
of a plane in R3 (x1 2x2 + x3 = 0).
When interpreting the set of all candidate light paths as a three-dimensional space
P3 of coordinate tuples (x1 , x2 , x3 ), this constraint then states that the subset of
relevant paths has one dimension less and is given by the intersection of P3 and the
plane Eq. (18). With this extra knowledge, it is now easy to sample valid specular
chains, e.g. by generating x1 and x3 and solving for x2 .
Given general non-planar shapes, the problem becomes considerably harder, since
the equations that have to be satisfied are nonlinear and may admit many solutions.
Prior work has led to algorithms that can find solutions even in such cases [21, 33]
but these methods are closely tied to the representation of the underlying geometry,
and they become infeasible for specular chains with lengths greater than one. Like
these works, ME finds valid specular chainsbut because it does so within the
neighborhood of a given path, it avoids the complexities of a full global search and
does not share these limitations.
ME is also related to the analysis of reflection geometry presented by Chen and
Arvo [2], who derived second-order expansion of the neighborhood of a path. The
main difference is that ME solves for paths exactly and is used as part of an unbiased
MCMC rendering algorithm.

6.1 Integrals Over Specular Manifolds


Let us return to our previous example of the swimming pool involving the family
of light paths LSDSE. These paths belong to the P5 component of the path space
P (Eq. 12), which is a 10-dimensional space with two dimensions for each surface
position. As we will see shortly, the paths that contribute have to satisfy two constraint
equations involving unit directions in R3 (which each have 2 degrees of freedom).
This constrains a total of four dimensions of the path, meaning that all contributing
paths lie on a manifold S of dimension 6 embedded in P5 .
The corresponding integral Eq. (13) is more naturally expressed as an integral
over this specular manifold S , rather than as an integral over the entire path space:

S

(x1 x5 ) dA(x1 , x3 , x5 ).

130

W. Jakob

Note the absence of the specular vertices x2 and x4 in the integrals area product
measure. The contribution function still has the same form: a product of terms corresponding to vertices and edges of the path. However, singular reflection functions
at specular vertices are replaced with (unitless) specular reflectance values, and the
geometric terms are replaced by generalized geometric terms over specular chains
that we will denote G(x1 x2 x3 ) and G(x3 x4 x5 ).
The standard geometric term G(x y) for a non-specular edge computes the area
ratio of an (infinitesimally) small surface patch at one vertex and its projection onto
projected solid angles as seen from the other vertex. The generalized geometry factor
is defined analogously: the ratio of solid angle at one end of the specular chain with
respect to area at the other end of the chain, considering the path as a function of the
positions of the endpoints.

6.2 Constraints for Reflection and Refraction


Equation (18) introduced a simple specular reflection constraint for axis-aligned
geometry in two dimensions. This constraint easily generalizes to arbitrary geometry
in three dimensions and to both specular reflection and refraction.
Recall the law of specular reflection, which states that incident and outgoing directions make the same angle with the surface normal. Furthermore, all three vectors
must be coplanar (Fig. 13). We use an equivalent reformulation of this law, which
states that the half direction vector of the incident and outgoing direction i and o ,
defined as
i + o
h(i , o ) :=
,
(19)

i + o

is equal to the surface normal, i.e., h(i , o ) = n. In the case of refraction, the
relationship of these directions is explained by Snells law. Using a generalized
definition of the half direction vector which includes weighting by the incident and
outgoing indices of refraction [32]; i.e.,

Specular reflection

Specular refraction

Fig. 13 In-plane view of the surface normal n and incident and outgoing directions i and o at a
surface marking a transition between indices of refraction i and o

Path Space Markov Chain Monte Carlo Methods

h(i , o ) :=

i i + o o
,

i i + o o

131

(20)

we are able to use a single constraint h(i , o ) = n which subsumes both Snells
law and the law of specular reflection (in which case i equals o ). Each specular
vertex xi of a path x must satisfy this generalized constraint involving its own position
and the positions of the preceding and following vertices. Note that this constraint
involves unit vectors with only two degrees of freedom. We can project (20) onto a
two-dimensional subspace to reflect its dimensionality:

x
ci (x) = T (xi )T h(
i xi1 , xi xi+1 ),

(21)

The functions ci : P R2 compute the generalized half-vector at vertex xi and


project it onto the tangent space of the underlying scene geometry at this position,
which is spanned by the columns of the matrix T (xi ) R32 ; the resulting 2-vector
is zero when h(i , o ) is parallel to the normal. Then the specular manifold is simply
the set
(22)
S = {x P | ci (x) = 0 if vertex xi is specular} .

6.3 Local Manifold Geometry


The complex nonlinear behavior of S severely limits our ability to reason about its
geometric structure globally. In this section, we therefore focus on local properties,
leading to an explicit expression for the tangent space at any point on the manifold. This constitutes the key geometric information needed to construct a numerical
procedure that is able to move between points on the manifold.
For simplicity, let us restrict ourselves to the case of a single specular chain
x = x1 xk with k 2 specular vertices and non-specular endpoints x1 and xk ,
matching the path regular expression DS+D. This suffices to cover most cases by
separate application to each specular chain along a path. To analyze the geometry
locally, we require a point in S , i.e., a light path x satisfying all specular constraints,
to be given.
We assume that local parameterizations of the surfaces in the scene on small
neighborhoods around every vertex are provided via functions x i (ui , vi ) : R2 M ,
where x i (0, 0) = xi . We can then express the constraints ci in terms of these local
coordinates and stack them on top of each other to create a new function c with
signature c : R2k R2k4 , which maps 2k local coordinate values to 2k 4 =
2(k 2) projected half direction vector coordinatestwo for each of the specular
vertices of the chain. The set


Sloc = (u1 , v1 , . . . , uk , vk ) R2k | c (. . .) = 0

(23)

132

W. Jakob

then describes the (four-dimensional) specular manifold in terms of local coordinates


around the path x , which is identified with the origin. Under the assumption that the
Jacobian of c has full rank (more on this shortly), the Implicit Function Theorem [26]
states that the implicitly defined manifold (23) can be converted into the (explicit)
graph of a function q : R4 R2k4 on an epsilon ball B4 () around the origin.
Different functions q are possiblein our case, the most useful variant determines
the positions of all the specular vertices from the positions of the non-specular endpoints, i.e.




= (u1 , v1 , q(u1 , v1 , uk , vk ), uk , vk )  (u1 , v1 , uk , vk ) B4 () .
Sloc

(24)

Unfortunately, the theorem does not specify how to compute qit only guarantees
the existence of such a function. It does, however, provide an explicit expression for
the derivative of q, which contains all information we need to compute a basis for
the tangent space at the path x , which corresponds to the origin in local coordinates.
This involves the Jacobian of the constraint function c (0), which is a matrix of
k 2 by k 2-by-2 blocks with a block tridiagonal structure (Fig. 14).

(a)

(b)

where

(c)

(d)

Fig. 14 The linear system used to compute the tangent space and its interpretation as a derivative
of a specular chain. a An example path. b Associated constraints. c Constraint Jacobian. d Tangent
space

Path Space Markov Chain Monte Carlo Methods

133

If we block the derivative c , as shown in the figure, into 2-column matrices B1


and Bk for the first and last vertices and a square matrix A for the specular chain, the
tangent space to the manifold in local coordinates is


TS (x) = A1 B1 Bk .

(25)

This matrix is k 2 by 2 blocks in size, and each block represents the derivative of
one vertex with respect to one endpoint.
This construction computes tangents with respect to a graph parameterization
of the manifold, which is guaranteed to exist for a suitable choice of independent
variables. Because we always use the endpoint vertices for this purpose, difficulties
arise when one of the endpoints is located exactly at the fold of a caustic wavefront,
in which case c becomes rank-deficient and A fails to be invertible. This happens
rarely in practice and is not a problem for our method, which allows for occasional
parameterization failures. In other contexts where this is not acceptable, the chain
could be parameterized by a different pair of vertices when a non-invertible matrix
is detected.
These theoretical results about the structure of the specular manifold can be used
in an algorithm to solve for specular paths, which we discuss next.

6.4 Walking on the Specular Manifold


In practice, we always keep one endpoint fixed (e.g., x1 ), while parameterizing the
remaining two-dimensional set. Figure 15 shows a conceptual sketch of the manifold
of a specular chain that is parameterized by the last vertex xk . This vertex is initially
target
located at xkstart , and we search for a valid configuration where it is at position xk .
The derivation in Sect. 6.3 provides a way of extrapolating the necessary change of
x2 , . . . , xk1 to first order, but this is not enough: an expansion, no matter to what
order, will generally not be able to find a valid path that is located on S .
To address this issue, we combine the extrapolation with a simple projection operation, which maps approximate paths back onto S by intersecting the extrapolated
ray x1 x2 with the scene geometry and using the appropriate laws of reflection and
refraction to compute the remaining vertex locations. The combination of extrapolation and projection behaves like Newtons method, exhibiting quadratic convergence
near the solution; details on this iteration can be found in the original paper [11].
Figure 16 shows a sketch of how manifold walks can be used in a MLT-like iteration: a proposal begins to modify a light path by perturbing the outgoing direction at
vertex xa . Propagating this direction through a specular reflection leads to a modified
position xb on a diffuse surface. To complete the partial path, it is necessary to find a
specular chain connecting xb to the light source. Here, we can simply apply a manifold walk to the existing specular chain xb xc to solve for an updated configuration

134

W. Jakob

Start

te
pola
extra
ct
oje
pr

ext
rap
ola
te
project

Target

Fig. 15 Manifold walks use a Newton-like iteration to locally parameterize the specular manifold.
The extrapolation operation takes first-order steps based on the local manifold tangents, which are
subsequently projected back onto the manifold

ed

trac

ted

a
upd

half-vector equal
to surface normal

Fig. 16 Example of a manifold-based path perturbation

xb xc . The key observation is that MCMC explores the space of light paths using
localized steps, which is a perfect match for the local parameterization of the path
manifold provided by Manifold Exploration.

6.5 Results
Figures 17 and 18 show the comparisons of several MCMC rendering techniques for
an interior scene containing approximately 2 million triangles with shading normals
and a mixture of glossy, diffuse, and specular surfaces and some scattering volumes.
One hour of rendering time was allotted to each technique; the results are intentionally
unconverged to permit a visual analysis of the convergence behavior. By reasoning
about the geometry of the specular and offset specular manifolds for the paths it
encounters, the ME perturbation strategy is more successful at rendering certain
pathssuch as illumination that refracts from the bulbs into the butter dish, then to
the camera (6 specular vertices)that the other methods struggle with.

Path Space Markov Chain Monte Carlo Methods

135

(a)

(b)

(c)

(d)

Fig. 17 This interior scene shows chinaware, a teapot containing an absorbing medium, and a butter
dish on a glossy silver tray. Illumination comes from a complex chandelier with glass-enclosed bulbs.
Prior methods have difficulty in finding and exploring relevant light paths, which causes noise and
other convergence artifacts. Equal-time renderings on an eight-core Intel Xeon X5570 machine at
1280 720 pixels in 1 h. a MLT [28]. b ERPT [3]. c PSSMLT [14]. d ME [11]

(a)

(b)

(c)

(d)

Fig. 18 This view of a different part of the room, now lit through windows using a spherical environment map surrounding the scene, contains a scattering medium inside the glass egg.
Equal-time renderings at 720 1280 pixels in 1 h. a MLT [28]. b ERPT [3]. c PSSMLT [14].
d ME [11]

136

W. Jakob

7 Perturbation Rules for Glossy Transport


Realistic scenes contain a diverse set of materials and are usually not restricted to
specular or diffuse BSDFs. It is important for the used rendering method to generalize to such cases. All derivations thus far focused on ideal specular materials,
but it is possible to extend manifold walks to glossy materials as well. Jakob and
Marschner proposed a simple generalization of ME, which works for moderately
rough materials, and Kaplanyan et al. [13] recently developed a natural constraint
representation of light paths. They proposed a novel half vector-based perturbation
rule as well as numerous enhancements including better tolerance to non-smooth
geometry and sample stratification in image space based on a frequency analysis of
the scattering operator. We provide a high level overview of both approaches here.

7.1 Glossy Materials in the Manifold Perturbation


Figure 19 shows a sketch of this generalization. In the ideal specular case, there is a
single specular chain (or discrete set) connecting xb and xc (top left), and all energy
is concentrated on a lower-dimensional specular manifold defined by c(x) = 0 (top
right). In the glossy case, there is a continuous family of chains connecting xb and xc
(bottom left), and the space of light paths has its energy concentrated in a thin band
near the specular manifold. The key idea of how ME handles glossy materials is to
take steps along a family of parallel offset manifolds c(x) = k (bottom right) so that

Specular

Glossy

Valid path configurations

Schematic path space view

Fig. 19 Sketch of the generalization of Manifold Exploration to glossy materials

Path Space Markov Chain Monte Carlo Methods

137

path space near the specular manifold can be explored without stepping out of this
thin band of near-specular light transport.

7.2 The Natural Constraint Formulation


The method by Kaplanyan et al. [13] takes a different approach to explore glossy
transport paths (Fig. 20): instead of parameterizing a glossy chain by fixing its half
vectors and moving the chain endpoints, their method parameterizes complete paths
starting at the light source and ending at the camera. The underlying manifold walk
keeps the path endpoints fixed and computes a nearby light path as a function of its
half vectors. The set of all half vectors along a path can be interpreted as a type of
generalized coordinate system for light paths: its dimension equals the paths degrees
of freedom, while capturing the relevant constraints (reflection and refraction) in a
convenient explicit form. For this reason, the resulting parameterization is referred to
as the natural constraint representation, and the method is called half vector space
light transport (HSLT); loosely speaking, its perturbation can be seen to explore
orthogonal directions as compared to the parallel manifold walks of ME.

x3

x5

x1
h2

h3

x2

h4

h5

x4
x3

x5

x1
h2

x2

h3

h4

h5

x4

Fig. 20 In the above example, ME (top) constrains the half vectors of two glossy chains x1 . . . x4
and x4 . . . x6 and solves for an updated configuration after perturbing the position of x4 . HSLT
(bottom) instead adjusts all half vectors at once and solves for suitable vertex positions with this
configuration. This proposal is effective for importance sampling the material terms and leads to
superior convergence when dealing with transport between glossy surfaces. Based on a figure by
Kaplanyan et al. [13] (used with permission)

138

W. Jakob

The underlying approach is motivated by the following interesting observation:


when parameterizing light paths in terms of their half vectors, the influence of material
terms on the integrand approximately decouples (Fig. 21). The reason for this
effect is that the dominant terms in glossy reflectance models (which are factors
of ) depend on the angle between the half vector and the surface normal. The
change of variables from path space to the half vector domain furthermore cancels out
the geometry terms G, leading to additional simplifications. As a consequence, this
parameterization turns into a much simpler function resembling a separate Gaussian
in each half vector dimension, which is related to the roughness of the associated
surface. Kaplanyan et al. also demonstrate how frequency-space information about
the scattering operator can be used to better spread out samples in image space,
which is important to accelerate convergence of the histogram generation method
that creates the final rendering.
Figure 22 shows a rendering comparison of a kitchen scene rendered by ME and
HSLT, where most of the illumination is due to caustics paths involving a reflection
by the glossy floor. After 30 min, the ME rendering is noticeably less converged and
suffers from stripe artifacts, which are not present in the HSLT result.

Material roughness coefficient

Difference

Fig. 21 The natural constraint formulation [13] is a parameterization of path space in the half vector
domain. It has the interesting property of approximately decoupling the influence of the individual
scattering events on . The figure shows a complex path where the half vector h3 is perturbed at
vertex x3 . The first column shows a false-color plot of over the resulting paths for different values
of h3 and two roughness values. The second column shows a plot of the BSDF value at this vertex,
which is approximately proportional to . Based on a figure by Kaplanyan et al. [13] (used with
permission)

Path Space Markov Chain Monte Carlo Methods

139

MEMLT (30m)

HSLT+MLT (30m)

Fig. 22 Equal-time rendering of an interior kitchen scene with many glossy reflections. Based on
a figure by Kaplanyan et al. [13] (used with permission)

8 Conclusion
This article presented an overview of the physics underlying light transport simulations in computer graphics. After introducing relevant physical quantities and the
main energy balance equation, we showed how to compute approximate solutions
using a simple Monte Carlo estimator. Following this, we introduced the concept of
path space and examined the relation of path tracing, light tracing, and bidirectional
path tracingincluding their behavior given challenging input that causes these
methods to become impracticably slow. The second part of this article reviewed several MCMC methods that compute path space integrals using proposal distributions
defined on sets of light paths. To efficiently explore light paths involving specular
materials, we showed how to implicitly define and locally parameterize the associated paths using a root-finding iteration. Finally, we reviewed recent work that
aims to generalize this approach to glossy scattering interactions. Most of the methods that were discussed are implemented in the Mitsuba renderer [9], which is a
research-oriented open source rendering framework.
MCMC methods in rendering still suffer from issues that limit their usefulness
in certain situations. Most importantly, they require an initialization or mutation
rule that provides well distributed seed paths to the perturbations, as they can only
explore connected components of path space. Bidirectional Path Tracing and the
Bidirectional Mutation are reasonably effective but run into issues when there are
many disconnected components of path space. This becomes increasingly problematic as their number increases. Ultimately, as the number of disconnected components
exceeds the number of samples that can be generated, local exploration of path space
becomes ineffective; future algorithms could be designed to attempt exploration only
in sufficiently large path space components.
Furthermore, the all perturbations rules made assumptions about specific path
configurations or material properties, which limits their benefits when rendering
scenes that contain a wide range of material types. To efficiently deal with light paths

140

W. Jakob

involving arbitrary materials, camera models, and light sources, a fundamentally


different construction will be needed.
Acknowledgments This research was conducted in conjunction with the Intel Science and Technology Center for Visual Computing. Additional funding was provided by the National Science
Foundation under grant IIS-1011919 and an ETH/Marie Curie fellowship. The author is indebted
to Olesya Jakob, who crafted several of the example scenes in this article.

References
1. Arvo, J.R.: Analytic methods for simulated light transport. Ph.D. thesis, Yale University (1995)
2. Chen, M., Arvo, J.: Theory and application of specular path perturbation. ACM Trans. Graph.
19(4), 246278 (2000)
3. Cline, D., Talbot, J., Egbert, P.: Energy redistribution path tracing. ACM Trans. Graph. 24(3),
11861195 (2005)
4. Cook, R.L., Torrance, K.E.: A reflectance model for computer graphics. ACM Trans. Graph.
1(1), 724 (1982)
5. Doucet, A., Johansen, A., Tadic, V.: On solving integral equations using Markov Chain Monte
Carlo methods. Appl. Math. Comput. 216(10), 28692880 (2010)
6. Grnschlo, L., Raab, M., Keller, A.: Enumerating quasi-Monte Carlo point sequences in
elementary intervals. In: Plaskota, L., Wozniakowski, H. (eds.) Monte Carlo and Quasi-Monte
Carlo Methods 2010. Springer Proceedings in Mathematics and Statistics, vol. 23, pp. 399408.
Springer, Berlin (2012)
7. Hachisuka, T., Kaplanyan, A.S., Dachsbacher, C.: Multiplexed metropolis light transport. ACM
Trans. Graph. 33(4), 100:1100:10 (2014)
8. Heckbert, P.S.: Adaptive radiosity textures for bidirectional ray tracing. In: Proceedings of
SIGGRAPH 90 on Computer Graphics. (1990)
9. Jakob, W.: Mitsuba renderer. http://www.mitsuba-renderer.org (2010)
10. Jakob, W.: Light transport on path-space manifolds. Ph.D. thesis, Cornell University (2013)
11. Jakob, W., Marschner, S.: Manifold exploration: a Markov Chain Monte Carlo technique for
rendering scenes with difficult specular transport. ACM Trans. Graph. 31(4), 58:158:13 (2012)
12. Kajiya, J.T.: The rendering equation. In: Proceedings of SIGGRAPH 86 on Computer Graphics,
pp. 143150 (1986)
13. Kaplanyan, A.S., Hanika, J., Dachsbacher, C.: The natural-constraint representation of the path
space for efficient light transport simulation. ACM Trans. Graph. (Proc. SIGGRAPH) 33(4),
113 (2014)
14. Kelemen, C., Szirmay-Kalos, L., Antal, G., Csonka, F.: A simple and robust mutation strategy
for the Metropolis light transport algorithm. Comput. Graph. Forum 21(3), 531540 (2002)
15. Keller, A.: Quasi-Monte Carlo Image Synthesis in a Nutshell. Springer, Heidelberg (2014)
16. Kollig, T., Keller, A.: Efficient Bidirectional Path Tracing by Randomized Quasi-Monte Carlo
Integration. Springer, Heidelberg (2002)
17. Lafortune, E.P., Willems, Y.D.: Bi-directional path tracing. In: Proceedings of the Compugraphics 93. Alvor, Portugal (1993)
18. Lehtinen, J., Karras, T., Laine, S., Aittala, M., Durand, F., Aila, T.: Gradient-domain Metropolis
light transport. ACM Trans. Graph. 32(4), 1 (2013)
19. Manzi, M., Rousselle, F., Kettunen, M., Lethinen, J., Zwicker, M.: Improved sampling for
gradient-domain Metropolis light transport. ACM Trans. Graph. 33(6), 112 (2014)
20. Matsumoto, M., Nishimura, T.: Mersenne twister: a 623-dimensionally equidistributed uniform
pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8(1), 330 (1998)
21. Mitchell, D.P., Hanrahan, P.: Illumination from curved reflectors. In: Proceedings of the SIGGRAPH 92 on Computer Graphics, pp. 283291 (1992)

Path Space Markov Chain Monte Carlo Methods

141

22. Nicodemus, E.: Geometrical Considerations and Nomenclature for Reflectance, vol. 160. US
Department of Commerce, National Bureau of Standards, Washington (1977)
23. Pauly, M., Kollig, T., Keller, A.: Metropolis light transport for participating media. In: RenderingTechniques 2000: 11th Eurographics Workshop on Rendering, pp. 1122 (2000)
24. Pharr, M., Humphreys, G., Jakob, W.: Physically Based Rendering: From Theory to Implementation, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2016)
25. Preisendorfer, R.: Hydrologic optics. US Department of Commerce, Washington (1976)
26. Spivak, M.: Calculus on Manifolds. Addison-Wesley, Boston (1965)
27. Torrance, E., Sparrow, M.: Theory for off-specular reflection from roughened surfaces. JOSA
57(9), 11051112 (1967)
28. Veach, E.: Robust Monte Carlo methods for light transport simulation. Ph.D. thesis, Stanford
University (1997)
29. Veach, E., Guibas, L.: Bidirectional estimators for light transport. In: Proceedings of the Fifth
Eurographics Workshop on Rendering (1994)
30. Veach, E., Guibas, L.J.: Optimally combining sampling techniques for Monte Carlo rendering.
In: Proceedings of the 22nd annual conference on Computer graphics and interactive techniques,
SIGGRAPH 95, pp. 419428. ACM (1995)
31. Veach, E., Guibas, L.J.: Metropolis light transport. In: Proceedings of the SIGGRAPH 97 on
Computer Graphics, pp. 6576 (1997)
32. Walter, B., Marschner, S.R., Li, H., Torrance, K.E.: Microfacet models for refraction through
rough surfaces. In: Rendering Techniques 2007: 18th Eurographics Workshop on Rendering,
pp. 195206 (2007)
33. Walter, B., Zhao, S., Holzschuch, N., Bala, K.: Single scattering in refractive media with triangle
mesh boundaries. ACM Trans. Graph 28(3), 92 (2009)

Walsh Figure of Merit for Digital


Nets: An Easy Measure for Higher Order
Convergent QMC
Makoto Matsumoto and Ryuichi Ohori

Abstract Fix an integer s. Let f : [0, 1)s R be an integrable function. Let


P [0, 1]s be a finite point set. Quasi-Monte Carlo integration of f by P is the average value of f over P that approximates the integration of f over the s-dimensional
cube. KoksmaHlawka inequality tells that, by a smart choice of P, one may expect
that the error decreases roughly O(N 1 (log N )s ). For any 1, J. Dick gave a construction of point sets such that for -smooth f , convergence rate O(N (log N )s )
is assured. As a coarse version of his theory, M-Saito-Matoba introduced Walsh figure
of Merit (WAFOM), which gives the convergence rate O(N C log N /s ). WAFOM
is efficiently computable. By a brute-force search of low WAFOM point sets, we
observe a convergence rate of order N with > 1, for several test integrands for
s = 4 and 8.
Keywords Quasi-Monte Carlo
Digital nets

Walsh figure of merit Numerical integration

1 Quasi-Monte Carlo and Higher Order Convergence


Fix an integer s. Let f : [0, 1)s R be an integrable function. Our goal is to have
a good approximation of the value

I ( f ) :=

[0,1)s

f (x)dx.

M. Matsumoto (B)
Graduate School of Sciences, Hiroshima University, Hiroshima 739-8526, Japan
e-mail: m-mat@math.sci.hiroshima-u.ac.jp
R. Ohori
Fujitsu Laboratories Ltd., Kanagawa 211-8588, Japan
e-mail: ohori.ryuichi@jp.fujitsu.com
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_5

143

144

M. Matsumoto and R. Ohori

We choose a finite point set P [0, 1)s , whose cardinality is called the sample size
and denoted by N . The quasi-Monte Carlo (QMC) integration of f by P is the value
I ( f ; P) :=

1 
f (x),
N
xP

i.e., the average of f over the finite points P that approximates I ( f ). The QMC
integration error is defined by
Error( f ; P) := |I ( f ) I ( f ; P)|.
If P consists of N independently, uniformly and randomly chosen points, the QMC
integration is nothing but the classical Monte Carlo (MC) integration, where the
integration error is expected to decrease with the order of N 1/2 when N increases,
if f has a finite variance.
The main purpose of QMC integration is to choose good point sets so that the
integration error decreases faster than MC. There are enormous studies in diverse
directions, see for examples [7, 19].
In applications, often we know little on the integrand f , so we want point sets
which work well for a wide class of f . An inequality of the form
Error( f ; P) V ( f )D(P),

(1)

called of KoksmaHlawka type, is often useful. Here, V ( f ) is a value independent


of P which measures some kind of variance of f , and D(P) is a value independent
of f which measures some kind of discrepancy of P from an ideal uniform
distribution. Under such an inequality, we may prepare point sets with small values
of D(P), and use them for QMC-integration if V ( f ) is expected to be not too large.
In the case of the original KoksmaHlawka inequality, [19, Chaps. 2 and 3],
V ( f ) is the total variation of f in the sense of Hardy and Krause, and D(P)
is the star discrepancy of the point set. In this case the inequality is known to
be sharp. It is a conjecture that there is a constant cs depending only on s such
that D (P) > cs (log N )s1 /N , and there are constructions of point sets with
D (P) < Cs (log N )s /N . Thus, to obtain a better convergence rate, one needs to
assume some restriction on f . If for a function class F , there are V ( f ) ( f F )
and D(P) with the inequality (1) with a sequence of point sets P1 , P2 , . . . with
D(Pi ) decreases faster than the order 1/Ni , then it is natural to call the point sets
as higher order QMC point sets for the function class F .
It is known that this is possible if we assume some smoothness on f . Dick [2, 4,
7] showed that for any positive integer , there is a function class named -smooth
such that the inequality
Error( f ; P) C(, s)|| f || W (P)

Walsh Figure of Merit for Digital Nets: An Easy Measure

145

holds, where point sets with W (P) = O(N (log N )s ) are constructible from
(t, m, s)-nets (named higher order digital net). The definition of W (P) is given
later in Sect. 5.3. We omit the definition of || f || , which depends on all partial
mixed derivatives up to the th order in each variable; when s = 1, it is defined by
 f 2 :=





i=0

2 

f (i) (x) d x  +



 () 2
 f (x) d x.

2 Digital Net, Discretization and WAFOM


In [16], Saito, Matoba and the first author introduced Walsh figure of merit (WAFOM)
WF(P) of a digital net1 P. This may be regarded as a simplified special case of Dicks
W with some discretization. WAFOM satisfies a KoksmaHlawka type inequality,
and the value WF(P) decreases in the order O(N C(log2 N )/s+D ) for some constant
C, D > 0 independent of s, N . Thus, the order of the convergence is faster than
O(N ) for any > 0.

2.1 Discretization
Although the following notions are naturally extended to Z/b or even any finite
abelian groups [29], we treat only the case when base b = 2 for simplicity.
Let F2 := {0, 1} = Z/2 be the two-element field. Take n large enough, and approximate the unit interval I = [0, 1) by the set of n-bit integers In := F2 n through the
inclusion In I, x(considered as an n-bit integer) x/2n + 1/2n+1 .
More precisely, we identify the finite set In with the set of half open intervals
obtained by partitioning [0, 1) into 2n pieces; namely
In := {[i2n , (i + 1)2n ) | 0 i 2n 1}.
Example 1 In the case n = 3 and I3 = {0, 1}3 , I3 is the set of 8 intervals in Fig. 1.
The s-dimensional hypercube I s is approximated by the set Ins of 2ns hypercubes,
which is identified with Ins = (F2 n )s = Ms,n (F2 ) =: V . In sum,
Fig. 1 {0, 1}3 is identified
with the set of 8
segments I3

1 See Sect. 2.3 for a definition of digital nets; there we use the italic

P instead of P for a digital net,


to stress that actually P is a subspace of a discrete space, while P is in a continuous space I s .

146

M. Matsumoto and R. Ohori

Definition 1 Let V := Ms,n (F2 ) be the set of (s n)-matrices with coefficients in


F2 = {0, 1}. An element B = (bi j ) V is identified with an s-dimensional hypercube in Ins , consisting of elements (x1 , . . . , xs ) Rs where, for each i, the binary
expansion of xi coincides with 0.bi1 bi2 bin up to the nth digit below the decimal point. By abuse of the language, the notation B is used for the corresponding
hypercube.
Example 2 In the case n = 3 and s = 2, for example,

B=

100
011


corresponds to [0.100, 0.101) [0.011, 0.100).

As an approximation of f : I s R, define
f n : InS = V R,

B f n (B) :=

1
Vol(B)


f dx
B

by mapping a small hypercube B of edge length 2n to the average of f over this


small hypercube. Thus, f n is the discretization (with n-bit precision) of f by taking
the average over each small hypercube.
In the following, we do not compute f n , but consider as if we are given f n .
More precisely saying, let x B denote the mid point of the hypercube B, and we
approximate f n (B) by f (x B ). For sufficiently large n, say, n = 32, the approximation
error | f n (B) f (x B )| (which we call the discretization error of f atB ) would be
small enough: if f is Lipschitz continuous, then the error2 has order s2n .
From now on, we assume that n is taken large enough, so that this discretization
error is negligible in practice for the QMC integration considered. A justification is
that we have only finite precision computation in digital computers, so a function
f has discretized domain with some finite precision. This assumption is somewhat
cheating, but seems to work well in many practical uses.
By definition of the above discretization, we have an equality

[0,1)s

f (x) dx =

1 
f n (B).
|V | BV

2.2 Discrete Fourier Transform


For A, B V , we define its inner product by
(A, B) := trace(t AB) =

ai j bi j F2 (mod2).

1is,1 jn

f has Lipschitz constant C, namely, satisfies f (x y) < C|x y|, then the error is bounded
by C s2n [16, Lemma 2.1].

2 If

Walsh Figure of Merit for Digital Nets: An Easy Measure

147

For a function g : V R, its discrete Fourier transform g : V R is defined by


g(A)

:=
Thus

1 
g(B)(1)(B,A) .
|V | BV

1 
f n (B) = I ( f ).
fn (0) =
|V | BV

Remark 1 The value fn (A) coincides with the Ath Walsh coefficient
 of the function
f defined as follows. Let A = (ai j ). Define an integer ci := nj=1 ai j 2 j for each
i = 1, . . . , s. Then the Ath Walsh coefficient of f is defined as the standard multiindexed Walsh coefficient fc1 ,...,cs .

2.3 Digital Nets, and QMC-Error in Terms


of Walsh Coefficients
Definition 2 Let P V be an F2 -linear subspace (namely, P is closed under componentwise addition modulo 2). Then, P can be regarded as a set of small hypercubes
in Ins , or, a finite point set P I s by taking the mid point of each hypercubes. Such
a point set P (or even P) is called a digital net with base 2.
This notion goes back to Sobol and Niederreiter; see for example [7, Definition 4.47]. For such an F2 -subspace P, let us define its perpendicular space3 by
P := {A V | (B, A) = 0 (B P)}.
QMC integration of f n by P is by definition
I ( f n ; P) :=


1 
f n (B) =
fn (A),
|P| BP

(2)

AP

where the right equality (called Poisson summation formula) follows from


1 

( BV f n (B)(1)(B,A) )

AP f n (A) =
AP
|V
|


= |V1 | BV f n (B) AP (1)(B,A)

= |V1 | BP f n (B)|P |
1 
= |P|
BP f n (B).
3 The

perpendicular space is called the dual space in most literatures on QMC and coding theory.
However, in pure algebra, the dual space to a vector space V over a field k means V := Homk (V, k),
which is defined without using inner product. In this paper, we use the term perpendicular going
against the tradition in this area.

148

M. Matsumoto and R. Ohori

2.4 KoksmaHlawka Type Inequality by Dick


From (2), we have a QMC integration error bound by Walsh coefficients




 


Error( f n ; P) = |I ( f n ; P) fn (0)| = 
fn (A)
 AP {0}


| fn (A)|. (3)

AP {0}

Thus, to bound the error, it suffices to bound | fn (A)|.


Theorem 1 (Decay of Walsh coefficients, [3]) For an n-smooth function f , there
is a notion of n-norm || f ||n and a constant C(s, n) independent of f and A with
| fn (A)| C(s, n)|| f ||n 2(A) .
(See [7, Theorem 14.23] for a general statement.) Here, (A) is defined as follows:
Definition 3 For A = (ai j )1is,1 jn V , its Dick weight (A) is defined by


(A) :=

jai j ,

1is,1 jn

where ai j {0, 1} are considered as integers (without modulo 2).


Example 3 In the case of s = 3, n = 4, for example,

1001 ja
1004
(1 + 0 + 0 + 4)
ij
A = 0111 0234 (A) = +(0 + 2 + 3 + 4) = 17.
0010
0030
+(0 + 0 + 3 + 0)
Walsh figure of merit of P is defined as follows [16]:
Definition 4 (WAFOM) Let P V . WAFOM of P is defined by
WF(P) :=

2(A) .

AP {0}

By plugging this definition and Dicks Theorem 1 into (3), we have an inequality
of KoksmaHlawka type:
Error( f n ; P) C(s, n)|| f ||n WF(P).

(4)

Walsh Figure of Merit for Digital Nets: An Easy Measure

149

2.5 A Toy Experiment on WF( P)


We shall see how WAFOM works for a toy case of n = 3-digit precision and s = 1
dimension. In Fig. 1, the unit interval I is divided into 8 intervals, each of which
corresponds to a (1 3)-matrix in F2 3 = V . Table 1 lists the seven subspaces of
dimension 2, selection of four of them, and their WAFOM and QMC error for the
integrand f (x) = x, x 2 and x 3 . The first line in Table 1 shows the 8-element set
V = F2 3 , corresponding to the 8 intervals in Fig. 1. The next line (100) denotes
the 2-dimensional subspace of V consisting of the elements perpendicular to (100),
that is, the four vectors whose first digit is 0. In the same manner, all 2-dimensional
subspaces of V are listed. The last one is (111) , consisting of the four vectors
(x1 , x2 , x3 ) with x1 + x2 + x3 = 0(mod2).
Our aim is to decide which is the best (or most uniform) among the seven 2dimensional sub-vector spaces for QMC integration. Intuitively, (100) is not a good
choice since all the four intervals cluster in [0, 1/2]. Similarly, we exclude (010)
and (110) . We compare the remaining four candidates by two methods: computing
WAFOM, and computing QMC integration errors with test integrand functions x, x 2
and x 3 .
The results are shown in the latter part of Table 1. The first line corresponds
to the case of P = V . Since P {0} is empty, WF(P) = 0. For the remaining
four cases P = (x1 , x2 , x3 ) , note that {(x1 , x2 , x3 ) } = {(000), (x1 , x2 , x3 )} and
P {0} = {(x1 , x2 , x3 )}, thus we have WF(P) = 2((x1 ,x2 ,x3 )) . The third column
in the latter table shows WAFOM for five different choices of P. The three columns
Error for x i with i = 1, 2, 3 show the QMC integration error by P for integrating
x i over [0, 1]. We used the mid point of each segment (of length 1/8) to evaluate f .

Table 1 Toy examples for WAFOM for 3-digit discretization for integrated x, x 2
V = {000 001 010 011 100 101 110 111}
(100) = {000 001 010 011
}
100 101
}
(010) = {000 001
(110) = {000 001
110 111}
010
100
110
}
(001) = {000
(101) = {000
010
101
111}
011 100
111}
(011) = {000
(111) = {000
011
101 110
}
(A) for
P
WF(P)
Error for x
Error for x 2
A P \0
V

0
0
0.0013
001
0+0+3
23
0.0625
0.0638
101
1+0+3
24
0
0.0299

011
0+2+3
25
0
+0.0143
111
1+2+3
26
0
0.0013

and x 3

Error for x 3
0.0020
0.0637
0.0449
+0.0215
0.0137

150

M. Matsumoto and R. Ohori

Thus, the listed errors include both the discretization errors and QMC-integration
errors for f n . For the first line, P = V implies no QMC integration error for f n
(n = 3), so the values show the discretization error exactly. The error bound (4) is
proportional to WF(P) for a fixed integrand. The table shows that, for these test
functions, the actual errors are well reflected in WAFOM values.
Here is a loose interpretation of WF(P). For an F2 -linear P,
A P \{0} is a linear relation satisfied by P.
(A) measures
 complexity of A.
WF(P) = AP \{0} 2(A) is small if all relations have high complexity, and
hence P is close to uniform.

The weight j in the sum
jai j in the definition of (A) denotes that the jth digit
below the decimal point is counted with complexity 2 j .

3 Point Sets with Low WAFOM Values


3.1 Existence and Non-existence of Low WAFOM Point Sets
Theorem 2 There are absolute (i.e. independent of s, n and d) positive constants
C, D, E such that for any positive integer s, n and d 9s, there exists a P V of
F2 -dimension d (hence cardinality N = 2d ) satisfying
WF(P) E 2Cd

/s+Dd

= E N C log2 N /s+D .

Since the exponent C log2 N /s + D goes to when N , this shows that


there exist point sets with higher order convergence having this order of WAFOM.
There are two independent proofs: M-Yoshiki [17] shows the positivity of the probability to have low-WAFOM point sets under a random choice of its basis (hence
non-constructive), and K.Suzuki [28] shows a construction using Dicks interleaving
method [7, Sect. 15] for Niederreiter-Xing sequence [21]. Suzuki [29] generalizes
[17] and [31] for arbitrary base b. Theorem 2 is similar to the Dicks construction
of point sets with W (P) = O(N (log N )s ) for arbitrary high 1, but there
seems no implication between his result and this theorem.
On the other side, Yoshiki [31] proved the following theorem that the order of the
exponent d 2 /s is sharp, namely, WAFOM can not be so small:

Theorem
3 Let C > 1/2 be any constant. For any positive integer s, n and d
s ( C + 1/16 + 3/4)/(C 1/2), any linear subspace P V of F2 -dimension
d satisfies
2
WF(P) 2C d /s .

Walsh Figure of Merit for Digital Nets: An Easy Measure

151

3.2 An Efficient Computation Method of WAFOM


Since P is intended for a QMC integration where the enumeration of P is necessary,
|P| = 2dimF2 P can not be huge. On the other hand, |V | = 2ns would be huge, say,
for n = 32 and s > 2. Since dimF2 P + dimF2 P = dimF2 V , |P | must be huge.
Thus, a direct computation of WF(P) using Definition 4 would be too costly. In [16],
the following formula is given by a Fourier inversion. Put B = (bi, j ), then we have


1 
WF(P) =
[(1 + (1)bi, j 2 j )] 1 .

|P| BP 1is,1 jn
This is computable in O(ns N ) steps of arithmetic operations in real numbers, where
N = |P|. Compared with most of other discrepancies, this is relatively easily computable. This allows us to do a random search for low-WAFOM point sets.
Remark 2 1. The above equality holds only for an F2 -linear P. Since the left hand
side is non-negative, so is the right sum in this case. It seems impossible to define
WAFOM for a general point set by using this formula, since for a general (i.e.
non-linear) P, the sum at the right hand side is sometimes negative and thus will
never give a bound on the integration error.
2. The right sum may be interpreted as the QMC integration of a function (whose
definition is given in the right hand side of the equality) by P. The integration of
the function over total space V is zero. Hence, the above equality indicates that,
to have a best F2 -linear P from the viewpoint of WAFOM, it suffices to have a
best P for QMC integration for a single specified function. This is in contrast to
the definition of star-discrepancy, where all the rectangle characteristic functions
are used as the test functions, and the supremum of their QMC integration errors
is taken.
3. Harase-Ohori [11] gives a method to accelerate this computation by a factor of
30, using a look-up table. Ohori-Yoshiki [25] gives a faster and simpler method
to compute a good approximation of WAFOM, using that Walsh coefficients of
exponential function approximates the Dick weight . More precisely,
WF(P)
s
xi ), whose
is well-approximated by the QMC-error of the function exp(2 i=1
value is easy to evaluate in modern CPUs.

4 Experimental Results
4.1 Random Search for Low WAFOM Point Sets
We fix the precision n = 30. We consider two cases of the dimension s = 4 and
s = 8. For each d = 8, 9, 10, . . . , 16, we generate d-dimensional subspace P V =

152

M. Matsumoto and R. Ohori

Fig. 2 WAFOM values for: (1) best WAFOM among 10000, (2) the 100th best WAFOM, (3)
Niederreiter-Xing, (4) Sobol , of size 2d with d = 8, 9, . . . , 16. The vertical axis is for log2 of their
WAFOM, and the horizontal for log2 of the size of point sets. The left figure is for dimension s = 4,
the right s = 8

(F2 30 )s 10000 times, by the uniformly random choice of d elements as its basis. Let
Pd,s be the point set with the lowest WAFOM among them. For the comparison, Q d,s
be the point set of the 100th lowest WAFOM.

4.2 Comparison of QMC Rules by WAFOM


For a comparison, we use two other QMC quadrature rules, namely, Sobol sequence
improved by Joe and Kuo [13], and Niederreiter-Xing sequence (NX) implemented
by Pirsic [27] and by Dirk Nuyens [23, item nxmats] (downloaded from the latter).
Figure 2 shows the WAFOM values for these four kinds of point sets, with size 28
to 216 . For s = 4, Sobol has largest WAFOM value, while NX has small WAFOM
comparable to the 100th best Q d,s selected by WAFOM. In d = 14, NX has much
larger WAFOM than that of Q 14,s , while in d = 15 the converse occurs. Note that
this seems to be reflected in the following experiments. For s = 8, the four kinds of
point sets show small differences in values of their WAFOM. Indeed, NX has smaller
WAFOM value than the best point set among randomly generated 10000 for each
d, while Sobol has larger WAFOM values. A mathematical analysis on this good
grade of NX would be interesting.

4.3 Comparison by Numerical Integration


In addition to the above four kinds of QMC rules, Monte Carlo method is used for
comparison (using Mersenne Twister [15] pseudorandom number generator). For the
test functions, we use 6 Genz functions [8]:
s
Oscillatory f 1 (x) = cos(2
s u 1 + 2 i=1 ai xi ), 2
Product Peak f 2 (x) = i=1 [1/(ai + (xi u i ) )],

Walsh Figure of Merit for Digital Nets: An Easy Measure

153

Fig. 3 QMC integration errors for (1) best WAFOM among 10000, (2) the 100th best WAFOM,
(3) Niederreiter-Xing, (4) Sobol , (5) Monte Carlo, using six Genz functions on the 4-dimensional
unit cube. The vertical axis is for log2 of the errors, and the horizontal for log2 of the size of point
sets. The error is the mean square error for 100 randomly digital shifted point sets

s
Corner Peak f 3 (x) = (1 + i=1
ai xi )(s+1)
s
2
2
Gaussian f 4 (x) = exp( 
a
i=1 i (x i u i ) )
s
a
|x

u
Continuous f 5 (x) = exp(
i |)
i=1 i i

0
if x1 > u 1 or x2 > u 2 ,
Discontinuous f 6 (x) =
s
exp( i=1 ai xi )) otherwise.
This selection is copied from [22, p. 91] [11]. The parameters a1 , . . . , as are selected
so that (1) they are in an arithmetic progression (2) as = 2a1 (3) the average of
a1 , . . . , as coincides with the average of c1 , . . . , c10 in [22, Eq. (10)] for each test
function. The parameters u i are generated randomly by [15].
Figure 3 shows the QMC integration errors for six test functions with five methods,
for dimension s = 4. The error for Monte Carlo is of order N 1/2 . The best WAFOM
point sets (WAFOM) and Niederreiter-Xing (NX) are comparable. For the function
Oscillatory, where its higher derivatives grow relatively slowly, WAFOM point sets

154

M. Matsumoto and R. Ohori

perform better than NX and Sobol , and the convergence rate seems of order N 2 . For
Product peak and Gaussian, WAFOM and NX are comparable; this coincides with the
fact that higher derivatives of these test functions rapidly grow, but still we observe
convergence rate N 1.6 . For Corner peak, WAFOM performs better than NX. It is
somewhat surprising that the convergence rate is almost N 1.8 for WAFOM point sets.
For Continuous, NX performs better than WAFOM. Since the test functions are not
differentiable, || f ||n is unbounded and hence the inequality (4) has no meaning. Still,
for Continuous, the convergence rate of WAFOM is almost N 1.2 . For Discontinuous,
NX and Sobol perform better than WAFOM. Note that except Discontinuous, the
large/small value of WAFOM of NX for d = 14, 15 observed in the left of Fig. 2
seems to be reflected in the five graphs.
We conducted similar experiments for s = 8 dimension, but we omit the results,
since their difference in WAFOM is small, and the QMC rules show not much
difference. We report that still we observe convergence rate with N with > 1.05
for the five test functions except Discontinuous, for WAFOM selected points and
NX.
Remark 3 Convergence rate for the integration error is even faster than that of
WAFOM values, for WAFOM selected point sets and NX for s = 4, while Sobol
sequence converging with rate N 1 . We feel that these go against our intuition, so
checked the code and compared with MC. We do not know why NX and WAFOM
work so well.

5 WAFOM Versus Other Figure of Merits


Niederreiters t-value [19] is a most established figure of merit of a digital net. Using
test functions, we compare the effect of t-value and WAFOM for QMC integration.

5.1 t-Value
Let P I S = [0, 1)s be a finite set of cardinality 2m . Let n 1 , n 2 , . . . , n
s 0 be
s
In i
integers. Recall that Ini is the set of 2ni intervals partitioning I . Then, i=1
is a set of 2n 1 +n 2 ++n s intervals. We want to make the QMC integration error 0 in
computing the volume of every such interval. A trivial bound is n 1 + n 2 + + n s
m, since at least one point must fall in each interval. The point set P is called a
(t, m, s)-net if the QMC integration error for each interval is zero, for any tuple
(n 1 , . . . , n s ) with
n 1 + n 2 + + n s m t.
Thus, smaller t-value is more preferable.

Walsh Figure of Merit for Digital Nets: An Easy Measure

155

1
1
1
Fig. 4 Left Hellekaleks function f (x) = (x11.1 1+1.1
)(x21.7 1+1.7
)(x32.3 1+2.3
)(x42.9
1
4 {5x }{7x }{11x }{13x }, where {x} := x [x]. Hor),
right
Hamukazus
function
f
(x)
=
2
1
2
3
4
1+2.9
izontal axis for category, vertical for the log2 of error. :WAFOM, +
: t-value

5.2 Experiments on WAFOM Versus t-Value


We fix the dimension s = 4 and the precision n = 32, and generate 106 (F2 -linear)
point sets of cardinality 212 by uniform random choices of their F2 basis consisting
of 12 vectors. We sort these 106 point sets, according to their t-values. It turns out
that 3 t 12, and the frequency of the point sets for a given t-value is as follows.
t
3 4
5
6
7
8
9
10 11 12
freq. 63 6589 29594 32403 18632 8203 2994 1059 365 98

Then, we sort the same 106 point sets by WAFOM. We categorize them into 10
classes from the smallest WAFOM, so that ith class has the same frequency with the
ith class by t-value. Thus, the same 106 point sets are categorized in two ways. For
a given test integrand function, compute the mean square error of QMC integral in
each category, for those graded by t-value and those graded by WAFOM.
Figure 4 shows log2 of the mean square integration error, for each category corresponding to 3 t 12 for t-value (+
), and for the category sorted by WAFOM
value (). The smooth test function in the left hand side comes from Hellekalek
[12], and the non-continuous function in the right hand side was communicated
from Kimikazu Kato (refered to as Hamukazu according to his established twitter
handle). From the left figure, for t = 3, the average error for the best 63 point sets
with the smallest t-value 3 is much larger than the average from the best 63 point
sets selected by WAFOM. Thus, the experiments show that for this test function,
WAFOM seems to work better than t-value in selecting good point set. We have no
explanation why the error decreases for t 9. In the right figure, for Hamukazus
non-continuous test function, t-value works better in selecting good points.
Thus, it is expected that digital nets that have small t-value and small WAFOM
would work well for smooth functions and robust to non-smooth functions. Harase
[10] noticed that Owen linear scrambling [7, Sect. 13] [26] preserves t-value, but

156

M. Matsumoto and R. Ohori

changes WAFOM. Starting from a Niederreiter-Xing sequence with small t, he


applied Owen linear scrambling to find a point set with low WAFOM and small
t-value. He obtained good results for wide range of integrands.

5.3 Dicks , and Non-discretized Case


Let > 0 be an integer. For A M S,n (F2 ), the Dicks -weight (A) is defined as
follows. It is a part of summation appeared in Definition 3 of (A): the sum is taken
up to nonzero entries from the right in each row.
Example 4 Suppose = 2.
1001 ja 1004
(1 + 0 + 0 + 4)
ij
A = 0111 0234 (A) = +(0 + 0 + 3 + 4) = 15.
0010
0030
+(0 + 0 + 3 + 0)
For F2 -linear P M S,n (F2 ),
W (P) :=

2 (A) .

(5)

AP {0}

To be precise, we need to take n , as follows. We identify I = [0, 1] with the


product W := F2 N via binary fractional expansion (neglecting a measure-zero set).
Let K := F2 N W be the subspace consisting of vectors with finite number of
nonzero components (this is usually identified with N {0} via binary expansion
and reversing the digits). We define inner product W K F2 as usual. Then,
for a finite subgroup P W s , its perpendicular space P K s is defined and is
countable. For A K s , (A) is analogously defined, and the right hand side of (5)
is absolutely converging. Dick [3] proved
Error( f ; P) C(s, )|| f || W (P),
and constructed a sequence of P with W (P) = O(N (log N ) S ) called higher
order digital nets. (See [7] for a comprehensive explanation.) Existence results and
search algorithms for higher order polynomial lattice rules are studied in [1, 5].
WAFOM is an n-digit discretized version of W where = n. WAFOM loses
freedom to choose , but it might be a merit since we do not need to choose .
Remark 4 In Dicks theory, is fixed. In fact, setting = log N does not yield
useful bound, since C(s, log N )Wlog N (P) (N ).
The above experiments show that, to have a small QMC-error by low WAFOM
point sets, the integrand should have high order partial derivatives with small norms
(see a preceding research [11], too). However, WAFOM seems to work with some
non-differentiable functions (such as Continuous in the previous section).

Walsh Figure of Merit for Digital Nets: An Easy Measure

157

5.4 t-Value Again


Niederreiter-Pirsic [20] showed that for a digital net P, the strict t-value of P as a
(t, m, s)-net is expressed as
mt +1=

min

AP {0}

1 (A).

(6)

Here 1 is Dicks -weight for = 1, which is known as the NiederreiterRosenbloom-Tsfasman weight.


There is a strong resemblance between (6) and Definition 4. Again in (6), high
complexity of all elements in P {0} gives strong uniformity (i.e., small t-value).
The right hand side of (6) is efficiently computable by a MacWilliams-type identity
in O(s N log N ) steps of integer operation [6].
Remark 5 The formula (6) for t-value uses the minimum over P, while Definition 4
of WAFOM and (5) use the summation over P. Can we connect t-value in (6) with
WAFOM in Definition 4? It may perhaps relate with ultra-discretization [14].

6 Randomization by Digital Shift


Let P Ms,n (F2 ) be a linear subspace. Choose Ms,n (F2 ). The point set P +
:= {B + |B P} is called the digital shift of P by . Since P + is not an
F2 -linear subspace, one can not define WF(P + ). Nevertheless, the same error
bound holds as P. Under a uniform random choice of , P + becomes unbiased.
Moreover, the mean square error is bounded as follows:
Theorem 4 (Goda-Ohori-Suzuki-Yoshiki [9])
Error( f n ; P + ) C(s, n)|| f ||n WF(P), and


E(Error( f n ; P + )2 ) C(s, n)|| f ||n WFr.m.s. (P),


where WFr.m.s. (P) :=

 

22(A) .

AP {0}

7 Variants of WAFOM
As mentioned in the previous section, [9] defined WFr.m.s. (P). As another direction,
the following generalization of WAFOM is proposed by Yoshiki [30] and Ohori [24]:

158

M. Matsumoto and R. Ohori

in Definition 3, the function (A) might be generalized by:


(A) :=

( j + )ai j

1is,1 jn

for any (even negative) real number (note that this definition is different from that of
, but we could not find a better notation). Then Definition 4 gives WF (P). The case
where = 1 is dealt in [30]. A weak point of the original WAFOM is that WAFOM
value does not vary enough and consequently it is not useful in grading point sets
for a large s, see Fig. 2, the s = 8 case. By choosing a suitable , we obtain WF (P)
that varies for large s (even for s = 16) and useful in choosing a good point set [24].
A table of bases of such point sets is available from Ohoris GitHub Pages: http://
majiang.github.io/qmc/index.html. These point sets are obtained by Ohori, using
Harases method based on linear scrambling, from NX sequences. Thus, they have
small t-values and small WAFOM values. Experiments show their good performance
[18].

8 Conclusion
Walsh figure of merit (WAFOM) [16] for F2 -linear point sets as a quality measure for
a QMC rule is discussed. Since WAFOM satisfies a KoksmaHlawka type inequality
(4), its effectiveness for very smooth functions is assured. Through the experiments
on QMC integration, we observed that the low WAFOM point sets show higher order
convergence such as O(N 1.2 ) for several test functions (including non-smooth one)
in dimension four, and O(N 1.05 ) for dimension eight.
Acknowledgments The authors are deeply indebted to Josef Dick, who patiently and generously
informed us of beautiful researches in this area, and to Harald Niederreiter for leading us to this
research. They thank for the indispensable helps by the members of Komaba-Applied-Algebra Seminar (KAPALS): Takashi Goda, Shin Harase, Shinsuke Mori, Syoiti Ninomiya, Mutsuo Saito, Kosuke
Suzuki, and Takehito Yoshiki. We are thankful to the referees, who informed of numerous improvements on the manuscript. The first author is partially supported by JST CREST, JSPS/MEXT Grantin-Aid for Scientific Research No.21654017, No.23244002, No.24654019, and No.15K13460. The
second author is partially supported by the Program for Leading Graduate Schools, MEXT, Japan.

References
1. Bardeaux, J., Dick, J., Leobacher, G., Nuyens, D., Pillichshammer, F.: Efficient calculation of
the worst-case error and (fast) component-by-component construction of higher order polynomial lattice rules. Numer. Algorithm. 59, 403431 (2012)
2. Dick, J.: Walsh spaces containing smooth functions and quasi-monte carlo rules of arbitrary
high order. SIAM J. Numer. Anal. 46, 15191553 (2008)

Walsh Figure of Merit for Digital Nets: An Easy Measure

159

3. Dick, J.: The decay of the walsh coefficients of smooth functions. Bull. Austral. Math. Soc.
80, 430453 (2009)
4. Dick, J.: On quasi-Monte Carlo rules achieving higher order convergence. In: Monte Carlo and
Quasi-Monte Carlo Methods 2008, pp. 7396. Springer, Berlin (2009)
5. Dick, J., Kritzer, P., Pillichshammer, F., Schmid, W.: On the existence of higher order polynomial lattices based on a generalized figure of merit. J. Complex 23, 581593 (2007)
6. Dick, J., Matsumoto, M.: On the fast computation of the weight enumerator polynomial and
the t value of digital nets over finite abelian groups. SIAM J. Discret. Math. 27, 13351359
(2013)
7. Dick, J., Pillichshammer, F.: Digital Nets and Sequences. Discrepancy Theory and Quasi-Monte
Carlo Integration. Cambridge University Press, Cambridge (2010)
8. Genz, A.: A package for testing multiple integration subroutines. In: Numerical Integration:
Recent Developments, Software and Applications, pp. 337340. Springer, Berlin (1987)
9. Goda, T., Ohori, R., Suzuki, K., Yoshiki, T.: The mean square quasi-Monte Carlo error for
digitally shifted digital nets. In: Cools, R., Nuyens, D. (eds.) Monte Carlo and Quasi-Monte
Carlo Methods 2014, vol. 163, pp. 331350. Springer, Heidelberg (2016)
10. Harase, S.: Quasi-Monte Carlo point sets with small t-values and WAFOM. Appl. Math. Comput. 254, 318326 (2015)
11. Harase, S., Ohori, R.: A search for extensible low-WAFOM point sets. arXiv:1309.7828
12. Hellekalek, P.: On the assessment of random and quasi-random point sets. In: Random and
Quasi-Random Point Sets, pp. 49108. Springer, Berlin (1998)
13. Joe, S., Kuo, F.: Constructing Sobol sequences with better two-dimensional projections. SIAM
J. Sci. Comput. 30, 26352654 (2008). http://web.maths.unsw.edu.au/~fkuo/sobol/new-joekuo-6.21201
14. Kakei, S.: Development in Discrete Integrable Systems - Ultra-discretization, Quantization.
RIMS, Kyoto (2001)
15. Matsumoto, M., Nishimura, T.: Mersenne twister: a 623-dimensionally equidistributed uniform
pseudorandom number generator. ACM Trans. Model.Comput. Simul. 8(1), 330 (1998). http://
www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
16. Matsumoto, M., Saito, M., Matoba, K.: A computable figure of merit for quasi-Monte Carlo
point sets. Math. Comput. 83, 12331250 (2014)
17. Matsumoto, M., Yoshiki, T.: Existence of higher order convergent quasi-Monte Carlo rules via
Walsh figure of merit. In: Monte Carlo and Quasi-Monte Carlo Methods 2012, pp. 569579.
Springer, Berlin (2013)
18. Mori, S.: A fast QMC computation by low-WAFOM point sets. In preparation
19. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. CBMS-NSF,
Philadelphia (1992)
20. Niederreiter, H., Pirsic, G.: Duality for digital nets and its applications. Acta Arith. 97, 173182
(2001)
21. Niederreiter, H., Xing, C.P.: Low-discrepancy sequences and global function fields with many
rational places. Finite Fieldsr Appl. 2, 241273 (1996)
22. Novak, E., Ritter, K.: High-dimensional integration of smooth functions over cubes. Numer.
Math. 75, 7997 (1996)
23. Nuyens, D.: The magic point shop of qmc point generators and generating vectors. http://
people.cs.kuleuven.be/~dirk.nuyens/qmc-generators/. Home page
24. Ohori, R.: Efficient quasi-monte carlo integration by adjusting the derivation-sensitivity parameter of walsh figure of merit. Masters Thesis (2015)
25. Ohori, R., Yoshiki, T.: Walsh figure of merit is efficiently approximable. In preparation
26. Owen, A.B.: Randomly permuted (t, m, s)-nets and (t, s)-sequences. In: Monte Carlo and
Quasi-Monte Carlo Methods 1994, pp. 299317. Springer, Berlin (1995)
27. Pirsic, G.: A software implementation of niederreiter-xing sequences. In: Monte Carlo and
quasi-Monte Carlo methods, 2000 (Hong Kong), pp. 434445 (2002)
28. Suzuki, K.: An explicit construction of point sets with large minimum Dick weight. J. Complex.
30, 347354 (2014)

160

M. Matsumoto and R. Ohori

29. Suzuki, K.: WAFOM on abelian groups for quasi-Monte Carlo point sets. Hiroshima Math. J.
45, 341364 (2015)
30. Yoshiki, T.: Bounds on walsh coefficients by dyadic difference and a new Koksma-Hlawka
type inequality for quasi-Monte Carlo integration. arXiv:1504.03175
31. Yoshiki, T.: A lower bound on WAFOM. Hiroshima Math. J. 44, 261266 (2014)

Some Results on the Complexity


of Numerical Integration
Erich Novak

Abstract We present some results on the complexity of numerical integration. We


start with the seminal paper of Bakhvalov (1959) and end with new results on the
curse of dimensionality and on the complexity of oscillatory integrals. This survey
paper consists of four parts:
1.
2.
3.
4.

Classical results till 1971


Randomized algorithms
Tensor product problems, tractability and weighted norms
Some recent results: C k functions and oscillatory integrals

Keywords Complexity of integration


Curse of dimensionality

Randomized algorithms Tractability

1 Classical Results Till 1971


I start with a warning: We do not discuss the complexity of path integration
and infinite-dimensional integration on RN or other domains although there are exciting new results in that area, see [7, 15, 21, 22, 29, 41, 43, 44, 56, 70, 76, 88, 95,
120, 122]. For parametric integrals see [16, 17], for quantum computers, see [48,
49, 79, 114].
We mainly study the problem of numerical integration, i.e., of approximating the
integral

Sd ( f ) =

f (x) dx

(1)

Dd

over an open subset Dd Rd of Lebesgue measure d (Dd ) = 1 for integrable functions f : Dd R. The main interest is on the behavior of the minimal number of
function values that are needed in the worst case setting to achieve an error at most
E. Novak (B)
Mathematisches Institut, University Jena, Ernst-Abbe-Platz 2, 07743 Jena, Germany
e-mail: erich.novak@uni-jena.de
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_6

161

162

E. Novak

> 0. Note that classical examples of domains Dd are the unit cube [0, 1]d and the
normalized Euclidean ball (with volume 1), which are closed. However, we work
with their interiors for definiteness of certain derivatives.
We state our problem. Let Fd be a class of integrable functions f : Dd R.
For f Fd , we approximate the integral Sd ( f ), see (1), by algorithms of the form
An ( f ) = n ( f (x1 ), f (x2 ), . . . , f (xn )),
where x j Dd can be chosen adaptively and n : Rn R is an arbitrary mapping.
Adaption means that the selection of x j may depend on the already computed values f (x1 ), f (x2 ), . . . , f (x j1 ). We define N : Fd Rn by N ( f ) = ( f (x1 ), . . . ,
f (xn )). The (worst case) error of the algorithm An is defined by
e(An ) = sup |Sd ( f ) An ( f )|,
f Fd

the optimal error bounds are given by


e(n, Fd ) = inf e(An ).
An

The information complexity n(, Fd ) is the minimal number of function values which
is needed to guarantee that the error is at most , i.e.,
n(, Fd ) = min{n | An such that e(An ) }.
We minimize n over all choices of adaptive sample points x j and mappings n .
In this paper we give an overview on some of the basic results that are known about
the numbers e(n, Fd ) and n(, Fd ). Hence we concentrate on complexity issues and
leave aside other important questions such as implementation issues.
It was proved by Smolyak and Bakhvalov that as long as the class Fd is convex and
balanced we may restrict the minimization of e(An ) by considering only nonadaptive
choices of x j and linear mappings n , i.e., it is enough to consider An of the form
An ( f ) =

n


ai f (xi ).

(2)

i=1

Theorem 0 (Bakhvalov [6]) Assume that the class Fd is convex and balanced. Then
e(n, Fd ) = inf

x1 ,...,xn

sup Sd ( f )

f Fd
N ( f )=0

(3)

and for the infimum in the definition of e(n, Fd ) it is enough to consider linear and
nonadaptive algorithms An of the form (2).

Some Results on the Complexity of Numerical Integration

163

In this paper we only consider convex and balanced Fd and then we can use the
last formula for e(n, Fd ).
Remark 0 (a) For a proof of Theorem 0 see, for example, [87, Theorem 4.7]. This
result is not really about complexity (hence it got its number), but it helps to prove
complexity results.
(b) A linear algorithm An is called a quasi Monte Carlo (QMC) algorithm if ai =
1/n for all i and is called a positive quadrature formula if ai > 0 for all i. In general
it may happen that optimal quadrature formulas have some negative weights and, in
addition, we cannot say much about the position of good points xi .
(c) More on the optimality of linear algorithms and on the power of adaption can
be found in [14, 77, 87, 112, 113]. There are important classes of functions that are
not balanced and convex, and where Theorem 0 can not be applied, see also [13, 94].
The optimal order of convergence plays an important role in numerical analysis.
We start with a classical result of Bakhvalov (1959) for the class
Fdk = { f : [0, 1]d R | D f  1, || k},
d
where k N and || = i=1
i for Nd0 and D f denotes the respective partial
derivative. For two sequences an and bn of positive numbers we write an bn if
there are positive numbers c and C such that c < an /bn < C for all n N.
Theorem 1 (Bakhvalov [5])
e(n, Fdk ) n k/d .

(4)

Remark 1 (a) For such a complexity result one needs to prove an upper bound (for a
particular algorithm) and a lower bound (for all algorithms). For the upper bound one
can use tensor product methods based on a regular grid, i.e., one can use the n = m d
points xi with coordinates from the set {1/(2m), 3/(2m), . . . , (2m 1)/(2m)}.
The lower bound can be proved with the technique of bump functions: One can
construct 2nfunctions f 1 , . . . , f 2n with disjoint supports such that all 22n functions
2n
i f i are contained in Fdk , where i = 1 and Sd ( f i ) cd,k n k/d1 .
of the form i=1
n function values, there are two functions
Since an
algorithm An can only compute
2n

f i and f = f + 2 nk=1 f ik such that f + , f Fdk and An ( f + ) =
f + = i=1
An ( f ) but |Sd ( f + ) Sd ( f )| 2ncd,k n k/d1 . Hence the error of An must be at
least cd,k n k/d . For the details see, for example, [78].
(b) Observe that we can not conclude much on n(, Fdk ) if is fixed and d is large,
since Theorem 1 contains hidden factors that depend on k and d. Actually the lower
bound is of the form
e(n, Fdk ) cd,k n k/d ,
where the cd,k decrease with d and tend to zero.
(c) The proof of the upper bound (using tensor product algorithms) is easy since
we assumed that the domain is Dd = [0, 1]d . The optimal order of convergence is

164

E. Novak

known for much more general spaces (such as Besov and TriebelLizorkin spaces)
and arbitrary bounded Lipschitz domains, see [85, 115, 118]. Then the proof of the
upper bounds is more difficult, however.
(d) Integration on fractals was recently studied by Dereich and Mller-Gronbach [18]. These authors also obtain an optimal order of convergence n k/ . The
definition of Sd must be modified and coincides, under suitable conditions, with
the Hausdorff dimension of the fractal.
By the curse of dimensionality we mean that n(, Fd ) is exponentially large in d.
That is, there are positive numbers c, 0 and such that
n(, Fd ) c (1 + )d

for all 0

and infinitely many d N.

(5)

If, on the other hand, n(, Fd ) is bounded by a polynomial in d and 1 then we say
that the problem is polynomially tractable. If n(, Fd ) is bounded by a polynomial in
1 alone, i.e., n(, Fd ) C for < 1, then we say that the problem is strongly
polynomially tractable.
From the proof of Theorem 1 we can not conclude whether the curse of dimensionality holds for the classes Fdk or not; see Theorem 11. Possibly Maung Zho Newn
and Sharygin [124] were the first who published (in 1971) a complexity result for
arbitrary d with explicit constants and so proved the curse of dimensionality for
Lipschitz functions.
Theorem 2 (Maung Zho Newn and Sharygin [124]) Consider the class
Fd = { f : [0, 1]d R | | f (x) f (y)| max |xi yi |}.
i

Then
e(n, Fd ) =

d
n 1/d
2d + 2

for n = m d with m N.
Remark 2 One can show that for n = m d the regular grid (points xi with coordinates from theset {1/(2m), 3/(2m), . . . , (2m 1)/(2m)}) and the midpoint rule
n
f (xi ) are optimal. See also [3, 4, 12, 107] for this result and for
An ( f ) = n 1 i=1
generalizations to similar function spaces.

2 Randomized Algorithms
The integration problem is difficult for all deterministic algorithms if the classes Fd
of inputs are too large, see Theorem 2. One may hope that randomized algorithms
make this problem much easier.

Some Results on the Complexity of Numerical Integration

165

Randomized algorithms can be formalized in various ways leading to slightly


different models. We do not explain the technical details and only give a reason why
it makes sense to study different models for upper and lower bounds, respectively;
see [87] for more details.
Assume that we want to construct and to analyze concrete algorithms that yield
upper bounds for the (total) complexity of given problems including the arithmetic
cost and the cost of generating random numbers. Then it is reasonable to consider
a rather restrictive model of computation where, for example, only the standard
arithmetic operations are allowed. One may also restrict the use of random numbers
and study so-called restricted Monte Carlo methods, where only random bits are
allowed; see [52].
For the proof of lower bounds we take the opposite view and allow general randomized mappings and a very general kind of randomness. This makes the lower
bounds stronger.
It turns out that the results are often very robust with respect to changes of the computational model. For the purpose of this paper, it might be enough that a randomized
algorithm A is a random variable (A ) with a random element where, for each
fixed , the Algorithm A is a (deterministic) algorithm as before. We denote by
the distribution of the . In addition one needs rather weak measurability assumptions, see also the textbook [73]. Let n(
f, ) be the number of function values used
for fixed and f .
The number

n(
f, ) d()
n(A)

= sup
f F

is called the cardinality of the randomized algorithm A and



eran (A) = sup
f F

is the error of A. By

1/2
S( f ) (N ( f ))2 d()

we denote the upper integral. For n N, define

n}.
eran (n, Fd ) = inf{eran (A) : n(A)
If A : F G is a (measurable) deterministic algorithm then A can also be treated
as a randomized algorithm with respect to a Dirac (atomic) measure . In this sense
we can say that deterministic algorithms are special randomized algorithms. Hence
the inequality
(6)
eran (n, Fd ) e(n, Fd )
is trivial.
The number eran (0, Fd ) is called the initial error in the randomized setting. For
n = 0, we do not sample f , and A ( f ) is independent of f , but may depend on .

166

E. Novak

It is easy to check that for a linear S and a balanced and convex set F, the best we
can do is to take A = 0 and then
eran (0, Fd ) = e(0, Fd ).
This means that for linear problems the initial errors are the same in the worst case
and randomized setting.
The main advantage of randomized algorithms is that the curse of dimensionality
is not present even for certain large classes of functions. With the standard Monte
Carlo method we obtain
1
eran (n, Fd ) ,
n
when Fd is the unit ball of L p ([0, 1]d ) and 2 p . Math [72] proved that this
is almost optimal and the optimal algorithm is
An ( f ) =

n

1
f (X i )

n + n i=1

with i.i.d. random variables X i that are uniformly distributed on [0, 1]d . It also follows
that
1
eran (n, Fd ) =
,
1+ n
when Fd is the unit ball of L p ([0, 1]d ) and 2 p . In the case 1 p < 2 one
can only achieve the rate n 1+1/ p , for a discussion see [50].
Bakhvalov [5] found the optimal order of convergence already in 1959 for the
class
Fdk = { f : [0, 1]d R | D f  1, || k},
where k N and || =

d
i=1

i for Nd0 .

Theorem 3 (Bakhvalov [5])


eran (n, Fdk ) n k/d1/2 .

(7)

Remark 3 A proof of the upper bound can be given with a technique that is often
called separation of the main part or also control variates. For n = 2m use m function values to construct a good L 2 approximation f m of f Fdk by a deterministic
algorithm. The optimal order of convergence is
 f f m 2 m k/d .

Some Results on the Complexity of Numerical Integration

167

Then use the unbiased estimator


An (

m
1 
f ) = Sd ( f m ) +
( f f m )(X i )
m i=1

with i.i.d. random variables X i that are uniformly distributed on [0, 1]d . See, for
example, [73, 78] for more details. We add in passing that the optimal order of convergence can be obtained for many function spaces (Besov spaces, TriebelLizorkin
spaces) and for arbitrary bounded Lipschitz domains Dd Rd ; see [85], where the
approximation problem is studied. To obtain an explicit randomized algorithm with
the optimal rate of convergence one needs a random number generator for the set
Dd . If it is not possible to obtain efficiently random samples from the uniform distribution on Dd one can work with Markov chain Monte Carlo (MCMC) methods,
see Theorem 5.
All known proofs of lower bounds use the idea of Bakhvalov (also called Yaos
Minimax Principle): study the average case setting with respect to a probability
measure on F and use the theorem of Fubini. For details see [4547, 73, 78, 88].
We describe a problem that was studied by several colleagues and solved by
Hinrichs [58] using deep results from functional analysis. Let H (K d ) be a reproducing kernel Hilbert space of real functions defined on a Borel measurable set Dd Rd .
Its reproducing kernel K d : Dd Dd R is assumed to be integrable,

Cdinit :=

1/2


K d (x, y) d (x) d (y) dx dy
Dd

< .

Dd

Here, d is a probability density function on Dd . Without loss of generality we


assume that Dd and d are chosen such that there is no subset of Dd with positive
measure such that all functions from H (K d ) vanish on it.
The inner product and the norm of H (K d ) are denoted by ,  H (K d ) and   H (K d ) .
Consider multivariate integration

Sd ( f ) =

f (x) d (x) dx

for all

f H (K d ),

Dd

where it is assumed that Sd : H (K d ) R is continuous.


We approximate Sd ( f ) in the randomized setting using importance sampling.
That is, for a positive probability density function d on Dd we choose n random
sample points x1 , x2 , . . . , xn which are independent and distributed according to d
and take the algorithm
An,d,d ( f ) =

n
1  f (x j ) d (x j )
.
n j=1
d (x j )

168

E. Novak

The error of An,d,d is then


eran (An,d,d ) =

sup

 f  H (K d ) 1

2 1/2
Ed Sd ( f ) An,d,d ( f )
,

where the expectation is with respect to the random choice of the sample points x j .
For n = 0 we formally take A0,d,d = 0 and then
eran (0, H (K d )) = Cdinit .
Theorem 4 (Hinrichs [58]) Assume additionally that K d (x, y) 0 for all
x, y Dd . Then there exists a positive density function d such that
eran (An,d,d )

 1/2 1
eran (0, H (K d )).
2
n

Hence, if we want to achieve eran (An,d,d ) eran (0, H (K d )) then it is enough to


take
 
1 2
n=
.
2
Remark 4 In particular, such problems are strongly polynomially tractable (for the
normalized error) if the reproducing kernels are pointwise nonnegative and integrable. In [89] we prove that the exponent 2 of 1 is sharp for tensor product Hilbert
spaces whose univariate reproducing kernel is decomposable and univariate integration is not trivial for the two parts of the decomposition. More specifically we
have
 
1 1 2
2 ln 1 ln 2
ran
,
n (, H (K d ))
for all (0, 1) and d
8
ln 1
where [1/2, 1) depends on the particular space.
We stress that these estimates hold independently of the smoothness of functions
in a Hilbert space. Hence, even for spaces of very smooth functions the exponent of
strong polynomial tractability is 2.
Sometimes one cannot sample easily from the target distribution if one wants
to compute an integral

f (x) (dx).

S( f ) =
D

Then Markov chain Monte Carlo (MCMC) methods are a very versatile and widely
used tool.
We use an average of a finite Markov chain sample as approximation of the mean,
i.e., we approximate S( f ) by

Some Results on the Complexity of Numerical Integration

Sn,n 0 ( f ) =

169

n
1
f (X j+n 0 ),
n j=1

where (X i )nN0 is a Markov chain with stationary distribution . The number n


determines the number of function evaluations of f . The number n 0 is the burn-in or
warm up time. Intuitively, it is the number of steps of the Markov chain to get close
to the stationary distribution .
We study the mean square error of Sn,n 0 , given by

1/2
e (Sn,n 0 , f ) = E,K |Sn,n 0 ( f ) S( f )|
,
where and K indicate the initial distribution and the transition kernel of the chain;
we work with the spaces L p = L p ( ). For the proof of the following error bound
we refer to [98, Theorem 3.34 and Theorem 3.41].
Theorem 5 (Rudolf [98]) Let (X n )nN be a Markov chain with reversible transition
kernel K , initial distribution , and transition operator P. Further, let
= sup{ : spec(P S)},
where spec(P S) denotes the spectrum of the operator (P S) : L 2 L 2 , and
assume that < 1. Then
sup e (Sn,n 0 , f )2

 f  p 1

2 C n0
2
+ 2
n(1 ) n (1 )2

(8)

holds for p = 2 and for p = 4 under the following conditions:


d
for p = 2, d
L and a transition kernel K which is L 1 -exponentially convergent with ( , M) where < 1, i.e.,

P n S L 1 L 1 M n
 d

for all n N and C = M  d
1 ;
 d

d
for p = 4, d
L 2 and = P S L 2 L 2 < 1 where C = 64  d
12 .
Remark 5 Let us discuss the results. First observe that we assume that the so called
spectral gap 1 is positive; in general we only know that || 1. If the transition
kernel is L 1 -exponentially convergent, then we have an explicit error bound for inted
L . However,
grands f L 2 whenever the initial distribution has a density d
in general it is difficult to provide explicit values and M such that the transition
kernel is L 1 -exponentially convergent with ( , M). This motivates to consider transition kernels which satisfy a weaker convergence property, such as the existence of
an L 2 -spectral gap, i.e., P S L 2 L 2 < 1. In this case we have an explicit error
d
L 2.
bound for integrands f L 4 whenever the initial distribution has a density d

170

E. Novak

Thus, by assuming a weaker convergence property of the transition kernel we obtain


a weaker result in the sense that f must be in L 4 rather than L 2 .
If we want to have an error of (0, 1) it is still not clear how to choose n and n 0
to minimize the total amount of steps n + n 0 . How should we choose the burn-in n 0 ?
C
 is a reasonable
One can prove in this setting, see [98], that the choice n =  log
1
and almost optimal choice for the burn-in.
More details can be found in [83]. For a full discussion with all the proofs
see [98].

3 Tensor Product Problems and Weights


We know from the work of Bakhvalov already done in 1959 that the optimal order of
convergence is n k/d for functions from the class C k ([0, 1]d ). To obtain an order of
convergence of roughly n k for every dimension d, one needs stronger smoothness
conditions. This is a major reason for the study of functions with bounded mixed
derivatives, or dominating mixed smoothness, such as the classes
W pk,mix ([0, 1]d ) = { f : [0, 1]d R | D f  p 1 for  k}.
Observe that functions from this class have, in particular, the high order derivative
D (k,k,...,k) f L p and one may hope that the curse of dimensionality can be avoided
or at least moderated by this assumption. For k = 1 these spaces are closely related
to various notions of discrepancy, see, for example, [23, 29, 71, 88, 111].
The optimal order of convergence is known for all k N and 1 < p < due
to the work of Roth [96, 97], Frolov [39, 40], Bykovskii [10], Temlyakov [109]
and Skriganov [101], see the survey Temlyakov [111]. The cases p {1, } are
still unsolved. The case p = 1 is strongly related to the star discrepancy, see also
Theorem 10.
Theorem 6 Assume that k N and 1 < p < . Then
e(n, W pk,mix ([0, 1]d )) n k (log n)(d1)/2 .
Remark 6 The upper bound was proved by Frolov [39] for p = 2 and by
Skriganov [101] for all p > 1. The lower bound was proved by Roth [96] and
Bykovskii [10] for p = 2 and by Temlyakov [109] for all p < . Hence it took
more than 30 years to prove Theorem 6 completely.
For functions in W pk,mix ([0, 1]d ) with compact support in (0, 1)d one can take
algorithms of the form
An ( f ) =



Am
| det A| 
,
f
ad
a
d
mZ

Some Results on the Complexity of Numerical Integration

171

where A is a suitable matrix that does not depend on k or n, and a > 0. Of course
in (0, 1)d .
the sum is finite since we use only the points Am
a
This algorithm is similar to a lattice rule but is not quite a lattice rule since the
points do not build an integration lattice. The sum of the weights is roughly 1, but
not quite. Therefore this algorithm is not really a quasi-Monte Carlo algorithm. The
algorithm An can be modified to obtain the optimal order of convergence for the
whole space W pk,mix ([0, 1]d ). The modified algorithm uses different points xi but
still positive weights ai . For a tutorial on this algorithm see [116]. Error bounds
for Besov spaces are studied in [35]. TriebelLizorkin spaces and the case of small
smoothness are studied in [117] and [74].
For the BesovNikolskii classes S rp,q B(T d ) with 1 p, q and 1/ p < r < 2,
the optimal rate is
n r (log n)(d1)(11/q)
and can be obtained constructively with QMC algorithms, see [63]. The lower bound
was proved by Triebel [115].
The Frolov algorithm can be used as a building block for a randomized algorithm
that is universal in the sense that it has the optimal order of convergence (in the
randomized setting as well as in the worst case setting) for many different function
spaces, see [65].
A famous algorithm for tensor product problems is the Smolyak algorithm, also
called sparse grids algorithm. We can mention just a few papers and books that deal
with this topic: The algorithm was invented by Smolyak [106] and, independently,
by several other colleagues and research groups. Several error bounds were proved
by Temlyakov [108, 110]; explicit error bounds (without unknown constants) were
obtained by Wasilkowski and Wozniakowski [121, 123]. Novak and Ritter [8082]
studied the particular Clenshaw-Curtis Smolyak algorithm. A survey is Bungartz
and Griebel [9] and another one is [88, Chap. 15]. For recent results on the order of
convergence see Sickel and T. Ullrich [99, 100] and Dinh Dung and T. Ullrich [36].
The recent paper [62] contains a tractability result for the Smolyak algorithm applied
to very smooth functions. We display only one recent result on the Smolyak algorithm.
Theorem 7 (Sickel and T. Ullrich [100]) For the classes W2k,mix ([0, 1]d ) one can
construct a Smolyak algorithm with the order of the error
n k (log n)(d1)(k+1/2) .

(9)

Remark 7 (a) The bound (9) is valid even for L 2 approximation instead of integration, but it is not known whether this upper bound is optimal for the approximation
problem. Using the technique of control variates one can obtain the order
n k1/2 (log n)(d1)(k+1/2)

172

E. Novak

for the integration problem in the randomized setting. This algorithm is not often used
since it is not easy to implement and its arithmetic cost is rather high. In addition,
the rate can be improved by the algorithm of [65] to n k1/2 (log n)(d1)/2 .
(b) It is shown in Dinh Dung and T. Ullrich [36] that the order (9) can not be
improved when restricting to Smolyak grids.
(c) We give a short description
of the ClenshawCurtis Smolyak algorithm for the

computation of integrals [1,1]d f (x) dx that often leads to almost optimal error
bounds, see [81].
We assume that for d = 1 a sequence of formulas
Ui( f ) =

mi


a ij f (x ij )

j=1

is given. In the case of numerical integration the a ij are just numbers. The method
U i uses m i function values and we assume that U i+1 has smaller error than U i and
m i+1 > m i . Define then, for d > 1, the tensor product formulas
(U U )( f ) =
i1

id

m i1


j1 =1

m id


a ij11 a ijdd f (x ij11 , . . . , x ijdd ).

jd =1

A tensor product formula clearly needs


m i1 m i2 m id
function values, sampled on a regular grid. The Smolyak formulas A(q, d) are clever
linear combinations of tensor product formulas such that
only tensor products with a relatively small number of knots are used;
the linear combination is chosen in such a way that an interpolation property for
d = 1 is preserved for d > 1.
The Smolyak formulas are defined by
A(q, d) =

(1)

q|i|

qd+1|i|q


d 1

(U i1 U id ),
q |i|

where q d. Specifically, we use, for d > 1, the Smolyak construction and start,
for d = 1, with the classical ClenshawCurtis formula with
m1 = 1

and

m i = 2i1 + 1 for i > 1.

Some Results on the Complexity of Numerical Integration

173

The ClenshawCurtis formulas


U (f) =
i

mi


a ij f (x ij )

j=1

use the knots


x ij = cos

( j 1)
,
mi 1

j = 1, . . . , m i

(and x11 = 0). Hence we use nonequidistant knots. The weights a ij are defined in such
a way that U i is exact for all (univariate) polynomials of degree at most m i .
It turns out that many tensor product problems are still intractable and suffer from the curse of dimensionality, for a rather exhaustive presentation see
[87, 88, 90]. Sloan and Wozniakowski [103] describe a very interesting idea that
was further developed in hundreds of papers, the paper [103] is most important and
influential. We can describe here only the very beginnings of a long ongoing story;
we present just one example instead of the whole theory.
The rough idea is that f : [0, 1]d R may depend on many variables, d is large,
but some variables or groups of variables are more important than others. Consider,
for d = 1, the inner product


 

 f, g1, =

f dx
0

1
g dx +

f  (x) g  (x) dx,

where > 0. If is small then f must be almost constant if it has small norm.
A large means that f may have a large variation and still the norm is relatively
small. Now we take tensor products of such spaces and weights 1 2 . . . and
consider the complexity of the integration problem for the unit ball Fd with respect
to this weighted norm. The kernel K of the tensor product space H (K ) is of the form
K (x, y) =

d


K i (xi , yi ),

i=1

where K is the kernel of the respective space H of univariate functions.



i < . Then the
Theorem 8 (Sloan and Wozniakowski [103]) Assume that i=1
problem is strongly polynomially tractable.
Remark 8 The paper [103] contains also a lower bound which is valid for all quasiMonte Carlo methods. The proof of the upper bound is very interesting and an
excellent example for the probabilistic method. Compute the mean of the quadratic
worst case error of QMC algorithms over all (x1 , . . . , xn ) [0, 1]nd and obtain
1
n




[0,1]d

K (x, x) dx

[0,1]2d


K (x, y) dx dy .

174

E. Novak

1
This
 expectation is of the form Cd n and the sequence Cd is bounded if and only if
i < . The lower bound in [103] is based on the fact that the kernel K is always
non-negative; this leads to lower bounds for QMC algorithms or, more generally, for
algorithms with positive weights.
As already indicated, Sloan and Wozniakowski [103] was continued in many
directions. Much more general weights and many different Hilbert spaces were studied. By the probabilistic method one only obtains the existence of a good QMC
algorithms but, in the meanwhile, there exist many results about the construction
of good algorithms. In this paper the focus is on the basic complexity results and
therefore we simply list a few of the most relevant papers: [8, 11, 2628, 5355,
6669, 92, 93, 102, 104, 105]. See also the books [23, 71, 75, 88] and the excellent
survey paper [29].

In complexity theory we want to study optimal algorithms and it is not clear


whether QMC algorithms or quadrature formulas with positive coefficients ai are
optimal. Observe that the Smolyak algorithm uses also negative ai and it is known
that in certain cases positive quadrature formulas are far from optimal; for examples
see [84] or [88, Sects. 10.6 and 11.3]. Therefore it is not clear whether the conditions
on the weights in Theorem 8 can be relaxed if we allow arbitrary algorithms. The
next result shows that this is not the case.
Theorem 9 ([86]) The
integration problem from Theorem 8 is strongly polynomially

i < .
tractable if and only if i=1
Remark 9 Due to the known upper bound of Theorem 8, to prove Theorem 9 it is
enough to prove a lower bound for arbitrary algorithms. This is done via the technique
of decomposable kernels that was developed in [86], see also [88, Chap. 11].
We do not describe this technique here and only remark that we need for this
technique many non-zero functions f i in the Hilbert space Fd with disjoint supports.
Therefore this technique usually works for functions with finite smoothness, but not
for analytic functions.
Tractability of integration can be proved for many weighted spaces and one
may ask whether there are also unweighted spaces where tractability holds as well.
A famous example for this are integration problems that are related to the star discrepancy.
For x1 , . . . , xn [0, 1]d define the star discrepancy by

D
(x1 , . . . , xn )



n


1


= sup t1 td
1[0,t) (xi ) ,

n i=1
t[0,1]d 

the respective QMC quadrature formula is Q n ( f ) =


Consider the Sobolev space

1
n

n
i=1

f (xi ).

Fd = { f W11,mix |  f  1, f (x) = 0 if there exists an i with xi = 1}

Some Results on the Complexity of Numerical Integration

with the norm



 f  := 


175



d f
 .
x1 x2 . . . xd 1

Then the HlawkaZaremba-equality yields

(x1 , . . . , xn ) = sup |Sd ( f ) Q n ( f )|,


D
f Fd

hence the star discrepancy is a worst case error bound for integration. We define

(x1 , . . . , xn ) }.
n(, Fd ) = min{n | x1 , . . . , xn with D

The following result shows that this integration problem is polynomially tractable
and the complexity is linear in the dimension.
Theorem 10 ([51])

n(, Fd ) C d 2

(10)

and
n(1/64, Fd ) 0.18 d.
Remark 10 This result was modified and improved in various ways and we mention
some important results. Hinrichs [57] proved the lower bound
n(, Fd ) c d 1 for 0 .
Aistleitner [1] proved that the constant C in (10) can be taken as 100. Aistleitner
and Hofer [2] proved more on upper bounds.
Already the proof in [51] showed that


(x1 , . . . , xn ) C dn holds with high probability if the points


an upper bound D
x1 , . . . , xn are taken independently and uniformly distributed. Doerr [30] proved the
respective lower bound, hence

E(D
(x1 , . . . , xn ))

d
for n d.
n

Since the upper bounds are proved with the probabilistic method, we only know
the existence of points with small star discrepancy. The existence results can be
transformed into (more or less explicit) constructions and the problem is, of course,
to minimize the computing time as well as the discrepancy. One of the obstacles is
that already the computation of the star discrepancy of given points x1 , x2 , . . . , xn is
very difficult. We refer the reader to [19, 24, 25, 3134, 42, 59].
Recently Dick [20] proved a tractability result for another unweighted space that
is defined via an L 1 -norm and consists of periodic functions; we denote Fourier
coefficients by f(k), where k Zd . Let 0 < 1 and 1 p and

176

E. Novak


F, p,d =

f : [0, 1] R |
d


|
f
(x
+
h)

f
(x)|
| f(k)| + sup
1 .
hp
x,h
d


kZ

Dick proved the upper bound




d 1 d / p
e(n, F, p,d ) max ,
n
n

for any prime number n. Hence the complexity is at most quadratic in d.


a suitable algorithm is the following. Use points xk =
 proof
 2  is constructive,
 d 
 The
k1
k
k
, n , . . . , n , where k = 0, 1, . . . , n 1, and take the respective QMC
n
algorithm.

4 Some Recent Results


We end this survey with two results that were still unpublished at the time of the
conference, April 2014. First we return to the classes C k ([0, 1]d ), see Theorem 1.
We want to be a little more general and consider the computation of

Sd ( f ) =

f (x) dx

(11)

Dd

up to some error > 0, where Dd Rd has Lebesgue measure 1. The results hold for
arbitrary sets Dd , the standard example of course is Dd = [0, 1]d . For convenience
we consider functions f : Rd R. This makes the function class a bit smaller and
the result a bit stronger, since our emphasis is on lower bounds.
It has not been known if the curse of dimensionality is present for probably
the most natural class which is the unit ball of r times continuously differentiable
functions,
Fdk = { f C k (Rd ) | D f  1 for all || k},
where k N.
Theorem 11 ([60]) The curse of dimensionality holds for the classes Fdk with the
super-exponential lower bound
n(, Fdk ) ck (1 ) d d/(2k+3) for all d N and (0, 1),
where ck > 0 depends only on k.

Some Results on the Complexity of Numerical Integration

177

Remark 11 In [60, 61] we also prove that the curse of dimensionality holds for
even smaller classes of functions Fd for which
the norms of arbitrary directional
derivatives are bounded proportionally to 1/ d.
We start with the fooling function


1
f 0 (x) = min 1, dist(x, P )
d
where
P =

n


for all x Rd ,

Bd (xi )

i=1

and Bd (xi ) is the ball with center xi and radius d. The function f 0 is Lipschitz.
By a suitable smoothing via convolution we construct a smooth fooling function
f k Fd with f k |P0 = 0.
Important elements of the proof are volume estimates (in the spirit of Elekes [38]
and Dyer, Fredi and McDiarmid [37]), since we need that the volume of a neighborhood of the convex hull of n arbitrary points is exponentially small in d.
Also classes of C -functions were studied recently. We still do not know whether
the integration problem suffers from the curse of dimensionality for the classes
Fd = { f : [0, 1]d R | D f  1 for all Nd0 },
this is Open Problem 2 from [87]. We know from Vybral [119] and [61] that the
curse is present for somewhat larger spaces and that a weak tractability holds for
smaller classes; this can be proved with the Smolyak algorithm, see [62].
We now consider univariate oscillatory integrals for the standard Sobolev spaces
H s of periodic and non-periodic functions with an arbitrary integer s 1. We study
the approximate computation of Fourier coefficients

Ik ( f ) =

f (x) e2 i kx dx,

i=

1,

where k Z and f H s .
There are several recent papers about the approximate computation of highly
oscillatory univariate integrals with the weight exp(2 i kx), where x [0, 1] and
k is an integer (or k R) which is assumed to be large in the absolute sense, see
Huybrechs and Olver [64] for a survey.
We study the Sobolev space H s for a finite s N, i.e.,
H s = { f : [0, 1] C | f (s1) is abs. cont., f (s) L 2 }

(12)

178

E. Novak

with the inner product


 f, gs =

s1 

=0

s1


()

(x) dx

()

g (x) dx +

f (s) (x) g (s) (x) dx

(13)
f

()

()

, 10 g , 10 +  f

(s)

(s)

, g 0 ,

=0

1
1/2
where  f, g0 = 0 f (x) g(x) dx, and norm  f  H s =  f, f s .
For the periodic case, an algorithm that uses n function values at equally spaced
points is nearly optimal, and its worst case error is bounded by Cs (n + |k|)s with
Cs exponentially small in s. For the non-periodic case, we first compute successive
derivatives up to order s 1 at the end-points x = 0 and x = 1. These derivatives
values are used to periodize the function and this allows us to obtain similar error
bounds like for the periodic case. Asymptotically in n, the worst case error of the
algorithm is of order n s independently of k for both periodic and non-periodic cases.
Theorem 12 ([91]) Consider the integration problem Ik defined over the space H s
of non-periodic functions with s N. Then
cs
e(n, k, H s )
(n + |k|)s

3
2

s

2
,
(n + |k| 2s + 1)s

for all k Z and n 2s.


Remark 12 The minimal errors e(n, k, H s ) for the non-periodic case have a peculiar
property for s 2 and large k. Namely, for n = 0 we obtain the initial error which is
of order |k|1 , whereas for n 2s it becomes of order |k|s . Hence, the dependence
on |k|1 is short-lived and disappears quite quickly. For instance, take s = 2. Then
e(n, k, H s ) is of order |k|1 only for n = 0 and maybe for n = 1, 2, 3, and then
becomes of order |k|2 .
Acknowledgments I thank the following colleagues and friends for valuable remarks: Michael
Gnewuch, Aicke Hinrichs, Robert Kunsch, Thomas Mller-Gronbach, Daniel Rudolf, Tino Ullrich,
and Henryk Wozniakowski. I also thank two referees for carefully reading my manuscript.

References
1. Aistleitner, Ch.: Covering numbers, dyadic chaining and discrepancy. J. Complex. 27, 531
540 (2011)
2. Aistleitner, Ch., Hofer, M.: Probabilistic discrepancy bounds for Monte Carlo point sets. Math.
Comput. 83, 13731381 (2014)
3. Babenko, V.F.: Asymptotically sharp bounds for the remainder for the best quadrature formulas
for several classes of functions. 19(3), 187193 (1976). English Translation: Mathematics
Notes

Some Results on the Complexity of Numerical Integration

179

4. Babenko, V.F.: Exact asymptotics of the error of weighted cubature formulas optimal for
certain classes of functions. English Translation Mathematics Notes 20(4), 887890 (1976)
5. Bakhvalov, N.S., On the approximate calculation of multiple integrals. Vestnik MGU, Ser.
Math. Mech. Astron. Phys. Chem. 4:318, : in Russian. English translation: Journal of Complexity 31, 502516, 2015 (1959)
6. Bakhvalov, N.S.: On the optimality of linear methods for operator approximation in convex
classes of functions. USSR Comput. Math. Math. Phys. 11, 244249 (1971)
7. Baldeaux, J., Gnewuch, M.: Optimal randomized multilevel algorithms for infinitedimensional integration on function spaces with ANOVA-type decomposition. SIAM J.
Numer. Anal. 52, 11281155 (2014)
8. Baldeaux, J., Dick, J., Leobacher, G., Nuyens, D., Pillichshammer, F.: Efficient calculation
of the worst-case error and (fast) component-by-component construction of higher order
polynomial lattice rules. Numer. Algorithms 59, 403431 (2012)
9. Bungartz, H.-J., Griebel, M.: Sparse grids. Acta Numer. 13, 147269 (2004)
10. Bykovskii, V.A.: On the correct order of the error of optimal cubature formulas in spaces with
dominant derivative, and on quadratic deviations of grids. Akad. Sci. USSR, Vladivostok,
Computing Center Far-Eastern Scientific Center (preprint, 1985)
11. Chen, W.W.L., Skriganov, M.M.: Explicit constructions in the classical mean squares problem
in irregularities of point distribution. J. fr Reine und Angewandte Mathematik (Crelle) 545,
6795 (2002)
12. Chernaya, E.V.: Asymptotically exact estimation of the error of weighted cubature formulas
optimal in some classes of continuous functions. Ukr. Math. J. 47(10), 16061618 (1995)
13. Clancy, N., Ding, Y., Hamilton, C., Hickernell, F.J., Zhang, Y.: The cost of deterministic,
adaptive, automatic algorithms: cones, not balls. J. Complex. 30, 2145 (2014)
14. Creutzig, J., Wojtaszczyk, P.: Linear vs. nonlinear algorithms for linear problems. J. Complex.
20, 807820 (2004)
15. Creutzig, J., Dereich, S., Mller-Gronbach, Th, Ritter, K.: Infinite-dimensional quadrature
and approximation of distributions. Found. Comput. Math. 9, 391429 (2009)
16. Daun, T., Heinrich, S.: Complexity of Banach space valued and parametric integration. In:
Dick, J., Kuo, F.Y., Peters, G.W., Sloan, I.H. (eds.) Monte Carlo and Quasi-Monte Carlo
Methods 2012, pp. 297316. Springer (2013)
17. Daun, T., Heinrich, S.: Complexity of parametric integration in various smoothness classes.
J. Complex. 30, 750766, (2014)
18. Dereich, S., Mller-Gronbach, Th.: Quadrature for self-affine distributions on Rd . Found.
Comput. Math. 15, 14651500, (2015)
19. Dick, J.: A note on the existence of sequences with small star discrepancy. J. Complex. 23,
649652 (2007)
20. Dick, J.: Numerical integration of Hlder continuous, absolutely convergent Fourier-, Fourier
cosine-, and Walsh series. J. Approx. Theory 183, 1430 (2014)
21. Dick, J., Gnewuch, M.: Optimal randomized changing dimension algorithms for infinitedimensional integration on function spaces with ANOVA-type decomposition. J. Approx.
Theory 184, 111145 (2014)
22. Dick, J., Gnewuch, M.: Infinite-dimensional integration in weighted Hilbert spaces: anchored
decompositions, optimal deterministic algorithms, and higher order convergence. Found.
Comput. Math. 14, 10271077 (2014)
23. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and QuasiMonte Carlo Integration. Cambridge University Press, Cambridge (2010)
24. Dick, J., Pillichshammer, F.: Discrepancy theory and quasi-Monte Carlo integration. In: Chen,
W., Srivastav, A., Travaglini, G., (eds) Panorama in Discrepancy Theory. Lecture Notes in
Mathematics 2107, pp. 539619. Springer (2014)
25. Dick, J., Pillichshammer, F.: The weighted star discrepancy of Korobovs p-sets. Proc. Am.
Math. Soc. 143, 50435057, (2015)
26. Dick, J., Sloan, I.H., Wang, X., Wozniakowski, H.: Liberating the weights. J. Complex. 20,
593623 (2004)

180

E. Novak

27. Dick, J., Sloan, I.H., Wang, X., Wozniakowski, H.: Good lattice rules in weighted Korobov
spaces with general weights. Numer. Math. 103, 6397 (2006)
28. Dick, J., Larcher, G., Pillichshammer, F., Wozniakowski, H.: Exponential convergence and
tractability of multivariate integration for Korobov spaces. Math. Comput. 80, 905930 (2011)
29. Dick, J., Kuo, F.Y., Sloan, I.H.: High-dimensional integration: the quasi-Monte Carlo way.
Acta Numer. 22, 133288 (2013)
30. Doerr, B.: A lower bound for the discrepancy of a random point set. J. Complex. 30, 1620
(2014)
31. Doerr, B., Gnewuch, M.: Construction of low-discrepancy point sets of small size by bracketing covers and dependent randomized rounding. In: Keller, A., Heinrich, S., Niederreiter, H.,
(eds.), Monte Carlo and Quasi-Monte Carlo Methods 2006, pp. 299312. Springer (2008)
32. Doerr, B., Gnewuch, M., Kritzer, P., Pillichshammer, F.: Component-by-component construction of low-discrepancy point sets of small size. Monte Carlo Methods Appl. 14, 129149
(2008)
33. Doerr, B., Gnewuch, M., Wahlstrm, M.: Algorithmic construction of low-discrepancy point
sets via dependent randomized rounding. J. Complex. 26, 490507 (2010)
34. Doerr, C., Gnewuch, M., Wahlstrm, M.: Calculation of discrepancy measures and applications. In: Chen, W.W.L., Srivastav, A., Travaglini, G., (eds.) Panorama of Discrepancy Theory.
Lecture Notes in Mathematics 2107, pp. 621678. Springer (2014)
35. Dubinin, V.V.: Cubature formulas for Besov classes. Izvestija Mathematics 61(2), 259283
(1997)
36. Dung, D., Ullrich, T.: Lower bounds for the integration error for multivariate functions with
mixed smoothness and optimal Fibonacci cubature for functions on the square. Mathematische
Nachrichten 288, 743762 (2015)
37. Dyer, M.E., Fredi, Z., McDiarmid, C.: Random volumes in the n-cube. DIMACS Ser. Discret.
Math. Theor. Comput. Sci. 1, 3338 (1990)
38. Elekes, G.: A geometric inequality and the complexity of computing volume. Discret. Comput.
Geom. 1, 289292 (1986)
39. Frolov, K.K.: Upper bounds on the error of quadrature formulas on classes of functions. Doklady Akademy Nauk USSR 231, 818821 (1976). English translation: Soviet Mathematics
Doklady 17, 16651669, 1976
40. Frolov, K.K.: Upper bounds on the discrepancy in L p , 2 p  . Doklady Akademy Nauk
USSR 252, 805807 (1980). English translation: Soviet Mathematics Doklady 18(1): 3741,
1977
41. Gnewuch, M.: Infinite-dimensional integration on weighted Hilbert spaces. Math. Comput.
81, 21752205 (2012)
42. Gnewuch. M.: Entropy, randomization, derandomization, and discrepancy. In: Plaskota, L.,
Wozniakowski, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2010, pp. 4378.
Springer (2012)
43. Gnewuch, M.: Lower error bounds for randomized multilevel and changing dimension algorithms. In: Dick, J., Kuo, F.Y., Peters, G.W., Sloan, I.H. (eds.) Monte Carlo and Quasi-Monte
Carlo Methods 2012, pp. 399415. Springer (2013)
44. Gnewuch, M., Mayer, S., Ritter, K.: On weighted Hilbert spaces and integration of functions
of infinitely many variables. J. Complex. 30, 2947 (2014)
45. Heinrich, S.: Lower bounds for the complexity of Monte Carlo function approximation. J.
Complex. 8, 277300 (1992)
46. Heinrich, S.: Random approximation in numerical analysis. In: Bierstedt, K.D., et al. (eds.)
Functional Analysis, pp. 123171. Dekker (1994)
47. Heinrich, S.: Complexity of Monte Carlo algorithms. In: The Mathematics of Numerical
Analysis, Lectures in Applied Mathematics 32, AMS-SIAM Summer Seminar, pp. 405419.
Park City, American Mathematical Society (1996)
48. Heinrich, S.: Quantum Summation with an Application to Integration. J. Complex. 18, 150
(2001)
49. Heinrich, S.: Quantum integration in Sobolev spaces. J. Complex. 19, 1942 (2003)

Some Results on the Complexity of Numerical Integration

181

50. Heinrich, S., Novak, E.: Optimal summation and integration by deterministic, randomized,
and quantum algorithms. In: Fang, K.-T., Hickernell, F.J., Niederreiter, H. (eds.) Monte Carlo
and Quasi-Monte Carlo Methods 2000, pp. 5062. Springer (2002)
51. Heinrich, S., Novak, E., Wasilkowski, G.W., Wozniakowski, H.: The inverse of the stardiscrepancy depends linearly on the dimension. Acta Arithmetica 96, 279302 (2001)
52. Heinrich, S., Novak, E., Pfeiffer, H.: How many random bits do we need for Monte Carlo
integration? In: Niederreiter, H. (ed.) Monte Carlo and Quasi-Monte Carlo Methods 2002,
pp. 2749. Springer (2004)
53. Hickernell, F.J., Wozniakowski, H.: Integration and approximation in arbitrary dimension.
Adv. Comput. Math. 12, 2558 (2000)
54. Hickernell, F.J., Wozniakowski, H.: Tractability of multivariate integration for periodic functions. J. Complex. 17, 660682 (2001)
55. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: On strong tractability of weighted multivariate integration. Math. Comput. 73, 19031911 (2004)
56. Hickernell, F.J., Mller-Gronbach, Th, Niu, B., Ritter, K.: Multi-level Monte Carlo algorithms
for infinite-dimensional integration on RN . J. Complex. 26, 229254 (2010)
57. Hinrichs, A.: Covering numbers, Vapnik-Cervonenkis classes and bounds for the star discrepancy. J. Complex. 20, 477483 (2004)
58. Hinrichs, A.: Optimal importance sampling for the approximation of integrals. J. Complex.
26, 125134 (2010)
59. Hinrichs, A.: Discrepancy, integration and tractability. In: Dick, J., Kuo, F.Y., Peters, G.W.,
Sloan, I.H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2012, pp. 129172. Springer
(2013)
60. Hinrichs, A., Novak, E., Ullrich, M., Wozniakowski, H.: The curse of dimensionality for
numerical integration of smooth functions. Math. Comput. 83, 28532863 (2014)
61. Hinrichs, A., Novak, E., Ullrich, M., Wozniakowski, H.: The curse of dimensionality for
numerical integration of smooth functions II. J. Complex. 30, 117143 (2014)
62. Hinrichs, A., Novak, E., Ullrich, M.: On weak tractability of the Clenshaw Curtis Smolyak
algorithm. J. Approx. Theory 183, 3144 (2014)
63. Hinrichs, A., Markhasin, L., Oettershagen, J., Ullrich, T.: Optimal quasi-Monte Carlo rules
on order 2 digital nets for the numerical integration of multivariate periodic functions, Numer.
Math. 134, (2015)
64. Huybrechs, D., Olver, S.: Highly oscillatory quadrature. Lond. Math. Soc. Lect. Note Ser.
366, 2550 (2009)
65. Krieg, D., Novak, E.: A universal algorithm for multivariate integration. Found. Comput.
Math. available at arXiv:1507.06853 [math.NA]; arXiv:1507.06853v2 [math.NA]
66. Kritzer, P., Pillichshammer, F., Wozniakowski, H.: Multivariate integration of infinitely many
times differentiable functions in weighted Korobov spaces. Math. Comput. 83, 11891206
(2014)
67. Kritzer, P., Pillichshammer, F., Wozniakowski, H.: Tractability of multivariate analytic problems. In: Uniform distribution and quasi-Monte Carlo methods, pp. 147170. De Gruyter
(2014)
68. Kuo, F.Y.: Component-by-component constructions achieve the optimal rate of convergence
for multivariate integration in weighted Korobov and Sobolev spaces. J. Complex. 19, 301
320 (2003)
69. Kuo, F.Y., Wasilkowski, G.W., Waterhouse, B.J.: Randomly shifted lattice rules for unbounded
integrands. J. Complex. 22, 630651 (2006)
70. Kuo, F.Y., Sloan, I.H., Wasilkowski, G.W., Wozniakowski, H.: Liberating the dimension. J.
Complex. 26, 422454 (2010)
71. Leobacher, G., Pillichshammer, F.: Introduction to Quasi-Monte Carlo Integration and Applications. Springer, Berlin (2014)
72. Math, P.: The optimal error of Monte Carlo integration. J. Complex. 11, 394415 (1995)
73. Mller-Gronbach, Th., Novak, E., Ritter, K.: Monte-Carlo-Algorithmen. Springer, Berlin
(2012)

182

E. Novak

74. Nguyen, V.K., Ullrich, M., Ullrich, T.: Change of variable in spaces of mixed smoothness and
numerical integration of multivariate functions on the unit cube (In preparation)
75. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM (1992)
76. Niu, B., Hickernell, F., Mller-Gronbach, Th, Ritter, K.: Deterministic multi-level algorithms
for infinite-dimensional integration on RN . J. Complex. 27, 331351 (2011)
77. Novak, E.: On the power of adaption. J. Complex. 12, 199237 (1996)
78. Novak, E.: Deterministic and Stochastic Error Bounds in Numerical Analysis. Lecture Notes
in Mathematics 1349. Springer, Berlin (1988)
79. Novak, E.: Quantum complexity of integration. J. Complex. 17, 216 (2001)
80. Novak, E., Ritter, K.: High dimensional integration of smooth functions over cubes. Numer.
Math. 75, 7997 (1996)
81. Novak, E., Ritter, K.: The curse of dimension and a universal method for numerical integration.
In: Nrnberger, G., Schmidt, J.W., Walz, G. (eds.) Multivariate Approximation and Splines,
vol. 125, pp. 177188. ISNM, Birkhuser (1997)
82. Novak, E., Ritter, K.: Simple cubature formulas with high polynomial exactness. Constr.
Approx. 15, 499522 (1999)
83. Novak, E., Rudolf, D.: Computation of expectations by Markov chain Monte Carlo methods.
In: Dahlke, S., et al. (ed.) Extraction of quantifiable information from complex systems.
Springer, Berlin (2014)
84. Novak, E., Sloan, I.H., Wozniakowski, H.: Tractability of tensor product linear operators. J.
Complex. 13, 387418 (1997)
85. Novak, E., Triebel, H.: Function spaces in Lipschitz domains and optimal rates of convergence
for sampling. Constr. Approx. 23, 325350 (2006)
86. Novak, E., Wozniakowski, H.: Intractability results for integration and discrepancy. J. Complex. 17, 388441 (2001)
87. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, vol. I, Linear Information. European Mathematical Society (2008)
88. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, vol. II, Standard Information for Functionals. European Mathematical Society (2010)
89. Novak, E., Wozniakowski, H.: Lower bounds on the complexity for linear functionals in the
randomized setting. J. Complex. 27, 122 (2011)
90. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, vol. III, Standard Information for Operators. European Mathematical Society (2012)
91. Novak, E., Ullrich, M., Wozniakowski, H.: Complexity of oscillatory integration for univariate
Sobolev spaces. J. Complex. 31, 1541 (2015)
92. Nuyens, D., Cools, R.: Fast algorithms for component-by-component construction of rank-1
lattice rules in shift invariant reproducing kernel Hilbert spaces. Math. Comput. 75, 903920
(2006)
93. Nuyens, D., Cools, R.: Fast algorithms for component-by-component construction of rank-1
lattice rules with a non-prime number of points. J. Complex. 22, 428 (2006)
94. Plaskota, L., Wasilkowski, G.W.: The power of adaptive algorithms for functions with singularities. J. Fixed Point Theory Appl. 6, 227248 (2009)
95. Plaskota, L., Wasilkowski, G.W.: Tractability of infinite-dimensional integration in the worst
case and randomized settings. J. Complex. 27, 505518 (2011)
96. Roth, K.F.: On irregularities of distributions. Mathematika 1, 7379 (1954)
97. Roth, K.F.: On irregularities of distributions IV. Acta Arithmetica 37, 6775 (1980)
98. Rudolf, D.: Explicit error bounds for Markov chain Monte Carlo. Dissertationes Mathematicae
485, (2012)
99. Sickel, W., Ullrich, T.: Smolyaks algorithm, sampling on sparse grids and function spaces of
dominating mixed smoothness. East J. Approx. 13, 387425 (2007)
100. Sickel, W., Ullrich, T.: Spline interpolation on sparse grids. Appl. Anal. 90, 337383 (2011)
101. Skriganov, M.M.: Constructions of uniform distributions in terms of geometry of numbers.
St. Petersburg Math. J. 6, 635664 (1995)

Some Results on the Complexity of Numerical Integration

183

102. Sloan, I.H., Reztsov, A.V.: Component-by-component construction of good lattice rules. Math.
Comput. 71, 263273 (2002)
103. Sloan, I.H., Wozniakowski, H.: When are quasi-Monte Carlo algorithms efficient for high
dimensional integrals? J. Complex. 14, 133 (1998)
104. Sloan, I.H., Kuo, F.Y., Joe, S.: On the step-by-step construction of quasi-Monte Carlo integration rules that achieves strong tractability error bounds in weighted Sobolev spaces. Math.
Comput. 71, 16091640 (2002)
105. Sloan, I.H., Wang, X., Wozniakowski, H.: Finite-order weights imply tractability of multivariate integration. J. Complex. 20, 4674 (2004)
106. Smolyak, S.A.: Quadrature and interpolation formulas for tensor products of certain classes
of functions. Doklady Akademy Nauk SSSR 4, 240243 (1963)
107. Sukharev, A.G.: Optimal numerical integration formulas for some classes of functions. Sov.
Math. Dokl. 20, 472475 (1979)
108. Temlyakov, V.N.: Approximate recovery of periodic functions of several variables. Mathematics USSR Sbornik 56, 249261 (1987)
109. Temlyakov, V.N.: On a way of obtaining lower estimates for the error of quadrature formulas.
Math. USSR Sb. 181, 14031413 (1990). in Russian. English translation: Mathematics USSR
Sbornik 71(247257), 1992
110. Temlyakov, V.N.: On approximate recovery of functions with bounded mixed derivative. J.
Complex. 9, 4159 (1993)
111. Temlyakov, V.N.: Cubature formulas, discrepancy, and nonlinear approximation. J. Complex.
19, 352391 (2003)
112. Traub, J.F., Wozniakowski, H.: A General Theory of Optimal Algorithms. Academic Press,
Cambridge (1980)
113. Traub, J.F., Wasilkowski, G.W., Wozniakowski, H.: Information-Based Complexity. Academic Press, Cambridge (1988)
114. Traub, J.F., Wozniakowski, H.: Path integration on a quantum computer. Q. Inf. Process. 1,
365388 (2003)
115. Triebel, H.: Bases in Function Spaces, Sampling, Discrepancy, Numerical Integration. European Mathematical Society (2010)
116. Ullrich, M.: On Upper error bounds for quadrature formulas on function classes by K.K.
Frolov. In: Cools, R., Nuyens, D., (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2014,
vol. 163, pp. 571582. Springer, Heidelberg (2016)
117. Ullrich, M., Ullrich, T.: The role of Frolovs cubature formula for functions with bounded
mixed derivative. SIAM J. Numer. Anal. 54(2), 969993 (2016)
118. Vybral, J.: Sampling numbers and function spaces. J. Complex. 23, 773792 (2007)
119. Vybral, J.: Weak and quasi-polynomial tractability of approximation of infinitely differentiable functions. J. Complex. 30, 4855 (2014)
120. Wasilkowski, G.W.: Average case tractability of approximating -variate functions. Math.
Comput. 83, 13191336 (2014)
121. Wasilkowski, G.W., Wozniakowski, H.: Explicit cost bounds of algorithms for multivariate
tensor product problems. J. Complex. 11, 156 (1995)
122. Wasilkowski, G.W., Wozniakowski, H.: On tractability of path integration. J. Math. Phys. 37,
20712088 (1996)
123. Wasilkowski, G.W., Wozniakowski, H.: Weighted tensor-product algorithms for linear multivariate problems. J. Complex. 15, 402447 (1999)
124. Zho Newn, M., Sharygin, I.F.: Optimal cubature formulas in the classes D21,c and D21,l1 . In:
Problems of Numerical and Applied Mathematics, pp. 2227. Institute of Cybernetics, Uzbek
Academy of Sciences (1991, in Russian)

Approximate Bayesian
Computation: A Survey on Recent Results
Christian P. Robert

Abstract Approximate Bayesian Computation (ABC) methods have become a


mainstream statistical technique in the past decade, following the realisation by
statisticians that they are a special type of non-parametric inference. In this survey
of ABC methods, we focus on the recent literature, building on the previous survey of Marin et al. Stat Comput 21(2):279291, 2011, [39]. Given the importance
of model choice in the applications of ABC, and the associated difficulties in its
implementation, we also give emphasis to this aspect of ABC techniques.
Keywords Approximate Bayesian computation Likelihood-free methods
Bayesian model choice Sufficiency Monte Carlo methods Summary statistics

1 ABC Basics
Bayesian statistics and Monte Carlo methods are ideally suited to the task of passing many
models over one dataset. D. Rubin 1984

Although it now covers a wide range of application domains, approximate Bayesian


computation (ABC) was first introduced in population genetics [48, 62] to handle
models with intractable likelihoods [3]. By intractable, we mean models where the
likelihood function ( |y)
is completely defined by the probabilistic model, y f (y| );
is available neither in closed form, nor by numerical derivation;
cannot easily be completed or demarginalised by the introduction of latent or
auxiliary variables [53, 61];
cannot be estimated by an unbiased estimator [2].
C.P. Robert (B)
CEREMADE, Universit Paris-Dauphine, Paris, France
e-mail: xian@ceremade.dauphine.fr
C.P. Robert
Department of Statistics, University of Warwick, Coventry, UK
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_7

185

186

C.P. Robert

This intractability prohibits the direct implementation of a generic MCMC


algorithm like a Gibbs or a MetropolisHastings scheme. Examples of intractable
models associated with latent variable structures of high dimension abound, primarily
in population genetics, but more generally models including combinatorial structures
(e.g., trees, graphs), intractable normalising constants as in f (x| ) = g(y| )/Z ( )
(e.g. Markov random fields, exponential graphs), and missing (or latent) variables,
i.e. when

f (y|G, ) f (G| )dG
f (y| ) =
G

cannot produce a likelihood function in a manageable way (while f (y|G, ) and


f (G| ) are easily available).
The idea of the approximation behind ABC is both surprisingly simple and fundamentally related to the very nature of statistics, namely the resolution of an inverse
problem. Indeed, ABC relies on the feasibility of producing simulated data from the
inferred model or models, as it evaluates the unavailable likelihood by the proximity
of this simulated data to the observed data. In other words, it relies on the natural
assumption that the forward stepfrom model to datais reasonably easy in contrast with the backward stepfrom data to model. ABC involves three levels of
approximation of the original Bayesian inference problem: if y0 denotes the actual
observation,
ABC degrades the data precision down to a tolerance level , replacing the event
Y = y0 with the event d(Y, y0 ) , where d(, ) is a distance (or deviance) measure;
ABC substitutes for the likelihood ( |y0 ) a non-parametric approximation, for
instance I(d(z( ), y0 ) , where z( ) f (z| );
ABC most often summarises the original data y0 by an (almost always) insufficient
statistic, S(y0 ), and aims to approximate the posterior ( |S(y0 )), instead of the
original ( |y0 ).
Not so coincidentally, [56], quoted above, used this representation in a nonalgorithmic perspective as a motivation for conducting Bayesian analysis (as opposed
to other forms of statistical inference). Rubin indeed details the acceptreject algorithm [53] at the core of the ABC algorithm. Namely, the following algorithm
Algorithm 1 Acceptreject for Bayesian analysis
Given an observation x 0
for t = 1 to N do
repeat
Generate from the prior ()
Generate x from the model f (| )
Accept if x = x 0
until acceptance
end for
return the N accepted values of

Approximate Bayesian Computation: A Survey on Recent Results

187

returns as an accepted value a draw generated exactly from the posterior distribution,
( |x0 ).
When compared with Rubins representation, ABC produces an approximate solution, replacing the above acceptance step with the tolerance condition
d(x, x0 ) <
in order to handle both continuous and large finite sampling spaces,1 X, but this
early occurrence in [56] is definitely worth signalling. It is also relevant that Rubin
does not promote this simulation method in situations where the likelihood is not
available but rather as an intuitive way to understand posterior distributions from
a frequentist perspective, because s from the posterior are these that could have
generated the observed data. (The issue of the zero probability of the exact equality
between simulated and observed data is not addressed in Rubins paper, maybe
because the notion of a match between simulated and observed data is not clearly
defined.) Another (just as) early occurrence of an ABC-like algorithm was proposed
by [19].
Algorithm 2 ABC (basic version)
Given an observation x 0
for t = 1 to N do
repeat
Generate from the prior ()
Generate x from the model f (| )
Compute the distance (x0 , x )
Accept if (x0 , x ) <
until acceptance
end for
return the N accepted values of

The ABC method is formally implemented as in Algorithm 2, which requires


calibrating the objects (, ), called the distance or divergence measure, N , number
of accepted simulations, and , called the tolerance. Algorithm 2 is exact (in the
sense of Algorithm 1) when = 0. This algorithm can be easily implemented to test
the performances of the ABC methods on toy examples where the exact posterior
distribution is known, in order to visualise the impact of the algorithm parameters
like the tolerance level or the choice of the distance function . However, in realistic
settings, it is almost never used as such, due to the curse of dimensionality. Indeed,
the data x 0 is generally complex enough for the proximity (x0 , x ) to be large, even
when both x 0 and x are generated from the same distribution. As illustrated on the
1 As detailed below, the distance may depend solely on an insufficient statistic S(x) and hence not
be a distance from a formal perspective, while introduction a second level of approximation to the
ABC scheme.

188

C.P. Robert

time series (toy) example of [39], the signal-to-noise2 ratio produced by selecting
s such that (x0 , x ) < falls dramatically as the dimension of x 0 increases for
a fixed value of . This means a corresponding increase in either the total number of
simulations N or in the tolerance is required to preserve a positive acceptance rate.
In practice, it is thus paramount to first summarise the data in a so-called summary
statistic before computing a proximity index. Thus enters the notion of summary
statistics that is central to operational ABC algorithms, as well as the subject of
much debate, as discussed in [12, 39] and below. A more realistic version of the ABC
algorithm is produced in Algorithm 3, where S() denotes the summary statistic.
Algorithm 3 ABC (version with summary)
Given an observation x 0
for t = 1 to N do
Generate (t) from the prior ()
Generate x (t) from the model f (| (t) )
Compute dt = (S(x0 ), S(x(t) ))
end for
Order distances d(1) d(2) . . . d(N )
return the values (t) associated with the k smallest distances

For a general introduction to ABC methods, I refer the reader to our earlier survey
[39] and to [60], the latter constituting the original version of the Wikipedia page on
ABC [69], first published in PLoS One. The presentation made in that page is comprehensive and correct, rightly putting stress on the most important aspects of the method.
The authors also include the proper level of warning about the need to assess assumptions behind and calibrations of the method. For concisions sake, I will not cover here
recent computational advances, like these linked with sequential Monte Carlo [4, 65]
and the introduction of Gaussian processes in the approximation [72].
An important question that arises in the wake of defining this approximate algorithm is whether or not it constitutes a valid approximation to the posterior distribution
( |S(y0 )), if not of the original ( |y0 ). (This is what we will call consistency of
the ABC algorithm in the following section, meaning that the Monte Carlo approximation provided by the algorithm converges to the posterior when the number of
simulations grows to infinity. The more standard notion of statistical consistency
will also be invoked to justify the approximation.) In case it does not converge to
the posterior, a subsequent question is whether or not the ABC output constitutes
a proper form of Bayesian inference. Answers to the latter vary according to ones
perspective:
asymptotically, an infinite computing power allows for a zero tolerance, hence for
a proper posterior conditioning on S(y0 );
the outcome of Algorithm 3 is an exact posterior distribution when assuming an
error-in-variable model with scale [70];
2 Or,

more accurately, posterior-to-prior.

Approximate Bayesian Computation: A Survey on Recent Results

189

it is also an exact posterior distribution once data has been randomised at


scale [24];
it remains a formal Bayesian procedure albeit applied to an estimated likelihood.
Those answers are not fully satisfactory, in particular because using ABC implies
an ad hoc modification to the sampling model, but they are also illuminating about
the tension that exists between information and precision in complex models. ABC
indeed provides a worse approximation of the posterior distribution when the dimension of the summary statistics increases, at a given computational cost. This may
sound paradoxical from a purely statistical perspective but it is in fine a consequence
of the curse of dimensionality and of the fact that the signal-to-noise ratio may be
higher in a low dimension statistic than in the raw data. While ( |S(y0 )) is less
concentrated than the original ( |y0 ), the ABC versions of these two posteriors,
( |d(S(Y ), S(y0 )) )

and

( |d(Y, y0 ) )

may exhibit the opposite feature. (In the above, we introduce the tolerance to stress
the dependence of the choice of the tolerance on the summary statistics.) A related
difficulty with ABC is that the approximation errorof using ( |d(S(Y ), S(y0 ))
) instead of ( |S(y0 )) or the original ( |y0 )is unknown unless one is ready
to run costly simulation.

2 ABC Consistency
ABC was first treated with suspicion by the mainstream statistical community (as
well as some population geneticists, see the fierce debate between [63, 64] and [5, 8])
because it sounded like a rudimentary version of standard Monte Carlo methods like
MCMC algorithms [53]. However, the perspective later changed, due to representations of the ABC posterior distribution as (i) a genuine posterior distribution [71] and
of ABC as an auxiliary variable method [71], (ii) a non-parametric technique [10,
11], connected with both indirect inference [20] and k-nearest neighbour estimation
[9]. This array of interpretations helped to turn ABC into an acceptable (if not fully
accepted) component of Bayesian computational methods, albeit requiring caution
and calibration [69]. The following entries cover some of the advances made in the
statistical analysis of the method. While some of the earlier justifications are about
computational consistency, namely a converging approximation when the computing
power grows to infinity, the more recent analyses are mostly focused on statistical
consistency. This perspective shift signifies that ABC is increasingly considered as
an inference method per se.

2.1 ABC as Knn


In [9], the authors made a significant contribution to the statistical foundations of
ABC. It analyses the convergence properties of the ABC algorithm in accordance

190

C.P. Robert

with the way it is truly implemented. In practice, as in the DIYABC software [16],
the tolerance bound is determined as in Algorithm 3: a quantile of the simulated
distances, say the 10 % or the 1 % quantile, is chosen as . This means in particular
that the interpretation of as a non-parametric density estimation bandwidth, while
interesting and prevalent in the literature (see, e.g., [10, 24]), is only an approximation
of the actual practice.
The focus of [9] is on the mathematical foundations of this practice, an advance
obtained by (re)analysing ABC as a k-nearest neighbour (knn) method. Using generic
knn results, they derive a consistency property for the ABC algorithm by imposing
some constraints upon the rate of decrease of the quantile k as a function of n. More
specifically, provided
k N/log log N

and

k N/N

when N , for almost all s0 (with respect to the distribution of S(Y )), with
probability 1, convergence occurs, i.e.
1/k N

kN


( j ) E[( j )|S = s0 ]

j=1

The setting is restricted to the use of sufficient statistics or, equivalently, to a distance over the whole sample. The issue of summary statistics is not addressed in the
paper. The paper also contains a rigorous proof of the convergence of ABC when
the tolerance goes to zero. The mean integrated square error consistency of the
conditional kernel density estimate is established for a generic kernel (under usual
assumptions). Further assumptions (both on the target and on the kernel) allow the
authors to obtain precise convergence rates (as a power of the sample size), derived
from classical k-nearest neighbour regression, like
kN N

p+4/m+ p+4

in dimensions m larger than 4 (where N is the simulation size). The paper [9] is
theoretical and highly mathematical, however this work clearly constitutes a major
reference for the justification of ABC. In addition, it creates a link with machinelearning techniques, where ABC is yet at an early stage of development.

2.2 Convergence Rates


In [17], the authors address ABC consistency in the special setting of hidden Markov
models. It relates to [24] discussed below in that those authors also establish ABC
consistency for the noisy ABC, given in Algorithm 4, where h() is a kernel bounded
by one (as for instance the unnormalised normal density).

Approximate Bayesian Computation: A Survey on Recent Results

191

Algorithm 4 ABC (noisy version)


Given an observation x 0
Generate S 0 h({ S S(x 0 )}/)
for t = 1 to N do
repeat
Generate from the prior ()
Generate x from the model f (| )
Accept with probability h({ S 0 S(x )}/)
until acceptance
end for
return N accepted values of

In [17], an ABC scheme is derived in such a way that the ABC simulated sequence
remains an HMM, the conditional distribution of the observables given the latent
Markov chain being modified by the ABC acceptance ball. This means that conducting maximum likelihood (or Bayesian) estimation based on the ABC sample is
equivalent to exact inference under the perturbed HMM scheme. In this sense, this
equivalence also connects with [24, 71] perspectives on exact ABC. While the
paper provides asymptotic bias for a fixed value of the tolerance , it also proves that
an arbitrary level of accuracy can be attained with enough data and a small enough .
The authors of the paper show in addition (as in [24]) that ABC inference based on
noisy observations y1 + 1 , . . . , yn + n with the same tolerance , is equivalent
to a regular inference based on the original data y1 , . . . , yn , hence the consistence of
Algorithm 4. Furthermore, the asymptotic variance of the ABC version is shown to
always be larger than the asymptotic variance of the standard MLE, and decreasing
as 2 . The paper also contains an illustration on an HMM with -stable observables.
Notice that the restriction to summary statistics that preserve the HMM structure
is paramount for the results in the paper to apply, hence prevents the use of truly
summarising statistics that would not grow linearly in dimension with the size of the
HMM series.

2.3 Checking ABC Convergence


The authors of [47] evaluate several diagnostics for ABC validation via coverage
diagnostics. Getting valid approximation diagnostics for ABC is of obvious importance, while being under-represented in the literature. When simulation time remains
manageable, the DIYABC [16] software does implement a limited coverage assessment by computing the type I error, i.e. through simulating pseudo-data under the
null model and evaluating the number of time it is rejected at the 5 % level (see
Sects. 2.11.3 and 3.8 in the DIYABC documentation).
The core idea advanced by [47] is that a Bayesian credible interval on the parameter at a given credible level should have a similar confidence level (at least
asymptotically and even more for matching priors). Furthermore, they support the

192

C.P. Robert

notion that simulating pseudo-data ( la ABC) with a known parameter value allows
for a Monte Carlo evaluation of the credible interval genuine coverage, hence for
a calibration of the tolerance . The delicate issue is about the generation of these
known parameters. For instance, if the pair (, y) is generated from the joint distribution made of prior times likelihood, and if the credible region is also based on
the true posterior, the average coverage is the nominal one. On the other hand, if
the credible interval is based on a poor (ABC) approximation to the posterior, the
average coverage should differ from the nominal one. Given that ABC is only an
approximation, however, this approach may fail to return a powerful diagnostic. In
their implementation, the authors end up approximating the p-value P(0 < ) and
checking for uniformity.

3 Improvements, Implementations, and Applications


3.1 ABC for State-Space Models
As stressed in the survey written by [30] on the use of ABC methods in a rather
general class of time-series models, these methods allow us to handle setting where
the likelihood of the current observation conditional on the past observations and on a
latent (discrete-time) process cannot be computed. The author makes the preliminary
useful remark that, in most cases, the probabilistic structure of the model (e.g.,
a hidden Markov type of dependence) is lost within the ABC representation. An
exception he and others [14, 17, 21, 31, 33, 41, 42] exploit quite thoroughly is when
the difference between the observed data and the simulated pseudo-data is operated
time step by time step, as in

Id(yt ,yt0 )
t=1

where y 0 = (y10 , . . . , yT0 ) is the actual observation. The ABC approximation indeed
retains the same likelihood structure and allows for derivations of consistency properties (in the number of observations) of the ABC estimates. In particular, using such
a distance in the algorithm allows for the approximation to converge to the genuine
posterior when the tolerance goes to zero [9]. This is the setting where [24] (see also
17) show that noisy ABC is well-calibrated, i.e. has asymptotically proper convergence properties. Most of the results obtained by Jasra and co-authors are dedicated
to specific classes of models, from iid models [17, 24, 31] to observation-driven
times-series [31] to other forms of HMM (17, 21, 41) mostly for MLE consistency
results. The constraint mentioned above leads to computational difficulties as the
acceptance rate quickly decreases with n (unless the tolerance is increasing with n).
The authors of [31] then suggest raising the number of pseudo-observations to average indicators in the above product and to make it random in order to ensure a
fixed number of acceptances. Moving to ABC-SMC (for sequential Monte Carlo,

Approximate Bayesian Computation: A Survey on Recent Results

193

see [4] and Algorithm 5), [32] establish unbiasedness and convergence within this
framework, in connection with the alive particle filter [35].
Algorithm 5 ABC-SMC
Given an observation x 0 , 0 < < 1, and a proposal distribution q0 ()
Set 0 = + and i = 0
repeat
for t = 1 to N do
Generate t from the proposal qi ()
Generate x from the model f (|t )
Compute the distance dt = (x , x 0 ) and the weight t = (t )/qi (t )
end for
Set i = i + 1
Update i as the weighted -quantile of the dt s and qi based on the weighted t s
until is stable
return N weighted values t

3.2 ABC with Empirical Likelihood


In [43], an ABC algorithm based on an empirical likelihood (EL) approximation is
introduced. The foundations of empirical likelihood are provided in the comprehensive book of [45]. The core idea of empirical likelihood is to use a maximum entropy
discrete distribution supported by the data and constrained by estimating equations
related with the parameters of interest or of the whole model. Given a dataset x
comprising n independent replicates x = (x1 , . . . , xn ) of a random variable X F,
and a collection of generalised moment conditions that identify the parameter (of
interest)
E F [h(X, )] = 0
where h is a known function, the induced empirical likelihood [44] is defined as
L el ( |x) = max
p

n


pi

i=1

where the maximum is taken on for all ps on the simplex of Rn such that


pi h(xi , ) = 0

As such, empirical likelihood is a non-parametric approach in the sense that the distribution of the data does not need to be specified, only some of its characteristics.
Econometricians have developed this kind of approach over the years, see e.g. [26].
However, this empirical likelihood technique can also be seen as a convergent

194

C.P. Robert

approximation to the likelihood and hence able to be exploited for cases when the
exact likelihood cannot be derived. For instance, [43] propose using it as a substitute
to the exact likelihood in Bayes formula, as sketched in Algorithm 6.
Algorithm 6 ABC (with empirical likelihood)
Given an observation x 0
for i = 1 N do
Generate i from the prior ()
Set the weight i = L el (i |x 0 )
end for
return (i , i ), i = 1, . . . , N
Use weighted sample as in importance sampling

Furthermore, [43] examine the consequences of using an empirical likelihood


in ABC contexts through a collection of examples. Note that the (ABCel) method
differs from genuine ABC algorithms in that it does not simulate pseudo-data. (Simulated data versions produced poor performances.) The principal difficulty with this
method is in connecting the parameter of the distribution with some moments of the
(iid) data. While this link operates rather straightforwardly for quantile distributions
[1], since theoretical quantiles are available in closed form, implementing empirical likelihood is less clear for times-series models like ARCH and GARCH [13].
Those models actually relate to hidden Markov structures, meaning that the underlying iid generating process is latent and has to be recovered by simulation. Independence is indeed paramount when defining the empirical likelihood. Through a
range of simulation models and experiments, [43] demonstrates that ABCel clearly
improves upon ABC for the GARCH(1, 1) model but also that it remains less informative than a regular MCMC analysis. The difficulty in implementing the principle
is steeper for population genetic models, where parameters like divergence dates,
effective population sizes, mutation rates, cannot be expressed in terms of moments
of the distribution of the sample at a given locus. In particular, the data-points are not
iid. To bypass this difficulty, [43] resort instead to a composite likelihood formulation
[57], approximating for instance a likelihood by a product of pairwise likelihoods
over all pairs of genes. In Kingmans coalescent theory [58], the pairwise likelihoods
can indeed be expressed in closed form. However, instead of using this composite
likelihood per se, since it constitutes a rather poor substitute to the genuine likelihood, [43] rely on the associated pairwise composite score functions log L( ) to
build their generalised moment conditions as E[ log L( )] = 0. The comparison
with optimal standard ABC outcomes shows an improvement brought by ABCel in
the approximation, at an overall computing cost that is negligible against the cost of
ABC (in the sense that it takes minutes to produce the ABCel outcome, compared
with hours for ABC.)
The potential for use of the empirical likelihood approximation is much less
widespread than the possibility of simulating pseudo-data in regular ABC, since
EL essentially relies on an iid sample structure, plus the availability of parameter

Approximate Bayesian Computation: A Survey on Recent Results

195

defining moments. While the composite likelihood alternative provided an answer


in the important case of population genetic models, there are in fact many instances
where one simply cannot come up with a regular EL approximation, However, the
range of applications of straight EL remains wide enough to be of interest, as it
includes most dynamical models like hidden Markov models. In cases when it is
available, ABCel provides an almost free benchmark against which regular ABC
can be tested.

4 Summary Statistics, the ABC Conundrum


The main focus in the recent ABC literature has been on the selection and evaluation
of summary statistics, including a Royal Statistical Society Read Paper [24] that set a
reference and gave prospective developments in the discussion section. Transforming
the data into a statistic of small dimension but nonetheless sufficiently informative
constitutes a fundamental difficulty with ABC. Indeed, it is most often the case that
there is no non-trivial sufficient statistic and that summary statistics are not already
provided by the software (like DIYABC, [16]) or predetermined by practitioners
from the field. This choice has to balance a loss of statistical information against a
gain in ABC precision, with little available on the amounts of error and information
loss involved in the ABC substitution.

4.1 The Read Paper


In what is now a reference paper, [24] proposed an original approach to ABC, where
ABC is considered from a purely inferential viewpoint and calibrated for estimation
purposes. Quite logically, Fearnhead and Prangle (2012) do not follow the more
traditional perspective of representing ABC as a converging approximation to the
true posterior density. Like [71], they take instead a randomised or noisy version of
the observed summary statistic and then derive a calibrated version of ABC, i.e. an
algorithm that gives proper predictions, the drawback being that it is for the posterior
given this randomised version of the summary statistics. The paper also contains an
important result in the form of a consistency theorem which shows that noisy ABC is
a convergent estimation method when the number of observations or datasets grows
to infinity. The most interesting aspect in this switch of perspective is that the kernel
h used in the acceptance probability
h((s sobs )/ h)
does not have to act as an estimate of the true sampling density, since it appears in
the (randomised) pseudo-model. (Everything collapses to the true model when the
bandwidth h goes to zero.) The Monte Carlo error is taken into account through the

196

C.P. Robert

average acceptance probability, which converges to zero with h, demonstrating it is


a suboptimal choice.
A form of tautology stems from the comparison of ABC posteriors via a loss
function
(0 )T A(0 )
that ends up with the best asymptotic summary statistic being the Bayes estimate
itself
E[ |yobs ].
This result indeed follows from the very choice of the loss function rather than from an
intrinsic criterion. Using the posterior expectation as the summary statistic still makes
sense, especially when the calibration constraint implies that the ABC approximation has the same posterior mean as the true (randomised) posterior. Unfortunately
this result is parameterisation dependent and unlikely to be available in settings
where ABC is necessary. In the semi-automatic implementation proposed by [24],
the authors suggest using a pilot run of ABC to approximate the above statistics.
The simplification in the paper follows from a linear regression on the parameters,
thus linking the approach with [6]. The paper also accounts for computing costs and
stresses the relevance of the indirect inference literature [20, 27].
As exposed in my discussion [52], I remain skeptical about the optimality
resulting from the choice of summary statistics in the paper, partly because practice
shows that proper approximation to genuine posterior distributions stems from using
a (much) larger number of summary statistics than the dimension of the parameter
(albeit un-achievable at a given computing cost), partly because the validity of the
approximation to the optimal summary statistics depends on the quality of the pilot
run, and partly because there are some imprecisions in the mathematical derivation
of the results [52]. Furthermore, important inferential issues like model choice are
not covered by this approach. But, nonetheless, the paper provides a way to construct
default summary statistics that should come as a supplement to summary statistics
provided by the experts, or even as a substitute.

4.2 A Review of Dimension Reduction Techniques


In [12], the authors offer a detailed review of dimension reduction methods in ABC,
along with a comparison of three specific models. Given that, as put above, the
choice of the vector of summary statistics is presumably the most important single
step in an ABC algorithm and keeping in mind that selecting too large a vector is
bound to fall victim of the curse of dimensionality, this constitutes a reference for the
ABC literature. Therein, the authors compare regression adjustments la [6], subset
selection methods, as in [34], and projection techniques, as in [24]. They add to this
impressive battery of methods the potential use of AIC and BIC.

Approximate Bayesian Computation: A Survey on Recent Results

197

The paper also suggests a further regularisation of [6] by ridge regression, although
L 1 penalty la Lasso would be more appropriate in my opinion for removing extraneous summary statistics. Unsurprisingly, ridge regression does better than plain
regression in the comparison experiment when there are many almost collinear summary statistics, but an alternative conclusion could be that regression analysis is not
that appropriate with many summary statistics. Indeed, summary statistics are not
quantities of interest but data summarising tools towards a better approximation of
the posterior at a given computational cost.

4.3 ABC with Score Functions


In connection with [43] and their application in population genetics, [57] advocate
the use of composite score functions for ABC. While the paper provides a survey of
composite likelihood methods, the core idea of the paper is to use the score function
(of the composite likelihood) as the summary statistic,
c( ; y)
,

when evaluated at the maximum composite likelihood at the observed data point.
In the specific (but unrealistic) case of an exponential family, an ABC based on the
score is asymptotically (i.e., as the tolerance goes to zero) exact. Working with a
composite likelihood thus leads to a natural summary statistics. As with the empirical
likelihood approach, the composite likelihoods that are available for computation are
usually restricted in number, thus leading to an almost automated choice of a summary
statistic.
An interesting (common) feature in most examples found in this paper is that
comparisons are made between ABC using the (truly) sufficient statistic and ABC
based on the pairwise score function, which essentially relies on the very same
statistics. So the difference, when there is a difference, pertains to the choice of
a different combination of the summary statistics or, somehow equivalently to the
choice of a different distance function. One of the examples starts from the MA(2)
toy-example of [39]. The composite likelihood is then based on the consecutive triplet
marginal densities.
In a related vein, [40] offer a new perspective on ABC based on pseudo-scores.
For one thing, it concentrates on the selection of summary statistics from a more
econometrics than usual point of view, defining asymptotic sufficiency in this context and demonstrating that both asymptotic sufficiency and Bayes consistency can
be achieved when using maximum likelihood estimators of the parameters of an
auxiliary model as summary statistics. In addition, the proximity to (asymptotic)
sufficiency yielded by the MLE is replicated by the score vector. Using the score
instead of the MLE as a summary statistics allows for huge gains in terms of speed.
The method is then applied to a continuous time state space model, using as auxiliary

198

C.P. Robert

model an augmented unscented Kalman filter. The various state space models tested
therein demonstrate that the ABC approach based on the marginal [likelihood] score
performs quite well, including against [24] approach. It strongly supports the idea
of using such a generic object as the unscented Kalman filter for state space models,
even when it is not a particularly accurate representation of the true model. Another
appealing feature is found in the connections made with indirect inference.

5 ABC Model Choice


While ABC is a substitute for a properpossibly MCMC-basedBayesian inference, and thus pertains to all aspects of Bayesian inference, including testing and
model checking, the special issue of comparing models via ABC is highly delicate
and has attracted most of the criticisms addressed against ABC [63, 64]. The implementation of ABC model choice follows by treating the model index m as an extra
parameter with an associated prior, as detailed in the following algorithm:
Algorithm 7 ABC (model choice)
Given an observation x 0
for i = 1 to N do
repeat
Generate m from the prior (M = m)
Generate m from the prior m (m )
Generate x from the model f m (x|m )
until {S(x), S(x 0 )}
Set m(i) = m and (i) = m
end for
return the values m(i) associated with the k smallest distances

Improvements upon returning raw model index frequencies as ABC estimates


have been proposed in [23], via a regression regularisation. In this approach, indices
are processed as categorical variables in a formal multinomial regression, using for
instance logistic regression. Rejection-based approaches as in Algorithm 7 were
introduced in [16, 28, 65], in a Monte Carlo perspective simulating model indices as
well as model parameters. Those versions are widely used by the population genetics
community, as exemplified by [7, 15, 22, 25, 29, 37, 46, 49, 67, 68]. As described
in the following sections, this adoption may be premature or over-optimistic, since
caution and cross-checking are necessary to completely validate the output.

5.1 ABC Model Criticism


The approach in [51] is very original in its view of ABC model criticism and thus
indirectly ABC model choice. It is about the use of the ABC approximation error

Approximate Bayesian Computation: A Survey on Recent Results

199

in an altogether different way, namely as a tool for assessing the goodness of fit of a
given model. The fundamental idea is to process as an additional parameter of the
model, simulating from a joint posterior distribution
f (, |x0 ) (|x0 , ) ( ) ()
where x0 is the data and (|x0 , ) plays the role of the likelihood. (The s are
obviously the priors on and .) In fact, (|x0 , ) is the prior predictive density of
(S(x), S(x0 )) given and x0 when x is distributed from f (x| ). The authors then
derive an ABC algorithm they call ABC to simulate an MCMC chain targeting this
joint distribution, replacing (|x0 , ) with a non-parametric kernel approximation.
For each model under comparison, the marginal posterior distribution on the error
is then used to assess the fit of the model, the logic of it being that this posterior
should include 0 in a reasonable credible interval. (Contrary to other ABC papers,
can be negative and multidimensional in this paper.)
Given the wealth of innovations contained in the paper, let me add here that,
while the authors stress they use the data once (a point always uncertain to me), they
also define the above target by using simultaneously a prior distribution on and a
conditional distribution on the same -that they interpret as the likelihood in (, ).
The product being most often defined as a density in (, ), it can be simulated from,
but this is hardly a regular Bayesian problem, especially because it seems the prior
on significantly contributes to the final assessment. Further and more developed
criticisms are published as [55], along with a reply by the authors [50]. Let me stress
one more time how original this paper is and deplore a lack of follow-up in the
subsequent literature for a practical method that should be implemented on existing
ABC software.

5.2 A Clear Lack of Confidence


The analysis in [54] leads to to the conclusion that ABC approximations to posterior probabilities cannot be blindly and uniformly trusted. Approximating posterior
probabilities as in Algorithm 7, i.e. by using the frequencies of acceptances of simulations from these models (assuming the use of a common summary statistic to
define the distance to the observations). Rather obviously, the limiting behaviour of
the procedure is ruled by a true Bayes factor, except that it is the Bayes factor based
on the distributions of the summary statistics under both models.
While this does not sound like a particularly fundamental remark, given that all
ABC approximations rely on posterior distributions based on these statistics, rather
than on the whole dataset, and while this approximation only has consequences in
terms of inferential precision for most inferential purposes, it induces a dramatic arbitrariness in the Bayes factor. To illustrate this arbitrariness, consider the case of using
a sufficient statistic S(x) for both models. Then, by the factorisation theorem [36],
the true likelihoods factorise as

200

C.P. Robert

1 (1 |x) = g1 (x) p1 (1 |S(x)) and 2 (2 |x) = g2 (x) p2 (2 |S(x))


resulting in a true Bayes factor equal to
B12 (x) =

g1 (x) S
B (x)
g2 (x) 12

where the last term is the limiting ABC Bayes factor. Therefore, in the favourable
case of the existence of a sufficient statistic, using only the sufficient statistic induces
a difference in the result that fails to converge with the number of observations or
simulations. Quite the opposite, it may diverge one way or another as the number
of observations increases. Again, this is in the favourable case of sufficiency. In the
realistic setting of insufficient statistics, things deteriorate even further. This practical
situation implies a wider loss of information compared with the exact inferential
approach, hence a wider discrepancy between the exact Bayes factor and the quantity
produced by an ABC approximation. The paper is thus intended as a warning to the
community about the dangers of this approximation, especially when considering the
rapidly increasing number of applications using ABC for conducting model choice
and hypothesis testing.
This paper stresses a fundamental and even foundational distinction between ABC
point (and confidence) estimation, and ABC model choice, namely that the problem
stands at another level for Bayesian model choice (using posterior probabilities).
When doing point estimation with insufficient statistics, the information content is
poorer, but unless one uses very degraded (i.e., ancillary) summary statistics, inference is converging. The posterior distribution stays different from the true posterior
in this case but, at least, increasing the number observations brings more information
about the parameter (and convergence when this number goes to infinity). For model
choice, this is not guaranteed if we use summary statistics that are not inter-model
sufficient, as shown by the Poisson and normal examples in [54]. Furthermore, except
for very specific cases such as Gibbs random fields [28], it is almost always impossible to derive inter-model sufficient statistics, beyond the raw sample. The paper
includes a realistic and computationally costly population genetic illustration, where
it exhibits a clear divergence in the numerical values of both approximations of the
posterior probabilities. The error rates in using the ABC approximation to choose
between two scenarios, labelled 1 and 2, are 14.5 and 12.5 % (under scenarios 1 and 2),
respectively.
A quite related if less pessimistic paper is [18], also concerned with the limiting
behaviour for the ratio,
g1 (x) S
B (x).
B12 (x) =
g2 (x) 12
Indeed, the authors reach the opposite conclusion from ours, namely that the problem
can be solved by a sufficiency argument. Their point is that, when comparing models
within exponential families (which is the natural realm for sufficient statistics), it
is always possible to build an encompassing model with a sufficient statistic that

Approximate Bayesian Computation: A Survey on Recent Results

201

remains sufficient across models. This construction is correct from a mathematical


perspective, as seen for instance in the Poisson versus geometric example we first
mentioned in [28]: adding
n

xi !
i=1

to the sum of the observables into a large sufficient statistic produces a ratio g1 /g2 that
is equal to 1, hence avoids any discrepancy. However, this encompassing property
only applies for exponential families. Looking at what happens in the limiting case
when one is relying on a common sufficient statistic is a formal study that sheds no
light on the (potentially huge) discrepancy between the ABC-based Bayes factor and
the true Bayes factor in the typical case.

5.3 Validating Summaries for ABC Model Choice


The subsequent [38] deals with the contrasted performances and the resulting evaluation of summary statistics for Bayesian model choice (and not solely in ABC
settings). The central result in this paper is that the summary statistic should enjoy a
different range of means (as a vector) under different models for the corresponding
Bayes factor to be consistent (as the number of observations goes to zero). Otherwise, the model with the least parameters will be asymptotically selected. Even
though the idea of separating the mean behaviour of the summary statistic under
both models is intuitive, establishing a complete theoretical framework that validated this intuition requires assumptions borrowed from the asymptotic Bayesian
literature [66]. The main theorem in [38] states that, under such assumptions, when
the true mean E[S(Y )] of the summary statistic can be recovered for both models
under comparison, then the Bayes factor is of order
 (d d ) 
O n 1 2 /2 ,
where di is the intrinsic dimension of the parameters driving the summary statistic
in model i = 1, 2, irrespective of which model is true. (Precisely, the dimensions
di are the dimensions of the asymptotic mean of the summary statistic under both
models.) Therefore, the Bayes factor always asymptotically selects the model having
the smallest effective dimension and cannot be consistent. If, instead, the true
mean E[S(Y )] cannot be represented in the wrong model, then the Bayes factor is
consistent. This implies that, the best statistics to be used in ABC model choice
are ancillary statistics with different mean values under both models. Otherwise, the
summary statistic must have enough components to prohibit a parameter value under
the wrong model meeting the true mean of the summary statistic.
The paper remains quite theoretical, with the mathematical assumptions required
to obtain the convergence theorems being rather overwhelming and difficult to check

202

C.P. Robert

in practical cases. Nonetheless, this paper comes as a third if not last step in a series
of papers on the issue of ABC model choice. Indeed, we first identified a sufficiency
property [28], then realised that this property was a quite rare occurrence, and we
finally made the theoretical advance in [38]. This last step characterises when is a
statistic good enough to conduct model choice, with a clear answer that the ranges
of the mean of the summary statistic under each model should not intersect. From a
methodological point of view, only the conclusion should be taken into account, as
it is then straightforward to come up with quick simulation devices to check whether
a summary statistic behaves differently under both models, taking advantage of the
reference table already available (instead of having to run Monte Carlo experiments
with ABC steps). The paper [38] includes a 2 check about the relevance of a given
summary statistics.
In [59], the authors consider summary statistics for ABC model choice in hidden
Gibbs random fields. The move to a hidden Markov random field means that the
original approach of [28] does not apply: there is no dimension-reduction sufficient
statistics in that case. The authors introduce a small collection of (four!) focused
statistics to discriminate between Potts models. They further define a novel misclassification rate, conditional on the observed value and derived from the ABC reference
table. It is the predictive error rate
)
= m|S(y obs ))
PABC (m(Y
integrating out both the model index m and the corresponding random variable Y (and
the hidden intermediary parameter) given the observationsmore precisely given the
transform of the observations by the summary statistic S. In a simulation experiment,
this paper shows that the predictive error rate significantly decreases by including
2 or 4 geometric summary statistics on top of the no-longer-sufficient concordance
statistics.

6 Conclusion
This survey reflects upon the diversity and the many directions of progress in this
field of ABC research. The overall message is that the on-going research has led both
to consider ABC as part of the statistical tool-kit and to envision different approaches
to statistical modelling, where a complete representation of the whole world is no
longer feasible. Over the evolution of ABC in the past fifteen years we have thus
moved from approximate methods to approximate models, which is a positive move
in my opinion.
Acknowledgments The author is most grateful to an anonymous referee for her or his help with
the syntax and grammar of this survey. He also thanks the organisers of MCqMC 2014 in Leuven
for their kind invitation.

Approximate Bayesian Computation: A Survey on Recent Results

203

References
1. Allingham, D., King, R., Mengersen, K.: Bayesian estimation of quantile distributions. Stat.
Comput. 19, 189201 (2009)
2. Andrieu, C., Roberts, G.: The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Stat. 37(2), 697725 (2009)
3. Beaumont, M.: Approximate Bayesian computation in evolution and ecology. Annu. Rev. Ecol.
Evol. Syst. 41, 379406 (2010)
4. Beaumont, M., Cornuet, J.-M., Marin, J.-M., Robert, C.: Adaptive approximate Bayesian computation. Biometrika 96(4), 983990 (2009)
5. Beaumont, M., Nielsen, R., Robert, C., Hey, J., Gaggiotti, O., Knowles, L., Estoup, A., Mahesh,
P., Coranders, J., Hickerson, M., Sisson, S., Fagundes, N., Chikhi, L., Beerli, P., Vitalis, R.,
Cornuet, J.-M., Huelsenbeck, J., Foll, M., Yang, Z., Rousset, F., Balding, D., Excoffier, L.: In
defense of model-based inference in phylogeography. Mol. Ecol. 19(3), 436446 (2010)
6. Beaumont, M., Zhang, W., Balding, D.: Approximate Bayesian computation in population
genetics. Genetics 162, 20252035 (2002)
7. Belle, E., Benazzo, A., Ghirotto, S., Colonna, V., Barbujani, G.: Comparing models on the
genealogical relationships among Neandertal, Cro-Magnoid and modern Europeans by serial
coalescent simulations. Heredity 102(3), 218225 (2008)
8. Berger, J., Fienberg, S., Raftery, A., Robert, C.: Incoherent phylogeographic inference. Proc.
Natl. Acad. Sci. 107(41), E57 (2010)
9. Biau, G., Crou, F., Guyader, A.: New insights into approximate Bayesian computation.
Annales de lIHP (Probab. Stat.) 51, 376403 (2015)
10. Blum, M.: Approximate Bayesian computation: a non-parametric perspective. J. Am. Stat.
Assoc. 105(491), 11781187 (2010)
11. Blum, M., Franois, O.: Non-linear regression models for approximate Bayesian computation.
Stat. Comput. 20, 6373 (2010)
12. Blum, M.G.B., Nunes, M.A., Prangle, D., Sisson, S.A.: A comparative review of dimension
reduction methods in approximate Bayesian computation. Stat. Sci. 28(2), 189208 (2013)
13. Bollerslev, T., Chou, R., Kroner, K.: ARCH modeling in finance. A review of the theory and
empirical evidence. J. Econom. 52, 559 (1992)
14. Calvet, C., Czellar, V.: Accurate methods for approximate Bayesian computation filtering. J.
Econom. (2014, to appear)
15. Cornuet, J.-M., Ravign, V., Estoup, A.: Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0). BMC
Bioinform. 11, 401 (2010)
16. Cornuet, J.-M., Santos, F., Beaumont, M., Robert, C., Marin, J.-M., Balding, D., Guillemaud, T.,
Estoup, A.: Inferring population history with DIYABC: a user-friendly approach to approximate
Bayesian computation. Bioinformatics 24(23), 27132719 (2008)
17. Dean, T., Singh, S., Jasra, A., Peters, G.: Parameter inference for hidden Markov models with
intractable likelihoods. Scand. J. Stat. (2014, to appear)
18. Didelot, X., Everitt, R., Johansen, A., Lawson, D.: Likelihood-free estimation of model evidence. Bayesian Anal. 6, 4876 (2011)
19. Diggle, P., Gratton, R.: Monte Carlo methods of inference for implicit statistical models. J. R.
Stat. Soc. Ser. B 46, 193227 (1984)
20. Drovandi, C., Pettitt, A., Fddy, M.: Approximate Bayesian computation using indirect inference. J. R. Stat. Soc. Ser. A 60(3), 503524 (2011)
21. Ehrlich, E., Jasra, A., Kantas, N.: Gradient free parameter estimation for hidden markov models
with intractable likelihoods. Method. Comp. Appl. Probab. (2014, to appear)
22. Excoffier, C., Leuenberger, D., Wegmann, L.: Bayesian computation and model selection in
population genetics (2009)
23. Fagundes, N., Ray, N., Beaumont, M., Neuenschwander, S., Salzano, F., Bonatto, S., Excoffier,
L.: Statistical evaluation of alternative models of human evolution. Proc. Natl. Acad. Sci.
104(45), 1761417619 (2007)

204

C.P. Robert

24. Fearnhead, P., Prangle, D.: Constructing summary statistics for Approximate Bayesian computation: semi-automatic approximate Bayesian computation. J. R. Stat. Soc.: Ser. B (Stat.
Method.), 74(3), 419474. (With discussion)
25. Ghirotto, S., Mona, S., Benazzo, A., Paparazzo, F., Caramelli, D., Barbujani, G.: Inferring
genealogical processes from patterns of bronze-age and modern DNA variation in Sardinia.
Mol. Biol. Evol. 27(4), 875886 (2010)
26. Gouriroux, C., Monfort, A.: Simulation Based Econometric Methods. CORE Lecture Series.
CORE, Louvain (1995)
27. Gouriroux, C., Monfort, A., Renault, E.: Indirect inference. J. Appl. Econom. 8, 85118 (1993)
28. Grelaud, A., Marin, J.-M., Robert, C., Rodolphe, F., Tally, F.: Likelihood-free methods for
model choice in Gibbs random fields. Bayesian Anal. 3(2), 427442 (2009)
29. Guillemaud, T., Beaumont, M., Ciosi, M., Cornuet, J.-M., Estoup, A.: Inferring introduction
routes of invasive species using approximate Bayesian computation on microsatellite data.
Heredity 104(1), 8899 (2009)
30. Jasra, A.: Approximate Bayesian Computation for a Class of Time Series Models. e-prints
(2014)
31. Jasra, A., Kantas, N., Ehrlich, E.: Approximate inference for observation driven time series
models with intractable likelihoods. TOMACS (2014, to appear)
32. Jasra, A., Lee, A., Yau, C., Zhang, X.: The Alive Particle Filter. e-prints (2013)
33. Jasra, A., Singh, S., Martin, J., McCoy, E.: Filtering via approximate Bayesian computation.
Stat. Comp. 22, 12231237 (2012)
34. Joyce, P., Marjoram, P.: Approximately sufficient statistics and Bayesian computation. Stat.
Appl. Genet. Mol. Biol. 7(1), Article 26 (2008)
35. Le Gland, F., Oudjane, N.: A Sequential Particle Algorithm that Keeps the Particle System
Alive. Lecture Notes in Control and Information Sciences, vol. 337, pp. 351389. Springer,
Berlin (2006)
36. Lehmann, E., Casella, G.: Theory of Point Estimation, revised edn. Springer, New York (1998)
37. Leuenberger, C., Wegmann, D.: Bayesian computation and model selection without likelihoods.
Genetics 184(1), 243252 (2010)
38. Marin, J., Pillai, N., Robert, C., Rousseau, J.: Relevant statistics for Bayesian model choice. J.
R. Stat. Soc. Ser. B 76(5), 833859 (2014)
39. Marin, J., Pudlo, P., Robert, C., Ryder, R.: Approximate Bayesian computational methods.
Stat. Comput. 21(2), 279291 (2011)
40. Martin, G.M., McCabe, B.P.M., Maneesoonthorn, W., Robert, C.P. Approximate Bayesian
Computation in State Space Models. e-prints (2014)
41. Martin, J., Jasra, A., Singh, S., Whiteley, N., Del Moral, P., McCoy, E.: Approximate Bayesian
computation for smoothing. Stoch. Anal. Appl. 32(3), (2014)
42. McKinley, T., Ross, J., Deardon, R., Cook, A.: Simulation-based Bayesian inference for epidemic models. Comput. Stat. Data Anal. 71, 434447 (2014)
43. Mengersen, K., Pudlo, P., Robert, C.: Bayesian computation via empirical likelihood. Proc.
Natl. Acad. Sci. 110(4), 13211326 (2013)
44. Owen, A.B.: Empirical likelihood ratio confidence intervals for a single functional. Biometrika
75, 237249 (1988)
45. Owen, A.B.: Empirical Likelihood. Chapman & Hall, Boca Raton (2001)
46. Patin, E., Laval, G., Barreiro, L., Salas, A., Semino, O., Santachiara-Benerecetti, S., Kidd, K.,
Kidd, J., Van Der Veen, L., Hombert, J., et al.: Inferring the demographic history of African
farmers and pygmy hunter-gatherers using a multilocus resequencing data set. PLoS Genet.
5(4), e1000448 (2009)
47. Prangle, D., Blum, M.G.B., Popovic, G., Sisson, S.A.: Diagnostic tools of approximate
Bayesian computation using the coverage property. e-prints (2013)
48. Pritchard, J., Seielstad, M., Perez-Lezaun, A., Feldman, M.: Population growth of human Y
chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16, 17911798
(1999)

Approximate Bayesian Computation: A Survey on Recent Results

205

49. Ramakrishnan, U., Hadly, E.: Using phylochronology to reveal cryptic population histories:
review and synthesis of 29 ancient DNA studies. Mol. Ecol. 18(7), 13101330 (2009)
50. Ratmann, O., Andrieu, C., Wiuf, C., Richardson, S.: Reply to Robert et al.: Model criticism
informs model choice and model comparison. Proc. Natl. Acad. Sci. 107(3), E6E7 (2010)
51. Ratmann, O., Andrieu, C., Wiujf, C., Richardson, S.: Model criticism based on likelihood-free
inference, with an application to protein network evolution. Proc. Natl. Acad. Sci. USA 106,
16 (2009)
52. Robert, C.: Discussion of constructing summary statistics for Approximate Bayesian Computation by Fernhead, P., Prangle, D., J. R. Stat. Soc. Ser. B, 74(3), 447448 (2012)
53. Robert, C., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, New York (2004)
54. Robert, C., Cornuet, J.-M., Marin, J.-M., Pillai, N.: Lack of confidence in ABC model choice.
Proc. Natl. Acad. Sci. 108(37), 1511215117 (2011)
55. Robert, C., Mengersen, K., Chen, C.: Model choice versus model criticism. Proc. Natl. Acad.
Sci. 107(3), E5 (2010)
56. Rubin, D.: Bayesianly justifiable and relevant frequency calculations for the applied statistician.
Ann. Stat. 12, 11511172 (1984)
57. Ruli, E., Sartori, N., Ventura, L.: Approximate Bayesian Computation with composite score
functions. e-prints (2013)
58. Stephens, M., Donnelly, P.: Inference in molecular population genetics. J. R. Stat. Soc.: Ser. B
(Stat. Method.) 62(4), 605635 (2000)
59. Stoehr, J., Pudlo, P., Cucala, L.: Adaptive ABC model choice and geometric summary statistics
for hidden Gibbs random fields. Stat. Comput. pp. 113 (2014)
60. Sunnker, M., Busetto, A., Numminen, E., Corander, J., Foll, M., Dessimoz, C.: Approximate
Bayesian computation. PLoS Comput. Biol. 9(1), e1002803 (2013)
61. Tanner, M., Wong, W.: The calculation of posterior distributions by data augmentation. J. Am.
Stat. Assoc. 82, 528550 (1987)
62. Tavar, S., Balding, D., Griffith, R., Donnelly, P.: Inferring coalescence times from DNA
sequence data. Genetics 145, 505518 (1997)
63. Templeton, A.: Statistical hypothesis testing in intraspecific phylogeography: nested clade
phylogeographical analysis vs. approximate Bayesian computation. Mol. Ecol. 18(2), 319
331 (2008)
64. Templeton, A.: Coherent and incoherent inference in phylogeography and human evolution.
Proc. Natl. Acad. Sci. 107(14), 63766381 (2010)
65. Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.: Approximate Bayesian computation
scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface
6(31), 187202 (2009)
66. van der Vaart, A.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
67. Verdu, P., Austerlitz, F., Estoup, A., Vitalis, R., Georges, M., Thry, S., Froment, A., Le Bomin,
S., Gessain, A., Hombert, J.-M., Van der Veen, L., Quintana-Murci, L., Bahuchet, S., Heyer,
E.: Origins and genetic diversity of pygmy hunter-gatherers from Western Central Africa. Curr.
Biol. 19(4), 312318 (2009)
68. Wegmann, D., Excoffier, L.: Bayesian inference of the demographic history of chimpanzees.
Mol. Biol. Evol. 27(6), 14251435 (2010)
69. Wikipedia (2014). Approximate Bayesian computation Wikipedia, The Free Encyclopedia
70. Wilkinson, R.L: Approximate Bayesian computation (ABC) gives exact results under the
assumption of model error. Technical Report (2008)
71. Wilkinson, R.: Approximate Bayesian computation (ABC) gives exact results under the
assumption of model error. Stat. Appl. Genet. Mol. Biol. 12(2), 129141 (2013)
72. Wilkinson, R.D.: Accelerating ABC methods using Gaussian processes. e-prints (2014)

Part II

Contributed Papers

Multilevel Monte Carlo Simulation


of Statistical Solutions to the NavierStokes
Equations
Andrea Barth, Christoph Schwab and Jonas ukys

Abstract We propose Monte Carlo (MC), single level Monte Carlo (SLMC) and
multilevel Monte Carlo (MLMC) methods for the numerical approximation of statistical solutions to the viscous, incompressible NavierStokes equations (NSE) on
a bounded, connected domain D Rd , d = 1, 2 with no-slip or periodic boundary
conditions on the boundary D. The MC convergence rate of order 1/2 is shown
to hold independently of the Reynolds number with constant depending only on
the mean kinetic energy of the initial velocity ensemble. We discuss the effect of
space-time discretizations on the MC convergence. We propose a numerical MLMC
estimator, based on finite samples of numerical solutions with finite mean kinetic
energy in a suitable function space and give sufficient conditions for mean-square
convergence to a (generalized) moment of the statistical solution. We provide in
particular error bounds for MLMC approximations of statistical solutions to the viscous Burgers equation in space dimension d = 1 and to the viscous, incompressible
Navier-Stokes equations in space dimension d = 2 which are uniform with respect
to the viscosity parameter. For a more detailed presentation and proofs we refer the
reader to Barth et al. (Multilevel Monte Carlo approximations of statistical solutions
of the NavierStokes equations, 2013, [6]).
Keywords Multilevel Monte Carlo method NavierStokes equations Statistical
solutions Finite volume

A. Barth (B)
SimTech, University of Stuttgart, Pfaffenwaldring 5a, 70569 Stuttgart, Germany
e-mail: andrea.barth@mathematik.uni-stuttgart.de
C. Schwab
Seminar Fr Angewandte Mathematik, ETH Zrich, Rmistrasse 101,
8092 Zurich, Switzerland
e-mail: schwab@math.ethz.ch
J. ukys
Computational Science Laboratory, ETH Zrich, Clausiusstrasse 33,
8092 Zurich, Switzerland
e-mail: jonas.sukys@mavt.ethz.ch
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_8

209

210

A. Barth et al.

1 NavierStokes Equations and Statistical Solutions


In the connected bounded domain D Rd , for d = 1, 2, with boundary D and in the
finite time interval J := [0, T ], for T < , we consider a viscous, incompressible
flow, subject to a prescribed divergence-free initial velocity field u0 : D  R and to
a body force f acting on the fluid particles in D. The NSE for viscous, incompressible
flow of a Newtonian fluid are given in terms of the velocity field u : J D  Rd , and
the pressure p : J D  R. The pressure takes the role of a Lagrange multiplier,
enforcing the divergence-free constraint. The NSE in J D, for d = 2, read (see,
e.g., [16]),

u u + (u )u + p = f , u = 0,
(1)
t
with the kinematic viscosity 0 and with a given initial velocity field u(0) = u0 .
In space dimension d = 1, i.e. for D = (0, 1), the NSE reduce to the (viscous for
> 0) Burgers equation. We focus here on Eq. (1) with periodic or no-slip boundary
conditions. We provide numerical examples for periodic boundary conditions, but
emphasize that the theory of statistical solutions extends also to other boundary conditions (see [7]). Apart from not exhibiting viscous boundary layers, homogeneous
statistical solutions to the NSE with periodic boundary conditions appear in certain
physical models [7, Chaps. IV and V].
Statistical solutions aim at describing the evolution of ensembles of solutions
through their probability distribution. In space dimension d 2, for no-slip boundary
conditions we define the function space
Hnsp = {v L 2 (D)d : v = 0 in H 1 (D), v n|D = 0 in H 1/2 (D)},
where n is the unit outward-pointing normal vector from D. For D = (0, 1)2 and
periodic boundary conditions, we denote the corresponding space of functions with
vanishing average over D by Hper . We remark that Hper coincides with the space

H(L)
in [7, Chap. V.1.2] of L-periodic functions with vanishing average, with period
L = 1. Whenever we discuss generic statements valid for either boundary conditions,
we write H {Hnsp , Hper }.
We assume given a probability measure on H, where H is equipped with the Borel -algebra B(H). Statistical solutions to the NSE as defined in [7, 8] are parametric
families of probability measures on H. Rather than being restricted to one single
initial condition, a (FoiasProdi) statistical solution to the NSE is a one-parameter
family of probability measures which describes the evolution of statistics of initial
velocity ensembles. Individual solutions of Eq. (1) are special cases of statistical
solutions, for initial measure 0 charging one initial velocity u0 H. In general, the
initial distribution 0 is defined via an underlying probability space (, F , P). The
distribution of initial velocities is assumed to be given as image measure under a
H-valued random variable with distribution 0 . This random variable X is defined as
a mapping from the measurable space (, F ) into the measurable space (H, B(H))
such that 0 = X P. Consider the NSE (1) in space dimension d = 2 with viscosity

MLMC Simulation of Statistical Solutions

211

> 0 without forcing, i.e. with f 0. In this case, the solution to the NSE is unique
and the initial-data-to-solution map is a semigroup S = (S (t, 0), t J) on H [7,
Chap. III.3.1]. Then, a (unique) time-dependent family of probability measures =
(t , t J) on H is given by [7, Chap. IV.1.2]
t (E) = 0 (S (t, 0)1 E), E B(H),

(2)

i.e., for every t 0, and every E B(H), P({u(t) E}) = P({u0 S (t, 0)1 E}) =
0 ((S (t, 0))1 E). We remark that for nonzero, time-dependent forcing f , S defines
in general not a semigroup on H [7, Chap. V.1.1]. For any time t J, we may then
define the generalized moment

H

(w) dt (w)

for a suitable, t -integrable function on H. The time-evolution of generalized


moments of the Navier-Stokes flow is formally given by
d
dt


H

(v) dt (v)


=
H

(F(t, v), (v))H dt (v),

(3)

for suitable test functionals . Here, F is given by F(t, u) = f Au B(u, u),


where A denotes the Stokes operator and B(u, u) the quadratic momentum advection
term (see [7, Eq. IV.1.10] for details). For the functional setting in space dimension
d = 2, in the no-slip case, we define Vnsp := {v H01 (D)d : v = 0 in L 2 (D)}
Hnsp and in an analogous fashion Vper Hper . Again, we write V {Vnsp , Vper } for
generic statements valid in either case.
A suitable class of test functionals is given by the following:
Definition 1 [7, Definitions V.1.1, V.1.3] Let C be the space of cylindrical test
functionals on H which are real-valued and depend only on a finite number of
components of v H, i.e. there exists k < , such that
(v) = ((v, g1 )H , . . . , (v, gk )H ),

(4)

where is a compactly supported C 1 scalar function on Rk and g1 , . . . , gk V .


Provided that the support of 0 in H is bounded, the condition of compact support
of in Eq. (4) can be relaxed; we refer to [7, Appendix V.A] for details.
For C we denote by its differential in H, which is given by
(v) =

k


i ((v, g1 )H , . . . , (v, gk )H )gi .

i=1

As a linear combination of elements in V , (v) belongs to V .

212

A. Barth et al.

Energy equalities are central for statistical solutions to Eq. (1); we integrate Eq. (3),
which leads, in space dimension d = 2 and for all t J, to (cp. [7, Eq. V.1.9])

H

v 2H

dt (v)

+ 2

 t
0

v 2V ds (v) ds
 t


=
(f (s), v)H ds (v) ds +
v 2H d0 (v).
V

(5)

Equations (3) and (5) motivate the definition of statistical solutions to Eq. (1).
Definition 2 [7, Definitions V.1.2, V.1.4] In space dimension d = 1, 2, a oneparametric family = (t , t J) of Borel probability measures on H is a statistical
solution to Eq. (1) on the time interval J if
1. the initial Borel probability measure 0 on H has finite mean kinetic energy, i.e.,

H

v 2H d0 (v) < ,

2. f L 2 (J; H) and the Borel probability measures t satisfy Eq. (3) for all C
and Eq. (5) holds.
We note that in space dimension d = 3 the notion of statistical solution is more
delicate, cp. [8]. We recall an existence (and, in space dimension d = 2, uniqueness,)
result (see [7, Theorems V.1.2, V.1.3, V.1.4]), [8]): if 0 is supported in BH (R) for
some 0 < R < , and if the forcing term f H is time-independent, the statistical
solution is unique and given by Eq. (2).

2 Discretization Methods
Our goal is the numerical approximation of (generalized) moments of the statistical
solution (t , t J) for a given initial distribution 0 on H. We achieve this by
approximating, for given C (with C as in Definition 1) and for given 0 with
finite mean kinetic energy on H, the expression

et () =
H

(w) dt (w), t J.

As a first approach, we assume that we can sample from the exact initial distribution
0 . Since 0 is a distribution on the infinite-dimensional space H, this is, in general,
a simplifying assumption. However, if the probability measure 0 is supported on
a finite-dimensional subspace of H, the assumption is no constraint. We discuss an
appropriate approximation of the initial distribution in Sect. 3. We generate M N
independent copies (wi , i = 1, . . . , M) of u0 , where u0 is 0 -distributed. Assume for

MLMC Simulation of Statistical Solutions

213

now that for each draw wi H, distributed according to 0 , we can solve ui (t) =
S (t, 0)wi exactly, and that we can evaluate the (real-valued) functional (ui (t))
exactly. Then
e ()
t

EMt ((u(t)))

M
M
1 
1 
i
:=
(u (t)) =
(S (t, 0)wi ),
M i=1
M i=1

(6)

where we denoted by (EMt , M N) the sequence of MC estimators which approximate the (generalized) expectation et () for C . To state the error bound on
the variance of the MC estimator, given in Eq. (6), we assume for simplicity that the
right hand side of Eq. (1) is equal to zero, i.e., f 0 (all results that follow have an
analog for nonzero forcing f L 2 (D)).
Proposition 1 Let C be a test function. Then, an error bound on the meansquare error of the Monte Carlo estimator EMt , for M N, is given by
1/2
1 
Var t ()
et () EMt ((u(t))) L2 ((H,t );R) =
M


1/2 
1 
1+
.
C
w 2H d0 (w)
M
H
For > 0, the latter inequality is strict. Here, we used the notation Var P (X) =
eP ( eP (X) X 2E ) for a square-integrable, E-valued random variable X under the
measure P. We define, further, L 2 ((, P); E) as the set of square-summable (with
respect to the measure P) random variables taking values in the separable Banach

1/2
space E, and equip it with norm X L2 ((,P);E) := eP ( X 2E ) . Test functions in C
fulfill, for some constant C > 0, the linear growth condition |(w)| C(1 + w H ),
for all w H. We remark that the MC error estimate in Proposition 1 is uniform with
respect to > 0 (see [6]). With EMt being a convex combination of individual Leray
Hopf solutions, by [8, Theorem 4.2] the MC estimator EMt converges as M (in
the sense of sequential convergence of measures, and uniformly on bounded time
intervals) to a ViikFoursikov statistical solution as defined in [8].
Space and Time Discretization
The MC error bounds in Proposition 1 are semi-discrete in the sense that they assume
the availability of an exact LerayHopf solution to the NSE for each initial velocity
sample drawn from 0 , and they pertain to bulk properties of the flow in the sense
that they depend on the H-norm of the individual flows. We have, therefore, to
perform additional space and time discretizations in order to obtain computationally
feasible approximations of (generalized) moments of statistical solutions. In MLMC
sampling strategies such as those proposed subsequently, we consider a sequence
of (space and time) discretizations which are indexed by a level index N0 . We
consider a dense, nested family of finite dimensional subspaces V = (V , N0 )
of V and therefore of H. Associated to the subspaces V , we have the refinement

214

A. Barth et al.

levels N0 , the refinement sizes (h , N0 ) and the H-orthogonal projections


(P , N0 ). Furthermore, we endow the finite dimensional spaces in V with the
norm induced by H. For N0 , the sequence is supposed to be dense in H in the sense
that, for every v H, lim + v P v H = 0. In order to obtain a computationally
feasible method, we introduce a sequence of time discretizations = ( , N0 )
of the time interval J each of equidistant/maximum time steps of size t. The time
discretization at level N0 , , is the partition of [0, T ] which is given by
= {t i [0, T ] : t i = i t, i = 0, . . . ,

T
}.
t

We view the fully-discrete solution to Eq. (1) as the solution to a nonlinear dynamical
system according to
Dt (u ) = F (t, u ),
where Dt denotes the weak derivative with respect to time and the right hand side is
F (t, v) = f A v B (v, v).
Here, A denotes the discrete Stokes operator and B the associated bilinear form.
We denote by S = (S (t i , 0), i = 0, . . . , T / t) the fully-discrete solution operator that maps u0 into u = (u (t i ), i = 0, . . . , T / t). We assume that the spaces
in V and the time discretizations are chosen such that the following error bound
holds.
Assumption 1 The sequence of fully-discrete solutions (u , N0 ) converges to
the solution u to Eq. (1). The space and time discretization error is bounded, for
N and t , with h  t, either by
1.
u(t) u H = S (t, 0)u0 S (t, 0)u0 H C (h s + ( t)s ) C h s ,

(7)

for some s [0, 1], or by


2.
u(t) u H = S (t, 0)u0 S (t, 0)u0 H C (

h
h
( t)
+
)C ,

(8)

for some > 0.


Equation (8) implies the scale resolution convergence requirement > where
N0 such that h .
Let us comment on Assumption 1. The convergence estimates are explicit in the
discretization parameter h (equal to, for example, a mesh width of a Finite Volume
mesh, or to N 1 where N denotes the spectral order of a spectral method) and in the

MLMC Simulation of Statistical Solutions

215

kinematic viscosity . Finite Element based space-time discretizations of the NSE in


space dimension d = 2, such as those in [9, 16] will, in general, not satisfy Eq. (7).
In spatial dimension d = 1, it is shown in [10, Main Corollary, p. 373] that Eq. (7)
holds, with s = 1/2 and for some constant C > 0 independent of . The rate is bound
to s = 1/2 since solutions to the inviscid limit problem form shocks in finite time.
In space dimension d = 2, for small data and with periodic boundary conditions,
for = 0 the equations of inviscid, incompressible flow are well-posed and for
sufficiently regular initial data, the unique solutions do not form shocks ([3, 17]).
First order convergent Finite Volume discretizations for the corresponding problem
of inviscid, incompressible flow which satisfy the error bound Eq. (7) with s = 1
are available in [11], based on [3, Chap. 7]. Finite Element discretizations based
on the heat equation result in discretization error bounds as in Eq. (8) with, to our
knowledge, constants C > 0 which implicitly depend (exponentially) on T / and
which are, therefore, not suitable to infer statements on the performance of the MLMC
approximation of statistical solutions for small values of .

2.1 Single Level Monte Carlo Method


With the discretization in hand we can combine the error in the spatial and temporal
domain with the statistical sampling by the MC method, leading to what we shall
refer to as the single level Monte Carlo (SLMC) approach.
We define, for N0 and t , the SLMC estimator with M independent and
identically according to 0 distributed samples wi H by
EMt ((u (t))) :=

M
M
1 
1 
((u (t))) =
(S (t, 0)wi ).
M i=1
M i=1

Here, S denotes the fully-discrete solution operator, defined above, and C .


We assume that C satisfies a Lipschitz condition:
for all v, w H :

|(v) (w)| C v w H .

(9)

We remark that Eq. (9) follows from being continuously differentiable and with
compact support. The constant C depends on the maximum of and on the H norms
of g1 , . . . , gk . Under Eq. (9), the SLMC estimator admits the following mean-square
error bound (see [6]).
Proposition 2 If, for C fulfilling Eq. (9) and N0 , the generalized moment
of the statistical solution fulfills Assumption 1, for some s [0, 1] or some > 0
and h  t, then the fully-discrete single level Monte Carlo estimator EMt ((u ))
admits, for t , the bound

216

A. Barth et al.

et () EMt ((u )) L2 ((H,t );R)


1/2
1 
Var t ()

+ et ( (u )) L2 ((H,t );R)
M
 1

C
+ (h ) .
M
For robust discretizations, (z) = zs , with C > 0 independent of , h and of .
The error bound for the SLMC estimator consists of two additive components, the
approximation of the spatial and temporal discretization and of the MC sampling.
Although we only established an upper bound, one can show that this error is, indeed,
of additive nature. This, in turn, indicates that the lack of scale-resolution in the
spatial and temporal approximation, i.e. if the discretization underresolves the scale
of viscous cut-off, can partly (in a mean-square sense) be offset by increasing the
number of samples, on the mesh-level in the MC approximation. This is in line
with similar findings for MLMC Galerkin discretizations for elliptic homogenization
problems in [2]. To ensure that the total error in Proposition 2 is smaller than a
prescribed tolerance > 0, we require
1/2
1 
+ et ( (u )) L2 ((H,t );R) .
Var t ()

M
A sufficient condition for this is for some (0, 1)
1/2
1 
Var t ()
and et ( (u )) L2 ((H,t );R) (1 ).

2.2 Multilevel Monte Carlo Method


The idea of the MLMC estimator is to expand the expectation of the approximation of
the solution on some discretization level L N0 as the expectation of the solution on
the (initial) discretization level 0 and a sum of correcting terms on all discretization
levels = 1, . . . , L, i.e., for C ,
et ((uL )) = et ((u0 )) +

L


et ((u ) (u 1 )).

=1

Then we approximate the expectation in each term on the right hand side with a
SLMC estimator with a level dependent number of samples, so that we may write
EL t ((uL )) = EMt0 ((u0 )) +

L

=1

EMt ((u ) (u 1 )).

MLMC Simulation of Statistical Solutions

217

We call EL t the MLMC estimator for discretization level L N0 . The MLMC estimator has the following mean-square error bound.
Proposition 3 If, for C fulfilling Eq. (9) and L N0 , the generalized moment
of the statistical solution fulfills Assumption 1, for = 0, . . . , L with s [0, 1) or
> 0 and h  t, the error of the fully-discrete multilevel Monte Carlo estimator
EL t ((uL )) admits, for t L , the bound
et () EL t ((uL )) L2 ((H,t );R)
et ( (uL )) L2 ((H,t );R) +

L

=0

1
1 
Var t ((u ) (u 1 )) 2

L

 

1 
1 
1 + (h0 ) +
(h ) + (h 1 ) ,
C (hL ) +

M
M0
=1

where (u1 ) 0, (z) = zs or (z) = z and z [0, 1]. If, further, for all =
1, . . . , L, it holds that h  t and that h 1 h , with some reduction factor
0 < < 1 independent of . Then, there exists C() > 0 independent of L, such
that there holds the error bound
L



1
1
et () EL t ((uL )) L2 ((H,t );R) C() (hL ) +
+
(h ) .
M0 =0 M

A proof can be found in [6]. This result leads again to the question how to chose
the sample numbers (M , = 1, . . . , L) that yield a given (mean kinetic energy)
error threshold . We have, if we assume that L (0, 1), the requirement
et ( (uL )) L2 ((H,t );R) (1 L ) and
L

=0

1/2
1 
L .
Var t ((u ) (u 1 ))

If we have that, for some > 0, for = 0, . . . , L,




Var t ((u ) (u 1 ))

1/2

then, to equilibrate the error for each level = 1, . . . , L, we choose the sample sizes
M = 2 (L )2

(10)


for a sequence ( , = 1, . . . , L) with [0, 1], subject to the constraint L =1 =


1. We determine the required number M of SLMC samples on each discretization level = 0, . . . , L based on equilibration of the errors arising from each term
Var t ((u ) (u 1 )) such that the total mean-square error from Proposition 3

218

A. Barth et al.

is bounded by the prescribed tolerance > 0. This is only possible if the convergence requirement is fulfilled for level L, since then we can choose L accordingly
to satisfy a preset error bound. However, the convergence requirement might not
be fulfilled for all < L, hence, for those levels we have to sample accordingly. In
particular, denote by 0 the first level where the solution is scale-resolved. Then
Var t ((u ) (u 1 )) might be large, as might be ; thus has to be chosen
accordingly. Since it is infeasible to determine the values Var t ((u ) (u 1 ))
we estimate sample numbers from the second (more general) bound in Proposition 3.
We refer to [4] for an analysis of the computational complexity of MLMC estimators
in the case of weak or strong errors of SPDEs.
We proceed to determine the numbers M of SLMC samples. To this end, we
continue to work under Assumption 1. We either assume Eq. (7) or we work with
Eq. (8) under the assumption that at least on the finest level the scale resolution
requirement is fulfilled, i.e., hL < . For the latter, we consider the case where the
scale resolution requirement is not fulfilled for all levels up to level (). In this
case, for 0 () < L (meaning h () and h ()+1 < ), we choose on the
first level the sample number
M0 = O


2 
((hL ))1

(11)

to equilibrate the statistical and the discretization error contributions. Here, and in
what follows, all constants implied in the Landau symbols O() are independent
of . According to this convergence analysis, the SLMC sample numbers M , for
discretization levels = 1, . . . , (), . . . , L should be chosen according to
M = O



2
(h )((hL ))1 2(1+ ) ,

(12)

for > 0 arbitrary (with the constant implied in O depending on ). Note that (h )
might be large for underresolved discretization levels. This choice of sample numbers
is in line with Eq. (10) for one particular sequence ( , = 1, . . . , L).

3 Numerics
We describe numerical experiments in the unit interval D = (0, 1) in space dimension
d = 1, i.e. for the viscous Burgers equation, and in space dimension d = 2, in
D = (0, 1)2 , with periodic boundary conditions, and with stochastic initial data. As
indicated in Sect. Space and Time Discretization, in space dimension d = 1, i.e. for
scalar problems, the bound in Assumption 1 holds with s = 1/2 and with a constant
C > 0 independent of (see [10]). If the mesh used for the space discretization
resolves the viscous scale, the first order Finite Volume method even converges with
rate s = 1 in L 1 (D) due to the high spatial regularity of the solution u, albeit with
constants which blow up as the viscosity tends to zero. Specifically, we consider

MLMC Simulation of Statistical Solutions

219

Eq. (1) with periodic boundary conditions in the physical domain D = [0, 1], i.e.
2

1 2
u+
(u ) = 2 u + f , for all x D, t [0, T ], ,
t
2 x
x

(13)

which is completed with the random initial condition u(0) = u0 L 2 (, L 1 (D)


L (D)), inducing an initial measure 0 on L 1 (D) L (D) H = L 2 (D) with finite
second moments.
The numerical simulation of a statistical solution requires sampling from the
measure 0 defined on the generally infinite dimensional space H. To give a convergence result for finite dimensional, principal component approximations of this
initial measure 0 , we follow closely the approach in [5].
The initial distribution 0 is defined on a probability space (, F , P) and is
assumed to be given as an image measure under an H-valued random variable with
distribution 0 . This random variable is defined as a mapping from the measurable space (, F ) into the measurable space (H, B(H)) such that 0 = X P.
We assume throughout the numerical experiments that 0 is a Gaussian measure
supported on H or on a subspace of H. Gaussian measures on a separable, infinitedimensional Hilbert space H are completely characterized by the mean m H and
covariance operator Q defined on H, being a symmetric, nuclear trace-class operator. Any Gaussian random variable X L 2 (; H) can then be represented by its
KarhunenLove expansion
X =m+

i i wi ,

iN

where ((i , wi ), i N) is a complete orthonormal system in H and consists of eigenvalues and eigenfunctions of Q. The sequence (i , i N) consists of real-valued,
independent, (standard) normal-distributed random variables. With -term truncations of KarhunenLove expansions
define a sequence of random variables
 we
(X , N) given by X = m + i=1 i i wi , with mean m H and covariance
operator Q . The sequence of truncated sums X converge P-a.s. to X in the H-norm
as +. Then, we have the following lemma (see [5] for a proof).
Lemma 1 ([5]) If the eigenvalues (i , i N) of the covariance operator Q of the
Gaussian random variable X on H have a rate of decay of i C i for some
> 1, then the sequence (X , N) converges to X in L 2 (; H) and the error is
bounded by
1
1
2 .
X X L2 (;H) C
1
For the numerical realization of the MLMC method, and in particular for the
numerical experiments ahead, we need to draw samples from the initial distribution.
2
(D), where
As an example we therefore introduce a Gaussian distribution on H = Lper
D = (0, 1). In the univariate
case,
the
condition

u
=
0
in
(1)
becomes
void and

2
2
(D) = {u L 2 (D) : D u = 0}. A basis of Lper
(D) is given by (wi , i N), where
Lper

220

A. Barth et al.

wi (x) = sin(2i x). Then the covariance operator Q is with Mercers theorem defined,
2
(D), as
for Lper

Q(x) =
q(x, y)(y)dy
D


where the kernel is q(x, y) = iN i wi (x)wi (y) = 
x) sin(2i y).
iN i sin(2i

<

to define
Now, we may choose any sequence (i , i N) with
i
iN
a covariance operator Q on H which is trace class. One possible choice would
be i  i , for > 2. In our numerical experiments, we choose as eigenvalues
i = i2.5 for i 8 and zero otherwise, and the mean field m 0, i.e.
u0 (x, ) =

8

1
sin(2 ix)Yi ().
5/4
i
i=1

(14)

The kinematic viscosity is chosen to be = 103 and the source term is set to f 0.
All simulations reported below were performed on Cray XE6 in CSCS [14] with the
recently developed massively parallel code ALSVID-UQ [1, 13, 15]. Simulations
were executed on Cray XE6 (see [14]) with 1496 AMD Interlagos 2 16-core 64bit CPUs (2.1 GHz), 32 GB DDR3 memory per node, 10.4 GB/s Gemini 3D torus
interconnect with a theoretical peak performance of 402 TFlops.
The initial data in Eq. (14) and the reference solution uref at time t = 2 are depicted
in Fig. 1. The solid line represents the mean Et (uref ) and the dashed lines represent
the mean plus/minus the standard deviation (Var t (uref ))1/2 of the (random) solution
uref at every point x D. The variance
 and1therefore the2standard deviation can easily
sin(2 ix)) , for x D. The solution is
be calculated by Var 0 (u0 (x)) = 8i=1 ( i5/4
computed with a standard first-order Finite Volume scheme using the Rusanov HLL
solver on a spatial grid in D of size 32768 cells and the explicit forward Euler
time stepping (see [12]) with the CFL number set to 0.9. The number of levels of
refinement is 9 (the coarsest level has 64 cells). The number of samples is chosen
according to the analysis in Sect. Space and Time Discretization with s = 1, i.e.

Fig. 1 Reference solution computed using the MLMC finite volume method

MLMC Simulation of Statistical Solutions

221

M = ML 22(L ) , for = 0, . . . , L, where the number of samples on the finest mesh


set to ML = 4 (this leads to M0 = 262144). The simulation took 50 min (wall-clock
time) on 256 cores.
Next, following Definition 1 and the remarks thereafter, for k = 1, ( ) =
and for a given kernel g1 L (D), we define a continuous, linear functional on
L 1 (D) L (D) by

u(x, t, )g1 (x)dx, for all t [0, T ] .

(u)(t, ) =

(15)

Note, that formally the function is not compactly supported. However, for onedimensional problems, there holds an energy bound (we refer to the results in [12])
with respect to the initial data u0 (, ), i.e. u(, t, ) L2 (D) u0 (, ) L2 (D) . Since
the values of the inner product can be bounded for every t and P-a.e. by
|(u(, t, ), g1 )H | u(, t, ) L2 (D) g1 L2 (D) u0 (, ) L2 (D) g1 L2 (D) < ,
the function () may be modified for large values, enforcing the required compact
support of in the Definition 1. We note, that such modification is -dependent,
and hence a more stringent bound of the L (, L 2 (D))-norm of the initial data
is required instead, i.e. we require that u0 (, ) L2 (D) < C holds P-a.s. for some
constant C < . Such a bound holds for the uniformly distributed initial condition,
however, it does not hold for the Gaussian distributed initial condition considered
here. In the following numerical experiment, we choose the function g1 in Eq. (15)
to be g1 (x) = (x 0.5)3 . With this choice it can be easily verified that in Eq. (15)
fulfills the Lipschitz condition in Eq. (9).
Using MLMC Finite Volume approximations for the mean Et ((uref )) and the
variance Var t ((uref )) from Fig. 1 as a reference solution, we compute approximate solutions u using both, SLMC Finite Volume and MLMC Finite Volume
methods, on a family of meshes with spatial resolutions ranging from n0 = 64 cells
up to nL = 2048 cells. We monitor the convergence of the errors in EL t ((uL )) and
Var Lt ((uL )),








LE = Et ((uref )) EL t ((uL )) , LV = Var t ((uref )) Var Lt ((uL )) .
The number of samples on the finest mesh is set to ML = 4. The number of levels
for the MLMC Finite Volume method
is chosen so that the coarsest level contains
64 cells. Since 1/64 0.015 < = 101.5 0.03, the viscous cut-off scale
(which, in the present problem coincides with the scale of the viscous shock profile)
of the solution u is resolved on every mesh resolution level = 0, . . . , L.
Since the solution is a random field, the discretization error L is a random quantity
as well. For error convergence analysis we, therefore, compute a statistical estimator by averaging estimated discretization errors from several independent runs. We
compute the error in Proposition 3 by approximating the L 2 (H, R)-norm by MC

222

A. Barth et al.

sampling. Let (uref ) denote the reference solution and (((uL ))(k) , k = 1, . . . , K)
be a sequence of independent approximate solutions obtained by running the SLMC
Finite Volume or MLMC Finite Volume solver K N times. The L 2 (; H)-based
relative percentage error estimator is defined to be




RLE = 100 EK

e,(k)
L

|Et ((uref ))|

2





, RLV = 100 EK

2
V,(k)
L
.
| Var t ((uref ))|

In order to obtain an accurate estimate of RLE and RLV , the number K must be
large enough to ensure a sufficiently small (<0.1) relative variance 2 (RLE ) and
2 (RLV ). We found K = 30 to be sufficient for our numerical experiments. Next,
we analyze the relative percentage error convergence plots of mean and variance.
In Fig. 2, we plot the error LE against the number of cells on the finest discretization level L in the left subplot and versus the computational work (runtime) in the
right subplot. The coarse level stays the same when we increase the finest discretization level L to obtain a convergence plot. Both SLMC and MLMC methods give
similar relative percentage errors for the same spatial resolution. However, there is a
significant difference in the runtime: MLMC methods are two orders of magnitude
faster than plain SLMC methods. The lower dashed line in the top-right corner of
each plot in Fig. 2 (and all subsequent figures) indicates the expected convergence
rate of the MLMC method obtained in Proposition 3. These expected convergence
rates coincide with the observations in the numerical experimental data. In Fig. 3,
we plot the error LV versus the number of cells on the finest discretization level L
in the left subplot and versus the computational work (runtime) in the right subplot.
Analogously as in the plots for the expectation, both SLMC and MLMC methods
give similar errors for the same spatial resolution. In terms of the required computational work for one percent error, MLMC methods are, in this example, two orders
of magnitude faster than plain SLMC methods.
We repeat the error convergence analysis for Burgers equation, but this time
with much fewer cells on the coarsest mesh resolution in the MLMC estimator. In

Fig. 2 Convergence of the error LE of the mean Et () of the viscous Burgers equation

MLMC Simulation of Statistical Solutions

223

Fig. 3 Convergence of the error LV of the variance Var t () of the viscous Burgers equation

particular, instead of taking 64 cells on the coarsest mesh resolution, we will take
only 8
cells, i.e. adding three more levels of mesh refinement. Since in this case
1/8 > = 101.5 0.03, the viscous cut-off length scale of the solution u is not
resolved on every mesh resolution level, in particular, it is resolved only on the mesh
resolution levels = 3, . . . , L, and it is under-resolved on = 0, 1, 2. Notice, that
the number of cells on the finer mesh resolutions stays the same as in the previous
experiment, where n3 = 64, . . . , nL = 2048. Note also that by the theory in [10],
the presently used numerical scheme converges robustly in H with order s  1/2,
meaning that the constant in the convergence bound is independent of . In Fig. 4,
we plot the error LE against the number of cells nL in the left subplot and versus
computational work (runtime) in the right subplot for the case of 8 cells on the
coarsest resolution. Even in the presence of multiple under-resolved levels, the error
convergence of the MLMC Finite Volume method is faster than the previous setup
(compared to Fig. 2). In Fig. 5, we plot the error LV versus the number of cells nL in
the left subplot and versus the computational work (runtime) in the right subplot for
the case of 8 cells on the coarsest resolution. Again, even in the presence of multiple
under-resolved levels, the error convergence of the MLMC Finite Volume method is
faster than the previous setup (compared to Fig. 3).

Fig. 4 Convergence of the error LE of the mean Et () of the viscous Burgers equation

224

A. Barth et al.

Fig. 5 Convergence of the error nV of the variance Var t () of the viscous Burgers equation

We conclude with preliminary numerical experiments in space dimension d = 2,


from [11]. We consider Eq. (1) in the physical domain D = [0, 1]2 , with periodic
boundary conditions. For d = 2 and > 0, individual and statistical solutions exist
and are unique. Moreover, in this setting Eq. (1) admits equivalent vorticity reformulations in terms of a scalar vorticity obtained from the velocity u(t) = (u1 (t), u2 (t))
by
(16)
(t) := rot u(t) = 2 u1 (t) 1 u2 (t)
which maps Sobolev spaces of divergence-free velocity fields isomorphically to
spaces of (scalar) vorticities . The relation in Eq. (16) is invertible via the BiotSavart law:
u(t) = curl ()1 (t) = (2 ()1 , 1 ()1 ) =: rot1 (t).

(17)

In terms of the (scalar in space dimension d = 2) vorticity (t), Eq. (1) becomes
the viscous vorticity equation: in the periodic setting, for s 0, given > 0, find
s+1
s1
(D)) H 1 (J; Hper
(D)) such that there holds Eq. (17) and
Xs := L 2 (J; Hper
s1
(D)),
t + u = , in L 2 (J; Hper
s+1
= in L 2 (J; Hper
(D)),

|t=0 = 0 in

(18)

s
Hper
(D).

The relations Eqs. (16) and (17) are bijective in certain scales of (Sobolev) spaces
of D-periodic functions so that Eqs. (16)(18) and (1) are equivalent. Moreover,
the isomorphisms rot and rot1 in Eqs. (16) and (17) allow to transfer the statistical
solutions = (t , t 0) equivalently to a one-parameter family = (t , t 0)
of probability measures on sets of admissible vorticities, defined for every ensemble
F of 0 -measurable initial vorticities 0 by
t (F) = 0 ((T (t))1 (F)), T (t)0 := (rot S (t, 0) rot1 )0 .

MLMC Simulation of Statistical Solutions

225

Fig. 6 L2 error of the mean for different viscosities with SLMC and MLMC, with respect to the
mesh width h and wall clock time

Here, we defined 0 (F) := (0 rot1 )(F). Existence and uniqueness of the velocity
statistical solutions imply existence and uniqueness of the vorticity statistical
solutions . We refer to [11] for further details, and also for detailed description of
the Finite Volume discretization and convergence analysis of Eq. (18) (Fig. 6).
In the ensuing numerical experiments, we consider a probability measure 0
concentrated on initial vorticities of the form:
0 (x; ) = 0 (x) + Y1 ()1 (x)
1
(D) denotes the mean initial vorticwith Y1 U (1, 1) and where 0 (x) Hper
1
ity, and the fluctuation is given by 1 (x) := sin(2 x1 ) sin(2 x2 ) Hper
(D). We
choose as the mean vorticity 0 (x) := x1 (1 x1 )x2 (1 x2 ). Note that then 0 ()
1
(D) P-a.s.
Hper
The ensuing numerical results are obtained using a forward in time, central in space
(FTCS), vorticity solver, described in detail in [11]. In this case, for small data, the
individual Leray-Hopf solutions converge, as 0, to the unique incompressible,
inviscid Euler flow (see [3, Chap. 13], [17]) in C([0, T ]; L 2 (D)). Contrary to the
one-dimensional setting, in space dimension d = 2 and for sufficiently regular initial
data, incompressible, inviscid Euler flow solutions do not form shocks. To construct
a reference solution, we approximate the ensemble average by 1-dimensional Gauss
Legendre quadrature (using 20 nodes) and a fine discretization in space and time. This
is sufficient to accurately resolve the mean of the statistical solution. This solution,
computed with a space discretization on 10242 equal sized cells, is used as a reference
solution for the error convergence analysis of the SLMC and MLMC Finite Volume
discretization error for the 1-parametric random initial data. Simulations of individual
solutions are performed up to final time T = 1. We compare SLMC and MLMC
approximations. We select the sample numbers on the discretization levels so that
the sampling error and the discretization errors remain balanced. Due to the absence
of boundary layers, for periodic boundary conditions, and of shocks in solutions of the

226

A. Barth et al.

limiting problem, we are in the setting of Assumption 1, with s = 1. Then, the SLMC
error behaves like O(M 1/2 ) + O(h ) with O() independent of . A sufficient choice
of the sample numbers for a first order numerical scheme on individual solutions
is M = h 2 . For MLMC, with the choice M = 22s(Ll) we achieve an asymptotic
error bound of O(hL log(hL )). On the finest meshes we choose ML = 10 samples
in order to remove sampling fluctuations. Concerning the computational work, the
computational cost of a single deterministic simulation behaves like WDET hL3
(in two spatial dimensions and one temporal dimension). We remark, that Multigrid
methods allow for implicit time-stepping for the viscous part and for the velocity
reconstruction in work and memory of O(hL2 ) per time step. For SLMC, we perform
O(hL2 ) deterministic runs. This yields a scaling of the overall work of WSLMC hL5 .
With MLMC we require M = O(h 2s /hL2s ) simulations per level, for a total work of:
WMLMC

L

l=0

h 3 h 2s /hL2s

hL2

L


h 1 hL3 ,

=0

neglecting the logarithmic term. That is, for SLMC with the mentioned choices of
1/5
1/3
sample numbers M, we obtain WSLMC ErrSLMC , whereas for MLMC, WMLMC
ErrMLMC (see Fig. 6). From the discussion above and from the numerical results,
SLMC has prohibitive complexity for small space and timesteps. As predicted by the
theoretical analysis, MLMC exhibits, in terms of work vs. accuracy, a performance
which is comparable to that of one individual numerical solution on the finest mesh.
As in the one-dimensional setting, for the computation of the error, a sample of
K = 10 experiments was generated and the error is estimated by the sample average.
The number K of repetitions of experiments is chosen in such a way that the variance
of the relative error is sufficiently small.
Acknowledgments The research of Ch. S. and A. B. is partially supported under ERC AdG 247277.
The research of J. . was supported by ETH CHIRP1-03 10-1 and CSCS production project ID
S366. The research of A.B. leading to these results has further received funding from the German
Research Foundation (DFG) as part of the Cluster of Excellence in Simulation Technology (EXC
310/2) at the University of Stuttgart, and it is gratefully acknowledged. The research of A. B. and
J. . partially took place at the Seminar fr Angewandte Mathematik, ETH Zrich. The authors
thank S. Mishra and F. Leonardi for agreeing to cite numerical tests from [11] in space dimension
d = 2.

References
1. ALSVID-UQ. Version 3.0. http://www.sam.math.ethz.ch/alsvid-uq
2. Abdulle, A., Barth, A., Schwab, Ch.: Multilevel Monte Carlo methods for stochastic elliptic
multiscale PDEs. Multiscale Model. Simul. 11(4), 10331070 (2013)
3. Bahouri, H., Chemin, J.-Y., Danchin, R.: Fourier Analysis and Nonlinear Partial Differential Equations. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of
Mathematical Sciences], vol. 343. Springer, Heidelberg (2011)
4. Barth, A., Lang, A.: Multilevel Monte Carlo method with applications to stochastic partial
differential equations. Int. J. Comput. Math. 89(18), 24792498 (2012)

MLMC Simulation of Statistical Solutions

227

5. Barth, A., Lang, A.: Simulation of stochastic partial differential equations using finite element
methods. Stochastics 84(23), 217231 (2012)
6. Barth, A., Schwab, Ch., ukys, J.: Multilevel Monte Carlo approximations of statistical solutions of the NavierStokes equations. Research report 2013-33, Seminar for Applied Mathematics, ETH Zrich (2013)
7. Foias, C., Manley, O., Rosa, R., Temam, R.: Navier-Stokes equations and turbulence. Encyclopedia of Mathematics and its Applications, vol. 83. Cambridge University Press, Cambridge
(2001)
8. Foias, C., Rosa, R., Temam, R.: Properties of time-dependent statistical solutions of the threedimensional Navier-Stokes equations. Annales de lInstitute Fourier 63(6), 25152573 (2013)
9. Heywood, J.G., Rannacher, R.: Finite element approximation of the nonstationary NavierStokes problem. I. Regularity of solutions and second-order error estimates for spatial discretization. SIAM J. Numer. Anal. 19(2), 275311 (1982)
10. Karlsen, K.H., Koley, U., Risebro, N.H.: An error estimate for the finite difference approximation to degenerate convection-diffusion equations. Numer. Math. 121(2), 367395 (2012)
11. Leonardi, F., Mishra, S., Schwab, Ch.: Numerical Approximation of Statistical Solutions of
Incompressible Flow. Research report 2015-27, Seminar for Applied Mathematics, ETH Zrich
(2015)
12. LeVeque, R.: Numerical Solution of Hyperbolic Conservation Laws. Cambridge Press, Cambridge (2005)
13. Mishra, S., Schwab, Ch., ukys, J.: Multi-level Monte Carlo Finite Volume methods for nonlinear systems of conservation laws in multi-dimensions. J. Comput. Phys. 231(8), 33653388
(2012)
14. Rosa (Cray XE6). Swiss National Supercomputing Center (CSCS), Lugano. http://www.
cscs.ch
15. ukys, J., Mishra, S., Schwab, Ch.: Static load balancing for Multi-Level Monte Carlo finite
volume solvers. PPAM 2011, Part I, LNCS, vol. 7203, pp. 245254. Springer, Heidelberg
(2012)
16. Temam, R.: Navier-stokes equations and nonlinear functional analysis. CBMS-NSF Regional
Conference Series in Applied Mathematics, vol. 41. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1983)
17. Yudovic, V.I.: A two-dimensional non-stationary problem on the flow of an ideal incompressible
fluid through a given region. Mat. Sb. (N.S.) 64(106), 562588 (1964)

Unbiased Simulation of Distributions


with Explicitly Known Integral Transforms
Denis Belomestny, Nan Chen and Yiwei Wang

Abstract In this paper, we propose an importance-sampling based method to obtain


unbiased estimators to evaluate expectations involving random variables whose probability density functions are unknown while their Fourier transforms have explicit
forms. We give a general principle about how to choose appropriate importance sampling density under various Lvy processes. Compared with the existing methods,
our method avoids time-consuming numerical Fourier inversion and can be applied
effectively to high dimensional option pricing under different models.
Keywords Monte Carlo Unbiased simulation
processes Importance sampling

Fourier transform

Levy

1 Introduction
Nowadays Monte Carlo simulation becomes an influential tool in financial applications such as derivative pricing and risk management; see Glasserman [12] for a
comprehensive overview, Staum [25] and Chen and Hong [8] for introductory tutorials of the topic. A standard MC procedure typically starts with using some general
methods of random number generation, such as inverse transform and acceptancerejection, to sample from descriptive probabilistic distributions of market variables.

D. Belomestny (B)
Duisburg-Essen University, Thea-Leymann-Str. 9, Essen, Germany
e-mail: denis.belomestny@uni-due.de
D. Belomestny
IITP RAS, Moscow, Russia
N. Chen Y. Wang
The Chinese University of Hong Kong, Hong Kong, China
e-mail: nchen@se.cuhk.edu.hk
Y. Wang
e-mail: ywwang@se.cuhk.edu.hk
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_9

229

230

D. Belomestny et al.

Therefore, explicit knowledge about the functional forms of the underlying distributions is a prerequisite for the applications of MC technique.
However, a growing literature of Lvy-driven processes and their applications in
finance calls for research to investigate how to simulate from a distribution whose
cumulative probability function or probability density function may not be available in explicit form. As an important building block of asset price modeling, Lvy
processes can capture well discontinuous price changes and thus are widely used to
model the skewness/smile of implied volatility curves in the option market; see, e.g.
Cont and Tankov [26], for the modeling issues of Lvy process. According to the
celebrated Lvy-Khintchine representation, the joint distribution of the increments
of a Lvy process is analytically characterized by its Fourier transform. Utilizing this
fact, we can evaluate the price function of options written on the underlying assets
modelled by a Lvy process in two steps. First, we apply the Fourier transform (with
some suitable adjustments) on the risk-neutral presentation of option prices in order
to obtain an explicit form of the transformed price function. Second, we numerically
invert the transform to recover the original option price. This research line can be
traced back to Carr and Madan [5], which proposed Fast Fourier Transform (FFT) to
accelerate the computational speed of the method. One may also refer to Lewis [20],
Lee [19], Lord and Kahl [21] and Kwok et al. [17] for more detailed discussion and
extension of FFT. Kou et al. [16] used a trapezoidal rule approximation developed
by Abate and Whitt [1] to invert Laplace transforms for the purpose of option pricing
under the double exponential jump diffusion model, a special case of Lvy process.
Feng and Linetsky [11] introduced Hilbert transform to simplify Fourier transform
of discretely monitored barrier option by backward induction. More recently, Biagini
et al. [4] and Hurd and Zhou [14] extended the Fourier-transform based method to
price options on several assets, including basket options, spread options and catastrophe insurance derivatives.
The numerical inversion of Fourier transforms turns out to be the computational
bottleneck of the above approach. It essentially involves using a variety of numerical discretization schemes to evaluate one- or multi-dimensional integrals. Hence,
such methods will suffer seriously from the curse of dimensionality as the problem dimension increases when we try to price options written on multiple assets.
Monte Carlo method, as a competitive alternative for calculating integrals in a high
dimensional setting, thus becomes a natural choice in addressing this difficulty. To
overcome the barrier that explicit forms of the distribution functions for Lvy driven
processes are absent, some early literature relies on somehow ad hoc techniques to
derive upper bounds for the underlying distribution for the purpose of applying the
acceptance-rejection principle (see, e.g. Glynn [2] and Devroye [18]). More recently,
some scholars, such as Glasserman and Liu [13] and Chen et al. [9], proposed to
numerically invert the transformed distributions to tabulate the original distribution
on a uniform grid so that they can simulate from. Both directions work well in one
dimension. Nevertheless, it is difficult for them to be extended to simulate high
dimensional distributions.
In this paper we propose a novel approach for computing high-dimensional integrals with respect to distributions with explicitly known Fourier transforms based on

Unbiased Simulation of Distributions with Explicitly

231

a genuine combination of Fourier and Monte Carlo techniques. In order to illustrate


the main idea of our approach, let us first consider a simple problem of computing
expectations with respect to one-dimensional stable distributions. Let p (x) be the
density of a random variable X having a symmetric stable law with the stability
parameter (1, 2), i.e., its Fourier transform is
.
F [p ](u) =

eiux p (x)dx = exp(|u| ),

Suppose we want to compute the expectation Q = e[g(X)] for some nonnegative


function g. Since there are several algorithms of sampling from stable distribution
(see, e.g. Chambers et al. [7]), we could use Monte Carlo to construct the estimate
1
g(Xi ),
n i=1
n

Qn =

where X1 , . . . , Xn is an i.i.d. sample from the corresponding -stable distribution.


Recall that in the theory of Fourier transform, we have Parsevals identity (see,
e.g. Rudin [23]) such that for g and p,

Rd

g(x)p(x)dx =

1
(2 )d


Rd

F [g](u)F [p](u)du.

Take, for example, g(x) = (max{x, 0}) with some (0, ), then Parsevals identity implies


g(x)
[x p (x)] dx
x

1
F [x p (x)](u)F [g(x)/x](u) du.
=
2

Q=

According to
F [x p (x)](u) = i
and
F [g(x)/x](u) =

d
F [p ](u) = isign(u)|u|1 exp(|u| )
du
()
(cos(/2) + isign(u) sin(/2)),
|u|

we have
() sin(/2)
Q=


0

u1 exp(u ) du.

(1)

232

D. Belomestny et al.

Consider a new random variable X  with a power exponential distribution density


f (x) =

1
exp(|x| ), < x < +
2 (1 + 1/)

and a new function g (x) such that


(1/) () sin(/2) 1
|x|
, < x < +,

g (x) =

we can easily show that Q = e[g (X  )] from (1). If = 1, noting that g is in


fact a constant function, we have Var[g (X  )] = 0. On the other hand, Var[g(X)] >
B(2 )1 for some constant B > 0 (not depending on ).
This shows that even in the above very simple situation, moving to the Fourier
domain can significantly reduce the variance of Monte Carlo estimates. More importantly, by using our approach, we replace the problem of sampling from the stable
distribution p by a much simpler problem of drawing from the exponential power
distribution f . Of course, the main power of Monte Carlo methods can be observed in
high-dimensional integration problems, which will be considered in the next section.

2 General Framework
Let g be a real-valued function on Rd and let p be a probability density on Rd . Our
aim is to compute the integral of g with respect to p :

V=

g(x)p(x) dx.
Rd

Suppose that there is a vector R Rd , such that


g(x)ex,R L 1 (Rd ), p(x)ex,R L 1 (Rd ),
then we have by the Parsevals formula

V=

Rd

g(x)ex,R p(x)ex,R dx =

1
(2 )d


Rd

F [g](iR u)F [p](u iR) du.

(2)

Let q be a probability density function with the property that q(x) = 0 whenever
|F [p](u iR)| = 0, | | denoting the complex modulus. That is, q has the same
support as |F [p](u iR)|. Then we can write
V=
where

1
(2 )d


Rd

F [g](iR u)

F [p](u iR)
q(u) du = eq [h(X)] ,
q(u)

(3)

Unbiased Simulation of Distributions with Explicitly

h(x) =

233

1
F [p](x iR)
.
F [g](iR x)
d
(2 )
q(x)

and X is a random variable distributed according to q.


The variance of the corresponding Monte Carlo estimator is given by

Varq [h(X)] =

1
2

2d 
Rd

|F [g](iR u)|2

|F [p](u iR)|2
du V 2 .
q(u)

Note that the function |F [p](u iR)| is, up to a constant, a probability density and
in order to minimize the variance, we need to find a density q, that minimizes the
ratio
|F [p](u iR)|
q(u)
and that we are able to simulate from. In the next section, we discuss how to get a
tight upper bound for |F [p](iR u)| in the case of an infinitely divisible distribution
p, corresponding to the marginal distributions of Lvy processes. Such a bound can
be then used to find a density q leading to small values of variance Varq [h(X)].

3 Lvy Processes
Let (Zt ) be a pure jump d-dimensional Levy process with the characteristic exponent
, that is


E eiu,Zt  = et(u) , u Rd .
Consider the process Xt = Zt , where is a real m d matrix. Let a vector R Rm

.
be such that R (dz) = e R,z (dz) is again a Lvy measure, i.e.



 2
|z| 1 R (dz) < .

Suppose that there exist a constant C > 0 and a real number (0, 2), such
that, for sufficiently small > 0, the following estimate holds

{zR:|z,h|}

z, h2 R (dz) C 2 , h Rd , |h| = 1.

(4)

The above condition is known as Oreys condition in the literature (see Sato [24]). It
is usually used to ensure that the process admits continuous transition densities. The
value is called by the BlumenthalGetoor index of the process. Under it, we have
Lemma 1 Suppose that (4) holds, then there exists constant AR > 0 such that, for
any u Rm and sufficiently large | u|,

234

D. Belomestny et al.




2tC

|F [pt ](u iR)| AR exp 2


u
,

(5)

where pt is the density of Xt .


Proof For any u Rm , we have

 



 R,z

|F [pt ](u iR)| = exp t


1e
cos  u, z +  R, z1{|z|1} (dz)
Rd

 

1 e R,z +  R, z1{|z|1} (dz)


= exp t
Rd

 




(dz)
e R,z 1 cos  u, z
exp t
Rd
 




1 cos  u, z R (dz) ,
= AR exp t
Rd

where
 
AR = exp t

Rd


  R,z

e
1  R, z1{|z|1} (dz) < ,

since

 R,z

1  R, z1{|z|1}
C1 ( R) |z|2 1{|z|1} + C2 ( R)e R,z 1{|z|>1} .

First, note that the condition (4) is equivalent to the following one

{zR:|z,k|1}

z, k2 R (dz) C |k| ,

for sufficiently large k Rd , say |k| c0 . To see this, it is enough to change in (4)
the vector h to the vector k. Fix u Rm with |u| 1 and | u| c0 , then using
the inequality 1 cos(x) 22 |x|2 , |x| , we find






2
1 cos  u, z R (dz) 2
 u, z2 R (dz)
{zR:| u,z|1}
Rd

2C

2
u
.

Lemma 1 provides us a general guideline how to choose the importance sampling


density q used in our unbiased simulation. Note that, after a proper rescaling, the
function on the right hand side of the inequality (5) gives us the probability density
of a power exponential distribution. Hence, letting

Unbiased Simulation of Distributions with Explicitly

235




2tC

q(u) := C exp 2
u
,

we know from Lemma 1 that our simulation scheme will have a finite variance.
Discussion
The condition (4) is not very restrictive. We can show that it is true for many commonly used Lvy models in financial applications, such as CGMY, NIG and -stable
models. Below we discuss a special case, which can be viewed as a generalization
of -stable processes.
For simplicity we take R = 0. Clearly, if (Zt ) is a d-dimensional -stable process
which is rotation invariant ((h) = c |h| , for h Rd ), then (4) holds. Consider now
general -stable processes. It is known that Z is -stable if and only if its components
Z 1 , . . . , Z d are -stable and if the Levy copula C of Z is homogeneous of order 1
(see Cont and Tankov [26]), i.e.
C (r 1 , . . . , r d ) = r C (1 , . . . , d )
for all = (1 , . . . , d ) Rd and r > 0. As an example of such homogeneous Levy
copula one can consider

1/
d


j

1 ...
C (1 , . . . , d ) = 22d
1

d 0


(1 )11 ...d <0 ,

j=1

where > 0 and [0, 1]. If the marginal tail integrals given by


j (xj ) = R, . . . , I (xj ), . . . R sgn(xj )


with
I (x) =

(x, ),
x 0,
(, x], x < 0,

are absolutely continuous, we can compute the Lvy measure for the Lvy copula
C by differentiation as follows:
(dx1 , . . . , dxd ) = 1 . . . d C |1 =1 (x1 ),...,d =d (xd ) 1 (dx1 ) . . . d (dxd ),
where 1 (dx1 ), . . . , d (dxd ) are the marginal Lvy measures.
Suppose that the marginal Lvy measures are absolutely continuous with a stablelike behaviour:
j (dxj ) = kj (xj ) dxj =

lj (|xj |)
dxj , j = 1, . . . , d,
|xj |1+

236

D. Belomestny et al.

where l1 , . . . , ld are some nonnegative bounded nonincreasing functions on [0, )


with lj (0) > 0 and [0, 2]. Then
(dx1 , . . . , dxd ) = G(1 (x1 ), . . . , d (xd )) k1 (x1 ) . . . kd (xd ) dx1 . . . dxd
with G(1 , . . . , d ) = 1 . . . d C |1 ...,d . Note that for any r > 0,
kj (rxj ) = r 1 k j (xj , r), j (rxj ) = r j (xj , r), j = 1, . . . , d,
where
k j (xj , r) =

lj (rxj )
|xj |

, j (xj , r) = 1{xj 0}
1+

xj

k j (s, r) ds + 1{xj <0}

 xj

k j (s, r) ds.

Since the function G is homogeneous with order 1 d, we get for (0, 1),

{zR:|z,h|}


z, h2 (dz) = 2

{zR:|y,h|1}



y, h2 G 1 (y1 , ), . . . , d (yd , )

k 1 (y1 , ) . . . k d (yd , ) dy1 . . . dyd





2
y, h2 G 1 (y1 , 1), . . . , d (yd , 1)
{zR:|y,h|1}

k 1 (y1 , 1) . . . k d (yd , 1) dy1 . . . dyd

and the condition (4) holds, provided



inf

h: |h|=1 {zR:|z,h|1}

z, h2 (dz) > 0.

If for some R = (R1 , . . . , Rd ) the functions exRi li (x), i = 1, . . . , d, are bounded, the
.
condition (4) holds for R (dz) = eR,z (dz).
Of course, the power exponential distribution may not be a proper candidate for
q(u) if the condition (4) fails to hold. Nevertheless, we need to stress that the principle
behind Parsevals identity still applies here and thus our unbiased simulation should
work in that case.
For example, for the variance gamma process Xt with parameters , drift
and volatility of Brownian motion and variance of the subordinator, the Fourier
transform is
u2 2
t
i u) .
E[eiuXt ] = (1 +
2
There exists some constant 1 < <

2t
,

providing

iuX

E[e t ]
<

2t

1
(1 + |u|)

> 1, such that

Unbiased Simulation of Distributions with Explicitly

237

for sufficiently large |u|, so we can use the power density


1
2(1 + |u|)

q(u) =

as our importance sampling density.


We leave the investigation on the variance property of the simulator when the
condition (4) is not satisfied to the future research work.

4 Positive Definite Densities


Let p be a probability density on Rd , which is positive definite. For example, all
symmetric infinite divisible absolute continuous distributions have positive definite
densities. Let furthermore g be a nonnegative integrable function on Rd . Suppose
that we want to compute the expectation

V = Ep [g(X)] =

g(x)p(x) dx.
Rd

We have by the Parsevals identity


1
V=
(2 )d


F [g](x)F [p](x) dx.

Rd

Note that p (x) = F [p](x)/((2 )d p(0)) is a probability density and therefore we


have another dual representation for V :
V = Ep [g (X)]
with g (x) = p(0)F [g](x).
Let us compare the variances of the random variables g(X) under X p and g (X)
under X p . It holds

Var p [g(X)] =

Rd

g2 (x)p(x) dx V 2

and

p(0)
|F [g](x)|2 F [p](x) dx V 2
(2 )d Rd

= p(0)
(g  g)(x)p(x) dx V 2 ,

Var p [g (X)] =

Rd

238

D. Belomestny et al.

where
(g  g)(x) =

g(x y)g(y) dy.

As a result,


Var p [g(X)] Var p [g (X)] =

Rd

 2

g (x) p(0)(g  g)(x) p(x) dx.

Note that if p(0) > 0 is small, then it is likely that Var p [g(X)] > Var p [g (X)].
This means that estimating V under p with Monte Carlo can be viewed as a variance
reduction method in this case. Apart from the variance reduction effect, the density
p may has in many cases (for example, for infinitely divisible distributions) much
simpler form than p and therefore is easy to simulate from.

5 Numerical Examples
5.1 European Put Option Under CGMY Model
The CGMY process {Xt }t0 with drift is a pure jump process with the Lvy measure
(see Carr et al. [6])

exp(Gx)
exp(Mx)
CGMY (x) = C
1x<0 +
1x>0 , C, G, M > 0, 0 < Y < 2.
|x|1+Y
x 1+Y


As can be easily seen, the Lvy measure CGMY satisfies the condition (4) with = Y .
The characteristic function of XT is given by


(u) = e[eiuXT ] = exp iuT + TC (Y )[(M iu)Y M Y + (G + iu)Y GY ] ,

where
= r C (Y )[(M 1)Y M Y + (G + 1)Y GY
ensures that {ert eXt }t0 is a martingale. Suppose the stock price follows the model
St = S0 eXt ,
then due to (2), for any R < 0, the price of the European put option is given by
rT

e
where

erT
e[(K ST ) ] =
2
+


F [g](iR u)F [p](u iR)du,

(6)

Unbiased Simulation of Distributions with Explicitly

F [g](iR u) =

239

K 1R eiu ln K
, F [p](u iR) = ei(uiR) ln S0 e[ei(uiR)XT ].
(iu + R 1)(iu + R)

To ensure the finiteness of F [p](u iR), we have to select an R such that G <
R < 0. In fact, under such R,

eRx CGMY (x)dx < +,
|x|1

which is equivalent to E[eRXT ] < + (see Sato [24], Theorem 25.17). Therefore,
|F [p](u iR)| eR ln S0 E[eRXT ] < +.
Lemma 1 implies that we can find constants , A, and such that Y , A > 0,
> 0, and
|F [p](u iR)| Ae

|u|

for sufficiently large u. So the following exponential power density


q(u) =

1
1

2 (1 + 1 )

|u|

can be used as the sampling density in (3).


We choose the values of , , and R to minimize the second moment of our
estimator, i.e., we solve the following optimization problem

min

G<R<0,,

Eq


|F [g](iR U)|2 |F [p](U iR))|2
, U q().
q2 (U)

Since the expectation usually does not have the explicit form, we propose the
following stochastic optimization algorithm to solve the problem.
.
Step 1 Noting that W = |U| is gamma distributed with the density
qW (w) =

( 1 )

w 1 e , w > 0,
1

we first generate n i.i.d. samples Wi ( 1 , ) and i.i.d. samples Ri which have


1

equal probability to be 1 or 1. Then Ui = Wi Ri have the common distribution


function q.
Step 2 Obtain the optimal parameters by

240

D. Belomestny et al.

arg

min

G<R<0,,

N
1  |F [g](iR Ui )|2 |F [p](Ui iR))|2
N i=1
q2 (Ui )

We use the parameters C = 1, G = 5, M = 5, Y = 0.5, r = 0.1, S0 = K =


100, T = 1 from Feng and Lin [10] to calculate the price of the European put option.
The option price obtained via numerical integration was 10.2967. All numerical
experiments were conducted on a PC equipped with Intel Core i5 CPU at 2.50 GHz
with 8 GB RAM. Our numerical results are shown in Table 1 and compared with the
results obtained by PT method (given by Poirot and Tankov [22]) and KM method
(given by Kawai and Masuda [15]). The results show that our proposed scheme is
more efficient in option pricing.
Here we choose initial point R = 5, = Y , = 1/TC (Y ) and repeat
the above optimization scheme until the optimal solution converges. To assess how
sensitive the simulation efficiency is with respect to the choice of (R, , ), we also
run two more arbitrarily chosen value sets for the parameters, as shown in Tables 2
and 3. The performance is still better than existing methods in terms of RMSE.
European Put Option Under NIG Model
The NIG (Normal Inverse Gaussian) Lvy process can be constructed by subordinating Brownian Motion with an Inverse Gaussian process: (see Barndorff-Nielsen,
O. [3])

Table 1 Put option in CGMY model (R = 4.6, = 0.49, = 0.36)


No. of simulation Price
95 %-interval
RMSE
100,000

10.3073

400,000

10.2999

1,600,000

10.2970

100,000(PT)
100,000(KM)

11.6421
10.2938

[10.2896,
10.3251]
[10.2910,
10.3088]
[10.2926,
10.3014]
[9.3455, 13.9387]
[10.2016,
10.3861]

0.0091

0.06

0.0045

0.27

0.0023

1.05

3.5172
0.0471

0.03
13096.13

Table 2 Put option in CGMY model (R = 1.5, = 0.49, = 0.36)


No. of simulation Price
95 %-interval
RMSE
100,000

10.3074

400,000

10.2990

1,600,000

10.2958

[10.2705,
10.3444]
[10.2805,
10.3175]
[10.2866,
10.3050]

Time (s)

Time (s)

0.0188

0.07

0.0094

0.30

0.0047

1.07

Unbiased Simulation of Distributions with Explicitly

241

Table 3 Put option in CGMY model (R = 3, = 0.4, = 0.6)


No. of simulation Price
95 %-interval
RMSE
100,000

10.3062

400,000

10.2925

1,600,000

10.2961

[10.2614,
10.3511]
[10.2701,
10.3150]
[10.2849,
10.3073]

Time (s)

0.0229

0.07

0.0114

0.29

0.0057

1.08

Xt (a, , ) = Tt (, ) + W (Tt (, )),



where a = 2 + 2 and Tt (, ) is the Inverse Gaussian Lvy process defined by
Tt (, ) = inf{s > 0 : s + Bs = t}. We have

 

e[eiuXt ] = exp t a2 2 a2 ( + iu)2
and the corresponding Lvy measure NIG fulfils the condition (4) with = 1. Suppose the stock price is modelled by
St = S0 et+Xt ,
where



= r q ( a2 2 a2 ( + 1)2 )

ensures the martingale condition. Then for any a < R < 0, the price of European put option is given by
rT

erT
e[(K ST ) ] =
2
+


F [g](iR u)F [p](u iR)du,

where
F [g](iR u) =

K 1R eiu ln K
, F [p](u iR) = ei(uiR)(ln S0 +T ) e[ei(uiR)XT ].
(iu + R 1)(iu + R)

Lemma 1 implies that one can use the Laplace density


q(u) =

1 |u|
e
2

as the importance sampling density, where the parameter can be chosen by minimizing the simulated second moment.

242

D. Belomestny et al.

Table 4 Put option in NIG model (R = 9.3, = 2.4)


No. of
Price
95 %RMSE
Time
simulation
interval
100,000

4.5900

400,000

4.5896

1,600,000

4.5897

[4.5879,
4.5922]
[4.5886,
4.5907]
[4.5891,
4.5902]

RMSE
(direct)

Time

0.0011

0.06

0.0238

0.04

0.0006

0.23

0.0119

0.13

0.0003

0.92

0.0059

0.49

Chen et al. [9] used parameters a = 15, = 5, = 0.5, r = 0.03, S0 = K =


100, T = 0.5 to calculate the price of the European put option and obtained the value
4.5898. Table 4 shows numerical results with the same parameters and compares the
RMSE and the computational time of our method with those of the method direct
simulating the subordinator.
Barrier Option Under CGMY Model
The payoff function of barrier option with m monitoring time points is
(ST K)+ 1{LSt1 ...Stm U} ,
where St = S0 eXt and 0 < t1 < . . . < tm < T . According to (2), the option price is
equal to

erT
F [g](iR u)F [p](u iR)du,
(2 )m+1 Rm+1
where u = (u1 , . . . , um+1 ), R = (0, . . . , 0, R), 1 < R < M,
F [g](iR u) =

e(ium+1 ium +R) ln U e(ium+1 ium +R) ln L


e(ium+1 R+1) ln K
(ium+1 + R 1)(ium+1 + R)
ium+1 ium + R
e(iu2 iu1 ) ln U e(iu2 iu1 ) ln L
e(ium ium1 ) ln U e(ium ium1 ) ln L

ium ium1
iu2 iu2

and F [p](u iR) = eiu1 ln S0

m



(uj ) (um+1 iR).

j=1

We use parameters C = 1, G = 5, M = 5, Y = 1.5, r = 0.1, s = 100, K =


100, T = 2, U = 105, L = 95 and calculate the price of the barrier option when
there is only one monitoring time at t = 1. The benchmark price calculated by numerical integration is 1.2266. We use the method described in Sect. 2 with the sampling
density
h(u1 , u2 ) =

1
1/1

21

(1 +

1
)
1

|u1 |1
1

1
1/2

22

(1 +

1
)
2

|u2 |2
2

Unbiased Simulation of Distributions with Explicitly

243

Table 5 Barrier option in CGMY model (R = 1.1, 1 = 1.4, 1 = 0.9, 2 = 0.7, 2 = 0.2)
No. of simulation Price
95 %-interval
RMSE
Time (s)
100,000
400,000
1,600,000

1.2235
1.2260
1.2264

[1.2164, 1.2305]
[1.2225, 1.2295]
[1.2247, 1.2282]

0.0036
0.0018
0.0009

0.18
0.61
2.34

where 1 , 2 Y . The numerical results are presented in Table 5.


Acknowledgments The research by Denis Belomestny was made in IITP RAS and supported by
Russian Scientific Foundation grant (project N 14-50-00150). The second and third authors are grateful for the financial support of a GRF grant from HK SAR government (Grant ID: CUHK411113).

References
1. Abate, J., Whitt, W.: The fourier-series method for inverting transforms of probability distributions. Queueing Syst. 10(12), 587 (1992)
2. Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithms and Analysis, vol. 57. Springer
Science & Business Media, New York (2007)
3. Barndorff-Nielsen, O.E.: Processes of normal inverse gaussian type. Financ. Stoch. 2(1), 4168
(1997)
4. Biagini, F., Bregman, Y., Meyer-Brandis, T.: Pricing of catastrophe insurance options written
on a loss index with reestimation. Insur.: Math. Econ. 43(2), 214222 (2008)
5. Carr, P., Madan, D.: Option valuation using the fast Fourier transform. J. Comput. Financ. 2(4),
6173 (1999)
6. Carr, P., Geman, H., Madan, D.B., Yor, M.: The fine structure of asset returns: an empirical
investigation. J. Bus. 75(2), 305333 (2002)
7. Chambers, J.M., Mallows, C.L., Stuck, B.: A method for simulating stable random variables.
J. Am. Stat. Assoc. 71(354), 340344 (1976)
8. Chen, N., Hong, L.J.: Monte Carlo simulation in financial engineering. In: Proceedings of the
39th Conference on Winter Simulation, pp. 919931. IEEE Press (2007)
9. Chen, Z., Feng, L., Lin, X.: Simulating Lvy processes from their characteristic functions and
financial applications. ACM Trans. Model. Comput. Simul. (TOMACS) 22(3), 14 (2012)
10. Feng, L., Lin, X.: Inverting analytic characteristic functions and financial applications. SIAM
J. Financ. Math. 4(1), 372398 (2013)
11. Feng, L., Linetsky, V.: Pricing discretely monitored barrier options and defaultable bonds in
Lvy process models: a fast Hilbert transform approach. Math. Financ. 18(3), 337384 (2008)
12. Glasserman, P.: Monte Carlo Methods in Financial Engineering, vol. 53. Springer, New York
(2004)
13. Glasserman, P., Liu, Z.: Sensitivity estimates from characteristic functions. Oper. Res. 58(6),
16111623 (2010)
14. Hurd, T.R., Zhou, Z.: A Fourier transform method for spread option pricing. SIAM J. Financ.
Math. 1(1), 142157 (2010)
15. Kawai, R., Masuda, H.: On simulation of tempered stable random variates. J. Comput. Appl.
Math. 235(8), 28732887 (2011)
16. Kou, S., Petrella, G., Wang, H.: Pricing path-dependent options with jump risk via Laplace
transforms. Kyoto Econ. Rev. 74(1), 123 (2005)

244

D. Belomestny et al.

17. Kwok, Y.K., Leung, K.S., Wong, H.Y.: Efficient options pricing using the fast Fourier transform.
Handbook of Computational Finance, pp. 579604. Springer, Heidelberg (2012)
18. LEcuyer, P.: Non-uniform random variate generations. International Encyclopedia of Statistical Science, pp. 991995. Springer, New York (2011)
19. Lee, R.W., et al.: Option pricing by transform methods: extensions, unification and error control.
J. Comput. Financ. 7(3), 5186 (2004)
20. Lewis, A.L.: A simple option formula for general jump-diffusion and other exponential Lvy
processes. Available at SSRN 282110 (2001)
21. Lord, R., Kahl, C.: Optimal Fourier inversion in semi-analytical option pricing (2007)
22. Poirot, J., Tankov, P.: Monte Carlo option pricing for tempered stable (CGMY) processes.
Asia-Pac. Financ. Mark. 13(4), 327344 (2006)
23. Rudin, W.: Real and Complex Analysis. Tata McGraw-Hill Education, New York (1987)
24. Sato, K.I.: Lvy Processes and Infinitely Divisible Distributions. Cambridge University Press,
Cambridge (1999)
25. Staum, J.: Monte Carlo computation in finance. Monte Carlo and Quasi-Monte Carlo Methods
2008, pp. 1942. Springer, New York (2009)
26. Tankov, P.: Financial Modelling with Jump Processes, vol. 2. CRC Press, Boca Raton (2004)

Central Limit Theorem for Adaptive


Multilevel Splitting Estimators
in an Idealized Setting
Charles-Edouard Brhier, Ludovic Goudenge and Loc Tudela

Abstract The Adaptive Multilevel Splitting (AMS) algorithm is a powerful and


versatile iterative method to estimate the probabilities of rare events. We prove a
new central limit theorem for the associated AMS estimators introduced in [5],
and which have been recently revisited in [3]the main result there being (nonasymptotic) unbiasedness of the estimators. To prove asymptotic normality, we rely
on and extend the technique presented in [3]: the (asymptotic) analysis of an integral
equation. Numerical simulations illustrate the convergence and the construction of
Gaussian confidence intervals.
Keywords Monte-Carlo simulation
limit theorem

Rare events Multilevel splitting Central

Mathematics Subject Classification: 65C05 65C35 60F05

1 Introduction
Many models from physics, chemistry or biology involve stochastic systems for
different purposes: taking into account uncertainty with respect to data parameters,
C.-E. Brhier (B)
Universit Paris-Est, CERMICS (ENPC), 6-8-10 Avenue Blaise Pascal,
Cit Descartes, 77455 Marne-la-valle, France
e-mail: brehierc@cermics.enpc.fr
C.-E. Brhier
INRIA Paris-Rocquencourt, Domaine de Voluceau - Rocquencourt,
B.P. 105, 78153 Le Chesnay, France
L. Goudenge
Fdration de Mathmatiques de lcole Centrale Paris, CNRS,
Grande voie des Vignes, 92295 Chtenay-Malabry, France
e-mail: goudenege@math.cnrs.fr
L. Tudela
Ensae ParisTech, 3 Avenue Pierre Larousse, 92240 Malakoff, France
e-mail: loic.tudela@ensae-paristech.fr
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_10

245

246

C.-E. Brhier et al.

or allowing for dynamical phase transitions between different configurations of the


system. This phenomenon often referred to as metastability is observed, for instance,
when one studies a d-dimensional overdamped Langevin dynamics:
d X t = V (X t )dt +

2 1 dWt ,

associated with a potential function V with several local minima. Here W denotes a
d-dimensional standard Wiener process. When the inverse temperature increases,
the transitions become rare events (their probability decreases exponentially fast).
In this paper, we adopt a numerical point of view, and analyze a method which
outperforms a pure Monte-Carlo method for a given computational effort in the small
probability regime (in terms of relative error). Two important families of methods
have been introduced in the 1950s and next have been extensively developed, in order
to efficiently address this rare event estimation problem: importance sampling, and
importance/multilevel splittingsee [11], and [9] for a more recent treatment. We
refer for instance to [12] for a more general presentation.
The method we study in this work is a multilevel splitting algorithm. The main
advantage of this kind of methods is that they are non-intrusive: the model does not
need to be modified in order to obtain a more efficient Monte-Carlo method. The
method we study has an additional feature: adaptive computations (of levels) are
made on-the-fly. To explain more precisely the algorithm and its properties, from
now on we only focus on a simpler, generic setting for the rare event estimation
problem.
Let X be a real random variable, and a be a given threshold. We want to estimate
the tail probability p := P(X > a). The splitting strategy, in the regime when
a becomes large, consists in introducing the following decomposition of p, as a
product of conditional probabilities:
P(X > a) = P(X > an |X > an1 ) . . . P(X > a2 |X > a1 )P(X > a1 ),
for a sequence of levels a1 < . . . < an1 < an = a. The common interpretation of
this formula is that the event that X > a is split in n conditional probabilities for X ,
which are each much larger than p, and are thus easier to estimate.
To optimize the variance, the levels must be chosen such that all the conditional
probabilities are equal to p 1/n , with n as large as possible. However, levels satisfying
this condition are not known a priori in practical cases.
Notice that, in principle, to apply this splitting strategy, one needs to know how to
sample according to the conditional distributions appearing in the splitting formula.
If this condition holds, we say that we are in an idealized setting.
Adaptive techniques based on multilevel splitting, where the levels are computed
on-the-fly, have been introduced in the 2000s in various contexts, under different
names: Adaptive Multilevel Splitting (AMS) [57], Subset simulation [2] and Nested
sampling [13] for instance.

Central Limit Theorem for Adaptive Multilevel Splitting

247

In this paper, we focus on the versions of AMS algorithms studied in [3], following
[5]. Such algorithms depend on two parameters: a number of (interacting) replicas
n, and a fixed integer k {1, . . . , n 1}, such that a proportion k/n of replicas are
killed and resampled at each iteration. The version with k = 1 has been studied in
[10], and is also (in the idealized setting) a special case of the Adaptive Last Particle
Algorithm of [14].
A family of estimators ( p n,k )n2,1kn1 is introduced in [3]see (2) and (3).
The main property established there is unbiasedness: for all values n and k the
equality E[ p n,k ] = p holds truenote that this statement is not an asymptotic
result. Moreover, an analysis of the computational cost is provided there, in the
regime n +, with fixed k. However, comparisons, when k changes, are made
using a cumbersome procedure: M independent realizations of the algorithm are
necessary to define a new estimator, as an empirical mean of p 1n,k , . . . , p n,k
M , and
finally one studies the limit when M +. The aim of this paper is to remove this
procedure: we prove directly an asymptotic normality result for the estimator p n,k ,
when n +, with fixed k. Such a result allows to directly rely on asymptotic
Gaussian confidence intervals.
Note that other Central Limit Theorems for Adaptive Multilevel Splitting
estimators (in different parameter regimes for n and k) have been obtained in
[4, 5, 8].
The main result of this paper is Theorem 1: if k and a are fixed, under the assumption that the cumulative
 function of X is continuous, when n +,
distribution
the random variable n p n,k p converges in law to a centered Gaussian random
variable, with variance p 2 log( p) (independent of k).
The main novelty of the paper is the treatment of the case k > 1: indeed when
k = 1 (see [10]) the law of the estimator is explicitly known (it involves a Poisson
random variable with parameter n log( p)): the asymptotic normality of log( p n,1 ) is
a consequence of straightforward computation, and the central limit theorem for p n,1
easily follows using the delta-method. When k > 1, the law is more complicated
and not explicitly known; the key idea is to prove that the characteristic function
of log( p n,k ) satisfies a functional equation, following the strategy in [3]; the basic
ingredient is a decomposition according to the first step of the algorithm.
One of the main messages of this paper is thus that the functional equation technique is a powerful tool in order to prove several key properties of the AMS algorithm
in the idealized setting: unbiasedness and asymptotic normality.
The paper is organized as follows. In Sect. 2, we introduce the main objects:
the idealized setting (Sect. 2.1) and the AMS algorithm (Sect. 2.2). Our main result
(Theorem 1) is stated in Sect. 2.3. Section 3 is devoted to the detailed proof of this
result. Finally Sect. 4 contains a numerical illustration of the Theorem.

248

C.-E. Brhier et al.

2 Adaptive Multilevel Splitting Algorithms


2.1 Setting
Let X be a real random variable. We assume that X > 0 almost surely. The aim is
the estimation of the probability p = P(X > a), where a > 0 is a threshold. When
a goes to +, p goes to 0. More generally, we introduce the conditional probability
for 0 x a
P(x) = P(X > a|X > x).
(1)
Note that the quantity of interest satisfies p = P(0); moreover P(a) = 1.
Let F denote the cumulative distribution function of X : F(x) = P(X x)
x R.
The following standard assumption [3, 5] is crucial for the study in this paper.
Assumption 1 The function F is assumed to be continuous.

2.2 The AMS Algorithm


The algorithm depends on two parameters:
the number of replicas n 2;
the number k {1, . . . , n 1} of replicas that are resampled at each iteration.
The other necessary parameters are the stopping threshold a and the initial condition x [0, a]. On the one hand, in practice, one applies the algorithm with x = 0
to estimate p. On the other hand, introducing an additional variable x for the initial
condition is a key tool for the theoretical analysis of the algorithm.
j
In the sequel, when a random variable X i is written, the subscript i denotes the
index in {1, . . . , n} of a particle, and the superscript j denotes the iteration of the
algorithm.
In the algorithm below and in the following, we use classical notations for kth
order statistics. For Y = (Y1 , . . . , Yn ) independent and identically distributed (i.i.d.)
real valued random variables with continuous cumulative distribution function, there
exists almost surely a unique (random) permutation of {1, . . . , n} such that Y (1) <
. . . < Y (n) . For any k {1, . . . , n}, we then use the classical notation Y(k) = Y (k)
to denote the kth order statistics of the sample Y .
We are now in position to describe the Adaptive Multilevel Splitting (AMS)
algorithm.
Algorithm 1 (Adaptive Multilevel Splitting)
Initialization: Define Z 0 = x. Sample n i.i.d. realizations X 10 , . . . , X n0 , with the law
L (X |X > x).

Central Limit Theorem for Adaptive Multilevel Splitting

249

0
Define Z 1 = X (k)
, the kth order statistics of the sample X 0 = (X 10 , . . . , X n0 ), and
1 the (a.s.) unique associated permutation: X 0 1 (1) < . . . < X 0 1 (n) .
Set j = 1.
Iterations (on j 1): While Z j < a:
j

Conditionally on Z j , sample k new independent random variables (Y1 , . . . , Yk ),


according to the conditional distribution L (X |X > Z j ).
Set
 j
Y( j )1 (i) if ( j )1 (i) k
j
Xi =
j1
Xi
if ( j )1 (i) > k.
In other words, the particle with index i is killed and resampled according to the
j1
j1
law L (X |X > Z j ) if X i Z j , and remains unchanged
if X i > Zj . Notice

that the condition ( j )1 (i) k is equivalent to i j (1), . . . , j (k) .
j
j
j
Define Z j+1 = X (k) , the kth order statistics of the sample X j = (X 1 , . . . , X n ),
j
j
and j+1 the (a.s.) unique associated permutation: X j+1 (1) < . . . < X j+1 (n) .
Finally increment j j + 1.
End of the algorithm: Define J n,k (x) = j 1 as the (random) number of iterations.
n,k
n,k
Notice that J n,k (x) is such that Z J (x) < a and Z J (x)+1 a.
For a schematic representation of the algorithm, we refer for instance to [5].
We are now in position to define the estimator p n,k (x) of the probability P(x):


n,k

(x) = C

n,k

k
(x) 1
n

J n,k (x)

(2)

with
C n,k (x) =


n,k
1
Card i ; X iJ (x) a .
n

(3)

When x = 0, to simplify notations we set p n,k = p n,k (0).

2.3 The Central Limit Theorem


The main result of the paper is the following asymptotic normality statement.
Theorem 1 Under Assumption 1, for any fixed k N and a R+ , the following
convergence in distribution holds true:



 n,k
n p p N 0, p 2 log( p) .
n+

(4)

250

C.-E. Brhier et al.

Notice that the asymptotic variance does not depend on k. As a consequence of this
result, one can define asymptotic Gaussian confidence intervals, for one realization
of the algorithm and n +. However, the speed of convergence is not known
and may depend on the estimated probability p, and on the parameter k.
Thanks to Theorem 1, we can study the cost of the use of one realization of
the AMS algorithm to obtain a given accuracy when n +. In [3], the cost was
analyzed when using a sample of M independent realizations of the algorithm, giving
an empirical estimator, and the analysis was based on an asymptotic analysis of the
variance in the large n limit.
Let be some fixed tolerance error, and > 0. Denote r such that P(Z
[r , r ]) = 1 , where Z is a standard Gaussian random variable.
Then for n large,an asymptotic confidence
interval with level 1 , centered

around p, is [ p

p2 log( p)

,
n

p2 log( p)

].
n
p2 log( p)r2
.
2

p+

Then the -error criterion | p n,k

p| is achieved for n of size


However, on average one realization of the AMS algorithm requires a number of
steps of the order n log( p)/k, with k random variables sampled at each iteration
(see [3]). Another source of cost is the sorting of the replicas at initialization, and
the insertion at each iteration of the k new sampled replicas in the sorted ensemble
of the non-resampled ones. Thus
 the cost to achieve an accuracy of size is in the
large n regime of size n log(n) p 2 log( p) , which does not depend on k.
This cost can be compared with the one when using a pure Monte-Carlo approximation, with an ensemble of non-interacting replicas of size n: thanks to the Central
p(1 p)r2
.
Limit Theorem, the tolerance criterion error is satisfied for n of size
2
Despite the log(n) factor in the AMS case, the performance is improved since
p 2 log( p) = o( p) when p 0.
Remark 1 In [3], the authors are able to analyze the effect of the change of k on the
asymptotic variance of the estimator. Here, we do not observe significant differences
when k changes, theoretically and numerically.

3 Proof of the Central Limit Theorem


The proof is divided into the following steps. First, thanks to Assumption 1, we
explain why, in order to theoretically study the statistical behavior of the algorithm,
it is sufficient to study to the case when X is distributed according to the exponential
law with parameter 1: P(X > z) = exp(z) for any z > 0. The second step is
the introduction of the characteristic function of log( p n,k (x)); then, following the
definition of the algorithm, we prove that it is solution of a functional equation with
respect to x, which can be transformed into a linear ODE of order k. Finally, we
study the solution of this ODE in the limit n +.

Central Limit Theorem for Adaptive Multilevel Splitting

251

3.1 Reduction to the Exponential Case


We first recall arguments from [3] which prove that it is sufficient to study the statistical behavior of the Algorithm 1 and of the estimator (2) in a special case (Assumption 2 below); the more general result, Theorem 1 (valid under Assumption 1), is
deduced from that special case.
It is sufficient to study the case when the random variable X is exponentially
distributed with parameter 1. This observation is based on a change of variable with
the following function:


(x) = log 1 F(x) .
(5)
It is well-known that F(X ) is uniformly distributed on (0, 1) (thanks to the continuity
Assumption 1), and thus (X ) is exponentially distributed with parameter 1. Thanks
to Corollary 3.4 in [3], this property has the following consequence for the study
of the AMS algorithm: the law of the estimator p n,k is equal to the law of q n,k ,
which is the estimator defined, with (2), using the same values of the parameters
n and k, but with two differences. First, the law of the underlying random variable
is the exponential distribution with parameter 1; second, the stopping level a is
replaced
is defined by (5). Note the following consistency:
with (a), where


E q n,k = exp (a) = 1 F(a) = p (by the unbiasedness result of [3]).
Since the arguments are intricate, we do not repeat them here and we refer the
interested reader to [3]; from now on, we thus assume the following.
Assumption 2 Assume that X is exponentially distributed with parameter 1: we
denote L (X ) = E (1).
When Assumption 2 is satisfied, the analysis is simpler and the rest of the paper
is devoted to the proof of the following Proposition 1.
Proposition 1 Under Assumption 2, the following convergence in distribution holds
true:



 n,k
n p p N 0, a exp(2a) .
(6)
n+

We emphasize again that even if the exponential case appears as a specific example
(Assumption 2 obviously implies Assumption 1), giving a detailed proof of Proposition 1 is sufficient, thanks to Corollary 3.4 in [3], to obtain our main general result
Theorem 1. Since the exponential case is more convenient for the computations below,
in the sequel we work under Assumption 2. Moreover, we abuse notation: we use the
general notations from Sect. 2, even under Assumption 2.
The following notations will be useful:


f (z) = exp(z)1z>0 (resp. F(z) = 1 exp(z) 1z>0 ) is the density (resp. the
cumulative distribution function) of the exponential law E (1) with parameter 1.


nk
is the density of the kth order statistics
f n,k (z) = k nk F(z)k1 f (z) 1 F(z)
X (k) of a sample (X 1 , . . . , X n ), where the X i are independent and exponentially
distributed, with parameter 1.

252

C.-E. Brhier et al.

Finally, in order to deal with the conditional distributions L (X |X > x) (which


thanks to Assumption 2 is a shifted exponential distribution x+E (1)) in the algorithm,
we set for any x 0 and any y 0
f (y; x) = f (y x), F(y; x) = F(y x),
 f n,k (y; x) = f n,k (y x),
Fn,k (y) =

f n,k (z)dz,

(7)

Fn,k (y; x) = Fn,k (y x).

Straightforward computations (see also [3]) yield the following useful formulae:

d
f n,1 (y; x) = n f n,1 (y; x).
dx


d

for k {2, . . . , n 1},


f n,k (y; x) = (n k + 1) f n,k (y; x) f n,k1 (y; x) .
dx

(8)

3.2 Proof of the Proposition 1


The first important idea is to prove Proposition 1 for all possible initial conditions
x [0, a], even if the value of interest is x = 0: in fact we prove the convergence



 n,k
n p (x) p(x) N 0, (a x) exp(2(a x)) .
n+

(9)

A natural idea is to introduce the characteristic function of p n,k (x), and to follow the strategy developed in [3]. Nevertheless, we are not able to derive a useful functional equation with respect to the x variable. The strategy we adopt is to
study the asymptotic normality of the logarithm log( p n,k (x)) of the estimator, and
to use a particular case of the delta-method (see for instance [15], Sect. 3): if for
asequence
of real random variables


(n )nN and a real number  R one has
n n ) N (0, 2 ), then n exp(n ) exp( ) N 0, exp(2 ) 2 ,
n
n
where convergence is in distribution.
We thus introduce for any t R and any 0 x a
  

n,k (t, x) := E exp it n log( p n,k (x)) log(P(x)) .

(10)

We also introduce an additional auxiliary function (using P(x) = exp(x a))


 



n,k (t, x) := E exp it n p n,k (x) = exp it n(x a) n,k (t, x),

(11)

Central Limit Theorem for Adaptive Multilevel Splitting

253

for which Lemma 1 states a functional equation, with respect to the variable x
[0, a]. By Lvys Theorem, Proposition 1 is a straightforward consequence (choosing
x = 0) of Proposition 2 below.
Proposition 2 For any k N , any 0 x a and any t R

t 2 (x a)
.
n,k (t, x) exp
n+
2


(12)

The rest of this section is devoted to the statement and the proof of four lemmas,
and finally to the proof of Proposition 2.
Lemma 1 (Functional Equation) For any n N and any k {1, . . . , n 1}, and
for any t R, the function x n,k (t, x) is solution of the following functional
equation (with unknown ): for any 0 x a
(t, x) = e

it n log(1 nk )

(t, y) f n,k (y; x) dy

(13)

k1


eit

n log(1 nl )

P(S(x)n(l) < a S(x)n(l+1) ),

(14)

l=0

where (S(x)nj )1 jn are iid with law L (X |X > x) and where S(x)n(l) is the lth order
statistics of this sample (with convention S(x)n(0) = x).
Proof The idea (like in the proof of Proposition 4.2 in [3]) is to decompose the
0
. On the event
expectation according to the value of the first level Z 1 = X (k)
 1
  n,k

nl
Z > a = J (x) = 0 , the algorithm stops and p n,k (x) = n for the unique
l {0, . . . , k 1} such that S(x)n(l) < a S(x)n(l+1) . Thus
E[eit

n log( p n,k (x))

1 J n,k (x)=0 ] =

k1


eit

n log(1 nl )

P(S(x)n(l) < a S(x)n(l+1) ). (15)

l=0

If Z 1 < a, for the next iteration the algorithm restarts from Z 1 , and
E[eit

n log( p n,k (x))

1 J n,k (x)>0 ]





n,k
it n log C n,k (x)(1 nk ) J (x)1
it n log(1 nk )
1
E[e
|Z ]1 Z 1 <a
=E e



k
n,k
1
(16)
= eit n log(1 n ) E E[eit n log( p (Z )) |Z 1 ]1 Z 1 <a



k
= eit n log(1 n ) E n,k (t, Z 1 )1 Z 1 <a
 a

it n log(1 nk )
=e
n,k (t, y) f n,k (y; x) dy.
x

Then (13) follows from (15), (16) and the definition (11) of n,k .

254

C.-E. Brhier et al.

We exploit the functional equation (13) for x n,k (t, x), to prove that this
function is solution of a Linear Ordinary Differential Equation (ODE).
Lemma 2 (ODE) Let n and k {1, . . . , n 2} be fixed. There exist real numbers
n,k and (rmn,k )0mk1 , depending only on n and k, such that for all t R, the
function x n,k (t, x) satisfy the following Linear Ordinary Differential Equation
(ODE) of order k: for x [0, a]
k1


dm
dk
it n log(1 nk ) n,k

(t,
x)
=
e

(t,
x)
+
rmn,k m n,k (t, x).
n,k
n,k
k
dx
dx
m=0

(17)

The coefficients n,k and (rmn,k )0mk1 satisfy the following properties:

k1


n,k = (1)k n . . . (n k + 1)
rmn,k m = ( n) . . . ( n + k 1) for all R.

(18)

m=0

Observe that the ODE (17) is linear and that the coefficients are constant (with
respect to the variable x [0, a], for fixed parameters n, k and t). This nice property
is the main reason why we consider the function n,k (given by (11)) instead of
n,k (given by (10)); moreover it is also the reason why we study the characteristic
function of log( p n,k (x)), instead of the one of p n,k (x).
Proof The proof follows the same lines as Proposition 6.4 in [3]. We introduce
n,k (t, x) :=

k1


eit

n log(1 nl )

P(S(x)n(l) < a S(x)n(l+1) ).

l=0

Then by recursion, using the second line in (8), for 0 l k 1 and for any x a
and t R
 a


dl 
n,k it n log(1 nk )

(t,
x)

(t,
x)
=

e
n,k (t, y) f n,kl (y; x) dy
n,k
n,k
l
d xl
x
l1
m 


n,k d
n,k (t, x) n,k (t, x) , (19)
rm,l
+
m
dx
m=0
with the associated recursion
n,k
n,k
k + l + 1)ln,k ;
n,k 0 = 1, l+1 = (nn,k

r0,l+1 = (n k + l + 1)r0,l , if l > 0,


n,k
n,k
n,k
= rm1,l
(n k + l + 1)rm,l
, 1 m l,
rm,l+1

n,k
rl,l = 1.

(20)

Central Limit Theorem for Adaptive Multilevel Splitting

255

Using (19) for l = k 1 and the first line of (8), one eventually obtains, by differentiation, an ODE of order k:


dk 
n,k it n log(1 nk )

(t,
x)

(t,
x)
=

e
n,k (t, x)
n,k
n,k
dxk
k1


dm 
rmn,k m n,k (t, x) n,k (t, x) , (21)
+
dx
m=0
n,k
n,k
with n,k := n,k
k and r m := r m,k .
It is key to observe that the coefficients n,k and (rmn,k )0mk1 are defined by the
same recursion as in [3]. In particular, they do not depend on the parameter t R.
To see a proof of (18), we refer to Sect. 6.4 in [3].
It is clear that the polynomial equality in (18) is equivalent to the following
identity: for all j {0, . . . , k 1}
k1

dm
dk
exp

k
+
j
+
1)(x

a))
=
rmn,k m exp ((n k + j + 1)(x a)) .
((n
k
dx
dx
m=0

Due to the definition of the cumulative distribution functions of order statistics (7),
one easily checks that n,k (t, .) is a linear combination of the exponential functions
x exp(nx), . . . , exp((n k + 1)x); therefore
k1
m

dk
n,k d

(t,
x)
=
r
n,k (t, x).
n,k
m
dxk
dxm
m=0

Thus the terms depending on n,k in (21) cancel out, and thus (17) holds true.

The next steps are to give an explicit expression of the solution of (17) as a linear
combination of exponential functions, and to study the coefficients and the modes in
the asymptotic regime n +. Since the ODE is of order k, in order to uniquely
determine the solution, more information is required: we need to know the derivatives
of order 0, 1, . . . , k 1 of x n,k (t, x) at some point. We choose the terminal
point x = a (notice that by the change of variable x a x the ODE (17) can then
be seen as an ODE with an initial condition). This is the content of Lemma 3 below.
Lemma 3 (Terminal condition) For any fixed k {1, . . . , } and any t R, we have


n,k (t, a) = 1

dm
(t, x)
d x m n,k

= O( 1n )n m

x=a n

if m {1, . . . , k 1} .

(22)

Proof The equality n,k (t, a) = 1 is trivial, since p n,k (a) = 1. Equations (19) and
(21), immediately imply (by recursion) that for 1 m k 1

256

C.-E. Brhier et al.



dm
dm



(t,
x)
=

(t,
x)

 .
n,k
n,k
x=a
x=a
dxm
dxm
Introduce the following decomposition
n,k (t, x) =

k1


l
(eit n log(1 n ) P(S(x)n(l) < a S(x)n(l+1) )
l=0

k1 


eit

n log(1 nl )



1 Fn,l (a; x) Fn,l+1 (a; x)

l=0

k1


P(S(x)n(l) < a S(x)n(l+1) )

l=0

=: n,k (t, x) + 1 Fn,k (a; x),


where Fn,l denotes the cumulative distribution function of the lth order statistics
(with the convention Fn,0 (a; x) = 1 for x a), see (7).
Thanks to (8) and a simple recursion on l, it is easy to prove that for any 0 l k
and any m 1

dm

F
(a;
x)
= O(n m );
(23)

n,l
x=a
dxm
this immediately yields

dm
1

n,k (t, x)
= O( )n m .
m
x=a n
dx
n
In fact, it is possible to prove a stronger result: if 1 l k and 0 m < l then

dm

F
(a;
x)
= 0,

n,l
x=a
dxm
by recursion on l and using (8) recursively on m. We thus obtain for 1 m k 1

dm 
1

F
(a;
x)
= 0.

n,k
x=a
dxm


This concludes the proof of Lemma 3.


The last result we require is given by Lemma 4.

Lemma 4 (Asymptotic expansion) Let k {1, . . . , } and t R be fixed. Then for


n large enough, we have
n,k (t, x) =

k

l=1

l
n,k
(t)en,k (t)(xa) ,
l

(24)

Central Limit Theorem for Adaptive Multilevel Splitting

257

for complex coefficients satisfying:




1n,k (t) = it n +
n

1
n,k
(t)
n

and for 2 l k

t2
2

+ o(1),

(25)

1;

ln,k (t) n(1 e

i2(l1)
k

l
(t)
n,k
n

),

(26)

0.

Proof We denote by (ln,k (t))1lk the roots of the characteristic equation associated
with the linear ODE with constant coefficient (17) (with unknown C): thanks to
(18)

(n )...(n k + 1 )
k
eit n log(1 n ) = 0
n...(n k + 1)
By the continuity property of the roots of a complex polynomial of degree k with
respect to its coefficients, we have
l

n,k (t) :=

ln,k (t)
n

where ( (t))1lk are the roots of (1 )k = 1: thus 1n,k (t) and = o(n),
n

ln,k (t) n(1 e


n

i2(l1)
k

).

To study more precisely the asymptotic behavior of 1n,k (t), we postulate an ansatz

1n,k (t) = ct n + dt + o(1);


n

We then identify the coefficients ct = it and dt = t 2 /2 thanks to the expansions




k

dt k k2 ct2
ct
1
ct k
dt
1 + o(1)
+o
= 1
n
n
n
n
n
n


1
itk
t 2k2
it n log(1 nk )
e
+o
.
= 1
n
2n
n
n
In particular, for n large enough, (ln,k (t))1lk are pairwise distinct, and (24) follows.
l
(t))1lk are solutions of the following linear system
Then the coefficients (n,k
of equations of order k:

258

C.-E. Brhier et al.

k
1

n,k (t) + ... + n,k (t) = n,k (a),

1 (t) (t) + ... + k (t)k (t) = 1 d n,k (t, a),


n,k
n,k
n,k
n,k
n dx
...

 1
k1
 k
k1

1
1 d k1
k

+ ... + n,k
(t) n,k (t)
= n k1
(t, a).
n,k (t) n,k (t)
d x k1 n,k

(27)

l
Using Cramers rule, we express each n,k
(t) as a ratio of determinants (the denominator is a Vandermonde determinant and is non zero when n is large enough). For
l {2, . . . , k}, we have

l
(t) =
n,k

l
det(Mn,k
(t))
1

V n,k (t), . . . , n,k (t)

0,

n+

where the matrix

l
Mn,k
(t)

1
n,k (t)

2
n,k (t)

..
..
.
.
k1  2
k1
 1
n,k (t)
n,k (t)

... 1 ...
1
k
. . . O( 1n ) . . . n,k (t)

..
..
..
..

.
.
.
.



k
k1
1

. . . O( n ) . . . n,k (t)

l
is such that det(Mn,k
(t)) 0 (since n,k (t) 0), while the denominator is the
n+
 1

 1

k
k
Vandermonde determinant V n,k (t), . . . , n,k (t)
V (t), . . . , (t)

= 0.
!k
l
1
(t) = 1 l=2
n,k
(t)
Finally, n,k
Lemma 4.

n+

n+

1. This concludes the proof of




We
 are
now in position to prove Proposition 2. Indeed, recall that n,k (t, x) =
exp it n(x a) n,k (t, x) thanks to (10) and (11). Then taking the limit n +
thanks to Lemma 4 gives the convergence of the characteristic function n,k .

4 Numerical Results
In this section, we provide numerical illustration of the Central Limit Theorem 1. We
apply the algorithm with an exponentially distributed random variable with parameter
1this is justified by the discussion in Sect. 3.1.
In the simulations below, the estimated probability is e6 ( 2.48 103 ).
In Fig. 1, we fix the value k = 10, and we show histograms for n = 102 , 103 , 104 ,
with different values for the number M independent realizations of the algorithm,
such that n M = 108 (we thus have empirical variance of the same order for all
cases). In Fig. 1, we give the associated Q-Q plots, where the empirical quantiles of

Central Limit Theorem for Adaptive Multilevel Splitting

259

Fig. 1 Histograms for k = 10 and p = exp(6): n = 102 , 103 , 104 from left to right

Fig. 2 Q-Q plot for k = 10 and p = exp(6): n = 102 , 103 , 104 from left to right

Fig. 3 Histograms for n = 104 and p = exp(6): k = 1, 10, 100 from left to right

the sample are compared with the exact quantiles of the standard Gaussian random
variable (after normalization).
In Fig. 3, we show histograms for M = 104 independent realizations of the AMS
algorithm with n = 104 and k {1, 10, 100}; we also provide associated Q-Q plots
in Fig. 4.
From Figs. 1 and 2, we observe that when n increases, the normality of the estimator is confirmed. Moreover, from Figs. 3 and 4, no significant difference when k
varies is observed.

260

C.-E. Brhier et al.

Fig. 4 Q-Q plot for n = 104 and p = exp(6): k = 1, 10, 100 from left to right

Acknowledgments C.-E. B. would like to thank G. Samaey, T. Lelivre and M. Rousset for the
invitation to give a talk on the topic of this paper at the 11th MCQMC Conference, in the special
session on Mathematical aspects of Monte Carlo methods for molecular dynamics. We would also
like to thank the referees for suggestions which improved the presentation of the paper.

References
1. Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithms and Analysis. Springer, New
York (2007)
2. Au, S.K., Beck, J.L.: Estimation of small failure probabilities in high dimensions by subset
simulation. J. Probab. Eng. Mech. 16, 263277 (2001)
3. Brhier, C.E., Lelivre, T., Rousset, M.: Analysis of adaptive multilevel splitting algorithms in
an idealized case. ESAIM Probab. Stat., to appear
4. Crou, F., Del Moral, P., Furon, T., Guyader, A.: Sequential Monte Carlo for rare event estimation. Stat. Comput. 22(3), 795808 (2012)
5. Crou, F., Guyader, A.: Adaptive multilevel splitting for rare event analysis. Stoch. Anal. Appl.
25(2), 417443 (2007)
6. Crou, F., Guyader, A.: Adaptive particle techniques and rare event estimation. In: Conference
Oxford sur les mthodes de Monte Carlo squentielles, ESAIM Proceedings, vol. 19, pp. 6572.
EDP Sci., Les Ulis (2007)
7. Crou, F., Guyader, A., Lelivre, T., Pommier, D.: A multiple replica approach to simulate
reactive trajectories. J. Chem. Phys. 134, 054108 (2011)
8. Crou, F., Guyader, A., Del Moral, P., Malrieu, F.: Fluctuations of adaptive multilevel splitting.
e-preprints (2014)
9. Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating
rare event probabilities. Oper. Res. 47(4), 585600 (1999)
10. Guyader, A., Hengartner, N., Matzner-Lber, E.: Simulation and estimation of extreme quantiles and extreme probabilities. Appl. Math. Optim. 64(2), 171196 (2011)
11. Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur.
Stand. Appl. Math. Ser. 12, 2730 (1951)
12. Rubino, G., Tuffin, B.: Rare Event Simulation using Monte Carlo Methods. Wiley, Chichester
(2009)
13. Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833859
(2006)
14. Simonnet, E.: Combinatorial analysis of the adaptive last particle method. Stat. Comput. (2014)
15. van der Vaart, A.W.: Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic
Mathematics, vol. 3. Cambridge University Press, Cambridge (1998)

Comparison Between L S-Sequences


and -Adic van der Corput Sequences
Ingrid Carbone

Abstract In 2011 the author introduced a generalization of van der Corput


sequences, the so called L S-sequences defined for integers L , S such that L 1,
S 0, L + S 2, and ]0, 1[ is the positive solution of S 2 + L = 1. These
sequences coincide with the classical van der Corput sequences whenever S = 0,
are uniformly distributed for all L , S and have low discrepancy when L S. In
this paper we compare the L S-sequences and the -adic van der Corput sequences
where > 1 is the Pisot root of x 2 L x L. Using a suitable numeration system G = {G n }n0 , where the base sequence is the linear recurrence of order two,
G n+2 = LG n+1 + LG n , with initial conditions G 0 = 1 and G 1 = L + 1, we prove
that when L = S the (L , L)-sequence with L 2 + L = 1 and the -adic van der
Corput sequence with = 1/ and 2 = L + L can be obtained from each other
by a permutation. In particular for = , the golden ratio, the -adic van der Corput
sequence coincides with the KakutaniFibonacci sequence obtained for L = S = 1,
which has been already studied.
Keywords Uniform distribution
Corput sequences

Discrepancy Numeration systems van der

1 Introduction
In this paper we compare two classes of low discrepancy sequences which have been
introduced relatively recently and which have an interesting overlap.
We are interested in -adic van der Corput sequences, which have been introduced
in [3, 22, 23]. Their motivation stems from algebraic arguments and is related to the
-adic representation of real numbers introduced by [24]. They have been studied

I. Carbone (B)
Department of Mathematics and Informatics, University of Calabria,
Ponte P. Bucci Cubo 30B, 87036, Arcavacata di Rende, Cosenza, Italy
e-mail: i.carbone@unical.it
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_11

261

262

I. Carbone

quite extensively. For good references on the subject we suggest [3, 18, 22, 23]. For
the original definition of van der Corput sequence see [25].
The other, more recent, class is represented by the L S-sequences which have been
introduced in [4] and have been object of several papers. These sequences have a
more geometric motivation and are related to a generalization of an idea of Kakutani
([20]), which appeared in [26]. For another generalization of the Kakutani splitting
procedure to the multidimensional setting see [8]; for two possible generalizations
of L S-sequences to dimension 2, see [6]. Other papers dedicated to the subject are
[911, 19].
As it has been shown in [4], each L S-sequence of points corresponds to the
reordering by a suitable algorithm of the points defining a specific sequence of
partitions of [0, 1[, which depends on two nonnegative integers L and S, such that
L + S 2, L 1 and S 0. These sequences will be defined in the next section.
An interesting role is played by the (1, 1)-sequence, called in [4] the Kakutani
Fibonacci sequence of points, which has also been studied from an ergodic point of
view in [7, 18].
Each -adic van der Corput sequence is associated to a characteristic equation
x d = a0 x d1 + a1 x d2 + + ad1 , with some restrictions on the coefficients.
These restrictions imply, when d = 2, that a1 = a0 or a1 = a0 + 1. In the latter case
-adic sequences are nothing else but the classical van der Corput sequences with
base b = a0 1.
This paper is concerned with the study of the interesting overlap between -adic
sequences of order two (with a0 = a1 ) and the corresponding L S-sequences for
L = S.
It should be noted that both families of sequences are much richer: the -adic
sequences can be defined for any order d 2 (with appropriate restrictions on the
coefficients), while the L S-sequences can be defined for any pair of positive integers
L , S and have low discrepancy whenever L S.
The main result is Theorem 2, which states that for L = S the (L , L)-sequence
and the -adic van der Corput sequence, which corresponds to the positive root of
x 2 L x L, can be obtained from each other by a permutation. In particular, when
L = S = 1, the KakutaniFibonacci (1, 1)-sequence coincides with the -adic
van der Corput sequence where = is the golden ratio, i.e. the positive root of
x 2 x 1 (see also [18] for more details).
-adic sequences and L S-sequences provide, in dimension 1, low-discrepancy
sequences.
Having new low-discrepancy sequences at our disposal, it is important to obtain
a more complete understanding of their behavior in order to use them in the Quasi
Monte Carlo method, pairing them la Halton (as it has been done in [16] and in
[17]). This is the main motivation of this paper.
For the L S-sequences the problem has been posed for the first time in [6]. It should
be noted that partial negative results have been obtained quite recently by [2]. This is
one of the most interesting open problems concerning L S-sequences. On the other
hand, a recent result ([18]) proved uniform distribution of the Halton type sequences
for -adic van der Corput sequences.

Comparison Between L S-Sequences and -Adic

263

2 Preliminaries and Results


We recall that, given a sequence {xn }n1 [0, 1[, its discrepancy is defined by the
sequence {D N } N 1 , where




N

1 

D N = D(x1 , . . . , x N ) = sup 
1[a, b[ (x j ) (b a) .
0a<b1  N j=1

We say that {xn }n1 has low discrepancy if there exists a constant C such that
N D N C log N .
We recall that a sequence {xn }n1 is uniformly distributed if D N 0 as N tends to
infinity. For extensive accounts on uniform distribution, discrepancy and applications
see [12] and [21].
If we consider a sequence {n }n1 of finite partitions of [0, 1[, with n =
(n)
{[yi(n) , yi+1
[, 1 i tn }, where y1(n) = 0 and yt(n)
= 1, according to the defin +1
nition above we say that its discrepancy D(n ) is the discrepancy of {Q n }n1 , where
Q n = {y1(n) , . . . , yt(n)
}, and that {n }n1 is uniformly distributed if D(n ) 0 as n
n
tends to infinity. Moreover, if there exists a constant C such that tn D(n ) C, we
say that {n }n1 has low discrepancy.
Definition 1 (Kakutani, [20]) Given a finite partition of [0, 1[ and given
]0, 1[, its -refinement, denoted by , is the partition obtained by subdividing
all the intervals of having maximal length proportionally to and 1 . The
so-called Kakutanis -sequence of partitions { n }n1 is obtained by successive
-refinements of .
When is the trivial partition = {[0, 1[} of [0, 1[, the sequence { n }n1 is
uniformly distributed ([20]).
In [26] the following generalization of Kakutanis splitting procedure is given.
Definition 2 (Volcic, [26]) For a given non-trivial finite partition of [0, 1[, the
-refinement of a partition of [0, 1[ (denoted by ) is obtained by subdividing all
the intervals of having maximal length positively (or directly) homothetically to
. If for any n N we denote by n the -refinement of n1 , we get a sequence
of partitions { n }n1 , called the sequence of successive -refinements of .
Obviously, if = {[0, [, [, 1[}, then the -refinement is just Kakutanis refinement.
In [26] Volcic proved that the sequence { n }n1 is uniformly distributed for
any partition , and in [1] the authors, solving a problem posed in [26], provided
necessary and sufficient conditions on and under which the sequence { n }n1
is uniformly distributed.
The L S-sequences of partitions are a special case.

264

I. Carbone

Definition 3 (Carbone, [4]) Let us fix two nonnegative integers L 1 and S


0 with L + S 2 and let be the positive solution of the quadratic equation
Sx 2 + L x = 1 (if S = 0, the equation is linear). Denote by L ,S the partition defined
by L long intervals having length followed by S short intervals having length
2 (if S = 0, all the L intervals have the same length = 1/L). The sequence
of successive L ,S -refinements of the trivial partition is called L S-sequence of
partitions and is denoted by { Ln ,S }n1 (or { Ln ,S } for short).
Whenever L = b and S = 0, we simply get the b-adic sequence of partitions in
base b.
If we denote by tn the total number of intervals of Ln ,S , by ln the number of its
long intervals and by sn the number of its short intervals, it is very simple to see that
tn = ln + sn , ln = L ln1 + sn1 and sn = S ln1 .
n
}, called in [4] KakutaniFibonacci sequence of partitions,
The sequence {1,1

corresponds to L = S = 1 and = 21 ( 5 1), which is the inverse of the golden


ratio . It is a Kakutani -sequence, with = .
In Theorem 2.2 of [4] we gave explicit and very precise estimates of the discrepancy of L S-sequences of partitions, proving in particular that they have low
discrepancy when L S.
To each L S-sequence of partitions { Ln ,S } we associate the L S-sequence of points,
denoted by { Ln ,S }. Actually, the underlying geometric construction is strongly based
on the partitions. We refer to [5] for further details. For the original definition based
on the reordering the points of each partition Ln ,S , see [4].
We define for every 0 i L 1 the functions i (x) = x + i restricted
to 0 x < 1, and for every L i L + S 1 (and S > 0) the functions
i (x) = x + L + (i L) 2 restricted to 0 x < .
We denote by E L ,S the set consisting of all the pairs of indices which correspond
to the forbidden compositions i, j = i j , i.e.
E L ,S = {L , L + 1, . . . , L + S 1} {1, . . . , L + S 1}.

(1)

If S = 0, the first factor is empty, so E L ,0 = .


We recall that any natural number n 1 can be expressed in base b 1 as
n=

M


ak (n) bk ,

(2)

k=0

with ak (n) {0, 1, . . . , b 1} for all 0 k M, and M = logb n (here and in the
sequel   denotes the integer part). The expression (2) leads to the representation
in base b of n
(3)
[n]b = a M (n)a M1 (n) . . . a0 (n) .
If n = 0, we write [0]b = 0. The representation of n in base b given by (2) is used
to define the radical-inverse function b on N which associates the number

Comparison Between L S-Sequences and -Adic

b (n) =

M


ak (n)bk1

265

(4)

k=0

to the string of digits (3), whose representation in base b is 0.a0 (n)a1 (n) . . . a M (n).
Of course 0 b (n) < 1 for all n 0.
The sequence {b (n)}n0 is the van der Corput sequence in base b.
Definition 4 (Carbone, [5]) Let be the positive solution of the equation Sx 2 +
L x = 1. We denote by N L ,S the set of all positive integers n, ordered by magnitude,
/ E L ,S for all
with [n] L+S = a M (n) a M1 (n) . . . a0 (n) such that (ak (n), ak+1 (n))
0 k M 1. If S = 0, we have N L ,S = N. For all n N L ,S we define the
L S-radical inverse function as follows:
L ,S (n) =

M


a k (n) k+1 ,

(5)

k=0

where a k (n) = ak (n) if 0 ak (n) L 1 and a k (n) = L + (ak (n) L) if


L ak (n) L + S 1. If S = 0, (5) coincides with the radical inverse function
(4).
Definition 5 (Carbone, [5]) The sequence { L ,S (n)} defined on N L ,S is the L Ssequence of points.
If the L S-sequence of partitions { Ln ,S } has low discrepancy, the corresponding
L S-sequence of points { L ,S (n)} has low discrepancy, too. In fact, if we denote by
LN,S = { L1 ,S , L2 ,S , . . . , LN,S } the first N elements of the sequence { L ,S (n)}, we
have the following result.
Theorem 1 (Carbone, [4])
(i) If S L there exists k1 > 0 such that N D( LN,S ) k1 log N for any N N.
(ii) If S = L + 1 there exist k2 , k2
> 0 such that k2
log N D( LN,S ) k2 log2 N
for any N N.
(iii) If S L + 2 there exist k3 , k3
> 0 such that k3 N 1 D( LN,S )
)
> 0.
k3 N 1 log N for any N N, where 1 = log(S
log
Let us observe that if L = b and S = 0, the L S-sequence reduces to the van der
Corput sequence in base b.
The simple case L = S = 1 has been widely studied. For a dynamical approach to
n
}, called in [5] KakutaniFibonacci
the (1, 1)-sequence, see [7]. The sequence {1,1
sequence of points, corresponds to the KakutaniFibonacci sequence of partitions
n
}.
{1,1
The set (1) reduces to {(1, 1)} and, according to Definition 4, N1,1 is the set of
all natural numbers n such that the binary representation (3) does not contain two
consecutive digits equal to 1. Moreover, the (1, 1)-radical inverse function defined
by (5) on N1,1 is

266

I. Carbone

1,1 (n) =

M


ak (n) k+1 ,

(6)

k=0

with the same coefficients ak (n) of the representation of n given by (3) for b = 2.
n
}nN or {1,1 (n)}nNL ,S for the KakutaniFibonacci
We will use the notation {1,1
sequence of points.
We conclude this section with some basic notions on numeration systems with
respect to a linear recurrence base sequence (for more details see [13]).
If G = {G n }n0 is an increasing sequence of natural numbers with G 0 = 1, any
n N can be expanded with respect to this sequence as follows:
n=

k (n)G k .

(7)

k=0

N
This expansion is finite and unique if for every N N we have k=0
k (n)G k <
G N +1 . G is called numeration system and (7) the G-expansion of n. The digits k
can be computed by the greedy algorithm (see, for instance, [14]).
Let us consider now a special numeration system, where the base sequence is a
linear recurrence of order d 1, namely
G n+d = a0 G n+d1 + + ad1 G n ,

n 0,

(8)

with G 0 = 1 and G k = a0 G k1 + + ak1 G 0 + 1 for k < d.


When the coefficients of the characteristic equation
x d = a0 x d1 + + ad1

(9)

associated to the linear recurrence (8) are decreasing, namely a0 ad1 1,


we know that the largest root of (9) is a Pisot number.
We recall that a Pisot number is a real algebraic integer q > 1 such that all its
Galois conjugates have absolute value strictly less than 1. If P(x) is a polynomial
with exactly one Pisot number as a zero, is called the Pisot root of P.
The most famous example of a Pisot number is the golden ratio , which is
the Pisot root of the equation x 2 = x + 1 associated to the numeration system
G = {G n }n0 , where {G n }n0 is the Fibonacci sequence.
Definition 6 (BaratGrabner, [3]) If (7) is the G-expansion of the natural number n
and is the Pisot root of (9), the sequence { (n)}n0 where is the -adic Monna
map defined by


(n) =
k (n) k1 ,
(10)
k=0

is called -adic van der Corput sequence.

Comparison Between L S-Sequences and -Adic

267

If = b is a natural number greater than 1, the sequence { (n)}n0 is the classical


van der Corput sequence in base b.

3 Results
In order to compare L S-sequences and -adic van der Corput sequences, let us recall
that the sequence { (n)}n0 defined by (10) is not necessarily contained and dense
in [0, 1[. A partial answer can be found in [3], where it is proved that if is the Pisot
root of the characteristic Eq. (9) associated to the numeration system G defined by
(8), where a0 = = ad1 , then the sequence { (n)}n0 is uniformly distributed
in [0, 1[ and has low discrepancy. In this case, the sequence is called the Multinacci
sequence.
A complete answer has been given very recently by [18], where the authors proved
the following result.
Lemma 1 (HoferIacTichy, [18]) Let a = (a0 , . . . , ad1 ), where the integers
a0 , . . . , ad1 0 are the coefficients of the numeration system G and assume that
the corresponding characteristic root satisfies (9). Furthermore, assume that there
is no b = (b0 , . . . , bk1 ) with k < d such that is the characteristic root of
the polynomial defined by b. Then (N) [0, 1[ and (N) [0, x[ for some
0 < x < 1 if and only if a can be written either as
a = (a0 , . . . , a0 )

(11)

a = (a0 , a0 1, . . . , a0 1, a0 ),

(12)

or
where a0 > 0.
We notice that the above lemma does not require the assumption of decreasing
coefficients. In [18] it is also observed that, if the condition that d has to be minimal
is dropped, then there exist two more cases in which the above theorem is satisfied.
We are interested in the following case:
a = (a0 , . . . , a0 , a0 + 1).

(13)

From now on we shall restrict our attention to the case d = 2, and consequently
to (11) and (12). Let us consider the numeration system G = {G n }n0 defined by the
linear recurrence of order d = 2
G n+2 = a0 G n+1 + a1 G n , n 0,
with the initial conditions

(14)

268

I. Carbone

G 0 = 1 and G 1 = a0 + 1.

(15)

According to [18], if is the solution of the characteristic equation x 2 = a0 x +a1 ,


the -adic van der Corput sequence { (n)}n0 is uniformly distributed if and only
if a0 = a1 (and is not the root of any equation of order 1), or a1 = a0 + 1 (and is
the root of the equation of order 1 associated to the linear recurrence G n+1 = a0 G n ).
At this point we come back to our L S-sequences and state our main result.
Theorem 2 When L = S, the L S-sequence { Ln ,L } is a reordering of the -adic van
der Corput sequence, where 1/ is the solution of the equation L x 2 + L x = 1.
n
Proof In the case L = S = 1, the KakutaniFibonacci sequence {1,1
} actually
coincides with the -adic van der Corput sequence, where = 1/ is the golden
ratio .
n
} can be written as {1,1 (n)}nN1,1 (see (6)) where N1,1 (see
We know that {1,1
Definition 4) is the set of all the natural numbers whose binary representation
(3)

is
the
does not contain two consecutive digits equal to 1. Moreover, = 51
2
solution of the equation + 2 = 1. If we consider now the linear recurrence (14),
namely G n+2 = G n+1 + G n with the initial conditions (15)given by G 0 = 1 and
G 1 = 2, we have already noticed that the golden ratio = 1+2 5 is the solution of the
equation 2 = + 1 and that 1 = . Furthermore, it is clear that {G n }n0 = {tn }n0 ,
where tn is the total number of intervals of the nth partition of the KakutaniFibonacci
n
} defined in Sect. 2, which satisfies tn+2 = tn+1 + tn , with
sequence of partitions {1,1
0
t0 = 1 and t1 = 2. Here tn (0) = 1 corresponds to 1,1
= [0, 1[.
The coefficients k (n) of the related -adic van der Corput sequence { (n)}n0
defined by (10) can be evaluated with the greedy algorithm: it is very simple to
see that k (n) {0, 1} and that the expansion (7) does not contain two consecutive
coefficients equal to 1. In both representations, the -adic Monna map and the (1, 1)radical inverse function coincide on their domain and the proof is complete.
This result appears also in [18].

Now we prove the statement of the theorem in the case L = S 2, showing that
the set of the images of the radical inverse function L ,L (n) defined by (5) coincides
with the set of the images of the -adic Monna map (n) defined by (10).
More precisely, we consider n N L ,L . According to Definition 4, n has a representation [n]2L = a M (n) a M1 (n) . . . a0 (n) in base 2L such that (ak (n), ak+1 (n))

/ E L ,L for all 0 k M 1, where E L ,L = {L , L + 1, . . . , 2L 1}


{1, 2, . . . , 2L 1} (see (1)). For such n N L ,L we consider the (L , L)-sequence
{ L ,L (n)}, where
L ,L (n) =

M


a k (n) k+1 ,

(16)

k=0

with a k (n) = ak (n) if 0 ak (n) L 1 and a k (n) = L + (ak (n) L) if


L ak (n) 2L 1, and where L + L 2 = 1.

Comparison Between L S-Sequences and -Adic

269

We now restrict our attention to the digits a k (n) in the case L ak (n) 2L 1.
If we put ak (n) = L + m, with 0 m L 1, we can write a k (n) = L + m .
Consequently, we have a k (n) k+1 = L k+1 + m k+2 .
From the condition (ak (n), ak+1 (n))
/ E L ,L we derive that ak+1 (n) must be equal
to 0, and that ak1 (n) has to belong to the set {0, 1, . . . , L 1}. Three consecutive
powers of can be grouped in the partial sum
a k1 (n) k + a k (n) k+1 + a k+1 (n) k+2 = ak1 (n) k + L k+1 + m k+2 ,
and in (16) we also admit two consecutive digits belonging to the set {L}{1, . . . , L
1}.
Taking the set E L ,L into account, (16) can be written with new coefficients ak
(n),

(n))
/ E L
,L , where
which are nonnegative integer numbers such that (ak
(n), ak+1


E L
,L = E L ,L \ {L} {0, 1, . . . , L 1} =

 

= {L + 1, . . . , 2L 1} {1, . . . , 2L 1} {L} {L , . . . , 2L 1} . (17)
Now we consider the -adic van der Corput sequence { (n)}n0 , where
(n) =

k (n) k1 ,

k=0

and 1/ = is the Pisot root of x 2 = a0 x + a0 , where a0 = L, which is the


characteristic equation associated to the numeration system G = {G n }n0 , with
G n+2 = a0 (G n+1 + G n ) and initial conditions G 0 = 1 and G 1 = a0 + 1.
By Theorem 2 of [13] we know that the digits k of the G-expansion (7) of the
/ E L
,L , where E L
,L is
natural number n have to satisfy the condition (k , k+1 )
defined by (17), and the theorem is completely proved.

It follows from Lemma 1 that in Theorem 2 we considered all the -adic van der
Corput sequences of order two, apart for the classical van der Corput sequences. On
the other hand, there exist many other LS-sequences having low discrepancy.

References
1. Aistleitner, C., Hofer, M.: Uniform distribution of generalized Kakutanis sequences of partitions. Annali di Matematica Pura e Applicata (4). 192(4), 529538 (2013)
2. Aistleitner, C., Hofer, M., Ziegler, V.: On the uniform distribution modulo 1 of multidimensional
L S-sequences. Annali di Matematica Pura e Applicata (4). 193(5), 13291344 (2014)
3. Barat, G., Grabner, P.: Distribution properties of G-additive functions. J. Number Theory 60,
103123 (1996)

270

I. Carbone

4. Carbone, I.: Discrepancy of L S sequences of partitions and points. Annali di Matematica Pura
e Applicata (4). 191(4), 819844 (2012)
5. Carbone, I.: Extension of van der Corput algorithm to L S-sequences. Appl. Math. Comput.
255, 2072013 (2015)
6. Carbone, I., Iac, M.R., Volcic, A.: L S-sequences of points in the unit square. submitted
arXiv:1211.2941 (2012)
7. Carbone, I., Iac, M.R., Volcic, A.: A dynamical system approach to the Kakutani-Fibonacci
sequence. Ergod. Theory Dyn. Syst. 34(6), 17941806 (2014)
8. Carbone, I., Volcic, A.: Kakutani splitting procedure in higher dimension. Rendiconti
dellIstituto Matematico dellUniversit di Trieste 39, 119126 (2007)
9. Carbone, I., Volcic, A.: A von Neumann theorem for uniformly distributed sequences of partitions. Rendiconti del Circolo Matematico di Palermo 60(12), 8388 (2011)
10. Chersi, F., Volcic, A.: -equidistributed sequences of partitions and a theorem of the de BruijnPost type. Annali di Matematica Pura e Applicata 4(162), 2332 (1992)
11. Drmota, M., Infusino, M.: On the discrepancy of some generalized Kakutanis sequences of
partitions. Unif. Distrib. Theory 7(1), 75104 (2012)
12. Drmota, M., Tichy, R.F.: Sequences Discrepancies and Applications. Lecture Notes in Mathematics. Springer, Berlin (1997)
13. Fraenkel, A.S.: Systems of numeration. Am. Math. Mon. 92(2), 105114 (1985)
14. Frougny, C., Solomyak, B.: Finite beta-expansions. Ergod. Theory Dyn. Syst. 12, 713723
(1992)
15. Grabner, P., Hellekalek, P., Liardet, P.: The dynamical point of view of low-discrepancy
sequences. Unif. Distrib. Theory 7(1), 1170 (2012)
16. Halton, J.H.: On the efficiency of certain quasi-random sequences of points in evaluating multidimensional integrals. Numerische Mathematik 2, 8490 (1960)
17. Hammersley, J.M.: Monte-Carlo methods for solving multivariate problems. Ann. N. Y. Acad.
Sci. 86, 844874 (1960)
18. Hofer, M., Iac, M.R., Tichy, R.: Ergodic properties of the -adic Halton sequences. Ergod.
Theory Dyn. Syst. 35, 895909 (2015)
19. Infusino, M., Volcic, A.: Uniform distribution on fractals. Unif. Distrib. Theory 4(2), 4758
(2009)
20. Kakutani, S.: A problem on equidistribution on the unit interval [0, 1[. In: Measure theory
(Proc. Conf., Oberwolfach, 1975), Lecture Notes in Mathematics 541, pp. 369375. Springer,
Berlin (1976)
21. Kuipers, L., Niederreiter, H.: Unif. Distrib. Seq. Pure and Applied Mathematics. Wiley, New
York (1974)
22. Ninomiya, S.: Constructing a new class of low-discrepancy sequences by using the -adic
transformation. IMACS Seminar on Monte Carlo Methods (Brussels, 1997). Math. Comput.
Simul. 47(25), 403418 (1998)
23. Ninomiya, S.: On the discrepancy of the -adic van der Corput sequence. J. Math. Sci. 5,
345366 (1998)
24. Rnyi, A.: Representations for real numbers and their ergodic properties. Acta Mathematica
Academiae Scientiarum Hungaricae 8, 477493 (1957)
25. van der Corput, J.G.: Verteilungsfunktionen. Proc. Koninklijke Nederlandse Akademie Van
Wetenschappen 38, 813821 (1935)
26. Volcic, A.: A generalization of Kakutanis splitting procedure. Annali di Matematica Pura e
Applicata (4). 190(1), 4554 (2011)

Computational Higher Order Quasi-Monte


Carlo Integration
Robert N. Gantner and Christoph Schwab

Abstract The efficient construction of higher-order interlaced polynomial lattice


rules introduced recently in [Dick et al. SIAM Journal of Numerical Analysis,
52(6):26762702, 2014] is considered and the computational performance of these
higher-order QMC rules is investigated on a suite of parametric, highdimensional test integrand functions. After reviewing the principles of their construction by the fast component-by-component (CBC) algorithm due to Nuyens
and Cools as well as recent theoretical results on their convergence rates from
[Dick, J., Kuo, F.Y., Le Gia, Q.T., Nuyens, D., Schwab, C.: Higher order QMC
PetrovGalerkin discretization for affine parametric operator equations with random
field inputs. SIAM J. Numer. Anal. 52(6) (2014), pp. 26762702], we indicate algorithmic aspects and implementation details of their efficient construction. Instances
of higher order QMC quadrature rules are applied to several high-dimensional test
integrands which belong to weighted function spaces with weights of product and
of SPOD type. Practical considerations that lead to improved quantitative convergence behavior for various classes of test integrands are reported. The use of (analytic or numerical) estimates on the Walsh coefficients of the integrand provide
quantitative improvements of the convergence behavior. The sharpness of theoretical, asymptotic bounds on memory usage and operation counts, with respect to
the number of QMC points N and to the dimension s of the integration domain is
verified experimentally to hold starting with dimension as low as s = 10 and with
N = 128. The efficiency of the proposed algorithms for computation of the generating vectors is investigated for the considered classes of functions in dimensions
s = 10, ..., 1000. A pruning procedure for components of the generating vector
is proposed and computationally investigated. The use of pruning is shown to yield
quantitative improvements in the QMC error, but also to not affect the asymptotic convergence rate, consistent with recent theoretical findings from [Dick, J., Kritzer, P.:

R.N. Gantner (B) C. Schwab


Seminar for Applied Mathematics, ETH Zrich, Rmistrasse 101, Zurich, Switzerland
e-mail: robert.gantner@sam.math.ethz.ch
C. Schwab
e-mail: christoph.schwab@sam.math.ethz.ch
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_12

271

272

R.N. Gantner and C. Schwab

On a projection-corrected component-by-component construction. Journal of Complexity (2015) DOI 10.1016/j.jco.2015.08.001].


Keywords Quasi-Monte Carlo Higher Order Polynominal Lattice Rule

1 Introduction
The efficient approximation of high-dimensional integrals is a core task in many areas
of scientific computing. We mention only uncertainty quantification, computational
finance, computational physics and chemistry, and computational biology. In particular, high-dimensional integrals arise in the computation of statistical quantities of
solutions to partial differential equations with random inputs.
In addition to efficient spatial and temporal discretizations of partial differential
equation models, it is important to devise high-dimensional quadrature schemes that
are able to exploit an implicitly lower-dimensional structure in parametric input data
and solutions of such PDEs. The rate of convergence of Monte Carlo (MC) methods is
dimension-robust, i.e. the convergence rate bound holds with constants independent
of the problem dimension provided that the variances are bounded independent of
the dimension, but it is limited to 1/2. Thus it is important to devise integration
methods which converge of higher order than 1/2, independent of the dimension of
the integration domain.
In recent years, numerous approaches to achieve this type of higher-order convergence have been proposed; we mention only quasi Monte-Carlo integration, adaptive
Smolyak quadrature, adaptive polynomial chaos discretizations, and related methods.
In the present paper, we consider the realization of novel higher-order interlaced
polynomial lattice rules introduced in [6, 10, 11], which allow an integrand-adapted
construction of a quasi-Monte Carlo quadrature rule that exploits sparsity of the
parameter-to-solution map. We consider in what follows the problem of integrating
a function f : [0, 1)s R of s variables y1 , . . . , ys over the s-dimensional unit
cube,

f (y1 , . . . , ys ) dy1 dys .
(1)
I [ f ] :=
[0,1)s

Exact computation quickly becomes infeasible and we must, in most applications,


resort to an approximation of (1) by a quadrature rule. We focus on quasi-Monte Carlo
quadrature rules; more specifically, we consider interlaced polynomial lattice point
sets for functions in weighted spaces with weights of product and smoothness-driven
product-and-order-dependent (SPOD) type. Denoting the interlaced polynomial lattice point set by P = {x (0) , . . . , x (N 1) } with x (n) [0, 1)s for n = 0, . . . , N 1,
we write the QMC quadrature rule as
QP [ f ] :=

N 1
1 
f (x (n) ).
N n=0

Computational Higher Order Quasi-Monte Carlo Integration

273

In Sect. 2 we first define in more detail the structure of the point set P considered
throughout and derive worst-case error bounds for integrand functions which belong
to certain weighted spaces of functions introduced in [13]. Then, the component-bycomponent construction is reviewed and the worst-case error reformulated to allow
efficient computation. The main contribution of this paper is found in Sects. 4 and 5,
which mention some practical considerations required for efficient implementation
and application of these rules. In Sect. 5, we give measured convergence results for
several model integrands, showing the applicability of these methods.

2 Interlaced Polynomial Rank-1 Lattice Rules


Polynomial rank-1 lattice point sets, introduced by Niederreiter in [14], are a modification of standard rank-1 lattice point sets to polynomial arithmetic in Zb [x] (defined
in the next section). A polynomial lattice rule is an equal-weight quasi-Monte Carlo
(QMC) quadrature rule based on such point sets. Here, we consider the higher-order
interlaced polynomial lattice rules introduced in [6, Def. 3.6], [7, Def. 5.1] and focus
on computational techniques for their efficient construction.

2.1 Definitions
For a given prime number b, let Zb denote the finite field of order b and Zb [x] the set
of polynomials with coefficients in Zb . Let P Zb [x] be an irreducible polynomial
of degree m. Then, the finite field of order bm is isomorphic to the residue class
(Zb [x]/P, +, ), where both operations are carried out in Zb [x] modulo P. We denote
by G b,m = ((Zb [x]/P) , ) the cyclic group formed by the nonzero elements of the
residue class together with polynomial multiplication modulo P.
Throughout, we frequently interchange an integer n, 0 n < N = bm , with its
associated polynomial n(x) = 0 + 1 x + 2 x 2 + . . . + m1 x m1 , the coefficients
of which are given by the b-adic expansion n = 0 + 1 b + 2 b2 + . . . + m1 bm1 .
Given a generating vector q G sb,m , we have the following expression for the
ith component of the nth point x(n) [0, 1)s of a polynomial lattice point set P:
xi(n) = vm

 n(x)q (x) 
i
, i = 1, . . . , s, n = 0, . . . , N 1,
P(x)

1
where the mapping
any integer w by the
 vm :Zb ((xm)) [0, 1)is given for 1
=

b
,
and
Z
((x
)) denotes the set of
expression vm


b
=w 
=min(1,w)
k
formal Laurent series
a
x
with
a

Z
for
some
integer
w.
k
b
k=w k

274

R.N. Gantner and C. Schwab

A key ingredient for obtaining QMC formulas which afford higher-order convergence rates is the interlacing of lattice point sets, as introduced in [1, 2]. We define
the digit interlacing function, which maps points in [0, 1) to one point in [0, 1).
Definition 1 (Digit Interlacing Function) We define the digit interlacing function
D with interlacing factor N acting on the points {x j [0, 1), j = 1, . . . , }
by



D (x1 , . . . , x ) =
j,a b j(a1) ,
a=1 j=1

where by j,a we denote the ath component of the b-adic decomposition of x j ,


x j = j,1 b1 + j,2 b2 + . . ..
An interlaced polynomial lattice point set based on the generating vector q G s
b,m ,
whose dimension is now times larger than before, is then given by the points
bm 1
with
{x(n) }n=0
(n)

xi



n(x)q(i1)+1 (x)
n(x)q(i1)+ (x)
, . . . , vm
, i = 1, . . . , s,
= D vm
P(x)
P(x)

i.e. the ith coordinate of the nth point is obtained by interlacing a block of coordinates.

2.2 Worst-Case Error Bound


We give here an overview of bounds on the worst case error which are required for
the CBC construction; for details we refer to [6]. The results therein were based on a
new function space setting, which generalizes the notion of a reproducing kernel
Hilbert space to a Banach space setting. We also refer to [13] for an overview of
related function spaces.

2.2.1

Function Space Setting

In order to derive a worst-case error (WCE) bound, consider the higher-order unanchored Sobolev space Ws,, ,q,r := { f L 1 ([0, 1)s ) :  f s,, ,q,r < } which is
defined in terms of the higher order unanchored Sobolev norm


 f s,, ,q,r :=


u{1:s}





|v|

[0,1]

uq

vu u\v {1:}|u\v|

[0,1]s|v|

( ,
,0)
( y v u\v

r/q
1/r

f )( y) d y{1:s}\v d yv
,

(2)

Computational Higher Order Quasi-Monte Carlo Integration

275

with the obvious modifications if q or r is infinite. Here {1 : s} is a shorthand notation


for the set {1, 2, . . . , s}, and ( v , u\v , 0) denotes a sequence with j = for j v,
/ u. For non-negative weights u , the space
j = j for j u \ v, and j = 0 for j
Ws,, ,q,r consists of smooth functions with integrable mixed derivatives of orders up
to with respect to each variable, and L q -integrable (q [1, ]) mixed derivatives
containing a derivative of order in at least one variable.
This space is called unanchored because the innermost integral over [0, 1]s|v|
in the definition of the norm  s,, ,q,r integrates out the inactive coordinates,
i.e. those with respect to which a derivative of order less than is taken, rather
than anchoring these variables by fixing their values equal to an anchor point
a [0, 1)s . The weights u in the definition of the norm can be interpreted as the
relative importance of groups of variables u. Below, we will assume either product
structure or so-called SPOD structure on the weights u ; here, the acronym SPOD
stands for smoothness-driven, product and order dependent weights, which were first
introduced in [6].
We remark that the sum over all subsets u {1 : s} also
r includes the empty


r
set u = , for which we obtain the term [0,1]s f ( y) d y , which contains the
average of the function f over the s-dimensional unit cube.

2.2.2

Error Bound

The worst-case error eWC (P, W ) of a point set P = { y(0) , . . . , y(b 1) } over the
function space W is defined by the following supremum over the unit ball in W :
m

eWC (P, W ) = sup |I [ f ] QP [ f ]|.


 f W 1

Assume that 1 r, r
with 1/r + 1/r
= 1 and , s N with > 1. Define a
collection of positive weights = (u )uN . Then, by [6, Theorem 3.5], we have the
following bound on the worst-case error in the space Ws,, ,q,r ,
sup

 f Ws,, ,q,r 1

|I [ f ] QP [ f ]| es,, ,r
(P),

with the bound for the worst case error es,, ,r


(P) given by


es,, ,r
(P) =
=u{1:s}

|u|
C,b
u

r
1/r

b (ku ) .

(3)


ku Du

The inner sum is over all elements of the


dual net without zero, see [10, Def. 5]. For
a number k with b-adic expansion k = Jj=1 j ba j with a1 > . . . > a J , we define

)
(a j + 1) as in [6]. The constant C,b is obtained by
the weight (k) = min(,J
j=1

276

R.N. Gantner and C. Schwab

bounding the Walsh coefficients of functions in Sobolev spaces, see [3, Thm.14] for
details. Here, it has the value


1
2
C,b = max
,
max
(2 sin b ) z=1,...,1 (2 sin b )z


1
2 2b + 1
1
.
(4)
3+ +
1+ +
b b(b + 1)
b
b1
The bound (3) holds for general digital nets; however, we wish to restrict ourselves
to polynomial lattice rules. We additionally choose r
= 1 (and thus r = , i.e. the
 norm over the sequence indexed by u {1 : s} in the norm  s,, ,q,r ). We
 a point set in s dimensions, and use in the following the definition
denote by P
logb y(1) b 1
where (0) = bb1
(y) = bb1
b b
b . Using [6, Theorem 3.9], we
b b

bound the sum over the dual net Du in (3) by a computationally amenable expression,
b 1

1  
es,, ,1 (P) E s (q) = m
v

(y (n)
j ),
b n=0 v{1:s} jv
m


y(n) P,

(5)

v =



n(x)q j (x)
where y (n)
depends on the jth component of the generating vector,
j = vm
P(x)
v , v {1 : s} depends on the choice of weights u .
q j (x), and the auxiliary weight 
Assume given a sequence ( j ) j  p (N) for 0 < p < 1 and denote by u(v) {1 :
s} an indicator set containing a dimension i {1, . . . , s} if any of the corresponding
dimensions {(i 1) + 1, . . . , i} is in v {1 : s}. This can be given explicitly
by u(v) = { j/ : j v}. For product weights, we define
v =


j = C,b b(1)/2

j,

!2(,) j ,

(6)

=1

ju(v)

and obtain from (5) the worst-case error bound for d = 1, . . . , s


b 1

1      
E d (q) = m
j
(y (n)
j ) .
b n=0 u{1:s} ju
v{1:d} jv
m

u =

(7)

u(v)=u

For SPOD weights we have


v =



u(v) {1:}|u(v)|

| u(v) |!


ju(v)

j ( j ), j ( j ) = C,b b(1)/2 2( j ,) j j , (8)

Computational Higher Order Quasi-Monte Carlo Integration

277

for which we obtain


b 1
1  
E d (q) = m
b n=0 v{1:d}

v =

{1:}|u(v)|

||!

 

j ( j )

 

ju(v)


(y (n)
)
.
j

(9)

jv

These two expressions will be the basis of the component-by-component (CBC)


construction elucidated in the next section. We note that the powers of C,b arising
in (7) and (9) can become very large, leading to a pronounced negative impact on
the construction procedure (see Sect. 4.1 below). The constant C,b , defined in (4),
stems from bounds on the Walsh coefficients of smooth functions [3].

3 Component-by-Component Construction
The component-by-component construction (CBC) [12, 18, 19] is a simple but nevertheless effective algorithm for computing generating vectors for rank-1 lattice rules,
of both standard and polynomial type. In each iteration of the algorithm, the worstcase error is computed for all candidate elements of the generating vector, and the one
with minimal WCE is taken as the next component. After s iterations, a generating
vector of length s is obtained, which can then be used for QMC quadrature.
Nuyens and Cools reformulated in [15, 16] the CBC construction to exploit the
cyclic structure inherent in the point sets for standard lattice rules when the number
of points N is a prime number. This leads to the so-called fast CBC algorithm based
on the fast Fourier transform (FFT) which speeds up the computation drastically. It
is also the basis for the present construction.
Fast CBC is based on reformulating (7) and (9): instead of iterating over the index
d = 1, . . . , smax , we iterate over the dimension s = 1, . . . , smax and for each s
over t = 1, . . . , . Thus, the index d above is replaced by the pair s, t through
d = (s 1) + t and we write
(n)
y (n)
j,i = y( j1)+i ,

j = 1, . . . , smax , i = 1, . . . , .

(10)

In order to obtain an efficient algorithm we further reformulate (7) and (9) such that
only intermediate quantities are updated instead of recomputing E d (q) in (7) and (9).

3.1 Product Weights


In the product weight case, we have for t = the expression



bm 1 s

1 
(n)
E s, (q) = m
(1 + (y j,i )) 1
1.
1 + j
b n=0 j=1
i=1

(11)

278

R.N. Gantner and C. Schwab



 
(n)

We define the quantity Ys (n) = sj=1 1 + j


which
i=1 (1 + (y j,i )) 1
will be updated at the end of each iteration over t. To emphasize the independence of
certain quantities on the current unknown component qs,t , we denote the truncated
generating vector by q d = (q1 , . . . , qd ) or in analogy to (10), q s,t = (q1,1 , . . . , qs,t ).
 s,1 , . . . , qs,t ), such that (11) can be
We now write E s,t (q s,t ) = E s1, (q s1, ) + E(q
used for E s1, (q s1, ) during the iteration over t. For t < , we have


 t
bm 1

1 
(n)
(1 + (ys,i )) 1 Ys1 (n) 1,
1 + s
E s,t (q) = m
b n=0
i=1
which can be written in terms of E s1, (q s1, ) as

 t
bm 1
bm 1
s 
s  
(n)
E s,t (q) = E s1, (q s1, ) m
Ys1 (n)+ m
(1 + (ys,i )) Ys1 (n).
b n=0
b n=0 i=1
t
(n)
For later use and ease of exposition, we define Vs,t (n) = i=1
(1 + (ys,i
)), which



(n)
(n) 
satisfies Vs,t (n) = Vs,t1 (n) 1 + (ys,t ) for t > 1 and Vs,1 (n) = 1 + (ys,1
) . We
 b 1 t
(0)
t
also note that Vs,t (0) = (1 + (0)) = b b , since ys,t = 0, independent of the
generating vector. This leads to the following decomposition of the error for product
weights

s 
(1 + (0))t 1 Ys1 (0)
bm
bm 1

s  
+ m
Vs,t1 (n) 1 Ys1 (n)
b n=1

E s,t (q) = E s1, (q s1, ) +

b 1
s 
(n)
+ m
(ys,t
)Vs,t1 (n)Ys1 (n),
b n=1
m

(12)

(n)
. This reformulation permits
where only (12) depends on the unknown qs,t through ys,t
efficient computation of the worst-case error bound E s,t during the CBC construction
by updating intermediate quantities.

3.2 SPOD Weights


The search criterion (9) can be reformulated to obtain [6, 3.43]
b 1 s
s
 
1   
E s,t (q) = m
!
j ( j )
b n=0 =1 {0:}s j=1
v{1:d} s.t.
m

||=

j >0

u(v)={1 js: j >0}


jv

(y (n)
j ).

(13)

Computational Higher Order Quasi-Monte Carlo Integration

For a complete block (i.e. t = ), we write E s, (q) =


where Us, (n) is given by
Us, (n) = !

279
1
bm

bm 1 s
n=0

=1 Us, (n),

s 


 


j ( j )
1 + (y (n)
j,i ) 1 .
{0:}s j=1
||= j >0

i=1

Proceeding as in the product weight case, we separate out the E s1, (q s1, ) term,
E s,t (q) = E s1, (q s1, )
bm 1
t
s min(,)
 


1  
!
(n)
Us1,s (n) .
+ m
(1 + (ys,i )) 1
s (s )
b
( s )!
n=0

=1 s =1

i=1

 min(,)
!
Defining Vs,t (n) as above and with Ws (n) = s
s (s ) (
Us1,s
s =1
=1
s )!
(n), we again aim to isolate the term depending on the unknown qs,t . This yields
E s,t (q) = E s1, (q s1, ) +


1  b 1 t
1 Ws (0)
m

b
b b

b 1
1 
(Vs,t1 (n) 1)Ws (n)
+ m
b n=1

(14)

b 1
1 
(n)
+ m
Vs,t1 (n)Ws (n)(ys,t
),
b n=1

(15)

(n)
.
where only the last sum (15) depends on qs,t through ys,t
The remaining terms can be ignored, since the error E(q d1 , z) is shifted by the
same amount for all candidates z G b,m . This optimization saves O(N ) operations
due to the omission of the sum (14). An analogous optimization is possible in the
product weight case. Since the value of the error bound E smax , (q) is sometimes a
useful quantity, one may choose to compute the full bounds given above.

3.3 Efficient Implementation


As currently written, the evaluation of the sums (12) and (15) for all possible bm 1
values for qs,t requires O(N 2 ) operations. Following [15], we view this sum as a
matrix-vector multiplication of the matrix

280

R.N. Gantner and C. Schwab


n(x)q(x)
:= vm
P(x)

1nbm 1
qG b,m

(16)



with the vector consisting of the component-wise product Vs,t1 (n)Ws (n) 1nbm 1 .
The elements of depend on n(x)q(x), which is a product of polynomials in G b,m .
Since the nonzero elements of a finite field form a cyclic group under multiplication,
there exists a primitive element g that generates the group, i.e. every element of G b,m
can be given as some exponent of g.
By using the so-called Rader transform, originally developed in [17], the rows
and columns of can be permuted to obtain a circulant matrix perm . Application
of the fast Fourier transform allows the multiplications (12) and (15) to be executed
in O(N log N ) operations. This technique was applied to the CBC algorithm in [16];
we also mention the exposition in [8, Chap. 10.3].
The total work complexity is O(s N log N + 2 s 2 N ) for SPOD weights and
O(s N log N ) for product weights [6, Theorems 3.1, 3.2]. In Sect. 5, we show measurements of the CBC construction time that indicate that the constants in these
asymptotic estimates are small, allowing these methods to be applied in practice.

3.4 Algorithms
In Algorithms 1 and 2 below, V, W, Y, U() and X() denote vectors of length N . E
is a vector of length N 1 and E old , E 1 , E 2 are scalars. By  we denote componentwise multiplication and z,: denotes the zth row of .
Algorithm 1 CBC_product(b, m, , smax , {1 , . . . , s })
Y 1 bm , E old 0
for s = 1, . . . , smax do
V1
for t = 1, . 
. . , do

1 t
E 1 s bb b
1 Y(0)

bm 1 
E 2 s n=1 V(n) 1 Y(n)
E s (V  Y) + (E old + E 1 + E 2 ) 1
qs,t argminqG b,m E(q)
V (1 + qs,t ,: )  V
end for


Y 1 + s (V 1)  Y
E old E(qs, )
end for
return q, E old

Computational Higher Order Quasi-Monte Carlo Integration

281

Algorithm 2 CBC_SPOD(b, m, , smax , { j ()}sj=1 )


U(0) 1, U(1 : smax ) 0
E old 0
for s = 1, . . . , smax do
V1
W0
for  = 1, . . . , s do
X() 0
for = 1, . . . , min(, ) do
!
X() X() + s () ()!
U( )
end for
W W + b1m X()
end for
for t = 1,. . . , do 
1 t
E 1 bb b
1 W(0)

bm 1 
E 2 n=1 V(n) 1 W(n)
E (V  W) + (E old + E 1 + E 2 ) 1
qs,t argminqG b,m E(q)
V (1 + qs,t ,: )  V
end for
E old E(qs, )
for  = 1, . . . , s do
U() U() + (V 1)  X()
end for
end for
return q, E old

4 Implementation Considerations
4.1 Walsh Coefficient Bound
The definition of the auxiliary weights (6) and (8) contain powers of the Walsh
constant bound C,b defined in (4), which for b = 2 is bounded from below by
 2
C,2 = 29 53
29 . For base b = 2, it was recently shown in [20] that C,2 can
be replaced by C = 1. Large values of the worst-case error bounds (7) and (9) have
been found to lead to generating vectors with bad projections. For integrand functions
with small Walsh coefficients, C,b may be replaced with a tighter bound C; this will
yield a worst-case error bound better adapted to the integrand and a generating vector
with the desired properties. Since additionally C,b is increasing in for fixed b, this
becomes more important as the order of the quadrature rule increases.

282

R.N. Gantner and C. Schwab

4.2 Pruning
For large values of the WCE, the elements of the generating vector can repeat, leading
to very bad projections in certain dimensions. For polynomial lattice rules, if qs,k =
qs,k k = 1, . . . , for two dimensions s and s , the corresponding components of the
quadrature points will be identical, xs(n) = xs(n) for all values of n = 0, . . . , bm 1.
Thus, in the projection onto the (s, s )-plane, only points on the diagonal are obtained,
which is obviously a very bad choice. One way this problem could be avoided is to
consider a second error criterion, as in [4]. We propose here a simpler method that
requires only minor modification of the CBC iteration.
To alleviate this effect, we formulate a pruning procedure that incorporates this
observation into the construction of the generating vector. We impose the additional
condition that the newest element of the generating vector is unique, i.e. is not
equal to a previously constructed component of q. This can be achieved in the CBC
construction by replacing the minimization of E(q) over all possible bm 1 values
of the new component by the restricted version
qd =

argmin E(q1 , . . . , qd1 , z).

zG b,m ,
z {q
/ 1 ,...,qd1 }

(17)

This procedure requires d 1 operations in iteration d to check the previous entries of


the vector, or O( 2 s 2 ) in total, and thus does not increase the asymptotic complexity.
Alternatively, the indices can be stored in an additional sorted data structure with
logarithmic (in s) cost for both inserting new indices and checking for membership.
This yields a cost of O(s log(s)) additional operations, with an additional storage
cost of O(s). It was shown in [5] that the presently proposed pruning procedure
preserves the higher order QMC convergence estimates. In the case where the set of
candidates in (17), G b,m \{q1 , . . . , qd1 }, is empty (which happens e.g. when smax >
bm 1), the restriction is dropped. In other words, pruning is applied as long as it
still allows for at least one possible value for qd .

5 Results
We present several tests of an implementation of Algorithms 1 and 2, and of the
resulting higher order QMC quadrature rules. Rather than solving concrete application problems, the purpose of the ensuing numerical experiments is a) to verify
the validity of the asymptotic (as s, N ) complexity estimates and QMC error
bounds, in particular to determine the range where the asymptotic complexity bounds
give realistic descriptions of the CBC constructions performance; b) to investigate
the quantitative effect of (not) pruning the generating vector on the accuracy and convergence rates of the QMC quadratures, and c) to verify the necessity of the weighted
spaces Ws,, ,q,r and the norms in (2) for classifying integrand function regularity. We

Computational Higher Order Quasi-Monte Carlo Integration

283

remark that, due to the limited space of these proceedings, only few representative
simulations can be presented in detail; for further results and a complete description
of our implementation, we refer to [9].

5.1 Model Problems


For our numerical results, we consider two model parametric integrands designed to
mimic the behavior of parametric solution families of parametric partial differential
equations. Both integrand functions are smooth (in fact, analytic) functions of all
integration variables and admit stable extensions to a countable number of integration
variables. However, their sparsity is controlled, as expressed by the growth of their
higher derivatives. The first integrand function belongs to weighted spaces Ws,, ,q,r
with the norms in (2) where the weights are of SPOD type [6], whereas the second
integrand allows for product weights. The SPOD-type integrand we consider was first
mentioned in [13], and models a parametric partial differential equation depending
in an affine manner on s parameters y1 , . . . , ys , as considered, for example, in [6]:

f ,s, ( y) = 1 +

s


1
aj yj

, a j = j , N .

(18)

j=1


||+1
We have the differential y f ,s, ( y) = (1)|| ||! f ,s, ( y) sj=1 (a j ) j , leading


to the bound | y f ,s, ( y)| C f ||! sj=1 j j for all {0, 1, . . . , }s and for a
C f 1 with the weights j given by j = a j = j , j = 1, . . . , s. Additionally,
for s , we have ( j ) j  p (N) with p > 1 and thus = 1/ p + 1 = .
Therefore, by Theorem 3.2 of [6], an interlaced polynomial lattice rule of order
with N = bm points (b prime, m 1) and point set P N can be constructed such
that the QMC quadrature error fulfills
|I [ f ,s, ] QP N [ f ,s, ]| C(, , b, p)N 1/ p ,

(19)

for a constant C(, , b, p) independent of s and N . Convergence rates were computed with respect to a reference value of the integral I [ f ,s, ] obtained with
dimension-adaptive Smolyak quadrature with tolerance 1014 . We also consider separable integrand functions, which, on account of their separability, trivially belong
to the product weight class. They are given by

g,s, ( y) = exp

s

j=1

a j y j , a j = j ,

(20)

284

R.N. Gantner and C. Schwab


and satisfy y g( y) = g( y) sk=1 (ak )k . Under the assumption that there exists
 > 0 that is independent of s and such that g( y) C
 for all y
a constant C
s
s


[0, 1] , which holds here with C = exp( j=1 j ), > 1, we have the bound

 sj=1 j j , for all {0, 1, . . . , }s with the weights j given
| y g,s, ( y)| C
by j = a j = j for j = 1, . . . , s. We have the following analytically given
formula for the integral

s 
s
 






j
j
, (21)
exp( j ) 1 = exp
I [g,s, ] =
log

( + 1)!
=0
j=1
j=1
and have an error bound of the form (19), with a different value for C(, , b, p).

5.2 Validation of Work Bound


The results below are based on an implementation of the CBC algorithm in the
C++ programming language, and exploits shared-memory parallelism to reduce the
computation time for large m and s. Fourier transforms were realized using the FFTW
library, with shared-memory parallelization enabled. Timings were executed on a
system with 48 Intel Xeon E5-2696 cores at 2.70 GHz, where at most 8 CPUs were
used at a time. The timing results in Fig. 1 show that the work bounds O(s N log N +
2 s 2 N ) for SPOD weights from [6, Thm.3.1] and O(s N log N ) for product weights
from [6, Thm.3.2] are fulfilled in practice and seem to be tight. The work O(N log N )
in the number of QMC points N also appears tight for moderate s and N .

5.3 Pruning and Adapting the Walsh Coefficient Bound


We consider construction of the generating vector with and without application of
the pruning procedure defined in Sect. 4.2. Convergence rates for both cases can be
seen in Fig. 2: for = 2 no difference was observed when pruning the entries.
Results for the constant C,b from (4) as well as for C = 1 are shown; in this
example, adapting the constant C to the integrand seems to yield better results than
pruning. In the case of the integrand (18), this can be justified by estimating the
Walsh coefficients by numerical computation of the WalshHadamard transform.
The maximal values of these numerically computed coefficients is bounded by 1 for
low dimensions, indicating that the bound C,b is too pessimistic. For base b = 2 in
(4), it was recently shown in [20] that C = 1.

Computational Higher Order Quasi-Monte Carlo Integration

(a)

(b)

(c)

(d)

285

Fig. 1 CPU time required for the construction of generating vectors of varying order = 2, 3, 4
for product and SPOD weights with j = j versus the dimension s in a and b and versus the
number of points N = 2m in c and d

(a)

(b)

Fig. 2 Effect of pruning the generating vectors: convergence of QMC approximation for the SPOD
integrand (18) with = 4, s = 100, base b = 2 and = 2, 3, 4, with and without pruning. Results
a obtained with Walsh constant (4). In b, the Walsh constant C = 1 and pruning are theoretically
justified in [5] and [20], respectively

286

R.N. Gantner and C. Schwab

(a)

(b)

(c)

(d)

Fig. 3 Convergence of QMC approximation to (21) for the product weight integrand (20) in s =
100, 1000 dimensions with interlacing parameter = 2, 3, 4 with pruning. a s = 100, = 2, b
s = 100, = 4, c s = 1000, = 2, d s = 1000, = 4

(a)

(b)

Fig. 4 Convergence of QMC approximation for the SPOD weight integrand (18) in s = 100
dimensions with interlacing parameter = 2, 3, 4 with pruning. a = 2. b = 4

Computational Higher Order Quasi-Monte Carlo Integration

287

5.4 Higher-Order Convergence


As can be seen in Figs. 3 and 4, the higher-order convergence rates proved in [6] can
be observed in practice for the two classes of tested integrand functions. To generate
the QMC rules used in Figs. 3 and 4, the ad hoc value C = 0.1 was used. We
also mention that for more general, non-affine, holomorphic parameter dependence
of operators the same convergence rates and derivative bounds as in [6] have been
recently established in [7]. The CBC constructions apply also to QMC rules for
these (non affine-parametric) problems. The left subgraphs ( = 2) show that higher
values of the interlacing parameter do not imply higher convergence rates, if the
integrand does not exhibit sufficient sparsity as quantified by the norm (2). The right
subgraphs ( = 4) in Figs. 3 and 4 show that the convergence rate is indeed dimension
independent, but limited by the interlacing parameter = 2: the integrand function
with = 4 affords higher rates than 2 for interlaced polynomial lattice rules with
higher values of the interlacing parameter .
The fast CBC constructions [15, 16], as adapted to higher order, interlaced polynomial lattice rules in [6], attain the asymptotic scalings for work and memory with
respect to N and to integration dimension s already for moderate values of s and
N . Theoretically predicted, dimension-independent convergence orders beyond first
order were achieved with pruned generating vectors obtained with base b = 2 and
Walsh constant C = 1. QMC rule performance was observed to be sensitive to overestimated values of the Walsh constant C,b . The choice b = 2 and C = 1 with
pruning of generating vectors, theoretically justified in [5] and [20], respectively,
yielded satisfactory results for = 2, 3, 4 in up to s = 1000 dimensions.
Acknowledgments This work is supported by the Swiss National Science Foundation (SNF)
under project number SNF149819 and by the European Research Council (ERC) under FP7 Grant
AdG247277. Work of CS was performed in part while CS visited ICERM / Brown University in
September 2014; the excellent ICERM working environment is warmly acknowledged.

References
1. Dick, J.: Explicit constructions of quasi-Monte Carlo rules for the numerical integration of highdimensional periodic functions. SIAM J. Numer. Anal. 45(5), 21412176 (2007) (electronic).
doi:10.1137/060658916
2. Dick, J.: Walsh spaces containing smooth functions and quasi-Monte Carlo rules of arbitrary
high order. SIAM J. Numer. Anal. 46(3), 15191553 (2008). doi:10.1137/060666639
3. Dick, J.: The decay of the Walsh coefficients of smooth functions. Bull. Aust. Math. Soc. 80(3),
430453 (2009). doi:10.1017/S0004972709000392
4. Dick, J.: Random weights, robust lattice rules and the geometry of the cbcr c algorithm.
Numerische Mathematik 122(3), 443467 (2012). doi:10.1007/s00211-012-0469-5
5. Dick, J., Kritzer, P.: On a projection-corrected component-by-component construction. J. Complex. (2015). doi:10.1016/j.jco.2015.08.001
6. Dick, J., Kuo, F.Y., Le Gia, Q.T., Nuyens, D., Schwab, C.: Higher order QMC PetrovGalerkin
discretization for affine parametric operator equations with random field inputs. SIAM J.
Numer. Anal. 52(6), 26762702 (2014)

288

R.N. Gantner and C. Schwab

7. Dick, J., Le Gia, Q.T., Schwab, C.: Higher-order quasi-Monte Carlo integration for holomorphic, parametric operator equations. SIAM/ASA J. Uncertain. Quantif. 4(1), 4879 (2016).
doi:10.1137/140985913
8. Dick, J., Pillichshammer, F.: Digital nets and sequences. Cambridge University Press, Cambridge (2010). doi:10.1017/CBO9780511761188
9. Gantner, R. N.: Dissertation ETH Zrich (in preparation)
10. Goda, T.: Good interlaced polynomial lattice rules for numerical integration in weighted Walsh
spaces. J. Comput. Appl. Math. 285, 279294 (2015). doi:10.1016/j.cam.2015.02.041
11. Goda, T., Dick, J.: Construction of interlaced scrambled polynomial lattice rules of arbitrary
high order. Found. Comput. Math. (2015). doi:10.1007/s10208-014-9226-8
12. Kuo, F.Y.: Component-by-component constructions achieve the optimal rate of convergence
for multivariate integration in weighted Korobov and Sobolev spaces. J. Complexity 19(3),
301320 (2003). doi:10.1016/S0885-064X(03)00006-2
13. Kuo, F.Y., Schwab, C., Sloan, I.H.: Quasi-Monte Carlo methods for high-dimensional integration: the standard (weighted Hilbert space) setting and beyond. ANZIAM J. 53, 137 (2011).
doi:10.1017/S1446181112000077
14. Niederreiter, H.: Random number generation and quasi-Monte Carlo methods. CBMS-NSF
Regional Conference Series in Applied Mathematics, vol. 63. Society for Industrial and Applied
Mathematics (SIAM), Philadelphia, PA (1992). doi:10.1137/1.9781611970081
15. Nuyens, D., Cools, R.: Fast algorithms for component-by-component construction of rank-1
lattice rules in shift-invariant reproducing kernel Hilbert spaces. Math. Comp. 75(254), 903
920 (2006) (electronic). doi:10.1090/S0025-5718-06-01785-6
16. Nuyens, D., Cools, R.: Fast component-by-component construction, a reprise for different
kernels. Monte Carlo and quasi-Monte Carlo methods 2004, pp. 373387. Springer, Berlin
(2006). doi:10.1007/3-540-31186-6_22
17. Rader, C.: Discrete Fourier transforms when the number of data samples is prime. Proc. IEEE
3(3), 12 (1968)
18. Sloan, I.H., Kuo, F.Y., Joe, S.: Constructing randomly shifted lattice rules in weighted Sobolev
spaces. SIAM J. Numer. Anal. 40(5), 16501665 (2002). doi:10.1137/S0036142901393942
19. Sloan, I.H., Reztsov, A.V.: Component-by-component construction of good lattice rules. Math.
Comp. 71(237), 263273 (2002). doi:10.1090/S0025-5718-01-01342-4
20. Yoshiki, T.: Bounds on Walsh coefficients by dyadic difference and a new Koksma- Hlawka
type inequality for Quasi-Monte Carlo integration (2015)

Numerical Computation of Multivariate


Normal Probabilities Using Bivariate
Conditioning
Alan Genz and Giang Trinh

Abstract New methods are derived for the computation of multivariate normal
probabilities defined for hyper-rectangular probability regions. The methods use conditioning with a sequence of truncated bivariate probability densities. A new approximation algorithm based on products of bivariate probabilities will be described.
Then a more general method, which uses sequences of simulated pairs of bivariate
normal random variables, will be considered. Simulations methods which use Monte
Carlo, and quasi-Monte Carlo point sets will be described. The new methods will be
compared with methods which use univariate normal conditioning, using tests with
random multivariate normal problems.
Keywords Multivariate normal probabilities Bivariate conditioning

1 Introduction
Many problems in applied statistical analysis require the computation of multivariate
normal (MVN) probabilities in the form
1
(a, b; ) =
|| (2 )n

b1

a1


...

bn

e 2 x
1 t

dx,

an

where x = (x1 , x2 , . . . , xn )t , dx = d xn d xn1 d x1 , and is an n n symmetric


positive definite covariance matrix. There are in general no exact methods for the
computation of the MVN probabilities, so various methods (see Genz and Bretz [5])
have been developed to provide suitably accurate approximations. And now there
A. Genz (B) G. Trinh
Department of Mathematics, Washington State University, Pullman,
WA 99164-3113, USA
e-mail: alangenz@wsu.edu
G. Trinh
e-mail: alangenz@wsu.edu
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_13

289

290

A. Genz and G. Trinh

are implementations in scientific computing environments of efficient simulation


methods (see the R pvmnorm package and Matlab mvncdf function, for example),
which can often provide highly accurate MVN probabilities.
The purpose of this paper is to consider generalizations of some simulation
methods which use univariate conditioning. The generalizations we study here use
bivariate conditioning, with the goal of providing more accurate simulations without
significantly increasing the computational cost, compared to a univariate conditioning method. In order to provide background for the new simulation methods, we
first describe the basic univariate conditioning method. Then we derive our bivariate
conditioning methods, and finish with some tests comparing the different methods.

2 Univariate Conditioning Algorithms


We start with the Cholesky decomposition of = CC t , where C is a lower trianguwe use the transformation x = Cy,
lar matrix. Then xt 1 x = xt C t C 1 x, and if
we have xt 1 x = yt y with dx = |C| dy = ||dy. The probability region for
(a, b; ) is now given by a Cy b. Taking advantage of the lower triangular
structure of C, this set of inequalities can be rewritten in more detail in the form
a1 /c11 y1 b1 /c11
(a2 c21 y1 )/c22 y2 (b2 c21 y1 )/c22
..
.
(an

n1


cnm ym )/cnn yn (bn

m=1

Then, using ai = (ai


have

n1


cnm ym )/cnn .

m=1

i1

m=1 cim ym )/cii ,

(a, b; ) =

1
(2 )n

b1

a1

y12

e 2

and bi = (bi




b2

a2

y22

e 2

i1


bn1


an1

m=1 cim ym )/cii ,


yn2

e 2 dy.

we
(1)

This conditioned form for MVN probabilities has been used as the basis for several
numerical approximation and simulation methods (see Genz and Bretz [4, 5]).

2.1 Univariate Conditioning Simulations


We can use (1) with successive truncated conditional simulations to estimate
(a, b; ). In what follows we will use y N (a, b), to denote the simulation of a random y value from a univariate Normal distribution with truncation

Numerical Computation of Multivariate Normal Probabilities

291

limits a and b. A standard method


for computing these y values is to use y =

1 (a) + ((b) (a))u , with u U (0, 1) (u is a random number from the
uniform distribution on [0, 1]). The basic simulation step k is:
1. start with y1 N (a1 , b1 ),
(ai , bi ) for i = 1, . . . , n 1;
2. given y1 , . . . , yi1 , compute
n yi N

3. compute the final Pk = i=1 ((bi ) (ai )) (a, b; )
After computing M estimates Pk , k = 1, . . . , M, for (a, b; ), we compute the
mean and standard error
M
1 
Pk (a, b; ),
PM =
M k=1

 M
EM =


PM )2 2
.
M(M 1)

k=1 (Pk

(2)

The scaled standard error is used to provide error estimates for PM . If QMC points
are used instead of the u i U (0, 1) MC points, the result is a QMC algorithm,
with faster convergence to (a, b; ) (see Hickernell [8], where the use of lattice
rule QMC point sets is analyzed). Sndor and Andrs [12] also showed how QMC
point sets can provide faster convergence than MC point sets for this problem, and
compared several types of QMC point sets.

2.2 Variable Prioritization


This algorithm uses an ordering of the variables that is specified by the original ,
but there are n! possible orderings of the variables for (a, b, ). These orderings
do not change the MVN value as long as the integration limits and corresponding
rows and columns of are also permuted. Schervish [13] originally proposed sorting
the variables so that the variables with the shortest integration interval widths were
the outer integration variables. This approach often reduces the overall variation of
the integrand and consequently produces and easier simulation problem. Gibson,
Glasbey and Elston (GGE [7]) suggested an improved prioritization of the variables.
They proposed sorting the variables so that the outermost integrals have the smallest
expected values. With this heuristic, the outer variables, which have the most influence on the innermost integrals, tend to have smaller variation, and this often reduces
the overall variance for the resulting integration problem. Test results have shown
that this variable prioritized reordering, when combined with the univariate conditioning algorithm can often produce more accurate results. We use this reordering
as preliminary step for our bivariate conditioning algorithms, so we provide some
details for the GGE reordering method here. We will use = E(a, b) to denote the
expected value for a Normal distribution; this is defined by
1
E(a, b) =
2


a

x2

xe 2 d x/((b) (a)).

292

A. Genz and G. Trinh

The GGE variable prioritization method first chooses the outermost integration variable by selecting the variable i so that





bi
ai

.
i = argmin
ii
ii
1in
The integration limits and the rows and columns of for variables 1 and i are interchanged. Then the first column of the Cholesky decomposition C of is computed

using c11 = 11 and ci1 = c11i1 for i = 2, . . . , n. Letting a 1 = ca111 , b1 = cb111 , we set
1 = E(a 1 , b1 ).
At stage j, the jth integration variable is chosen by selecting a variable i so that

j1
j1

a
b

i
im
m
i
im
m
m=1
m=1

.
i = argmin 

j1 2
j1 2
jin
ii m=1 cim
ii m=1 cim

The integration limits, rows and columns of , and partially completed rows of C for
variables j and i are interchanged. Then the jth column of C is computed using The
integration limits, rows and columns of , and partially completed rows of C for
variables
 j and i are interchanged. Then the jth column of C is computed using
cjj =

j1

j1

2
ii m=1 cim
and ci j = (i j m=1 cim c jm )/c j j , for i = j + 1, . . . , n.
j1
j1
Letting a j = (a j m=1 c jm m )/c j j , and b j = (b j m=1 c jm m )/c j j , we set
j = E(a j , b j ). The algorithm finishes when j = n, and then the final Cholesky
factor C and permuted integration limits a and b are used for the Pk computations
in (2).
Tests of the univariate conditioned simulation algorithm, with this variable
reordering algorithm show that the resulting Pk have smaller variation, reducing
the overall variation for the MVN estimates (see Genz and Bretz [4]). This (variable
prioritized) algorithm is widely used with QMC or deterministic us for implementations in Matlab, R, and Mathematica.

3 Bivariate Conditioning Simulation


We will now derive algorithms which use a bivariate conditioned form for (a, b; ).
These algorithms depend on methods for fast and accurate bivariate normal (BVN)
computations which are now available (see Drezner and Wesolowsky [2], and
Genz [3]). The algorithms also depend on a bivariate decomposition for , which
we now describe.

Numerical Computation of Multivariate Normal Probabilities

293

3.1 L DL t Decomposition
In order to transform the MVN problem into a sequence of conditioned BVN integrals, we define k =  n2  and use the covariance matrix decomposition = L DL t .
If n is even this decomposition of has

I2 O2

L 21 . . . . . .
L=
. .
.. . . I2
L k1 . . . L k,k1

D1 O 2
O2

..
..

.
, D = O2 .
. .

.. . .
O2
I2
O2 . . .

O2
.
..
. ..
,

Dk1 O2
O 2 Dk

where Di , L i, j , and O2 , are 2 2s matrices.


For odd n, there is an extra row in L, and the final entry in D is dnn .
For example, if

1 0 0
2 1 1 1 2
0 1 0
1 2 1 1 2

=
1 1 4 3 1 , L = 1 1 1
1 1 0
1 1 3 4 1
2 2 1
2 2 1 1 16

21 0 00
00
1 2 0 0 0
0 0

0 0, D =
0 0 2 1 0 .

0 0 1 2 0
10
00 0 02
11

This
 block t decomposition
 using the partitioning

 can be recursivelycomputed
D1 O
I2 O
1,1 R
=
, and D =
, where 1,1 is a 2 2
, with L =
R
M L
O D
matrix. Then D1 = 1,1 , M = R D11 , D = M D1 M t , and the decomposition
procedure continues by applying the same operations to the (n 2) (n 2) matrix
This is a 2 2 block form for the standard Cholesky decomposition algorithm
.
(see Golub and Van Loan [6]).

3.2 The Bivariate Approximation Algorithm


We start with = L DL t , and use the transformation x = Ly, so that dx = |L| dy =
dy. The y constraints which define the integration region are now determined from
a Ly b. Defining (, ) j = (a j g j , b j g j ), with
 j1
g j = m=1 l jm ym , and y2k = (y2k1 , y2k )t , we have
 2k1  2k
 1  2
1
1
1 t
21 y2t D11 y2
(a, b; ) =
e

e 2 y2k Dk y2k
|D| (2 )n 1 2
2k1
2k

dy
if n = 2k;

n 2d1 yn2
nn
e
dy
if
n = 2k + 1.
n

294

A. Genz and G. Trinh

A final transformation using yi =



(a, b; ) =

b1

a1

b2

a2

dii z i for i = 1, . . . , n, gives us

1 t

2 |12 | 2

1
2

e 2 z2 12 z2


b2k1


a2k1


b2k


a2k

e 2 z2k 2k1,2k z2k


1 t

2 |2k1,2k | 2

dz
if n = 2k,
21 z n2
e
dz if n = 2k + 1.

(3)

bn
an


!
1 k
, k = d2k1,2k / d2k1,2k1 d2k,2k , and
k 1

(a  , b )i = (, )i / dii .
The bivariate approximation algorithm begins with the computation outermost
BVN probability P1 = ((a1 , a2 ), (b1 , b2 ); 12 ). We then use explicit formulas,
derived by Muthn [11], for truncated BVN moments 1 and 2 : using q1 =

with 2k1,2k =

1 12 ,

(1 , 2 ) = E((a1 , a2 ), (b1 , b2 ); 1 )


 b1  b2
u 2 +v2 2uv1

1
2q12
(u, v)e
dvdu.
=
2 P1 q1 a1 a2
The Muthn formula for 1 is
1 = 1
+

 (a2 )
P1

(a1 )

P1

"

"

a1 1 a2 b1 1 a2


, q1
q1

a2 1 a1 b2 1 a1


, q1
q1

(b2 )

P1

(b1 )

P1

"

"

a1 1 b2 b1 1 b2


, q1
q1

a2 1 b1 b2 1 b1


, q1
q1

#
,

(4)

using the univariate (a, b) = (b) (a). The 2 formula is the same, except
for the interchanges a1  a2 and b1  b2 . Note that the i formulas depend only
on easily computed univariate pdf and cdf values.
Now, approximate the second BVN by P2 = ((a 3 , a 4 ), (b3 , b4 ); 3,4 ), where
a i , bi , are ai , bi , with z 1 , z 2 replaced by 1 , 2 . Then, compute (3 , 4 ) =
E((a 3 , a 4 ), (b3 , b4 ); 2 ). At the ith stage we compute
Pi = ((a 2i1 , a 2i ), (b2i1 , b2i ); 2i1,2i ),
with a i , bi , computed ai , bi , with z 1 , ..., z 2i2 replaced by the expected values 1 ,
..., 2i2
After k stages the bivariate conditioning approximation is
(a, b; )

k
$
i=1

Pi

1
if n = 2k;
(a n , bn ) if n = 2k + 1.

Numerical Computation of Multivariate Normal Probabilities

295

This algorithm was proposed and studied by Trinh and Genz [15], where the
BVN conditioned approximations were found to be more accurate than approximations using univariate means with conditioning. In that paper variable reorderings
were also studied, where a natural strategy is to reorder the variables at stage i to
minimize the Pi . But this strategy uses O(n 2 ) BVN values overall, which can take
a lot more time than the strategy described previously which uses only UVN values. Tests by Trinh and Genz showed that UVN value prioritization results provided
approximations which were usually as accurate, or almost as accurate as the BVN
prioritized approximations.

4 BVN Conditioned Simulation Algorithms


4.1 Basic BVN Conditioned Simulation Algorithm
We will use the approximation algorithm described in the previous section, except
that the i values will be replaced by simulated z i values. We focus on (a, b; )
in the form given by Eq. (3).
Basic BVN Conditioned Simulation Algorithm Steps
First compute the outermost BVN P1 = ((a1 , a2 ), (b1 , b2 ); 12 ). and simulate
(z 1 , z 2 ) values from the (a1 , a2 ), (b1 , b2 ) truncated density
1

e 2 z2 12 z2
1 t

2 P1 |12 | 2

At stage i: given simulated (z 1 , z 2 ), . . . , (z 2i3 , z 2i2 ) compute





, a2i ), (b2i1
, b2i
); 2i1,2i ).
Pi = ((a2i1



, a2i ), (b2i1
, b2i
) the truncated density
and simulate (z 2i1 , z 2i ) values from (a2i1
1

e 2 zi 2i1,2i zi
1 t

2 Pi |2i1,2i | 2

After k stages
(a, b; )

k
$
i=1

Pi

1
if n = 2k;
(a n , bn ) if n = 2k + 1.

(5)

296

A. Genz and G. Trinh

The k stages in the algorithm are repeated and the results are averaged to approximate
(a, b; ). The primary complication with this algorithm compared to the algorithm
for univariate simulation is the truncated BVN simulation. In contrast to the univariate
simulation, there is no direct inversion formula for truncated BVN simulation.
Currently, the most efficient methods for truncated BVN simulation use an algorithm derived by Chopin [1], with variations for special cases. The basic algorithm is
an Acceptance-Rejection (AR) algorithm which we now describe. At each stage in
the BVN conditioned simulation algorithm, we need to simulate x, y from a truncated
BVN. We consider a generic BVN problem
with

 truncated region (a, b) (c, d) and
1
correlation coefficient . Using =
, we first define
1
P=

1
1

e 2 z z dz 2 dz 1 =
1 t

e 2 x

2
1

dx

1 2

cx
2 || 2 a c
a
1 2
 b 1 x2
 dx 1 y 2
2
2
e
1 2 e

f (x)d x, with f (x) =

dy.
cx
2
2

a
2

e 2 y
d yd x
2
1

The AR algorithm first simulates x (using AR) from the (a, b) truncated density
h(x) =

1 2

e 2 x

2 P

f (x). Then, given x, y is simulated directly from a truncated Normal


with limits ( 2 , dx 2 ). For the AR x simulation, x is first simulated directly
1
1

1 2
using the (a, b) truncated Normal density g(x) = e 2 x /( 2 ((b, a)). This x is
accepted if u < h(x)/Cg(x), where u U (0, 1), and where the AR constant C
is given by C = max x[a, b] h(x/g(x)). Now h(x)/g(x) = f (x)(a, b)/P, so C is
given by the x [a, b] which maximizes f (x).
it can be shown
 Using basic analysis,

),
b
,
so
we
define f =
that a unique maximum occurs at x = min max(a, c+d
2

f (x ), with C = f (a, b)/P. This makes h(x)/(Cg(x)) = f (x)/ f . Putting the


steps together we have the following truncated AR algorithm for (x, y):
cx

Truncated BVN AR Simulation Algorithm


1. Input truncation limits

(a, b) and (c, d),and correlation coefficient .
2. Compute f = f min max(a, c+d
), b , and
2
Repeat: compute x N (a, b), u U (0, 1)
Until u f (x)/ f (accepting the final

 x);
3. Using the accepted x, compute y N cx 2 , dx 2 ;
1

4. Return (x,y).
The notation (x, y) B N ((a, b), (c, d); ) will be used to denote an (x, y) pair produced by this algorithm. We need  n1
 (x, y) pairs for each approximate (a, b, )
2
computation (5). We will present some test results using this MC algorithm in
Sect. 4.4.

Numerical Computation of Multivariate Normal Probabilities

297

4.2 BVN Conditioned Simulation with QMC Points


We also investigated the use of QMC point sets with BVN conditioned simulations,
because of the improved convergence properties for QMC point sets compared to MC
point sets for the univariate conditioned algorithms. Initially, we considered methods
which use QMC points in a fairly direct manner, by simply replacing the MC points
required for the truncated BVN AR simulations with QMC points. The validity of
the use of QMC points with AR algorithms has been analyzed previously by various
authors and this work was recently reviewed with further analysis in the paper by
Zhu and Dick [17].
An implementation problem with the truncated BVN AR algorithm is the indeterminate length AR loop, which is repeated for each approximate (a, b, ) com times). Each approximate (a, b, ) computation requires a
putation (5) ( n1
2
vector of components from a QMC sequence, but the vector length is different for
each approximate (a, b, ), because of the AR loops. While the expected length
of these vectors can be estimated, a robust implementation requires the use of a QMC
sequence with dimension larger than this expected length, to allow for the cases when
the AR loops all have several rejections. We ran some tests for this type of algorithm
using both Kronecker and lattice rule QMC sequences, with similar results, and
the results for lattice rules are reported in Sect. 4.4. An alternate method for using
QMC sequences with AR algorithms, which does not require indeterminate length
AR loops, uses smoothing. In the next section, we will describe how a smoothing
method can be used with the truncated BVN AR algorithm.

4.3 Smoothed AR for BVN Simulation


Smoothed Acceptance-Rejection has been studied in several forms (see, for example,
Wang [16], or Moskowitz and Caflish [10]). For truncated BVN simulations, we will
use an algorithm similar to the Wang algorithm. In order to describe our algorithm,
we use notation similar to that used in the previous section, and consider the basic
calculation for each stage in the conditioned BVN simulation algorithm. There we
used an approximation in the form

a

e 2 x

2
1

dx

1 2

cx

1 2

e 2 y
y ),
F(x, y)d yd x P F(x,
2
1

(6)

with (x,
y ) B N ((a, b), (c, d); ), and we used AR to determine x.
In order to use
a smoothed AR simulation for x,
we rewrite the BVN integral as

P=
a

e 2 x f (x)
dx
f

f
2
1


a

e 2 x
f

2
1


0

I (r (x) < u)dud x,

298

A. Genz and G. Trinh

where r (x) = f (x)/ f , and I (s) is the indicator function (with value 1 if s is true and
0 otherwise). This setup can be used for MC or QMC simulations (first simulate x
N (a, b) by inversion from U (0, 1), then use u U (0, 1)), but the nonsmooth I (s)
is not expected to lead to an efficient algorithm. However, we tested this unsmoothed
(USAR) algorithm, where the approximation which replaces P in (6) is
P = (a, b)) f I (r (x) < u).
These approximations, which are sometimes zero, are used to replace the Pi values
in (5), and the primary problem is that the USAR simulation algorithm can often
have zero value for (5).
Smoothed AR replaces I (r < u) with a smoother function wr (u) which satisfies
1
the condition 0 wr (u)du = r . After some experimentation and consideration of
the possibilities discussed by Wang [16], we chose to replace I (r (x) < u) by the
continuous

(x)
u, if u r (x);
1 1r
r (x)
wr (x) (u) =
r (x)
(1 u), otherwise.
1r (x)
1
0

It is is easy to check that



P=
a

wr (u)du = r , so that now we have

e 2 x
f (x)d x

2
1

e 2 x
f

2
1


0

w f (x) (u)dud x.
f

This leads to a smoothed AR algorithm for BVN


 simulation where, at each stage, x
N (a, b), followed by y N cx 2 , dx 2 , and u U (0, 1) is used to provide an
1

additional weight for that stage. The resulting contribution to the product for each
(a, b, ) approximation in (5) is
Pi = (a, b) f wr (x) (u)
instead of Pi . Notice that Pi is not needed for the SAR algorithm, and the algorithm
is similar to the univariate conditioned algorithm which uses
%
(a, b)

c x d x
!
,!
1 2
1 2

&

instead of Pi . After k stages


(a, b; )

k
$
i=1

Pi

1
if n = 2k;
(a n , bn ) if n = 2k + 1.

(7)

Numerical Computation of Multivariate Normal Probabilities

299

As with AR, the k stages in the algorithm are repeated and the results are averaged
to produce the final approximation to (a, b; ).
The SAR algorithm requires one additional u U (0, 1) for each stage so, assuming that x,
and y are both computed using truncated univariate Normal inversion of
U (0, 1)s, the total number of U (0, 1)s is m = 3n/2 1 for each approximation
to (a, b; ) for an MC SAR algorithm. For a QMC SAR algorithm, m-dimensional
QMC vectors with components from (0, 1) replace the m-dimensional U (0, 1) component vectors for the MC algorithm.

4.4 Randomized AR and SAR Tests


We completed a series of tests to compare MATLAB implementations of the algorithms discussed in this paper. For each n = 4, . . . , 15, we generated 250 random
(b, ) combinations. Each = Q D Q t was determined from a randomly generated
n n orthogonal matrix Q (see Stewart [14]) and a diagonal matrix with diagonal
entries di = u i , and each b vector had bi = nvi , with u i , vi uniform random from
[0, 1]. We used ai = for all i for all tests. Given a randomly chosen (a, b; )
problem, all of the tested algorithms were used for that problem. The term points
used in the Tables refers to the number of approximations to a randomly chosen
(a, b; ) problem that were used by each algorithm to compute that algorithms
final approximation. The QMC point set that we used for all tests was a lattice
rule point set determined using the fast CBC algorithm developed by Nuyens and
Cools [9].
Table 1 provides some test results for errors for the six algorithms:

AR(MC) used BVN simulation with AR and MC points;


USAR used unsmoothed AR with QMC points;
SAR used smoothed AR with QMC points;
AR(QMC) used BVN simulation with AR and QMC points;
UV(QMC) used univariate simulation with QMC points;
UV(MC) used univariate simulation with MC points.

All of the algorithm used the GGE univariate variable prioritization algorithm
described in Sect. 2.1. We used the Matlab mvncdf function to compute exact
values for each .
The results in Table 1 show, as was expected, that USAR is clearly not competitive
with any of the other algorithms. Somewhat surprisingly, the AR(MC) algorithm had
average errors that were somewhat smaller than the SAR errors, and (2 3) smaller
than the univariate conditioned MC algorithm. The AR(QMC) algorithm had errors
(5 10) smaller than the AR(MC) algorithm and were similar to the UV(QMC)
algorithm errors.
Table 2 provides some test results for times for the six algorithms using Matlab
with a 3.5Ghz processor Linux workstation. The results in Table 2 show that the

300

A. Genz and G. Trinh

Table 1 Average errors for MVN simulation algorithms, 2500 points


n
Algorithm average absolute errors, 2500 points
AR(MC)
USAR
SAR
AR(QMC) UV(QMC)
4
5
6
7
8
9
10
11
12
13
14
15

0.000039
0.000042
0.000040
0.000056
0.000052
0.000039
0.000066
0.000045
0.000046
0.000036
0.000050
0.000026

0.000285
0.000282
0.000370
0.000279
0.000341
0.000335
0.000324
0.000278
0.000298
0.000316
0.000354
0.000406

0.000054
0.000097
0.000066
0.000071
0.000075
0.000094
0.000224
0.000073
0.000101
0.000072
0.000079
0.000066

0.000008
0.000010
0.000008
0.000007
0.000007
0.000007
0.000005
0.000003
0.000005
0.000004
0.000003
0.000006

0.000008
0.000005
0.000005
0.000005
0.000005
0.000005
0.000006
0.000004
0.000003
0.000003
0.000003
0.000003

Table 2 Average times(s) for MVN simulation algorithms, 2500 points


n
Algorithm average Matlab times(s), 2500 points
AR(MC)
USAR
SAR
AR(QMC) UV(QMC)
4
5
6
7
8
9
10
11
12
13
14
15

0.486
0.899
1.072
1.478
1.649
2.069
2.226
2.626
2.800
3.208
3.380
3.784

0.479
0.657
0.829
1.007
1.183
1.357
1.519
1.689
1.864
2.067
2.204
2.405

0.471
0.656
0.836
1.014
1.195
1.378
1.553
1.725
1.910
2.087
2.269
2.449

0.509
0.926
1.096
1.519
1.686
2.107
2.271
2.695
2.862
3.284
3.440
3.865

0.007
0.009
0.011
0.013
0.015
0.016
0.018
0.020
0.022
0.024
0.026
0.028

UV(MC)
0.000125
0.000137
0.000154
0.000109
0.000111
0.000138
0.000126
0.000113
0.000107
0.000100
0.000106
0.000099

UV(MC)
0.008
0.011
0.013
0.016
0.018
0.021
0.023
0.026
0.029
0.032
0.034
0.037

AR algorithms takes more time (the difference increasing with dimension) compared to the approximately equal time USAR and SAR algorithms; these AR versus
SAR/USAR time difference are caused by the time needed by AR extra random
number generation and acceptance testing. The UV algorithms take much less time
( 1/100) because these algorithms can easily be implemented in Matlab in a vectorized form which allows large sets of (a, b; ) approximations to be computed
simultaneously.

Numerical Computation of Multivariate Normal Probabilities

301

5 Conclusions
The Monte Carlo MVN simulation methods described in this paper which use bivariate conditioning are more accurate than the univariate conditioned Monte Carlo simulation methods that we tested. However, there is a significant additional time cost
for the bivariate algorithms because there is no simple algorithm for simulation from
truncated BVN distributions.
We also considered the use of QMC methods with bivariate conditioned MVN
computations, but the lack of a direct algorithm for truncated BVN simulation does
not allow the straightforward use of QMC point sequences. But we did test a simple
QMC algorithm which replaces the MC vectors for the truncated BVN AR simulations with QMC vectors and this algorithm was significantly more accurate than
the MC algorithm, with error levels comparable to the univariate conditioned QMC
algorithm. We also derived a smoothed AR algorithm which could be used with a
QMC sequence for truncated BVN simulation. But, when this algorithm was combined in the bivariate conditioned MVN algorithm, the testing showed this smoothed
AR BVN conditioned algorithm had larger errors than the MC AR BVN conditioned
algorithm. The complete algorithm was not as accurate as a univariate conditioned
QMC algorithm. The bivariate conditioned algorithms also require significantly more
time than the (easily vectorized) univariate conditioned algorithms. Unfortunately,
the goal of finding a bivariate conditioned QMC MVN algorithm has not been satisfied. It is possible that a more direct algorithm for truncated BVN simulation could
lead to a more efficient MVN computation algorithm based on bivariate conditioning
with QMC sequences, but this is a subject for future research.

References
1. Chopin, N.: Fast simulation of truncated Gaussian distributions. Stat. Comput. 21, 275288
(2011)
2. Drezner, Z., Wesolowsky, G.O.: On the computation of the bivariate normal integral. J. Stat.
Comput. Simul. 3, 101107 (1990)
3. Genz, A.: Numerical computation of rectangular bivariate and trivariate normal and t probabilities. Stat. Comput. 14, 151160 (2004)
4. Genz, A., Bretz, F.: Methods for the computation of multivariate t-probabilities. J. Comput.
Graph. Stat. 11, 950971 (2002)
5. Genz, A., Bretz, F.: Computation of Multivariate Normal and t Probabilities. Lecture Notes in
Statistics, vol. 195. Springer, New York (2009)
6. Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press,
Baltimore (2012)
7. Gibson, G.J., Glasbey, C.A., Elston, D.A.: Monte Carlo evaluation of multivariate normal
integrals and sensitivity to variate ordering. In: Dimov, I.T., Sendov, B., Vassilevski, P.S. (eds.)
Advances in Numerical Methods and Applications, pp. 120126. World Scientific Publishing,
River Edge (1994)
8. Hickernell, F.J.: Obtaining O(N 2+ convergence for lattice quadrature rules. In: Fang, K.T.,
Hickernell, F.J., Niederreiter, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2000,
pp. 274289. Springer, Berlin (2002)

302

A. Genz and G. Trinh

9. Nuyens, D., Cools, R.: Fast algorithms for component-by-component construction of rank-1
lattice rules in shift-invariant Reproducing Kernel Hilbert Spaces. Math. Comput. 75, 903920
(2006)
10. Moskowitz, B., Caflish, R.E.: Smoothness and dimension reduction in quasi-Monte Carlo
methods. Math. Comput. Model. 23, 3754 (1996)
11. Muthn, B.: Moments of the censored and truncated bivariate normal distribution. Br. J. Math.
Stat. Psychol. 43, 131143 (1991)
12. Sndor, Z., Andrs, P.: Alternative sampling methods for estimating multivariate normal probabilities. J. Econ. 120, 207234 (2002)
13. Schervish, M.J.: Algorithm AS 195: multivariate normal probabilities with error bound. J.
Royal Stat. Soc. Series C 33, 8194 (1984), correction 34,103104 (1985)
14. Stewart, G.W.: The efficient generation of random orthogonal matrices with an application to
condition estimators. SIAM J. Numer. Anal. 17(3), 403409 (1980)
15. Trinh, G., Genz, A.: Bivariate conditioning approximations for multivariate normal probabilities. Stat. Comput. (2014). doi:10.1007/s11222-014-9468-y
16. Wang, X.: Improving the rejection sampling method in quasi-Monte Carlo methods. J. Comput.
Appl. Math. 114, 231246 (2000)
17. Zhu, H., Dick, J.: Discrepancy bounds for deterministic acceptance-rejection samplers. Electron. J. Stat. 8, 687707 (2014)

Non-nested Adaptive Timesteps in Multilevel


Monte Carlo Computations
Michael B. Giles, Christopher Lester and James Whittle

Abstract This paper shows that it is relatively easy to incorporate adaptive timesteps
into multilevel Monte Carlo simulations without violating the telescoping sum on
which multilevel Monte Carlo is based. The numerical approach is presented for
both SDEs and continuous-time Markov processes. Numerical experiments are given
for each, with the full code available for those who are interested in seeing the
implementation details.
Keywords multilevel Monte Carlo
Markov process

adaptive timestep SDE continuous-time

1 Multilevel Monte Carlo and Adaptive Simulations


Multilevel Monte Carlo methods [4, 6, 8] are a very simple and general approach to
improving the computational efficiency of a wide range of Monte Carlo applications.
Given a set of approximation levels  = 0, 1, . . . , L giving a sequence of approximations P of a stochastic output P, with the cost and accuracy both increasing as 
increases, then a trivial telescoping sum gives
E[PL ] = E[P0 ] +

L


E[P P1 ],

(1)

=1

expressing the expected value on the finest level as the expected value on the coarsest
level of approximation plus a sum of expected corrections.

M.B. Giles (B) C. Lester J. Whittle


Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
e-mail: mike.giles@maths.ox.ac.uk
C. Lester
e-mail: christopher.lester@maths.ox.ac.uk
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_14

303

304

M.B. Giles et al.

Approximating each of the expectations on the r.h.s. of (1) independently using


N samples, we obtain the multilevel estimator
Y =

L


Y ,

Y =

N1

=0

N 


(n)
P(n) P1

n=1

with P1 0. The Mean Square Error of this estimator can be shown to be


E[(Y E[P])2 ] = (E[PL ] E[P])2 +

L


N1 V

=0

where V V[P P1 ] is the variance of a single multilevel correction sample


on level . To ensure that the MSE is less than some given accuracy 2 , it is then
sufficient
to choose the finest level L so that the bias |E[PL ]E[P]| is less than

/ 2, and the number of samples N so that the variance sum is less than 2 /2.
If C is the cost of a single sample P P1 , then a constrained optimisation,
minimising the computational cost for a fixed total variance, leads to
N = 2 2

V /C

L 

 =0

V C .

In the particular case in which |E[P ]E[P] | 2 , V 2 , C 2  , as


 , this results in the total cost to achieve the 2 MSE accuracy being

> ,
O(2 ),
C = O(2 (log 1 )2 ), = ,

O(2( )/ ), < .
The above is a quick overview of the multilevel Monte Carlo (MLMC) approach.
In the specific context of outputs which are functionals of the solution of an SDE, most
MLMC implementations use a set of levels with exponentially decreasing uniform
timesteps, i.e. on level  the uniform timestep is
h  = M  h 0
where M is an integer. When using the EulerMaruyama approximation it is usually
found that the optimum value for M is in the range 48, whereas for higher order
strong approximations such as the Milstein first order approximation it is found that
M = 2 is best.
The MLMC implementation is then very straightforward. In computing a single
correction sample P P1 , one can first generate the Brownian increments for the
fine path simulation which leads to the output P . The Brownian increments can then
be summed in groups of size M to provide the Brownian increments for the coarse

Non-nested Adaptive Timesteps

305

path simulation which yields the output P1 . The strong convergence properties of
the numerical approximation ensure that the difference between the fine and coarse
path simulations decays exponentially as  , and therefore the output difference
P P1 also decays exponentially; this is an immediate consequence if the output
is a Lipschitz functional of the path solution, but in other cases it requires further
analysis.
In the computational finance applications which have motivated a lot of MLMC
research, it is appropriate to use uniform timesteps on each level because the drift
and volatility in the SDEs does not vary significantly from one path to another, or
from one time to another. However, in other applications with large variations in
drift and volatility, adaptive timestepping can provide very significant reductions in
computational cost for a given level of accuracy [15]. It can also be used to address
difficulties with SDEs such as
dSt = St3 dt + dWt ,
which have a super-linear growth in the drift and/or the volatility, which otherwise
lead to strong instabilities when using uniform timesteps [11].
The most significant prior research on adaptive timestepping in MLMC has been
by Hoel, von Schwerin, Szepessy and Tempone [9] and [10]. In their research, they
construct a multilevel adaptive timestepping discretisation in which the timesteps
used on level  are a subdivision of those used on level 1, which in turn are
a subdivision of those on level 2, and so on. By doing this, the payoff P on
level  is the same regardless of whether one is computing P P1 or P+1 P ,
and therefore the MLMC telescoping summation, (1), is respected. Another notable
aspect of their work is the use of adjoint/dual sensitivities to determine the optimal
timestep size, so that the adaptation is based on the entire path solution.
In this paper, we introduce an alternative approach in which the adaptive timesteps
are not nested, so that the timesteps on level  do not correspond to a subdivision
of the timesteps on level 1. This leads to an implementation which is perhaps a
little simpler, and perhaps a more natural extension to existing adaptive timestepping
methods. The local adaptation is based on the current state of the computed path, but
it would also work with adjoint-based adaptation based on the entire path. We also
show that it extends very naturally to continuous-time Markov processes, extending
ideas due to Anderson and Higham [1, 2]. The key point to be addressed is how to
construct a tight coupling between the fine and coarse path simulations, and at the
same time ensure that the telescoping sum is fully respected.

2 Non-nested Adaptive Timesteps


The essence of the approach to non-nested adaptive timestepping in MLMC is illustrated in Fig. 1.

306

M.B. Giles et al.

Fig. 1 Simulation times for multilevel Monte Carlo with adaptive timesteps

Algorithm 1 Outline of the algorithm for a single MLMC sample for  > 0 for a
scalar Brownian SDE with adaptive timestepping for the time interval [0, T ].
t := 0; t c := 0; t f := 0
h c := 0; h f := 0
W c := 0; W f := 0
while (t < T ) do
told := t
t := min(t c , t f )
W := N (0, t told )
W c := W c + W
W f := W f + W
if t = t c then
update coarse path using h c and W c
compute new adapted coarse path timestep h c
h c := min(h c , T t c )
t c := t c + h c
W c := 0
end if
if t = t f then
update fine path using h f and W f
compute new adapted fine path timestep h f
h f := min(h f , T t f )
t f := t f + h f
W f := 0
end if
end while
compute P P1

For Brownian diffusion SDEs, level  uses an adaptive timestep of the form
h  = M  H (Sn ), where M > 1 is a real constant, and H (S) is independent of
level. This automatically respects the telescoping summation, (1), since the adaptive timestep on level  is the same regardless of whether it is the coarser or finer of
the two paths being computed. On average, the adaptive timestepping leads to simulations on level  having approximately M times as many timesteps as level 1, but
it also results in timesteps which are not naturally nested, so the simulation times for
the coarse path do not correspond to simulation times on the fine path. It may appear
that this would cause difficulties in the strong coupling between the coarse and fine

Non-nested Adaptive Timesteps

307

paths in the MLMC implementation, but it does not. As usual, what is essential to
achieve a low multilevel correction variance V is that the same underlying Brownian
path is used for both the fine and coarse paths. Figure 1 shows a set of simulation
times which is the union of the fine and coarse path times. This defines a set of intervals, and for each interval we generate a Brownian increment with the appropriate
variance. These increments are then summed to give the Brownian increments for
the fine and coarse path timesteps.
An outline implementation to compute a single sample of P P1 for  > 0 is
given in Algorithm 1. This could use either an EulerMaruyama discretisation of the
SDE, or a first order Milstein discretisation for those SDEs which do not require the
simulation of Lvy area terms.
Adaptive timestepping for continuous-time Markov processes works in a very
similar fashion. The evolution of a continuous-time Markov process can be described
by
t



j Pj
j (Ss ) ds
St = S0 +
0

where the summation is over the different reactions, j is the change due to reaction

j
j (the number of molecules of each species which are created or destroyed), the P
are independent unit-rate Poisson processes, and j is the propensity function for
the j th reaction, meaning that j (St ) dt is the probability of reaction j taking place
in the infinitesimal time interval (t, t +dt).
j (St ) should be updated after each individual reaction, since it changes St , but in
the tau-leaping approximation [7] j is updated only at a fixed set of update times.
This is the basis for the MLMC construction due to Anderson and Higham [1].
Using nested uniform timesteps, with h c = 2 h f , each coarse timestep is split into
two fine timesteps, and for each
appropriate
 of the fine timesteps one has
 to compute

f
c f
f

Poisson increments P j j h for the coarse path and P j j h for the fine path.
To achieve a tight coupling between the coarse and fine paths, they use the fact that
f

cj = min(cj , j ) + |cj j | 1c > f ,


j

f
j

f
min(cj , j )

|cj

f
j |

1c < f ,
j

+b) is equivalent in distribution to


together with the fact that a Poisson variate P(a

the sum of independent Poisson variates P(a),


P(b).
Hence, they generate common
f
f
c
c
f
f

)
h
)
and
P(|

|
Poisson variates P(min(
j
j
j
j h ) and use these to give the
Poisson variates for the coarse and fine paths over the same fine timestep.
As outlined in Algorithm 2, the extension of adaptive timesteps to continuoustime Markov processes based on the tau-leaping approximation is quite natural. The
Poisson variates are computed for each time interval in the time grid formed by the
union of the coarse and fine path simulation times. At the end of each coarse timestep,
the propensity functions c are updated, and a new adapted timestep h c is defined.
Similarly, f and h f are updated at the end of each fine timestep.

308

M.B. Giles et al.

Algorithm 2 Outline of the algorithm for a single MLMC sample for a continuoustime Markov process with adaptive timestepping for the time interval [0, T ].
t := 0; t c := 0; t f := 0
c := 0; f := 0
h c := 0; h f := 0
while (t < T ) do
told := t
t := min(t c , t f )
h := t told
c , f ) h), P(|

c f | h),
for each reaction, generate Poisson variates P(min(
use Poisson variates to update fine and coarse path solutions

if t = t c then
update coarse path propensities c
compute new adapted coarse path timestep h c
h c := min(h c , T t c )
t c := t c + h c
end if
if t = t f then
update fine path propensities f
compute new adapted fine path timestep h f
h f := min(h f , T t f )
t f := t f + h f
end if
end while
compute P P1

The telescoping sum is respected because, for each timestep of either the coarse or
fine path simulation, the sum of the Poisson variates for the sub-intervals is equivalent
in distribution to the Poisson variate for the entire timestep, and therefore the expected
value E[P ] is unaffected.

3 Numerical Experiments
3.1 FENE SDE Kinetic Model
A kinetic model for a dilute solution of polymers in a fluid considers each molecule as
a set of balls connected by springs. The balls are each subject to random forcing from
the fluid, and the springs are modelled with a FENE (finitely extensible nonlinear
elastic) potential which increases without limit as the length of the bond approaches
a finite value [3].

Non-nested Adaptive Timesteps

309

In the case of a molecule with just one bond, this results in the following 3D SDE
for the vector length of the bond:
dqt =

4
qt dt + 2 dWt
1qt 2

where = 4 for the numerical experiments to be presented, and Wt is a 3D driving


Brownian motion. Note that the drift term ensures that qt  < 1 for all time, and this
property should be respected in the numerical approximation.
An EulerMaruyama discretisation of the SDE using timestep h n gives
qn+1 = qn

4h n
qn + 2 Wn
1qn 2

and because the volatility is constant, one would expect this to give first order strong
convergence. The problem is that this discretisation leads to qn+1  > 1 with positive
probability, since Wn is unbounded.
-5

-2
-4

log 2 |mean|

log 2 variance

-10

-15

-20

Pl

-8
-10
Pl

-12

Pl - P l-1

-25

-6

-14

Pl - P l-1

level l

10 6

10

Std MC
MLMC

=0.0005
=0.001
=0.002
=0.005
=0.01

10

Nl

Cost

10 4

level l

10 2

10 0

level l

10 -1

10-3

accuracy

Fig. 2 MLMC results for the FENE model using adaptive timesteps

10-2

310

M.B. Giles et al.

This problem is addressed in two ways. The first isto use adaptive timesteps which
become much smaller as qn  1. Since Wn = h Z n , where the component of
Z n in the direction normal to the boundary is a standard Normal random variable
which is very unlikely to take a value with magnitude greater than 3, we choose the
timestep so that

6 h n 1 qn 
so the stochastic term is highly unlikely to take across the boundary. In addition, the
drift term is singular at the boundary and therefore for accuracy we want the drift
term to be not too large relative to the distance to the boundary so that it will not
change by too much during one timestep. Hence, we impose the restriction
2h n
1qn .
1qn 
Combining these two gives the adaptive timestep
H (qn ) =

(1qn )2
,
max(2, 36)

on the coarsest level of approximation. On finer levels, the timestep is h n = 2


H (qn ) so that level  has approximately 2 times as many timesteps as level 0.
Despite the adaptive timestep there is still an extremely small possibility that the
numerical approximation gives qn+1  > 1. This is handled by introducing clamping
with
clamped

qn+1

:=

1
qn+1
qn+1 

if qn+1  > 1 , with typically chosen to be 105 , which corresponds to an adaptive timestep of order 1010 for the next timestep. Numerical experiments suggest
that this value for does not lead to any significant bias in the output of interest.
The output of interest in the initial experiments is E[q2 ] at time T = 1, having
started from initial data q = 0 at time t = 0. Figure 2 presents the MLMC results,
showing first order convergence for the weak error (top right plot) and second order
convergence for the multilevel correction variance (top left plot). Thus, in terms of
the standard MLMC theory we have = 1, = 2, = 1, and hence the computational
cost for RMS accuracy is O(2 ); this is verified in the bottom right plot, with the
bottom left plot showing the number of MLMC samples on each level as a function
of the target accuracy.

Non-nested Adaptive Timesteps

311

3.2 Dimerization Model


This dimerization model involving 3 species and 4 reactions has been used widely as
a test of stochastic simulation algorithms [7, 16] as it exhibits behaviour on multiple
timescales. The reaction network is given by:
1/25

,
R1 : S1

R2 : S2 S3 ,
1/500

(2)

1/2

R3 : S1 + S1 S2 , R4 : S2 S1 + S1 .
and the corresponding propensity functions for the 4 reactions are
1 = S1 ,
2 = (1/25) S2 ,
3 = (1/500) S1 (S1 1), 4 = (1/2) S2 ,

(3)

where S1 , S2 , S3 are the numbers of each of the 3 species.


We take the initial conditions to be [S1 , S2 , S3 ]T = [105 , 0, 0]T . In order to understand the dynamics of system (2), Fig. 3 presents the temporal evolution of a single
sample path of the system generated by the Gillespie method which simulates each
individual reaction. The behaviour is characterised by two distinct time scales, an
initial transient phase in which there is rapid change, and a subsequent long phase
in which the further evolution is very slow.
This motivates the use of adaptive timesteps. The
expected change in species Si in
one timestep of size h is approximately equal to h j i j j , where i j is the change
in species i due to reaction j and the summation is over all of the reactions. Hence,

10 4

10

Transient phase

10 4

10

Long phase
S

Copy number

Copy number

0.01

0.02

Time

0.03

10

20

30

Time

Fig. 3 The temporal evolution of a single sample path of reaction system (2) on two different
time-scales. Reaction rates are given in (3) and initial conditions are as described in the text

312

M.B. Giles et al.

to ensure that there is no more than a 25 % change in any species in one timestep,
the timestep on the coarsest level is taken to be



Si + 1
H = 0.25 min 
.
i
| j i j j |

(4)

On level , this timestep is multiplied by M  . The choice M = 4 is found to be good;


this is in line with experience and analysis of SDEs which shows that values for M
in the range 48 are good when the multilevel variance is O(h), as it is with this
continuous-time Markov process application [2].
The output quantity of interest is E[S3 ] at time T = 30, which is the maximum
time shown in Fig. 3. The value is approximately 20,000, so much larger values for
are appropriate in this case. The MLMC results for this testcase in Fig. 4 indicate that
the MLMC parameters are = 2, = 2, = 2, and hence the computational cost is
O(2 (log )2 ). Additional results show that the computational efficiency is much
greater than using uniform timesteps.

15

14
12

log 2 |mean|

log 2 variance

10
10
8
6

Pl - P l-1

Pl - P l-1

-5
0

level l

level l

106

10 8
Std MC
MLMC

=1
=2
=5
=10
=20

10 7

Nl

Cost

104

102

100

10 6

level l

10 0

10 1

accuracy

Fig. 4 MLMC results for the continuous-time Markov process using adaptive timesteps

Non-nested Adaptive Timesteps

313

Note that these numerical results do not include a final multilevel correction which
couples the tau-leaping approximation on the finest grid level to the unbiased Stochastic Simulation Algorithm which simulates each individual reaction. This additional coupling is due to Anderson and Higham [1], and the extension to adaptive
timestepping is discussed in [12]. Related research on adaptation has been carried out
by [13, 14].

4 Conclusions
This paper has just one objective, to explain how non-nested adaptive timesteps
can be incorporated very easily within multilevel Monte Carlo simulations, without
violating the telescoping sum on which MLMC is based.
Outline algorithms and accompanying numerical demonstrations are given for
both SDEs and continuous-time Markov processes. For those interested in learning
more about the implementation details, the full MATLAB code for the numerical
examples is available with other example codes prepared for a recent review paper
[5, 6].
Future papers will investigate in more detail the FENE simulations, including
results for molecules with multiple bonds and the interaction with fluids with nonuniform velocity fields, and the best choice of adaptive timesteps for continuous-time
Markov processes [12].
The adaptive approach could also be extended easily to Lvy processes and other
processes in which the numerical approximation comes from the simulation of increments of a driving process over an appropriate set of time intervals formed by a union
of the simulation times for the coarse and fine path approximations.
Acknowledgments MBGs research was funded in part by EPSRC grant EP/H05183X/1, and CL
and JW were funded in part by a CCoE grant from NVIDIA. In compliance with EPSRCs open
access initiative, the data in this paper, and the MATLAB codes which generated it, are available from
doi:10.5287/bodleian:s4655j04n. This work has benefitted from extensive discussions
with Ruth Baker, Endre Sli, Kit Yates and Shenghan Ye.

References
1. Anderson, D., Higham, D.: Multi-level Monte Carlo for continuous time Markov chains with
applications in biochemical kinetics. SIAM Multiscale Model. Simul. 10(1), 146179 (2012)
2. Anderson, D., Higham, D., Sun, Y.: Complexity of multilevel Monte Carlo tau-leaping. SIAM
J. Numer. Anal. 52(6), 31063127 (2014)
3. Barrett, J., Sli, E.: Existence of global weak solutions to some regularized kinetic models for
dilute polymers. SIAM Multiscale Model. Simul. 6(2), 506546 (2007)
4. Giles, M.: Multilevel Monte Carlo path simulation. Oper. Res. 56(3), 607617 (2008)
5. Giles, M.: Matlab code for multilevel Monte Carlo computations. http://people.maths.ox.ac.
uk/gilesm/acta/ (2014)

314

M.B. Giles et al.

6. Giles, M.: Multilevel Monte Carlo methods. Acta Numer. 24, 259328 (2015)
7. Gillespie, D.: Approximate accelerated stochastic simulation of chemically reacting systems.
J. Chem. Phys. 115(4), 17161733 (2001)
8. Heinrich, S.: Multilevel Monte Carlo methods. In: Multigrid Methods. Lecture Notes in Computer Science, vol. 2179, pp. 5867. Springer, Heidelberg (2001)
9. Hoel, H., von Schwerin, E., Szepessy, A., Tempone, R.: Adaptive multilevel Monte Carlo
simulation. In: Engquist, B., Runborg, O., Tsai, Y.H. (eds.) Numerical Analysis of Multiscale
Computations, vol. 82, pp. 217234. Lecture Notes in Computational Science and Engineering.
Springer, Heidelberg (2012)
10. Hoel, H., von Schwerin, E., Szepessy, A., Tempone, R.: Implementation and analysis of an
adaptive multilevel Monte Carlo algorithm. Monte Carlo Methods Appl. 20(1), 141 (2014)
11. Hutzenthaler, M., Jentzen, A., Kloeden, P.: Divergence of the multilevel Monte Carlo method.
Ann. Appl. Prob. 23(5), 19131966 (2013)
12. Lester, C., Yates, C., Giles, M., Baker, R.: An adaptive multi-level simulation algorithm for
stochastic biological systems. J. Chem. Phys. 142(2) (2015)
13. Moraes, A., Tempone, R., Vilanova, P.: A multilevel adaptive reaction-splitting simulation
method for stochastic reaction networks. Preprint arXiv:1406.1989 (2014)
14. Moraes, A., Tempone, R., Vilanova, P.: Multilevel hybrid Chernoff tau-leap. SIAM J. Multiscale
Model. Simul. 12(2), 581615 (2014)
15. Mller-Gronbach, T.: Strong approximation of systems of stochastic differential equations.
Habilitation thesis, TU Darmstadt (2002)
16. Tian, T., Burrage, K.: Binomial leap methods for simulating stochastic chemical kinetics. J.
Chem. Phys. 121(10), 356 (2004)

On ANOVA Decompositions of Kernels


and Gaussian Random Field Paths
David Ginsbourger, Olivier Roustant, Dominic Schuhmacher,
Nicolas Durrande and Nicolas Lenz

Abstract The FANOVA (or Sobol-Hoeffding) decomposition of multivariate


functions has been used for high-dimensional model representation and global sensitivity analysis. When the objective function f has no simple analytic form and is
costly to evaluate, computing FANOVA terms may be unaffordable due to numerical
integration costs. Several approximate approaches relying on Gaussian random field
(GRF) models have been proposed to alleviate these costs, where f is substituted
by a (kriging) predictor or by conditional simulations. Here we focus on FANOVA
decompositions of GRF sample paths, and we notably introduce an associated kernel
decomposition into 4d terms called KANOVA. An interpretation in terms of tensor
product projections is obtained, and it is shown that projected kernels control both
the sparsity of GRF sample paths and the dependence structure between FANOVA
effects. Applications on simulated data show the relevance of the approach for designing new classes of covariance kernels dedicated to high-dimensional kriging.
D. Ginsbourger (B)
Uncertainty Quantification and Optimal Design group, Idiap Research Institute,
Rue Marconi 19, 1920 Martigny, Switzerland
e-mail: ginsbourger@stat.unibe.ch
D. Ginsbourger
IMSV, Department of Mathematics and Statistics, University of Bern,
Alpeneggstrasse, 22, 3012 Bern, Switzerland
O. Roustant N. Durrande
Mines Saint-Etienne, UMR CNRS 6158, LIMOS, 42023 Saint-etienne, France
e-mail: roustant@emse.fr
N. Durrande
e-mail: durrande@emse.fr
D. Schuhmacher
Institut fr Mathematische Stochastik, Georg-August-Universitt Gttingen,
Goldschmidtstrae 7, 37077 Gttingen, Germany
e-mail: dominic.schuhmacher@mathematik.uni-goettingen.de
N. Lenz
geo7 AG,
Neufeldstrasse 5-9, 3012 Bern, Switzerland
e-mail: nicolas.lenz@geo7.ch
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_15

315

316

D. Ginsbourger et al.

Keywords Gaussian processes Sensitivity analysis Kriging Covariance functions Conditional simulations

1 Introduction: Metamodel-Based Global Sensitivity


Analysis
Global Sensitivity Analysis (GSA) is a topic of importance for the study of complex
systems as it aims at uncovering among many candidates which variables and interactions are influential with respect to some response of interest. FANOVA (Functional
ANalysis Of VAriance) [2, 10, 13, 32] has become commonplace for decomposing
a real-valued function f of d-variables into a sum of 2d functions (a.k.a. effects) of
increasing dimensionality, and quantifying the influence of each variable or group
of variables through the celebrated Sobol indices [27, 33]. In practice f is rarely
known analytically and a number of statistical procedures have been proposed for
estimating Sobol indices based on a finite sample of evaluations of f ; see, e.g., [15].
Alternatively, a pragmatic approach to GSA, when the evaluation budget is drastically limited by computational cost or time, is to first approximate f using some
class of surrogate models (e.g., regression, neural nets, splines, wavelets, kriging;
see [37] for an overview) and then to perform the analysis on the obtained cheapto-evaluate surrogate model. Here we focus on kriging and Gaussian random field
(GRF) models, with an emphasis on the interplay between covariance kernels and
FANOVA decompositions of corresponding centred GRF sample paths.
While GSA relying on kriging have been used for at least two decades [40],
Bayesian GSA under a GRF prior seems to originate in [24], where posterior effects
and related quantities were derived. Later on, posterior distributions of Sobol indices
were investigated in [14, 22] relying on conditional simulations, an approach revisited
and extended to multi-fidelity computer codes in [20]. From a different perspective,
FANOVA-graphs were used in [23] to incorporate GSA information into a kriging
model, and a special class of kernels was introduced in [6] for which Sobol indices
of the kriging predictor are analytically tractable. Moreover, kernels leading to GRFs
with additive paths has been discussed in [5], and FANOVA decompositions of GRFs
and their covariance were touched upon in [21] where GRFs with ortho-additive paths
were introduced. Also, kernels investigated in [6] were revisited in [4] in the context
GSA with dependent inputs, and a class of kernels related to ANOVA decompositions
was studied in [8, 9]. In a different setup, GRF priors have been used for Bayesian
FANOVA with functional responses [16].
In the present paper we investigate ANOVA decompositions both for
(symmetric positive definite) kernels and for associated centred GRFs. We show
that under standard integrability conditions, s.p.d. kernels can be decomposed into
4d terms that govern the joint distribution of the 2d terms of the associated GRF
FANOVA decomposition. This has some serious consequences in kriging-based
GSA, as for instance the choice of a sparse kernel induces almost sure sparsity
of the associated GRF paths, and such phenomenon cannot be compensated by conditioning on data.

On ANOVA Decompositions of Kernels and Gaussian

317

2 Preliminaries and Notation


FANOVA. We focus on measurable f : D Rd R (d N\{0}). In
FANOVA
d
Di for
with independent inputs, D is typically assumed to be of the form D = i=1
with
a
probability
some measurable subsets Di B(R), where each Di is endowed
d
measure i and D is equipped with the product measure =
i=1 i . Assuming
further that f is square-integrable w.r.t. , f can be expanded into as sum of 2d terms
indexed by the subsets u I = {1, . . . , d} of the d variables indices,
f =

fu ,

(1)

uI

where each f u F = L2 () depends only on the variables x j with j u (up to


an a.e. equality, as all statements involving L2 from Eq. (1)
 on). Uniqueness of this
decomposition is classically guaranteed by imposing that f u j (dx j ) = 0 for every
j u. Any f u , or FANOVA effect, can then be expressed in closed form as
f u : x D f u (x1 , . . . , xd ) =

(1)|u||u |


f (x1 , . . . , xd ) u (dxu ),

u u

(2)

 = (x i )iI \u . As developed in [19], Eq. (2) is

and
x
where u =

j
u
jI \u
a special case of a decomposition relying on commuting projections. Denoting by
P j : f F f d j the orthogonal projector onto the subspace F j of f F
not depending on x j , the identity on F can be expanded as
IF =




d


 
(IF P j ) + P j =
(IF P j )
Pj .
j=1

uI

ju

(3)

jI \u

FANOVA effects appear then as images of


the orthogonal projection opera f under 



tors onto the associated subspaces Fu =


F
j
j u
/
ju F j , i.e. we have that


 
f u = Tu ( f ), where Tu =
(I

P
)
P
j
j . Finally, the squared norm
ju F
j u
/

2
of f decomposes by orthogonality as  f  = uI Tu ( f )2 and the influence of
each (group of) variable(s) on f can be quantified via the Sobol indices
Su ( f ) =

Tu ( f T ( f ))2
Tu ( f )2
=
, u
= .
 f T ( f )2
 f T ( f )2

(4)

Gaussian random fields (GRFs). A random field indexed by D is a collection of


random variables Z = (Z x )xD defined on a common probability space (, A , P).
The random field is called a Gaussian random field (GRF) if (Z x(1) , . . . , Z x(n) ) is
n-variate normally distributed for any x(1) , . . . , x(n) D (n 1). The distribution
of Z is then characterized by its mean function m(x) = E[Z x ], x D, and covari-

318

D. Ginsbourger et al.

ance function k(x, y) = Cov(Z x , Z y ), x, y D. It is well-known that admissible


covariance functions coincide with symmetric positive definite (s.p.d.) kernels on
D D [3].
A multivariate GRF taking values in R p is a collection of R p -valued random
( j)
vectors Z = (Z x )xD such that Z x(i) , 1 i n, 1 j p, are jointly np-variate
normally distributed for any x(1) , . . . , x(n) D. The distribution of Z is characterized by its R p -valued mean function and a matrix-valued covariance function
(ki j )i, j{1,..., p} .
In both real- and vector-valued cases (assuming additional technical conditions
where necessary) k governs a number of pathwise properties ranging from squareintegrability to continuity, differentiability and more; see e.g. Sect. 1.4 of [1] or
Chap. 5 of [30] for details. As we will see in Sect. 4, k actually also governs the
FANOVA decomposition of GRF paths Z () R D . Before establishing this result, let us first introduce a functional ANOVA decomposition for kernels.

3 KANOVA: A Kernel ANOVA Decomposition


Essentially we apply the 2d-dimensional version of the decomposition introduced
in Sect. 2 to -square integrable kernels k (s.p.d. or not). From a formal point
of view it is more elegant and leads to more efficient notation if we work with the
tensor products Tu Tv : F F F F . It is well known that L 2 ( )
and F F are isometrically isomorphic (see [17] for details on tensor products of
Hilbert spaces), and we silently identify them here for simplicity. Then Tu Tv =
Tu(1) Tv(2) = Tv(2) Tu(1) , where Tu(1) , Tv(2) : L 2 ( ) L 2 ( ) are given by
(Tu(1) k)(x, y) = (Tu (k(, y))(x) and (Tv(2) k)(x, y) = (Tv (k(x, ))(y).
Theorem 1 Let k be -square integrable.
(a) There exist ku,v L2 ( ) depending
 solely on (xu , yv ) such that k can be
decomposed in a unique way as k = u,vI ku,v under the conditions



u, v I i u j v

ku,v i (dxi ) = 0 and

ku,v j (dy j ) = 0. (5)

We have
ku,v (x, y) =



(1)|u|+|v||u ||v |


k(x, y) u (dxu ) v (dyv ).

u u v v

(6)
Moreover, ku,v may be written concisely as ku,v = [Tu Tv ]k.
(b) Suppose that D is compact and k is a continuous s.p.d. kernel. Then, for any
d
(u )uI R2 , the following function is also a s.p.d. kernel:

On ANOVA Decompositions of Kernels and Gaussian

(x, y) D D



319

u v ku,v (x, y) R.

(7)

uI vI

Proof The proofs are in the appendix to facilitate the reading.

Example 1 (The Brownian kernel) Consider the covariance kernel k(x, y) =


min(x, y) of the Brownian motion on D = [0, 1], and suppose that is the
Lebesgue measure. The ku,v s can then easily be obtained by direct calculation:
2
2
k, = 13 , k,{1} (y) = y y2 13 , k{1}, (x) = x x2 13 , and k{1},{1} (x, y) =
min(x, y) x +

x2
2

y+

y2
2

+ 13 .

Example 2 Consider the very common class of tensor product kernels: k(x, y) =

d
i=1 ki (x i , yi ) where the ki s are 1-dimensional symmetric kernels. It turns out that
Eq. (6) boils down to a sum depending on 1- and 2-dimensional integrals, since

k(x, y)du (xu )dv (yv ) =

 
 
 
ki (xi , yi )
ki (xi , )di
ki (, yi )di
ki d(i i ).
iuv

iu\v

iv\u

i uv
/

(8)

By symmetry of k, Eq. (8) solely depends on the integrals ki d(i i ) and integral
functions t  ki (, t)di , i = 1, . . . , d. We refer to Sect. 7 for explicit calculations
using typical ki s. A particularly convenient case is considered next.
Corollary 1 Let ki(0) : Di Di R (1 i d) be argumentwise centred, i.e.


such that ki(0) (, t)di = ki(0) (s, )di = 0 for all i I and s, t Di , and
d
consider k(x, y) = i=1 (1 + ki(0) (xi , yi )). Then the KANOVA decomposition of k

consists of the terms [Tu Tu ]k(x, y) = iu ki(0) (xi , yi ) and [Tu Tv ]k = 0 if
u
= v.
d
(1+ki(0) (xi , yi )), where ki0 are s.p.d., we recover
Remark 1 By taking k(x, y) = i=1
the so-called ANOVA kernels [6, 38, 39]. Corollary 1 guarantees for argumentwise
centred ki(0) (see, e.g., [6, Sect. 2]) that the associated k has a simple KANOVA
decomposition, with analytically tractable ku,u and vanishing ku,v terms (for u
= v),
as also reported in [4] where a GRF model with this structure is postulated.

4 FANOVA Decomposition of Gaussian Random


Field Paths
Let Z = (Z x )xD be a centred GRF with covariance function k. To simplify the
arguments we assume for the rest of the article that Di are compact subsets of R

320

D. Ginsbourger et al.

and that Z has continuous sample paths. The latter can be guaranteed by a weak
condition on the covariance kernel; see [1], Theorem 1.4.1. For r N \ {0} write
Cb (D, Rr ) for the space of (bounded) continuous functions D Rr equipped
with the supremum norm, and set in particular Cb (D) = Cb (D, R). We reinterpret
Tu as maps Cb (D) Cb (D), which are still bounded linear operators, and set
Z x(u) = (Tu Z )x .
Theorem 2 The 2d -dimensional vector-valued random field (Z x(u) , u I )xD is
Gaussian, centred, and has continuous sample paths again. Its matrix-valued covariance function is given by
Cov(Z x(u) , Z y(v) ) = [Tu Tv ]k (x, y).

(9)

Example 3 Continuing from Example 1, let B = (Bx )x[0,1] be the Brownian motion
on D = [0, 1], which is a centred GRF with continuous paths. Theorem 2 yields that
1
1
(T B, T{1} B) = ( 0 Bu du, Bx 0 Bu du)xD is a bivariate random field on D, where
T B is a N (0, 1/3)-distributed random variable, while (T{1} Bx ) is a centred GRF
2
2
with covariance kernel k{1},{1} (x, y) = min(x, y) x + x2 y + y2 + 13 . The cross2
covariance function of the components is given by Cov(T B, T{1} Bx ) = x x2 13 .
Remark 2 Under our conditions on Z and using the notation
from the proof of

Theorem 1, we have a KarhunenLove expansion Z x = i=1
i i i (x), where
= (i )iN\{0} is a standard Gaussian white noise sequence and the series converges
uniformly (i.e. in Cb (D)) with probability 1 (and in L 2 ()); for d = 1 see [1, 18].
Thus by the continuity of Tu , we can expand the projected random field as

Z x(u)

= Tu





i i i (x) =

i=1




i i Tu (i ) (x),

(10)

i=1

where the series converges uniformly in x with probability 1 (and in L 2 ()). This
is the basis for an alternative proof of Theorem 2. We can also verify Eq. (9) under
2
these
 Using
 conditions.
 the left/right-continuity of cov in L (), we obtain indeed
(u)
(v)
cov Z x , Z y = i=1 i Tu (i )(x) Tv (i )(y) = ku,v (x, y).
Corollary 2 (a) For any u I the following statements are equivalent:
(i)
(ii)
(iii)
(iv)

Tu (k(, y)) = 0 for every y D


[Tu Tu ]k = 0
[Tu Tu ]k(x, x) = 0 for every x D
P(Z (u) = 0) = 1

(b) For any u, v I with u


= v the following statements are equivalent:
(i) [Tu Tv ]k = 0
(ii) Z (u) and Z (v) are two independent GRFs

On ANOVA Decompositions of Kernels and Gaussian

321

Remark 3 A consequence of Corollary 2 is that choosing a kernel without u component in GRF-based GSA will lead to a posterior distribution without u component
whatever the conditioning observations, i.e. P(Z (u) = 0 | Z x1 , . . . , Z xn ) = 1 (a.s.).
However, the analogous result does not hold for cross-covariances between Z (u)
and Z (v) for u
= v. Let us take for instance D = [0, 1], arbitrary, and Z t = U + Yt ,
where U N (0, 2 ) ( > 0) and (Yt ) is a centred Gaussian process with argumentwise centred covariance kernel k (0) . Assuming that U and Y are independent, it is
clear that (T Z )s = U and (T{1} Z )t = Yt , so Cov((T Z )s , (T{1} Z )t ) = 0. If in addition Z was observed at a point r D, Eq. (9) yields Cov((T Z )s , (T{1} Z )t |Z r ) =
(T T{1} )(k(,
) k(, r )k(r,
)/k(r, r ))(s, t), where k(s, t) = 2 + k (0) (s, t)
is the covariance kernel of Z . By Eq. (6) we obtain Cov((T Z )s , (T{1} Z )t |Z r ) =
2 k (0) (r, t)/( 2 + k (0) (r, r )), which in general is nonzero.
Remark 4 Coming back to the ANOVA kernels discussed in Remark 1, Corollary 2(b) implies that for a centred
continuous sample paths and covariance
d GRF with
kernel of the form k(x, y) = i=1
(1 + ki(0) (xi , yi )), where ki(0) is argumentwise centred, the FANOVA effects Z (u) , u I , are actually independent.
To close this section, let us finally touch upon the distribution of Sobol indices
of GRF sample paths, relying on Theorem 2 and Remark 2.
Corollary 3 For u I , u
= , we can represent the Sobol indices of Z as
Su (Z ) = 

Q u (, )
,
v
= Q v (, )

where the Q u s are quadratic forms in a


standard
Gaussian white noise sequence.
 
In the notation of Remark 2, Q u (, ) = i=1
j=1 i j Tu i , Tu j i j , where
the convergence is uniform with probability 1.

Remark
   the GRF Z = Z T Z with KarhunenLove expansion
5 Consider

Z x = i=1 i i (x)i . From Eq. (4) and (the proof of) Corollary 3 we can see that

   2



,
where
g
=
i j Tu i , Tu j .
Su (Z ) = Su (Z  ) = i,j=1 gi j i j
i=1 i i
ij
Truncating both series above at K N, applying the theorem in Sect. 2 of [29] and
then Lebesgues theorem for K , we obtain

ESu (Z ) =

gii

i=1

ESu (Z )2 =

(1 + 2i t)3/2





(gii g j j + 2gi j 2 )
i=1 j=1

(1 + 2l t)1/2

1

dt,

l
=i

1


t (1 + 2i t)3/2
(1 + 2l t)1/2
dt.
l {i,
/ j}

322

D. Ginsbourger et al.

5 Making New Kernels from Old with KANOVA


While kernel methods and Gaussian process modelling have proven efficient in a
number of classification and prediction problems, finding a suitable kernel for a given
application is often judged difficult. It should simultaneously express the desired features of the problem at hand while respecting positive definiteness, a mathematical
constraint that is not straightforward to check in practice. In typical implementations of kernel methods, a few classes of standard stationary kernels are available for
which positive definiteness was established analytically based on the Bochner theorem. On the other hand, some operations on kernels are known to preserve positivedefiniteness, which enables enriching the available dictionary of kernels notably by
multiplication by a positive constant, convex combinations, products and convolutions of kernels, or deformations of the input space. The section Making new kernels
from old of [26] (Sect. 4.2.4) covers a number of such operations. We now consider
some new ways of creating admissible kernels in the context of the KANOVA decomposition of Sect. 3. Let us first consider as before some square-integrable symmetric
positive definite kernel kold and take u I .
One straightforward approach to create a kernel whose associated Gaussian random field has paths in Fu is then to plainly take the simple projected kernel
knew = u kold with u = Tu Tu .

(11)

From Theorem 1(b), and also from the fact that knew is the covariance function of
Z (u) where Z is a centred GRF with covariance function kold , it is clear that such
kernels are s.p.d.; however, they will generally not be strictly positive definite.
d
Going one step further, one obtains a richer class of 22 symmetric positive definite
kernels by considering parts of P(I ), and designing kernels accordingly. Taking
U P(I ), we obtain a further class of projected kernels as follows:
knew = U kold with U = TU TU =



Tu Tv , where TU =

uU vU

Tu . (12)

uU

The resulting kernel is again s.p.d., which follows from Theorem 1(b) by choosing
u = 1 if u U and u = 0 otherwise, or again by noting that knew is the covariance
function of uU Z (u) where Z is a centred GRF with covariance function kold .
Such a kernel contains not only the covariances of the effects associated with the
different subsets of U , but also cross-covariances between these effects. Finally,
another relevant class of positive definite projected kernels can be designed by taking
knew = U
kold with U
=

Tu Tu .

(13)

uU

This kernel corresponds to the one of a sum of independent random fields with same
individual distributions as the Z (u) (u U ). In addition, projectors of the form

On ANOVA Decompositions of Kernels and Gaussian

323

U1 ,U
2 (U1 , U2 P(I )) can be combined (e.g. by sums or convex combinations)
in order to generate a large class of s.p.d. kernels, as illustrated here and in Sect. 6.
Example 4 Let us consider A = {, {1}, {2}, . . . , {d}} and O, the complement of A
in P(I ). While A corresponds to the constant and main effects forming the additive component in the FANOVA decomposition, O corresponds to all higher-order
terms, referred to as ortho-additive component in [21]. Taking A k = (T A T A )k
amounts to extracting the additive component of k with cross-covariances between
the various
 main effects (including the constant); see Fig. 1(c). On the other hand,
A
k = uA u k retains these main effects without their possible cross-covariances;
see Fig. 1(b). In the next theorem (proven in [21]), analytical formulae are given for
A k and related terms for the class of tensor product kernels.
d
Theorem 3 Let Di = [ai , bi ] (ai < bi ) and k = i=1
ki , where the ki are s.p.d.
kernels on Di such that ki (xi , yi ) > 0 for all xi , yi Di . Then, the additive and
ortho-additive components of k with their cross-covariances are given by
( A k)(x, y) =

a(x)a(y)
E

+E


ki (xi , yi )
i=1

Ei

E i (xi )E i (yi )

Ei2

d

k j (x j , y j )
(TO T A k)(x, y) = (T A TO k)(y, x) = E(x) 1 d +
( A k)(x, y)
E j (x j )
j=1

( O k)(x, y) = k(x, y) (T A TO k)(x, y) (TO T A k)(x, y) ( A k)(x, y)

b
b
d
where E i (xi ) = ai i ki (xi , yi ) dyi , E(x) = i=1
E i (xi ), Ei = ai i E i (xi )i (dxi ),

d Ei (xi ) 
d
.
Ei , and a(x) = E 1 d + i=1
E = i=1
Ei

6 Numerical Experiments
We consider 30-dimensional numerical experiments where we compare the prediction abilities of sparse kernels obtained from the KANOVA decomposition of
k(x, y) = exp(||x y||2 ),

x, y [0, 1]30 .

(14)

As detailed in the previous sections, k can be expanded as a sum of 430 terms,


and sparsified versions of k can be obtained by projections such as in Example 4.
We will focus hereafter on eight sub-kernels (all summations are over u, v I ):
kfull = k


k A = |u|1 |v|1 (Tu Tv )k

kdiag = u k
k A+O
= k A + (kdiag k A
)


k A
= |u|1 u k

kinter = |u|2 u k
k A
+O = k A
+ O k
ksparse = ( + {1} + {2} + {2,3} + {4,5} )k.
(15)

324

D. Ginsbourger et al.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Fig. 1 Schematic representations of a reference kernel k and various projections or sums of projections. The expressions of these kernels are detailed in Sect. 6 (Eq. 15)

A schematic representation of these kernels can be found in Fig. 1. Note that the
tensor product structure of k allows to use Theorem 3 in order to get more tractable
expressions for all kernels above. Furthermore, the integrals appearing in the E i and
Ei terms can be calculated analytically as detailed in appendix.
We now compare kriging predictions based on paths simulated from centred GRFs,
selecting any combination of two of the kernels in Fig. 1 and using one for simulation
(generating kernel) and one for prediction (prediction kernel). Each prediction
is performed at n test = 200 locations based on observations of an individual path
at n train = 500 locations. We judge the performance of the prediction by averaging
over n path = 200 sample paths for each combination of kernels. Whenever the kernel
used for prediction is not the same as the one used for simulation, a Gaussian observation noise with variance 2 is assumed in the models used in prediction, where
2 is chosen so as to reflect the part of variance that cannot be approximated by the
model. For simplicity, only one n train -point training set and one n test -point test set
are considered for the whole experiment. For both, design points are chosen by maximizing the minimal interpoint distance among random Latin hypercube designs [28]
using DiceDesign [7, 11]. For each path y ( = 1, . . . , n path ), the criterion used for
quantifying prediction accuracy is:
n test
(y ,i y ,i )2
C = 1 i=1n test 2
i=1 y ,i

(16)

where y ,i and y ,i are the actual and predicted values of the th path at the i th
test point. While C = 1 means a null prediction error, C = 0 means that y
predicts as badly as the null function. Average values of C are summarized in

On ANOVA Decompositions of Kernels and Gaussian

325

Table 1 Average values of C over the n path = 200 replications


Z full
Z diag
Z A
+O
Z A+O

Z inter
Z A

ZA
Z sparse
Mean

kfull

kdiag

k A
+O

k A+O

kinter

k A

kA

ksparse

0.06
0.05
0.05
0.06
0.33
0.67
0.69
0.75
0.33

0.05
0.05
0.04
0.06
0.37
0.76
0.77
0.83
0.37

0.06
0.05
0.05
0.06
0.34
0.71
0.71
0.8
0.35

0.05
0.05
0.04
0.06
0.37
0.75
0.77
0.78
0.36

0.05
0.04
0.04
0.05
0.7
0.96
0.96
0.95
0.47

0.03
0.03
0.03
0.04
0.28
1
1
0.9
0.41

0.04
0.03
0.03
0.04
0.28
1
1
0.9
0.42

0.01
0.01
0.01
0.01
0.07
0.2
0.18
1
0.19

Rows correspond to generating GRF models (characterized by generating kernels) while columns
correspond to prediction kernels. The four last rows of the kinter column are in bold blue to highlight
the superior performances of that prediction kernel when the class of generating GRF models is as
sparse or sparser than Z inter

Table 1 for all couples of generating versus prediction kernel. Note that Table 1 was
slightly perturbed but the conclusions unchanged when replicating the training and
test designs.
First, this example illustrates that, unless the correlation range is increased, predicting a GRF based on 500 points in dimension 30 is hopeless when the generating
kernel is full or close to full (first four rows of Table 1) no matter what prediction
kernel is chosen. However, for GRFs with a sparser generating kernel, prediction
performances are strongly increased (last four rows of Table 1).
Second, still focusing on the four last lines of Table 1, kinter seems to offer a
nice compromise as it works much better than other prediction kernels on Z inter and
achieves very good performances on sample paths of sparser GRFs. Besides this, it
is not doing notably worse than the best prediction kernels on rows 14.
Third, neglecting cross-correlations has very little or no influence on the results,
so that the Gaussian kernel appears to have a structure relatively close to what we
refer to as diagonal (diag) here. This point remains to be studied analytically.

7 Conclusion and Perspectives


We have proposed an ANOVA decomposition of kernels (KANOVA), and shown
how KANOVA governs the probability distribution of FANOVA effects of Gaussian
random field paths. This has enabled us in turn to establish that ANOVA kernels
correspond to centred Gaussian random fields (GRFs) with independent FANOVA
effects, to make progress towards the distribution of Sobol indices of GRFs, and
also to suggest a number of operations for making new symmetric positive definite
kernels from existing ones. Particular cases include the derivation of additive and

326

D. Ginsbourger et al.

ortho-additive kernels extracted from tensor product kernels, for which a closed form
formula was given. Besides this, a 30-dimensional numerical experiment supports
our claim that KANOVA may be a useful approach to designing kernels for highdimensional kriging, as the performances of the interaction kernel suggest. Perspectives include analytically calculating the norm of terms appearing in the KANOVA
decomposition to better understand the structure of common GRF models. From a
practical point of view, a next challenge will be to parametrize decomposed kernels
adequately so as to recover from data which terms of the FANOVA decomposition
are dominating and to automatically design adapted kernels from this.
Acknowledgments The authors would like to thank Dario Azzimonti for proofreading, as well as
the editors and an anonymous referee for their valuable comments and suggestions.

Proofs
Theorem 1 (a) The first part and the concrete solution (6) follow directly from the
corresponding statements in Sect. 2. Having established (6), it is easily seen that
[Tu Tv ]k = Tu(1) Tv(2) k coincides with ku,v .
(b) Under these conditions Mercers theorem applies (see [34] for an overview and
recent extensions). So there exist a non-negative sequence (i )iN\{0} , and continuous
representatives (i )iN\{0} of an orthonormal basis of L2 () such that k(x, y) =


i=1 i i (x)i (y), x, y D, where the convergence is absolute and uniform.


Noting that Tu , Tv are also bounded as operators on continuous functions, applying
Tu(1) Tv(2) from above yields that

uI vI

u v ku,v (x, y) =

i i (x)i (y),

(17)

i=1


where i = uI u (Tu i ). Thus the considered function is indeed s.p.d.

d
Corollary 1 Expand the product l=1
(1 + kl(0) (xl , yl )) and conclude by unique
(0)
ness of the KANOVA decomposition, noting that
lu kl (xl , yl )i (dx i ) =

(0)

lu kl (xl , yl ) j (dy j ) = 0 for any u I and any i, j u.
Theorem 2 Sample path continuity implies product-measurability of Z and Z (u) ,
respectively, as can be shown by an approximation argument; see e.g. Prop. A.D.
kernel k is continuous, hence
in [31]. Due to Theorem 3 in [35], the covariance
1/2
E|Z
|

(dx
)

(
k(x,
x)

(dx
))
<
for any u I and by
x u
u 
u
u
D
D
CauchySchwarz D D E|Z x Z y | u (dxu )v (dyv ) < for any u, v I .
Replacing f by Z in Formula (2), taking expectations and using Fubinis theorem
yields that Z (u) is centred again. Combining (2), Fubinis theorem, and (6) yields

On ANOVA Decompositions of Kernels and Gaussian

Cov(Z x(u) , Z y(v) )


u u




(1)|u|+|v||u ||v | Cov

327

Cov(Z x ,Z y ) u (dxu ) v (dyv )

 

Z y v (dyv )
Z x u (dxu ),

v v

= [Tu Tv ]k (x, y).


(18)
It remains to show the joint Gaussianity of the Z (u) . First note that Cb (D, Rr ) is a
separable Banach space for r N \ {0}. We may and do interprete Z as a random
element of Cb (D), equipped with the -algebra B D generated by the evaluation
maps [Cb (D)  f  f (x)  R]. By Theorem 2 in [25] the distribution PZ 1
of Z is a Gaussian measure on Cb (D), B(Cb (D)) . Since Tu is a bounded linear
operator Cb (D) Cb (D), we obtain immediately that the combined operator
d
T : Cb (D) Cb (D, R2 ), defined by (T( f ))(x) = (Tu f (x))uI , is also bounded
and linear. Corollary 3.7 of [36] yields that the image measure (PZ 1 )T1 is a
d
Gaussian measure on Cb (D, R2 ). This means that for every bounded linear operator
d
 : Cb (D, R2 ) R the image measure ((PZ 1 )T1 )1 is a univariate normal
distribution, i.e. (TZ ) is a Gaussian random variable. Thus,
N, x(i) D
 all n (u)
n for
(u)
and ai R, where 1 i n, u I , we obtain that i=1 uI ai (Tu Z )x(i) is
Gaussian by the fact that [Cb (D)  f  f (x) R] is continuous (and linear) for
every x D. We conclude that TZ = (Z x(u) , u I )xD is a vector-valued GRF. 
Corollary 2 (a) If (i) holds, [Tu Tu ]k = Tu(2) (Tu(1) k) = 0 by (Tu(1) k)(, y) =
Tu (k(, y)); thus (ii) holds. (ii) trivially implies (iii). Statement (iii) means that
Var(Z x(u) ) = 0, which implies that Z x(u) = 0 a.s., since Z (u) is centred. (iv) follows by noting that P(Z x(u) = 0) = 1 for all x D implies P(Z (u) = 0) = 1 by the
fact that Z (u) has continuous sample paths and is therefore separable. Finally, (iv)
implies (i) because Tu (k(, y)) = Cov(Z (u) , Z y ) = 0; see (18) for the first equality.
(b) For any m, n N and x1 , . . . , xm , y1 , . . . , yn D we obtain by Theorem 2
, . . . , Z x(u)
, Z y(v)
, . . . , Z y(v)
are jointly normally distributed. Statement (i) is
that Z x(u)
1
m
1
n
(u)
, . . . , Z x(u)
)
equivalent to saying that Cov(Z x , Z y(v) ) = 0 for all x, y D. Thus (Z x(u)
1
m
(v)
(v)
and (Z y1 , . . . , Z yn ) are independent. Since the sets
{( f, g) R D R D : ( f (x1 ), . . . , f (xm )) A, (g(y1 ), . . . , g(yn )) B}

(19)

with m, n N, x1 , . . . , xm , y1 , . . . , yn D, A B(Rm ), B B(Rn ) generate


B D B D (and the system of such sets is stable under intersections), statement (ii)
follows. The converse direction is straightforward.

Corollary 3 By Remark
2, there is a Gaussian white noise sequence = (i )iN\{0}

such
that
Z
=
i i i (x) uniformly with probability 1. From Z x(u) =
x
i=1

i i Tu i (x), we obtain Z (u) 2 = Q u (, ) with Q u as defined
i=1
 in the statement. A similar calculation for the denominator of Su (Z ) leads to v
= Q v (, ).


328

D. Ginsbourger et al.

Additional Examples
Here we give useful expressions to compute the KANOVA decomposition of some
tensor product kernels with respect to the uniform measure on [0, 1]d . For simplicity
we denote the 1-dimensional kernels on which they are based by k (corresponding
to the notation ki in Example 2). The uniform measure on [0, 1] is denoted by .


, then:
Example 5 (Exponential kernel) If k(x, y) = exp |xy|

1

0 k(., y)d = [2 k(0, y) k(1, y)]
[0,1]2 k(., .)d( ) = 2 (1 + e1/ )
Example 6 (Matrn kernel, = p + 21 ) Define for = p +
p!  ( p + i)!
k(x, y) =
(2 p)! i=0 i!( p i)!
p

Then, denoting p =


1
0

,
2

|x y|

/ 8

pi

1
2

( p N):

|x y|
.
exp
/ 2

we have:



y
1y
p!
Ap
,
2c p,0 A p
k(., y)d = p
(2 p)!
p
p

 p
 u
 p

where A p (u) =
with c p, = !1 i=0 ( p+i)!
2 pi . This generalizes
=0 c p, u e
i!
Example 5, corresponding to = 1/2. Also, this result can be
written more explicitly
for the commonly selected value = 3/2 ( p = 1, 1 = / 3):




exp |xy|
k(x, y) = 1 + |xy|
1
1




1
0 k(., y)d = 1 4 A1 y1 A1 1y
with A1 (u) = (2 + u)eu
1



[0,1]2 k(., .)d( ) = 21 2 31 + (1 + 31 )e1/1

Similarly, for = 5/2 ( p = 2, 2 = / 5):






2
exp |xy|
+ 13 (xy)
k(x, y) = 1 + |xy|
2
(2 )2
2




1
with A2 (u) = (8+5u+u 2 )eu
0 k(., y)d = 13 2 16 A2 y2 A2 1y
2

[0,1]2 k(., .)d( ) = 13 2 (16 30 2 ) + 23 (1 + 7 2 + 15 (2 )2 )e1/2


2
, then
Example 7 (Gaussian kernel) If k(x, y) = exp 21 (xy)
2





1


0 k(., y)d = 2 1y
+ y 1




 
2
[0,1]2 k(., .)d( ) = 2(e1/(2 ) 1) + 2 2 1 1
where denotes the cdf of the standard normal distribution.

On ANOVA Decompositions of Kernels and Gaussian

329

References
1. Adler, R., Taylor, J.: Random Fields and Geometry. Springer, Boston (2007)
2. Antoniadis, A.: Analysis of variance on function spaces. Statistics 15, 5971 (1984)
3. Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, Boston (2004)
4. Chastaing, G., Le Gratiet, L.: ANOVA decomposition of conditional Gaussian processes for
sensitivity analysis with dependent inputs. J. Stat. Comput. Simul. 85(11), 21642186 (2015)
5. Durrande, N., Ginsbourger, D., Roustant, O.: Additive covariance kernels for high-dimensional
Gaussian process modeling. Ann. Fac. Sci. Toulous. Math. 21, 481499 (2012)
6. Durrande, N., Ginsbourger, D., Roustant, O., Carraro, L.: ANOVA kernels and RKHS of zero
mean functions for model-based sensitivity analysis. J. Multivar. Anal. 115, 5767 (2013)
7. Dupuy, D., Helbert, C., Franco, J.: DiceDesign and DiceEval: Two R packages for design and
analysis of computer experiments. J. Stat. Softw. 65(11): 138 (2015)
8. Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, Department of Engineering, University of Cambridge (2014)
9. Duvenaud, D., Nickisch, H., Rasmussen, C.: Additive Gaussian Processes. NIPS conference.
(2011)
10. Efron, B., Stein, C.: The jackknife estimate of variance. Ann. Stat. 9, 586596 (1981)
11. Franco, J., Dupuy, D., Roustant, O., Damblin, G., Iooss, B.: DiceDesign: Designs of computer
experiments. R package version 1.7 (2015)
12. Gikhman, I.I., Skorokhod, A.V.: The theory of stochastic processes. Springer, Berlin (2004).
Translated from the Russian by S. Kotz, Reprint of the 1974 edition
13. Hoeffding, W.: A class of statistics with asymptotically normal distributions. Ann. Math. Stat.
19, 293325 (1948)
14. Jan, B., Bect, J., Vazquez, E., Lefranc, P.: approche baysienne pour lestimation dindices de
Sobol. In 45mes Journes de Statistique - JdS 2013. Toulouse, France (2013)
15. Janon, A., Klein, T., Lagnoux, A., Nodet, M., Prieur, C.: Asymptotic Normality and Efficiency
of Two Sobol Index Estimators. Probability And Statistics, ESAIM (2013)
16. Kaufman, C., Sain, S.: Bayesian functional ANOVA modeling using Gaussian process prior
distributions. Bayesian Anal. 5, 123150 (2010)
17. Kre, P.: Produits tensoriels complts despaces de Hilbert. Sminaire Paul Kre Vol 1, No. 7
(19741975)
18. Kuelbs, J.: Expansions of vectors in a Banach space related to Gaussian measures. Proc. Am.
Math. Soc. 27(2), 364370 (1971)
19. Kuo, F.Y., Sloan, I.H., Wasilkowski, G.W., Wozniakowski, H.: On decompositions of multivariate functions. Math. Comput. 79, 953966 (2010)
20. Le Gratiet, L., Cannamela, C., Iooss, B.: A Bayesian approach for global sensitivity analysis
of (multi-fidelity) computer codes. SIAM/ASA J. Uncertain. Quantif. 2(1), 336363 (2014)
21. Lenz, N.: Additivity and ortho-additivity in Gaussian random fields. Masters thesis, Departement of Mathematics and Statistics, University of Bern (2013). http://hal.archives-ouvertes.fr/
hal-01063741
22. Marrel, A., Iooss, B., Laurent, B., Roustant, O.: Calculations of Sobol indices for the Gaussian
process metamodel. Reliab. Eng. Syst. Saf. 94, 742751 (2009)
23. Muehlenstaedt, T., Roustant, O., Carraro, L., Kuhnt, S.: Data-driven Kriging models based on
FANOVA-decomposition. Stat. Comput. 22(3), 723738 (2012)
24. Oakley, J., OHagan, A.: Probabilistic sensitivity analysis of complex models: a Bayesian
approach. J. R. Stat. Soc. 66, 751769 (2004)
25. Rajput, B.S., Cambanis, S.: Gaussian processes and Gaussian measures. Ann. Math. Stat. 43,
19441952 (1972)
26. Rasmussen, C.R., Williams, C.K.I.: Gaussian Processes for Machine Learning. Cambridge,
MIT Press (2006)
27. Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M.,
Tarantola, S.: Global sensitivity analysis: the primer. Wiley Online Library (2008)

330

D. Ginsbourger et al.

28. Santner, T., Williams, B., Notz, W.: The design and analysis of computer experiments. Springer,
New York (2003)
29. Sawa, T.: The exact moments of the least squares estimator for the autoregressive model. J.
Econom. 8(2), 159172 (1978)
30. Scheuerer, M.: A comparison of models and methods for spatial interpolation in statistics and
numerical analysis. Ph.D. thesis, Georg-August-Universitt Gttingen (2009)
31. Schuhmacher, D.: Distance estimates for poisson process approximations of dependent thinnings. Electron. J. Probab. 10(5), 165201 (2005)
32. Sobol, I.: Multidimensional Quadrature Formulas and Haar Functions. Nauka, Moscow
(1969). (In Russian)
33. Sobol, I.: Global sensitivity indices for nonlinear mathematical models and their Monte Carlo
estimates. Math. Comput. Simul. 55(13), 271280 (2001)
34. Steinwart, I., Scovel, C.: Mercers theorem on general domains: on the interaction between
measures, kernels, and RKHSs. Constr. Approx. 35(3), 363417 (2012)
35. Talagrand, M.: Regularity of Gaussian processes. Acta Math. 159(12), 99149 (1987)
36. Tarieladze, V., Vakhania, N.: Disintegration of Gaussian measures and average-case optimal
algorithms. J. Complex. 23(46), 851866 (2007)
37. Touzani, S.: Response surface methods based on analysis of variance expansion for sensitivity
analysis. Ph.D. thesis, Universit de Grenoble (2011)
38. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
39. Wahba, G.: Spline Models for Observational Data. Siam, Philadelphia (1990)
40. Welch, W.J., Buck, R.J., Sacks, J., Wynn, H.P., Mitchell, T.J., Morris, M.D.: Screening, predicting, and computer experiments. Technometrics 34, 1525 (1992)

The Mean Square Quasi-Monte Carlo Error


for Digitally Shifted Digital Nets
Takashi Goda, Ryuichi Ohori, Kosuke Suzuki and Takehito Yoshiki

Abstract In this paper, we study randomized quasi-Monte Carlo (QMC) integration


using digitally shifted digital nets. We express the mean square QMC error of the nth
discrete approximation f n of a function f : [0, 1)s R for digitally shifted digital
nets in terms of the Walsh coefficients of f . We then apply a bound on the Walsh coefficients for sufficiently smooth integrands to obtain a quality measure called Walsh
figure of merit for the root mean square error, which satisfies a KoksmaHlawka
type inequality on the root mean square error. Through two types of experiments, we
confirm that our quality measure is of use for finding digital nets which show good
convergence behavior of the root mean square error for smooth integrands.
Keywords Randomized quasi-Monte Carlo
functions Walsh figure of merit

Digital shift Digital net Walsh

T. Goda
Graduate School of Engineering, The University of Tokyo,
7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
e-mail: goda@frcer.t.u-tokyo.ac.jp
R. Ohori
Fujitsu Laboratories Ltd., 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki,
Kanagawa 211-8588, Japan
e-mail: ohori.ryuichi@jp.fujitsu.com
K. Suzuki (B) T. Yoshiki
School of Mathematics and Statistics, The University of New South Wales,
Sydney, NSW 2052, Australia
e-mail: kosuke.suzuki1@unsw.edu.au
K. Suzuki T. Yoshiki
Graduate School of Mathematical Sciences, The University of Tokyo,
3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan
e-mail: takehito.yoshiki1@unsw.edu.au
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_16

331

332

T. Goda et al.

1 Introduction
Quasi-Monte Carlo (QMC) integration is one of the well-known methods for
high-dimensional numerical integration [5, 11]. Let P be a point set in the
s-dimensional unit cube [0, 1)s with finite cardinality |P|, and f : [0, 1)s R a
Riemann integrable
function. The QMC integration by P gives

 an approximation
of I ( f ) := [0,1)s f (x) d x by the average IP ( f ) := |P|1 xP f (x).
Let Zb = Z/bZ be the residue class ring modulo b, which is identified with the set
the set of s n matrices over Zb for a positive integer n. The
{0, . . . , b 1}, and Zsn
b
is
an
additive
group
with respect to the operation +, the usual summation
set Zsn
b
of matrices over Zb . As QMC point sets, we consider digital nets defined as follows.
Definition 1 Let m, n 
be positive integers. Let 0 k bm 1 be an integer with
m
i bi1 . Let Ci Znm
. For 1 i s and 1 j n,
b-adic expansion k = i=1
b
define yi, j,k Zb by (yi,1,k, , . . . , yi,n,k ) = Ci (1 , . . . , m ) . Then we define
xi,k =

yi,2,k
yi,1,k
yi,n,k
+ 2 + + n [0, 1)
b
b
b

for 1 i s. In this way we obtain the k-th point x k = (x1,k , . . . , xs,k ). We call the
set P := {x 0 , . . . , x bm 1 } (P is considered as a multiset) a digital net over Zb with
precision n generated by C1 , . . . , Cs , or simply a digital net.
Recently, the discretization f n of a function f : [0, 1)s R has been introduced
to analyze QMC integration in the framework of digital computation [9]. We define
R by
the n-digit discretization f n : Zsn
b
f n (X ) :=

1
Vol(In (X ))


In (X )

f (x) d x,

s n

xi, j b j , nj=1 xi, j b j + bn ).
for X = (xi, j ) Zsn
i=1 [
j=1 
b . Here In (X ) :=
We denote the true integral of f n by I ( f n ) := bsn X Zbsn f n (X ), which indeed

s
equals I ( f ). Define a function : Zsn
 [0, 1)s by (X ) := ( nj=1 xi, j b j )i=1
b
sn
for X = (xi, j ) Zb , where xi, j is considered to be an integer and the sum is taken
in R. Then it is easy to check that for any digital net P there exists a subgroup
such that P = (P). Thus, in discretized setting, our main concern is
P Zsn
b
is a subgroup. By abuse of terminology, a subgroup of Zsn
the case that P Zsn
b
b
is also called a digital net in this paper.
In [9], Matsumoto, Saito and Matoba
 treat the QMC integration of the n-th discrete approximation I P ( f n ) := |P|1 X P f n (X ) for b = 2. They consider the discretized integration error Err( f n ; P) := I P ( f n ) I ( f n ) instead of the usual integration error Err( f ; (P)) := I(P) ( f ) I ( f ). The difference between them, which
is equal to I(P) ( f ) I P ( f n ), is called the discretization error and bounded by
sup X Zbsn , xIn (X ) | f (x) f n (X )|. If f is continuous with Lipschitz constant K ,

then the discretization error is bounded by K sbn , which is negligibly small in

The Mean Square Quasi-Monte Carlo

333

practice (say n = 30) [9, Lemma 2.1]. Hence, in this case, we have Err( f n ; P)
Err( f ; (P)), which is a part of their setting we adopt.
Assume that f : [0, 1)s R is a function whose mixed partial derivatives up to
order n in each variable are continuous and P Zsn
is a subgroup. Matsumoto
b
et al. [9] proved the KoksmaHlawka type inequality for Err( f n ; P);
|Err( f n ; P)| Cb,s,n || f ||n WAFOM(P),

(1)

where Cb,s,n is a constant independent of f and P and WAFOM(P) is the Walsh


figure of merit, a quantity which depends only on P and can be computed in O(sn|P|)
steps. || f ||n is the norm of f defined as in [4] (see also Sect. 4). More recently, this
result has been generalized by Suzuki [13] for digital nets over a finite abelian group.
WAFOM was suggested as a criterion for the quality of digital nets in [9]. The first
advantage of WAFOM is that the inequality (1) implies that if WAFOM(P) is small,
Err( f n ; P) can also be small. The second is that WAFOM is efficiently computable.
It means that we can find P with small WAFOM(P) by computer search. Numerical
experiments showed that by stochastic optimization we can find P with WAFOM(P)
small enough, and that such P performs well for a financial problem [9]. Moreover,
the existence of a low-WAFOM digital net P of size N has been proved in [10, 13]
such that WAFOM(P) N C(log N )/s+D for positive constants C and D when
(log N )/s is large enough. Thus, a low-WAFOM digital net is asymptotically superior to well-known low-discrepancy point sets for sufficiently smooth integrands.
In this paper, as a continuation of [9, 13], we discuss randomized QMC integration
using digitally shifted digital nets for the n-digit discretization f n . A digitally shifted
is defined as P + = {B + | B P} for a subgroup
digital net P + Zsn
b
sn
and

Z
.
Here
is chosen uniformly and randomly. Randomized
P Zsn
b
b
QMC integration by P + of the n-digit discretization f n gives the approximation
I P+ ( f n ) of I ( f n ). By adding a random element , it becomes possible to obtain
some statistical estimate on the integration error. Such an estimate is not available
for deterministic digital nets.
We note that randomized QMC integration using digitally shifted digital nets has
already been studied in previous works, see for instance [1, 7] among many others,
where a digital shift is chosen from [0, 1)s and the QMC integration using P
is considered to give the approximation of I ( f ). Here denotes digitwise addition
modulo b applied componentwise. It is known that the estimator IP ( f ) is an
unbiased estimator of I ( f ), so that the mean square QMC error for a function f
with respect to [0, 1)s equals the variance of the estimator.
In the n-digit discretized setting which we consider in this paper, it is also possible to show that the estimator I P+ ( f n ) is an unbiased estimator of I ( f n ), so that
equals the
the mean square QMC error for a function f n with respect to Zsn
b
variance of the estimator, see Proposition 2. For our case, where the discretization
error is negligible, we also have Var [0,1)s [I(P) ( f )] Var Zbsn [I(P+) ( f )]
Var Zbsn [I P+ ( f n )].
The variance Var Zbsn [I(P+) ( f )] is for practical computation where each
real number in [0, 1) is represented as a finite-digit binary fraction. The estima-

334

T. Goda et al.

tor I(P+) ( f ) of I ( f ) has so small a bias that the variance Var Zbsn [I(P+) ( f )] is
a good approximation of the mean square error EZbsn [(I(P+) ( f ) I ( f ))2 ].
From the above justifications of the n-digit discretization for digitally shifted
point sets, we focus on analyzing the variance Var Zbsn [I P+ ( f n )] of the estimator
I P+ ( f n ). As the main result of this paper, in Sect. 4 below, we give a Koksma
Hlawka type inequality to bound the variance:

Var Zbsn [I P+ ( f n )] Cb,s,n f n W (P; ),

(2)

where Cb,s,n and f n are the same as in (1), denotes the Dick weight defined
later in Definition 3, and W (P; ) is a quantity which depends only on P and can
be computed in O(sn|P|) steps. Thus, similarly to WAFOM(P), W (P; ) can be a
useful measure for the quality of digital nets.
The remainder of this paper is organized as follows. We give some preliminaries
in Sect. 2. In Sect. 3, we consider the randomized QMC integration over Zsn
b . For
R, a subgroup P Zsn
and an element Zsn
a function F : Zsn
b
b
b , we first
prove the unbiasedness of the estimator I P+ (F) as mentioned above, and then that
the variance Var Zbsn [I P+ (F)] can be written in terms of the discrete Fourier coefficients of F, see Theorem 2. In Sect. 4, we apply a bound on the Walsh coefficients
for sufficiently smooth functions to the variance Var Zbsn [I P+ ( f n )], and obtain a
quality measure W (P; ) which satisfies a KoksmaHlawka type inequality on the
root mean square error. By using the MacWilliams-type identity given in [13], we
give a computable formula for W (P; ) in Sect. 5. Finally, in Sect. 6, we conduct
two types of experiments to show that our new quality measure is of use for finding
digital nets which show good convergence behavior of the root mean square error
for smooth integrands.

2 Preliminaries
Throughout this paper, we use the following notation. Let N be the set of positive
integers and N0 := N {0}. For a set S, we denote by |S| the cardinality
of S. For
z C, we denote by z the complex conjugate of z. Let b = exp(2 1/b).
In the following, we recall the notion of the discrete Fourier transform and see
the correspondence of discrete Fourier coefficients to Walsh coefficients.
hg
define the pairing as g h := b . We also define the pairing
For g, h Zb , we 
sn
with
on Zb as A B := 1is,1 jn ai j bi j for A = (ai j ) and B = (bi j ) in Zsn
b
1 i s, 1 j n. We note the following properties used in this paper:
A B = (A B)1 = (A) B and A (B + C) = (A B)(A C).
We now define the discrete Fourier transform.

The Mean Square Quasi-Monte Carlo

335

Definition 2 Let f : Zsn


C. The discrete
by
b
Fourier transform of f , denotedsn
sn


sn f (B)(A B) for A Z


C,
is
defined
by
f
(A)
=
b
f : Zsn
BZb
b
b .

Each value f (A) is called a discrete Fourier coefficient.
We assume that P Zsn
is a digital net. We define the dual net of P as
b
|
A

B
=
1
for
all B P}. Several important properties of the
P := {A Zsn
b
discrete Fourier transform are summarized below (for a proof, see [13] for example).

Lemma 1 We have


AB =

AZbsn

bsn if B = 0,
0 if B = 0.

C be a function and
Theorem 1 (Poisson summation formula) Let f : Zsn
b


C
its
discrete
Fourier
transform.
Then
we
have
f : Zsn
b

1 

f (B) =
f (A).
|P| BP

AP

Walsh functions and Walsh coefficients are widely used to analyze QMC integration using digital nets, and are defined as follows. Let f : [0, 1)s R and
k = (k1 , . . . , ks ) Ns0 . We define the k-th Walsh function wal k by
wal k (x) :=

j1

i, j i, j

i=1


where for 1 i s, we write the b-adic expansion of ki by ki = j1 i, j b j1

and xi by xi = j1 i, j b j , where for each i, infinitely many of the digits i, j are
different from b 1. By using Walsh functions, we define the k-th Walsh coefficient
F ( f )(k);

F ( f )(k) :=

[0,1)s

f (x) wal k (x) d x.

We refer to [5, Appendix A] for general information on Walsh functions. We denote the kth Walsh coefficient of f by F ( f )(k), while it is denoted by 
f (k)
in [5, Appendix A]. The relationship between Walsh coefficients and discrete
Fourier coefficients is stated in the following proposition (for a proof, see [13,
sn
 Ns0 by
Lemma 2]).
(ai, j ) Zsn
b . We define the function : Zb
nLet A = j1
sn
s
(A) := ( j=1 ai, j b )i=1 for A = (ai, j ) Zb . Note that each element of
(A) is strictly less than bn .
Proposition 1 Let A = (ai, j ) Zsn
and assume that f : [0, 1)s R is integrable.
b
Then we have
F ( f )((A)) = 
f n (A).

336

T. Goda et al.

3 Mean Square Error with Respect to Digital Shifts


Let P Zsn
be a subset and F : Zsn
R a real-valued
function. Then QMC
b
b
1 
(F)
:=
|P|
integration by P is an approximation
I
P
BP F(B) of the actual

.
average value I (F) := bsn BZbsn F(B) of F over Zsn
b
For Zsn
b , we define the digitally shifted point set P + by P + = {B + |
B P}. We consider the mean and the variance of the estimator I P+ (F) for digitally
shifted point sets of P Zsn
b .
First we consider the average EZbsn [I P+ (F)]. We have
bsn

I P+ (F) = bsn

Zbsn


Zbsn

1 
1  sn
F(B + ) =
b
|P|
|P|
BP

BP

1 
I (F) = I (F),
|P|

F(B + )

Zbsn

BP

and thus we have the following proposition, showing that randomized QMC integration using a digitally shifted point set P + gives an unbiased estimator I P+ (F)
of I (F).
Proposition 2 For an arbitrary subset P Zsn
b , we have
EZbsn [I P+ (F)] = I (F).
It follows from this proposition that the mean square QMC error equals the variance Var Zbsn [I P+ (F)], namely we have
EZbsn [(I P+ (F) I (F))2 ] = Var Zbsn [I P+ (F)].
is a subgroup of Zsn
Hereafter we assume that P Zsn
b
b .
Lemma 2 Let P Zsn
be a subgroup. Then we have
b
I P+ (F) =


(A )1 F(A).

AP


Proof Let F (B) := F(B + ). Then for A Zsn
b , we can calculate F (A) as

F (A) = bsn

F (B)(A B)

BZbsn

= (A ())bsn


BZbsn


= (A )1 F(A),

F(B + )(A (B + ))

The Mean Square Quasi-Monte Carlo

337


where we use the definition of F(A)
in the last equality. Thus by Theorem 1 we have
I P+ (F) =



1 


F (B) =
(A )1 F(A),
F (A) =
|P| BP

AP

AP

which proves the result.


By Proposition 2 and Lemma 2, we have


Var Zbsn [I P+ (F)] := bsn

(I P+ (F) EZbsn [I P+ (F)])2

Zbsn

= bsn

|I P+ (F) I (F)|2

Zbsn

2



 

sn
1 

=b
(A ) F(A)



Zbsn AP \{0}
 


 )
= bsn
(A ) F(A)
(A ) F(A
Zbsn AP \{0}

AP \{0}

A P \{0}

= bsn
=

A P \{0}

 F(A
 )
F(A)

((A A) )

Zbsn



F(A)
 2 ,

AP \{0}

where the last equality follows from Lemma 1. Now we proved:


Theorem 2 Let P Zsn
be a subgroup. Then we have
b
Var Zbsn [I P+ (F)] =



F(A)
 2 .

AP \{0}

In particular, we immediately obtain the following corollary for the most important
case.
be a subgroup, i.e., a digital net over Zb , and f n be the
Corollary 1 Let P Zsn
b
n-digit discretization of f : [0, 1)s R. Then we have
Var Zbsn [I P+ ( f n )] =

2


f n (A) .

AP \{0}

Our results obtained in this section can be regarded as the discretized version of
known results [1, 7].

338

T. Goda et al.

4 WAFOM for the Root Mean Square Error


In the previous section, we obtained that the mean square QMC error is equal to
a certain sum of the squared discrete Fourier coefficients, and thus we would like
to bound the value | 
f n (A)|. By Proposition 1, it is sufficient to bound the Walsh
coefficients of f , and several types of upper bounds on the Walsh coefficients are
already known. In order to introduce bounds on the Walsh coefficients proved by
Dick [2, 3, 5], we define the Dick weight.
sn
Definition 3 Let A = (ai, j ) Zsn
N0 is defined as
b . The Dick weight : Zb

(A) :=

j (ai, j ),

1is
1 jn

where : Zb {0, 1} is defined as (a) = 0 for a = 0 and (a) = 1 for a = 0.


Here we consider functions f whose mixed partial derivatives up to order N,
> 1, in each variable are continuous. In [2, 3], Dick proved upper bounds on
Walsh coefficients for these functions. By letting = n, we have the following, see
also [4].
Lemma 3 (Dick) There exists a constant Cb,s,n depending only on b, s and n such
it holds that
that for any n-smooth function f : [0, 1)s R and any A Zsn
b



f n (A) Cb,s,n f n b(A) ,

(3)

where f n denotes the norm of f for a Sobolev space, which is defined as

f n :=

uS S\u {0,...,n1}s|u|





|u|

[0,1]

[0,1]s|u|

1/2
2

f ( S\u ,nu ) (x) d x S\u d x u ,

where we used the following notation: Let S := {1, . . . , s}, x = (x1 , . . . , xs ), and for
u S let x u = (x j ) ju . ( S\u , nu ) denotes a sequence ( j ) j with j = n for j u
/ u. Moreover, we write f (n 1 ,...,n s ) = n 1 ++n s f /x1n 1 xsn s .
and j = j for j
Another upper bound on the Walsh coefficients of f has been shown by Yoshiki
[14] for b = 2. Applying Proposition 1, we also have the following;
Lemma 4 (Yoshiki) Let f : [0, 1]s R and define Ni := |{ j = 1, . . . , n | ai, j =
0}| and N := (Ni )1is Ns0 for A = (ai, j ) Zsn
2 . If the Nth mixed partial derivN1
(N)
N1 ++Ns
Ns
=
f /x1 xs of f exists and is continuous, then we have
ative f




f n (A)  f (N)  2((A)+h(A)) ,
where h(A) :=


i, j

(4)

(ai, j ) is the Hamming weight and the supremum norm.

The Mean Square Quasi-Monte Carlo

339

Generally speaking, we cannot prove an inequality between f (N) and f n .


But it happens that f n is much larger than f (N) since f n is the summation of sn positive terms for large n. For example, when s = 1 and f = exp(x),
f (N ) = 1 while f n = ((n + 1)(1 e1 )2 + (1 e2 )/2)1/2 . In this case, if
we take n large enough, f (N ) / f n goes to 0. In this way, f (N) tends to be
small compared with f n .
Similar to [9] and [13], we define a kind of figure of merit corresponding to these
bounds on Walsh coefficients. Since Yoshikis bound (4) tends to be tighter than
Dicks bound (3), we use the figure of merit obtained by Yoshikis bound in the
experiment in the last section.
Definition 4 (Walsh figure of merit for the root mean square error) Let s, n be
a subgroup. We define two Walsh figures of merit
positive integers and P Zsn
b
for the root mean square error of P by
W (P; ) :=

 

b2(A) ,

AP \{0}

W (P; + h) :=

 

b2((A)+h(A)) .

AP \{0}

We have the following main result.


Theorem 3 (KoksmaHlawka type inequalities for the root mean square error) For
we have
an arbitrary subgroup P Zsn
b

Var Zbsn [I P+ ( f n )] Cb,s,n f n W (P; ).
Moreover, if b = 2 then


 (N) 
Var Z2sn [I P+ ( f n )] max  f  W (P; + h)
0Nn
N=0

holds where the condition for the maximum is denoted by a multi-index, i.e., the
maximum value is taken over N = (N1 , . . . , Ns ) such that 0 Ni n for all i and
Ni = 0 for some i.
Proof Since the proofs of these inequalities are almost identical, we only show the
latter. Apply Lemma
4 to each term in the right-hand side of the result in Corollary 1.


For the factor  f (N)  , note that N depends only on A, that A runs through all
non-zero elements of P , and that Ni n for all i. Then we have
Var Zbsn [I P+ ( f n )]


AP \{0}

2
 (N) 
max  f  22((A)+h(A))

0Nn
N=0

340

T. Goda et al.

and the result follows.

5 Inversion Formula for W ( P; )


sn
For A = (ai, j ) Zsn
R given by
b , we consider a general weight : Zb

(A) =

i, j (ai, j ),

1is
1 jn

where i, j R for 1 i s, 1 j n. In this section, we give a practically computable formula for


 
b2(A) .
W (P; ) :=
AP \{0}

Note that the Dick weight is given by i, j = j and the Hamming weight h is given
by i, j = 1. The key to the formula [9, (4.2)] for WAFOM is the discrete Fourier
transform. In order to obtain a formula for W (P; ), we use a MacWilliams-type
identity [13], which is also based on the discrete Fourier transform.
Let X := {xi, j (l)} be a set of indeterminates for 1 i s, 1 j n, and l Zb .
The complete weight enumerator polynomial of P , in a standard sense [8, Chap. 5],
is defined by


xi, j (ai, j ).
GW P (X ) :=
AP 1is
1 jn

Similarly, the complete weight enumerator polynomial of P is defined by


GW P (X ) :=

xi, j (bi, j ),

BP 1is
1 jn

where B = (bi, j )1is,1 jn and X := {xi, j (g)} is a set of indeterminates for


1 i s, 1 j n, and g Zb . We define Y := {yi, j (g)} for 1 i s, 1 j
n and g Z with
yi, j (0) = 1, yi, j (l) = b2i, j (l = 0).
Note that, by substituting Y into X for GW P (X ), we have
GW P (Y ) = W (P; )2 + 1.
By the MacWilliams-type identity for GW [13, Proposition 2], we have

The Mean Square Quasi-Monte Carlo

341

GW P (X ) =

1
GW P (Z ),
|P|

(5)

where in the right hand side every xi, j (g) X is substituted by z i, j (g) Z , which
is defined by

z i, j (g) :=
(l g)xi, j (l).
lZb

By substituting Y into X for (5), we have the following result. Since the result
follows in the same way as in [13, Corollary 2], we omit the proof.
be a subgroup. Then we have
Theorem 4 Let P Zsn
b


1 


W (P; ) = 1 +
(1 + (bi, j )b2i, j ),

|P| BP 1is
1 jn

where (bi, j ) = b 1 if bi, j = 0 and (bi, j ) = 1 if bi, j = 0.


In particular, we can compute W (P; ) and W (P; + h) as follows.
Corollary 2 Let P Zsn
be a subgroup. Then we have
b


1 


W (P; ) = 1 +
(1 + (bi, j )b2 j ),

|P| BP 1is
1 jn



1 


W (P; + h) = 1 +
(1 + (bi, j )b2( j+1) ),

|P| BP 1is
1 jn

where (bi, j ) = b 1 if bi, j = 0 and (bi, j ) = 1 if bi, j = 0.


While computing WAFOM by definition needs an iteration through P , Theorem 4 and Corollary 2 give it by iterating over P. For QMC, the size |P| cannot
exceed a reasonable number of computer operations opposed to huge |P |, and thus
Theorem 4 and Corollary 2 are useful in many cases.
We use the figure of merit W (P; + h) obtained by Yoshikis bound (4) in the
experiment of the next section.

6 Numerical Experiments
To show that W works as a useful bound on the root mean square error we conduct
two types of experiments. The first one is to generate many point sets at random, and

342

T. Goda et al.

to observe the distribution of the criterion W and the standard deviation E . The other
one is to search for low-W point sets and to compare with digital nets consisting of
the first terms of a known low-discrepancy sequence.
In this section we consider only the case b = 2. The dimension of a digital net P
is denoted by m, i.e., |P| = 2m . We set s = 4, 12 and
as a subvector space of Zsn
2
use the following eight test functions for x = (xi )1is :

Polynomial
f 0 (x) = ( i xi
)6 ,
Exponential
f j (x) = exp(a
 i xi ) (a = 2/3 for j = 1 and a = 3/2 for j = 2),
Oscillatory
f 3 (x) = cos( i xi ),
exp( i xi2 ),
Gaussian
f 4 (x) = 
Product peak
f 5 (x) = i (xi2 + 1)1 ,
Continuous
f 6 (x) = i T (xi ) where T (x) = minkZ |3x 2k|,
Discontinuous f 7 (x) = i C(xi ) where C(x) = (1)3x .
Assuming that the discretization error is negligible, we have that I(P+) ( f ) is a
practically unbiased
 estimator of I ( f ). Thus we may say that if the standard deviation E ( f ; P) := Var Z2sn [I(P+) ( f )] of the quasi-Monte Carlo integration is

small then the root mean square error EZ2sn [(I(P+) ( f ) I ( f ))2 ] is as small as
E (
f ; P). From the same assumption we also have that E ( f ; P) is well approximated
by Var Z2sn [I P+ ( f n )], on which we have a bound in Theorem 3.
In this section we implicitly use the weight + h so W (P) denotes W (P; + h).
The aim of the experiments is to establish that if W (P) is small then so is E ( f ; P).
For this we 
compute W by the inversion formula in Corollary 2 and approximate
uniE ( f ; P) = Var Z2sn [I(P+) ( f )] by sampling 210 digital shifts Zsn
2
formly, randomly and independently of each other. We shall observe both the criterion
W and the variance E in binary logarithm, which is denoted by lg.

6.1 The Distribution of (W , E )


In this experiment we set m = 10, 12 and n = 32, generate point sets P, compute
W (P), approximate E ( f ; P) for test functions f and observe (W , E ). We generate
1000 point sets P by random and uniform choice of generating matrices C1 , . . . , Cs
from the set (Znm
)s .
2
For each (s, m, f ) we calculate the correlation coefficient between W (P) and
E ( f ; P) log-scaled, obtaining the result as in Table 1. For typical distributions of
(W (P), E ( f ; P)) for smooth, continuous nondifferentiable and discontinuous functions we refer the readers to Figs. 1, 2, 3 and 4. We observe that there are very
high correlations (the correlation coefficient is larger than 0.85) between W (P) and
E ( f ; P) if f is smooth. Though f 6 is a nondifferentiable function we have moderate
correlation coefficients around 0.35. However, for the discontinuous function f 7 it
seems we can do almost nothing for the root mean square error through W (P).

The Mean Square Quasi-Monte Carlo

343

Table 1 The correlation coefficient between lg W (P) and lg E ( f ; P)


s
4
4
12
m
10
12
10
f0
f1
f2
f3
f4
f5
f6
f7

0.9861
0.9907
0.9897
0.9794
0.9723
0.9421
0.3976
0.0220

Fig. 1 s = 4 and m = 10.


The integrand is the
oscillatory function

f 3 (x) = cos( i xi )

0.9920
0.9901
0.9887
0.9818
0.9599
0.9144
0.3218
0.0102

12
12

0.9821
0.9842
0.9821
0.8900
0.9975
0.9912
0.4077
0.0208

0.9776
0.9866
0.9851
0.8916
0.9951
0.9839
0.3258
0.0171

2
3
4
5
6
+
7
+++
+
+
+
+
+
8
+
++ +
+
+
+
+
+
++
++
lg E 9
+
+
+
++
+
+
+
+
+
++
+
+
+
10
++
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
11
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
++
+
+
+
+
+
+
+
12
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
13
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
14
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
15 +
+
++
+
16
15 1413121110 9 8 7 6

+
+
+
+ +
+

5 4 3 2 1

lg W

Fig. 2 s = 12 and m = 12.


The integrand is the product
peak function

f 5 (x) = i (xi2 + 1)1

+
++
+

9
10

lg E 11
12
13
14
15
16
10

+++
+ +
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
++
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+++
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
9

lg W

344

T. Goda et al.

Fig. 3 s = 12 and m = 10.


The integrand is the
continuous nondifferentiable

function f 6 (x) = i T (xi )
where
T (x) = minkZ |3x 2k|

13

14

lg E
15

16
8

+
+
+
+
+
+++
+
+ +
+
+
+
+
+ +
+
++ ++
+++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ++++
+
+
+
+
+
+
+
+++
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
++
+
+
+
+
+
+
+
+
++ +
+
+ ++
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
++
+
++ +
+++
+
+
+
++
+
+
+
+++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+ +
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++ ++
++++
++++

+
+
+
+

lg W

Fig. 4 s = 4 and m = 12.


The integrand is the
discontinuous
 function
f 7 (x) = i C(xi ) where
C(x) = (1)3x

+ +
+
4

+ +
+++
+ + +
+
+

+ +

+ +
+
++
+
+
+ ++
+
+
+
+
+
+
+ +
+ +
+++
+ +
++ ++ +
++
+++
+
+
++
+
+
+ +
+
+ ++
++
+
++++
++
+ ++ +
+
+
+
+
+
+
+
+ +
+
+
++
+
+
++
+
++
+
+ ++
+ +
+
++
+
++
+
+
+
+
+
+
6
+
++
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
++
+
+
+
+
+ +
+ +++
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+ ++
+
+
+
+
+
+
+
+
+
+++
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ++ +
+
+++
+++
+
+
+
+
++
+
+
+
+
+
+
+
+
+
++
+
+
+
+
++++
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+++
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+ +
+
+
+
+
+
+ +
+
+
+
+
+
+
+
+
+
+++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ++
+
+
+
+
+
+
7 +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
++
+
+
+
++ +
+ ++
+
+ + +
+
+
+++ + +
+
5

lg E

+
+
+
+
+
+
+
+

+
+

+
+
+
+
+ +
+
+ + +
+

191817161514131211109 8 7 6 5 4 3

lg W

6.2 Comparison to Known Low-Discrepancy Sequence


In this experiment we set n = 30. For 8 m < 16, let P be a low-W point set and
PNX the digital net consisting of the first 2m points of an s-dimensional NiederreiterXing sequence from [12]. Here we search for low-W point sets based on simulated
annealing as follows:
1. Let s, m, n N be fixed.
2. For = 4, . . . , 12, do the following:
a. Choose generating matrices C1(i) , . . . , Cs(i) randomly and uniformly from the
set (Znm
)s and denote by P (i) the digital net generated by C1(i) , . . . , Cs(i) for
2
i = 1, . . . , 2 .

The Mean Square Quasi-Monte Carlo

345

b. Find C1(i) , . . . , Cs(i) such that W (P (i) ) W (P ( j) ) for all j = 1, . . . , 2 . Let


C1 = C1(i) , . . . , Cs = Cs(i) and P = P (i) .
c. For l = 1, . . . , 2 , do the following:
i. Choose matrices A = (ai j ) and B = (bi ) randomly and uniformly from
and Zm
the sets Zsn
2 , respectively.
2
(s)
nm
ii. Construct generating matrices D1 = (di(1)
by
j ), . . . , Ds = (di j ) Z2
(h)
di(h)
j = ci j + b j ahi

for 1 i n, 1 j m and 1 h s, where we write C1 = (ci(1)


j ), . . . ,
(s)
Cs = (ci j ). Denote by Q the digital net generated by D1 , . . . , Ds .
iii. Replace C1 , . . . , Cs and P by D1 , . . . , Ds and Q with probability

min

W (P)
W (Q)

1/Tl


,1 .

3. Output P which gives a minimum value of W during the process 2.


In the above algorithm, Ti is called the temperature and is given in the form T i
for 0 < < 1. T and are determined such that T1 = 1 and T2 = 0.01 for a given
. Note that point sets we obtain by this algorithm are not extensible in m, i.e.,
one cannot increase the size of P while retaining the existing points. For a search
for extensible point sets which are good in a W -like (but different in weight and
exponent) criterion, see [6].
Varying m, we observe lg W (PNX ), lg W (P) and lg E ( f ; PNX ), lg E ( f ; P) for
each test function in Table 2. As shown in Figs. 5 and 6, the W -value of point sets P
optimized in W by our algorithm is far better than that of PNX , however this is not
surprising. The W -values of PNX have plateaus and sudden drops. In Figs. 7 and 8 are
the root mean square errors for two test functions; we clearly observe higher order
convergence in the former for the smooth function f 5 and for the discontinuous
function f 7 in the latter only lower order convergence can be achieved by both
methods.

6.3 Discussion
The first experiment shows that W works as a useful bound on E for some of the
functions tested above. The other experiment shows that point sets with low W
values are easy enough to find and perform better for smooth test functions, while
these point sets work as badly as the Niederreiter-Xing sequence for non-smooth or
discontinuous functions.

lg W (PNX )
lg W (P)
lg E ( f 0 ; PNX )
lg E ( f 0 ; P)
$ lg E ( f 1 ; PNX )
lg E ( f 1 ; P)
lg E ( f 2 ; PNX )
lg E ( f 2 ; P)
lg E ( f 3 ; PNX )
lg E ( f 3 ; P)
lg E ( f 4 ; PNX )
lg E ( f 4 ; P)
lg E ( f 5 ; PNX )
lg E ( f 5 ; P)
lg E ( f 6 ; PNX )
lg E ( f 6 ; P)
lg E ( f 7 ; PNX )
lg E ( f 7 ; P)
lg W (PNX )
lg W (P)
lg E ( f 0 ; PNX )

4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
12
12
12

10.31
12.59
0.19
2.14
9.81
12.74
3.76
5.25
10.93
13.13
12.44
13.16
13.24
13.81
9.77
8.93
4.32
4.53
5.18
6.16
9.95

m=8
12.40
14.39
2.17
3.99
11.99
14.72
5.60
6.87
13.62
14.91
14.57
15.69
15.39
16.24
11.23
10.31
4.96
4.12
6.07
6.93
8.89

9
12.90
16.39
3.22
6.03
12.07
16.54
6.67
8.82
14.14
17.00
15.00
17.26
15.57
17.89
11.54
11.70
5.70
5.25
6.68
7.89
8.00

10
12.98
17.91
3.45
7.51
12.12
18.62
6.93
10.20
14.47
18.57
15.14
18.05
15.67
18.30
12.13
9.55
6.17
5.68
6.82
8.67
7.84

11
15.74
19.50
5.93
9.35
15.01
20.58
9.42
11.55
16.84
20.17
17.88
19.75
18.48
20.66
12.20
11.88
6.47
6.21
6.92
9.66
7.80

12

Table 2 Comparison between NiederreiterXing sequences (PNX ) and low-W point sets (P) in lg W and lg E .
13
15.77
21.82
5.98
11.95
15.00
23.09
9.50
13.51
16.84
22.40
17.97
21.43
18.55
21.79
14.57
14.85
6.65
7.40
6.98
10.73
7.76

14
15.77
23.67
5.94
13.63
14.98
24.82
9.46
15.34
16.86
24.28
17.95
24.32
18.55
25.12
15.92
15.56
8.06
7.05
11.52
11.67
1.39

15

(continued)

23.20
26.00
12.75
16.40
23.26
27.47
15.92
17.45
24.03
27.04
25.30
24.46
26.47
24.66
17.60
17.19
9.22
8.84
12.01
12.64
0.09

346
T. Goda et al.

lg E ( f 0 ; P)
lg E ( f 1 ; PNX )
lg E ( f 1 ; P)
lg E ( f 2 ; PNX )
lg E ( f 2 ; P)
lg E ( f 3 ; PNX )
lg E ( f 3 ; P)
lg E ( f 4 ; PNX )
lg E ( f 4 ; P)
lg E ( f 5 ; PNX )
lg E ( f 5 ; P)
lg E ( f 6 ; PNX )
lg E ( f 6 ; P)
lg E ( f 7 ; PNX )
lg E ( f 7 ; P)

12
12
12
12
12
12
12
12
12
12
12
12
12
12
12

Table 2 (continued)

8.09
0.57
2.20
11.02
10.58
6.14
7.18
10.56
11.54
10.69
12.00
13.64
13.87
4.06
4.00

m=8
7.19
1.60
3.05
10.20
9.91
7.34
8.01
11.52
12.36
11.70
12.86
14.31
14.16
4.45
4.51

9
6.05
2.43
4.12
9.77
9.07
8.32
9.01
12.07
13.28
12.33
13.97
14.93
14.83
4.93
5.00

10
4.98
2.60
5.07
9.54
8.53
8.64
10.16
12.27
14.09
12.62
14.86
15.65
15.48
5.50
5.52

11
4.15
2.64
5.97
9.40
7.53
8.97
10.78
12.39
14.82
12.70
15.17
16.11
15.97
6.01
5.96

12
2.46
2.69
7.35
9.25
6.80
9.27
11.86
12.41
16.17
12.71
16.99
16.62
16.45
6.48
6.50

13
1.49
8.27
8.36
6.00
5.84
12.74
12.90
16.99
17.10
18.09
17.90
17.10
17.30
7.02
6.95

14

0.31
8.99
9.61
5.45
5.18
13.51
13.76
17.47
18.20
18.69
19.34
17.54
18.09
7.48
7.52

15

The Mean Square Quasi-Monte Carlo


347

348
Fig. 5 W values for s = 4

T. Goda et al.
10
11
12
13
14
15
16
17
lg W 18
19
20
21
22
23
24
25
26

Niederreiter-Xing sequence
Low-W digital nets
+
+

10

11

12

13

14

15

dimension/F2

Fig. 6 W values for s = 12

10

11

12
13

lg W

Niederreiter-Xing sequence
Low-W digital nets

10

11

12

13

14

15

dimension/F2

Fig. 7 s = 4. The integrand


is the product
 peak function
f 5 (x) = i (xi2 + 1)1

13
14
15
16
17
18
19
lg E 20
21
22
23
24
25
26
27

Niederreiter-Xing sequence
Low-W digital nets
+
+

10

11

12

dimension/F2

13

14

15

The Mean Square Quasi-Monte Carlo


Fig. 8 s = 12. The
integrand is the
discontinuous
 function
f 7 (x) = i C(xi ) where
C(x) = (1)3x

349

+
+

Niederreiter-Xing sequence
Low-W digital nets

lg E 6

8
8

10

11

12

13

14

15

dimension/F2

Acknowledgments The authors would like to thank Prof. Makoto Matsumoto for helpful discussions and comments. The work of T.G. was supported by Grant-in-Aid for JSPS Fellows No.24-4020.
The works of R.O., K.S. and T.Y. were supported by the Program for Leading Graduate Schools,
MEXT, Japan. The work of K.S. was partially supported by Grant-in-Aid for JSPS Fellows Grant
number 15J05380.

References
1. Baldeaux, J., Dick, J.: QMC rules of arbitrary high order: reproducing kernel Hilbert space
approach. Constr. Approx. 30(3), 495527 (2009)
2. Dick, J.: Walsh spaces containing smooth functions and quasi-Monte Carlo rules of arbitrary
high order. SIAM J. Numer. Anal. 46(3), 15191553 (2008)
3. Dick, J.: The decay of the Walsh coefficients of smooth functions. Bulletin of the Australian
Mathematical Society 80(3), 430453 (2009)
4. Dick, J.: On quasi-Monte Carlo rules achieving higher order convergence. In: Monte Carlo and
Quasi-Monte Carlo Methods 2008, pp. 7396. Springer, Berlin (2009)
5. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and quasi-Monte
Carlo integration. Cambridge University Press, Cambridge (2010)
6. Harase, S., Ohori, R.: A search for extensible low-WAFOM point sets (2013)
7. LEcuyer, P., Lemieux, C.: Recent advances in randomized quasi-Monte Carlo methods.
Modeling uncertainty. International Series in Operations Research and Management Science,
vol. 46, pp. 419474. Kluwer Academic Publishers, Boston, MA (2002)
8. MacWilliams, F.J., Sloane, N.J.A.: The theory of error-correcting codes. I. North-Holland
Mathematical Library, North-Holland Publishing Co., Amsterdam (1977)
9. Matsumoto, M., Saito, M., Matoba, K.: A computable figure of merit for quasi-Monte Carlo
point sets. Math. Comput. 83(287), 12331250 (2014)
10. Matsumoto, M., Yoshiki, T.: Existence of higher order convergent quasi-Monte Carlo rules via
Walsh figure of merit. In: Monte Carlo and Quasi-Monte Carlo Methods 2012, pp. 569579.
Springer, Heidelberg (2013)
11. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods, CBMS-NSF
Regional Conference Series in Applied Mathematics, vol. 63. Society for Industrial and Applied
Mathematics (SIAM), Philadelphia, PA (1992)

350

T. Goda et al.

12. Nuyens, D.: The magic point shop of QMC point generators and generating vectors, http://
people.cs.kuleuven.be/~dirk.nuyens/qmc-generators/
13. Suzuki, K.: WAFOM on abelian groups for quasi-Monte Carlo point sets. Hiroshima Math. J.
45(3), 341364 (2015)
14. Yoshiki, T.: Bounds on Walsh coefficients by dyadic difference and a new Koksma-Hlawka
type inequality for Quasi-Monte Carlo integration (2015)

Uncertainty and Robustness in Weather


Derivative Models
Ahmet Gnc, Yaning Liu, Giray kten and M. Yousuff Hussaini

Abstract Pricing of weather derivatives often requires a model for the underlying
temperature process that can characterize the dynamic behavior of daily average
temperatures. The comparison of different stochastic models with a different number
of model parameters is not an easy task, especially in the absence of a liquid weather
derivatives market. In this study, we consider four widely used temperature models
in pricing temperature-based weather derivatives. The price estimates obtained from
these four models are relatively similar. However, there are large variations in their
estimates with respect to changes in model parameters. To choose the most robust
model, i.e., the model with smaller sensitivity with respect to errors or variation in
model parameters, the global sensitivity analysis of Sobol is employed. An empirical
investigation of the robustness of models is given using temperature data.
Keywords Weather derivatives Sobol sensitivity analysis Model robustness

1 Introduction
Weather related risks exist in many economic sectors, especially in agriculture,
tourism, energy, and construction. Hanley [10] reports that about one-seventh of
the industrialized economy is sensitive to weather. The weather related risks can be
A. Gnc
Xian Jiaotong Liverpool University, Suzhou 215123, China
e-mail: Ahmet.Goncu@xjtlu.edu.cn
Y. Liu
Hydrogeology Department, Earth Sciences Division,
Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
e-mail: yaningliu@lbl.gov
G. kten (B) M.Y. Hussaini
Florida State University, Tallahassee, FL 32306, USA
e-mail: okten@math.fsu.edu
M.Y. Hussaini
e-mail: yousuff@fsu.edu
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_17

351

352

A. Gnc et al.

hedged via weather derivatives, which is a relatively new form of a financial instrument that has contingent payoffs with respect to possible weather events or indices.
The market for weather derivatives was established in the USA in 1997 following
the deregulation of the energy market. The Weather Risk Management Association
(WRMA) reported that as of 2011 the weather derivatives market has grown to 12
billion US dollars. The Chicago Mercantile Exchange (CME) trades standardized
weather derivatives with the highest trading volume in temperature-based weather
derivatives; this type of derivatives is the focus of this study.
There are different approaches to price weather derivatives, such as, historical
burn analysis, index modeling, and stochastic modeling of daily average temperatures ([13]). In the stochastic modelling approach, a mean-reverting process such as
the OrnsteinUhlenbeck process is often used for modelling the evolution of daily
average temperatures at a particular measurement station. Amongst others, some
examples of studies that follow this approach are given by Alaton et. al. [1], Benth
and Benth [3], Brody et al. [4], Cao and Wei [6], Platen and West [20], Huang et al.
[12], and Gnc [8]. Some studies suggest the superiority of daily temperature modelling over the index modelling approach (Oetomo and Stevenson [18], Schiller et al.
[25]). Another important modeling approach uses time series to model daily average
temperatures. An example is the model of Campbell and Diebold [5], which forecasts
daily average temperatures using an autoregressive conditional heteroscedasticity
(ARCH) model.
Within the class of dynamic models of daily temperatures, four models that are
highly cited in the literature (see, for example, the survey by Schiller et. al. [25]) and
widely used in the weather derivatives industry are given by Alaton et al. [1], Benth
and Benth [3], Brody et al. [4] and Campbell and Diebold [5]. In the study by Gnc
[9] these four models are compared in terms of their forecasting power of the futures
prices for different locations. Different models come with different parameters that
need to be estimated from the historical data, and although we may know how accurately a certain parameter can be estimated, the question of the impact of the parameter estimation error on the overall model has not been investigated in the literature. In
this paper, we propose a framework based on global sensitivity analysis to assess the
robustness of a model with respect to the uncertainties in its parameters. We apply our
methodology to the four different temperature models given in [1], [35].
The paper is organized as follows. In Sect. 2, we describe the dataset utilized,
introduce the temperature models investigated, and present estimation results of
each model. Section 3 discusses the global sensitivity analysis employed and Sect. 4
presents numerical results and conclusions.

2 Modelling of Daily Average Temperatures


In the weather derivatives market, daily temperatures are defined as the average of
the minimum and maximum temperatures observed during a given day. The most
common type of weather derivative contracts are based on the heating and cooling
degree days index, defined as follows.

Uncertainty and Robustness in Weather Derivative Models

353

Definition 1 (Heating/Cooling Degree Days) Let Ti denote the temperature for


day i. We define heating degree-days (HDD) and cooling degree-days for a given
day i and reference temperature Tr e f as H D Di = max(Tr e f Ti , 0), and C D Di =
max(Ti Tr e f , 0), respectively. The industry convention for the reference temperature Tr e f is 18 C (or, 65 Fahrenheit), which we adopt in this paper.
The number 
of HDDs and CDDs accumulated
for a contract period of n days are
n
n
H D Di and Cn = i=1
C D Di , respectively.
given by Hn = i=1
Definition 2 (Weather Options) Call and put options are defined with respect to the
accumulated HDDs or CDDs during a contract period of n days and a predetermined
strike level K . The payoff of the call and put options written on the accumulated
HDDs (or CDDs) during a contract period of n days is given as max(Hn K , 0) and
max(K Hn , 0), respectively.
In the standard approach to price financial derivatives, one uses the risk neutral
dynamics of the underlying variables, which are often tradable, and from no-arbitrage
arguments an arbitrage free price is obtained. On the other hand, the underlying
for weather derivatives is a temperature index, which is not tradable, and thus noarbitrage arguments do not apply. However, one can still find a risk neutral measure
(which will be model dependent) from the market price of weather derivatives. (eg.
see [11])
In this section, we describe temperature models given by Alaton et. al. [1], Benth
and Benth [3], Brody et. al. [4], and Campbell and Diebold [5]. In the first three
models, the long-term dynamics of daily average temperatures are modeled deterministically. The long-term mean temperature at time t is given by
Ttm = A + Bt + C sin(t) + D cos(t),

(1)

where = 2/365. The sine and cosine functions capture the seasonality of daily
temperatures, whereas the linear term captures the trend in temperatures which might
be due to global warming or urbanization effects. The parameters A, B, C, D can
be estimated from the data by a linear regression. An improvement in the fit can
be obtained by increasing the number of sine and cosine functions in the above
representation. However, in our dataset, we did not observe any significant improvements by adding more terms. Our dataset consists of daily average temperatures1
and HDD/CDD monthly futures prices for the measurement station at New York La
Guardia International Airport. Daily average temperature data for the period between
01/01/1997 and 01/21/2012 is used to estimate the parameters of each model considered. In Fig. 1, the historical temperatures for New York are plotted.

1 Daily

average temperatures are measured by the Earth Satellite Corporation and our dataset is
provided by the Chicago Mercantile Exchange (CME).

354

A. Gnc et al.

Fahrenheit

100
90
80
70
60
50
40
30
20
10
0
0

1000

2000

3000

4000

5000

6000

Sample Size (Number of days)

Fig. 1 Daily average temperatures at New York La Guardia Airport: 19972012

2.1 The Model by Alaton, Djehiche, and Stillberger (2002)


In the model by Alaton et. al. [1], the daily temperatures are modeled by a mean
reverting OrnsteinUhlenbeck process

dTt =


dTtm
+ a(Ttm Tt ) dt + t dWt ,
dt

(2)

where Tt is the temperature at time t, a is the mean reversion parameter, t is a piecewise constant volatility function, Wt is P-Brownian motion (the physical probability
measure) and Ttm is the long-term mean temperature given by Eq. (1).
The volatility of daily temperatures t is assumed to be constant for each month
of the year. We will not discuss the estimation of model parameters since they are
explained in [1]. We estimate the piecewise constant volatility function for our dataset
using the regression and quadratic variation methods. Figure 2 plots these results,
9

Monthly volatility (quadratoic variation)


Monthly volatility (regression method)
Fourier series fitted to empirical volatility
Empirical volatility for each day of the year

(t)

7
6
5
4
3
2
0

50

100

150

200

Time (t)

Fig. 2 Empirical versus estimated volatility

250

300

350

400

Uncertainty and Robustness in Weather Derivative Models

355

Table 1 Estimated parameters for the model by Alaton, Djehiche, and Stillberger (standard errors
of estimators in parenthesis)
A
B
C
D
a
3.0 104
(5.6 105 )

55.7952
(0.1849)

8.7965
(0.1307)

20.0178
(0.1307)

0.3491
(0.01)

Table 2 Estimated monthly volatility (t ) for each month of the year, for the model by Alaton,
Djehiche, and Stillberger (standard errors of estimators in parenthesis)
Jan
Feb
Mar
Apr
May
Jun
Volatility

6.36 (0.76)

5.84 (0.64)

5.82 (0.64)

5.52 (0.57)

4.69 (0.41)

4.53 (0.39)

Jul

Aug

Sep

Oct

Nov

Dec

Volatility

3.61 (0.25)

3.53 (0.23)

4.03 (0.30)

4.67 (0.41)

5.00 (0.47)

5.96 (0.67)

together with the empirical daily volatility and its Fourier series fit. Tables 1 and 2
display the estimated model parameters (including the parameters for Eq. (1)) for
our dataset with the standard errors given in parenthesis.

2.2 The Model by Benth and Benth (2007)


Benth and Benth [3] use the same mean reverting OrnsteinUhlenbeck process used
by Alaton et. al. [1], but model the volatility function differently:
t2 = c0 +

I1


ci sin(it) +

i=1

J1


d j cos(jt),

(3)

j=1

where = 2/365. Following [3], we set I1 = J1 = 4 in the above equation in


our numerical results. Volatility estimates obtained from Eq. (3) are given in Fig. 2
(the curve labeled as Fourier series fitted to empirical volatility). The long-term
average temperatures are modeled in the same way as in Alaton et. al. [1] by Eq. (1),
hence the estimated parameters, A, B, C, D, are the same as given in Table 1. The
estimates for the rest of the parameters of the model are displayed in Table 3.
Table 3 Estimated parameters for the model by Benth and Benth (standard errors of estimators in
parenthesis)
c0
c1
c2
c3
c4
0.1127 (0.7708) 0.3783 (0.7708)

1.2162 (0.7708)

d1

d2

d4

9.3381 (0.7708)

0.1068 (0.7708) 0.4847 (0.7708)

24.0422 (0.5450) 6.9825 (0.7708)

d3

1.1303 (0.7708)

356

A. Gnc et al.
2

(T)
H=0.64
H=0.50

1.5
Slope = 0.36

(T)

1
0.5
0
0.5
1
1.5

2.5

3.5

4.5

5.5

6.5

log (T)

Fig. 3 Estimation of the Hurst exponent using the estimator in [4]

2.3 The Model by Brody, Syroka, and Zervos


Brody et. al. [4] generalizes the OrnsteinUhlenbeck stochastic process used in the
previous models by replacing the Brownian motion in the stochastic differential
equation (2) with a fractional Brownian motion, giving the following equation:

dTt =


dTtm
m
+ a(Tt Tt ) dt + t dWtH .
dt

(4)

WtH is a fractional Brownian motion defined on a probability space (, F , P H ). See


[4] for the properties of fractional Brownian motion. The motivation for the use of
fractional Brownian motion is to capture the possible long memory effects in the data.
The Hurst exponent, H , characterizes the persistence in the fractional Brownian
motion process. We estimated the Hurst exponent using the statistic described in
Brody et al. [4], which measures the variability of temperature with respect to time.
In the absence of long-memory effects, we would expect to observe a decay in
the standard deviation proportional to (T ) T 0.5 , whereas an exponent between
0 and 0.5 suggests the existence of temporal correlation between daily average
temperatures. As can be seen in Fig. 3, the decay of the standard deviation follows,
(T ) T 0.36 , which supports the existence of such temporal correlation, and thus
long-memory effect. The deterministic part of the temperature dynamics, i.e. the
trend and seasonal terms, are modeled as given in Eq. (1).

Uncertainty and Robustness in Weather Derivative Models

357

2.4 The Model by Campbell and Diebold


The model proposed by Campbell and Diebold [5] follows a non-structural ARCH
type time series modeling approach. Different from [1] and [3], autoregressive lags
of daily average temperatures are also included as explanatory variables to the model.
The time series model proposed in [5] is given by
Tt = 1 + 2 t +

L


l sin(lt) + l cos(lt) +

p Tt p + t t ,

(5)

p=1

l=1

t2 = 0 +

P


Q
R


 
2
q sin(qt) + q cos(qt) +
r tr
,

(6)

r =1

q=1

where t N (0, 1) iid. Based on a similar preliminary data analysis as described


in [5], we set L = 1, P = 10, Q = 1, R = 9. First we regress temperature data on
the trend, seasonal term and autoregressive lags. We follow Engles [7] two-step
estimation approach, which is also used in [5], to remove the heteroscedasticity and
seasonality in the data. The estimated parameters are given in Tables 4 and 5.
The four models we have discussed share the common characteristic that seasonal temperature patterns are modeled via sine and cosine functions and thus have
the same expected value for future long-term mean temperatures. Furthermore, the
models by Alaton et al. [1], Benth and Benth [3], and Campbell and Diebold [5]

Table 4 Estimated parameters for the model by Campbell and Diebold (standard errors of estimators in parenthesis)
1
2
1
1
1
2
3
15.2851
(0.8534)

0.0001
(3.6105 )

1.0969
(0.1424)

5.9156
(0.3247)

0.8820
(0.0137)

0.3184
(0.0184)

0.1193
(0.0187)

10

0.0149
(0.0189)

0.0160
(0.0192)

0.0185
(0.0189)

0.0019
(0.0186)

0.0066
(0.0189)

0.0207
(0.0183)

0.0017
(0.0134)

Table 5 Estimated parameters for the model by Campbell and Diebold, contd. (standard errors of
estimators in parenthesis)
0
1
1
1
2
3
16.4401
(0.9091)

2.2933
(0.6893)

7.3571
(0.7528)

0.0294
(0.0133)

0.0366
(0.0133)

0.0110
(0.0133)

0.0465
(0.0133)

0.0505
(0.0133)

0.0114
(0.0133)

0.0151
(0.0133)

0.0611
(0.0133)

0.0043
(0.0133)

358

A. Gnc et al.

assume a Gaussian noise term after removing the effects of trend, seasonality, and
heteroscedasticity in the daily temperatures, whereas the model by Brody et. al. [4]
captures the long-memory effects by using the fractional Brownian motion different
from the other models. For option pricing of short term weather contracts it is possible to assume a simpler form of heteroscedasticity in the volatility which would
be sufficient to price monthly weather options (see [9]). The model by Campbell
and Diebold [5] might be prone to pricing errors due to the large number of ARCH
coefficients to be estimated, whereas the model by Brody et. al. [4] suffers from the
difficulty of estimating the Hurst exponent and long-term sensitivity with respect to
this parameter. These issues are investigated in the next section.

3 Global Sensitivity Analysis


Global sensitivity analysis (SA) measures parameter importance by considering variations of all input parameters at the same time. As a result, interactions among
different inputs can be detected. Among all global SA methods, Sobol sensitivity measures [16, 23, 26, 27] that utilize the analysis of variance (ANOVA) of the
model output are the most widely used. Variance-based global sensitivity analysis
has the advantage that type II errors (failure to identify a significant parameter) can
be avoided with a higher probability (Saltelli [24]). Other advantages include model
independence, full exploration of input parameter ranges, as well as capabilities to
capture parameter interactions and tackle groups of parameters (Saltelli [24]). Other
techniques (e.g. EFAST (Saltelli [22]) and DGSM (Sobol [28], Kucherenko [14]))
have been developed to approximate Sobols sensitivity measures with less computational cost. However, they can give inaccurate sensitivity indices in certain situations
(e.g. Sobol [28]) and computational efficiency is not a focus in this study.
There is an extensive literature on applications of Sobol sensitivity measures, for
example, Kucherenko et. al. [15] use Sobol sensitivity measures to identify model
effective dimensions, which are closely related to the effectiveness of applying quasiMonte Carlo sequences; Rohmer et. al. [21] perform Sobol global sensitivity analysis
in computationally intensive landslide modelling with the help of Gaussian-process
surrogate modeling; Alexanderian et. al. [2] compute Sobol sensitivity measures for
an ocean general circulation model by constructing a polynomial chaos expansion of
the model outputs; and Liu et. al. [17] utilize Sobol sensitivity measures to identify
the important input parameters in a wildland surface fire spread model to develop
efficient simulations.
Let u {1, . . . , d} be an index set and x u denote the |u|-dimensional vector
with elements x j for j u. The ANOVA decomposition writes a square integrable function
f (x), defined on the d-dimensional unit hypercube I d = [0, 1]d , as

f (x) = u{1,...,d} f u (x u ), where f u (x u ) is a function that only depends on the
u
with a variance, called
variables in u. Each component function
 f u (x ) uis 2associated
2
a partial variance, defined as u = [0,1]|u| f u (x ) dx u . The variance of the func-

Uncertainty and Robustness in Weather Derivative Models

359


tion f (x), called the total variance, is 2 = [0,1]d f (x)2 dx f 2 . The total variance

can be written as the sum of all partial variances: 2 = u{1,...,d} u2 . Based on
the ANOVA decomposition, Sobol [26] introduced
two types of global
sensitivity


indices (GSI) for an index set u: S u = 12 vu v2 and S u = 12 vu= v2 . The
sensitivity index S u sums all the normalized variances whose index sets are subsets
of u, and S u sums all those whose index sets have non-empty intersections with u.
Clearly, S u S u , and hence they can be used as the lower and upper bounds for
the sensitivity measures on the parameters x u . The GSI with respect to singletons,
S {i} , for instance, represents the impact on the output of parameter xi alone, and
S {i} considers the individual impact as well as the cooperative impact of xi and the
other parameters. In this sense, S {i} and S {i} are called main effects and total effects,
respectively. In the general case, S u and S u are also called lower and upper Sobol
indices. The main effects S {i} can be used to prioritize the model parameters in terms
of their importance, while the total effects S {i} can be used as a tool to reduce model
complexity. If S {i} is relatively small, then the corresponding parameter can be frozen
at its nominal value.

4 Numerical Results
In our global sensitivity analysis, the model output is the estimate of the HDD call
option price that is calculated by averaging the payoff in Definition 2. The model
inputs are the temperature model parameters, which are estimated from the historical
temperatures. In our numerical results, the pricing of the weather derivatives is done
under the physical probability measure. We estimate the price of an HDD call option
on December 31, 20122 with strike price 800 HDDs. The contract period is January
1-31, 2012. We will refer to the four weather derivatives models considered in Sect. 2
by simply using the name of the first author. The parameters of the weather derivatives models can be classified into six groups: trend, seasonality, volatility, mean
reversion, Hurst parameters, and ARCH parameters. Trend, seasonality and volatility are common to Alatons, Benths and Brodys models. Brodys model assumes a
fractional Brownian motion and thus involves the additional Hurst parameter. Campbells model considers an AR(P) process for the temperatures and an ARCH(R) for
the volatility process. Least squares regression is used to obtain the mean of each
estimate and its standard error. The detailed grouping is listed in Table 6. We apply
global sensitivity analysis to these groups of parameters. Table 7 shows the Sobol
indices S with respect to groups of parameters for all models. The Sobol indices
are computed using a sample size of 20,000, and the price of the derivative is computed using a randomly permuted random-start Halton sequence ([19]) of sample
size 10,000.
2 Our historical data starts from 1/1/1997, which corresponds to t

December 31, 2012 corresponds to t = 5475.

= 1. The date we price the option,

360

A. Gnc et al.

Table 6 Parameter grouping for daily average temperature models


Alaton
Benth
Brody
Trend
Seasonality
Volatility
Mean reversion
Hurst parameter

A, B
A, B
C, D
C, D
i , i = 1, . . . , 12 c, ci , di ,
i = 1, . . . , 4
a
a
N/A
N/A

Table 7 Upper Sobol indices for groups of parameters


Alaton
Benth
Trend
Seasonality
Volatility
Mean reversion
Hurst parameter
ARCH
parameters (s)

0.8240
0.1053
0.0736
0.0040
N/A
N/A

0.8794
0.1148
0.0019
0.0027
N/A
N/A

Campbell

A, B
1 , 2
C, D
1 , 1
i , i = 1, . . . , 12 0 , 1 , ...9
a
H

1 , ...10
N/A

Brody

Campbell

0.6317
0.0823
0.2666
0.0118
0.0134
N/A

0.2073
0.0278
0.00001
N/A
N/A
0.8313

The sample sizes used for sensitivity analysis and for calculating the prices are 20,000 and 10,000,
respectively. M = 31, t0 = 5475, and regression standard errors are chosen as standard deviations

For all models, the sum of the upper Sobol indices is approximately 1, indicating
that the secondary interactions between groups of parameters are small. From Table 7,
we see that the largest sensitivity in the models by Alaton, Benth, and Brody are
due to the trend parameters. The sensitivities of the mean reversion parameters are
negligible. For Campbells model, the ARCH parameters are the most sensitive,
while the seasonality and volatility parameters are the most insensitive.
We first compare Alatons, Benths and Brodys models due to their similarities.
Note that the trend and seasonality parameters are the same for the three models and
the characterization of volatility by Benth is different from Alaton. Despite the fact
that Brodys model considers volatility the same way as Alatons model, the use of
fractional Brownian motion changes the behavior of the underlying stochastic process
and thus changes the volatility part as well. We keep the uncertainties of all groups
of parameters, excluding volatility, fixed at their regression standard errors. We vary
the uncertainty of the volatility group by increasing the coefficient of variation (CoV,
defined as the ratio of standard deviation to the mean) for each parameter in the
volatility group from 1 to 35 %. For example, when the CoV is 1 % for the firstmonth volatility parameter 1 in Alatons model, then 1 is modeled as a normal
distribution with mean 6.36, and standard deviation 0.01 6.36. (The estimated
mean for 1 is 6.36, as shown in Table 2.)
Figure 4a shows that as the CoV of volatility increases, Sobols non-normalized
upper index 2 S V , which represents the sum of all the partial variances of groups

Uncertainty and Robustness in Weather Derivative Models

(b)

(a)
Alaton
Benth
Brody

250

Alaton
Benth
Brody

110
105

2 S { V

200

2 S { V

361

150
100

100
95
90

50

85

80
0

0.05

0.1

0.15

0.2

0.25

CoV of volatility

0.3

0.35

0.05

0.1

0.15

0.2

0.25

0.3

0.35

CoV of volatility

Fig. 4 Model robustness using Sobol indices. a Sobols upper index for the volatility parameters
against the coefficient of variation of volatility; b Sobols lower index for the compliment of volatility
parameters against the coefficient of variation of volatility

of parameters that include a volatility parameter, increases monotonically for all


three models. However, for each CoV of volatility, Benths model has the smallest
sensitivity while Brodys model has the largest. In addition, the sensitivity of Benths
model increases at a much smaller rate than that of Brodys model. On the other
hand, Fig. 4b shows that the values of Sobols non-normalized lower index 2 S{V }
is relatively constant for all models (Here, the notation V stands for the complement
of the set V ). Since 2 S V + 2 S {V } = 2 , this result suggests that the faster rate
of increase in the total variance of Brodys model is explained by the faster rate of
increase in the sensitivity of the volatility parameter.
These observations suggest the following qualitative approach to compare two
models in terms of their robustness. Consider models, A and B, with the same
output. Let x be an input parameter (or a group of parameters), for the models. This
input parameter is estimated from data, and has uncertainty due to the estimation
process. Assume the uncertainty in x leads to its modeling by a normal distribution,
with mean equaling the estimated value, and a standard deviation characterizing the
estimation error. If the growth of the (non-normalized) upper Sobol index for x in
model B, as a function of the estimation error of the input, is at a higher rate than
that of model A, but yet the rate of increase of the (non-normalized) lower Sobol
indices for the complimentary parameters are similar for both models, then model A
will be deemed more robust than model B with respect to x. For example, assume
that the total variances of the two models are equal, i.e., 2 S x + 2 S {x} = 2 , is
the same for each model, however, the rate of growth in model B for the term 2 S x
is higher than that of model A. Then model A would be preferable since it is less
sensitive to estimation error in the input parameter x. With this understanding, and
the observations made in the previous paragraph, we conclude that Benths model is
more robust than Alatons and Brodys models.

362

A. Gnc et al.

(b)

(a)
Benth
Campbell

150000

15000

2 S { T }

2 S { T }

120000
90000
60000
30000
0
0

Benth
Campbell

18000

12000
9000
6000
3000

0.05 0.1

0.15 0.2

0.25 0.3

0
0

0.35

0.05

0.1

0.15

0.2

0.25 0.3

0.35

CoV of trend

CoV of trend

Fig. 5 a Sobols upper index for the trend parameters against the coefficient of variation of trend;
b Sobols lower index for the compliment of trend parameters against the coefficient of variation
of trend

(a)

(b)

25000

24000

Benth
Campbell

20000

18000

2 S { S }

2 S { S }

Benth
Campbell

21000

15000
10000

15000
12000
9000
6000

5000
3000
0

0
0

0.05

0.1

0.15

0.2

0.25

CoV of seasonality

0.3

0.35

0.05

0.1

0.15

0.2

0.25

0.3

0.35

CoV of seasonality

Fig. 6 a Sobols upper index for the seasonality parameters against the coefficient of variation
of seasonality; b Sobols lower index for the compliment of seasonality parameters against the
coefficient of variation of seasonality

Next we compare Benths model with Campbells time series model. Figure 5a
shows that as the CoV of the trend parameters increases, the non-normalized upper
Sobol index 2 S T increases monotonically in a similar pattern for both models.
However, when we examine the lower Sobol index 2 S{T } plot in Fig. 5b, we
observe that Campbells model has significantly larger sensitivity for components
other than the trend. This also means that the total variance of the model output for
Campbells model is much larger. Figure 6 plots the sensitivity for the seasonality
parameters. The upper Sobol index increases at a similar rate for both Benths
and Campbells models. However, the lower Sobol index for Campbells model

Uncertainty and Robustness in Weather Derivative Models

(a)

(b)

25000

Benth
Campbell

Benth

Campbell

20000

2 S { V

2 S { V

363

4
3

15000
10000

2
5000
1
0
0

0
0.05

0.1

0.15

0.2

0.25

0.3

0.05

CoV of volatility

0.1

0.15

0.2

0.25

0.3

CoV of volatility

Fig. 7 a Sobols upper index for the volatility parameters against the coefficient of variation of
volatility; b Sobols lower index for the compliment of volatility parameters against the coefficient
of variation of volatility
140000

trend

seasonality

120000

volatility

100000

Fig. 8 Total variance 2


against CoV of trend,
seasonality and volatility
parameters in Campbells
model

80000
60000
40000
20000
0
0

0.05

0.1

0.15

0.2

0.25

0.3

CoV

is very large relative to Benths model. In Fig. 7, we conduct a similar analysis


for the volatility parameters, and observe a similar behavior. Finally, we plot the
total variance 2 of the output for Campbells model as a function of the CoV in
trend, seasonality, and volatility coefficients in Fig. 8. We observe that the model is
most sensitive to the increasing uncertainty in the trend parameters. This observation
makes sense if we note that any initial uncertainty in the trend coefficients applies
throughout time affecting the whole trajectory of temperatures during the contract
period. We also observe that the total variance does not change much with respect to
increasing CoV in volatility.
A summary of the many observations we have discussed, in a more general context,
will be useful. When one sets out to compare the accuracy of different models for the
same problem, a reasonable first step is to compare their total variances, which we

364

A. Gnc et al.

Table 8 Mean and total variance for all models


Alaton
Benth
Mean
Variance

106.69
108.04

104.86
104.16

Brody

Campbell

118.95
114.61

140.70
20337.33

The sample sizes used for sensitivity analysis and for calculating the prices are 20,000 and 10,000,
respectively. M = 31, t0 = 5475, and regression standard errors are chosen as standard deviations

did in Table 8 for the four weather derivative models considered in the paper. From
this table, one can deduce the models by Alaton, Benth and Brody perform equally
well, and the model by Campbell is unsatisfactory. However, the information in this
table does not reveal how the variances will change as the models are recalibrated
with different input, resulting in different standard errors for the input parameters.
In other words, the total variance information does not explain how robust a model
is with respect to its input parameter(s). Our qualitative analysis computes Sobol
sensitivity indices for each model, with inputs (or input groups) that match across
models, and compares the growth of the sensitivity indices as the estimation error in
the input parameters (CoV) increases. Based on our empirical results, we conclude
Benths model is the most robust; the model that has the smallest rate of increase in
the sensitivity indices as a function of input parameter error. In future work, we will
investigate developing a quantitative approach to define the robustness of a model.

References
1. Alaton, P., Djehiche, B., Stillberger, D.: On modelling and pricing weather derivatives. Appl.
Math. Financ. 9, 120 (2002)
2. Alexanderian, A., Winokur, J., Sraj, I., Srinivasan, A., Iskandarani, M., Thacker, W.C., Knio,
O.M.: Global sensitivity analysis in an ocean general circulation model: a sparse spectral
projection approach. Comput. Geosci. 16, 757778 (2012)
3. Benth, F.E., Benth, J.S.: The volatility of temperature and pricing of weather derivatives. Quant.
Financ. 7, 553561 (2007)
4. Brody, D.C., Syroka, J., Zervos, M.: Dynamical pricing of weather derivatives. Quant. Financ.
3, 189198 (2002)
5. Campbell, S., Diebold, F.X.: Weather forecasting for weather derivatives. J. Am. Stat. Assoc.
100, 616 (2005)
6. Cao, M., Wei, J.: Weather derivatives valuation and market price of weather risk. J. Futur. Mark.
24, 10651089 (2004)
7. Engle, R.F.: Autoregressive conditional heteroscedasticity with estimates of variance of united
kingdom inflation. Econometrica 50, 9871008 (1982)
8. Gnc, A.: Pricing temperature-based weather derivatives in China. J. Risk Financ. 13, 3244
(2011)
9. Gnc, A.: Comparison of temperature models using heating and cooling degree days futures.
J. Risk Financ. 14, 159178 (2013)
10. Hanley, M.: Hedging the force of nature. Risk Prof. 1, 2125 (1999)
11. Hrdle, W.K., Cabrera, B.L.: The Implied Market Price of Weather Risk. Appl. Math. Financ.
19, 5995 (2012)

Uncertainty and Robustness in Weather Derivative Models

365

12. Huang, H.-H., Shiu, Y.-M., Lin, P.-S.: HDD and CDD option pricing with market price of
weather risk for Taiwan. J. Futu. Mark. 28, 790814 (2008)
13. Jewson, S.: Weather Derivative Valuation: The Meteorological, Statistical, Financial and Mathematical Foundations. Cambridge University Press, Cambridge (2005)
14. Kucherenko, S., Rodriguez-Fernandez, M., Pantelides, C., Shah, N.: Monte Carlo evaluation
of derivative-based global sensitivity measures. Reliab. Eng. Syst. Saf. 94, 11351148 (2009)
15. Kucherenko, S., Feil, B., Shah, N., Mauntz, W.: The identification of model effective dimensions
using global sensitivity analysis. Reliab. Eng. Syst. Saf. 96, 440449 (2011)
16. Liu, R., Owen, A.: Estimating mean dimensionality of analysis of variance decompositions. J.
Am. Stat. Assoc. 101, 712721 (2006)
17. Liu, Y., Jimenez, E., Hussaini, M.Y., kten, G., Goodrick, S.: Parametric uncertainty quantification in the Rothermel model with randomized quasi-Monte Carlo methods. Int. J. Wildland
Fire 24, 307316 (2015)
18. Oetomo, T., Stevenson, M.: Hot or Cold? a comparison of different approaches to the pricing
of weather derivatives. J. Emerg. Mark. Financ. 4, 101133 (2005)
19. kten, G., Shah, M., Goncharov, Y.: Random and deterministic digit permutations of the Halton
sequence. In: Plaskota, L., Wozniakowski, H. (eds.) 9th International Conference on Monte
Carlo and Quasi-Monte Carlo Methods in Scientific Computing, Warsaw, Poland, August
1520, pp. 589602. Springer, Berlin (2012)
20. Platen, E., West, J.: A fair pricing approach to weather derivatives. Asian-Pac. Financ. Mark.
11, 2353 (2005)
21. Rohmer, J., Foerster, E.: Global sensitivity analysis of large-scale numerical landslide models
based on Gaussian-Process meta-modeling. Comput. Geosci. 37, 917927 (2011)
22. Saltelli, A., Tarantola, S., Chan, K.P.-S.: A quantitative model-independent method for global
sensitivity analysis of model output. Technometrics 41, 3956 (1999)
23. Saltelli, A.: Making best use of model evaluations to compute sensitivity indices. Comput.
Phys. Commun. 145, 80297 (2002). doi:10.1016/S0010-4655(02)00280-1
24. Saltelli, A.: Global Sensitivity Analysis: The Primer. Wiley, New Jersey (2008)
25. Schiller, F., Seidler, G., Wimmer, M.: Temperature models for pricing weather derivatives.
Quant. Financ. 12, 489500 (2012)
26. Sobol, I.M.: Sensitivity estimates for non-linear mathematical models. Math. Model. Comput.
Exp. 1, 407414 (1993)
27. Sobol, I.M.: Global sensitivity indices for nonlinear mathematical models and their
Monte Carlo estimates. Math. Comput. Simul. 55, 271280 (2001). doi:10.1016/S03784754(00)00270-6
28. Sobol, I.M., Kucherenko, S.: Derivative based global sensitivity measures and their link with
global sensitivity indices. Math. Comput. Simul. 79, 30093017 (2009)

Reliable Adaptive Cubature Using


Digital Sequences
Fred J. Hickernell and Llus Antoni Jimnez Rugama

In honor of Ilya M. Sobol

Abstract Quasi-Monte Carlo cubature methods often sample the integrand using
Sobol (or other digital) sequences to obtain higher accuracy than IID sampling. An
important question is how to conservatively estimate the error of a digital sequence
cubature so that the sampling can be terminated when the desired tolerance is reached.
We propose an error bound based on the discrete Walsh coefficients of the integrand
and use this error bound to construct an adaptive digital sequence cubature algorithm.
The error bound and the corresponding algorithm are guaranteed to work for integrands lying in a cone defined in terms of their true Walsh coefficients. Intuitively,
the inequalities defining the cone imply that the ordered Walsh coefficients do not
dip down for a long stretch and then jump back up. An upper bound on the cost of our
new algorithm is given in terms of the unknown decay rate of the Walsh coefficients.
Keywords Quasi-Monte Carlo methods Multidimensional integration Digital
sequences Sobol sequences Adaptive algorithms Automatic algorithms

1 Introduction
Quasi-Monte Carlo cubature rules approximate multidimensional integrals over the
unit cube by an equally weighted sample average of the integrand values at the first n
F.J. Hickernell Ll.A. Jimnez Rugama (B)
Department of Applied Mathematics, Illinois Institute of Technology,
10 W. 32nd Street, E1-208, Chicago, IL 60616, USA
e-mail: ljimene1@hawk.iit.edu
F.J. Hickernell
e-mail: hickernell@iit.edu
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_18

367

368

F.J. Hickernell and Ll.A. Jimnez Rugama

nodes from some sequence {z i }i=0


. This node sequence should be chosen to minimize
the error, and for this one can appeal to KoksmaHlawka type error bounds of the
form


n1


1
n1

f (x)dx
f (z i ) D({z i }i=0
)V ( f ).
(1)

d
n
[0,1)

i=0

n1
The discrepancy, D({z i }i=0
), measures how far the empirical distribution of the first
n nodes differs from the uniform distribution. The variation, V ( f ), is some seminorm of the integrand, f . The definitions of the discrepancy and variation are linked
to each other. Examples of such error bounds are given by [3, Chaps. 23], [4], [11,
Sect. 5.6], [12, Chaps. 23], and [14, Chap. 9].
A practical problem is how large to choose n so that the absolute error is smaller
than some user-defined tolerance, . Error bounds of the form (1) do not help in this
regard because it is too hard to compute V ( f ), which is typically defined in terms
of integrals of mixed partial derivatives of f .
This article addresses the challenge of reliable error estimation for quasi-Monte
Carlo cubature based on digital sequences, of which Sobol sequences are the most
popular example. The vector space structure underlying these digital sequences facilitates a convenient expression for the error in terms of the (Fourier)-Walsh coefficients
of the integrand. Discrete Walsh coefficients can be computed efficiently, and their
decay provides a reliable cubature error estimate. Underpinning this analysis is the
assumption that the integrands lie in a cone defined in terms of their true Walsh
coefficients; see (13).
The next section introduces digital sequences and their underlying algebraic structure. Section 3 explains how the cubature error using digital sequences as nodes can
be elegantly formulated in terms of the Walsh series representation of the integrand.
Our contributions begin in Sect. 4, where we derive a reliable data-based cubature
error bound for a cone of integrands, (16), and an adaptive cubature algorithm based
on that error bound, Algorithm 2. The cost of the algorithm is also represented in
terms of the unknown decay of the Walsh series coefficients and the error tolerance
in Theorem 1. A numerical example and discussion then conclude this article. A
parallel development for cubature based on lattice rules is given in [9].

2 Digital Sequences
The integrands considered here are defined over the half open d-dimensional unit
cube, [0, 1)d . For integration problems on other domains one may often transform the
integration variable so that the problem is defined on [0, 1)d . See [1, 58] for some
discussion of variable transformations and the related error analysis. The example in
Sect. 5 also employs a variable transformation.

Reliable Adaptive Cubature Using Digital Sequences

369

Digital sequences are defined in terms of digitwise addition. Let b be a prime


number; b = 2 is the choice made for Sobol sequences. Digitwise addition, , and
negation, , are defined in terms of the proper b-ary expansions of points in [0, 1)d :

x=

=1

d
x j b

xt =

, t=

x=

t j b

,
j=1

[(x j + t j ) mod b]b

=1

=1

j=1

(mod 1)

[x j mod b]b

x  t := x (t),

j=1

=1

x j , t j Fb := {0, . . . , b 1},

ax := x x a Fb .



a times

j=1

We do not have associativity for all of [0, 1)d . For example, for b = 2,
1/6 = 2 0.001010 . . . , 1/3 = 2 0.010101 . . . , 1/2 = 2 0.1000 . . .
1/3 1/3 = 2 0.00000 . . . = 0, 1/3 1/6 = 2 0.011111 . . . = 1/2,
(1/3 1/3) 1/6 = 0 1/6 = 1/6, 1/3 (1/3 1/6) = 1/3 1/2 = 5/6.
This lack of associativity comes from the possibility of digitwise addition resulting
in an infinite trail of digits b 1, e.g., 1/3 1/6 above.
Define the Boolean operator that checks whether digitwise addition of two points
does not result in an infinite trail of digits b 1:

true, min j=1,...,d sup{ : [(x j + t j ) mod b] = b 1} = ,
ok(x, t) =
(2)
false, otherwise.
If P [0, 1)d is some set that is closed under and ok(x, t) = true for all x, t P,
then associativity holds for all points in P. Moreover, P is an Abelian group and
also a vector space over the field Fb .

[0, 1)d is such a vector space that satisfies the


Suppose that P = {z i }i=0
following additional conditions:
{z 1 , z b , z b2 , . . .} is a set of linearly independent points,
zi =


=0

i  z b ,

where i =

i  b N0 , i  Fb .

(3a)
(3b)

=0

b 1
is a subspace
Such a P is called a digital sequence. Moreover, any Pm := {z i }i=0
of P and is called a digital net. From this definition it is clear that
m

370

(a)

F.J. Hickernell and Ll.A. Jimnez Rugama

(b)

0.75

0.75

0.5

0.5

0.25

0.25

0.25

0.5

0.75

0.25

0.5

0.75

Fig. 1 a 256 Sobol points, b 256 scrambled and digitally shifted Sobol points

P0 = {0} P1 = {0, z 1 , . . . , (b 1)z 1 } P2 P = {z i }i=0


.

This digital sequence definition is equivalent to the traditional one in terms of


generating matrices. By (3) and according to the b-ary expansion notation introduced
earlier, the m,  element of generating matrix, C j , for the jth coordinate is the th
binary digit of the jth element of z bm1 , i.e.,

(z 1 ) j1 (z b ) j1 (z b2 ) j1
(z 1 ) j2 (z b ) j2 (z b2 ) j2

C j = (z 1 ) j3 (z b ) j3 (z b2 ) j3

..
..
..
.
.
.

..
.

for j = 1, . . . , d.

The Sobol sequence works in base b = 2 and makes a careful choice of the basis
{z 1 , z 2 , z 4 , . . .} so that the points are evenly distributed. Figure 1a displays the initial
points of the two-dimensional Sobol sequence. In Fig. 1b the Sobol sequence has
been linearly scrambled to obtain another digital sequence and then digitally shifted.

3 Walsh Series
Non-negative integer vectors are used to index the Walsh series for the integrands.
The set Nd0 is a vector space under digitwise addition, , and the field Fb . Digitwise
addition and negation are defined as follows for all k, l Nd0 :

Reliable Adaptive Cubature Using Digital Sequences


k=

d
k j b

=0

, l=







d
l j b

=0

j=1

kl =

k=




371

k j , l j Fb ,

j=1

d

[(k j + l j ) mod b]b

=0

d

(b k j )b

=0

,
j=1

ak := k

k a Fb .

j=1

a times

For each wavenumber k Nd0 a function


k, : [0, 1)d Fb is defined as

k, x :=

d 


k j x j,+1

(mod b).

(4a)

j=1 =0

For all points t, x [0, 1)d , wavenumbers k, l Nd0 , and a Fb , it follows that

k, 0 =
0, x = 0,

k, ax t = a
k, x +
k, t (mod b) if ok(ax, t)

ak l, x = a
k, x +
l, x (mod b),

k, x = 0 k

Nd0

= x = 0.

(4b)
(4c)
(4d)
(4e)

considered here are assumed to contain suffiThe digital sequences P = {z i }i=0


cient points so that

k, z i = 0 i N0 = k = 0.
(5)

Defining N0,m := {0, . . . , bm 1}, the dual net corresponding to the net Pm is
the set of all wavenumbers for which
k, maps the whole net to 0:
Pm := {k Nd0 :
k, z i = 0, i N0,m }
= {k Nd0 :
k, z b = 0,  = 0, . . . , m 1}.
The properties of the bilinear transform defined in (4) imply that the dual nets Pm
are subspaces of each other:

= {0}.
P0 = Nd0 P1 P

The integrands are assumed to belong to some subset of L 2 ([0, 1)d ), the space of
square integrable functions. The L 2 inner product is defined as


f, g 2 =

[0,1)d

f (x)g(x) dx.

372

F.J. Hickernell and Ll.A. Jimnez Rugama

The Walsh functions {exp(2 1


k, /b) : k Nd0 } [3, Appendix A] are a complete orthonormal basis for L 2 ([0, 1)d ). Thus, any function in L 2 may be written in
series form as




(6)
f(k)e2 1
k,x /b , where f(k) := f, e2 1
k, /b ,
f (x) =
2

kNd0

and the L 2 inner product of two functions is the 2 inner product of their Walsh series
coefficients:







f, g 2 =

=: f(k) kNd , g(k)

.
f(k)g(k)
kNd
0

kNd0

Since the digital net Pm is a group under , one may derive a useful formula for
the average of a Walsh function sampled over a net. For all wavenumbers k Nd0
and all x Pm one has
b 1

1  2 1
k,zi /b
[e
e2 1
k,zi x /b ]
0= m
b i=0
m

b 1

1  2 1
k,zi /b
= m
[e
e2 1{
k,zi +
k,x }/b ] by (4c)
b i=0
m

= [1 e

2 1
k,x /b

b 1
1  2 1
k,zi /b
] m
e
.
b i=0
m

By this equality it follows that the average of the sampled Walsh function values is
either one or zero, depending on whether the wavenumber is in the dual net or not:

bm 1
1  2 1
k,zi /b
1, k Pm
e
= 1Pm (k) =
m
b i=0
0, k Nd0 \ Pm .

(7)

Multivariate integrals may be approximated by the average of the integrand sampled over a digitally shifted digital net, namely,
b 1
1 
f (z i ).
Im ( f ) := m
b i=0
m

(8)

Under the assumption that ok(z i , ) = true (see (2)) for all i N0 , it follows that
the error of this cubature rule is the sum of the Walsh coefficients of the integrand
over those wavenumbers in the dual net:

Reliable Adaptive Cubature Using Digital Sequences






[0,1)d

373

 



 
2 1
k, /b 



f (x) dx Im ( f ) =  f (0)
f (k) Im e

kNd0







=  f(0)
f(k)1Pm (k)e2 1
k, /b 
kNd0


 
= 

f(k)e2



.

1
k, /b 

kPm \{0}

(9)

Adaptive Algorithm 2 that we construct in Sect. 4 works with this expression for the
cubature error in terms of Walsh coefficients.
Although the true Walsh series coefficients are generally not known, they can be
estimated by the discrete Walsh transform, defined as follows:
bm 1



1  2 1
k,zi /b
2 1
k, /b

f m (k) := Im e
f () = m
e
f (z i )
b i=0

bm 1

1  2 1
k,zi /b 
e
= m
f (l)e2 1
l,zi /b
b i=0
d
lN0


lNd0

1
f(l) m
b
f(l)e2

lNd0

f(l)e2

m
b
1

e2

= f(k) +

b 1
1  2 1
lk,zi /b
e
bm i=0
m

1
lk, /b

1
lk, /b

f(k l)e2

lPm

1
lk,z i /b

i=0

lNd0

1Pm (l  k)

1
l, /b

f(k l)e2

1
l, /b

k Nd0 .

(10)

lPm \{0}

The discrete transform, fm (k) is equal to the true Walsh transform, f(k), plus aliasing
terms proportional to f(k l) where l is a nonzero wavenumber in the dual net.

374

F.J. Hickernell and Ll.A. Jimnez Rugama

4 Error Estimation and an Adaptive Cubature Algorithm


4.1 Wavenumber Map
Since the discrete Walsh transform has aliasing errors, some assumptions must be
made about how quickly the true Walsh coefficients decay and which coefficients
are more important. This is done by way of a map of the non-negative integers onto
the space of all wavenumbers, k : N0 Nd0 , according to the following algorithm.

Algorithm 1 Given a digital sequence, P = {z i }i=0


define k : N0 Nd0 as follows:

Step 1. Define k(0)


= 0.
Step 2. For m = 0, 1, . . .
For = 0, . . . , bm 1
+ (b 1)bm ) from
+ bm ), . . . , k(
Choose the values of k(
the sets





k Nd0 : k  k()
Pm , k  k(),
z bm = a , a = 1, . . . , b 1,
but not necessarily in that order.
There is some flexibility in the choice of this map. One might choose k to map
smaller values of to smaller values of k based on some standard measure of size
such as that given in [3, (5.9)]. The motivation is that larger should generally lead

to smaller f( k()).
We use Algorithm 3 below to construct this map implicitly.
To illustrate the initial steps of Algorithm 1, consider the Sobol points in dimension 2. In this case, z 1 = (1/2, 1/2), z 2 = (1/4, 3/4) and z 4 = (1/8, 5/8). For
m = = 0, one needs

 




P0 , k  k(0),
z 1 = 1 = k Nd0 :
k, z 1 = 1 .
k(1)
k Nd0 : k  k(0)

Thus, one may choose k(1)


= (1, 0). Next, m = 1 and = 0 leads to





k(2)
k Nd0 : k  k(0)
P1 , k  k(0),
z2 = 1


= k Nd0 : k P1 ,
k, z 2 = 1 .

Hence, we can take k(2)


:= (1, 1). Continuing with m = = 1 requires





k(3)
k Nd0 : k  k(1)
P1 , k  k(1),
z2 = 1 ,

so the next choice can be k(3)


:= (0, 1).

Reliable Adaptive Cubature Using Digital Sequences

375

Introducing the shorthand notation f := f( k())


and fm, := fm ( k()),
the
aliasing relation (10) may be written as
fm, = f +

2
f+bm e



m

1 k(+b
) k(),
/b

(11)

=1

and the cubature error in (9) may be bounded as







[0,1)d

 
 




 
m

2 1 k(b
), /b 
 m

f
f (x) dx Im ( f ) = 
fbm e

b  .

=1

(12)

=1

We will use the discrete transform, fm, , to estimate true Walsh coefficients, f , for
m significantly larger than logb ().

4.2 Sums of Walsh Series Coefficients and Cone Conditions


Consider the following sums of the true and approximate Walsh series coefficients.
For , m N0 and  m let
Sm ( f ) =

m
b
1

 
 f ,


b
1

S,m ( f ) =

=bm1 




 f+bm ,

=b1  =1

Sm ( f ) = S0,m ( f ) + + Sm,m ( f ) =


 
 f ,

!
S,m ( f ) =

=bm


b
1



 fm, .

=b1 

The first three sums, Sm ( f ), S,m ( f ), and Sm ( f ), cannot be observed because they
involve the true series coefficients. But, the last sum, !
S,m ( f ), is defined in terms of
the discrete Walsh transform and can easily be computed in terms of function values.
The details are described in the Appendix.
We now make critical assumptions about how certain sums provide upper bounds
on others. Let  N be some fixed integer and and be some non-negative valued

= 0 such that (r )(r


) < 1 for some r N. Define
functions with limm (m)
the cone of integrands
C := { f L 2 ([0, 1)d ) : S,m ( f ) (m ) Sm ( f ),  m,
Sm ( f ) (m

)S ( f ),   m}.
(13)
This is a cone because f C = a f C for all real a.

376

F.J. Hickernell and Ll.A. Jimnez Rugama

Fig. 2 The magnitudes of


true Walsh coefficients
for

f (x) = e3x sin 10x 2

The first inequality asserts that the sum of the larger indexed Walsh coefficients
bounds a partial sum of the same coefficients. For example, this means that S0,12 , the
sum of the values of the large black dots in Fig. 2, is no greater than some factor times
S12 ( f ), the sum of the values of the gray . Possible choices of are (m) = 1
or (m) = Cbm for some C > 1 and 0 1. The second inequality asserts
that the sum of the smaller indexed coefficients provides an upper bound on the sum
of the larger indexed coefficients. In other words, the fine scale components of the
integrand are not unduly large compared to the gross scale components. In Fig. 2 this
means that S12 ( f ) is no greater than some
  factor times S8 ( f ), the sum of the values
of the black squares. This implies that  f  does not dip down and then bounce back
up too dramatically as . The reason for enforcing the second inequality only
for   is that for small , one might have a coincidentally small S ( f ), while
Sm ( f ) is large.
The cubature error bound in (12) can be bounded in terms of Sl ( f ), a certain
finite sum of the Walsh coefficients for integrands f in the cone C . For , m N,
  m, it follows that





[0,1)d

 





 fbm  = S0,m ( f )

f (x) dx Im ( f )

by (12)

=1

(m) Sm ( f ) (m)(m

)S ( f ).

(14)

Thus, the faster S ( f ) decays as  , the faster the cubature error must decay.
Unfortunately, the true Walsh coefficients are unknown. Thus we must bound
S,m ( f ). This
S ( f ) in terms of the observable sum of the approximate coefficients, !
is done as follows:

Reliable Adaptive Cubature Using Digital Sequences

S ( f ) =

 1
b

377

 
 f 

=b1 

 




m



 fm,
+bm e2 1 k(+b )k(), /b 
f


=1
=b1 
 1
b

 1
b



 fm,  +

=b1 

 1
b

by (11)




 f+bm  = !
S,m ( f ) + S,m ( f )

=b1  =1

)S ( f )
by (13),
!
S,m ( f ) + (m )(m
!
S,m ( f )
S ( f )
provided that (m )(m

) < 1.
1 (m )(m

)

(15)

Combining (14) with (15) leads to the following conservative upper bound on the
cubature error for , m N,   m:





[0,1)d


!

S,m ( f )(m)(m

)
.
f (x) dx Im ( f )
1 (m )(m

)

(16)

This error bound suggests the following algorithm.

4.3 An Adaptive Cubature Algorithm and Its Cost


Algorithm 2 (Adaptive Digital Sequence Cubature, cubSobol_g) Given the
parameter  N and the functions and that define the cone C in (13), choose the
parameter r N such that (r )(r
) < 1. Let C(m) := (m)(r
)/[1 (r )(r
)]
and m =  + r . Given a tolerance, , and a routine that produces values of the
integrand, f , do the following:
Step 1. Compute the sum of the discrete Walsh coefficients, !
Smr,m ( f ) according
to Algorithm 3.
Step 2. Check whether the error tolerance is met, i.e., whether C(m)!
Smr,m ( f ) .
If so, then return the cubature Im ( f ) defined in (8) as the answer.
Step 3. Otherwise, increment m by one, and go to Step 1.
There is a balance to be struck in the choice of r . Choosing r too large causes the
error bound to depend on the Walsh coefficients with smaller indices, which may be
large, even thought the Walsh coefficients determining the error are small. Choosing
r too large makes (r )(r
) large, and thus the inflation factor, C, large to guard
against aliasing.

378

F.J. Hickernell and Ll.A. Jimnez Rugama

Theorem 1 If the integrand, f , lies in the cone, C , then the Algorithm 2 is successful:





[0,1)d




f (x)dx Im ( f ) .

The number of integrand values required to obtain this answer is bm , where the
following upper bound on m depends on the tolerance and unknown decay rate of
the Walsh coefficients.
)]Sm  r ( f ) }
m min{m   + r : C(m  )[1 + (r )(r
The computational cost of this algorithm beyond that of obtaining the integrand
values is O(mbm ) to compute the discrete Walsh transform.
Proof The success of this algorithm comes from applying (16). To bound the number
of integrand values required note that argument leading to (15) can be modified to
provide an upper bound on !
S,m ( f ) in terms of S ( f ):
!
S,m ( f ) =


b
1



 fm, 

=b1 


 




m

2 1 k(+b
) k(),
/b 


=
f +bm e

 f +
=1
=b1 

b
1


b
1

 
 f  +

=b1 


b
1

by (11)




 f+bm  = S ( f ) + S,m ( f )

=b1  =1

[1 + (m )(m

)]S ( f )

by (13).

Thus, the upper bound on the error in Step 2 of Algorithm 2, is itself bounded above
by C(m)[1 + (r )(r
)]Smr ( f ). Therefore, the stopping criterion in Step 2 must be
satisfied no later than when this quantity falls below .
The computation of the discrete Walsh transform and !
Smr,m ( f ) is described in
Algorithm 3 in the Appendix. The cost of this algorithm is O(mbm ) operations. 

5 Numerical Experiments
Algorithm 2 has been implemented in MATLAB code as the function cubSobol_g.
It is included in our Guaranteed Automatic Integration Library (GAIL) [2]. Our
cubSobol_g utilizes MATLABs built-in Sobol sequences, so b = 2. The default
algorithm parameters are

Reliable Adaptive Cubature Using Digital Sequences


Fig. 3 Time required and
error observed for
cubSobol_g (Algorithm 2)
for the Keister example, (17).
Small dots denote the time
and error when the tolerance
of = 0.001 was met. Large
dots denote the time and
error when the tolerance was
not met. The solid line
denotes the empirical
distribution function of the
error, and the dot-dashed line
denotes the empirical
distribution function of the
time

10

0.2

0.4

0.6

0.8

0.8
1

10

0.6

0.4

10

0.2
3

10

0
6

10

 = 6,

379

r = 4,

10

10

10

10

10

C(m) = 5 2m ,

and mapping k is fixed heuristically according to Algorithm 3. Fixing C partially


determines and since (m) = C(m)/(r ) and (r )(r
) = C(r )/[1 + C(r )].
We have tried cubSobol_g on an example from [10]:

I =

Rd

et cos(t) dt = d/2


2


[0,1)d

"

# d
#1 
cos $
[ 1 (x j )]2 dx,
2 j=1

(17)

where is the standard Gaussian distribution function (Fig. 3). We generated 1000
IID random values of the dimension d = e D  with D being uniformly distributed
between 0 and log(20). Each time cubSobol_g was run, a different scrambled and
shifted Sobol sequence was used. The tolerance was met about 97 % of the time
and failures were more likely among the higher dimensions. For those cases where
the tolerance was not met, mostly the larger dimensions, the integrand lay outside
the cone C . Our choice of k via Algorithm 3 depends somewhat on the particular
scrambling and digital shift, so the definition of C also depends mildly on these.

6 Discussion
There are few quasi-Monte Carlo cubature algorithms available that adaptively determine the sample size needed based on integrand values. The chief reason is that
reliable error estimation for quasi-Monte Carlo is difficult. Quasi-standard error has
serious drawbacks, as explained in [15]. Internal replications have no explicit theory.

380

F.J. Hickernell and Ll.A. Jimnez Rugama

IID replications of randomized quasi-Monte Carlo rules are sometimes used, but one
does not know how many replications are needed.
The proposed error bound and adaptive algorithm here are practical and have
theoretical justification. The conditions imposed on the sums of the (true) Fourier
Walsh coefficients make it possible to bound the cubature error in terms of discrete
FourierWalsh coefficients. The set of integrands satisfying these conditions is a nonconvex cone (13), thereby placing us in a setting where adaption has the opportunity
to be beneficial.
Problems requiring further consideration include how to choose the default parameters for Algorithm 2. We would also like to extend our algorithm and theory to
the case of relative error.
Acknowledgments This work was partially supported by US National Science Foundation grants
DMS-1115392, DMS-1357690, and DMS-1522687. The authors thank Ronald Cools and Dirk
Nuyens for organizing MCQMC 2014. We thank Sergei Kucherenko and Art Owen for organizing
the special session in honor of Ilya M. Sobol. We are grateful for Professor Sobols many contributions to MCQMC and related fields. The suggestions made by Sou-Cheng Choi, Yuhan Ding, Lan
Jiang, and the anonymous referees to improve this manuscript are greatly appreciated.

Appendix: Fast Computation of the Discrete Walsh


Transform
Let y0 , y1 , . . . be some data. Define Y(m) for = 0, . . . , bm 1 as follows:
b 1
b1
b1
%m1

m1
1  2 1 %=0
1 
 i  /b
:= m
e
yi = m

e2 1 =0  i /b yi ,
b i=0
b i =0
i =0
m

Y(m)

m1

where i = i 0 + i 1 b + i m1 bm1 and = 0 + 1 b + m1 bm1 . For all i j , j


Fb , j,  = 0, . . . , m 1, recursively define
Ym,0 (i 0 , . . . , i m1 ) := yi ,
Ym,+1 (0 , . . . ,  , i +1 , . . . , i m1 )
1  2 1 i /b (m)
e
Ym, (1 , . . . , 1 , i  , . . . , i m1 ).
b i =0
b1

:=

This allows us to identify Y(m) = Ym,m (0 , . . . , m1 ). By this iterative process one


m
can compute Y0(m) , . . . , Yb(m)
m 1 in only O(mb ) operations.
Note also, that Ym+1,m (0 , . . . , m1 , 0) = Ym,m (0 , . . . , m1 ) = Y(m) .
This means that the work done to compute Y(m) can be used to compute Y(m+1) .

Reliable Adaptive Cubature Using Digital Sequences

381

Next, we relate the Y to the discrete Walsh transform of the integrand f . For

, let
every k Nd0 and every digital sequence P = {z i }i=0
!
0 (k) := 0,

!
m (k) :=

m1


k, z b b N0,m , m N.

(18)

=0

If we set yi = f (z i + ), and if !
m (k) = , then
b 1
1  2 1
k,zi /b
e
yi
fm (k) = m
b i=0
m

=
=
=
=

e2

m
1
1
k, /b b

bm
e2

m
1
1
k, /b b

m
1
1
k, /b b

m
1
1
k, /b b

1
k, /b

yi

by (4c)


 %
2 1 k, m1
j=0 i j z b j /b

yi

by (3)

e2

%m1
j=0

i j
k,z b j /b

yi

by (4c)

i=0

bm

= e2

1
k,z i /b

i=0

bm
e2

i=0

bm
e2

e2

e2

%m1
=0

 i  /b

yi

by (18)

i=0

Y(m) .

(19)

Using the notation in Sect. 4, for all m N0 define a pointer m : N0,m N0,m

as m () := !
m ( k()).
It follows that

= e2
fm, = fm ( k())
!
S,m ( f ) =


b
1

=b1

1
k, /b

Y(m)
,
m ()


b
1 



 (m) 
 fm,  =
Ym () .

(20)

=b1

The quantity !
Smr,m ( f ) is the key to the stopping criterion in Algorithm 2.
If the map k : N0 Nd0 defined in Algorithm 1 is known explicitly, then specifying m is straightforward. However, in practice the bookkeeping involved in constructing k might be tedious, so we take a data-dependent approach to constructing
the pointer m () for N0,m directly, which then defines k implicitly.
Algorithm 3 Let r N be fixed. Given the input m N0 , the discrete Walsh coefficients Y(m) for N0,m , and also the pointer m1 () defined for N0,m1 ,
provided m > 0,

382

F.J. Hickernell and Ll.A. Jimnez Rugama

Step 1. If m = 0, then define (0) = 0 and go to Step 4.


Step 2. Otherwise, if m 1, then initialize m () = m1 () for N0,m1 and
m () = for = bm1 , . . . , bm 1.
Step 3. For  = m 1, m 2, . . . , max(1, m r ),
for = 1, . . . , b 1
 


  (m)

Find a such that Y(m)
b )  Y
 )  for all a Fb .
(+a
(+ab

m
m

Swap the values of m () and m ( + a b ).


Smr,r ( f ) according to
Step 4. Return m () for N0,m . If m r , then compute !
(20), and return this value as well.

:= {k Nd0 : !
m (k) = m ()} for N0,m , m N0 , where m
Lemma 1 Let Pm,
is given by Algorithm 3. Then we implicitly have defined the map k in the sense

= 0 Pm,0
, and k()
Pm, for
that any map k : N0,m Nd0 that chooses k(0)
m
all = 1, . . . , b 1 gives the same value of Smr,r ( f ). It is also consistent with
Algorithm 1 for N0,mr .

Proof The constraint that k()


Pm, implies that Smr,r ( f ) is invariant under all

k chosen according to the assumption that k()

Pm, . By definition 0 Pm,0


remains true for all m for Algorithm 3.

The remainder of the proof is to show that choosing k()


by the hypothesis of
this lemma is consistent with Algorithm 1. To do this we show that for m N0

k Pm,
, l Pm,+ab
 = k  l P

for all = 1, . . . , b ,  < m,


(21)

and that

Pm+1,

Pm,

for N0,mr provided m r.

(22)

The proof proceeds by induction. Since P0,0


= Nd0 , the above two conditions are
satisfied automatically.
If they are satisfied for m 1 (instead of m), then the initialization stage in Step
2 of Algorithm 3 preserves (21) for m. The swapping of m () and m ( + a b )

Pm,
= for some
values in Step 3 also preserves (21). Step 3 may cause Pm1,
larger values of , but the constraint on the values of  in Step 3 mean that (22) is
preserved.


References
1. Caflisch, R.E.: Monte Carlo and quasi-Monte Carlo methods. Acta Numer. 7, 149 (1998)
2. Choi, S.C.T., Ding, Y., Hickernell, F.J., Jiang, L., Jimnez Rugama, Ll.A., Tong, X., Zhang,
Y., Zhou, X.: GAIL: Guaranteed Automatic Integration Library (versions 1.02.1). MATLAB
software (20132015). https://github.com/GailGithub/GAIL_Dev
3. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte
Carlo Integration. Cambridge University Press, Cambridge (2010)

Reliable Adaptive Cubature Using Digital Sequences

383

4. Hickernell, F.J.: A generalized discrepancy and quadrature error bound. Math. Comput. 67,
299322 (1998)
5. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: On strong tractability of weighted multivariate
integration. Math. Comput. 73, 19031911 (2004)
6. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: On tractability of weighted integration for
certain Banach spaces of functions. In: Niederreiter [13], pp. 5171
7. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: On tractability of weighted integration over
bounded and unbounded regions in Rs . Math. Comput. 73, 18851901 (2004)
8. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: The strong tractability of multivariate integration using lattice rules. In: Niederreiter [13], pp. 259273
9. Jimnez Rugama, Ll.A., Hickernell, F.J.: Adaptive multidimensional integration based on
rank-1 lattices. In: Cools, R., Nuyens, D., (eds.) Monte Carlo and Quasi-Monte Carlo Methods
2014, vol. 163, pp. 407422. Springer, Heidelberg (2016)
10. Keister, B.D.: Multidimensional quadrature algorithms. Comput. Phys. 10, 119122 (1996)
11. Lemieux, C.: Monte Carlo and quasi-Monte Carlo Sampling. Springer Science+Business Media
Inc, New York (2009)
12. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. CBMS-NSF
Regional Conference Series in Applied Mathematics. SIAM, Philadelphia (1992)
13. Niederreiter, H. (ed.): Monte Carlo and Quasi-Monte Carlo Methods 2002. Springer, Berlin
(2004)
14. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems Volume II: Standard Information for Functionals. No. 12 in EMS Tracts in Mathematics. European Mathematical Society,
Zrich (2010)
15. Owen, A.B.: On the Warnock-Halton quasi-standard error. Monte Carlo Methods Appl. 12,
4754 (2006)

Optimal Point Sets for Quasi-Monte Carlo


Integration of Bivariate Periodic Functions
with Bounded Mixed Derivatives
Aicke Hinrichs and Jens Oettershagen

Abstract We investigate quasi-Monte Carlo (QMC) integration of bivariate periodic


functions with dominating mixed smoothness of order one. While there exist several
QMC constructions which asymptotically yield the optimal rate of convergence of
1
O(N 1 log(N ) 2 ), it is yet unknown which point set is optimal in the sense that it is
a global minimizer of the worst case integration error. We will present a computerassisted proof by exhaustion that the Fibonacci lattice is the unique minimizer of the
1
for small Fibonacci numbers N . Moreover,
QMC worst case error in periodic Hmix
we investigate the situation for point sets whose cardinality N is not a Fibonacci
number. It turns out that for N = 1, 2, 3, 5, 7, 8, 12, 13 the optimal point sets are
integration lattices.
Keywords Multivariate integration
points Fibonacci lattice

Quasi-Monte Carlo

Optimal quadrature

1 Introduction
Quasi-Monte Carlo (QMC) rules are equal-weight quadrature rules which can be
used to approximate integrals defined on the d-dimensional unit cube [0, 1)d

[0,1)d

f (x) dx

N
1 
f (x i ),
N i=1

where P N = {x 1 , x 2 , . . . , x N } are deterministically chosen quadrature points in


[0, 1)d . The integration error for a specific function f is given as
A. Hinrichs
Institut fr Analysis, Johannes-Kepler-Universitt Linz, Altenberger Strae 69,
4040 Linz, Austria
e-mail: aicke.hinrichs@uni-rostock.de
J. Oettershagen (B)
Institute for Numerical Simulation, Wegelerstrae 6, 53115 Bonn, Germany
e-mail: oettershagen@ins.uni-bonn.de
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_19

385

386

A. Hinrichs and J. Oettershagen



N


1 


f (x) dx
f (x i ) .

 [0,1)d

N i=1
To study the behavior of this error as N increases for f from a Banach space (H , )
one considers the worst case error


N


1 


wce(H , P N ) = sup 
f (x) dx
f (x i ) .


d
N
[0,1)
f H
i=1

 f 1

Particularly nice examples of such function spaces are reproducing kernel Hilbert
1
spaces [1]. Here, we will consider the reproducing kernel Hilbert space Hmix
of
1-periodic functions with mixed smoothness. Details on these spaces are given in
Sect. 2. The reproducing kernel is a tensor product kernel of the form
K d, (x, y) =

d


K 1, (x j , y j ) for x = (x1 , . . . , xd ), y = (y1 , . . . , yd ) [0, 1)d

j=1

with K 1, (x, y) = 1 + k(|x y|) and k(t) = 21 (t 2 t + 16 ) and a parameter > 0.


1
, P N ) among all N -point
It turns out that minimizing the worst case error wce(Hmix
sets P N = {x 1 , . . . , x N } with respect to the Hilbert space norm corresponding to
the kernel K d, is equivalent to minimizing the double sum
G (x 1 , . . . , x N ) =

N


K d, (x i , x j ).

i, j=1

There is a general connection between the discrepancy of a point set and the worst case
error of integration. Details can be found in [11, Chap. 9]. In our case, the relevant
notion is the L 2 -norm of the periodic discrepancy. We describe the connection in
detail in Sect. 2.3.
There are many results on the rate of convergence of worst case errors and of
the optimal discrepancies for N , see e.g. [9, 11], but results on the optimal
point configurations for fixed N and d > 1 are scarce. For discrepancies, we are
only aware of [21], where the point configurations minimizing the standard L -stardiscrepancy for d = 2 and N = 1, 2, . . . , 6 are determined, [14], where for N = 1
the point minimizing the standard L - and L 2 -star discrepancy for d 1 is found,
and [6], where this is extended to N = 2.
It is the aim of this paper to provide a method which for d = 2 and N > 2
yields the optimal points for the periodic L 2 -discrepancy and worst case error in
1
Hmix
. Our approach is based on a decomposition of the global optimization problem
into exponentially many local ones which each possess unique solutions that can be
approximated efficiently by a nonlinear block GauSeidel method. Moreover, we

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

387

use the symmetries of the two-dimensional torus to significantly reduce the number
of local problems that have to be considered.
It turns out that in the case that N is a (small) Fibonacci number, the Fibonacci
lattice yields the optimal point configuration. It is common wisdom, see e.g.
[3, 10, 1518], that the Fibonacci lattice provides a very good point set for integrating periodic functions. Now our results support the conjecture that they are actually
the best points.
These results may suggest that the optimal point configurations are integration
lattices or at least lattice point sets. This seems to be true for some numbers N of
points, for example for Fibonacci numbers, but not always. However, it can be shown
1
, P N ). Moreover, our
that integration lattices are always local minima of wce(Hmix
numerical results also suggest that for small the optimal points are always close
to a lattice point set, i.e. N -point sets of the form


i (i)
,
N N


: i = 0, . . . , N 1 ,

where is a permutation of {0, 1, . . . , N 1}.


The remainder of this article is organized as follows: In Sect. 2 we recall Sobolev
spaces with bounded mixed derivatives, the notion of the worst case integration error
in reproducing kernel Hilbert spaces and the connection to periodic discrepancy.
In Sect. 3 we discuss necessary and sufficient conditions for optimal point sets and
derive lower bounds of the worst case error on certain local patches of the whole
[0, 1)2N . In Sect. 4 we compute candidates for optimal point sets up to machine
precision. Using arbitrary precision rational arithmetic we prove that they are indeed
near the global minimum which also turns out to be unique up to torus-symmetries.
For certain point numbers the global minima are integration lattices as is the case if
N is a Fibonacci number. We close with some remarks in Sect. 5.

1 (T2 )
2 Quasi-Monte Carlo Integration in Hmix

2.1 Sobolev Spaces of Periodic Functions


We consider univariate 1-periodic functions f : R R which are given by their
values on the torus T = [0, 1).

1 For k Z, the kth Fourier coefficient of a function


f L 2 (T) is given by fk = 0 f (x) exp(2 i kx) dx. The definition
 f 2H 1, = f02 +


kZ

|2 k|2 fk2 =



2
T

f (x) dx


+

f (x)2 dx

(1)

388

A. Hinrichs and J. Oettershagen

for a function f in the univariate Sobolev space H 1 (T) = W 1,2 (T) L 2 (T) of
functions with first weak derivatives bounded in L 2 gives a Hilbert space norm
 f  H 1, on H 1 (T) depending on the parameter > 0. The corresponding inner
product is given by

( f, g) H 1, (T) =

 
f (x) dx

g(x) dx +

f (x)g (x) dx.

We denote the Hilbert space H 1 (T) equipped with this inner product by H 1, (T).
Since H 1, (T) is continuously embedded in C 0 (T) it is a reproducing
kernel Hilbert space (RKHS), see [1], with a symmetric and positive definite kernel
K 1, : T T R, given by [20]
K 1, (x, y) := 1 +

|2 k|2 exp(2 ik(x y))

kZ\{0}

(2)

= 1 + k(|x y|),
where k(t) = 21 (t 2 t + 16 ) is the Bernoulli polynomial of degree two divided by
two.
This kernel has the property that it reproduces point evaluations in H 1 , i.e.
f (x) = ( f (), K (, x)) H 1, for all f H 1 . The reproducing kernel of the tensor
1,
product space Hmix (T2 ) := H 1 (T) H 1 (T) C(T2 ) is the product of the univariate kernels, i.e.
K 2, (x, y) = K 1, (x1 , y1 ) K 1, (x2 , y2 )
= 1 + k(|x1 y1 |) + k(|x2 y2 |) + 2 k(|x1 y1 |)k(|x2 y2 |).
(3)

2.2 Quasi-Monte Carlo Cubature


N
A linear cubature algorithm Q N ( f ) := N1 i=1
f (x i ) with uniform weights N1 on a
point set P N = {x 1 , . . . , x N } is called a QMC cubature rule. Well-known examples
for point sets used in such quadrature methods are digital nets, see e.g. [4, 9], and
lattice rules [15]. A two-dimensional integration lattice is a set of N points given as


i ig
,
N N



mod 1 : i = 0, . . . , N 1

for some g {1, . . . , N 1} coprime to N . A special case of such a rank-1 lattice


rule is the so called Fibonacci lattice that only exists for N being a Fibonacci number
Fn and is given by the generating vector (1, g) = (1, Fn1 ), where Fn denotes the

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

389

nth Fibonacci number. It is well known that the Fibonacci lattices yield the optimal
rate of convergence in certain spaces of periodic functions.
In the setting of a reproducing kernel Hilbert space with kernel K on a general
domain D, the worst case error of the QMC-rule Q N can be computed as
 
wce(H , P N )2 =

K (x, y) dx d y

N 
N
2 
1 
K (x i , y) dy + 2
K (x i , x j ),
N
N
i=1 D
i, j=1

which is the norm of the error functional, see e.g. [4, 11]. For the kernel K 2, we
obtain
1,

wce(Hmix (T2 ), P N )2 = 1 +

N
N
1 
K 2, (x i , x j ).
N 2 i=1 j=1

There is a close connection between the worst case error of integration in


1,
wce(Hmix (T2 ), P N ) for the case = 6 and periodic L 2 -discrepancy, which we
will describe in the following.

2.3 Periodic Discrepancy


The periodic L 2 -discrepancy is measured with respect to periodic boxes. In dimension d = 1, periodic intervals I (x, y) for x, y [0, 1) are given by
I (x, y) = [x, y) if x y

and

I (x, y) = [x, 1) [0, y) if x > y.

In dimension d > 1, the periodic boxes B(x, y) for x = (x1 , . . . , xd ) and y =


(y1 , . . . , yd ) [0, 1)d are products of the one-dimensional intervals, i.e.
B(x, y) = I (x1 , y1 ) I (xd , yd ).
The discrepancy of a set P N = {x 1 , . . . , x N } [0, 1)d with respect to such a
periodic box B = B(x, y) is the deviation of the relative number of points of P N
in B from the volume of B
D(P N , B) =

#P N B
vol(B).
N

Finally, the periodic L 2 -discrepancy of P N is the L 2 -norm of the discrepancy function taken over all periodic boxes B = B(x, y), i.e.

390

A. Hinrichs and J. Oettershagen


D2 (P N ) =

1/2


D(P N , B(x, y)) d y dx
2

[0,1)d

[0,1)d

It turns out, see [11, p. 43] that the periodic L 2 -discrepancy can be computed as
D2 (P N )2 = 3d +

1
N2

K d (x, y)

x, yP N

1,6
(Td ), P N )2 ,
= 3d wce(Hmix

where K d is the tensor product of d kernels K 1 (x, y) = |x y|2 |x y| + 21 . So


minimizing the periodic L 2 -discrepancy is equivalent to minimizing the worst case
1,
error in Hmix for = 6. Let us also remark that the periodic L 2 -discrepancy is (up to
a factor) sometimes also called diaphony. This terminology was introduced in [22].

3 Optimal Cubature Points


In this section we deal with (local) optimality conditions for a set of two-dimensional
points P N (x, y) T2 , where x, y T N denote the vectors of the first and
second components of the points, respectively.

3.1 Optimization Problem


We want to minimize the squared worst case error
1,

wce(Hmix (T2 ), P N )2 = 1 +

N 1
1 
K 1, (xi , x j ) K 1, (yi , y j )
N2
i, j=0

=1+

=
=

N2

1
N2

N
1


N
1




1 + k(|xi x j |) + k(|yi y j |) + 2 k(|xi x j |)k(|yi y j |)

i, j=0



k(|xi x j |) + k(|yi y j |) + k(|xi x j |)k(|yi y j |)

i, j=0

(2k(0) + k(0)2 )
N
N 2 N 1

2  
+ 2
k(|xi x j |) + k(|yi y j |) + k(|xi x j |)k(|yi y j |)
N
i=0 j=i+1

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

391

1,

Thus, minimizing wce(Hmix (T2 ), P N )2 is equivalent to minimizing either


F (x, y) :=

N 1
N 2 



k(|xi x j |) + k(|yi y j |) + k(|xi x j |)k(|yi y j |)
i=0 j=i+1

(4)
or
G (x, y) :=

N 1


(1 + k(|xi x j |))(1 + k(|yi y j |)).

(5)

i, j=0

For theoretical considerations we will sometimes use G , while for the numerical
implementation we will use F as objective function, since it has less summands.
Let , S N be two permutations of {0, 1, . . . , N 1}. Define the sets


D,

x
x (1) x (N 1)
= x [0, 1) , y [0, 1) : (0)
y (0) y (1) y (N 1)
N


(6)

on which all points maintain the same order in both components and hence it holds
|xi x j | = si, j (xi x j ) for si, j {1, 1}. It follows that the restriction of F to
D, , i.e. F (x, y)|D, , is a polynomial of degree 4 in (x, y). Moreover, F |D, is
convex for sufficiently small .
Proposition 1 F (x, y)|D, and G (x, y)|D, are convex if [0, 6].
Proof It is enough to prove the claim for
G (x, y) =

N 1


(1 + k(|xi x j |))(1 + k(|yi y j |)).

i, j=0

Since the sum of convex functions is convex


and since f (x y) is convex if f is, it
is enough to show that f (s, t) = 1 + k(s) 1 + k(t) is convex for s, t [0, 1].
To this end, we
show that the Hesse matrix H ( f ) is positive definite if 0 < 6.
First, f ss = 1 + k(t) is positive if < 24. Hence is is enough to check that the
determinant of H ( f ) is positive, which is equivalent to the inequality

 




1 2
1 2
t
1 + k(s) 1 + k(t) > 2 s
.
2
2
So it remains to see that

1 + k(s) = 1 +
2

1
s s+
6
2



1 2
> s
.
2

392

A. Hinrichs and J. Oettershagen

But this is elementary to check for 0 < 6 and s [0, 1]. In the case = 6
the determinant of H ( f ) = 0 and some additional argument is necessary which we
omit here.

Since
[0, 1) N [0, 1) N =

D, ,

(, )S N S N

one can obtain the global minimum of F on [0, 1) N [0, 1) N by computing


argmin(x, y)D, F (x, y) for all (, ) S N S N and choose the global minimum
as the smallest of all the local ones.

3.2 Using the Torus Symmetries


We now want to analyze how symmetries of the two dimensional torus T2 allow to
reduce the number of regions D, for which the optimization problem has to be
solved.
The symmetries of the torus T2 which do not change the worst case error for the
considered classes of periodic functions are generated by
1. Shifts in the first coordinate x  x +c mod 1 and shifts in the second coordinate
y  y + c mod 1.
2. Reflection of the first coordinate x  1x and reflection of the second coordinate
y  1 y.
3. Interchanging the first coordinate x and the second coordinate y.
4. The points are indistinguishable, hence relabeling the points does not change the
worst case error.
Applying finite compositions of these symmetries to all the points in the point set
P N = {(x0 , y0 ), . . . , (x N 1 , y N 1 )} leads to an equivalent point set with the same
worst case integration error. This shows that the group of symmetries G acting on
the pairs (, ) indexing D, generated by the following operations
1. replacing or by a shifted permutation:  ( (0) + k mod N , . . . ,
(N 1) + k mod N ) or  ( (0) + k mod N , . . . , (N 1) + k mod N )
2. replacing or by its flipped permutation:  ( (N 1), (N 2), . . . , (1),
(0)) or  ( (N 1), (N 2), . . . , (1), (0))
3. interchanging and : (, )  (, )
4. applying a permutation S N to both and : (, )  ( , )
lead to equivalent optimization problems. So let us call the pairs (, ) and ( , )
in S N S N equivalent if they are in the same orbit with respect to the action of G.
In this case we write (, ) ( , ).
Using the torus symmetries 1. and 4. it can always be arranged that = id and
(0) = 0, which together with fixing the point (x0 , y0 ) = (0, 0) leads to the sets

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate



0 = x0 x1 . . . x N 1
,
D = x [0, 1) N , y [0, 1) N :
0 = y0 y (1) y (N 1)

393

(7)

where S N 1 denotes a permutation of {1, 2, . . . , N 1}.


But there are many more symmetries and it would be algorithmically desirable
to cycle through exactly one representative of each equivalence class without ever
touching the other equivalent . This seems to be difficult to implement, hence we
settled for a little less which still reduces the amount of permutations to be handled
considerably.
To this end, let us define the symmetrized metric
d(i, j) = min{|i j|, N |i j|}

for

0 i, j N 1

(8)

and the following subset of S N .


Definition 1 The set of semi-canonical permutations C N S N consists of permutations which fulfill
(i)
(ii)
(iii)
(iv)

(0) = 0
d( (1), (2)) d(0, (N 1))
(1) = min {d( (i), (i + 1)) | i = 0, 1, . . . , N 1}
is lexicographically smaller than 1 .

Here we identify (N ) with 0 = (0).


This means that is semi-canonical if the distance between 0 = (0) and (1)
is minimal among all distances between (i) and (i + 1), which can be arranged
by a shift. Moreover, the distance between (1) and (2) is at most as large as the
distance between (0) and (N 1), which can be arranged by a reflection and a
shift if it is not the case. Hence we have obtained the following lemma.
Lemma 1 For any permutation S N with (0) = 0 there exists a semi-canonical
such that the sets D and D are equivalent up to torus symmetry.
Thus we need to consider only semi-canonical which is easy to do algorithmically.
Remark 1 If S N is semi-canonical, it holds (1) N /2.
Another main advantage in considering our objective function only in domains
D is that it is not only convex but strictly convex here. This is due to the fact that
we fix (x0 , y0 ) = (0, 0).
Proposition 2 F (x, y)|D and G (x, y)|D are strictly convex if [0, 6].
Proof Again it is enough to prove the claim for
G (x, y) =

N 1

i, j=0

(1 + k(|xi x j |))(1 + k(|yi y j |)).

394

A. Hinrichs and J. Oettershagen

Now we use that the sum of a convex and a strictly convex function is again strictly
convex. Hence it is enough to show that the function
f (x1 , . . . , x N 1 , y1 , . . . , y N 1 ) =

N 1


(1 + k(|xi x0 |))(1 + k(|yi y0 |))

i=1

N 1


(1 + k(xi ))(1 + k(yi ))

i=1

is strictly convex on [0, 1] N 1 [0, 1] N 1 . In the proof of Proposition 1 it was


actually shown that f i (xi , yi ) = (1 + k(xi ))(1 + k(yi )) is strictly convex for
(xi , yi ) [0, 1]2 for each fixed i = 1, . . . , N 1. Hence the strict convexity of f
follows from the following easily verified lemma.

Lemma 2 Let f i : Di R, i = 1, . . . , m be strictly convex functions on the convex
domains Di Rdi . Then the function
f : D = D1 Dm R, (z 1 , . . . , z m ) 

m


f i (z i )

i=1

is strictly convex.

Hence we have indeed a unique point in each D where the minimum of F is


attained.

3.3 Minimizing F on D
Our strategy will be to compute the local minimum of F on each region
D [0, 1) N [0, 1) N for all semi-canonical permutations C N S N and
determine the global minimum by choosing the smallest of all the local ones.
This gives for each C N the constrained optimization problem
min F (x, y)

(x, y)D

subject to vi (x) 0 and wi ( y) 0 for all i = 1, . . . , N 1,


(9)

where the inequality constraints are linear and given by


vi (x) = xi xi1

and

wi ( y) = y (i) y (i1)

for i = 1, . . . , N 1. (10)

In order to use the necessary (and due to local strict convexity also sufficient) conditions for local minima

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

F (x, y) = 0
xk

F (x, y) = 0
yk

and

395

for k = 1, . . . , N 1

for (x, y) D we need to evaluate the partial derivatives of F .


Proposition 3 For a given permutation C N the partial derivative of F |D with
respect to the second component y is given by

N 1
k1
N 1




F (x, y)|D = yk
ci,k
ci,k yi +
ci,k si,k
ck, j sk, j ,
yk
2 i=0
i=0
i=0
j=k+1
N 1

i=k

i=k

(11)
where si, j = sgn(yi y j ) and ci, j := 1 + k(|xi x j |) = c j,i .
Interchanging x and y the same result holds for the partial derivatives with respect
to x with the obvious modification to ci, j and the simplification that si, j = 1.
The second order derivatives with respect to y are given by

N 1
k1
2
i=0 ci,k + i=k+1 ci,k
F(x, y)|D =
yk y j
ck, j

for j = k
, k, j {1, . . . , N 1}
for j  = k

(12)
Again, the analogue for xk x j F(x, y)|D is obtained with the obvious modification
ci, j = 1 + k(|yi y j |).
2

Proof We prove the claim for the partial derivative with respect to y:
1
N
2 N



F (x, y) =
k(|yi y j |) 1 + k(|xi x j |) +
k(|xi x j |)
yk
yk


 yk
i=0 j=i+1

N
2 N
1



=:ci, j

ci, j

i=0 j=i+1

1
N
2 N



ci, j

i=0 j=i+1

N
1


ck, j sk, j

j=k+1

k(|yi y j |)
yk

for i = k
si, j
k (si, j (yi y j )) si, j for j = k

0
else
 



k1
1
1

sk, j (yk y j )
ci,k si,k si,k (yi yk )
2
2
i=0

1
1
k1
N
1

N
N
1 

= yk
ci,k
ci,k yi +
ci,k si,k
ck, j sk, j .
2
i=0
i=k

i=0
i=k

i=0

From this we immediately get the second derivative (12).

j=k+1

396

A. Hinrichs and J. Oettershagen

3.4 Lower Bounds of F on D


Until now we are capable of approximating local minima of F on a given D . If this
is done for all C N we can obtain a candidate for a global minimum, but due to
the finite precision of floating point arithmetic one can never be sure to be close to the
actual global minimum. However, it is also possible to compute a lower bound for
the optimal point set for each D using Wolfe-duality for constrained optimization.
It is known [12] that for a convex problem with linear inequality constraints like (9)
the Lagrangian
L F (x, y, , ) := F(x, y) T v(x) T w( y)
= F(x, y)

N 1


(i vi (x) + i wi ( y))

(13)
(14)

i=1

gives a lower bound on F, i.e.


min F(x, y) L F ( x , y, , )

(x, y)D

for all ( x , y, , ) that fulfill the constraint


(x, y) L F ( x , y, , ) = 0

and

, 0 (component-wise).

(15)

Here, (x, y) = ( x , y ), where x denotes the gradient of a function with respect to


the variables in x. Hence it is our goal to find for each D such an admissible point
( x , y, , ) which yields a lower bound that is larger than some given candidate for
the global minimum. If the relevant computations are carried out in infinite precision
rational number arithmetic these bounds are mathematically reliable.
In order to accomplish this we first have to compute the Lagrangian of (9). To this
end, let P {1, 0, 1}(N 1)(N 1) denote the permutation matrix corresponding to
S N 1 and

1 1 0 . . . 0 0
0 1 1 . . . 0 0

.. R(N 1)(N 1) .
..
B := ...
(16)

.
.

0
. . . 0 1 1
0
...
0 1
Then the partial derivatives of L F with respect to x and y are given by

1 2
..
.

x L F (x, y, , ) = x F(x, y)
= x F(x, y) B
N 2 N 1
N 1

(17)

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

397

and

(1) (2)
..
.

y L F (x, y, , ) = y F(x, y)
= y F(x, y) BP .
(N 2) (N 1)
(N 1)
(18)
This leads to the following theorem.
Theorem 1 For C N and > 0 let the point ( x , y ) D fulfill

F( x , y ) =
xk

F( x , y ) =
yk

and

for k = 1, . . . , N 1.

(19)

Then
F(x, y) F( x , y )

N 1



(N i) vi ( x ) + (N i)wi ( y )
i=1
2

> F( x , y ) N

(20)
(21)

holds for all (x, y) D .


Proof Choosing
= B 1 x F( x , y )

and

= P1 B 1 y F( x , y )

(22)

x F( x , y) = B

and

y F( x , y) = BP .

(23)

yields

A short computation shows that the inverse of B from (16) is given by

B 1

1 1 ...
0 1 . . .

:= .
.. 0 . . .
0 ... 0

1
1

(N 1)(N 1)
,
.. R
.
1

which yields y, > 0 and hence by Wolfe duality gives (20). The second inequality
(21)
follows from
noting that both |vi (x)| and |wi ( y)| are bounded by 1 and
Nthen
1
N 1
(N i) = 2 i=1
i = (N 1)(N 2) < N 2 .

2 i=1
Now, suppose we had some candidate (x , y ) D for an optimal point set. If we
can find for all other C N points ( x , y ) that fulfills (19) and
F( x , y ) N 2 F (x , y )

398

A. Hinrichs and J. Oettershagen

for some > 0, we can be sure that D is (up to torus symmetry) the unique domain
D that contains the globally optimal point set.

4 Numerical Investigation of Optimal Point Sets


In this section we numerically obtain optimal point sets with respect to the worst
1
. Moreover, we present a proof by exhaustion that these point
case error in Hmix
sets are indeed approximations to the unique (modulo torus symmetry) minimizers
of F . Since integration lattices are local minima, if the D containing the global
minimizer corresponds to an integration lattice, this integration lattice is the exact
global minimizer.

4.1 Numerical Minimization with Alternating Directions


In order to obtain the global minimum (x , y ) of F we are going to compute
:= argmin min F (x, y),
C N

(x, y)D

(24)

where the inner minimum has a unique solution due to Proposition 2. Moreover, since
D is a convex domain we know that the local minimum of F (x, y)|D is not on
the boundary. Hence we can restrict our search for optimal point sets to the interior
of D , where F is differentiable.
Instead of directly employing a local optimization technique, we will make use
of the special structure of F . While F (x, y)|D is a polynomial of degree four, the
functions
(25)
x  F (x, y0 )|D and y  F (x 0 , y)|D ,
where one coordinate direction is fixed, are quadratic polynomials, which have unique
minima in D . We are going to use this property within an alternating minimization
approach. This means, that the objective function F is not minimized along all coordinate directions simultaneously, but with respect to certain successively alternating
blocks of coordinates. If these blocks have size one this method is usually referred
to as coordinate descent [7] or nonlinear GauSeidel method [5]. It is successfully employed in various applications, like e.g. expectation maximization or tensor
approximation [8, 19].
In our case we will alternate between minimizing F (x, y) along the first coordinate block x (0, 1) N 1 and the second one y (0, 1) N 1 , which can be done
exactly due to the quadratic polynomial property of the partial objectives (25). The
method is outlined in Algorithm 1, which for threshold-parameter = 0 approximates the local minimum of F on D . For > 0 it obtains feasible points that

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

399

Algorithm 1 Alternating minimization algorithm. For off-set = 0 it finds local


minima of F . For > 0 it obtains feasible points used by Algorithm 2.
Given: Permutation CN , tolerance > 0 and off-set 0.
Initialize:
1. x (0) := (0,
2. k := 0.

1
N

,...,

N 1
N )

and y(0) = (0,

(1)
(N 1)
).
N ,...,
N

repeat
N
N


1. compute H x := xi x j F (x (k) , y(k) i, j=1 and x = xi F (x (k) , y(k) i=1 by (12) and (11).

2. Update x (k+1) := H 1
( x + 1) via Cholesky factorization.
x

N
N
3. compute H y := yi y j F (x (k+1) , y(k) i, j=1 and y = yi F (x (k+1) , y(k) i=1 .


4. Update y(k+1) := H 1
y + 1 via Cholesky factorization.
y
5. k := k + 1.

until  x 2 +  y 2 <
Output: point set (x, y) D with x F (x, y) 1 and y F (x, y) 1.

fulfill (19), i.e. (x, y) F = (, . . . , ) = 1. Linear convergence of the alternating


optimization method for strictly convex functions was for example proven in [2, 13].

4.2 Obtaining Lower Bounds


By now we are able to obtain a point set (x , y ) D as a candidate for a global
minimum of F by finding local minima on each D , C N . On first sight we can
not be sure that we chose the right , because the value of min(x, y)D F (x, y) can
only be computed numerically.
On the other hand, Theorem 1 allows to compute lower bounds for all the other
domains D with C N . If we were able to obtain for each a point ( x , y ),
such that
min

(x, y)D

F (x, y) N := F (x , y ) < L F ( x , y ) 2N 2 F (x, y),

we could be sure that the global optimum is indeed located in D and (x , y ) is a


good approximation to it. Luckily, this is the case. Of course certain computations
can not be done in standard double floating point arithmetic. Instead we use arbitrary
precision rational number (APR) arithmetic from the GNU Multiprecision library
GMP from http://www.gmplib.org. Compared to standard floating point arithmetic
in double precision this is very expensive, but it has only to be used at certain parts of
the algorithm. The resulting procedure is outlined in Algorithm 2, where we marked
those parts which require APR arithmetic.

400

A. Hinrichs and J. Oettershagen

Algorithm 2 Computation of lower bound on D .

Given: Optimal point candidate P N := (x , y ) D with CN , tolerance > 0 and off-set


0.
Initialize:
1. Compute N := F (x , y ) (in APR arithmetic).
2. N := .
for all CN
1.
2.
3.
4.
5.

Find ( x , y ) D s.t. (x, y) F ( x , y ) 1 by Algorithm 1.


Compute := B 1 x F( x , y ) and := P1 B 1 y F( x , y ) (in APR arithmetic).
Verify , > 0.
Evaluate := L F ( x , y , , ) (in APR arithmetic).
If ( N ) N := N .

Output: Set of permutations in which D contained a lower bound smaller than N .

4.3 Results
In Figs. 1 and 2 the optimal point sets for N = 2, . . . , 16 and both = 1 and = 6
are plotted. It can be seen that they are close to lattice point sets, which justifies using
them as start points in Algorithm 1. The distance to lattice points seems to be small
if is small.
In Table 1 we list the permutations for which D contains an optimal set of
cubature points. In the second column the total number of semi-canonical permutations C N that had to be considered is shown. It grows approximately like 21 (N 2)!.
Moreover, we computed the minimal worst case error and periodic L 2 -discrepancies.
In some cases we found more than one semi-canonical permutation for which
D contained a point set which yields the optimal worst case error. Nevertheless, they
represent equivalent permutations. In the following list, the torus symmetries used
to show the equivalency of the permutations are given. All operations are modulo 1.

N = 7: (x, y)  (1 y, x)
N = 9: (x, y)  (y 2/9, x 1/9)
N = 11: (x, y)  (y + 5/11, x 4/11)
N = 14: (x, y)  (x 4/14, y + 6/14)
N = 15: (x, y)  (y + 3/15, x + 2/15), (y 2/15, 12/15 x), (y 6/15,
4/15 x)
N = 16: (x, y)  (1/16 x, 3/16 y)

In all the examined cases N {2, . . . , 16} Algorithm 2 produced sets N which
contained exactly the permutations that were previously obtained by Algorithm 1
and are listed in Table 1. Thus we can be sure, that the respective D contained
minimizers of F , which on each D are unique. Hence we know that our numerical
approximation of the minimum is close to the true global minimum, which (modulo
torus symmetries) is unique. In the cases N = 1, 2, 3, 5, 7, 8, 12, 13 the obtained
global minima are integration lattices.

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

Fig. 1 Optimal point sets for N = 2, . . . , 16 and = 1

401

402

Fig. 2 Optimal point sets for N = 2, . . . , 16 and = 6

A. Hinrichs and J. Oettershagen

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

403

Table 1 List of semi-canonical permutations , such that D contains an optimal set of cubature
points for N = 1, . . . , 16
1,1
N
|CN |
wce(Hmix
, P N ) D2 (P N )

Lattice
1
2
3
4
5
6
7

0
1
1
2
5
13
57

0.416667
0.214492
0.146109
0.111307
0.0892064
0.0752924
0.0650941

0.372678
0.212459
0.153826
0.121181
0.0980249
0.0850795
0.0749072

8
9

282
1,862

0.056846
0.0512711

0.0651562
0.0601654

10

14,076

0.0461857

0.054473

11

124,995

0.0422449

0.050152

12

1,227,562

0.0370732

0.0456259

13

13,481,042

0.0355885

0.0421763

14

160,456,465

0.0333232

0.0400524

15

2,086,626,584

0.0312562

0.0379055

16

29,067,602,676

0.0294507

0.0359673

(0)
(0 1)
(0 1 2)
(0 1 3 2)
(0 2 4 1 3)
(0 2 4 1 5 3)
(0 2 4 6 1 3 5), (0
3 6 2 5 1 4)
(0 3 6 1 4 7 2 5)
(0 2 6 3 8 5 1 7 4),
(0 2 7 4 1 6 3 8 5)
(0 3 7 1 4 9 6 2 8
5)
(0 3 8 1 6 10 4 7 2
9 5), (0 3 9 5 1 7
10 4 8 2 6)
(0 5 10 3 8 1 6 11
4 9 2 7)
(0 5 10 2 7 12 4 9
1 6 11 3 8)
(0 5 10 2 8 13 4
11 6 1 9 3 12 7),
(0 5 10 3 12 7 1 9
4 13 6 11 2 8)
(0 4 9 13 6 1 11 3
8 14 5 10 2 12 7),
(0 5 11 2 7 14 9 3
12 6 1 10 4 13 8),
(0 5 11 2 8 13 4
10 1 6 14 9 3 12
7), (0 5 11 2 8 13
6 1 10 4 14 7 12 3
9)
(0 3 11 5 14 9 1 7
12 4 15 10 2 6 13
8), (0 3 11 6 13 1
9 4 15 7 12 2 10 5
14 8)











404

A. Hinrichs and J. Oettershagen

5 Conclusion
In the present paper we computed optimal point sets for quasi-Monte Carlo cubature
of bivariate periodic functions with mixed smoothness of order one by decomposing
the required global optimization problem into approximately (N 2)!/2 local ones.
Moreover, we computed lower bounds for each local problem using arbitrary precision rational number arithmetic. Thereby we obtained that our approximation of the
global minimum is in fact close to the real solution.
In the special case of N being a Fibonacci number our approach showed that for
N {1, 2, 3, 5, 8, 13} the Fibonacci lattice is the unique global minimizer of the
1
. We strongly conjecture that this is true for all
worst case integration error in Hmix
Fibonacci numbers. Also in the cases N = 7, 12, the global minimizer is the obtained
integration lattice.
In the future we are planning to prove that optimal points are close to lattice
r
, i.e. Sobolev spaces with dominating
points. Moreover, we will investigate Hmix
mixed smoothness of order r 2 and other suitable kernels and discrepancies.
Acknowledgments The authors thank Christian Kuske and Andr Uschmajew for valuable hints
and discussions. Jens Oettershagen was supported by the Sonderforschungsbereich 1060 The Mathematics of Emergent Effects of the DFG.

References
1. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337404 (1950)
2. Bezdek, J.C., Hathaway, R.J., Howard, R.E., Wilson, C.A., Windham, M.P.: Local convergence
analysis of a grouped variable version of coordinate descent. J. Optim. Theory Appl. 54(3),
471477 (1987)
3. Bilyk, D., Temlyakov, V.N., Yu, R.: Fibonacci sets and symmetrization in discrepancy theory.
J. Complex. 28, 1836 (2012)
4. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte
Carlo Integration. Cambridge University Press, Cambridge (2010)
5. Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gau-Seidel method
under convex constraints. Oper. Res. Lett. 26(3), 127136 (2000)
6. Larcher, G., Pillichshammer, F.: A note on optimal point distributions in [0, 1)s . J. Comput.
Appl. Math. 206, 977985 (2007)
7. Luo, Z.Q., Tseng, P.: On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 72(1), 735 (1992)
8. McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley series in probability
and statistics. Wiley, New York (1997)
9. Niederreiter, H.: Quasi-Monte Carlo Methods and Pseudo-Random Numbers, Society for
Industrial and Applied Mathematics (1987)
10. Niederreiter, H., Sloan, I.H.: Integration of nonperiodic functions of two variables by Fibonacci
lattice rules. J. Comput. Appl. Math. 51, 5770 (1994)
11. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems. Volume II: Standard
Information for Functionals. European Mathematical Society Publishing House, Zrich (2010)
12. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
13. Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables,
Society for Industrial and Applied Mathematics (1987)

Optimal Point Sets for Quasi-Monte Carlo Integration of Bivariate

405

14. Pillards, T., Vandewoestyne, B., Cools, R.: Minimizing the L 2 and L star discrepancies of a
single point in the unit hypercube. J. Comput. Appl. Math. 197, 282285 (2006)
15. Sloan, I.H., Joe, S.: Lattice Methods for Multiple Integration. Oxford University Press, New
York and Oxford (1994)
16. Ss, V.T., Zaremba, S.K.: The mean-square discrepancies of some two-dimensional lattices.
Stud. Sci. Math. Hung. 14, 255271 (1982)
17. Temlyakov, V.N.: Error estimates for Fibonacci quadrature formulae for classes of functions.
Trudy Mat. Inst. Steklov 200, 327335 (1991)
18. Ullrich, T., Zung, D.: Lower bounds for the integration error for multivariate functions with
mixed smoothness and optimal Fibonacci cubature for functions on the square. Math. Nachr.
288(7), 743762 (2015)
19. Uschmajew, A.: Local convergence of the alternating least squares algorithm for canonical
tensor approximation. SIAM J. Matrix Anal. Appl. 33(2), 639652 (2012)
20. Wahba, G.: Smoothing noisy data with spline functions. Numer. Math. 24(5), 383393 (1975)
21. White, B.E.: On optimal extreme-discrepancy point sets in the square. Numer. Math. 27, 157
164 (1977)
22. Zinterhof, P.: ber einige Abschtzungen bei der Approximation von Funktionen mit Gleichverteilungsmethoden. sterreich. Akad. Wiss. Math.-Naturwiss. Kl. S.-B. II 185, 121132
(1976)

Adaptive Multidimensional Integration


Based on Rank-1 Lattices
Llus Antoni Jimnez Rugama and Fred J. Hickernell

Abstract Quasi-Monte Carlo methods are used for numerically integrating multivariate functions. However, the error bounds for these methods typically rely on
a priori knowledge of some semi-norm of the integrand, not on the sampled function values. In this article, we propose an error bound based on the discrete Fourier
coefficients of the integrand. If these Fourier coefficients decay more quickly, the
integrand has less fine scale structure, and the accuracy is higher. We focus on rank-1
lattices because they are a commonly used quasi-Monte Carlo design and because
their algebraic structure facilitates an error analysis based on a Fourier decomposition of the integrand. This leads to a guaranteed adaptive cubature algorithm with
computational cost O(mbm ), where b is some fixed prime number and bm is the
number of data points.
Keywords Quasi-Monte Carlo methods Multidimensional integration
lattices Adaptive algorithms Automatic algorithms

Rank-1

1 Introduction
Quasi-Monte Carlo (QMC) methods use equally weighted sums of integrand values
at carefully chosen nodes to approximate multidimensional integrals over the unit
cube,

n1
1
f (z i )
f (x) dx.
n i=0
[0,1)d
Ll.A. Jimnez Rugama (B) F.J. Hickernell
Department of Applied Mathematics, Illinois Institute of Technology,
10 W 32nd Street, E1-208, Chicago, IL 60616, USA
e-mail: ljimene1@hawk.iit.edu
F.J. Hickernell
e-mail: hickernell@iit.edu
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_20

407

408

Ll.A. Jimnez Rugama and F.J. Hickernell

Integrals over more general domains may often be accommodated by a transformation


of the integration variable. QMC methods are widely used because they do not suffer
from a curse of dimensionality. The existence of QMC methods with dimensionindependent error convergence rates is discussed in [11, Chaps. 1012]. See [3] for
a recent review.
The QMC convergence rate of O(n (1) ) does not give enough information about
the absolute error to determine how large n must be to satisfy a given error tolerance, .
The objective of this research is to develop a guaranteed, QMC algorithm based on
rank-1 lattices that determines n adaptively by calculating a data-driven upper bound
on the absolute error. The KoksmaHlawka inequality is impractical for this purpose
because it requires the total variation of the integrand. Our data-driven bound is
expressed in terms of the integrands discrete Fourier coefficients.
Sections 24 describe the group structure of rank-1 lattices and how the complex
exponential functions are an appropriate basis for these nodes. For computation
purposes, there is also an explanation of how to obtain the discrete Fourier transform
of f with an O(n log(n)) computational cost. New contributions are described in
Sects. 5 and 6. Initially, a mapping from N0 to the space of wavenumbers, Zd , is
defined according to constraints given by the structure of our rank-1 lattice node
sets. With this mapping, we define a set of integrands for which our new adaptive
algorithm is designed. This set is defined in terms of cone conditions satisfied by
the (true) Fourier coefficients of the integrands. These conditions make it possible
to derive an upper bound on the rank-1 lattice rule error in terms of the discrete
Fourier coefficients, which can be used to construct an adaptive algorithm. An upper
bound on the computational cost of this algorithm is derived. Finally, there is an
example of option pricing using the MATLAB implementation of our algorithm,
cubLattice_g, which is part of the Guaranteed Automatic Integration Library
[1]. A parallel development for Sobol cubature is given in [5].

2 Rank-1 Integration Lattices


Let b be prime number, and let Fn := {0, . . . , n 1} denote the set of the first n nonnegative integers for any n N. The aim is to construct a sequence of embedded
node sets with bm points for m N0 :
{0} =: P0 P1 Pm := {z i }iFbm P := {z i }iN0 .
Specifically, the sequence z 1 , z b , z b2 , . . . [0, 1)d is chosen such that
z 1 = b1 a0 ,
1

z bm = b (z bm1 + am ) = b

a0 {1, . . . , b 1}d ,

am + + b

m1

a0 ,

am

(1a)
Fdb ,

m N. (1b)

Adaptive Multidimensional Integration Based on Rank-1 Lattices

409

From this definition it follows that for all m N0 ,



z bm ,  = 0, . . . , m

b z bm mod 1 =
0,
 = m + 1, m + 2, . . . .

(2)

Next, for any i N with proper b-ary expansion i = i 0 + i 1 b + i 2 b2 + , and m =


logb (i) + 1 define
z i :=


=0

i  z b mod 1 =

m1


i  z b mod 1 =

=0

m1


i  bm1 z bm1 mod 1

=0

= j z bm1 mod 1,

where j =

m1


i  bm1 , (3)

=0

where (2) was used. This means that node set Pm defined above may be written as
the integer multiples of the generating vector z bm1 since


m1

Pm := {z i }iFbm = z bm1
i  bm1 mod 1 : i 0 , . . . , i m1 Fb
=0

= { j z bm1 mod 1} jFbm .


Integration lattices, L , are defined as discrete groups in Rd containing Zd and
closed under normal addition [13, Sects. 2.7 and 2.8]. The node set of an integration
lattice is its intersection with the half-open unit cube, P := L [0, 1)d . In this case,
P is also a group, but this time under addition modulo 1, i.e., operator : [0, 1)d
[0, 1)d [0, 1)d defined by x y := (x + y) mod 1, and where x := 1 x.
Sets Pm defined above are embedded node sets of integration lattices. The sufficiency of a single generating vector for each of these Pm is the reason that Pm is
called the node set of a rank-1 lattice. The theoretical properties of good embedded
rank-1 lattices for cubature are discussed in [6].
The set of d-dimensional integer vectors, Zd , is used to index Fourier series
expressions for the integrands, and Zd is also known as the wavenumber space.
We define the bilinear operation , : Zd [0, 1)d [0, 1) as the dot product
modulo 1:
k, x := k T x mod 1

k Zd , x [0, 1)d .

(4)

This bilinear operation has the following properties: for all t, x [0, 1)d , k, l Zd ,
and a Z, it follows that
k, 0 = 0, x = 0,
k, ax mod 1 t = (a k, x + k, t ) mod 1

(5a)
(5b)

410

Ll.A. Jimnez Rugama and F.J. Hickernell

ak + l, x = (a k, x + l, x ) mod 1,
k, x = 0 k Z

(5c)

= x = 0.

(5d)

An additional constraint placed on the embedded lattices is that


k, z bm = 0 m N0 = k = 0.

(6)

The bilinear operation defined in (4) is also used to define the dual lattice corresponding to Pm :
Pm := {k Zd : k, z i = 0, i Fbm }
= {k Zd : k, z bm1 = 0}

by (3) and (5b).

(7)

By this definition P0 = Zd , and the properties (2), (4), and (6), imply also that the
Pm are nested subgroups with

= {0}.
Zd = P0 Pm P

(8)

Analogous to the dual lattice definition, for j Fbm one can define the dual


cosets as Pm, j := {k Zd : bm k, z bm1 = j}. Hence, a similar extended property (8)
applies:
Pm, j =

b1

, j+abm

Pm+1

, j+abm

= Pm, j Pm+1

, a Fb , j Fbm .

(9)

a=0
, j+abm b1
}a=0

The overall dual cosets structure can be represented as a tree, where {Pm+1
, j
are the children of Pm .

(a)

(b)

20
15

0.8

10
0.6

0.4

0
10

0.2

15
0
0

0.2

0.4

0.6

0.8

20
20

10

10

20

Fig. 1 Plots of a the node set P6 depicted as {z 0 , z 1 }, {z 2 , z 3 }, {z 4 , . . . , z 7 }, {z 8 , . . . , z 15 },


+{z 16 , . . . , z 31 }, {z 32 , . . . , z 63 }, and b some of the dual lattice points, P6 [20, 20]2

Adaptive Multidimensional Integration Based on Rank-1 Lattices

411

Figure 1 shows an example of a rank-1 lattice node set with 64 points in dimension
2 and its dual lattice. The parameters defining this node set are b = 2, m = 6, and
z 32 = (1, 27)/64. It is useful to see how Pm = Pm1 {Pm1 + z 2m1 mod 1}.

3 Fourier Series
The integrands considered here are absolutely continuous periodic functions. If the
integrand is not initially periodic, it may be periodized as discussed in [4, 12], or
[13, Sect. 2.12]. More general box domains may be considered, also by using variable
transformations, see e.g., [7, 8].

The L 2 ([0, 1)d ) inner product is defined as f, g 2 = [0,1)d f (x)g(x) dx. The

complex exponential functions, {e2 1 k, }kZd form a complete orthonormal basis


for L 2 ([0, 1)d ). So, any function in L 2 ([0, 1)d ) may be written as its Fourier series
as


f (x) =
(10)
f(k)e2 1 k,x , where f(k) = f, e2 1 k, ,
2

kZd

and the inner product of two functions in L 2 ([0, 1)d ) is the 2 inner product of their
series coefficients:






f, g 2 =

=: f(k) kZd , g(k)

.
f(k)g(k)
d
kZ
2

kZd

Note that for any z Pm and k Pm , we have e2 1 k,z = 1. The special


group structure of the lattice node set, Pm , leads to a useful formula for the average
of any Fourier basis function over Pm . According to [10, Lemma 5.21],

bm 1
1  2 1 k,zi
1, k Pm
e
= 1Pm (k) =
m
b i=0
0, k Zd \ Pm .

(11)

This property of the dual lattice is used below to describe the absolute error of a
shifted rank-1 lattice cubature rule in terms of the Fourier coefficients for wavenumbers in the dual lattice. For fixed [0, 1)d , the cubature rule is defined as
b 1
1 
Im ( f ) := m
f (z i ),
b i=0
m

m N0 .

(12)




Note from this definition that Im e2 1 k, = e2 1 k, 1Pm (k). The series


decomposition defined in (10) and Eq. (11) are used in intermediate results from

412

Ll.A. Jimnez Rugama and F.J. Hickernell

[10, Theorem 5.23] to show that,







[0,1)d

 
  

f (x) dx Im ( f ) = 

f(k)e2





1 k, 

kPm \{0}



 
 f (k) . (13)

kPm \{0}

4 The Fast Fourier Transform for Function Values


at Rank-1 Lattice Node Sets
Adaptive Algorithm 1 (cubLattice_g) constructed in Sect. 6 has an error analysis
based on the above expression. However, the true Fourier coefficients are unknown
and they must be approximated by the discrete coefficients, defined as:



fm (k) := Im e2 1 k, f ()




2
1 k,
2
1 l,
f(l)e
= Im e
=

lZd



f(l) Im e2 1 lk,

lZd

f(l)e2

1 lk,

lZd

(14a)

f(k + l)e2

lPm

= f(k) +

1Pm (l k)

1 l,

f(k + l)e2

1 l,

k Zd .

(14b)

lPm \{0}

Thus, the discrete transform fm (k) equals the integral transform f(k), defined in
(10), plus aliasing terms corresponding to f(k + l) scaled by the shift, , where
l Pm \ {0}.
To facilitate the calculation of fm (k), we define the map 
m : Zd Fbm as follows:

0 (k) := 0,


m (k) := bm k, z bm1 , m N.
, j

(15)

A simple but useful remark is that Pm corresponds to all k Zd such that


m (k) =
j for j Fbm . The above definition implies that k, z i appearing in fm (k), may be
written as

Adaptive Multidimensional Integration Based on Rank-1 Lattices


k, z i = k,

m1



i  z b mod 1 =

=0

m1


413

i  k, z b mod 1

=0

m1


i 
+1 (k)b1 mod 1. (16)

=0

The map 
m depends on the choice of the embedded rank-1 lattice node sets
defined in (1) and (3). We can confirm that the right hand side of this definition lies
in Fbm by appealing to (1) and recalling that the a are integer vectors:
bm k, z bm1 = bm [(b1 k T am1 + + bm k T a0 ) mod 1]
= (bm1 k T am1 + + k T a0 ) mod bm Fbm , m N.
Moreover, note that for all m N

m+1 (k) 
m (k) = bm+1 k, z bm bm k, z bm1
= bm [b k, z bm k, z bm1 ]
= bm [a + k, bz bm mod 1 k, z bm1 ], for some a Fb
= bm [a + k, z bm1 k, z bm1 ], by (2)
= abm for some a Fb .

(17)

For all N0 with proper b-ary expansion = 0 + 1 b + N0 , let m denote


the integer obtained by keeping only the first m terms of its b-ary expansion, i.e.,
m := 0 + + m1 bm1 = [(bm ) mod 1]bm Fbm

(18)

The derivation in (17) means that if 


m (k) = Fbm , then

 (k) =  ,

 = 1, . . . , m.

(19)

Letting yi := f (z i ) for i N0 and considering (16), the discrete Fourier


transform defined in (14a) can now be written as follows:
bm 1



1  2 1 k,zi
2 1 k,

f m (k) := Im e
f () = m
e
yi
b i=0

= e2

1 k,

Ym (
m (k)),

m N0 , k Zd ,

(20)

where for all m, N0 ,




b1
b1


m1
1 
Ym () := m

yi0 ++im1 bm1 exp 2 1


i  +1 b1
b i =0
i =0
=0
m1

= Ym ( m ).

414

Ll.A. Jimnez Rugama and F.J. Hickernell

The quantity Ym (), Fbm , which is essentially the discrete Fourier transform, can
be computed efficiently via some intermediate quantities. For p {0, . . . , m 1},
m, N0 define Ym,0 (i 0 , . . . , i m1 ) := yi0 ++im1 bm1 and let
Ym,m p (, i m p , . . . , i m1 )
:=

b1


1
bm p

i m p1 =0

b1


m p1

yi0 ++im1 bm1 exp 2 1

i 0 =0

i  +1 b1 .

=0

Note that Ym,m p (, i m p , . . . , i m1 ) = Ym,m p ( m p , i m p , . . . , i m1 ), and thus


takes on only bm distinct values. Also note that Ym,m () = Ym (). For p = m
1, . . . , 0, compute
Ym,m p (, i m p , . . . , i m1 )
=

b1


1
bm p
1
b

i m p1 =0

b1


b1


p1

m
1

yi0 ++im1 bm1 exp 2 1


i  +1 b

i 0 =0

=0

Ym,m p1 (, i m p1 , . . . , i m1 ) exp 2 1i m p1 m p bm+ p .

i m p1 =0

For each p one must perform O(bm ) operations, so the total computational cost to
obtain Ym () for all Fbm is O(mbm ).

5 Error Estimation
As seen in Eq. (13), the absolute error is bounded by a sum of the absolute value of
the Fourier coefficients in the dual lattice. Note that increasing the number of points
in our lattice, i.e. increasing m, removes wavenumbers from the set over which this
summation is defined. However, it is not obvious how fast is this error decreasing
with respect to m. Rather than deal with a sum over the vector wavenumbers, it is
more convenient to sum over scalar non-negative integers. Thus, we define another
mapping k : N0 Zd .

Definition 1 Given a sequence of points in embedded lattices, P = {z i }i=0


define
k : N0 Zd one-to-one and onto recursively as follows:

Set k(0)
=0
For m N0
For Fbm ,

Let a Fb be such that 


m+1 ( k())
=
m ( k())
+ abm .
m
d

(i) If a = 0, choose k( + ab ) {k Z : 
m+1 (k) = 
m ( k())}.
 m
d

(ii) Choose k( + a b ) {k Z : 
m+1 (k) = 
m ( k()) + a  bm },
for a  {1, . . . , b 1}\{a}.

Adaptive Multidimensional Integration Based on Rank-1 Lattices

415

Definition 1 is intended to reflect the embedding of the dual cosets described in (8)
, j+abm

= j. In (i), if k()
Pm+1
with a > 0,
and (9). For clarity, consider 
m ( k())
, j
m

we choose k( + ab ) Pm+1 . Otherwise by (ii), we simply choose k( + a  bm )


, j+a  bm
, j
. Condition (i) forces us to pick wavenumbers in Pm+1 .
Pm+1
This mapping is not uniquely defined and one has the flexibility to choose part
of it. For example, defining a norm such as in [13, Chap. 4] one can assign smaller
values of to smaller wavenumbers k. In the end, our goal is to define this mapping

such that f( k())


0 as . In addition, it is one-to-one since at each step the
+ a  bm ) are chosen from sets of wavenumbers that

new values k( + abm ) or k(

exclude those wavenumbers already assigned to k().


The mapping can be made
onto by choosing the smallest wavenumber in some sense.

m+1 (k) = 
m ( k())
+
It remains to be shown that for any Fbm , {k Zd : 
 m


a b } is nonempty for all a Fb with a = a. Choose l such that l, z 1 = b1 .
This is possible because z 1 = b1 a0 = 0. For any m N0 , Fbm , and a  Fb ,
note that


k()
+ a  bm l, z bm = k(),
by (5c)
z bm + a  bm l, z bm mod 1



= [bm1
m+1 ( k())
+ a  l, bm z bm mod 1 ] mod 1
by (5b) and (15)


m ( k())
+ ab1 + a  l, z 1 ] mod 1

= [b

m1

= [b

m1

by (2)


m ( k())
+ (a + a )b ] mod 1,


Then it follows that

+ a  bm l) = 
m ( k())
+ (a + a  mod b)bm

m+1 ( k()

by (15).

By choosing a  such that a  = (a + a  mod b), we have shown that the set Fbm ,

m+1 (k) = 
m ( k())
+ a  bm } is nonempty.
{k Zd : 
To illustrate the initial steps of a possible mapping, consider the lattice in Fig. 1

and Table 1. For m = 0, {0} and a = 0. This skips i) and implies k(1)
{k

1 (k) = 2 k, (1, 27)/2 = 1}, so one may choose k(1)


:= (1, 0). After that,
Zd : 
m = 1 and {0, 1}. Starting with = 0, again a = 0 and we jump to ii) where

2 (k) = 4 k, (1, 27)/4 = 2} and thus, we can take


we require k(2)
{k Zd : 

k(2) := (1, 1). When = 1, we note that 


=
((1, 0)) = 3. Here a = 1
2 ( k(1))
d

2 (k) = 1}, so we may choose k(3)


:= (1, 0).
leading to i) and k(3) {k Z : 

Continuing, we may take k(4) := (1, 1), k(5) := (0, 1), k(6) := (1, 1) and

k(7)
:= (0, 1).
Lemma 1 The map in Definition 1 has the property that for m N0 and Fbm ,

+ bm )} = {l Zd : k()
l Pm }.
{ k(
=0

416

Ll.A. Jimnez Rugama and F.J. Hickernell

Table 1 Values 
1 , 
2 and 
3 for some wavenumbers and a possible assignment of k()

k()


1 ( k())
=


(
k())
=


(
k())
=
2
3


2 k(),
4 k(),
8 k(),
(1, 27)/2
(1, 27)/4
(1, 27)/8
(0, 0)
(1, 1)
(1, 1)
(1, 1)
(1, 0)
(1, 0)
(0, 1)
(0, 1)
(1, 1)

0
4
2
6
1
3
7
5

0
0
0
0
1
1
1
1
0

0
0
2
2
3
1
1
3
0

0
4
2
6
7
1
5
3
4

The reader should notice that 


m+1 ( k())

m ( k())
is either 0 or 2m

Proof This statement holds trivially for m = 0 and = 0. For m N it is noted that
by (7)
k l Pm k l, z bm1 = 0
by (5c)
k, z bm1 = l, z bm1
bm
m (k) = bm
m (l)

m (k) = 
m (l).

by (15)
(21)

This implies that for all m N and Fbm ,

m (l) = 
m ( k())}
= {l Zd : k()
l Pm }.
{l Zd : 

(22)

By Definition 1 it follows that for m N and Fbm ,

+ bm )}b1 {k Zd : 
m+1 (k) = 
m ( k())
+ abm , a Fb }
{ k(
=0

= {k Zd : 
m (k) = 
m ( k())}.
Applying property (19) on the right side,
 ))},
+ bm )}b1 {k Zd : 
 (k) = 
 ( k(
{ k(
=0

 = 1, . . . , m.

Because one can say the above equation holds  = 1, . . . , n < m, the left hand side
can be extended,
+ bm )} {k Zd : 

{ k(
m (k) = 
m ( k())}.
=0

(23)

Adaptive Multidimensional Integration Based on Rank-1 Lattices

417

Now suppose that l is any element of {k Zd : 


m (k) = 
m ( k())}.
Since the
 ). Choose  such that
map k is onto, there exists some  N0 such that l = k(
 =  m +  bm , where the overbar notation was defined in (18). According to (23) it
 m )) = 
 m +  bm )) = 

follows that 
m ( k(
m ( k(
m (l) = 
m ( k()).
Since  m and

+ bm )} . Thus,
are both in Fbm , this implies that m = , and so l { k(
=0
+ bm )} {k Zd : 

{ k(
m (k) = 
m ( k())},
and the lemma is proved.

=0

and fm, := fm ( k()).


For convenience we adopt the notation f := f( k())
Then, by Lemma 1 the error bound in (13) may be written as





 



 m
f (x) dx Im ( f )
 f b  ,

[0,1)d

(24)

=1

and the aliasing relationship in (14b) becomes


fm, = f +

2
f+bm e

1 k(+b
) k(),

(25)

=1

Given an integrand with absolutely summable Fourier coefficients, consider the


following sums defined for , m N0 ,  m:
Sm ( f ) =

m
b
1

 
 f ,


b
1


S,m ( f ) =

=bm1 




 f+bm ,

=b1  =1


 
 f ,
Sqm ( f ) = 
S0,m ( f ) + + 
Sm,m ( f ) =
=bm


S,m ( f ) =


b
1



 fm, .

=b1 

Note that 
S,m ( f ) is the only one that can be observed from data because it
involves
the
coefficients. In fact, from (20) one can identify

  discrete transform

 and our adaptive algorithm will be based on this sum bound fm,  = Ym (

m ( k()))
ing the other three, Sm ( f ), 
S,m ( f ), and Sqm ( f ), which cannot be readily observed.
and be some bounded non-negative
Let  N be some fixed integer and 
valued functions. We define a cone, C , of absolutely continuous functions whose
Fourier coefficients decay according to certain inequalities:
S,m ( f ) 
(m ) Sqm ( f ),  m,
C := { f AC([0, 1)d ) : 
Sqm ( f ) (m

)S ( f ),   m}. (26)

=
We also require the existence of r such that 
(r )(r
) < 1 and that limm (m)
0. This set is a cone, i.e. f C = a f C a R, but it is not convex. A wider
discussion on the advantages and disadvantages of designing numerical algorithms
for cones of functions can be found in [2].

418

Ll.A. Jimnez Rugama and F.J. Hickernell

Fig. 2 The magnitudes of


true Fourier coefficients for
some integrand

Functions in C have Fourier coefficients that do not oscillate wildly. According


to (24), the error of our integration is bounded by 
S0,m ( f ). Nevertheless, for practical purposes we will use S ( f ) as an indicator for the error. Intuitively, the cone
conditions enforce these two sums to follow a similar trend. Thus, one can expect
S0,m ( f ).
that small values of S ( f ) imply small values of 
The first inequality controls how an infinite sum of some of the larger wavenumber
coefficients are bounded above by a sum of all the surrounding coefficients. The
second inequality controls how the sum of these surrounding coefficients is bounded
above by a finite sum of some smaller wavenumber Fourier coefficients. In Fig. 2 we
S0,12 ( f ). The
can see how S8 ( f ) can be used to bound Sq12 ( f ) and Sq12 ( f ) to bound 
former sum also corresponds to the error bound in (24).
For small  the sum S ( f ) includes only a few summands. Therefore, it could accidentally happen that S ( f ) is too small compared to Sqm ( f ). To avoid this possibility,
the cone definition includes the constraint that  is greater than some minimum  .
Because we do not assume the knowledge of the true Fourier coefficients, for
functions in C we need bounds on S ( f ) in terms of the sum of the discrete coefficients 
S,m ( f ). This is done by applying (25), and the definition of the cone in
(26):
S ( f ) =


b
1

=b1 


b
1


b
1

 
 f  =

=b1 



 fm,  +

=b1 

b
1








m

2 1 k(+b
) k(),


f+bm e

 f m,


=1




 f+bm  = 
S,m ( f ) + 
S,m ( f )

=b1  =1

(m )(m

)S ( f )

S,m ( f ) + 
and provided that 
(m )(m

) < 1,

(27)

Adaptive Multidimensional Integration Based on Rank-1 Lattices

S ( f )


S,m ( f )
.
1
(m )(m

)

419

(28)

By (24) and the cone conditions, (28) implies a data-based error bound:





[0,1)d

 


 fbm  = 

S0,m ( f ) 
f (x) dx Im ( f )
(m) Sqm ( f )
=1


(m)(m

)S ( f )

(m)(m

)


S,m ( f ).
1
(m )(m

)

(29)

In Sect. 6 we construct an adaptive algorithm based on this conservative bound.

6 An Adaptive Algorithm Based for Cones of Integrads


Inequality (29) suggests the following algorithm. First, choose  and fix r := m
 N such that 
(r )(r
) < 1 for   . Then, define
C(m) :=


(m)(r
)
.
1
(r )(r
)

The choice of the parameter r is important. Larger r means a smaller C(m), but it
also makes the error bound more dependent on smaller indexed Fourier coefficients.
Algorithm 1 (Adaptive Rank-1 Lattice Cubature, cubLattice_g) Fix r and  ,

and describing C in (26). Given a tolerance, , initialize m =  + r and do:
Step 1. According to Sect. 4, compute 
Smr,m ( f ).

Step 2. Check whether C(m) Smr,m ( f ) . If true, return Im ( f ) defined in (12).
If not, increment m by one, and go to Step 1.
Theorem 1 For m = min{m   + r : C(m  )
Sm  r,m  ( f ) }, Algorithm 1 is successful whenever f C ,





[0,1)d



f (x)dx Im ( f ) .

Thus, the number of function data needed is bm . Defining m = min{m   + r :

(r )(r
)]Sm  r ( f ) }, we also have bm bm . This means that the
C(m  )[1 + 
computational cost can be bounded,


Im , f, $( f )bm + cm bm
cost 
where $( f ) is the cost of evaluating f at one data point.

420

Ll.A. Jimnez Rugama and F.J. Hickernell

Proof By construction, the algorithm must be successful. Recall that the inequality
used for building the algorithm is (29).
To find the upper bound on the computational cost, a similar result to (27) provides

S,m ( f ) =


b
1

=b1


b
1
=b1



b
1 







m

2 1 k(+b
) k(),
 f +

 fm,  =

f +bm e



=b1

b
1

 
 f  +

=1



 f+bm  = S ( f ) + 
S,m ( f )

=b1 =1

[1 + 
(m )(m

)]S ( f ).
Replacing 
S,m ( f ) in the error bound in (29) by the right hand side above proves that
the choice of m needed to satisfy the tolerance is no greater than m defined above.
In Sect. 4, the computation of 
Smr,m ( f ) is described in terms of O(mbm ) operations. Thus, the total cost of Algorithm 1 is,


Im , f, $( f )bm + cm bm
cost 

7 Numerical Example
Algorithm 1 has been coded in MATLAB as cubLattice_g in base 2, and is
part of GAIL, [1]. To test it, we priced an Asian call with geometric Brownian
motion, S0 = K = 100, T = 1 and r = 3 %. The test is performed on 500 samples
whose dimensions are chosen IID uniformly among 1, 2, 4, 8, 16, 32, and 64, and
the volatility also IID uniformly from 10 to 70 %. Results, in Fig. 3, show 97 % of
success meeting the error tolerance.
The algorithm cone parametrization was  = 6, r = 4 and C(m) = 5 2m . In
addition, each replication used a shifted lattice with U (0, 1). However, results
are strongly dependent on the generating vector that was used for creating the rank-1
lattice embedded node sets. The vector applied to this example was found with the
latbuilder software from Pierre LEcuyer and David Munger [9], obtained for
226 points, d = 250 and coordinate weights j = j 2 , optimizing the P2 criterion.
For this particular example, the choice of C(m) does not have a noticeable impact
on the success rate or execution time. In other cases such as discontinuous functions, it is more sensitive. Being an adaptive algorithm, if the Fourier coefficients

Adaptive Multidimensional Integration Based on Rank-1 Lattices

10 2

Time (seconds)

Fig. 3 Empirical
distribution functions
obtained from 500 samples,
for the error (continuous
line) and time (slashed-doted
line). Quantiles are specified
on the right and top axes
respectively. The tolerance
of 0.02 (vertical dashed line)
is an input of the algorithm
and will be a guaranteed
bound on the error if the
function lies inside the cone

0.2

421

0.4

0.6

0.8

10 1

0.8

0.6

-1

0.4

10 -2

0.2

10
10

10

-3

10

-5

10

-4

10

-3

10

-2

10

-1

10

Error

decrease quickly, cone conditions have a weaker effect. One can see that the number
of summands involving 
Smr,m ( f ) is 2mr 1 for a fixed r . Thus, in order to give a
uniform weight to each wavenumber, we chose C(m) proportional to 2m .

8 Discussion and Future Work


Quasi-Monte Carlo methods rarely provide guaranteed adaptive algorithms. This
new methodology that bounds the absolute error via the discrete Fourier coefficients
allows us to build an adaptive automatic algorithm guaranteed for cones of integrands. The non-convexity of the cone allows our adaptive, nonlinear algorithm to
be advantageous in comparison with non-adaptive, linear algorithms.
Unfortunately, the definition of the cone does contain parameters, 
and ,
whose
optimal values may be hard to determine. Moreover, the definition of the cone does
not yet correspond to traditional sets of integrands, such as Korobov spaces. These
topics deserve further research.
Concerning the generating vector used in Sect. 7, some further research should be
carried out to specify the connection between dimension weights and cone parameters. This might lead to the existence of optimal weights and generating vector.
Our algorithm provides an upper bound on the complexity of the problem, but
we have not yet obtained a lower bound. We are also interested in extending our
algorithm to accommodate a relative error tolerance. We would like to understand
how the cone parameters might depend on the dimension of the problem, and we
would like to extend our adaptive algorithm to infinite dimensional problems via
multi-level or multivariate decomposition methods.

422

Ll.A. Jimnez Rugama and F.J. Hickernell

Acknowledgments The authors thank Ronald Cools and Dirk Nuyens for organizing MCQMC
2014 and greatly appreciate the suggestions made by Sou-Cheng Choi, Frances Kuo, Lan Jiang,
Dirk Nuyens and Yizhi Zhang to improve this manuscript. In addition, the first author also thanks
Art B. Owen for partially funding traveling expenses to MCQMC 2014 through the US National
Science Foundation (NSF). This work was partially supported by NSF grants DMS-1115392, DMS1357690, and DMS-1522687.

References
1. Choi, S.C.T., Ding, Y., Hickernell, F.J., Jiang, L., Jimnez Rugama, Ll.A., Tong, X., Zhang,
Y., Zhou, X.: GAIL: Guaranteed Automatic Integration Library (versions 1.02.1). MATLAB
software. https://github.com/GailGithub/GAIL_Dev (20132015)
2. Clancy, N., Ding, Y., Hamilton, C., Hickernell, F.J., Zhang, Y.: The cost of deterministic,
adaptive, automatic algorithms: cones, not balls. J. Complex. 30(1), 2145 (2014)
3. Dick, J., Kuo, F., Sloan, I.H.: High dimensional integration the Quasi-Monte Carlo way. Acta
Numer. 22, 133288 (2013)
4. Hickernell, F.J.: Obtaining O(N 2+ ) convergence for lattice quadrature rules. In: Fang, K.T.,
Hickernell, F.J., Niederreiter, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2000,
pp. 274289. Springer, Berlin (2002)
5. Hickernell, F.J., Jimnez Rugama, Ll.A.: Reliable adaptive cubature using digital sequences.
In: Cools, R., Nuyens, D., (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2014, vol. 163,
pp. 367383. Springer, Heidelberg (2016)
6. Hickernell, F.J., Niederreiter, H.: The existence of good extensible rank-1 lattices. J. Complex.
19, 286300 (2003)
7. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: On tractability of weighted integration over
bounded and unbounded regions in Rs . Math. Comput. 73, 18851901 (2004)
8. Hickernell, F.J., Sloan, I.H., Wasilkowski, G.W.: The strong tractability of multivariate integration using lattice rules. In: Niederreiter, H. (ed.) Monte Carlo and Quasi-Monte Carlo Methods
2002, pp. 259273. Springer, Berlin (2004)
9. LEcuyer, P., Munger, D.: Algorithm xxx: A general software tool for constructing rank-1 lattice
rules. ACM Trans. Math. Softw. (2016). To appear, http://www.iro.umontreal.ca/~lecuyer/
myftp/papers/latbuilder.pdf
10. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. CBMS-NSF
Regional Conference Series in Applied Mathematics. SIAM, Philadelphia (1992)
11. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems Volume II: Standard Information for Functionals. No. 12 in EMS Tracts in Mathematics. European Mathematical Society,
Zrich (2010)
12. Sidi, A.: A new variable transformation for numerical integration. In: Brass, H., Hmmerlin,
G. (eds.) Numerical Integration IV, No. 112 in International Series of Numerical Mathematics,
pp. 359373. Birkhuser, Basel (1993)
13. Sloan, I.H., Joe, S.: Lattice Methods for Multiple Integration. Oxford University Press, Oxford
(1994)

Path Space Filtering


Alexander Keller, Ken Dahm and Nikolaus Binder

Abstract We improve the efficiency of quasi-Monte Carlo integro-approximation


by using weighted averages of samples instead of the samples themselves. The proposed deterministic algorithm is constructed such that it converges to the solution of
the given integro-approximation problem. The improvements and wide applicability
of the consistent method are demonstrated by visual evidence in the setting of light
transport simulation for photorealistic image synthesis, where the weighted averages
correspond to locally smoothed contributions of path space samples.
Keywords Transport simulation
synthesis Rendering

Integro-approximation Photorealistic image

1 Introduction
Modeling with physical entities like cameras, light sources, and materials on top of
a scene surface stored in a computer, light transport simulation may deliver photorealistic images. Due to complex discontinuities and the curse of dimension, analytic
solutions are out of reach. Thus simulation algorithms have to rely on sampling path
space and summing up the contributions of light transport paths that connect camera sensors and light sources. Depending on the complexity of the modeled scene,
the inherent noise of sampling may vanish only slowly with the progression of the
computation.
This noise may be efficiently reduced by smoothing the contribution of light transport paths before reconstructing the image. So far, intermediate approximations were
A. Keller (B) K. Dahm N. Binder
NVIDIA, Fasanenstr. 81, 10623 Berlin, Germany
e-mail: keller.alexander@gmail.com
K. Dahm
e-mail: ken.dahm@gmail.com
N. Binder
e-mail: nikolaus.binder@gmail.com
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_21

423

424

A. Keller et al.

computed for this purpose. However, removing frequent persistent visual artifacts
due to insufficient approximation then forces simulation from scratch. In addition,
optimizing the numerous parameters of such methods in order to increase efficiency
has been challenging.
We therefore propose a simple and efficient deterministic algorithm that has fewer
parameters. Furthermore, visual artifacts are guaranteed to vanish by progressive
computation and the consistency of the scheme, which in addition overcomes tedious
parameter tuning. While the algorithm unites the advantages of previous work, it also
provides the desired noise reduction as shown by many practical examples.

1.1 Light Transport Simulation by Connecting


Path Segments
Just following photon trajectories and counting photons incident on the camera is
hopelessly inefficient. Therefore light transport paths are sampled by following both
photon trajectories from the lights and tracing paths from the camera aiming to
connect both classes of path segments by proximity and shadow rays [3, 6, 30].
Instead of (pseudo-) random sampling, we employ faster quasi-Monte Carlo methods [22], which for the context of computer graphics are reviewed in [11]. The extensive survey provides all algorithmic building blocks for generating low discrepancy
sequences and in depth explains how to transform them into light transport paths. For
the scope of our article, it is sufficient to know that quasi-Monte Carlo methods in
computer graphics use deterministic low discrepancy sequences to generate path segments. Other than (pseudo-) random sequences, such low discrepancy sequences lack
independence, however, are much more uniformly distributed. In order to generate
light transport path segments, the components of the ith vector of a low discrepancy
sequence are partitioned into two sets (for example by separating the odd and even
components), which then are used to trace the ith camera and light path segment.
Such path segments usually are started by using two components to select an origin
on an area, as for example a light source, and then selecting a direction by two more
components to trace a ray. At the first point of intersection with the scene surface,
another component may be used to decide on path termination, otherwise, the next
two components are used to determine a direction of scattering to trace the next ray,
repeating the procedure.
As illustrated in Fig. 1a, one way to establish a light transport path is by means
of a shadow ray, testing whether both end points of two path segments are mutually
visible. While shadow rays work fine for mostly diffuse surfaces, they may become
inefficient for light transport paths that include specular-diffuse-specular segments
as for example light that is reflected by a mirror onto a diffuse surface and reflected
back by the mirror. To overcome this problem of insufficient techniques [15, Fig. 2],
connecting photon trajectories to camera path segments by proximity, which is called
photon mapping [7], aims to efficiently capture contributions that shadow rays fail
on.

Path Space Filtering

(a)

connecting path segments by shadow rays and proximity

425

(b)

path space filtering

Fig. 1 Illustration of connecting path segments in light transport simulation: a Segments of light
transport paths are generated by following photon trajectories from the light source L and tracing
paths from the camera. End points of path segments then are connected either if they are mutually
visible (dashed line, shadow ray) or if they are sufficiently close (indicated by the dashed circle).
b Complementary to these connection techniques, path space filtering is illustrated by the green
part of the schematic: The contribution ci to the vertex xi of a light transport path is replaced by
a smoothed contribution ci resulting from averaging contributions csi + j to vertices inside the ball
B(n). This averaged contribution ci then is multiplied by the throughput i of the path segment
towards the camera and accumulated on the image plane P. In order to guarantee a consistent
algorithm, the radius r (n) of the ball B(n) must vanish with an increasing number n of samples

Photon mapping connects end points of path segments that are less than a specified radius apart. Decreasing such a radius r (n) with the increasing number n of
sampled light transport paths as introduced by progressive photon mapping [5], the
scheme became consistent: In the limit it in fact becomes equivalent to shadow ray
connections. A consistent and numerically robust quasi-Monte Carlo method for
progressive photon mapping has been developed in [12], while the references in this
article reflect the latest developments in photon mapping as well. Similar to stochastic progressive photon mapping [4], the computation is processing consecutive
batches of light transport paths. Depending on the low discrepancy sequence used,
some block sizes are preferable over others and we stick to integer block sizes of the
form bm as derived in [12]. Note that b is fixed by the underlying low discrepancy
sequence.

2 Consistent Weighted Averaging


Already in [20, 21] it has been shown that a very sparse set of samples may provide sufficient information for high quality image synthesis. Progressive path space
filtering is a new simpler, faster, and consistent variance reduction technique that is
complementary to shadow rays and progressive photon mapping.

426

A. Keller et al.

Considering the ith out of a current total of n light transport paths, selecting a
vertex xi suitable for filtering the radiance contribution ci of the light path segment
towards xi also determines the throughput i along the path segment towards the
camera (see Fig. 1b). While any or even multiple vertices of a light transport path
may be selected, a simple and practical choice is the first vertex along the path from
the camera whose optical properties are considered sufficiently diffuse.
As mentioned before, one low discrepancy sequence is transformed to sample
path space in contiguous batches of bm N light transport paths, where for each
path one selected tuple (xi , i , ci ) is stored for path space filtering. As the memory
consumption is proportional to the batch size bm and given the size of the tuples
and the maximum size of a memory block, it is straightforward to determine the
maximum natural number m.
 
Processing the batch of bm paths starting at index si := bim bm , the image is
formed by accumulating i ci , where


B(n) xsi + j xi wi, j csi + j


bm 1
j=0 B(n) x si + j x i wi, j

bm 1
ci :=

j=0

(1)

is the weighted average of the contributions csi + j of all vertices xsi + j in a ball B(n)
of radius r (n) centered in xi normalized by the sum of weights wi, j as illustrated in
Fig. 1. While the weights will be detailed in Sect. 2.1, for the moment it is sufficient
to postulate wi,i = 0.
Centered in xi , the characteristic function B(n) always includes the ith path (as
opposed to for example [28]). Therefore, given an initial radius r0 (see Sect. 2.2 for
details), and a radius (see [12])
r (n) =

r0
for (0, 1)
n

(2)

vanishing with the total number n of paths guarantees limn ci = ci and thus
consistency. As a consequence, all artifacts visible during progressive computation
must be transient, even if they may vanish slowly. However, selecting a small radius
to hide the transient artifacts is a goal competing with a large radius to include as
many as possible contributions in the weighted average.
Given the path space samples of a path tracer with next event estimation and
implicit multiple importance sampling [11, 19], Fig. 2 illustrates progressive path
space filtering, especially its noise reduction, transient artifacts, and consistency
for an increasing number n of light transport paths. The lighting consists of a high
dynamic range environment map. The first hit points as seen from the camera are
stored as the vertices xi , where the range search and filtering takes place.
In spite of the apparent similarity of Eq. 1, methods used for scattered data interpolation [14, 25], and weighted uniform sampling [23, 27], there are principal differences: First, an interpolation property ci = ci would inhibit any averaging right from
the beginning and second, bm  , as bm is proportional to the required amount of
memory to store light transport paths. Nevertheless, the batch size bm should be cho-

Path Space Filtering

427

Fig. 2 The series of images illustrates progressive path space filtering. Each image shows the
unfiltered input above and the accumulation of weighted averages below the diagonal. As more and
more batches of paths are processed, the splotchy artifacts vanish due to the consistency of the
algorithm as guaranteed by the decreasing range search radius r (n). Model courtesy M. Dabrovic
and Crytek

sen as large as memory permits, because the efficiency results from simultaneously
filtering as many vertices as possible.
Caching samples of irradiance and interpolating them to increase the efficiency
of light transport simulation [32] has been intensively investigated [18] and has been
implemented in many renderers (see Fig. 5b). Scintillation in animations is the key
artifact of this method, which appears due to interpolating cached irradiance samples
that are noisy [17, Sect. 6.3.2] and cannot be placed in a coherent way over time. Such
artifacts require to adjust a set of multiple parameters followed by simulation from
scratch, because the method is not consistent.
Other than irradiance interpolation, path space filtering efficiently can filter across
discontinuities such as detailed geometry (for examples, see Fig. 6). It overcomes
the necessity of excessive trajectory splitting to reduce noise in the cached samples,
too, which enables path tracing using the fire-and-forget paradigm as required for
efficient parallel light transport simulation. This in turn fits the observation that
with an increasing number of simulated light transport paths trajectory splitting
becomes less efficient. In addition, reducing artifacts in a frame due to consistency
only requires to continue computation instead of starting over from scratch.
The averaging process defined in Eq. 1 may be iterated within a batch of light
transport paths, i.e. computing ci from ci and so on. This yields a further dramatic

428

A. Keller et al.

Fig. 3 Iterated weighted averaging very efficiently smooths the solution by relaxation at the cost
of losing some detail. Obviously, path space filtering replaces the black pixels of the input with the
weighted averages, which brightens up the image in the expected way. Model courtesy G. M. Leal
LLaaguno

speed up at the cost of some blurred illumination detail as can be seen in Fig. 3. Note
that such an iteration is consistent, too, because the radius r (n) decreases with the
number of batches.

2.1 Weighting by Similarity


Although Eq. 1 is consistent even without weighting, i.e. wi, j 1, for larger radii r (n)
the resulting images may look overly blurred as contributions csi + j become included
in the average that actually never could have been gathered in xi (see Fig. 4). In order
to reduce this transient artifact of light leaking and to benefit from larger radii to
include more contributions in the average, the weights wi, j should value how likely
the contribution csi + j could have been created in xi by trajectory splitting.
Various heuristics for such weights are known from irradiance interpolation [18],
the discontinuity buffer [10, 31], photon mapping [7], light field reconstruction
[20, 21], and Fourier histogram descriptors [2]. The effect of the following weights
of similarity is shown in Fig. 4:
Blur across geometry: The similarity of the surface normal n i in xi and other surface normals n si + j in xsi + j can be determined by their scalar product n si + j , n i
[1, 1]. While obviously contributions with negative scalar product will be
excluded in order to prevent light leaking through the backside of an opaque
surface, including only contributions with n si + j , n i 0.95 (in our implementation) avoids light being transported across geometry that is far from planar.
Blur across textures: The images would be most crisp if for all contributions included
in the average the optical surface properties were evaluated in xi . For surfaces
other than diffuse surfaces, like for example glossy surfaces, these properties also
depend on the direction of observation, which then must be explicitly stored with
the xi . Some of this additional memory can be saved when directions are implicitly known to be similar, as for example for query locations xi as directly seen
from the camera.

Path Space Filtering

429

Fig. 4 The effect of weighting: The top left image was rendered by a forward path tracer at 16 path
space samples per pixel. The bottom left image shows the same algorithm with path space filtering.
The improvement is easy to see in the enlarged insets. The right column illustrates the effect of the
single components of the weights. From top to bottom: Using uniform weights, the image looks
blurred and light is transported around corners. Including only samples with similar surface normals
(middle), removes a lot of blur resulting in crisp geometry. The image at the bottom right in addition
reduces texture blur by not filtering contributions with too different local throughput by the surface
reflectance properties. Finally, the bottom left result adds improvements on the shadow boundaries
by excluding contributions that have too different visibility. Model courtesy M. Dabrovic and Crytek

In situations where this evaluation is too costly or not feasible, the algorithm has
to rely on data stored during path segment generation. Such data usually includes
a color term, which is the bidirectional scattering distribution function (BSDF)
multiplied by the ratio of the cosine between surface normal and direction of
incidence and the probability density function (pdf) evaluated for the directions
of transport. For the example of cosine distributed samples on diffuse surfaces
only the diffuse albedo remains, because all other terms cancel. If a norm of the
difference of these terms in xsi + j and xi is below a threshold ( 2 < 0.05 in our
implementation), the contribution of xsi + j is included in the average. Unless the
surface is diffuse, the similarity of the directions of observation must be checked
as well to avoid incorrect in-scattering on glossy materials. Including more and

430

A. Keller et al.

more heuristics of course excludes more and more candidates, decreasing the
potential of noise reduction. In the real-time implementation of path space filtering [2], the weighted average is computed for each component resulting from a
decomposition of path space induced by the basis functions used to represent the
optical surface properties.
Blurred shadows: Given a point light source, its visibility as seen from xi and xsi + j
may be either identical or different. In order to avoid sharp shadow boundaries
to be blurred, contributions may be only included upon identical visibility. For
ambient occlusion and illumination by an environment map, blur can be reduced
by comparing the lengths of each one ray shot into the hemisphere at xi and xsi + j
by thresholding their difference.
Using only binary weights that are either zero or one, the denominator of the ratio in
Eq. 1 amounts to the number of included contributions. Although seemingly counterintuitive, using the norms to directly weight the contributions results in higher variance. This effect already has been observed in an article [1] on efficient anti-aliasing:
Having other than uniform weights, the same contribution may be weighted differently in neighboring queries, which in turn results in increased noise. In a similar
way, using kernels (for examples see [26] or kernels used in the domain of smoothed
particles hydrodynamics (SPH)) other than the characteristic function B(n) to weight
contributions by their distance to the query location xi increases the variance.

2.2 Range Search


The vertices xsi + j selected by the characteristic function B(n) centered at xi efficiently may be queried by a range search using a hash grid [29], a bounding volume
hierarchy or a kd-tree organizing the entirety of stored vertices in space, or a divideand-conquer method [13] simultaneously considering all queries. As the sets of query
and data points are identical, data locality is high and implementation is simplified.
Note that storing vertex information only in screen space even enables real-time
path space filtering [2], however, can only query a subset of the neighborhood relations as compared to the full 3d range search. In fact, real-time path space filtering
[2] improves on previous work [24, Chap. 4, p. 83] with respect to similarity criteria
and path space decomposition, while the basic ideas are similar and both based on
earlier attempts of filtering approaches [10, 16, 31] in order to improve efficiency.
As already observed in [12], the parameter in Eq. 2 does not have much influence
and = 41 is a robustly working choice. In fact, the radius is decreasing arbitrarily
slowly, which leaves the initial radius r0 as the most important parameter.
As fewer and fewer contributions are averaged with decreasing radius, there is
a point in time, where actually almost no averaging takes place any longer as only
the central vertex xi is included in the queries. On the one hand, this leads to the
intuition that comparing the maximum of the number of averaged contributions to
a threshold can be utilized to automatically switch off the algorithm. On the other

Path Space Filtering

431

hand, it indicates that the initial radius needs to be selected sufficiently large in order
to include a meaningful number of contributions in the weighted averages from Eq. 1.
The initial radius r0 also may depend on the query location xi . For example, it
r 2
may be derived from the definition of the solid angle := d 20 of a disk of radius
r0 in xi perpendicular to a ray at a distance d from the ray origin. For a fixed solid
angle , the initial radius

d 2
r0 =
d

then is proportional to the distance d. The factor of proportionality may be either


chosen by the user or can be determined using a given solid angle. For example,
can be chosen as the solid angle determined by the area of 3 3 pixels on the screen
with respect to the focal point. Finally, the distance d may be chosen as the length
of the camera path towards xi .
Note that considering an anisotropic footprint (area of averaging determined by
projecting the solid angle of a ray onto the intersected surface) is not practical for
several reasons: The requirement of dividing by the cosine between surface normal
in xi and the ray direction may cause numerical issues for vectors that are close to
perpendicular. In addition the efficiency of the range search may be decreased, as
now the query volume may have an arbitrarily large extent. Finally, this would result
in possibly averaging contributions from vertices that are spatially far apart, although
the local environment of the vertex xi may be small such as for example in foliage
or hair.

2.3 Differentiation of Path Space Filtering


and Photon Mapping
Progressive path space filtering is different from progressive photon mapping: First
of all, progressive photon mapping is not a weighted average as it determines radiance
by querying the flux of photons inside a ball around a query point divided by the
corresponding disk area. Without progressive photon mapping the contribution of
light transport paths that are difficult to sample [15, Fig. 2] would be just missing or
add high variance sporadically. Second, the query locations in photon mapping are
not part of the photon cloud queried by range search, while in path space filtering the
ensemble of vertices subject to range search includes both data and query locations.
Third, progressive photon mapping is concerned with light path segments, while
progressive path space filtering is concerned with camera path segments.
Temporally visible light leaks and splotches are due to a large range search radius
r (n), which allows for collecting light beyond opaque surfaces and due to the shape of
the ball B(n) blurs light into disk-like shapes. If the local environment around a query
point is not a disk, as for example close to a geometric edge, the division by the disk
area in photon mapping causes an underestimation resulting in a darkening along such

432

A. Keller et al.

edges. While this does not happen for the weighted average of path space filtering,
contrast may be reduced (see the foliage rendering in Fig. 6). In addition, so-called
fire flies that actually are rarely sampled peaks of the integrand, are attenuated by
the weighted average and therefore may look more like splotches instead of single
bright pixels. Since both progressive photon mapping and path space filtering are
consistent, all of these artifacts must be transient.

3 More Applications in Light Transport Simulation


Path space filtering is simple to implement and due to linearity (see Eq. 1) works for
any decomposition of path space including any variant of (multiple) importance sampling. It can overcome the need for excessive trajectory splitting (see the schematic in
Fig. 5) for local averaging in xi in virtually all common use cases in rendering: Ambient occlusion, shadows from extended and/or many light sources (like for example
instant radiosity [9]), final gathering, ray marching, baking light probes and textures
for games, rendering with participating media, or effects like depth of field simulation can be determined directly from path space samples. Some exemplary results
are shown in Fig. 6 and some more applications are briefly sketched in the following:
Animations: A common artifact in animations rendered with interpolation methods
is scintillation due to for example temporally incoherent cached samples, noisy
cached samples, or temporal changes in visibility. Then parameters have to be
tweaked and computation has to be started from scratch. Progressive path space
filtering removes this critical source of inefficiency: Storing the next batch starting
index si with each frame (see Sect. 2), any selected frame can be refined by just
continuing the computation as all artifacts are guaranteed to be transient.
Multiple views: In addition, path space filtering can be applied across vertices generated from multiple views. As such, rendering depth of field, stereo pairs of images

(a)

trajectory splitting

(b)

irradiance interpolation

(c)

path space filtering

(d)

super-sampling

Fig. 5 In order to determine the radiance in xi as seen by the long ray, a many rays are shot
into the hemisphere to sample the contributions. As this becomes too expensive due to the large
number of rays, b irradiance interpolation interpolates between cached irradiance samples that were
smoothed by trajectory splitting. c Path space filtering mimics trajectory splitting by averaging the
contributions of paths in the proximity. d Supersampling the information provided by the paths used
for path space filtering is possible by tracing additional path segments from the camera. Note that
then xi does not have an intrinsic contribution

Path Space Filtering

(a)

433

(b)

ambient occlusion

(c)

shadows

(d)

light transport simulation

(e)

complex geometry

(d)

transluscent material

red-cyan super imposed stereo image pair

Fig. 6 The split image comparisons show how path space filtering can remove substantial amounts
of noise in various example settings. Models courtesy S. Laine, cgtrader, Laubwerk, Stanford
Computer Graphics Laboratory, and G.M. Leal LLaaguno

(see Fig. 6f), multiple views of a scene, rendering for light field displays, or an
animation of a static scene can greatly benefit as vertices can be shared among all
frames to be rendered.
Motion blur: Identical to [11], the consistent simulation of motion blur may be realized by averaging images at distinct points in time. As an alternative, extending
the range search to include proximity in time allows for averaging across vertices with different points in time. In cases where linear motion is a sufficient
approximation and storing linear motion vectors is affordable, reconstructing the
visibility as introduced in [20, 21] may improve the speed of convergence.
Spectral rendering: The consistent simulation of spectral light transport may be realized by averaging monochromatic contributions ci associated to a wavelength i .
The projection onto a suitable color system may happen during the averaging
process, where the suitable basis functions are multiplied as factors to the weights.
One example of such a set of basis functions are the CIE XYZ response curves.

434

A. Keller et al.

Fig. 7 The left image shows a fence rendered with one light transport path per pixel. The image on
the right shows the result of anti-aliasing by path space filtering using the paths from the left image
and an additional three camera paths per pixel. Model courtesy Chris Wyman

One very compact continuous approximation of these response curves is proposed


in [33].
Participating media and translucency: As path space filtering works for any kind
of path space samples, it readily can be applied to the simulation of subsurface
scattering and participating media in order to improve rendering performance.
Figure 6e features a statuette with light transported through translucent matter,
where path space filtering has been performed across the surface of the statuette.
At this level of efficiency, the consistent direct simulation may become affordable
over approximations like for example bidirectional subsurface scattering distribution functions (BSSRDF) based on the dipole approximation [8].
Decoupling anti-aliasing from shading: As illustrated in Fig. 5d, it is straightforward
to just sample more camera path segments. Contributions to their query locations
are computed as before. However, similar to [28], these queries may be empty
due to the lack of a guaranteed central contribution ci and in that case must not
be considered in the accumulation process. Figure 7 illustrates how anti-aliasing
by super-sampling with path space filtering works nicely across discontinuities.

4 Conclusion
Path space filtering is simple to implement on top of any sampling-based rendering
algorithm and has low overhead. The progressive algorithm efficiently reduces variance and is guaranteed to converge without persistent artifacts due to consistency.
It will be interesting to explore the principle applied to integro-approximation problems other than computer graphics and to investigate how the method fits into the
context of multilevel Monte Carlo methods.

References
1. Ernst, M., Stamminger, M., Greiner, G.: Filter importance sampling. In: Proceedings of the
IEEE/EG Symposium on Interactive Ray Tracing, pp. 125132 (2006)

Path Space Filtering

435

2. Gautron, P., Droske, M., Wchter, C., Kettner, L., Keller, A., Binder, N., Dahm, K.: Path space
similarity determined by Fourier histogram descriptors. In: ACM SIGGRAPH 2014 Talks,
SIGGRAPH14, pp. 39:139:1. ACM (2014)
3. Georgiev, I., Krivnek, J., Davidovic, T., Slusallek, P.: Light transport simulation with vertex
connection and merging. ACM Trans. Graph. (TOG) 31(6), 192:1192:10 (2012)
4. Hachisuka, T., Jensen, H.: Stochastic progressive photon mapping. In: SIGGRAPH Asia09:
ACM SIGGRAPH Asia Papers, pp. 18. ACM (2009)
5. Hachisuka, T., Ogaki, S., Jensen, H.: Progressive photon mapping. ACM Trans. Graph. 27(5),
130:1130:8 (2008)
6. Hachisuka, T., Pantaleoni, J., Jensen, H.: A path space extension for robust light transport
simulation. ACM Trans. Graph. (TOG) 31(6), 191:1191:10 (2012)
7. Jensen, H.: Realistic Image Synthesis Using Photon Mapping. AK Peters, Natick (2001)
8. Jensen, H., Buhler, J.: A rapid hierarchical rendering technique for translucent materials. ACM
Trans. Graph. 21(3), 576581 (2002)
9. Keller, A.: Instant radiosity. In: SIGGRAPH97: Proceedings of the 24th Annual Conference
on Computer Graphics and Interactive Techniques, pp. 4956 (1997)
10. Keller, A.: Quasi-Monte Carlo Methods for Photorealistic Image Synthesis. Ph.D. thesis, University of Kaiserslautern, Germany (1998)
11. Keller, A.: Quasi-Monte Carlo image synthesis in a nutshell. In: Dick, J., Kuo, F., Peters, G.,
Sloan, I. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2012, pp. 203238. Springer,
Heidelberg (2013)
12. Keller, A., Binder, N.: Deterministic consistent density estimation for light transport simulation.
In: Dick, J., Kuo, F., Peters, G., Sloan, I. (eds.) Monte Carlo and Quasi-Monte Carlo Methods
2012, pp. 467480. Springer, Heidelberg (2013)
13. Keller, A., Droske, M., Grnschlo, L., Seibert, D.: A divide-and-conquer algorithm for simultaneous photon map queries. Poster at High-Performance Graphics
in Vancouver. http://www.highperformancegraphics.org/previous/www_2011/media/Posters/
HPG2011_Posters_Keller1_abstract.pdf (2011)
14. Knauer, E., Brz, J., Mller, S.: A hybrid approach to interactive global illumination and soft
shadows. Vis. Comput.: Int. J. Comput. Graph. 26(68), 565574 (2010)
15. Kollig, T., Keller, A.: Efficient bidirectional path tracing by randomized quasi-Monte Carlo
integration. In: Niederreiter, H., Fang, K., Hickernell, F. (eds.) Monte Carlo and Quasi-Monte
Carlo Methods 2000, pp. 290305. Springer, Berlin (2002)
16. Kontkanen, J., Rsnen, J., Keller, A.: Irradiance filtering for Monte Carlo ray tracing. In: Talay,
D., Niederreiter, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2004, pp. 259272.
Springer, Berlin (2004)
17. Krivnek, J.: Radiance caching for global illumination computation on glossy surfaces. Ph.D.
thesis, Universit de Rennes 1 and Czech Technical University in Prague (2005)
18. Krivnek, J., Gautron, P.: Practical Global Illumination with Irradiance Caching. Synthesis
lectures in computer graphics and animation. Morgan & Claypool, San Rafael (2009)
19. Lafortune, E.: Mathematical Models and Monte Carlo Algorithms for Physically Based Rendering. Ph.D. thesis, Katholieke Universiteit Leuven, Belgium (1996)
20. Lehtinen, J., Aila, T., Chen, J., Laine, S., Durand, F.: Temporal light field reconstruction for
rendering distribution effects. ACM Trans. Graph. 30(4), 55:155:12 (2011)
21. Lehtinen, J., Aila, T., Laine, S., Durand, F.: Reconstructing the indirect light field for global
illumination. ACM Trans. Graph. 31(4), 51 (2012)
22. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM,
Philadelphia (1992)
23. Powell, M., Swann, J.: Weighted uniform sampling - a Monte Carlo technique for reducing
variance. IMA J. Appl. Math. 2(3), 228236 (1966)
24. Schwenk, K.: Filtering techniques for low-noise previews of interactive stochastic ray tracing.
Ph.D. thesis, Technische Universitt Darmstadt (2013)
25. Shepard, D.: A two-dimensional interpolation function for irregularly-spaced data. In: Proceedings of the 23rd ACM National Conference, pp. 517524. ACM (1968)

436

A. Keller et al.

26. Silverman, B.: Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC,
London (1986)
27. Spanier, J., Maize, E.: Quasi-random methods for estimating integrals using relatively small
samples. SIAM Rev. 36(1), 1844 (1994)
28. Suykens, F., Willems, Y.: Adaptive filtering for progressive Monte Carlo image rendering. In:
WSCG 2000 Conference Proceedings (2000)
29. Teschner, M., Heidelberger, B., Mller, M., Pomeranets, D., Gross, M.: Optimized spatial
hashing for collision detection of deformable objects. In: Proceedings of VMV03, pp. 4754.
Munich, Germany (2003)
30. Veach, E.: Robust Monte Carlo Methods for Light Transport Simulation. Ph.D. thesis, Stanford
University (1997)
31. Wald, I., Kollig, T., Benthin, C., Keller, A., Slusallek, P.: Interactive global illumination using
fast ray tracing. In: Debevec, P., Gibson, S. (eds.) Rendering Techniques (Proceedings of the
13th Eurographics Workshop on Rendering), pp. 1524 (2002)
32. Ward, G., Rubinstein, F., Clear, R.: A ray tracing solution for diffuse interreflection. Comput.
Graph. 22, 8590 (1988)
33. Wyman, C., Sloan, P., Shirley, P.: Simple analytic approximations to the CIE XYZ color matching functions. J. Comput. Graph. Tech. (JCGT) 2, 111 (2013). http://jcgt.org/published/0002/
02/01/

Tractability of Multivariate Integration


in Hybrid Function Spaces
Peter Kritzer and Friedrich Pillichshammer

Abstract We consider tractability of integration in reproducing kernel Hilbert


spaces which are a tensor product of a Walsh space and a Korobov space. The main
result provides necessary and sufficient conditions for weak, polynomial, and strong
polynomial tractability.
Keywords Multivariate integration Quasi-Monte Carlo Tractability Korobov
space Walsh space

1 Introduction

In this paper we study multivariate integration Is (f ) = [0,1]s f (x) dx in reproducing
kernel Hilbert spaces H (K) of functions f : [0, 1]s R, equipped with the norm
 H (K) , where K denotes the reproducing kernel. We refer to Aronszajn [1] for
an introduction to the theory of reproducing kernel Hilbert spaces. Without loss of
generality, see, e.g., [19, 23], we can restrict ourselves to approximating Is (f ) by
means of linear algorithms QN,s of the form
QN,s (f , P) :=

N1


qk f (xk ),

k=0

P. Kritzer (B) F. Pillichshammer


Department of Financial Mathematics, Johannes Kepler University Linz,
Altenbergerstr. 69, 4040 Linz, Austria
e-mail: peter.kritzer@jku.at
P. Kritzer
Johann Radon Institute for Computational and Applied Mathematics,
Austrian Academy of Sciences, Altenbergerstr. 69, 4040 Linz, Austria
F. Pillichshammer
e-mail: friedrich.pillichshammer@jku.at
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_22

437

438

P. Kritzer and F. Pillichshammer

where N N, with coefficients qk C and deterministically chosen sample points


P = {x0 , x1 , . . . , xN1 } in [0, 1)s . In this paper we further restrict ourselves to
considering only qk of the form qk = 1/N for all 0 k < N in which case one
speaks of quasi-Monte Carlo (QMC) algorithms. QMC algorithms are often used
in practical applications especially if s is large. We are interested in studying the
worst-case integration error,
e(H (K), P) =

sup
f H (K)
f H (K) 1



Is (f ) QN,s (f , P) .

For N N let e(N, s) be the Nth minimal QMC worst-case error,


e(N, s) = inf e(H (K), P),
P

where the infimum is extended over all N-element point sets in [0, 1)s . Additionally,
the initial error e(0, s) is defined as the worst-case error of the zero algorithm,
e(0, s) =

sup
f H (K)
f H (K) 1

|Is (f )|

and is used as a reference value. In this paper we are interested in the dependence
of the worst-case error on the dimension s. To study this dependence systematically
we consider the so-called information complexity defined as
Nmin (, s) = min{N N0 : e(N, s) e(0, s)},
which is the minimal number of points required to reduce the initial error by a factor
of , where > 0.
We would like to avoid cases where the information complexity Nmin (, s) grows
exponentially or even faster with the dimension s or with 1 . To quantify the behavior
of the information complexity we use the following notions of tractability.
We say that the integration problem in H (K) is
Nmin (,s)
weakly QMC-tractable, if lims+1 logs+
= 0;
1
polynomially QMC-tractable, if there exist non-negative numbers c, p, and q such
that Nmin (, s) csq p ;
strongly polynomially QMC-tractable, if there exist non-negative numbers c and
p such that Nmin (, s) cp .

Of course, strong polynomial QMC-tractability implies polynomial QMC-tractability


which in turn implies weak QMC-tractability. If we do not have weak QMCtractability, then we say that the integration problem in H (K) is intractable.
In the existing literature, many authors have studied tractability (since we only
deal with QMC-rules here we omit the prefix QMC from now on) of integration
in many different reproducing kernel Hilbert spaces. The current state of the art of

Tractability of Multivariate Integration in Hybrid Function Spaces

439

tractability theory is summarized in the three volumes of the book of Novak and
Wozniakowski [1921] which we refer to for extensive information on this subject
and further references. Most of these investigations have in common that reproducing
kernel Hilbert spaces are tensor products of one-dimensional spaces whose kernels
are all of the same type (but maybe equipped with different weights). In this paper
we consider the case where the reproducing kernel is a tensor product of spaces
with kernels of different type. We call such spaces hybrid spaces. Some results on
tractability in general hybrid spaces can be found in the literature. For example, in
[20] multivariate integration is studied for arbitrary reproducing kernels Kd without
relation to Kd+1 . Here we consider as a special instance the tensor product of Walsh
and Korobov spaces. As far as we are aware of, this specific problem has not been
studied in the literature so far. This paper is a first attempt in this direction.
In particular, we consider the tensor product of an s1 -dimensional weighted Walsh
space and an s2 -dimensional weighted Korobov space (the exact definitions will be
given in the next section). The study of such spaces could be important in view of the
integration of functions which are periodic with respect to some of the components
and, for example, piecewise constant with respect to the remaining components.
Moreover, it has been pointed out by several scientists (see, e.g., [11, 17]) that
hybrid integration problems may be relevant for certain integration problems in
applications. Indeed, communication with the authors of [11] and [17] have motivated
our idea for considering function spaces, where we may have very different properties
of the integrands with respect to different components, as for example regarding
smoothness.
From the analytical point of view, it is very challenging to deal with integration
in hybrid spaces. The reason for this is the rather complex interplay between the
different analytic and algebraic structures of the kernel functions. In the present study
we are concerned with Fourier analysis carried out simultaneously with respect to
the Walsh and the trigonometric function system. The problem is also closely related
to the study of hybrid point sets which received much attention in recent times (see,
for example, [5, 6, 810, 1315]).
The paper is organized as follows. In Sect. 2 we introduce the Hilbert space under
consideration in this paper. The main result states necessary and sufficient conditions
for various notions of tractability and is stated in Sect. 3. In Sect. 4 we prove the
necessary conditions and in Sect. 5 the sufficient ones.

2 The Hilbert Space


2.1 Basic Notation

i
Let k N0 with b-adic representation k =
i {0, . . . , b 1}. Furtheri=0 i b
,
more, let x [0, 1) with b-adic representation x = i=1 i bi , i {0, . . . , b 1},
unique in the sense that infinitely many of the i differ from b1. If a = 0 is the most

440

P. Kritzer and F. Pillichshammer

significant nonzero digit of k, we define the kth Walsh function walk : [0, 1) C
(in base b) by


1 0 + + a+1 a
,
walk (x) := e
b
where e(v) := exp(2 iv). For dimension s 2 and vectors k = (k1 , . . . , ks ) Ns0
and x = (x1 , . .
. , xs ) [0, 1)s we define the kth Walsh function walk : [0, 1)s C
by walk (x) := sj=1 walkj (xj ).
Furthermore, for l Zs and y Rs we define the lth trigonometric function by
el (y) := e(l y), where denotes the usual dot product.
We define two functions r (1) , r (2) : let > 1 and > 0 be reals and let = (j )j1
be a sequence of positive reals.
For integer b 2, and k N0 let

(1)
r,
(k)

:=

1
if k = 0,
logb k

b
if k = 0.


(1)
(1)
For k = (k1 , . . . , ks ) Ns0 let r,
(k) := sj=1 r,
(kj ). Even though the parameter
j
(1)
b occurs in the definition of r, , we do not explicitly include it in our notation as
the choice of b will usually be clear from the context.
For l Z let

1
if l = 0,
(2)
(l) :=
r,

|l|

if l = 0.
(2)
For l = (l1 , . . . , ls ) Zs let r,
(l) :=

s

(2)
j=1 r,j (lj ).

2.2 Definition of the Hilbert Space


Let s1 , s2 N0 such that s1 + s2 1. We write s = (s1 , s2 ). For x =
(x1 , . . . , xs1 ) [0, 1)s1 and y = (y1 , . . . , ys2 ) [0, 1)s2 , we use the short hand
(x, y) for (x1 , . . . , xs1 , y1 , . . . , ys2 ) [0, 1)s1 +s2 .
Let (1) = (j(1) )j1 and (2) = (j(2) )j1 be non-increasing sequences of positive
real numbers. We write for the tuple ( (1) , (2) ). Furthermore, let 1 , 2 R, with
1 , 2 > 1 and write = (1 , 2 ).
We first define a function Ks,, : [0, 1]s1 +s2 [0, 1]s1 +s2 C (which will be the
kernel function of a Hilbert space, as we shall see later) by
Ks,, ((x, y), (x , y ))
  (1)


:=
r1 , (1) (k)r(2)
(2) (l)walk (x)el (y)walk (x )el (y )
2 ,
s
s
kN01 lZ 2

Tractability of Multivariate Integration in Hybrid Function Spaces

441

for (x, y), (x , y ) [0, 1]s1 +s2 (to be more precise, we should write x, x [0, 1]s1
and y, y [0, 1]s2 ; from now on, when using the notation (x, y) [0, 1]s1 +s2 , we
shall always tacitly assume that x [0, 1]s1 and y [0, 1]s2 ).
Note that Ks,, can be written as

Kor

Ks,, ((x, y), (x , y )) = KsWal
(1) (x, x )Ks , , (2) (y, y ),
1 ,1 ,
2 2

(1)

where KsWal
(1) is the reproducing kernel of a Hilbert space based on Walsh functions.
1 ,1 ,
This space is defined as



(1) <
(k)wal
:
f

f
=
,
H (KsWal
f
(1) ) :=
wal
k
s
,
,
1
1
,
,
1 1

s1
kN0

where the
fwal (k) :=


[0,1]s1

f (x)walk (x) dx are the Walsh coefficients of f and

f s1 ,1 , (1) =

1/2

s
kN01

r(1)
(1) (k)
1 ,

|
fwal (k)|2

This so-called Walsh space was first introduced and studied in [3]. The kernel
KsWal
(1) can be written as (see [3, p. 157])
1 ,1 ,

KsWal
(1) (x, x ) =
1 ,1 ,


s


r(1)
(1) (k)walk (x)walk (x )
1 ,

kN01

s1



1+

j=1

s1


j(1)

 walk (xj xj )
kN

b1 logb k

(1 + j(1) wal,1 (xj , xj )),


(2)

(3)

j=1

where denotes digit-wise subtraction modulo b, and where the function wal,1 is
defined as in [3, p. 170], where it is also noted that 1 + j wal,1 (u, v) 0 for any
u, v as long as j 1.
Furthermore, KsKor
(2) is the reproducing kernel of a Hilbert space based on
2 ,2 ,
trigonometric functions. This second function space is defined as



(2) <
(l)e
:
f

f
=
,
H (KsKor
f
(2) ) :=
trig
l
s
,
,
2
2
,
,
2 2

s2
lZ0

442

P. Kritzer and F. Pillichshammer

where the
ftrig (l) :=


[0,1]s2

f (y)el (y) dy are the Fourier coefficients of f and




f s2 ,2 , (2) =

1/2

lZs2

r(2)
(2) (l)
2 ,

|
ftrig (l)|

This so-called Korobov space is studied in many papers. We refer to [20, 22] and
the references therein for further information. The kernel KsKor
(2) can be written as
2 ,2 ,
(see [22])

KsKor
(2) (y, y ) =
2 ,2 ,


r(2)
(2) (l)el (y)el (y )
2 ,

lZs2

s2

j=1

s2


1 +


 el (yj yj )

j(2)

lZ\{0}

|l|2

(4)


cos(2 l(yj yj ))
(2)

1 + 2j

j=1

l 2

l=1

(5)

Note that Ks2 ,2 , (2) (y, y ) 0 as long as j(2) (2 (2 ))1 for all j 1, where is
the Riemann zeta function.
Furthermore, [1, Part I, Sect. 8, Theorem I, p. 361] implies that Ks,, is the reproKor
ducing kernel of the tensor product of the spaces H (KsWal
(1) ) and H (Ks , , (2) ),
1 ,1 ,
2 2
i.e., of the space
Kor
H (Ks,, ) = H (KsWal
(1) ) H (Ks , , (2) ).
1 ,1 ,
2 2

The elements of H (Ks,, ) are defined on [0, 1]s1 +s2 , and the space is equipped with
the norm

||f ||s,, =

 
s
kN01

where
f (k, l) :=
that


[0,1]s1 +s2

s
lZ02

1/2
1
(2)
r(1)
(1) (k) r , (2) (l)
1 ,
2

|
f (k, l)|2

f (x, y)walk (x)el (y) dx dy. From (1), (3) and (5) it follows

Ks,, ((x, y), (x , y ))





s1
s2





cos(2
l(y

y
))
j
j
(1)
(2)
.
1 + 2j
= (1 + j wal,1 (xj , xj ))
2
l
j=1
j=1
l=1

Tractability of Multivariate Integration in Hybrid Function Spaces

443

In particular, if j(1) 1 and j(2) (2 (2 ))1 for all j 1, then the kernel Ks,,
is nonnegative.
We study the problem of numerically integrating a function f H (Ks,, ), i.e.,
we would like to approximate

Is (f ) =

[0,1]s1

f (x, y) dx dy.

[0,1]s2

s1 +s2
We use a QMC rule based on a point set SN,s = ((xn , yn ))N1
, so we
n=0 [0, 1)
approximate Is (f ) by
N1
1 
f (xn , yn ).
N n=0

Using [4, Proposition 2.11] we obtain that e(0, s1 + s2 ) = 1 for all s1 , s2 and
e2 (H (Ks,, ), SN,s ) = 1 +

N1
1 
Ks,, ((xn , yn ), (xn , yn )).
N 2 n,n =0

(6)

3 The Main Result


The main result of this paper states necessary and sufficient conditions for the various
notions of tractability.
Theorem 1 We have strong polynomial QMC-tractability of multivariate integration in H (Ks,, ) iff

lim

(s1 +s2 )

s1
s2



j(1) +
j(2) < .
j=1

(7)

j=1

We have polynomial QMC-tractability of multivariate integration in H (Ks,, ) iff


 s1
lim

j=1

(s1 +s2 )

j(1)

log+ s1

s2
+

j=1

j(2)

log+ s2


< ,

(8)

where log+ s = max(1, log s).


We have weak QMC-tractability of multivariate integration in H (Ks,, ) iff

lim

(s1 +s2 )

s1
s2


(1)
(2)

j +
j s1 + s2 = 0.
j=1

j=1

(9)

444

P. Kritzer and F. Pillichshammer

The necessity of the conditions in Theorem 1 will be proven in Sect. 4 and the
sufficiency in Sect. 5. In the latter section we will see that the notions of tractability
can be achieved by using so-called hybrid point sets made of polynomial lattice point
sets and of classical lattice point sets. We will construct these by a component-bycomponent algorithm.

4 Proof of the Necessary Conditions


First we prove the following theorem.
s
Theorem 2 For any point set SN,s = ((xn , yn ))N1
n=0 [0, 1) , we have

e2 (H (Ks,, ), SN,s ) 1 +
where () :=

b (b1)
b b

s1
s2

1 
(1 + j(1) (1 )) (1 + 2j(2) (2 )),
N j=1
j=1

for > 1, and where is the Riemann zeta function.

Proof Let us, for the sake of simplicity, assume that


j(1) 1 and j(2)

1
,
2 (2 )

respectively, for j 1. This imposes no loss of generality due to the fact that if we
decrease product weights, then the problem becomes easier. Under the assumption
on the weights we know from Sect. 2.2 that Ks,, is nonnegative. Now, taking only
the diagonal elements in (6), and from the representations of the kernels in (1), (3)
and (5) we obtain
1 
Ks,, ((xn , yn ), (xn , yn ))
N 2 n=0

s1
s2

1 
(1 + j(1) (1 )) (1 + 2j(2) (2 )) ,
= 1 +
N j=1
j=1
N1

e2 (H (Ks,, ), SN,s ) 1 +

since wal, (x, x) = () according to [3, p. 170].


From Theorem 2, we conclude that for (0, 1) we have

s1
s2

1 
(1)
(2)
Nmin (, s1 + s2 )
(1 + j (1 )) (1 + 2j (2 )) .
1 + 2 j=1
j=1

Tractability of Multivariate Integration in Hybrid Function Spaces

445

Now the two products can be analyzed in the same way as it was done in [3] and [22],
respectively. This finally leads to the necessary conditions (7) and (8) in Theorem 1.
Now assume that we have weak QMC-tractability. Then for = 1 we have
1
2

1 
log(1 + j(1) (1 )) +
log(1 + 2j(2) (2 ))
+
2
j=1
j=1

log Nmin (1, s1 + s2 ) log

and
s1
lim

j=1

log(1 + j(1) (1 )) +

s2
j=1

log(1 + 2j(2) (2 ))

s1 + s2

(s1 +s2 )

= 0.

This implies that limj j(k) = 0 for k {1, 2}. For small enough x > 0 we have
log(1 + x) cx for some c > 0. Hence, for some j1 , j2 N and s1 j1 and s2 j2
we have
s1


log(1 + j(1) (1 )) +

j=1

s2


log(1 + 2j(2) (2 ))

j=1

c1 (1 )

s1


j(1)

+ c2 2 (2 )

j=j1

s2


j(2)

j=j2

and therefore, under the assumption of weak QMC-tractability,


lim

c1 (1 )

s1

(s1 +s2 )

j=j1

j(1) + c2 2 (2 )
s1 + s2

s2
j=j2

j(2)

= 0.

This implies the necessity of (9).

5 Proof of the Sufficient Conditions


We construct, component-by-component (or, for short, CBC), a QMC algorithm
whose worst-case error implies the sufficient conditions in Theorem 1. This QMC
algorithm is based on lattice rules and on polynomial lattice rules, where the lattice
rules are used to integrate the Korobov part of the integrand and the polynomial
lattice rules are used to integrate the Walsh part. We quickly recall the concepts of
(polynomial) lattice rules:
Lattice point sets (according to Hlawka [7] and Korobov [12]). Let N N
be an integer and let z = (z1 , . . . , zs2 ) Zs2 . The lattice point set (yn )N1
n=0 with
generating vector z, consisting of N points in [0, 1)s2 , is defined by

446

P. Kritzer and F. Pillichshammer

yn =

 nz 
1

,...,

 nz 
s2

for all 0 n N 1,

where {} denotes the fractional part of a number. Note that it suffices to choose
z ZNs2 , where
ZN := {z {0, 1, . . . , N 1} : gcd(z, N) = 1}.
Polynomial lattice point sets (according to Niederreiter [18]). Let Fb be the
finite field of prime order b. Furthermore let Fb [x] be the set of polynomials
over Fb , and let Fb ((x 1 )) be the field of formal Laurent series over Fb . The
latter contains the field of rational functions as a subfield. Given m N, set
Gb,m := {a Fb [x] : deg(a) < m} and define a mapping m : Fb ((x 1 )) [0, 1)
by


m


l
m
tl x
tl bl .
:=
l=z

l=max(1,z)

Let f Fb [x] with deg(f ) = m and g = (g1 , . . . , gs1 ) Fb [x]s1 . The polynomial
lattice point set (xh )hGb,m with generating vector g, consisting of bm points in
[0, 1)s1 , is defined by



 
h(x)g1 (x)
h(x)gs1 (x)
, . . . , m
for all h Gb,m .
xh := m
f (x)
f (x)
A QMC rule using a (polynomial) lattice point set is called (polynomial) lattice rule.

5.1 Component-by-Component Construction


We now show a CBC construction algorithm for point sets that are suitable for integration in the space H (Ks,, ). For practical reasons, we will, in the following,
denote the worst-case error of a hybrid point set SN,s = ((xn , yn ))N1
n=0 , consisting of
an s1 -dimensional polynomial lattice generated by g and an s2 -dimensional lattice
generated by z, by e2s,, (g, z), where g is the generating vector of the polynomial
lattice part, and z is the generating vector of the lattice part. Using the kernel representations in (2) and (4) we have
e2s,, (g, z) = 1 +

1
N2

N1


n,n =0

s2

j=1

s1


 walk (xn,j xn ,j )
(1)

1 + j

j=1

1 + j(2)

b1 logb k

 el (yn,j yn ,j )
,
|l|2

kN

lZ\{0}

where xn,j is the jth component of xn and similar for yn,j .

(10)

Tractability of Multivariate Integration in Hybrid Function Spaces

447

We now proceed to our construction algorithm. Note that we state the algorithm
in a way such that we exclude the cases s1 = 0 or s2 = 0, as these are covered by
the results in [2] and [16]. For s N let [s] := {1, . . . , s}.
Algorithm 1 Let s1 , s2 , m N, a prime number b, and an irreducible polynomial
f Fb [x] with deg(f ) = m be given. We write N = bm .
1. For d1 = 1, choose g1 = 1 Gb,m .
2. For d2 = 1, choose z1 ZN such that e2(1,1),, (g1 , z1 ) is minimized as a function
of z1 .
3. For d1 [s1 ] and d2 [s2 ], assume that gd1 = (g1 , . . . , gd1 ) and zd2 =
(z1 , . . . , zd2 ) are given. If d1 < s1 and d2 < s2 go to either Step (3a) or (3b). If
d1 = s1 and d2 < s2 go to Step (3b). If d1 < s1 and d2 = s2 , go to Step (3a). If
d1 = s1 and d2 = s2 , the algorithm terminates.
a. Choose gd1 +1 Gb,m such that e2(d1 +1,d2 ),, ((gd1 , gd1 +1 ), zd2 ) is minimized
as a function of gd1 +1 . Increase d1 by 1 and repeat Step 3.
b. Choose gd2 +1 ZN such that e2(d1 ,d2 +1),, (gd1 , (zd2 , zd2 +1 )) is minimized as
a function of zd2 +1 . Increase d2 by 1 and repeat Step 3.
Remark 1 As pointed out in, e.g., [22] and [3], the infinite sums in (10) can be
represented in closed form, so the construction cost of Algorithm 1 is of order
O(N 3 (s1 + s2 )2 ). Of course it would be desirable to lower this cost bound. If s1 = 0
or s2 = 0 one can use the fast CBC approach based on FFT as done by Cools and
Nuyens to reduce the construction cost to O(sN log N), where s {s1 , s2 }. It is not
yet clear if these ideas also apply to the hybrid case.
Theorem 3 Let d1 [s1 ] and d2 [s2 ] be given. Then the generating vectors gd1
and zd2 constructed by Algorithm 1 satisfy

d1 
d2 




2
(1)
(2)
e2(d1 ,d2 ),, (gd1 , zd2 )
1 + j 2(1 )
1 + j 4 (2 ) .
N j=1
j=1
The proof of Theorem 3 is deferred to the appendix.

5.2 Proof of the Sufficient Conditions


From Theorem 3 it follows that for N = bm we have

s1 
s2 




2
(1)
(2)
1 + j 2(1 )
1 + j 4 (2 ) .
e2 (N, s1 + s2 )
N j=1
j=1

448

P. Kritzer and F. Pillichshammer

Assuming that (7) holds, we know that


s1


(1 +


j=1

j(1) < , and hence


(1)
exp
j (1 ) =: C1 (1 , (1) ).

j(1) (1 ))

j=1

j=1

A similar argument shows that


e2 (N, s1 + s2 )

s2

j=1 (1

+ 4j(2) (2 )) C2 (2 , (2) ). Hence

2
C(, )
C1 (1 , (1) )C2 (2 , (2) ) =:
.
N
N

For > 0 choose m N such that bm1 < C(, )2  =: N bm . Then we


have e(bm , s1 + s2 ) and hence
Nmin (, s1 + s2 ) bm < bN = bC(, )2 .
This implies strong polynomial QMC-tractability. The corresponding bounds can be
achieved with the point set constructed by Algorithm 1.
The sufficiency of the condition for polynomial QMC-tractability is shown in a
similar fashion by standard arguments (cf. [3, 22]).
For weak QMC-tractability we deduce from Theorem 3 that

s1 
s2 





1 + j(1) 2(1 )
1 + j(2) 4 (2 )
Nmin (, s1 + s2 ) 22
.

j=1
j=1
Hence
log Nmin (, s1 + s2 ) log 4 + 2 log 1 + 2(1 )

s1

j=1

j(1) + 4 (2 )

s2


j(2) ,

j=1

and this together with (9) implies the result.

6 Open Questions
The findings of this paper naturally lead to the following two open problems:
Study tractability for general algorithms (not only QMC rules) and compare the
tractability conditions with the one given in Theorem 1.
From Theorem 3 we obtain a convergence rate of order O(N 1/2 ) for the worstcase error which is the same as for plain Monte Carlo. Improve this convergence
rate.

Tractability of Multivariate Integration in Hybrid Function Spaces

449

Acknowledgments The authors would like to thank the anonymous referees for their remarks
which helped to improve the presentation of this paper. P. Kritzer is supported by the Austrian
Science Fund (FWF), Projects P23389-N18 and F05506-26. The latter is part of the Special Research
Program Quasi-Monte Carlo Methods: Theory and Applications. F. Pillichshammer is supported
by the Austrian Science Fund (FWF) Project F5509-N26, which is part of the Special Research
Program Quasi-Monte Carlo Methods: Theory and Applications.

Appendix: The Proof of Theorem 3


Proof We show the result by an inductive argument. We start our considerations by
dealing with the case where d1 = d2 = 1. According to Algorithm 1, we have chosen
g1 = 1 Gb,m and z1 ZN such that e2(1,1),, (g1 , z1 ) is minimized as a function
of z1 . In the following, we denote the points generated by (g, z) Gb,m ZN by
(xn (g), yn (z)).
According to Eq. (10), we have
e2(1,1),, (g1 , z1 ) = e21,1 , (1) (1) + (1,1) (z1 ),
where e21,1 , (1) (1) denotes the squared worst-case error of the polynomial lattice rule
Wal
generated by 1 in the Walsh space H (K1,
(1) ), and where
1 ,

N1


wal
(x
(1)

x
(1))
1(2) 
k1 n,1
n ,1

(1,1) (z1 ) := 2
1 + 1(1)
N n,n =0
b1 logb k1

k1 N

 el (yn,1 (z1 ) yn ,1 (z1 ))


1
.
|l1 |2

l1 Z\{0}

By results in [2], we know that


e21,1 , (1) (1)


2 
1 + 1(1) (1 ) .
N

Then, as z1 was chosen to minimize the error,


1 
(1,1) (z)
(N) zZ
N

N1
 walk (xn,1 (1) xn ,1 (1))
1(2) 
(1)
1

1 + 1
= 2
N n,n =0
b1 logb k1

(1,1) (z1 )

k1 N

1   el1 (yn,1 (z) yn ,1 (z))

|l1 |2
(N) zZ
N l1 Z\{0}


1(2) 1 + 1(1) (1 ) B ,

(11)

450

P. Kritzer and F. Pillichshammer

where



2i(nn )zl1 /N 



1
e
1


B := 2

|l1 |2
N n=0 n =0  (N) zZ

N l1 Z\{0}



N 
1   1   e2inzl1 /N 
,
=
|l1 |2 
N n=1  (N) zZ

N1
 N1


N l1 Z\{0}

since the inner sum in the second line always has the same value. We now use [16,
Lemmas 2.1 and 2.3] and obtain B 4 (2 )N 1 , where we used that N has only
one prime factor. Hence we obtain
(1,1) (z1 )


1(2) 
1 + 1(1) (1 ) 4 (2 ).
N

(12)

Combining Eqs. (11) and (12) yields the desired bound for (g1 , z1 ).
Let us now assume d1 [s1 ] and d2 [s2 ] and that we have already found
generating vectors gd1 and zd2 such that the bound in Theorem 3 is satisfied.
In what follows, we are going to distinguish two cases: In the first case, we assume
that d1 < s1 and add a component gd1 +1 to gd1 , and in the second case, we assume
that d2 < s2 and add a component zd2 +1 to zd2 . In both cases, we will show that the
corresponding bounds on the squared worst-case errors hold.
Let us first consider the case where we start from (gd1 , zd2 ) and add, by Algorithm
1, a component gd1 +1 to gd1 . According to Eq. (10), we have
e2(d1 +1,d2 ),, ((gd1 , gd1 +1 ), zd2 ) = e2(d1 ,d2 ),, (gd1 , zd2 ) + (d1 +1,d2 ) (gd1 +1 ),
where
(d1 +1,d2 ) (gd1 +1 )


d1
N1



d(1)

wal
(x
(g
)

x
(g
))
k n,j j
n ,j j
+1

:= 1 2
1 + j(1)
N n,n =0 j=1
b1 logb k

kN

d2



e
(y
(z
)

y
(z
))
l n,j j
n ,j j

1 + j(2)
2
|l|
j=1
lZ\{0}

 walk (xn,d

1 +1

kN

(gd1 +1 ) xn ,d1 +1 (gd1 +1 ))


.
b1 logb k

Tractability of Multivariate Integration in Hybrid Function Spaces

451

However, by the assumption, we know that


e2(d1 ,d2 ),, (gd1 , zd2 )

d1 
d2 


2 
(1)
1 + j 2(1 )
1 + j(2) 4 (2 ) .

N j=1
j=1

(13)

Furthermore, as gd1 +1 was chosen to minimize the error,


1 
(d1 +1,d2 ) (g)
N gG
b,m

d
d2 
1






1 + j(1) (1 )
1 + j(2) 2 (2 ) C ,

(d1 +1,d2 ) (gd1 +1 )


d(1)
1 +1

j=1

j=1

where








walk (xn,d1 +1 (g) xn ,d1 +1 (g)) 
1
1
C := 2

N n,n =0  N gG
b1 logb k


b,m kN



N1 N1 
1    1   walk (xn n ,d1 +1 (g)) 
= 2

N n=0 n =0  N gG
b1 logb k


b,m kN



N1 
1   1   walk (xn,d1 +1 (g)) 
=
,
N n=0  N gG
b1 logb k


b,m kN
N1


where we used the group structure of the polynomial lattice points


(see [4, Sect. 4.4.4]) in order to get from the first to the second line and where
we again used that the inner sum in the second line always has the same value. We
now write



N1 
1
1 
1   1   walk (xn,d1 +1 (g)) 
C =
+

N
b1 logb k

N n=1  N gG
b1 logb k


kN
b,m kN



N1 
(1 )
1   1   walk (xn,d1 +1 (g)) 
=
+
.
N
N n=1  N gG
b1 logb k


b,m kN

452

P. Kritzer and F. Pillichshammer

Let now n {1, . . . , N 1} be fixed, and consider the term


C,n :=

1   walk (xn,d1 +1 (g))


N gG
b1 logb k

b,m

kN

 1  walk (xn,d +1 (g))


 1  walk (xn,d +1 (g))
1
1
=
+
N gG
b1 logb k

N gG
b1 logb k

kN
k0(N)

kN
k0(N)

b,m

b,m

=:C,n,1 + C,n,2 .
By results in [2],
C,n,1 =


kN
k0(N)

1
b1 logb k

(1 )
(1 )

.
m
b
N

Furthermore,
C,n,2 =

kN
k0(N)

b1 logb k

1 
walk (xn,d1 +1 (g))
N gG
b,m

b 1
 g 
1 
,
wal
k
b1 logb k
N g=0
bm
m

kN
k0(N)

where we used that




walk (xn,d1 +1 (g)) =

gGb,m


gGb,m


gGb,m

 

n(x)g(x)
walk m
f (x)
m
 b
 
1
 g 
g(x)
=
walk m
walk m ,
f (x)
b
g=0

since n = 0 and since g takes on all values in Gb,m , and f is irreducible.


However,

#g$
bm 1


= 0 and so C,n,2 = 0. This yields C,n (1 )N 1 and
g=0 walk bm
1
C 2(1 )N , which in turn implies
d1 
d2 


(1 ) 
2d(1)
(1)
1 +1
(d1 +1,d2 ) (gd1 +1 )
1 + j (1 )
1 + j(2) 2 (2 ) .
N
j=1
j=1

Tractability of Multivariate Integration in Hybrid Function Spaces

453

Combining the latter result with Eq. (13), we obtain


e2(d1 +1,d2 ),, ((gd1 , gd1 +1 ), zd2 ))

d1 +1 
d2 


2 
(1)
1 + 2j (1 )
1 + j(2) 4 (2 ) .

N j=1
j=1

The case where we start from (gd1 , zd2 ) and add, by Algorithm 1, a component
zd2 +1 to zd2 can be shown by a similar reasoning. We just sketch the basic points:
According to Eq. (10), we have
e2(d1 ,d2 +1),, (gd1 , (zd2 , zd2 +1 )) = e2(d1 ,d2 ),, (gd1 , zd2 ) + (d1 ,d2 +1) (zd2 +1 ),
where e2(d1 ,d2 ),, (gd1 , zd2 ) satisfies (13) and where

d1 
d2 





1 + j(1) (1 )
1 + j(2) 2 (2 ) D ,
(d1 ,d2 +1) (zd2 +1 ) d(1)
2 +1

j=1

j=1

with



N1 
1   1   e2inzl/N  4 (2 )

D =
,
|l|2 
N n=0  (N) zZ
N
N lZ\{0}
according to [16, Lemmas 2.1 and 2.3]. This implies
(d1 ,d2 +1) (zd2 +1 )

d1 
4 (2 ) 
d(1)
2 +1

1 + j(1) (1 )

j=1

Combining these results we obtain the desired bound.

d2 


1 + j(2) 2 (2 ) .
j=1

References
1. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337404 (1950)
2. Dick, J., Kuo, F.Y., Pillichshammer, F., Sloan, I.H.: Construction algorithms for polynomial
lattice rules for multivariate integration. Math. Comput. 74, 18951921 (2005)
3. Dick, J., Pillichshammer, F.: Multivariate integration in weighted Hilbert spaces based on Walsh
functions and weighted Sobolev spaces. J. Complex. 21, 149195 (2005)
4. Dick, J., Pillichshammer, F.: Digital Nets and Sequences. Discrepancy Theory and Quasi-Monte
Carlo Integration. Cambridge University Press, Cambridge (2010)
5. Hellekalek, P.: Hybrid function systems in the theory of uniform distribution of sequences. In:
Plaskota, L., Wozniakowski, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2010, pp.
435449. Springer, Berlin (2012)
6. Hellekalek, P., Kritzer, P.: On the diaphony of some finite hybrid point sets. Acta Arithmetica
156, 257282 (2012)

454

P. Kritzer and F. Pillichshammer

7. Hlawka, E.: Zur angenherten Berechnung mehrfacher Integrale. Monatshefte fr Mathematik


66, 140151 (1962)
8. Hofer, R., Kritzer, P.: On hybrid sequences built of Niederreiter-Halton sequences and Kronecker sequences. Bull. Aust. Math. Soc. 84, 238254 (2011)
9. Hofer, R., Kritzer, P., Larcher, G., Pillichshammer, F.: Distribution properties of generalized
van der Corput-Halton sequences and their subsequences. Int. J. Number Theory 5, 719746
(2009)
10. Hofer, R., Larcher, G.: Metrical results on the discrepancy of Halton-Kronecker sequences.
Mathematische Zeitschrift 271, 111 (2012)
11. Keller, A.: Quasi-Monte Carlo image synthesis in a nutshell. In: Dick, J., Kuo, F.Y., Peters,
G.W., Sloan, I.H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods, pp. 213249. Springer,
Berlin (2013)
12. Korobov, N.M.: Approximate evaluation of repeated integrals. Doklady Akademii Nauk SSSR
124, 12071210 (1959). (in Russian)
13. Kritzer, P.: On an example of finite hybrid quasi-Monte Carlo Point Sets. Monatshefte fr
Mathematik 168, 443459 (2012)
14. Kritzer, P., Leobacher, G., Pillichshammer, F.: Component-by-component construction of
hybrid point sets based on Hammersley and lattice point sets. In: Dick, J., Kuo, F.Y., Peters,
G.W., Sloan, I.H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2012, 501515. Springer,
Berlin (2013)
15. Kritzer, P., Pillichshammer, F.: On the existence of low-diaphony sequences made of digital
sequences and lattice point sets. Mathematische Nachrichten 286, 224235 (2013)
16. Kuo, F.Y., Joe, S.: Component-by-component construction of good lattice rules with a composite number of points. J. Complex. 18, 943976 (2002)
17. Larcher, G.: Discrepancy estimates for sequences: new results and open problems. In: Kritzer,
P., Niederreiter, H., Pillichshammer, F., Winterhof, A. (eds.) Uniform Distribution and QuasiMonte Carlo Methods, Radon Series in Computational and Applied Mathematics, 171189.
DeGruyter, Berlin (2014)
18. Niederreiter, H.: Low-discrepancy point sets obtained by digital constructions over finite fields.
Czechoslovak Mathematical Journal 42, 143166 (1992)
19. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume I: Linear Information. EMS, Zurich (2008)
20. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume II: Standard
Information for Functionals. EMS, Zurich (2010)
21. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume III: Standard
Information for Operators. EMS, Zurich (2012)
22. Sloan, I.H., Wozniakowski, H.: Tractability of multivariate integration for weighted Korobov
classes. J. Complex. 17, 697721 (2001)
23. Traub, J.F., Wasilkowski, G.W., Wozniakowski, H.: Information-Based Complexity. Academic
Press, New York (1988)

Derivative-Based Global Sensitivity


Measures and Their Link with Sobol
Sensitivity Indices
Sergei Kucherenko and Shugfang Song

Abstract The variance-based method of Sobol sensitivity indices is very popular


among practitioners due to its efficiency and easiness of interpretation. However,
for high-dimensional models the direct application of this method can be very timeconsuming and prohibitively expensive to use. One of the alternative global sensitivity analysis methods known as the method of derivative based global sensitivity
measures (DGSM) has recently become popular among practitioners. It has a link
with the Morris screening method and Sobol sensitivity indices. DGSM are very
easy to implement and evaluate numerically. The computational time required for
numerical evaluation of DGSM is generally much lower than that for estimation of
Sobol sensitivity indices. We present a survey of recent advances in DGSM and
new results concerning new lower and upper bounds on the values of Sobol total
sensitivity indices Sitot . Using these bounds it is possible in most cases to get a good
practical estimation of the values of Sitot . Several examples are used to illustrate an
application of DGSM.
Keywords Global sensitivity analysis Monte Carlo methods Quasi Monte Carlo
methods Derivative based global measures Morris method Sobol sensitivity
indices

1 Introduction
Global sensitivity analysis (GSA) is the study of how the uncertainty in the model
output is apportioned to the uncertainty in model inputs [9, 14]. GSA can provide
valuable information regarding the dependence of the model output to its input parameters. The variance-based method of global sensitivity indices developed by Sobol
[11] became very popular among practitioners due to its efficiency and easiness of
S. Kucherenko (B) S. Song
Imperial College London, SW7 2AZ, London, UK
e-mail: s.kucherenko@imperial.ac.uk
S. Song
e-mail: shufangsong@nwpu.edu.cn
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_23

455

456

S. Kucherenko and S. Song

interpretation. There are two types of Sobol sensitivity indices: the main effect
indices, which estimate the individual contribution of each input parameter to the
output variance, and the total sensitivity indices, which measure the total contribution
of a single input factor or a group of inputs [3]. The total sensitivity indices are used
to identify non-important variables which can then be fixed at their nominal values
to reduce model complexity [9]. For high-dimensional models the direct application
of variance-based GSA measures can be extremely time-consuming and impractical.
A number of alternative SA techniques have been proposed. In this paper we
present derivative based global sensitivity measures (DGSM) and their link with
Sobol sensitivity indices. DGSM are based on averaging local derivatives using
Monte Carlo or Quasi Monte Carlo sampling methods. These measures were briefly
introduced by Sobol and Gershman in [12]. Kucherenko et al. [6] introduced some
other derivative-based global sensitivity measures (DGSM) and coined the acronym
DGSM. They showed that the computational cost of numerical evaluation of DGSM
can be much lower than that for estimation of Sobol sensitivity indices which later
was confirmed in other works [5]. DGSM can be seen as a generalization and formalization of the Morris importance measure also known as elementary effects [8].
Sobol and Kucherenko [15] proved theoretically that there is a link between DGSM
and the Sobol total sensitivity index Sitot for the same input. They showed that DGSM
can be used as an upper bound on total sensitivity index Sitot . They also introduced
modified DGSM which can be used for both a single input and groups of inputs [16].
Such measures can be applied for problems with a high number of input variables
to reduce the computational time. Lamboni et al. [7] extended results of Sobol and
Kucherenko for models with input variables belonging to the class of Boltzmann
probability measures.
The numerical efficiency of the DGSM method can be improved by using the automatic differentiation algorithm for calculation DGSM as was shown in [5]. However,
the number of required function evaluations still remains to be proportional to the
number of inputs. This dependence can be greatly reduced using an approach based
on algorithmic differentiation in the adjoint or reverse mode [1]. It allows estimating all derivatives at a cost at most 46 times of that for evaluating the original
function [4].
This paper is organised as follows: Sect. 2 presents Sobol global sensitivity
indices. DGSM and lower and upper bounds on total Sobol sensitivity indices
for uniformly distributed variables and random variables are presented in Sects. 3
and 4, respectively. In Sect. 5 we consider test cases which illustrate an application
of DGSM and their links with total Sobol sensitivity indices. Finally, conclusions
are presented in Sect. 6.

2 Sobol Global Sensitivity Indices


The method of global sensitivity indices developed by Sobol is based on ANOVA
decomposition [11]. Consider the square integrable function f (x) defined in the unit
hypercube H d = [0, 1]d . The decomposition of f (x)

Derivative-Based Global Sensitivity Measures

f (x) = f 0 +

d


f i (xi ) +

i=1

where f 0 =


Hd

d 
d


457

f i j (xi , x j ) + + f 12d (x1 , , xd ),

(1)

i=1 j>i

f (x)d x , is called ANOVA if conditions



f i1 ...is d xik = 0

(2)

Hd

are satisfied for all different groups of indices x1 , , xs such that 1 i 1 < i 2 < ...
< i s n. These conditions guarantee that all terms in (1) are mutually orthogonal
with respect to integration.
The variances of the terms in the ANOVA decomposition add up to the total
variance:

n
n


2
2
f (x)dx f 0 =
Di1 ...is ,
D=
Hd

s=1 i 1 <<i s

where Di1 ...is = H d f i21 ...is (xi1 , ..., xis )d xi1 , ..., xis are called partial variances.
Total partial variances account for the total influence of the factor xi :

Di1 ...is ,
Ditot =
<i>

where the sum


<i>

is extended over all different groups of indices x1 , , xs satis-

fying condition 1 i 1 < i 2 < ... < i s d, 1 s d, where one of the indices is
equal to i. The corresponding total sensitivity index is defined as

Sitot = Ditot D.
Denote u i (x) the sum of all terms in ANOVA decomposition (1) that depend on xi :
u i (x) = f i (xi ) +

d


f i j (xi , x j ) + + f 12d (x1 , , xd ).

j=1, j=i

From the definition of ANOVA decomposition it follows that



u i (x)dx = 0.
Hd

(3)

458

S. Kucherenko and S. Song

The total partial variance Ditot can be computed as



Ditot


Hd

u i2 (x)dx

Hd

u i2 (xi , z)d xi dz.

Denote z = (x1 , ..., xi1 , xi+1 , ..., xd ) the vector of all variables but xi , then
x (xi , z) and f (x) f (xi , z). The ANOVA decomposition of f (x) in (1) can be
presented in the following form
f (x) = u i (xi , z) + v(z),
where v(z) is the sum
 of terms independent of xi . Because of (2) and (3) it is easy to
show that v(z) = H d f (x)d xi . Hence

u i (xi , z) = f (x)
f (x)d xi .
(4)
Hd

Then the total sensitivity index Sitot is equal to



2
d u (x)dx
tot
Si = H i
.
D

(5)

We note that in the case of independent random variables all definitions of the
ANOVA decomposition remain to be correct but all derivations should be considered
in probabilistic sense as shown in [14] and presented in Sect. 4.

3 DGSM for Uniformly Distributed Variables


Consider continuously differentiable
function f (x) defined in the unit hypercube

H d = [0, 1]d such that f xi L 2 .
 
 
Theorem 1 Assume that c  xfi  C. Then
c2
C2
Sitot
.
12D
12D

(6)

The proof is presented in [15].


The Morris importance measure also known as elementary effects originally
defined as finite differences averaged over a finite set of random points [8] was
generalized in [6]:

 
 f (x) 


(7)
i =
 x  dx.
i
Hd

Derivative-Based Global Sensitivity Measures

459

Kucherenko et al. [6] also introduced a new DGSM measure:





i =

Hd

f (x)
xi

2
dx.

(8)

In this paper we define two new DGSM measures:



f (x)
wi(m) =
xim
dx,
d
xi
H

(9)

where m is a constant, m > 0,


1
i =
2


Hd

f (x)
xi (1 xi )
xi

2
dx.

(10)

 2
We note that i is in fact the mean value of f xi . We also note that
u i
f
=
.
xi
xi

(11)

3.1 Lower Bounds on Sit ot


Theorem 2 There exists the following lower bound between DGSM and the Sobol
total sensitivity index


Hd

[ f (1, z) f (0, z)] [ f (1, z) + f (0, z) 2 f (x)] dx


4i D

2
< Sitot .

(12)

Proof Consider an integral



Hd

u i (x)

u i (x)
dx.
xi

(13)

Applying the CauchySchwarz inequality we obtain the following result:



Hd

u i (x)
u i (x)
dx
xi


Hd

u i2 (x)dx

Hd

u i (x)
xi

2
dx.

(14)

It is easy to prove that the left and right parts of this inequality cannot be equal.
should be linearly dependent.
Indeed, for them to be equal functions u i (x) and uix(x)
i
For simplicity consider a one-dimensional case: x [0, 1]. Lets assume
u(x)
= Au(x),
x

460

S. Kucherenko and S. Song

where A is a constant. The general solution to this equation u(x) = B exp(Ax), where
B is a constant. It is easy to see that this solution is not consistent with condition (3)
which should
 be imposed on function u(x).
dx can be transformed as
Integral H d u i (x) uix(x)
i


1  u i2 (x)
u i (x) uix(x)
dx =
dx
d
i
2 H xi



1
2
2
=
d1 u i (1, z) u i (0, z) dz
2 H
1
=
d1 (u i (1, z) u i (0, z)) (u i (1, z) + u i (0, z)) dz
2 H
1
=
d ( f (1, z) f (0, z)) ( f (1, z) + f (0, z) 2v(z)) dz.
2 H
Hd

(15)

All terms in the last integrand are independent of xi , hence we can replace integration with respect to dz to integration with respect to dx and substitute v(z) for
f (x) in the integrand due to condition (3). Then (15) can be presented as

Hd

u i (x)

u i (x)
1
dx =
xi
2


Hd

[ f (1, z) f (0, z)] [ f (1, z) + f (0, z) 2 f (x)] dx


(16)

From (11) uix(x)


= f x(x)
, hence the right hand side of (14) can be written as i Ditot .
i
i

Finally dividing (14) by i D and using (16), we obtain the lower bound (12).
We call


Hd

[ f (1, z) f (0, z)] [ f (1, z) + f (0, z) 2 f (x)] dx


4i D

the lower bound number one (LB1).


Theorem 3 There exists the following lower bound between DGSM (9) and the
Sobol total sensitivity index
(2m + 1)


Hd

( f (1, z) f (x)) dx wi(m+1)


(m + 1)2 D

2
< Sitot

(17)

Proof Consider an integral



Hd

xim u i (x)dx.

(18)

Applying the CauchySchwarz inequality we obtain the following result:




2
Hd

xim u i (x)dx


Hd

xi2m dx

Hd

u i2 (x)dx.

(19)

Derivative-Based Global Sensitivity Measures

461

It is easy to see that equality in (19) cannot be attained. For this to happen functions
u i (x) and xim should be linearly dependent. For simplicity consider a one-dimensional
case: x [0, 1]. Lets assume
u(x) = Ax m ,
where A = 0 is a constant. This solution does not satisfy condition (3) which should
be imposed on function u(x).
Further we use the following transformation:

Hd




xim+1 u i (x)
u i (x)
dx = (m + 1)
xim u i (x)dx +
xim+1
dx
xi
xi
Hd
Hd

to present integral (18) in a form:






(xim+1 u i (x))
1
m
dx H d xim+1 uix(x)
dx
H d x i u i (x)dx = m+1
Hd
xi
i



m+1 u i (x)
1
= m+1 H d1 u i (1, z)dz H d xi
dx
xi



m+1 u i (x)
1
= m+1
dx .
H d ( f (1, z) f (x)) dx H d x i
xi

(20)

We notice that

Hd

xi2m dx =

1
.
(2m + 1)

(21)


Using (20) and (21) and dividing (19) by D we obtain (17).


This second lower bound on Sitot we denote (m):
(m) =

(2m + 1)


Hd

( f (1, z) f (x)) dx wi(m+1)

2
< Sitot .

(m + 1)2 D

(22)

In fact, this is a set of lower bounds depending on parameter m. We are interested


in the value of m at which (m) attains its maximum. Further we use star to denote
such a value m: m = arg max( (m)) and call
(m ) =

(2m + 1)


Hd

( f (1, z) f (x)) dx wi(m


(m + 1)2 D

+1)

2
(23)

the lower bound number two (LB2).


We define the maximum lower bound L B as
L B = max(L B1, L B2).

(24)

462

S. Kucherenko and S. Song

We note that both lower and upper bounds can be estimated by a set of derivative
based measures:
i = {i , wi(m) }, m > 0.

(25)

3.2 Upper Bounds on Sit ot


Theorem 4
Sitot

i
.
2D

(26)

The proof of this Theorem in given in [15].


Consider the set of values 1 , ..., n , 1 i n. One can expect that smaller i
correspond to less influential variables xi .
We further call (26) the upper bound number one (UB1).
Theorem 5
Sitot

i
,
D

(27)

where i is given by (10).


Proof We use the following inequality [2]:

0



u2d x

ud x
0

1
2

x(1 x)u 2 d x.

(28)

The inequality is reduced to an equality only if u is constant. Assume that u is given


1
by (3), then 0 ud x = 0, and from (28) we obtain (27).

Further we call Di the upper bound number two (UB2). We note that 21 xi (1 xi )
for 0 xi 1 is bounded: 0 21 xi (1 xi ) 18 . Therefore, 0 i 18 i .

3.3 Computational Costs


All DGSM can be computed using the same set of partial derivatives
f (x)
xi

f (x)
,
xi

i = 1, ..., d. Evaluation of
can be done analytically for explicitly given easilydifferentiable functions or numerically.
In the case of straightforward numerical estimations of all partial derivatives and
computation of integrals using MC or QMC methods, the number of required function
evaluations for a set of all input variables is equal to N (d + 1), where N is a number
of sampled points. Computing LB1 also requires values of f (0, z) , f (1, z), while
computing LB2 requires only values of f (1, z). In total, numerical computation of

Derivative-Based Global Sensitivity Measures

463

L B for all input variables would require N FL B = N (d + 1) + 2N d = N (3d + 1)


function evaluations. Computation of all upper bounds require N FU B = N (d + 1)
function evaluations. We recall that the number of function evaluations required for
computation of Sitot is N FS = N (d +1) [10]. The number of sampled points N needed
to achieve numerical convergence can be different for DGSM and Sitot . It is generally
lower for the case of DGSM. The numerical efficiency of the DGSM method can be
significantly increased by using algorithmic differentiation in the adjoint (reverse)
mode [1]. This approach allows estimating all derivatives at a cost at most 6 times
of that for evaluating the original function f (x) [4]. However, as mentioned above

lower bounds also require computation of f (0, z) , f (1, z) so N FL B would only be


UB
L B
reduced to N F = 6N + 2N d = N (2d + 6), while N F would be equal to 6N .

4 DGSM for Random Variables


Consider a function f (x1 , ..., xd ), where x1 , ..., xd are independent random variables
with distribution functions F1 (x1 ) , ..., Fd (xd ). Thus the point x = (x1 , ..., xd ) is
defined in the Euclidean space R d and its measure is d F1 (x1 ) d Fd (xd ).
The following DGSM was introduced in [15]:



i =

f (x)
xi

Rd

2
d F(x).

(29)

We introduce a new measure



wi =

Rd

f (x)
d F(x).
xi

(30)

4.1 The Lower Bounds on Sit ot for Normal Variables


Assume that xi is normally distributed with the finite variance i2 and the mean
value i .
Theorem 6

i2 wi2
Sitot .
D

Proof Consider
obtain


Rd

xi u i (x)d F(x). Applying the CauchySchwarz inequality we


2


Rd

(31)

xi u i (x)d F(x)


Rd

xi2 d F(x)

Rd

u i2 (x)d F(x).

(32)

464

S. Kucherenko and S. Song

Equality in (32) can be attained if functions u i (x) and xi are linearly dependent. For
simplicity consider a one-dimensional case. Lets assume
u(x) = A(x ),
where A = 0 is a constant. This solution satisfies condition (3) for normally distributed variable x with the mean value : R d u(x)d F(x) = 0.
For normally distributed variables the following equality is true [2]:
2


Rd

xi u i (x)d F(x)


=


Rd

xi2 d F(x)

Rd

u i (x)
d F(x).
xi

(33)


By definition R d xi2 d F(x) = i2 . Using (32) and (33) and dividing the resulting
inequality by D we obtain the lower bound (31).


4.2 The Upper Bounds on Sit ot for Normal Variables


The following Theorem 7 is a generalization of Theorem 1.
 
 
Theorem 7 Assume that c  xfi  C, then
i2 c2
2C 2
Sitot i
.
D
D

(34)

The constant factor i2 cannot be improved.


Theorem 8
Sitot

i2
i .
D

(35)

The constant factor i2 cannot be reduced.


Proofs are presented in [15].

5 Test Cases
In this section we present the results of analytical and numerical estimation of Si ,
Sitot , LB1, LB2 and UB1, UB2. The analytical values for DGSM and Sitot were calculated and compared with numerical results. For text case 2 we present convergence
plots in the form of root mean square error (RMSE) versus the number of sampled

Derivative-Based Global Sensitivity Measures

465

points N . To reduce the scatter in the error estimation the values of RMSE were
averaged over K = 25 independent runs:

i =


K 
I0 2
1  Ii,k
K k=1
I0

 21
.

Here Ii can be either numerically computed Sitot , LB1, LB2 or UB1, UB2, I0 is
the corresponding analytical value of Sitot , LB1, LB2 or UB1, UB2. The RMSE can
be approximated by a trend line cN . Values of () are given in brackets on the
plots. QMC integration based on Sobol sequences was used in all numerical tests.
Example 1 Consider a linear with respect to xi function:
f (x) = a(z)xi + b(z).


1
2
2
For this function Si = Sitot , Ditot = 12
d1 a (z)dz, i =
H
H d1 a (z)dz, L B1 =


2
2
2
2
2
a
(z)2a
(z)x
dzd
x
(2m+1)m
a(z)dz
( H d1
) . A maximum value
( Hd ( 
i)
i)
= 0 and (m) =
4(m+2)2 (m+1)2 D
4D d1 a 2 (z)dz
H

0.0401
(m) is attained at m =3.745, when (m )
=
D
2
a(z)dz . The lower and upper bounds are L B 0.48Sitot . U B1 1.22Sitot .
1
1
tot
2
U B2 = 12D
0 a(z) dz = Si . For this test function UB2 < UB1.

of

Example 2 Consider the so-called g-function which is often used in GSA for illustration purposes:
f (x) =

d


gi ,

i=1

where gi =

|4xi 2|+ai
1+ai

, ai (i = 1, ..., d) are constants. It is easy to see that for this


d

g j and as a result LB1=0. The
function f i (xi ) = (gi 1), u i (x) = (gi 1)

d 

total variance is D = 1 +
1+
j=1

1/3
(1+a j )2

j=1, j=i

. The analytical values of Si , Sitot and

LB2 are given in Table 1.

Table 1 The analytical expressions for Si , Sitot and LB2 for g-function
Si
1/3
(1 + ai )2 D

Sitot
1/3
(1+ai )2

d

j=1, j=i

1+

1/3
(1+a j )2

(m)


(2m + 1) 1

2
4 1(1/2)m+1
m+2

(1 + ai )2 (m + 1)2 D

466

S. Kucherenko and S. Song

(m)
0.0772
By solving equation ddm
= 0, we find that m = 9.64, (m ) = (1+a
.
2
i) D

depend
on
a
,i
=
1,
2,
...,
d
and
d.
In the
It is interesting to note that m does not
i

extreme cases: if ai for all i, S(mtot ) 0.257, SStoti 1, while if ai 0 for


i

Si
0.257
1
tot
, UB1 and
all i, S(mtot ) (4/3)
d1 , S tot (4/3)d1 . The analytical expression for Si
i
i
UB2 are given in Table 2.
2
2
Sitot
Sitot
For this test function UB1
= 48 , UB2
= 41 , hence UB2
= 12 < 1. Values of Si ,
UB1
Sitot , UB and LB2 for the case of a = [0, 1, 4.5, 9, 99, 99, 99, 99], d = 8 are given
in Table 3 and shown in Fig. 1. We can conclude that for this test the knowledge of
LB2 and UB1, UB2 allows to rank correctly all the variables in the order of their
importance.
Figure 2 presents RMSE of numerical estimations of Sitot , UB1 and LB2. For an
individual input LB2 has the highest convergence rate, following by Sitot , and UB1
in terms of the number of sampled points. However, we recall that computation of
all indices requires N FL B = N (3d + 1) function evaluations for LB, while for Sitot
this number is N FS = N (d + 1) and for UB it is also N FU B = N (d + 1).


4
n


Example 3 Hartmann function f (x) = ci exp
i j (x j pi j )2 , xi
i=1

j=1

[0, 1]. For this test case a relationship between the values LB1, LB2 and Si varies
with the change of input (Table 4, Fig. 3): for variables x2 and x6 LB1> Si > LB2,
while for all other variables LB1< LB2 <Si . LB* is much smaller than Sitot for all
inputs. Values of m* vary with the change of input. For all variables but variable 2
UB1 > UB2.
Table 2 The analytical expressions for Sitot UB1 and UB2 for g-function
Sitot
1/3
(1+ai )2

d

j=1, j=i

1+

1/3
(1+a j )2

U B1
16

d

j=1, j=i


1+

1/3
(1+a j )2

(1 + ai )2 2 D

U B2
d

4

j=1, j=i


1+

1/3
(1+a j )2

3(1 + ai )2 D

Table 3 Values of LB*, Si , Sitot , UB1 and UB1. Example 2, a = [0, 1, 4.5, 9, 99, 99, 99, 99],
d =8
x1
x2
x3
x4
x5 ...x8
L B
Si
Sitot
U B1
U B2

0.166
0.716
0.788
3.828
3.149

0.0416
0.179
0.242
1.178
0.969

0.00549
0.0237
0.0343
0.167
0.137

0.00166
0.00720
0.0105
0.0509
0.0418

0.000017
0.0000716
0.000105
0.000501
0.00042

Derivative-Based Global Sensitivity Measures

467

Si
tot

Si

UB
LB
log (RMSE)

log (N)
2

Fig. 1 Values of Si , Sitot , LB2 and UB1 for all input variables. Example 2, a = [0, 1, 4.5,
9, 99, 99, 99, 99], d = 8

(b) 4

UB1(0.962)
LB2(1.134)

4
6
8
10

(c) 14
15

tot

Si (0.953)

log2 (RMSE)

tot

Si (0.977)

log 2(RMSE)

log2 (RMSE)

(a) 0

UB1(0.844)
LB2(1.048)

8
10
12
14

12

16
4

10

11

12

10

11

12

tot

Si (0.993)

16
17
18
19
20
21
22
23

UB1(0.894)
LB2(0.836)

10

11 12

log (N)

log2(N)

log 2(N)

Fig. 2 RMSE of Sitot , UB and LB2 versus the number of sampled points. Example 2, a = [0, 1, 4.5,
9, 99, 99, 99, 99], d = 8. Variable 1 (a), variable 3 (b) and variable 5 (c)
Table 4 Values of m , LB1, LB2, UB1, UB2, Si and Sitot for all input variables
L B1
L B2
m
L B
Si
Sitot
U B1
U B2

x1

x2

x3

x4

x5

x6

0.0044
0.0515
4.6
0.0515
0.115
0.344
1.089
1.051

0.0080
0.0013
10.2
0.0080
0.00699
0.398
0.540
0.550

0.0009
0.0011
17.0
0.0011
0.00715
0.0515
0.196
0.150

0.0029
0.0418
5.5
0.0418
0.0888
0.381
1.088
0.959

0.0014
0.0390
3.6
0.0390
0.109
0.297
1.073
0.932

0.0357
0.0009
19.9
0.0357
0.0139
0.482
1.046
0.899

468

S. Kucherenko and S. Song


0.5

i
tot

Si

UB
LB1
LB2

log2(RMSE)

0.5
1
1.5
2
2.5
3
3.5

log2(N)
Fig. 3 Values of Si , Sitot , UB1, LB1 and LB2 for all input variables. Example 3

6 Conclusions
We can conclude that using lower and upper bounds based on DGSM it is possible
in most cases to get a good practical estimation of the values of Sitot at a fraction of
the CPU cost for estimating Sitot . Small values of upper bounds imply small values
of Sitot . DGSM can be used for fixing unimportant variables and subsequent model
reduction. For linear function and product function, DGSM can give the same variable
ranking as Sitot . In a general case variable ranking can be different for DGSM and
variance based methods. Upper and lower bounds can be estimated using MC/QMC
integration methods using the same set of partial derivative values. Partial derivatives
can be efficiently estimated using algorithmic differentiation in the reverse (adjoint)
mode.
We note that all bounds should be computed with sufficient accuracy. Standard
techniques for monitoring convergence and accuracy of MC/QMC estimates should
be applied to avoid erroneous results.
Acknowledgments The authors would like to thank Prof. I. Sobol his invaluable contributions
to this work. Authors also gratefully acknowledge the financial support by the EPSRC grant
EP/H03126X/1.

Derivative-Based Global Sensitivity Measures

469

References
1. Griewank, A., Walther, A.: Evaluating Derivatives: Principles and Techniques of Algorithmic
Differentiation. SIAM Philadelphia, Philadelphia (2008)
2. Hardy, G.H., Littlewood, J.E., Polya, G.: Inequalities, 2nd edn. Cambridge University Press,
Cambridge (1973)
3. Homma, T., Saltelli, A.: Importance measures in global sensitivity analysis of model output.
Reliab. Eng. Syst. Saf. 52(1), 117 (1996)
4. Jansen, K., Leovey, H., Nube, A., Griewank, A., Mueller-Preussker, M.: A first look at quasiMonte Carlo for lattice field theory problems. Comput. Phys. Commun. 185, 948959 (2014)
5. Kiparissides, A., Kucherenko, S., Mantalaris, A., Pistikopoulos, E.N.: Global sensitivity analysis challenges in biological systems modeling. J. Ind. Eng. Chem. Res. 48(15), 71687180
(2009)
6. Kucherenko, S., Rodriguez-Fernandez, M., Pantelides, C., Shah, N.: Monte Carlo evaluation of
derivative based global sensitivity measures. Reliab. Eng. Syst. Saf. 94(7), 11351148 (2009)
7. Lamboni, M., Iooss, B., Popelin, A.L., Gamboa, F.: Derivative based global sensitivity measures: general links with Sobols indices and numerical tests. Math. Comput. Simul. 87, 4554
(2013)
8. Morris, M.D.: Factorial sampling plans for preliminary computational experiments. Technometrics 33, 161174 (1991)
9. Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M.,
Tarantola, S.: Global Sensitivity Analysis: The Primer. Wiley, New York (2008)
10. Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., Tarantola, S.: Variance based
sensitivity analysis of model output: design and estimator for the total sensitivity index. Comput.
Phys. Commun. 181(2), 259270 (2010)
11. I.M. Sobol Sensitivity estimates for nonlinear mathematical models. Matem. Modelirovanie ,
2: 112-118, 1990 (in Russian). English translation: Math. Modelling and Comput. Experiment,
1(4):407414, 1993
12. Sobol, I.M., Gershman, A.: On an altenative global sensitivity estimators. Proc SAMO, Belgirate 1995, 4042 (1995)
13. Sobol, I.M.: Global sensitivity indices for nonlinear mathematical models and their Monte
Carlo estimates. Math. Comput. Simul. 55(13), 271280 (2001)
14. Sobol, I.M., Kucherenko, S.: Global sensitivity indices for nonlinear mathematical models.
Rev. Wilmott Mag. 1, 5661 (2005)
15. Sobol, I.M., Kucherenko, S.: Derivative based global sensitivity measures and their link with
global sensitivity indices. Math. Comput. Simul. 79(10), 30093017 (2009)
16. Sobol, I.M., Kucherenko, S.: A new derivative based importance criterion for groups of variables and its link with the global sensitivity indices. Comput. Phys. Commun. 181(7), 1212
1217 (2010)

Bernstein Numbers and Lower Bounds


for the Monte Carlo Error
Robert J. Kunsch

Abstract We are interested in lower bounds for the approximation of linear operators
between Banach spaces with algorithms that may use at most n arbitrary linear
functionals as information. Lower error bounds for deterministic algorithms can
easily be found by Bernstein widths; for mappings between Hilbert spaces it is already
known how Bernstein widths (which are the singular values in that case) provide
lower bounds for Monte Carlo methods. Here, a similar connection between Bernstein
numbers and lower bounds is shown for the Monte Carlo approximation of operators
between arbitrary Banach spaces. For non-adaptive algorithms we consider the
average case setting with the uniform distribution on finite dimensional balls and in
this way we obtain almost optimal prefactors. By combining known results about
Gaussian measures and their connection to the Monte Carlo error we also cover
adaptive algorithms, however with weaker constants. As an application, we find
that for the L approximation of smooth functions from the class C ([0, 1]d ) with
uniformly bounded partial derivatives, randomized algorithms suffer from the curse
of dimensionality, as it is known for deterministic algorithms.
Keywords Monte Carlo Lower error bounds Bernstein numbers Approximation
of smooth functions Curse of dimensionality

1 Basic Notions and Prerequisites


1.1 Types of Errors and Information
Let S : 
F G be a compact linear operator between Banach spaces over the reals,
the so-called solution mapping. We aim to approximate S for an input set F 
F
with respect to the norm of the target space G. In this work F will always be the unit
ball of 
F.
R.J. Kunsch (B)
Friedrich-Schiller-Universitt Jena, Institut fr Mathematik, 07737 Jena, Germany
e-mail: robert.kunsch@uni-jena.de
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_24

471

472

R.J. Kunsch

Let (, , P) be a suitable probability space. Further let B(


F) and B(G) denote
the Borel -algebra of 
F and G, respectively. Under randomized algorithms, also
called Monte Carlo algorithms, we understand B(
F) B(G)-measurable
F G. This means that the output An (f )
mappings An = (An ()) : 
for an input f is random, depending on . We consider algorithms that use
at most n continuous linear functionals as information, i.e. An = N where
F Rn is the so-called information mapping. The mapping : Rn G
N : 
generates an output g = (y ) G as a compromise for all possible inputs that
lead to the same information y = N (f ) Rn . An information mapping is called
non-adaptive, if
N (f ) = (y1 , . . . , yn ) = [L1 (f ), . . . , Ln (f )],

(1)

where all functionals Lk are chosen at once. In that case N is a linear mapping for
fixed . For adaptive information N the choice of the functionals may depend
on previously obtained information, we assume that the choice of the k-th functional

)  Lk;y
is a measurable mapping (; y1 , . . . , yk1

() for k = 1, . . . , n, see [3]


1 ,...,yk1
for more details on measurability assumptions for adaptive algorithms. By Anran,ada
we denote the class of all Monte Carlo algorithms that use n pieces of adaptively
obtained information, for the subclass of nonadaptive algorithms we write Anran,nonada .
We regard the class of deterministic algorithms as a subclass Andet, Anran,
( {ada, nonada}) of algorithms that are independent from (this means in
particular that we assume deterministic algorithms to be measurable), for a particular
algorithm we write An = N, omitting . For a deterministic algorithm An the
(absolute) error at f is defined as the distance between output and exact solution
e(An , S, f ) := S(f ) An (f ) G .

(2)

For randomized algorithms An = (An ()) this can be generalized as the expected
error at f
(3)
e(An , S, f ) := E S(f ) An (f ) G ,
however some authors prefer the root mean square error
e2 (An , S, f ) :=

E S(f ) An (f ) 2G .

(4)

(The expectation E is written for the integration over all with respect to P.)
Since e(An , S, f ) e2 (An , S, f ), for lower bounds we may stick to the first version.
The global error of an algorithm An is defined as the error for the worst input
from the input set F 
F, we write
e(An , S, F) := sup e(An , S, f ).
f F

(5)

Bernstein Numbers and Lower Bounds for the Monte Carlo Error

473

For technical purposes we also need the average error, which is defined for any
(sub-)probability measure (the so-called input distribution) on the input space 
F,

e(An , S, ) :=

e(An , S, f ) d(f ).

(6)

(A sub-probability measure on 
F is a positive measure with 0 < (
F) 1.)
The difficulty of a problem within a particular setting refers to the error of optimal
algorithms, we define
e , (n, S, F) :=

inf e(An , S, F) and e , (n, S, ) :=

An A ,

inf e(An , S, ),

An A ,

where {ran, det} and  {ada, nonada}. These quantities are inherent properties
of the problem S, so eran, (n, S, F) is called the Monte Carlo error, edet, (n, S, F) the
worst case error, and edet, (n, S, ) the -average case error of the problem S.
Since adaption and randomization are additional features for algorithms we have
eran, (n, S, ) edet, (n, S, ) and e ,ada (n, S, ) e ,nonada (n, S, ),

(7)

where is fixed, either standing for an input set F 


F, or for an input distribution .
Another important relationship connects average errors and the Monte Carlo error.
It has already been used by Bakhvalov [1, Sect. 1].
Proposition 1 (Bakhvalovs technique) Let be an arbitrary (sub-)probability measure supported on F. Then
eran, (n, S, F) edet, (n, S, ).
 
Proof Let An = An A ran, be a Monte Carlo algorithm. We find
e(An , S, F) = sup E e(An , S, f )
f F

E e(A
n , S, f ) d(f ) = E

= E e(An , S, ) inf e(An , S, )

Fubini

inf

A n Andet,

e(An , S, f ) d(f )

e(A n , S, ).

In the last step we used that for any fixed elementary event the realization An
can be seen as a deterministic algorithm.

We will prove lower bounds for the Monte Carlo error by considering particular
average case situations where we have to deal with only deterministic algorithms.
We have some freedom to choose a suitable distribution .
For more details on error settings and types of information see [11].

474

R.J. Kunsch

1.2 Bernstein Numbers


The compactness of S can be characterized by the Bernstein numbers
bm (S) := sup inf S(x) G ,
Xm 
F xXm
x =1

(8)

where the supremum is taken over m-dimensional linear subspaces Xm 


F. These
quantities are closely related to Bernstein widths of the image S(F) within G,
bm (S(F), G) := sup sup{r 0 | Br (0) Ym S(F)},
Ym G

(9)

where the first supremum is taken over m-dimensional linear subspaces Ym G.


By Br (g) we denote the (closed) ball around g G with radius r. In general Bernstein
widths are greater than Bernstein numbers, however for injective operators (like
embeddings) both notions coincide (consider Ym = S(Xm )), in the case of Hilbert
spaces 
F and G Bernstein numbers and widths match the singular values m (S).
For deterministic algorithms it can be easily seen that
edet,ada (n, S, F) bn+1 (S(F), G) bn+1 (S),

(10)

since for any information mapping N : 


F Rn and all > 0 there always exists
1
an f N (0) with S(f ) G bn+1 (S(F), G) (1 ) and f F, i.e. f cannot be
distinguished from f .
If both 
F and G are Hilbert spaces, lower bounds for the (root mean square) Monte
Carlo error have been found by Novak [7]:
(n, S, F)
eran,ada
2

2
2n (S).

(11)

The new result for operators between arbitrary Banach spaces (see Theorem 1) reads
quite similar, for non-adaptive algorithms we have:
eran,nonada (n, S, F)

1
b2n+1 (S).
2

(12)

For adaptive algorithms we get at least the existence of a constant c 1/215 such
that
(13)
eran,ada (n, S, F) c b2n (S).

1.3 Some Convex Geometry


Since our aim is to consider arbitrary real Banach spaces, we recall some facts about
the geometry of unit balls.

Bernstein Numbers and Lower Bounds for the Monte Carlo Error

475

Proposition 2 (Structure of unit balls) Let (V, ) be a normed vector space over
the reals with its closed unit ball B := {x V : x 1}. Then
for any finite-dimensional subspace U V the intersection B U is compact
and has a non-empty interior with respect to the standard topology of U as a
finite-dimensional vector space, i.e. B U U is a d-dimensional body, where
d := dim U,
B is symmetric, i.e. if x B then x B,
B is convex, i.e. for x, y B and any (0, 1) it contains the convex combination
(1 )x + y B.
If conversely a given set B fulfills those properties, it induces a norm by
x B := inf{r 0 | x r B}, x V,

(14)

where rB := {r y | y B} is the dilation of B by a factor r. The closure of B is the


corresponding closed unit ball then.
Henceforth by Vold we denote the d-dimensional volume for sets within Rd+n as
the standard euclidean space, for n = 0 this is the standard d-dimensional Lebesgue
measure.
Now, for arbitrary sets A, B Rd and (0, 1) consider their convex combination
(1 ) A + B := {(1 ) a + b | a A, b B}.
The following fundamental statement provides a lower bound for the volume of the
convex combination. Note that this set is empty if one of the sets A or B is empty, so
we will exclude that case.
Proposition 3 (Brunn-Minkowski inequality) Let A, B Rd be non-empty compact sets. Then for 0 < < 1 we obtain
Vold ((1 ) A + B)1/d (1 ) Vold (A)1/d + Vold (B)1/d .
For a proof and more general conditions see [2]. We apply this inequality to
parallel slices through convex bodies:
Corollary 1 (Parallel slices) Let F Rd+n be a convex body and N : Rd+n Rn
a surjective linear mapping. Considering the parallel slices Fy := F N 1 (y), the
function
R : Rn [0, ), y  Vold (Fy )1/d
is concave on its support supp R = N(F) which again is a convex body in Rn .
If in addition F is symmetric, the image N(F) is symmetric as well and the function R is even, its maximum lies in y = 0.
We omit the easy proof and complete this section by a special consequence of
Corollary 1 which we will need for Lemma 1 in Sect. 2.1.

476

R.J. Kunsch

Corollary 2 Let G be a normed vector space and U G a d-dimensional linear


subspace with a d-dimensional volume measure Vold that extends to parallel affine
subspaces U + x0 G for x0 G by parallel projection, i.e. for a measurable set
A U + x0 we have
Vold (A) = Vold (A x0 ).
 

For g G we denote the closed ball around g with radius r 0 by


Br (g) := {x G | x g G r}.
Then
Vold (Br (g) (U + x0 )) Vold (Br (0) U)

(15)

and the mapping


r  Vold (Br (g) (U + x0 ))
is continuous and strictly increasing for
r dist(g, U + x0 ) = inf x g G .
xU+x0

Proof Without loss of generality, after replacing x0 by x0 g, we assume g = 0.


/ U (since otherwise the result is trivial with equality holding
Now we suppose x0
in (15)) and restrict to the (d + 1)-dimensional vector space V = U + Rx0 . We may
apply Corollary 1 to this finite-dimensional situation, where for r > 0 we get


x0
Vold (Br (0) (U + x0 )) = r d Vold B1 (0) U +
r
r d Vold (B1 (0) U)
= Vold (Br (0) U),
since the central slice through the unit ball has the greatest volume. By Corollary 1
the function
R(s) := (Vold (B1 (0) (U + s x0 )))1/d 0
is concave on [0, 1/ dist(0, U + x0 )], takes its maximum for s = 0, and by this it is
continuous and monotonically decreasing for s [0, 1/ dist(0, U + x0 )]. There d
fore the function r  Vold (Br (0) (U + x0 )) = r d R 1r is continuous and
monotonically increasing for r dist(0, U + x0 ) since it is composed of continuous and monotone functions. It is actually strictly increasing because r d is strictly
increasing.


Bernstein Numbers and Lower Bounds for the Monte Carlo Error

477

2 The Main Results on Lower Bounds


2.1 The Non-adaptive Setting
The proof of the following theorem needs Lemmas 1 and 2 that are provided later.
Theorem 1 (Non-adaptive Monte Carlo methods) Let S : 
F G be a compact
linear operator and F 
F be the closed unit ball in 
F. Then, for n < m, as a lower
error bound for non-adaptive Monte Carlo methods we obtain
eran,nonada (n, S, F)

mn
bm (S).
m+1

Especially for m = 2n + 1 we have


eran,nonada (n, S, F)

1
b2n+1 (S).
2

Proof For all > 0 there exists an m-dimensional subspace Xm 


F such that
S(f ) G f F bm (S) (1 ) for f Xm .
Note that for the restricted operator we have bm (S|Xm ) (1 ) bm (S) and in general
eran,nonada (n, S, F) eran,nonada (n, S|Xm , F). Hence it suffices to show the theorem
F = Rm and therefore
for S|Xm , so without loss of generality we assume Xm = 

S(f ) G f F bm (S) holds for all f F. Additionally we assume bm (S) > 0, i.e. S
is injective. Let be the uniform distribution on the input set F Rm with respect to
the m-dimensional Lebesgue measure. We assume that the mapping N : 
F Rn is
an arbitrary surjective linear (i.e. non-adaptive) information mapping. We will show
that for any (measurable) choice of a mapping : Rn G we obtain

e( N, S, ) =

S(f ) (N(f )) G d(f )

mn
bm (S),
m+1

(16)

which by Proposition 1 (Bakhvalovs technique) provides a lower bound for nonadaptive Monte Carlo methods.
Within the first step we rewrite the integral in (16) as an integral of local average
errors over the information. The set of inputs F and the information mapping N
match the situation of Corollary 1 with m = n + d, each d-dimensional slice Fy :=
F N 1 (y) represents all inputs with the same information y Rn . Since is
the uniform distribution on F, the uniform distribution on Fy is a version of the
conditional distribution of given y = N(f ), which we denote by y . Therefore we
can write the integral from (16) as

478

R.J. Kunsch

 


S(f ) (N(f )) G d(f ) =


S(f ) (y) G dy (f ) d N 1 (y),
(17)

The size of the slices Fy compared to the central slice F0 (where y = 0) shall be
1/d

. The function R(y)d is a quasi-density
described by R(y) := Vold (Fy )/ Vold (F0 )
for the distribution N 1 of the information y Rn . Further, by subsequent
Lemma 1 we have a lower bound for the inner integral, which we call the local
average error:

S(f ) (y) G dy (f )

d
R(y) bm (S).
d+1

Therefore the integral (17) is bounded from below by an expression that only depends
on the volumes of the parallel slices Fy :



R(y)d+1 dn y
d
N(F)
bm (S),
S(f ) (N(f )) G d(f )
d + 1 N(F) R(y)d dn y

(18)

where dn y denotes the integration by the n-dimensional Lebesgue measure.


The problem now is reduced to a variational problem on n-variate functions R(y).
Note that 0 R(y) R(0) = 1 since R is symmetric and concave on its support N(F), which is a convex and symmetric body in Rn , see Corollary 1. The
set N(F) satisfies the structure of a unit ball, hence it defines a norm N(F)
on Rn (compare Proposition 2). We switch to a kind of spherical coordinates representing any information vector y = 0 by its length r = r(y) := y and its
direction k = k(y) := 1r y, i.e. k = 1 and y = r k. Let denote the cone
measure (see [6] for an explicit construction) on the set of directions N(F). The
n-dimensional Lebesgue integration is to be replaced by dn y = n r n1 dr d(k), i.e.

  1
d+1 n
R(r k)d+1 r n1 dr d(k)
R(y)
d
y
0
N(F)

,
=  
d n
1
d n1 dr d(k)
N(F) R(y) d y
0 R(r k) r

(19)

where we have cancelled the factor n. For all directions k N(F) the ratio of the
integrands with respect to k is globally bounded from below, in detail
1

d+1 n1
r
dr
d+1
d+1
0 R(r k)

=
,
1
d
n1
d
+
n
+
1
m
+1
dr
0 R(r k) r

where the function r  R(r k) [0, 1] is concave on [0, 1] and R(0) = 1. For the
solution of this univariate variational problem see subsequent Lemma 2. It follows
d+1
as well, which along with (18) proves the
that (19) is bounded from below by m+1
theorem.


Bernstein Numbers and Lower Bounds for the Monte Carlo Error

479

The following lemma is about local average errors. Its quintessence is that ballshaped slices S(Fy ) G (with respect to the norm in G) are optimal. For the general
notion of local average radius of information see [11, pp. 197204].
Lemma 1 (Local average error) Let S : Rm G be an injective linear mapping
between Banach spaces, where F Rm is the unit ball with respect to an arbitrary norm on Rm , and let be the uniform distribution on F. Let N : Rm Rn
be a linear surjective information mapping, where for y Rn the conditional
measure y is the uniform distribution on the slice Fy := F N 1 (y). With
1/d

and d := m n for the local average error we
R(y) := Vold (Fy )/ Vold (F0 )
have

d
R(y) bm (S).
inf S(f ) g G dy (f )
gG
d+1
Proof Since S : Rm G is linear and bm (S) > 0, the exact solutions S(Fy ) for
inputs with the same information y Rn in each case form (convex) sets within
d-dimensional affine subspaces Uy := S(N 1 (y)) of the output space G. We compare the volume of subsets within different parallel affine subspaces, i.e. any ddimensional Lebesgue-like measure on U0 is also defined for subsets of the affine
subspaces Uy just as in Corollary 2. The linear mapping S preserves the ratio of
volumes, i.e.
Vold (Fy )
(S(Fy ))
= R(y)d =
.
(20)
Vold (F0 )
(S(F0 ))
Therefore for each information y N(F) the image measure y S 1 is the uniform
distribution on S(Fy ) with respect to . This means that for any g G we need to
show the inequality



1
x g G d (x)
S(f ) g G dy (f ) =
(S(Fy )) S(Fy )
d
R(y) bm (S).

d+1

(21)

For convenience we assume to be scaled such that


(Br (0) U0 ) = r d ,

(22)

where Br (g) := {x G | x g G r} is the ball around g G with radius r 0.


Given the information N(f ) = y let g G be any (possibly non-interpolatory)
choice for a return value. For r dist(g, Uy ) =: we define the set of those points
in Uy that have a distance of at most r to g,
Cr := Br (g) Uy ,

480

R.J. Kunsch

and write its volume as a function


V (r) := (Cr ).
By Corollary 2 the function V is continuous and strictly increasing for r and
(22)

V (r) (Br (0) U0 ) = r d .

(23)

Therefore also the inverse function, which we denote by


: [V (), ] [, ] , with (V (r)) = r for r ,
is strictly increasing. By (23) we have (r d ) (V (r)) = r for r . That means,
for v = r d d , and trivially for V () v d , we obtain
(v)
Especially
= (V ())


d

v , for v V ().

V ()

(24)

d
v , for v V ().

(25)

If (S(Fy )) V () we obtain


(25)

S(Fy )

x g G d (x) (S(Fy ))

d
(S(Fy ))(d+1)/d .
d+1

(26)

Otherwise we introduce the abbreviation y := ( (S(Fy ))), where by definition we


have
(S(Fy )) = (Cy ).
Note that
(S(Fy ) \ Cy ) = (Cy \ S(Fy )),
x g G y , for x S(Fy ) \ Cy , and

(27)

x g G y , for x Cy \ S(Fy ).
This enables us to carry out a symmetrization:

S(Fy )

x g G d (x)

(27)

x g G d (x)
C y

V () +

(S(Fy ))
V ()

(v) dv

Bernstein Numbers and Lower Bounds for the Monte Carlo Error
(24),(25)

(S(Fy ))

481

v1/d dv

(28)

d
(S(Fy ))(d+1)/d .
d+1

Together with (20), both cases, (26) and (28), give us


1
(S(Fy ))


S(Fy )

x g G d (x)

d
d
R(y) (S(F0 ))1/d
R(y) bm (S),
d+1
d+1

which is (21). For the second inequality we have used the definition of the Bernstein
number, i.e. Bbm (S) (0) S(Rm ) S(F) and therefore Bbm (S) (0) U0 S(F0 ) which

with our scaling (22) implies bm (S)d (S(F0 )).
Remark 1 (Alternative to Bernstein numbers) In the very end of the above proof
we have replaced (S(F0 )) by an expression using the Bernstein number bm (S). In
fact, due to the scaling of , the expression (S(F0 )) is a volume comparison of an
(m n)-dimensional slice of the image of the input set S(F) and the unit ball in G.
We could replace the Bernstein numbers within Theorem 1 by new quantities

km,n (S) := sup inf
Xm Ymn

Volmn (S(F) Ymn )


Volmn (BG Ymn )

1/(mn)
,

(29)

where Xm 
F and Ymn S(Xm ) are linear subspaces with dimension dim(Xm ) =
dim(S(Xm )) = m and dim(Ymn ) = m n, further BG denotes the unit ball in G and
for each choice of Ymn the volume measure Volmn may be any (mn)-dimensional
Lebesgue measure, since we are only interested in the ratio of volumes.
Lemma 2 (Variational problem) For d, n N consider the variational problem of
minimizing the functional
1

R(r)d+1 r n1 dr
F[R(r)] := 0 1
,
d n1 dr
0 R(r) r
where R : [0, 1] [0, 1] is concave and R(0) = 1. Then
F[R(r)]

d+1
d+n+1

with equality holding only for R(r) = 1 r.


Proof For p > 0 with repeated integration by parts we obtain


1
0

(1 r)p1 r n1 dr =

(n 1) 1
,
p (p + n 1)

482

R.J. Kunsch

which is a special value of the beta function (see for Example [12, p. 103]). Knowing
the value of this integral we get
F[1 r] =

d+1
.
d+n+1

(30)

The maximum is F[1] = 1. For other linear functions 


R(r) = (1 r) + r with
R(1) = (0, 1) we can write
1

(1 (1 )r)d+1 r n1 dr
F[(1 r) + r] = 0 1
d n1 dr
0 (1 (1 )r) r
 1
(1 x)d+1 x n1 dx
[x=(1)r] 0
,
=
 1
(1 x)d x n1 dx
0
where we have cancelled the factor (1 )n . We can express this as a conditional
expectation using a random variable X [0, 1] with quasi density (1 x)d x n1 :
F[(1 r) + r] = E[(1 X) | X (1 )] = E[(1 X) | (1 X) ],
which obviously is monotonically increasing in .
For any nonlinear concave function R : [0, 1] [0, 1] with R(0) = 1 there exists
exactly one linear function 
R(r) = (1 r) + r with


R(r)d r n1 dr


R(r)d r n1 dr = 0.

(31)

Due to the concavity of R there is exactly one r0 (0, 1) with R(r0 ) = 


R(r0 ).
For r (0, r0 ) we have
R(r) > 
R(r) > R(r0 )



R(r)d+1 
R(r)d+1 > R(r0 ) R(r)d 
R(r)d > 0.

Meanwhile for r (r0 , 1] we have


R(r) < 
R(r) < R(r0 )



0 > R(r)d+1 
R(r)d+1 > R(r0 ) R(r)d 
R(r)d .

Therefore
 1
 1
d+1 n1

R(r)
r
dr
R(r)d+1 r n1 dr
0
0
 1

 1
(31)
d n1
d n1

> R(r0 )
R(r) r
dr
dr = 0,
R(r) r
0

which with (31) implies F[R] > F[


R].

Bernstein Numbers and Lower Bounds for the Monte Carlo Error

483

m
Remark 2 (Quality of the prefactor) Consider the identity idm1 : m
1 1 with
Bernstein number bm (idm1 ) = 1. (For notation see Sect. 3.1.) For any J {1, . . . , m}
being an index set containing n indices define the deterministic algorithm

AJ (x) :=

xi ei , x m
1,

iJ

where ei = (ij )m
j=1 are the vectors of the standard basis. With being the uniform
distribution on the unit ball B1m m
1 , for the average case setting this type of
algorithm is optimal.
We add some randomness to the above method. Let J = J() be uniformly
distributed on the system of index sets {I {1, ..., m} | #I = n} and define the
Monte Carlo algorithm An = (An ) by
An (x) :=

xi ei ,

iJ()

The error is
e(An , idm1 , x) = E x An (x) 1 =

m

mn
x 1 .
P(i
/ J()) |xi | =
m
i=1

Along with Theorem 1 we have


mn
mn
eran,nonada (n, idm1 , B1m )
.
m+1
m
The remaining gap may be due to the fact that the distribution within the average
case setting was no distribution on the surface of F but the uniform distribution on
the whole volume of F. Yet, for high dimensions most of the mass is concentrated
near the surface.

2.2 The Adaptive Setting


A different approach was taken by Heinrich [3]. Gaussian measures can be downscaled in a way such that for their truncation to the unit ball F the mass of the
truncated area is small and for any adaptive information N we have a big portion
(with respect to the Gaussian measure) of slices Fy = F N 1 (y) that are in a certain
sense close to the center so that truncation does not make a big difference for the
local average error. The Gaussian measure should however not be exaggeratedly concentrated around the origin for that the local average error of those central slices is
still sufficiently high. For the next theorem we combine Heinrichs general results on

484

R.J. Kunsch

truncated Gaussian measures with Lewiss theorem which gives us a way to choose
a suitable Gaussian measures for our average case setting.
Theorem 2 (Adaptive Monte Carlo methods) Let S : 
F G be a compact linear
operator and F 
F be the closed unit ball in F. Then for n < m for adaptive Monte
Carlo methods we obtain
eran,ada (n, S, F) c
where the constant can be chosen as c =

mn
bm (S),
m

12e1
16

1
.
108

Remark 3 The given constant can be directly extracted from the proof in Heinrich [3].
However by optimizing some parts of the proof one can show that the theorem is
1
still valid with c = 16
. When restricting to homogenious algorithms (i.e. An ( f ) =
An (f ) for R) we may show the above result with the optimal constant c = 1
(see also Remark 2). The proofs for these statements will be published in future work.
Proof (Theorem 2) As before we assume 
F = Rm . We start with the existence of in
some sense optimal Gaussian measures on 
F. Let x be a standard Gaussian random
vector in Rm . Then (J) := E Jx F defines a norm on the set of linear operators
J : Rm Rm . By Lewis Theorem (see for example [10, Theorem 3.1]) there exists
a linear mapping J : Rm Rm with maximal determinant subject to (J) = 1,
and tr(J 1 T ) m (T ) for any linear mapping T : Rm Rm . In particular with
T = JP for any rank-(m n) projection P within Rm this implies
E JPx F

mn
.
m

(32)

For the average setting let denote the Gaussian measure for the distribution of
the rescaled random vector c Jx, where c = 81 , and let be the truncated measure,
i.e. (A) = (A
F) for measurable sets A Rm . Note that is no probability
measure, but a sub-probability measure with (F) < 1, which is sufficient for the
purpose of lower bounds.
Then by Heinrich [3, Proposition 2] we have
edet,ada (n, S, ) c c inf E SJPx G ,
P

(33)

where
the infimum is taken over orthogonal rank-(m n) projections P and c =
1
1
e . (The conditional measure y for given the information y = N(f ) can
2
be represented as the distribution of c JPy x with a suitable orthogonal projection Py .)
With SJPx G JPx F bm (S) and (32) and c = c c we obtain the theorem. 
Note that we consider Monte Carlo algorithms with fixed information cost n,
whereas in [3] n denotes the average information cost En() which leads to slightly
different bounds, like 4c b4n (S) instead of 2c b2n (S).

Bernstein Numbers and Lower Bounds for the Monte Carlo Error

485

3 Applications
3.1 Recovery of Sequences
We compare the results we obtain by Theorems 1 and 2 with some asymptotic lower
bounds of Heinrich [3] for the Monte Carlo approximation of the identity
id : Np Nq .
Here Np denotes RN equipped with the p-norm x p = (|x1 |p + . . . + |xN |p )1/p for
p < , or x = maxi=1,...,N |xi | for p = , the input set is the unit ball BpN of Np .
Since the identity is injective, Bernstein widths and Bernstein numbers coincide.
Proposition 4 (Heinrich 1992) Let 1 p, q and n N. Then

eran,ada (n, id : 4n


p

1/q1/p
n
,

n1/p (log n)1/2 ,


4n
4n
q , Bp ) 

n1/q (log n)1/2 ,

1,

if
if
if
if

1 p, q < ,
1 p < q = ,
1 q < p = ,
p = q = .

The above result is a direct application of Heinrichs technique of truncated


Gaussian measures to a scaled version of the standard Gaussian distribution on Rm ,
here m = 4n. In detail, we need the asymptotics of the norm expectations for a
standard Gaussian vector x Rm which are E x p  m1/p for 1 p < , and
E x  (log m)1/2 .
Now we cite some asymptotic results on Bernstein numbers, see [4, Lemma 3.6].
Lemma 3 Let 1 p, q and m N. Then

bm (id : 2m
p

1/q1/p

, if 1 p q or 1 q p 2,
m
2m
1/q1/2
q )  m
, if 1 q 2 p ,

1
if 2 q p .

Combining this with Theorem 2 for m = 2n one may obtain a result similar
to Proposition 4, though without the logarithmic factor for 1 p < q = and
even with a weaker polynomial order for 1 q < p if p > 2. However for
the non-adaptive setting with Theorem 1 we can use the quantities km,n (S) defined
in Remark 1. The following result on volume ratios due to Meyer and Pajor [5] is
relevant to the problematic case 1 q < p .
Proposition 5 (Meyer, Pajor 1988) For every d-dimensional subspace Yd Rm
and for 1 q p we have
Vold (Bpm Yd )
Vold (Bqm Yd )

Vold (Bpd )
Vold (Bqd )

486

R.J. Kunsch

Corollary 3 For 1 p, q we have


4n
4n
1/q1/p
.
eran,nonada (n, id : 4n
p q , Bp )  n

Note that by this for the case 1 q < p = we even have stronger lower bounds
than in Proposition 4, namely without the logarithmic term, however this only holds
for non-adaptive algorithms. On the other hand, for the case 1 p < q = this
result is weaker by a logarithmic factor compared to Heinrichs result.
Proof (Corollary 3) For 1 p q we apply Theorem 1 using the Bernstein
numbers from Lemma 3 with m = 2n.
For 1 q p let m = 4n and d = m n = 3n. By Proposition 5 we have

km,n (id :

m
p

m
q)

= inf m
Yd R

Vold (Bpm Yd )
Vold (Bqm Yd )

1/d

Vold (Bpd )
Vold (Bqd )

1/d
.

(34)

The volume of the unit Ball in dp can be found e.g. in [10, Eq. (1.17)], it is

d
2 1 + 1p
.

Vold (Bpd ) =
1 + dp
For 1 p < we apply Stirlings formula to the denominator
 
 d/p

 
d
p
d
d
d
,
= 2

1+
e(d/p) where 0
p
p ep
p
12d
and by this we obtain the asymptotics (Vold (Bpd ))1/d  d 1/p . For p = we simply
d
have (Vold (B
))1/d = 2. Putting this into (34), by Remark 1 together with Theorem 1
we obtain the corollary.

Finally observe that in the case 1 p q Proposition 5 provides upper
m
bounds for the quantities km,n (id : m
p q ). By this we see that taking these
quantities instead of the Bernstein numbers will not change the order of the lower
4n
4n
bounds for the error eran,nonada (n, id : 4n
p q , Bp ).

3.2 Curse of Dimensionality for Approximation


of Smooth Functions
For each dimension d N consider the problem
Fd L ([0, 1]d ),
Sd = id : 

(35)

Bernstein Numbers and Lower Bounds for the Monte Carlo Error

487

where the input space is



Fd := {f C ([0, 1]d ) | sup D f < },

(36)

Nd0

equipped with the norm

f F := sup D f .

(37)

Nd0

Here D f = 11 dd f denotes the partial derivative of f belonging to a multiFd .


index Nd0 . The input set Fd is the unit ball in 
Novak and Wozniakoswki have shown in [9] that this problem suffers from the
curse of dimensionality for deterministic algorithms. The proof is based on the
Bernstein numbers given in the following lemma, we will sketch the idea on how to
obtain these values.
Lemma 4 (Novak, Wozniakoswki 2009) For the problems Sd we have
bm (Sd ) = 1 for m 2d/2 .
Proof (idea) Note that F and therefore bm (Sd ) 1 for all m N.
Further, with s := d/2 consider the linear subspace



ai (x1 + x2 )i1 (x3 + x4 )i2 (x2s1 + x2s )is , ai R (38)
Vd := f | f (x) =
i{0,1}s

of 
F with dim Vd = 2d/2 . For f Vd one can show D f f for all
multi-indices Nd0 , i.e. f F = f . Therefore with m = 2d/2 and Xm = Vd
we obtain bm (S) = 1. Since the sequence of Bernstein numbers is decreasing, we

know the first 2d/2 Bernstein numbers.
Knowing this, by Theorems 1 and 2 we directly obtain the following result for
randomized algorithms.
Corollary 4 (Curse of dimensionality) For the problems Sd we have
eran,nonada (n, Sd , Fd )
and

1
for n 2d/21 1,
2

eran,ada (n, Sd , Fd ) c for n 2d/21 ,

with a suitable constant c 1/215.

(39)

488

R.J. Kunsch

Note that if we do not collect any information about the problem, the best algorithm
would simply return 0 and the so-called initial error is
e(0, Sd , Fd ) = 1.
Even after evaluating exponentially many (in d) information functionals, with nonadaptive algorithms we only halve the initial error, if at all. The problem suffers from
the curse of dimensionality. For more details on tractability notions see [8].
Acknowledgments I want to thank E. Novak and A. Hinrichs for all the valuable hints and their
encouragements during the process of compiling this work.
In addition I would like to thank S. Heinrich for his crucial hint on Bernstein numbers and
Bernstein widths.
Last but not least I would like to express my gratitude to Brown Universitys ICERM for its support
with a stimulating research environment and the opportunity of having scientific conversations that
finally inspired the solution of the adaptive case during my stay in fall 2014.

References
1. Bakhvalov, N.S.: On the approximate calculation of multiple integrals. Vestnik MGU, Ser.
Math. Mech. Astron. Phys. Chem., 4:318: in Russian. English translation: Journal of Complexity 31(502516), 2015 (1959)
2. Gardner, R.J.: The Brunn-Minkowski inequality. Bulletin of the AMS 39(3), 355405 (2002)
3. Heinrich, S.: Lower bounds for the complexity of Monte Carlo function approximation. J.
Complex. 8, 277300 (1992)
4. Li, Y.W., Fang, G.S.: Bernstein n-widths of Besov embeddings on Lipschitz domains. Acta
Mathematica Sinica, English Series 29(12), 22832294 (2013)
5. Meyer, M., Pajor, A.: Sections of the unit ball of np . J. Funct. Anal. 80, 109123 (1988)
6. Naor, A.: The surface measure and cone measure on the sphere of np . Trans. AMS 359, 1045
1079 (2007)
7. Novak, E.: Optimal linear randomized methods for linear operators in Hilbert spaces. J. Complex. 8, 2236 (1992)
8. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Linear Information, vol.
I. European Mathematical Society, Europe (2008)
9. Novak, E., Wozniakowski, H.: Approximation of infinitely differentiable multivariate functions
is intractable. J. Complex. 25, 398404 (2009)
10. Pisier, G.: The Volume of Convex Bodies and Banach Space Geometry. Cambridge University
Press, Cambridge (1989)
11. Traub, J.F., Wasilkowski, G..W., Wozniakowski, H.: Information-Based Complexity. Academic
Press, New York (1988)
12. Wang, Z.X., Guo, D.R.: Special Functions. World Scientific, Singapore (1989)

A Note on the Importance of Weak


Convergence Rates for SPDE
Approximations in Multilevel Monte Carlo
Schemes
Annika Lang

Abstract It is a well-known rule of thumb that approximations of stochastic partial


differential equations have essentially twice the order of weak convergence compared to the corresponding order of strong convergence. This is already known for
many approximations of stochastic (ordinary) differential equations while it is recent
research for stochastic partial differential equations. In this note it is shown how the
availability of weak convergence results influences the number of samples in multilevel Monte Carlo schemes and therefore reduces the computational complexity of
these schemes for a given accuracy of the approximations.
Keywords Stochastic partial differential equations Multilevel Monte Carlo
methods Finite element approximations Weak error analysis Stochastic heat
equation

1 Introduction
Since the publication of Giles articles about multilevel Monte Carlo methods [8, 9],
which applied an earlier idea of Heinrich [10] to stochastic differential equations, an
enormous amount of literature on the application of multilevel Monte Carlo schemes
to various applications has been published. For an overview of the state of the art
in the area, the reader is referred to the scientific program and the proceedings of
MCQMC14 in Leuven.
This note is intended to show the consequences of the availability of different
types of convergence results for stochastic partial differential equations of It type
(SPDEs for short in what follows). Here we consider so called strong and weak

A. Lang (B)
Department of Mathematical Sciences, Chalmers University of Technology
and University of Gothenburg, SE-412 96 Gothenburg, Sweden
e-mail: annika.lang@chalmers.se
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_25

489

490

A. Lang

convergence rates, where a sequence of approximations (Y ,  N0 ) of a H-valued


random variable Y converges strongly (also called in mean square) to Y if
lim E[Y Y 2H ]1/2 = 0.

+

In the context of this note, H denotes a separable Hilbert space. The sequence is said
to converge weakly to Y if
lim |E[(Y )] E[(Y )]| = 0

+

for in an appropriately chosen class of functionals that depends in general on


the treated problem. While strong convergence results for approximations of many
SPDEs are already well-known, corresponding orders of weak convergence that are
better than the strong ones are just rarely available. For an overview on the existing
literature on weak convergence, the reader is referred to [1, 11] and the literature
therein. The necessity to do further research in this area is besides other motivations
also due to the efficiency of multilevel Monte Carlo approximations, which is the
content of this note. By a rule of thumb one expects the order of weak convergence to
be twice the strong one for SPDEs. This is shown under certain smoothness assumptions on the SPDE and its approximation in [1]. We use the SPDE from [1] and its
approximations with the desired strong and weak convergence rates to show that
the additional knowledge of better weak than strong convergence rates changes the
choices of the number of samples per level in a multilevel Monte Carlo approximation according to the theory. Since, for a given accuracy, the number of samples
reduces with the availability of weak rates, the overall computational work decreases.
Computing numbers, we shall see in the end that for high dimensional problems and
low regularity of the original SPDE the work using only strong approximation results
is essentially the squared work using also weak approximation rates. In other words
the order of the complexity of the work in terms of accuracy decreases essentially by
a factor of 2, when weak convergence rates are available. The intention of this note
is to point out this important fact by writing down the resulting numbers explicitly.
First simulation results are presented in the end for a stochastic heat equation in
one dimension driven by additive space-time white noise, which, to the best of my
knowledge, are the first simulation results of that type in the literature. The obtained
results confirm the theory.
This work is organized as follows: In Sect. 2 the multilevel Monte Carlo method
is recalled including results for the approximation of Hilbert-space-valued random
variables on arbitrary refinements. SPDEs and their approximations are introduced in
Sect. 3 and results for strong and weak errors from [1] are summarized. The results
from Sects. 2 and 3 are combined in Sect. 4 to a multilevel Monte Carlo scheme
for SPDEs and the consequences of the knowledge of weak convergence rates are
outlined. Finally, the theory is confirmed by simulations in Sect. 5.

A Note on the Importance of Weak Convergence Rates

491

2 Multilevel Monte Carlo for Random Variables


In this section we recall and improve a convergence and a work versus accuracy
result for the multilevel Monte Carlo estimator of a Hilbert-space-valued random
variable from [3]. This is used to calculate errors and computational work for the
approximation of stochastic partial differential equations in Sect. 4. A multilevel
Monte Carlo method for (more general) Banach-space-valued random variables has
been introduced in [10], where the author derives bounds on the error for given work.
Here, we do the contrary and bound the overall work for a given accuracy.
We start with a lemma on the convergence in the number of samples of a Monte
Carlo estimator. Therefore let (, A , P) be a probability space and let Y be a random
variable with values in a Hilbert space (B, (, )B ) and (Y i , i N) be a sequence of
independent, identically distributed copies of Y . Then the strong law of large numbers
states that the Monte Carlo estimator EN [Y ] defined by
EN [Y ] :=

N
1  i
Y
N i=1

converges P-almost surely to E[Y ] for N +. In the following lemma we see that
it also converges in mean square to E[Y ] if Y is square integrable, i.e., Y L 2 (; B)
with


L 2 (; B) := v : B, v strongly measurable, vL2 (;B) < + ,
where
vL2 (;B) := E[v2B ]1/2 .
In contrast to the almost sure convergence of EN [Y ] derived from the strong law of
large numbers, a convergence rate in mean square can be deduced from the following
lemma in terms of the number of samples N N.
Lemma 1 For any N N and for Y L 2 (; B), it holds that
1
1
E[Y ] EN [Y ]L2 (;B) = Var[Y ]1/2 Y L2 (;B) .
N
N
The lemma is proven in, e.g., [6, Lemma 4.1]. It shows that the sequence of socalled Monte Carlo estimators (EN [Y ], N N) converges with rate O(N 1/2 ) in
mean square to the expectation of Y .
Next let us assume that (Y ,  N0 ) is a sequence of approximations of Y , e.g.,
Y V , where (V ,  N0 ) is a sequence of finite dimensional subspaces of B. For
given L N0 , it holds that

492

A. Lang

YL = Y0 +

L

(Y Y1 )
=1

and due to the linearity of the expectation that


E[YL ] = E[Y0 ] +

L


E[Y Y1 ].

=1

A possible way to approximate E[YL ] is to approximate E[Y Y1 ] with the corresponding Monte Carlo estimator EN [Y Y1 ] with a number of independent
samples N depending on the level . We set
E L [YL ] := EN0 [Y0 ] +

L


EN [Y Y1 ]

=1

and call E L [YL ] the multilevel Monte Carlo estimator of E[YL ]. The following lemma
gives convergence results for the estimator depending on the order of weak convergence of (Y ,  N0 ) to Y and the convergence of the variance of (Y Y1 ,  N).
If neither estimates on weak convergence rates nor on the convergence of the variances are available, one can usethe in general slowerstrong convergence rates.
Lemma 2 Let Y L 2 (; B) and let (Y ,  N0 ) be a sequence in L 2 (; B), then,
for L N0 , it holds that
E[Y ] E L [YL ]L2 (;B)
E[Y YL ]B + E[YL ] E L [YL ]L2 (;B)

1/2
L

1
1
= E[Y YL ]B + N0 Var[Y0 ] +
N Var[Y Y1 ]

Y YL L2 (;B) + 2

=1

L

=0

1/2

N1 (Y Y 2L2 (;B) + Y Y1 2L2 (;B) )

where Y1 := 0.
Proof This is essentially [3, Lemma 2.2] except that the square root is kept outside
the sum. Therefore it remains to show the property of the multilevel Monte Carlo
estimator that
E[YL ] E L [YL ]2L2 (;B) = N01 Var[Y0 ] +

L

=1

N1 Var[Y Y1 ].

A Note on the Importance of Weak Convergence Rates

493

To prove this we first observe that


L

E[YL ] E [YL ] = E[Y0 ] EN0 [Y0 ] +
(E[Y Y1 ] EN [Y Y1 ])
L

=1

and that all summands are independent, centered random variables by the construction of the multilevel Monte Carlo estimator. Thus [7, Proposition 1.12] implies
that
E[E[YL ] E L [YL ]2B ]
= E[E[Y0 ] EN0 [Y0 ]2B ] +

L


E[E[Y Y1 ] EN [Y Y1 ]2B ]

=1

and Lemma 1 yields the claim.

This lemma enables us to choose for a given order of weak convergence of


(Y ,  N0 ) and for given convergence rates of the variances of (Y Y1 ,  N)
the number of samples N on each level  N0 such that all terms in the error
estimate are equilibrated.
The following theorem is essentially Theorem 2.3 in [3]. While it was previously formulated for a sequence of discretizations obtained by regular subdivision,
i.e., h = C2 , it is written down for general sequences of discretizations here
with improved sample sizes. For completeness we include the proof. We should also
remark that the convergence with basis 2 by regular subdivision in [3] is useful and
important for SPDEs since most available approximation schemes that can be implemented are obtained in that way. Nevertheless, it is also known that the refinement
with respect to basis 2 is not optimal for multilevel Monte Carlo approximations.
Therefore it makes sense to reformulate the theorem in this more general way.
Theorem 1 Let (a ,  N0 ) be a decreasing sequence of positive real numbers
that converges to zero and let (Y ,  N0 ) converge weakly to Y , i.e., there exists a
constant C1 such that
E[Y Y ]B C1 a
for  N0 . Furthermore assume that the variance of (Y Y1 ,  N) converges
with order 2 [0, 2] with respect to (a ,  N0 ), i.e., there exists a constant C2
such that
2
Var[Y Y1 ] C2 a ,
and that Var[Y0 ] = C3 . For a chosen level L N0 , set N := aL2 a 1+ ,
 = 1, . . . , L, > 0, and N0 := aL2 , then the error of the multilevel Monte
Carlo approximation is bounded by
2

E[Y ] E L [YL ]L2 (;B) (C1 + (C3 + C2 (1 + ))1/2 ) aL ,

494

A. Lang

where denotes the Riemann zeta function, i.e., E[Y ] E L [YL ]L2 (;B) has the
same order of convergence as E[Y YL ]B .
Assume further that the work WB of one calculation of Y Y1 ,  1, is
bounded by C4 a for a constant C4 and > 0, that the work to calculate Y0 is
bounded by a constant C5 , and that the addition of the Monte Carlo estimators costs
C6 aL for some 0 and some constant C6 . Then the overall work WL is bounded
by
L


(2) 1+ 
+ C6 aL .
WL  aL2 C5 + C4
a

=1

If furthermore (a ,  N0 ) decreases polynomially, i.e., there exists a > 1 such that
a = O(a ), then the bound on the computational work simplifies to

WL =

max{2,}

O(aL
)
if < 2,
(2+2) 2+

L , aL }) if 2.
O(max{aL

Proof First, we calculate the error of the multilevel Monte Carlo estimator. It holds
with the made assumptions that
N01 Var[Y0 ] C3 aL2
and, for  = 1, . . . , L, that
2 (1+)

N1 Var[Y Y1 ] C2 aL2 a

a = C2 aL2 (1+) .
2

So overall we get that


L

=1

N1 Var[Y Y1 ] C2 aL2

L


(1+) C2 aL2 (1 + ),

=1

where denotes the Riemann zeta function. To finish the calculation of the error we
apply Lemma 2 and assemble all estimates to
E[Y ] E L [YL ]L2 (;B) (C1 + (C3 + C2 (1 + ))1/2 ) aL .
Next we calculate the necessary work to achieve this error. The overall work consists
of the work WB to compute Y Y1 times the number of samples N on all levels
 = 1, . . . , L, the work W0B on level 0, and the addition of the Monte Carlo estimators
2
in the end. Therefore, using the observation that N aL2 a 1+ ,  = 1, . . . , L,
2
and N0 aL with equality if the right hand side is an integer, we obtain that

A Note on the Importance of Weak Convergence Rates

WL C5 N0 + C4

L


495

N a + C6 aL

=1

 C5 aL2 + C4

L


aL2 a 1+ a + C6 aL
2

=1
L


(2) 1+ 
+ C6 aL ,
aL2 C5 + C4
a

=1

which proves the first claim of the theorem on the necessary work.
If < 2 and additionally (a ,  N0 ) decreases polynomially, the sum on the
right hand side is absolutely convergent and therefore
max{2,}

WL  (C5 + C4 C)aL2 + C6 aL = O(aL

).

For 2, it holds that


(2) 2+

WL  aL2 (C5 + C4 aL

(2+2) 2+

= O(max{aL

) + C6 aL

, aL }).

This finishes the proof of the theorem.

We remark that the computation of the sum over different levels of the Monte
Carlo estimators does not increase the computational complexity if Y V for all
 N0 and (V ,  N0 ) is a sequence of nested finite dimensional subspaces of B.

3 Approximation of Stochastic Partial Differential


Equations
In this section we use the framework of [1] and recall the setting and the results
presented in that manuscript. We use the different orders of strong and weak convergence of a Galerkin method for the approximation of a stochastic parabolic evolution
problem in Sect. 4 to show that it is essential for the efficiency of multilevel Monte
Carlo methods to consider also weak convergence rates and not only strong ones as
was presented in [6].
Let (H, (, )H ) be a separable Hilbert space with induced norm H and Q : H
H be a self-adjoint positive semidefinite linear operator. We define the reproducing
kernel Hilbert space H = Q1/2 (H) with inner product (, )H = (Q1/2 , Q1/2 )H ,
where Q1/2 denotes the square root of the pseudo inverse of Q which exists due
to the made assumptions. Let us denote by LHS (H ; H) the space of all Hilbert
Schmidt operators from H to H, which will be abbreviated by LHS in what follows.
Furthermore L(H) is assumed to be the space of all bounded linear operators from

496

A. Lang

H to H. Finally, let (, A , (Ft )t0 , P) be a filtered probability space satisfying


the usual conditions which extends the probability space already introduced in
Sect. 2. The corresponding Bochner spaces are denoted by L p (; H), p 2, with
p
norms given by  Lp (;H) = E[ H ]1/p . In this framework we denote by W =
(W (t), t 0) a (Ft )t0 -adapted Q-Wiener process. Let us consider the stochastic
partial differential equation
dX(t) = (AX(t) + F(X(t))) dt + dW (t)

(1)

as Hilbert-space-valued stochastic differential equation on the finite time interval (0, T ], T < +, with deterministic initial condition X(0) = X0 . We pose
the following assumptions on the parameters, which ensure the existence of a mild
solution and some properties of the solution which are necessary for the derivation
and convergence of approximation schemes.
Assumption 1 Assume that the parameters of (1) satisfy the following:
1. Let A be a negative definite, linear operator on H such that (A)1 L(H) and
A is the generator of an analytic semigroup (S(t), t 0) on H.
2. The initial value X0 is deterministic and satisfies (A) X0 H for some
[0, 1].
3. The covariance operator Q satisfies (A)(1)/2 LHS < + for the same as
above.
4. The drift F : H H is twice differentiable in the sense that F Cb1 (H; H)
Cb2 (H; H 1 ), where H 1 denotes the dual space of the domain of (A)1/2 .
Under Assumption 1, the SPDE (1) has a continuous mild solution

X(t) = S(t)X0 +

S(t s)F(X(s)) ds +

S(t s) dW (s)

(2)

for t [0, T ], which is in L p (; H) for all p 2 and satisfies for some constant C
that
sup X(t)Lp (;H) C(1 + X0 H ).
t[0,T ]

We approximate the mild solution by a Galerkin method in space and a semi-implicit


EulerMaruyama scheme in time, which is made precise in what follows and spares
us the treatment of stability issues. Therefore let (V ,  N0 ) be a nested family
of finite dimensional subspaces of V with refinement level  N0 , refinement sizes
(h ,  N0 ), associated H-orthogonal projections P , and norm induced by H. For
 N0 , the sequence (V ,  N0 ) is supposed to be dense in H in the sense that for
all H, it holds that
lim  P H = 0.
+

We denote the approximate operator by A : V V and specify the necessary


properties in Assumption 2 below. Furthermore let ( n , n N0 ) be a sequence of

A Note on the Importance of Weak Convergence Rates

497

equidistant time discretizations with step sizes t n , i.e., for n N0 ,


n := {tkn = t n k, k = 0, . . . , N(n)},
where N(n) = T /t n , which we assume to be an integer for simplicity reasons. We
define the fully discrete semigroup approximation by S,n := (I t n A )1 P and
assume the following:
Assumption 2 The linear operators A : V V ,  N0 , and the orthogonal
projectors P : H V ,  N0 , satisfy for all k = 1, . . . , N(n) that
k
L(H) C(tkn )
(A ) S,n

for 0 and

(A ) P (A) L(H) C

for [0, 1/2] uniformly in , n N0 . Furthermore they satisfy for all [0, 2],
[, min{1, 2 }], and k = 1, . . . , N(n),
k
)(A)/2 L(H) C(h + (t n )/2 )(tkn )(+)/2 .
(S(tkn ) S,n

The fully discrete semi-implicit EulerMaruyama approximation is then given in


recursive form for tkn = t n k n and for  N0 by
n
n
n
) + S,n F(X,n (tk1
)) t n + S,n (W (tkn ) W (tk1
))
X,n (tkn ) := S,n X,n (tk1

with X,n (0) := P X0 , which may be rewritten as


X,n (tkn )

k
S,n
X0

+ t

k

j=1

kj+1
n
S,n F(X,n (tj1
))


j=1

tjn
n
tj1

kj+1

S,n

dW (s).

(3)

We remark here that we do not approximate the noise which might cause problems in
implementations. One way to treat this problem is to truncate the KarhunenLove
expansion of the Q-Wiener process depending on the decay of the spectrum of Q
(see [2, 5]).
The theory on strong convergence of the introduced approximation scheme is
already developed for some time and the convergence rates are well-known and
stated in the following theorem.
Theorem 2 (Strong convergence [1]) Let the stochastic evolution Eq. (1) with mild
solution X and the sequence of its approximations (X,n , , n N0 ) given by (3)
satisfy Assumptions 1 and 2 for some (0, 1]. Then, for every (0, ), there
exists a constant C > 0 such that for all , n N0 ,
max

k=1,...,N(n)

X(tkn ) X,n (tkn )L2 (;H) C(h + (t n ) /2 ).

498

A. Lang

It should be remarked at this point that the order of strong convergence does not
exceed 1/2 although we are considering additive noise since the regularity of the
parameters of the SPDE are assumed to be rough. Under smoothness assumptions
the rate of strong convergence attains one for additive noise since the higher order
Milstein scheme is equal to the EulerMaruyama scheme. Nevertheless, under the
made assumptions on the regularity of the initial condition X0 and the covariance
operator Q of the noise, this does not happen in the considered case.
The purpose of the multilevel Monte Carlo method is to approximate expressions
of the form E[(X(t))] efficiently, where : H R is a sufficiently smooth
functional. Therefore weak error estimates of the form |E[(X(tkn ))]E[(X,n (tkn ))]|
are of importance. Before we state the convergence theorem from [1], we specify the
necessary properties of in the following assumption.
Assumption 3 The functional : H R is twice continuously Frchet differentiable and there exists an integer m 2 and a constant C such that for all x H and
j = 1, 2,
mj
 (j) (x)L[m] (H;R) C(1 + xH ),
where  (j) (x)L[m] (H;R) is the smallest constant K > 0 such that for all u1 , . . . , um
H,
| (j) (x)(u1 , . . . , um )| Ku1 H um H .
Combining this assumption on the functional with Assumptions 1 and 2 on the
parameters and approximation of the SPDE, we obtain the following result, which
was proven in [1] using Malliavin calculus.
Theorem 3 (Weak convergence [1]) Let the stochastic evolution equation (1) with
mild solution X and the sequence of its approximations (X,n , , n N0 ) given by (3)
satisfy Assumptions 1 and 2 for some (0, 1]. Then, for every : H R
satisfying Assumption 3 and all [0, ), there exists a constant C > 0 such that
for all , n N0 ,
max

k=1,...,N(n)

|E[(X(tkn )) (X,n (tkn ))]| C(h + (t n ) ).


2

An example that satisfies Assumptions 1 and 2 is presented in Sect. 5 of [1] and


consists of a (general) heat equation on a bounded, convex, and polygonal domain
which is approximated with a finite element method using continuous piecewise
linear functions.

4 SPDE Multilevel Monte Carlo Approximation


In the previous section, we considered weak error analysis for expressions of the
form E[(X(t))], where we approximated the mild solution X of the SPDE (1) with
a fully discrete scheme. Unluckily, this is not yet sufficient to compute numbers

A Note on the Importance of Weak Convergence Rates

499

since we are in general not able to compute the expectation exactly. Going back to
Sect. 2, we recall that the first approach to approximate the expected value is to do a
(singlelevel) Monte Carlo approximation. This leads to the overall error given in the
following corollary, which is proven similarly to [3, Corollary 3.6] and included for
completeness.
Corollary 1 Let the stochastic evolution equation (1) with mild solution X and the
sequence of its approximations (X,n , , n N0 ) given by (3) satisfy Assumptions 1
and 2 for some (0, 1]. Then, for every : H R satisfying Assumption 3 and
all [0, ), there exists a constant C > 0 such that for all , n N0 , the error of
the Monte Carlo approximation is bounded by

1
2
E[(X(tkn ))] EN [(X,n (tkn )))]L2 (;R) C h + (t n ) +
k=1,...,N(n)
N
max

for N N.
Proof By the triangle inequality we obtain that
E[(X(tkn ))]EN [(X,n (tkn )))]L2 (;R)
E[(X(tkn ))] E[(X,n (tkn )))]L2 (;R)
+ E[(X,n (tkn )))] EN [(X,n (tkn )))]L2 (;R) .
The first term is bounded by the weak error in Theorem 3 while the second one is
the Monte Carlo error in Lemma 1. Putting these two estimates together yields the
claim.

The errors are all converging with the same speed if we couple  and n such that
4
h2 t n as well as the number of Monte Carlo samples N for  N0 by N h .
This implies for the overall work that
(d+2+4 )

W = WH WT WMC = O(hd (t n )1 N ) = O(h

),

where we assumed that the computational work in space is bounded by WH =


O(hd ) for some d 0, which refers usually to the dimension of the underlying
spatial domain.
Since we have just seen that a (singlelevel) Monte Carlo simulation is rather
expensive, the idea is to use a multilevel Monte Carlo approach instead which is
obtained by the combination of the results of the previous two sections. In what
follows we show that it is essential for the computational costs that weak convergence
results are available, since the number of samples that should be chosen according
to the theory depends heavily on this fact, if weak and strong convergence rates do
not coincide.
Let us start under the assumption that Theorem 3 (weak convergence rates) is not
available. This leads to the following numbers of samples and computational work.

500

A. Lang

Corollary 2 (Strong convergence) Let the stochastic evolution equation (1) with
mild solution X and the sequence of its approximations (X,n , , n N0 ) given by (3)
satisfy Assumptions 1 and 2 for some (0, 1]. Furthermore couple  and n such
2
2 2
that t n h2 and for L N0 , set N0 hL as well as N hL h 1+ for
all  = 1, . . . , L and arbitrary fixed > 0. Then, for every : H R satisfying
Assumption 3 and all [0, ), there exists a constant C > 0 such that for all
, n N0 , the error of the multilevel Monte Carlo approximation is bounded by
max

k=1,...,N(nL )

E[(X(tknL ))] E L [(XL,nL (tknL ))]L2 (;R) ChL ,

where nL is chosen according to the coupling with L. If the work of one computation
in space is bounded by WH = O(hd ) for  = 0, . . . , L and fixed d 0, which
includes the summation of different levels, the overall work will be bounded by
WL = O(hL(d+2) L 2+ ).
Proof We first observe that

max

k=1,...,N(nL )

X(tknL ) XL,nL (tknL )L2 (;H) C(hL + (t n ) /2 ) C 2 hL

by Theorem 2 and the coupling of the space and time discretizations. Furthermore it
holds that
|E[(X(tknL ))]E[(XL,nL (tknL ))]|

max

k=1,...,N(nL )

max

k=1,...,N(nL )

max

(X(tknL )) (XL,nL (tknL ))L2 (;R)

k=1,...,N(nL )

ChL ,

X(tknL ) XL,nL (tknL )L2 (;H)

since is assumed to be a Lipschitz functional (cf. [5, Proposition 3.4]). Furthermore


Lemma 2 implies that
Var[(X,n (t)) (X1,n1 (t))]
2((X(t)) (X,n (t))2L2 (;R) + (X(t)) (X1,n1 (t))2L2 (;R) )
2

Ch .

Setting a = h , = 1, and the sample numbers according to Theorem 1, we obtain


the claim.

If the additional information of better weak convergence rates from Theorem 3
is available, the parameters that are plugged into Theorem 1 change, which leads
for given accuracy to less samples and therefore to less computational work. This

A Note on the Importance of Weak Convergence Rates

501

is made precise in the following corollary and the computations for given accuracy
afterwards.
Corollary 3 (Weak convergence) Let the stochastic evolution equation (1) with mild
solution X and the sequence of its approximations (X,n , , n N0 ) given by (3)
satisfy Assumptions 1 and 2 for some (0, 1]. Furthermore couple  and n such
4
4 2
that t n h2 and for L N0 , set N0 hL as well as N hL h 1+ for
all  = 1, . . . , L and arbitrary fixed > 0. Then, for every : H R satisfying
Assumption 3 and all [0, ), there exists a constant C > 0 such that for all
, n N0 , the error of the multilevel Monte Carlo approximation is bounded by
max

k=1,...,N(nL )

E[(X(tknL ))] E L [(XL,nL (tknL ))]L2 (;R) ChL ,

where nL is chosen according to the coupling with L. If the work of one computation
in space is bounded by WH = O(hd ) for  = 0, . . . , L and fixed d 0, which
includes the summation of different levels, the overall work will be bounded by
(d+2+2 ) 2+

WL = O(hL

).

Proof The proof is the same as for Corollary 2 except that we obtain
max

k=1,...,N(nL )

|E[(X(tknL ))] E[(XL,nL (tknL ))]| ChL


2

directly from Theorem 3 and therefore set a = h , = 1/2, and the sample
numbers according to these choices in Theorem 1.

If we take regular subdivisions of the grids, i.e., we set, up to a constant, h := 2
for  N0 and rescale both corollaries such that the convergence rates are the same,
2
i.e., the errors are bounded by O(h ), we obtain that for a given accuracy L on
level L N, Corollary 2 leads to computational work


2+ 2 + (d+2)/

| log2 L |
WL = O 2
2 L
while the estimators in Corollary 3 can be computed in

WL = O


2 + ((d+2)/(2 )+1)
| log2 L | .
L
2

Therefore the availability of weak convergence rates implies a reduction of the computational complexity of the multilevel Monte Carlo estimator which depends on the
regularity and d referring to the dimension of the problem in space. For large d, the
work using strong convergence rates is essentially the squared work that is needed
with the knowledge of weak rates. Additionally, for all d 0, the rates are better and
3/(2 )+1
3/
for the weak rates versus L ,
especially in dimension d = 1 we obtain L

502

A. Lang

Table 1 Computational work of different Monte Carlo type approximations for a given precision L
Monte Carlo
MLMC with strong conv.
MLMC with weak conv.
General

(d+2)/

((d+2)/(2 )+2)

22+ 2+
2 L

(d/2+3)

= 1, omitting L
const.

(d+2)

| log2 L |

| log2 L |

2+ ((d+2)/(2 )+1)
| log2 L |
2 L
(d/2+2)
L
| log2 L |

where (0, 1). Nevertheless, one should also mention that Corollary 2 already
reduces the work for 4 > d + 2 compared to a (singlelevel) Monte Carlo approximation according to weak convergence rates. The results are put together in Table 1
for a quick overview.

5 Simulation
In this section simulation results of the theory of Sect. 4 are shown, where it has
to be admitted that the chosen example fits better the framework of [6] since we
estimate the expectation of the solution instead of the expectation of a functional
of the solution. Simulations that fit the conditions of Sect. 4 are under investigation.
Here we simulate similarly to [4] and [5] the heat equation driven by additive Wiener
noise
dX(t) = X(t) dt + dW (t)
on the space interval (0, 1) and the time interval [0, 1] with initial condition X(0, x) =
sin( x) for x (0, 1). In contrast to previous simulations, the noise is assumed to be
white in space to reduce the strong convergence rate of the scheme to (essentially) 1/2.
The solution to the corresponding deterministic system with u(t) = E[X(t)] for
t [0, 1]
du(t) = u(t) dt
is in this case u(t, x) = exp( 2 t) sin( x) for x (0, 1) and t [0, 1].
The space discretization is done with a finite element method and the hat function
basis, i.e., with the spaces (Sh , h > 0) of piecewise linear, continuous polynomials
(see, e.g., [6, Example 3.1]). The numbers of multilevel Monte Carlo samples are
calculated according to Corollaries 2 and 3 with = 1 to compare the convergence
and complexity properties with and without the availability of weak convergence
rates. In the left graph in Fig. 1, the multilevel Monte Carlo estimator E L [XL,2L (1)]
was calculated for L = 1, . . . , 5 for available weak convergence rates as in Corollary 3 while just for L = 1, . . . , 4 in the other case to finish the simulations in a
reasonable time on an ordinary laptop. The plot shows the approximation of

A Note on the Importance of Weak Convergence Rates

E[X(1)] E L [XL,2L (1)]H =

503

(exp( 2 ) sin(x) E L [XL,2L (1, x)])2 dx

1/2

i.e.,
e1 (XL,2L ) :=

m
1/2
1 
(exp( 2 ) sin( xk ) E L [XL,2L (1, xk )])2
.
m
k=1

Here, for all levels L = 1, . . . , 5, m = 25 + 1 and xk , k = 1, . . . , m, are the nodal


points of the finest discretization, i.e., on level 5 respectively 4. The multilevel Monte
Carlo estimator E L [XL,2L ] is calculated at these points by its basis representation for
L = 1, . . . , 4, which is equal to the linear interpolation to all grid points xk , k =
1, . . . , m. One observes the convergence of one multilevel Monte Carlo estimator,
i.e., the almost sure convergence of the method, which can be shown using the mean
square convergence and the BorelCantelli lemma. In the graph on the right hand
side of Fig. 1, the error is estimated by
eN (XL,2L ) :=

N
1 

i
e1 (XL,2L
)2

1/2

i=1

i
where (XL,2L
, i = 1, . . . , N) is a sequence of independent, identically distributed
samples of XL,2L and N = 10. The simulation results confirm the theory. In Fig. 2 the
computational costs per level of the simulations on a laptop using matlab are shown
for both frameworks. It is obvious that the computations using weak convergence
rates are substantially faster. One observes especially that the computations with
weak rates on level 5 take less time than the ones with strong rates on level 4. The
computing times match the bounds of the computational work that were obtained in
Corollaries 3 and 2.

Error of 1 MLMC run

10

strong, = 1
strong, = 0
weak, = 1
weak, = 0

10

Error of 10 MLMC runs

10

strong, = 1
strong, = 0
weak, = 1
weak, = 0

10

L2 error

10

L error

O(h )

O(h )
2

10

10

10

10

10
0

10

10

Grid points on finest level

10

10

10

10

Grid points on finest level

Fig. 1 Mean square error of the multilevel Monte Carlo estimator with samples chosen according
to Corollaries 2 and 3

504

A. Lang

Computational costs in seconds

10

10

10

10

strong, = 1
strong, = 0
6

10

O(h )
weak, = 1
weak, = 0

O(h )
10

4
0

10

10

10

Grid points on finest level

Fig. 2 Computational work of the multilevel Monte Carlo estimator with samples chosen according
to Corollaries 2 and 3

Finally, Figs. 1 and 2 include besides = 1 also simulation results for the border
case = 0 in the choices of sample sizes per level. One observes in the left graph in
Fig. 1 that the variance of the errors for = 0 in combination with Corollary 2 is high,
which is visible in the nonalignment of the single simulation results. Furthermore
the combination of Figs. 1 and 2 shows that = 0 combined with Corollary 3 and
= 1 with Corollary 2 lead to similar errors, but that the first choice of sample sizes
is essentially less expensive in terms of computational complexity. Therefore the
border case = 0, which is not included in the theory, might be worth to consider
in practice.
Acknowledgments This research was supported in part by the Knut and Alice Wallenberg foundation as well as the Swedish Research Council under Reg. No. 621-2014-3995. The author thanks
Lukas Herrmann, Andreas Petersson, and two anonymous referees for helpful comments.

References
1. Andersson, A., Kruse, R., Larsson, S.: Duality in refined Sobolev-Malliavin spaces and weak
approximations of SPDE. Stoch. PDE: Anal. Comp. 4(1), 113149 (2016). doi:10.1007/
s40072-015-0065-7
2. Barth, A., Lang, A.: Milstein approximation for advection-diffusion equations driven by multiplicative noncontinuous martingale noises. Appl. Math. Opt. 66(3), 387413 (2012). doi:10.
1007/s00245-012-9176-y

A Note on the Importance of Weak Convergence Rates

505

3. Barth, A., Lang, A.: Multilevel Monte Carlo method with applications to stochastic partial
differential equations. Int. J. Comp. Math. 89(18), 24792498 (2012). doi:10.1080/00207160.
2012.701735
4. Barth, A., Lang, A.: Simulation of stochastic partial differential equations using finite element
methods. Stochastics 84(23), 217231 (2012). doi:10.1080/17442508.2010.523466
5. Barth, A., Lang, A.: L p and almost sure convergence of a Milstein scheme for stochastic
partial differential equations. Stoch. Process. Appl. 123(5), 15631587 (2013). doi:10.1016/j.
spa.2013.01.003
6. Barth, A., Lang, A., Schwab, Ch.: Multilevel Monte Carlo method for parabolic stochastic
partial differential equations. BIT Num. Math. 53(1), 327 (2013). doi:10.1007/s10543-0120401-5
7. Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions. In: Encyclopedia of
Mathematics and Its Applications. Cambridge University Press, Cambridge (1992). doi:10.
1017/CBO9780511666223
8. Giles, M.B.: Improved multilevel Monte Carlo convergence using the Milstein scheme. In:
Alexander, K., et al. (eds.) Monte Carlo and Quasi-Monte Carlo methods 2006. Selected Papers
Based on the presentations at the 7th International Conference Monte Carlo and quasi-Monte
Carlo Methods in Scientific Computing, Ulm, Germany, August 1418, 2006, pp. 343358.
Springer, Berlin (2008). doi:10.1007/978-3-540-74496-2_20
9. Giles, M.B.: Multilevel Monte Carlo path simulation. Oper. Res. 56(3), 607617 (2008). doi:10.
1287/opre.1070.0496
10. Heinrich, S.: Multilevel Monte Carlo methods. In: Margenov, S., Wasniewski, J., Yalamov, P.Y.
(eds.) arge-Scale Scientific Computing, Third International Conference, LSSC 2001, Sozopol,
Bulgaria, June 6-10, 2001, Revised Papers. Lecture notes in computer science, pp. 5867.
Springer, Heidelberg (2001). doi:10.1007/3-540-45346-6_5
11. Jentzen, A., Kurniawan, R.: Weak convergence rates for Euler-type approximations of semilinear stochastic evolution equations with nonlinear diffusion coefficients (2015)

A Strategy for Parallel Implementations


of Stochastic Lagrangian Simulation
Lionel Lentre

Abstract In this paper, we present some investigations on the parallelization of


stochastic Lagrangian simulations. The challenge is the proper management of the
random numbers. We review two different object-oriented strategies: to draw the random numbers on the fly within each MPIs process or to use a different random number generator for each simulated path. We show the benefits of the second technique
which is implemented in the PALMTREE library. The efficiency of PALMTREE is
demonstrated on two classical examples.
Keywords Parabolic partial differential equations Stochastic differential
equations Monte Carlo methods Lagrangian methods High performance
computing

1 Introduction
Monte Carlo simulation is a very convenient method to solve problems arising in
physics like the advectiondiffusion equation with a Dirichlet boundary condition

t c(x, t) = div( (x) c(x, t)) v(x)c(x, t)), (x, t) D [0, T ],


c(x, 0) = c0 (x), x D,

c(x, t) = 0, t [0, T ] and , x D,


(1)
where, for each x D, (x) is a d-dimensional square matrix which is definite,
positive, symmetric, v(x) is a d-dimensional vector such that div(v(x)) = 0, D Rd
is a regular open bounded subset and T is a positive real number. In order to have
L. Lentre (B)
Inria, Research Centre Rennes - Bretagne Atlantique, Campus de Beaulieu,
35042 Rennes Cedex, France
e-mail: lionel.lenotre@inria.fr
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_26

507

508

L. Lentre

a well-posed problem [4, 5] and to be able to use later the theory of stochastic
differential equations, we required that satisfies an ellipticity condition and has its
coefficients at least in C 2 (D), and that v is bounded and in C 1 (D).
Interesting computations involving the solution c(t, x) are the moments

Mk (T ) =

x k c(T, x) d x, k 1 such that Mk (T ) < +.


D

One possibility for their computation is to perform a numerical integration of an


approximated solution of (1). Eulerian methods (like Finite Difference Method,
Finite Volume Method or Finite Element Method) are classical to obtain such an
approximated solution. However, for advectiondiffusion problems, they can induce
numerical artifacts such as oscillations or artificial diffusion. This mainly occurs
when advection dominates [7].
An alternative is to use Monte Carlo simulation [6, 19] which is really simple.
Indeed, the theory of stochastic processes implies that there exists X = (X t )t0 whose
law is linked to (1) and is such that
Mk (T ) = E[X Tk ].

(2)

The above expectation is nothing more than an average of the positions at time
T of particles that move according to a scheme associated with the process X . This
requires a large number of these particles to be computed. For linear equations, the
particles do not interact with each other and move according to a Markovian process.
The great advantage of the Monte-Carlo method is that its rate of convergence is
not affected by the curse of dimensionality. Nevertheless, the slowness of the rate
caused by the Central-Limit theorem can be considered as a drawback. Precisely, the
computation of the moments requires a large amount of particles to achieve a reliable
approximation. Thus, the use of supercomputers and parallel architectures becomes a
key ingredient to obtain reasonable computational times. However, the main difficulty
when one deals with parallel architectures is to manage the random numbers such
that the particles are not correlated, otherwise a bias in the approximation of the
moments is obtained.
In this paper, we investigate the parallelization of the Monte Carlo method for
the computation of (2). We will consider two implementations strategies where the
total number of particles is divided into batches distributed over the Floating Point
Units (FPUs):
1. SAF: the Strategy of Attachment to the (FPUs) where each FPU received a Virtual Random Number Generator (VRNG) which is either different independent
Random Number Generators (RNGs) or copies of the same RNG in different
states [10]. In this strategy, the random numbers are generated on demand and do
not bear any attachment to the particles.
2. SAO: the Strategy of Attachment to the Object where the particles carries their
own Virtual Random Number Generator.

A Strategy for Parallel Implementations of Stochastic Lagrangian Simulation

509

Both schemes clearly carry the non correlation of the particles assuming that all the
drawn random numbers have enough independence which is a matter of RNGs.
Sometimes particles with a singular behavior are encountered and the examination of the full paths of such particles is necessary. With the SAF, a particle replay
requires either to re-run the simulation with a condition to record only the positions
of this particle or to keep track of the random numbers used for this particle. In
both cases, it would drastically increase the computational time and add unnecessary
complications to the code. On the contrary, a particle replay is straightforward with
the SAO.
The present paper is organized in two sections. The first one describes SAF and
SAO. It also treat of the work done in PALMTREE, a library we developed with the
generator RNGStreams [11] and which contains an implementation of the SAO. The
second section presents two numerical experiments which illustrate the performance
of PALMTREE [17] and the SAO. Characteristic curves like speedup and efficiency
are provided for both experiment.

2 Parallel and Object-Oriented Implementations


in Monte Carlo
All along this section, we assume that we are able to simulate the transition law
of particles undergoing a Markovian dynamics such that there is no interactions
between them. As a result, the presentation below can be applied to various Monte
Carlo schemes involving particle tracking where the goal is to compute moments.
Moreover, this shows the flexibility of the implementation we choose.

2.1 An Object-Oriented Design for Monte Carlo


C++ offers very interesting features which are of great help for a fast execution or to
treat multidimensional processes. In addition, a consistent implementation of MPI is
available in this language. As a result, it becomes a natural choice for PALMTREE.
In what follows, we describe and motivate the choices we made in the implementation of PALMTREE. We refer to a FPU as a MPIs process.
We choose to design an object called the Launcher which conducts the Monte
Carlo simulation. Roughly speaking, it collects all the generic parameters for the
simulation (the number of particles or the repository for the writing of outputs). It
also determines the architecture of the computer (cartography of the nodes, number
of MPIs process, etc.) and is responsible for the parallelization of the simulation
(managing the VRNGs and collecting the result on each MPIs process to allow the
final computations).

510

L. Lentre

Some classical designs introduce an object consisting of a Particles Factory which


contains all the requirements for the particle simulations (the motion scheme or the
diffusion and advection coefficients). The Launchers role is then to distribute to
each MPIs process a factory with the number of particles that must be simulated
and the necessary VRNGs. The main job of the factory is to create objects which are
considered as the particles and to store them. Each one of these objects contains all
the necessary information for path simulation including the current time-dependent
position and also the motion simulation algorithm.
This design is very interesting for interacting particles as it requires the storage
of the path of each particle. For the case we decide to deal with, this implementation
suffers two major flaws: a slowdown since many objects are created and a massive
memory consumption as a large number of objects stay instantiated.
As a result, we decide to avoid the above approach and to use a design based on
recycling. In fact, we choose to code a unique object that is similar to the factory,
but does not create redundant particle objects. Let us call this object the Particle.
In few words, the recycling concept is the following. When the final position
at time T is reached for each path, the Particle resets to the initial position and
performs another simulation. This solution avoids high memory consumption and
allows complete management of the memory. In addition, we do not use a garbage
collector which can cause memory leaks.
Another thing, we adopt in our design, is the latest standards in the C++11 library
[1] which offers the possibility to program an object with a template whose parameter
is the spatial dimension of the process we want to simulate. Thus, one can include this
template parameter into the implementation of the function governing the motion of
the particle. If it is, the object is declared with the correct dimension and automatically
changes the function template. Otherwise, it checks the compatibility of the declared
dimension with the function.
Such a feature allows the ability to preallocate the exact size required by the chosen
dimension for the position in a static array. Subsequently, we avoid writing multiple
objects or using a pointer and dynamic memory allocation, which provoke slowdown.
Moreover, templates allow for a better optimization during the compilation.
Now a natural parallel scheme for a Monte Carlo simulation consists in the distribution of a particle on the different MPIs processes. Then, a small number of paths
are sequentially simulated on each MP. When each MPIs process has finished, the
data is regrouped on the master MPI process using MPI communications between
the MPIs processes. Thus, the quantities of interest can be computed by the master
MPIs process.
This scheme is typically embarrassingly parallel and can be used with both shared
or distributed memory paradigm. Here we choose the distributed memory paradigm
as it offers the possibility to use supercomputers based on SGI Altix or IBM Blue
Gene technologies. Furthermore, if the path of the particles needs be recorded, the
shared memory paradigm can not be used due to a very high memory consumption.

A Strategy for Parallel Implementations of Stochastic Lagrangian Simulation

511

Fig. 1 The structure of RNGStreams

2.2 Random Number Generators


The main difficulty with the parallelization of the Monte Carlo method is to ensure
the independence of all the random numbers split on the different MPIs processes.
To be precise, if the same random numbers are used on two different processes, the
simulation will end up with non-independent paths and the targeted quantities will
be erroneous.
Various recognized RNGs such as RNGStreams [11], SPRNG [12] or MT19937
[13] offer the possibility to use VNRGs an can be used on parallel architectures.
Recently, algorithms have been proposed to produce advanced and customized
VRNGs with MRG32k3a and MT19937 [3].
In PALMTREE, we choose RNGStreams which possesses the following two
imbricated subdivisions of the backbone generator MRG32k3a:
1. Stream: 2127 consecutive random numbers
2. Substream: 276 consecutive random numbers
and the VRNGs are just the same MRG32k3a in different states (See Fig. 1). Moreover, this RNG has already implemented VRNGs [11] and passes several statistical
tests which can be found in TestU01 that ensure the independence of random numbers [9].
Now a possible strategy with RNGStreams is to use a stream for each new simulation of a moment as we must have a new set of independent paths and to use the 251
substreams contain in each stream to allocate VRNGs on the FPU or to the objects
for each moment simulation. This decision clearly avoids the need to store the state
of the generator after the computations.

2.3 Strategy of Attachment to the FPUs (SAF)


An implementation of SAF with RNGStreams and the C++ design proposed in
Sect. 2.1 is very easy to perform as the only task is to attach a VRNG to each MPIs

512

L. Lentre

process in the Launcher. Then the particles distributed on each MPIs process are
simulated, drawing the random number from the attached VRNG.
Sometimes a selective replay may be necessary to capture some singular paths
in order to enable a physical understanding or for debugging purposes. However,
recording the path of every particle is a memory intensive task as keeping the track
of the random numbers used by each particle. This constitutes a major drawback for
this strategy. SAO is preferred in that case.

2.4 Strategy of Object-Attachment (SAO) and PALMTREE


Here a substream is attached to each particle which can be considered as an object
and all that is needed to implement this scheme is a subroutine to quickly jump from
the first substream to the nth one. We show why in the following example: suppose
that we need 1,000,000 paths to compute the moment and have 5 MPIs processes,
then we distribute 200,000 paths to each MPIs process, which therefore requires
200,000 VRNGs to perform the simulations (See Fig. 2).
The easiest way to solve this problem is to have the mth FPU that starts at the
(m 1) 200,000 + 1st substream and then to jump to the next substream until it
reaches the m 200,000th substream.
RNGStreams possesses a function that allows to go from one substream to the
next one (See Fig. 3). Thus the only problem is to go quickly from the first substream

Fig. 2 Distribution of 200,000 particles to each FPU


Fig. 3 Distribution of
VRNGs or substreams to
each FPU

A Strategy for Parallel Implementations of Stochastic Lagrangian Simulation

513

to the (m 1) 200,000 + 1st substream so that we can compete with the speed of
the SAF.
A naive algorithm using a loop containing the default function that passes through
each substream one at a time is clearly too slow. As a result, we decide to modify
the algorithm for MRG32k3a proposed in [3].
The current state of the generator RNGStreams is a sequence of six numbers,
suppose that {s1 , s2 , s3 , s4 , s5 , s6 } is the start of a substream. With the vectors Y1 =
{s1 , s2 , s3 } and Y2 = {s4 , s5 , s6 }, the matrix

82758667 1871391091 4127413238


A1 = 36728315231 69195019 1871391091
3672091415 3528743235 69195019
and

1511326704 3759209742 1610795712


A2 = 4292754251 1511326704 3889917532 ,
3859662829 4292754251 3708466080

and the numbers m 1 = 4294967087 and m 2 = 4294944443, the jump from one substream to the next is performed with the computations
X 1 = A1 Y1 mod m 1 and X 2 = A2 Y2 mod m 2
with X 1 and X 2 the states providing the first number of the next substream.
As we said above, it is too slow to run these computations n times to make a
jump from the 1st-substream to the nth-substream. Subsequently, we propose to use
the algorithm developed in [3] based on the storage in memory of already computed
matrix and the decomposition
k

gj 8j,
s=
j=0

for any s N.
Since a stream contains 251 = 817 substreams, we decide to only store the already
computed matrices
Ai Ai2 Ai7
Ai8 Ai28 Ai78
..
.. . .
.
. ..
.
.
16

16

16

Ai8 Ai28 Ai78

for i = 1, 2 with A1 and A2 as above. Thus we can reach any substream s with the
formula
k

g 8j
Ais Yi =
Ai j Yi mod m i
j=0

514

L. Lentre

Fig. 4 Illustration of the stream repartition on FPUs

This solution provides a process that can be completed with a complexity less
than O(log2 p) which is much faster [3] than the naive solution. The Fig. 4 illustrates
this idea. In effect, we clearly see that the second FPU receive a stream and then
performs a jump from the initial position of this stream to the first random number
of the n + 1 substream of this exact same stream.

3 Experiments with the AdvectionDiffusion Equation


3.1 The AdvectionDiffusion Equation
In physics, the solution c(x, t) of (1) is interpreted as the evolution at the position
x of the initial concentration c0 (x) during the time interval [0, T ]. The first moment
of c is often called the center of mass.
Let us first recall that it exists a unique regular solution of (1). Proofs can be
found [5, 14]. This clearly means, as we said in the introduction, that we deal with
a well-posed problem.
The notion of fundamental solution [2, 4, 5, 14] which is motivated by the fact
that c(x, t) depends on the initial condition plays an important role in the treatment
of the advectiondiffusion equation. It is the unique solution (x, t, y) of

A Strategy for Parallel Implementations of Stochastic Lagrangian Simulation

(x, t, y) = divx ( (x) x (x, t, y)) v(x)x (x, t, y),

t
(x, t, y) D [0, T ] D,

(x, 0, y) = y (x), (x, y) D D,

(x, t, y) = 0, t [0, T ], y D, x D.

515

(3)

This parabolic partial differential equation derived from (1) is often called the
Kolmogorov Forward equation or the FokkerPlanck equation. The probability theory provides us with the existence of a unique Feller process X = (X t )t0 whose
transition function density is the solution of the adjoint of (3), that is

(x, t, y) = div y ( (y) x (x, t, y)) + v(y) y (x, t, y),

t
(x, t, y) D [0, T ] D,

(x, 0, y) = x (y), (x, y) D D,

(x, t, y) = 0, t [0, T ], x D, y D,

(4)

which is easy to compute since div(v(x)) = 0 for every x R.


Assuming that and v satisfy the hypotheses settled in (1), then using the
FeynmanKac formula [15] and (4), we can define the process X as the unique
strong solution of the Stochastic Differential Equation
d X t = v(X t ) dt + (X t ) d Bt ,

(5)

starting at the position y and killed on the boundary D. Here, (Bt )t0 is a ddimensional Brownian motion with respect to the filtration (Ft )t0 satisfying the
usual conditions [18].
The path of such a process can be simulated step-by-step with a classical Euler
scheme. Therefore a Monte Carlo algorithm for the simulation of the center of mass
simply consists in the computation until time T of a large number of paths and the
average of all the final positions of every simulated particle still inside the domain.
As we are mainly interested in computational time and efficiency, the numerical
experiments that will follow are performed in free space. Working on a bounded
domain would only require to set the accurate stopping condition, which is a direct
consequence of the FeynmanKac formula that is to terminate the simulation of the
particle when it leaves the domain.

3.2 Brownian Motion Simulation


Let us take an example in dimension one. We suppose that the drift term v is zero
and that (x) is constant. We then obtain the renormalized Heat Equation whose
solution is the standard Brownian Motion.

516

L. Lentre

Let us divide the time interval [0, T ] into N subintervals by setting t = T /N


and tn = n t, n = 0, . . . , N and use the Euler scheme
X tn+1 = X tn + Bn ,

(6)

with Bn = Btn+1 Btn . In this case, the Euler scheme presents the advantage of
being exact.
Since the Brownian motion is easy to simulate, we choose to sample 10,000,000
paths starting from the position 0 until time T = 1 with 0.001 as time step. We
compute the speedup S and the efficiency E which are defined as
S=

T1
T1
and E =
100,
Tp
p Tp

where T1 is the sequential computational time with one MPIs process and T p is the
time in parallel using p MPIs process.
The speedup and efficiency curves together with the values used to plotted them
are respectively given in Fig. 5 and Table 1. The computations were realized with
the supercomputer Lambda from the Igrida Grid of INRIA Research Center Rennes
Bretagne Atlantique. This supercomputer is composed of 11 nodes with 2 6 Intel
Xeon(R) E5647 CPUs at 2.40 Ghz on Westmere-EP architecture. Each node possesses 48 GB of Random Access Memory and is connected to the others through
infiniband. We choose GCC 4.7.2 as C++ compiler and use the MPI library OpenMPI
1.6 as we prefer to use opensource and portable software. These tests include the
time used to write the output file for the speedup computation so that we also show
the power of the HDF5 library.
The Table 1 clearly illustrates PALMTREEs performance. It appears that the
SAO does not suffer a significant loss of efficiency despite it requires a complex

(a)

(b)

Speedup

Efficiency

120
108
100

96

90
84

80

72

70

60

60

48

50
40

36

30
24

20

12

10

MPIs processes
1

12 24 36 48 60 72 84 96 108 120

MPIs processes
1

12 24 36 48 60 72

84 96 108 120

Fig. 5 Brownian motion: a The dash line represents the linear acceleration and the black curve
shows the speedup. b The dash line represents the 100 % efficiency and the black curve shows the
PALMTREEs efficiency

A Strategy for Parallel Implementations of Stochastic Lagrangian Simulation

517

Table 1 The values used to plot the curve in Fig. 5


Processes

12

24

36

48

60

72

84

96

108

120

Time (s)

4842

454

226

154

116

93

78

67

59

53

48

Speedup

10.66

21.42

31.44

41.74

52.06

62.07

72.26

82.06

91.35

100.87

Efficiency

100

88.87

89.26

87.33

86.96

86.77

86.21

86.03

85.48

84.59

84.06

preprocessing. Moreover, the data show that the optimum efficiency (89.26 %) is
obtained with 24 MPIs processes.
As we mentioned in Sect. 2.2, the independence between the particles is guaranteed by the non correlation of random numbers generated by the RNG. Moreover,
Fig. 6 shows that the sum of the squares of the positions of the particles at T = 1
follow a 2 distribution in two different cases: (a) between substreams i and i + 1
for i = 0, . . . , 40,000 of the first stream. (b) between substreams i of the first and
second streams for i = 0, . . . , 10,000.

3.3 AdvectionDiffusion Equation with an Affine Drift Term


We now consider the advectiondiffusion equation whose drift term v is an affine
function, that is for each x R, v(x) = a x + b and is a constant. We simulate the
associated stochastic process X through the exact scheme
X tn+1 = e

at

b
X tn + (eat 1) +
a

e2at 1
N (0, 1)
2a

where N (0, 1) is a standard Gaussian law [8].

(a)

(b)

Fig. 6 2 test: a between substreams i and i + 1 for i = 0 . . . 40,000 of the first stream. b between
substreams i of the first and second streams for i = 0 . . . 10,000

518

L. Lentre

For this scheme with an initial position at 0 and the parameters = 1, a = 1,


b = 2 and T = 1, we give the speedup and efficiency curves represented in Fig. 7
based on the simulation of hundred millions of particles. The Table 2 provides the
data resulting from the simulation and used for the plots.
Whatever the number of MPIs processes involved, we obtain the same empirical expectation E = 3.19 and empirical variance V = 13.39 with a standard error
S.E. = 0.0011 and a confidence interval C.I. = 0.0034. Moreover, a good efficiency
(89.29 %) is obtained with 60 MPIs processes.
In this case, the drift term naturally pushes the particles out of 0 relatively quickly.
If this behavior is not clearly observed in a simulation, then the code has a bug and
a replay of a selection of few paths can be useful to track it in spite of reviewing all
the code. This can clearly save time.
With the SAO, this replay can be easily performed since we know which substreams is used by each particle as it is shown in Fig. 4. Precisely, in the case presented in Figs. 2 and 3, the nth particle is simulated by a certain FPU using the nth
substream. As a result, it is easy to replay the nth particle since we just have to use
the random numbers of the nth substream. The point is that the parameters must stay
exactly the same particularly the time step. Otherwise, the replay of the simulation
will use the same random numbers but not for the exact same call of the generator
during the simulation.

(a)

(b)

Speedup
120

Efficiency

108
100
90
80
70
60
50
40
30
20
10

96
84
72
60
48
36
24
12
MPIs processes
1 12 24 36 48 60 72 84 96108120

MPIs processes
1 12 24 36 48 60 72 84 96108120

Fig. 7 Constant diffusion with an affine drift: a The dash line represents the linear acceleration
and the black curve shows the speedup. b The dash line represents the 100 % efficiency and the
black curve shows the PALMTREEs efficiency
Table 2 The values used to plot the curve in Fig. 7
Processes

24

36

48

60

72

84

96

108

120

Time (s)

19020 1749

12

923

627

460

355

302

273

248

211

205

Speedup

10.87

20.60

30.33

41.34

53.57

62.98

69.67

76.69

90.14

92.78

Efficiency

100

90.62

85.86

84.26

86.14

89.29

87.47

82.94

79.88

83.46

73.31

A Strategy for Parallel Implementations of Stochastic Lagrangian Simulation

519

4 Conclusion
The parallelization of Stochastic Lagrangian solvers relies on a careful and efficient
management of the random numbers. In this paper, we proposed a strategy based on
the attachment of the Virtual Random Number Generators to the Object.
The main advantage of our strategy is the possibility to easily replay some particle
paths. This strategy is implemented in the PALMTREE software. PALMTREE use
RNGStreams to benefit from the native split of the random numbers in streams and
substreams.
We have shown the efficiency of PALMTREE on two examples in dimension one:
the simulation of the Brownian motion in the whole space and the simulation of an
advectiondiffusion problem with an affine drift term. Independence of the paths
was also checked.
Our current work is to perform more tests with various parameters and to link
PALMTREE to the platform H2OLAB [16], dedicated to simulations in hydrogeology. In H2OLAB, the drift term is computed in parallel so that the drift data are split
over MPIs processes. The challenge is that the computation of the paths will move
from one MPIs process to another which raises issues about communications, good
work load balance and an advanced management of the VRNGs in PALMTREE.
Acknowledgments I start by thanking S. Maire and M. Simon who offer me the possibility to
present this work at MCQMC. I thank J. Erhel and G. Pichot for the numerous discussions on
Eulerian Methods. I am also grateful to T.Dufaud and L.-B. Nguenang for the instructive talks on
the MPI library. C. Deltel and G. Andrade-Barroso of IRISA were of great help for the deployment
on supercomputers and understanding the latest C++ standards. Many thanks to G. Landurein for
his help in the implementation of PALMTREE. I am in debt to P. LEcuyer and B. Tuffin for the
very interesting discussions about RNGStreams. I show gratitude to D. Imberti for his help in the
English language during the writing of this article. I finish with a big thanks to A. Lejay. This work
was partly funded by a grant from ANR (H2MNO4 project).

References
1. The C++ Programming Language. https://isocpp.org/std/status (2014)
2. Aronson, D.G.: Non-negative solutions of linear parabolic equations. Annali della Scuola Normale Superiore di Pisa - Classe di Scienze 22(4), 607694 (1968)
3. Bradley, T., du Toit, J., Giles, M., Tong, R., Woodhams, P.: Parallelization techniques for
random number generations. GPU Comput. Gems Emerald Ed. 16, 231246 (2011)
4. Evans, L.C.: Partial differential equations. In: Graduate Studies in Mathematics, 2nd edn.
American Mathematical Society, Providence (2010)
5. Friedman, A.: Partial differential equations of parabolic type. In: Dover Books on Mathematics
Series. Dover Publications, New York (2008)
6. Gardiner, C.: A handbook for the natural and social sciences. In: Springer Series in Synergetics,
4th edn. Springer, Heidelberg (2009)
7. Hundsdorfer, W., Verwer, J.G.: Numerical solution of time-dependent advection-diffusionreaction equations. In: Springer Series in Computational Mathematics. Springer, Heidelberg
(2003)

520

L. Lentre

8. Kloeden, P.E., Platen, E.: Numerical solution of stochastic differential equations. In: Stochastic
Modelling and Applied Probability. Springer, Heidelberg (1992)
9. LEcuyer, P.: Testu01. http://simul.iro.umontreal.ca/testu01/tu01.html
10. LEcuyer, P., Munger, D., Oreshkin, B., Simard, R.: Random numbers for parallel computers: requirements and methods, with emphasis on GPUs. In: Mathematics and Computers in
Simulation, Revision Submitted (2015)
11. LEcuyer, P., Simard, R., Chen, E.J., Kelton, W.D.: An object-oriented random-number package
with many long streams and substreams. Oper. Res. 50(6), 10731075 (2002)
12. Mascagni, M., Srinivasan, A.: Algorithm 806: SPRNG: a scalable library for pseudorandom
number generation. ACM Trans. Math. Softw. 26(3), 436461 (2000)
13. Matsumoto, M., Nishimura, T.: Mersenne twister: a 623-dimensionally equidistributed uniform
pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8(1), 330 (1998)
14. Nash, J.: Continuity of solutions of parabolic and elliptic equations. Am. J. Math. 80(4), 931
954 (1958)
15. ksendal, B.: Stochastic Differential Equations. Universitext. Springer, Heidelberg (2003)
16. Project-team Sage. H2OLAB. https://www.irisa.fr/sage/research.html
17. Lentre, L., Pichot, G.: Palmtree Library. http://people.irisa.fr/Lionel.Lenotre/software.html
18. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Grundelehren der mathematischen Wissenschaften, 3rd edn. Springer, Berlin (1999)
19. Zheng, C., Bennett, G.D.: Applied Contaminant Transport Modelling. Wiley, New York (2002)

A New Rejection Sampling Method


for Truncated Multivariate Gaussian
Random Variables Restricted
to Convex Sets
Hassan Maatouk and Xavier Bay

Abstract Statistical researchers have shown increasing interest in generating


truncated multivariate normal distributions. In this paper, we only assume that the
acceptance region is convex and we focus on rejection sampling. We propose a new
algorithm that outperforms crude rejection method for the simulation of truncated
multivariate Gaussian random variables. The proposed algorithm is based on a generalization of Von Neumanns rejection technique which requires the determination
of the mode of the truncated multivariate density function. We provide a theoretical
upper bound for the ratio of the target probability density function over the proposal
probability density function. The simulation results show that the method is especially efficient when the probability of the multivariate normal distribution of being
inside the acceptance region is low.
Keywords Truncated Gaussian vector Rejection sampling Monte Carlo method

1 Introduction
The need for simulation of truncated multivariate normal distributions appears in
many fields, like Bayesian inference for truncated parameter space [10] and [11],
H. Maatouk (B) X. Bay
cole Nationale Suprieure des Mines de St-tienne, 158 Cours Fauriel,
Saint-tienne, France
e-mail: hassan.maatouk@mines-stetienne.fr
X. Bay
e-mail: bay@emse.fr
H. Maatouk
Institut Camille Jordan, Universit de Lyon, UMR 5208 , F - 69622
Villeurbanne Cedex, France
H. Maatouk
Institut de Radioprotection et de Sret Nuclaire (IRSN),
92260 Fontenay-aux-Roses, France
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_27

521

522

H. Maatouk and X. Bay

Gaussian processes for computer experiments subject to inequality constraints [5,


8, 9, 20] and regression models with linear constraints (see e.g. [12] and [28]).
In general, we have two types of methods. The first ones are based on Markov chain
Monte Carlo (McMC) simulation [3, 18, 25], as the Gibbs sampling [2, 12, 15, 17,
19, 24, 26]. They provide samples from an approximate distribution which converges
asymptotically to the true one. The second ones are exact simulation methods based
on rejection sampling (Von Neumann [27]) and its extensions, [6, 16, 18]. In this
paper, we focus on the second type of methods.
Recently, researchers in statistics have used an adaptive rejection technique with
Gibbs sampling [12, 13, 21, 22, 24]. Let us mention that in one dimension rejection
sampling with a high acceptance rate has been developed by Robert [24], and Geweke
[12]. In [24] Robert developed simulation algorithms for one-sided and two-sided
truncated normal distributions. Its rejection algorithm is based on exponential functions and uniform distributions. The multidimensional case where the acceptance
region is a convex subset of Rd is based on the same algorithm using the Gibbs
sampling to reduce the simulation problem to a sequence of one-dimensional simulations. In this case, the method requires the determination of slices of the convex
acceptance region. Also, Geweke [12] proposed an exponential rejection sampling
to simulate a truncated normal variable. The multidimensional case is deduced by
using the Gibbs algorithm. In one-dimension, Chopin [4] designed an algorithm that
is computationally faster than alternative algorithms. A multidimensional rejection
sampling to simulate a truncated Gaussian vector outside arbitrary ellipsoids has
been developed by Ellis and Maitra [7]. For higher dimensions, Philippe and Robert
[23] developed a simulation method of a Gaussian distribution restricted to positive
quadrants. Also, Botts [1] improves an accept-reject algorithm to simulate positive
multivariate normal distributions.
In this article, we develop a new rejection technique to simulate a truncated multivariate normal distribution restricted to any convex subset of Rd . The method only
requires the determination of the mode of the probability density function (pdf)
restricted to the convex acceptance region. We provide a theoretical upper bound
for the ratio of the target probability density function over the proposal probability
density function.
The article is organized as follows. In Sect. 2, we recall the rejection method.
Then, we present our new method, called rejection sampling from the mode (RSM)
and we give the main theoretical results and the associated algorithm. In Sect. 3, we
compare RSM with existing rejection algorithms.

2 Multivariate Normal Distribution


2.1 The General Rejection Method
Let f be a probability density function (pdf) defined on Rd . Von Neumann [27]
proposed the rejection method, using the notion of dominating density function.

A New Rejection Sampling Method

523

Suppose that g is another density function close to f such that for some finite constant
c 1, called rejection constant,
f (x) cg(x), x Rd .

(1)

The acceptance/rejection method is an algorithm for generating random samples


from f by drawing from the proposal pdf g and the uniform distribution:
1. Generate X with density g.
2. Generate U uniformly on [0, 1]. If cg(X )U f (X ), accept X ; otherwise, go
back to step 1.
The random variable X resulting from the above algorithm is distributed according
to f . Furthermore it can be shown that the acceptance rate is equal to 1/c. In practice
it is crucial to get a small c.
Notice that the rejection sampling algorithm is immediately extended to unnormalized density functions avoiding the computation of normalizing constant.
Proposition 1 Let C be a subset of Rd and f and g be two unnormalized density
functions on C such that f(x) k g(x),

k R. Then the rejection algorithm is still


valid if the inequality condition cg(X )U f (X ) is replaced by
k g(X
)U f(X ).


The rejection constant is c = k  C


C

(2)

g(t)dt

.
f(t)dt

Proof We have f(x) k g(x),

and so
f (x) = 
C


with c = k  C
f(X ).

g(t)dt

.
f(t)dt

f(x)
g(x)

= cg(x),
c

f (t)dt
C g(t)dt

(3)

The condition cg(X )U f (X ) is equivalent to k g(X


)U


2.2 Rejection Sampling from the Mode


Suppose that X has multivariate normal distribution with probability density function:
f (x | , ) =



1
1
 1
(x

)
exp

(x

)
, x Rd (4)
(2 )d/2 | |1/2
2

where = E[X ] and is the covariance matrix, assumed to be invertible.

524

H. Maatouk and X. Bay

We consider a convex subset C of Rd representing the acceptance region. We


assume that does not belong to C , which is a hard case for crude rejection sampling.
Furthermore, as explained in Remark 1 (see below) the proposed method is not
different from crude rejection sampling if C . Without loss of generality, let
= 0. Our aim is to simulate the multivariate normal distribution X restricted to the
convex set C . The idea is twofold. Firstly, we determine the mode corresponding
to the maximum of the probability density function f restricted to C . It is the solution
of the following convex optimization problem:
1
= arg min x  1 x.
xC 2

(5)

Secondly, let g be the pdf obtained from f by shifting the center to :




1
1
 1

exp (x ) (x ) .
g(x | , ) =
(2 )d/2 | |1/2
2

(6)

Then we prove in the next theorem and corollary that g can be used as a proposal
pdf for rejection sampling on C , and we derive the optimal constant.
Theorem 1 Let f and g be the unnormalized density functions defined as
f(x) = f (x | 0, )1xC and

g(x)

= g(x | , )1xC ,

where f and g are defined respectively in (4) and (6). Then there exists a constant k
such that f(x) k g(x)

for all x in C and the smallest value of k is :




1
k = exp ( ) 1 .
2

(7)

Proof Let us start with the one-dimensional case. Without loss of generality, we
suppose that C = [ , +[, where is positive and = 2 . In this case, the
condition f(x) k g is written
x2

x , e 2 2 ke
and so

k =e

( )2
2 2

max e
x

x
2

=e

( )2
2 2

(x )2
2 2

x
2
x

min

In the multidimensional case, we have k = max e 2 (


1

C , we only need to show that

= e

( )2
2 2

) 1 x  1

xC

x C , x  1 ( ) 1 .

. Since

A New Rejection Sampling Method

525

Fig. 1 Scalar product


between the gradient vector
1 of the function
1  1
x at and the
2x
dashed vector (x ). The
ellipses centered at origin are
the level curves of the
function x 21 x  1 x

The angle between the gradient vector 1 of the function 21 x  1 x at the mode
and the dashed vector (x ) is acute for all x in C since C is convex (see

Fig. 1). Therefore, (x ) 1 is non-negative for all x in C .
By now, we can write the RSM algorithm as follows:
Corollary 1 (RSM Algorithm) Let f and g be the unnormalized density functions
defined as
f(x) = f (x | 0, )1xC and

g(x)

= g(x | , )1xC ,

where f and g are defined by (4)(6). Then the random vector X resulting from the
following algorithm is distributed accorded to f.
1. Generate X with unnormalized density g.


2. Generate U uniformly on [0, 1]. If U exp ( ) 1 X  1 , accept
X ; otherwise go back to step 1.
Proof The proof is done by applying Proposition 1 with the optimal constant k of
Theorem 1.

Remark 1 In practice, we use a crude rejection method to simulate X with unnormalized density g in the RSM algorithm. So if C , RSM degenerates to crude
rejection sampling since = and f = g. Therefore, the method RSM can be
seen as a generalization of naive rejection sampling.
Remark 2 Our method requires only the maximum likelihood of the pdf restricted to
the acceptance region. It is the mode of the truncated multivariate normal distribution.
The numerical calculation of it is a standard problem for solving convex quadratic
programs, see e.g. [14].

526

H. Maatouk and X. Bay

3 Performance Comparisons
To investigate the performance of the RSM algorithm, we compare it with existing
rejection algorithms. Robert [24] for example proposed a rejection sampling method
in the one dimensional case. To compare the acceptance rates of RSM with Roberts
method, we consider a standard normal variable truncated between and + with
fixed to 1. In Roberts method, the average acceptance rate is high when the
acceptance interval is small (see Table 2.2 in [24]). In the proposed algorithm, simulating from shifted distributions (first step in the RSM algorithm) leads to the fact
that the average acceptance rate is more important when the acceptance interval is
large. As expected, the performance of the proposed algorithm appears when we
have a large gap between and + , as shown in Table 1. Thus the RSM algorithm
can be seen as a complementary of Roberts one.
The performance of the method appears when the probability to be inside the
acceptance region is low. In Table 2, we consider the one dimensional case d = 1 and
we only change the position of , where the acceptance region is C = [ , +[.

Table 1 Comparison of average acceptance rate between Roberts method [24] and RSM under
the variability of the distance between and +
+
Roberts method (%) Rejection sampling
Gain
from the mode (%)
0.5
1
2
5
10

77.8
56.4
35.0
11.6
7.0

18.0
21.2
27.4
28.2
28.4

0.2
0.3
0.7
2.4
4.0

The acceptance region is C = [ , + ], where is fixed to 1


Table 2 Comparison between crude rejection sampling and RSM when the probability to be inside
the acceptance region becomes low

Acceptance rate with Acceptance rate with Gain


crude rejection
RSM (%)
sampling (%)
0.5
1
1.5
2
2.5
3
3.5
4
4.5

30.8
15.8
6.7
2.2
0.6
0.1
0.0
0.0
0.0

The acceptance region is C = [ , +[

34.9
26.2
20.5
16.8
14.2
12.2
10.6
9.3
8.4

1.1
1.6
3.0
7.4
23.1
92.0
455.6
2936.7
14166.0

A New Rejection Sampling Method

527

Fig. 2 Crude rejection


sampling using 2000
simulations. The acceptance
rate is 3 %

From the last column, we observe that our algorithm outperforms crude rejection
sampling. For instance, the proposed algorithm is approximately 14,000 times faster
than the crude rejection sampling when the acceptance region is [4.5, +[. Note
also that the acceptance rate remains stable for large (near 10 %) for the RSM
method whereas it decreases rapidly to zero for crude rejection sampling.
Now we investigate the performance of the RSM algorithm using a convex set
in two dimensions. To do this, we consider azero-mean
 bivariate Gaussian random
4 2.5
vector x with covariance matrix , equal to
. Assume that the convex set
2.5 2
C R2 is defined by the following inequality constraints:
10 x2 0 and x1 15, 5x1 x2 + 15 0.
It is the acceptance region used in Figs. 2 and 3. By minimizing a quadratic form
subject to linear constraints, we find the mode
1
= arg min x  1 x (3.4, 2.0),
xC 2
and then we compare crude rejection sampling to RSM.
In Fig. 2, we use crude rejection sampling in 2000 simulations of a N (0, ).
Given the number of points in C (black points), it is clear that the algorithm is not
efficient. The reason is that the mean of the bivariate normal distribution is outside the
acceptance region. In Fig. 3, we first simulate from the shifted distribution centered
at the mode with same covariance matrix (step one of the RSM algorithm). Now
in the second step of the RSM algorithm, we have two types of points (black and gray
ones) in the convex set C . The gray points are in C but do not respect the inequality
constraint in the RSM algorithm (see Corollary 1). The black points are in C , and

528

H. Maatouk and X. Bay

Fig. 3 Rejection sampling


from the mode using 2000
simulations. The acceptance
rate is 21 %

Table 3 Comparison between crude rejection sampling and RSM with respect to the dimension d
Dimension d

Acceptance rate Acceptance rate Gain


with crude
with RSM (%)
rejection
sampling (%)
1
2
3
4
5

2.33
1.29
0.79
0.48
0.25

1.0
1.0
1.0
1.0
1.0

15.0
5.2
2.5
1.5
1.2

15.0
5.2
2.5
1.5
1.2

The acceptance region is C = [ , +[d

respect this inequality constraint. We observe that RSM outperforms crude rejection
sampling, with acceptance rate of 21 % against 3 %.
Now we investigate the influence of the problem dimension d. We simulate a
standard multivariate normal distribution X restricted to C = [ , +[d , where
is chosen such that P(X C ) = 0.01. The mean of the multivariate normal
distribution is outside the acceptance region. Simulation of truncated normal distributions in multidimensional cases is a difficult problem for rejection algorithms.
As shown in Table 3, the RSM algorithm is interesting up to dimension three. However, simulation of truncated multivariate normal distribution in high dimensions is
a difficult problem for exact rejection methods. In that case, an adaptive rejection
sampling for Gibbs sampling is needed, see e.g. [13]. From Table 3, we can remark
that when the dimension increases, the parameter tends to zero. Hence, the mode
= ( , . . . , ) tends to the zero-mean of the Gaussian vector X . And so, the
acceptance rate of the proposed method converges to the acceptance rate of the crude
rejection sampling.

A New Rejection Sampling Method

529

4 Conclusion
In this paper, we develop a new rejection technique, called RSM, to simulate a
truncated multivariate normal distribution restricted to convex sets. The proposed
method only requires to find the mode of the target probability density function
restricted to the convex acceptance region. The proposal density function in the
RSM algorithm is the shifted target distribution centered at the mode. We provide a
theoretical formula of the optimal constant such that the proposal density function
is as close as possible to the target density. An illustrative example to compare RSM
with crude rejection sampling is included. The simulation results show that using
rejection sampling from the mode is more efficient than crude rejection sampling.
Comparisons with Roberts method in the one dimensional case is discussed. The
RSM method outperforms Roberts method when the acceptance interval is large and
the probability of the normal distribution to be inside is low. The proposed rejection
method has been applied in the case where the acceptance region is a convex subset
of Rd , and can be extended to non-convex regions by using the convex hull. Note
that it is an exact method and it is easy to implement since the mode is calculated
as a Bayesian estimator in many application. For instance, the proposed algorithm
has been used to simulate a conditional Gaussian process with inequality constraints
(see [20]). An adaptive rejection sampling for Gibbs sampling is needed to improve
the acceptation rate of the proposed method.
Acknowledgments This work has been conducted within the frame of the ReDice Consortium,
gathering industrial (CEA, EDF, IFPEN, IRSN, Renault) and academic (Ecole des Mines de SaintEtienne, INRIA, and the University of Bern) partners around advanced methods for Computer
Experiments. The authors wish to thank Olivier Roustant (EMSE), Laurence Grammont (ICJ, Lyon
1) and Yann Richet (IRSN, Paris) for helpful discussions, as well as the anonymous reviewers for
constructive comments and the participants of MCQMC2014 conference.

References
1. Botts, C.: An accept-reject algorithm for the positive multivariate normal distribution. Comput.
Stat. 28(4), 17491773 (2013)
2. Breslaw, J.: Random sampling from a truncated multivariate normal distribution. Appl. Math.
Lett. 7(1), 16 (1994)
3. Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167174 (1992)
4. Chopin, N.: Fast simulation of truncated Gaussian distributions. Stat. Comput. 21(2), 275288
(2011)
5. Da Veiga, S., Marrel, A.: Gaussian process modeling with inequality constraints. Annales de
la facult des sciences de Toulouse 21(3), 529555 (2012)
6. Devroye, L.: Non-Uniform Random Variate Generation. Springer, New York (1986)
7. Ellis, N., Maitra, R.: Multivariate Gaussian simulation outside arbitrary ellipsoids. J. Comput.
Graph. Stat. 16(3), 692708 (2007)
8. Emery, X., Arroyo, D., Pelez, M.: Simulating large Gaussian random vectors subject to
inequality constraints by Gibbs sampling. Math. Geosci. 119 (2013)

530

H. Maatouk and X. Bay

9. Freulon, X., Fouquet, C.: Conditioning a Gaussian model with inequalities. In: Soares, A. (ed.)
Geostatistics Tria 92, Quantitative Geology and Geostatistics, vol. 5, pp. 201212. Springer,
Netherlands (1993)
10. Gelfand, A.E., Smith, A.F.M., Lee, T.M.: Bayesian analysis of constrained parameter and
truncated data problems using Gibbs sampling. J. Am. Stat. Assoc. 87(418), 523532 (1992)
11. Geweke, J.: Exact inference in the inequality constrained normal linear regression model. J.
Appl. Econom. 1(2), 127141 (1986)
12. Geweke, J.: Efficient simulation from the multivariate normal and student-t distributions subject
to linear constraints and the evaluation of constraint probabilities. In: Proceedings of the 23rd
Symposium on the Interface Computing Science and Statistics, pp. 571578 (1991)
13. Gilks, W.R., Wild, P.: Adaptive rejection sampling for Gibbs sampling. J. R. Stat. Soc. Series
C (Applied Statistics) 41(2), 337348 (1992)
14. Goldfarb, D., Idnani, A.: A numerically stable dual method for solving strictly convex quadratic
programs. Math. Progr. 27(1), 133 (1983)
15. Griffiths, W.E.: A Gibbs sampler for the parameters of a truncated multivariate normal distribution. Department of Economics - Working Papers Series 856, The University of Melbourne
(2002)
16. Hrmann, W., Leydold, J., Derflinger, G.: Automatic Nonuniform Random Variate Generation.
Statistics and Computing. Springer, Berlin (2004)
17. Kotecha, J.H., Djuric, P.: Gibbs sampling approach for generation of truncated multivariate
Gaussian random variables. IEEE Int. Conf. Acoust. Speech Signal Process. 3, 17571760
(1999)
18. Laud, P.W., Damien, P., Shively, T.S.: Sampling some truncated distributions via rejection
algorithms. Commun. Stat. - Simulation Comput. 39(6), 11111121 (2010)
19. Li, Y., Ghosh, S.K.: Efficient sampling method for truncated multivariate normal and student
t-distribution subject to linear inequality constraints. http://www.stat.ncsu.edu/information/
library/papers/mimeo2649_Li.pdf
20. Maatouk, H., Bay, X.: Gaussian process emulators for computer experiments with inequality
constraints (2014). https://hal.archives-ouvertes.fr/hal-01096751
21. Martino, L., Miguez, J.: An adaptive accept/reject sampling algorithm for posterior probability
distributions. In: IEEE/SP 15th Workshop on Statistical Signal Processing, SSP 09, pp. 4548
(2009)
22. Martino, L., Miguez, J.: A novel rejection sampling scheme for posterior probability distributions. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal
Processing ICASSP, pp. 29212924 (2009)
23. Philippe, A., Robert, C.P.: Perfect simulation of positive Gaussian distributions. Stat. Comput.
13(2), 179186 (2003)
24. Robert, C.P.: Simulation of truncated normal variables. Stat. Comput. 5(2) (1995)
25. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (2004)
26. Rodriguez-Yam, G., Davis, R.A., Scharf, L.L.: Efficient Gibbs sampling of truncated multivariate normal with application to constrained linear regression (2004). http://www.stat.columbia.
edu/~rdavis/papers/CLR.pdf
27. Von Neumann, J.: Various techniques used in connection with random digits. J. Res. Nat. Bur.
Stand. 12, 3638 (1951)
28. Jun-wu YU, G.l.T.: Efficient algorithms for generating truncated multivariate normal distributions. Acta Mathematicae Applicatae Sinica, English Series 27(4), 601 (2011)

Van der Corput and Golden Ratio Sequences


Along the Hilbert Space-Filling Curve
Colas Schretter, Zhijian He, Mathieu Gerber, Nicolas Chopin
and Harald Niederreiter

Abstract This work investigates the star discrepancies and squared integration
errors of two quasi-random points constructions using a generator one-dimensional
sequence and the Hilbert space-filling curve. This recursive fractal is proven to maximize locality and passes uniquely through all points of the d-dimensional space. The
van der Corput and the golden ratio generator sequences are compared for randomized integro-approximations of both Lipschitz continuous and piecewise constant
functions. We found that the star discrepancy of the construction using the van der
Corput sequence reaches the theoretical optimal rate when the number of samples
is a power of two while using the golden ratio sequence performs optimally for
Fibonacci numbers. Since the Fibonacci sequence increases at a slower rate than the
exponential in base 2, the golden ratio sequence is preferable when the budget of
samples is not known beforehand. Numerical experiments confirm this observation.
Keywords Quasi-random points
sequence numerical integration

Hilbert curve

discrepancy

golden ratio

C. Schretter (B)
ETRO Department, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
e-mail: cschrett@vub.ac.be
C. Schretter
IMinds, Gaston Crommenlaan 8, Box 102, 9050 Ghent, Belgium
Z. He
Tsinghua University, Haidian Dist., Beijing 100084, China
M. Gerber
Universit de Lausanne, 1015 Lausanne, Switzerland
N. Chopin
Centre de Recherche En conomie Et Statistique, ENSAE, 92245 Malakoff, France
H. Niederreiter
RICAM, Austrian Academy of Sciences, Altenbergerstr. 69, 4040 Linz, Austria
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_28

531

532

C. Schretter et al.

1 Introduction
The Hilbert space-filling curve in two dimensions [1], first described in 1891 by David
Hilbert, is a recursively defined fractal path that passes uniquely through all points
of the unit square. The Hilbert curve generalizes naturally in higher dimensions
and presents interesting potential for the construction of quasi-random point sets
and sequences. In particular, its construction ensures the bijectivity, adjacency and
nesting properties that we define in the following.
For integers d 2 and m 0, let

2d m 1
Imd = Imd (k) := [k, k + 1] 2d m k=0

(1)

be the splitting of [0, 1] into closed intervals of equal size 2d m and Smd be the
splitting of [0, 1]d into 2d m closed hypercubes of volume 2d m . First, writing
H : [0, 1] [0, 1]d for the Hilbert space-filling curve mapping, the set Smd (k) :=
H (Imd (k)) is a hypercube that belongs to Smd (bijectivity property). Second, for any
k {0, . . . , 2d m 2}, Smd (k) and Smd (k + 1) have at least one edge in common (adjacency property). Finally, if we split Imd (k) into the 2d successive closed intervals
d
d
(ki ), ki = 2d k + i and i {0, . . . , 2d 1}, then Sm+1
(ki ) are simply the splitIm+1
d
d
d(m+1)
(nesting property).
ting of Sm (k) into 2 closed hypercubes of volume 2
The Hilbert space-filling curve has already been applied to many problems in
computer science such as clustering points [2] and optimizing cache coherence for
efficient database access [3]. The R*-tree data structure has also been proposed
for efficient searches of points and rectangles [4]. Similar space-filling curves have
been used to heuristically propose approximate solutions to the traveling salesman
problem [5]. In computer graphics, the Hilbert curve has been used to define strata
prior to stratified sampling [6]. Very recently, the inverse Hilbert mapping has also
been applied to sequential quasi-Monte Carlo methods [7].

Fig. 1 First three steps of the recursive construction of the Hilbert space-filling curve in two
dimensions. The dots snap to the closest vertex on an implicit Cartesian grid that cover the space
with an arbitrary precision increasing with the recursion order of the mapping calculations

Van der Corput and Golden Ratio Sequences

533

The recursive definition of the Hilbert space-filling curve provides levels of details
for approximations of a continuous mapping from 1-D to d-D with d 2, up to any
arbitrary numerical precision. An illustration of the generative process of the curve
with increasing recursion order is shown in Fig. 1. Efficient computer implementations exists for computing Hilbert mappings, both in two dimensions [8, 9] and up to
32 or 64 dimensions [10]. Therefore, the Hilbert space-filling curve allows fast constructions of point sets and sequences using a given generator set of coordinates in
the unit interval. The remainder of this work focuses on comparing the efficiency of
two integro-approximation constructions, using either the van der Corput sequence
or the golden ratio sequence [11].

2 Integro-Approximations
Let f () be a d-dimensional function that is not analytically integrable on the unit
cube [0, 1]d . We aim at estimating an integral

=

[0,1]d

f (X ) d X.

(2)

Given a one-dimensional sequence x0 , . . . , xn1 in [0, 1), we can get a corresponding sequence of points P0 , . . . , Pn1 in [0, 1)d in the domain of integration via the
mapping function H : [0, 1] [0, 1]d towards samples into the d-dimensional unit
cube. The integral can therefore be estimated by the following average:
=

n1
1
f (H (xi )).
n i=0

(3)

Recent prior work by He and Owen [12] studied such approximations with the
van der Corput sequence as the one-dimensional input for the Hilbert mapping function H . To define the van der Corput sequence, let
i=

dk (i)bk1 for dk (i) {0, 1, . . . , b1}

(4)

k=1

be the digit expansion in base b 2 of the integer i 0. Then, the ith element of
the van der Corput sequence is defined as
xi =


k=1

dk (i)bk .

(5)

534

C. Schretter et al.

13

11 3

11

13 5

10

10 2

12

12 4

Fig. 2 The first 13 coordinates generated by the van der Corput (top) and the golden ratio (bottom)
sequences. For this specific choice of number of samples, the points are more uniformly spread on
the unit interval with the golden ratio sequence and the maximum distance between the two closest
coordinates is smaller than in the van der Corput sequence

Fig. 3 The first hundred (top row) and thousand (bottom row) points generated by marching along
the Hilbert space-filling curve with distances given by the van der Corput sequence (left) and the
golden ratio sequence (right). In contrast to using the golden ratio number, the van der Corput
construction generates points that are implicitly aligned on a regular Cartesian grid

Van der Corput and Golden Ratio Sequences

535

Alternatively, one can choose as input a specific instance of the one-dimensional


Richtmyer sequences [13], based on the golden ratio number. Given a seed parameter
s U([0, 1)) for randomization, the golden ratio sequence is defined as
xi = {s + i },

(6)

where {t} denotes the fractional part of the real number t and is the golden ratio
(or golden section) number

1+ 5
1.6180339887 . . . ;
=
2

(7)

however, since only fractional parts are retained, we can as well substitute by the
golden ratio conjugate number
=1=

1
0.6180339887 . . . .

(8)

In prior work, we explored applications of these golden ratio sequences for generating randomized integration quasi-lattices [14] and for non-uniform sampling [15].
Figure 2 compares the first elements of the van der Corput generator and the golden
ratio sequence with s = 0. Figure 3 shows their images in two dimensions through
the Hilbert space-filling curve mapping. It is worth pointing out that both of the van
der Corput and the golden ratio sequences are extensible, while the latter spans the
unit interval over a larger range.

3 Star Discrepancy
A key corollary of the strong irrationality of the golden ratio is that the set of coordinates will not align on any regular grid in the golden ratio sequence. Therefore, we
could expect that irregularities in the generated sequence of point samples could be
advantageous in case the function to integrate contains regular alignments or selfrepeating structures. In order to compare their potential performance for integroapproximation problems, we use the star discrepancy to measure the uniformity of
the resulting sequence P = (P0 , . . . , Pn1 ).
d
[0, ai ). The
For a = (a1 , . . . , ad ) [0, 1]d , let [0, a) be the anchored box i=1
star discrepancy of P is
Dn (P)



 A(P, [0, a))


d ([0, a))
= sup 
n
a[0,1)d

(9)

with the counting function A giving the number of points from the set P that belong
to [0, a) and d being the d-dimensional Lebesgue measure, i.e., the area for d = 2.

536

C. Schretter et al.
10

VDC
VDC:n=2k
GR
GR:n=F(k)
n1

Star discrepancy

10

10

10

10

10

10

10

10

Number of samples

Fig. 4 A comparison of the star discrepancies of the dyadic van der Corput (VDC) and the golden
ratio (GR) sequences. The dots are evaluated at n = 2k , k = 1, . . . , 12 for the VDC construction
and at n = F(k), k = 1, . . . , 18 for the GR construction. The reference line is n 1

It is possible to compute exactly the star discrepancy of some one-dimensional


sequences by Theorem 2.6 of [16]. It is also known that the star discrepancy of the van
der Corput sequence is O(n 1 log(n)), and the star discrepancy of the golden ratio
sequence is of the same order for n 2. Figure 4 compares the star discrepancies of
the van der Corput sequence and the golden ratio sequence. We observe that the star
discrepancies of the two sequences are slightly worse than O(n 1 ), which is in line
with the theoretical rate O(n 1+ ) for any > 0.
Let F(k) be the Fibonacci sequence satisfying F(0) = 0, F(1) = 1 and F(k) =
F(k 1) + F(k 2) for k 2. It is of interest to investigate the star discrepancy of
P = {H (x0 ), . . . , H (xn1 )} when n = F(k), k 1. We can show that if (xi )i0 is
the anchored (s = 0) golden ratio sequence, then each interval I j = [( j 1)/n, j/n)
for j = 1, . . . , n, contains precisely one of the xi if n = F(k) for any k 1. This
follows from the proof of Theorem 3.3 in [16] in which we consider the point set P
with n i = 0 and z = or in that proof.
If we combine the above observation with Theorem 3.1 in [12], then we have the
following star discrepancy bound for P:
Dn (P) 4d

d + 3n 1/d + O(n 2/d )

(10)

with n = F(k), k 1.
From the result above, we can see that in most cases the star discrepancy of the
golden ratio sequence is smaller than that of the van der Corput sequence. It is also
of interest to compare the performance of the resulting point sequences P generated
by the van der Corput and golden ratio sequences. For the former, we can prove that
the star discrepancy of P is O(n 1/d ) [12].

Van der Corput and Golden Ratio Sequences

537

More generally, for an arbitrary one-dimensional point set x0 , . . . , xn1 in [0, 1],
the following result provides a bound for the star discrepancy of the resulting
d-dimensional point set P:
Theorem 1 Let x0 , . . . , xn1 be n 1 points in [0, 1] and P = {H (x0 ), . . . ,
H (xn1 )}. Then


n1 1/d
(11)
Dn (P) c Dn {xi }i=0
for a constant c depending only on d.
Proof For the sake of simplicity we assume that the Hilbert curve starts at (0, . . . , 0)
[0, 1]d . Let m 0 be an arbitrary integer and a [0, 1)d be such that Smd (0)
B := [0, a). Let SmB = {W Smd : W B}, B = SmB and DmB = {W Smd :
W = }. Then, let D mB be the set of #DmB disjoint subsets of [0, 1]d such
(B\ B)
that
1. W D mB , W DmB | W W, 2. D mB = DmB , 3. B {D mB } = .
(12)
Note that D mB is obtained by removing boundaries of the elements in DmB such that
the above conditions 2 and 3 are satisfied. Then, we have



 



  A(P , W B)

  A(P , B)
 A(P , B)



+
d (B) 
d ( B)
d (W B) .






n
n
n
W D mB

(13)
To bound the first term on the right-hand side, let SmB = {Smd (0)} {Smd (k)
k 1 such that Smd (k) B, Smd (k 1) B c = } so that B contains #SmB
non-consecutive hypercubes belonging to Smd . By the property of the Hilbert curve,
consecutive hypercubes in Smd correspond to consecutive intervals in Imd (adja contains at most #SmB non consecutive intercency property). Therefore, h( B)
d
vals that belong to Im so that there exist disjoint closed intervals I j [0, 1], j =
SmB +1
n1
= #j=1
I j . Hence, since the point set {xi }i=0
is
1, . . . , #SmB + 1 such that h( B)
in [0, 1) we have, using Proposition 2.4 of [16],
Smd ,


 

 A(P, B)
  A({x }, h( B))




i



n1
=
 2(#SmB + 1) D {xi }i=0
d ( B)
1 (h( B))
.


 

n
n
(14)
To bound #SmB , let m 1 m be the smallest positive integer such that Smd 1 (0) B
and let km 1 be the maximal number of hypercubes in SmB1 . Note that km 1 = 2m 1 (d1) .
Indeed, by the definition of m 1 , the only way for B to be made of more than one
hypercube in Smd1 is to stack such hypercubes in at most (d 1) dimensions, otherwise, we can reduce m 1 to (m 1 1) due to the nesting property of the Hilbert curve.

538

C. Schretter et al.

In each dimension we can stack at most 2m 1 hypercubes that belong to SmB1 so that
km 1 = 2m 1 (d1) .
Let m 2 = (m 1 + 1) and Bm 2 = B\ SmB1 . Then,
Bm

#Sm 2 2 km 2 := 2d 2m 2 (d1)

(15)

Bm

since, by construction, #Sm 2 2 is the number of hypercubes in Smd2 required to cover


the faces other than the ones that are along the axis of the hyperrectangle made by
the union of the hypercubes in SmB1 . This hyperrectangle has at most 2d faces of
dimension (d 1). The volume of each face is smaller than 1 so that we need at
most 2m 2 (d1) hypercubes in Smd2 to cover each face.
Bm

Bm

k1
More generally, for m 1 m k m, we define Bm k := Bm k1 \ Sm k 1
and #Sm k k

m k (d1)
is bounded by km k := 2d2
. Note that, for any j = 1, . . . , k 1, the union of

Bm

all hypercubes belonging to Sm j j forms a hyperrectangle having at most 2d faces


of dimension (d 1). Therefore, since d 2, we have
#SmB km +

m1


k j = 2d 2m(d1) + 2d 2m 1 (d1)

j=m 1

2(mm 1 )(d1) 1
4d 2m(d1)
2d1 1
(16)

so that



 A(P, B)





n1
 2(1 + 4d 2m(d1) ) D {xi }i=0
d ( B)
.



n

(17)

For the second term of (13), take W D mB and note that W Smd (k) for a k
{0, . . . , 2dm 1}. Then,


 A(P, W B)

A(P, Smd (k))


d (W B)
+ d (Smd (k))



n
n
A({xi }, Imd (k))
+ 1 (Imd (k))
n


n1
21 (Imd (k)) + 2 D {xi }i=0



n1
= 2 2dm + D {xi }i=0
=

(18)

where the last inequality uses the fact that the xi s are in [0, 1) as well as Proposition
2.4 in [16]. Thus,



  A(P, W B)



n1
d (W B) 2d 2m + 2d 2m(d1) D {xi }i=0
(19)



n
W D mB

Van der Corput and Golden Ratio Sequences

539

since #D mB = #DmB d 2m(d1) , as we show in the following.


Indeed, by construction, #DmB is the number of hypercubes in Smd required to
cover the faces other than the ones that are along the axis of the hyperrectangle made
by the union of the hypercubes in SmB . This hyperrectangle has d faces of dimension
(d 1) that are not along an axis. The volume of each face is smaller than 1 so that
we need at most 2(d1)m hypercubes in Smd to cover each face.
Hence, for all a [0, 1)d such that Smd (0) [0, a) we have


 A(P, [0, a))




n1

d ([0, a)) 2d 2m + D {xi }i=0
2 + 10d 2m(d1) .

n

(20)

Finally, if a [0, 1)d is such that Smd (0)  [0, a), we proceed exactly as above,
but now B is empty and therefore the first term in (13) disappears. To conclude

n1
.
the proof, we choose the optimal value of m such that 2m 2(d1)m D {xi }i=0


n1 1/d
Hence, D (P) c D {xi }i=0
for a constant c depending only on d.

Compared to the result obtained for the van der Corput sequence, which only relies
on the Hlder property of the Hilbert curve [12], it is worth noting that Theorem 1
is based on its three key geometric properties: bijectivity, adjacency and nesting.
Theorem 1 is of key importance in this work as it says that the discrepancy of
the point set is monotonously related to the discrepancy of the generator sequence.
From this point of view, we can see that the star discrepancy of P generated by the
golden ratio sequence is O(n 1/d log(n)1/d ) for n 2. Numerical experiments will
compare the van der Corput and the golden ratio generator sequences and highlight
practical implications for computing the cubatures of four standard test functions.

4 Numerical Experiments
For the scrambled van der Corput sequences, the mean squared error (MSE) for
integration of Lipschitz continuous integrands is in O(n 12/d ) [12]. Additionally,
it is also shown in [12] that for discontinuous functions whose boundary of discontinuities has bounded (d 1)-dimensional Minkowski content, one can get an MSE
of O(n 11/d ). We will compare the two quasi-Monte Carlo constructions using
randomized sequences in our following numerical experiments.
We consider first two smooth functions that were studied in [17, 18] and are shown
in the first row of Fig. 5. The Additive function
f 1 (X ) = X 1 + X 2 ,

X = (X 1 , X 2 ) [0, 1]2 ,

(21)

and the Smooth function that is the exponential surface


f 2 (X ) = X 2 exp(X 1 X 2 ),

X = (X 1 , X 2 ) [0, 1]2 .

(22)

540

C. Schretter et al.

3
2.5
2
1.5
1
0.5
0

3
2.5
2
1.5
1
0.5
0
1

0.8
0

0.8

0.6
0.2

0.4

0.4
0.6

0.8

0.6
0.2

0.2

0.4

0.4
0.6

1 0

3
2.5
2
1.5
1
0.5
0

0.8

0.2
1 0

3
2.5
2
1.5
1
0.5
0
1

1
0.8

0.8
0

0.6
0.2

0.4

0.4
0.6

0.8

0.6
0.2

0.2
1 0

0.4

0.4
0.6

0.8

0.2
10

Fig. 5 The four test functions used for integro-approximation experiments. The smooth functions
on the first rows are fairly predictable as their variations are locally coherent. However, the functions
on the second row contain sharp changes that are difficult to capture with discrete sampling

This Lipschitz function in particular has infinitely many continuous derivatives.


It is known that for Lipschitz continuous functions, the scrambled van der Corput
sequence yields an MSE O(n 2 log(n)2 ) for arbitrary sample size n 2 and when
n = bk , k = 1, . . . , the MSE becomes O(n 2 ) [12]. Figure 6 shows that the MSEs
for the randomized van der Corput and golden ratio sequences are nearly O(n 2 ).
When n = 2k , the van der Corput sequence performs better than the golden ratio
sequence. But in most cases of n = 2k , the golden ratio sequence outperforms the
van der Corput sequence. In plots, the dots are evaluated at n = 2k , k = 1, . . . , 12
for the VDC construction and at n = F(k), k = 1, . . . , 18 for the GR construction.
The MSEs are computed based on 100 repetitions.
We consider now the examples in the second row of Fig. 5: the Cusp function
f 3 (X ) = max(X 1 + X 2 1, 0),

X = (X 1 , X 2 ) [0, 1]2 ,

(23)

and the Discontinuous function that is the indicator


f 4 (X ) = 1{X 1 +X 2 >1} (X ),

X = (X 1 , X 2 ) [0, 1]2 .

(24)

Van der Corput and Golden Ratio Sequences

541

Additive
10

VDC
k
VDC:n=2
GR
GR:n=F(k)
n2

10

MSE

10

10

10

10

10

10

10

10

10

Number of samples

Smooth
10

VDC
k
VDC:n=2
GR
GR:n=F(k)
2
n

MSE

10

10

10

10

10

10

10

10

Number of samples

Fig. 6 A comparison of the mean squared errors (MSEs) of the randomized van der Corput and
the golden ratio sequences for the smooth functions f 1 (top) and f 2 (bottom). The reference line is
n 2

In particular, the discontinuity boundary of this indicator function has finite


Minkowski content. This step function was previously studied with a van der Corput
generator sequence in [12]. It was found that for this function the scrambled van
der Corput sequence yields an MSE O(n 3/2 ) for arbitrary sample size n. Figure 7
shows that the MSEs for the randomized van der Corput and golden ratio sequences
are close to O(n 3/2 ). In most cases, the golden ratio sequence seems to outperform
the construction of quasi-random samples using the van der Corput sequence.

542

C. Schretter et al.

Cusp
0

10

VDC
k
VDC:n=2
GR
GR:n=F(k)
2
n

10

MSE

10

10

10

10

10

10

10

10

10

Number of samples

Discontinuous
0

10

VDC
VDC:n=2k
GR
GR:n=F(k)
n1.5

MSE

10

10

10

10

10

10

10

10

Number of samples

Fig. 7 A comparison of the mean squared errors (MSEs) of the randomized van der Corput sequence
and golden ratio sequences for the functions f 3 (top) and f 4 (bottom). The reference line is n 1.5
for the discontinuous step function and n 2 for the continuous function

5 Conclusions
This work evaluated the star discrepancy and squared integration error for two constructions of quasi-random points, using the Hilbert space-filling curve. We found that
using the fractional parts of integer multiples of the golden ratio number often leads to
improved results, especially when the number of samples is close to a Fibonacci number. The discrepancy of the point sets increases monotonously with the discrepancy
of the generator one-dimensional sequence, therefore the van der Corput sequence

Van der Corput and Golden Ratio Sequences

543

leads to optimal results in the specific cases when the generating coordinates are
equally-spaced.
In future work, we plan to investigate generalizations of the Hilbert space-filling
curve in higher dimensions. A deterioration of the discrepancy is expected as the
dimension increases, an effect linked to the curse of dimensionality. Since the Hilbert
space-filling curve is accepted by a pseudo-inverse operator, the problem of constructing quasi-random samples is reduced to choosing a suitable generator onedimensional sequence. We therefore hope that the preliminary observations presented
here may spark subsequent research towards designing adapted generator sequences,
given specific integration problems at hand.
Acknowledgments The authors thank Art Owen for suggesting conducting the experimental comparisons presented here, his insightful discussions and his reviews of the manuscript.

References
1. Bader, M.: Space-Filling CurvesAn Introduction with Applications in Scientific Computing.
Texts in Computational Science and Engineering, vol. 9. Springer, Berlin (2013)
2. Moon, B., Jagadish, H.V., Faloutsos, C., Saltz, J.H.: Analysis of the clustering properties of
Hilbert space-filling curve. Technical report, University of Maryland, College Park, MD, USA
(1996)
3. Terry, J., Stantic, B., Terenziani, P., Sattar, A.: Variable granularity space filling curve for indexing multidimensional data. In: Proceedings of the 15th International Conference on Advances
in Databases and Information Systems, ADBIS11, pp. 111124. Springer (2011)
4. Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust
access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pp. 322331 (1990)
5. Platzman, L.K., Bartholdi III, J.J.: Spacefilling curves and the planar travelling salesman problem. J. ACM 36(4), 719737 (1989)
6. Steigleder, M., McCool, M.: Generalized stratified sampling using the Hilbert curve. J. Graph.
Tools 8(3), 4147 (2003)
7. Gerber, M., Chopin, N.: Sequential quasi-Monte Carlo. J. R. Stat. Soc. Ser. B 77(3), 509579
(2015)
8. Butz, A.: Alternative algorithm for Hilberts space-filling curve. IEEE Trans. Comput. 20(4),
424426 (1971)
9. Jin, G., Mellor-Crummey, J.: SFCGen: a framework for efficient generation of multidimensional space-filling curves by recursion. ACM Trans. Math. Softw. 31(1), 120148 (2005)
10. Lawder, J.K.: Calculation of mappings between one and n-dimensional values using the Hilbert
space-filling curve. Research report BBKCS-00-01, University of London (2000)
11. Coxeter, H.S.M.: The golden section, phyllotaxis, and Wythoffs game. Scr. Math. 19, 135143
(1953)
12. He, Z., Owen, A.B.: Extensible grids: uniform sampling on a space-filling curve. e-print (2014)
13. Franek, V.: An algorithm for QMC integration using low-discrepancy lattice sets. Comment.
Math. Univ. Carolin 49(3), 447462 (2008)
14. Schretter, C., Kobbelt, L., Dehaye, P.O.: Golden ratio sequences for low-discrepancy sampling.
J. Graph. Tools 16(2), 95104 (2012)
15. Schretter, C., Niederreiter, H.: A direct inversion method for non-uniform quasi-random point
sequences. Monte Carlo Methods Appl. 19(1), 19 (2013)

544

C. Schretter et al.

16. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM,
Philadelphia (1992)
17. Sloan, I.H., Joe, S.: Lattice Methods for Multiple Integration. Clarendon Press, Oxford (1994)
18. Owen, A.B.: Local antithetic sampling with scrambled nets. Ann. Stat. 36(5), 23192343
(2008)

Uniform Weak Tractability


of Weighted Integration
Pawe Siedlecki

Abstract We study a relatively new notion of tractability called uniform weak


tractability that was recently introduced in (Siedlecki, J. Complex. 29:438453,
2013 [5]). This notion holds for a multivariable problem iff the information complexity n(, d) of its d-variate component to be solved to within is not an exponential
function of any positive power of 1 and/or d. We are interested in necessary and
sufficient conditions on uniform weak tractability for weighted integration. Weights
are used to control the role or importance of successive variables and groups
of variables. We consider here product weights. We present necessary and sufficient
conditions on product weights for uniform weak tractability for two Sobolev spaces
of functions defined over the whole Euclidean space with arbitrary smoothness,
and of functions defined over the unit cube with smoothness 1. We also briefly consider (s, t)-weak tractability introduced in (Siedlecki and Weimar, J. Approx. Theory
200:227258, 2015 [6]), and show that as long as t > 1 then this notion holds for
weighted integration defined over quite general tensor product Hilbert spaces with
arbitrary bounded product weights.
Keywords Tractability Multivariate integration Weighted integration

1 Introduction
There are many practical applications for which we need to approximate integrals of
multivariate functions. The number of variables d in many applications is huge. It is
desirable to know what is the minimal number of function evaluations that is needed
to approximate the integral to within and how this number depends on 1 and d.
In this paper we consider weighted integration. We restrict ourselves to product weights which control the importance of successive variables and groups of
variables. We consider weighted integration defined over two Sobolev spaces. One
space consists of smooth functions defined over the whole Euclidean space, whereas
P. Siedlecki (B)
Faculty of Mathematics, Informatics and Mechanics, University of Warsaw,
Banacha 2, 02-097 Warszawa, Poland
e-mail: psiedlecki@mimuw.edu.pl
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_29

545

546

P. Siedlecki

the second one is an anchored space of functions defined on the unit cube that are
once differentiable with respect to all variables.
We find necessary and sufficient conditions on product weights to obtain uniform
weak tractability for weighted integration. This problem is solved by first establishing
a relation between uniform weak tractability and so called T -tractability. Then we
apply known results on T -tractability from [4].
We compare necessary and sufficient conditions on uniform weak tractability with
the corresponding conditions on strong polynomial, polynomial, quasi-polynomial
and weak tractability. All these conditions require some specific decay of product
weights. For different notions of tractability the decay is usually different.
We also briefly consider (s, t)-weak tractability introduced recently in [6]. This
notion holds if the minimal number of function evaluations is not exponential in s
and d t . We stress that now s and t can be arbitrary positive numbers. We show that as
long as t > 1 then weighted integration is (s, t)-weakly tractable for a general tensor
product Hilbert space whose reproducing univariate kernel is finitely integrable over
its diagonal. This means that as long as we accept a possibility of an exponential
dependence on d with < t then we do not need decaying product weights and
we may consider even the case where all product weights are the same.

2 Multivariate Integration
Assume that for every d N we have a Borel measurable subset
Dd of Rd , and

d : Dd R+ is a Lebesgue probability density function, Dd d (x)d x = 1. Let
Fd be a reproducing kernel Hilbert space of real integrable functions defined on a
common domain Dd with respect to the measure d (A) = A d (x)d x defined on
all Borel subsets of Dd .
A multivariate integration is a problem INT = {INTd } such that

INTd : Fd R : f 

f (x)d (x)d x
Dd

for every d N
We approximate INTd ( f ) for f Fd by algorithms which use only partial information about f . The information about f consists of a finite number of function
values f (t j ) at sample points t j Dd . In general, the points t j can be chosen adaptively, that is the choice of t j may depend on f (ti ) for i = 1, 2, . . . , j 1. The
approximation of INTd ( f ) is then
Q n,d ( f ) = n ( f (t1 ), f (t2 ), . . . , f (tn ))
for some, not necessarily linear, function n : Rn R.
The worst case error of Q n,d is defined as
e(Q n,d ) = sup |INTd ( f ) Q n,d ( f )|.
 f  Fd 1

Uniform Weak Tractability of Weighted Integration

547

Since the use of adaptive information does not help we can restrict ourselves to
considering only non-adaptive algorithms, i.e., t j can be given simultaneously, see
[1]. It is also known that the best approximations can be achieved by means of linear
functions, i.e., n can be chosen as a linear function. This is the result of Smolyak
which can be found in [1]. Therefore without loss of generality, we only need to
consider non-adaptive and linear algorithms of the form
Q n,d ( f ) =

n


a j f (t j )

j=1

for some a j R and for some t j Dd .


For (0, 1) and d N, the information complexity n(, INTd ) of the problem
INTd is defined as the minimal number n N for which there exists an algorithm
Q n,d with the worst case error at most CRId ,
n(, INTd ) = min{ n : Q n,d such that e(Q n,d ) CRId }.
Here, CRId = 1 if we consider the absolute error criterion, and CRId = INTd  if
we consider the normalized error criterion.

3 Generalized Tractability and Uniform Weak Tractability


We first remind the reader of the basic notions of tractability. For more details we
refer to [3] and references therein. Recall that a function
T : [1, ) [1, ) [1, )
is called a generalized tractability function iff T is nondecreasing in each of its
arguments and
ln T (x, y)
= 0.
lim
x+y
x+y
As in [2], we say that INT = {INTd } is T -tractable iff there are nonnegative numbers
C and t such that
n(, INTd ) C T (1 , d)t

(0, 1], d N.

We say that INT = {INTd } is strongly T -tractable iff there are nonnegative numbers C and t such that
n(, INTd ) C T (1 , 1)t

(0, 1], d N.

548

P. Siedlecki

Examples of T -tractability include polynomial tractability (PT) and strong polynomial tractability (SPT) if T (x, y) = x y, and quasi-polynomial tractability (QPT)
if T (x, y) = exp((1 + ln x)(1 + ln y)).
We say that INT = {INTd } is weakly tractable (UWT) iff
lim

1 +d

ln n(, INTd )
= 0.
1 + d

As in [5], we say that INT = {INTd } is uniformly weakly tractable (UWT) iff
lim

1 +d

ln n(, INTd )
= 0 , (0, 1).
+ d

Here we adopt convention that ln 0 = 0.


The following lemma gives a characterization of uniform weak tractability in
terms of a certain family of generalized tractability functions.
Lemma 1 For every , (0, 1) the function
T, (x, y) = exp(x + y ) for all x, y [1, )
is a generalized tractability function. Moreover,
INT is uniformly weakly tractable iff INT is T, -tractable for every , (0, 1).
Proof It is obvious that for every , (0, 1) and fixed x, y [1, )
T, (x, ) : [1, ) [1, ) and T, (, y) : [1, ) [1, )
are non-increasing functions. Since for every , (0, 1) we have
ln T, (x, y)
x + y
= lim
= 0,
x+y
x+y x + y
x+y
lim

it follows that T,(0,1) is a generalized tractability function for every , (0, 1).
Suppose that INT is uniformly weakly tractable, i.e.,
lim

1 +d

ln n(, INTd )
=0
+ d

, > 0.

Thus, for arbitrary but fixed , (0, 1), there exists t > 0 such that
ln n(, INTd ) t ( + d )
Hence


t
n(, INTd ) exp( + d )

(0, 1], d N.
(0, 1], d N.

Therefore a problem S is T, -tractable for all , (0, 1).

Uniform Weak Tractability of Weighted Integration

549

Assume now that INT is T, -tractable for every , (0, 1). That is, for all
, (0, 1) there are positive C(, ) and t (, ) such that


n(, INTd ) C(, ) exp t (, ) ( + d ) (0, 1], d N.
Take now arbitrary positive and which may be larger than 1. Obviously there
exist 0 , 0 (0, 1) such that 0 < and 0 < . Since INTd is T0 ,0 -tractable then
ln n(, INTd )
ln C(0 , 0 ) + t (0 , 0 )(0 + d 0 )

lim
= 0.
+ d
+ d
1 +d
1 +d
lim

Since the choice of , > 0 was arbitrary, we conclude that


lim

1 +d

ln n(, INTd )
=0
+ d

, > 0,


and the problem INT is uniformly weakly tractable, as claimed.

We add that Lemma 1 holds not only for multivariate integration but also for all
multivariate problems.

4 Weighted Sobolev Spaces Over Unbounded Domain


In this section we specify the class Fd, as a weighted Sobolev space of smooth
functions f : Rd R. More precisely, assume that a set of weights =
{d,u }dN,u1,2,...,d , with d,u 0, is given. Then for r N, Fd = H (K d ) is a reproducing kernel Hilbert space whose reproducing kernel is of the form
K d, (x, t) =


u{1,2,...,d}

d,u

R(x j , t j )

ju

where


R(x, t) = 1 M (x, t)
0

(|t| z)r+1 (|x| z)r+1


dz
[(r 1)!]2

for

x, y R,

and
M = {(x, t) R2 : xt 0}.
We assume that the weights are bounded product weights, i.e., d, = 1 and
d,u =


ju

d, j for non-empty u {1, 2, . . . , d}

(1)

550

P. Siedlecki

where d, j satisfy

0 d, j <

for some positive number .


The weighted integration problem INT = {INTd, } is given as in [4, Sect. 12.4.2]:

INTd, : Fd, R : f 

Rd

f (t1 , t2 , . . . , td )(t1 )(t2 ) . . . (td )dt,

where : R R is a non-negative function satisfying





R

(t)dt = 1 and

(t)|t|r 1/2 dt < .

Theorem 1 Consider weighted integration problem INT for bounded product


weights. Assume that
(t) c > 0 for t [a, b] for some a, b and c with a < b.
Then for both the absolute and normalized error criteria
d
INT is uniformly weakly tractable iff

lim

j=1 d, j
d

= 0 for all > 0.

Proof Lemma 1 implies that it is sufficient to prove that INT is T, -tractable for
every , (0, 1). Here T, is defined as in Sect. 3. From [4, Corollary 12.4] we
know that INT is T, -tractable iff the following two conditions hold:
ln 1
< ,
ln T, (1 , 1)
0
d
j=1 d, j
lim lim sup
< .
1 d
ln T, (1 , d)

(2)

lim sup

(3)

Since
ln 1
ln 1
=
lim
=0
0 ln T, ( 1 , 1)
0 + 1
lim

the first condition is satisfied for every , (0, 1) regardless of the choice of
weights . Note that for the second condition on T, -tractability we have the following equivalence:
d
lim lim sup

j=1 d, j

+ d

d
<

lim sup
d

j=1 d, j
d

< .

Uniform Weak Tractability of Weighted Integration

551

Therefore the weighted integration INT is uniformly weakly tractable iff


d
lim sup
d

j=1 d, j
d

< (0, 1).

(4)

Note that the last condition holds iff


d
lim

j=1 d, j
d

= 0 > 0.

(5)

Indeed, suppose that (4) holds. Obviously, it is enough to consider arbitrary


(0, 1). Then we take = /2, which also belongs to (0, 1), and
d
0 lim

j=1 d, j
d

1
= lim
d d

d

j=1 d, j
d

d

1
j=1 d, j
lim lim sup
= 0.
d d
d
d

Since (5) obviously implies (4) we have shown that the weighted integration INT
is uniformly weakly tractable iff the condition (5) is satisfied.

After obtaining a necessary and sufficient condition on uniform weak tractability
of the weighted integration INT it is interesting to compare it with conditions on
other types of tractability, which were obtained in [4, Corollary 12.4].
The weighted integration INT is :
strongly polynomially tractable

lim sup
d

d


d, j < ,

j=1

d
polynomially tractable

quasi-polynomially tractable

j=1

lim sup

ln d

d

j=1

lim sup
d

d

uniformly weakly tractable


weakly tractable

lim

lim

d, j
d, j

ln d

j=1 d, j
d

d

j=1

d, j

< ,
< ,

= 0 > 0,
= 0.

Note that depending on the weights , the weighted integration INT can satisfy
one or some types of tractability.
Let d, j =

1
j

for > 0. Then weighted integration INT is:

strongly polynomially tractable iff > 1,


polynomially tractable, but not strongly polynomially tractable, iff = 1,
weakly tractable, but not uniformly weakly tractable, if < 1.

552

P. Siedlecki

Let d, j = [ln ( j+1)]


for R. Then weighted integration INT is uniformly
j
weakly tractable, but not polynomially tractable.

5 Weighted Anchored Sobolev Spaces


In this section we specify the class Fd, as a weighted anchored Sobolev space of
functions f : [0, 1]d R that are once differentiable with respect to each variable.
More precisely, assume that a set of weights = {d,u }dN,u1,2,...,d , with d,u 0,
is given. Then Fd = H (K d ) is a reproducing kernel Hilbert space whose reproducing
kernel is of the form


d,u
R(x j , t j )
K d, (x, t) =
u{1,2,...,d}

ju

where
R(x, t) = 1 M (x, t) min(|x a|, |t a|)

for

x, y [0, 1],

for some a [0, 1] and


M = {(x, t) [0, 1]2 : (x a)(t a) 0}.
We assume that the weights are product weights, i.e., d, = 1 and
d,u =

d, j for non-empty u {1, 2, . . . , d}

ju

for non-negative d, j .
The weighted integration problem INT = {INTd, } is given as in [4, Sect. 12.6.1]:

INTd, : Fd, R : f 

[0,1]d

f (t)dt.

Theorem 2 Consider weighted integration problem INT for product weights. Then
for both the absolute and normalized error criteria
d
INT is uniformly weakly tractable iff

lim

j=1 d, j
d

= 0 for all > 0.

Proof Again, applying Lemma 1 it is enough to verify T, -tractability for all ,


(0, 1). From [4, Corollary 12.11] we know that conditions on T, -tractability of
the weighted integration INT have the same form as those used in the proof of

Uniform Weak Tractability of Weighted Integration

553

Theorem 1. Therefore we can repeat the reasoning used in the proof of Theorem 1 to
obtain the same condition on uniform weak tractability of the presently considered
weighted integration problem.


6 (s, t)-Weak Tractability with t > 1


As in [6], by (s, t)-weak tractability of the integration INT for positive s and t we
mean that
ln n(, INTd )
= 0.
lim
1
s + d t
+d
We now prove that (s, t)-weak tractability for any s > 0 and t > 1 holds for
weighted integration defined over quite general tensor product Hilbert spaces
equipped with bounded product weights . More precisely, let D be a Borel subset
of the
 real line R and : D R+ be a Lebesgue probability density function on
D, D (x)d x = 1. Let H (K ) be an arbitrary reproducing kernel Hilbert space of
integrable real functions defined on D with the kernel K : D D R such that

K (x, x)(x)d x < .

(6)

Let be a set of bounded product weights defined as in Sect. 4, see (1).


For d N and j = 1, 2, . . . , d, let
K 1,d, j (x, y) = 1 + d, j K (x, y) for x, y D
and
Fd, =

H (K 1,d, j ).

j=1

The weighted integration problem INT = {INTd, } is now given as



INTd, : Fd, R : f 

Dd

f (x1 , x2 , . . . , xd )(x1 )(x2 ) (xd )d x.

It is well known that


INTd,  =

d

j=1

1/2


1 + d, j

D2

K (x, t)(x)(t) d x dt

Hence, INTd,  1 and the absolute error criterion is harder than the normalized
error criterion.

554

P. Siedlecki

Theorem 3 Consider weighted integration problem INT for bounded product


weights. If s > 0 and t > 1 then for both the absolute and normalized error criteria INT = {INTd, } is (s, t)-weakly tractable.
Proof It is well known, see e.g. [4, p. 102], that


d

1
(1 + d, j
K (x, x)(x)d x)
n(, INT )
2
.
D
j=1

From this it follows that


0

lim

1 +d

2 ln 1
+
s + d t


lim

1 +d

lim

1 +d



2 ln 1
+
s + d t

ln n(, INT )

s + d t

K (x, x)(x)d x

 d
j=1

d, j

s + d t

D



K (x, x)(x)d x d
=0
s + d t

for every s > 0 and t > 1. Hence, we have (s, t)-weak tractability for INT .

From Theorems 1, 2 and 3 we see that strong polynomial, polynomial and weak
tractability for weighted integration requires some decay conditions on product
weights even for specific Hilbert spaces, whereas (s, t)- weak tractability for t > 1,
which is the weakest notion of tractability considered here, holds for all bounded
product weights and for general tensor product Hilbert spaces for which the univariate
reproducing kernel satisfies (6).
Acknowledgments I would like to thank Henryk Wozniakowski for his valuable suggestions. This
project was financed by the National Science Centre of Poland based on the decision number DEC2012/07/N/ST1/03200. I gratefully acknowledge the support of ICERM during the preparation of
this manuscript.

References
1. Bakhvalov, N.S.: On the optimality of linear methods for operator approximation in convex
classes of functions. USSR Comput. Math. Math. Phys. 11, 244249 (1971)
2. Gnewuch, M., Wozniakowski, H.: Quasi-polynomial tractability. J. Complex. 27, 312330
(2011)
3. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, vol. I. European Mathematical Society, Zrich (2008)

Uniform Weak Tractability of Weighted Integration

555

4. E. Novak, H. Wozniakowski. Tractability of Multivariate Problems Volume II: Standard Information for Functionals. European Mathematical Society, Zrich (2010)
5. Siedlecki, P.: Uniform weak tractability. J. Complex. 29, 438453 (2013)
6. Siedlecki, P., Weimar, M.: Notes on (s, t)-weak tractability: a refined classification of problems
with (sub)exponential information complexity. J. Approx. Theory 200, 227258 (2015)

Incremental Greedy Algorithm


and Its Applications in Numerical
Integration
Vladimir Temlyakov

Abstract Applications of the Incremental Algorithm, which was developed in the


theory of greedy algorithms in Banach spaces, to approximation and numerical integration are discussed. In particular, it is shown that the Incremental Algorithm provides an efficient way for deterministic construction of cubature formulas with equal
weights, which give good rate of error decay for a wide variety of function classes.
Keywords Greedy algorithm Discrepancy Approximation

1 Introduction
The paper provides some progress in the fundamental problem of algorithmic construction of good methods of approximation and numerical integration. Numerical
integration seeks good ways of approximating an integral


f (x)d

by an expression of the form


m ( f, ) :=

m


j f ( j ), = ( 1 , . . . , m ), j ,

j = 1, . . . , m.

(1)

j=1

It is clear that we must assume that f is integrable and defined at the points 1 , . . . , m .
The expression (1) is called a cubature formula (, ) (if Rd , d 2) or a
quadrature formula (, ) (if R) with knots = ( 1 , . . . , m ) and weights
V. Temlyakov (B)
University of South Carolina, Columbia, SC, USA
e-mail: temlyakovv@gmail.com
V. Temlyakov
Steklov Institute of Mathematics, Moscow, Russia
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_30

557

558

V. Temlyakov

= (1 , . . . , m ). For a function class W we introduce a concept of error of the


cubature formula m (, ) by

m (W, ) := sup |
f W

f d m ( f, )|.

(2)

There are many different ways to construct good deterministic cubature formulas
beginning with heuristic guess of good knots for a specific class and ending with finding a good cubature formula as a solution (approximate solution) of the optimization
problem
m (W, ).
inf
1 ,..., m ;1 ,...,m

Clearly, the way of solving the above optimization problem is the preferable one.
However, in many cases this problem is very hard (see a discussion in [11]). It was
observed in [10] that greedy-type algorithms provide an efficient way for deterministic constructions of good cubature formulas for a wide variety of function classes.
This paper is a follow up to [10]. In this paper we discuss in detail a greedy-type
algorithmIncremental Algorithmthat was not discussed in [10]. The main advantage of the Incremental Algorithm over the greedy-type algorithms considered in [10]
is that it provides better control of weights of the cubature formula and gives the same
rate of decay of the integration error.
We remind some notations from the theory of greedy approximation in Banach
spaces. The reader can find a systematic presentation of this theory in [12], Chap. 6.
Let X be a Banach space with norm  . We say that a set of elements (functions) D
from X is a dictionary if each g D has norm less than or equal to one (g 1) and
the closure of D coincides with X . We note that in [9] we required in the definition
of a dictionary normalization of its elements (g = 1). However, it is pointed out
in [11] that it is easy to check that the arguments from [9] work under assumption
g 1 instead of g = 1. In applications it is more convenient for us to have an
assumption g 1 than normalization of a dictionary.
For an element f X we denote by Ff a norming (peak) functional for f :
F f  = 1,

F f ( f ) =  f .

The existence of such a functional is guaranteed by the Hahn-Banach theorem.


We proceed to the Incremental Greedy Algorithm (see [11] and [12], Chap. 6).
Let = {n }
n=1 , n > 0, n = 1, 2, . . . . For a Banach space X and a dictionary D
define the following algorithm IA() := IA(, X, D).
Incremental Algorithm with schedule (IA(, X, D)). Denote f 0i, := f and
i,
G 0 := 0. Then, for each m 1 we have the following inductive definition.
(1) mi, D is any element satisfying
i,
i, (
F fm1
m f ) m .

Incremental Greedy Algorithm and Its Applications in Numerical Integration

559

(2) Define
i,
i,
G i,
m := (1 1/m)G m1 + m /m.

(3) Let
f mi, := f G i,
m .
We show how the Incremental Algorithm can be used in approximation and
numerical integration. We begin with a discussion of the approximation problem. A
detailed discussion, including historical remarks, is presented in Sect. 2. For simplicity, we illustrate how the Incremental Algorithm works in approximation of univariate
trigonometric polynomials.
An expression
m


c j g j , g j D, c j R,

j = 1, . . . , m

j=1

is called m-term polynomial with respect to D. The concept of best m-term approximation with respect to D
m ( f, D) X :=

inf

{c j },{g j D}

f

m


c j g j X

j=1

plays an important role in our consideration.


By RT (N ) we denote the set of real 1-periodic trigonometric polynomials of
order N and by RT N denote the real trigonometric system
1, cos 2 x, sin 2 x, . . . , cos N 2 x, sin N 2 x.
For a real trigonometric polynomial denote
a0 +

N
N


(ak cos k2 x + bk sin k2 x) A := |a0 | +
(|ak | + |bk |).
k=1

k=1

We formulate here a result from [11]. We use the short notation   p := 


 L p ([0,1]) .
Theorem 1 There exists a constructive method A(N , m) such that for any t
RT (N ) it provides an m-term trigonometric polynomial A(N , m)(t) with the following approximation property
t A(N , m)(t) Cm 1/2 (ln(1 + N /m))1/2 t A
with an absolute constant C.

560

V. Temlyakov

An advantage of the IA() over other greedy-type algorithms is that the IA() gives
precise control of the coefficients of the approximant. For all approximants G i,
m we

=
1.
Moreover,
we
know
that
all
nonzero
coefficients
of
have the property G i,
m A
the approximant have the form a/m where a is a natural number. In Sect. 2 we prove
the following result.
Theorem 2 For any t RT (N ) the IA(, L p , RT N ) with an appropriate schedule , applied to f := t/t A , provides after m iterations an m-term trigonometric
polynomial G m (t) := G i,
m ( f )t A with the following approximation property
t G m (t) Cm 1/2 (ln N )1/2 t A , G m (t) A = t A ,
with an absolute constant C.
Comparing Theorems 1 and 2 we see that the error bound in Theorem 1 is better
than in Theorem 2ln(1 + N /m) versus lnN . It is important in applications in the
m-term approximation of smoothness classes. The proof of Theorem 1 is based on
the Weak Chebyshev Greedy Algorithm (WCGA). The WCGA is the most powerful
and the most popular in applications greedy-type algorithm. Its Hilbert space version
is known in signal processing under the name Weak Orthogonal Matching Pursuit.
For this reason for the readers convenience we discuss the WCGA in some detail in
Sect. 2 despite the fact that we do not obtain any new results on the WCGA in this
paper.
We note that the implementation of the IA() depends on the dictionary and the
ambient space X . The IA() from Theorem 2 acts with respect to the real trigonometric system 1, cos 2 x, sin 2 x, . . . , cos N 2 x, sin N 2 x in the space X = L p
with p  lnN . Relation p  lnN means that there are two positive constants C1 and
C2 , which do not depend on N , such that C1 N p C2 N .
We now proceed to results from Sect. 3 on numerical integration. As in [10] we
define a set Kq of kernels possessing the following properties. Let K (x, y) be
a measurable function on x y . We assume that for any x  x K (x, )
L q ( y ), for any y y the K (, y) is integrable over x and x K (x, )dx
L q ( y ), 1 q .
For a kernel K K p we define the class

W pK

:= { f : f =

K (x, y)(y)dy,  L p ( y ) 1}, 1 p .

Then each f W pK is integrable on x (by Fubinis theorem) and defined at


each point of x . We denote for convenience

J (y) := JK (y) :=

K (x, y)dx.

For p [1, ] denote the dual p := p/( p 1). Consider a dictionary

Incremental Greedy Algorithm and Its Applications in Numerical Integration

561

D := {K (x, ), x x }
and define a Banach space X (K , p ) as the L p ( y )-closure of span of D. In Sect. 3
the following theorem is proved.
Theorem 3 Let W pK be a class of functions defined above. Assume that K K p
satisfies the condition
K (x, ) L p ( y ) 1, x x , |x | = 1
and JK X (K , p ). Then for any m there exists (provided by an appropriate Incremental Algorithm) a cubature formula m (, ) with = 1/m, = 1, 2, . . . , m,
and
m (W pK , ) C( p 1)1/2 m 1/2 , 1 < p 2.
Theorem 3 provides a constructive way of finding for a wide variety of classes
W pK cubature formulas that give the error bound similar to that of the Monter Carlo
method. We stress that in Theorem 3 we do not assume any smoothness of the kernel
K (x, y).

2 Approximation by the Incremental Algorithm


First, we discuss the known Theorem 1 from the Introduction. The proof of Theorem
1 is based on a greedy-type algorithmthe Weak Chebyshev Greedy Algorithm. We
now describe it. Let := {tk }
k=1 be a given sequence of nonnegative numbers tk 1,
k = 1, . . . . We define (see [9]) the Weak Chebyshev Greedy Algorithm (WCGA)
that is a generalization for Banach spaces of Weak Orthogonal Greedy Algorithm
defined and studied in [8] (see also [12]).
Weak Chebyshev Greedy Algorithm (WCGA). We define f 0c := f 0c, := f .
Then for each m 1 we inductively define
(1) mc := mc, D is any element satisfying
c
c
(mc )| tm sup |F fm1
(g)|.
|F fm1

gD

(2) Define

m := m := span{ cj }mj=1 ,

and define G cm := G c,
m to be the best approximant to f from m .
(3) Denote
f mc := f mc, := f G cm .
The term weak in this definition means that at the step (1) we do not shoot for
the optimal element of the dictionary, which realizes the corresponding supremum,

562

V. Temlyakov

but are satisfied with weaker property than being optimal. The obvious reason for
this is that we do not know in general that the optimal one exists. Another, practical
reason is that the weaker the assumption the easier to satisfy it and, therefore, easier
to realize in practice.
We consider here approximation in uniformly smooth Banach spaces. For a
Banach space X we define the modulus of smoothness
(u) :=

sup

x=y=1

1
( (x + uy + x uy) 1).
2

The uniformly smooth Banach space is the one with the property
lim (u)/u = 0.

u0

It is well known (see for instance [3], Lemma B.1) that in the case X = L p ,
1 p < we have

u p/ p
if 1 p 2,
(u)
2
( p 1)u /2 if 2 p < .

(3)

Denote by A1 (D) := A1 (D, X ) the closure in X of the convex hull of D. The


following theorem from [9] gives the rate of convergence of the WCGA for f in
A1 (D).
Theorem 4 Let X be a uniformly smooth Banach space with the modulus of smoothness (u) u q , 1 < q 2. Then for t (0, 1] we have for any f A1 (D) that
p 1/ p
 f G c,
,
m ( f, D) C(q, )(1 + mt )

p :=

q
,
q 1

with a constant C(q, ) which may depend only on q and .


In [11] we demonstrated the power of the WCGA in classical areas of harmonic
analysis. The problem concerns the trigonometric m-term approximation in the uniform norm. The first result that indicated an advantage of m-term approximation with
respect to the real trigonometric system RT over approximation by trigonometric
polynomials of order m is due to Ismagilov [5]
m (| sin 2 x|, RT ) C m 6/5+ , for any > 0.

(4)

Maiorov [6] improved the estimate (4):


m (| sin 2 x|, RT )  m 3/2 .

(5)

Both R.S. Ismagilov [5] and V.E. Maiorov [6] used constructive methods to get
their estimates (4) and (5). V.E. Maiorov [6] applied number theoretical methods

Incremental Greedy Algorithm and Its Applications in Numerical Integration

563

based on Gaussian sums. The key point of that technique can be formulated in terms
of best m-term approximation of trigonometric polynomials. Let as above RT (N )
be the subspace of real trigonometric polynomials of order N . Using the Gaussian
sums one can prove (constructively) the estimate
m (t, RT ) C N 3/2 m 1 t1 , t RT (N ).

(6)

Denote as above
a0 +

N
N


(ak cos k2 x + bk sin k2 x) A := |a0 | +
(|ak | + |bk |).
k=1

k=1

We note that by the simple inequality


t A 2(2N + 1)t1 , t RT (N ),
the estimate (6) follows from the estimate
m (t, RT ) C(N 1/2 /m)t A , t RT (N ).

(7)

Thus (7) is stronger than (6). The following estimate was proved in [1]
m (t, RT ) Cm 1/2 (ln(1 + N /m))1/2 t A , t RT (N ).

(8)

In a way (8) is much stronger than (7) and (6). The proof of (8) from [1] is not
constructive. The estimate (8) has been proved in [1] with the help of a nonconstructive theorem of Gluskin [4]. In [11] we gave a constructive proof of (8). The key
ingredient of that proof is the WCGA. In the paper [2] we already pointed out that
the WCGA provides a constructive proof of the estimate
m ( f, RT ) p C( p)m 1/2  f  A ,

p [2, ).

(9)

The known proofs (before [2]) of (9) were nonconstructive (see discussion in [2],
Sect. 5). Thus, the WCGA provides a way of building a good m-term approximant.
However, the step (2) of the WCGA makes it difficult to control the coefficients of
the approximantthey are obtained through the Chebyshev projection of f onto
m . This motivates us to consider the IA() which gives explicit coefficients of the
approximant. We note that the IA() is close to the Weak Relaxed Greedy Algorithm (WRGA) (see [12], Chap. 6). Contrary to the IA(), where we build the
mth approximant G m as a convex combination of the previous approximant G m1
and the newly chosen dictionary element m with a priori fixed coefficients: G m =
(1 1/m)G m1 + m /m, in the WRGA we build G m = (1 m )G m1 + m m
with m [0, 1] chosen from an optimization problem, which depends on f and m.

564

V. Temlyakov

For more detailed comparison of the IA() and the WRGA in application in numerical
integration see [12], pp. 402403.
Second, we proceed to a discussion and proof of Theorem 2. In order to be able
to run the IA() for all iterations we need existence of an element mi, D at the
step (1) of the algorithm for all m. It is clear that the following condition guarantees
such existence.
Condition B. We say that for a given dictionary D an element f satisfies Condition
B if for all F X we have
F( f ) sup F(g).
gD

It is well known (see, for instance, [12], p. 343) that any f A1 (D) satisfies
Condition B. For completeness we give this simple argument here. Take any f
A1 (D). Then for any > 0 there exist g1 , . . . , g N D and numbers a1 , . . . , a N
such that ai > 0, a1 + + a N = 1 and
f

N


ai gi  .

i=1

Thus
F( f ) F + F(

N


ai gi ) F + sup F(g)


gD

i=1

which proves Condition B.


We note that Condition B is equivalent to the property f A1 (D). Indeed, as
we showed above, the property f A1 (D) implies Condition B. Let us show that
/ A1 (D) by the
Condition B implies that f A1 (D). Assuming the contrary f
separation theorem for convex bodies we find F X such that
F( f ) >

sup F() sup F(g)

A1 (D)

gD

which contradicts Condition B.


We formulate results on the IA() in terms of Condition B because in the application from Sect. 3 it is easy to check Condition B.
Theorem 5 Let X be a uniformly smooth Banach space with modulus of smoothness
(u) u q , 1 < q 2. Define
n := 1/q n 1/ p ,

p=

q
, n = 1, 2, . . . .
q 1

Then, for every f satisfying Condition B we have


 f mi,  C() 1/q m 1/ p ,

m = 1, 2 . . . .

Incremental Greedy Algorithm and Its Applications in Numerical Integration

565

In the case f A1 (D) this theorem is proved in [11] (see also [12], Chap. 6). As
we mentioned above Condition B is equivalent to f A1 (D).
We now give some applications of Theorem 5 in the construction of special polynomials. We begin with a general result.
Theorem 6 Let X be a uniformly smooth Banach space with modulus of smoothness
(u) u q , 1 < q 2. For any n elements 1 , 2 , . . . , n ,  j  1, j = 1, . . . , n,
there exists a subset [1, n] of cardinality || m < n and natural numbers a j ,
j such that


n
 aj
1
j  X C 1/q m 1/q1 ,
j
n j=1
m
j

a j = m.

Proof For a given set 1 , 2 , . . . , n consider a new Banach space X n := span(1 , 2 ,


. . . , n ) with norm   X . In the space X n consider the dictionary Dn := { j }nj=1 .
Then the space X n is a uniformly smooth
 Banach space with modulus of smoothness
(u) u q , 1 < q 2 and f := n1 nj=1 j A1 (Dn ). Applying the IA() to f
with respect to Dn we obtain by Theorem 5 after m iterations
f

m

1
jk  X C 1/q m 1/q1 ,
m
k=1

where jk is obtained at the kth iteration of the IA(). Clearly,



a
written in the form j mj j with || m.

m

1
k=1 m jk

can be


Corollary 1 Let m N and n = 2m. For any n trigonometric polynomials j


(0, ), there exist a set
RT (N ),  j  1, j = 1, . . . , n with N n b , b
and natural numbers a j , j , such that || m, j a j = m and


n
 aj
1
j  C(b)(ln m)1/2 m 1/2 .
j
n j=1
m
j

(10)

Proof First, we apply Theorem 6 with X = L p , 2 p < . Using (3) we get




n
 a j ( p)
1
j  p C p 1/2 m 1/2 ,
j
n j=1
m
j( p)

with |( p)| m.


j( p)

a j ( p) = m,

(11)

566

V. Temlyakov

Second, by the Nikolskii inequality (see [7], Chap. 1, S2): for a trigonometric
polynomial t of order N one has
t p C N 1/q1/ p tq ,
we obtain from (11)


C N 1/ p 

1 q < p ,

n
 a j ( p)
1
j 
j
n j=1
m
j( p)

n
 a j ( p)
1
j  p C p 1/2 N 1/ p m 1/2 .
j
n j=1
m
j( p)

Choosing p  lnN  lnm we obtain (10).

We note that Corollary 1 provides a construction of analogs of the Rudin-Shapiro


polynomials (see, for instance, [12], p.155) in a much more general situation than
in the case of the Rudin-Shapiro polynomials, albeit with a little bit weaker bound,
which contains an extra (lnm)1/2 factor.
Proof of Theorem 2. It is clear that it is sufficient to prove Theorem 2 for
t RT (N ) with t A = 1. Then t A1 (RT (N ), L p ) for all p [2, ). Now,
applying Theorem 6 and using its proof with X = L p , 1 , 2 , . . . , n , n = 2N + 1,
being the real trigonometric system 1, cos 2 x, sin 2 x, . . . , cos N 2 x, sin N 2 x,
we obtain that

 aj
j  p C 1/2 m 1/2 ,
a j = m,
(12)
t
m
j
j

a
where j mj j is the G i,
m (t). By (3) we find p/2. Next, by the Nikolskii
inequality we get from (12)
t

 aj
j

j  C N 1/ p t

 aj
j

j  p C p 1/2 N 1/ p m 1/2 .

Choosing p  lnN we obtain the desired in Theorem 2 bound.


We point out that the above proof of Theorem 2 gives the following statement.
Theorem 7 Let 2 p < . For any t RT (N ) the IA(, L p , RT N ) with an
appropriate schedule , applied to f := t/t A , provides after m iterations an mterm trigonometric polynomial G m (t) := G i,
m ( f )t A with the following approximation property
t G m (t) p Cm 1/2 p 1/2 t A , G m (t) A = t A ,
with an absolute constant C.

Incremental Greedy Algorithm and Its Applications in Numerical Integration

567

3 Numerical Integration and Discrepancy


For a cubature formula m (, ) we have

m (W pK , ) =

sup

 L p ( y ) 1

= J ()

J (y)

m



K ( , y) (y)dy| =

=1

m


K ( , ) L p ( y ) .

(13)

=1

Define the error of optimal cubature formula with m knots for a class W
m (W ) :=

inf

1 ,...,m ; 1 ,..., m

m (W, ).

The above identity (13) obviously implies the following relation.


Proposition 1
m (W pK )

inf

1 ,...,m ; 1 ,..., m

J ()

m


K ( , ) L p ( y ) .

=1

Thus, the problem of finding the optimal error of a cubature formula with m knots
for the class W pK is equivalent to the problem of best m-term approximation of a
special function J with respect to the dictionary D = {K (x, ), x x }.
Consider a problem of numerical integration of functions K (x, y), y y , with
respect to x, K Kq :

x

K (x, y)dx

m


K ( , y).

=1

Definition 1 (K , q)-discrepancy of a cubature formula m with knots 1 , . . . , m


and weights 1 , . . . , is

D(m , K , q) := 

K (x, y)dx

m


K ( , y) L q ( y ) .

=1

The above definition of the (K , q)-discrepancy implies right a way the following
relation.

568

V. Temlyakov

Proposition 2
inf

1 ,...,m ; 1 ,..., m

inf

1 ,...,m ; 1 ,..., m

D(m , K , q)

J ()

m


K ( , ) L q ( y ) .

=1

Therefore, the problem of finding minimal (K , q)-discrepancy is equivalent to


the problem of best m-term approximation of a special function J with respect to
the dictionary D = {K (x, ), x x }.

The particular case K (x, y) = [0,y] (x) := dj=1 [0,y j ] (x j ), y j [0, 1), j = 1,
. . . , d, where [0,y] (x), y [0, 1) is a characteristic function of an interval [0, y),
leads to a classical concept of the L q -discrepancy.
Proof of Theorem 3. By (13)
m (W pK , ) = J ()

m


K ( , ) L p ( y ) .

=1

We are going to apply Theorem 5 with X = X (K , p ) L p ( y ), f = JK . We need


to check the Condition B. Let F be a bounded linear functional on L p . Then by the
Riesz representation theorem there exists h L p such that for any L p

F() =

h(y)(y)dy.
y

By the Hlder inequality for any x x we have



y

|h(y)K (x, y)|dy h p .

Therefore, the functions |h(y)K (x, y)| and h(y)K (x, y) are integrable on x y
and by Fubinis theorem

F(JK ) =

h(y)
y

K (x, y)dx =


y

h(y)K (x, y)dy dx


=

F(K (x, y))dx sup F(K (x, y)),


xx

which proves the Condition B. Applying Theorem 5 and taking into account (3) we
complete the proof.
Proposition 2 and the above proof imply the following theorem on (K , q)discrepancy.

Incremental Greedy Algorithm and Its Applications in Numerical Integration

569

Theorem 8 Assume that K Kq satisfies the condition


K (x, ) L q ( y ) 1, x x , |x | = 1
and JK X (K , q). Then for any m there exists (provided by an appropriate Incremental Algorithm) a cubature formula m (, ) with = 1/m, = 1, 2, . . . , m,
and
D(m , K , q) Cq 1/2 m 1/2 , 2 q < .
We note that in the case X = L q ([0, 1]d ), q [2, ), D = {K (x, y), x [0, 1]d },
f = J (y) the implementation of the IA() is a sequence of maximization steps, when
we maximize functions of d variables. An important advantage of the L q spaces is a
simple and explicit form of the norming functional F f of a function f L q ([0, 1]d ).
The F f acts as (for real L q spaces)

F f (g) =

[0,1]d

 f q1q | f |q2 f gdy.

Thus the IA() should find at a step m an approximate solution to the following
optimization problem (over x [0, 1]d )

[0,1]d

i,
i,
| f m1
(y)|q2 f m1
(y)K (x, y)dy

max.

Acknowledgments Research was supported by NSF grant DMS-1160841.

References
1. DeVore, R.A., Temlyakov, V.N.: Nonlinear approximation by trigonometric sums. J. Fourier
Anal. Appl. 2, 2948 (1995)
2. Dilworth, S.J., Kutzarova, D., Temlyakov, V.N.: Convergence of some Greedy Algorithms in
Banach spaces. J. Fourier Anal. Appl. 8, 489505 (2002)
3. Donahue, M., Gurvits, L., Darken, C., Sontag, E.: Rate of convex approximation in non-Hilbert
spaces. Constr. Approx. 13, 187220 (1997)
4. Gluskin, E.D.: Extremal properties of orthogonal parallelpipeds and their application to the
geometry of Banach spaces. Math USSR Sbornik 64, 8596 (1989)
5. Ismagilov, R.S.: Widths of sets in normed linear spaces and the approximation of functions by
trigonometric polynomials, Uspekhi Mat. Nauk, 29 (1974), 161178; English transl. in Russian
Math. Surveys, 29 (1974)
6. Maiorov, V.E.: Trigonometric diameters of the Sobolev classes W pr in the space L q . Math.
Notes 40, 590597 (1986)
7. Temlyakov, V.N.: Approximation of Periodic Functions, Nova Science Publishers, Inc., New
York (1993)
8. Temlyakov, V.N.: Weak greedy algorithms. Adv. Comput. Math. 12, 213227 (2000)
9. Temlyakov, V.N.: Greedy algorithms in Banach spaces. Adv. Comput. Math. 14, 277292
(2001)

570

V. Temlyakov

10. Temlyakov, V.N.: Cubature formulas, discrepancy, and nonlinear approximation. J. Complex.
19, 352391 (2003)
11. Temlyakov, V.N.: Greedy-type approximation in Banach spaces and applications. Constr.
Approx. 21, 257292 (2005)
12. Temlyakov, V.N.: Greedy Approximation. Cambridge University Press, Cambridge (2011)

On Upper Error Bounds for Quadrature


Formulas on Function Classes
by K.K. Frolov
Mario Ullrich

Abstract This is a tutorial paper that gives the complete proof of a result of Frolov
(Dokl Akad Nauk SSSR 231:818821, 1976, [4]) that shows the optimal order
of convergence for numerical integration of functions with bounded mixed derivatives. The presentation follows Temlyakov (J Complex 19:352391, 2003, [13]),
see also Temlyakov (Approximation of periodic functions, 1993, [12]).
Keywords Frolov cubature Numerical Integration Sobolev space Tutorial

1 Introduction
We study cubature formulas for the approximation of the d-dimensional integral

I( f ) =

[0,1]d

f (x) dx

for functions f with bounded mixed derivatives. For this, let D f , Nd0 , be the
usual (weak) partial derivative of a function f and define the norm


 f 2s,mix :=

D f 2L 2 ,

(1)

Nd0 :  s

where s N. In the following we will study the class (or in fact the unit ball)
Hds,mix :=


f C sd ([0, 1]d ) :  f s,mix 1 ,

(2)

i.e. the closure in C([0, 1]d ) (with respect to  s,mix ) of the set of sd-times continuously differentiable functions f with  f s,mix 1. Note that these well-studied
classes of functions often appear with different notations, like M W2s , S2s W or S2s H .
M. Ullrich (B)
Johannes Kepler Universitt, 4040 Linz, Austria
e-mail: mario.ullrich@jku.at
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_31

571

572

M. Ullrich

Additionally, we will study the class


H ds,mix :=


f Hds,mix : supp( f ) (0, 1)d .

(3)

The algorithms under consideration are of the form


Qn ( f ) =

n


a j f (x j )

(4)

j=1
j

for a given set of nodes {x j }nj=1 , x j = (x1 , . . . , xd ) [0, 1]d , and weigths (a j )nj=1 ,
a j R, i.e. the algorithm Q n uses at most n function evaluations of the input function.
The worst case error of Q n in the function class H is defined as
e(Q n , H ) = sup |I ( f ) Q n ( f )|.
f H

We will prove the following theorem, which is Theorem 2 of [4].


Theorem 1 Let s, d N. Then there exists a sequence of algorithms (Q n )nN such
that
d1
e(Q n , H ds,mix ) Cs,d n s (log n) 2 ,
where Cs,d may depend on s and d.
Using standard techniques, see e.g. [11, Sect. 2.12] or [13, Theorem 1.1], one can
deduce (constructively) from the algorithm that is used to prove Theorem 1 a cubature
rule for the non-periodic classes Hds,mix that has the same order of convergence. More
precisely, one uses a properly chosen mapping, say M, which maps Hds,mix to H ds,mix
and preserves the integral. Then, the cubature rule applied to M f gives the optimal
order as long as M has bounded norm. Such mappings (in a more general setting)
will be analyzed in [8].
This results in the following corollary.
Corollary 1 Let s, d N. Then there exists a sequence of algorithms (Q n )nN such
that
s,d n s (log n) d1
2 ,
e(Q n , Hds,mix ) C
s,d may depend on s and d.
where C
The proof of Theorem 1, and hence also of Corollary 1, is constructive, i.e. we
will show how to construct the nodes and weights of the used algorithms.
Remark 1 The upper bounds of Theorem 1 and Corollary 1 that will be proven in the
next section for a specific algorithm, see (10), are best possible in the sense of the
order of convergence. That is, there are matching lower bounds that hold for arbitrary
cubature rules that use only function values, see e.g. [13, Theorem 3.2].

On Upper Error Bounds for Quadrature Formulas on Function Classes

573

s,mix
Remark 2 There is a natural generalization of the spaces H ds,mix , say H d,
p , where
the L 2 -norm in (1) is replaced by an L p -norm, 1 < p < . The same lower bounds
as mentioned in Remark 1 are valid also in this case, see [13, Theorem 3.2]. Obviously,
the upper bounds from Theorem 1 hold for these spaces if p 2, since the spaces get
smaller for larger p. For 1 < p < 2 it was proven by Skriganov [10, Theorem 2.1]
that the same algorithm satisfies the optimal order. We refer to [13] and references
therein for more details on this and the more delicate case p = 1.

Remark 3 Besides the cubature rule of Frolov that is analyzed in this paper, there
are several other constructions. Two prominent examples are the Smolyak algorithm
and (higher order) digital nets, see [9, Chap. 15] and [1], respectively. However, it is
proven that the Smolyak algorithm cannot achieve the optimal order of convergence
for the function classes under consideration, see [2, Theorem 5.2], and that the upper
bounds on the error for digital nets are (at the moment) restricted to small smoothness,
see e.g. [6]. In this sense Frolovs cubature is universal, i.e. the same cubature rule
gives the optimal order of convergence for every choice of the parameters s and d.
This is also true in the more general setting of Besov and Triebel-Lizorkin spaces,
see [14].

2 Proof of Theorem 1
2.1 The Algorithm
We start with the construction of the nodes of our cubature rule. See Sloan and Joe [11]
for a more comprehensive introduction to this topic. In the setting of Theorem 1 the
set X [0, 1)d of nodes will be a subset of a lattice X Rd , i.e. x, y X implies
x y X. In fact, we take all points inside the unit cube.
The lattice X will be d-dimensional, i.e. there exists a non-singular matrix
T Rdd such that


(5)
X := T (Zd ) = T x : x Zd .
The matrix T is called the generator of the lattice X. Obviously, every multiple of
X, i.e. cX for some c R, is again a lattice and note that while X is a lattice, it is
not necessarily an integration lattice, i.e. in general we do not have X Zd .
In the following we will fix a generator T and consider all points inside the cube
[0, 1)d of the shrinked lattice a 1 T (Zd ), a > 1, as nodes for our cubature rule for
functions from H ds,mix . That is, we will use the set of points


X ad := a 1 X [0, 1)d ,
where X is given by (5).

a > 1,

(6)

574

M. Ullrich

For the construction of the nodes it remains to present a specific generator matrix
T that is suitable for our purposes. For this, define the polynomials
d



Pd (t) :=
t 2 j + 1 1,

t R.

(7)

j=1

Obviously, the polynomial Pd has only integer coefficients, and it is easy to check
that it is irreducible1 (over Q) and has d different real roots. Let 1 , . . . , d R be
the roots of Pd . Using these roots we define the d d-matrix B by
d


d
j1
B = Bi, j i, j=1 := i

i, j=1

(8)

This matrix is a Vandermonde matrix and hence invertible and we define the generator
matrix of our lattice by
T = (B
)1 ,
(9)
where B
is the transpose of B. It is well known that X := B(Zd ) is the dual lattice
associated with X = T (Zd ), i.e. y X if and only if x, y Z for all x X.
We define the cubature rule for functions f from H ds,mix by
Q a ( f ) := a d det(T )

f (x),

a > 1.

(10)

xX ad

In the next subsection we will prove that Q a has the optimal order of convergence
for H ds,mix .
Note that Q a ( f ) uses |X ad | function values of f and that the weights of this
algorithm are equal, but do not (in general) sum up to one, i.e. Q a is not a quasiMonte Carlo method. While the number |X ad | of points can be estimated in terms of the
determinant of the corresponding generator matrix, it is in general not equal. In fact, if
a 1 X would be an integration lattice, then it is well known that |X ad | = a d det(T 1 ),
see e.g. [11]. For the general lattices that we consider, we know, however, that these
numbers are of the same order, see Skriganov [10, Theorem 1.1].2
Lemma 1 Let X = T (Zd ) Rd be a lattice with generator T of the form (9), and
let X ad be given by (6). Then there exists a constant C T that is independent of a such
that

d


|X | a d det(T 1 ) C T lnd1 1 + a d
a
polynomial P is called irreducible over Q if P = G H for two polynomials G, H with rational
coefficients implies that one of them has degree zero. This implies that all roots of P must be irra
tional. In fact, every polynomial of the form dj=1 (x b j ) 1 with different b j Z is irreducible,
but has not necessarily d different real roots.
2 Skriganov proved this result for admissible lattices. The required property will be proven in
Lemma 3, see also [10, Lemma 3.1(2)].
1A

On Upper Error Bounds for Quadrature Formulas on Function Classes

575

for all a > 1. In particular, we have


lim

|X ad |
= 1.
a d det(T 1 )

Remark 4 It is still not clear if the corresponding QMC algorithm, i.e. the cubature
rule (10) with a d det(T ) replaced by |X ad |1 , has the same order of convergence. If
true, this would imply the optimal order of the L p -discrepancy, p < , of a (deterministic) modification of the set X ad , see [5, 10]. We leave this as an open problem. In
fact, Skriganov [10, Corollary 2.1] proved that for every a > 0 there exists a vector
z a Rd such that the translated set X ad z a satisfies the above conditions.
In the remaining subsection we prove the crucial property of these nodes. For
this we need the following corollary of the Fundamental Theorem of Symmetric
Polynomials, see, [3, Theorem 6.4.2].

Lemma 2 Let P(x) = dj=1 (x j ) and G(x1 , . . . , xd ) be polynomials with integer coefficients. Additionally, assume that G(x1 , . . . , xd ) is symmetric in x1 , . . . , xd ,
i.e. invariant under permutations of x1 , . . . , xd . Then, G(1 , . . . , d ) Z.
We obtain that the elements of the dual lattice B(Zd ) satisfy the following.

Lemma 3 Let 0 = z = (z 1 , . . . , z d ) B(Zd ) with B from (8). Then, dj=1 z i
Z \ 0.
Proof Fix m = (m 1 , . . . , m d ) Zd such that Bm = z. Hence,
zi =

d


m j i

j1

j=1


depends only on i . This implies that dj=1 z i is a symmetric polynomial in 1 , . . . , d

with integer coefficients. By Lemma 2, we have dj=1 z i Z.
to prove z i = 0 for i = 1, . . . , d. Define the polynomial R1 (x) :=
dIt remains
j1
m
x
and assume that z  = R1 ( ) = 0 for some  = 1, . . . , d. Then there
j
j=1
exist unique polynomials G and R2 with rational coefficients such that
Pd (x) = G(x)R1 (x) + R2 (x),
where degree(R2 ) < degree(R1 ). By assumption, R2 ( ) = 0. If R2 0 this is a
contradiction to the irreducibility of Pd . If not, divide Pd by R2 (instead of R1 ).
Iterating this procedure, we will eventually find a polynomial R with degree(R ) >
0 (since it has a root) and rational coefficients that divides Pd : a contradiction to the
irreducibility. This completes the proof of the lemma.

We finish the subsection with a result on the maximal number of nodes in the dual
lattice that lie in an axis-parallel box of fixed volume.

576

M. Ullrich

Corollary 2 Let B be the matrix from (8) and a > 0. Then, for each axis-parallel
box Rd we have


a B(Zd ) a d vold () + 1.
Proof Assume first that vold () < a d . If contains 2 different points z, z 
a B(Zd ), then, using that this implies z  = z z  a B(Zd ), we obtain
vold ()

|z i z i | =

i=1

|z i | a d

i=1

from Lemma 3: a contradiction. For vold () a d we divide along one coordinate


in a d vold () + 1 equal pieces, i.e. pieces with volume less than a d , and use the
same argument as above.

Remark 5 Although we focus in the following on the construction of nodes that
is based on the polynomial Pd from (7), the same construction works with any
irreducible polynomial of degree d with d different real roots and leading coefficient
1, cf. [12, Section 4.4]. For example, if the dimension is a power of 2, i.e. d = 2k for
some k N, we can be even more specific. In this case we can choose the polynomial


Pd (x) = 2 cos d arccos(x/2) ,
cf. the Chebyshev polynomials. The roots of this polynomial are given by

(2i 1)
,
i = 2 cos
2d


i = 1, . . . , d.

Hence, the construction of the lattice X that is based on this polynomial is completely
explicit. For a suitable polynomial if 2d + 1 is prime, see [7]. We didnt try to find
a completely explicit construction in the intermediate cases.

2.2 The Error Bound


In this subsection we prove that the algorithm Q a from (10) has the optimal order of
convergence for functions from H ds,mix , i.e. that
d1
e( Q a , H ds,mix ) Cs,d n s (log n) 2 ,

where n = n(a, T ) := |X ad | is the number of nodes used by Q a and Cs,d is independent of n.

On Upper Error Bounds for Quadrature Formulas on Function Classes

577

For this we need the following two lemmas. Recall that the Fourier transform of
an integrable function f L 1 (Rd ) is given by
f(y) :=
with y, x :=
s (y) =

d
j=1

Rd

f (x) e2 i y,x dx,

y Rd ,

y j x j . Furthermore, let

 s
d


j=1


|2 y j |

2

=0

|2 y j |2 j ,

y Rd .

(11)

Nd0 :  s j=1

Clearly,


s (y)| f(y)|2 =

Nd0 :  s

2



d


j
2 i y,x


(2
i
y
)
f
(x)
e
dx
j

d

R j=1

2


D f (y)

Nd0 :  s

for all f H ds,mix .


Throughout the rest of the paper we study only functions from H ds,mix . Since their
supports are contained strictly inside the unit cube, we can identify each function
f H ds,mix with its continuation to the whole space by zero, i.e. we set f (x) = 0 for
x
/ [0, 1]d .
We begin with the following result on the sum of values of the Fourier transform.
1

Lemma 4 Let B Rdd be an invertible


matrix, T = (B
) and define the number


d

d
d
M B := # m Z : B ([0, 1] ) m + (0, 1) = . Then, for each f H ds,mix ,
s N, we have



MB
 f 2s,mix .
s (z)| f(z)|2
det(B)
d
zB(Z )

Proof Let s := { Nd0 :  s} and define the function


g(x) :=

f (T (m + x)),

x [0, 1]d .

mZd

Clearly, at most M B of the summands are not zero and g is 1-periodic. Hence, we
obtain by Parsevals identity and Jensens inequality that

578

M. Ullrich


s (z)| f(z)|2 =

2

  
f (z) =


D

s zB(Zd )

zB(Zd )

  

= det(T )2

s yZd

Rd

2

D f (x) e2 i By,x dx

2

D f (T x) e2 i y,x dx

s yZd

Rd

s yZd

[0,1]d


2


   

2

2
i
y,x


= det(T )
D
f
(T
(m
+
x))
e
dx


d
[0,1]


s yZd mZd



 
2

= det(T )2
D g(x) e2 i y,x dx

= det(T )2



[0,1]d

= det(T )2 M B2
det(T )2 M B2
= det(T )2 M B



D g(x) 2 dx


s

[0,1]d

[0,1]d




Rd


2


1 


dx
D
f
(T
(m
+
x))
M

B mZd

2
1 
D f (T (m + x)) dx
MB
d
mZ



D f (T x) 2 dx = det(T ) M B  f 2
s,mix

as claimed.

Additionally, we need the following version of the Poisson summation formula


for lattices.
Lemma 5 Let X = T (Zd ) Rd be a full-dimensional lattice and X Rd be the
associated dual lattice. Additionally, let f H ds,mix , s N. Then,
det(T )

f (x) =

f(y).

yX

xX[0,1)d

In particular, the right-hand-side is convergent.


Proof Let g(x) = f (T x), x Rd . Then, by the definition of the lattice, we have

xX[0,1)d

f (x) =


xX

f (x) =


xZd

f (T x) =

g(x).

xZd

Additionally, note that B = (T


)1 is the generator of X and hence

On Upper Error Bounds for Quadrature Formulas on Function Classes




f(y) =

yX



f(By) =

yZd

= det(T )
= det(T )

yZd


yZd

Rd

f (x) e2 i By,x dx =

yZd

f (T z) e2 i y,z dz = det(T )

Rd



yZd

579

f (x) e2 i y,B

Rd

dx

g(z) e2 i y,z dz

Rd

g(y),

yZd

where we performed the substitution x = T z. (Here, we need that the lattice is fulldimensional.) In particular, the series on the left hand side converges if and only
if the right hand side does. For the proof of this convergence note that f H ds,mix ,
s 1, implies g1,mix gs,mix < . We obtain by Lemma 4 that


2
1 (y)|g(y)|

M B g21,mix <

yZd

with M B from Lemma 4, since supp(g) T 1 ([0, 1]d ) = B


([0, 1]d ). Hence,


1/2
1/2



2
|g(y)|


|1 (y)|1
1 (y) |g(y)|

< ,
y=0

yZd

y=0

which proves the convergence. We finish the proof of Lemma 5 by




g(y)

yZd


yZd

g(z) e2 i y,z dz

Rd



yZd

[0,1]d

g(m + z) e2 i y,z dz =

mZd

g(m).

mZd

The
 last equality is simply d the evaluation of the Fourier series of the function
mZd g(m + x), x [0, 1] , at the point x = 0. It follows from the absolute convergence of the left hand side that this Fourier series is pointwise convergent.

By Lemma 5 we can write the algorithm Q a , a > 1, as
Q a ( f ) = a d det(T )


xX ad

f (x) =

f(z),

f H ds,mix ,

za B(Zd )

where a B (see (8)) is the generator of the dual lattice of a 1 T (Zd ) (see (9)) and
X ad = (a 1 X) [0, 1)d . Since I ( f ) = f(0) we obtain

580

M. Ullrich












|s (z)|1/2 s (z)1/2 f(z)


|I ( f ) Q a ( f )| =
f (z)
za B(Zd )\0

za B(Zd )\0

1/2
1/2



|s (z)|1
s (z) | f(z)|2 .
za B(Zd )\0

za B(Zd )\0

with s from (11). We bound both sums separately. First, note that Lemma 4 implies
that

s (z) | f(z)|2 C(a, B)  f 2s,mix
za B(Zd )\0

with C(a, B) := det(a B)1 Ma B . Using that B


([0, 1]d ) is Jordan measurable, we
obtain lima C(a, B) = 1 and, hence, for a > 1 large enough,


s (z) | f(z)|2 2 f 2s,mix .

(12)

za B(Zd )\0

This follows from the fact that Ma B is the number of unit cubes that are necessary
to cover the set a B
([0, 1]d ), and det(a B) is its volume.
Now we treat the first sum. Define, for m = (m 1 , . . . , m d ) Nd0 , the sets
(m) := {x Rd : 2m j 1  |x j | < 2m j for j = 1, . . . , d}.

and note that dj=1 |x j | < 2m1 for all x (m). Recall from Lemma 3 that
d
d
d
d
j=1 z j Z \ 0 for all z B(Z ) \ 0 and, consequently,
j=1 |z j | a for z
d
d
d
a B(Z ) \ 0. This shows that |(a B(Z ) \ 0) (m)| = 0 for all m N0 with m1 <

d log2 (a) =: r . Hence, with |z | := dj=1 max{1, 2 |z j |}, we obtain

za B(Zd )\0

|s (z)|1


za B(Zd )\0

|z |2s =

|z |2s .

=r m:m1 = z(a B(Zd )\0)(m)


Note that for z (m) we have |z | dj=1 max{1, 2 2m j 1 } 2m1 . Since (m)
is a union of 2d axis-parallel boxes each with volume less than 2m1 , Corollary 2
implies that (a B(Zd ) \ 0) (m) 2d (a d 2m1 + 1) 2d+2 2m1 r for m with



m1 r . Additionally, note that {m Nd0 : m1 = } = d+1
< ( + 1)d1 .

We obtain

On Upper Error Bounds for Quadrature Formulas on Function Classes

|s (z)|1

581



(a B(Zd ) \ 0) (m) 22sm1

=r m:m1 =

za B(Zd )\0

2d+2

2m1 r 22sm1

=r m:m1 =

2d+2



( + 1)d1 2r 22s = 2d+2
(t + r + 1)d1 2t 22s(t+r )
=r

< 2d+2 22sr


 d1 
log2 a d
1+
t=0


 d1
2d+2 a 2sd log2 a d

t=0

t +2
d log2 (a)

d1

2(12s)t

e(t+2)/ log2 (a) 2(12s)t

t=0

where we have used that d log2 (a) r < d log2 (a) + 1. Clearly, the last series converges iff a > e1/(2s1) and, in particular, it is bounded by 23 for a e2 and all
s N.
So, all together

  d1
e( Q a , H ds,mix ) 2d/2+3 a sd log2 a d 2

(13)

for a > 1 large enough. From Lemma 1 we know that the number of nodes used by
Q a is proportional to a d . This proves Theorem 1.
Remark 6 It is interesting to note that the proof of Theorem 1 is to a large extent
independent of the domain of integration. For an arbitrary Jordan measurable set
Rd we can consider
Q a from (10) with the set of nodes X ad
 1the algorithm

d
d
replaced by X a () = a T (Z ) . The only difference in the estimates would
be that C(a, B), cf. (12), converges to vold () instead of 1.

References
1. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy theory and quasi-Monte
Carlo integration. Cambridge University Press, Cambridge (2010)
2. Dung, D., Ullrich, T.: Lower bounds for the integration error for multivariate functions
with mixed smoothness and optimal Fibonacci cubature for functions on the square, Math.
Nachrichten (2015) (to appear)
3. Fine, B., Rosenberger, G.: The fundamental theorem of algebra. Springer-Verlag, New York,
Undergraduate Texts in Mathematics (1997)
4. Frolov, K.K.: Upper error bounds for quadrature formulas on function classes. Dokl. Akad.
Nauk SSSR 231, 818821 (1976)
5. Frolov, K.K.: Upper bound of the discrepancy in metric L p , 2 p < . Dokl. Akad. Nauk
SSSR 252, 805807 (1980)

582

M. Ullrich

6. Hinrichs, A., Markhasin, L., Oettershagen, J., Ullrich, T.: Optimal quasi-Monte Carlo rules
on higher order digital nets for the numerical integration of multivariate periodic functions.
e-prints (2015)
7. Lee, C.-L., Wong, K.B.: On Chebyshevs polynomials and certain combinatorial identities.
Bull. Malays. Math. Sci. Soc. 2(34), 279286 (2011)
8. Nguyen, V.K., Ullrich, M. Ullrich, T.: Change of variable in spaces of mixed smoothness and
numerical integration of multivariate functions on the unit cube (2015) (preprint)
9. Novak, E., Wozniakowaski, H.: Tractability of Multivariate Problems, Volume II: Standard
Information for Functionals EMS Tracts in Mathematics, Vol. 12, Eur. Math. Soc. Publ. House,
Zrich (2010)
10. Skriganov, M.M.: Constructions of uniform distributions in terms of geometry of numbers.
Algebra i Analiz 6, 200230 (1994)
11. Sloan, I.H., Joe, S.: Lattice Methods for Multiple Integration. Oxford Science Publications,
New York (1994)
12. Temlyakov, V.N.: Approximation of Periodic Functions. Computational Mathematics and
Analysis Series. Nova Science Publishers Inc, NY (1993)
13. Temlyakov, V.N.: Cubature formulas, discrepancy, and nonlinear approximation. J. Complex.
19, 352391 (2003)
14. Ullrich, M., Ullrich, T.: The role of Frolovs cubature formula for functions with bounded
mixed derivative, SIAM J. Numer. Anal. 54(2), 969993 (2016)

Tractability of Function Approximation


with Product Kernels
Xuan Zhou and Fred J. Hickernell

Abstract This article studies the problem of approximating functions belonging to


a Hilbert space Hd with a reproducing kernel of the form
d (x, t) :=
K

d



1 2 + 2 K  (x , t ) for all x, t Rd .
=1

The  [0, 1] are scale parameters, and the  > 0 are sometimes called shape parameters. The reproducing kernel K corresponds to some Hilbert space of functions
d generalizes the anisotropic Gaussian reproducing kerdefined on R. The kernel K
nel, whose tractability properties have been established in the literature. We present
sufficient conditions on {  }
=1 under which function approximation problems on
Hd are polynomially tractable. The exponent of strong polynomial tractability arises
from bounds on the eigenvalues of positive definite linear operators.
Keywords Function approximation Tractability Product kernels

1 Introduction
This article addresses the problem of function approximation. In a typical application
we are given data of the form yi = f (x i ) or yi = L i ( f ) for i = 1, . . . , n. That is,
a function f is sampled at the locations {x 1 , . . . , x n }, usually referred to as the data

X. Zhou (B) F.J. Hickernell


Department of Applied Mathematics, Illinois Institute of Technology,
Room E1-232, 10 W. 32nd Street, Chicago, IL 60616, USA
e-mail: xzhou23@hawk.iit.edu
F.J. Hickernell
e-mail: hickernell@iit.edu
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_32

583

584

X. Zhou and F.J. Hickernell

sites or the design, or more generally we know the values of n linear functionals
L 1 , . . . , L n applied to f . Here we assume that the domain of f is a subset of Rd . The
goal is to construct An f , a good approximation to f that is inexpensive to evaluate.
Algorithms for function approximation based on symmetric positive definite kernels have arisen in both the numerical computation literature [3, 5, 13, 18], and the
statistical learning literature [1, 4, 7, 12, 1417]. These algorithms go by a variety
of names, including radial basis function methods [3], scattered data approximation
[18], meshfree methods [5], (smoothing) splines [17], kriging [15], Gaussian process
models [12] and support vector machines [16].
Many kernels commonly used in practice are associated with a sequence of shape
parameters = { }
=1 , which allows more flexibility in the function approximation problem. Examples of such kernels include the Matrn, the multiquadrics, the
inverse multiquadrics, and the extensively studied Gaussian kernel (also known as the
squared exponential kernel). The anisotropic stationary Gaussian kernel, is given by
d (x, t) := e12 (x1 t1 )2 d2 (xd td )2 =
K

d


e (x t )
2

for all x, t Rd ,

(1)

=1

where  is a positive shape parameter corresponding to the variable x . Choosing a


small  has a beneficial effect on the rate of decay of the eigenvalues of the Gaussian
kernel. The optimal choice of  is application dependent and much work has been
spent on the quest for the optimal shape parameter. Note that taking  = for all 
recovers the isotropic Gaussian kernel.
For the Gaussian kernel (1), convergence rates with polynomial tractability results
are established in [6]. These rates are summarized in Table 1. Note that the error of
an algorithm An in this context is the worst case error based on the following L2
criterion:

e

wor

(An ) :=

sup

 f Hd 1

 f An f L2 ,

 f L2 :=

1/2
f (t) d (t) dt
2

Rd

, (2)

where d is a probability density function with independent marginals, namely


d (x) = 1 (x1 ) 1 (xd ). For real q, the notation  n q (with n implied)
means that for all > 0 the quantity is bounded above by C n q+ for all n > 0,
where C is some positive constant that is independent of the sample size, n, and the
dimension, d, but may depend on . The notation  n q is defined analogously, and
means that the quantity is bounded from below by C n q for all > 0. The notation
n q means that the quantity is both  n q and  n q . The term r ( ) appearing in
Table 1 denotes the rate of convergence to zero of the shape parameter sequence
and is defined by

Tractability of Function Approximation with Product Kernels

585

Table 1 Error decay rates for the Gaussian kernel as a function of sample size n
Data available
Absolute error criterion
Normalized error criterion
Linear functionals
Function values

n max(r ( ),1/2)
 n max(r ( )/[1+1/(2r ( ))],1/4)

n r ( ) , if r ( ) > 0
 n r ( )/[1+1/(2r ( ))] , if r ( ) > 1/2

1/
r ( ) := sup > 0

 < .

(3)

=1

The kernel studied in this article has the more general product form given below:
d,, (x, t) :=
d (x, t) = K
K

d


 , (x , t ) for all x, t Rd ,


K

(4)

=1

where 0  1,  > 0 and


, (x, t) := 1 2 + 2 K (x, t),
K

x, t R.

(5)

We assume that we know the eigenpair expansion of the kernel K for univariate
functions in terms of its shape parameter . Many kernels in the numerical integration
and approximation literature take the form of (4), where  governs the vertical scale
of the kernel across the th dimension. In particular, taking  = 1 for all  and
K (x, t) = exp( 2 (x t)2 ) recovers the anisotropic Gaussian kernel (1).
The goal of this paper is to extend the results in Table 1 to the kernel in (4). In
essence we are able to replace r ( ) by r (, ), defined as




1/

(  ) < = r {  }N ,
r (, ) := sup > 0

(6)

=1

with the convention that the supremum of the empty set is taken to be zero.
The known eigenpair expansion of K does not give us explicit formulae for the
, is a convex
, . However, since K
eigenvalues and eigenfunctions of the kernel K
combination of the constant kernel and a kernel with a known eigenpair expansion,
, by approximatwe can derive upper and lower bounds on the eigenvalues of K
ing the corresponding linear operators by finite rank operators and applying some
inequalities for eigenvalues of matrices. These bounds then imply bounds on the
d , which is of tensor product form. Bounds on the eigenvalues of
eigenvalues of K

K d lead to tractability results for function approximation on Hd .

586

X. Zhou and F.J. Hickernell

2 Function Approximation
2.1 Reproducing Kernel Hilbert Spaces
d ) denote a reproducing kernel Hilbert space of real functions
Let Hd = H ( K
d
defined on R . The goal is to approximate any function in Hd given a finite number
d : Rd Rd R is symmetric and positive
of data. The reproducing kernel K
definite. It takes the form (4), where K satisfies the unit trace condition:

R

K (t, t) 1 (t) dt = 1 for all > 0.

(7)

This condition implies that Hd is continuously embedded in the space L2 =


L2 (Rd , d ) of square Lebesgue integrable functions, where the L2 norm was defined
in (2). Continuous embedding means that Id f L2 =  f L2 Id   f Hd for all
f Hd .
Functions in Hd are approximated by linear algorithms of the form
(An f ) (x) :=

L j ( f )a j (x) for all f Hd , x Rd ,

j=1

for some continuous linear functionals L j Hd , and functions a j L2 . Note that


for known functions a j , the cost of computing An f is O(n), if we do not consider
the cost of generating the data samples L j ( f ). The linear functionals, L j , used by an
algorithm An may either come from the class of arbitrary bounded linear functionals,
all = Hd , or from the class of function evaluations, std . The nth minimal worst
case error over all possible algorithms is defined as
ewor (n, Hd ) :=

inf

An with L j

ewor (An ) {std, all}.

2.2 Tractability
While typical numerical analysis focuses on the rate of convergence, it does not take
into consideration the effects of d. The study of tractability arises in informationbased complexity and it considers how the error depends on the dimension, d, as
well as the number of data, n.
In particular, we would like to know how ewor (n, Hd ) depends not only on n
but also on d. Because of the focus on d-dependence, the absolute and normalized
error criteria mentioned in Table 1 may lead to different answers. For a given positive
(0, 1) we want to find an algorithm An with the smallest n for which the error does

Tractability of Function Approximation with Product Kernels

587

not exceed for the absolute error criterion, and does not exceed ewor (0, Hd ) =
Id  for the normalized error criterion. That is,

n

wor

(, Hd ) = min n | e

wor

,
= abs,
(n, Hd )
Id , = norm,


.

Let I = {Id }dN denote the sequence of function approximation problems. We


say that I is polynomially tractable if and only if there exist numbers C, p and q
such that
n wor (, Hd ) C d q p for all d N and (0, 1).

(8)

If q = 0 above then we say that I is strongly polynomially tractable and the infimum
of p satisfying the bound above is the exponent of strong polynomial tractability.
The essence of polynomial tractability is to guarantee that a polynomial number
of linear functionals is enough to solve the function approximation problem up to an
error at most . Obviously, polynomial tractability depends on which class, all or
std , is considered and whether the absolute or normalized error is used.
The property of strong polynomial tractability is especially challenging since then
the number of linear functionals needed for an -approximation is independent of d.
Nevertheless, we provide here positive results on strong polynomial tractability.

3 Eigenvalues for the General Kernel


d as
Let us define the linear operator corresponding to any kernel K

Wf =

Rd

d (, t)d (t) dt for all f Hd .


f (t) K

d is a positive definite
It is known that W is self-adjoint and positive definite if K
kernel. Moreover (7) implies that W is compact. Let us define the eigenpairs of W
by (d, j , d, j ), where the eigenvalues are ordered, d,1 d,2 , and
W d, j = d, j d, j with d, j , d,i Hd = i, j for all i, j N.
Note also that for any f Hd we have
f, d, j L2 = d, j f, d, j Hd .
Taking f = d,i we see that {d, j } is a set of orthogonal functions in L2 . Letting
1/2

d, j = d, j d, j for all j N,

588

X. Zhou and F.J. Hickernell

we obtain an orthonormal sequence {d, j } in L2 . Since {d, j } is a complete


orthonormal basis of Hd , we have

d (x, t) =
K

d, j (x) d, j (t) =

j=1

d, j d, j (x) d, j (t) for all x, t Rd .

j=1

To standardize the notation, we shall always write the eigenvalues of the linear
d,, in (4) in a weakly decreasing order
operator corresponding to the kernel K
d,, ,1 d,, ,2 . We drop the dependency on the dimension d to denote the
,
eigenvalues of the linear operator corresponding to the one-dimensional kernel K
in (5) by , ,1 , ,2 . Similarly the eigenvalues of the linear operator corresponding to the one-dimensional kernel K (x, t) are denoted by ,1 ,2 .
A useful relation between the sum of the th power of the multivariate eigenvalues
d,, , j and the sums of the th powers of the univariate eigenvalues , , j is given
by [6, Lemma 3.1]:


j=1

d,,
,j

=
, , j
=1

> 0.

j=1

We are interested in the high dimensional case where d is large, and we want
to establish convergence and tractability results when  and/or  tend to zero
as  . According to [10], strong polynomial tractability holds if the sum of
some powers of eigenvalues are bounded. The following lemma provides us with
some useful inequalities on eigenvalues of the linear operators corresponding to
reproducing kernels.
Lemma 1 Let H (K A ), H (K B ), H (K C ) L2 (R, 1 ) be Hilbert spaces with
symmetric positive definite reproducing kernels K A , K B and K C such that

R

K (t, t)1 (t) dt < , {A, B, C},

(9)

and K C = a K A + bK B , a, b 0. Define the linear operators W A , W B , and WC by



W f =

f (t)K (, t)1 (t) dt, for all f H (K ), {A, B, C}.

Let the eigenvalues of the operators be sorted in a weakly decreasing order, i.e.
,1 ,2 . Then these eigenvalues satisfy
C,i+ j+1 a A,i+1 + b B, j+1 , i, j = 1, 2, . . .

(10)

C,i max(a A,i , b B,i ), i = 1, 2, . . .

(11)

Tractability of Function Approximation with Product Kernels

589

Proof Let {u j } jN be any orthonormal basis in L2 (R, 1 ). We assign the orthogonal


projections Pn given by
Pn x =

n

x, u j u j , x L2 (R, 1 ).
j=1

Since W A is compact due to (9), it can be shown that (I Pn )W A  0 as n ,


where the operator norm
(I Pn )W A  := sup (I Pn )W A xL2 (R,1 ) .
x1

Furthermore [11, Lemma 11.1 (O S2 )] states that for every pair T1 , T2 : X Y of


compact operators we have |s j (T1 )s j (T2 )| T1 T2 , j N, where the singular
values s j (Tk ), k = 1, 2 are the square rootsof the eigenvalues j (Tk Tk ) arranged in
a weakly decreasing order, thus s j (Tk ) = j (Tk Tk ). Now we can bound
|s j (W A ) s j (Pn W A Pn )| |s j (W A ) s j (Pn W A )| + |s j (Pn W A ) s j (Pn W A Pn )|
W A Pn W A  + Pn W A Pn W A Pn 
(I Pn )W A  + W A (I Pn ) 0
as n . Thus the eigenvalues Pn W A Pn , j W A , j for all j as n . Similarly
this applies to the operators W B and WC . Note that we have
Pn WC Pn = a Pn W A Pn + b Pn W B Pn
and these finite rank operators correspond to self-adjoint matrices. These matrices
are symmetric and positive definite because the kernels are symmetric and positive
definite. The inequalities (10) are found by Weyl (see [8]) and (11) are a direct result
of [2, Fact 8.19.4]. Since (10) and (11) hold for the eigenvalues of symmetric positive
definite matrices, they also hold for the operators corresponding to symmetric and
positive definite kernels.

We are now ready to present the main results of this article in the following two
sections.

4 Tractability for the Absolute Error Criterion


We now consider the function approximation problem for Hilbert spaces Hd =
d ) with a general kernel using the absolute error criterion. From the discussion
H (K
of eigenvalues in the previous section and from (7) it follows that

590

X. Zhou and F.J. Hickernell

, j =

j=1


R

K (t, t)1 (t) dt = 1,

> 0.

(12)

We want to verify whether polynomial tractability holds, namely whether (8) holds.

4.1 Arbitrary Linear Functionals


Recall that the rate of decay of scale and shape parameters r (, ) is defined in (6).
We first analyze the class all and polynomial tractability.
Theorem 1 Consider the function approximation problem I = {Id }dN for Hilbert
spaces for the class all and the absolute error criterion with the kernels (4) satisfying
(12). Let r (, ) be given by (6). If r (, ) = 0 or there exist constants C1 , C2 , C3 >
0, which are independent of but may depend on r (, ) and sup{ | N}, such
that

K (x, t)1 (x)1 (t) dx dt 1 C1 2 ,
(13)
R2

1

 2r (,

)

, j
C3
C2
2
j=2

(14)

hold for all 0 < < sup{ | N}, then it follows that
I is strongly polynomially tractable with exponent

p all = min 2,


1
.
r (, )

For all d N we have


ewor-all (n, Hd )  n 1/ p = n max(r (, ),1/2) n ,
all
n wor-abs-all (, Hd )  p 0,
all

where  n q with n was defined in Sect. 1, and  q with 0 is analogous


to  (1/)q with 1/ .
For the isotropic kernel with  = and  = for all , the exponent of
strong tractability is 2. Furthermore strong polynomial tractability is equivalent
to polynomial tractability.
Proof From [10, Theorem 5.1] it follows that I is strongly polynomially tractable
if and only if there exist two positive numbers c1 and such that

Tractability of Function Approximation with Product Kernels

c2 := sup
dN

591

1/

d,,
,j

< ,

(15)

j=c1 

Furthermore, the exponent p all of strong polynomial tractability is the infimum of 2


for which this condition holds. Obviously (15) holds for c1 = 1 and = 1 because

d,, , j


d

d 




=
 , , j =
[1 2 + 2 K  (t, t)]1 (t) dt
=1

j=1

=1

j=1

d



1 2 + 2 = 1.
=1

This shows that p all 2.


The case r (, ) = 0 is trivial. Take now r (, ) > 0. Consider first the case
, . We will show
d,, in (4) becomes K
d = 1 for simplicity. Then the kernel K
, satisfy
that for = 1/(2r (, )), the eigenvalues of K

2
,
, j 1 + C U ( ) ,

(16)

j=1

where the constant CU does not depend on or . Since all the eigenvalues of K
are non-negative, we clearly have for the first eigenvalue of K ,
, ,1 1.

(17)

,
On the other hand, (13) gives the lower bound of the first eigenvalue of K


 

, (x, t)1 (x)1 (t) dtdx =
1 2 + 2 K (x, t) 1 (x)1 (t) dtdx
K
R2
R2

2
2
=1 +
K (x, t)1 (x)1 (t) dtdx 1 C1 ( )2 .
(18)

, ,1

R2

It follows from (12) that


, ,2 C1 ( )2 .

(19)

For j 3, the upper bound of , , j is given by (10) with i = 1:


, , j 2 , j1 ,

(20)

592

X. Zhou and F.J. Hickernell

which in turn yields

2
,
,j

j=3

, j1 C3 ( )2

(21)

j=3

by (14). Combining (17), (19) and (21) gives (16), where the constant CU = C1 +C3 .
The lower bound we want to establish is that for < 1/(2r (, )),

,
,j


1 + CL ( )

if <

j=1

C2
2C1

1/[2(1 )]

(22)

where CL := C2 /2. It follows from (18) that

, ,1 1 C1 ( )2 .
,
,1

(23)

In addition we apply the eigenvalue inequality (10) to obtain


, , j 2 , j ,

j = 2, 3, . . .

which in turn gives

2
,
,j

j=2

, j C2 ( )2 ,

(24)

j=2

where the last inequality follows from (14). Inequalities (23) and (24) together give

2
2
,
1 + (C2 /2)( )2
, j 1 C 1 ( ) + C 2 ( )

j=1

under the condition in (22) on small enough . Thus we obtain (22).


For the multivariate case, the sum of the th power of the eigenvalues is bounded
from above for any > 1/(2r (, )) because


j=1

d,
j







1 + CU (  )2
=
, , j
=1

= exp

j=1

=1

ln 1 + CU (  )

=1

This shows that p all 1/r (, ).





exp CU


=1


(  )

< . (25)

Tractability of Function Approximation with Product Kernels

593

We now consider the lower bound in the multivariate case and define the set A by






C2 1/[2(1 )]

A = 
  <
.

2C1
Then

sup

dN







d,,
 , , j =
 , , j
 , , j .
,j
=1

j=1

j=1

A

j=1

N\A

j=1

We want to show that this supremum is infinite for < 1/(2r (, )). We do this by
proving that the first product on the right is infinite. Indeed for < 1/(2r (, )),






1 + CL (  )2 1 + CL
 , , j
(  )2 = .

A

A

j=1

A

Therefore, p all 1/r (, ), which establishes the formula for p all . The estimates on
ewor-all (n, Hd ) and n wor-abs-all (, Hd ) follow from the definition of strong tractability.
Finally, the exponent of strong tractability is 2 for the isotropic kernel because
r (, ) = 0 in this case. To prove that strong polynomial tractability is equivalent
to polynomial tractability, it is enough to show that polynomial tractability implies
strong polynomial tractability. From [10, Theorem 5.1] we know that polynomial
tractability holds if and only if there exist numbers c1 > 0, q1 0, q2 0 and > 0
such that

1/


d, j
< .
c2 := sup d q2

dN
q1
j=C1 d

If so, then

n wor-abs-all (, Hd ) (c1 + c2 ) d max(q1 ,q2 ) 2

for all (0, 1) and d N. Note that for all d we have


d q2

d q2 (c1  1),
,
,j
,1 c2 < .
j=1

This implies that 1. On the other hand, for = 1 we can take q1 = q2 = 0 and

arbitrarily small C1 , and obtain strong tractability. This completes the proof.
Theorem 1 states that the exponent of strong polynomial tractability is at most
2, while for all shape parameters for which r (, ) > 1/2 the exponent is smaller
than 2. Again, although the rate of convergence of ewor-all (n, Hd ) is always excellent,

594

X. Zhou and F.J. Hickernell

the dependence on d is eliminated only at the expense of the exponent which must
be roughly 1/ p all . Of course, if we take an exponentially decaying sequence of
the products of scale parameters and shape parameters, say,   = q  for some
q (0, 1), then r (, ) = and p all = 0. In this case, we have an excellent rate
of convergence without any dependence on d.

4.2 Only Function Values


The tractability results for the class std are stated in the following theorem.
Theorem 2 Consider the function approximation problem I = {Id }dN for Hilbert
spaces for the class std and the absolute error criterion with the kernels (4) satisfying
(12). Let r (, ) be given by (6). If r (, ) = 0 or there exist constants C1 , C2 , C3 >
0, which are independent of but may depend on r (, ) and sup{ | N}, such
that (13) and (14) are satisfied for all 0 < < sup{ | N}, then
I is strongly polynomially tractable with exponent of strong polynomial tractability at most 4. For all d N and (0, 1) we have


1 1/2
2
(n, Hd ) 1/4 1 +
,
e
n
2 n
!
"

(1 + 1 + 2 )2
worabsstd
n
(, Hd )
.
4
wor-std

For the isotropic kernel with  = and  = for all , the exponent of
strong tractability is at least 2 and strong polynomial tractability is equivalent to
polynomial tractability.
Furthermore if r (, ) > 1/2, then
I is strongly polynomially tractable with exponent of strong polynomial tractability at most
1
1
1
+ 2
= p all + ( p all )2 < 4.
p std =
r (, ) 2r (, )
2
For all d N we have
ewor-std (n, Hd )  n 1/ p = n r (, )/[1+1/(2r (, ))] n ,
std

n wor-abs-std (, Hd )  p

std

0.

Proof The same proofs as for [6, Theorems 5.3 and 5.4] can be used. We only need
to show that the assumption of [9, Theorem 5], which is used in [6, Theorem 5.4], is
satisfied. It is enough to show that there exists p > 1 and B > 0 such that for any
n N,

Tractability of Function Approximation with Product Kernels

d,, ,n

595

B
.
np

(26)

Take = 1/(2r (, )). Since the eigenvalues  ,n are ordered, we have for n 2,
 ,n

C3 2
1
1
,
 , j
 , j
n 1 j=2
n 1 j=2
n1
n

where the last inequality follows from (14). Raising to the power 1/ gives
 ,n 2

C3
n1

1/
.

Furthermore (20) implies that for n 3,


 , ,n 2  ,n1 2 2

C3
n2

1/
1/

= 2 2 C3

n
n2

1/ 

n 1/

2 2 (3C3 )1/
.
n 1/

Since  , ,n 1 for all n N, we have that for all 1  d and n 3,


d,, ,n  , ,n

C4
,
np

where C4 = 2 2 (3C3 )1/ and p = 1/ > 1. For n = 1 and n = 2, we can


always find C5 large enough such that d,, ,n C5 /n p . Therefore (26) holds for

B = max{C4 , C5 }.
Note that (26) can be easily satisfied for many kernels used in practice. This
theorem implies that for large r (, ), the exponents of strong polynomial tractability
are nearly the same for both classes all and std . For an exponentially decaying
sequence of shape parameters, say,   = q  for some q (0, 1), we have p all =
p std = 0, and the rates of convergence are excellent and independent of d.

5 Tractability for the Normalized Error Criterion


d )
We now consider the function approximation problem for Hilbert spaces Hd ( K
with a general kernel for the normalized error criterion. That is, we want to find the
smallest n for which
ewor (n, Hd ) Id ,

{std, all}.

596

X. Zhou and F.J. Hickernell

Note that Id  = d,, ,1 1 and it can be exponentially small in d. Therefore


the normalized error criterion may be much harder than the absolute error criterion.
It follows from [6, Theorem 6.1] that for the normalized error criterion, lack of
polynomial tractability holds for the isotropic kernel for the class all and hence for
the class std .

5.1 Arbitrary Linear Functionals


We do not know if polynomial tractability holds for kernels with 0 r (, ) < 1/2.
For r (, ) 1/2, we have the following theorem.
Theorem 3 Consider the function approximation problem I = {Id }dN for Hilbert
spaces for the class std and the normalized error criterion with the kernels (4)
satisfying (12). Let r (, ) be given by (6) and r (, ) 1/2. If there exist constants C1 , C2 , C3 > 0, which are independent of but may depend on r (, ) and
sup{ | N}, such that (13) and (14) are satisfied for all 0 < < sup{ | N},
then
I is strongly polynomially tractable with exponent of strong polynomial tractability
1
p all =
.
r (, )
For all d N we have
ewor-all (n, Hd )  Id n 1/ p = n r (, ) n ,
all

n wor-abs-all (, Hd )  p

all

0.

Proof From [10, Theorem 5.2] we know that strong polynomial tractability holds if
and only if there exits a positive number such that
c2 := sup
d




d,, , j
d,, ,1

j=1

= sup
d

d,,
,1

d,,
,j

j=1

< .

If so, then n wor-nor-all (, Hd ) c2 2 for all (0, 1) and d N, and the exponent
of strong polynomial tractability
infimum of 2 for which c2 < .
# is the

r (, )) from (25).
For all d N, we have
j=1 d,, , j < for = 1/(2

}
<

if
and
only
if
sup
It remains to note that supd {1/d,,
d {1/d,, ,1 } < .
,1
Furthermore note that (18) implies that

sup
d

1
d,, ,1


=1

1
.
1 C1 (  )2

Tractability of Function Approximation with Product Kernels

597

#
2
Clearly, r (, ) 1/2 implies that
=1 (  ) < , which yields c2 < .
all
1/r (, ). The estimates on ewor-all (n, Hd ) and
This also proves that p
wor-nor-all
(, Hd ) follow from the definition of strong tractability.

n

5.2 Only Function Values


We now turn to the class std . We do not know if polynomial tractability holds for
the class std for 0 r (, ) 1/2. For r (, ) > 1/2, we have the following
theorem.
Theorem 4 Consider the function approximation problem I = {Id }dN for Hilbert
spaces with the kernel (4) for the class std and the normalized error criterion. Let
r (, ) be given by (6) and r (, ) > 1/2. If there exist constants C1 , C2 , C3 > 0,
which are independent of but may depend on r (, ) and sup{ | N}, such that
(13) and (14) are satisfied for all 0 < < sup{ | N}, then
I is strongly polynomially tractable with exponent of strong polynomial tractability at most
1
1
1
p std =
+
= p all + ( p all )2 < 4.
r (, ) 2r 2 (, )
2
For all d N we have
ewor-std (n, Hd )  n 1/ p n ,
std
wor-nor-std
n
(, Hd )  p 0.
std

Proof The initial error is




d
d


1
(1 C1 (  )2 )1/2 = exp O(1)
(  )2 .
Id 
2
=1
=1
r (, ) > 1/2 implies that Id  is uniformly bounded from below by a positive
number. This shows that there is no difference between the absolute and normalized
error criteria. This means that we can apply Theorem 2 for the class std with

replaced by Id  = (). This completes the proof.
Acknowledgments We are grateful for many fruitful discussions with Peter Math and several
other colleagues. This work was partially supported by US National Science Foundation grants
DMS-1115392 and DMS-1357690.

598

X. Zhou and F.J. Hickernell

References
1. Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic, Boston (2004)
2. Bernstein, D.S.: Matrix Mathematics. Princeton University, New Jersey (2008)
3. Buhmann, M.D.: Radial Basis Functions. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge (2003)
4. Cucker, F., Zhou, D.X.: Learning Theory: An Approximation Theory Viewpoint. Cambridge
Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge (2007)
5. Fasshauer, G.E.: Meshfree Approximation Methods with Matlab, Interdisciplinary Mathematical Sciences, vol. 6. World Scientific Publishing Co., Singapore (2007)
6. Fasshauer, G.E., Hickernell, F.J., Wozniakowski, H.: On dimension-independent rates of convergence for function approximation with Gaussian kernels. SIAM J. Numer. Anal. 50, 247271
(2012). doi:10.1137/10080138X
7. Hastie, T., Tibshirani, R., Friedman, J.: Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd edn. Springer Science+Business Media
Inc, New York (2009)
8. Knutson, A., Tao, T.: Honeycombs and sums of Hermitian matrices. Not. AMS 482, 175186
(2001)
9. Kuo, F.Y., Wasilkowski, G.W., Wozniakowski, H.: On the power of standard information for
multivariate approximation in the worst case setting. J. Approx. Theory 158, 97125 (2009)
10. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics, vol. 6. European Mathematical Society, Zrich (2008)
11. Pietsch, A.: Operator Ideals. North-Holland Publishing Co., Amsterdam (1980)
12. Rasmussen, C.E., Williams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006). http://www.gaussianprocess.org/gpml/
13. Schaback, R., Wendland, H.: Kernel techniques: from machine learning to meshless methods.
Acta Numer. 15, 543639 (2006)
14. Schlkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization,
Optimization, and Beyond. MIT Press, Cambridge (2002)
15. Stein, M.L.: Interpolation of Spatial Data: Some theory for Kriging. Springer, New York (1999)
16. Steinwart, I., Christmann, A.: Support Vector Machines. Springer Science+Business Media,
Inc., New York (2008)
17. Wahba, G.: Spline Models for Observational Data, CBMS-NSF Regional Conference Series
in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990)
18. Wendland, H.: Scattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics, vol. 17. Cambridge University Press, Cambridge (2005)

Discrepancy Estimates
For Acceptance-Rejection Samplers
Using Stratified Inputs
Houying Zhu and Josef Dick

Abstract In this paper we propose an acceptance-rejection sampler using stratified


inputs as driver sequence. We estimate the discrepancy of the N -point set in
(s 1)-dimensions generated by this algorithm. First we show an upper bound
on the star-discrepancy of order N 1/21/(2s) . Further we prove an upper bound on
q
the qth moment of the L q -discrepancy (E[N q L q,N ])1/q for 2 q , which is
(11/s)(11/q)
of order N
. The proposed approach is numerically tested and compared
with the standard acceptance-rejection algorithm using pseudo-random inputs. We
also present an improved convergence rate for a deterministic acceptance-rejection
algorithm using (t, m, s)nets as driver sequence.
Keywords Monte Carlo method
theory

Acceptance-rejection sampler

Discrepancy

1 Introduction
The acceptance-rejection algorithm is one of the widely used techniques for sampling
from a distribution when direct simulation is not possible or is expensive. The idea
of this method is to determine a good choice of proposal density (also known as
hat function), and then sample from the proposal density with low cost. For a given
target density : D R+ and a well-chosen proposal density H : D R+ , one
assumes that there exists a constant L < such that (x) < L H (x) for all x in the
domain D. Let u have uniform distribution in the unit interval, i.e. u U ([0, 1]).
Then the plain acceptance-rejection algorithm works in the following way. One first
,
draws X H and u U ([0, 1]), then accepts X as a sample of if u L(X)
H (X)
H. Zhu (B) J. Dick
School of Mathematics and Statistics, The University of New South Wales,
Sydney NSW 2052, Australia
e-mail: houying.zhu@unsw.edu.au
J. Dick
e-mail: josef.dick@unsw.edu.au
Springer International Publishing Switzerland 2016
R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0_33

599

600

H. Zhu and J. Dick

otherwise reject this sample and repeat the sampling step. Note that by applying
this algorithm, one needs to know the value of L. However, in many situations, this
constant is known for the given function or can be estimated.
Devroye [6] gave a construction method of a proposal density for log-concave
densities and Hrmann [17] proposed a rejection procedure, called transformed density rejection, to construct a proposal density. Detailed summaries of this technique
and some extensions can be found in the monographs [3, 18]. For many target densities finding a good proposal density is difficult. To improve efficiency one can also
determine a better choice of driver sequence having the designated proposal density,
which yields a deterministic type of acceptance-rejection method.
The deterministic acceptance-rejection algorithm has been discussed by
Moskowitz and Caflisch [22], Wang [31, 32] and Nguyen and kten [23], where
empirical evidence and a consistency result were given. Two measurements included
therein are the empirical root mean square error (RMSE) and the empirical standard
deviation. However, the discrepancy of samples has not been directly investigated.
Motivated by those papers, in [33] we investigated the discrepancy properties of
points produced by a totally deterministic acceptance-rejection method. We proved
that the discrepancy of samples generated by the acceptance-rejection sampler using
(t, m, s)nets as driver sequences is bounded from above by N 1/s , where the target
density function is defined in (s 1)-dimension and N is the number of samples
generated by the deterministic acceptance-rejection sampler. A lower bound shows
that for any given driver sequence, there always exists a target density such that the
star-discrepancy is bounded below by cs N 2/(s+1) , where cs is a constant depending
only on s.
Without going into details, in the following we briefly review known results in
the more general area of deterministic Markov chain quasi-Monte Carlo.

1.1 Literature Review of Markov Chain Quasi-Monte Carlo


Method
Markov chain Monte Carlo (MCMC) sampling is a classical method widely used in
simulation. Using a deterministic set as driver sequence in the MCMC procedure,
known as Markov chain quasi-Monte Carlo (MCQMC) algorithm, shows potential
to improve the convergence rate. Tribble and Owen [30] proved a consistency result
for MCMC estimation for finite state spaces. A construction of weakly completely
uniformly distributed (WCUD) sequences is also proposed. As a sequel to the work
of Tribble, Chen [4] and Chen et al. [5] demonstrated that MCQMC algorithms
using a completely uniformly distributed (CUD) sequence as driver sequence give
a consistent result under certain assumptions on the update function and Markov
chain. Further, Chen [4] also showed that MCQMC can achieve a convergence rate
of O(N 1+ ) for any > 0 under certain stronger assumptions, but he only showed

Discrepancy Estimates For Acceptance-Rejection Samplers Using

601

the existence of a driver sequence. More information on (W)CUD sequences can be


found in [4, 5, 30].
In a different direction, LEcuyer et al. [20] proposed a randomized quasi-Monte
Carlo method, namely the so-called array-RQMC method, which simulates multiple
Markov chains in parallel, then applies a suitable permutation to provide a more
accurate approximation of the target distribution. It gives an unbiased estimator to
the mean and variance and also achieves good empirical performance. Gerber and
Chopin in [12] adapted low discrepancy point sets instead of random numbers in
sequential Monte Carlo (SMC). They proposed a new algorithm, named sequential
quasi-Monte Carlo (SQMC), through the use of a Hilbert space-filling curve. They
proved consistency and stochastic bounds based on randomized QMC point sets for
this algorithm. More literature review about applying QMC to MCMC problems can
be found in [5, Sect. 1] and the references therein.
In [10], jointly done with Rudolf, we prove upper bounds on the discrepancy for
uniformly ergodic Markov chains driven by a deterministic sequence rather than independent random variables. We show that there exists a deterministic driver sequence
such that the discrepancy of the Markov chain from the target distribution with respect
to certain test sets converges with almost the usual Monte Carlo rate of N 1/2 . In the
sequential work [9] done by Dick and Rudolf, they consider upper bounds on the discrepancy under the assumption that the Markov chain is variance bounding and the
driver sequence is deterministic. In particular, they proved a better existence result,
showing a discrepancy bound having a rate of convergence of almost N 1 under
a stronger assumption on the update function, the so called anywhere-to-anywhere
condition. Roughly, variance bounding is a weaker property than geometric ergodicity for reversible chains. It was introduced by Roberts and Rosenthal in [28], who
also proved relations among variance bounding, central limit theorems and Peskun
ordering, which indicated that variance bounding is a reasonable and convenient
property to study MCMC algorithms.

1.2 Our Contribution


In this work we first present an acceptance-rejection algorithm using stratified inputs
as driver sequence. Stratified sampling is one of the variance reduction methods used
in Monte Carlo sampling. More precisely, grid-based stratified sampling improves
the RMSE to N 1/21/s for Monte Carlo, see for instance [26, Chap. 10]. In this
paper, we are interested in the discrepancy properties of points produced by the
acceptance-rejection method with stratified inputs as driver sequence. We obtain a
convergence rate of the star-discrepancy of order N 1/21/(2s) . Also an estimation of
the L q -discrepancy is considered for this setting.
One would expect that the convergence rate which can be achieved using deterministic sampling methods also depends on properties of the target density function.
One such property is the number of elementary intervals (for a precise definition see
Definition 3 below) of a certain size needed to cover the graph of the density. We

602

H. Zhu and J. Dick

show that if the graph can be covered by a small number of elementary intervals,
then an improved rate of convergence can be achieved using (t, m, s)-nets as driver
sequence. In general, this strategy does not work with stratified sampling, unless one
knows the elementary intervals explicitly.
The paper is organized as follows. In Sect. 2 we provide the needed notation and
background. Section 3 introduces the proposed acceptance-rejection sampler using
stratified inputs, followed by the theoretical results including an upper bound on the
star-discrepancy and the L q -discrepancy. Numerical tests are presented in Sect. 3.3
together with a discussion of the results in comparison with the theoretical bounds of
Theorems 1 and 2. For comparison purpose only we do the numerical tests also with
pseudo-random inputs. Section 4 illustrates an improved rate of convergence when
using (t, m, s)-nets as driver sequences. The paper ends with concluding remarks.

2 Preliminaries
We are interested in the discrepancy properties of samples generated by the
acceptance-rejection sampler. We consider the L q -discrepancy and the stardiscrepancy.
Definition 1 (L q -discrepancy) Let 1 q be a real number. For a point set
PN = {x 0 , . . . , x N 1 } in [0, 1)s , the L q -discrepancy is defined by
L q,N (PN ) =



N 1
q 1/q
1 

1[0,t) (x n ) ([0, t)) dt
,
[0,1]s N n=0


1, if x n [0, t),
, [0, t) = sj=1 [0, t j ) and is the Lebesgue
0, otherwise.
measure, with the obvious modification for q = . The L ,N -discrepancy is called
the star-discrepancy which is also denoted by D N (PN ).

where 1[0,t) (x n ) =

Later we will consider the discrepancy of samples associated with a density function.
The acceptance-rejection algorithm accepts all points below the graph of the
density function. In order to prove bounds on the discrepancy, we assume that the
set below the graph of the density function admits a so-called Minkowski content.
Definition 2 (Minkowski content) For a set A Rs , let A denote the boundary of
A and let
(( A) )
,
M ( A) = lim
0
2
where ( A) = {x Rs |x y for y A} and   denotes the Euclidean
norm. If M ( A) (abbreviated as M A ) exists and is finite, then A is said to admit
an (s 1)dimensional Minkowski content.

Discrepancy Estimates For Acceptance-Rejection Samplers Using

603

For simplicity, we consider the Minkowski content associated with the boundary
of a given set, however one could define it in more general sense. Ambrosio et al.
[1] present a detailed discussion of general Minkowski content.

3 Acceptance-Rejection Sampler Using Stratified Inputs


We now present the acceptance-rejection algorithm using stratified inputs.
Algorithm 1 Let the target density : [0, 1]s1 R+ , where s 2, be given.
Assume that we know a constant L < such that (z) L for all z [0, 1]s1 .
Let A = {z [0, 1]s : (z 1 , . . . , z s1 ) Lz s }.
(i) Let M N and let {Q 0 , . . . , Q M1 } be a disjoint covering of [0, 1)s with Q i of

cj
c +1
, Mj 1/s with 0 c j
M 1/s 1. Then (Q i ) = 1/M for
the form sj=1 M 1/s
all 0 i M 1. Generate a point set PM = {x 0 , . . . , x M1 } such that x i Q i
is uniformly distributed in the sub-cube Q i for each i = 0, 1, . . . , M 1.
(ii) Use the acceptance-rejection method for the points in PM with respect to the
density , i.e. we accept the point x n if x n A, otherwise reject. Let PN(s) =
A PM = {z 0 , . . . , z N 1 } be the sample set we accept.
(iii) Project the points we accepted PN(s) onto the first (s 1) coordinates. Let
Y N(s1) = { y0 , . . . , y N 1 } be the projections of the points PN(s) .
(iv) Return the point set Y N(s1) .
Note that M 1/s is not necessarily an integer in Algorithm 1 and hence the sets
Q i do not necessarily partition the unit cube [0, 1)s . The restriction that M 1/s is an
integer forces one to choose M = K s for some K N, which grows fast for large
s. However, this restriction is not necessary and hence we do not assume here that
M 1/s is an integer.

3.1 Existence Result of Samples with Small Star Discrepancy


We present some results that we use to prove an upper bound for the star-discrepancy
with respect to points generated by the acceptance-rejection sampler using stratified
inputs. For any 0 < 1, a set of anchored boxes [0, x) [0, 1)s is called a cover of the set of anchored boxes [0, t) [0, 1)s if for every point t [0, 1]s , there
exist [0, x), [0, y) such that [0, x) [0, t) [0, y) and ([0, y) \ [0, x)) .
The following result on the size of the -cover is obtained from [13, Theorem 1.15].
Lemma 1 For any s N and > 0 there exists a -cover of the set of anchored
boxes [0, t) [0, 1)s which has cardinality at most (2e)s ( 1 + 1)s .

604

H. Zhu and J. Dick

By a simple generalization, the following result holds for our setting.


Lemma 2 Let : [0, 1]s1 R+ , where s 2, be a function. Assume that
there exists a constant L < such that (z) L for all z [0, 1]s1 . Let
A = {z [0, 1]s : (z 1 , . . . , z s1 ) Lz s } and Jt = ([0, t) [0, 1]) A, for
t [0, 1]s1 . Let (A, B(A), ) be a probability space where B(A) is the Borel
-algebra of A. Define the set A B(A) of test sets by
A = {Jt : t [0, 1]s1 }.
Then for any > 0 there exists a -cover of A with
| | (2e)s1 ( 1 + 1)s1 .
Lemma 3 Let the unnormalized density function : [0, 1]s1 R+ , with s 2,
be given. Assume that there exists a constant L < such that (z) L for all
z [0, 1]s1 .
s
Let M N and let the subsets
 Q 0 , . . . , Q M1 be a disjoint covering of [0, 1) of
s
cj
c j +1
the form i=1 M 1/s , M 1/s where 0 c j
M 1/s 1. Each set Q i satisfies
(Q i ) = 1/M.
Let
A = {z [0, 1]s : (z 1 , . . . , z s1 ) Lz s }.

Assume that A admits an

(s 1)dimensional Minkowski content M A .


Let Jt = ([0, t) [0, 1]) A, where t = (t1 , . . . , ts1 ) [0, 1]s1 .
Then there exists an M0 N such that Jt intersects at most with 3s 1/2 M A M 11/s
subcubes Q i , for all M M0 .
This result can be obtained utilizing a similar proof as in [14, Theorem 4.3]. For
the sake of completeness, we give the proof here.
Proof Since A admits an (s 1)dimensional Minkowski content, it follows that
M A = lim

(( A) )
< .
2

Thus by the definition of the limit, for any fixed > 2, there exists 0 > 0 such that
(( A) ) M A whenever 0 < 0 .
s c j c j +1 
, the largest diagBased on the form of the subcube given by i=1
,
M 1/s M 1/s
1/s

s
onal length
is s M
. We can assume that M > ( s/0 ) , then s M 1/s =:
< 0 and iJ Q i ( A) , where J is the index set for the sets Q i which satisfy
Q i A = . Therefore
|J |

M A
(( A) )

= sM A M 11/s .
1
(Q i )
M

Discrepancy Estimates For Acceptance-Rejection Samplers Using

605

Without loss of generality, we can set = 3. Note that the number of boxes Q i
which intersect Jt is bounded by the number of boxes Q i which intersect A,
which completes the proof.

Remark 1 Ambrosio et al. [1] found that for a closed set A Rs , if A has a Lipschitz
boundary, then A admits an (s 1)-dimensional Minkowski content. In particular,
a convex set A [0, 1]s has an (s 1)-dimensional Minkowski content. Note that
the surface area of a convex set in [0, 1]s is bounded by the surface area of the unit
cube [0, 1]s , which is 2s and it was also shown by Niederreiter and Wills [25] that 2s
is best possible. It follows that the Minkowski content M A 2s when A is a convex
set in [0, 1]s .
Lemma 4 Suppose that all the assumptions of Lemma 3 are satisfied. Let N be the
number of points accepted by Algorithm 1. Then we have
M((A) 3s 1/2 M A M 1/s ) N M((A) + 3s 1/2 M A M 1/s ).
Proof The number of points we accept in Algorithm 1 is a random number since the
driver sequence given by stratified inputs is random. Let E(N ) be the expectation
of N . The number of Q i which have non-empty intersection with A is bounded by
l = 3s 1/2 M A M 11/s from Lemma 3. Thus
E[N ] l N E[N ] + l.
Further we have
E[N ] =

M1

i=0

(Q i A)
= M(A).
(Q i )

(1)

(2)

Combining (1) and (2) and substituting l = 3s 1/2 M A M 11/s , one obtains the desired
result.

Before we start to prove the upper bound on the star-discrepancy, our method
requires the well-known BernsteinChernoff inequality.
Lemma 5 [2, Lemma 2] Let 0 , . . . , l1 be independent random variables with
E(i ) = 0 and |i | 1 for all 0 i l 1. Denote by i2 the variance of i , i.e.

2 1/2
i2 = E(i2 ). Set = ( l1
. Then for any > 0 we have
i=0 i )
l1
  2e /4 , if 2 ,
 



P
i
2
2
2e /4 , if 2 .
i=0

The star-discrepancy of samples Y N(s1) obtained by Algorithm 1 with respect to


is given as follows,

606

H. Zhu and J. Dick

D N , (Y N(s1) ) =
where C =


[0,1]s1


N 1

1 
1


sup 
1[0,t) ( yn )
(z)d z ,
N
C
s1
[0,t)
t[0,1]
n=0

(z) dz and s 2.

Theorem 1 Let an unnormalized density function : [0, 1]s1 R+ , with s 2,


be given. Assume that there exists a constant L < such that (z) L for all
z [0, 1]s1 . Let C = [0,1]s1 (z) dz > 0 and let the graph under be defined as
A = {z [0, 1]s : (z 1 , . . . , z s1 ) Lz s }.
Assume that A admits an (s 1)dimensional Minkowski content M A . Then for
all large enough N , with positive probability, Algorithm 1 yields a point set Y N(s1)
[0, 1]s1 such that
3

D N , (Y N(s1) )

s4
2

1
1
2s 2

6M A
((A))

1
1
2 2s

log N
N

1
1
2 + 2s

2(A)
,
N

(3)

where (A) = C/L.

Proof Let Jt = ([0, t) [0, 1]) A, where t = (t1 , . . . , ts1 ). Using the notation
from Algorithm 1, let yn be the first s1 coordinates of z n A, for n = 0, . . . , N 1.
We have
M1
N 1


1 Jt (x n ) =
1[0,t) ( yn ).
n=0

n=0

Therefore

N 1
1 
  1 M1

1
1

  

(Jt ).
1[0,t) ( yn )
(z)d z  = 
1 Jt (x n )

N n=0
C [0,t)
N n=0
(A)

(4)

It is noted that
 M1
 
1 Jt (x n )

n=0

  M1
 

N
N 
  
 
(Jt ) 
1 Jt (x n ) M(Jt ) + (Jt ) M

(A)
(A)
n=0


 M1
 

 
 

1 Jt (x n ) M(Jt ) + M(A) N 
n=0
M1
 M1
 


 
 


1 Jt (x n ) M(Jt ) + M(A)
1 A (x n )
n=0

n=0


 M1

 
2 sup 
1 Jt (x n ) M(Jt ).
t[0,1]s

n=0

(5)

Discrepancy Estimates For Acceptance-Rejection Samplers Using

607

Let us associate with each Q i , random points x i Q i with probability distribution


P(x i V ) =

(V )
= M(V ),
(Q i )

for all measurable sets V Q i .


It follows from Lemma 3 that Jt intersects at most l := 3s 1/2 M A M 11/s sets
Q i . Therefore, Jt is representable as the disjoint union of sets Q i entirely contained
in Jt and the union of at most l sets Q i for which Q i Jt = and Q i ([0, 1]s \
Jt ) = , i.e.


Jt =
Q i (Q i Jt ),
iI

iJ

where the index-set J has cardinality at most


3s 1/2 M A M 11/s . Since for every Q i ,
of iI Q i is
(Q i ) = 1/M and x i Q i for i = 0, 1, . . . , M 1, the discrepancy

zero. Therefore, it remains to investigate the discrepancy of iJ (Q i Jt ).
Since (A) = C/L and N M(C/L 3s 1/2 M A M 1/s ) by Lemma 4, we have
M 2L N /C for all M > (6Ls 1/2 M A /C)s . Consequently,
1

l = 3s 1/2 M A M 11/s 3s 1/2 (2L)11/s C 1/s1 M A N 1 s = N 11/s ,


where = 3s 1/2 (2L)11/s C 1/s1 M A .
Let us define the random variable i for 0 i l 1 as follows

i =

1, if z i Q i Jt ,
/ Q i Jt .
0, if z i

By definition,
l1
l1
 M1
 



 

1 Jt (x n ) M(Jt ) = 
i M
(Q i Jt ).

n=0

i=0

(6)

i=0

Because of P(i = 1) = (Q i Jt )/(Q i ) = M(Q i Jt ), we have


Ei = M(Q i Jt ),

(7)

where E() denotes the expected value. By (6) and (7),


l1
 M1
 


 

N (Jt ; z 1 , . . . , z N ) = 
1 Jt (x n ) M(Jt ) = 
(i Ei ).
n=0

(8)

i=0

Since the random variables


i for 0 i l 1 are independent of each other,
in order to estimate the sum l1
i=0 (i Ei ) we are able to apply the classical

608

H. Zhu and J. Dick

BernsteinChernoff inequality of large deviation type. Let i2 = E(i Ei )2 and



set = ( li=1 i2 )1/2 . Let
= l 1/2 (log N )1/2 ,
where is a constant depending only on the dimension s which will be fixed later.
Without loss of generality, assume that N 3.
1
Case 1: If 2 , since 2 l N 1 s , by Lemma 5 we obtain


P N (Jt ; z 1 , . . . , z N ) l 1/2 (log N )1/2
l

 

2
2
2
=P 
(i Ei ) 2e /(4 ) 2N /4 .

(9)

i=1

Though the class of axis-parallel boxes is uncountable, it suffices to consider a small


subclass. Based on the argument in Lemma 2, there is an 1/M-cover of cardinality
(2e)s1 (M + 1)s1 (2e)s1 (2L N /C + 1)s1 for M > M0 such that there exist
R1 , R2 1/M having the properties R1 Jt R2 and (R2 \ R1 ) 1/M. From
this it follows that
N (Jt ; z 1 , . . . , z N ) max (Ri ; z 1 , . . . , z N ) + 1,
i=1,2

see, for instance, [11, Lemma 3.1] and [16, Section 2.1]. This means that we can
restrict ourselves to the elements of 1/M .
In view of (9)
s1

 2L N

2
2
+1
P (Ri ; z 1 , . . . , z N ) |1/M |2N 4 2N 4 (2e)s1
< 1,
C

for = 2 2s and N 8e
+ 2.
C
Case 2: On the other hand, if 2 , then by Lemma 5 we obtain


P (Jt ; z 1 , . . . , z N ) l 1/2 (log N )1/2
l

 

l 1/2 (log N )1/2

4
=P
(i Ei ) 2e
.

(10)

i=1

Similarly, using the 1/M-cover above, for = 2 2s and sufficiently large N we


have


l 1/2 (log N )1/2
4
P (Ri ; z 1 , . . . , z N ) |1/M |2e
2e

l 1/2 (log N )1/2


4

(2e)s1

 2L N
C

s1
+1
< 1,

Discrepancy Estimates For Acceptance-Rejection Samplers Using

609

where the last equation is satisfied for all large enough N .


By (4) and (5), we obtain that, with positive probability, Algorithm 1 yields a point
set Y N(s1) such that
D N , (Y N(s1) )

1
1
2s 1/2 N 2 2s (log N )1/2 + 1/M.

As above, by Lemma 4 we have 1/M 2C/(L N ) for sufficiently large N . Thus


the proof of Theorem 1 is complete.


3.2 Upper Bound on the L q -Discrepancy


In this section we prove an upper bound on the expected value of the L q -discrepancy
1/q

q
which is
for 2 q . We establish an upper bound for E[N q L q,N (Y N(s1) )]
given by

1/q   
q
(s1)
E N q L q,N (Y N
)
= E


1
q 1/q
 N
N

1[0,t) ( yn )
(z) dz) dt
,
C [0,t)
[0,1)s1 n=0

where Y N(s1) is the sample set associated with the density function .
Theorem 2 Let the unnormalized density function : [0, 1]s1 R+ satisfy all
the assumptions stated in Theorem 1. Let Y N(s1) be the samples generated by the
acceptance-rejection sampler using stratified inputs in Algorithm 1. Then we have
for 2 q ,


E[N q L q,N (Y N(s1) )]


q

1/q

2(11/s)(11/q) (3s 1/2 M A )11/q (11/s)(11/q)


N
,

4 2C((A))(11/s)(11/q)

(11)

where M A is the (s 1)dimensional Minkowski content and the expectation is


taken with respect to the stratified inputs.

Proof Let Jt = ([0, t) [0, 1]) A, where t = (t1 , . . . , ts1 ) [0, 1]s1 . Let
i (t) = 1 Q i Jt (x i ) (Q i Jt )/(Q i ),
where Q i for 0 i M 1 is a disjoint covering of [0, 1)s with (Q i ) = 1/M.
Then E(i (t)) = 0 since we have E[1 Q i Jt (x i )] = M(Q i Jt ). Hence for any
t [0, 1]s1 ,
E[i2 (t)] = E[(1 Q i Jt (x i ) M(Q i Jt ))2 ]
= E[1 Q i Jt (x i )] 2M(Q i Jt )E[1 Q i Jt (x i )] + M 2 2 (Q i Jt )
= M(Q i Jt )(1 M(Q i Jt )) 1/4.

610

H. Zhu and J. Dick

If Q i Jt or if Q i Jt = , we have i (t) = 0. We order the sets Q i such that


Q 0 , Q 1 , . . . , Q i0 satisfy Q i Jt = and Q i  Jt (i.e. Q i intersects the boundary
of Jt ) and the remaining sets Q i either satisfy Q i Jt = or Q i Jt . If A
admits an (s 1)dimensional Minkowski content, it follows from Lemma 3 that,
M1


i2 (t) =

i=0

l1


i2 (t) l/4 for all t [0, 1]s1 .

i=0

Again, E[N ] = M(A) from Eq. (2). Now for q = 2,



1/2
E N 2 L 22,N (Y N(s1) )
 
= E


N 1
2 1/2

N

1[0,t) ( yn )
(z) dz) dt
C [0,t)
[0,1)s1 n=0
 

 M1
N (Jt ) 2 1/2

= E
1 Jt (x n )
dt
(A)
[0,1)s1 n=0
 M1
 

  E(N )(Jt )
N (Jt ) 2 1/2


E
1 Jt (x n ) M(Jt ) + 
dt
(A)
(A)
[0,1)s1
n=0

 
2 1/2
2  (Jt )
 M1

(E(N ) N ) dt
2 E
1 Jt (x n ) M(Jt ) + 
,
(A)
[0,1)s1 n=0
where we use (a + b)2 2(a 2 + b2 ).
Then we have


(s1)

E N 2 L 22,N (Y N

1/2

 
2 E


= 2

= 2

[0,1]s1

[0,1]s1


 M1
2

i (t) dt +
i=0
M1

i=0

l1

[0,1]s1 i=0

2 1/2
1 
E(N ) N 
2
((A))

M1

L 2  2 1/2
i2 (t) dt + 2
i (1)
C

E[i2 (t)]dt +

i=0

l1
L2 

C2

i2 (1)

1/2

i=0

1
(L 2 + C 2 )1/2 1/2
L 2 l 1/2
2 + 2
=
l .

4
C 4
2C

Since |i (t)| 1, for q = , we have


sup

PM [0,1]s

(s1)

|N D N (Y N

)| =

sup

sup

PM [0,1]s t[0,1]s1

sup

sup


 M1


i (t) =
i=0

sup

l1



i (t)

PM [0,1]s t[0,1]s1 i=0

l1 


i (t) l/4.

PM [0,1]s t[0,1]s1 i=0

sup

Discrepancy Estimates For Acceptance-Rejection Samplers Using

611

Therefore, for 2 q ,
1/q

(L 2 + C 2 )1/2 11/q
q

,
E[N q L q,N (Y N(s1) )]
l

4 2C
which is a consequence of the log-convexity of L p -norms, i.e.  f  p  f 1
p0
 f p1 , where 1/ p = (1 )/ p0 + / p1 . In our case, p0 = 2 and p1 = .
Additionally, following from Lemma 4, we have M 2L N /C whenever
M > (6Ls 1/2 M A /C)s . Hence we obtain the desired result by substituting l = 3s 1/2
M A M 11/s and replacing M in terms of N .

Remark 2 It would also be interesting to find out whether (11) still holds for 1 <
q < 2. See Heinrich [15] for a possible proof technique.
We leave it as an open problem.

3.3 Numerical Tests and Discussion of Results


We consider the discrepancy of samples generated by Algorithm 1 with respect to
the given density defined by
(x1 , x2 , x3 , x4 ) =

1 x1
(e + ex2 + ex3 + ex4 ), (x1 , x2 , x3 , x4 ) [0, 1]4 .
4

To compute the star-discrepancy, we utilize the same technique as in [33], a so-called


-cover, to estimate the supremum in the definition of the star-discrepancy. We also
calculate the L q -discrepancy of samples for this example. The L q -discrepancy with
respect to a density function is denoted by,
L q (Y N(s1) , ) =




N 1
1 
q 1/q
1


1[0,t) ( yn )
(z)d z  dt
,

C [0,t)
[0,1]s1 N n=0

(12)


where C = [0,1]s1 (z) dz and t = (t1 , . . . , ts1 ). One can write down a precise
formula for the squared L 2 -discrepancy for the given in this example, which is

L 2 (Y N(s1) , )2 =
2,t dt
[0,1]s1

N 1 s1
7 
1  
1  71
16
+
= 2
(1 max{ym, j , yn, j }) +

N m,n=0 j=1
4C 2 54e2
27e 108
4
N 1 4
2
1 
k=1 (1 yi,k )
1
yi, j

(1 + e yi, j e
)
,
16N C i=0 j=1
1 yi,2 j

where C = 1 1/e.

612

H. Zhu and J. Dick

Theorem 1 shows that Algorithm 1 can yield a point set satisfying the discrepancy
bound (3). To test this result numerically and to compare it to the acceptance-rejection
algorithm using random inputs, we performed the following numerical test. We generated 100 independent stratified inputs and 100 independent pseudo-random inputs
for the acceptance-rejection algorithm. From the samples sets obtained from the
acceptance-rejection algorithm we chose those samples which yielded the fastest
rate of convergence for stratified inputs and also for pseudo-random inputs.
Theorem 1 suggests a convergence rate of order N 1/21/(2s) = N 0.6 for stratified
inputs. The numerical results in this test shows an empirical convergence of N 0.62 ,
see Fig. 1. In comparison, the same test carried out with the stratified inputs replaced
by pseudo-random inputs shows a convergence rate of order N 0.55 . As expected,
stratified inputs outperform random inputs.
We also performed numerical experiments to test Theorem 2. For q = , the left
side in (11) is the infinite moment, i.e. the essential supremum, of the random variable
N L q,N (Y Ns1 ). Theorem 2 suggests a convergence rate of order N 1/s = N 0.2 . To
compare this result with the numerical performance in our example, we used again
100 independent runs, but now chose the one with the worst convergence rate for each
case. With stratified inputs, we get a convergence rate of order N 0.55 in this case (see
Fig. 1), which may suggest that Theorem 2 is too pessimistic. Note that Theorem 2
only requires very weak smoothness assumptions on the target density, whereas the
density in our example is very smooth. This may also explain the difference between
the theoretical and numerical results.
We also test Theorem 2 for the case q = 2. In this case, the left side of (11) is
an L 2 average of N L 2,N (Y Ns1 ). Theorem 2 with q = 2 suggests a convergence rate
of L 2,N (Y Ns1 ) of order N 1/21/(2s) = N 0.6 . The numerical experiment in Fig. 2

10

0
Random-worst
0.74 N -0.45
Random-best
1.99 N -0.55
Stratified-worst
0.98 N -0.55
Stratified-best

Discrepancy

2.03 N -0.62

10

-1

10

-2

10

-3

10

10

10

10

10

Number of points

Fig. 1 Convergence order of the star-discrepancy

10

10

Discrepancy Estimates For Acceptance-Rejection Samplers Using

613
L2-Stratified
0.26 N -0.59
L2-Random

Discrepancy

0.24 N -0.50

10-2

10-3
10 1

10 2

10 3

10 4

10 5

Number of points

Fig. 2 Convergence order of the L 2 -discrepancy

yields a convergence rate of order N 0.59 , roughly in agreement with Theorem 2 for
q = 2. For random inputs we get a convergence rate of order N 0.50 , as one would
expect.

4 Improved Rate of Convergence for a Deterministic


Acceptance-Rejection Sampler
In this section we prove a convergence rate of order N for 1/s < 1, where
depends on the target density . See Corollary 1 below for details. For this result
we use (t, m, s)-nets (see Definition 5 below) as inputs instead of stratified samples.
The value of here depends on how well the graph of can be covered by certain rectangles (see Eq. (13)). In practice this covering rate of order N is hard to
determine precisely, where can range anywhere in [1/s, 1), and where arbitrarily
close to 1 can be achieved if is constant. We also provide a simple example in
dimension s = 2 for which can take on the values = 1 1 for  N,  2.
See Example 1 for details.
We first establish some notation and useful definitions and then obtain theoretical
results. First we introduce the definition of (t, m, s)-nets in base b (see [8]) which
we use as the driver sequence. The following fundamental definitions of elementary
interval and fair sets are used to define a (t, m, s)-net in base b.
Definition 3 (b-adic elementary interval) Let b 2 be an integer. An s-dimensional
b-adic elementary interval is an interval of the form

614

H. Zhu and J. Dick


s 

ai
i=1

ai + 1
, d
d
i
b
bi

with integers 0 ai < bdi and di 0 for all 1 i s. If d1 , . . . , ds are such that
d1 + + ds = k, then we say that the elementary interval is of order k.
Definition 4 (fair sets) For a given set PN = {x 0 , x 1 , . . . , x N 1 } consisting of N
points in [0, 1)s , we say for a subset J of [0, 1)s to be fair with respect to PN , if
N 1
1 
1 J (x n ) = (J ),
N n=0

where 1 J (x n ) is the indicator function of the set J .


Definition 5 ((t, m, s)-nets in base b) For a given dimension s 1, an integer base
b 2, a positive integer m and an integer t with 0 t m, a point set Q m,s of bm
points in [0, 1)s is called a (t, m, s)-nets in base b if the point set Q m,s is fair with
respect to all b-adic s-dimensional elementary intervals of order at most m t.
We present the acceptance-rejection algorithm using (t, m, s)-nets as driver
sequence.
Algorithm 2 Let the target density : [0, 1]s1 R+ , where s 2, be given.
Assume that we know a constant L < such that (x) L for all x [0, 1]s1 .
Let A = {z [0, 1]s : (z 1 , . . . , z s1 ) L xs }. Suppose we aim to obtain approximately N samples from .



(i) Let M = bm N /( [0,1]s1 (x)/Ld x) , where m N is the smallest integer
satisfying this inequality. Generate a (t, m, s)-net Q m,s = {x 0 , x 1 , . . . , x bm 1 }
in base b.
(ii) Use the acceptance-rejection method for the points Q m,s with respect to the
density , i.e. we accept the point x n if x n A, otherwise reject. Let PN(s) =
A Q m,s = {z 0 , . . . , z N 1 } be the sample set we accept.
(iii) Project the points PN(s) onto the first (s 1) coordinates. Let Y N(s1) =
{ y0 , . . . , y N 1 } [0, 1]s1 be the projections of the points PN(s) .
(iv) Return the point set Y N(s1) .
In the following we show that an improvement of the discrepancy bound for the
deterministic acceptance-rejection sampler is possible. Let an unnormalized density
function : [0, 1]s1 R+ , with s 2, be given. Let again
A = {z = (z 1 , . . . , z s ) [0, 1]s : (z 1 , . . . , z s1 ) Lz s }

Discrepancy Estimates For Acceptance-Rejection Samplers Using

615

and Jt = ([0, t) [0, 1]) A. Let Jt denote the boundary of Jt and [0, 1]s
denotes the boundary of [0, 1]s . For k N we define the covering number
k () = sup min{v :U1 , . . . , Uv Ek : ( Jt \ [0, 1]s )
t[0,1]s

v


Ui ,

i=1

Ui Ui  = for 1 i < i  v},

(13)

where Ek is the family of elementary intervals of order k.


Lemma 6 Let : [0, 1]s1 [0, 1] be an unnormalized target density and let the
covering number mt () be given by (13). Then the discrepancy of the point set
Y N(s1) = { y0 , y1 , . . . , y N 1 } [0, 1]s1 generated by Algorithm 2 using a (t, m, s)net in base b, for large enough N , satisfies
D N , (Y N(s1) ) 4C 1 bt mt ()N 1 ,
where C =


[0,1]s1

(z)d z.

Proof Let t [0, 1]s be given. Let v = mt () and U1 , . . . , Uv be elementary


intervals of order m t such that U1 U2 Uv ( Jt \ [0, 1]s ) and


Emt with
Ui Ui  = for 1 i < i  v. Let V1 , . . . , Vz
v Vi Jt , V i Vi =
z

for all 1 i < i z and Vi Ui = such that i=1 Vi i=1 Ui Jt . We define
W =

z


Vi

i=1

v


Ui

i=1

and
Wo =

z


Vi .

i=1

Then W and W o are fair with respect to the (t, m, s)-net, W o Jt W and
(W \ Jt ), (Jt \ W o ) (W \ W o ) =

v

i=1

(Ui ) =

v


bm+t = bm+t mt ().

i=1

The proof of the result now follows by the same arguments as the proofs of [33,
Lemma 1 & Theorem 1].

From Lemma 3 we have that if A admits an (s 1)dimensional Minkowski
content, then
k () cs b(11/s)k .
This yields a convergence rate of order N 1/s in Lemma 6. Another known example
is the following. Assume that is constant. Since the graph of can be covered by

616

H. Zhu and J. Dick

just one elementary interval of order m t, this is the simplest possible case. The
results from [24, Sect. 3] (see also [8, pp. 184190] for an exposition in dimensions
s = 1, 2, 3) imply that k () Cs k s1 for some constant Cs which depends only
on s. This yields the convergence rate of order (log N )s1 N 1 in Lemma 6. Thus, in
general, there are constants cs, and Cs, depending only on s and such that
cs, k s1 k () Cs, b(11/s)k ,

(14)

whenever the set A admits an (s 1)dimensional Minkowski content. This yields


a convergence rate in Lemma 6 of order N with 1/s < 1, where the precise
value of depends on . We obtain the following corollary.
Corollary 1 Let : [0, 1]s1 [0, 1] be an unnormalized target density and let
k () be given by (13). Assume that there is a constant > 0 such that
k () b(1)k k for all k N,
for some 1/s 1 and 0. Then there is a constant s,t, > 0 which
depends only on s, t and , such that the discrepancy of the point set Y N(s1) =
{ y0 , y1 , . . . , y N 1 } [0, 1]s1 generated by Algorithm 2 using a (t, m, s)-net in
base b, for large enough N , satisfies
D N , (Y N(s1) ) s,t, N (log N ) .
Example 1 To illustrate the bound in Corollary 1, we consider now an example for
which we can obtain an explicit bound on k () of order bk(1) for 1/2 < 1.
For simplicity let s = 2 and = 1 1 for some  N with  2. We define
now a function  : [0, 1) [0, 1) in the following way: let x [0, 1) have b-adic
expansion
2
1
3
+ 2 + 3 +
x=
b
b
b
where i {0, 1, . . . , b 1} and assume that infinitely many of the i are different
from b 1. Then set
 (x) =

1
2
3
+ 2(l1) + 3(l1) + .
l1
b
b
b

Let t [0, 1). In the following we define elementary intervals of order k N which
cover Jt \ [0, 1]2 . Assume first that k is a multiple of , then let g = k/. Then
we define the following elementary intervals of order k = g:



ag1
ag1
a g a1
ag + 1
a1

+ + g1 + g ,
+ + g1 +
b
b
b b
b
bg


ag1
ag
ag1
ag + 1
a1
a1
+ + (g1)(1) + g(1) , 1 + + (g1)(1) + g(1) ,
b1
b
b
b
b
b
(15)

Discrepancy Estimates For Acceptance-Rejection Samplers Using

617

where a1 , . . . , ag {0, 1, . . . , b 1} run through all possible choices such that


ag + 1
ag1
a1
t.
+ + g1 +
b
b
bg
The number of these choices for a1 , . . . , ag is bounded by b g . Let
t=

tg
tg+1
t1
+ + g + g+1 + .
b
b
b

For integers 1 u g( 1) and 0 cu < tg+u , we define the intervals





tg+u1
tg+u1
t1
cu t1
cu + 1
+ + g+u1 + g+u , + + g+u1 + g+u
b
b
b
b
b
b


dg(1)u d1
dg(1)u
d1
1
+ + g(1)u ,
+ + g(1)u + g(1)u ,
b
b
b
b
b

(16)

where di = 0 if   i, di = ti/ if |i and we set db1 + + bg(1)u


g(1)u = 0 if u = g(1).
Further we define the interval


tg t1
tg
t1
1
(17)
+ + g , + + g + g [0, 1).
b
b
b
b
b
The intervals defined in (15)(17) cover Jt \ [0, 1]2 . Thus we have
g ( ) b g + bg( 1) + 1 b g .
For arbitrary k N we can use elementary intervals of order k which cover the same
area as the intervals (15)(17). Thus we have at most b1 times as many intervals
and we therefore obtain
k ( ) bk/+1 .
Thus we obtain
 N 1


1 

1 t
1


sup 
1[0,t) (yn )
 (z)dz  s,t, N (1  ) .

C 0
t[0,1]  N n=0
Remark 3 In order to obtain similar results as in this section for stratified inputs rather
than (t, m, s)nets, one would have to use the elementary intervals U1 , . . . , Uv of
order k which yield a covering of Jt \ [0, 1]s for all t [0, 1]s1 . From this
covering one would then have to construct a covering of A \ [0, 1]s and use this
covering to obtain stratified inputs. Since such a covering is not easily available in
general, we did not pursue this approach further.

618

H. Zhu and J. Dick

5 Concluding Remarks
In this paper, we study an acceptance-rejection sampling method using stratified
inputs. We examine the star-discrepancy and the L q -discrepancy and obtain that the
star-discrepancy is bounded by N 1/21/2s , which is slightly better than the rate of
plain Monte Carlo. A bound on the L q -discrepancy is given through an estimation of
q
q
(E[N q L q,N ])1/q . It is established that (E[N q L q,N ])1/q achieves an order of convergence of N (11/s)(11/q) for 2 q . Unfortunately, our arguments do not yield
an improvement for the case 1 < q < 2. From our numerical experiments we can
see that, adapting stratified inputs in the acceptance-rejection sampler outperforms
the original algorithm. The numerical results are roughly in agreement with the upper
bounds in Theorems 1 and 2.
We also find that the upper bound for the star-discrepancy using a deterministic
driver sequence can be improved to N for 1/s < 1 under some assumptions.
An example illustrates these theoretical results.
Acknowledgments The work was supported by Australian Research Council Discovery Project
DP150101770. We thank Daniel Rudolf and the anonymous referee for many very helpful comments.

References
1. Ambrosio, L., Colesanti, A., Villa, E.: Outer Minkowski content for some classes of closed
sets. Math. Ann. 342, 727748 (2008)
2. Beck, J.: Some upper bounds in the theory of irregularities of distribution. Acta Arith. 43,
115130 (1984)
3. Botts, C., Hrmann, W., Leydold, J.: Transformed density rejection with inflection points. Stat.
Comput. 23, 251260 (2013)
4. Chen, S.: Consistency and convergence rate of Markov chain quasi Monte Carlo with examples.
Ph.D. thesis, Stanford University (2011)
5. Chen, S., Dick, J., Owen, A.B.: Consistency of Markov chain quasi-Monte Carlo on continuous
state spaces. The Ann. Stat. 39, 673701 (2011)
6. Devroye, L.: A simple algorithm for generating random variats with a log-concave density.
Computing 33, 247257 (1984)
7. Devroye, L.: Nonuniform Random Variate Generation. Springer, New York (1986)
8. Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte
Carlo Integration. Cambridge University Press, Cambridge (2010)
9. Dick, J., Rudolf, D.: Discrepancy estimates for variance bounding Markov chain quasi-Monte
Carlo. Electron. J. Prob. 19, 124 (2014)
10. Dick, J., Rudolf, D., Zhu, H.: Discrepancy bounds for uniformly ergodic Markov chain QuasiiMonte Carlo. http://arxiv.org/abs/1303.2423 [stat.CO], submitted (2013)
11. Doerr, B., Gnewuch, M., Srivastav, A.: Bounds and constructions for the star-discrepancy via
-covers. J. Complex. 21, 691709 (2005)
12. Gerber, M., Chopin, N.: Sequential quasi-Monte Carlo. J. R. Stat. Soc. B 77, 144 (2015)
13. Gnewuch, M.: Bracketing number for axis-parallel boxes and application to geometric discrepancy. J. Complex. 24, 154172 (2008)
14. He, Z., Owen, A.B.: Extensible grids: uniform sampling on a space-filling curve. J. R. Stat.
Soc. B 115 (2016)

Discrepancy Estimates For Acceptance-Rejection Samplers Using

619

15. Heinrich, S.: The multilevel method of dependent tests. In: Balakrishnan, N., Melas, V.B.,
Ermakov, S.M., (eds.), Advances in Stochastic Simulation Methods, pp. 4762. Birkhuser
(2000)
16. Heinrich, S., Novak, E., Wasilkowski, G.W., Wozniakowski, H.: The inverse of the stardiscrepancy depends linearly on the dimension. Acta Arith. 96, 279302 (2001)
17. Hrmann, W.: A reject technique for sampling from T-concave distributions. ACM Trans. Math.
Softw. 21, 182193 (1995)
18. Hrmann, W., Leydold, J., Derflinger, G.: Automatic Nonuniform Random Variate Generation.
Springer, Berlin (2004)
19. Kuipers, L., Niederreiter, H.: Uniform Distribution of Sequences. Wiley, New York (1974)
20. LEcuyer, P., Lcot, C., Tuffin, B.: A randomized quasi-Monte Carlo simulation method for
Markov chains. Oper. Res. 56, 958975 (2008)
21. Morokoff, W.J., Caflisch, R.E.: Quasi-Monte Carlo integration. J. Comput. Phys. 122, 218230
(1995)
22. Moskowitz, B., Caflisch, R.E.: Smoothness and dimension reduction in quasi-Monte Carlo
methods. Math. Comput. Mod. 23, 3754 (1996)
23. Nguyen, N., kten, G.: The acceptance-rejection method for low discrepancy sequences (2014)
24. Niederreiter, H.: Point sets and sequences with small discrepancy. Monatshefte fr Mathematik
104, 273337 (1987)
25. Niederreiter, H., Wills, J.M.: Diskrepanz und Distanz von Maen bezglich konvexer und
Jordanscher Mengen (German). Mathematische Zeitschrift 144, 125134 (1975)
26. Owen, A.B.: Monte Carlo Theory, Methods and Examples. http://www-stat.stanford.edu/
~owen/mc/. Last accessed Apr 2016
27. Robert, C., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, New York (2004)
28. Roberts, G.O., Rosenthal, J.S.: Variance bounding Markov chains. Ann. Appl. Prob. 18, 1201
1214 (2008)
29. Tribble, S.D.: Markov chain Monte Carlo algorithms using completely uniformly distributed
driving sequences. Ph.D. thesis, Stanford University (2007)
30. Tribble, S.D., Owen, A.B.: Constructions of weakly CUD sequences for MCMC. Electron. J.
Stat. 2, 634660 (2008)
31. Wang, X.: Quasi-Monte Carlo integration of characteristic functions and the rejection sampling
method. Comupt. Phys. Commun. 123, 1626 (1999)
32. Wang, X.: Improving the rejection sampling method in quasi-Monte Carlo methods. J. Comput.
Appl. Math. 114, 231246 (2000)
33. Zhu, H., Dick, J.: Discrepancy bounds for deterministic acceptance-rejection samplers. Eletron.
J. Stat. 8, 678707 (2014)

Index

B
Barth, Andrea, 209
Bay, Xavier, 521
Belomestny, Denis, 229
Binder, Nikolaus, 423
Brhier, Charles-Edouard, 245

C
Carbone, Ingrid, 261
Chen, Nan, 229
Chopin, Nicolas, 531

D
Dahm, Ken, 423
Dereich, Steffen, 3
Dick, Josef, 599
Durrande, Nicolas, 315

G
Gantner, Robert N., 271
Genz, Alan, 289
Gerber, Mathieu, 531
Giles, Michael B., 303
Ginsbourger, David, 315
Goda, Takashi, 331
Gnc, Ahmet, 351
Goudenge, Ludovic, 245

H
He, Zhijian, 531
Hickernell, Fred J., 367, 407, 583
Hinrichs, Aicke, 385

Hoel, Hkon, 29
Hofer, Roswitha, 87
Hussaini, M. Yousuff, 351
Hppl, Juho, 29

J
Jakob, Wenzel, 107
Jimnez Rugama, Llus Antoni, 367, 407

K
Keller, Alexander, 423
Kritzer, Peter, 437
Kucherenko, Sergei, 455
Kunsch, Robert J., 471

L
Lang, Annika, 489
Lentre, Lionel, 507
Lenz, Nicolas, 315
Lester, Christopher, 303
Li, Sangmeng, 3
Liu, Yaning, 351

M
Maatouk, Hassan, 521
Matsumoto, Makoto, 143

N
Niederreiter, Harald, 87, 531
Novak, Erich, 161

Springer International Publishing Switzerland 2016


R. Cools and D. Nuyens (eds.), Monte Carlo and Quasi-Monte Carlo Methods,
Springer Proceedings in Mathematics & Statistics 163,
DOI 10.1007/978-3-319-33507-0

621

622
O
Oettershagen, Jens, 385
Ohori, Ryuichi, 143, 331
kten, Giray, 351

P
Pillichshammer, Friedrich, 437

R
Robert, Christian P., 185
Roustant, Olivier, 315

S
Schretter, Colas, 531
Schuhmacher, Dominic, 315
Schwab, Christoph, 209, 271
Siedlecki, Pawe, 545
Song, Shugfang, 455
ukys, Jonas, 209
Suzuki, Kosuke, 331

Index
T
Temlyakov, Vladimir, 557
Tempone, Ral, 29
Trinh, Giang, 289
Tudela, Loc, 245

U
Ullrich, Mario, 571

W
Wang, Yiwei, 229
Whittle, James, 303

Y
Yoshiki, Takehito, 331

Z
Zhou, Xuan, 583
Zhu, Houying, 599

Das könnte Ihnen auch gefallen