Roman

SPRINGER BRIEFS IN OPTIMIZATION
Yaroslav D.Sergeyev
Roman G.Strongin
DanielaLera
Introduction
to Global
Optimization
Exploiting SpaceFilling Curves
123
SpringerBriefs in Optimization
Series Editors
Panos M. Pardalos
Janos D. Pinter
Stephen Robinson
Tamas Terlaky
My T. Thai
SpringerBriefs in Optimization showcases algorithmic and theoretical techniques, case studies, and applications within the broad-based field of optimization.
Manuscripts related to the ever-growing applications of optimization in applied
mathematics, engineering, medicine, economics, and other applied sciences are
encouraged.
For further volumes:

http://www.springer.com/series/8918
Yaroslav D. Sergeyev Roman G. Strongin

Daniela Lera
Introduction to Global
Optimization Exploiting
Space-Filling Curves
123
Yaroslav D. Sergeyev
Universit`a della Calabria
Department of Computer Engineering,
Modeling, Electronics and Systems
Rende, Italy
Roman G. Strongin
N.I. Lobachevsky University
of Nizhni Novgorod
Software Department
Nizhni Novgorod, Russia
Daniela Lera
University of Cagliari
Department of Mathematics
and Computer Science
Cagliari, Italy
ISSN 2190-8354
ISSN 2191-575X (electronic)
ISBN 978-1-4614-8041-9
ISBN 978-1-4614-8042-6 (eBook)
DOI 10.1007/978-1-4614-8042-6
Springer New York Heidelberg Dordrecht London
Library of Congress Control Number: 2013943827
Mathematics Subject Classification (2010): 90C26, 14H50, 68W01, 65K05, 90C56, 90C30, 68U99,
65Y99
Yaroslav D. Sergeyev, Roman G. Strongin, Daniela Lera 2013
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publishers location, in its current version, and permission for use must always be obtained from Springer.
Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations
are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Before I speak, I have something important to say.

Groucho Marx
In the literature there exist a lot of traditional local search techniques that have been
designed for problems where the objective function F(y), y D RN , has only one
optimum and a strong a priori information is known about F(y) (for instance, it is
supposed that F(y) is convex and differentiable). In such cases it is used to speak
about local optimization problems. However, in practice the objects and systems to
be optimized are frequently such that the respective objective function F(y) does not
satisfy these strong suppositions. In particular, F(y) can be multiextremal with an
unknown number of local extrema, non-differentiable, each function evaluation can
be a very time-consuming operation (from minutes to hours for just one evaluation
of F(y) on the fastest existing computers), and nothing is known about the internal
structure of F(y) but its continuity. Very often when it is required to find the best
among all the existing locally optimal solutions, in the literature problems of this
kind are called black-box global optimization problems and exactly this kind of
problems and methods for their solving are considered in this book.
The absence of a strong information about F(y) (i.e., convexity, differentiability,
etc.) does not allow one to use traditional local search techniques that require this
kind of information and the necessity to develop algorithms of a new type arises.
In addition, an obvious extra difficulty in using local search algorithms consists
of the presence of several local solutions. When one needs to approximate the
global solution (i.e., the best among the local ones), something more is required in
comparison with local optimization procedures that lead to a local optimum without
discussing the main issue of global optimization: whether the found solution is the
global one we are interested in or not.
Thus, numerical algorithms for solving multidimensional global optimization
problems are the main topic of this book and an important part of the lives of the
authors who have dedicated several decades of their careers to global optimization.
Results of their research in this direction have been presented as plenary lectures
v
vi
Preface
at dozens of international congresses. Together with their collaborators the authors

have published more than a hundred of research papers and several monographs in
English and Russian since 1970s of the twentieth century. Among these publications
the following three volumes [117, 132, 139] can be specially mentioned:
1. Strongin, R.G.: Numerical Methods in Multi-Extremal Problems: InformationStatistical Algorithms. Nauka, Moscow (1978), in Russian
2. Strongin, R.G., Sergeyev, Ya.D.: Global Optimization and Non-Convex Constraints: Sequential and Parallel Algorithms. Kluwer Academic Publishers, DD
(2000)
3. Sergeyev, Ya.D., Kvasov, D.E.: Diagonal Global Optimization Methods. FizMatLit, Moscow (2008), in Russian
Each of these volumes was in some sense special at the time of its appearance:
the monograph of 1978 was one of the first books in the world entirely dedicated to
global optimization; the second monograph for the first time has presented results of
the authors in English in a comprehensive form giving a special emphasis to parallel
computationsa peculiarity that was extremely innovative at that time; finally, the
monograph of 2008 was one of the first books dedicated to global optimization and
published in Russian since events occurred in the Soviet Union in 1990s filling so
in the gap in publications in this direction that was 1520 years long.
The decision to write the present Brief has been made due to the following two
reasons. First, as it becomes clear from the format of this publication (Springer
Brief) and the title of the bookIntroduction to Global Optimization Exploiting
Space-Filling Curvesthe authors wished to give a brief introduction to the subject.
In fact, the monograph [139] of 2000 has been also dedicated to space-filling
curves and global optimization. However, it considers a variety of topics and is
very detailed (it consists of 728 pages). Second, it is more than 10 years since the
monograph [139] has been published in 2000 and the authors wished to present new
results and developments made in this direction in the field.
The present book introduces quite an unusual combination of such a practical
field as global optimization with one of the examples per eccellenza of pure
mathematicsspace-filling curves. The reason for such a combination is the
following. The curves have been first introduced by Giuseppe Peano in 1890 who
has proved that they fill in a hypercube [a, b] RN , i.e., they pass through every point
of [a, b], and this gave rise to the term space-filling curves. Then, in the second half
of the twentieth century it has been independently shown in the Soviet Union and
the USA (see [9,132,139]) that, by using space-filling curves, the multidimensional
global minimization problem over the hypercube [a, b] can be turned into a onedimensional problem giving so a number of new exciting possibilities to attack hard
multidimensional problems using such a reduction.
The book proposes a number of algorithms using space-filling curves for
solving the core global optimization problemminimization of a multidimensional,
multiextremal, non-differentiable Lipschitz (with an unknown Lipschitz constant)
function F(y) over a hypercube [a, b] RN . A special attention is dedicated both
to techniques allowing one to adaptively estimate the Lipschitz constant during
Preface
vii
the optimization process and to strategies leading to a substantial acceleration

of the global search. It should be mentioned that there already exist a lot of
generalizations of the ideas presented here in several directions: algorithms that
use new efficient partition techniques and work with discontinuous functions and
functions having Lipschitz first derivatives; algorithms for solving multicriteria
problems and problems with multiextremal non-differentiable partially defined
constrains; algorithms for finding the minimal root of equations (and sets of
equations) having a multiextremal (and possibly non-differentiable) left-hand part
over an interval; parallel non-redundant algorithms for Lipschitz global optimization
problems and problems with Lipschitz first derivatives, etc. Due to the format of this
volume (Springer Brief) these generalizations are not considered here. However, in
order to guide the reader in possible future investigations, references to a number of
them were collected and systematized (see p. 117).
In conclusion to this preface the authors would like to thank the institutions
they work at: University of Calabria, Italy; N.I. Lobachevsky State University of
Nizhni Novgorod, Russia; University of Cagliari, Italy; and the Institute of High
Performance Computing and Networking of the National Research Council of Italy.
During the recent years the research of the authors has been supported by Italian
and Russian Ministries of University, Education and Science and by the Italian
National Institute of High Mathematics F. Severi. Actually research activities of
the authors are partially supported by the Ministry of Education and Science of
Russian Federation, project 14.B37.21.0878 as well as by the grant 11-01-00682a of the Russian Foundation for Fundamental Research and by the international
program Italian-Russian University.
The authors are very grateful to the following friends and colleagues for their
inestimable help and useful discussions: K. Barkalov, M. Gaviano, V. Gergel,
S. Gorodetskiy, V. Grishagin, and D. Kvasov. The authors thank Prof. Panos
Pardalos for his continuous support that they do appreciate. The authors express
their gratitude to Springers publishing editor Razia Amsad for guiding them during
the publication process.
Finally, the authors cordially thank their families for their love and continuous
support during the preparation of this book.
Rende, Italy
Nizhni Novgorod, Russia
Cagliari, Italy
Yaroslav D. Sergeyev
Roman G. Strongin
Daniela Lera
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.1 Examples of Space-Filling Curves . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.2 Statement of the Global Optimization Problem . . .. . . . . . . . . . . . . . . . . . . .
1
1
6
2 Approximations to Peano Curves: Algorithms and Software .. . . . . . . . . .

2.1 Space-Filling Curves and Reduction of Dimensionality . . . . . . . . . . . . . .
2.2 Approximations to Peano Curves . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.2.1 Partitions and Numerations.. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.2.2 Types of Approximations and Their Analysis .. . . . . . . . . . . . . . . .
2.3 Standard Routines for Computing Approximations
to Peano Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9
9
13
13
28
37
3 Global Optimization Algorithms Using Curves to Reduce

Dimensionality of the Problem . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.2 One-Dimensional Information and Geometric Methods
in Euclidean Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.2.1 Convergence Conditions and Numerical Examples .. . . . . . . . . .
3.2.2 Relationship Between the Information
and the Geometric Approaches . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.3 One-Dimensional Geometric Methods in Holderian Metrics . . . . . . . . .
3.3.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.3.2 Convergence Properties and Numerical Experiments .. . . . . . . .
3.4 A Multidimensional Information Method.. . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.5 A Multidimensional Geometric Method . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
59
60
61
65
73
81
4 Ideas for Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.2 Local Tuning and Local Improvement in One Dimension . . . . . . . . . . . .
91
91
93
47
47
50
54
ix
Contents
4.3 Acceleration of Multidimensional Geometric Algorithms .. . . . . . . . . . . 102

4.4 Fast Information Algorithms . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 109
5 A Brief Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 117
References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 119
Chapter 1
Introduction
Everyone knows what a curve is, until he has studied enough

mathematics to become confused through the countless number
of possible exceptions.
Felix Kleins
1.1 Examples of Space-Filling Curves

In this chapter, we give just a fast tour in the history of the subject, provide examples
of space-filling curves, discuss some of their interesting (at least for us) properties
(in this section), and introduce global optimization problems that will be considered
in this book (see Sect. 1.2). All technical considerations related to the details of the
construction of space-filling curves, their usage in global optimization, numerical
algorithms, etc. will be moved to the subsequent chapters.
As it happens for any fundamental concept, it is very difficult to give a
general and precise definition of the notion curve. It can be defined differently in
different branches of mathematics in dependence on each concrete aspect taken
into consideration for one or another study. For instance, we can think that a
curve is either the graph of a function of one variable or the path of a point
continuously moving in the space or the intersection of two surfaces. The aspect
that is important for us in the present study is our perception of a curve as a onedimensional object. From this point of view the main definition of this chapter
and of the entire book sounds very surprisingly because it links a one-dimensional
object to a multidimensional one: A curve that passes through every point of a
multidimensional region is called space-filling curve or Peano curve.
In fact, the famous Italian mathematician Giuseppe Peano, who was born in
Spinetta, a hamlet near Cuneo, Italy, on 27 August 1858 and died in Turin, Italy,
on 20 April 1932, was the first who has constructed such a curve in 1890 (see [88]).
Due to the key importance for our investigations and its historical value, this small
paper of Peano is entirely reproduced here for the joy of the reader, see Fig. 1.1
Y.D. Sergeyev et al., Introduction to Global Optimization Exploiting Space-Filling
Curves, SpringerBriefs in Optimization, DOI 10.1007/978-1-4614-8042-6 1,
Fig. 1.1 The paper of G. Peano published in Mathematische Annalen in 1890
1 Introduction
that presents this mathematical masterpiece: G. Peano, Sur une courbe, qui remplit
toute une aire plane, Mathematische Annalen, 36, Janvier 1890, 157160. Further
examples of Peano curves have then been proposed by Hilbert in 1891 (see [62]),
Moore in 1900 (see [84]), Sierpinski in 1912 (see [125]), and others (see [99] and
references given therein).
However, it should be mentioned that the story has begun some years
before 1890, precisely in 1878, when Cantor (see [11]) has proved that any two
finite-dimensional smooth manifolds have the same cardinality. In particular, this
result implies that the interval [0, 1] can be mapped bijectively onto the square
[0, 1] [0, 1]. A year later (and in the same journal) Netto has shown that such a
mapping is necessarily discontinuous (see [85]). Thus, since bijective mappings
are discontinuous, the next important question regarding existence of continuous
mappings from an interval into the space asks about surjective mappings.
These results are of our interest because a continuous mapping from an interval
into the plane (or, more generally, into the space) is one of the ways used to define a
curve. In the two-dimensional case the problem that has been formulated above
is the question of the existence of a curve that passes through every point of a
two-dimensional region having a positive Jordan area. The answer exactly to this
question has been given by Peano in [88] where he has constructed the first instance
of such a curve. In his turn, a year later, Hilbert (see [62]) has made a very important
contribution by explaining how Peano curves can be constructed geometrically.
The paper D. Hilbert, Uber

die steitige abbildung einer linie auf ein flachenst
uck,
Mathematische Annalen, 38, 1891, 459460 has been published in the same journal
where Peano has introduced the space-filling curves. It consists of just two pages
that are shown in Fig. 1.2.
Hilbert has introduced the following procedure that can be explained easily. Let
us take the interval [0, 1] and divide it into four equal subintervals. Let us then take
the square [0, 1] [0, 1] and divide it into four equal subsquares. Since the interval
[0, 1] can be mapped continuously onto the square [0, 1] [0, 1], each of its four
subintervals can be mapped continuously onto one of the four subsquares. Then each
of the subintervals and subsquares is partitioned again and the procedure is repeated
infinitely many times. Hilbert has shown first that subsquares can be rotated in an
opportune way in order to ensure the continuity of the curve on the square. Second,
that the inclusion relationships are preserved; i.e., if a square corresponds to an
interval, then its subsquares correspond to the subintervals of that interval. He has
shown also that the curve constructed in this way is nowhere differentiable.
Figure 1.2 shows that in his paper Hilbert sketches the first steps of his iterative
construction having as the limit to his space-filling curve and consisting of the
sequence of piecewise linear continuous curves that approximate closer and closer
the space-filling curve. It can be seen that at each iteration, the current curve is
substituted by four reduced copies of itself. By using the modern language we can
say that the curve is constructed by applying the principle of self-similarity. Remind
that a structure is said to be self-similar if it can be broken down into arbitrary small
pieces each of which is a small replica of the entire structure.
1 Introduction
Fig. 1.2 The paper of D. Hilbert showing in particular how Peano curves can be constructed
geometrically
Fig. 1.3 Construction of the original Peano curve
The original Peano curve possesses the same property but to be precise it is
necessary to mention that Peanos construction is slightly different. Some of its first
steps are presented in Fig. 1.3. In this book, Hilberts version of Peano curves will
be mainly used; however, in order to emphasize the priority of Peano and following
the tradition used in the literature, the terminology Peano curve will be used to call
this precise curve.
As has been already mentioned, several kinds of space-filling curves have been
then proposed after publishing the seminal Peano and Hilbert articles. In Fig. 1.4 we
show Moores version of Peano curve (see [84]) and Fig. 1.5 presents the procedure
constructing a curve introduced by Sierpinski in 1912 (see [125]) that in contrast to
Fig. 1.4 Construction of Moores version of Peano curve

Fig. 1.5 Construction of
Sierpinski closed curve
the previous ones is closed. Notice that all the curves presented in Figs. 1.21.5 are
in two dimensions. However, Peano curves can be generalized for n > 2 dimensions
and such generalizations will be actually used in this book. To illustrate this point,
Fig. 1.6 shows the procedure of generation of the three-dimensional space-filling
curve.
Space-filling curves for a long time were considered by many people just like
monsters or a kind of noodling until Benot Mandelbrot has published his famous
book B. Mandelbrot, Les objets fractals: forme, hasard et dimension, Flammarion,
Paris, 1975, describing objects that everybody knows nowadays under the name
Mandelbrot has given to them: fractals. Space-filling curves are, in fact, examples of
fractalsobjects that are constructed using principles of self-similarity. They have
a number of amazing properties (the important and interesting notion of fractional
dimension is one of them) and can be frequently met in nature (see, e.g., [20, 25,
60, 89] and references given therein). Fractal objects have been broadly studied and
fractal models have been successfully applied in various fields. The reader interested
in space-filling curves and fractals can continue his/her studies using, for instance,
the following publications [20, 25, 60, 66, 80, 89, 90, 95, 99, 110112, 139, 148].
1 Introduction
Fig. 1.6 Generation of the

three-dimensional
space-filling curve
1
1
0
1 1
level 1
1
1
0
1 1
level 2
1
1
0
1 1
level 3
1
1
0
1 1
level 4
1.2 Statement of the Global Optimization Problem

Many decision-making problems arising in various fields of human activity (technological processes, economic models, etc.) can be stated as global optimization
problems (see, e.g., [7,10,13,21,23,2830,47,64,69,70,72,73,82,93,104,127129,
132, 134, 139, 143, 146, 153156] and references given therein). Objective functions
describing real-life applications are very often multiextremal, non-differentiable,
and hard to be evaluated. Numerical techniques for finding solutions to such
problems have been widely discussed in literature (see [2, 3, 12, 16, 24, 26
30,37,39,41,45,48,51,53,55,61,64,71,78,113,118,120,123,126,136,140,156158],
etc.).
One of the natural and powerful assumptions on these problems (from both
theoretical and applied points of view) is that the objective function has bounded
slopes, i.e., it satisfies the Lipschitz condition. More precisely, a large number
of decision problems in the world of applications (especially if we consider
engineering ones, see, e.g., [12, 18, 19, 26, 27, 73, 93, 121, 139, 149]) may be
formulated as searching for a global optimum (minimum, for certainty)
F = F(y ) = min {F(y) : y D},
(1.1)
where the domain of the search

D = {y RN :
21 y j 21 ,
1 j N},
(1.2)
RN is the N-dimensional Euclidean space and the objective function F(y) satisfies
the Lipschitz condition with a constant L, 0 < L < , i.e., for any two points y , y
D it is true that
1.2 Statement of the Global Optimization Problem

|F(y ) F(y )| L||y y || = L
1/2
(yj yj )2
j=1
(1.3)
Obviously, if the domain of the search is a hyperinterval

S = {w RN :
a j w j b j,
1 j N},
(1.4)
then, by introducing the transformation

y j = (w j (a j + b j )/2)/ ,
(1.5)
= max{b j a j :
(1.6)
1 j N},
it is possible to keep up the initial presentation (1.2) for the domain of the search
(which is assumed to be the standard one) not altering the relations of Lipschitzian
properties in dimensions.
Numerical methods for finding solutions to the Lipschitz global optimization
problem (1.1) have been widely discussed in the literature (see [12, 17, 22
24, 26, 27, 37, 39, 41, 45, 49, 51, 55, 58, 64, 78, 86, 87, 91, 93, 107, 108, 113, 117,
118, 120, 123, 130, 136, 140142, 144, 147, 151, 152], etc.). There exist a number
of generalizations of the problem (1.1). Among them, global optimization problems
with multiextremal, non-differentiable, partially defined constraints deserve a special attention. However, due to the format of this monograph (Springer Brief) they
are not considered here. For the interested reader we advise to see the monograph
[139] together with the following publications [109, 119, 122124, 134, 137, 140]
where an original approach that does not require the introduction of penalties to
deal with constraints has been proposed and used together with the space-filling
curves.
The assumption (1.3) on the function F(y) says that the relative differences of
F(y) are bounded by the constant L. This assumption is very practical because it can
be interpreted as a mathematical description of a limited power of change present
in real systems. If we suppose that the constant L is known, then this information
can be successfully used to develop global optimization algorithms (see [2224, 37,
39,41,51,55,58,64,86,87,91,93,117,139,144,147,151], etc.). From the theoretical
point of view this supposition is certainly very useful. In practice, there exists a
variety of techniques allowing one to approximate L (see [12, 26, 27, 45, 51, 78, 93,
107, 108, 113, 118, 120, 139, 140], etc.). Some of these techniques will be introduced
and discussed in the subsequent chapters.
It is well known that Lipschitz global optimization algorithms (see, e.g., [22, 23,
38, 58, 59, 91, 93, 141, 142]) require, in general, substantially fewer function evaluations than the plain uniform grid technique. This happens because Lipschitz global
optimization methods in order to select each subsequent trial point (hereinafter
evaluation of the objective function at a point is called trial) use all the previously
1 Introduction
computed values of the objective function (see [139]). The one-dimensional case,
N = 1, has been deeply studied and powerful numerical algorithms allowing
us to work with it efficiently have been proposed in the literature (see, e.g.,
[22, 23, 58, 59, 91, 93, 117, 118, 139]). In this case, after k executed trials there are no
serious problems in choosing the point xk+1 of the next trial. In fact, this is reduced
to selecting the minimal value among k values, each of which is usually easy to
compute (see, e.g., [58, 59, 93, 117, 139]). If the dimension N > 1, then the relations
between the location of the next evaluation point and the results of the already
performed evaluations become significantly more complicated. Therefore, finding
the point xk+1 that is optimal with respect to certain criteria is usually the most
time-consuming operation of the multidimensional algorithm, and its complexity
increases with the growth of the problem dimension.
This happens because an optimal selection of xk+1 turns into solving at each
step of the search process an auxiliary multidimensional optimization problem of
an increasing multi-extremality along with the accumulation of the trial points. As a
result, an algorithm aiming to effectively use the acquired search information to
reduce the number of trials needed to estimate the sought optimum also includes
an inherent multi-extremal optimization problem (see [58, 93, 117, 134, 139] where
this subject is covered in much greater detail). But, as was already mentioned, the
case N = 1 is effectively solvable. Therefore it is of great interest to reduce
the multivariate optimization problem (1.1) to its one-dimensional equivalent, which
could be then effectively solved by using techniques developed for dealing with
one-dimensional global optimization problems.
A possible way to do so (see, e.g., [9,131135,139]) is to employ a single-valued
Peano curve y(x) continuously mapping the unit interval [0, 1] from the x-axis onto
the hypercube (1.2) and, thus, yielding the equality
F = F(y ) = F(y(x )) = min{F(y(x)) : x [0, 1]}.
(1.7)
As was already said in the previous section, these curves, first introduced by Peano
in [88] and Hilbert in [62], fill in the cube D, i.e., they pass through every point of
D giving so the possibility to construct numerical univariate algorithms for solving
the problem (1.7) and, therefore, the original problem (1.1). Putting this possibility
in practice is the main goal of this monograph that both describes how to build
approximations to Peano curves on a computer and introduces a number of efficient
global optimization algorithms using these approximations.
Chapter 2
Approximations to Peano Curves: Algorithms

and Software
In rallying every curve, every hill may be different than you

thought. That makes it interesting.
Kimi Raikkonen
2.1 Space-Filling Curves and Reduction of Dimensionality

Due to the important role the space-filling curves play in the subsequent treatment
it is appropriate to fix this term by some formal statement.
Definition 2.1. Single-valued continuous correspondence y(x) mapping the unit
interval [0, 1] on the x-axis onto the hypercube D from (1.2) is said to be a
Peano-type curve or a space-filling curve.
Hereinafter we describe a particular type of space-filling curves emerging as the
limit objects generated by the scheme from [132, 134, 135, 139, 140] succeeding the
ideas from Hilbert [62]. Computable approximations to these curves are employed
in the algorithms suggested in the rest of this book for solving multivariate Lipschitz
global optimization problems.
We introduce now the Curve construction scheme.
Divide the hypercube D from (1.2) into 2N equal hypercubes of the first
partition by cutting D with N mutually orthogonal hyperplanes (each plain is
parallel to one of the coordinate ones and passes through the middle points of
the D edges orthogonal to this hyperplane); note that each of these subcubes has
edge length equal to 21 . Use the index z1 , 0 z1 2N 1, to number all the
subcubes obtained in the above partitioning; each particular subcube is, henceforth,
designated D(z1 ) (for the sake of illustration case N = 2 is presented in Fig. 2.1;
see the left picture). Then divide (in the above manner) each of the obtained
first-partition cubes into 2N second-partition subcubes numbered with the index

10
2 Approximations to Peano Curves: Algorithms and Software

y2
y2
D(3,3) D(3,0) D(2,3) D(2,2)
D(2)
D(3)
D(3,2) D(3,1) D(2,0) D(2,1)
y1
y1
D(0)
D(0,1) D(0,2) D(1,3) D(1,2)
D(1)
D(0,0) D(0,3) D(1,0) D(1,1)
M=2
M=1
Fig. 2.1 Case N = 2. Subcubes of the first partition (left picture) and of the second partition (right
picture) of the initial cube D
d(0)
0
)[
)[
d(1)
)[
)[
)[
)[
)[
)[
d(2)
)[
)[
)[
)[
)[
d(3)
)[
)[
)[
)[
)[
)[
)[
d(0,0) d(0,1) d(0,2) d(0,3) d(1,0) d(1,1) d(1,2) d(1,3) d(2,0) d(2,1) d(2,2) d(2,3) d(3,0) d(3,1) d(3,2) d(3,3)
Fig. 2.2 Case N = 2. Subintervals d(z1 ) of the first partition and subintervals d(z1 , z2 ) of the
second partition of the unit interval [0, 1] on the x-axis
z2 , 0 z2 2N 1. Each particular subcube obtained by such a partitioning of D(z1 )

is designated D(z1 , z2 ) and it has edge length equal to 22 ; see the right picture in
Fig. 2.1.
Continuing this process, i.e., consequently cutting each hypercube of a current
partition into 2N subcubes of the subsequent partition (with a twice shorter edge
length), yields hypercubes D(z1 , . . . , zM ) of any Mth partition with edge length equal
to 2M . The total number of subcubes of the Mth partition is equal to 2MN and
D D(z1 ) D(z1 , z2 ) . . . D(z1 , . . . , zM ),
(2.1.1)
where 0 z j 2N 1, 1 j M.
Next, cut the interval [0,1] on the x-axis into 2N equal parts; each particular part
is designated d(z1 ), 0 z1 2N 1: the numeration streams from left to right along
the x-axis. Then, once again, cut each of the above parts into 2N smaller (equal)
parts, etc. Designate d(z1 , . . . , zM ), 0 z j 2N 1, 1 j M, the subinterval of
the Mth partition; the length of any such interval is equal to 2MN . Assume that
each interval contains its left-end-point, but it does not contain its right-end-point;
the only exception is for the case when the right-end-point is equal to unity, which
corresponds to the relations z1 = z2 = . . . = zM = 2N 1. Obviously,
[0, 1] d(z1 ) d(z1 , z2 ) . . . d(z1 , . . . , zM ) ;
case N = 2 is illustrated by Fig. 2.2 (for M = 1 and M = 2).
(2.1.2)
2.1 Space-Filling Curves and Reduction of Dimensionality
11
Present the left-end-point v of the subinterval

d(z1 , . . . , zM ) = [v, v + 2MN )
(2.1.3)
in the binary form

MN
0 v = i 2i < 1,
(2.1.4)
i=1
where 1 , 2 , . . . , MN are binary digits (i.e., i = 0 or i = 1). From (2.1.3), (2.1.4)

and the already accepted condition that the numeration of the subintervals from
(2.1.2) with any index z j , 1 j M, streams from left to right along the x-axis, this
index is possible to present as
zj =
N1
jNi 2i,
1 jM.
(2.1.5)
i=0
In the sequel, the interval (2.1.3) will also be referred to as d(M, v). The relations
(2.1.4), (2.1.5) provide a basis for computing the parameters from one side of the
identity
d(M, v) = d(z1 , . . . , zM )
(2.1.6)
(i.e., M, v or z1 , . . . , zM ) being given the parameters from the other side of this
identity (i.e., z1 , . . . , zM or M, v).
Now, establish a mutually single-valued correspondence between all the
subintervals of any particular Mth partition and all the subcubes of the same
Mth partition by accepting that d(M, v) from (2.1.6) corresponds to D(z1 , . . . , zM )
and vice versa. The above subcube will also be designated D(M, v), i.e.,
D(M, v) = D(z1 , . . . , zM ),
(2.1.7)
where the indexes z1 , . . . , zM have the same values as in (2.1.6) and they could be
computed through (2.1.3)(2.1.5).
In accordance with (2.1.1) and (2.1.2), the introduced correspondence
satisfies the
Condition 1. D(M + 1, v ) D(M, v ) if and only if d(M + 1, v ) d(M, v ).
We also require this correspondence to satisfy the following
Condition 2. Two subintervals d(M, v ) and d(M, v ) have a common end-point
(this point may only be either v or v ) if and only if the corresponding subcubes D(M, v ) and D(M, v ) have a common face (i.e., these subcubes must be
contiguous).
Two linked systems of partitioning (i.e., the partitioning of the cube D from
(1.2) and the partitioning of the unit interval [0, 1] on the x-axis) that meet the
12
above two conditions provide the possibility for constructing the evolvent curve
which may be employed in (1.7). Note that Condition 1 is already met, but
Condition 2 has to be ensured by a special choice of numeration for the subcubes
D(z1 , . . . , zM ), M 1, which actually establishes the juxtaposition of the subcubes
(2.1.7) to the subintervals (2.1.6). The particular scheme of such numeration
suggested in [132, 134] will be introduced in the next section.
Theorem 2.1. Let y(x) be a correspondence defined by the assumption that for any
M 1 the image y(x) D(M, v) if and only if the inverse image x d(M, v). Then:
1. y(x) is the single-valued continuous mapping of the unit interval [0, 1] onto the
hypercube D from (1.2); hence, y(x) is a space-filling curve.
2. If F(y), y D, is Lipschitzian with some constant L, then the univariate function
1
F(y(x)), x [0,
1], satisfies Holder conditions with the exponent N and the
coefficient 2L N + 3, i.e.,
|F(y(x )) F(y(x ))| 2L N + 3(|x x |)1/N ,

(2.1.8)
x , x [0, 1] .
Proof. Any nested sequence (2.1.2) of intervals d(M, v) from (2.1.3), (2.1.6) and the
corresponding nested sequence (2.1.1) of subcubes D(M, v) from (2.1.7) contract,
respectively, to some point in [0, 1] and to some point in D with M because
of the geometrically decreasing values 2MN and 2M which are, respectively, the
length of d(M, v) and the edge length of D(M, v). Hence, the correspondence y(x)
really maps the x-intercept [0, 1] onto the hypercube D; the last property is due to
the fact that for any M 1 the union of all the subcubes D(z1 , . . . , zM ) constitutes D.
The continuity of y(x) is an obvious consequence of the second condition. Let
x , x [0, 1] and x = x . Then there exists an integer M 1 such that
2(M+1)N |x x | 2MN .
(2.1.9)
Therefore, either there is some interval d(M, v) containing both points x , x or there
are two intervals d(M, v ), d(M, v ) having a common end-point and containing x
and x in the union. In the first case, y , y D(M, v) and
|y j (x ) y j (x )| 2M ,
1 jN.
(2.1.10)
In the second case, y(x ) D(M, v ), y(x ) D(M, v ), but the subcubes D(M, v )
and D(M, v ) are contiguous due to Condition 2. This means that for some particular
index k, 1 k N,
|yk (x ) yk (x )| 2(M1) ;
(2.1.11)
but for all integer values j = k, 1 j N, the statement (2.1.10) is still true.
From (2.1.10) and (2.1.11),
2.2 Approximations to Peano Curves
13

||y(x ) y(x )|| =
1/2
[yl (x ) yl (x

2
)]
l=1
1/2
(N 1)22M + 22(M1)
= 2M N + 3,
and, in consideration of (2.1.9), we derive the estimate
||y(x ) y(x )|| 2 N + 3(|x x |)1/N ,
(2.1.12)
whence it follows that the Euclidean distance between the points y(x ) and y(x )
vanishes with |x x | 0. Finally, employ (1.3), (2.1.12) to substantiate the
relation (2.1.9) for the function F(y(x)), x [0, 1], which is the superposition of
the Lipschitzian function F(y), y D, and the introduced space-filling curve y(x).

Once again, we recall that the space-filling curve (or Peano curve) y(x) is
defined as a limit object emerging in some sequential construction. Therefore,
in practical application some appropriate approximations to y(x) are to be used.
Particular techniques for computing such approximations (with any preset accuracy)
are suggested and substantiated in [46, 132, 134, 138]. Some of them are presented
in the next section.

2.2.1 Partitions and Numerations
The left-end-points v of the subintervals d(M, v) introduced in Sect. 2.1 are strictly
ordered from the left to the right along the x-axis, which induces a strict order for
the corresponding vectors (z1 , . . . , zM ) from (2.1.6); vectors (0, . . . , 0) and (2N
1, . . . , 2N 1) are, respectively, the minimum and the maximum elements of this
order.
Definition 2.2. The vector (z1 , . . . , zM ), M 1, is said to precede the vector
(z1 , . . . , zM ) if either z1 < z1 or there exists an integer k, 1 k < M, such that
zj = zj , 1 j k, and zk+1 < zk+1 . Two vectors (z1 , . . . , zM ), (z1 , . . . , zM ) and the
corresponding subcubes D(z1 , . . . , zM ), D(z1 , . . . , zM ) are said to be adjacent if one
of these vectors precedes the other and there is no third vector (z1 , . . . , zM ) satisfying
the relations
(z1 , . . . , zM ) (z1 , . . . , zM ) (z1 , . . . , zM )
or
(z1 , . . . , zM ) (z1 , . . . , zM ) (z1 , . . . , zM ) ;
here is the precedence sign.
14
From this definition, if two subintervals d(M, v ), d(M, v ) have a common
end-point, then the corresponding vectors (z1 , . . . , zM ), (z1 , . . . , zM ) from (2.1.6)
have to be adjacent.
Therefore, Condition 2 from Sect. 2.1 is possible to interpret as the necessity for
any two adjacent subcubes D(z1 , . . . , zM ), D(z1 , . . . , zM ) to have a common face (i.e.,
to be contiguous).
Introduce the auxiliary hypercube
= {y RN : 21 y j 3 21, 1 j N}
(2.2.1)
and designate (s), 0 s 2N 1, the subcubes constituting the first partition

of . Due to the special choice of set by (2.2.1), the central points u(s) of the
corresponding subcubes (s) (in the sequel, these points are referred as centers) are
N-dimensional binary vectors (each coordinate is presented by some binary digit).
Install the numeration of the above centers (and, hence, the numeration of the
corresponding subcubes (s)) by the relations
u j (s) = ( j + j1)mod 2,
1 j N,
uN (s) = N1 ,
(2.2.2)
where j , 0 j < N, are the digits in the binary presentation of the number s:
s = N1 2N1 + . . . + 0 20 .
(2.2.3)
Theorem 2.2. The numeration of subcubes (s) set by the relations (2.2.2), (2.2.3)
ensures that:
1. All the centers u(s), 0 s 2N 1, are different.
2. Any two centers u(s), u(s+1), 0 s < 2N 1, are different just in one coordinate.
3.
u(0) = (0, . . . , 0, 0),
u(2N 1) = (0, . . . , 0, 1) .
(2.2.4)
Proof. 1. Consider the first statement and assume the opposite, i.e., that the
relations (2.2.2) juxtapose the same center to the different numbers s, s :
u(s) = u(s ),
s = s ,
0 s, s 2N 1 ;
(2.2.5)
where s is from (2.2.3) and s is also given in the binary form

2N1 + . . . + 0 20 .
s = N1
(2.2.6)
From (2.2.2), (2.2.3), (2.2.6) and the first equality in (2.2.5) follows that

N1 = N1
(2.2.7)
15
and

( j + j1)mod 2 = ( j + j1
)mod 2,
1 j<N.
(2.2.8)
The relations (2.2.8) imply that for any integer j, 1 j < N, either the equalities
j = j ,

j1 = j1
j = j ,

j1 = j1
(2.2.9)
or the equalities
(2.2.10)
have to be true; here is the negation symbol inverting the value of the binary
digit (i.e., 0 = 1, 1 = 0).
Suppose that (2.2.9) is true for any integer j in the range 1 j k < N. This
implies the validity of the relations
k = k ,

k1 = k1
.
(2.2.11)
If k + 1 < N, then the conditions (2.2.10) could not be met for j = k + 1 because
the equalities

k+1 =k+1
,
k =k
are in contradiction with (2.2.11). Case k + 1 = N, in consideration of (2.2.7),

(2.2.9), leads to the conclusion that s = s which is in contradiction with the
assumption (2.2.5).
Another option, i.e., the supposition that (2.2.10) is true for any integer j in
the range 1 j k < N, being considered in the analogous way, also brings a
contradiction.
2. Assume that the number s from (2.2.3) satisfies the inequalities 0 s < 2N 1.
Then there exists an integer k, 1 k N, such that the binary form of the next
number s + 1 can be written as
s + 1 = N1 2N1 + . . . + Nk+1 2Nk+1 +Nk 2Nk + . . . +0 20 . (2.2.12)
From (2.2.2), (2.2.3), (2.2.12) and the relations
( j + j1)mod 2 = ( j + j1)mod 2,
1 j < N,
(2.2.13)
follows that
u j (s + 1) = u j (s),
1 j N,
j = N k + 1 .
(2.2.14)
16
If k = 1, then
uN (s + 1) =N1 =uN (s) ;
(2.2.15)
otherwise, i.e., if 1 < k N,

uNk+1 (s + 1) = (Nk+1 +Nk )mod 2
=(Nk+1 + Nk )mod 2 =uNk+1 (s) .
(2.2.16)
Hence, the second statement of the Theorem is also proved.

3. The relations (2.2.4) directly follow from (2.2.2), (2.2.3); the value for u(0)
corresponds to 0 = . . . = N1 = 0 and the value for u(2N 1) corresponds
to 0 = . . . = N1 = 1.

The suggested scheme (2.2.2), (2.2.3) is possible to use for numbering the
subcubes D(z1 ), 0 z1 2N 1, constituting the first partition of the hypercube
D from (1.2). This will ensure that adjacent subcubes have a common face. To do
so we introduce the linear mapping
g(y) = 2y + p,
y RN ,
(2.2.17)
p = (21 , . . . , 21 ) RN ,
(2.2.18)
g(D) =
(2.2.19)
with
meeting the condition
and assume that the subcube D(z1 ) has the number z1 = s if and only if D(z1 ) is the
inverse image of (s), i.e.,
g(D(z1 )) = (s) .
(2.2.20)
To employ the above approach for numbering the second partition subcubes
D(z1 , z2 ), 0 z2 2N 1, from D(z1 ), 0 z1 2N 1, we introduce the linear
mappings
g(z1 ; y) = 22 {y [u(z1) p]21} + p
meeting the conditions
g(z1 ; D(z1 )) = ,
0 z1 2N 1,
(2.2.21)
17
and assume that the subcube D(z1 , z2 ) from D(z1 ) has the index z2 = s if and only if
D(z1 , z2 ) is the inverse image of (s), i.e.,
g(z1 ; D(z1 , z2 )) = (s) .
(2.2.22)
Note that u(z1 ) from (2.2.21) is the center of the subcube (z1 ) juxtaposed to D(z1 )
by the relation (2.2.20).
The suggested numeration of the second-partition subcubes D(z1 , z2 ) with
indexes z2 , 0 z2 2N 1, by means of the scheme (2.2.21), (2.2.22) ensures
contiguity of any two adjacent subcubes D(z1 , z2 ), D(z1 , z2 + 1), 0 z2 < 2N 1,
from the same cube D(z1 ). But there is still a problem to be solved. Any two
subcubes D(z1 , 2N 1) and D(z1 + 1, 0), 0 z1 < 2N 1, are adjacent too and,
therefore, they should also have a common face. This means that there should be
some special linkage in the numerations of elements D(z1 , z2 ), 0 z2 2N 1,
from different subcubes D(z1 ), 0 z1 2N 1.
To provide the basis for such a linkage we, first, introduce a variety of
numerations for the elements (s), 0 s 2N 1. As follows from (2.2.4), the
numeration defined by the rules (2.2.2), (2.2.3) ensures that the initial center u(0)
is the zero-vector and that the centers u(0) and u(2N 1) differ only in the Nth
coordinate. The permutation of uN and ut in u(s) resulting in the vector designated
ut (s) = (u1 (s), . . . , ut1 (s), uN (s), ut+1 (s), . . . , uN1 (s), ut (s)),
1 t N, does not change the initial vector, i.e.,
ut (0) = u(0),
1 t N,
and moves the only nonzero coordinate in u(2N 1) to the t-th position, which
means that ut (0) and ut (2N 1) are different only in the t-th coordinate:

uti (2N
1) =
uti (0), i = t,
uti (0), i = t.
(2.2.23)
The addition of some binary vector q RN to the vectors ut (s), 0 s 2N 1,

resulting in the new vectors utq (s) from
tq
ui (s) = (uti (s) + qi )mod 2,
1 i N,
(2.2.24)
ensures that the used vector q is the center of the initial subcube (0), i.e.,
utq (0) = q .
Thereby, these two operations allow us to construct the numeration which assures
that the initial subcube (0) has the preset center q and the centers of the subcubes
(0) and (2N 1) are different only in the t-th coordinate, i.e., they satisfy the
condition (2.2.23) for any preset integer t, 1 t N.
18
Next, we introduce the integer function

l(s 1) = l(s) = min{ j : 2 j N, j1 = 1},
(2.2.25)
where s, 2 s 2N 2, is an even integer, j1 is from (2.2.3), and

l(0) = l(2N 1) = 1 .
(2.2.26)
As it follows from (2.2.2), (2.2.3), (2.2.12), (2.2.14), and (2.2.25),

l(s) = N k + 1 ;
hence, l(s) from (2.2.25) is the number of the only coordinate in which the centers
u(s) and u(s 1) are different, i.e.,

ui (s) =
ui (s 1), i = l(s),

ui (s 1), i = l(s).
We assume that the centers of the initial subcube D(z1 , 0) and of the last subcube
D(z1 , 2N 1) from the cube D(z1 ), 0 z1 2N 1, should also be different just
in one coordinate and that the function (2.2.25) determines the number of this
coordinate.
We also introduce a binary function

ui (s), i = 1,
wi (s + 1) = wi (s) =
(2.2.27)
ui (s), 2 i N,
where s is supposed to be the odd number and
w(0) = u(0) ;
(2.2.28)
as already mentioned is a negation sign inverting the value of the binary digit. The
binary vector w(z1 ) is to be used for determining the position of the center of the
subcube (0) employed in (2.2.22) with z2 = 0, i.e., it will used to set the value of
the vector q embedded into (2.2.24).
Now, we suggest carrying out the numbering of the subcubes D(z1 , z2 ), 0 z2
2N 1, in the following way. From (2.2.25) to (2.2.28), compute the values l =
l(z1 ), w = w(z1 ) for the given index z1 , 0 z1 2N 1. Select
t = l(z1 ),
q = w(z1 )
(2.2.29)
and, using the above described permutations and additions, determine the vectors
utq (s), 0 s 2N 1; note that the vector function utq (s) may be different for
different values of z1 , 0 z1 2N 1. Finally, we employ (2.2.22) to number
the subcubes D(z1 , z2 ), 0 z2 2N 1, under the condition that the index s of

Table 2.1 Vectors (0 , 1 )
with 0 , 1 from (2.2.3), u(s)
from (2.2.2), w(s) from
(2.2.27), (2.2.28) as the
functions of s, 0 s 3 (case
N = 2, M = 2)
Table 2.2 Centers utq (s)
from (2.2.24), (2.2.29)
juxtaposed to the subcubes
(s) from (2.2.22) as the
functions of z1 , 0 z1 3
Fig. 2.3 Case N = 2, M = 2.

Solid-line arrows link the
centers (marked by the red
dots) of the second-partition
subsquares in the order of
numbering providing
contiguity of the adjacent
squares. The dotted-line
arrows link the centers of the
first-partition subsquares
19
s
0
1
2
3
(0 , 1 )
(0,0)
(1,0)
(0,1)
(1,1)
z1
0
1
2
3
utq (0)
(0,0)
(0,0)
(0,0)
(1,1)
(u1 , u2 )
(0,0)
(1,0)
(1,1)
(0,1)
utq (1)
(0,1)
(1,0)
(1,0)
(1,0)
l
1
2
2
1
utq (2)
(1,1)
(1,1)
(1,1)
(0,0)
utq (3)
(1,0)
(0,1)
(0,1)
(0,1)
D(2)
D(3)
D(0)
(w1 , w2 )
(0,0)
(0,0)
(0,0)
(1,1)
l(3)=1
l(2)=2
l(0)=1
l(1)=2
D(1)
the subcube (s) from the right-hand side of (2.2.22) has to be identical with the
number of the center utq (s) of this subcube generated for the given index z1 in the
above way.
Tables 2.1, 2.2 and Fig. 2.3 illustrate the role of the numbers l from (2.2.25),
(2.2.26) and of the vectors w from (2.2.27), (2.2.28) in establishing the numeration
of the second-partition subcubes D(z1 , z2 ) which were already pictured in Fig. 2.1
(case N = 2, M = 2). Table 2.1 presents the vectors (0 , 1 ) with 0 , 1 from (2.2.3),
u(s) from (2.2.2), w(s) from (2.2.27), (2.2.28), and the number l(s) from (2.2.25),
(2.2.26) as the functions of s, 0 s 3. Table 2.2 contains the sets of the centers
utq (s), 0 s 3, juxtaposed to the subcubes (s) from (2.2.22) for the given index
z1 , 0 z1 3. These centers are computed from (2.2.24) under the conditions
(2.2.29).
Circles in Fig. 2.3 mark the centers of the first-partition subcubes D(z1 ),
0 z1 3. These centers are linked with the dotted-line arrows in the order of
20
numeration. The corresponding centers u(s) juxtaposed to the subcubes (s) from
(2.2.20) while numbering D(z1 ), 0 z1 3, are given in the third column of
Table 2.1.
Red dots in Fig. 2.3 mark the centers of the second-partition subcubes D(z1 , z2 ).
Centers of the adjacent subcubes are linked with solid-line arrows streaming from
the initial subcube D(0, 0) to the last subcube D(3, 3). Centers utq (s) juxtaposed to
the subcubes (s) from (2.2.22) while numbering D(z1 , z2 ), 0 z2 3, are given
in Table 2.2 (each row of this table corresponds to some particular value of the first
index z1 ).
The picture allows us to clarify the role of the vector w(z1 ) from the last column
of Table 2.1 which points the position for the center of the initial subcube D(z1 , 0)
from the next partition of the cube D(z1 ). As it is clear from the picture, the values
l(z1 ) and w(z1 ) are coherent in such a way that the centers of the subcubes D(z1 , 3)
and D(z1 + 1, 0), 0 z1 < 3, are different just in one coordinate, though these
adjacent subcubes belong to different cubes of the foregoing partition. The centers
of such cubes are linked with thick arrows in Fig. 2.3.
Let us consider now how to link numerations in subsequent partitions. Note that
the already considered two cases (numbering in the first partition and numbering in
the second partition) were treated in somewhat different ways. In the first case we
used the relations (2.2.17)(2.2.20) juxtaposing the centers u(s) from (2.2.2), (2.2.3)
to the cubes (s). In the second case we employed the relations (2.2.21), (2.2.22)
and the cubes (s) which juxtaposed the centers utq (s) from (2.2.24) linked with
the corresponding centers u(z1 ) due to (2.2.25)(2.2.29); in fact, each center utq (z2 )
depends also on some value z1 (we use the short notation utq (s) just to compact the
writing; this should not cause any confusion).
It is possible to unify both considered cases by introducing the linear mappings

M
g(z1 , . . . , zM ; y) = 2M+1 y [utq (z j ) p]2 j + p
(2.2.30)
j=1
and assuming that the subcube D(z1 , . . . , zM , zM+1 ) is characterized by the index
zM+1 if and only if
g(z1 , . . . , zM ; D(z1 , . . . , zM , zM+1 )) = (s) ;
(2.2.31)
here the cube (s) is the one having the center utq (s) from (2.2.24) with

N,
M = 0,
t = t(zM+1 ) =
(2.2.32)
l(zM ), M > 0,
and

q = q(zM+1 ) =
(0, . . . , 0) RN , M = 0,
M > 0.
w(zM ),
(2.2.33)
21
If M = 1, then the relations (2.2.30) and (2.2.32), (2.2.33) are, respectively, identical
to the relations (2.2.21) and (2.2.29). If M = 0, which corresponds to the numeration
in the first partition, then (2.2.30) is identical to (2.2.17) and application of (2.2.24)
in conjunction with (2.2.32), (2.2.33) yields
utq (s) = u(s),
0 s 2N 1 .
Thus, (2.2.30), (2.2.31) together with (2.2.24), (2.2.32), (2.2.33), and (2.2.25)
(2.2.28) combine the rules for numbering in the first and in the second partitions.
Moreover, it is possible to generalize this scheme for any M > 1. The only
amendment needed is to accept that the rule (2.2.24) transforming u(s) into utq (s)
has to be appended with similar transformation for the vector w(s)
t
wtq
i (s) = (wi (s) + qi )mod 2,
1 i N,
(2.2.34)
and with the following transformation for the integer l(s)
N, l(s) = t,
lt (s) = t, l(s) = N,
l(s), l(s) = N and l(s) = t,
(2.2.35)
where t is the pointer used in the permutations yielding ut (s) and wt (s).
It has to be clarified that all the values u(zM ), l(zM ), w(zM ) embedded into
the right-hand sides of the expressions (2.2.27), (2.2.32), (2.2.33) to produce the
subsequent auxiliary values w, t, q for the numeration in the next partition are
functions of the corresponding values u, l, w generated in the foregoing partition.
Once again, we stress that utq (zM+1 ), wtq (zM+1 ), and lt (zM+1 ) are dependent on
z1 , .., zM if M 1.
Theorem 2.3. The introduced system of the linked numerations ensures the contiguity of any two adjacent subcubes from any Mth (M 1) partition of the cube D
from (1.2); see [132].
Proof. 1. Consider any two adjacent subcubes D(z1 ) and D(z1 + 1), 0 z1 < 2N
1, of the first partition mapped by the correspondence (2.2.17) onto the auxiliary
subcubes (z1 ) and (z1 + 1); see (2.2.20). As already proved in Theorem 2.2,
the centers u(z1 ), u(z1 + 1), 0 z1 < 2N 1, of the subcubes (z1 ), (z1 + 1) are
different just in one coordinate if they are numbered in accordance with the rules
(2.2.2), (2.2.3). That is, the subcubes (z1 ), (z1 + 1) have to be contiguous and,
therefore, the corresponding cubes D(z1 ), D(z1 + 1) are contiguous too.
Suppose that the Theorem is true for any adjacent subcubes of the k-th
partition of the cube D, where 1 k M. Then it is left to prove that it is also
true for the adjacent subcubes of the (M + 1)st partition.
As long as for the given z1 , 0 z1 2N 1, the set of all the subcubes
D(z1 , z2 , . . . , zM+1 ) constitutes the Mth partition of the cube D(z1 ), then, due
to the assumption, all the adjacent subcubes D(z1 , z2 , . . . , zM+1 ) from D(z1 ) are
22
contiguous. Thus, it is left to demonstrate that for any given z1 , 0 z1 2N 1,

the subcubes
D(z1 , 2N 1, . . . , 2N 1) and D(z1 + 1, 0, . . ., 0)
(2.2.36)
of the (M + 1)st partition are also contiguous.

In accordance with (2.2.30), (2.2.31), the point
y(z1 , . . . , zM ) =
[utq (z j ) p]2 j
(2.2.37)
j=1
belongs to all the subcubes D(z1 , . . . , zM , zM+1 ), 0 zM+1 2N 1, from

D(z1 , . . . , zM ). Therefore, in the sequel, the point (2.2.37) is to be referred to
as the center of the subcube D(z1 , . . . , zM ). Then, the necessary and sufficient
condition for the cubes from (2.2.36) to be contiguous could be stated as the
existence of a number l, 1 l N, such that the centers of these cubes satisfy
the requirement
yi (z1 , 2N 1, . . . , 2N 1) yi(z1 + 1, 0, . . ., 0)
=

0,
i = l ,
=
2(M+1) , i = l ;
(2.2.38)
i.e., the centers of the cubes from (2.2.36) have to be different just in one, l-th,
coordinate and the absolute difference in this coordinate has to be equal to the
edge length for the (M + 1)st partition subcube. We proceed with computing the
estimate for the left-hand side of (2.2.38) for the accepted system of numeration.
2. Introduce the notations u(z1 , . . . , zM ; zM+1 ), w(z1 , . . . , zM ; zM+1 ) for the
vectors utq (zM+1 ), wtq (zM+1 ) corresponding to the particular subcube
D(z1 , . . . , zM , zM+1 ) from the cube D(z1 , . . . , zM ).
Suppose that z1 = 2k 1, 1 k 2N1 1, i.e., z1 is the odd number and
z1 < 2N 1, and consider the sequence of indexes z1 , z2 , . . . ; z j = 2N 1, j 2.
First, we study the sequence of numbers t(z j ), j 1, corresponding to the
introduced sequence of indexes. From (2.2.32),
t(z1 ) = N
(2.2.39)
l(z1 ) = l(z1 + 1) > 1 .
(2.2.40)
and, as it follows from (2.2.25),
Now, from (2.2.35), (2.2.39), (2.2.40), we derive that t(z2 ) = l(z1 ).

In accordance with (2.2.26), l(z2 ) = l(2N 1) = 1; hence, due to (2.2.32),
(2.2.35), (2.2.40), we get the value t(z3 ) = 1.
23
Reproduction of the above reasoning for z3 = 2N 1 and z4 = 2N 1 yields

the estimates t(z4 ) = N, t(z5 ) = 1; and by inductive inference, finally, we obtain
the dependence
j = 2 + 1, 1,
1,
t(z j ) = l(z1 ), j = 2,
N,
j = 1, j = 2 , 2.
(2.2.41)
From (2.2.33), q(z1 ) = (0, . . . , 0) and, with account of (2.2.24), (2.2.33), (2.2.34)
and (2.2.41), we derive the relations
utq (z1 ) = u(z1 ),
q(z2 ) = wtq (z1 ) = w(z1 ) .
(2.2.42)
Now, it is possible to analyze the second-partition subcubes from D(z1 ). From

(2.2.4), (2.2.24), (2.2.41), (2.2.42) follows that

ui (z1 ; 2N 1) =
wi (z1 ), i = l(z1 ),
wi (z1 ), i = l(z1 ),
1 i N,
(2.2.43)
whence, in consideration of (2.2.27), (2.2.40),

ui (z1 ; 2N 1) =
ui (z1 ), i = 1, i = l(z1 ),
ui (z1 ), i = 1, i = l(z1 ),
1 i N.
(2.2.44)
In the analogous way, from (2.2.4), (2.2.27), (2.2.33), (2.2.34), (2.2.40), and
(2.2.41), obtain

qi (z3 ) = wi (z1 ; 2 1) =
N
ui (z1 ), i = l(z1 ),
ui (z1 ), i = l(z1 ),
1 i N.
(2.2.45)
Next, from (2.2.4), (2.2.24), (2.2.40)(2.2.45), establish the identity

u(z1 , 2N 1; 2N 1) = u(z1 ; 2N 1)
(2.2.46)
and, due to (2.2.27), (2.2.34), derive the relation

=
wi (z1 , 2N 1; 2N 1) =
ui (z1 ), l(zi ) = i = 1, l(z1 ) = i = N, l(z1 ) = i = N,
ui (z1 ), l(zi ) = i = 1, l(z1 ) = i = N, l(z1 ) = i = N,
for 1 i N, which, due to (2.2.33), represents also the vector q(z4 ). By

repetition of the above discourse for z4 = 2N 1, obtain the identities
24
u(z1 , 2N 1, 2N 1; 2N 1) = u(z1 ; 2N 1),

w(z1 , 2N 1, 2N 1; 2N 1) = w(z1 ; 2N 1),
(2.2.47)
whence, due to (2.2.41) and (2.2.45), follows

t(z5 ) = t(z3 ) = 1,
q(z5 ) = q(z3 ) = w(z1 ; 2N 1) .
This means that each subsequent repetition of the above discourse will just add
one more parameter (equal to 2N 1) into the left-hand side of (2.2.47).
Therefore, for any M > 1
u(z1 , 2N 1, . . . , 2N 1; 2N 1) = u(z1 ; 2N 1),
which being substituted into (2.2.37) yields
y(z1 , . . . , zM , zM+1 ) = y(z1 , 2N 1, . . . , 2N 1) =
1
= {u(z1 ) + (1 2M )u(z1 ; 2N 1) (2 2M)p}.
2
(2.2.48)
Proceed to the numbering of subcubes from D(z1 + 1) where z1 + 1 is the even

number (2 z1 + 1 2N 2) and consider the sequence of indexes z1 + 1, z2 , . . .
under the condition that z j = 0, j 2.
From (2.2.27),
w(z1 + 1) = w(z1 )
(2.2.49)
and, in accordance with (2.2.32), (2.2.33),

t(z1 + 1) = N,
q(z1 + 1) = (0, . . . , 0) .
Therefore, from (2.2.24), (2.2.34)

utq (z1 + 1) = u(z1 + 1),
q(z2 ) = w(z1 ) .
For z2 = 0, from (2.2.4), (2.2.24), (2.2.34), obtain that

ut (0) = wt (0) = (0, . . . , 0),
u(z1 + 1; 0) = w(z1 ),
1 t N,
q(z3 ) = w(z1 + 1; 0) = w(z1 ) .
One more iteration (for z3 = 0) results in similar relations

u(z1 + 1, 0; 0) = w(z1 ),
q(z4 ) = w(z1 + 1, 0; 0) = w(z1 ),
(2.2.50)
(2.2.51)
25
which means that the successive application of (2.2.24), (2.2.34), in consideration

of (2.2.49)(2.2.51), ensures the validity of
u(z1 + 1, 0, . . ., 0; 0) = w(z1 )
(2.2.52)
for any M > 1. By plugging (2.2.52) into (2.2.37), obtain

y(z1 + 1, z2 , . . . , zM+1 ) = y(z1 + 1, 0, . . . , 0) =
1
= {u(z1 + 1) + (1 2M)w(z1 ) (2 2M )p}.
2
(2.2.53)
Finally from (2.2.48) and (2.2.53), we derive the estimate
i = |yi (z1 , 2N 1, . . . , 2N 1) yi(z1 + 1, 0, . . ., 0)| =

1
= |ui (z1 ) ui (z1 + 1) + ui(z1 ; 2N 1)
2
wi (z1 ) + 2M [wi (z1 ) ui(z1 ; 2N 1)]|.
From the comment following the definition (2.2.25) and from (2.2.43),
ui (z1 + 1) = ui (z1 ),
ui (z1 ; 2N 1) = wi (z1 ),
i = l(z1 ) .
Therefore, i = 0 if i = l(z1 ). Consider the case when i = l(z1 ). In accordance

with (2.2.27),
wl (z1 ) = ul (z1 )
and, in consideration of (2.2.25) and (2.2.44),
ul (z1 + 1) =ul (z1 ) = ul (z1 ; 2N 1),
which means that l = 2(M+1) . So, the relations (2.2.38) are validated for the
odd number z1 , 1 < z1 < 2N 1, with l = l(z1 ).
3. Suppose that z1 = 2k, 1 k 2N1 1, i.e., z1 > 0 is the even integer and
consider the sequence of indexes z1 , z2 , . . . ; z j = 2N 1, j 2 (note that (2.2.41)
is valid for the elements of this sequence).
In consideration of the linking between u(s) and u(s 1) introduced by
(2.2.25) for the case of the even integer s and due to (2.2.27), derive

wi (z1 ) = wi (z1 1) =

qi (z2 ) = wi (z1 ) =
ui (z1 1), i = 1,

ui (z1 1), i = 1,
ui (z1 ), i = 1, i = l(z1 ),

1, i =
l(z1 ),
ui (z1 ), i =
1 i N.
(2.2.54)
26
From t(z2 ) = l(z1 ) > 1 and (2.2.4), (2.2.24), (2.2.54),

ui (z1
; 2N
1) =

=
wi (z1 ), i = l(z1 ),

=
l(z1 ),
wi (z1 ), i =
1 i N,
(2.2.55)
ui (z1 ), i = 1,
1,
ui (z1 ), i =
for 1 i N, and due to (2.2.27), (2.2.34),

q(z3 ) = w(z1 ; 2N 1) = u(z1 ) .
(2.2.56)
By analogy, and in consideration of (2.2.56), obtain

t(z3 ) = 1,
q(z3 ) = u(z1 ),
u(z1 , 2N 1; 2N 1) = u(z1 ; 2N 1),

ui (z1 ), i = 1, i = N,
N
N
qi (z4 ) = wi (z1 , 2 1; 2 1) =
N,
ui (z1 ), i = 1, i =
(2.2.57)
(2.2.58)
for 1 i N. One more iteration yields t(z4 ) = N,

u(z1 , 2N 1, 2N 1; 2N 1) = u(z1 ; 2N 1),
q(z5 ) = w(z1 , 2N 1, 2N 1; 2N 1) = u(z1 ) .
(2.2.59)
Next, due to (2.2.59), we have the relations

t(z5 ) = 1,
q(z5 ) = u(z1 ),
u(z1 , 2N 1, 2N 1, 2N 1; 2N 1) = u(z1 ; 2N 1),

which reproduce the state of discourse presented by (2.2.57), (2.2.58). Therefore,
for any M > 1
u(z1 , 2N 1, . . . , 2N 1; 2N 1) = u(z1 ; 2N 1),
where u(z1 ; 2N 1) is from (2.2.55). Hence, the equality (2.2.48) is valid also for
the even number z1 > 0.
From (2.2.4), (2.2.28), for any t, 1 t N, follows
ut (0) = wt (0) = (0, . . . , 0) .
These zero-vectors, being substituted into (2.2.24), (2.2.34), produce
27
u(z1 + 1; 0) = w(z1 + 1),

q(z3 ) = w(z1 + 1; 0) = w(z1 + 1),
u(z1 + 1, 0; 0) = w(z1 + 1),
q(z4 ) = w(z1 + 1, 0; 0) = w(z1 + 1),
consequently, for any M > 1:
u(z1 + 1, 0, . . ., 0; 0) = w(z1 + 1),
(2.2.60)
where, in accordance with (2.2.27)

wi (z1 + 1) =
ui (z1 + 1), i = 1,

ui (z1 + 1), i = 1,
1 i N.
(2.2.61)
From (2.2.2), (2.2.3)

ui (z1 + 1) =
ui (z1 ), i = 1,
1,
ui (z1 ), i =
1 i N.
(2.2.62)
Recall in this occasion that z1 is the even integer. Therefore, due to (2.2.61), we
obtain that w(z1 + 1) = u(z1 ). The last equality, in conjunction with (2.2.60) and
(2.2.37), implies
y(z1 + 1, z2 , . . . , zM+1 ) = y(z1 + 1, 0, . . ., 0) =
1
= {u(z1 + 1) + (1 2M)u(z1 ) (2 2M )p}.
2
(2.2.63)
Now, from (2.2.48) and (2.2.63) follows the validity of (2.2.38) also for even
indexes z1 > 0 because, due to (2.2.55), (2.2.62),
u(z1 ; 2N 1) = u(z1 + 1)
and the vectors u(z1 ; 2N 1) and u(z1 ) are different only in the first coordinate
(i.e., l = 1); see (2.2.55).
4. Suppose that z1 = 0 and consider the sequence of indexes z1 , z2 , . . . ; z j = 2N
1, j 2. In this case, from (2.2.26), (2.2.32) and (2.2.35) follows the relation for
the parameter t in the operation of permutation

t(z j ) =
1, j = 2 , 1,
N, j = 2 + 1, 0.
From (2.2.24), (2.2.28), (2.2.33), (2.2.34) and (2.2.64),
(2.2.64)
28
t(z4 ) = t(z2 ) = 1,
q(z4 ) = q(z2 ) = w(0),
u(0, 2N 1, 2N 1; 2N 1) = u(0, 2N 1; 2N 1) =
= u(0; 2N 1) = u(1),
i.e., the case j = 4 is the reproduction of the state of discourse at j = 2. Therefore
for any M > 1:
u(0, 2N 1, . . . , 2N 1; 2N 1) = u(0; 2N 1) = u(1) ;
(2.2.65)
and formula (2.2.48) is true also for z1 = 0.

Next consider the sequence of indexes z1 + 1, z2 , . . . = 1, 0, 0, . . . ; z j = 0, j 2
(note that (2.2.41) is true for the elements of this sequence with l(1) = 2).
At z1 = 0, in consideration of (2.2.27),
u(z1 + 1) = u(1),
q(z2 ) = w(1) = u(0) .
(2.2.66)
In accordance with (2.2.24), (2.2.34),

u(1, 0, . . . , 0; 0) = u(1; 0) = w(1),
w(1, 0, . . . , 0; 0) = w(1; 0) = w(1),
(2.2.67)
where (2.2.67) is similar to (2.2.52). Therefore, formula (2.2.53) is true also

for z1 = 0. Thus, from (2.2.48), (2.2.53) and (2.2.2)(2.2.4), (2.2.65), (2.2.66)
follows the validity of (2.2.38) at z1 = 0 with l = 1.

2.2.2 Types of Approximations and Their Analysis

Consider the space-filling curve y(x) introduced in Theorem 2.1. This curve,
continuously mapping the unit interval [0,1] onto the hypercube D from (1.2), was
defined by establishing a correspondence between the subintervals d(z1 , . . . , zM )
from (2.1.3)(2.1.6) and the subcubes D(z1 , . . . , zM ) of each Mth partition (M =
1, 2, . . .) and assuming that the inclusion x d(z1 , . . . , zM ) induces the inclusion
y(x) D(z1 , . . . , zM ). Therefore, for any preset accuracy , 0 < < 1, it is possible
to select a large integer M > 1 such that the deviation of any point y(x), x
d(z1 , . . . , zM ), from the center y(z1 , . . . , zM ) of the hypercube D(z1 , . . . , zM ) introduced in (2.2.37) will not exceed (in each coordinate) because
|y j (x) y j (z1 , . . . , zM )| 2(M+1) ,
1 jN.
29
This allows us to outline the following scheme for computing the approximation
y(z1 , . . . , zM ) for any point y(x), x [0, 1], with the preset accuracy , 0 < < 1:
1. Select the integer M (ln / ln 2 + 1).
2. Detect the interval d(M, v) containing the inverse image x, i.e., x d(M, v) =
[v, v + 2MN ] and estimate the indexes z1 , . . . , zM from (2.1.4), (2.1.5).
3. Compute the center y(z1 , . . . , zM ) from (2.2.37). This last operation is executed
by sequential estimation of the centers utq (z j ), 1 j M, from (2.2.24) with t
from (2.2.32), (2.2.35) and q from (2.2.33), (2.2.34).
In all the above numerical examples the curve y(x) was approximated by (2.2.37)
at N = 2, M = 10.
Remark 2.1. The centers (2.2.37) constitute a uniform orthogonal net of 2MN nodes
in the hypercube D with mesh width equal to 2M . Therefore, all the points
x d(z1 , . . . , zM ) have the same image y(z1 , . . . , zM ). But in some applications it
is preferable to use a one-to-one continuous correspondence lM (y) approximating
Peano curve y(x) with the same accuracy as is ensured by the implementation of
(2.2.37).
A piecewise-linear curve of this type is now described; it maps the interval [0,1]
into (not onto) the cube D, but it covers the net constituted by the centers (2.2.37).
Establish the numeration of all the intervals (2.1.3) constituting the Mth partition
of the interval [0, 1] by subscripts in increasing order of the coordinate:
d(z1 , . . . , zM ) = [vi , vi + 2MN ),
0 i 2MN 1 .
Next, assume that the center y(z1 , . . . , zM ) of the hypercube D(z1 , . . . , zM ) is assigned
the same number (the superscript) as the number of the subinterval d(z1 , . . . , zM )
corresponding to this subcube, i.e.,
yi = y(z1 , . . . , zM ),
0 i 2MN 1 .
This numeration ensures that any two centers yi , yi+1 , 0 i < 2MN 1, correspond
to the contiguous hypercubes (see Condition 2 from Sect. 2.1), which means that
they are different just in one coordinate.
Consider the following curve l(x) = lM (x) mapping the unit interval [0, 1] into
the hypercube D from (1.2):
l(x) = yi + (yi+1 yi )[(w(x) vi )/(vi+1 vi )],
(2.2.68)
where the index i is from the conditions

vi w(x) vi+1 ,
and
w(x) = x(1 2MN ),
0x1.
(2.2.69)
30
Fig. 2.4 Image of the

interval [0, 1] generated by
Peano-like piecewise-linear
evolvent l(x) from (2.2.68) at
N = 2, M = 3; red dots
correspond to the centers of
the third-partition subsquares
from (2.2.37) got through by
the curve l(x) in the order of
established numeration
The image of any particular subinterval

[vi (1 2MN )1 , vi+1 (1 2MN )1 ],
0 i < 2MN 1,
(2.2.70)
generated by this curve is the linear segment connecting the nodes yi , yi+1 and, thus,
l(x), 0 x 1, is the piecewise-linear curve running through the centers yi , 0
i 2MN 1 in the order of the established numeration. The curve l(x) = lM (x)
henceforth to be referred to as a Peano-like piecewise-linear evolvent because it
approximates the Peano curve y(x) from Theorem 2.1 with accuracy not worse than
2M in each coordinate; note that M is the parameter of the family of curves (2.2.68)
as long as it determines the number and the positions of the nodes (2.2.37) used in
the construction of l(x). For the sake of illustration, Fig. 2.4 presents the image of
the interval [0, 1] generated by l(x) at N = 2, M = 3 (the corresponding centers
yi , 0 i 63, are marked by red dots).
Remark 2.2. The expression (2.2.68), (2.2.69) allow us to determine the point l(x)
for any given x [0, 1] by, first, estimating the difference
= (w(x) vi )/(vi+1 vi ) = 2MN (x vi ) x

and then employing (2.2.37) to compute the centers yi = y(z1 , . . . , zM ), yi+1 of the
two adjacent subcubes of the Mth partition corresponding to the intervals [vi , vi+1 ) =
d(z1 , . . . , zM ) and [vi+1 , vi+2 ); note that the index i is defined by the condition
w(x) = x(1 2MN ) d(z1 , . . . , zM ) .
The scheme for computing centers y(z1 , . . . , zM ) from (2.2.37) was already discussed
in the previous subsection. As long as the adjacent centers yi and yi+1 are different
31
just in one coordinate, it is sufficient to compute only the center yi = y(z1 , . . . , zM )

and the number = (z1 , . . . , zM ) of this coordinate. Then for any k, 1 k N,
lk (x) = yk (z1 , . . . , zM )+
0, k = ,
tq
[uk (zM ) 21]2(M1) , k = , zM = 2N 1,
, k = , zM =
2N 1,
where utq (zM ) is from (2.2.37). Now, it is left to outline the scheme for computing
the number .
Represent the sequence z1 , . . . , zM as z1 , . . . , z , z +1 , . . . , zM where 1 M
and z = 2N 1, z +1 = . . . = zM = 2N 1; note that the case z1 = . . . = zM = 2N 1
is impossible because the center y(2N 1, . . . , 2N 1) does not coincide with the
node yq , q = 2MN 1. As it follows from the construction of y(x), the centers
y(z1 , . . . , z , 2N 1, . . ., 2N 1)
and
y(z1 , . . . , z 1 , z + 1, 0, . . ., 0)
corresponding to the adjacent subcubes are different in the same coordinate as the
auxiliary centers
u(z1 , . . . , z 1 ; z )
and u(z1 , . . . , z 1 , z + 1) ;
see the notations introduced in the second clause from the proof of Theorem 2.3.
Therefore, if z is the odd number, then, in accordance with (2.2.25),
(z1 , . . . , zM ) = l(z1 , . . . , z 1 ; z ) .
If z is even, then from (2.2.26), (2.2.32), (2.2.62), and the permutation rule,

(z1 , . . . , zM ) =
1, t =
N,
N, t = 1,
where t = l(z1 , . . . , z 2 ; z 1 ) if > 1 and t = N if = 1.

Theorem 2.4. If the function g(y), y D, is Lipschitzian with the constant L,
then the one-dimensional function g(l(x)), x [0, 1], satisfies the uniform Holder
conditions (2.1.9).
Proof. Suppose that l(x) = lM (x), M > 1, and let x , x [0, 1], x = x . If there exists
an integer n < M meeting the conditions
2(n+1)M |x x | 2nN ,
(2.2.71)
32
which are similar to (2.1.9), then justification of the relations (2.1.9) is just a
reproduction of the corresponding discourse from the proof of Theorem 2.1.
Suppose that the conditions (2.2.71) are met at n M. If the points x , x are from
the same interval (2.2.70) and the corresponding images l(x ), l(x ) belong to the
same linear segment connecting the nodes yi , yi+1 , which are different just in one
coordinate, then from (2.2.68), (2.2.69), and (2.2.71),
||l(x ) l(x )|| = 2MN ||yi yi+1 ||(1 2MN )|x x |
2M(N1) 2nN = 2 2(n+1)2(Mn)(N1) 2(|x x |)1/N
(2.2.72)
because
||yi yi+1 || = 2M .
If the points l(x ), l(x ) belong to two different linear segments linked at the
common end-point yi+1 , then
||l(x ) l(x )|| ||l(x ) yi || + ||yi+1 l(x )|| =
= ||yi+1 yi ||
1 (w(x ) vi )2MN
+ ||yi+2 yi+1 ||
w(x ) vi+1
=
= 2M(N1) (|w(x ) vi+1 | + |w(x ) vi+1|)
2 2M(N1) |x x | < 2M(N1) 2nN < 2(|x x |)1/N ,
which is equivalent to (2.2.72). Therefore, in consideration of the function g(y), y
D, being Lipschitzian, we obtain the relation
||g(l(x ))g(l(x ))|| 2L N+3(|x x |)1/N ,
x , x [0, 1] .
(2.2.73)

The last statement proves the validity of (2.1.9).

Let us study now Peano curves in comparison with spirals and TV evolvents.
The Peano-like curve l(x), x [0, 1], covers the grid
H(M, N) = {yi ; 0 i 2MN 1}
(2.2.74)
having, as already mentioned, a mesh width equal to 2M . It should be stressed

that the most important feature of the evolvent l(x) is not in its piecewise linearity,
covering the grid (2.2.74). The most important property is presented by the relation
(2.2.72) which is similar to the relation (2.1.12) for the Peano curve y(x). This
property ensures the boundedness for the first divided differences of the function
F(l(x)), x [0, 1], corresponding to the Lipschitzian function F(y), y D; see
(2.2.73).
33
Fig. 2.5 Piecewise-linear curves covering the set (2.2.75) at N = 2, M = 3: spiral (the left picture)
and TV evolvent (the right picture); nodes of the set (2.2.75) are marked by the red dots
We confine our consideration to the class of piecewise-linear curves and characterize the complexity of any particular curve from this family by the number of
linear segments it is built of (each linear segment is assumed to be parallel to one of
the coordinate axes). Spiral and TV evolvents (the images of the unit interval [0, 1]
generated by these curves are given in Fig. 2.5; case N = 2, M = 3) are clearly from
this family and they are much simpler than the Peano-like curve l(x); see Fig. 2.4 (in
both figures the nodes of the grid (2.2.75) are marked with red dots). For example,
the TV evolvent is possible to present in the parametric form t(x), 0 x 1, by the
following coordinate functions
t1 (x) = (1)q+1 21 {2M 1 + | | },
t2 (x) = 21 {(1 + 2q)2M 1 + | | },
(2.2.75)
where q = k, k = x(2M 2M ), = k q 1 + 2M . The curve (2.2.75) is defined

for N = 2. It is obviously much simpler than l(x) and it is possible to generalize this
scheme for N > 2.
But these simple piecewise-linear curves have essential drawback emerging from
the very fact that the image of the unit interval [0, 1] generated by such a curve
contains some linear segments covering a large number of nodes from (2.2.74).
Let us focus on this feature assuming that there exists at least one linear segment
covering 2M nodes (which is exactly the case for the TV evolvent).
Total length of all the segments constituting the image of the interval [0, 1]
generated by the curve s(x) = sM (x), 0 x 1, is equal 2M(N1) 2M . Suppose
that s(x ) and s(x ) are, respectively, the initial and the end points of the above linear
segment containing 2M nodes of the net (2.2.75). Then
||s(x ) s(x )|| = 1 2M > 21 ,
|x x | = (2M 1)/(2MN 1) < 2M(N1) ,
34
whence it follows
||s(x ) s(x )|| > 2M(11/N)1 (|x x |)1/N .
This means that there does not exist any coefficient that ensures the validity of a
relation similar to (2.1.12), (2.2.72) and does not dependent on M.
The Peano curve y(x) is defined as the limit object and, therefore, only approximations to this curve are applicable in the actual computing. The piecewise linear
evolvent l(x) = lM (x) suggested above covers all the nodes of the grid H(M, N)
from (2.2.74) and, thus, it allows us to ensure the required accuracy in analyzing
multidimensional problems by solving their one-dimensional images produced by
the implementation of l(x). But this evolvent has some deficiencies.
The first one is due to the fact that the grid H(M + , N) with mesh width equal
to 2(M+ ) does not contain the nodes of the less finer grid H(M, N). Therefore, in
general, the point lM (x ) may not be covered by the curve lM+ (x), 0 x 1, and,
hence, the outcomes already obtained while computing the values F(lM (x)) will not
be of any use if the demand for greater accuracy necessitates switching to the curve
lM+ (x), 1. This difficulty is possible to overcome by setting the parameter M
equal to a substantially larger value than seems to be sufficient at the beginning of
the search.
Another problem arises from the fact that l(x) is a one-to-one correspondence
between the unit interval [0, 1] and the set {l(x) : 0 x 1} D though the Peano
curve y(x) has a different property: the point y D = {y(x) : 0 x 1} could have
several inverse images in [0, 1] (but not more than 2N ). That is, the points y D could
be characterized by their multiplicity with respect to the correspondence y(x). This
is due to the fact that though each point x [0, 1] is contained just in one subinterval
of any Mth partition, some subcubes corresponding to several different subintervals
of the same Mth partition (e.g., all the subcubes of the first partition) could have
a common vertex. Therefore, some different inverse images x , x [0, 1], x = x ,
could have the same image, i.e., y(x ) = y(x ).
This multiplicity of points y D with respect to the correspondence y(x) is
the fundamental property reflecting the essence of the dimensionality notion: the
segment [0, 1] and the cube D are sets of equal cardinality and the first one could
be mapped onto the other by some single-valued mapping, but if this mapping
is continuous, then it could not be univalent (i.e., it could not be a one-to-one
correspondence), and the dimensionality N of the hypercube D determines the
bound from above (2N ) for the maximal possible multiplicity of y(x).
Therefore, the global minimizer y of the function F(y) over D could have several
inverse images xi , 1 i m, i.e., y = y(xi ), 1 i m, which are the global
minimizers of the function F(y(x)) over [0, 1].
To overcome the above deficiencies of l(x), we suggest one more evolvent n(x) =
nM (x) mapping some uniform grid in the interval [0, 1] onto the grid P(M, N) in the
hypercube D from (1.2) having mesh width equal to 2M (in each coordinate) and
meeting the condition
P(M, N) P(M + 1, N) .
(2.2.76)
35
The evolvent n(x) approximates the Peano curve y(x) and its points in D own the
property of multiplicity; each node of the grid P(M, N) could have several (but not
more than 2N ) inverses in the interval [0, 1].
Construction of n(x). Assume that the set of nodes in P(M, N) coincides with
the set of vertices of the hypercubes D(z1 , . . . , zM ) of the Mth partition. Then the
mesh width for such a grid is equal to 2M and the total number of all the nodes in
P(M, N) is (2M + 1)N . As long as the vertices of the Mth partition hypercubes are
also the vertices of some hypercubes of any subsequent partition M + , 1, then
the inclusion (2.2.77) is valid for the suggested grid.
Note that each of 2N vertices on any hypercube D(z1 , . . . , zM ) of the Mth
partition is simultaneously the vertex of just one hypercube D(z1 , . . . , zM , zM+1 ) from
D(z1 , . . . , zM ). Denote P(z1 , . . . , zM+1 ) the common vertex of the hypercubes
D(z1 , . . . , zM , zM+1 ) D(z1 , . . . , zM ) .
(2.2.77)
Due to (2.2.37), the center of the hypercube from the left-hand part of (2.2.77) and
the center of the hypercube from the right-hand part of (2.2.77) are linked by the
relation
y(z1 , . . . , zM+1 ) = y(z1 , . . . , zM ) + (utq (zM+1 ) 21)2(M+1) ,
whence it follows that
n(z1 , . . . , zM+1 ) = y(z1 , . . . , zM ) + (utq (zM+1 ) 21)2M ,
(2.2.78)
and varying zM+1 from 0 to 2N 1 results in computing from (2.2.78) all the 2N
vertices of the hypercube D(z1 , . . . , zM ).
Formula (2.2.78) establishes the single-valued correspondence between 2(M+1)N
intervals d(z1 , . . . , zM+1 ) of the Mth partition of [0, 1] and (2M + 1)N nodes
n(z1 , . . . , zM+1 ) of the grid P(M, N); this correspondence is obviously not a univalent
(not one-to-one) correspondence.
Number all the intervals d(z1 , . . . , zM+1 ) from left to right with subscript i, 0
i 2(M+1)N 1 and denote vi , vi+1 the end-points of the ith interval. Next, introduce
the numeration of the centers yi = y(z1 , . . . , zM+1 ) from (2.2.37) assuming that the
center corresponding to the hypercube D(z1 , . . . , zM+1 ) is assigned the same number
i as the number of the interval d(z1 , . . . , zM+1 ) = [vi , vi+1 ). Thus, we have defined
the one-to-one correspondence of the nodes
vi ,
0 i 2(M+1)N 1,
(2.2.79)
constituting a uniform grid in the interval [0, 1] and of the centers yi which, in
accordance with (2.2.78), generates the one-to-one correspondence of the end-points
vi from (2.2.79) and of the nodes p P(M, N).
Note that if the centers yi and yi+1 are from the same hypercube of the Mth
partition, then these centers (and consequently the corresponding points vi and vi+1 )
36
Fig. 2.6 Nodes of the grid

P(2, 2) (marked by the red
dots). Integers around the
nodes indicate the numbers of
the points from the uniform
grid (2.2.80) mapped onto the
corresponding nodes by n(x)
32
48
47 37
36
45
46 38
42
39 35
25
34
26 30
31
29
44
4
43 41
5 7
40
8 24
27
21
28
20
6
2 10
9 23
13
22
14 18
19
17
1 11
15
16
12
33
are juxtaposed with some different nodes from P(M, N). Therefore, the node p may
be juxtaposed to the points vi , vi+1 if and only if the corresponding centers yi and
yi+1 are from some different (but adjacent) subcubes of the Mth partition. As long
as the number of subcubes in the Mth partition of D is equal to 2MN , then there
are exactly 2MN 1 pairs vi , vi+1 juxtaposed with the same node from P(M, N) (in
general, this node is different for different pairs vi , vi+1 of the above type).
To ensure that any two vicinal nodes in [0, 1] are juxtaposed with different nodes
from P(M, N) we substitute each of the above pairs vi , vi+1 with just one node in
[0, 1]. Next, we rearrange the collocation of nodes in [0, 1] to keep up the uniformity
of the grid.
To do so we construct the uniform grid in the interval [0, 1] with the nodes
h j,
0 j 2(M+1)N 2MN = q,
(2.2.80)
where h0 = 0 and hq = 1, and juxtapose to the node h j of the grid (2.2.80) the node
vi of the grid (2.2.79), where
i = j + ( j 1)/(2N 1) .
(2.2.81)
Next, we assume that the node h j is juxtaposed with the node of the grid P(M, N)
generated by (2.2.78) for the center yi with i from (2.2.81). This mapping of the
uniform grid (2.2.81) in the interval [0, 1] onto the grid P(M, N) in the hypercube
D will be referred to as Non-Univalent Peano-like Evolvent (NUPE, for short) and
designated n(x) = nM (x).
For the sake of illustration, Fig. 2.6 presents the nodes of the grid P(2, 2) (marked
by the red dots) and each node is assigned with the numbers j of the points h j from
(2.2.80) mapped onto this node of the grid P(2, 2). These numbers are plotted around
the relevant nodes.
2.3 Standard Routines for Computing Approximations to Peano Curves
37
The inverse images h j corresponding to a given node p P(M, N) could be

computed in the following way. Let U be the set of 2N different binary vectors
u RN . Then, in accordance with (2.2.78),
Y (p) = {p (u 21)2M : u U},
p P(M, N),
is the set of all centers of the (M + 1)st partition subcubes from D generating the
same given node p P(M, N). If the center y Y (p) is assigned the number i, i.e.,
if y = yi , then it corresponds to the node vi of the grid (2.2.79), and being given this
number i it is possible to compute the number j = i i/2N of the corresponding
node h j from the grid (2.2.80). Different nodes h j obtained as the result of these
computations performed for all centers y Y (p) constitute the set of all inverse
images for the node p P(M, N) with respect to n(x). From (2.1.4), (2.1.5), follows
that the point vi corresponding to the center yi = y(z1 , . . . , zM+1 ) can be estimated
from the expression
vi =
M+1
z j 2 jN ,
j=1
where the numbers z1 , . . . , zM+1 are to be computed by presenting the corresponding

vector y(z1 , . . . , zM+1 ) in the form (2.2.37) and then analyzing the right-hand part of
this expression.
2.3 Standard Routines for Computing Approximations

to Peano Curves
This section presents the software package for computing images and inverse
images for the suggested approximations to Peano curves (centers of the Mth
partition hypercubes, piecewise-linear evolvents, and non-univalent evolvents). The
package originally was written in FORTRAN by R.Strongin and later rewritten in
C++ by V.Gergel. This last version is given here.
The principal function of the package is mapd which computes the image of the
point x from [0, 1] and places this image into the array y. The required accuracy is
set by selecting the number M of the corresponding partition which is to be assigned
as the value of the function parameter m. The dimensionality N of the hypercube D
from (1.2) is to be indicated by the parameter n. Notations m and n have the above
meaning in the description of all the functions presented in this section. As long
as the expansion (2.1.4) requires MN binary digits for representation of x, it has
to be mentioned that there is the constraint MN < where the value of is the
number of digits in the mantissa and, therefore, depends on the computer that is used
for the implementation. To select the particular evolvent it is necessary to assign
some appropriate integer value to the function parameter key: 1 corresponds to the
38
approximation by centers of the Mth partition hypercubes, 2 corresponds to the

approximation by the piecewise-linear evolvent lM (x), and 3 corresponds to the nonunivalent evolvent nM (x).
The function mapd uses the auxiliary function node which computes the vector
u(s) (designated iu) similar to the one from (2.2.2), (2.2.3) (parameter is corresponds
to s). But the particular software realization we consider uses the scheme in which
|ui (s)| = 1, 1 i N, i.e., u(s) is not a binary vector (zero values of the coordinates
from (2.2.2), (2.2.3) are replaced with 1 and, consequently, the negation operation
is replaced by the invertor changing the sign of the coordinates). It should also be
mentioned that iu[0] and iu[2N 1] are different in the first coordinate, but not in the
Nth one as in (2.2.4).
This function also computes the vector w(s) (designated iv) and the integer
l(s), respectively, from (2.2.27), (2.2.28) and (2.2.25), (2.2.26); it is assumed that
|wi (s)| = 1, 1 i N.
The inverse images h j from (2.2.80) for the given point p P(M, N) generated
by the non-univalent evolvent nM (x) are computed by the function invmad. It has to
be mentioned that, instead of the term inverse image, the shorter term preimage
is used in the comments in the bodies of the functions. The computed coordinates
of the inverse images are placed into the array xp. The size of this array is set by the
parameter kp; the number of the actually computed inverse images is reflected by the
value of the parameter kxx. The parameter incr provides the possibility for the user
to stipulate the condition that the difference in the location of any two computed
vicinal inverse images should be not less than the distance prescribed by the value
of incr; the required distance in incr is recorded as the number of the sequential
nodes of the grid (2.2.80) parting the vicinal inverse images.
The function invmad uses the auxiliary function xyd which computes the left-end
points vi (designated xx) from (2.2.79) corresponding to the centers yi (placed into
the array y) employed in (2.2.68). This function is also possible to use as the means
for computing approximations to some inverse images of the points y D with
respect to the Peano curve y(x). In this case, the function xyd, first, computes the
nearest center yi = y(z1 , . . . , zM ) to the given point y D and then estimates the leftend point vi of the interval d(M, vi ) = d(z1 , . . . , zM ). This point is the approximation
(with accuracy not worse than 2MN ) to one of the inverse images of the given
point y.
Routines
File map.h
/* map modules */
#ifndef MAP
#define MAP
void mapd ( double, int, float *, int, int ); /* map x to y */
void invmad ( int, double *, int , int *, float *, int , int );
/* map y to x */
#endif
File x to y.c
#include <math.h>

#include "map.h"
int n1,nexp,l,iq,iu[10],iv[10];
void mapd( double x, int m, float y[], int n, int key ) {
/* mapping y(x) : 1 - center, 2 - line, 3 - node */
double d,mne,dd,dr;
float p, r;
int iw[11], it, is, i, j, k;
void node ( int );
p=0.0;
n1=n-1;
for ( nexp=1,i=0; i<n; nexp*=2,i++ ); /* nexp=2**n */
d=x;
r=0.5;
it=0;
dr=nexp;
for ( mne=1,i=0; i<m; mne*=dr,i++ ); /* mne=dr**m */
for ( i=0; i<n; i++ ) {
iw[i]=1; y[i]=0.0;
}
if ( key == 2 ) {
d=d*(1.0-1.0/mne); k=0;
} else
if ( key > 2 ) {
dr=mne/nexp;
dr=dr-fmod(dr,1.0);
dd=mne-dr;
dr=d*dd;
dd=dr-fmod(dr,1.0);
dr=dd+(dd-1.0)/(nexp-1.0);
dd=dr-fmod(dr,1.0);
d=dd*(1./mne);
}
for ( j=0; j<m; j++ ) {

iq=0;
if ( x == 1.0 ) {
is=nexp-1; d=0.0;
} else {
d=d*nexp;
is=d;
d=d-is;
}
i=is;
node(i);
i=iu[0];
iu[0]=iu[it];
iu[it]=i;
i=iv[0];
iv[0]=iv[it];
iv[it]=i;
if ( l == 0 )
l=it;
else if ( l == it ) l=0;
if ( (iq>0)||((iq==0)&&(is==0)) ) k=l;
else if ( iq<0 ) k = ( it==n1 ) ? 0 : n1;
39
40

r=r*0.5;
it=l;
for ( i=0; i<n; i++ ) {
iu[i]=iu[i]*iw[i];
iw[i]=-iv[i]*iw[i];
p=r*iu[i];
p=p+y[i];
y[i]=p;
if ( key == 2 ) {
if ( is==(nexp-1) ) i=-1;
else i=1;
p=2*i*iu[k]*r*d;
p=y[k]-p;
y[k]=p;
} else if ( key == 3 ) {
for ( i=0; i<n; i++ ) {
p=r*iu[i];
p=p+y[i];
y[i]=p;
}
void node ( int is ) {

/* calculate iu=u[s], iv=v[s], l=l[s] by is=s */
int n,i,j,k1,k2,iff;
n=n1+1;
if ( is == 0 ) {
l=n1;
for ( i=0; i<n; i++ ) {
iu[i]=-1; iv[i]=-1;
}
} else if ( is == (nexp-1) ) {
l=n1; iu[0]=1; iv[0]=1;

for ( i=1; i<n; i++ ) {
iu[i]=-1; iv[i]=-1;
}
iv[n1]=1;
} else {
iff=nexp;
k1=-1;
for ( i=0; i<n; i++ ) {
iff=iff/2;
if ( is >= iff ) {
if ( (is==iff)&&(is != 1) ) { l=i; iq=-1; }
is=is-iff;
k2=1;
}
else {
k2=-1;
if ( (is==(iff-1))&&(is!= 0) ) { l=i; iq=1; }
j=-k1*k2;
41
iv[i]=j;
iu[i]=j;
k1=k2;
}
iv[l]=iv[l]*iq;
iv[n1]=-iv[n1];
}
}
File y to x.c
#include <math.h>
#include "map.h"
static void xyd ( double *, int, float *, int ); /* get a
preimage */
static void numbr ( int *iss);
extern int n1,nexp,l,iq,iu[10],iv[10];
double del;
void
invmad ( int m, double xp[], int kp, int *kxx, float p[], int n,
int incr ) {
/*
preimages calculation
- m - map level (number of partitioning)
- xp - preimages to be calculated
- kp - number of preimages that may be calculated (size of xp)
- kxx - number of preimages being calculated
- p - image for which preimages are calculated
- n - dimension of image (size of p)
- incr - minimum number of map nodes that must be between
preimages
*/
double mne,d1,dd,x,dr;
float r,d,u[10],y[10];
int i,k,kx,nexp;
void xyd ( double *, int, float *, int );
kx=0;
kp--;
for ( nexp=1,i=0; i<n; i++ ) { nexp*=2; u[i]=-1.0; }
dr=nexp;
for ( mne=1, r=0.5, i=0; i<m; i++ ) { mne*=dr; r*=0.5; }
dr=mne/nexp;
dr=dr-fmod(dr,1.0);
del=1./(mne-dr);
d1=del*(incr+0.5);
for ( kx=-1; kx<kp; ) {
for ( i=0; i<n; i++ ) { /* label 2 */
d=p[i];
y[i]=d-r*u[i];
}
for ( i=0; (i<n) && (fabs(y[i]) < 0.5) ; i++ );

if ( i>=n ) {
xyd(&x,m,y,n);
dr=x*mne;
42

dd=dr-fmod(dr,1.0);
dr=dd/nexp;
dd=dd-dr+fmod(dr,1.0);
x=dd*del;
if ( kx>kp ) break;
k=kx++; /* label 9 */
if ( kx == 0 ) xp[0]=x;
else {
while ( k>=0 ) {
dr=fabs(x-xp[k]); /* label 11 */
if ( dr<=d1 ) {
for ( kx-- ; k<kx; k++,xp[k]=xp[k+1] );
goto m6;
} else
if ( x <= xp[k] ) {
xp[k+1]=xp[k]; k--;
} else break;
}
xp[k+1]=x;
}
m6 : for ( i=n-1; (i>=0)&&(u[i]=(u[i]<=0.0) ? 1 : -1)<0; i-);

if ( i<0 ) break;
}
*kxx=++kx;
}
void xyd ( double *xx, int m, float y[], int n){

/* calculate preimage xx for the nearest m-level center of y */
/* (xx - left boundary point of m-level interval) */
double x,r1;
float r;
int iw[10];
int i,j,it,is;
void numbr ( int * );
n1=n-1;
for ( nexp=1,i=0; i<n; i++ ) { nexp*=2; iw[i]=1; }
r=0.5;
r1=1.0;
x=0.0;
it=0;
for ( j=0; j<m; j++ ) {
r*=0.5;
for ( i=0; i<n; i++ ) {
iu[i] = ( y[i]<0 ) ? -1 : 1;
y[i]-=r*iu[i];
iu[i]*=iw[i];
}
i=iu[0];
iu[0]=iu[it];
iu[it]=i;
numbr ( &is );
i=iv[0];
iv[0]=iv[it];
43
iv[it]=i;
for ( i=0; i<n; i++ )
iw[i]=-iw[i]*iv[i];
if ( l == 0 ) l=it;
else if ( l == it ) l=0;
it=l;
r1=r1/nexp;
x+=r1*is;
*xx=x;
}
void numbr ( int *iss) {

/* calculate s(u)=is,l(u)=l,v(u)=iv by u=iu */
int i,n,is,iff,k1,k2,l1;
n=n1+1;
iff=nexp;
is=0;
k1=-1;
for ( i=0; i<n; i++ ) {
iff=iff/2;
k2=-k1*iu[i];
iv[i]=iu[i];
k1=k2;
if ( k2<0 ) l1=i;
else { is+=iff; l=i; }
if ( is == 0 ) l=n1;
else {
iv[n1]=-iv[n1];
if ( is == (nexp-1) ) l=n1;
else if ( l1 == n1 ) iv[l]=-iv[l]; else l=l1;
}
*iss=is;
}
Example 2.1. We consider in Fig. 2.7 an approximation of the Peano curve, the
piecewise-linear evolvent (obtained with key=2), in dimension N = 2 and with level
M = 3. The points 1, 2 and 3 have, respectively, coordinates in the interval [0, 1] and
Fig. 2.7 Peano-like piecewise-linear evolvent, N = 2, M = 3
44
Peano curve
and its rotated copy
Fig. 2.8 Several rotated Peano curves can help to construct one-dimensional schemes better
representing the information about vicinity of the points in the multidimensional domain
in the domain D = {y R2 : 21 y j 21 , j = 1, 2}:

1 : (0.0470)
(0.3125, 0.4326)
2 : (0.0320)
(0.3125, 0.3145)
3 : (0.2060)
(0.3125, 0.1848)
It can be seen that in the domain D the point 2 is equidistant from the points 1 and 3
whereas at the interval [0, 1] the point 2 is significantly more distant from the point
3 than from the point 1.
It has been recently shown (see [6]) that a simultaneous usage in one-dimensional
reduction of several curves rotated with respect to the search domain (see Fig. 2.8)
gives the possibility to have a better representation of the information about vicinity
of the points in the multidimensional domain.
Example 2.2. The main function suggested here presents the test illustrating the
way in which the functions are to be called. First, it computes the image PM (0.5)
at M = 10, N = 2 and estimates all the inverse images for the obtained image; then
it generates the images for all three estimated inverse images demonstrating that in
all three cases the same image (0, 0) is obtained. Second, it computes the image
PM (0.55) at M = 14, N = 3 and regenerates eight inverse images which is followed
by generation of eight images (all the same).
File test.c
#include <conio.h>
#include <stdio.h>
#include "map.h"
#define KEY 3
main() {
double x, xp[32], d, del;
45
int i,j,n,m,kp,kx;
float y[10];
/* Initialization */
clrscr();
n=2;
m=5;
y[0] = 0.0;
y[1] = 0.0;
printf("\t\tTesting the map modules\n\n");
/* parameters input */
printf("Input dimension (2-5) - ");
scanf("%d",&n);
printf("Input map level (n*m< ) - ");
scanf("%d",&m);
printf("Input a preimage value - ");
scanf("%lf",&x);
/* calculation and output */
printf("\n\t\tCalculation results\n\n");
printf("Preimage = %lf, Dimension = %d, Map level = %d\n",x,n,m);
/* image calculation */
mapd ( x, m, y, n, KEY );
printf("Image (");
for ( i=0; i<n; i++ )
printf(" %f%c",y[i],(i==n-1)? :,);
printf(" )\n\n");
/* back calculation preimages of image */
printf("Input number of preimages that may be calculated - ");
scanf("%d",&kp);
invmad ( m, xp, kp, &kx, y, n, 1 );
printf("\nPreimages \n\n");
for ( i=0; i<kx; i++ )
printf(" %30.25f\n",xp[i]);
printf("\n");
/* testing preimages that are calculated */
for ( d=1.0, i=0; i<n; i++, d /= 2.0 );
for ( del=1.0, i=0; i<m; i++, del *= d );
for ( i=0; i<kx; i++ ) {
mapd ( ( (xp[i]==1.0) ? xp[i] : xp[i] + 0.5 * del ), m, y, n,
KEY );
printf("Image for %d preimage = (",i);
for ( j=0; j<n; j++ )
printf(" %f%c",y[j],(j==n-1)? :,);
printf(" )\n\n");
}
getch();
}
Results of the test

Testing the map modules
Input dimension (2--5) - 2
Input map level (n*m< ) - 10
Input a preimage value - 0.5
Calculation results
46
Preimage = 0.500000, Dimension = 2, Map level = 10

Image ( 0.000000, 0.000000 )
Input number of preimages that may be calculated - 4
Preimages
0.1666666666666666574000000
0.5000000000000000000000000
0.8333333333333332593000000
Image for 0 preimage = ( 0.000000, 0.000000 )
Testing the map modules
Input dimension (2--5) - 3
Input map level (n*m< ) - 14
Input a preimage value - 0.55
Calculation results
Preimage = 0.550000, Dimension = 3, Map level = 14
Image ( 0.099976, 0.099976, -0.100098 )
Input number of preimages that may be calculated - 8
Preimages
0.5499999998870147566000000
0.5499999998932513234000000
0.5499999998942907142000000
0.5499999999005272810000000
0.5499999999914767512000000
0.5499999999925161420000000
0.5499999999987527088000000
0.5499999999997920996000000
Image for 0 preimage = ( 0.099976, 0.099976, -0.100098
)
)
)
)
)
)
)
)
Chapter 3
Global Optimization Algorithms Using Curves

to Reduce Dimensionality of the Problem
An algorithm must be seen to be believed.

Donald Knuth
3.1 Introduction
In this chapter, we return to the global optimization problem of a multiextremal
function satisfying the Lipschitz condition over a hyperinterval. Let us recollect
briefly some of the achievements we have got by now. To deal with the multidimensional global optimization problems we would like to develop algorithms that use
numerical approximations of space-filling curves to reduce the original Lipschitz
multidimensional problem to a univariate one satisfying the Holder condition.
In particular, we consider the following problem:
min{F(y) : y [a, b]},
(3.1.1)
where [a, b] is a hyperinterval in RN and F(y) is a multiextremal function that

satisfies the Lipschitz condition
|F(y ) F(y )| Ly y ,
y , y [a, b],
(3.1.2)
with a constant L, 0 < L < , generally unknown; denotes the Euclidean norm.
Due to Theorem 2.1, the multidimensional global minimization problem (3.1.1),
(3.1.2) is turned into a one-dimensional problem. In particular, finding the global
minimum of the Lipschitz function F(y), y RN , over a hypercube is equivalent to
determining the global minimum of the function f (x):
f (x) = F(y(x)),
x [0, 1],
(3.1.3)
where y(x) is the Peano curve. Moreover, the Holder condition

47
48
3 Global Optimization Algorithms Using Curves to Reduce Dimensionality of . . .
| f (x ) f (x )| H|x x |1/N ,
x , x [0, 1],
holds (in accordance with (2.1.9)) for the function f (x) with the constant
H = 2L N + 3,
(3.1.4)
(3.1.5)
where L is the Lipschitz constant of the multidimensional function F(y). Thus, we

can solve the problem (3.1.1), (3.1.2) by using algorithms proposed for minimizing
functions in one dimension (see [74, 77, 100, 103, 132, 134, 136, 139, 140]). If such
a method uses an approximation of the Peano curve pM () of level M and provides
a lower bound UM for the one-dimensional function y(x), then this value will be a
lower bound for the function F(y) but only along the curve pM (). Naturally, the
following question becomes very important in this connection: Can we establish a
lower bound for the function F(y) over the entire multidimensional search region
D? The following theorem answers this question. Without loss of generality we use
D = {y RN : 1 y j 1, 1 j N}.
Theorem 3.1. Let UM be a lower bound along the Peano curve pM (x) for a
multidimensional function F(y) satisfying Lipschitz condition with constant L, i.e.,
UM F(pM (x)),
x [0, 1].
(3.1.6)
Then the value
U = UM 2(M+1)L N
is a lower bound for F(y) over the entire region D, i.e.,
U F(y),
y D.
(3.1.7)
Proof. Every point y D is approximated by points (called images) i (y), 1 i J,

on the curve minimizing the Euclidean distance from y to the curve
i (y) = arg min{y y : y = pM (x), x [0, 1]},
1 i J.
It has been shown that the number of images, J, ranges between 1 and 2N . For
example, for N = 2 (see Fig. 3.1), the point A has four images on the curve, C has
three images, B has two, and D has only one image.
Let us consider now a point y D and its approximation (y) on the Peano curve.
Since the function F(y) satisfies the Lipschitz condition, we have
|F(y) F( (y))| Ly (y),
F(y) F( (y)) Ly (y).
3.1 Introduction
49
0,5
0,4
0,3
0,2
2
1
0,1
3 C
B
2
0,1
1
1
0,2
D
A
0,3
4
0,4 dm
0,5
0,5
0,4
0,3
0,2
0,1
0,1
0,2
0,3
0,4
0,5
Fig. 3.1 Images of points, belonging to the 2-dimensional domain, on the Peano curve
The point (y) belongs to the Peano curve and UM is a lower bound for F(y) along
the curve. Thus, it follows from (3.1.6) that F( (y)) UM and then
F(y) UM Ly (y) UM LdM ,
(3.1.8)
where the designation

dM = max y (y)
yD
has been used. It is easy to understand how the distance dM can be calculated. The
Peano curves establish a correspondence between subintervals of the curve and Ndimensional sub-cubes of D RN (these sub-cubes for N = 2 are shown in Fig. 3.1)
with the side equal to 2M . Thus, dM is equal to thedistance between the center of
a sub-cube and one of its vertex, i.e., dM = 2(M+1) N. From this result and (3.1.8)
we obtain the final estimate
F(y) UM L2(M+1) N = U
that concludes the proof.
In this chapter, our goal is to introduce algorithms for solving the problem
(3.1.1), (3.1.2) by using algorithms proposed for minimizing functions in one
dimension. We reach our goal in three steps. First, in order to explain how Lipschitz
50
information can be used for global optimization purposes we introduce

one-dimensional methods in Euclidian metrics. Then, we show how these ideas
can be generalized to the cases where the univariate objective function satisfies
the Holder condition. Finally, we close the circle by using the developed
methods together with space-filling curves for solving the problem (3.1.1), (3.1.2).
3.2 One-Dimensional Information and Geometric Methods

in Euclidean Metrics
In this section, we study the global optimization problem of finding a point x
belonging to a finite one-dimensional interval [a, b] and the value f = f (x )
such that
f = f (x ) = min{ f (x) : x [a, b]},
(3.2.1)
where the objective function f (x) can be non-differentiable, multiextremal, blackbox and it satisfies the Lipschitz condition over [a, b], i.e.,
| f (x) f (y)| L|x y|,
x, y [a, b],
(3.2.2)
with a constant L, 0 < L < .

It follows from (3.2.2) that, for all x, y [a, b] we have
f (x) f (y) L|x y|.
Then, if we fix a point y [a, b], the function
C(x) = f (y) L|x y|
is such that f (x) C(x), x [a, b].
The popular method proposed by Piyavskii (see [92]) is based directly on the
last observation together with the assumption that the constant L is known; it was
one of the first methods that used the geometric approach. In fact, this algorithm,
in the course of its work, constructs piece-wise linear support functions for f (x)
over every subinterval [xi1 , xi ], 2 i k, where the points x1 , . . . , xk are trial points
previously produced by the algorithm, i.e., zi = f (xi ), 2 i k. More precisely, at
the (k + 1)th iteration the method constructs (see Fig. 3.2) an auxiliary piece-wise
linear function
Ck (x) =
k

i=2
ci (x),
Ck (x) f (x),
x [a, b],
3.2 One-Dimensional Information and Geometric Methods in Euclidean Metrics

Fig. 3.2 A piece-wise linear
support function constructed
by the method of Piyavskii
after five evaluations of f (x)
51
2
f(x) zi-1
zi
2
xi)
ci (
xi-1
x
i
xi
where
ci (x) = max{zi1 L(x xi1), zi + L(x xi)},
x [xi1 , xi ].
The function Ck (x) is called either minorant for the objective function f (x) or lower
bounding or support function. For its shape Ck (x) is often called a saw-tooth cover
of f (x) over [a, b] and the function ci (x), i fixed, is called a tooth. It is simple to
show that the minimum of the tooth ci (x) is reached at the point
xi =
xi + xi1 zi1 zi
+
2
2L
(3.2.3)
and the minimum value is

ci (xi ) =
xi xi1
zi + zi1
L
.
2
2
(3.2.4)
In the Piyavskii method the next trial point xk+1 is chosen in such a way that
Ck (xk+1 ) = min Ck (x),
x[a,b]
i.e., xk+1 is the point that corresponds to the minimum of the deepest tooth:
Ck (xk+1 ) = min ci (xi ).
1ik
After a finite number, K, of trials, the global minimum f from (3.2.1) can be
estimated by the value
fK = min{zi : 1 i K},
52
and the point

xK = arg min{zi : 1 i K}
can be taken as an estimate of the global minimizer x .
The idea to construct minorants for solving global optimization problems has
proved to be very fruitful. In fact, there exist a huge number of various versions and
generalizations of Piyavskii method (see [4, 17, 44, 49, 51, 58, 64, 78, 86, 87, 93, 117,
118, 120, 130, 136, 144, 147, 150152], etc.). Algorithms constructing minorants for
functions having Lipschitzian first derivatives are of a special interest (see [5, 8, 43,
44, 50, 51, 7072, 77, 102, 103, 106, 117, 118, 120], etc.).
Piyavskiis method requires, for its correct work, the knowledge of the exact
value of the Lipschitz constant L (obviously, an overestimate of L also will work).
However, increasing the estimate for L will cause a respective increase in the number
of trials required to sustain the same accuracy of the search. If the constant L is
not available (and as has been already mentioned, this situation can be very often
encountered in practice), it is necessary to look for a procedure that would allow
one to provide an adaptive approximation of L during the search.
The information approach introduced by Strongin in [131133, 139] proposes a
bunch of ideas for global optimization. In particular, it shows how the constant L can
be adaptively estimated. The information approach has been successfully used for
developing a number of powerful numerical methods for solving global optimization
problems of various kind (see [6, 56, 75, 120, 123, 132140], etc.). In particular,
this approach has allowed the authors and their collaborators to develop sequential
and parallel numerical global optimization algorithms for working both with
unconstrained problems and with problems with multiextremal partially defined
constrained (see the monograph [139] for a comprehensive presentation of these
results). It is interesting that the constrained case is treated without the introduction
of any additional (penalty) parameters.
Now we provide a detailed description of the core global optimization algorithm
from [132, 139] for solving problems of the type (3.2.1), (3.2.2). It will be referred
to from now on as the Information Algorithm.
Information Algorithm (IA)
Step 0. The first two trials are to be carried out at the end-points of the search
interval [a, b], i.e.,
x0 = a,
x1 = b.
(3.2.5)
The choice of the point xk+1 , k > 1, of any subsequent (k + 1)th trial is determined
by the following rules:
Step 1. Renumber the points x0 , . . . , xk of the previous trials by subscripts in
increasing order of the coordinate, i.e.,
a = x0 < x1 < . . . < xk1 < xk = b,
(3.2.6)
53
and juxtapose to them the values zi = f (xi ), 0 i k, which are the outcomes
z0 , . . . , zk renumbered by subscripts.
Step 2. Compute the maximal absolute value of the divided differences:
zi zi1
.
M = max
1ik xi xi1
(3.2.7)
Step 3. Accept the estimate

m=
1, M = 0,
rM, M > 0,
(3.2.8)
where r > 1 is the reliability parameter of the algorithm.

Step 4. For each interval [xi1 , xi ], 1 i k, calculate the value
Ri = m(xi xi1) +
(zi zi1 )2
2(zi + zi1 )
m(xi xi1)
(3.2.9)
called characteristic of the interval.

Step 5. Select the interval [xt1 , xt ] corresponding to the maximal characteristic
Rt = max Ri .
1ik
(3.2.10)
If the condition (3.2.10) has more than one solution, i.e., if the maximal value of the
characteristic is attained for several intervals, then the minimal integer satisfying
(3.2.10) is accepted as t.
Step 6. If
xt xt1 > ,
(3.2.11)
where > 0 is a given search accuracy, then accept

xk+1 =
xt + xt1 zt zt1
2
2m
(3.2.12)
as the point for the next trial and go to Step 1. Otherwise, calculate an estimate of
the minimum as
f = min{zi : 1 i k}
and STOP.
54
It is important to recollect that to work with the Piyavskii method we have to

find the exact Lipschitz constant or its overestimatethe problem that can happen
to be as much expensive as the global search itself. In contrast, in the IA, the simple
adaptive estimator (3.2.7), (3.2.8) is used in the course of the search. It has withstood
the test of time and become the core for many new efficient generalizations (see,
e.g., [75, 101, 117, 118, 139]). In the next chapter, the idea of an adaptive estimating
of a Lipschitz information will be developed in a new promising direction and
adaptive local estimates of the local Lipschitz constants over each subinterval
[xi1 , xi ] [a, b] will be introduced (see [74, 100, 101, 103, 105]). Let us make now
some observations with regard to the scheme described above.
Remark 3.1. (1) The scheme IA is convenient for analytic investigation of the
suggested algorithm. The implementation of this scheme on a computer can be
much more efficient if some more or less obvious changes are introduced. For
example, the value M = Mk from (3.2.7) may be computed as

|zk+1 zt1 | |zt zk+1 |
Mk = max Mk1 , k+1
,
x xt1 xt xk+1
where Mk1 is the value used at the previous step, etc.
(2) The introduced ordering of trials (3.2.6) by subscripts depends on the number k
of trials performed. In fact, it assumes that the notation xi for i = i(k) and for
i = i(k + 1) may correspond to different points of the sequence {xk }.
(3) In further consideration we shall often use the value M from (3.2.7) as a
denominator, which assumes that M > 0, i.e., that not all the outcomes zi =
f (xi ), 0 i k, are equal; we shall not stipulate this any more.
(4) After normalization the characteristic Ri from (3.2.9) can be interpreted as the
probability that a global minimizer is located within the interval [xi1 , xi ], in the
course of the (k + 1)th iteration.
3.2.1 Convergence Conditions and Numerical Examples

Lemma 3.1. Let x be a limit point of the sequence {xk } generated by the decision
rules of the Information Algorithm while minimizing the bounded function f (x), x
[a, b]; besides x = a and x = b. Then the sequence {xk } contains two subsequences,
one of which converges to x from the left and the other from the right.
Proof. Any regular trial executed at the point xk+1 partitions the interval [xt1 , xt ]
into two subintervals [xt1 , xk+1 ] and [xk+1 , xt ]. Due to (3.2.12) and (3.2.7), (3.2.8),
these subintervals meet the inequality
1
1
max{xt xk+1 , xk+1 xt1 } (1 + )(xt xt1 )
2
r
(3.2.13)
55
where r > 1. Denote t = t(k) the number of the interval [xt1 , xt ] including the point
x at the step k(k > 0). If the trial points do not coincide with the limit point x,
i.e., if
x = xk for any k > 0, then, from (3.2.13), it follows that
lim (xt xt1 ) = 0.
In this case, the left-hand points xq = xt(q)1 and the right-end points x p = xt(p) of
the above intervals bracketing the point x constitute two subsequences convergent
to x from left and right, respectively.
Now consider the case when, at some step q, the trial is carried out exactly at the
point xq = x and, thus, at any step k > q, there exists an integer j = j(k) such that
x j = x = xq . Suppose that in this case there is no any subsequence convergent to the
point x from the left. Then
lim (x j x j1) > 0,
and there exists a number p such that the trials do not hit the interval (x j1 , x j ) =
(x p , xq ) = (x p , x)
if k > max(p, q). From (3.2.9), the characteristic R j(k) of this
interval is equal to
2
(z p f (x))
2(z p f (x))
4 f (x)
=
m(x x p)
= m(x x p )(1 )2 4 f (x)
R j = m(x x p ) +
where
z p f (x)
.
m(x x p)
Similarly, by introducing the notation t = j(k) + 1 for the number of the interval
(x,
x j+1 ), we obtain
Rt = m(xt x)(1
)2 4 f (x)
where
zt f (x)
.
m(xt x)
From (3.2.7), (3.2.8) we have

(1 )2 > (1 r1)2 ,
therefore,
(1 )2 < (1 + r1)2 < 4;

R j Rt > m (x x p )(1 r1)2 4(xt x)
whence it follows that after some step

R j(k) > Rt(k)
(3.2.14)
56
because xt x 0 with k (as a consequence of x being a limit point). But, due

to the condition (3.2.10), the inequality (3.2.14) is in contradiction with the assumed
impossibility of trials in the interval (x p , xq ).
The assumption that there is no convergence to the point x from the right may be
disproved in a similar way.

Theorem 3.2. Let the point x be the limit point of the sequence {xk } generated
by the rules of the Information Algorithm while minimizing Lipschitzian with the
constant L function f (x), x [a, b]. Then:
1. If the function f (x) has a finite number of local minimizers at the interval [a, b],
then the point x is locally optimal, i.e., it is the local minimizer.
2. If side by side with x there exists another limit point x of the sequence {xk }, then
f (x)
= f (x).
3. For any k 0, zk = f (xk ) f (x).
4. If, at some step, the value m from (3.2.8) satisfies the inequality
m > 2L,
(3.2.15)
then any global minimizer x from (3.2.1) is the limit point of the sequence {xk };
besides, any limit point x of this sequence is the global minimizer of f (x).
Proof. Due to our assumption, the function to be minimized possesses a finite
number of local minimizers. Then the function f (x) is strictly monotonous in the
intervals (x , x)
and (x,
x + ) for sufficiently small real number > 0 (there
exists just one of these intervals if x = b or x = a). If we admit that the point x is not
locally optimal, then for all points x from at least one of these intervals it is true that
f (x) < f (x).
Due to the existence of two subsequences convergent from left and right, respectively, to x (see Lemma 3.1), the validity of the last inequality contradicts the third
statement of this theorem.
The assumption of the existence of some subsequence convergent to x also
contradicts the third statement of the theorem if f (x)
= f (x).
Let us show the validity of the third statement. Assume the opposite, i.e., that at
some step q 0 the outcome
zq = f (xq ) < f (x)
(3.2.16)
is obtained. Denote by j = j(k) the subscript corresponding to the point xq at the

step k, i.e., z j = zq , and consider the characteristic of the interval [x j1 , x j ) computed
from (3.2.9) (if q = 0, then R1 has to be considered).
By introducing the notation
m(x j x j1)
|z j z j1|
57
where > 1 due to (3.2.7), (3.2.8), we derive the relations

R j = |z j z j1 |( + 1 ) 2(z j + z j1 ) >
> 2{max(z j , z j1 ) min(z j , z j1 )} 2(z j + z j1 ) = 4 min(z j , z j1 ).
The last inequality, which also holds true for the case z j = z j1 (this can easily be
checked directly from (3.2.9)), and the assumption (3.2.16) results in the inequality
R j > 4 f (x)
+ ( f (x)
zq ).
(3.2.17)
If t = t(k) is the number of the interval [xt1 , xt ] containing the point x at the step k,
then
lim Rt(k) = 4 f (x)
(3.2.18)
because for the Lipschitzian function f (x) the estimate m from (3.2.7), (3.2.8)
is bound. From (3.2.17), (3.2.18) follows the validity of (3.2.14) for steps with
sufficiently large k. Hence, the point x cannot be the limit point if the assumption
(3.2.16) is true.
Let us prove the last statement. Let the condition (3.2.15) be met at some step q.
Then, from (3.2.7), (3.2.8), it will be met at any subsequent step k q. Denote
j = j(k) the number of the interval including the point x from (3.2.1) at the step k.
If x is not a limit point, then there exists such a number p > 0 that for any k p
xk+1
/ [x j1 , x j ].
(3.2.19)
As far as the function satisfies Lipschitz condition (3.2.2), the inequality

z j + z j1 L(x j x j1) f (x )
(3.2.20)
is met for the above interval, whence, due to (3.2.9) and (3.2.15), follows the
estimate for R j
R j(k) > 4 f (x ).
(3.2.21)
This estimate is true for any k > max(p, q). But, on the other hand, (3.2.18) is valid
for any limit point x;
besides f (x ) f (x).
Therefore, from (3.2.18) and (3.2.21),
follows the validity of (3.2.14) if k is sufficiently large, which is the contradiction
to (3.2.19). Thus, the point x of the absolute minimum of the function f (x) over
[a, b] is the limit point of the sequence {xk } if the condition (3.2.15) is met. Then,
as a consequence of the second statement of the theorem, any other limit point x is
necessarily the global minimizer.

58
Table 3.1 Numbers of trials

done by the methods before
satisfaction of the stopping
rule
Problem
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
GA
377
308
581
923
326
263
383
530
314
416
779
746
1,829
290
1,613
992
1,412
620
302
1,412
PM
149
155
195
413
151
129
153
185
119
203
373
327
993
145
629
497
549
303
131
493
IA
127
135
224
379
126
112
115
188
125
157
405
271
472
108
471
557
470
243
117
81
Average
720.80
314.60
244.15
Corollary 3.1. Under the condition (3.2.15) the set of all limit points of the trial
sequence generated by the IA coincides with the union of all global minimizers of
the function.
In particular, if x is the unique global minimizer, i.e., for any x [a, b], f (x ) <
f (x) if x = x , then
lim xk = x .
To illustrate the performance of the IA let us present results of some numerical

experiments that were carried out using the set of 20 test functions proposed in
[59] for comparing one-dimensional global optimization algorithms. The method IA
was compared (see Table 3.1) with the Piyavskii method (denoted by PM) and the
Galperin algorithm (see [35]) denoted by GA. Both algorithms do not use derivatives
and work with the exact values of the Lipschitz constant. The problem of how to find
this constant is not discussed by the methods. In general, a predefined estimate of
the Lipschitz constant is used in place of this value.
To be accurate, when comparing the algorithms PM and GA with IA it would
be proper to add the computational efforts spent in obtaining an estimate of the
Lipschitz constant to the efforts required in finding an -approximation of the global
minimum. The methods PM and GA do this in two separate phases and the algorithm
IA does it simultaneously. But as the first item of this sum varies according to each
59
practical problem and is inestimable for the algorithms PM and GA, we shall use
only the number of trials carried out before satisfying a stopping rule for comparing
the methods. As they all belong to the class of Divide the Best algorithms (see [54,
56]), we can use the condition
xt xt1 ,
(3.2.22)
where t is taken from (3.2.10), as a stopping rule. All experiments were carried out
with the accuracy
= 104 (b a).
(3.2.23)
The parameters of the methods were chosen in accordance with recommendations made by the authors. For GA and PM the exact values of the Lipschitz constant
were used for all test functions (see [35, 92]). Following the suggestions of [35] for
this algorithm we chose ai as the trial point in a current interval [ai , bi ]. The partition
operator which at every iteration divides [ai , bi ] into p equal subintervals was taken
as p = 4. The parameter of IA was chosen equal to 2. Results obtained by testing
the methods are presented in Table 3.1.
In all the experiments the a priori known locations and values of the global
minima were determined by placing an observation in the global minimizer vicinity
and the methods were stopped because the width of the interval [xi1 , xi ] (see
(3.2.22)) was less than from (3.2.23). We calculated the average number of trials
carried out by a method j, 1 j 4, until the stopping rule was satisfied as
j
=
Naver
1 20 j
ni ,
20 i=1
(3.2.24)
where ni is the number of trials carried out by the method j to solve the i-th problem.
3.2.2 Relationship Between the Information and the Geometric

Approaches
In this subsection, we show that both approaches we discuss in this chapter, the
information and the geometric one, are very closely related. In fact, the method IA
has an interesting geometric interpretation obtained by looking at it under the light
of the local tuning approach first introduced in [100,101] (we shall provide a detailed
description of this approach in the next chapter). This interpretation allows us to
obtain new formulae for estimating local Lipschitz constants for each subinterval
[xi1 , xi ], 1 i k. It has been shown (see [74, 77, 102, 103, 105, 117119, 123, 124,
139], etc.) that the adaptive local estimates can be incorporated into numerous global
optimization algorithms instead of the a priori given global constant L (or its global
estimates) and the usage of local estimates can accelerate the search significantly
60
(the words global constant mean here that the same value estimating L is used over
the whole search region). It should be noticed that the local estimates we are going to
develop in this subsection can be successfully used both in the geometric approach
and in the framework of the information algorithms.
Let us now consider the algorithm IA. Its characteristics Ri associated with each
subinterval [xi1 , xi ], 0 < i k, of the search region [a, b] are written (see (3.2.9)) as
follows
Ri = rM(xi xi1 ) +
(zi zi1 )2
2(zi + zi1 ).
rM(xi xi1)
(3.2.25)
It has been observed in [83] and [100, 101, 103], that (3.2.25) can be rewritten in the
form
Ri = (xi xi1 )(rM +
Mi2
) 2(zi + zi1 ),
rM
(3.2.26)
where
Mi =
| zi zi1 |
.
xi xi1
By comparing formula (3.2.26) with the formula (3.2.4) of the characteristic that is
used in Piyavskiis method we observe that the IA can be interpreted as a method
constructing an auxiliary piecewise-linear function with the local slopes si used at
each subinterval [xi1 , xi ], 0 < i k, where si have the following form
si = 0.5(rM +
Mi2
),
rM
1 i k,
(3.2.27)
Thus, the IA method based on a stochastic model has a geometric interpretation.

Moreover, in the original stochastic model the adaptive estimate (3.2.7), (3.2.8)
of the global Lipschitz constant is used, in contrast with this fact, the geometric
interpretation of the method is based on the local estimates from (3.2.27). These
estimates, of course, can be applied in order to accelerate the search in all the
methods using Lipschitz constants. In the next chapter, we shall focus our attention
on a number of ideas allowing one to speed up global optimization algorithms.
3.3 One-Dimensional Geometric Methods

in Holderian Metrics
Let us consider now the geometric approach in order to find the global minimum
of a function, but by using the Holderian metric. We consider the one-dimensional
global optimization problem
3.3 One-Dimensional Geometric Methods in Holderian Metrics
min{ f (x) : x [a, b]},
a, b R,
61
(3.3.1)
where the function f (x) satisfies the Holder condition

| f (x) f (y)| H|x y|1/N ,
x, y [a, b],
(3.3.2)
with a constant 0 < H < . It is supposed that the value N is known and the
objective function (3.3.1) can be represented by a black box procedure. This
problem arises in many applications, for instance, in the plant location problem
under a uniform delivered price policy (see [57]), in infinite horizon optimization
problems (see [68]), etc.
Two cases can be examined:
i) a constant M H is given a priori;
ii) nothing is known about the Holder constant H.
For the case i), an extension to Holder optimization of the Piyavskii method
(see [92]) has been proposed by Gourdin, Jaumard, and Ellaia in [52] (hereinafter
this method will be called GJE) in order to solve a global maximization problem
analogous to (3.3.1), (3.3.2). They consider an iterative construction of an upperbounding function corresponding to the lower envelope of parabolic functions.
At each iteration they must determine the maxima of the piecewise concave function
through line search techniques, i.e. by solving an equation of degree N. The
drawback of this approach is that for N large the computation of the local maxima
of the upper-bounding function can be tricky. We propose a technique that at each
iteration does not require the solution of nonlinear equations of degree N, so that
this method turns out to be very easy to apply even with N large, and it requires a
smaller time for its execution.
Moreover, the approach introduced here can be extended to the case ii) where
the constant H is not available a priori (in particular, we use an adaptive procedure
that estimates the global constant H during the search). Note that the algorithm GJE
cannot be extended in such a way. In fact, if an adaptive estimate H k of H would be
used in this algorithm in the course of the kth iteration, then, each time when H k is
updated, it would become necessary to solve k 1 equations of degree N making so
the whole algorithm too expensive from the computational point of view.
3.3.1 Algorithms
Let us describe first a general algorithm in a compact form and then, by specifying
Step 2, we will give two different algorithms.
62
General Geometric Algorithm (GA)

Step 0. The first two trials are performed at the points x1 = a and x2 = b. The point
xk+1 , k 2, of the current (k+1)-th iteration is chosen as follows.
Step 1. Renumber the trial points x1 , x2 , . . . , xk of the previous iterations by
subscripts so that
a = x1 < < xk = b.
Step 2. Compute in a certain way the value mi being an estimate of the Holder
constant of f (x) over the interval [xi1 , xi ], 2 i k, such that
0 < rmi <
(3.3.3)
where r > 1 is a reliability parameter of the method. The way to calculate the value
mi will be specified in each concrete algorithm.
Step 3. For each interval [xi1 , xi ], 2 i k, compute
1
zi zi1
yi = (xi + xi1)
,
1N
2
2rmi (xi xi1) N
(3.3.4)
where z j = f (x j ), 1 j k.
Step 4. Calculate, for the interval [xi1 , xi ], 2 i k, the characteristic
Ri = min{ f (xi1 ) rmi (yi xi1)1/N , f (xi ) rmi (xi yi )1/N }.
(3.3.5)
Step 5. Find the interval [xt1 , xt ] with the minimal characteristic

t = argmin{Ri : 2 i k}.
(3.3.6)
xt xt1 >
(3.3.7)
Step 6. If
where > 0 is a given search accuracy, then execute the next trial at the point
xk+1 = yt
(3.3.8)
and go to Step 1. Otherwise, calculate the estimate of the global minimum as

f = min{zi : 1 i k}
and STOP.
63
Let us now introduce some observations with regard to the GA scheme described
above and compare it with the method GJE from [52]. During the course of the
(k + 1)th iteration the GA constructs an auxiliary piecewise function
Ck (x) =
ci (x),
(3.3.9)
i=2
where
ci (x) = max{ f (xi1 ) rmi (x xi1)1/N ,
x [xi1 , xi ].
f (xi ) rmi (xi x)1/N },
(3.3.10)
Let us compare (3.3.9), (3.3.10) used in the GA with the method GJE from [52]
where the authors also work with the functions Ck (x) but instead of the adaptive
estimate mi used by the GA they apply the a priori given Holder constant H
supposing that it has been granted. In the GJE, the new trial point xk+1 is chosen
as follows
xk+1 = arg min{ci (x) : x [xi1 , xi ], 2 i k}.
Since the GJE uses for constructing ci (x) the a priori given Holder constant H,
then due to (3.3.2), the function Ck (x) from (3.3.9) used in [52] has the following
property
Ck (x) f (x),
x [a, b],
i.e., Ck (x) is a low-bounding function for f (x) over [a, b] and, respectively, functions
ci (x) are also minorants over the intervals [xi1 , xi ], 2 i k, (see Fig. 3.3) and
Ck (xk+1 ) Ck (x) f (x),
x [xi1 , xi ],
2 i k.
In order to find the point xk+1 , the GJE requires, for each interval [xi1 , xi ], 2 i k,
to solve the following system

Ai = f (xi1 ) rmi (pi xi1)1/N

Ai = f (xi ) rmi (xi pi )1/N
(3.3.11)
to determine the peak value Ai and the corresponding point pi (see Fig. 3.3) and
then to choose among them the point xk+1 .
In our method GA introduced above, for each interval [xi1 , xi ], 2 i k, we
approximate the point pi by the point yi from (3.3.4) found as intersection of the
lines rle f t (x) and rright (x) (see Fig. 3.4):
rle f t (x) = rmi (xi xi1 )
1N
N
x + rmi (xi xi1)
1N
N
xi1 + f (xi1 ),
64
Fig. 3.3 The function Ck (x)

is a minorant for f (x) over
the search interval [a, b]
Ai
xi1
Fig. 3.4 The point yi is very

easy to calculate even with N
large, the fact that is not true
for the point pi from Fig. 3.3
pi
xi
rleft(x)
rright(x)
Bi
c (x)
+
i
c (x)
Ri
yi
xi1
rright (x) = rmi (xi xi1)
1N
N
x rmi (xi xi1 )
1N
N
xi
xi + f (xi ).
It is important to notice that following (3.3.4), the value yi is very easy to calculate
even with N large.
Thus, the characteristic Ri calculated in Step 3 and related to the interval [xi1 , xi ],
+
represents the minimum value among the auxiliary functions c
i (x) and ci (x)
evaluated at the point yi (see Fig. 3.4)
+
Ri = min{c
i (yi ), ci (yi )},
1/N
1/N
, c+
.
c
i (x) = f (xi1 ) rmi (x xi1 )
i (x) = f (xi ) rmi (xi x)
(3.3.12)
Let us now introduce two different choices of the value mi in Step 2 of the GA in
order to get two different algorithms. The first choice is the traditional one.
65
Algorithm GA1
Step 2. Set
rmi = H,
2 i k, (r = 1).
(3.3.13)
Here the exact value of the a priori given Holder constant H is used as it was in the
GJE. Since it is quite difficult to know the Holder constant a priori, the typical way
to avoid this obstacle is to look for an approximation of H during the course of the
search (see [74, 105, 123, 132, 139, 140]). Thus, in the second algorithm we consider
the following adaptive global estimate of the Holder constant.
Algorithm GA2
Step 2. Set
mi = max{ , hk },
(3.3.14)
where > 0 is a small number (the second parameter of the method) that takes into
account our hypothesis that f (x) is not constant over the interval [a, b] and the value
hk is calculated as follows
hk = max{hi : 2 i k}
(3.3.15)
with
hi =
|zi zi1 |
,
|xi xi1|1/N
2 i k.
(3.3.16)
Note that if during the course of the k-th iteration hk = hk1 , then the auxiliary
function Ck+1 (x) will differ from Ck (x) only in the two subintervals obtained after
splitting the interval [xt1 , xt ] by the point xk+1 . Otherwise, if hk > hk1 , we have to
recalculate completely the function Ck+1 (x).
3.3.2 Convergence Properties and Numerical Experiments

In this subsection, we study convergence properties of the two algorithms introduced
above by considering an infinite trial sequence {xk } generated by an algorithm
belonging to the general scheme GA for solving the problem (3.3.1), (3.3.2). First
of all, we obtain a simple result regarding the properties of the characteristic Ri .
Lemma 3.2. If rmi > Hi , where Hi is the local Holder constant related to the
interval [xi1 , xi ], then
66
Ri < f (x),
x [xi1 , xi ].
(3.3.17)
Proof. If rmi > Hi , then, due to (3.3.2), (3.3.10), and (3.3.12), the function
+
ci (x) = max{c
i (x), ci (x)}
is a low-bounding function for f (x) over the interval [xi1 , xi ]. Moreover, since r > 1,
it follows
ci (x) < f (x),
x [xi1 , xi ].
+
The function c
i (x) is strictly decreasing on [xi1 , xi ] and ci (x) is strictly increasing
on this interval. Thus, it follows
+
min{c
i (x), ci (x)} min{ci (x) : x [xi1 , xi ]},
x [xi1 , xi ].
Particularly, this is true for x = yi , where yi is from (3.3.4). To conclude the proof it
is sufficiently to recall that, due to (3.3.4), (3.3.5), and (3.3.10),
+
Ri = min{c
i (yi ), ci (yi )}
and yi [xi1 , xi ].

{xk }
Let us now return to the trial sequence

generated by the GA for solving the
problem (3.3.1), (3.3.2). We need the following definition.
Definition 3.1. The convergence to a point x (a, b) is said bilateral if there exist
two subsequences of {xk } converging to x one from the left, the other from the
right.
Theorem 3.3. Let x be any limit point of {xk } such that x = a, x = b. Then the
convergence to x is bilateral.
Proof. Consider the interval [xt1 , xt ] determined by (3.3.6) at the (k + 1)-th
iteration. By (3.3.4) and (3.3.8) we have that the new trial point xk+1 divides
the interval [xt1 , xt ] into the subintervals [xt1 , xk+1 ] and [xk+1 , xt ]. From (3.3.4),
(3.3.13), and (3.3.14) we can write
max(x
k+1
xt1 , xt x
k+1

) 0.5 xt xt1 +
|zt zt1 | (xt xt1 )

rmt
(xt xt1 )1/N

1
0.5 xt xt1 + (xt xt1 )
r
1
0.5(1 + )(xt xt1 ).
r
(3.3.18)
67
Consider now an interval [xs1 , xs ], s = s(k), such that x [xs1 , xs ]; then,

because x is a limit point of {xk } and using (3.3.4), (3.3.6), (3.3.8), and (3.3.18),
we obtain
lim (xs(k) xs(k)1) = 0.
(3.3.19)
/ {xk }, the subsequences {xs(k)1} and {xs(k) } are the ones we are looking
If x
for, and the theorem has been proved. Suppose now that x {xk } and that the
convergence to x is not bilateral, i.e., no sequence converging to x from the left
exists. In this case there exist integers q, n > 0, such that x = xq and for any iteration
number k > max(q, n) no trials will fall into the interval [xn , xq ] = [x j(k)1 , x j(k) ]. For
the value R j of this interval we have:
1
R j = min{z j1 rm j (y j x j1 ) N , f (x ) rm j (x j y j ) N }
(3.3.20)
that is
1
R j f (x ) rm j (x j y j ) N < f (x ).

(3.3.21)
>0
On the other hand, it follows from (3.3.3) and (3.3.19) that

lim Rs(k) = f (x )
(3.3.22)
thus, for a sufficiently large iteration number k the inequality

R j(k) < Rs(k)
(3.3.23)
is satisfied. This means that, by (3.3.6) and (3.3.8), a trial will fall into the interval
[xn , xq ] which contradicts our assumption that there is no subsequence converging
to x from the left. In the same way we can consider the case when there is no
subsequence converging to x from the right. Hence convergence to x is bilateral.

Corollary 3.2. For all trial points xk , it follows f (xk ) f (x ), k 1.
Proof. Suppose that there exists a point xq such that
zq = f (xq ) < f (x ).
(3.3.24)
Consider the value R j of the interval [x j1 , x j ] where x j = xq . We have:

1
R j = min{z j1 rm j (y j x j1 ) N , z j rm j (x j y j ) N },
R j < min{z j1 , z j } < f (x ).
68
Again, from (3.3.22) and (3.3.24) the inequality (3.3.23) holds. By (3.3.6) and
(3.3.8) this fact contradicts the assumption that x is a limit point of {xk }. Thus
f (xq ) f (x ) and the Corollary has been proved.

Corollary 3.3. If another limit point x = x exists, then f (x ) = f (x ).

Proof. Follows directly from Corollary 3.2.
Corollary 3.4. If the function f (x) has a finite number of local minima in [a, b],
then the point x is locally optimal.
Proof. If the point x is not a local minimizer then, taking into account the bilateral
convergence of {xk } to x and the fact that f (x) has a finite number of local minima
in [a, b], a point w such that f (w) < f (x ) will be found. But this is impossible by
Corollary 3.2.

Let us introduce now sufficient conditions for global convergence.
Theorem 3.4. Let x be a global minimizer of f (x). If there exists an iteration
number k such that for all k > k the inequality
rm j(k) > H j(k)
(3.3.25)
holds, where H j(k) is the Holder constant for the interval [x j(k)1 , x j(k) ], i.e.,
| f (x) f (y)| H j(k) |x y|1/N ,
x, y [x j(k)1 , x j(k) ]
(3.3.26)
and the interval [x j(k)1 , x j(k) ] is such that x [x j(k)1 , x j(k) ], then x is a limit point
of {xk }.
Proof. Suppose that x is not a limit point of the sequence {xk } and a point x = x is
a limit point of {xk }. Then there exists an iteration number n such that for all k n
xk+1
/ [x j1 , x j ],
j = j(k).
Lemma 3.2 and (3.3.25) imply

R j < f (x ).
(3.3.27)
However, since x is a global minimizer, the inequality

f (x ) f (x )
(3.3.28)
holds. Thus, considering (3.3.27), (3.3.22), and (3.3.28) together with the decision
rules of the algorithm, we conclude that a trial will fall into the interval [x j1 , x j ].
This fact contradicts our assumption and proves that x is a limit point of the
sequence {xk }.

Corollary 3.5. If the conditions of Theorem 3.4 are satisfied, then all limit points
of {xk } are global minimizers of f (x).
69
Table 3.2 The set of test functions from [52]

Name
F1
F2
Interval
[4, 4]
[5, 5]
F5
F6
Formula
x6 15x4 + 27x2 + 250
2
2
(x
5x +2 6)/(x + 1)
(x 2)
if x 3
2ln(x
2)
+
1
otherwise

2x x2
if x 2
x2 + 8x 12 otherwise
(3x 1.4) sin 18x
2
2(x 3)2 + ex /2
F7
5k=1 k sin[(k + 1)x + k]
[10, 10]
F8
5k=1 k cos[(k + 1)x + k]
[10, 10]
F3
F4
Solution
3.0
2.414213
[0, 6]
2.0
[0, 6]
4.0
[0, 1]
[3, 3]
Proof. The result follows immediately from Corollary 3.3.
0.966085
1.590717
6.774576
0.49139
5.791785
7.083506
0.8003
5.48286
Theorem 3.5. For every function f (x) satisfying (3.3.2) with H < there exists r
such that for all r > r the Algorithm GA2 determines all global minimizers of the
function f (x) over the search interval [a, b].
Proof. Since H < and any value of r can be chosen in the Algorithm GA2, it
follows that there exists r such that condition (3.3.25) will be satisfied for all global
minimizers for r > r . This fact, due to Theorem 3.4, proves the Theorem.

We report now some numerical results showing a comparison of the algorithms
GA1 and GA2 with the method GJE from [52]. Three series of experiments have
been executed.
In the first and the second series of experiments, a set of eight functions described
in [52] have been used (see Table 3.2). Since the GJE algorithm requires, at each
iteration, the solution to the system (3.3.11) in order to find the peak point (pi , Ai )
(see Fig. 3.2) we distinguish two cases. In the first series we use the integers N = 2, 3,
and 4 because it is possible to use explicit expressions for the coordinates of the
intersection point (pi , Ai ) (see [52]). The second series of experiments considers the
case of fractional N.
Table 3.3 contains the number of trials executed by the algorithms with accuracy
= 104 (b a) (this accuracy is used in all series of experiments). The exact
constants H (see [52]) have been used in the GJE and the GA1. Parameters of the
GA2 were = 108 and r = 1.1. In this case, all the global minimizers have been
found by all the methods.
In Table 3.4 we present numerical results for the problems from Table 3.2 with
fractional values of N. In this case, in the method GJE, the system (3.3.11) should
70
Table 3.3 Numerical results for N = 2, 3, 4

Method
GJE
N
2
3
4
F1
5,569
11,325
12,673
F2
4,517
5,890
7,027
F3
1,683
2,931
3,867
F4
4,077
4,640
6,286
F5
1,160
3,169
4,777
F6
2,879
5,191
5,370
F7
4,273
8,682
10,304
F8
3,336
8,489
8,724
Average
3,436
6,289
7,378
GA1
2
3
4
5,477
11,075
15,841
5,605
7,908
8,945
1,515
2,521
3,162
4,371
7,605
9,453
1,091
2,823
4,188
2,532
4,200
5,093
4,478
11,942
15,996
3,565
9,516
15,538
3,579
7,198
9,777
GA2
2
3
4
1,477
2,368
2,615
2,270
3,801
3,486
1,249
1,574
1,697
1,568
2,023
2,451
279
367
424
1,761
3,186
4,165
580
710
756
380
312
550
1,195
1,792
2,018
Table 3.4 Numerical experiments with fractional values of N

Method N
F1
GJE
4/3
1,913
53/2 10,483
100/3 10,195
F2
F3
F4
F5
2,341
705 1,213
397
6,883
6,763 5,895
7,201
6,609 4,127
F6
F7
1,160
730
9,833
10,078
GA1
4/3
1,923 2,329
680 1,768
381 1,108
722
53/2 15,899 9,243 5,921 10,057 8,056 5,050 16,169
100/3 15,757 8,671 5,399 9,458 6,783 4,699 15,982
GA2
4/3
53/2
100/3
1,053 1,484
649
2,972 4,215 2,207
2,108 4,090 2,023
1,025
3,073
2,828
278 1,664
725 4,491
667 4,196
473
103
154
F8
557
9,617
9,094
Average
1,127
549
16,083
15,617
1,182
10,809
10,295
378
94
153
875
2,235
2,027
be solved by using a line search technique (see [52]) at each iteration. The following
methods have been used for this goal:
(i) the routine FSOLVE from the Optimization Toolbox of MATLAB 5.3;
(ii) the routine NEWT of Numerical Recipes (see [96]) that combines the Newtons
method for solving nonlinear equations with a globally convergent strategy that
will guarantee progress towards the solution at each iteration even if the initial
guess is not sufficiently close to the root.
These methods have been chosen because they can be easily found by a final
user. Unfortunately, our experience with both algorithms has shown that solving the
system (3.3.11) can be a problem itself. Particularly, we note that, when N increases,
the two curves li and li+ from (3.3.12) tend to flatten (see Figs. 3.5 and 3.6) and if
the intersection point (pi , Ai ) is close to the boundaries of the subinterval [xi1 , xi ],
then the system (3.3.11) can be difficult to solve. In some cases the methods looking
for the roots of the system do not converge to the solution.
For example, Fig. 3.6 presents the case when the point (denoted by *) which
approximates the root is obtained out of the search interval [xi1 , xi ]. Thus, the
system (3.3.11) is not solved and, as a consequence, the algorithm GJE does
not find the global minima of the objective function. These cases are shown in
Table 3.4 by .

Fig. 3.5 The two curves li
and li+
71
5
0
5
10
15
20
25
Fig. 3.6 No convergence
0.95
5
0
5
10
15
20
25
30
35
40
0.75
0.8
0.85
0.9
Numerical experiments described in Table 3.4 have been executed with the
following parameters. The exact constants H h have been used in the methods
GJE and GA1. Parameters = 108 and r = 1.5 have been used in the GA2.
All global minimizers have been found by the algorithms GJE and GA1. Note
that the parameter r influences the reliability of the method GA2. For example, the
algorithm GA2 has found only one global minimizer in the experiments marked by
*. The value r = 3.5 allows one to find all global minimizers.
The third series of experiments (see Table 3.5) has been executed with the
following function from [74] shown in Fig. 3.7
FN (x) =
k| sin((3k + 1)x + k)||x k|1/N ,
x [0, 10].
(3.3.29)
k=1
Over the interval [0, 10] it satisfies the Holder condition with a constant h, i.e.,
|FN (x) FN (y)| h|x y|1/N ,
x, y [0, 10].
72

Table 3.5 Numerical experiments with the function FN (x) from (3.3.29)
N
5
10
20
40
60
80
100
Optimal point
2.82909266
2.83390034
2.83390034
2.83390034
2.83390034
2.83390034
2.83390034
Fig. 3.7 The function FN (x)

from (3.3.29) with N = 5
Optimal value
1.15879294
1.15176044
1.14960372
1.14946908
1.14956783
1.14964447
1.14969913
HN
77
58
51
48
47
47
47
GJE
1,886
208
GA0
2,530
1,761
760
220
53
41
34
GA1
1,995
1,295
518
69
87
94
71
GA2
258
82
171
949
581
241
261
20
18
16
14
12
10
8
6
4
2
0
10
It can be shown (see [42]) that the constant

5
HN = 15 + k211/N (3k + 1)1/N (10 k)1/N h.

k=1
In this series of experiments a new algorithm, GA0, has been used to show
efficiency of the choice of the point yi from (3.3.4). The method GA0 works as
the algorithm GA1 but in Step 3 and Step 4 it uses the following characteristic
Ri = min{ f (xi1 ) rmi (yi xi1 )1/N , f (xi ) rmi (xi yi )1/N },
(3.3.30)
where the point yi = 0.5(xi + xi1 ).

The parameters of the methods have been chosen as follows: for each given N,
the corresponding value HN has been used in the GJE, GA0, and GA1 algorithms as
an overestimate for h. The parameter = 108 has been used in the method GA2.
Due to Theorem 3.5, every Holderian function optimized by GA1 or GA2 has a
crucial value r of the parameter r. Thus, different values of r have been chosen for
3.4 A Multidimensional Information Method
73
different values of N. In the method GA2 we have chosen r = 1.3 for the case N = 5,
r = 1.7 for N = 10, r = 2.8 for N = 20 and r = 9.3 if N = 40, 60, 80; the result for
N = 100 has been obtained with r = 15.
The algorithm GA2 has found good estimates of the global solution in all the
cases. The methods GA0 and GA1 have made the same for N = 5, 10, 20. It can be
seen that GA1 outperforms GA0. For N = 40, 60, 80, 100, these methods stop after
a few iterations in neighborhoods of local minimizers because the used accuracy
was not small enough in order to find the global solution. Augmenting accuracy
allows to locate the global minimizer. These cases are shown in Table 3.5 by *.
The symbol has the same meaning as in Table 3.4.

In the remaining part of the chapter we consider the multidimensional case, i.e., the
problem (3.1.1), (3.1.2) with N 2. In particular, we first study a generalization
of the information algorithm IA, and then, in the successive section, a multidimensional geometric algorithm. In order to facilitate the reading, let us restate the
problem we deal with. We are to solve the following global optimization problem
min f (x) = F(y(x)),
x [0, 1],
(3.4.1)
where y(x) is the Peano curve and

1
| f (x ) f (x )| H|x x | N ,
x , x [0, 1],
(3.4.2)
i.e., f (x) is a Holderian function, and H = 2L N + 3, where L is the Lipschitz

constant of the original multidimensional function F(y).
Considerations made above immediately suggest how, being in the information
approach framework, we should generalize the method IA from Sect. 3.2 for solving
the problem (3.4.1). This generalization leads us to the following algorithm.
Multidimensional global Information Algorithm (MIA)
Step 0. The first two trials are to be carried out at the points
y0 = y(0), y1 = y(1).
(3.4.3)
The choice of the point yk+1 , k > 1, of any subsequent (k + 1) st trial is done as
follows.
Step 1. Renumber the inverse images x0 , . . . , xk of all the points
y0 = y(x0 ), . . . , yk = y(xk )
(3.4.4)
74
of the already performed trials by subscripts in increasing order of the coordinate

(note that x0 = 0 and x1 = 1), i.e.,
0 = x0 < x1 < . . . < xk = 1,
(3.4.5)
and juxtapose to them the values z j = F(y(x j )), 1 j k, which are the outcomes
z0 = F(y(x0 )), . . . , zk = F(y(xk ))
(3.4.6)
of the trials renumbered by subscripts.

Step 2. Compute the maximal absolute value of the observed first divided
differences
M = max{|zi zi1 |/i :
1 i k}
(3.4.7)
where
i = (xi xi1 )1/N ;
(3.4.8)
if (3.4.7) yields a zero value, then accept M = 1.

Step 3. For each interval [xi1 , xi ], 1 i k, calculate the value (the characteristic
of the interval)
Ri = rM i + (zi zi1 )2 /rM i 2(zi + zi1 ).
(3.4.9)
The real number r > 1 is the reliability parameter of the algorithm.

Step 4. Select the interval [xt1 , xt ] corresponding to the maximal characteristic
Rt = max{Ri : 1 i k} .
(3.4.10)
|xt xt1 |1/N
(3.4.11)
Step 5. If
where > 0 is a given search accuracy, then calculate an estimate of the minimum
as
Fk = min{zi : 1 i k}
(3.4.12)
and STOP. Otherwise, execute the next trial at the point yk+1 = y(xk+1 ) from [a, b]
where

|zt zt1 | N 1
k+1
sgn(zt zt1 )
= 0.5(xt + xt1 )
(3.4.13)
x
M
2r
and go to Step 1.; sgn(zt zt1 ) denotes the sign of (zt zt1 ).
75
Step 0Step 5 of the scheme MIA describe the sequence of decision functions
xk+1 = Grk (x0 , . . . , xk ; z0 , . . . , zk )
generating the sequence of inverse images {xk } [0, 1] and also the sequence
{yk } [a, b] RN of trial points; see (3.4.4). These decision functions are obviously
dependent on the particular mapping y(x) used in (3.4.6). By analogy with (3.2.11),
the search sequence {yk } may be truncated by meeting the stopping condition
(3.4.11). But it should be stressed that in solving applied multidimensional problems
the actual termination of the search process is very often caused by exhaustion of the
available computing resources or by assuming the available running estimate from
(3.4.12) as already satisfactory and, thus, economizing on the computing effort.
Now, we proceed to the study of convergence properties of MIA, but before
embarking on the detailed treatment of this subject we single out one more feature
of the space-filling curve y(x) introduced in Theorem 2.1.
Lemma 3.3. Let {yk } = {y(xk )} be the sequence of points in [a, b] induced by
the sequence {xk } [0, 1]; here y(x) is the space-filling curve from Theorem 2.1.
Then:
is a limit point
1. If x is a limit point of the sequence {xk }, then the image y = y(x)
of the sequence {yk }.
2. If y is a limit point of the sequence {yk }, then there exists some inverse image x
of this point, i.e., y = y(x),
which is a limit point of the sequence {xk }.
Proof. 1. If x is a limit point of the sequence{xk }, then there exists some
subsequence {xkq }, k1 < k2 < . . ., converging to x,
i.e.,
=0.
lim |xkq x|
Hence, in accordance with (2.1.12),

lim ||y(xkq ) y(x)||
= 0,
whence it follows that y = y(x)

is a limit point of the sequence {yk }.
2. If y is the limit point of the sequence {yk }, then there exists some subsequence
{ykq } = {y(xkq )}, k1 < k2 < . . ., converging to y,
i.e.,
lim ykq = y .
(3.4.14)
If the corresponding sequence { q } = {xkq } has two different limit points x and
x , i.e., there are two subsequences { qi } and { q j } satisfying the conditions
lim qi = x ,
lim q j = x ,
76
and x = x , then y = y(x ) = y(x ) because, due to (3.4.14), the subsequence
{ykq } has just one limit point. Therefore, any limit point x of the sequence { q }
is the inverse image of y.

Theorem 3.6. (Sufficient convergence conditions). Let the point y be a limit point of
the sequence {yk } generated by the rules of MIA while minimizing the Lipschitzian
with the constant L function F(y), y [a, b] RN . Then:
1) If side by side with y there exists another limit point y of the sequence {yk }, then
F(y)
= F(y ).
2) For any k zk = F(yk ) F(y).
3) If at some step of the search process the value M from (3.4.7) satisfies the
condition
rM > 231/N L N + 3,
(3.4.15)
then y is a global minimizer of the function F(y) over [a, b] and any global
minimizer y of F(y(x)), x [0, 1], is also a limit point of the sequence {yk }.
Proof. The assumption that F(y)
= F(y ) where y and y are some limit points of
the sequence {xk } obviously contradicts the second statement of the theorem, and
we proceed to proving this second statement.
Any point xk+1 from (3.4.13) partitions the interval [xt1 , xt ] into two subintervals
[xt1 , xk+1 ], [xk+1 , xt ]. Due to (3.4.7) and (3.4.13), these subintervals meet the
inequality
max{xt xk+1 , xk+1 xt1 } (xt xt1 )
(3.4.16)
where = (1 + r)/2r < 1 (note that r > 1).

In accordance with (3.4.4), the sequence of trial points {yk } is the image of the
sequence {xk } generated by the rule (3.4.13); hence, in accordance with Lemma 3.3,
the point y should have some inverse image x which is the limit point of the sequence
{xk }. Denote j = j(k) the number of the interval [x j1 , x j ] containing the point x at
the step k(k > 1). If x
/ {xk }, i.e., if x is the interior point of (x j1 , x j ), j = j(k), k >
1, then due to (3.4.16), these intervals constitute the nested sequence contracting to
the point x with k . This statement holds true also if x = 0 or x = 1.
Suppose that x {xk } and x = 0, x = 1. Then, since some index q > 1, x = xq ,
which means that at k q, x = xl , l = l(k), is the common end-point of two intervals
[xl1 , xl ] and [xl , xl+1 ]. As long as x is the limit point of the sequence {xk }, then at
least one of these intervals should contract, with k being hit with subsequent
trials. Without loss of generality, we assume that the selected index j = j(k)
indicates exactly this interval. Thus, in any case, there is a nested sequence of
intervals [x j1 , x j ], j = j(k), contracting to the point x with k and
lim j = lim (x j x j1) = 0
where j is from (3.4.8).
(3.4.17)
77
In consideration of the function F(y), y [a, b], being Lipschitzian and with
account of (3.4.2), (3.4.6)(3.4.8), we derive that the value M (which is obviously
the function of the index k) is positive and bounded from above. Then, from (3.4.6)
(3.4.9) and (3.4.17), follows that
= 4F(y)
.
lim R j(k) = 4F(y(x))
(3.4.18)
Assume that the second statement is not true, i.e., that at some step q 0 the
outcome
zq = F(yq ) < F(y)
(3.4.19)
is obtained. Denote l = l(k) the subscript corresponding to the inverse image xq of

the point yq at the step k, i.e., zq = zl = F(y(xl )), yq = y(xq ), k q, and consider the
characteristic of the interval [xl1 , xl ) computed from (3.4.9) (if q = 0, then R1 has
to be considered). By introducing the notation
= rM l /|zl zl1 |,
where l is from (3.4.8) and > 1, due to (3.4.7), we derive the relations
Rl = |zl zl1 |( + 1 ) 2(zl + zl1 ) >
> 2{max(zl , zl1 ) min(zl , zl1 )} 2(zl + zl1 ) =
= 4 min(zl , zl1 ) .
The last inequality, which also holds true for the case zl = zl1 (this can easily be
checked directly from (3.4.9)), and the assumption (3.4.19) results in the estimate
+ 4[F(y)
F(yq )] .
Rl(k) > 4zq = 4F(y)
(3.4.20)
From (3.4.18) and (3.4.20) follows the validity of the relation

Rl(k) > R j(k)
for sufficiently large k. Hence, in accordance with the rule (3.4.10), the point y
cannot be a limit point of the sequence {yk } if the assumption (3.4.19) is true.
Let x be the inverse image of some global minimizer y of F(y(x)), x [0, 1],
i.e., y = y(x ), and denote t = t(k) the index of the interval [xt1 , xt ] containing
this inverse image at the step k > 1. From (3.4.2), (3.4.6) and the assumption of the
function F(y) being Lipschitzian, obtain the inequalities
zt1 F + 2L N + 3(x xt1 )1/N ,
zt F + 2L N + 3(xt x )1/N ,
(3.4.21)
(3.4.22)
78
where F = F(y ). By summarizing (3.4.21) and (3.4.22), obtain (see [138]) the
estimate
zt +zt1 2F +2L N+3[(x xt1 )1/N + (xt x )1/N ]
2F + 2Lt N + 3 max [ 1/N + (1 )1/N ] =

0 1
(3.4.23)
= 2F + 221/N Lt N + 3.
Now, from (3.4.9), (3.4.15), and (3.4.23) follows the validity of the relation
Rt(k) > 4F
for sufficiently large values of the index k. This last estimate together with (3.4.18)
and the rule (3.4.10) leads to the conclusion that the interval [xt1 , xt ], t = t(k), is to
be hit by some subsequent trials generating, thereafter, a nested sequence of intervals
each containing x and, thus, due to (3.4.16), contracting to this inverse image of the
global minimizer y . Hence, y is to be a limit point of the sequence {yk }. Then due
to the first statement of the theorem, any limit point y of the sequence {yk } has to be
a global minimizer of F(y) over [a, b] RN .
Therefore, under the condition (3.4.15), the set of all limit points of the sequence
{yk } generated by MIA is identical to the set of all global minimizers of the
Lipschitzian function F(y) over [a, b].

Let us make some considerations with regard to the effective use of approximations to the Peano curve.
Corollary 3.6. From Theorem 2.4 follows that it is possible to solve the onedimensional problem
min{F(l(x)) :
x [0, 1]}
(3.4.24)
for a Lipschitzian with the constant L function F(y), y [a, b], and where l(x) =
lM (x) is the Peano-like piecewise-linear evolvent (2.2.68), (2.2.69), by employing
the decision rules of the algorithm for global search in many dimensions MIA. The
obvious amendment to be done is to substitute y(x) in the relations (3.4.3), (3.4.4),
and (3.4.6) with l(x).
But the problem (3.4.24) is not exactly equivalent to the problem of minimizing
F(y) over [a, b]. For l(x) = lM (x), where M is from (2.2.69), the relation between
these two problems is set defined by the inequality
min{F(l(x)) : x [0, 1]}min{F(y)) : y D} L N2(M+1)
(3.4.25)
because the evolvent l(x), x [0, 1], covers the grid

H(M, N) = {yi ; 0 i 2MN 1}
(3.4.26)
79
having, as already mentioned in Chap. 2, a mesh width equal to 2M , but not the
entire cube [a, b] RN .
Therefore, the accuracy of solving the one-dimensional problem (3.4.24) should
not essentially exceed the precision assured by (3.4.25), i.e., these two sources of
inexactness have to be simultaneously taken into account.
Remark 3.2. The curve l(x) = lM (x) is built of linear segments, and for any pair of
points l(x ), l(x ) from the same segment, it is true that
||l(x ) l(x )|| 2M(N1) |x x | ;
see (2.2.72). Therefore, the function F(l(x)), x [0, 1], from (3.4.24) is Lipschitzian
with the constant LM = L2M(N1) increasing with the rise of M (i.e., along with
an increase of the required accuracy in solving the problem of minimizing the
Lipschitzian with constant L function F(y) over [a, b]). Due to this reason, the
one-dimensional algorithms based on the classical Lipschitz conditions are not
effective in solving problems similar to (3.4.24).
Search algorithm employing NUPE. Suppose that the function F(y), y [a, b],
is Lipschitzian with the constant L. Then the function F(nM (h j )), in which nM is
the Non-Univalent Peano-like Evolvent, NUPE (see Chap 2), defined on the grid
(2.2.80) satisfies the condition
min{F(nM (h j )) : 1 j q} min{F(y) : y D} L N2(M+1)

which is similar to (3.4.25). Therefore, being given the required accuracy of solving
the initial N-dimensional problem it is possible to select the appropriate evolvent
nM (x) approximating y(x). Consequently, it is possible to minimize F(n(x)) in the
same way as already considered for the function F(l(x)) employing the piecewiselinear evolvent l(x).
But there are some differences to be taken into account. The function F(n(x))
is defined only at the points from (2.2.80) and not over the entire interval [0, 1].
Besides, computing of a single value z = F(p), p P(M, N), has to be interpreted
as executing trials at the points h j1 , . . . , h j , which are the inverse images of p,
yielding the same outcome
z = F(nM (h j1 ) = . . . = F(nM (h j ) .
(3.4.27)
Actually, as already mentioned, the whole idea of introducing nM (x) was to use
the property (3.4.27) as a kind of compensation for losing some information
about nearness of the performed trials in the hypercube [a, b] when reducing
the dimensionality with evolvents. Therefore, the algorithm of global search in
many dimensions MIA has to be modified for employing nM (x) due to the above
circumstances. Now, we outline the skeleton of the new algorithm.
The first two trials have to be selected as prescribed by (3.4.3), i.e.,
p0 = nM (0),
p1 = nM (1),
80
because h0 = 0 and hq = 1 are from (2.2.80) and, as it follows from the construction
of nM (x), the above points p0 , p1 are characterized by unit multiplicity. Suppose
that l > 1 trials have already been performed at the points p0 , . . . , pl from [a, b]
and x0 , . . . , xk (k l) are the inverse images of these points with respect to nM (x).
We assume that these inverse images are renumbered with subscripts as prescribed
by (3.4.5) and juxtaposed to the values zi = F(nM (xi )), 1 i k; note that this
juxtaposition is based on the outcomes of just l trials where, in general, due to
(3.4.27), l < k. Further selection of trial points follows the scheme:
Step 1. Employing (3.4.7)(3.4.13) execute the rules 25 of MIA and detect the
interval [h j , h j+1 ) from (2.2.80) containing the point xk+1 from (3.4.13), i.e., h j
xk+1 < h j+1 .
Step 2. Determine the node pl+1 = nM (h j ) P(M, N) and compute the outcome
zl+1 = F(pl+1 ).
Step 3. Compute all the inverse images h j1 , . . . , h j of the point pl+1 with respect
to nM (x).
Step 4. Introduce the new points
xk+1 = h j1 , . . . , xk+ = h j
characterized by the same outcome zl+1 , increment k by and pass over to the first
clause.
Termination could be forced by the condition t , where t ,t are, respectively,
from (3.4.2), (3.4.10), and > 0 is the preset accuracy of search which is supposed
to be greater than the mesh width of the grid (2.2.80). This rule may also be set
defined by another option: the search is to be terminated if the node h j generated
in accordance with the first clause of the above scheme coincides with one of the
already selected points from the series (3.4.5). Note that if there is a need to continue
the search after the coincidence, then it is sufficient just to augment the parameter
M (i.e., to use the evolvent corresponding to a finer grid; recall that the points of all
the already accumulated trials are the nodes of this finer grid).
We present now some numerical examples in order to test the behavior of the
algorithm MIA. First, we consider the following test function
F(y1 , y2 ) = y21 + y22 cos18y1 cos18y2 ,
21 y1 , y2 1,
from [97]. Minimization of this function carried out by applying the above algorithm
with r = 2, = 0.01 with non-univalent evolvent n(x) covering the grid P(9, 2)
required 63 trials and at the moment of termination there were 199 points in the
series (3.4.5). Solving of the same problem by the MIA employing the piecewiselinear approximation l(x) to the Peano curve with the grid H(10, 2) required 176
trials with r = 2, = 0.01.
3.5 A Multidimensional Geometric Method
81
Table 3.6 Results of experiments with six problems produced by the GKLS-generator
from [40]. In the table, N is the dimension of the problem; rg is the radius of the attraction
region of the global minimizer; n is the number of the function taken from the considered
class of test functions; is the radius of the ball used at the stopping rule; r is the
reliability parameter of the MIA.
N
2
3
4
5
6
7
rg
0.20
0.20
0.20
0.20
0.20
0.20
n
70
16
21
85
14
25
0.01N
0.01N
0.01N
0.02N
0.03N
0.05 N
r
2.1
3.4
3.3
4.2
4.2
4.2
Direct
733
6641
24569
127037
232593
400000
LBDirect
1015
10465
70825
224125
400000
400000
MIA
169
809
2947
33489
84053
80223
Second, we use a set of six functions produced by the GKLS-generator of classes

of test function from [40] (see a detailed description of the generator in Sect. 3.5).
Each class consists of 100 N-dimensional test functions. In our experiment we
consider 6 functions from 6 different classes with the dimension that varies from
N = 2 to N = 7. Each function has 10 local minima and the value of the global
minimum is equal to 1 for all the functions. The parameter d of the classes (see
Sect.3.5) is fixed equal to 0.66 and in Table 3.6 we report the value of the parameter
rg for each function. The column n denotes the number of the function in the fixed
class.
The algorithm MIA is compared with the original Direct algorithm proposed in
[67] and with its recent locally biased modification LBDirect introduced in [31,
34]. The FORTRAN implementations of these two methods downloadable from
[32] and a FORTRAN implementation of the MIA algorithm have been used in all
the experiments. In the MIA algorithm a piecewise-linear M-approximation to the
Peano curve has been considered. In particular, the level M of the curve has been
chosen taking in mind the constraint NM (see Sect. 2.3). For the computer used
in our experiments we had = 52 and the following values of M have been used:
M = 10 for N = 2, 3, 4, 5; M = 8 for N = 6, and M = 7 for N = 7. The real number
r > 1 is the reliability parameter of the algorithm, from (3.4.9).
The following stopping has been used: = 0 is fixed in rule (3.4.11) and the
search terminates when a trial point falls in a ball B having the center at the global
minimizer of the considered function and a radius (see Table 3.6). In Table 3.6,
we report the number of trials executed by the algorithms. In all experiments the
maximal number of function evaluations has been taken equal to 400, 000 and the
symbol * in Table 3.6 denotes that the global minimum has not been found by a
method after 400, 000 trials.

In this section, we generalize the geometric algorithm introduced in Sect. 3.3 for
minimizing Holderian one-dimensional functions. Thus, we can solve the problem
(3.1.1), (3.1.2) by using algorithms proposed for minimizing functions (3.1.3),
82
(3.1.4) in one dimension. Naturally, in order to realize the passage from the
multidimensional problem to the one-dimensional one, computable approximations
to the Peano curve should be employed in the numerical algorithms. Hereinafter
we use the designation pM (x) for an M-level piecewise-linear approximation of the
Peano curve.
By Theorem 3.1 one-dimensional methods from Sect. 3.3 constructing at each
iteration auxiliary functions providing a lower bound of the univariate objective
function can be used as a basis for developing new methods for solving the
multidimensional problem. The general scheme for solving problem (3.1.1), (3.1.2)
that we name MGA (Multidimensional Geometric Algorithm) is obtained by using
the scheme GA from Sect. 3.3 as follows.
Multidimensional Geometric Algorithm

Step 0. Set x1 = 0, x2 = 1 and compute the values of the function z j = f (x j ) =
F(pM (x j )), j = 1, 2, where pM (x) is the M-approximation of the Peano curve. After
executing k trials the choice of new trial points is done as follows.
Step 1. Execute Step 1 of the GA.
Step 2. Set
mi = max{ , hk },
2 i k,
(3.5.1)
where > 0 is a small number that takes into account our hypothesis that f (x) is
not constant over the interval [0, 1] and the value hk is calculated as follows
(3.5.2)
with
hi =
|zi zi1 |
,
|xi xi1|1/N
2 i k.
(3.5.3)
Step 3. For each interval [xi1 , xi ], 2 i k, compute the point yi and the
characteristic Ri , according to (3.3.4) and (3.3.5), replacing the values zi = f (xi )
by F(pM (xi )).
Step 4. Select the interval [xt1 , xt ] according to (3.3.6) of the GA.
Step 5. If
|xt xt1 |1/N ,
(3.5.4)
83
where > 0 is a given search accuracy, then calculate an estimate of the global
minimum as
and STOP. Otherwise, execute the next trial at the point
xk+1 = yt
(3.5.5)
set k = k + 1 and go to Step 1.

As it was in the one-dimensional case, in the course of its work the MGA
constructs an auxiliary piecewise function in one dimension that after executing
k trials is
Ck (x) = ci (x),
for x [xi1 , xi ],
2 i k,
(3.5.6)
where ci (x), 2 i k, are from (3.3.10).

If the constant rmi is equal or larger than the Holder constant H, then it follows
from (3.1.4) that the function Ck (x) is a low-bounding function for f (x) for every
interval [xi1 , xi ], 2 i k, i.e.,
Ck (x) f (x),
x [0, 1].
(3.5.7)
The characteristic Ri from (3.3.5) for each interval [xi1 , xi ], 2 i k, at a generic

iteration k + 1, represents the minimum among the values of the auxiliary functions
1/N
1/N
c
, c+
i (x) = f (xi1 ) rmi (x xi1 )
i (x) = f (xi ) rmi (xi x)
evaluated at the point yi from (3.3.4). By making use of the Peano curves we have a
correspondence between a cube in dimension N and an interval in one dimension.
In the MGA we suppose that the Holder constant is unknown and in Step 2
we compute the value mi being an estimate of the Holder constant for f (x) over
the interval [xi1 , xi ], 2 i k. In this case the same estimates mi are used over the
whole search region for f (x). However, as it was already mentioned above, global
estimates of the constant can provide a very poor information about the behavior
of the objective function over every small subinterval [xi1 , xi ] [0, 1]. In the next
chapter we shall describe the local tuning technique that adaptively estimates the
local Holder constants over different subintervals of the search region allowing so
to accelerate the process of optimization.
Let us study now convergence properties of the MGA algorithm. Theorem 3.1
linking the multidimensional global optimization problem (3.1.1), (3.1.2) to the onedimensional problem (3.1.3), (3.1.4) allows us to concentrate our attention on the
one-dimensional case using the curve. We shall study properties of an infinite (i.e.,
= 0 in (3.5.4)) sequence {xk }, xk [0, 1], k 1, of trial points generated by the
algorithm MGA.
84
Theorem 3.7. Assume that the objective function f (x) satisfies the condition
(3.1.4), and let x be any limit point of {xk } generated by the MGA. Then the
following assertions hold:
Convergence to x is bilateral, if x (0, 1);
f (xk ) f (x ), for any k 1;
If there exists another limit point x = x , then f (x ) = f (x );
If the function f (x) has a finite number of local minima in [0, 1], then the point x
is locally optimal;
5. (Sufficient conditions for convergence to a global minimizer). Let x be a global
minimizer of f (x). If there exists an iteration number k such that for all k > k
the inequality
1.
2.
3.
4.
rm j(k) > H j(k)
(3.5.8)
holds, where m j(k) is an estimate (calculated at Step 2 of the algorithm) of

the Holder constant H j(k) for the interval [x j(k)1 , x j(k) ] containing x , and r
is the reliability parameter of the method. Then the set of limit points of the
sequence {xk } coincides with the set of global minimizers of the function f (x).
Proof. Theorem is proved analogously to proofs of Theorems 3.33.4, with Corollaries 3.23.5 from Sect. 3.3.

Note that assertion 4 in Theorem 3.7 describes conditions of local optimality on
the curve. In the multidimensional region the point pM (x ) can be a point which is
not a local optimum. Such situations for the class of Dived the Best algorithms the
Scheme MGA belongs to have been studied in detail (see [103]). It is also important
to emphasize that assertion 5 regards the global optimum x of the one-dimensional
problem. Since the global minimizer, y , in the N-dimensional space can have up
to 2N images on the curve (see Fig. 3.1) and in the process of optimization a curve
pM (x) is used, in order to have convergence to the point y it is sufficient to have
convergence to one of the images of y on the curve. Of course, in the limit case
(M and = 0 in (3.5.4)) if condition (3.5.8) is satisfied for one of the images,
all global minimizers will be found. But in practice we work with a finite M <
and > 0, i.e., with a finite trial sequence, and the search can stop after finding the
only image of y providing nevertheless the required approximation of the global
minimizer y . This effect leads to a serious acceleration of the search, in the next
chapter we will introduce a local improvement technique [76] in order to enforce
this effect.
The following Theorem ensures existence of the values of the parameter r
satisfying condition (3.5.8) providing so location of all global minimizers of f (x)
by the proposed method.
Theorem 3.8. For any function f (x) satisfying (3.1.4) with H < there exists a
value r such that for all r > r the algorithm MGA determines all global minimizers
of the function f (x) over the search interval [0, 1].
85
Proof. It follows from (3.5.1), and the finiteness of > 0 that approximations of
the Holder constant mi in the method are always greater than zero. Since H < in
(3.1.4) and any positive value of the parameter r can be chosen in the Scheme MGA,
it follows that there exists an r such that condition (3.5.8) will be satisfied for all
global minimizers for r > r . This fact, due to Theorem 3.7, proves the Theorem.

We present now numerical results of experiments executed for testing performance of the algorithm MGA. In all the experiments we have considered the
FORTRAN implementation of the methods tested. Since in many real life problems
each evaluation of the objective function is usually a very time-consuming operation
[63, 93, 98, 117, 139, 156], the number of function evaluations executed by the
methods until the satisfaction of a stopping rule has been chosen as the main
criterion of the comparison.
Classes of test functions. In the field of global optimization there exists an old
set of standard test functions (see [21]). However, recently it has been discovered
by several authors (see [1, 79, 145]) that these problems are not suitable for testing
global optimization methods since the functions belonging to the set are too simple
and methods can hardly miss the region of attraction of the global minimizer.
As a consequence, the number of trials executed by methods is usually very small
and, therefore, non-representative. These functions are especially inappropriate
for testing algorithms proposed to work with the global optimization of real
multiextremal black-box functions where it is necessary to execute many trials in
order to better explore the search region and to reduce the risk of missing the global
solution. The algorithms proposed in this book are oriented exactly on such a type
of hard global optimization problems. Hence, more sophisticated and systematic
tests are required to verify their performance. In our numerical experiments several
classes of N-dimensional test functions generated by the GKLS-generator (see [40])
have been used; an example of a function generated by the GKLS can be seen in
Fig. 3.8. This generator has several advantages that allow one to use it as a good
tool for the numerical comparison of algorithms (in fact, it is used to test numerical
methods in more than 40 countries in the world).
It generates classes of 100 test functions (see [40] for a detailed explanation,
examples of its usage, etc.) with the same number of local minima and supplies
a complete information about each of the functions: its dimension, the values of
all local minimizers, their coordinates, regions of attraction, etc. It is possible to
generate harder or simpler test classes easily. Only five parameters (see Table 3.7)
should be defined by the user and the other parameters are generated randomly.
An important feature of the generator consists of the complete repeatability of the
experiments: if you use the same five parameters, then each run of the generator will
produce the same class of functions.
The GKLS-generator works by constructing test functions F(y) in RN using a
convex quadratic function g(y), i.e., a paraboloid g(y) = y T 2 + t, that is then
distorted over the sets
86
3.5
3
2.5
2
1.5
1
0.5
0
0.5
1
1
0.5
0.5
0.5
0.5
1
Fig. 3.8 A function produced by the GKLS generator shown together with a piecewise-linear
approximation to Peano curve used for optimization
k = {y RN | y Pk rk },
1 k m,
by assigning function values fk at Pk . The general form of F(y) is

F(y) =
Ck (y)
if y k , 1 k m

/ 1 , , m .
y T 2 + t if y
(3.5.9)
where

Ck (y) =

2 y Pk , T Pk 2
A
y Pk 3
y Pk
rk2
rk3

4 y Pk , T Pk 3
+ 2 A y Pk 2 + fk
+ 1
rk
y Pk
rk
(3.5.10)
with A = T Pk 2 + t fk .
The generator gives the possibility to use several types of functions. In the
described experiments, cubic continuous multiextremal functions have been used.
In all series of experiments we have considered classes of 100 N-dimensional
functions with 10 local minima over the domain [1, 1] RN . For each dimension
N = 2, 3, 4 two test classes were considered: a simple class and a difficult one. Note
(see Table 3.7) that a more difficult test class can be created either by decreasing
the radius, rg , of the approximate attraction region of the global minimizer, or by
increasing the distance, d, from the global minimizer to the paraboloid vertex.
Experiments have been carried out by using the following stopping criteria:
87
Table 3.7 Description of GKLS classes of test functions used in the

experiments: the global minimum value
Class
1
2
3
4
5
6
Difficulty
Simple
Hard
Simple
Hard
Simple
Hard
N
2
2
3
3
4
4
m
10
10
10
10
10
10
f
1.0
1.0
1.0
1.0
1.0
1.0
d
0.66
0.90
0.66
0.90
0.66
0.90
rg
0.33
0.20
0.33
0.20
0.33
0.20
f ; the distance from the global minimizer to the vertex of the paraboloid,
d; the radius of the attraction region of the global minimizer, rg
Stopping criteria. The value = 0 is fixed in the stopping rule (3.5.4) and the
search terminates when a trial point falls in a ball Bi having a radius and the
center at the global minimizer of the considered function, i.e.,
Bi = {y RN : y yi },
(3.5.11)
where yi denotes the global minimizer of the i-th function of the test class,
1 i 100.
Comparison MGADirectLBDirect. In this series of experiments we compare
the algorithm MGA with the original Direct algorithm proposed in [67] and its
recent locally biased modification LBDirect introduced in [31, 34]. These methods
have been chosen for comparison because they, just as the MGA method, do not
require the knowledge of the Lipschitz constant of the objective function and the
knowledge of the objective function gradient. The FORTRAN implementations of
these two methods described in [31,33] and downloadable from [32] have been used
in all the experiments. Parameters recommended by the authors have been used in
both methods.
In all experiments the

stopping rule was used with = 0 and = 0.01 N for
classes 1 4 and = 0.02 N for classes 5 and 6, where is from (3.5.11). In all the
cases the maximal number of function evaluations has been taken equal to 90, 000;
the parameter = 108 , from (3.5.1). The choices of the reliability parameter
r, for all the experiments, are the following: for the class 1 the value r = 1.1 for
98 functions, and r = 1.2 for the remaining two functions. The value r = 1.4 was
used for 97 functions of the class 2 and r = 1.5 for the remaining 3 functions of
this class. In dimension N = 3 the value r = 1.1 was applied for all 100 functions of
classes 3 and for 99 functions of class 4; the value r = 1.2 for 1 function of class 4.
In dimension N = 4 the value r = 1.1 was used for all the functions of class 5 and
for 98 functions of class 6, and r = 1.3 for the remaining two function of class 6.
When one executes tests with a class of 100 different functions it becomes
difficult to use specific values of r for each function, hence in our experiments at
most two values of this parameter have been fixed for the entire class. Clearly, such
a choice does not allow the MGA to show its complete potential.
88
Table 3.8 Results of experiments with six classes of test functions generated by the GKLS
Max trials
Class
1
2
3
4
5
6
N
2
2
3
3
4
4
Direct
127
1159
1179
77951
90000(1)
90000(43)
Average
LBDirect
165
2665
1717
85931
90000(15)
90000(65)
MGA
239
938
3945
26964
27682
90000(1)
Direct
68.14
208.54
238.06
5857.16
>12206.49
>57333.89
90
80
80
70
60
50
40
30
20
LBDirect
10
Direct
MGA
50
100
150
Iterations
200
250
No. of solved functions
100
90
100
LBDirect
70.74
304.28
355.30
9990.54
>23452.25
>65236.00
MGA
90.06
333.14
817.74
3541.82
3950.36
>22315.59
70
60
50
40
30
20
LBDirect
10
Direct
MGA
500
1000
1500 2000
Iterations
2500
3000
Fig. 3.9 Methods MGA, Direct and LBDirect, N = 2. Class no.1, left; Class no.2, right
Results of numerical experiments with the six GKLS tests classes from Table 3.7
are shown in Table 3.8. The columns Max trials report the maximal number of
trials required for satisfying the stopping rule (a) for all 100 functions of the class.
The notation 90,000 ( j) means that after 90,000 function evaluations the method
under consideration was not able to solve j problems. The Average columns in
Table 3.8 report the average number of trials performed during minimization of
the 100 functions from each GKLS class. The symbol > reflects the situations
when not all functions of a class were successfully minimized by the method under
consideration: that is the method stopped when 90,000 trials had been executed
during minimizations of several functions of this particular test class. In these cases,
the value 90,000 was used in calculations of the average value, providing in such a
way a lower estimate of the average.
Figure 3.9 shows the behavior of the three methods for N = 2 on classes 1 and
2 from Table 3.7, respectively (for example, it can be seen in Fig. 3.9-left that after
100 function evaluations the LBDirect has found the solution at 79 problems, Direct
at 91 problems and the MGA at 63 problems). Figure 3.10 illustrates the results of
the experiment for N = 3 on classes 3 and 4 from Table 3.7, respectively.
Figure 3.11 shows the behavior of the three methods for N = 4 on classes 5 and
6 from Table 3.7, respectively (it can be seen in Fig. 3.11-left that after 10,000
function evaluations the LBDirect has found the solution at 58 problems, Direct
at 73 problems, and the MGA at 93 problems). It can be seen from Fig. 3.11-left
89
90
80
80
100
90
100
70
60
50
40
30
70
60
50
40
30
20
LBDirect
20
10
Direct
10
LBDirect
Direct
MGA
MGA
500 1000 1500 2000 2500 3000 3500 4000

Iterations
4
5
Iterations
9
x 104
90
80
80
70
60
50
40
30
20
LBDirect
Direct
10
0
MGA
4
5
iterations
9
x 104
100
90
100
70
60
50
40
30
20
LBDirect
10
Direct
MGA
4
5
Iterations
9
x 104
that after 90,000 evaluations of the objective function the Direct method has not
found the solution for 1 function, and the LBDirect has not found the solution for
15 functions of the class 5. Figure 3.11-right shows that the Direct and LBDirect
methods were not able to locate after executing the maximal possible value of
function evaluations, 90, 000, the global minimum of 43 and 62 functions of the
class 6, respectively. The MGA was able to solve all the problems in the classes
15; the MGA has not found the solution only in 1 function in the class 6.
As it can be seen from Table 3.7 and Figs. 3.93.11, for simple problems Direct
and LBDirect are better than the MGA and for harder problems the MGA is better
than its competitors. The advantage of the MGA becomes more pronounced both
when classes of test functions become harder and when the dimension of problems
increases. It can be noticed also that on the taken test classes the performance of
the LBDirect is worse with respect to the Direct (note that these results are in a
good agreement with experiments executed in [116, 117]). A possible reason of this
behavior can be the following. Since the considered test functions have many local
minima and due to its locally biased character, LBDirect spends too much time
exploring various local minimizers which are not global.
Chapter 4
Ideas for Acceleration
I think my acceleration is very good. Thats the key for me.

Usain Bolt
4.1 Introduction
Let us return to the Lipschitz-continuous multidimensional function F(y), y D,
from (3.1.2) and the corresponding global minimization problem (3.1.1). In the
previous chapter, algorithms that use dynamic estimates of the Lipschitz information
for the entire hyperinterval D have been presented. This dynamical estimating
procedures have been introduced since the precise information about the value of
the constant L required by Piyavskiis method for its correct work is often hard to get
in practice. Thus, we have used the procedure (3.2.7), (3.2.8) to obtain an estimate
of the global Lipschitz constant L during the search (the word global means that
the same value is used over the whole region D). In this chapter, we introduce ideas
that can accelerate the global search significantly by using a local information about
F(y). In order to have a warm start let us first introduce these ideas informally for
the one-dimensional case.
Notice that both the a priori given exact constant L and its global overestimates
can provide a poor information about the behavior of the objective function f (x)
over a small subinterval [xi1 , xi ] [a, b]. In fact, over such an interval, the
corresponding local Lipschitz constant L[xi1 ,xi ] can be significantly less than the
global constant L. This fact would significantly slow down methods using global
value L or its estimate over [xi1 , xi ].
In order to overcome this difficulty and to accelerate the search a new approach
called local tuning technique has been introduced in [100,101]. The new approach
allows one to construct global optimization algorithms that tune their behavior to
the shape of the objective function at different sectors of the search region by
using adaptive estimates of the local Lipschitz constants in different subintervals

91
92
4 Ideas for Acceleration
of the search domain during the course of the optimization process. It has been
successfully applied to a number of global optimization methods providing a
high level of the speed up both for problems with Lipschitz objective functions
and for problems with objective functions having Lipschitz first derivatives (see
[74, 77, 102, 103, 105, 117119, 123, 124, 139], etc.).
The main idea lies in the adaptive automatic concordance of the local and global
information obtained during the search for every subinterval [xi1 , xi ] of [a, b]. When
an interval [xi1 , xi ] is narrow, only the local information obtained within the near
vicinity of the trial points xi1 , xi has a decisive influence on the method. In this
case, the results of trials executed at points lying far away from the interval [xi1 , xi ]
are less significant for the method. In contrast, when the method works with a wide
subinterval, it takes into consideration data obtained from the whole search region
because the local information represented by the values f (xi1 ), f (xi ) becomes
less reliable due to the width of the interval [xi1 , xi ]. Thus, for every subinterval
both comparison and the balancing of global and local information is automatically
effected by the method. Such a balancing is very important because the usage of a
local information only can lead to the loss of the global solution (see [126]). It is
important to mention that the local tuning works during the global search over the
whole search region and does not require to stop the global procedure as it is usually
done by traditional global optimization methods when it is required to switch on a
local procedure.
Furthermore, the second accelerating technique, called local improvement, (see
[7577]) that can be used together with the local tuning technique, is presented
in this chapter. This approach forces the global optimization method to make a
local improvement of the best approximation of the global minimum immediately
after a new approximation better than the current one is found. The proposed local
improvement technique is of a particular interest due to the following reasons. First,
usually in the global optimization methods the local search phases are separated
from the global ones. This means that it is necessary to introduce a rule that: stops
the global phase and starts the local one; then it stops the local phase and starts the
global one. It happens very often (see, e.g., [63, 65, 93, 117, 139]) that the global
search and the local one are realized by different algorithms and the global search
is not able to use all evaluations of f (x) made during the local search losing so an
important information about the objective function that has been already obtained.
The local improvement technique does not have this defect and allows the global
search to use all the information obtained during the local phases.
Second, the local improvement technique can work without any usage of
the derivatives. This is a valuable asset because many traditional local methods
require the derivatives and therefore when one needs to solve the problem (3.1.1),
(3.1.2) they cannot be applied because, clearly, Lipschitz functions can be nondifferentiable.
4.2 Local Tuning and Local Improvement in One Dimension
93

In order to introduce the local tuning and local improvement techniques, in this
section, we focus our attention on the most simple case, i.e., the Lipschitz global
optimization problem in one dimension, i.e.
min f (x),
x [a, b],
(4.2.1)
where
| f (x) f (y)| L|x y|,
x, y [a, b],
(4.2.2)
with a constant 0 < L < .

First, we present a general algorithmic scheme which contains two steps, Step 2
and Step 4, that are not defined. They will be clarified later and each specification
of these two steps will give a concrete algorithm. As in the previous chapter, by the
term trial we denote the evaluation of the function f (x) at a point x that, in its turn,
is called the trial point.
General Scheme (GS)

Step 0. The first two trials are performed at the points x1 = a and x2 = b. The point
xk+1 , k 2, of the current (k+1)-th iteration is chosen as follows.
Step 1. Renumber the trial points x1 , x2 , . . . , xk1 , xk of the previous iterations by
subscripts so that
a = x1 < x2 < < xk1 < xk = b.
(4.2.3)
Step 2. Compute in a certain way the values mi being estimates of the local
Lipschitz constants of f (x) over the intervals [xi1 , xi ], 2 i k. The way to
calculate the values mi will be specified in each concrete algorithm described
hereinafter.
Step 3. Calculate for each interval [xi1 , xi ], 2 i k, its characteristic
Ri =
(xi xi1 )
zi + zi1
mi
,
2
2
(4.2.4)
where the values zi = f (xi ), 1 i k.

Step 4. Find an interval [xt1 , xt ] where the next trial will be executed. The
way to choose such an interval will be specified in each concrete algorithm
described below.
94
Step 5. If
|xt xt1 | > ,
(4.2.5)
where > 0 is a given search accuracy, then execute the next trial at the point
xk+1 =
xt + xt1 zt1 zt
+
2
2mt
(4.2.6)
and go to Step 1. Otherwise, take as an estimate of the global minimum f from

(4.2.1) the value
fk = min{zi : 1 i k},
and a point
xk = argmin{zi : 1 i k},
as an estimate of the global minimizer x , after executing these operations STOP.
Let us make some observations with regard to the scheme GS introduced above.
During the course of the (k + 1)th iteration the algorithm constructs an auxiliary
piecewise linear function
Ck (x) =
ci (x),
i=2
where
ci (x) = max{zi1 mi (x xi1), zi + mi (x xi )},
x [xi1 , xi ],
and the characteristic Ri from (4.2.4) represents the minimum of the auxiliary
function ci (x) over the interval [xi1 , xi ].
If the constants mi are equal or larger than the local Lipschitz constant Li
corresponding to the interval [xi1 , xi ], for all i, 2 i k, then the function Ck (x)
is a low-bounding function for f (x) over the interval [a, b], i.e., for every interval
[xi1 , xi ], 2 i k, we have
f (x) ci (x),
x [xi1 , xi ],
2 i k.
Moreover, if mi = L, we obtain the Piyavskii support functions. But if mi , for

each subinterval [xi1 , xi ], is an overestimate of the local Lipschitz constant in this
interval, we can construct at each iteration k, a piecewise support function which
takes into account the behavior of the objective function in the search region and
better approximates f (x) (see Fig. 4.1).
In order to obtain from the general scheme GS a concrete global optimization
algorithm, it is necessary to define Step 2 and Step 4 of the scheme. This

Fig. 4.1 Piecewise linear
support functions constructed
by using the global Lipschitz
constant (green) and the local
Lipschitz constants (blue)
95
6
4
2
f(x)
0
2
4
6
2
10
12
14
16
18
20
22
section proposes four specific algorithms executing this operation in different ways.
In Step 2, we can make two different choices of computing the constant mi that lead
to two different procedures that are called Step 2.1 and Step 2.2, respectively. In the
first procedure we use an adaptive estimate of the global Lipschitz constant (see
[117, 139]), for each iteration k. More precisely we have:
Step 2.1.
Set
mi = r max{ , hk },
2 i k,
(4.2.7)
where > 0 is a small number that takes into account our hypothesis that f (x) is
not constant over the interval [a, b] and r > 1 is a reliability parameter.
The value hk is calculated as follows
hk = max{hi : 2 i k, }
(4.2.8)
with
hi =
|zi zi1 |
,
xi xi1
2 i k,
(4.2.9)
where the values zi = f (xi ), 1 i k. In Step 2.1, at each iteration k all quantities
mi assume the same value over the whole search region [a, b]. However, as it was
already mentioned this global estimate (4.2.7) of the Lipschitz constant can provide
a poor information about the behavior of the objective function f (x) over every
small subinterval [xi1 , xi ] [a, b]. In fact, when the local Lipschitz constant related
to the interval [xi1 , xi ] is significantly smaller than the global constant L, then the
methods using only this global constant or its estimate can work slowly over such
an interval (see [100, 117, 139]).
96
In order to overcome this difficulty, we consider the local tuning approach (see
[74, 100, 117]) that adaptively estimates the values of the local Lipschitz constants
Li corresponding to the intervals [xi1 , xi ], 2 i k. The auxiliary function Ck (x) is
then constructed by using these local estimates for each interval [xi1 , xi ], 2 i k.
This technique is described below as the rule Step 2.2.
Step 2.2 (local tuning technique).
Set
mi = r max{i , i , }
(4.2.10)
i = max{hi1 , hi , hi+1 }, 3 i k 1,
(4.2.11)
with
where hi is from (4.2.9), and when i = 2 and i = k only h2 , h3 , and hk1 , hk , should
be considered, respectively. The value
i = hk
(xi xi1 )
,
X max
(4.2.12)
where hk is from (4.2.8) and

X max = max{xi xi1 : 2 i k}.
The parameter > 0 has the same sense as in Step 2.1.
Note that in (4.2.10) we consider two different components, i and i , that take
into account, respectively, the local and the global information obtained during the
previous iterations. When the interval [xi1 , xi ] is large, the local information is not
reliable and the global part i has a decisive influence on mi thanks to (4.2.10) and
(4.2.12). When [xi1 , xi ] is small, then the local information becomes relevant, i is
small (see (4.2.12)), and the local component i assumes the key role. Thus, Step 2.2
automatically balances the global and the local information available at the current
iteration. It has been proved for a number of global optimization algorithms that the
usage of the local tuning can accelerate the search significantly (see [100, 103, 109,
117, 119, 124, 139]).
Let us introduce now possible ways to fix Step 4 of the GS. At this step, we select
an interval where a new trial will be executed. We consider both the traditional rule
used, e.g., in [92] and [139] and a new one that we shall call the local improvement
technique [7577]). The traditional way to choose an interval for the next trial is the
following.
Step 4.1.
Select the interval [xt1 , xt ] such that
Rt = min{Ri : 2 i k}
and t is the minimal number satisfying (4.2.13).
(4.2.13)
97
This rule used together the exact Lipschitz constant in Step 2 gives us Piyavskiis
algorithm. In this case, the new trial point xk+1 (xt1 , xt ) is chosen in such a
way that
Rt = min{Ri : 2 i k} = ct (xk+1 ) = min{Ck (x) : x [a, b]}.
The new way to fix Step 4 is introduced below.
Step 4.2 (local improvement technique).
f lag is a parameter initially equal to zero. imin is the index corresponding to the
current estimate of the minimal value of the function, that is: zimin = f (ximin )
f (xi ), 1 i k. zk is the result of the last trial corresponding to a point x j in the line
(4.2.3), i.e., xk = x j .
IF (flag=1) THEN
IF zk < zimin THEN imin = j.
Local improvement: Alternate the choice of the interval [xt1 , xt ]
among t = imin + 1 and t = imin, if 2 imin k 1, (if imin = 1
or imin = k take t = 2 or t = k, respectively) in such a way that for
> 0 it follows
|xt xt1 | > .
(4.2.14)
ELSE (flag=0)
t = argmin{Ri : 2 i k}
ENDIF
flag=NOTFLAG(flag)
The motivation of the introduction of Step 4.2 presented above is the following.
In Step 4.1, at each iteration, we continue the search at an interval corresponding to
the minimal value of the characteristic Ri , 2 i k, see (4.2.13). This choice admits
occurrence of such a situation where the search goes on for a certain finite (but
possibly high) number of iterations at subregions of the domain that are distant
from the best found approximation to the global solution and only successively
concentrates trials at the interval containing a global minimizer. However, very
often it is of a crucial importance to be able to find a good approximation of
the global minimum in the lowest number of iterations. Due to this reason, in
Step 4.2 we take into account the rule (4.2.13) used in Step 4.1 and related to the
minimal characteristic, but we alternate it with a new selection method that forces
the algorithm to continue the search in the part of the domain close to the best value
of the objective function found up to now. The parameter flag assuming values 0
or 1 allows us to alternate the two methods of the selection.
More precisely, in Step 4.2 we start by identifying the index imin corresponding
to the current minimum among the found values of the objective function f (x), and
then we select the interval (ximin , ximin+1 ) located on the right of the best current
point, ximin , or the interval on the left of ximin , i.e., (ximin1 , ximin ). Step 4.2 keeps
98
working alternatively on the right and on the left of the current best point ximin until a
new trial point with value less than zimin is found. The search moves from the right to
the left of the best found approximation trying to improve it. However, since we are
not sure that the found best approximation ximin is really located in the neighborhood
of a global minimizer x , the local improvement is alternated in Step 4.2 with
the usual rule (4.2.13) providing so the global search of new subregions possibly
containing the global solution x . The parameter defines the width of the intervals
that can be subdivided during the phase of the local improvement. Note that the
trial points produced during the phases of the local improvement (obviously, there
can be more than one phase in the course of the search) are used during the further
iterations of the global search in the same way as the points produced during the
global phases.
Let us consider now possible combinations of the different choices of Step 2 and
Step 4 allowing us to construct the following four algorithms.
GE: GS with Step 2.1 and Step 4.1 (the method using the Global Estimate of the
Lipschitz constant L).
LT : GS with Step 2.2 and Step 4.1 (the method executing the Local Tuning on
the local Lipschitz constants).
GE LI: GS with Step 2.1 and Step 4.2 (the method using the Global Estimate of
L enriched by the Local Improvement technique).
LT LI: GS with Step 2.2 and Step 4.2 (the method executing the Local Tuning
on the local Lipschitz constants enriched by the Local Improvement technique).
Let us consider convergence properties of the introduced algorithms by studying
an infinite trial sequence {xk } generated by an algorithm belonging to the general
scheme GS for solving problem (4.2.1), (4.2.2).
Theorem 4.1. Assume that the objective function f (x) satisfies the condition
(4.2.2), and let x be any limit point of {xk } generated by the GE or by the LT
algorithm. Then the following assertions hold:
Convergence to x is bilateral, if x (a, b) (see definition 3.1);
f (xk ) f (x ), for all trial points xk , k 1;
If there exists another limit point x = x , then f (x ) = f (x );
If the function f (x) has a finite number of local minima in [a, b], then the point x
is locally optimal;
5. (Sufficient conditions for convergence to a global minimizer). Let x be a global
minimizer of f (x). If there exists an iteration number k such that for all k > k
the inequality
1.
2.
3.
4.
m j(k) L j(k)
(4.2.15)
holds, where L j(k) is the Lipschitz constant for the interval [x j(k)1 , x j(k) ]
containing x , and m j(k) is its estimate (see (4.2.7) and (4.2.10)). Then the set
of limit points of the sequence {xk } coincides with the set of global minimizers of
the function f (x).
99
Proof. The proofs of assertions 15 are analogous to the proofs of Theorems 4.14.2
and Corollaries 4.14.4 from [139].

Theorem 4.2. Assertions 15 of Theorem 4.1 hold for the algorithms GE LI and
LT LI for a fixed finite > 0 and = 0, where is the accuracy of the local
improvement from (4.2.14) and is from (4.2.5).
Proof. Since > 0 and = 0, the algorithms GE LI and LT LI use the local
improvement only at the initial stage of the search until the selected interval [xt1 , xt ]
is greater than . When |xt xt1 | the interval cannot be divided by the local
improvement technique and the selection criterion (4.2.13) is used. Thus, since the
one-dimensional search region has a finite length and is a fixed finite number,
there exists a finite iteration number j such that at all iterations k > j only selection
criterion (4.2.13) will be used. As a result, at the remaining part of the search,
the methods GE LI and LT LI behave themselves as the algorithms GE and LT ,
respectively. This consideration concludes the proof.

The next Theorem ensures the existence of the values of the parameter r such
that the global minimizers of f (x) will be located by the four proposed methods that
do not use the a priori known Lipschitz constant.
Theorem 4.3. For any function f (x) satisfying (4.2.2) with L < there exists a
value r such that for all r > r condition (4.2.15) holds for the four algorithms GE,
LT , GE LI, and LT LI.
Proof. It follows from (4.2.7), (4.2.10), and the finiteness of > 0 that approximations of the Lipschitz constant mi in the four methods are always greater than zero.
Since L < in (4.2.2) and any positive value of the parameter r can be chosen in
the scheme GS, it follows that there exists an r such that condition (4.2.15) will be
satisfied for all global minimizers for r > r . This fact, due to Theorems 4.1 and 4.2,
proves the Theorem.

Let us present now results of numerical experiments executed on 120 functions
taken from the literature to compare the performance of the four algorithms
described in this section. In order to test the effectiveness of the acceleration
techniques we have carried out the numerical tests considering also the Piyavskii
method (denoted by PM) and an algorithm obtained by modifying that of Piyavskii
with the addition of the local improvement procedure (denoted by PM LI). We note
that these two methods belong to the general scheme GS, in particular in Step 2 they
use the exact value of the Lipschitz constant.
Two series of experiments have been done. In the first series, a set of 20 functions
described in [59] has been considered. In Tables 4.1 and 4.2, we present numerical
results for the six methods proposed to work with the problem (4.2.1), (4.2.2).
In particular, Table 4.1 contains the numbers of trials executed by the algorithms
with the accuracy = 104(b a), where is from (4.2.5). Table 4.2 presents the
results for = 106 (b a). The parameters of the methods have been chosen as
follows: = 108 for all the methods, r = 1.1 for the algorithms GE, LT , and
GE LI, LT LI. The exact values of the Lipschitz constant of the functions f (x)
100

Table 4.1 Results of numerical experiments executed on 20 test problems from [59] by
the six methods belonging to the scheme GS; the accuracy = 104 (b a), r = 1.1
Problem
PM
GE
LT
PM LI
GE LI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
149
155
195
413
151
129
153
185
119
203
373
327
993
145
629
497
549
303
131
493
158
127
203
322
142
90
140
184
132
180
428
99
536
108
550
588
422
257
117
70
37
36
145
45
46
84
41
126
44
43
74
71
73
43
62
79
100
44
39
70
37
33
67
39
151
39
41
55
37
43
47
45
993
39
41
41
43
41
39
41
35
35
25
39
145
41
33
41
37
37
43
33
536
25
37
43
79
39
31
37
Average
314.60
242.40
65.10
95.60
68.55
LT LI
35
35
41
37
53
41
35
29
35
39
37
35
75
27
37
41
81
37
33
33
40.80
have been used in the methods PM and PM LI. For all the algorithms using the
local improvement technique the accuracy from (4.2.14) has been fixed = .
All the global minima have been found by all the methods in all the experiments
presented in Tables 4.1 and 4.2. In the last rows of these tables, the average values
of the numbers of trials points generated by the algorithms are given. It can be seen
from Tables 4.1 and 4.2 that both accelerating techniques, the local tuning and the
local improvement, allow us to speed up the search significantly when we work with
the methods belonging to the scheme GS. With respect to the local tuning we can
see that the method LT is faster than the algorithms PM and GE. Analogously, the
LT LI is faster than the methods PM LI and GE LI. The introduction of the local
improvement also was very successful. In fact, the algorithms PM LI, GE LI, and
LT LI work significantly faster than the methods PM, GE, and LT , respectively.
Finally, it can be clearly seen from Tables 4.1 and 4.2 that the acceleration effects
produced by both techniques are more pronounced when the accuracy of the search
increases.
In the second series of experiments, a class of 100 one-dimensional randomized
test functions from [94] has been taken. Each function f j (x), 1 j 100, of this
class is defined over the interval [5, 5] and has the following form
f j (x) = 0.025(x xj )2 + sin2((x xj ) + (x xj )2 ) + sin2 (x xj ),
(4.2.16)
101
Table 4.2 Results of numerical experiments executed on 20 test problems from [59] by the six
methods belonging to the scheme GS; the accuracy = 106 (b a), r = 1.1
PM
GE
LT
PM LI
GE LI
LT LI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Problem
1,681
1,285
1,515
4,711
1,065
1,129
1,599
1,641
1,315
1,625
4,105
3,351
8,057
1,023
7,115
4,003
5,877
3,389
1,417
2,483
1,242
1,439
1,496
3,708
1,028
761
1,362
1,444
1,386
1,384
3,438
1,167
6,146
1,045
4,961
6,894
4,466
2,085
1,329
654
60
58
213
66
67
81
64
194
64
65
122
114
116
66
103
129
143
67
60
66
55
53
89
63
59
63
65
81
61
59
71
67
8,057
57
65
63
69
65
61
61
55
61
51
63
65
65
55
67
59
63
63
57
6,146
49
61
65
103
61
57
61
57
57
61
59
74
61
59
49
57
57
61
55
119
49
59
63
103
57
53
53
Average
2919.30
2371.75
95.90
464.20
366.35
63.15
Table 4.3 Average number

of trial points generated by
the six methods belonging to
the scheme GS on 100 test
functions from [94] with the
accuracies = 104 and
= 106
Method
PM
GE
LT
PM LI
GE LI
LT LI
r
1.1
1.1
1.1
1.3*
= 104
400.54
167.63
47.28
44.82
40.22
38.88
r
1.1
1.1
1.2
1.2
= 106
2928.48
1562.27
70.21
65.70
62.96
60.04
where the global minimizer xj , 1 j 100, is chosen randomly from the interval
[5, 5] and differently for the 100 functions of the class. Figure 4.2 shows the graph
of the function no. 38 from the set of test functions (4.2.16) and the trial points
generated by the six methods during minimization of this function with the accuracy
= 104(b a). The global minimum of the function, f = 0, is attained at the point
x = 3.3611804993. In Fig. 4.2 the effects of the acceleration techniques, the local
tuning and the local improvement, can be clearly seen.
Table 4.3 shows the average numbers of trial points generated by the six methods
belonging to the scheme GS. In columns 2 and 4, the values of the reliability
parameter r are given. The parameter was again taken equal to 108 and = .
In Table 4.3, the asterisk denotes that in the algorithm LT LI (for = 104(b a))
the value r=1.3 has been used for 99 functions, and for the function no. 32 the value
102

4
3
2
1
0
PM
421
GE
163
LT
43
PM_LI
41
GE_LI
37
LT_LI
33
Fig. 4.2 Graph of the function number 38 from (4.2.16) and trial points generated by the six
methods during their work with this function
r=1.4 has been applied. Table 4.3 confirms for the second series of experiments the
same conclusions that have been made with respect to the effects of the introduction
of the acceleration techniques for the first series of numerical tests.
4.3 Acceleration of Multidimensional Geometric Algorithms

In this section we present multidimensional geometric methods that use in their
work the metric of Holder, and in which local tuning and local improvement
techniques are exploited.
We consider some modifications of the algorithm MGA from Sect. 3.5 of the
previous chapter in order to obtain three new algorithms that solve the Lipschitz
global minimization problem (3.1.1), (3.1.2). As it was in the previous section, we
first describe a general algorithmic scheme for solving problem (3.1.1), (3.1.2) in a
compact form without specifying Steps 2 and 4.
Multidimensional Geometric Scheme (MGS)
Step 0. Set x1 = 0, x2 = 1 and compute the values of the function z j = f (x j ) =
F(pM (x j )), j = 1, 2, where pM (x) is the M-approximation of the Peano
curve. After executing k trials the choice of new trial points is done as
follows.
103
Step 1. Execute Step 1 of the GS from Sect. 4.2.

Step 2. Call the function HOLDER-ESTIMATE(setH) in order to compute the
value mi being an estimate of the Holder constant of f (x) over the interval
[xI1 , xi ], 2 i k. The parameter setH can assume the values 1 or 2.
Step 3. For each interval [xi1 , xi ], 2 i k, compute the point yi and the
characteristic Ri , according to (3.3.4) and (3.3.5), replacing the values
zi = f (xi ) by F(pM (xi )).
Step 4. Call the function SELECT(setINT ) that returns an interval [xt1 , xt ] for the
next possible trial. The parameter setINT can assume the values 1 or 2.
Step 5. If
|xt xt1 |1/N ,
(4.3.1)
where > 0 is a given search accuracy, then calculate an estimate of the

global minimum as
xk+1 = yt
(4.3.2)
set k = k + 1 and go to Step 1.

In order to obtain from the general scheme MGS a global optimization algorithm,
it is necessary to define the routines used in Steps 2 and 4 of the scheme. In Step 2,
we can make two different choices of the constant mi , according to the value of
the parameter setH. For setH = 1, we consider a procedure that estimates the global
constant during the search for each iteration k (as in the algorithm MGA of Sect. 3.5),
whereas for setH = 2 we consider the local tuning procedure introduced in Sect. 4.2
that is used to determine estimates of the local Holder constants at subintervals of
the search domain in the course of the work of the algorithm. Let us describe both
procedures.
HOLDER-ESTIMATE(1)
Set
mi = max{ , hk },
2 i k,
(4.3.3)
where > 0 is a small number that takes into account our hypothesis that
f (x) is not constant over the interval [0, 1] and the value hk is calculated
as follows
(4.3.4)
104
with
hi =
|zi zi1 |
,
|xi xi1|1/N
2 i k.
(4.3.5)
HOLDER-ESTIMATE(2)
Set
mi = max{i , i , },
2 i k,
(4.3.6)
with
i = max{hi1 , hi , hi+1 }, 3 i k 1,
(4.3.7)
where Hi is from (4.3.5), and when i = 2 and i = k we consider only h2 ,

h3 , and hk1 , hk , respectively. The value
i = hk
|xi xi1 |
,
X max
(4.3.8)
where hk is from (4.3.4) and

X max = max{|xi xi1 |1/N , 2 i k}.
The parameter > 0 has the same sense as in HOLDER-ESTIMATE(1).
In Step 4, we choose the interval for performing a new trial. Following the
path from the previous section we have considered both the traditional criterion
choosing the minimal characteristic used (see, e.g., [74]) and corresponding to the
value 1 of the parameter SetINT and the local improvement criterion introduced in
the previous section (see [75, 76]), corresponding to the value 2 of the parameter
SetINT .
SELECT(1)
Execute Step 4.1 of the GS from Sect. 4.2.
SELECT(2)
Execute Step 4.2 of the GS from Sect. 4.2.
Thus, by considering the Scheme MGS with the function HOLDER-ESTIMATE(1)
and SELECT(1) or SELECT(2) we have two algorithms that we shall call AG (the
Algorithm with Global approximation of the Holder constant) and AGI (the Algorithm with Global approximation of the Holder constant and local Improvement).
We note that algorithm AG is exactly the algorithm MGA from Sect. 3.5 of the
105
previous chapter. The Scheme MGS with the function HOLDER-ESTIMATE(2)

and SELECT(1) or SELECT(2) gives two other algorithms that we shall call AL
(the Algorithm with the Local tuning) and ALI (the Algorithm with the Local tuning
and local Improvement).
As regards the convergence properties of algorithms belonging in the MGS
scheme, we must keep in mind that Theorem 3.1, linking the multidimensional
global optimization problem (3.1.1), (3.1.2) to the one-dimensional problem (3.1.3),
(3.1.4) allows us to concentrate our attention on the one-dimensional methods
working on the one-dimensional curve. Therefore convergence of the four methods
AG, AGI, AL, and ALI is proved (see [76]) by Theorems analogous to Theorems
3.23.4, with Corollaries 3.23.5, from Sect. 3.3 and Theorem 4.2 from Sect. 4.2.
Let us report now numerical results obtained by comparing the four algorithms
AG, AGI, AL, and ALI, in order to test the effectiveness of the acceleration procedures. Three series of experiments have been executed and in all the experiments
we have considered the FORTRAN implementation of the methods tested. In our
numerical experiments the GKLS-generator of classes of test functions from [40]
has been used. The GKLS-generator works by constructing test functions F(y) in RN
from (3.5.9) (see [36]). An example of a function generated by the GKLS-generator
can be seen in Fig. 3.8). In all the series of experiments we have considered classes
(described in Table 3.7) of 100 N-dimensional functions with 10 local minima over
the domain [1, 1] RN .
The experiments have been carried out by using one of the following two
stopping criteria:
(a) The value = 0 is fixed in the stopping rule (4.3.1) and the search terminates
when a trial point falls in a ball Bi having a radius and the center at the global
minimizer of the considered function, i.e.,
Bi = {y RN : y yi },
(4.3.9)
where yi denotes the global minimizer of the i-th function of the test class,
1 i 100.
(b) A value > 0 is fixed and the search terminates when the rule (4.3.1) is satisfied;
then it is counted the number of functions of the class for which the method
under consideration was able to put a point in the ball Bi , 1 i 100.
Comparison AGAGI and ALALI. In the first series of experiments, the
efficiency of the local improvement technique was studied. For this purpose,
the algorithms AG and AL were compared with the algorithms AGI and ALI,
respectively, on the class 1 from Table 3.7 (see Fig. 4.3). All experiments were
performed with = 108, =106 , from (4.2.14), and = 0 using the strategy
(a) with the radius = 0.01 N, where is from (4.3.9). The choices of the
reliability parameter r are given below in subsection The choice of parameters in
the experiments.
In order to illustrate a different behavior of the methods using the local
improvement technique, Fig. 4.4 shows behavior of the AG and the AGI on problem
106
100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
0
AL, (average number of trials 119, max 343)

ALI, (average number of trials 76, max 241)
10
AG, (average number of trials 132, max 468)

AGI, (average number of trials 76, max 205)
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
Fig. 4.3 Methods AGI and AG using the global estimate, left. Methods ALI and AL using local
estimates, right
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
1
1 0.8 0.6 0.4 0.2
0.2 0.4 0.6 0.8
1
1 0.8 0.6 0.4 0.2
0.2 0.4 0.6 0.8
Fig. 4.4 Function no.55, class 1. Trial points produced by the AG, left. Trial points produced by
the AGI, right. Trial points chosen by the local improvement strategy are shown by the symbol *
no.55 from class 1. Figure 4.4-left shows 337 points of trials executed by the AG
to find the global minimum of the problem and Fig. 4.4-right presents 107 points
of trials executed by the AGI to solve the same problem. Recall that the search has
been stopped using the rule (a), i.e., as soon as a point within the ball B55 has been
placed.
Comparison AGIALI. In the second series of experiments (see Fig. 4.5), the
algorithms AGI and ALI are compared in order to study the influence of the local
tuning technique in the situation when the local improvement is applied too. The
choices of the reliability parameter r are given below in subsection The choice of
parameters in the experiments and the other parameters have been chosen as in the
first series of experiments. In Fig. 4.5-left, the rule (a) is used. It can be noticed that
the method ALI is faster in finding the global solution: the maximum number of
iterations executed by ALI is 241 against 1,054 carried out by the algorithm AGI.
In Fig. 4.5-right, the strategy (b) is used, the algorithms stop when the rule (4.3.1) is

100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
107
20
AGI, (average no. of trials 164, max 1054)
ALI, (average no. of trials 76, max 241)
10
10
AGI, (av. trials 3910, max 6880)

ALI, (av. trials 437, max 586)
0
0
200
400
600
800
1000
1200
1000 2000 3000 4000 5000 6000 7000
Fig. 4.5 ALI and AGI: = 0, left. ALI and AGI: = .001, right
100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
AG, (av trials 1023, max 1910)
ALI, (av trials 437, max 586)
10
0
0
200
400
600
800 1000 1200 1400 1600 1800 2000
10
0
AG, (average no. of trials 1349, max 2952)

5000
10000
15000
Fig. 4.6 N = 2, class 1, methods AG and ALI, left. N = 3, class 3, methods AG and ALI, right
satisfied, with = 0.001. This criterion is very important because in solving real-life
problems we do not know a priori the global solution of the problem. Thus, it is very
important to study, how many trials should execute the methods to find the solution
and to stop by using the practical criterion (b). It can be seen that the ALI is very
fast to stop, whereas the method AGI executes a global analysis of the whole domain
of each objective function so that the stopping rule (4.3.1) is verified after a higher
number of trials.
Comparison AGALI. In the third series of experiments, we compare the basic
algorithm AG with the algorithm ALI using both the local tuning and the local
improvement, on classes 1, 3 and 5 from Table 3.7. The practical rule (b) was used
in these experiments. The choices of the reliability parameter r are given below.
In dimension 2, the values of , , and were the same as in the experiments
above; was fixed equal to 0.001. In Fig. 4.6-left the behavior of the two methods
can be seen. Note that after 500 iterations the stopping rule in the ALI was verified
for 84 functions and all the minima have been found, whereas the algorithm AG
stopped only at 2 functions.
108
Fig. 4.7 N = 4, class 5,
methods AG and ALI

100
90
80
70
60
50
40
30
20
10
0
AG, (average no. of trials >40036, max >90000)

8
x 104
For N = 3, the radius = 0.01 N has been used. The parameters of the methods
have been chosen as follows: the search accuracy = 0.0022, = 106, and =
108. In Fig. 4.6-right the behavior of the methods can be seen. All global minima
have been found.
In the last experiment of this series, the class of functions with N = 4 has been
used. Themethods AG and ALI worked with the following parameters: = 0.005,
= 0.04 N, = 108 , = 108 . The algorithm AG was not able to stop within
the maximal number of trials, 90,000, for 11 functions; however, the a posteriori
analysis has shown that the global minima have been found for these functions, too.
Figure 4.7 illustrates the results of the experiment.
The choice of parameters in the experiments. In this subsection we specify the
values of the reliability parameter r used in all the experiments. As has been already
discussed above (see also Theorem 3.3 in [76]), every function optimized by the AG,
AGI, AL, and ALI algorithms has a crucial value r of this parameter. Therefore,
when one executes tests with a class of 100 different functions it becomes difficult
to use specific values of r for each function, hence in our experiments at most two
or three values of this parameter have been fixed for the entire class. Clearly, such
a choice does not allow the algorithms to show their complete potential because
both the local tuning and local improvement techniques have been introduced to
capture the peculiarities of each concrete objective function. However, even under
these unfavorable conditions, the four algorithms proposed in the paper have shown
a nice performance. Note that the meaning of r and other parameters of this kind
in Lipschitz global optimization is discussed in detail in a number of fundamental
monographs (see, e.g., [93, 117, 132, 139, 156]).
4.4 Fast Information Algorithms
109
The following values of the reliability parameter r were used in the first series
of experiment: in the methods AG and AGI the reliability parameter r = 1.3; in the
ALI the value r = 2.8 was used for all 100 functions of the class and in the method
AL the same value r = 2.8 was used for 98 functions and r = 2.9 for the remaining
two functions.
In the second series of experiments the same value of the parameter r = 2.8 has
been used in both methods (AGI and ALI).
In the third series of experiments the following values of the parameter r have
been used: in dimension N = 2, in the AG the value r = 1.3 and in the ALI the
value r = 2.8. In dimension N = 3, the value r = 1.1 has been applied in the method
AG for all 100 functions of the class; in the method ALI, r = 3.1 has been used for
73 functions of the class, r = 3.4 for 20 functions, and r = 3.9 for the remaining 7
functions. In dimension N = 4, r = 1.1 in the method AG; r = 6.5 in the ALI for 77
functions of the class, r = 6.9 for 17 functions, r = 7.7 for the remaining 6 functions.

We present now algorithms that belong to the class of information methods and use
estimates of the local Lipschitz constants and the local improvement technique to
accelerate the search as has be done in the previous sections for the methods working
within the framework of the geometric approach. We introduce first the onedimensional algorithm and then the multidimensional one. The former algorithm
uses both the local tuning and local improvement techniques to accelerate the
algorithm IA from Sect. 3.2.
One-Dimensional Information Algorithm with Local Tuning

and Local Improvement (OILI)
Step 0. Starting points x1 , x2 , . . . , xm , m > 2, are fixed in such a way that x1 = a, xm =
b and the other m 2 points are chosen arbitrarily. Values f (x1 ), . . . , f (xm )
are calculated at these points. The point xk+1 , k m, of the current (k+1)-th
iteration is chosen as follows.
Step 1. Renumber the trial points x1 , x2 , . . . , xk of the previous iterations by
subscripts so that
a = x1 < x2 < < xk1 < xk = b.
(4.4.1)
Step 2. Compute the value mi being an estimate of the Lipschitz constant of f (x)
over the interval [xi1 , xi ], 2 i k, according to Step 2.2 of Sect.4.2 (the
local tuning).
110
Step 3. For each interval [xi1 , xi ], 2 i k, compute its characteristics

Ri = mi (xi xi1) +
(zi zi1 )2
2(zi + zi1 ),
mi (xi xi1)
(4.4.2)
where zi = f (xi ).
Step 4. Select the interval [xt1 , xt ] for the next possible trial according to Step 4.2
of Sect. 4.2 (the local improvement).
Step 5. If
|xt xt1 | > ,
(4.4.3)
where > 0 is a given search accuracy, then execute the next trial at the
point

zt zt1
xk+1 = 0.5 xt + xt1
mt
(4.4.4)
and go to Step 1. Otherwise, calculate an estimate of the minimum as

fk = min{zi : 1 i k}
and STOP.
Notice that, as it was in the previous methods, in the algorithm OILI the local
tuning and local improvement techniques are used, respectively, in Step 2 and Step 4
of the scheme. Furthermore, if in Step 4 we select the interval [xt1 , xt ] for the
next possible trial according to (3.2.10) of the IA algorithm from Sect. 3.2, i.e.
we use the traditional rule to choose this interval, then we obtain the information
algorithm with local tuning introduced in [101] that we shall denote hereinafter by
OIL. Convergence conditions of both algorithms are similar to the results considered
in the previous sections (see [75, 101] for a comprehensive discussion).
We present now some numerical results in order to compare the performance of
the information methods OILI and OIL with the methods that we have considered in
Chap. 3 (see Table 3.1): Galperins algorithm [35] denoted by GA, Piyavskii method
[92] denoted by PM, and the information method IA of Sect. 3.2. The set of 20 test
functions proposed by Hansen, Jaumard, and Lu in [59] has been considered for the
comparison.
The following parameters have been chosen for the experiments. In the methods
GA and PM, the precise values of Lipschitz constant have been used. In the methods
IA, OIL, and OILI the parameter r = 2 has been taken. The experiments have
been performed by using the value 0.0001 of the accuracy from (4.4.3). All the
global minima have been found by all the methods. We can see from Table 4.4 that

Table 4.4 Numerical
comparison of the
information algorithms using
the local tuning and local
improvement techniques to
accelerate the search with
traditional univariate
algorithms
111
Function
GA
PM
IA
OIL
OILI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
377
308
581
923
326
263
383
530
314
416
779
746
1,829
290
1,613
992
1,412
620
302
1,412
149
155
195
413
151
129
153
185
119
203
373
327
993
145
629
497
549
303
131
493
127
135
224
379
126
112
115
188
125
157
405
271
472
108
471
557
470
243
117
81
35
36
136
41
45
54
39
132
42
40
71
68
45
46
63
53
101
41
34
42
34
36
40
40
42
40
40
34
38
38
36
40
32
36
38
38
48
40
32
38
Average
720.80
314.60
244.15
58.20
38.00
both acceleration techniques allow us to obtain a good speed up for information

algorithms in the same way as it was for geometric methods.
Let us present now the Multidimensional Information algorithm with Local
tuning and local Improvement (MILI). It is described by the following scheme.
Algorithm MILI
Step 0. Starting points x1 , x2 , . . . , xm , m > 2, are fixed in such a way that x1 = 0,
xm = 1 and the other m 2 points are chosen arbitrarily. Values z j = f (x j ) =
F(pM (x j )), 1 j m, are calculated, where pM (x) is the M-approximation
of the Peano curve. After executing k trials the choice of new trial points is
done as follows.
Step 1. Execute Step 1 of OILI.
Step 2. (Local tuning.) Evaluate the values mi according to (4.2.10) of Sect. 4.2
replacing (xi xi1 ) by (xi xi1 )1/N in (4.2.11), (4.2.12) and X max by
(X max )1/N in (4.2.12). The values f (x j ) are replaced by F(pM (x j )).
Step 3. For each interval [xi1 , xi ], 2 i k, calculate characteristics Ri according
to (4.4.2) of algorithm OILI, replacing (xi xi1 ) by (xi xi1 )1/N .
Step 4. Execute Step 4 of OILI for select the index t.
Step 5. If
112
|xt xt1 |1/N
(4.4.5)
where > 0 is a given search accuracy, then calculate an estimate of the

minimum as

xk+1 = 0.5(xt + xt1 )
|zt zt1 |
mt
N
1
sgn(zt zt1 )
2r
(4.4.6)
and go to Step 1.
If in Step 4 of the scheme MILI we consider the traditional selection rule in
according to (3.2.10) from Sect. 3.2, i.e. we select the interval [xt1 , xt ] for the
next possible trial corresponding to the maximal characteristic, then we obtain the
Multidimensional Information algorithm with Local tuning that we will denote as
MIL hereinafter (see [101]).
Theorem 4.4. Let > 0 be fixed finite, and = 0, and x be a global minimizer of
f (x) = F(y(x)) and {k} be the sequence of all iteration numbers {k} = {1, 2, 3, . . .}
corresponding to trials generated by the MILI or the MIL. If there exists an infinite
subsequence {h} of iteration numbers {h} {k} such that for an interval
[xi1 , xi ], i = i(p), p {h},
containing the point x at the p-th iteration, the inequality
mi 211/N Ki +

222/N Ki2 Mi2
(4.4.7)
holds for the estimate mi of the local Lipschitz constant corresponding to the interval
[xi1 , xi ], then the set of limit points of the sequence {xk } of trials generated by the
MILI or the MIL coincides with the set of global minimizers of the function f (x).
In (4.4.7), the values Ki and Mi are the following:
Ki = max{(zi1 f (x ))/(x xi1 )1/N , (zi f (x ))/(xi x )1/N },
Mi = |zi1 zi |/(xi xi1)1/N .
Proof. Convergence properties of the method are similar to that ones described in
the previous section (see [75] for a detailed discussion).

Let us consider now two series of numerical experiments that involve a total
of 600 test functions in dimension N = 2, 3, 4. More precisely, six classes of 100
113
functions each generated by the GKLS-generator described in [40] and presented in

Table 3.7 have been considered. An example of a function generated by the GKLS
can be seen in Fig. 3.8 (for more details see Sect. 3.5). The numerical experiments
have been carried out by using one of the two stopping rules (a) or (b) defined in
Sect. 4.3.
Two criteria have been used to compare the performance of the original information method MIA (see [132,133]) described in Sect. 3.4 with the two methods using
acceleration techniques introduced in this section: the information algorithm with
the local tuning MIL and the MILI method using both the local tuning and local
improvement techniques.
Let Ts be the number of trials performed by a method to solve the problem
number s, 1 s 100, is the number that identifies the considered function of a
fixed test class.
Criterion C1. Number of trials Ts required for a method to satisfy a fixed stopping
criterion, for all 100 function of a particular test class, i.e.,
Ts = max Ts ,
1s100
s = arg max Ts .
1s100
(4.4.8)
Criterion C2. Average number of trials Tavg performed by the method during
minimization of all 100 functions from a particular test class, i.e.,
Tavg =
1 100
Ts .
100 s=1
(4.4.9)
Note that results reflected by Criterion C1 are influenced by minimization of the

most difficult function of a class whereas Criterion C2 considers average data of a
class.
Let us describe the first series of experiments in which we compare the
original information method MIA described in Sect. 3.4 that does not use the
acceleration techniques with the information algorithm MIL working with the local
tuning and with the MILI method that uses both the local tuning and the local
improvement. We consider these experiments in order to study the influence of the
local tuning strategy in the situation when the stopping rule (b) is used. In Table 4.5,
we summarize the results of the numerical experiments. In this table, the parameter
d is from (4.3.9). The value of from (4.4.5) is fixed equal to 103 and the parameter
= 108 , where is from Step 2 of the MILI. The same value of has been used
in the MIL and the MIA, as well.
The choices of the reliability parameter r for all the methods are given below in
the subsection The choice of parameters in the experiments. Due to Theorem 5 of
[75], every function optimized by the MILI has a crucial value r of this parameter.
The same situation takes place for both the MIA and MIL algorithms (see [101]).
Since the algorithm MIL uses a global estimate of the Lipschitz constant, the
value r for this method is less variable for different functions of a fixed class.
114
Table 4.5 Results of experiments with respect to criteria C1 and C2

C1
Class
1
2
3
4
5
6
.01N
.01N
.01N
.01N
.02N
.02 N
C2
MIA
1219
4678
22608
70492
100000(53)
100000(96)
MIL
1065
2699
10800
47456
100000(2)
100000(24)
MILI
724
2372
8459
37688
100000(1)
100000(27)
MIA
710.39
1705.89
8242.19
20257.50
83362.00
99610.97
MIL
332.48
956.67
2218.82
12758.14
23577.91
61174.69
MILI
354.82
953.58
2312.93
11505.38
23337.03
61900.93
The algorithms MILI and MIL have been constructed in order to be tuned on each
concrete function. Therefore, when one executes tests with a class of 100 different
functions it becomes difficult to use specific values of r for each function and in our
experiments only one or two values of this parameter have been fixed for the entire
class. Clearly, such a choice does not allow the algorithms MILI and MIL to show
their complete potential in the comparison with the MIA. However, as it can be seen
from Table 4.5, even under these unfavorable conditions, the algorithms show a very
nice performance. In the results described in Table 4.5, all the algorithms were able
to find the solution to all 100 functions of each class. It can be seen that the MIL
and the MILI were very fast to stop, whereas the MIA executed a deeper global
analysis of the whole domain of each objective function so that the stopping rule
(4.4.5) was verified after a higher number of trials. In all the cases, the maximal
number of function evaluations has been taken equal to 100,000 and in Table 4.5, in
the C1 columns, the numbers in brackets present the number of functions for which
the algorithm has reached this number.
In the second series of experiments, the efficiency of the local improvement
technique was studied. For this purpose, the algorithm MILI has been compared
with the algorithm MIL by using the stopping strategy (a), i.e., the search went
on until a point within the ball Bi from (4.3.9), has been placed. In solving many
concrete problems very often it is crucial to find a good approximation of the global
minimum in the lowest number of iterations. The most important aim of the local
improvement is that of quicken the search: thus, we use the stopping criterion (a)
that allows us to see which of the two methods faster approaches the global solution.
In these experiments, we considered the criteria C1 and C2 previously described,
and a new criterion defined as follows.
Criterion C3. Number p (number q) of functions from a class for which the MIL
algorithm executed less (more) function evaluations than the algorithm MILI. If Ts
is the number of trials performed by the MILI and Ts is the corresponding number
of trials performed by the MIL method, p and q are evaluated as follows:
p=
100
s=1
s ,
s

=
1, Ts < Ts ,
0, otherwise;
(4.4.10)
115
Table 4.6 Results of the second series of experiments

C1
Class
1
2
3
4
5
6
MIL
668
1517
7018
40074
67017
76561
C2
MILI
434
1104
5345
15355
36097
73421
MIL
153.72
423.39
1427.06
6162.02
10297.14
21961.91
q=
100
s ,
s=1
C3
MILI
90.73
198.82
838.96
2875.06
6784.37
16327.21
MIL
20
22
25
25
36
40
s =
MILI
79
77
75
75
64
60
1, Ts < Ts ,
0, otherwise.
Ratio C1
MIL/MILI
1.5391
1.3741
1.3130
2.6098
1.8565
1.0427
Ratio C2
MIL/MILI
1.6942
2.1295
1.7008
2.1432
1.5177
1.3451
(4.4.11)
If p + q < 100, then both the methods solve the remaining 100 (p + q) problems
with the same number of function evaluations.
Table 4.6 presents results of numerical experiments in the second series. The
C1 and C2 columns have the same meaning as before. The C3 column
presents results of the comparison between the two methods in terms of this
criterion: the MIL sub-column presents the number of functions, p, of a particular
test class, for which MIL spent fewer trials than the MILI method. Analogously, the
MILI sub-column shows the number of functions, q, for which the MILI executed
less function evaluations with respect to the MIL (p and q are from (4.4.10) and
(4.4.11), respectively). For example, in the line corresponding to the test class 1,
for N = 2, we can see that the method MILI was better (was worse) than the MIL
on q = 79 (p = 20) functions, and for one function of this class the two methods
generated the same number of function trials.
In all the cases, the maximal number of function evaluations has been taken equal
to 100,000. The parameters d, , and and the values of the reliability parameter r
used in these experiments for the MIL and MILI methods are the same as in the first
series of experiments. It can be seen from Table 4.6 that on these test classes the
method MILI worked better than the information algorithm MIL. In particular, the
columns Ratio C1 and Ratio C2 of Table 4.6 show the improvement obtained
by the MILI with respect to Criteria C1 and C2. They represent the ratio between
the maximal (and the average) number of trials performed by the MIL with respect
to the corresponding number of trials performed by the MILI algorithm.
The choice of parameters in the experiments. The following values of the
reliability parameter r have been used for the methods in the first series of
experiments: for the test class 1 the value r = 4.9 in the MIL and MILI algorithms
and the value r = 3.1 in the MIA algorithm. For the class 2 the value r = 5.4 was
used in the MIL and the MILI for 97 functions, and r = 5.5 for the remaining 3
functions of this class; in the MIA the values r = 4.1 and r = 4.3 were used for 97
and 3 functions of the same class, respectively.
116
In dimension N = 3, the values r = 5.5 and r = 5.7 were applied in the MIL
and the MILI methods for 97 and 3 functions of the class 3, respectively; the values
r = 3.2 and r = 3.4 for 97 and 3 functions of this class when the MIA algorithm has
been used. By considering the test class 4 the following values of the parameter r
have been used: r = 6.5 and r = 6.6 in the MIL and the MILI methods for 99 and
1 function, respectively; r = 3.8 for 99 functions in the MIA and r = 4.1 for the
remaining function.
In dimension N = 4, the value r = 6.2 was used for all 100 functions of test class 5
in the MIL and the MILI, and r = 3.3, r = 3.5 in the MIA, for 96 and 4 functions,
respectively. The value r = 6.2 was applied for 92 functions of test class 6 in the
MIL and the MILI, and the values r = 6.6 and 6.8 were used for 5 and 3 functions,
respectively; in the MIA algorithm the value r = 3.8 has been used for 98 functions
of the class 6 and r = 4.1 for the remaining 2 functions.
Finally, the parameter from Step 4 of the MILI algorithm has been fixed equal
to 106 for N = 2, and equal to 108 for N = 3, 4.
Chapter 5
A Brief Conclusion
What we call the beginning is often the end. And to make an end
is to make a beginning. The end is where we start from.
T. S. Eliot
We conclude this brief book by emphasizing once again that it is just an introduction
to the subject. We have considered the basic Lipschitz global optimization problem,
i.e., global minimization of a multiextremal, non-differentiable Lipschitz function
over a hyperinterval with a special emphasis on Peano curves, strategies for adaptive
estimation of Lipschitz information, and acceleration of the search. There already
exists a lot of generalizations of the ideas presented here in several directions.
For the reader interested in a deeper immersion in the subject we give below some
of them:
Algorithms working with discontinuous functions and functions having Lipschitz
first derivatives (see [43,70,72,77,102,103,106,117,118,121,139] and reference
given therein).
Algorithms working with diagonal partitions and adaptive diagonal curves
for solving multidimensional problems with Lipschitz objective functions and
Lipschitz first derivatives (see [72, 107, 108, 116118] and reference given
therein).
Algorithms for multicriteria problems and problems with multiextremal nondifferentiable partially defined constrains (see [109, 119, 122124, 134, 137, 139,
140] and reference given therein).
Algorithms combining the ideas of Lipschitz global optimization with the
Interval Analysis framework (see [1416, 81], etc.).
Parallel non-redundant algorithms for Lipschitz global optimization problems
and problems with Lipschitz first derivatives (see [44, 105, 113115, 120, 138
140], etc.).

117
118
5 A Brief Conclusion
Algorithms for finding the minimal root of equations (and sets of equations)
having a multiextremal (and possibly non-differentiable) left-hand part over an
interval (see [15, 16, 18, 19, 83, 121], etc.).
Thus, this book is a demonstration that the demand from the world of applications
entails a continuous intensive activity in the development of new global optimization
approaches. The authors hope that what is written here may serve not only as
a tool for people from different applied areas but also as the source of many
other successful developments (especially by young researchers just coming to the
scene of global optimization). Therefore, we expect this book to be a valuable
introduction in the subject to faculty, students, and engineers working in local and
global optimization, applied mathematics, computer sciences, and in related areas.
References
1. Addis, B., Locatelli, M.: A new class of test functions for global optimization. J. Global
Optim. 38, 479501 (2007)
2. Addis, B., Locatelli, M., Schoen, F.: Local optima smoothing for global optimization. Optim.
Meth. Software 20, 417437 (2005)
3. Aguiar e Oliveira, H., Jr., Ingber, L., Petraglia, A., Rembold Petraglia, M., Augusta Soares
Machado, M.: Stochastic Global Optimization and Its Applications with Fuzzy Adaptive
Simulated Annealing. Springer, Berlin (2012)
4. Baritompa, W.P.: Customized method for global optimizationa geometric viewpoint.
J. Global Optim. 3, 193212 (1993)
5. Baritompa, W.P.: Accelerations for a variety of global optimization methods. J. Global Optim.
4, 3745 (1994)
6. Barkalov, K., Ryabov, V., Sidorov, S.: Parallel scalable algorithms with mixed local-global
strategy for global optimization problems. In: Hsu, C.H., Malyshkin, V. (eds.) MTPP 2010.
LNCS 6083, pp. 232240. Springer, Berlin (2010)
7. Bomze, I.M., Csendes, T., Horst, R., Pardalos, P.M.: Developments in Global Optimization.
Kluwer, Dordrecht (1997)
8. Breiman, L., Cutler, A.: A deterministic algorithm for global optimization. Math. Program.
58, 179199 (1993)
9. Butz, A.R.: Space filling curves and mathematical programming. Inform. Contr. 12, 313330
(1968)
10. Calvin, J., Zilinskas,

A.: One-dimensional global optimization for observations with noise.
Comput. Math. Appl. 50, 157169 (2005)
11. Cantor, G.: Ein Beitrag zur Mannigfaltigkeitslehre. Journal fur die reine und angewandte
Mathematik (Crelles Journal) 84, 242258 (1878)
12. Carotenuto, L., Pugliese, P., Sergeyev, Ya.D.: Maximizing performance and robustness of PI
and PID controllers by global optimization. Int. J. Contr. Intell. Syst. 34, 225235 (2006)
13. Carter, R.G., Gablonsky, J.M., Patrick, A., Kelley, C.T., Eslinger, O.J.: Algorithms for noisy
problems in gas transmission pipeline optimization. Optim. Eng. 2, 139157 (2001)
14. Casado, L.G., Garca, I., Martnez, J.A., Sergeyev, Ya.D.: New interval analysis support
functions using gradient information in a global minimization algorithm. J. Global Optim.
25, 345362 (2003)
15. Casado, L.G., Garca, I., Sergeyev, Ya.D.: Interval Branch and Bound global optimization for
finding the first zero-crossing in one-dimensional functions. Reliable Comput. 6, 179191
(2000)

Curves, SpringerBriefs in Optimization, DOI 10.1007/978-1-4614-8042-6,
119
120
References
16. Casado, L.G., Garca, I., Sergeyev, Ya.D.: Interval algorithms for finding the minimal root in
a set of multiextremal non-differentiable one-dimensional functions. SIAM J. Sci. Comput.
24, 359376 (2002)
17. Clausen, J., Zilinskas,

A.: Subdivision, sampling, and initialization strategies for simplical
branch and bound in global optimization. Comput. Math. Appl. 44, 943955 (2002)
18. Daponte, P., Grimaldi, D., Molinaro, A., Sergeyev, Ya.D.: An algorithm for finding the zero
crossing of time signals with Lipschitzean derivatives. Measurement 16, 3749 (1995)
19. Daponte, P., Grimaldi, D., Molinaro, A., Sergeyev, Ya.D.: Fast detection of the first
zero-crossing in a measurement signal set. Measurement 19, 2939 (1996)
20. Devaney, R.L.: An Introduction to Chaotic Dynamical Systems. Westview Press Inc,
New York (2003)
21. Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. 2. North-Holland,
Amsterdam (1978)
22. Evtushenko, Yu. G.: Numerical methods for finding global extrema of a nonuniform mesh.
USSR Comput. Math. Math. Phys. 11, 13901403 (1971)
23. Evtushenko, Yu. G.: Numerical Optimization Techniques. Translation Series in Mathematics
and Engineering. Optimization Software Inc., Publication Division, New York (1985)
24. Evtushenko, Yu. G., Posypkin, M.A.: An application of the nonuniform covering method to
global optimization of mixed integer nonlinear problems. Comput. Math. Math. Phys. 51,
12861298 (2011)
25. Falconer, K.: Fractal Geometry: Mathematical Foundations and Applications. Wiley,
Chichester (1995)
26. Famularo, D., Pugliese, P., Sergeyev, Ya.D.: A global optimization technique for checking
parametric robustness. Automatica 35, 16051611 (1999)
27. Famularo, D., Pugliese, P., Sergeyev, Ya.D.: A global optimization technique for fixed-order
control design. Int. J. Syst. Sci. 35, 425434 (2004)
28. Floudas, C.A.: Deterministic Global Optimization: Theory, Methods and Applications.
Kluwer, Dordrecht (1999)
29. Floudas, C.A., Pardalos, P.M.: Recent Advances in Global Optimization. Princeton University
Press, Princeton (1992)
30. Floudas, C.A., Pardalos, P.M.: State of the Art in Global Optimization. Kluwer, Dordrecht
(1996)
31. Gablonsky, M.J.: Modifications of the DIRECT Algorithm. Ph.D thesis, North Carolina State
University, Raleigh, NC (2001)
32. Gablonsky, M.J.: DIRECT v2.04 FORTRAN code with documentation. http://www4.ncsu.
edu/ctk/SOFTWARE/DIRECTv204.tar.gz (2001)
33. Gablonsky, M.J.: An implemention of the Direct Algorithm. Technical report CRSC-TR0430, Center for Research in Scientific Computation, North Carolina State University, Raleigh,
NC (2004)
34. Gablonsky, M.J., Kelley, C.T.: A locally-biased form of the DIRECT Algorithm. J. Global
Optim. 21, 2737 (2001)
35. Galperin, E.A.: The cubic algorithm. J. Math. Anal. Appl. 112, 635640 (1985)
36. Gaviano, M., Lera, D.: Test functions with variable attraction regions for global optimization
problems. J. Global Optim. 13, 207223 (1998)
37. Gaviano, M., Lera, D.: Complexity of general continuous minimization problems: a survey.
Optim. Meth. Software 20, 525544 (2005)
38. Gaviano, M., Lera, D.: A global minimization algorithm for Lipschitz functions. Optim. Lett.
2, 113 (2008)
39. Gaviano, M., Lera, D.: Properties and numerical testing of a parallel global optimization
algorithm. Numer. Algorithm 60, 613629 (2012)
40. Gaviano, M., Kvasov, D.E., Lera, D., Sergeyev, Ya.D.: Software for generation of classes
of test functions with known local and global minima for global optimization. ACM Trans.
Math. Software 29, 469480 (2003)
References
121
41. Gaviano, M., Lera, D., Steri, A.M.: A local search method for continuous global optimization.
J. Global Optim. 48, 7385 (2010)
42. Gelfand, I., Raikov, D., Shilov, G.: Commutative Normed Rings. AMS Chelsea Publishing,
New York (1991)
43. Gergel, V.P.: A global search algorithm using derivatives. In: Systems Dynamics and
Optimization, pp. 161178. N.Novgorod University Press, N. Novgorod (1992) (In Russian)
44. Gergel, V.P., Sergeyev, Ya.D.: Sequential and parallel global optimization algorithms using
derivatives. Comput. Math. Appl. 37, 163180 (1999)
45. Gergel, V.P., Strongin, R.G.: Multiple Peano curves in recognition problems. Pattern Recogn.
Image Anal. 2, 161164 (1992)
46. Gergel, V.P., Strongin, L.G., Strongin, R.G.: Neighbourhood method in recognition problems.
Soviet J. Comput. Syst. Sci. 26, 4654 (1988)
47. Glover, F., Kochenberger, G.A.: Handbook on Metaheuristics, Kluwer, Dordrecht (2003)
48. Gornov, A.Yu., Zarodnyuk, T.S.: A method of stochastic coverings for optimal control
problems. Comput. Technol. 17, 3142 (2012) (In Russian)
49. Gorodetsky, S.Yu.: Multiextremal optimization based on domain triangulation. The Bulletin
of Nizhni Novgorod Lobachevsky University: Math. Model. Optim. Contr. 21, 249268
(1999) (In Russian)
50. Gorodetsky, S.Yu.: Paraboloid triangulation methods in solving multiextremal optimization
problems with constraints for a class of functions with lipschitz directional derivatives. The
Bulletin of Nizhni Novgorod Lobachevsky University: Math. Model. Optim. Contr. 1, 144
155 (2012) (In Russian)
51. Gorodetsky, S.Yu., Grishagin, V.A.: Nonlinear Programming and Multiextremal Optimization. NNGU Press, Nizhni Novgorod (2007) (In Russian)
52. Gourdin, E., Jaumard, B., Ellaia, R.: Global optimization of Holder functions. J. Global
Optim. 8, 323348 (1996)
53. Grishagin, V.A.: Operation characteristics of some global optimization algorithms. Prob.
Stoch. Search 7, 198206 (1978) (In Russian)
54. Grishagin, V.A.: On convergence conditions for a class of global search algorithms. In: Proceedings of the 3-rd All-Union Seminar Numerical Methods of Nonlinear Programming,
Kharkov, pp. 8284 (1979)
55. Grishagin, V.A.: On properties of a class of optimization algorithms. Transactions of the 3-rd
Conference of Young Scientists of Applied Mathematics and Cybernetics Research Institute
of Gorky University, Gorky, pp. 5058. Deposited with VINITI, Aug.14, 1984, No.583684
Dep. (1983)
56. Grishagin, V.A., Sergeyev, Ya.D., Strongin, R.G.: Parallel characteristical global optimization
algorithms. J. Global Optim. 10, 185206 (1997)
57. Hanjoul, P., Hansen, P., Peeters, D., Thisse, J.F.: Uncapacitated plant location under alternative space price policies. Manag. Sci. 36, 4147 (1990)
58. Hansen, P., Jaumard, B.: Lipshitz optimization. In: Horst, R., Pardalos, P.M. (eds.) Handbook
of Global Optimization, pp. 407493. Kluwer, Dordrecht (1995)
59. Hansen, P., Jaumard, B., Lu, S.H.: Global optimization of univariate Lipschitz functions: 2.
New algorithms and computational comparison. Math. Program. 55, 273293 (1992)
60. Hastings, H.M., Sugihara, G.: Fractals: A Users Guide for the Natural Sciences. Oxford
University Press, Oxford (1994)
61. Hendrix, E.M.T., G.-Toth, B.: Introduction to Nonlinear and Global Optimization. Springer,
New York (2010)
62. Hilbert, D.: Uber

die steitige abbildung einer linie auf ein flachenstuck. Math. Ann. 38,
459460 (1891)
63. Horst, R., Pardalos, P.M.: Handbook of Global Optimization. Kluwer, Dordrecht (1995)
64. Horst, R., Pardalos, P.M., Thoai, N.V.: Introduction to Global Optimization. Kluwer,
Dordrecht (1995)
65. Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches. Springer, Berlin (1996)
122
References
66. Iudin, D.I., Sergeyev, Ya.D., Hayakawa, M.: Interpretation of percolation in terms of infinity
computations. Appl. Math. Comput. 218, 80998111 (2012)
67. Jones, D.R., Perttunen, C.D., Stuckman, B.E.: Lipschitzian optimization without the Lipschitz
constant. J. Optim. Theor. Appl. 79, 157181 (1993)
68. Kiatsupaibul, S., Smith, R.L.: On the solution of infinite horizon optimization problems
through global optimization algorithms. Tech. Report 9819, DIOE, University of Michigan,
Ann Arbor (1998)
69. Kushner, H.: A new method for locating the maximum point of an arbitrary multipeak curve
in presence of noise. J. Basic Eng. 86, 97106 (1964)
70. Kvasov, D.E., Sergeyev, Ya.D.: A univariate global search working with a set of Lipschitz
constants for the first derivative. Optim. Lett. 3, 303318 (2009)
71. Kvasov, D.E., Sergeyev, Ya.D.: Univariate geometric Lipschitz global optimization
algorithms. Numer. Algebra Contr. Optim. 2, 6990 (2012)
72. Kvasov, D.E., Sergeyev, Ya.D.: Lipschitz gradients for global optimization in a one-pointbased partitioning scheme. J. Comput. Appl. Math. 236, 40424054 (2012)
73. Kvasov, D.E., Menniti, D., Pinnarelli, A., Sergeyev, Ya.D., Sorrentino, N.: Tuning fuzzy
power-system stabilizers in multi-machine systems by global optimization algorithms based
on efficient domain partitions. Elec. Power Syst. Res. 78, 12171229 (2008)
74. Lera, D., Sergeyev, Ya.D.: Global minimization algorithms for Holder functions. BIT 42,
119133 (2002)
75. Lera, D., Sergeyev, Ya.D.: An information global minimization algorithm using the local
improvement technique. J. Global Optim. 48, 99112 (2010)
76. Lera, D., Sergeyev, Ya.D.: Lipschitz and Holder global optimization using space-filling
curves. Appl. Numer. Math. 60, 115129 (2010)
77. Lera, D., Sergeyev, Ya.D.: Acceleration of univariate global optimization algorithms working
with Lipschitz functions and Lipschitz first derivatives. SIAM J. Optim. 23(1), 508529
(2013)
78. Liuzzi, G., Lucidi, S., Piccialli, V.: A partition-based global optimization algorithm. J. Global
Optim. 48, 113128 (2010)
79. Locatelli, M.: On the multilevel structure of global optimization problems. Comput. Optim.
Appl. 30, 522 (2005)
80. Mandelbrot, B.: Les objets fractals: forme, hasard et dimension. Flammarion, Paris (1975)
81. Martnez, J.A., Casado, L.G., Garca, I., Sergeyev, Ya.D., G.-Toth, B.: On an efficient use
of gradient information for accelerating interval global optimization algorithms. Numer.
Algorithm 37, 6169 (2004)
82. Mockus, J.: Bayesian Approach to Global Optimization. Kluwer, Dordrecht (1988)
83. Molinaro, A., Pizzuti, C., Sergeyev, Ya.D.: Acceleration tools for diagonal information global
optimization algorithms. Comput. Optim. Appl. 18, 526 (2001)
84. Moore, E.H.: On certain crinkly curves. Trans. Am. Math. Soc. 1, 7290 (1900)
85. Netto, E.: Beitrag zur Mannigfaltigkeitslehre. Journal fur die reine und angewandte Mathematik (Crelles Journal) 86, 263268 (1879)
86. Paulavicius, R., Zilinskas,

J.: Analysis of different norms and corresponding Lipschitz
constants for global optimization in multidimensional case. Inform. Technol. Contr. 36, 383
387 (2007)
87. Paulavicius, R., Zilinskas,

J., Grothey, A.: Investigation of selection strategies in branch and
bound algorithm with simplicial partitions and combination of Lipschitz bounds. Optim. Lett.
4, 173183 (2010)
88. Peano, G.: Sur une courbe, qui remplit toute une aire plane. Math. Ann. 36, 157160 (1890)
89. Peitgen, H.-O., Jurgens, H., Saupe, D.: Chaos and Fractals. Springer, New York (1992)
90. Pickover, C.A.: Chaos and Fractals: A Computer Graphical Journey. Elsevier, Amsterdam
(1998)
91. Pijavskii, S.A.: An algorithm for finding the absolute minimum of a function. Optimum
Decision Theory 2. Inst. of Cybern. of the Acad. of Sci. of the Ukr. SSR, Kiev, 1324 (1967)
(In Russian)
References
123
92. Pijavskii, S.A.: An algorithm for finding the absolute extremum of a function. USSR Comput.
Math. Math. Phys. 12, 5767 (1972)
93. Pinter, J.: Global Optimization in Action (Continuous and Lipschitz Optimization:
Algorithms, Implementations and Applications). Kluwer, Dordrecht (1996)
94. Pinter, J.: Global optimization: software, test problems, and applications. In: Pardalos, P.M.,
Romeijn, H.E. (eds.) Handbook of Global Optimization, vol. 2, pp. 515569. Kluwer,
Dordrecht (2002)
95. Platzman, L.K., Bartholdi, J.J. III: Spacefilling curves and the planar travelling salesman
problem. J. ACM 36, 719737 (1989)
96. Press, W.H., Teukolsky, S.A., Vettering, W.T., Flannery, B.P.: Numerical Recipes in Fortran,
The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)
97. Rastrigin, L.A.: Random Search in Optimization Problems for Multiparameter Systems. Air
Force System Command, Foreign Technical Division, FTD-HT-67-363 (1965)
98. Ratscek, H., Rokne, J.: New Computer Methods for Global Optimization. Ellis Horwood,
Chichester (1988)
99. Sagan, H.: Space-Filling Curves. Springer, New York (1994)
100. Sergeyev, Ya.D.: A one-dimensional deterministic global minimization algorithm. Comput.
Math. Math. Phys. 35, 705717 (1995)
101. Sergeyev, Ya.D.: An information global optimization algorithm with local tuning. SIAM
J. Optim. 5, 858870 (1995)
102. Sergeyev, Ya.D.: A method using local tuning for minimizing functions with Lipschitz
derivatives. In: Bomze, E., Csendes, T., Horst, R., Pardalos, P.M. (eds.) Developments in
Global Optimization, pp. 199215. Kluwer, Dordrecht (1997)
103. Sergeyev, Ya.D.: Global one-dimensional optimization using smooth auxiliary functions.
Math. Program. 81, 127146 (1998)
104. Sergeyev, Ya.D.: On convergence of Divide the Best global optimization algorithms.
Optimization 44, 303325 (1998)
105. Sergeyev, Ya.D.: Parallel information algorithm with local tuning for solving multidimensional GO problems. J. Global Optim. 15, 157167 (1999)
106. Sergeyev, Ya.D.: Multidimensional global optimization using the first derivatives. Comput.
Math. Math. Phys. 39, 743752 (1999)
107. Sergeyev Ya.D.: An efficient strategy for adaptive partition of N-dimensional intervals in the
framework of diagonal algorithms. J. Optim. Theor. Appl. 107, 145168 (2000)
108. Sergeyev, Ya.D.: Efficient partition of N-dimensional intervals in the framework of one-pointbased algorithms. J. Optim. Theor. Appl. 124, 503510 (2005)
109. Sergeyev, Ya.D.: Univariate global optimization with multiextremal nondifferentiable
constraints without penalty functions. Comput. Optim. Appl. 34, 229248 (2006)
110. Sergeyev, Ya.D.: Blinking fractals and their quantitative analysis using infinite and
infinitesimal numbers. Chaos Solitons Fract. 33, 5075 (2007)
111. Sergeyev, Ya.D.: Evaluating the exact infinitesimal values of area of Sierpinskis carpet and
volume of Mengers sponge. Chaos Solitons Fract. 42, 30423046 (2009)
112. Sergeyev, Ya.D.: Using blinking fractals for mathematical modelling of processes of growth
in biological systems. Informatica 22, 559576 (2011)
113. Sergeyev, Ya.D., Grishagin, V.A.: Sequential and parallel algorithms for global optimization.
Optim. Meth. Software 3, 111124 (1994)
114. Sergeyev, Ya.D., Grishagin, V.A.: A parallel method for finding the global minimum of
univariate functions. J. Optim. Theor. Appl. 80, 513536 (1994)
115. Sergeyev, Ya.D., Grishagin, V.A.: Parallel asynchronous global search and the nested
optimization scheme. J. Comput. Anal. Appl. 3, 123145 (2001)
116. Sergeyev, Ya.D., Kvasov, D.E.: Global search based on efficient diagonal partitions and a set
of Lipschitz constants. SIAM J. Optim. 16, 910937 (2006)
117. Sergeyev, Ya.D., Kvasov, D.E.: Diagonal Global Optimization Methods. FizMatLit, Moscow
(2008) (In Russian)
124
References
118. Sergeyev, Ya.D., Kvasov, D.E.: Lipschitz global optimization. In: Cochran, J.J., Cox, L.A.,
Keskinocak, P., Kharoufeh, J.P., Smith, J.C. (eds.) Wiley Encyclopaedia of Operations
Research and Management Science, vol. 4, pp. 28122828. Wiley, New York (2011)
119. Sergeyev, Ya.D., Markin, D.L.: An algorithm for solving global optimization problems with
nonlinear constraints. J. Global Optim. 7, 407419 (1995)
120. Sergeyev, Ya.D., Strongin, R.G.: A global minimization algorithm with parallel iterations.
Comput. Math. Math. Phys. 29, 715 (1990)
121. Sergeyev, Ya.D., Daponte, P., Grimaldi, D., Molinaro, A.: Two methods for solving optimization problems arising in electronic measurements and electrical engineering. SIAM J. Optim.
10, 121 (1999)
122. Sergeyev, Ya.D., Famularo, D., Pugliese, P.: Index Branch-and-Bound Algorithm for Lipschitz univariate global optimization with multiextremal constraints. J. Global Optim. 21,
317341 (2001)
123. Sergeyev, Ya.D., Pugliese, P., Famularo, D.: Index information algorithm with local tuning
for solving multidimensional global optimization problems with multiextremal constraints.
Math. Program. 96, 489512 (2003)
124. Sergeyev, Ya.D., Kvasov, D., Khalaf, F.M.H.: A one-dimensional local tuning algorithm for
solving GO problems with partially defined constraints. Optim. Lett. 1, 8599 (2007)
125. Sierpinski, W.: O krzywych, wypelniajacych kwadrat. Prace Mat.-Fiz. 23, 193219 (1912)
126. Stephens, C.P., Baritompa, W.P.: Global optimization requires global information. J. Optim.
Theor. Appl. 96, 575588 (1998)
127. Strekalovsky, A.S.: Global optimality conditions for nonconvex optimization. J. Global
Optim. 4, 415434 (1998)
128. Strekalovsky, A.S.: Elements of Nonconvex Optimization. Nauka, Novosibirsk (2003)
(In Russian)
129. Strekalovsky, A.S., Orlov, A.V., MalyshevA.V.: On computational search for optimistic
solutions in bilevel problems. J. Global Optim. 48, 159 172 (2010)
130. Strigul, O.I.: Search for a global extremum in a certain subclass of functions with the Lipschitz
condition. Cybernetics 6, 7276 (1985)
131. Strongin, R.G.: On the convergence of an algorithm for finding a global extremum. Eng.
Cybern. 11, 549555 (1973)
132. Strongin, R.G.: Numerical Methods in Multiextremal Problems. Nauka, Moskow (1978)
(In Russian)
133. Strongin, R.G.: The information approach to multiextremal optimization problems. Stoch.
Stoch. Rep. 27, 6582 (1989)
134. Strongin, R.G.: Search for Global Optimum. Series of Mathematics and Cybernetics 2.
Znanie, Moscow (1990) (In Russian)
135. Strongin, R.G.: Algorithms for multi-extremal mathematical programming problems employing the set of joint space-filling curves. J. Global Optim. 2, 357378 (1992)
136. Strongin, R.G., Gergel, V.P.: On realization of the generalized multidimensional global
search algorithm on a computer. Problems of Cybernetics. Stochastic Search in Optimization
Problems. Scientific Council of Academy of Sciences of USSR for Cybernetics, Moscow
(1978) (In Russian)
137. Strongin, R.G., Markin, D.L.: Minimization of multiextremal functions with nonconvex
constraints. Cybernetics 22, 486493 (1986)
138. Strongin, R.G., Sergeyev, Ya.D.: Global multidimensional optimization on parallel computer.
Parallel Comput. 18, 12591273 (1992)
139. Strongin, R.G., Sergeyev, Ya.D.: Global Optimization with Non-convex Constraints: Sequential and Parallel Algorithms. Kluwer, Dordrecht (2000)
140. Strongin, R.G., Sergeyev, Ya.D.: Global optimization: fractal approach and non-redundant
parallelism. J. Global Optim. 27, 2550 (2003)
141. Sukharev, A.G.: Global extrema and methods of its search. In: Moiseev, N.N., Krasnoshchekov, P.S. (eds.) Mathematical Methods in Operations Research, pp. 437. Moscow
University, Moscow (1981) (In Russian)
References
125
142. Sukharev, A.G.: Minimax Algorithms in Problems of Numerical Analysis. Nauka, Moscow
(1989) (In Russian)
143. Tawarmalani, M., Sahinidis, N.V.: Convexification and Global Optimization in Continuous
and MixedInteger Nonlinear Programming: Theory, Algorithms, Software, and Applications. Kluwer, Dordrecht (2002)
144. Timonov, L.N.: An algorithm for search of a global extremum. Eng. Cybern. 15, 3844 (1977)
145. Torn, A., Ali, M.M., Viitanen, S.: Stochastic global optimization: problem classes and
solution techniques. J. Global Optim. 14, 437447 (1999)
146. Torn, A.A., Zilinskas,

A.: Global Optimization. Lecture Notes in Computer Science, vol. 350.
Springer, Berlin (1989)
147. Vanderbei, R.J.: Extension of Piyavskiis algorithm to continuous global optimization.
J. Global Optim. 14, 205216 (1999)
148. Vita, M.C., De Bartolo, S., Fallico, C., Veltri, M.: Usage of infinitesimals in the Mengers
Sponge model of porosity. Appl. Math. Comput. 218, 81878196 (2012)
149. Watson, L.T., Baker, C.: A fully-distributed parallel global search algorithm. Eng. Comput.
18, 155169 (2001)
150. Wood, G.R.: Multidimensional bisection and global optimization. Comput. Math. Appl. 21,
161172 (1991)
151. Wood, G.R., Zhang Baoping: Estimation of the Lipschitz constant of a function. J. Global
Optim. 8, 91103 (1996)
152. Baoping, Z., Wood, G.R., Baritompa, W.: Multidimensional bisection: The performance and
the context. J. Global Optim. 3, 337358 (1993)
153. Zhigljavsky, A.A.: Mathematical Theory of the Global Random Search. St. Petersburg
University Press, St. Petersburg (1985) (In Russian)
154. Zhigljavsky, A.A.: Theory of Global Random Search. Kluwer, Dordrecht (1991)
155. Zhigljavsky, A.A.: Stochastic global optimization. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 15211524. Springer, New York (2011)
156. Zhigljavsky, A.A., Zilinskas,

A.: Stochastic Global Optimization. Springer, New York (2008)
157. Zilinskas,
A.: One-step Bayesian method for the search of the optimum of one-variable
functions. Cybernetics 1, 139144 (1975)
158. Zilinskas,
A., Mockus, J.: On one Bayesian method of search of the minimum. Avtomatica i
Vychislitelnaya Teknika 4, 4244 (1972) (In Russian)

Roman

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Roman

Загружено:

Авторское право:

Доступные форматы

SPRINGER BRIEFS IN OPTIMIZATION

For further volumes:

Yaroslav D. Sergeyev Roman G. Strongin

Before I speak, I have something important to say.

at dozens of international congresses. Together with their collaborators the authors

the optimization process and to strategies leading to a substantial acceleration

2 Approximations to Peano Curves: Algorithms and Software .. . . . . . . . . .

3 Global Optimization Algorithms Using Curves to Reduce

4 Ideas for Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

4.3 Acceleration of Multidimensional Geometric Algorithms .. . . . . . . . . . . 102

Everyone knows what a curve is, until he has studied enough

1.1 Examples of Space-Filling Curves

Fig. 1.1 The paper of G. Peano published in Mathematische Annalen in 1890

1.1 Examples of Space-Filling Curves

The paper D. Hilbert, Uber

Fig. 1.3 Construction of the original Peano curve

1.1 Examples of Space-Filling Curves

Fig. 1.4 Construction of Moores version of Peano curve

Fig. 1.6 Generation of the

1.2 Statement of the Global Optimization Problem

where the domain of the search

1.2 Statement of the Global Optimization Problem

|F(y ) F(y )| L||y y || = L

Obviously, if the domain of the search is a hyperinterval

then, by introducing the transformation

Approximations to Peano Curves: Algorithms

In rallying every curve, every hill may be different than you

2.1 Space-Filling Curves and Reduction of Dimensionality

Y.D. Sergeyev et al., Introduction to Global Optimization Exploiting Space-Filling

2 Approximations to Peano Curves: Algorithms and Software

D(3,3) D(3,0) D(2,3) D(2,2)

D(3,2) D(3,1) D(2,0) D(2,1)

D(0,1) D(0,2) D(1,3) D(1,2)

z2 , 0 z2 2N 1. Each particular subcube obtained by such a partitioning of D(z1 )

2.1 Space-Filling Curves and Reduction of Dimensionality

Present the left-end-point v of the subinterval

in the binary form

where 1 , 2 , . . . , MN are binary digits (i.e., i = 0 or i = 1). From (2.1.3), (2.1.4)

2 Approximations to Peano Curves: Algorithms and Software

|F(y(x )) F(y(x ))| 2L N + 3(|x x |)1/N ,

2.2 Approximations to Peano Curves

||y(x ) y(x )|| =

||y(x ) y(x )|| 2 N + 3(|x x |)1/N ,

2.2 Approximations to Peano Curves

2 Approximations to Peano Curves: Algorithms and Software

and designate (s), 0 s 2N 1, the subcubes constituting the first partition

where s is from (2.2.3) and s is also given in the binary form

2.2 Approximations to Peano Curves

are in contradiction with (2.2.11). Case k + 1 = N, in consideration of (2.2.7),

2 Approximations to Peano Curves: Algorithms and Software

otherwise, i.e., if 1 < k N,

Hence, the second statement of the Theorem is also proved.

meeting the condition

2.2 Approximations to Peano Curves

The addition of some binary vector q RN to the vectors ut (s), 0 s 2N 1,

ui (s) = (uti (s) + qi )mod 2,

2 Approximations to Peano Curves: Algorithms and Software

Next, we introduce the integer function

where s, 2 s 2N 2, is an even integer, j1 is from (2.2.3), and

As it follows from (2.2.2), (2.2.3), (2.2.12), (2.2.14), and (2.2.25),

ui (s 1), i = l(s),

2.2 Approximations to Peano Curves

Fig. 2.3 Case N = 2, M = 2.

2 Approximations to Peano Curves: Algorithms and Software

|F(y(x )) F(y(x ))| 2L N + 3(|x x |)1/N ,

||y(x ) y(x )|| 2 N + 3(|x x |)1/N ,

where s is from (2.2.3) and s is also given in the binary form

ui (s 1), i = l(s),

ui (z1 1), i = 1,

ui (z1 ), i = 1, i = l(z1 ),

wi (z1 ), i = l(z1 ),

ui (z1 + 1), i = 1,