这是用户在 2024-5-30 14:50 为 https://ieeexplore.ieee.org/document/7942105 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
双层优化综述:从经典到进化的方法和应用 |IEEE期刊和杂志 |IEEE Xplore(IEEE Xplore) --- A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications | IEEE Journals & Magazine | IEEE Xplore

A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications
双能级优化综述:从经典到进化的方法与应用

Publisher: IEEE 发行商: IEEE
Ankur Sinha; Pekka Malo; Kalyanmoy Deb
安库尔·辛哈;佩卡·马洛;卡利安莫伊·德布

Abstract: 抽象:

Bilevel optimization is defined as a mathematical program, where an optimization problem contains another optimization problem as a constraint. These problems have receiv...
双层优化被定义为一个数学程序,其中优化问题包含另一个优化问题作为约束。这些问题已经收到...
View more 查看更多

Abstract: 抽象:

Bilevel optimization is defined as a mathematical program, where an optimization problem contains another optimization problem as a constraint. These problems have received significant attention from the mathematical programming community. Only limited work exists on bilevel problems using evolutionary computation techniques; however, recently there has been an increasing interest due to the proliferation of practical applications and the potential of evolutionary algorithms in tackling these problems. This paper provides a comprehensive review on bilevel optimization from the basic principles to solution strategies; both classical and evolutionary. A number of potential application problems are also discussed. To offer the readers insights on the prominent developments in the field of bilevel optimization, we have performed an automated text-analysis of an extended list of papers published on bilevel optimization to date. This paper should motivate evolutionary computation researchers to pay more attention to this practical yet challenging area.
双层优化被定义为一个数学程序,其中优化问题包含另一个优化问题作为约束。这些问题受到了数学规划界的极大关注。使用进化计算技术对双能级问题的研究有限;然而,最近由于实际应用的激增以及进化算法在解决这些问题方面的潜力,人们越来越感兴趣。本文从基本原理到求解策略对双层优化进行了全面综述;既是经典的,也是进化的。还讨论了一些潜在的应用问题。为了让读者深入了解双能级优化领域的突出发展,我们对迄今为止发表的关于双能级优化的论文进行了自动文本分析。这篇论文应该会激励进化计算研究人员更多地关注这个实用但具有挑战性的领域。
Published in: IEEE Transactions on Evolutionary Computation ( Volume: 22, Issue: 2, April 2018)
发表于: IEEE Transactions on Evolutionary Computation ( Volume: 22, Issue: 2, April 2018)
Page(s): 276 - 295 页数: 276 - 295
Date of Publication: 07 June 2017
发布日期: 2017 年 6 月 7 日

ISSN Information:  ISSN信息:

DOI: 10.1109/TEVC.2017.2712906
DOI: 10.1109/TEVC.2017.2712906
Publisher: IEEE 发行商: IEEE

Funding Agency:  资助机构:


SECTION I. 第一节

Introduction 介绍

Many large-scale optimization and decision-making processes faced by public and private organizations are hierarchical in the sense that the realized outcome of any solution or decision taken by the upper level authority (leader) to optimize their goals is affected by the response of lower level entities (follower), who will seek to optimize their own outcomes. Fig. 1 illustrates a general bilevel problem solving structure involving interlinked optimization and decision-making tasks at both levels. The figure shows that for any given upper level decision vector, there is a corresponding (parametric) lower level optimization problem to be solved that provides the rational (optimal) response of the follower for the leader’s decision. The leader’s decision vector is represented by xu and the follower’s decision vector is represented by xl . An (xu , xl ) pair, where xl is an optimal response to xu represents a feasible solution to the upper level optimization problem provided that it also satisfies the constraints in the problem. Each level has its own objectives and constraints. One aspect of bilevel problems is that it is not symmetric in terms of two levels. The upper level decision maker usually has complete knowledge of the lower level problem, while the lower level decision maker only observes the decisions of the leader and then optimizes its own strategies. Interestingly, an incomplete knowledge about the follower’s optimization problem to the leader may lead to bilevel optimization problems involving uncertainties.
公共和私人组织面临的许多大规模优化和决策过程都是分层的,因为上级机构(领导者)为优化其目标而采取的任何解决方案或决策的实现结果都受到下级实体(追随者)的反应的影响,他们将寻求优化自己的结果。 Fig. 1 说明了一般的双层问题解决结构,涉及两个层次的相互关联的优化和决策任务。该图显示,对于任何给定的上层决策向量,都有一个相应的(参数化)下层优化问题需要解决,该问题为领导者的决策提供了追随者的合理(最优)响应。领导者的决策向量用 xu 表示,追随者的决策向量用 xl 表示。( xu xl ) 对,其中 xl 是 的最 xu 优响应,表示上层优化问题的可行解,前提是它也满足问题中的约束条件。每个级别都有自己的目标和约束。双能级问题的一个方面是它在两个能级方面是不对称的。上层决策者通常对下层问题有充分的了解,而下层决策者只观察领导者的决策,然后优化自己的策略。有趣的是,对领导者的追随者优化问题的不完全了解可能会导致涉及不确定性的双层优化问题。

Fig. 1. - General sketch of a bilevel problem.
Fig. 1.  图 1.

General sketch of a bilevel problem.
双层问题的一般草图。

It is not uncommon that the objectives of generally profit-seeking private agents can well be in conflict with those of the controlling authority. What makes such bilevel problem-solving tasks highly relevant is that they are typically characterized by very large spillover effects to the economy as well as the surrounding environment. Given the far-reaching future impacts of the decisions, it is not surprising that the interest toward bilevel programming has grown strong especially among researchers and practitioners dealing with large-scale public sector decision-making problems. For instance, farmers often tend to overuse fertilizers to increase the productivity, which leads to negative externalities such as pollution. Whittaker et al. [176] used a bilevel model to design policy measures to control the overuse of fertilizers and its negative impact on the environment. Apart from problems concerned with environmentally sensitive decisions (such as allocation of mining permits or controlling the use of fertilizers), there has been widespread interest across a number of fields in operations research. A good example is homeland security, where bilevel as well as trilevel optimization frameworks have been utilized in problems ranging from interdiction of nuclear-weapons projects to defending critical infrastructure and solving border security problems [9], [32], [33], [173]. In addition to the public sector challenges, there is abundant research on bilevel decision-making problems in economics, logistics, as well as diverse areas of computer science.
一般追求利润的私人代理人的目标很可能与控制当局的目标相冲突,这种情况并不少见。使这种双层次问题解决任务高度相关的是,它们通常具有对经济和周围环境非常大的溢出效应。鉴于这些决定对未来的深远影响,对双层次规划的兴趣日益浓厚也就不足为奇了,尤其是在处理大规模公共部门决策问题的研究人员和从业人员中。例如,农民往往倾向于过度使用化肥来提高生产力,从而导致污染等负面外部性。Whittaker等人 [176] 使用双层次模型设计了政策措施,以控制肥料的过度使用及其对环境的负面影响。除了与环境敏感决策有关的问题(例如采矿许可证的分配或控制化肥的使用)外,运筹学的许多领域都引起了广泛的兴趣。一个很好的例子是国土安全,其中双层和三层优化框架已被用于从拦截核武器项目到保卫关键基础设施和解决边境安全问题 [9] 等问题 [173][32] [33] 除了公共部门的挑战外,还有关于经济学、物流以及计算机科学不同领域的双层次决策问题的大量研究。

The research on decision-making problems with hierarchical leader-follower structures (bilevel optimization) can be traced to two roots. The first root is in the domain of game theory, where von Stackelberg [156] used bilevel programming to build descriptive models of decision behavior and establish game-theoretic equilibria. The second root is in the domain of mathematical programming, where the problems appeared as bilevel optimization problems containing a nested inner optimization problem as a constraint of an outer optimization problem [29]. Since then a substantial body of mathematical literature on bilevel optimization has emerged. Given that the hierarchical optimization structure may introduce difficulties such as nonconvexity and disconnectedness even for simpler instances of bilevel optimization, the problems have turned out to be surprisingly difficult to handle mathematically. Bilevel programming is known to be strongly NP-hard [79], and it has been proven that merely evaluating a solution for optimality is also an NP-hard task [165]. Even in the simplest case of linear bilevel programs, where the lower level problem has a unique optimal solution for all the parameters, it is not likely to find a polynomial algorithm that is capable of solving the linear bilevel program to global optimality. The proof for the nonexistence of a polynomial time algorithm for linear bilevel problems can be found in [63].

Due to lack of well-established solution procedures, a complex practical problem is usually modified into simpler single level optimization task, which is solved to arrive at a satisfying instead of an optimal solution. For the complex bilevel problems, classical methods often fail due to real world difficulties such as nonlinearity and discreteness. Under such circumstances, evolutionary methods can be useful tools to offset some of these difficulties. Recent initiatives on bilevel optimization using evolutionary algorithms suggest that a coordinated effort on bilevel optimization by the evolutionary community could help make significant progress on this challenging class of optimization problems (e.g., [11], [56], [144], and [151]).

Fig. 2 provides a network map of different themes on bilevel optimization that have been studied since 1950s. The network map shows different theoretical and application topics that have evolved under bilevel optimization. Each link in the map connects a subtopic with a higher level topic that are differentiated by font sizes. Subtopics connected with a link denote an overlap. To provide a more comprehensive overview on the past as well as recent developments in the field of bilevel optimization, we have organized this review paper along the three lines; theory, applications, and text-analysis of the entire bilevel literature body. First, to formalize the notion of bilevel programming, we begin by introducing a few central definitions and discuss the differences between optimistic and pessimistic formulations of bilevel problems. Once the common terminology has been established, we offer an overview on the algorithms that have been proposed for bilevel optimization. After a brief coverage of the commonly used classical approaches (e.g., descent methods, penalty function methods, and trust region methods), we move on to discuss the developments in the field of evolutionary computation, discrete bilevel optimization and multiobjective bilevel optimization. The method sections are followed by a review on the central application areas. Finally, we study the research topics, and the evolution of interest over time. The entire bilevel literature is divided into topics and a time series analysis across each topic is performed. The text-analysis performed in this paper is based on a recently developed nonparametric topic model [26], [160] for analyzing unstructured information. The technical details on the automated text-analysis approach are provided in an appendix. This paper concludes with a brief discussion on the directions for future research.

Fig. 2. - Bilevel network-map showing connections between various applications and theory since 1950s. Each connecting link represents either a topic connected with a subtopic, or an overlap between two subtopics.
Fig. 2.

Bilevel network-map showing connections between various applications and theory since 1950s. Each connecting link represents either a topic connected with a subtopic, or an overlap between two subtopics.

SECTION II.

General Formulation and Definitions

In this section, we provide a general formulation for bilevel optimization problem. These problems contain two levels of optimization tasks where one optimization task is nested within the other. The outer optimization problem is commonly referred as the leader’s (upper level) optimization problem and the inner optimization problem is known as the follower’s (or lower level) optimization problem. The two levels have their own objectives and constraints. Correspondingly, there are also two classes of decision vectors, namely, leader’s (upper level) decision vectors and follower’s (lower level) decision vectors. The lower level optimization is a parametric optimization problem that is solved with respect to the lower level decision vectors while the upper level decision vectors act as parameters. The lower level optimization problem is a constraint to the upper level optimization problem, such that, only those members are considered feasible that are lower level optimal and also satisfy the upper level constraints. A summary of the terminologies and notations used in the context of bilevel optimization is given in Table I.

TABLE I Summary of Central Notations
Table I- 
Summary of Central Notations

Definition 1:

For the upper-level objective function F : Rn×RmR and lower-level objective function f : Rn×RmR , the bilevel problem is given by

min''xuXU,xlXLsubject toxlargminxlXLF(xu,xl){f(xu,xl):gj(xu,xl)0,j=1,,J} Gk(xu,xl)0,k=1,,K
View SourceRight-click on figure for MathML and additional features. where Gk : XU×XLR , k=1,,K denote the upper level constraints, and gj : XU×XLR represent the lower level constraints, respectively. Equality constraints may also exist that have been avoided for brevity.

An equivalent formulation of the above problem can be stated in terms of set-valued mapping (multivalued function) as follows.

Definition 2:

Let Ψ : RnRm be a set-valued mapping

Ψ(xu)=argminxlXL{f(xu,xl) : gj(xu,xl)0,j=1,,J}
View SourceRight-click on figure for MathML and additional features. which represents the constraint defined by the lower-level optimization problem, i.e., Ψ(xu)XL for every xuXU . Then the bilevel optimization problem can be expressed as a constrained optimization problem as follows:
minxuXU,xlXL  subject to  F(xu,xl)xlΨ(xu)Gk(xu,xl)0,k=1,,K
View SourceRight-click on figure for MathML and additional features.
where Ψ can be interpreted as a parameterized range-constraint for the lower-level decision vector xl .

In the above two definitions, quotes have been used while specifying the upper level minimization problem because of an ambiguity that arises in case of multiple lower level optimal solutions for any given upper level decision vector. In the presence of multiple lower level optimal solutions there is lack of clarity at the upper level as to which optimal solution from the lower level should be utilized. This ambiguity can be sorted by defining different positions that may be assumed by the leader. The two common positions that have been widely studied are optimistic (weak) position and pessimistic (strong) position, which we discuss next.

A. Optimistic Position

In an optimistic position, in the presence of multiple lower level optimal solutions, the leader expects the follower to choose that solution from the optimal set Ψo(xu) , which leads to the best objective function value at the upper level. The choice function of the follower in this case may be defined as follows:

Ψo(xu)=argminxlXL{F(xu,xl) : xlΨ(xu)}.
View SourceRight-click on figure for MathML and additional features. This formulation assumes some extent of cooperation between the two players. The bilevel optimization problem under an optimistic position has been defined as follows:
minxuXU,xlXLsubject toF(xu,xl)xl=Ψo(xu)Gk(xu,xl)0,k=1,,K.
View SourceRight-click on figure for MathML and additional features.

Optimistic position is more tractable as compared to the pessimistic position; therefore, most of the studies handle optimistic version of the bilevel optimization problem. The optimistic formulation is guaranteed to have an optimal solutions under reasonable assumptions of regularity and compactness that are stated in the theorem below.

Theorem 1:

If the functions F,f,Gk , and gi are sufficiently smooth, the constraint region Φ of the bilevel optimization problem is nonempty and compact, and the Mangasarian-Fromowitz constraint qualification holds at all points, then the problem is guaranteed to have an optimistic bilevel optimum provided there exists a feasible solution.

See [58], [60], [80], [105], [106], [126] for further discussion on existence of optimistic bilevel optimum and additional results on optimality conditions.

B. Pessimistic Position

In a pessimistic position, in the presence of multiple lower level optimal solutions, the leader optimizes for the worst case, i.e., she assumes that the follower may choose that solution from the optimal set which leads to the worst objective function value at the upper level. Such a worst case choice function of the follower may be defined as

Ψp(xu)=argmaxxl{F(xu,xl) : xlΨ(xu)}.
View SourceRight-click on figure for MathML and additional features. This formulation does not assume any form of cooperation. The bilevel optimization problem under a pessimistic position has been defined as
minxuXU,xlXLsubject toF(xu,xl)xl=Ψp(xu)Gk(xu,xl)0,k=1,,K.
View SourceRight-click on figure for MathML and additional features.
Pessimistic position is relatively less tractable when compared to optimistic position. In case of an optimistic formulation with a convex lower level problem, it is possible to reduce the bilevel problem to single level using the variational inequality corresponding to the lower level problem. However, such a straightforward single level reduction is not possible in case of a pessimistic bilevel program. This poses significant challenges in designing methodologies that can handle pessimistic bilevel problems. For every lower level optimization problem solved one has to keep track of that lower level optimal solution that is worst for the upper level. This essentially makes a pessimistic bilevel optimization a three level task. The pessimistic formulation is guaranteed to have an optimal solutions under stronger assumptions, as compared to the optimistic formulation, that are given below.

Theorem 2:

If the functions F,f,Gk , and gi are sufficiently smooth, the constraint region Φ of the bilevel optimization problem is nonempty and compact, and the set-value mapping, Ψp , is lower semicontinuous for all upper level decision vectors, then the problem is guaranteed to have a pessimistic bilevel optimum.

For discussion on existence of pessimistic bilevel optimum and additional results on optimality conditions, the authors may refer to [60], [62], [109], [111], and [177].

C. Example

Below we provide a simple example of a bilevel optimization problem [71] that arises in case of two firms in a Stackelberg competition. The leader has complete knowledge about the follower’s inverse-demand function and the cost function, and desires to maximize it’s own profits by taking into account the actions of the follower firm. The two firms compete solely by choosing their production levels that maximize their profits (Πl and Πf ), and the follower acts only after observing the actions of the leader. Formally, this model can be presented as follows:

maxql,qfs.t.Πl=P(ql,qf)qlCl(ql) qfargmaxqf{Πf=P(ql,qf)qfCf(qf)}ql,qf0(1)(2)(3)
View SourceRight-click on figure for MathML and additional features. where P(ql,qf) is the unit price of the goods sold, which depends on the total production. The assumption is that at the optimum, all demand is satisfied. Cl() is the cost of production of the leader and Cf() is the cost of production of the follower. The variables in this model are the production levels of each firm ql and qf . The leader sets its production level first, and then the follower chooses its production level based on the leader’s decision. This simple model assumes homogeneity of the products manufactured by the firms.

By assuming that the firms produce and sell homogeneous goods, we may assume a single linear price function for both firms as an inverse demand function of the form

P(ql,qf)=αβ(ql,qf)(4)
View SourceRight-click on figure for MathML and additional features. where α and β>0 are constants. Additionally, since costs often tend to increase with the amount of production, we assume convex quadratic cost functions for both firms to be of the form
C(ql)=C(qf)=δlq2l+γlql+clδfq2f+γfqf+cf(5)(6)
View SourceRight-click on figure for MathML and additional features.
where ci denotes the fixed costs of the respective firm, and δi and γi are positive constants. It is possible to solve the above model analytically, using the first order conditions of the lower level problem to reduce it to single level, and then using the first order conditions of the reduced problem.

The optimal level of production of the leader (ql) and the follower (qf) in terms of the constants of the model is given as follows:

ql=qf=2(β+δf)(αγl)β(αγf)4(β+δf)(β+δl)2β2αγf2(β+δf)β(αγl)β2(αγf)2(β+δf)4(β+δf)(β+δl)2β2.(7)(8)
View SourceRight-click on figure for MathML and additional features.

SECTION III.

Classical Approaches

In this section, we provide a brief overview of the classical algorithms that have been proposed for bilevel optimization. Given the difficult nature of bilevel problems, it is not surprising that much of the classical literature considers bilevel problems that are mathematically well-behaved; i.e., contains functions that are linear, quadratic or convex. Strong assumptions like continuous differentiability and lower semicontinuity are quite common. A significant amount of attention has been given to linear bilevel optimization problems with continuous [20], [174] and combinatorial [166] variables. For more complex bilevel problems, the readers may refer [10] and [48].

A. Single-Level Reduction

When the lower level problem is convex and sufficiently regular, it is possible to replace the lower level optimization problem with its Karush-Kuhn–Tucker (KKT) conditions. The KKT conditions appear as Lagrangian and complementarity constraints, and reduce the overall bilevel optimization problem to a single-level constrained optimization problem. For example, the problem in Definition 1 can be reduced to the following form, when the convexity and regularity conditions at the lower level are met:

minxuXU,xlXL,λsubject toF(xu,xl)Gk(xu,xl)0,k=1,,KxlL(xu,xl,λ)=0gj(xu,xl)0,j=1,,Jλjgj(xu,xl)=0,j=1,,Jλj0,j=1,,J
View SourceRight-click on figure for MathML and additional features. where
L(xu,xl,λ)=f(xu,xl)+j=1Jλjgj(xu,xl).
View SourceRight-click on figure for MathML and additional features.
The above formulation, though a single level optimization task, is not necessarily simple to handle. The Lagrangian constraints can lead to nonconvexities even when suitable convexity assumptions are made on all the objectives and constraints in the bilevel formulation. The complementarity condition, inherently being combinatorial, renders the single-level optimization problem as a mixed integer program.

Interestingly, for linear bilevel optimization problems, the Lagrangian constraint is also linear. Therefore, the single-level optimization problem is a mixed integer linear program. Approaches based on vertex enumeration [25], [45], [161], as well as branch-and-bound (B&B) [16], [70] have been used to solve these problems. It is noteworthy that B&B methods constitute an exponentially slow algorithm with the number of integer variables. But, B&B approaches have been successfully applied to single-level reductions of linear-quadratic [17] and quadratic-quadratic [5], [64] bilevel problems. An extended KKT approach has also been considered [138] for handling linear bilevel problems.

B. Descent Methods

In addition to KKT-based approaches, a number of descent methods have been proposed for solving bilevel optimization problems. A descent direction in bilevel optimization leads to decrease in upper level function value while keeping the new point feasible. Given that a point is considered feasible only if it is lower level optimal, finding the descent direction can be quite challenging. To resolve the problem, researchers have investigated ways to approximate the gradient of the upper level objective [95] as well as considered formulation of auxiliary programs [136], [165] to determine the direction of descent.

C. Penalty Function Methods

In penalty function methods the bilevel optimization problem is handled by solving a series of unconstrained optimization problems. The unconstrained problem is generated by adding a penalty term that measures the extent of violation of the constraints. The penalty term often requires a parameter and takes the value zero for feasible points and positive (minimization) for infeasible points. For bilevel problems, the first attempt toward using a penalized approach was made by Aiyoshi and Shimizu [1], [2]. They replaced the lower level problem by a penalized problem; however, the bilevel hierarchy was still maintained and the reduced problem was still difficult to solve. Later double penalty method was introduced in [84], where both upper and lower level objective functions were penalized. The problem was reduced into a single level task by replacing the penalized lower level problem with its KKT conditions, and then solving the single level formulation by penalization. In a number of studies, the lower level problem is directly replaced by its KKT conditions and then a penalized approach is used to solve the single level problem. Few studies where penalty function approach has been used for linear bilevel problems are [112] and [175]. White and Anandalingam [175] converted the linear bilevel program into a penalized bilinear optimization problem, and then solve a series of bilinear problems to find the optimum. Lv et al.. [112] reduced the linear bilevel program into single level using KKT conditions, and then append the complementary slackness condition to the upper level objective function with a penalty. The penalized problem is then handled using a series of linear programs.

D. Trust-Region Methods

In trust-region methods, the algorithms approximate a certain region of the objective function with a model function. The region is expanded if the approximation is good, otherwise it is contracted. The first study using trust-region method to solve nonlinear bilevel programs was presented in [108], where the lower level problem had a convex objective function and linear constraints. However, no upper level constraints were considered. Later, a more general idea was proposed in [116], where the authors locally approximate the bilevel program with a model involving a linear program at the upper level and a linear variational inequality at the lower level. Trust-region and line search ideas have been combined to approach the bilevel optimum over iterations. Similarly, Colson et al. [50] approximated the bilevel program around an iterate with a model that itself is a linear-quadratic bilevel program. The authors proposed to solve the linear-quadratic bilevel program using a mixed integer solver after reducing it to a single level problem using its lower level KKT conditions. Convergence is achieved by sequentially solving linear-quadratic bilevel models.

Next, we discuss about evolutionary algorithms for bilevel optimization. At this point, we would like to refer the readers to other review papers [49], [57], [90], [150], [164] and books [18], [60], [61], [140], [159] on bilevel optimization.

SECTION IV.

Evolutionary Approaches

A. Nested Methods

Nested evolutionary algorithms are a popular approach to handle bilevel problems, where lower level optimization problem is solved corresponding to each and every upper level member [153]. Though effective, nested strategies are computationally very expensive and not viable for large scale bilevel problems. Nested methods in the area of evolutionary algorithms have been used in primarily two ways. The first approach has been to use an evolutionary algorithm at the upper level and a classical algorithm at the lower level, while the second approach has been to utilize evolutionary algorithms at both levels. Of course, the choice between two approaches is determined by the complexity of the lower level optimization problem.

One of the first evolutionary algorithms for solving bilevel optimization problems was proposed in the early 1990s. Mathieu et al. [118] used a nested approach with genetic algorithm at the upper level, and linear programming at the lower level. Another nested approach was proposed in [184], where the upper level was an evolutionary algorithm and the lower level was solved using Frank-Wolfe algorithm (reduced gradient method) for every upper level member. The authors demonstrated that the idea can be effectively utilized to solve nonconvex bilevel optimization problems.

Nested particle swarm optimization (PSO) was used in [103] to solve bilevel optimization problems. The effectiveness of the technique was shown on a number of standard test problems with small number of variables, but the computational expense of the nested procedure was not reported. A hybrid approach was proposed in [102], where simplex-based crossover strategy was used at the upper level, and the lower level was solved using one of the classical approaches. The authors report the generations and population sizes required by the algorithm that can be used to compute the upper level function evaluations, but they do not explicitly report the total number of lower level function evaluations, which presumably is high.

Differential evolution (DE) based approaches have also been used, for instance, Zhu et al. [188] used DE at the upper level and relied on the interior point algorithm at the lower level; similarly, Angelo et al. [11] have used DE at both levels. Authors have also combined two different specialized evolutionary algorithms to handle the two levels, for example, Angelo and Barbosa [12] used an ant colony optimization to handle the upper level and DE to handle the lower level in a transportation routing problem. Another nested approach utilizing ant colony algorithm for solving a bilevel model for production-distribution planning is [36]. Scatter search algorithms have also been employed for solving production-distribution planning problems, for instance [39].

Through a number of approaches involving evolutionary algorithms at one or both levels, the authors have demonstrated the ability of their methods in solving problems that might otherwise be difficult to handle using classical bilevel approaches. However, as already stated, most of these approaches are practically nonscalable. With increasing number of upper level variables, the number of lower level optimization tasks required to be solved increases exponentially. Moreover, if the lower level optimization problem itself is difficult to solve, numerous instances of such a problem cannot be solved, as required by these methods.

B. Single-Level Reduction

The idea behind single-level reduction, in the context of evolutionary algorithms, is similar to the discussion in Section III-A. A number of researchers in the area of evolutionary computation have also used the KKT conditions of the lower level to reduce the bilevel problem into a single-level problem. Most often, such an approach is able to solve problems that adhere to certain regularity conditions at the lower level because of the requirement of the KKT conditions. However, as the reduced single-level problem is solved with an evolutionary algorithm, usually the upper level objective function and constraints can be more general and not adhering to such regularities. For instance, one of the earliest papers using such an approach is by Hejazi et al. [82], who reduced the linear bilevel problem to single-level and then used a genetic algorithm, where chromosomes emulate the vertex points, to solve the problem. Wang et al. [171] reduced the bilevel problem into a single-level optimization problem using KKT conditions, and then utilized a constraint handling scheme to successfully solve a number of standard test problems. Their algorithm was able to handle nondifferentiability at the upper level objective function, but not elsewhere. Later on, Wang et al. [172] introduced an improved algorithm that was able to handle nonconvex lower level problem and performed better than the previous approach [171]. However, the number of function evaluations in both approaches remained quite high (requiring function evaluations to the tune of 100 000 for 2–5 variable bilevel problems). Wang et al. [169] used a simplex-based genetic algorithm to solve linear-quadratic bilevel problems after reducing it to a single level task. More recently, Jiang et al. [86] reduced the bilevel optimization problem into a nonlinear optimization problem with complementarity constraints, which is sequentially smoothed and solved with a PSO algorithm. Along similar lines of using lower level optimality conditions, Li [101] solved a fractional bilevel optimization problem by utilizing optimality results of the linear fractional lower level problem. Wan et al. [167] embedded the chaos search technique in PSO to solve single-level reduced problem.

C. Metamodeling-Based Methods

Metamodeling-based solution methods are commonly used for optimization problems [168], where actual function evaluations are expensive. A meta-model or surrogate model is an approximation of the actual model that is relatively quicker to evaluate. Based on a small sample from the actual model, a surrogate model can be trained and used subsequently for optimization. Given that, for complex problems, it is hard to approximate the entire model with a small set of sample points, researchers often resort to iterative meta modeling techniques, where the actual model is approximated locally during iterations.

Bilevel optimization problems contain an inherent complexity that leads to a requirement of large number of evaluations to solve the problem. Metamodeling, when used with population-based algorithms, offers a viable means to handle bilevel optimization problems. In this section, we discuss four ways in which metamodeling can be applied to bilevel optimization. The discussion related to approximation of the rational reaction set and lower level optimal value function is a review of some recent work. However, before starting, we refer the readers to Fig. 3, which provides an understanding of these two mappings graphically for a hypothetical bilevel problem. We also provide a brief discussion on approximating the bilevel problem with an auxiliary problem.

Fig. 3. - Graphical representation of rational reaction set (
$\Psi $
) and lower level optimal value function (
$\varphi $
).
Fig. 3.

Graphical representation of rational reaction set (Ψ ) and lower level optimal value function (φ ).

1) Reaction Set Mapping:

One of the approaches to solve bilevel optimization problems using evolutionary algorithms would be through iterative approximation of the reaction set mapping Ψ . If the Ψ -mapping (introduced in Table I) in a bilevel optimization problem is known, it effectively reduces the problem to single level optimization. However, this mapping is seldom available; therefore, the approach could be to solve the lower level problem for a few upper level members and then utilize the lower level optimal solutions and corresponding upper level members to generate an approximate mapping Ψ^ . It is noteworthy that approximating a set-valued Ψ -mapping offers its own challenges and is not a straightforward task. Assuming that an approximate mapping, Ψ^ , can be generated, the following single level optimization problem can be solved for a few generations of the algorithm before deciding to further refine the reaction set:

minxuXU,xlXLsubject toF(xu,xl)xlΨ^(xu)Gk(xu,xl)0,k=1,,K.
View SourceRight-click on figure for MathML and additional features. Evolutionary algorithms that rely on this idea to solve bilevel optimization problems are [13], [144], [145], and [149]. Sinha et al. [144], [145] have used quadratic approximation to approximate the local reaction set. This helps in saving lower level optimization calls when the approximation for the local reaction set is good. In case the approximations generated by the algorithm are not acceptable, the method defaults to a nested approach. It is noteworthy that a bilevel algorithm that uses a surrogate model for reaction set mapping may need not be limited to quadratic models but other models can also be used.

2) Optimal Lower Level Value Function:

Another way to use metamodeling would be through the approximation of the optimal value function φ . If the φ -mapping (introduced in Table I) is known, the bilevel problem can once again be reduced to single level optimization problem as follows [183]:

minxuXU,xlXLsubject toF(xu,xl)f(xu,xl)φ(xu)gj(xu,xl)0,j=1,,JGk(xu,xl)0,k=1,,K.
View SourceRight-click on figure for MathML and additional features. However, since the value function is seldom known, one can attempt to approximate this function using metamodeling techniques. The optimal value function is a single-valued mapping; therefore, approximating this function avoids the complexities associated with set-valued mapping. As described previously, an approximate mapping φ^ , can be generated with the population members of an evolutionary algorithm and the following single level optimization problem can be solved with refinements at every few generations:
minxuXU,xlXL  subject to  F(xu,xl)f(xu,xl)φ^(xu)gj(xu,xl)0,j=1,,JGk(xu,xl)0,k=1,,K.
View SourceRight-click on figure for MathML and additional features.
An evolutionary approach that relies on this idea can be found in [143] and [148].

3) Bypassing Lower Level Problem:

Another way to use a meta-model in bilevel optimization would be completely by-pass the lower level problem, as follows:

minxuXU  subject to  F^(xu)G^k(xu)0,k=1,,K.
View SourceRight-click on figure for MathML and additional features. Given that the optimal xl are essentially a function of xu , it is possible to construct a single level problem by ignoring xl completely. However, the landscape for such a single level problem can be highly nonconvex, disconnected and nondifferentiable. Advanced metamodeling techniques might be required to use this approach, which may be beneficial for certain classes of bilevel problems. A training set for the meta-model can be constructed by solving few lower level problems for different xu . Both upper level objective F and constraint set (Gk ) can then be meta-modeled using xu alone. Given the complex structure of such a single-level problem, it might be sensible to create such an approximation locally. We are currently pursuing such an approach using artificial neural network as the metamodeling approach.

4) Auxiliary Bilevel Meta-Model:

Building up on the trust-region methods for solving bilevel optimization problems, it is possible to utilize the population members in an evolutionary algorithm to formulate auxiliary bilevel problem(s). The auxiliary bilevel problem(s) may be simple enough to be solved using faster specialized techniques. The population members could then be updated based on the obtained auxiliary solution(s). For the moment, there does not exist any evolutionary algorithm based on this idea, but it may be an interesting direction to pursue in the future.

SECTION V.

Discrete Bilevel Optimization

In this section, we would like to discuss the contributions made toward solving discrete bilevel optimization problems. The formulation of the bilevel problem remains the same as described in Definitions 1 and 2, along with one or more variables at either of the levels being discrete. Presence of discrete variables can pose a variety of challenges depending upon, whether the discrete variables are present at upper level, lower level or both levels. In the classical literature, B&B and branch-and-cut are some of the commonly used techniques to handle discreteness in variables. Most of the work on discrete bilevel optimization employs an extension of these ideas from single-level optimization. To highlight the kind of complexities induced in the presence of discrete variables, we consider a simple linear bilevel problem described in [18] and [166] to show how the inducible region changes based on the upper, lower or both level variables being discrete or continuous.

Consider the following lower level optimization problem which we use for identifying the inducible region of a bilevel problem:

minxl subject to xlxu+xl2,xu+xl25xu4xl10,5xu4xl10.
View SourceRight-click on figure for MathML and additional features. For the above lower level problem there are four possible scenarios based on the variables being continuous or discrete.

  1. Continuous-Continuous Bilevel Program: Consider xuR and xlR .

  2. Discrete-Continuous Bilevel Program: Consider xuZ and xlR .

  3. Discrete-Discrete Bilevel Program: Consider xuZ and xlZ .

  4. Continuous-Discrete Bilevel Program: Consider xuR and xlZ .

For each scenario, the inducible region can be very different that has been shown in Fig. 4. The figure clearly demonstrates how a discrete variable at any level of the problem can lead to a disconnected search space. Of course there could also be situations where each level has a mix of continuous and discrete variables.
Fig. 4. - Inducible region in different cases when the upper and lower level variables belong to continuous or discrete sets.
Fig. 4.

Inducible region in different cases when the upper and lower level variables belong to continuous or discrete sets.

A. Discrete Bilevel Optimization Survey

One of the early works on discrete bilevel optimization was by Vicente et al. [166], which focused on discrete linear bilevel programs, and analyzed the properties and existence of the optimal solution for different kinds of discretizations arising from the upper and lower level variables. The authors have shown in this paper that certain compactness conditions guarantee the existence of optimal solution in continuous-continuous linear bilevel programs, discrete-continuous linear bilevel programs, and discrete-discrete linear bilevel programs. The conditions are equivalent to stating that the inducible region is nonempty. However, the existence conditions in the case of continuous-discrete linear bilevel programs are not straightforward. For instance, the inducible region for the continuous-discrete linear bilevel problem in Fig. 4 is a noncompact set that may lead to nonexistence of a bilevel optimal solution even when the inducible region is nonempty.

A few studies preceded the study by Vicente et al. [166]. For instance, Moore and Bard [123] solved mixed integer linear bilevel problems. The authors pointed out the difficulties involved in fathoming while solving mixed integer bilevel problems using traditional B&B techniques. Certain fathoming rules used in case of mixed integer linear programming, like fathoming when the relaxed subproblem is worse than the value of the incumbent or fathoming when the solution of the relaxed subproblem is feasible for the mixed integer problem, are not directly applicable to mixed integer linear bilevel problems. Therefore, the authors proposed a B&B approach involving stricter fathoming conditions. However, the algorithm has a nested structure and is not scalable beyond few integer variables, and to counter which the authors also proposed some heuristics. This paper was followed by [19], where the same authors solved discrete linear bilevel programs involving only binary variables using an implicit enumeration scheme. In this approach, the authors place a cut, similar to the one used by Bialas and Karwan [25] (for the continuous linear bilevel program), seeking incremental improvements in the upper level objective function. A cutting plane method utilizing the Chvátal-Gomory cut for the continuous-discrete bilevel program was proposed in [59]. Benders-decomposition-based techniques have also been employed to solve bilevel problems with mixed integers at the upper level and continuous linear programs at the lower level. The original problem is decomposed into a master and a slave problem. Fixing the integer values converts the slave problem into a bilevel linear program which is solved by KKT-based reduction techniques, and the solution to the slave is utilized to create a cut for the master problem. The algorithm switches between master and slave problems until the optimality criteria is met. Certain studies in this direction are [40], [69], and [135].

Despite the attempts made toward algorithm development for discrete bilevel programs, the research is still open for new methods and ideas as none of the proposed techniques would scale well for problems with larger number of variables. Apart from a few nested approaches and KKT-based single level reduction approaches, to our best knowledge, there does not exist any algorithmic study involving evolutionary algorithms for mixed integer bilevel problems that attempt to solve the problem efficiently by utilizing its properties. However, given that the evolutionary approaches are, in particular, potent for handling difficulties such as discreteness and nondifferentiabilities they offer a significant scope for solving discrete bilevel optimization problems. Some attempts toward mixed integer bilevel optimization using evolutionary methods are [4], [14], [37], [43], [78], [81], [100], and [120].

There is no dearth of application problems involving discrete variables. While mixed integer bilevel problems are ubiquitous, even combinatorial bilevel programs find innumerable applications in the areas of network design, facility location, hub-and-spoke networks, etc. In these areas, these problems are commonly studied in the context of interdiction, protection, robust design, competition and supply chain management, among others. Some of the related applications are highlighted in Section VII.

SECTION VI.

Multiobjective Bilevel Optimization

In many of the practical problems, a leader and/or the follower might face multiple objectives. This gives rise to multiobjective bilevel optimization problems that we define below.

Definition 3:

For the upper-level objective function F : Rn×RmRp and lower-level objective function f : Rn×RmRq , the multiobjective bilevel problem is given by

minxuXU,xlXLsubject toF(xu,xl)=(F1(xu,xl),,Fp(xu,xl))xlargminxlXL{f(xu,xl)=(f1(xu,xl),,fq(xu,xl)):gj(xu,xl)0,j=1,,J}Gk(xu,xl)0,k=1,,K
View SourceRight-click on figure for MathML and additional features. where Gk : XU×XLR , k=1,,K denote the upper level constraints, and gj : XU×XLR represent the lower level constraints, respectively.

The set-valued mapping in this case can be defined as follows and an equivalent definition can be written as in the single-objective case:

Ψ(xu)=argminxlXL{f(xu,xl)=(f1(xu,xl),,fq(xu,xl)) :gj(xu,xl)0,j=1,,J}.
View SourceRight-click on figure for MathML and additional features.

A. Optimistic Versus Pessimistic

The optimistic or pessimistic position becomes more prominent in multiobjective bilevel optimization. In the presence of multiple objectives at the lower level, the set-valued mapping Ψ() normally represents a set of Pareto-optimal (PO) solutions corresponding to any given xu , which we refer as follower’s PO frontier. A solution to the overall problem (with optimistic or pessimistic position) is expected to produce a tradeoff frontier for the leader that we refer as the leader’s PO frontier. From the perspective of the leader, it becomes important that what kind of position she seeks to take while solving the problem, as it determines that which solution(s) from the lower level frontier should be considered at the upper level.

Though optimistic positions have commonly been studied in classical [66] and evolutionary [56] literature in the context of multiobjective bilevel optimization; it is far from realism to expect that the follower will cooperate to an extent that she chooses any point from her PO frontier that is most suitable for the leader. This relies on the assumption that the follower is indifferent to the entire set of optimal solutions, and therefore decides to cooperate. The situation was entirely different in the single-objective case, where, in case of multiple optimal solutions, all the solutions offered an equal value to the follower. However, this cannot be assumed in the multiobjective case. Solution to the optimistic formulation in multiobjective bilevel optimization leads to the best possible PO frontier that can be achieved by the leader. Similarly, solution to the pessimistic formulation leads to the worst possible PO frontier at the upper level.

If the value function or the choice function of the follower is known to the leader, it provides an information as to what kind of tradeoff is preferred by the follower. A knowledge of such a function effectively, casually speaking, reduces the lower level optimization problem into a single-objective optimization task, where the value function may be directly optimized. The leader’s PO frontier for such intermediate positions lies between the optimistic and the pessimistic frontiers. Fig. 5 shows the optimistic and pessimistic frontiers for a hypothetical multiobjective bilevel problem with two objectives at upper and lower levels. Follower’s frontier corresponding to x1u , x2u , and x3u , and her decisions Al , Bl , and Cl are shown in the insets. The corresponding representations of the follower’s frontier and decisions (Au , Bu , and Cu ) in the leader’s space are also shown.

Fig. 5. - Leader’s PO frontiers for optimistic and pessimistic positions. Few follower’s PO frontiers are shown (in insets) along with their representations in the leader’s objective space.
Fig. 5.

Leader’s PO frontiers for optimistic and pessimistic positions. Few follower’s PO frontiers are shown (in insets) along with their representations in the leader’s objective space.

B. Example

Below, we provide a bilevel optimization problem involving design of a tax policy [152]. The upper level in this example is the government that wants to tax the lower level, a mining company, based on the pollution it causes to the environment. The government here has two objectives: the first objective is to maximize the revenues generated by the mining project, which may include the additional jobs, taxes, etc.; and the second objective is to minimize the harm caused to the environment as a result of mining. Obviously, there is a tradeoff between the two objectives, and the government as a decision maker needs to choose one of the preferred tradeoff solutions. The mining company has a sole objective of maximizing its profit under the constraints set by the government. In this scenario, the government would like to have a tax structure such that it is able to maximize its own revenues in addition to being able to restrain the mining company from causing extensive damage to the environment. It is possible for the leader to optimally regulate the problem in its favor, provided that it has complete knowledge of the follower’s strategies. The hierarchical optimization problem in this case can be formulated as follows:

maxτ,qs.t. F(q,τ)=(R,D)qargmaxq{π(q)π(q)=p(q)qc(q)R0}q0,τ0.(9)(10)(11)
View SourceRight-click on figure for MathML and additional features.

In (9), the first objective deals with the tax revenue, where R=τq ; τ is the per unit tax imposed on the mine, and q is the amount of metal extracted from the ore by the follower. The second objective denotes the environmental damage caused by the mine that the government ultimately wants to minimize. D=kq , where k is the pollution coefficient signifying the negative impact of extraction on the environment. The damages are thus linear and scale proportionately with the amount of gold extracted from the earth since a larger base of operation implies larger environmental damage.

Equation (10) gives the profit of the mine, where p(q)q (price function times amount of metal extracted) is the revenue function, and c(q) is the extraction cost function followed by the additional tax levied on the mine. The mine is most likely to be a price taker when it comes to the price of gold and must base its mining decisions on the possible price paid by their customers. It would therefore be plausible to replace the price function for gold in the above equation by a constant. However, given the assumption that the mine can extract a large amount of ore, and subsequently gold, at one time, it might be possible for it to affect the price of gold slightly. Therefore, we assume the price function to be linear with a small slope. Extraction cost can be considered to be quadratic. Thus, we have the following model:

maxτ,qs.t. F(q,τ)=(τq,kq)qargmaxqπ(q)π(q)=(αβq)q(δq2+γq+ϕ)τq0q0,τ0(12)(13)(14)
View SourceRight-click on figure for MathML and additional features. where α,β,δ,γ , and ϕ are constants, and ϕ represents the fixed costs of setting up operations. The above problem can be solved analytically by taking a weighted sum of squares of the upper level objectives (wτq(1w)kq : w[0,1] ). The optimal solution to the above problem is given as follows:
τ(w)=q(w)=αγk2+k2ww(αγ)(1w)k4w(β+δ).(15)(16)
View SourceRight-click on figure for MathML and additional features.

We assume the parameters as α=100,β=1,δ=1,γ=1 , and ϕ=0 . By varying the government’s preference weights (w ) in its domain, one can generate the entire PO solutions for the leader. Note that a very high taxation (or weight to the environmental objective) may lead to no production at the lower level, for instance, we find that w<0.01 does not lead to any production. The Pareto frontier generated using weights 0.01w1 has been provided in Fig. 6 for the above model.

Fig. 6. - PO frontier for the government showing the tradeoff between tax revenues and environmental pollution.
Fig. 6.

PO frontier for the government showing the tradeoff between tax revenues and environmental pollution.

C. Multiobjective Bilevel Optimization Survey

There exists a significant amount of work on single objective bilevel optimization; however, little has been done on multiobjective bilevel optimization primarily because of the computational and decision making complexities that these problems offer. For results on optimality conditions in multiobjective bilevel optimization, the readers may refer to [15], [72], and [182]. On the methodology side, Eichfelder [65], [66] solved simple multiobjective bilevel problems using a classical approach. The lower level problems in these studies have been solved using a numerical optimization technique, and the upper level problem is handled using an adaptive exhaustive search method. This makes the solution procedure computationally demanding and nonscalable to large-scale problems. In another study, Shi and Xia [139] used ϵ -constraint method at both levels of multiobjective bilevel problem to convert the problem into an ϵ -constraint bilevel problem. The ϵ -parameter is elicited from the decision maker, and the problem is solved by replacing the lower level constrained optimization problem with its KKT conditions.

One of the first studies, utilizing an evolutionary approach for multiobjective bilevel optimization was by Yin [184]. The study involved multiple objectives at the upper lever, and a single objective at the lower level. The study suggested a nested genetic algorithm, and applied it on a transportation planning and management problem. Later Halter and Mostaghim [76] used a PSO-based nested strategy to solve a multicomponent chemical system. The lower level problem in their application was linear for which they used a specialized linear multiobjective PSO approach. Recently, a hybrid bilevel evolutionary multiobjective optimization algorithm coupled with local search was proposed in [56] (for earlier versions, refer [53]–​[55] and [142]). In this paper, the authors handled nonlinear as well as discrete bilevel problems with relatively larger number of variables. The study also provided a suite of test problems for bilevel multiobjective optimization.

There has been some work done on decision making aspects at upper and lower levels. For example, in [141] an optimistic version of multiobjective bilevel optimization, involving interaction with the upper level decision maker, has been solved. The approach leads to the most preferred point at the upper level instead of the entire Pareto-frontier. Since multiobjective bilevel optimization is computationally expensive, such an approach was justified as it led to enormous savings in computational expense. Studies that have considered decision making at the lower level include [146] and [151]. Sinha et al. [146] have replaced the lower level with a value function that effectively reduces the lower level problem to single-objective optimization task. In [151], the follower’s value function is known with uncertainty, and the authors propose a strategy to handle such problems. Other work related to bilevel multiobjective optimization can be found in [107], [128], [129], [134], and [187].

SECTION VII.

Applications

Bilevel optimization commonly appears in many practical problems. They are often encountered in the fields of economics, transportation, engineering, and management, among others. The following list will provide an insight to the readers on the relevance of these problems to practice.

  1. Toll Setting Problem: Toll-setting problem is essentially a part of network problems. In this problem, there is an authority that wants to optimize the tolls for a network of roads. The authority acts as a leader and the network users act as followers. Papers on toll-setting problem and its multiobjective extensions can be found in [31], [51], [67], [75], [89], [91], [97], [115], [121], [147], [170], and [185]. Bilevel optimization is quite commonly used in network design problems. Instead of going through specific problems, we refer the readers to a variety of applications of bilevel optimization to the area of network design [21], [38], [42], [44], [68], [73], [99], [114], [119], [133], [178], [180], [181], [184].

  2. Environmental Economics: Bilevel optimization commonly appears in environmental economics, where an authority wants to tax an organization or individual that is polluting the environment as a result of its operations. Finding an optimal level of tax that offers a compromise between revenues and pollution results in a bilevel optimization problem with the regulator as the leader and the polluting entity as a follower [8], [27], [28], [152], [176].

  3. Chemical Industry: In chemical industries, the chemists often face a bilevel optimization problem where they have to decide upon the conditions (state variables and quantity of reactants) for the reaction to achieve optimal output. While optimizing the output is the upper level problem, the lower level appears as an equilibrium condition, which is an entropy functional minimization problem. Such applications of bilevel optimization can be found in [47], [77], [130], and [155].

  4. Optimal Design: Bilevel problems are very common in structural optimization or optimal shape design. For instance, in structural optimization one often requires to minimize the weight or cost of a structure as an upper level objective with the decision variables as shape of the structure, choice of materials, amount of material, etc. The constraints at the upper level involve bounds on displacements, stresses and contact forces whose values are determined by solving the potential energy minimization problem at the lower level. The equilibrium condition in many optimal shape design problems appears in the form of variational inequalities which require the overall problem to formulated as a two level task. For optimal design applications, the readers may refer to [22], [46], [83], [93], and [94].

  5. Defense Applications: Bilevel optimization has a number of applications in the defense sector [30], for example attacker-defender Stackelberg games [9], [34], [85], [125], [131], [132], [137]. Specifically, some recent applications include planning the prepositioning of defensive missile interceptors to counter an attack threat [32], interdicting nuclear weapons project [33], homeland security applications [110], [173], and location problems [3]. The bilevel problem, while offending, involves maximizing the damage caused to the opponent by taking into account the optimal reactions of the opponent. Conversely, while defending, the bilevel problem involves minimizing the maximum damage that an attacker can cause.

  6. Facility Location: Facility location problems may take the form of a Stackelberg game if a firm, while locating its facility, decides to account the actions of its competitors. For instance, Küçükaydin et al. [96] studied the scenario where a firm enters a market by locating new facilities, and its competitor reacts by adjusting the attractiveness of its existing facilities. Another study considers location of logistics distribution centers by minimizing the planners’ cost at the upper level and customers’ cost at the lower level [157]. Other applications of bilevel optimization to facility location problem may be found in [7], [35], [37], [40], [87], [113], [117], [127], [157], and [162].

  7. Inverse Optimal Control: Inverse optimal control problems are essentially bilevel in nature [6], [88], [122], [158] with wide applications in robotics, computer vision, communication theory and remote sensing to name a few. One of the major challenges in control theory is deriving the performance index or reward function which fits best on a given dataset. Such tasks lie in the category of inverse optimal control theory, where one solicits the calculation of the cause based on the given result. Such a requirement necessitates solving a parameter estimation problem with an optimal control problem.

  8. Machine Learning: Most of the machine learning and evolutionary optimization techniques often involve a number of parameters. A proper choice of these parameters has a substantial effect on the accuracy and efficiency of the approach. Tuning of these parameters is often achieved using brute force strategies, such as grid search and random search. A bilevel formulation of this problem allows for systematic and more efficient search when the number of parameters are large. Some of the approaches that have acknowledged the bilevel nature of this problem are [23], [24], [104], and [154].

  9. Principal-Agent Problems: Principal-agent problem [98] is a classical problem in economics, where a principal (leader) subcontracts a job to an agent (follower). Given that the agent prefers to act in his own interests rather than those of the principal, it becomes important for the principal to have an incentive scheme that aligns the interests of the agent with the principal. Design of such contracts appears as a bilevel optimization problem. In real life, principal-agent relationships are commonly found in doctor-patient, senior management-lower management, employer-employee, corporate board-shareholders, and politician-voters scenarios. Studies that the readers may refer to are [41], [74], [163], [179], and [187].

SECTION VIII.

Interest Over Time

In this section, we perform a text analysis of papers on bilevel programming (and Stackelberg games) that are indexed in SCOPUS. To begin with, we analyze the volume of publications every year on bilevel programming since 1950s to present, and then closely look at the themes within bilevel programming that have contributed to the growth over the years. The themes were discovered using a nonparametric Bayesian approach [160], which clusters the documents together based on similarities. The documents may probabilistically belong to multiple clusters at the same time.

Fig. 7 shows how the interest on bilevel programming has been growing at a slow pace until early 2000s and then picked-up significantly at the middle of the previous decade. The studies on bilevel programming using evolutionary algorithms appeared for the first time during the mid 1990s that took another decade to pick up to the extent that almost 10% of all studies on bilevel optimization utilize evolutionary methods. Fig. 7 shows the growth of evolutionary methods in the context of bilevel optimization from the 1990s to present. While the early papers on bilevel programming (pre-2000) were mainly focused on solution methods and optimality conditions, the growth in the post-2000 period was fueled by papers on applications of bilevel programming.

Fig. 7. - Interest on bilevel optimization over time.
Fig. 7.

Interest on bilevel optimization over time.

Fig. 8. - Interest on evolutionary bilevel optimization over time.
Fig. 8.

Interest on evolutionary bilevel optimization over time.

To identify the themes that have contributed toward the literature on bilevel optimization we used topic models. “Topic models are algorithms for discovering the main themes that pervade a large and otherwise unstructured collection of documents” [26]. These models can be used to organize unstructured collections as well as develop insights from large text databases that made it suitable for our purposes. The results from the topic model for the documents retrieved from SCOPUS are given in Figs. 9–​24. The figures consist of themes that have a volume at least to the order of around 1%. Along with the identification of themes, the analysis helped in determining the attention received by particular themes at any point in time.

Fig. 9. - Topic: methods.
Fig. 9.

Topic: methods.

Fig. 10. - Topic: optimality conditions.
Fig. 10.

Topic: optimality conditions.

Fig. 11. - Topic: classical game theory.
Fig. 11.

Topic: classical game theory.

Fig. 12. - Topic: network design.
Fig. 12.

Topic: network design.

Fig. 13. - Topic: supply chain applications.
Fig. 13.

Topic: supply chain applications.

Fig. 14. - Topic: optimal design applications.
Fig. 14.

Topic: optimal design applications.

Fig. 15. - Topic: electricity transmission applications.
Fig. 15.

Topic: electricity transmission applications.

Fig. 16. - Topic: telecommunication applications.
Fig. 16.

Topic: telecommunication applications.

Fig. 17. - Topic: business applications.
Fig. 17.

Topic: business applications.

Fig. 18. - Topic: computer architecture and circuit design.
Fig. 18.

Topic: computer architecture and circuit design.

Fig. 19. - Topic: hierarchical decision making applications.
Fig. 19.

Topic: hierarchical decision making applications.

Fig. 20. - Topic: environment applications.
Fig. 20.

Topic: environment applications.

Fig. 21. - Topic: facility location applications.
Fig. 21.

Topic: facility location applications.

Fig. 22. - Topic: railway applications.
Fig. 22.

Topic: railway applications.

Fig. 23. - Topic: machine learning applications.
Fig. 23.

Topic: machine learning applications.

Fig. 24. - Topic: defense applications.
Fig. 24.

Topic: defense applications.

Each figure contains a word cloud in the inset that describes the theme and volume of papers as the other inset that describes the number of papers published on that theme over the years. Interestingly, we observe that number of papers on classical bilevel methods and optimality conditions peaked during 1995–2000. Since 2000, a number of bilevel applications picked-up, for example, we see a growth in supply chain applications, electricity transmission applications, telecommunication applications, facility location applications, railway applications, and machine learning applications. Defense applications that appeared to be minimally present before 2000 show a significant presence later. Network design, optimal design, and business applications do not show a trend but represent the highest volume of applications on bilevel optimization.

SECTION IX.

Conclusion

In this concluding section, we will raise a few perspectives that have not yet received much attention but may offer interesting directions for future research. Apart from metamodeling-based techniques to solve bilevel problems, we would like to highlight the importance of being able to account for different forms of uncertainties that are often encountered when solving practical problems. Another interesting direction is concerned with scalability of bilevel algorithms and ability to leverage distributed computing platforms to handle large scale problems. Further information about evolutionary bilevel optimization can be found at http://www.bilevel.org.

A. Metamodeling-Based Algorithms

Though we have already highlighted the importance of metamodeling-based methods in an earlier section, we have decided to discuss it once again because of its potential in solving practical bilevel optimization problems. Bilevel optimization problems inherit a number of mappings, and any knowledge about the structure of these mappings can simplify the solution procedure extensively. In this paper, we have highlighted approaches that are based on approximating the reaction set mapping and the optimal lower level value function mapping. Knowing one of these mappings reduces any bilevel optimization problem to a single-level optimization problem. For solving large scale bilevel problems, one has to exploit the structure and properties of bilevel problems that are essentially contained in these mappings. Other ways of utilizing metamodeling for bilevel problems that we have discussed are: approximating the bilevel problem by bypassing the lower level problem completely; and utilizing auxiliary bilevel models to locally approximate a bilevel optimization problem. Only few studies in the context of evolutionary algorithms utilize such a strategies and offer opportunities for future contributions.

B. Multiobjective Bilevel Optimization and Decision Making

Multiobjective bilevel optimization has received only lukewarm interest from researchers. A number of issues, like decision interaction between the two levels and uncertainties in decision making, remain unexplored. It might be of interest for researchers working in the area of multicriteria decision making and multiobjective evolutionary optimization to explore how two levels of decision makers interact to arrive at a compromising or an equilibrium solution in different situations. Similarly, the notion of uncertain decision-maker’s preferences at one or both levels has also not received enough attention [151]. In the field of decision analysis, however, plenty of research has been carried out to extend traditional frameworks of decision making such as expected utility theory [124] and multiattribute utility theory [92] to account for uncertain preferences on, for instance, the tradeoffs among multiple decision objectives or the risk-attitude. However, preferential uncertainty in bilevel optimization problems still requires development of theory as well as methods to account for decision behavior in a hierarchical setting. The approaches that have been proposed so far are still very preliminary and require substantial future research.

C. Bilevel Optimization Under Variable Uncertainty

Another important research topic is concerned with the inherent uncertainty of decision variables. This poses several challenges for the existing more deterministic optimization frameworks that may fail to find solutions that are robust and sufficiently close to the optimal solutions. The fact that bilevel problems have nested optimization tasks makes the search of robust solutions substantially more challenging compared with single-level optimization problems. A few preliminary ideas for handling variable uncertainty have already been suggested [52], but the required algorithm side innovations that would make these problems accessible to practitioners are still missing.

D. Scaling of Evolutionary Bilevel Algorithms

Bilevel problems are well-known for being highly computationally intensive already before considering any types of uncertainties. One of the promising directions for handling larger bilevel problems could be the use of the recent distributed computing platforms such as Apache Spark project. The programming model of Spark is quite different from the Hadoop MapReduce framework, and it has managed to overcome many of the earlier limitations. In particular, its current form has turned out to support the use and development of iterative algorithms quite well. Therefore, it may be interesting to investigate whether this novel platform will be able to offer a helping hand for researchers and practitioners who seek to solve bigger bilevel problems faster.

Though considerable progress has already been made during the last few years, evolutionary bilevel optimization is still a relatively young field with numerous opportunities for both computational as well as theoretical innovations. The growing availability of algorithms is also opening the field to more applied research, and we believe that in the future we are likely to see a considerable amount of novel applications.

ACKNOWLEDGMENT

Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

References

1.
E. Aiyoshi and K. Shimizu, "Hierarchical decentralized systems and its new solution by a barrier method", IEEE Trans. Syst. Man Cybern., vol. 11, no. 6, pp. 444-449, Jun. 1981.
2.
E. Aiyoshi and K. Shimizu, "A solution method for the static constrained Stackelberg problem via penalty method", IEEE Trans. Autom. Control, vol. 29, no. 12, pp. 1111-1114, Dec. 1984.
3.
D. Aksen and N. Aras, "A bilevel fixed charge location model for facilities under imminent attack", Comput. Oper. Res., vol. 39, no. 7, pp. 1364-1381, 2012.
4.
D. Aksen and N. Aras, "A matheuristic for leader–follower games involving facility location-protection-interdiction decisions" in Metaheuristics for Bi-Level Optimization, Heidelberg, Germany:Springer, pp. 115-151, 2013.
5.
F. Al-Khayyal, R. Horst and P. M. Pardalos, "Global optimization of concave functions subject to quadratic constraints: An application in nonlinear bilevel programming", Ann. Oper. Res., vol. 34, no. 1, pp. 125-147, 1992.
6.
S. Albrecht et al., "Imitating human reaching motions using physically inspired optimization principles", Proc. 11th IEEE RAS Int. Conf. Humanoid Robots (Humanoids), pp. 602-607, 2011.
7.
E. Alekseeva, N. Kochetova, Y. Kochetov and A. Plyasunov, "A hybrid memetic algorithm for the competitive p-median problem", IFAC Proc. Vol., vol. 42, no. 4, pp. 1533-1537, 2009.
8.
M. A. Amouzegar and K. Moshirvaziri, "Determining optimal pollution control policies: An application of bilevel programming", Eur. J. Oper. Res., vol. 119, no. 1, pp. 100-120, 1999.
9.
B. An et al., "A deployed quantal response-based patrol planning system for the U.S. coast guard", Interfaces, vol. 43, no. 5, pp. 400-420, 2013.
10.
G. Anandalingam and T. L. Friesz, "Hierarchical optimization: An introduction", Ann. Oper. Res., vol. 34, no. 1, pp. 1-11, 1992.
11.
J. S. Angelo, E. Krempser and H. J. C. Barbosa, "Differential evolution for bilevel programming", Proc. Congr. Evol. Comput. (CEC), pp. 470-477, 2013.
12.
J. S. Angelo and H. J. C. Barbosa, "A study on the use of heuristics to solve a bilevel programming problem", Int. Trans. Oper. Res., vol. 22, no. 5, pp. 861-882, 2015.
13.
J. S. Angelo, E. Krempser and H. J. C. Barbosa, "Differential evolution assisted by a surrogate model for bilevel programming problems", Proc. IEEE Congr. Evol. Comput. (CEC), pp. 1784-1791, 2014.
14.
J. M. Arroyo and F. J. Fernández, "A genetic algorithm approach for the analysis of electric grid interdiction with line switching", Proc. 15th Int. Conf. Intell. Syst. Appl. Power Syst. (ISAP), pp. 1-6, 2009.
15.
B. Bank, J. Guddat, D. Klatte, B. Kummer and K. Tammer, Non-Linear Parametric Optimization, Basel, Switzerland:Birkhäuser, 1983.
16.
J. F. Bard and J. E. Falk, "An explicit solution to the multi-level programming problem", Comput. Oper. Res., vol. 9, no. 1, pp. 77-100, 1982.
17.
J. F. Bard and J. T. Moore, "A branch and bound algorithm for the bilevel programming problem", SIAM J. Sci. Stat. Comput., vol. 11, no. 2, pp. 281-292, 1990.
18.
J. F. Bard, Practical Bilevel Optimization: Algorithms and Applications, Dordrecht, The Netherlands:Kluwer, 1998.
19.
J. F. Bard and J. T. Moore, "An algorithm for the discrete bilevel programming problem", Naval Res. Logistics, vol. 39, no. 3, pp. 419-435, 1992.
20.
O. Ben-Ayed, "Bilevel linear programming", Comput. Oper. Res., vol. 20, no. 5, pp. 485-501, 1993.
21.
O. Ben-Ayed, C. E. Blair, D. E. Boyce and L. J. LeBlanc, "Construction of a real-world bilevel linear programming model of the highway network design problem", Ann. Oper. Res., vol. 34, no. 1, pp. 219-254, 1992.
22.
M. P. Bendsøe, Optimization of Structural Topology Shape and Material, Heidelberg, Germany:Springer, vol. 414, 1995.
23.
K. P. Bennett, J. Hu, X. Ji, G. Kunapuli and J.-S. Pang, "Model selection via bilevel optimization", Proc. Int. Joint Conf. Neural Netw. (IJCNN), pp. 1922-1929, 2006.
24.
K. P. Bennett, G. Kunapuli, J. Hu and J.-S. Pang, "Bilevel optimization and machine learning" in Computational Intelligence: Research Frontiers, Heidelberg, Germany:Springer, pp. 25-47, 2008.
25.
W. F. Bialas and M. H. Karwan, "Two-level linear programming", Manag. Sci., vol. 30, no. 8, pp. 1004-1020, 1984.
26.
D. M. Blei, "Probabilistic topic models", Commun. ACM, vol. 55, no. 4, pp. 77-84, 2012.
27.
M. Bostian, G. Whittaker, A. Sinha and B. Barnhart, "Incorporating data envelopment analysis solution methods into bilevel multi-objective optimization", Proc. IEEE Congr. Evol. Comput. (CEC), pp. 1667-1674, 2015.
28.
M. Bostian, G. Whittaker, B. Barnhart, R. Färe and S. Grosskopf, "Valuing water quality tradeoffs at different spatial scales: An integrated approach using bilevel optimization", Water Resources Econ., vol. 11, pp. 1-12, Jul. 2015.
29.
J. Bracken and J. McGill, "Mathematical programs with optimization problems in the constraints", Oper. Res., vol. 21, no. 1, pp. 37-44, 1973.
30.
J. Bracken and J. T. McGill, "Defense applications of mathematical programs with optimization problems in the constraints", Oper. Res., vol. 22, no. 5, pp. 1086-1096, 1974.