Introduction 介绍
Many large-scale optimization and decision-making processes faced by public and private organizations are hierarchical in the sense that the realized outcome of any solution or decision taken by the upper level authority (leader) to optimize their goals is affected by the response of lower level entities (follower), who will seek to optimize their own outcomes. Fig. 1 illustrates a general bilevel problem solving structure involving interlinked optimization and decision-making tasks at both levels. The figure shows that for any given upper level decision vector, there is a corresponding (parametric) lower level optimization problem to be solved that provides the rational (optimal) response of the follower for the leader’s decision. The leader’s decision vector is represented by
公共和私人组织面临的许多大规模优化和决策过程都是分层的,因为上级机构(领导者)为优化其目标而采取的任何解决方案或决策的实现结果都受到下级实体(追随者)的反应的影响,他们将寻求优化自己的结果。 Fig. 1 说明了一般的双层问题解决结构,涉及两个层次的相互关联的优化和决策任务。该图显示,对于任何给定的上层决策向量,都有一个相应的(参数化)下层优化问题需要解决,该问题为领导者的决策提供了追随者的合理(最优)响应。领导者的决策向量用
It is not uncommon that the objectives of generally profit-seeking private agents can well be in conflict with those of the controlling authority. What makes such bilevel problem-solving tasks highly relevant is that they are typically characterized by very large spillover effects to the economy as well as the surrounding environment. Given the far-reaching future impacts of the decisions, it is not surprising that the interest toward bilevel programming has grown strong especially among researchers and practitioners dealing with large-scale public sector decision-making problems. For instance, farmers often tend to overuse fertilizers to increase the productivity, which leads to negative externalities such as pollution. Whittaker et al. [176] used a bilevel model to design policy measures to control the overuse of fertilizers and its negative impact on the environment. Apart from problems concerned with environmentally sensitive decisions (such as allocation of mining permits or controlling the use of fertilizers), there has been widespread interest across a number of fields in operations research. A good example is homeland security, where bilevel as well as trilevel optimization frameworks have been utilized in problems ranging from interdiction of nuclear-weapons projects to defending critical infrastructure and solving border security problems [9], [32], [33], [173]. In addition to the public sector challenges, there is abundant research on bilevel decision-making problems in economics, logistics, as well as diverse areas of computer science.
一般追求利润的私人代理人的目标很可能与控制当局的目标相冲突,这种情况并不少见。使这种双层次问题解决任务高度相关的是,它们通常具有对经济和周围环境非常大的溢出效应。鉴于这些决定对未来的深远影响,对双层次规划的兴趣日益浓厚也就不足为奇了,尤其是在处理大规模公共部门决策问题的研究人员和从业人员中。例如,农民往往倾向于过度使用化肥来提高生产力,从而导致污染等负面外部性。Whittaker等人 [176] 使用双层次模型设计了政策措施,以控制肥料的过度使用及其对环境的负面影响。除了与环境敏感决策有关的问题(例如采矿许可证的分配或控制化肥的使用)外,运筹学的许多领域都引起了广泛的兴趣。一个很好的例子是国土安全,其中双层和三层优化框架已被用于从拦截核武器项目到保卫关键基础设施和解决边境安全问题 [9] 等问题 [173] 。 [32] [33] 除了公共部门的挑战外,还有关于经济学、物流以及计算机科学不同领域的双层次决策问题的大量研究。
The research on decision-making problems with hierarchical leader-follower structures (bilevel optimization) can be traced to two roots. The first root is in the domain of game theory, where von Stackelberg [156] used bilevel programming to build descriptive models of decision behavior and establish game-theoretic equilibria. The second root is in the domain of mathematical programming, where the problems appeared as bilevel optimization problems containing a nested inner optimization problem as a constraint of an outer optimization problem [29]. Since then a substantial body of mathematical literature on bilevel optimization has emerged. Given that the hierarchical optimization structure may introduce difficulties such as nonconvexity and disconnectedness even for simpler instances of bilevel optimization, the problems have turned out to be surprisingly difficult to handle mathematically. Bilevel programming is known to be strongly NP-hard [79], and it has been proven that merely evaluating a solution for optimality is also an NP-hard task [165]. Even in the simplest case of linear bilevel programs, where the lower level problem has a unique optimal solution for all the parameters, it is not likely to find a polynomial algorithm that is capable of solving the linear bilevel program to global optimality. The proof for the nonexistence of a polynomial time algorithm for linear bilevel problems can be found in [63].
Due to lack of well-established solution procedures, a complex practical problem is usually modified into simpler single level optimization task, which is solved to arrive at a satisfying instead of an optimal solution. For the complex bilevel problems, classical methods often fail due to real world difficulties such as nonlinearity and discreteness. Under such circumstances, evolutionary methods can be useful tools to offset some of these difficulties. Recent initiatives on bilevel optimization using evolutionary algorithms suggest that a coordinated effort on bilevel optimization by the evolutionary community could help make significant progress on this challenging class of optimization problems (e.g., [11], [56], [144], and [151]).
Fig. 2 provides a network map of different themes on bilevel optimization that have been studied since 1950s. The network map shows different theoretical and application topics that have evolved under bilevel optimization. Each link in the map connects a subtopic with a higher level topic that are differentiated by font sizes. Subtopics connected with a link denote an overlap. To provide a more comprehensive overview on the past as well as recent developments in the field of bilevel optimization, we have organized this review paper along the three lines; theory, applications, and text-analysis of the entire bilevel literature body. First, to formalize the notion of bilevel programming, we begin by introducing a few central definitions and discuss the differences between optimistic and pessimistic formulations of bilevel problems. Once the common terminology has been established, we offer an overview on the algorithms that have been proposed for bilevel optimization. After a brief coverage of the commonly used classical approaches (e.g., descent methods, penalty function methods, and trust region methods), we move on to discuss the developments in the field of evolutionary computation, discrete bilevel optimization and multiobjective bilevel optimization. The method sections are followed by a review on the central application areas. Finally, we study the research topics, and the evolution of interest over time. The entire bilevel literature is divided into topics and a time series analysis across each topic is performed. The text-analysis performed in this paper is based on a recently developed nonparametric topic model [26], [160] for analyzing unstructured information. The technical details on the automated text-analysis approach are provided in an appendix. This paper concludes with a brief discussion on the directions for future research.
General Formulation and Definitions
In this section, we provide a general formulation for bilevel optimization problem. These problems contain two levels of optimization tasks where one optimization task is nested within the other. The outer optimization problem is commonly referred as the leader’s (upper level) optimization problem and the inner optimization problem is known as the follower’s (or lower level) optimization problem. The two levels have their own objectives and constraints. Correspondingly, there are also two classes of decision vectors, namely, leader’s (upper level) decision vectors and follower’s (lower level) decision vectors. The lower level optimization is a parametric optimization problem that is solved with respect to the lower level decision vectors while the upper level decision vectors act as parameters. The lower level optimization problem is a constraint to the upper level optimization problem, such that, only those members are considered feasible that are lower level optimal and also satisfy the upper level constraints. A summary of the terminologies and notations used in the context of bilevel optimization is given in Table I.
Definition 1:
For the upper-level objective function
An equivalent formulation of the above problem can be stated in terms of set-valued mapping (multivalued function) as follows.
Definition 2:
Let
In the above two definitions, quotes have been used while specifying the upper level minimization problem because of an ambiguity that arises in case of multiple lower level optimal solutions for any given upper level decision vector. In the presence of multiple lower level optimal solutions there is lack of clarity at the upper level as to which optimal solution from the lower level should be utilized. This ambiguity can be sorted by defining different positions that may be assumed by the leader. The two common positions that have been widely studied are optimistic (weak) position and pessimistic (strong) position, which we discuss next.
A. Optimistic Position
In an optimistic position, in the presence of multiple lower level optimal solutions, the leader expects the follower to choose that solution from the optimal set
Optimistic position is more tractable as compared to the pessimistic position; therefore, most of the studies handle optimistic version of the bilevel optimization problem. The optimistic formulation is guaranteed to have an optimal solutions under reasonable assumptions of regularity and compactness that are stated in the theorem below.
Theorem 1:
If the functions
See [58], [60], [80], [105], [106], [126] for further discussion on existence of optimistic bilevel optimum and additional results on optimality conditions.
B. Pessimistic Position
In a pessimistic position, in the presence of multiple lower level optimal solutions, the leader optimizes for the worst case, i.e., she assumes that the follower may choose that solution from the optimal set which leads to the worst objective function value at the upper level. Such a worst case choice function of the follower may be defined as
Theorem 2:
If the functions
For discussion on existence of pessimistic bilevel optimum and additional results on optimality conditions, the authors may refer to [60], [62], [109], [111], and [177].
C. Example
Below we provide a simple example of a bilevel optimization problem [71] that arises in case of two firms in a Stackelberg competition. The leader has complete knowledge about the follower’s inverse-demand function and the cost function, and desires to maximize it’s own profits by taking into account the actions of the follower firm. The two firms compete solely by choosing their production levels that maximize their profits (
By assuming that the firms produce and sell homogeneous goods, we may assume a single linear price function for both firms as an inverse demand function of the form
The optimal level of production of the leader
Classical Approaches
In this section, we provide a brief overview of the classical algorithms that have been proposed for bilevel optimization. Given the difficult nature of bilevel problems, it is not surprising that much of the classical literature considers bilevel problems that are mathematically well-behaved; i.e., contains functions that are linear, quadratic or convex. Strong assumptions like continuous differentiability and lower semicontinuity are quite common. A significant amount of attention has been given to linear bilevel optimization problems with continuous [20], [174] and combinatorial [166] variables. For more complex bilevel problems, the readers may refer [10] and [48].
A. Single-Level Reduction
When the lower level problem is convex and sufficiently regular, it is possible to replace the lower level optimization problem with its Karush-Kuhn–Tucker (KKT) conditions. The KKT conditions appear as Lagrangian and complementarity constraints, and reduce the overall bilevel optimization problem to a single-level constrained optimization problem. For example, the problem in Definition 1 can be reduced to the following form, when the convexity and regularity conditions at the lower level are met:
Interestingly, for linear bilevel optimization problems, the Lagrangian constraint is also linear. Therefore, the single-level optimization problem is a mixed integer linear program. Approaches based on vertex enumeration [25], [45], [161], as well as branch-and-bound (B&B) [16], [70] have been used to solve these problems. It is noteworthy that B&B methods constitute an exponentially slow algorithm with the number of integer variables. But, B&B approaches have been successfully applied to single-level reductions of linear-quadratic [17] and quadratic-quadratic [5], [64] bilevel problems. An extended KKT approach has also been considered [138] for handling linear bilevel problems.
B. Descent Methods
In addition to KKT-based approaches, a number of descent methods have been proposed for solving bilevel optimization problems. A descent direction in bilevel optimization leads to decrease in upper level function value while keeping the new point feasible. Given that a point is considered feasible only if it is lower level optimal, finding the descent direction can be quite challenging. To resolve the problem, researchers have investigated ways to approximate the gradient of the upper level objective [95] as well as considered formulation of auxiliary programs [136], [165] to determine the direction of descent.
C. Penalty Function Methods
In penalty function methods the bilevel optimization problem is handled by solving a series of unconstrained optimization problems. The unconstrained problem is generated by adding a penalty term that measures the extent of violation of the constraints. The penalty term often requires a parameter and takes the value zero for feasible points and positive (minimization) for infeasible points. For bilevel problems, the first attempt toward using a penalized approach was made by Aiyoshi and Shimizu [1], [2]. They replaced the lower level problem by a penalized problem; however, the bilevel hierarchy was still maintained and the reduced problem was still difficult to solve. Later double penalty method was introduced in [84], where both upper and lower level objective functions were penalized. The problem was reduced into a single level task by replacing the penalized lower level problem with its KKT conditions, and then solving the single level formulation by penalization. In a number of studies, the lower level problem is directly replaced by its KKT conditions and then a penalized approach is used to solve the single level problem. Few studies where penalty function approach has been used for linear bilevel problems are [112] and [175]. White and Anandalingam [175] converted the linear bilevel program into a penalized bilinear optimization problem, and then solve a series of bilinear problems to find the optimum. Lv et al.. [112] reduced the linear bilevel program into single level using KKT conditions, and then append the complementary slackness condition to the upper level objective function with a penalty. The penalized problem is then handled using a series of linear programs.
D. Trust-Region Methods
In trust-region methods, the algorithms approximate a certain region of the objective function with a model function. The region is expanded if the approximation is good, otherwise it is contracted. The first study using trust-region method to solve nonlinear bilevel programs was presented in [108], where the lower level problem had a convex objective function and linear constraints. However, no upper level constraints were considered. Later, a more general idea was proposed in [116], where the authors locally approximate the bilevel program with a model involving a linear program at the upper level and a linear variational inequality at the lower level. Trust-region and line search ideas have been combined to approach the bilevel optimum over iterations. Similarly, Colson et al. [50] approximated the bilevel program around an iterate with a model that itself is a linear-quadratic bilevel program. The authors proposed to solve the linear-quadratic bilevel program using a mixed integer solver after reducing it to a single level problem using its lower level KKT conditions. Convergence is achieved by sequentially solving linear-quadratic bilevel models.
Next, we discuss about evolutionary algorithms for bilevel optimization. At this point, we would like to refer the readers to other review papers [49], [57], [90], [150], [164] and books [18], [60], [61], [140], [159] on bilevel optimization.
Evolutionary Approaches
A. Nested Methods
Nested evolutionary algorithms are a popular approach to handle bilevel problems, where lower level optimization problem is solved corresponding to each and every upper level member [153]. Though effective, nested strategies are computationally very expensive and not viable for large scale bilevel problems. Nested methods in the area of evolutionary algorithms have been used in primarily two ways. The first approach has been to use an evolutionary algorithm at the upper level and a classical algorithm at the lower level, while the second approach has been to utilize evolutionary algorithms at both levels. Of course, the choice between two approaches is determined by the complexity of the lower level optimization problem.
One of the first evolutionary algorithms for solving bilevel optimization problems was proposed in the early 1990s. Mathieu et al. [118] used a nested approach with genetic algorithm at the upper level, and linear programming at the lower level. Another nested approach was proposed in [184], where the upper level was an evolutionary algorithm and the lower level was solved using Frank-Wolfe algorithm (reduced gradient method) for every upper level member. The authors demonstrated that the idea can be effectively utilized to solve nonconvex bilevel optimization problems.
Nested particle swarm optimization (PSO) was used in [103] to solve bilevel optimization problems. The effectiveness of the technique was shown on a number of standard test problems with small number of variables, but the computational expense of the nested procedure was not reported. A hybrid approach was proposed in [102], where simplex-based crossover strategy was used at the upper level, and the lower level was solved using one of the classical approaches. The authors report the generations and population sizes required by the algorithm that can be used to compute the upper level function evaluations, but they do not explicitly report the total number of lower level function evaluations, which presumably is high.
Differential evolution (DE) based approaches have also been used, for instance, Zhu et al. [188] used DE at the upper level and relied on the interior point algorithm at the lower level; similarly, Angelo et al. [11] have used DE at both levels. Authors have also combined two different specialized evolutionary algorithms to handle the two levels, for example, Angelo and Barbosa [12] used an ant colony optimization to handle the upper level and DE to handle the lower level in a transportation routing problem. Another nested approach utilizing ant colony algorithm for solving a bilevel model for production-distribution planning is [36]. Scatter search algorithms have also been employed for solving production-distribution planning problems, for instance [39].
Through a number of approaches involving evolutionary algorithms at one or both levels, the authors have demonstrated the ability of their methods in solving problems that might otherwise be difficult to handle using classical bilevel approaches. However, as already stated, most of these approaches are practically nonscalable. With increasing number of upper level variables, the number of lower level optimization tasks required to be solved increases exponentially. Moreover, if the lower level optimization problem itself is difficult to solve, numerous instances of such a problem cannot be solved, as required by these methods.
B. Single-Level Reduction
The idea behind single-level reduction, in the context of evolutionary algorithms, is similar to the discussion in Section III-A. A number of researchers in the area of evolutionary computation have also used the KKT conditions of the lower level to reduce the bilevel problem into a single-level problem. Most often, such an approach is able to solve problems that adhere to certain regularity conditions at the lower level because of the requirement of the KKT conditions. However, as the reduced single-level problem is solved with an evolutionary algorithm, usually the upper level objective function and constraints can be more general and not adhering to such regularities. For instance, one of the earliest papers using such an approach is by Hejazi et al. [82], who reduced the linear bilevel problem to single-level and then used a genetic algorithm, where chromosomes emulate the vertex points, to solve the problem. Wang et al. [171] reduced the bilevel problem into a single-level optimization problem using KKT conditions, and then utilized a constraint handling scheme to successfully solve a number of standard test problems. Their algorithm was able to handle nondifferentiability at the upper level objective function, but not elsewhere. Later on, Wang et al. [172] introduced an improved algorithm that was able to handle nonconvex lower level problem and performed better than the previous approach [171]. However, the number of function evaluations in both approaches remained quite high (requiring function evaluations to the tune of 100 000 for 2–5 variable bilevel problems). Wang et al. [169] used a simplex-based genetic algorithm to solve linear-quadratic bilevel problems after reducing it to a single level task. More recently, Jiang et al. [86] reduced the bilevel optimization problem into a nonlinear optimization problem with complementarity constraints, which is sequentially smoothed and solved with a PSO algorithm. Along similar lines of using lower level optimality conditions, Li [101] solved a fractional bilevel optimization problem by utilizing optimality results of the linear fractional lower level problem. Wan et al. [167] embedded the chaos search technique in PSO to solve single-level reduced problem.
C. Metamodeling-Based Methods
Metamodeling-based solution methods are commonly used for optimization problems [168], where actual function evaluations are expensive. A meta-model or surrogate model is an approximation of the actual model that is relatively quicker to evaluate. Based on a small sample from the actual model, a surrogate model can be trained and used subsequently for optimization. Given that, for complex problems, it is hard to approximate the entire model with a small set of sample points, researchers often resort to iterative meta modeling techniques, where the actual model is approximated locally during iterations.
Bilevel optimization problems contain an inherent complexity that leads to a requirement of large number of evaluations to solve the problem. Metamodeling, when used with population-based algorithms, offers a viable means to handle bilevel optimization problems. In this section, we discuss four ways in which metamodeling can be applied to bilevel optimization. The discussion related to approximation of the rational reaction set and lower level optimal value function is a review of some recent work. However, before starting, we refer the readers to Fig. 3, which provides an understanding of these two mappings graphically for a hypothetical bilevel problem. We also provide a brief discussion on approximating the bilevel problem with an auxiliary problem.
1) Reaction Set Mapping:
One of the approaches to solve bilevel optimization problems using evolutionary algorithms would be through iterative approximation of the reaction set mapping
2) Optimal Lower Level Value Function:
Another way to use metamodeling would be through the approximation of the optimal value function
3) Bypassing Lower Level Problem:
Another way to use a meta-model in bilevel optimization would be completely by-pass the lower level problem, as follows:
4) Auxiliary Bilevel Meta-Model:
Building up on the trust-region methods for solving bilevel optimization problems, it is possible to utilize the population members in an evolutionary algorithm to formulate auxiliary bilevel problem(s). The auxiliary bilevel problem(s) may be simple enough to be solved using faster specialized techniques. The population members could then be updated based on the obtained auxiliary solution(s). For the moment, there does not exist any evolutionary algorithm based on this idea, but it may be an interesting direction to pursue in the future.
Discrete Bilevel Optimization
In this section, we would like to discuss the contributions made toward solving discrete bilevel optimization problems. The formulation of the bilevel problem remains the same as described in Definitions 1 and 2, along with one or more variables at either of the levels being discrete. Presence of discrete variables can pose a variety of challenges depending upon, whether the discrete variables are present at upper level, lower level or both levels. In the classical literature, B&B and branch-and-cut are some of the commonly used techniques to handle discreteness in variables. Most of the work on discrete bilevel optimization employs an extension of these ideas from single-level optimization. To highlight the kind of complexities induced in the presence of discrete variables, we consider a simple linear bilevel problem described in [18] and [166] to show how the inducible region changes based on the upper, lower or both level variables being discrete or continuous.
Consider the following lower level optimization problem which we use for identifying the inducible region of a bilevel problem:
Continuous-Continuous Bilevel Program: Consider
andxu∈R .xl∈R Discrete-Continuous Bilevel Program: Consider
andxu∈Z .xl∈R Discrete-Discrete Bilevel Program: Consider
andxu∈Z .xl∈Z Continuous-Discrete Bilevel Program: Consider
andxu∈R .xl∈Z
A. Discrete Bilevel Optimization Survey
One of the early works on discrete bilevel optimization was by Vicente et al. [166], which focused on discrete linear bilevel programs, and analyzed the properties and existence of the optimal solution for different kinds of discretizations arising from the upper and lower level variables. The authors have shown in this paper that certain compactness conditions guarantee the existence of optimal solution in continuous-continuous linear bilevel programs, discrete-continuous linear bilevel programs, and discrete-discrete linear bilevel programs. The conditions are equivalent to stating that the inducible region is nonempty. However, the existence conditions in the case of continuous-discrete linear bilevel programs are not straightforward. For instance, the inducible region for the continuous-discrete linear bilevel problem in Fig. 4 is a noncompact set that may lead to nonexistence of a bilevel optimal solution even when the inducible region is nonempty.
A few studies preceded the study by Vicente et al. [166]. For instance, Moore and Bard [123] solved mixed integer linear bilevel problems. The authors pointed out the difficulties involved in fathoming while solving mixed integer bilevel problems using traditional B&B techniques. Certain fathoming rules used in case of mixed integer linear programming, like fathoming when the relaxed subproblem is worse than the value of the incumbent or fathoming when the solution of the relaxed subproblem is feasible for the mixed integer problem, are not directly applicable to mixed integer linear bilevel problems. Therefore, the authors proposed a B&B approach involving stricter fathoming conditions. However, the algorithm has a nested structure and is not scalable beyond few integer variables, and to counter which the authors also proposed some heuristics. This paper was followed by [19], where the same authors solved discrete linear bilevel programs involving only binary variables using an implicit enumeration scheme. In this approach, the authors place a cut, similar to the one used by Bialas and Karwan [25] (for the continuous linear bilevel program), seeking incremental improvements in the upper level objective function. A cutting plane method utilizing the Chvátal-Gomory cut for the continuous-discrete bilevel program was proposed in [59]. Benders-decomposition-based techniques have also been employed to solve bilevel problems with mixed integers at the upper level and continuous linear programs at the lower level. The original problem is decomposed into a master and a slave problem. Fixing the integer values converts the slave problem into a bilevel linear program which is solved by KKT-based reduction techniques, and the solution to the slave is utilized to create a cut for the master problem. The algorithm switches between master and slave problems until the optimality criteria is met. Certain studies in this direction are [40], [69], and [135].
Despite the attempts made toward algorithm development for discrete bilevel programs, the research is still open for new methods and ideas as none of the proposed techniques would scale well for problems with larger number of variables. Apart from a few nested approaches and KKT-based single level reduction approaches, to our best knowledge, there does not exist any algorithmic study involving evolutionary algorithms for mixed integer bilevel problems that attempt to solve the problem efficiently by utilizing its properties. However, given that the evolutionary approaches are, in particular, potent for handling difficulties such as discreteness and nondifferentiabilities they offer a significant scope for solving discrete bilevel optimization problems. Some attempts toward mixed integer bilevel optimization using evolutionary methods are [4], [14], [37], [43], [78], [81], [100], and [120].
There is no dearth of application problems involving discrete variables. While mixed integer bilevel problems are ubiquitous, even combinatorial bilevel programs find innumerable applications in the areas of network design, facility location, hub-and-spoke networks, etc. In these areas, these problems are commonly studied in the context of interdiction, protection, robust design, competition and supply chain management, among others. Some of the related applications are highlighted in Section VII.
Multiobjective Bilevel Optimization
In many of the practical problems, a leader and/or the follower might face multiple objectives. This gives rise to multiobjective bilevel optimization problems that we define below.
Definition 3:
For the upper-level objective function
The set-valued mapping in this case can be defined as follows and an equivalent definition can be written as in the single-objective case:
A. Optimistic Versus Pessimistic
The optimistic or pessimistic position becomes more prominent in multiobjective bilevel optimization. In the presence of multiple objectives at the lower level, the set-valued mapping
Though optimistic positions have commonly been studied in classical [66] and evolutionary [56] literature in the context of multiobjective bilevel optimization; it is far from realism to expect that the follower will cooperate to an extent that she chooses any point from her PO frontier that is most suitable for the leader. This relies on the assumption that the follower is indifferent to the entire set of optimal solutions, and therefore decides to cooperate. The situation was entirely different in the single-objective case, where, in case of multiple optimal solutions, all the solutions offered an equal value to the follower. However, this cannot be assumed in the multiobjective case. Solution to the optimistic formulation in multiobjective bilevel optimization leads to the best possible PO frontier that can be achieved by the leader. Similarly, solution to the pessimistic formulation leads to the worst possible PO frontier at the upper level.
If the value function or the choice function of the follower is known to the leader, it provides an information as to what kind of tradeoff is preferred by the follower. A knowledge of such a function effectively, casually speaking, reduces the lower level optimization problem into a single-objective optimization task, where the value function may be directly optimized. The leader’s PO frontier for such intermediate positions lies between the optimistic and the pessimistic frontiers. Fig. 5 shows the optimistic and pessimistic frontiers for a hypothetical multiobjective bilevel problem with two objectives at upper and lower levels. Follower’s frontier corresponding to
B. Example
Below, we provide a bilevel optimization problem involving design of a tax policy [152]. The upper level in this example is the government that wants to tax the lower level, a mining company, based on the pollution it causes to the environment. The government here has two objectives: the first objective is to maximize the revenues generated by the mining project, which may include the additional jobs, taxes, etc.; and the second objective is to minimize the harm caused to the environment as a result of mining. Obviously, there is a tradeoff between the two objectives, and the government as a decision maker needs to choose one of the preferred tradeoff solutions. The mining company has a sole objective of maximizing its profit under the constraints set by the government. In this scenario, the government would like to have a tax structure such that it is able to maximize its own revenues in addition to being able to restrain the mining company from causing extensive damage to the environment. It is possible for the leader to optimally regulate the problem in its favor, provided that it has complete knowledge of the follower’s strategies. The hierarchical optimization problem in this case can be formulated as follows:
In (9), the first objective deals with the tax revenue, where
Equation (10) gives the profit of the mine, where
We assume the parameters as
C. Multiobjective Bilevel Optimization Survey
There exists a significant amount of work on single objective bilevel optimization; however, little has been done on multiobjective bilevel optimization primarily because of the computational and decision making complexities that these problems offer. For results on optimality conditions in multiobjective bilevel optimization, the readers may refer to [15], [72], and [182]. On the methodology side, Eichfelder [65], [66] solved simple multiobjective bilevel problems using a classical approach. The lower level problems in these studies have been solved using a numerical optimization technique, and the upper level problem is handled using an adaptive exhaustive search method. This makes the solution procedure computationally demanding and nonscalable to large-scale problems. In another study, Shi and Xia [139] used
One of the first studies, utilizing an evolutionary approach for multiobjective bilevel optimization was by Yin [184]. The study involved multiple objectives at the upper lever, and a single objective at the lower level. The study suggested a nested genetic algorithm, and applied it on a transportation planning and management problem. Later Halter and Mostaghim [76] used a PSO-based nested strategy to solve a multicomponent chemical system. The lower level problem in their application was linear for which they used a specialized linear multiobjective PSO approach. Recently, a hybrid bilevel evolutionary multiobjective optimization algorithm coupled with local search was proposed in [56] (for earlier versions, refer [53]–[55] and [142]). In this paper, the authors handled nonlinear as well as discrete bilevel problems with relatively larger number of variables. The study also provided a suite of test problems for bilevel multiobjective optimization.
There has been some work done on decision making aspects at upper and lower levels. For example, in [141] an optimistic version of multiobjective bilevel optimization, involving interaction with the upper level decision maker, has been solved. The approach leads to the most preferred point at the upper level instead of the entire Pareto-frontier. Since multiobjective bilevel optimization is computationally expensive, such an approach was justified as it led to enormous savings in computational expense. Studies that have considered decision making at the lower level include [146] and [151]. Sinha et al. [146] have replaced the lower level with a value function that effectively reduces the lower level problem to single-objective optimization task. In [151], the follower’s value function is known with uncertainty, and the authors propose a strategy to handle such problems. Other work related to bilevel multiobjective optimization can be found in [107], [128], [129], [134], and [187].
Applications
Bilevel optimization commonly appears in many practical problems. They are often encountered in the fields of economics, transportation, engineering, and management, among others. The following list will provide an insight to the readers on the relevance of these problems to practice.
Toll Setting Problem: Toll-setting problem is essentially a part of network problems. In this problem, there is an authority that wants to optimize the tolls for a network of roads. The authority acts as a leader and the network users act as followers. Papers on toll-setting problem and its multiobjective extensions can be found in [31], [51], [67], [75], [89], [91], [97], [115], [121], [147], [170], and [185]. Bilevel optimization is quite commonly used in network design problems. Instead of going through specific problems, we refer the readers to a variety of applications of bilevel optimization to the area of network design [21], [38], [42], [44], [68], [73], [99], [114], [119], [133], [178], [180], [181], [184].
Environmental Economics: Bilevel optimization commonly appears in environmental economics, where an authority wants to tax an organization or individual that is polluting the environment as a result of its operations. Finding an optimal level of tax that offers a compromise between revenues and pollution results in a bilevel optimization problem with the regulator as the leader and the polluting entity as a follower [8], [27], [28], [152], [176].
Chemical Industry: In chemical industries, the chemists often face a bilevel optimization problem where they have to decide upon the conditions (state variables and quantity of reactants) for the reaction to achieve optimal output. While optimizing the output is the upper level problem, the lower level appears as an equilibrium condition, which is an entropy functional minimization problem. Such applications of bilevel optimization can be found in [47], [77], [130], and [155].
Optimal Design: Bilevel problems are very common in structural optimization or optimal shape design. For instance, in structural optimization one often requires to minimize the weight or cost of a structure as an upper level objective with the decision variables as shape of the structure, choice of materials, amount of material, etc. The constraints at the upper level involve bounds on displacements, stresses and contact forces whose values are determined by solving the potential energy minimization problem at the lower level. The equilibrium condition in many optimal shape design problems appears in the form of variational inequalities which require the overall problem to formulated as a two level task. For optimal design applications, the readers may refer to [22], [46], [83], [93], and [94].
Defense Applications: Bilevel optimization has a number of applications in the defense sector [30], for example attacker-defender Stackelberg games [9], [34], [85], [125], [131], [132], [137]. Specifically, some recent applications include planning the prepositioning of defensive missile interceptors to counter an attack threat [32], interdicting nuclear weapons project [33], homeland security applications [110], [173], and location problems [3]. The bilevel problem, while offending, involves maximizing the damage caused to the opponent by taking into account the optimal reactions of the opponent. Conversely, while defending, the bilevel problem involves minimizing the maximum damage that an attacker can cause.
Facility Location: Facility location problems may take the form of a Stackelberg game if a firm, while locating its facility, decides to account the actions of its competitors. For instance, Küçükaydin et al. [96] studied the scenario where a firm enters a market by locating new facilities, and its competitor reacts by adjusting the attractiveness of its existing facilities. Another study considers location of logistics distribution centers by minimizing the planners’ cost at the upper level and customers’ cost at the lower level [157]. Other applications of bilevel optimization to facility location problem may be found in [7], [35], [37], [40], [87], [113], [117], [127], [157], and [162].
Inverse Optimal Control: Inverse optimal control problems are essentially bilevel in nature [6], [88], [122], [158] with wide applications in robotics, computer vision, communication theory and remote sensing to name a few. One of the major challenges in control theory is deriving the performance index or reward function which fits best on a given dataset. Such tasks lie in the category of inverse optimal control theory, where one solicits the calculation of the cause based on the given result. Such a requirement necessitates solving a parameter estimation problem with an optimal control problem.
Machine Learning: Most of the machine learning and evolutionary optimization techniques often involve a number of parameters. A proper choice of these parameters has a substantial effect on the accuracy and efficiency of the approach. Tuning of these parameters is often achieved using brute force strategies, such as grid search and random search. A bilevel formulation of this problem allows for systematic and more efficient search when the number of parameters are large. Some of the approaches that have acknowledged the bilevel nature of this problem are [23], [24], [104], and [154].
Principal-Agent Problems: Principal-agent problem [98] is a classical problem in economics, where a principal (leader) subcontracts a job to an agent (follower). Given that the agent prefers to act in his own interests rather than those of the principal, it becomes important for the principal to have an incentive scheme that aligns the interests of the agent with the principal. Design of such contracts appears as a bilevel optimization problem. In real life, principal-agent relationships are commonly found in doctor-patient, senior management-lower management, employer-employee, corporate board-shareholders, and politician-voters scenarios. Studies that the readers may refer to are [41], [74], [163], [179], and [187].
Interest Over Time
In this section, we perform a text analysis of papers on bilevel programming (and Stackelberg games) that are indexed in SCOPUS. To begin with, we analyze the volume of publications every year on bilevel programming since 1950s to present, and then closely look at the themes within bilevel programming that have contributed to the growth over the years. The themes were discovered using a nonparametric Bayesian approach [160], which clusters the documents together based on similarities. The documents may probabilistically belong to multiple clusters at the same time.
Fig. 7 shows how the interest on bilevel programming has been growing at a slow pace until early 2000s and then picked-up significantly at the middle of the previous decade. The studies on bilevel programming using evolutionary algorithms appeared for the first time during the mid 1990s that took another decade to pick up to the extent that almost 10% of all studies on bilevel optimization utilize evolutionary methods. Fig. 7 shows the growth of evolutionary methods in the context of bilevel optimization from the 1990s to present. While the early papers on bilevel programming (pre-2000) were mainly focused on solution methods and optimality conditions, the growth in the post-2000 period was fueled by papers on applications of bilevel programming.
To identify the themes that have contributed toward the literature on bilevel optimization we used topic models. “Topic models are algorithms for discovering the main themes that pervade a large and otherwise unstructured collection of documents” [26]. These models can be used to organize unstructured collections as well as develop insights from large text databases that made it suitable for our purposes. The results from the topic model for the documents retrieved from SCOPUS are given in Figs. 9–24. The figures consist of themes that have a volume at least to the order of around 1%. Along with the identification of themes, the analysis helped in determining the attention received by particular themes at any point in time.
Each figure contains a word cloud in the inset that describes the theme and volume of papers as the other inset that describes the number of papers published on that theme over the years. Interestingly, we observe that number of papers on classical bilevel methods and optimality conditions peaked during 1995–2000. Since 2000, a number of bilevel applications picked-up, for example, we see a growth in supply chain applications, electricity transmission applications, telecommunication applications, facility location applications, railway applications, and machine learning applications. Defense applications that appeared to be minimally present before 2000 show a significant presence later. Network design, optimal design, and business applications do not show a trend but represent the highest volume of applications on bilevel optimization.
Conclusion
In this concluding section, we will raise a few perspectives that have not yet received much attention but may offer interesting directions for future research. Apart from metamodeling-based techniques to solve bilevel problems, we would like to highlight the importance of being able to account for different forms of uncertainties that are often encountered when solving practical problems. Another interesting direction is concerned with scalability of bilevel algorithms and ability to leverage distributed computing platforms to handle large scale problems. Further information about evolutionary bilevel optimization can be found at http://www.bilevel.org.
A. Metamodeling-Based Algorithms
Though we have already highlighted the importance of metamodeling-based methods in an earlier section, we have decided to discuss it once again because of its potential in solving practical bilevel optimization problems. Bilevel optimization problems inherit a number of mappings, and any knowledge about the structure of these mappings can simplify the solution procedure extensively. In this paper, we have highlighted approaches that are based on approximating the reaction set mapping and the optimal lower level value function mapping. Knowing one of these mappings reduces any bilevel optimization problem to a single-level optimization problem. For solving large scale bilevel problems, one has to exploit the structure and properties of bilevel problems that are essentially contained in these mappings. Other ways of utilizing metamodeling for bilevel problems that we have discussed are: approximating the bilevel problem by bypassing the lower level problem completely; and utilizing auxiliary bilevel models to locally approximate a bilevel optimization problem. Only few studies in the context of evolutionary algorithms utilize such a strategies and offer opportunities for future contributions.
B. Multiobjective Bilevel Optimization and Decision Making
Multiobjective bilevel optimization has received only lukewarm interest from researchers. A number of issues, like decision interaction between the two levels and uncertainties in decision making, remain unexplored. It might be of interest for researchers working in the area of multicriteria decision making and multiobjective evolutionary optimization to explore how two levels of decision makers interact to arrive at a compromising or an equilibrium solution in different situations. Similarly, the notion of uncertain decision-maker’s preferences at one or both levels has also not received enough attention [151]. In the field of decision analysis, however, plenty of research has been carried out to extend traditional frameworks of decision making such as expected utility theory [124] and multiattribute utility theory [92] to account for uncertain preferences on, for instance, the tradeoffs among multiple decision objectives or the risk-attitude. However, preferential uncertainty in bilevel optimization problems still requires development of theory as well as methods to account for decision behavior in a hierarchical setting. The approaches that have been proposed so far are still very preliminary and require substantial future research.
C. Bilevel Optimization Under Variable Uncertainty
Another important research topic is concerned with the inherent uncertainty of decision variables. This poses several challenges for the existing more deterministic optimization frameworks that may fail to find solutions that are robust and sufficiently close to the optimal solutions. The fact that bilevel problems have nested optimization tasks makes the search of robust solutions substantially more challenging compared with single-level optimization problems. A few preliminary ideas for handling variable uncertainty have already been suggested [52], but the required algorithm side innovations that would make these problems accessible to practitioners are still missing.
D. Scaling of Evolutionary Bilevel Algorithms
Bilevel problems are well-known for being highly computationally intensive already before considering any types of uncertainties. One of the promising directions for handling larger bilevel problems could be the use of the recent distributed computing platforms such as Apache Spark project. The programming model of Spark is quite different from the Hadoop MapReduce framework, and it has managed to overcome many of the earlier limitations. In particular, its current form has turned out to support the use and development of iterative algorithms quite well. Therefore, it may be interesting to investigate whether this novel platform will be able to offer a helping hand for researchers and practitioners who seek to solve bigger bilevel problems faster.
Though considerable progress has already been made during the last few years, evolutionary bilevel optimization is still a relatively young field with numerous opportunities for both computational as well as theoretical innovations. The growing availability of algorithms is also opening the field to more applied research, and we believe that in the future we are likely to see a considerable amount of novel applications.
ACKNOWLEDGMENT
Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.