Introduction 介绍

Injection molding is the most commonly used method for producing plastic products because of its fast production time, excellent surface characteristics, feasibility of producing complex structures, and low cost. Owing to the diversification of products, the demand for optimization of the injection molding process has increased. However, process parameter design for injection molding has relied mostly on the experience of field personnel and the trial-and-error method, although heuristic methods do not guarantee optimal conditions and require a long time for fine-tuning. Hence, a systematic process optimization method is necessary to meet this increasing demand.
注射成型是生产塑料制品最常用的方法,因为其生产时间快、表面特性优良、可生产复杂结构且成本低。由于产品的多样化,对注塑工艺优化的需求不断增加。然而,注塑工艺参数设计主要依靠现场人员的经验和试错法,但启发式方法不能保证最佳条件,并且需要较长的时间进行微调。因此,需要一种系统的流程优化方法来满足这种不断增长的需求。

Although the development of computer-aided engineering (CAE) software based on the finite element method (FEM) has reduced the cost of injection molding process analysis, it is still difficult to explore a wide parameter space with highly nonlinear relationships between the process conditions and outcomes due to the high computational costs involved. The injection molding process consists of filling, packing, cooling, and extraction stages as shown in Fig. 1. In the filling stage, the plastic resin flows into the cavity of the mold as the screw of the injection molding machine moves forward. The filling stage is performed under the speed control of the screw. In the packing stage, pressure is continuously applied to compensate for the cooling shrinkage of the resins. The packing stage, which generally proceeds until the gate solidifies, is closely related to the shrinkage rate of the product. In the cooling stage, the resin filled in the cavity is cooled until the part is sufficiently solidified for extraction. In the extraction stage, the cooled part is separated from the cavity. After the extraction, the mold is closed again for the next filling.
尽管基于有限元法(FEM)的计算机辅助工程(CAE)软件的发展降低了注塑工艺分析的成本,但探索工艺条件和工艺条件之间高度非线性关系的宽参数空间仍然很困难。由于涉及高计算成本而导致的结果。注射成型过程由填充、保压、冷却和抽出阶段组成,如图1所示。在填充阶段,塑料树脂随着注射成型机的螺杆向前移动而流入模具的型腔。填充阶段是在螺杆的速度控制下进行的。在保压阶段,持续施加压力以补偿树脂的冷却收缩。保压阶段通常持续到浇口凝固,与产品的收缩率密切相关。在冷却阶段,填充在型腔中的树脂被冷却,直到零件充分固化以进行提取。在提取阶段,冷却的部分与型腔分离。取出后,再次关闭模具以进行下一次填充。

Fig. 1 图。1
figure 1

Schematic of the injection molding process. In the filling stage, a screw moves forward and the mold cavity is filled with resin. In the packing stage, pressure is continuously applied to compensate for the cooling shrinkage of the resins. In the cooling stage, the screw moves backward with rotation, and the resin is replenished. In the extraction stage, the mold opens, and the part is extracted
注射成型工艺示意图。在填充阶段,螺杆向前移动,模腔充满树脂。在保压阶段,持续施加压力以补偿树脂的冷却收缩。在冷却阶段,螺杆随着旋转向后移动,补充树脂。在提取阶段,模具打开,零件被提取

The ultimate goal of optimizing the injection molding process is to reduce the process cost and prevent injection molding defects such as short shots, sink marks, weld lines, air traps, flow marks, and warpage. The process parameters for injection molding exhibit complex relationships. For example, when the injection speed is high, the temperature of the flow front may increase owing to a decrease in heat loss and the generation of shear heat. As the temperature increases, the thickness of the solidification layer decreases, whereas the solidification time increases. Conversely, a low injection speed may lower the temperature of the flow front and increase the thickness of the solidification layer, thereby decreasing solidification time. The non-uniform temperature distribution within the part causes warpage owing to the high deviation of the shrinkage rate.

Ideally, production optimization should be conducted by holistically considering the design of the structure, geometries of the gate and runner, and the process parameters. However, owing to the nature of the industry, in most cases, the process parameters for injection molding are adjusted after the structure of the plastic part and location of the injection gate are predetermined. To date, process parameter design has been performed using data-based surrogate models. For instance, knowledge-based expert system (Steadman & Pell, 1995) and case-based reasoning (CBR) approaches (Jiang et al., 2019; Yu et al., 2020) have been proposed. However, expert system is inadequate for cases in which quantitative values are required, and CBR is only suitable when easy-to-find associated cases exist (Khosravani & Nasiri, 2020). In other studies, the Taguchi method has been applied to optimize process parameters to minimize shrinkage or warpage (Altan, 2010; Oktem et al., 2007). However, while the Taguchi method can provide the optimal combination, it is unable to provide the optimal solution in the parameter space. Some researchers have devised an adaptive process for the optimization of injection molding using the Kriging surrogate model and the expected improvement acquisition function (Gao & Wang, 2009). Several other studies have performed optimization by combining artificial neural networks and genetic algorithms (Kurtaran et al., 2005; Shen et al., 2007; Tsai & Luo, 2017), artificial neural networks and particle swarm optimization (Xu et al., 2015), or response surface methodology (RSM) and genetic algorithm (Ozcelik & Erzurumlu, 2005). In these studies, the process parameters were optimized for a single objective.

However, because product quality (warpage and shrinkage), production cost (cycle time), and energy consumption (clamping force) have trade-off relationships, a systematic multi-objective optimization approach is required. Many studies have devised multi-objective optimization approaches by combining neural networks and multi-objective genetic algorithms (Cheng et al., 2013; Sibalija & Majstorovic, 2012; Song et al., 2020; Yin et al., 2011), neural networks and multi-objective particle swarm optimization (Chen et al., 2014; Xu & Yang, 2015; Zhang et al., 2016), Gaussian process regression and multi-objective genetic algorithms (Zhou & Turng, 2007), improved efficient global optimization and non-dominated sorting-based genetic algorithm II (NSGA-II) (Zhao et al., 2015), and RSM and NSGA-II (Li et al., 2019). However, genetic algorithms and particle swarm optimization have several drawbacks. First, setting the hyperparameters of the genetic algorithm and particle swarm optimization requires relevant experience and expertise because the optimization performance significantly depends on their hyperparameters. For example, the genetic algorithm involves hyperparameters such as population size, number of generations, probability of crossover, and probability of mutation (Pongcharoen et al., 2002), while particle swarm optimization involves swarm size, neighborhood size, number of iterations, and acceleration coefficients (Du & Swamy, 2016). Second, genetic algorithm and particle swarm optimization are well-known global optimization algorithms, but they tend to get trapped in local optimum in some cases (Haklı & Uğuz, 2014; Hwang & He, 2006), and they are feedforward design approaches that propose heuristic candidates that are not promising candidates, which results in the requirement of many candidate points and iterations for the global optimum. Therefore, systematic backward design approaches are necessary where promising candidates can be sought solely based on the model’s ability, even without an engineer’s prior knowledge.

In this study, we proposed two systematic optimization frameworks for the injection molding process using multi-objective Bayesian optimization (MBO) and constrained generative inverse design networks (CGIDN). Bayesian optimization is a sequential design strategy for finding a global optimum through an iterative process of learning surrogate models for the target objective value and selecting the best design choice (Gardner et al., 2014; Pelikan & Goldberg, 2006; Snoek et al., 2012). Multi-Objective Bayesian Optimization (MBO), an extension of BO, is designed to construct the Pareto front by extending BO's single objective acquisition function to account for the relationships between the multiple objectives that potentially conflict. For example, a hypervolume-based acquisition function has been used (Daulton et al., 2020; Yang et al., 2019). As the first systematic approach to the injection molding process, we propose an MBO utilizing the expected hypervolume improvement (EHVI) (Emmerich, 2005; Emmerich et al., 2006; Wagner et al., 2010).

As a second approach, we propose a CGIDN method for injection molding. The generative inverse design network (GIDN) uses backpropagation to calculate the analytical gradients of the objective function with respect to design variables and outperforms the genetic algorithm for some cases (Chen & Gu, 2020). The original GIDN has a limitation in that the range of design variables cannot be constrained (i.e., an unbounded range of input variables is considered during the optimization), whereas process optimization should be performed within a feasible range of process parameters. In our proposed CGIDN method, the sigmoid function is augmented to the trained neural network architecture to constrain the available range of process parameters.

The two systematic data-driven optimization frameworks proposed are based on an active learning scheme where promising candidate recommendations and model updates are repeated, allowing the global optimum to be reached faster with relatively small data (Kim et al., 2021). Because the two methods have different strengths and limitations, they can be employed according to the target problem. Thus, this study provides insights into these two methods to help engineers select an appropriate method.

The remainder of this paper is organized as follows. The MBO framework for the injection molding process is introduced in “Multi-objective bayesian optimization (MBO)” section. The CGIDN framework is described in “Constrained generative inverse design networks (CGIDN)” section. A description of the door trim part for the verification of the optimization frameworks is presented in “Description of the door trim part” section. The feasibility of the MBO and CGIDN frameworks for injection molding is verified in “Results” section. Discussions on the two frameworks are presented in “Discussion” section. The conclusions are presented in “Conclusion” section.

Boldface letters indicate vectors or matrices and non-bold letters indicate scalar values.

Method

In this section, the frameworks using MBO and CGIDN to optimize the injection molding process are explained. Both frameworks were implemented using Python code combined with Moldflow Insight 2021.1, which is the software used for injection molding simulations. Workflow diagrams for the frameworks are shown in Figs. 2 and 3. In addition, the door trim part was introduced for verification of the frameworks, minimizing deflection after the extraction and cycle time.

Fig. 2
figure 2

Workflow of multi-objective Bayesian optimization (MBO) for the injection molding process. A single candidate is recommended in each optimization loop and combined with the previous dataset

Fig. 3
figure 3

Workflow of constrained generative inverse design networks (CGIDN) for the injection molding process. Forty candidates are recommended in each optimization loop and combined with the previous dataset

Multi-objective bayesian optimization (MBO)

Black-box optimization refers to a type of optimization problem in which a complex and highly nonlinear input–output relationship is not available in an analytical form, thereby making it difficult to apply gradient-based optimization methods. Bayesian optimization is an excellent choice for such optimization problems, particularly when obtaining abundant data is difficult owing to the high computational cost or time. Bayesian optimization proceeds in the following order: first, the initial training dataset is used to train the Gaussian process regression (GPR) model, which is a non-parametric and statistical regression model commonly used as a surrogate model for Bayesian optimization. The trained GPR model is then used to evaluate the values of the acquisition function over the entire design space, and a new candidate is recommended at the point where the value of the acquisition function is maximized in the design space.

GPR approximates a complicated input–output relationship y=f(x)+ε based on a dataset DD={(xxi,yi)|i=1,,n}, where the noise ε is assumed to follow a Gaussian distribution (Denzel & Kästner, 2018; Lizotte et al., 2005). GPR predicts not only the mean E[f(x)] but also the variance V[f(x)] of the target value f(x) for an unknown input x, thus effectively quantifying the uncertainty in its prediction. Given an unknown input x, the GPR model assumes that the observation data yy={yi|i=1,,n} and the prediction y follow a multivariate Gaussian distribution, as follows:

PPyy,y=[yyy]N(0,[KKkkkkTk(xx,xx)]).
(1)

where KK is a covariance matrix whose elements are expressed as Kij=k(xxi,xxj), and kkT refers to a covariance vector whose elements are ki=k(xxi,xx). The covariance function k(xxi,xxj), also referred to as the kernel function, characterizes the covariance between the two function outputs, f(xxi) and f(xxj).

In this study, we adopted the Matern 5/2 kernel function to evaluate the covariance of the two input features (Williams & Rasmussen, 1995):

k(xxi,xxj)=σf2(1+5rl+5r23l2)exp(5rl)+δijσϵ2wherer=(xxixxj)T(xxixxj)
(2)

The value of the kernel function strongly depends on the distance r between the two points, which implies that the predicted mean and variance at an unknown xx are estimated based on all observation data DD considering how far the unknown xx is from the observation dataset. The Matern 5/2 kernel function has three hyperparameters based on the signal variance (σf2), characteristic length scale (l), and observation noise (σε2). The signal variance adjusts the magnitude of the overall covariance values, the length scale represents how far the two inputs have to be located for their outputs to become mutually uncorrelated, and the observation noise assigns an appropriate variance for all the observation data points to ensure the smoothness of the regression near the observation data points.

The optimal kernel hyperparameter values for GPR are determined by maximum likelihood estimation (MLE), where the likelihood function represents how effectively the resultant statistical regression model represents the observed data (Blum & Riedmiller, 2013). From the multivariate Gaussian distribution in Eq. (1), a predictive Gaussian distribution of an unknown output (y) can be derived based on the standard rules for conditioning Gaussians.

p(y|xx,DD)=N(y|μ,σ2)
(3)
μ=kkTKK1yy
(4)
σ2=k(xx,xx)kkTKK1kk.
(5)

The GPR model can efficiently compute the acquisition function for any input value searching for a candidate input where the acquisition function is maximized in the design space. The expected improvement (EI) is a representative acquisition function for single-objective Bayesian optimization, which considers the properties of exploitation (sampling a new point close to the optimum) and exploration (sampling a new point that can reduce the uncertainty of the current regression model) in balance (Feng et al., 2015). Thus, by repeatedly sampling a new data point with the maximum EI value, Bayesian optimization allows us to efficiently explore the design space and approach the optimum. Bayesian optimization for a single objective function can be extended to MBO by adopting a multidimensional acquisition function based on the concepts of hypervolume and Pareto front. Let us say fi(x) is the ith objective function, where function ff(xx) is defined as ff(xx)=[f1(xx),f2(xx),,fm(xx)], and yy is the observation vector for ff(xx). When a point yRm is preferred to (strictly dominates) another point yyRm (or when at least one component of yy is strictly less than that of yy and all components of yy are less than or equal to those of yy for the minimization problem), this can be expressed as yy>yy. The Pareto front is defined as follows:
GPR 模型可以有效地计算任何输入值的采集函数,搜索候选输入,其中采集函数在设计空间中最大化。预期改进(EI)是单目标贝叶斯优化的代表性获取函数,它考虑了利用(采样接近最优的新点)和探索(采样可以减少当前回归的不确定性的新点)的属性模型)保持平衡(Feng et al., 2015)。因此,通过重复采样具有最大 EI 值的新数据点,贝叶斯优化使我们能够有效地探索设计空间并接近最优值。通过采用基于超体积和帕累托前沿概念的多维获取函数,单一目标函数的贝叶斯优化可以扩展到MBO。假设 fi(x) 是第 i 个目标函数,其中函数 ff(xx) 定义为 < b5> 和 yy ff(xx) 的观察向量。当一个点 yRm 优先于(严格支配)另一个点 yyRm 时(或者当 yy 严格小于 yy 以及 yy 的所有组件对于最小化问题小于或等于 yy ),这可以表示为 yy>yy 。帕累托前沿定义如下:

P(Y)={yyY:{yyY:yy>yy,yyyy}=}
(6)

The hypervolume is an area dominated by the Pareto front and the reference point, as shown in Fig. 4. The hypervolume improvement (HVI) dominated by a newly recommended point is expressed as follows:
超体积是由Pareto前沿和参考点主导的区域,如图4所示。由新推荐点主导的超体积改进(HVI)表示为:

Fig. 4 图4
figure 4

Configuration for hypervolume domain. The current hypervolume (HV) area and hypervolume improvement (HVI) area are shaded. HVI is dominated by the newly recommended point
超卷域的配置。当前超容量 (HV) 区域和超容量改进 (HVI) 区域用阴影表示。 HVI以新推荐点为主

HVI(P,ff(xx))=HV(Pff(xx))HV(P).
(7)

We adopted expected hypervolume improvement (EHVI) as an acquisition function for MBO (Emmerich et al., 2016), which is defined as follows,
我们采用预期超体积改进(EHVI)作为 MBO 的获取函数(Emmerich 等人,2016),其定义如下:

EHVI=E[HVI(P,ff(xx))]=HVI(P,ff(xx))p(ff(xx))dff(xx).
(8)

The MBO framework can be applied to multidimensional objective optimization using a high-dimensional hypervolume as the acquisition function. The Bayesian optimization framework was implemented in Python code using an open-source GPR library named GPy, developed by the Sheffield machine learning group (Sheffield, 2012). For each objective function, individual single-output GPR models were constructed assuming that each output was independent, which enabled efficient calculation of the acquisition function without lowering the performance of MBO.
MBO框架可以应用于使用高维超体积作为获取函数的多维目标优化。贝叶斯优化框架是使用由谢菲尔德机器学习小组开发的名为 GPy 的开源 GPR 库在 Python 代码中实现的(谢菲尔德,2012 年)。对于每个目标函数,假设每个输出是独立的,构建单独的单输出 GPR 模型,这使得能够在不降低 MBO 性能的情况下有效计算捕获函数。

To apply MBO to optimize the injection molding process, we followed the workflow shown in Fig. 2. We generated ten initial training data points at random locations in the design space. Moldflow simulation was used to compute the two objective functions, the maximum deflection after extraction (referred to as deflection in the remaining part) and the cycle time, for a given set of input process parameters. The initial dataset was then passed through the optimization loop. In the first step of the optimization loop, the two GPR models were fitted for the two objectives. Subsequently, based on the trained GPR models, a new sampling location was characterized in the design space by identifying the location where the acquisition function was maximized. Moldflow analysis was performed for the newly recommended point. If the optimization criteria were satisfied, the optimization cycle was terminated; if not satisfied, the previous data (DDprevious) and newly recommended data (DDnew) were combined, and the next optimization loop commenced. Here, the optimization criterion was satisfied when no improved output is derived for the next 80 optimization loops.
为了应用 MBO 来优化注塑工艺,我们遵循图 2 所示的工作流程。我们在设计空间中的随机位置生成了 10 个初始训练数据点。 Moldflow 仿真用于计算两个目标函数:对于给定的一组输入工艺参数,提取后的最大挠度(称为剩余部分的挠度)和循环时间。然后初始数据集通过优化循环。在优化循环的第一步中,针对两个目标拟合了两个探地雷达模型。随后,基于经过训练的探地雷达模型,通过识别采集函数最大化的位置,在设计空间中表征新的采样位置。对新推荐的点进行 Moldflow 分析。如果满足优化标准,则终止优化周期;如果不满意,则将之前的数据( DDprevious )和新推荐的数据( DDnew )合并,进行下一步优化循环开始。此处,当接下来的 80 个优化循环没有得出改进的输出时,就满足优化标准。

Constrained generative inverse design networks (CGIDN)
约束生成逆向设计网络 (CGIDN)

As a second method for optimizing the injection molding process, we proposed CGIDN. The GIDN uses backpropagation with analytical gradients, which enables fast calculation for various inputs while avoiding falling into a local optimum. In addition, because the GIDN is combined with active learning to gradually reach the optimal solution, the amount of data required can be reduced. The framework of GIDN is as follows: First, as in traditional deep neural network (DNN) training, the weights and biases of the DNN are trained on the relationship between input and output. Second, the weights and biases of the trained DNN are considered as fixed constants, and input parameters that minimize the target objective function using backpropagation are obtained. The output values of the recommended inputs are calculated through simulations or experiments, and the newly generated data are combined with previous data to update the neural network. The processes are repeated until optimum values are reached. A detailed explanation of the GIDN method can be found in a previous study (Chen & Gu, 2020).
作为优化注塑工艺的第二种方法,我们提出了 CGIDN。 GIDN 使用带有解析梯度的反向传播,可以快速计算各种输入,同时避免陷入局部最优。另外,由于GIDN与主动学习相结合,逐步达到最优解,因此可以减少所需的数据量。 GIDN的框架如下:首先,与传统的深度神经网络(DNN)训练一样,DNN的权重和偏差是根据输入和输出之间的关系进行训练的。其次,将训练后的 DNN 的权重和偏差视为固定常数,并获得使用反向传播最小化目标目标函数的输入参数。通过模拟或实验计算推荐输入的输出值,并将新生成的数据与之前的数据相结合来更新神经网络。重复该过程直到达到最佳值。 GIDN 方法的详细解释可以在之前的研究中找到(Chen & Gu,2020)。

However, unfortunately, the original GIDN method has a disadvantage in that the ranges of the input parameters are unbounded. We proposed a CGIDN method that limits the range of input parameters by improving the original GIDN method. A description of the CGIDN method is presented in Fig. 5. In the predictor stage, the weights and biases are trained based on the relationship between the input and output. In the designer stage, a one-to-one mapping layer, where each component of the layer has a value of 1 and the layer is followed by a sigmoid function σ(x) is added in front of the trained DNN. In other words, the weights wii0,1 are added to each input variable. The trained weights and biases are considered as fixed constants, and the newly added weights are considered as trainable parameters. Trainable weights wii0,1, which minimize the target objective function, are trained using backpropagation from random inputs, and the optimal process parameters are obtained by substituting the weights into a sigmoid function as follows:
然而,不幸的是,原始的GIDN方法有一个缺点,即输入参数的范围是无界的。我们提出了一种通过改进原始GIDN方法来限制输入参数范围的CGIDN方法。 CGIDN 方法的描述如图 5 所示。在预测器阶段,根据输入和输出之间的关系来训练权重和偏差。在设计阶段,添加一个一对一的映射层,该层的每个分量的值为1,该层后面跟随一个sigmoid函数 σ(x) 在经过训练的 DNN 前面。换句话说,权重 wii0,1 被添加到每个输入变量。训练后的权重和偏差被视为固定常数,新添加的权重被视为可训练参数。可训练权重 wii0,1 最小化目标目标函数,使用随机输入的反向传播进行训练,并通过将权重代入 sigmoid 函数来获得最佳过程参数,如下所示:

Fig. 5 图5
figure 5

The framework of constrained generative inverse design networks (CGIDN). In the predictor stage, the neural network is trained with a training set. In the designer stage, trained weights and biases are fixed and a new layer is added before the input layer. The newly added weights are trained with backpropagation, and candidates for optimized design are obtained from σ(wii0,1) where σ is the sigmoid function
约束生成逆向设计网络(CGIDN)的框架。在预测器阶段,使用训练集来训练神经网络。在设计阶段,训练的权重和偏差是固定的,并在输入层之前添加一个新层。新添加的权重通过反向传播进行训练,优化设计的候选权重从 σ(wii0,1) 中获得,其中 σ 是 sigmoid 函数

optimizedinput=σ(wii0,1)whereσ(x)=11+ex
(9)

Because the output of the sigmoid function is limited to 0–1, the range of the optimized input is bounded.
由于 sigmoid 函数的输出被限制为 0-1,因此优化输入的范围是有界的。

For the application of the CGIDN to the injection molding process optimization, we followed the workflow shown in Fig. 3. Forty initial data points were randomly generated, and the maximum deflection and cycle time were obtained from the Moldflow analysis. The initial data were passed through an optimization loop. In the first step of the optimization loop, the DNN model was trained. Then, based on the trained DNN model, 40 new data points were recommended by setting random initial weights for wii0,1 and using backpropagation. Moldflow analysis was performed on 40 data points. Newly generated data were combined with previous data. At every data collection step, the data collected from each optimization loop were newly shuffled and separated into a training set and a validation set as follows:
为了将 CGIDN 应用于注塑工艺优化,我们遵循图 3 所示的工作流程。随机生成 40 个初始数据点,并从 Moldflow 分析中获得最大挠度和循环时间。初始数据通过优化循环。在优化循环的第一步中,训练 DNN 模型。然后,基于训练好的DNN模型,通过为 wii0,1 设置随机初始权重并使用反向传播来推荐40个新数据点。对 40 个数据点进行 Moldflow 分析。新生成的数据与以前的数据相结合。在每个数据收集步骤中,从每个优化循环收集的数据都被重新洗牌并分成训练集和验证集,如下所示:

DDinitialDD0train,DD0validDD1DD1train,DD1validDDnDDntrain,DDnvalid
(10)
DDtrain=DD0trainDDntrain
(11)
DDvalid=DD0validDDnvalid
(12)

where DDinitial and DDn are the initial dataset and dataset generated by the nth optimization loop, respectively. DDtrain and DDvalid are the training and validation sets for training the DNN, respectively. Newly shuffling data in every optimization loop can prevent the solution from falling into a local minimum and improve the generalization of the GIDN. If the optimization criteria were satisfied, the optimization cycle was terminated; if not satisfied, the next optimization loop commenced. In this study, a case when no improved output is derived for the next two optimization loops was considered as the optimization criterion. The numbers of hidden layers and neurons for each layer, which are hyperparameters of the DNN, were selected using the grid-search hyperparameter tuning technique (see Supplementary Note 1). Based on the results of hyperparameter tuning, four hidden layers and 30 neurons for each layer were used. The Adam optimizer (Kingma & Ba, 2014) was used to train the DNN and an early stopping method was applied. In the predictor stage, the loss function of the mean squared error (MSE) was minimized, which is defined as follows:
其中 DDinitial DDn 分别是初始数据集和第 n 个优化循环生成的数据集。 DDtrain DDvalid 分别是用于训练 DNN 的训练集和验证集。在每个优化循环中重新洗牌数据可以防止解决方案陷入局部最小值并提高 GIDN 的泛化能力。如果满足优化标准,则终止优化周期;如果不满足,则开始下一个优化循环。在本研究中,将接下来的两个优化循环没有获得改进输出的情况视为优化标准。使用网格搜索超参数调整技术选择隐藏层和每层神经元的数量(DNN 的超参数)(参见补充说明 1)。根据超参数调整的结果,使用了四个隐藏层和每层 30 个神经元。 Adam 优化器(Kingma & Ba,2014)用于训练 DNN,并应用提前停止方法。在预测器阶段,最小化均方误差(MSE)的损失函数,其定义如下:

MSE=12NiN(yy^iyyi)2
(13)

where y^i and yi are the predicted and ground-truth values of the i-th sample, respectively, and N refers to the number of samples. In the designer stage, the following objective function was considered for the minimization problem:
其中 y^i yi 分别是第 i 个样本的预测值和真实值,N 指到样本数量。在设计阶段,最小化问题考虑了以下目标函数:

L=imγi(yi)2whereimγi=1and0<γi<1
(14)

where yi is the ith component of y, m is the dimension of y, and γi is a random variable between 0 and 1. If there is an intended relationship between the objective variables, γi may be fixed at specific values. In this study, because two-dimensional objective values were considered, the objective function was expressed as follows:
其中 yi 是 y 的第 i 个分量,m 是 y 的维度, γi 是 0 之间的随机变量1. 如果目标变量之间存在预期关系, γi 可以固定为特定值。本研究中,由于考虑了二维目标值,因此目标函数表示为:

L=γ(y1)2+(1γ)(y2)2where0<γ<1
(15)

Description of the door trim part
车门饰件部分说明

To demonstrate the applicability of the injection molding process optimization frameworks, the door trim part (Yilmaz, 2021) of an automobile was considered, as shown in Fig. 6. As illustrated in Fig. 6a, b, the length and width of the part were about (394.2mm×125.2mm), and the thickness was about 3mm. The gate was located at the center of this part. The gate, runner, and sprue specifications are listed in Table 1. Moldflow, a commercial software, was used for injection molding simulations. The mesh configuration of this part is shown in Fig. 6c, where 831,177 tetrahedral elements were used. For the analysis, a fill + pack + warp module in Moldflow was used to obtain the deflection and cycle time. The fill analysis predicts the thermoplastic polymer flow inside the mold in the filling stage. A fill analysis calculates the flow front that grows through the part incrementally from the injection location, and continues until the velocity/pressure switch-over point has been reached. A pack analysis predicts the thermoplastic polymer flow inside the mold in the packing stage. A pack analysis calculates a flow front that grows from the locations in the model the flow front had filled when the velocity/pressure switch-over point was reached. The analysis continues until the flow front has expanded to fill the last location in the model. A warp analysis is used to diagnose the cause of warping. Warpage in an injection molded part is caused by variations in shrinkage that occur: from region to region in the part; through the thickness of the part; parallel and perpendicular to the direction of material orientation (AUTODESK, 2021). Polypropylene (PP), the most commonly used material in vehicle interior materials, was selected as the resin, and the material properties of the material library in Moldflow were used (see Supplementary Note 2 for material properties). There are various process parameters for injection molding. In this study, we considered multiple parameters that affect the injection molding process, including molding temperature (Tmold), melt temperature (Tmelt) for resin, filling speed (vfill), packing pressure (Ppack), packing time (tpack), and cooling time (tcool). It is well known that the application of a multistage packing pressure, where the packing pressure decreases step-by-step, is more efficient than applying a single-stage packing pressure. In this study, a three-stage packing pressure was considered. Therefore, the following 10 process parameters were considered: Tmold, Tmelt, vfill, Ppack1, Ppack2, Ppack3, tpack1, tpack2, tpack3, tcool. The speed profile shown in Table 2, which is widely used by field engineers, was used to prevent a flow mark or rapid increase in injection pressure owing to a sudden change in injection speed. The flow rate in Table 2 is defined as the ratio of the filling rate to the nominal flow rate, and the nominal flow rate is defined by dividing the total volume of 156.8cm3 including runner system by the nominal fill time (tfill) (nominal flow rate =156.8cm3/tfill). Note that the nominal fill time is different from the actual fill time. The ranges for each process parameter are summarized in Table 3, considering the moldability. The ranges of the mold temperature and melting temperature were conservatively selected from the recommended mold temperature range ([20, 80]) and melt temperature range ([180, 260]) for PP. The packing pressure ranged from 0 to 80% of the filling pressure, as recommended, and the packing pressures were set to decrease monotonically (Ppack1>Ppack2>Ppack3). The ranges of nominal fill time and cooling time were set in consideration of the temperature at the flow front and solidification of the part, respectively (see Supplementary Note 3). In this study, the ten process parameters form the multidimensional design space, and all of them were normalized to have values between 0 and 1 for the given ranges. The deflection and cycle time were considered as outputs. Here, the mold opening time, which is independent of the other process parameters, was not included in the cycle time. Deflection was defined as the maximum deflection among the deflection values of all nodes (which indicates that the effects of warpage and shrinkage are considered together). The deflection and cycle time were normalized by assuming that their ranges were bounded in [3, 8] and [8, 50], respectively, based on the results of the initial data.
为了证明注塑工艺优化框架的适用性,考虑了汽车的车门装饰部件(Yilmaz,2021),如图 6 所示。如图 6a、b 所示,该部件的长度和宽度约为( 394.2mm×125.2mm ),厚度约为 3mm 。大门位于这部分的中心。浇口、流道和浇道的规格列于表 1。使用商业软件 Moldflow 进行注塑模拟。该部分的网格配置如图6c所示,其中使用了831,177个四面体单元。为了进行分析,使用 Moldflow 中的填充 + 保压 + 扭曲模块来获取偏转和周期时间。填充分析预测填充阶段模具内热塑性聚合物的流动。填充分析计算从注射位置开始逐渐穿过零件的流动前沿,并持续到达到速度/压力切换点。保压分析可预测保压阶段模具内热塑性聚合物的流动。保压分析计算流动前沿,该流动前沿从达到速度/压力切换点时流动前沿已填充的模型中的位置开始增长。分析将继续,直到流动前沿扩展至填充模型中的最后一个位置。翘曲分析用于诊断翘曲原因。注塑零件中的翘曲是由以下情况发生的收缩变化引起的:零件中的各个区域;穿过零件的厚度;平行和垂直于材料取向的方向(AUTODESK,2021)。 选择汽车内饰材料中最常用的材料聚丙烯(PP)作为树脂,并使用Moldflow中材料库的材料属性(材料属性见补充说明2)。注射成型有多种工艺参数。在本研究中,我们考虑了影响注塑成型过程的多个参数,包括成型温度 ( Tmold )、熔体温度 ( Tmelt vfill )、保压压力 ( Ppack )、保压时间 ( tpack )和冷却时间( tcool )。众所周知,应用多级保压压力(保压压力逐步降低)比应用单级保压压力更有效。在本研究中,考虑了三级保压压力。因此,考虑了以下10个工艺参数: Tmold Tmelt vfill tpack1 tpack2 tpack3 。表2所示的速度曲线被现场工程师广泛使用,用于防止由于注射速度突然变化而出现流痕或注射压力快速增加。表2中的流量定义为填充率与标称流量之比,标称流量定义为包含流道的 156.8cm3 的总体积系统按标称填充时间 ( tfill )(标称流量 =156.8cm3/tfill )。 请注意,标称填充时间与实际填充时间不同。考虑到成型性,每个工艺参数的范围总结在表 3 中。模具温度和熔融温度的范围是从PP推荐的模具温度范围([20, 80])和熔融温度范围([180, 260])中保守选择的。按照建议,保压压力范围为填充压力的 0 至 80%,并且保压压力设置为单调递减 ( Ppack1>Ppack2>Ppack3 )。标称填充时间和冷却时间的范围是根据流动前沿温度和零件凝固情况分别设定的(见补充说明3)。在本研究中,十个工艺参数形成多维设计空间,并且所有参数都被归一化为给定范围内的 0 到 1 之间的值。偏转和周期时间被视为输出。这里,与其他工艺参数无关的开模时间不包括在循环时间中。挠度定义为所有节点挠度值中的最大挠度(这表明同时考虑了翘曲和收缩的影响)。根据初始数据的结果,假设挠度和循环时间的范围分别在 [3, 8] 和 [8, 50] 范围内,对偏转和周期时间进行归一化。

Fig. 6
figure 6

Design of door trim part for injection molding. a Top side and b bottom side view of the part. c Meshed part with a runner system

Table 1 The specifications of the gate, runner, and sprue. The configurations for them can be found in Fig. 6c
Table 2 Filling speed profile according to shot volume
Table 3 Ranges of each injection molding process parameter

Results

MBO results

MBO was applied to optimize the injection molding process of the door trim. Ten randomly generated initial data points were considered. As shown in Fig. 7, the results of the MBO were plotted for every 50 iterations of the optimization loop, and the Pareto front curve formed as the optimization proceeded. Although the optimal process parameters were reached in the 128th loop, 250 optimization loops were performed to verify the convergence. The obtained process parameters minimize the deflection and cycle time simultaneously, as shown in Fig. 7f. Among the points with minimal deflection, the point with minimal cycle time was chosen as the optimal point for the process (colored in red). The optimized deflection and cycle time were 3.6886mm and 16.305s, respectively. The optimized process conditions showed 16.57% and 21.65% improvements in deflection and cycle time, respectively, compared with the initial dataset, as shown in Table 4. Figure 8 shows the visualization of the deflection analysis for the optimal process parameters obtained from MBO. The optimized process parameters are listed in Table 5. Examining the optimized process parameters in Table 5, contrary to the expectation that the three-stage packing pressure would be better, a single-stage packing pressure was sufficient to minimize both the maximum deflection and cycle time for the door trim.

Fig. 7
figure 7

The results of multi-objective Bayesian optimization (MBO). The results of MBO were plotted for every 50 iterations of the optimization loop

Table 4 Performance comparison between optimal parameters from MBO and best initial parameters
Fig. 8
figure 8

The deflection analysis result for the optimum design from multi-objective Bayesian optimization (MBO)

Table 5 Optimized process parameters obtained from MBO

CGIDN results

As the second method for injection molding process optimization, CGIDN was applied to the injection molding process of the door trim. Forty initial data points were generated randomly. The results of the CGIDN were plotted for every two iterations of the optimization loop in Fig. 9. The optimal candidates gradually approached the Pareto front as optimization proceeded. The optimal process parameters were reached in the 9th loop, but three additional optimization loops were performed to verify the convergence. As in “MBO results” section, among the points with minimal deflection, the point with the minimal cycle time was considered as the optimal point for the process (colored in red). The results of the optimized process parameters are shown in Fig. 9f. The optimized maximum deflection and cycle time were 3.7573mm and 15.4105s, respectively. The optimized process conditions showed 8.76% and 28.06% improvements in deflection and cycle time, respectively, compared with the initial dataset, as shown in Table 6. Figure 10 shows the results of the deflection analysis for the optimal process parameters obtained from the CGIDN. The process parameters optimized by the CGIDN are listed in Table 7. Considering that the second-and third-stage packing times (tpack2 and tpack3) are negligibly small, the single-stage packing pressure was sufficient for the door trim part, as shown in the MBO results. The results of CGIDN when the γ of Eq. (15) was fixed to 0.5 can be found in the supplementary material (See Supplementary Note 4). When using a fixed γ, the data generated through the optimization loop were biased toward one objective function; however, the optimal point was still reached.

Fig. 9
figure 9

The results of constrained generative inverse design networks (CGIDN). The results of CGIDN were plotted for every two iterations of the optimization loop

Table 6 Performance comparison between optimal parameters from CGIDN and best initial parameters
Fig. 10
figure 10

The deflection analysis result of optimum design from constrained generative inverse design networks (CGIDN)

Table 7 Optimized process parameters obtained from CGIDN

Discussion

In this study, the injection process parameters were optimized using the MBO and CGIDN. In both frameworks, optimized process parameters that simultaneously minimized deflection and cycle time could be obtained. Considering the optimized process parameters shown in Tables 5 and 7, low melt temperature (Tmelt) and high packing pressure (Ppack) were preferred. The low mold temperature (Tmold) reduces cycle time while increasing the warpage due to the high difference between mold temperature (Tmold) and melt temperature (Tmelt). High packing time (tpack) may decrease the deflection but increases the cycle time. The proper fill time (tfill) stabilizes the temperature at the flow front to reduce the deflection while a small fill time is preferred for cycle time. Unlike classical methods such as Taguchi method that derive the best combination, the proposed frameworks could obtain numerical values for optimum process parameters. The MBO and CGIDN frameworks obtained a smooth and dense Pareto front compared to the feedforward approaches using genetic algorithms or particle swarm optimization (Cheng et al., 2013; Zhang et al., 2016), which allows the engineer to flexibly select optimal process parameters according to allowable tolerances of objectives. Because the region close to the gate is under relatively high pressure compared to the region far from the gate, for the single-stage packing pressure, the regions within the part are solidified under different pressures depending on the location, which causes unbalanced residual stress and warpage. Therefore, applying a multistage packing pressure, which lowers the pressure as the part solidifies, is beneficial compared to the single-stage packing pressure. However, in “Results” section, the optimized process parameters for the door trim showed that the single-stage packing pressure was sufficient. This is because the door trim part is not sufficiently large to cause a severe imbalance in the solidification pressure. The MBO and CGIDN frameworks would still be valid for larger parts that require three-stage packing pressures, which will be the topic of our follow-up study.

For the identical optimization objectives, ten initial data were used for MBO, while 40 initial data were used for CGIDN. MBO carries out exploitation and exploration in a balanced manner, which allows the derivation of optimal points and model improvement on the unseen domain, respectively. Therefore, candidate points are efficiently recommended even starting with a small initial data. On the other hand, the DNN is vulnerable to overfitting when there is only a small amount of data, and CGDIN performs exploitation based on the performance of the DNN without exploration. Therefore, in CGIDN, a relatively large amount of initial data was used compared to MBO.

In the demonstration tasks for the door trim part manufacturing process, MBO was found to show better performance with smaller datasets compared with CGIDN; This is because, both exploitation and exploration were carried out together to efficiently collect data in MBO, which allows reaching the optimum with fewer data compared to CGIDN. However, it is difficult to judge the general superiority of the framework with only a single test case because each framework has its own strengths. Although the process parameters were optimized by utilizing the simulation data in this study, experimental data obtained using an actual injection molding machine can be used for the optimization. In an actual experiment, the characteristics of the parts produced under identical process conditions may show variance. The GPR in MBO reflects the variance (i.e., error bars) of the measured data. However, because obtaining a large amount of actual experimental data is difficult, the CGIDN framework can be used if the transfer learning (Jung et al., 2022) technique is used to combine the simulation and experimental datasets. When a DNN is initially trained with a large amount of simulation data and then fine-tuned with experimental data, it serves as an efficient surrogate model for the experimental dataset. Both MBO and CGIDN can be applied even if more than two objectives are required for optimization by considering a high-dimensional hypervolume and increasing neurons in the output layer, respectively. In this study, process optimization was performed under predetermined geometries of the runner system and part. The optimization frameworks for injection molding can be further extended to holistic optimization, which also considers the structures of the runner system and the part as input parameters.

Conclusion

In this study, two systematic data-driven optimization frameworks were proposed for the injection molding process. First, an MBO framework using GPR was proposed. Second, a CGIDN framework, which is an improved version of the original GIDN, was proposed. Because the frameworks share the spirit of active learning that repeats data recommendations and model updates, the optimum values can be quickly reached with a relatively small dataset. This approach is particularly useful when high costs are required to obtain data. To verify the frameworks, the process parameters of the door trim part were optimized. As a result, compared to randomly generated process parameters, optimized process parameters that significantly minimized both deflection and cycle time could be obtained. For the door trim part, MBO was found to show better performance with smaller datasets compared with CGIDN. Because each framework has different strengths, the frameworks can be applied according to the engineer’s purpose. The MBO framework can reflect variations in the output data, and the CGIDN framework can be combined with transfer learning, which enables the application of the frameworks to experimental data obtained from injection molding machines. In this study, the design space was constructed based on the knowledge of the engineer, but optimization of an undefined design space would also be possible in combination with classification models, which would enable process optimization without the engineer’s knowledge. In addition, the frameworks enable comprehensive optimization that simultaneously considers product quality, production cost, and energy consumption by increasing the dimensions of the target objectives. The process optimization frameworks proposed in this study are expected to contribute not only to injection molding but also to various manufacturing processes (Ashhab et al., 2014; Ma et al., 2020; Park et al., 2022; Zhao et al., 2021).