Elsevier

分子结构杂志

卷号 1330,2025 年 5 月 15 日,141429
Journal of Molecular Structure

基于 SHP2E76A 的别构抑制剂的虚拟筛选与结合机制

https://doi.org/10.1016/j.molstruc.2025.141429获取版权和内容
完整文本访问

精华

  • 应用多阶段分子对接策略进行抑制剂筛选。
  • 目标是蛋白质 SHP2E76A 的别构位点 1 和 2。
  • 具有新型骨架的 Lig1 和 Comp1 是显示最佳结合亲和力的顶尖命中。
  • Lig1 具有刚性的三环 N-杂环核,具有更强的范德华相互作用。
  • Comp1 的纺锤体结构通过 Arg265 和 Ala76 的相互作用增强了结合。

摘要

作为蛋白激酶通路中的关键调节因子,SHP2 表达的上调和激活突变可以导致各种癌症的发生。本项目针对 SHP2E76A 的两个别构位点(“隧道”型和“锁扣”型)进行了基于对接的抑制剂虚拟筛选,并探讨了结合机制。最终,在别构位点 1 和 2 分别获得了 9 个和 5 个命中化合物。在别构位点 1,Lig1 表现出最佳的结合亲和力。它具有刚性的三元 N-杂环母核,Leu216、Glu250、Arg111 和 Thr253 是与相互作用相关的关键残基。在别构位点 2,Comp1 由于特定的纺锤型结构表现出最高的亲和力,Arg265 和 Ala76 是对结合自由能贡献最大的两个残基。其中,带负电的 Arg265 通过静电相互作用有助于避免非靶向作用,而 Ala76 主要通过范德华相互作用贡献于结合,这提醒我们一种针对特定别构抑制剂设计的新策略。 这项工作为 SHP2E76A 变构抑制剂提供了全新的骨架结构,这将对进一步的药物设计和合成大有裨益,为其开发奠定了坚实的理论基础。

图形摘要

本项目针对 SHP2E76A 的两个变构位点进行了基于对接的抑制剂虚拟筛选,并探讨了结合机制。
Image, graphical abstract
  1. Download: Download high-res image (212KB)
  2. Download: Download full-size image

关键词

SHP2E76A
别构抑制剂 基于对接的虚拟筛选 分子动力学模拟 结合机制

1. 引言

含有 Src 同源性-2 的蛋白酪氨酸磷酸酶 2(SHP2)是一种非受体蛋白酪氨酸磷酸酶,能够特异性催化细胞质中磷酸酪氨酸的去磷酸化,在调节细胞增殖、分化、代谢、凋亡和免疫调节功能中发挥着至关重要的作用[[1], [2], [3], [4], [5], [6], [7], [8]]。SHP2 的异常调节和激活突变与血液疾病、各种实体肿瘤和发育障碍相关,如黑色素瘤、食管鳞状细胞癌、肺癌和肝癌等[1,6,[9], [10], [11], [12], [13]]。具体而言,在来自 Noonan 综合症和白血病患者的病理样本中发现,Glu76 是 SHP2 的 N-SH2 结构域中一个常见的突变氨基酸,对人类生命和健康构成威胁。
在结构上,SHP2 由三个部分组成:两个功能性 SH2 结构域(N-SH2 和 C-SH2)、一个具有催化功能的 PTP 结构域,以及一个 C 末端区域,具有两个主要的酪氨酸磷酸化位点[Y542 和 Y580]。传统的催化活性位点位于 PTP 结构域中,在那里它可以特异性地与抑制剂结合。除了其催化功能外,它还协助底物的周转,并稳定 SHP2 的构象转变。2018 年,诺华团队揭示了 SHP2 上的三个潜在的别构位点。其中,别构位点 1 已被广泛研究。它在 C-SH2/PTP 结构域的界面上呈现“隧道”型,具有一个深的结合口袋,中间收窄,两端扩宽,其形状相对有利于药物结合。别构位点 2,称为“闩”型,位于 N-SH2 和 PTP 结构域之间的界面上,距离别构位点 1 约 20 Å。它由多个活性腔体组成,具有浅口袋和大接触表面,使得非靶向药物结合的可能性更大。 别构位点 3,呈“沟槽”类型,位于蛋白质的对面,具有较浅的开放口袋,不利于药物结合。目前,SHP2 抑制剂主要靶向别构位点 1,而针对别构位点 2 和 3 的药物设计仍待开发。
SHP2 的“分子开关”别构机制在 PTP 家族中是独特的,它通过自身抑制构象的开放和关闭进行调节[[19], [20], [21], [22], [23]]。在这个过程中,有时伴随着 SHP2 的残基突变,例如 SHP2 的 E76K 突变体(SHP2E76K)。2021 年,Youqi Tao 团队利用单分子福斯特共振能量转移(smFRET)技术结合分子动力学(MD)模拟,从 SHP2 的 E76A 突变体(SHP2E76A)中捕获到一种在水溶液中先前未被识别的第三种部分开放状态[24]。在这种状态下,突变打破了 Glu76 和 Ser502 之间的氢键,最终削弱了 N-SH2 与 PTP 之间的结合,留下了一种介于已知的开放和闭合状态之间的半活性构象。
因此,由于 SHP2 激活突变(SHP2E76A、SHP2G60V、SHP2S502P)引起的疾病为开发这种突变的 SHP2 别构抑制剂带来了新的挑战。目前,针对野生型 SHP2,已有九种别构位点抑制剂进入临床研究阶段,其中,吡嗪类 SHP099(6-(4-amino-4-methylpiperidin-1-yl)-3-(2,3-dichlorophenyl)pyrazin-2-amine)[25]和 TNO155((3S,4S)-8-(6-amino-5-((2-amino-3-chloropyridin-4-yl)thio)pyrazin-2-yl)-3-methyl-2-oxa-8-azaspiro[4.5]decan-4-amine)[26]是最著名的候选化合物,IC50 分别为 0.071 µM 和 0.011 µM。然而,关于 SHP2E76A 抑制剂的研究仍然相当有限。仅有报道称在 2017 年合成了一系列针对突变 SHP2E76A 的化合物[27]。其中,(1-[4-(6-bromo-naphthol)-2-thiazolyl]-4-methylpiperidine-4-amine)显示出最高的抑制活性,IC50 为 0.71 µM。该化合物可以抑制多种癌细胞系和肺癌细胞的增殖,并具有良好的药代动力学特性和抗肿瘤活性。
在本研究中,为了开发新型有效的 SHP2E76A 抑制剂,我们进行了针对别构位点 1 和 2 的基于结构的虚拟筛选,利用来自一系列商业数据库的基于对接的虚拟筛选策略。最初遵循 Lipinski 和 Veber 的规则筛选出药物相似性分子。随后,通过使用三角匹配对接算法进行分子对接,进行了基于受体的虚拟筛选。进行了可视化分析,以帮助放弃不合理的构象。保留了具有良好药代动力学特性的分子。在此基础上,对剩余分子进行了分子动力学模拟,并计算了它们相应的结合自由能以确认最终的命中分子。最后,探讨了详细的结合模式。这项工作为 SHP2E76A 别构抑制剂提供了新的骨架结构,为其开发奠定了坚实的理论基础。此外,这可能为未来治疗与 SHP2E76A 靶点相关的疾病带来全新的希望。

2. 材料与方法

2.1. 数据库准备

选择了来自 Taosu 生化公司的商业数据库进行基于对接的筛选,包括 Interbioscreen (IBS)、TargetMol 以及我们团队的内部数据库,共包含 573,412 个小分子。使用分子操作环境(MOE2020)[28]软件的洗涤模块对所有小分子进行了优化,例如添加氢原子、转换为 3D 结构,以及在 Amber10:EHT 力场下进行 AM1-BCC 电荷拟合。使用 Lipinski 的[29]和 Veber 的规则[30]筛选药物相似性分子。

2.2. 蛋白质预处理

受体 SHP2E76A 的结构来自于 SHP2E76A 与 9b 的共晶结构(PDB ID: 5XZR)[27],该数据在 RCSB 蛋白质数据银行中可查阅(https://www.rcsb.org/)。SHP2E76A 蛋白经过预处理,包括残基校正、缺失环修复、在 pH 7.0 下添加氢原子,以及使用 MOE2020 软件进行电荷拟合,遵循类似研究中常用的标准协议[31,32]。然后在 Amber10: EHT 力场下进行了优化。
如报告所述[[33], [34], [35], [36]],变构位点 1 包含残基 Thr108、Glu110、Arg111、Phe113、His114、Thr218、Thr219、Glu249、Glu250、Thr253、Leu254、Gln257、Asp489、Pro491、Lys492 和 Gln495 等。相比之下,变构位点 2 由残基 Gln79、Tyr80、His84、Gln87、Leu262、Tyr263、Ser264、Arg265、Gln269、Lys274、Asn281、Ile282 和 Leu283 组成[18]。本研究针对这两个变构位点进行了对接。

2.3. 分子对接策略

分子对接是一种广泛应用于各种药物靶点的筛选策略[[37], [38], [39]],它通过预测结合亲和力和相互作用来有效识别潜在化合物,从而提高筛选的准确性。为了更准确地获得靶化合物并降低假阳性的风险,我们进行了基于对接的两步虚拟筛选过程。首先,我们使用半柔性对接策略筛选出匹配度差和亲和力低的化合物,仅保留得分最高的化合物(前 30%)。然后,这些命中化合物使用柔性对接策略重新对接,仅保留得分最高的化合物(前 10%)。通过分析 SHP2E76A 与命中化合物的实际结合匹配度,结合评分函数,最终选择了具有高生物活性的化合物。
所有对接均使用 MOE2020 软件的 DOCK 模块进行。对于第一轮半柔性对接,选择了 Triangle Matcher 构象搜索算法和 London delta G 评分函数,每个导出 30 个构象。然后,应用 Rigid Receptor 算法和 GBVI/WAS 评分函数进行构象优化,为每个化合物导出最佳拟合构象。在第二轮柔性对接中,我们将构象优化方法更改为 Induced Fit。

2.4. ADMET 筛选

为了进一步降低药物开发的成本,使用 ADMETlab2.0 进行了 ADMET 筛选[40](https://admetmesh.scbdd.com/)。选择了马丁-达比犬肾(MDCK)渗透性、血脑屏障穿透(BBB)、人以太-去-去相关基因(hERG)、大鼠口服急性毒性(ROA)和艾姆斯突变试验(AMES)作为 ADMET 性质预测的主要描述符。此外,还进行了泛测定干扰化合物(PAINS)[41]过滤,以避免假阳性分子。

2.5. 分子动力学模拟

以最佳对接的受体-配体复合物作为初始结构模型,所有分子动力学模拟均使用 Amber 18 软件包中的 GPU 加速版本 PMEMD [42,43]模块进行。所有最终命中的化合物首先在 B3LYP /6-31G (d, p)水平上使用 Gaussian16 软件 [45]进行了优化。一般 AMBER 力场(GAFF) [46]和 AMBER ff14SB [47]力场分别应用于命中化合物和蛋白质。每个系统中添加了钠离子以维持电中性。所有研究的系统浸入到一个 TIP3P [48]水箱中,距离复合物边缘的最小距离为 12 Å。
首先,采用最陡下降法和共轭梯度法各 5000 步相结合的方法来最小化初始模型。然后,使用 Langevin 动力学[49,50]将温度逐渐加热至 300K,碰撞频率为 5.0 ps−1。随后,系统在 300K 和 1 bar 的 NPT 系综中平衡了 10 ns。使用 Berendsen 压力调节器以各向异性的方式调节压力,压力松弛时间为 2.0 ps。最后,进行了 100 ns 的生产 MD 模拟。采用周期性边界条件,并使用粒子网 Ewald(PME)方法[51]描述长程静电相互作用,同时使用 8 Å的截断半径来评估短程相互作用。使用 SHAKE 算法[52]约束所有含有 H 原子的化学键,约束常数为 10-6,时间步长为 2 fs。使用 AmberTools CPPTRAJ 程序[53]对轨迹的平衡部分进行了分析。

2.6. 结合自由能计算

考虑到计算效率和准确性之间的良好平衡,最终选择了 MM/PBSA 方法通过 Amber18 包的 MMPBSA.py [54]脚本计算蛋白质-配体结合自由能。MM/PBSA 是一种后处理终态方法,广泛应用于作为高效可靠的自由能模拟方法来建模分子识别。从最后 50 纳秒的轨迹中提取了 5,000 个快照,间隔为 20 皮秒,用于计算结合自由能。离子强度设置为 0.15 M,外部和内部介电常数分别设置为 80 和 1。
结合自由能 (∆Gbind) 是通过以下方程计算的,其中 ∆Gcomplex、∆Greceptor 和 ∆Gligand 分别是复合物、受体和配体的自由能。(1)ΔGbind=ΔGcomplexΔGreceptorΔGligand
在上述方程中,相应的自由能(∆Gbind = ∆Gcomplex −∆Greceptor −∆Gligand)通过 MM/PBSA 方法进行估算:(2)ΔGbind=ΔHTΔSΔEgas+ΔGsolvTΔS(3)ΔEgas=ΔEint+ΔEvdw+ΔEele(4)ΔGsolv=ΔGPB+ΔGnonp
这里,–T∆S 表示熵的贡献。熵的计算耗时较长,如果采用的快照数量较少,其值会波动。在本研究中,熵的贡献被省略[55]。∆H 表示结合自由能的焓贡献,由∆Egas 和∆Gsolv 组成。∆Egas 是气相分子力学能,∆Gsolv 是溶剂化自由能,∆Eint、∆Evdw、∆Eele 分别是内能、范德华相互作用能和静电能。在本研究中,考虑到键、角和二面角能的总值∆Eint 为零。∆Gsolv 被计算为通过泊松-玻尔兹曼(PB)模型计算的溶剂化自由能的极性贡献(∆GPB)与根据与溶剂可接触表面积(SASA)的线性关系计算的非极性贡献(∆Gnonp)之和。(5)ΔGnonp=γ×ΔSASA+β
其中两个经验常数 γ 和 β 分别设定为 0.00542 kcal/mol/Ų 和 0.92 kcal/mol。SASA 由 1.4 Å 的探针半径确定。
To further explore the detailed interaction information of complex, free energy decomposition was performed using the MM/PBSA method to identify the key residues responsible for binding energy. The contribution of each residue was calculated without considering the contribution of entropies. The contribution was defined as the sum of ∆Evdw, ∆Eele, ∆GPB, and ∆Gnonp:(6)ΔGresidue=ΔEvdw+ΔEele+ΔGPB+ΔGnonp

3. Results and discussion

3.1. Database screening

Docking-based virtual screening was performed as the flow diagram depicted in Fig. 1. The screening process included drug-likeness screening, two rounds of molecular docking-based screening with the strategies of semi-flexible and flexible dockings, visualized analysis, ADMET screening, and binding free energy-based screening. Following the Lipinski's and Veber's rules 385,405 compounds were screened out after drug-likeness screening.
Fig 1
  1. Download: Download high-res image (303KB)
  2. Download: Download full-size image

Fig. 1. The flow diagram of the docking-based virtual screening.

3.2. Docking-based virtual screening

3.2.1. Docking method Selection

In order to verify the reliability of the docking algorithm, eight complexes, including the positive controls SHP099 and SHP844 ((3-chloro-4-((1-(2-hydroxy-3-methoxyphenyl)-5-oxo-[1,2,4]triazolo[4,3-a]quinazolin-4(5H)-yl)methyl)benzoyl)proline), whose co-crystal structures have already been able to obtain from the RCSB PDB database, were selected to re-dock into the original binding pocket of SHP2. Comparing the re-docking results with the original structures (Table 1), all calculated root mean square deviation (RMSD) values were within 0.5 Å, and their binding positions overlap well structurally. These results indicate that the MOE2020 procedure's parameters settings were relatively suitable and reliable for the docking of SHP2E76A in our systems. Moreover, MOE2020 has demonstrated its effectiveness in this protein system in our previous studies [56], further underscoring its applicability.

Table 1. Selected compounds with existed co-crystal structures used for re-docking to validate the reliability of MOE docking procedure.

Image, table 1

3.3. Molecular docking studies

In the first round of semi-flexible docking, considering the flexibility of ligands in molecular docking as the emphasis, 32,517 compounds (allosteric sites 1) for allosteric site 1 and 21,829 compounds (allosteric sites 2) for allosteric site 2 were retained after filtering out some compounds with a poor matching degree and low affinity with binding pockets. To improve the accuracy of docking results, the flexible docking method was applied for screening, since it thoroughly considered the dynamic flexibility changes of ligand and receptor and was close to the real drug-protein binding. After the second round, the top 10% high-scoring compounds were retained, resulting in 5,376 and 5,645 compounds for allosteric sites 1 and 2, respectively. Through comprehensively observing the three-dimensional binding mode of protein-ligand, interaction, and the matching degree between compounds and binding pockets, 23 and 11 compounds were screened out, respectively, targeting allosteric sites 1 and 2.

3.4. ADMET screening

The ADMET properties of compounds were evaluated to reduce false positives and avoid molecules possessing poor drug profiles using ADMETlab2.0. Eventually, 12 (allosteric site 1) and 7 (allosteric site 2) compounds were reserved, as listed in Table 2, Table 3, and all compounds met the ADMET screening criteria (Table S1-S3). The hit compounds had an MDCK value higher than 2 × 10-6 cm/s, indicating that they all satisfy with medium or higher permeability. Their BBB and hERG values are less than 0.5, suggesting limited penetration of the blood-brain barrier and weak inhibition of cardiac function. The AMES and ROA values are also less than 0.5, indicating extremely low mutagenic and toxic potentials.

Table 2. Compounds obtained after ADMET screening targeted on allosteric site 1.

Image, table 2

Table 3. Compounds obtained after ADMET screening targeted on allosteric site 2.

Image, table 3

3.5. Screening based on binding free energy calculations

Further, MD simulations and binding free energy calculations of the complex systems with SHP2E76A were performed to narrow down the region. Taking the SHP2E76A_ligand docking conformations with the highest score function as the initial dynamics simulation models, the binding free energies of complexes, as well as those of the positive control systems SHP2E76A_SHP099 (allosteric site 1) and SHP2E76A_SHP844 (allosteric site 2), were calculated using the last 50 ns equilibrated trajectories of 100 ns MD simulations. Ultimately, 9 hit compounds were reserved for allosteric site 1, and 5 hit compounds were reserved for allosteric site 2 (Table 4, Table 5). These hit compounds exhibit more negative ∆Gbind values compared to the positive controls (-26.38 kcal/mol for SHP099 and -23.59 kcal/mol for SHP844). For allosteric site 1, Lig1 and Lig2 are two hit compounds with the most negative binding free energies of -33.40 kcal/mol and -33.12 kcal/mol, respectively, indicating the highest affinity among these hit compounds. As well, notably, among these 9 hit compounds, a class of three-membered heterocyclic compounds with a similar parent nucleus draw our attention, including compounds Lig1, Lig5, Lig6, Lig8, and Lig9. Based on this, Lig1, which exhibits the most favorable binding free energy, was selected as a representative for subsequent detailed analysis of the interaction mechanism compared to the solo-protein and the positive control SHP099. For allosteric site 2, Comp1 demonstrates outstanding binding affinity with the most negative binding free energy of -35.41 kcal/mol, significantly larger than that of the positive control SHP844. Furthermore, Comp1 is the only hit compound whose ∆Eele larger than its ∆Evdw, which suggests that there are more electronic interaction contribute for the binding. Such the stronger interaction is also beneficial in reducing the possibility of off-target effects. So Comp1 is considered as a promising potential molecular structure.

Table 4. Binding Free Energy calculation results of the screened compounds with the more negative ∆Gbind than positive control SHP099 targeted on allosteric site 1 (unit: kcal/mol).

CompoundRename∆Evdw∆Eele∆GPB∆Gnonp∆Gbind
SHP099-51.92 (±0.09)-24.20 (±0.11)53.91 (±0.17)-4.17-26.38 (±0.14)
STOCK7S-32325Lig1-53.94 (±0.09)-30.25 (±0.15)55.44 (±0.11)-4.65-33.40 (±0.12)
STOCK6S-14956Lig2-51.77 (±0.05)-26.52 (±0.09)49.93 (±0.07)-4.76-33.12 (±0.06)
STOCK5S-38284Lig3-56.81 (±0.05)-21.06 (±0.13)52.99 (±0.11)-5.44-30.32 (±0.08)
STOCK6S-52092Lig4-58.70 (±0.05)-31.42 (±0.13)66.36 (±0.12)-5.06-28.82 (±0.07)
STOCK6S-34336Lig5-51.96 (±0.04)-21.96 (±0.09)50.09 (±0.08)-4.68-28.51 (±0.07)
STOCK7S-32878Lig6-55.17 (±0.04)-19.90 (±0.06)51.46 (±0.06)-4.87-28.48 (±0.06)
STOCK7S-33416Lig7-51.18 (±0.04)-16.22 (±0.08)44.67 (±0.07)-4.56-27.29 (±0.06)
STOCK6S-37289Lig8-48.23 (±0.06)-18.45 (±0.09)43.92 (±0.08)-4.40-27.16 (±0.06)
STOCK6S-39116Lig9-52.51 (±0.05)-30.14 (±0.09)60.60 (±0.08)-4.83-26.88 (±0.07)

Table 5. Binding Free Energy calculation results of the screened compounds with the more negative ∆Gbind than positive control SHP844 targeted on allosteric site 2 (unit: kcal/mol).

CompoundRename∆Evdw∆Eele∆GPB∆Gnonp∆Gbind
SHP844-44.98 (±0.08)-40.03 (±0.15)66.02 (±0.12)-4.60-23.59 (±0.07)
STOCK4S-23418Comp1-44.59 (±0.04)-57.20 (±0.08)70.70 (±0.08)-4.32-35.41 (±0.05)
G357-0324Comp2-56.72 (±0.08)-8.37 (±0.10)42.26 (±0.12)-5.67-28.50 (±0.06)
T1634Comp3-45.31 (±0.05)-28.22 (±0.19)51.35 (±0.15)-4.56-26.74 (±0.06)
G346-0245Comp4-40.39 (±0.06)-10.27 (±0.13)30.80 (±0.12)-4.31-24.17 (±0.06)
E596-0029Comp5-45.48 (±0.06)-20.58 (±0.12)46.42 (±0.12)-4.40-24.04 (±0.05)
Analyzing the detailed contributions of various energy components, we observed that the favorable contributions to ∆Gbind are consistent for both allosteric sites 1 and 2. These contributions primarily originated from ∆Eele, ∆Evdw, and ∆Gnonp.
In general, more hit compounds were identified targeting allosteric site 1 compared to allosteric site 2. This discrepancy is likely related to the shape of the binding pocket, as compounds easily bind to the "tunnel" shape of allosteric site 1. In contrast, allosteric site 2 has a shallow and wide shape, which increases the possibility of off-target compounds and poses challenges for drug design. Additionally, since the binding pocket of allosteric site 2 spans the N-SH2 and PTP domains exactly, inhibitors binding to this region can stabilize the self-inhibited state of SHP2. This finding is also of great interest for the development of allosteric site 2 inhibitors as well.

3.6. Binding mode analysis

3.6.1. Allosteric site 1

As noted in Section 3.5, Lig1, with its favorable binding free energy and representative characteristics within the class of three-membered heterocyclic compounds, was selected for detailed analysis of the interaction mechanism alongside the solo-protein and the positive control SHP099. Fig. 2 presents the RMSD and radius of gyration (Rg) results for solo protein SHP2E76A and complexes SHP2E76A_Lig1/SHP099. Comparatively, both Lig1 and SHP099 enhances the stability and compactness of the complex systems, as indicated by lower average RMSD values of 2.87 Å and 2.93 Å and Rg values of 25.77 Å and 25.91 Å, respectively.
Fig 2
  1. Download: Download high-res image (293KB)
  2. Download: Download full-size image

Fig. 2. The RMSD and Rg results of solo protein SHP2E76A and complexes SHP2E76A_Lig1/SHP099.

The residue free energy decompositions were calculated to explore the key residues and their functions in SHP2E76A_SHP099/Lig1 complex systems (Fig. 3). Compared with these two complex systems, the favor contributions to free energy consistently exceeds -2.0 kcal/mol for residue Arg111, Glu250, and Thr253. In SHP2E76A_Lig1 complex system, Glu250, an electronegative residue, whose skeleton carbonyl oxygen and the carboxyl oxygen of the sidechain can form hydrogen bonds with the hydroxyl-linked hydrogen of the tricyclic parent nucleus and the methoxyl hydrogen of sidechain, respectively. (Fig. 4) Similarly, in SHP2E76A_SHP099 complex system, Glu250 can also form hydrogen bonds with the amidogen of pyrazine in the middle. However, it has a minor residue energy contribution compared to Arg111, which forms a hydrogen bond with the nitrogen of pyrazine on the opposite side. (Fig. 4) Notably, no stable hydrogen bond interaction involving Arg111 was found in SHP2E76A_Lig1, although it still makes a minor residue energy contribution. Analysis of the electrostatic potential energy diagram reveals that Arg111 is exactly located at one terminal of the “tunnel” in allosteric site 1, and its strong electropositivity leads to great repulsive interaction with the pyridine nitrogen of the tricyclic of Lig1. This partially blocks Lig1’s off-target binding, resulting in the decomposition energy of -2.18 kcal/mol. In addition, Leu216 is the residue that contributed most to the total binding free energy in SHP2E76A_Lig1, up to -3.13 kcal/mol. It can form hydrogen bonds with the carbonyl oxygen and the imino hydrogen, respectively. In addition, Lig1 forms hydrophobic interactions with Thr219 and Pro491, as well as stronger hydrogen bonds with His114, Thr218, and Thr219.
Fig 3
  1. Download: Download high-res image (233KB)
  2. Download: Download full-size image

Fig. 3. Residue free energy decomposition diagrams of SHP2E76A_SHP099 and hit complex Lig1.

Fig 4
  1. Download: Download high-res image (875KB)
  2. Download: Download full-size image

Fig. 4. The interaction diagram of SHP2E76A_Lig1/SHP099.

Furthermore, the scaffold novelty of Lig1 was explored through the structural characterization of the compound. Compared to the pyrazine-piperidine ring parent nucleus of the positive control SHP099, Lig1 stands out with a more unique and flexible scaffold, containing a densely fused dihydropyrazole-pyridine system that provides both greater flexibility and structural complexity. Compared to the limited number of polar groups of conventional inhibitors such as SHP099, Lig1 incorporates multiple functional groups (e.g., hydroxyl, methoxy, and isopropyl) that optimize the balance between hydrophobic and polar interactions. These factors also collectively explain the higher binding affinity of Lig1 compared to SHP099.

3.6.2. Allosteric site 2

Due to the special "latch" shape of allosteric site 2, the hit compounds are mostly chain-like structures, with monocyclic or bicyclic structures on the terminal, connected by carbon-carbon single bonds. Such linker mode provides greater flexibility, allowing for better binding to allosteric site 2. Herein, the special interactional mechanism of Comp1, which exhibits the strongest binding affinity to SHP2E76A, was analyzed in detail. Structurally, Comp1 contains mainly of two benzene rings and a triazole ring connected by rotatable bonds, enabling the flexible rotation possible. The central benzene ring has two branches consisting of allyl and ether bonds, respectively.
Since there is still no inhibitor targeting allosteric site 2 discovered to enter the clinical stage, we took SHP844, with the IC50 of 18.9µM, as the positive control in this study. Its bioactivity is the best for allosteric site 2. Fig. 5 present the RMSD and Rg results of solo protein SHP2E76A and complexes SHP2E76A_Comp1/SHP844. It can be found that compared to SHP844, the binding of Comp1 significantly increases the stability and the compactness of complex system. The averaged RMSD and Rg values of SHP2E76A_Comp1 are 2.71 Å and 25.33 Å, respectively.
Fig 5
  1. Download: Download high-res image (338KB)
  2. Download: Download full-size image

Fig. 5. The RMSD and Rg results of solo protein SHP2E76A, complexes SHP2E76A_Comp1/SHP844.

The residue free energy decompositions were calculated to further explore the key residues and their functions in SHP2E76A_SHP844/Comp1 complex systems (Fig. 6). In SHP2E76A_Comp1, the greatest favorable contribution comes from the electropositive residue Arg265, up to -4.95 kcal/mol, due to the hydrogen bond formed between its sidechain and the α, β-unsaturated carboxyl group of Comp1, respectively (Fig. 7). This probably the main reason why ∆Eele is larger than ∆Evdw, as listed in Table 5. This relatively strong electrostatic interaction effectively anchores Comp1 to the pocket surface, preventing off-target effect.
Fig 6
  1. Download: Download high-res image (164KB)
  2. Download: Download full-size image

Fig. 6. Residue free energy decomposition diagrams of SHP2E76A_Comp1/SHP844.

Fig 7
  1. Download: Download high-res image (769KB)
  2. Download: Download full-size image

Fig. 7. The interaction diagram of SHP2E76A_Comp1/SHP844.

The mutated Ala76 makes a minor favorable contribution (-2.86 kcal/mol), which performs comparatively much better than in SHP2E76A_SHP844. Along with the other key hydrophobic residues, such as Tyr263, Leu262, Phe71, and Tyr80, Ala76 mainly increases the binding affinity through hydrophobic interactions. Apart from these two residues, all the other residues has energy contributions less than -2 kcal/mol, which also explains the different binding modes compared to SHP2E76A_SHP844 complex system. In SHP2E76A_SHP844, its ∆Evdw is larger than ∆Eele (Table 5), indicating that van der Waals energy interactions are the main favorable contribution compared to electrostatic interactions, and these interactions are more evenly distributed. This is still mainly since there are a series of bumpy sub-active cavities existed on the surface of allosteric site 2, which greatly restricts the orientation and binding of different structural compounds. Therefore, Comp1 can be acted as a potential hit compound with ample room for the further structural modification and optimization to obtain a more suitable candidate compound. Furthermore, the key effect of residue Ala76 also provides valuable insights for specific drug design targeting allosteric site 2 of SHP2E76A. In addition, Comp1 forms hydrophobic interactions with Lys280, hydrogen bonds with Gln79, Glu83, His84, Gln269, and Asn281, and π-cation interactions with Lys274.
Furthermore, the scaffold novelty of Comp1 was explored through the structural characterization of the compound. Firstly, Comp1 don't have large rigid cyclic structure moiety, compared to a triazolo [4,3-a] quinazolin tricyclic structure involved in SHP844. This increases its degree of freedom and occupied effect, matching the wide “latch” shape of binding pocket of site 2 better. Secondly, Comp1 contains more polar groups, compared to more hydrophobic group involved in SHP844. This can lead to the formation of stronger hydrogen bonds or salt bridges when binding, thereby enhancing the binding affinity. These factors also collectively explain the higher binding affinity of Comp1 compared to SHP844.

4. Conclusions

SHP2 is a key regulatory factor in the protein kinase pathways involved in various signal regulations. Our study aims to find novel and highly selective inhibitors by computer-aided drug design methods. For the allosteric binding pocket of SHP2E76A, virtual screening methods based on molecular docking strategies were adopted. Through drug-likeness evaluation and ADMET screening, combined with the MD simulations and free energy-based virtual screening, 9 and 5 candidate compounds were screened from allosteric site 1 and allosteric site 2 respectively, out of 573,412 small molecules. Eventually, two hit compounds, Lig1 and Comp1, were also selected, respectively. Then, their corresponding specific binding mechanisms were explored in depth.
Based on the previous analysis, several optimization directions for the hit compounds were proposed to further improve the stability, selectivity, and off-target effects, thereby improving their safety and efficacy in clinical applications. For Lig1, replacing the hydroxyl group on the benzene ring with a more stable substituent, such as a fluorine atom, is recommended to enhance metabolic stability. The introduction of a fluorine atom could also improve hydrophobicity. Additionally, reducing the number of polar groups and incorporating nonpolar groups, such as fluorinated aryl groups, may help minimize potential off-target effects. For Comp1, substituting the carboxyl group with a more stable hydroxycarbamate or a five-membered heterocyclic imidazole could improve metabolic stability. Alternatively, introducing neutral polar groups, such as nitro groups, may enhance target selectivity.
The new compounds framework provided by the study opens a new starting point for the chemical synthesis route of drugs. It also lays a solid foundation for further structure and mechanism-based drug design, providing theoretical support and impetus for obtaining SHP2E76A inhibitors with different characteristics.

CRediT authorship contribution statement

Fanru Yuan: Writing – original draft, Validation, Methodology, Investigation, Formal analysis, Data curation. MengGuo Chen: Writing – original draft, Validation, Methodology, Investigation, Formal analysis, Data curation. Shaohui Liu: Writing – review & editing, Visualization, Formal analysis. NanNan Ma: Investigation, Data curation. Jiangfeng Du: Writing – review & editing, Software, Resources. Hongmin Liu: Project administration, Funding acquisition, Conceptualization. Yong Guo: Writing – review & editing, Project administration. Longhua Yang: Writing – review & editing, Writing – original draft, Supervision, Project administration, Funding acquisition.

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
LongHua Yang reports financial support was provided by National Natural Science Foundation of China. Hongmin Liu reports financial support was provided by Key Scientific and Technological Projects of Henan Province. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research was funded by the National Natural Science Foundation of China (No. 21803058 and No. U21A20416), the Outstanding Youth Science Foundation of Henan Province (No. 242300421078), and Education and Teaching Reform Research and Practice Project of Zhengzhou University (No. 2022ZZUJGXMLXS-013 and No. 2022ZZUJG268). We sincerely thank the National Supercomputing Center in Zhengzhou for providing partial computing resources and software. We also thank for the impetus to our research origination brought by the concept and content of the education and teaching reform programs.

Appendix. Supplementary materials

Data availability

No data was used for the research described in the article.

References

Cited by (0)

1
These authors contributed equally to this work.
View Abstract