这是用户在 2024-4-2 10:29 为 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10935487/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Skip to main content
跳到主要内容
U.S. flag

An official website of the United States government

Access keys 访问密钥 NCBI Homepage NCBI主页 MyNCBI Homepage MyNCBI主页 Main Content 事由 Main Navigation 主导航
2024 Mar; 6(1): lqae028.
NAR 基因组生物信息。2024年3月;6(1):LQAE028。
Published online 2024 Mar 12. doi: 10.1093/nargab/lqae028 IF: 4.6
2024 年 3 月 12 日在线发布。doi: 10.1093/nargab/lqae028IF: 4.6
PMCID: PMC10935487 IF: 4.6
PMCID:PMC10935487IF:4.6
PMID: 38482061 IF: 4.6  PMID:38482061IF:4.6

mRNAid, an open-source platform for therapeutic mRNA design and optimization strategies
mRNAid,用于治疗性 mRNA 设计和优化策略的开源平台

Nikita Vostrosablin, Shuhui Lim, Pooja Gopal, Kveta Brazdilova, Sushmita Parajuli, Xiaona Wei, Anna Gromek, David Prihoda, Martin Spale, Anja Muzdalo, Jamie Greig, Constance Yeo, Joanna Wardyn, Petr Mejzlik, Brian Henry, Anthony W Partridge, and Danny A Bittoncorresponding author
尼基塔·沃斯特罗萨布林、 林淑慧、 普贾·戈帕尔、 科维塔·布拉兹迪洛娃、苏什米塔·帕拉朱利、薇晓娜、安娜·格罗梅克、大卫·普里霍达、马丁·斯帕尔、安雅·穆兹达洛、杰米·格雷格、康斯坦斯·杨、乔安娜·沃丁、彼得·梅兹利克、布莱恩·亨利、安东尼·帕特里奇和丹尼·比顿 corresponding author

Associated Data 相关数据

Supplementary Materials 补充材料
Data Availability Statement
数据可用性声明

Abstract 抽象

Recent COVID-19 vaccines unleashed the potential of mRNA-based therapeutics. A common bottleneck across mRNA-based therapeutic approaches is the rapid design of mRNA sequences that are translationally efficient, long-lived and non-immunogenic. Currently, an accessible software tool to aid in the design of such high-quality mRNA is lacking. Here, we present mRNAid, an open-source platform for therapeutic mRNA optimization, design and visualization that offers a variety of optimization strategies for sequence and structural features, allowing one to customize desired properties into their mRNA sequence. We experimentally demonstrate that transcripts optimized by mRNAid have characteristics comparable with commercially available sequences. To encompass additional aspects of mRNA design, we experimentally show that incorporation of certain uridine analogs and untranslated regions can further enhance stability, boost protein output and mitigate undesired immunogenicity effects. Finally, this study provides a roadmap for rational design of therapeutic mRNA transcripts.
最近的 COVID-19 疫苗释放了基于 mRNA 的疗法的潜力。基于 mRNA 的治疗方法的一个常见瓶颈是快速设计翻译高效、寿命长且无免疫原性的 mRNA 序列。目前,缺乏一种可访问的软件工具来帮助设计这种高质量的mRNA。在这里,我们介绍了 mRNAid,这是一个用于治疗性 mRNA 优化、设计和可视化的开源平台,它为序列和结构特征提供了多种优化策略,允许人们将所需的特性定制到他们的 mRNA 序列中。我们通过实验证明,通过mRNAid优化的转录本具有与市售序列相当的特征。为了涵盖 mRNA 设计的其他方面,我们通过实验表明,掺入某些尿苷类似物和未翻译区域可以进一步增强稳定性、提高蛋白质输出并减轻不良免疫原性效应。最后,本研究为治疗性mRNA转录本的合理设计提供了路线图。

Introduction 介绍

mRNA-based therapeutics continue to revolutionize vaccine development (), immunotherapy () and targeted degradation methodologies (), advancing the battle of modern medicine against infectious diseases, genetic disorders and cancer [, reviewed in ()]. Irrespective of the various indications and the diverse mechanisms of action of mRNA-based drugs, all derived therapeutics and vaccines share the same underlying principles. First, the mRNA sequence is designed and optimized in silico. Then, the optimized sequence is transcribed in vitro, often with selected chemical modifications. Finally, the synthetic transcript is packaged and delivered to the cytoplasm of host cells, where it is translated into a protein that exerts the desired cellular effect. Therefore, in silico mRNA design is undeniably instrumental to the success of any mRNA-based therapeutic. Transcript design is typically initiated with a decoration of the coding sequence (CDS) with flanking 5′- and 3′-UTRs (untranslated regions) and other signals (e.g. translational ramps, miRNA-binding sites, etc.) that can improve stability and translation efficiency, and enable tissue-specific expression (,). Then, rigorous sequence engineering is required to eliminate immunogenic properties () and further enhance transcript stability and translation (). During in vitro transcription, chemical modifications such as a 5′-cap and nucleoside analogs can be incorporated to protect against degradation and evade host immune surveillance (,). At present, mRNA design is largely dependent on expert knowledge, manual sequence editing, distributed optimization and visualization tools that are often proprietary. There is no freely available tool specifically tailored for therapeutic mRNA design that combines multiple optimization strategies. Here, we present mRNAid, an open-source, integrated software that bundles several modified and extended algorithms and tools for constraint propagation, sequence optimization and secondary structure visualization. Via an intuitive and user-friendly interface, mRNAid orchestrates simultaneous optimization of several sequence and structural properties including codon usage, GC content, minimum free energy (MFE), uridine depletion and exclusion of specific motifs and/or rare codons, thereby providing a powerful platform for therapeutic mRNA design. mRNAid is available at https://github.com/MSDLLCpapers/mRNAid and as a web application at https://mrnaid.dichlab.org.
基于mRNA的疗法继续革新疫苗开发(1)、免疫疗法(2)和靶向降解方法(3),推进了现代医学对抗传染病、遗传疾病和癌症的斗争[4-6,见(7)]。无论基于mRNA的药物具有不同的适应症和不同的作用机制,所有衍生疗法和疫苗都具有相同的基本原理。首先,在计算机中设计和优化mRNA序列。然后,在体外转录优化的序列,通常进行选定的化学修饰。最后,合成转录本被包装并递送至宿主细胞的细胞质中,在那里它被翻译成一种蛋白质,发挥所需的细胞效应。因此,不可否认,计算机mRNA设计对于任何基于mRNA的疗法的成功都起着重要作用。转录本设计通常从编码序列 (CDS) 的修饰开始,其中包含侧翼 5′- 和 3′-UTR(非翻译区)和其他信号(例如翻译斜坡、miRNA 结合位点等),这些信号可以提高稳定性和翻译效率,并实现组织特异性表达 ( 8, 9)。然后,需要严格的序列工程来消除免疫原性 (10) 并进一步增强转录本稳定性和翻译 ( 11–13)。在体外转录过程中,可以掺入化学修饰,例如 5′-帽和核苷类似物,以防止降解并逃避宿主免疫监视 ( 14, 15)。目前,mRNA设计很大程度上依赖于专业知识、手动序列编辑、分布式优化和可视化工具,这些工具通常是专有的。 目前尚无专门针对治疗性 mRNA 设计量身定制的结合多种优化策略的免费工具。在这里,我们介绍了 mRNAid,这是一款开源的集成软件,它捆绑了几种经过修改和扩展的算法和工具,用于约束传播、序列优化和二级结构可视化。通过直观且用户友好的界面,mRNAid可同时优化多个序列和结构特性,包括密码子使用、GC含量、最小自由能(MFE)、尿苷耗竭以及排除特定基序和/或稀有密码子,从而为治疗性mRNA设计提供强大的平台。mRNAid 可在 https://github.com/MSDLLCpapers/mRNAid 上获得,并在 https://mrnaid.dichlab.org 上作为 Web 应用程序提供。

Materials and methods 材料和方法

Tool architecture 工具架构

The application consists of several parts, which are containerized and can be easily built and run with ‘docker-compose’ utility (Supplementary Figure S1). The frontend is served as static files with the help of the Nginx server. It can also be configured as a reverse proxy. Frontend container communicates with backend through the uwsgi protocol. The backend presents a Python Flask application served by the uWSGI server. The optimization tasks are handled by a celery task queue implemented with the Redis in-memory database working as a message broker. The mounted volume is used to keep logs of the backend execution. All individual parts reside within separate containers that communicate with each other inside a docker network. The user interface is written in React.js and consists of an input form and results page. The input form allows users to select different optimization strategies, set optimization parameters and submit the optimization job. The output form includes visualization of the optimized sequences generated via the rgw Forna JavaScript visualization container combined with an MFE mountain-plot and summary of optimized sequence properties. Users can export the results in a pdf or an Excel format.
该应用程序由几个部分组成,这些部分是容器化的,可以使用“docker-compose”实用程序轻松构建和运行(补充图 S1)。在 Nginx 服务器的帮助下,前端作为静态文件提供。它也可以配置为反向代理。前端容器通过 uwsgi 协议与后端通信。后端提供了一个由uWSGI服务器提供的Python Flask应用程序。优化任务由芹菜任务队列处理,该队列由作为消息代理的 Redis 内存数据库实现。挂载的卷用于保存后端执行的日志。所有单独的部分都驻留在单独的容器中,这些容器在 docker 网络中相互通信。用户界面是用React.js编写的,由输入表单和结果页组成。输入表单允许用户选择不同的优化策略,设置优化参数并提交优化作业。输出形式包括通过 rgw Forna JavaScript 可视化容器生成的优化序列的可视化,以及 MFE 山图和优化序列属性的摘要。用户可以将结果导出为 pdf 或 Excel 格式。

Optimization strategies 优化策略

The core of the tool is the freely available sequence optimization framework for Python, DNA Chisel (). DNA Chisel allows the use of built-in specifications to approach some of the common optimization tasks (such as matching target codon usage in the host or ensuring correct translation to protein by using only synonymous codons during the optimization, etc.) and it is very flexible with respect to defining completely new optimization specifications. These specifications can either be hard constraints, which cannot be violated in the final sequence, or they can be considered as soft constraints or objectives, whose score is maximized in the final sequence. Some specifications can be used as both constraint and objective, depending on user requirements. When multiple objectives are defined in the optimization problem, the total weighted score is maximized.
该工具的核心是免费提供的 Python 序列优化框架 DNA Chisel ( 16)。DNA 凿子允许使用内置规范来处理一些常见的优化任务(例如匹配宿主中的目标密码子使用情况,或在优化过程中仅使用同义密码子来确保正确翻译为蛋白质等),并且在定义全新的优化规范方面非常灵活。这些规范可以是硬约束,在最终序列中不能被违反,也可以被视为软约束或目标,其分数在最终序列中最大化。根据用户要求,某些规范既可以用作约束,也可以用作目标。当在优化问题中定义多个目标时,总加权分数最大化。

DNA Chisel solves the constraint satisfaction problem using a combination of constraint propagation and local search methods. The optimization algorithm consists of two main steps, the resolution of all hard constraints and maximization of objectives’ scores with respect to the constraints. The solver reduces the optimization problem to a set of local optimization problems, which are resolved individually. The optimization is performed either by random mutations on the sequence or by exhaustive search through the pre-computed mutation space, depending on the size of the latter. During the mRNAid optimization, the tool combines all the specified constraints and objectives together into the optimization problem and calls the DNA Chisel optimization method. This procedure is re-executed in parallel either until a user-specified number of sequences is produced or until the number of attempts is exceeded. After the optimization is completed, mRNAid ranks the optimized sequences based on the scoring function, as described below.
DNA Chisel 使用约束传播和局部搜索方法的组合来解决约束满足问题。优化算法包括两个主要步骤,即解决所有硬约束和最大化目标相对于约束的分数。求解器将优化问题简化为一组局部优化问题,这些问题是单独求解的。优化是通过序列上的随机突变或通过预先计算的突变空间进行详尽搜索来执行的,具体取决于后者的大小。在mRNAid优化过程中,该工具将所有指定的约束条件和目标组合到优化问题中,并调用DNA凿优化方法。此过程将并行重新执行,直到生成用户指定的序列数或超过尝试次数。优化完成后,mRNAid根据评分函数对优化的序列进行排名,如下所述。

In our optimization approach, the following built-in specifications were used as hard constraints: AvoidPattern to ensure that certain motifs are excluded; EnforceGCContent to keep GC levels in a certain boundary across the sequence; AvoidRareCodons to not use rare codons with codon frequency below the threshold; and EnforceTranslation to ensure that the optimized sequence is translated back to the same protein as the input. As objectives, built-in MatchTargetCodonUsage was used to match codon usage frequencies in the host and EnforceGCContent to optimize GC content in the sliding window across the sequence. The details of these specifications can be found in the DNA Chisel documentation ().
在我们的优化方法中,以下内置规范被用作硬约束:AvoidPattern 以确保排除某些图案;强制执行 GCContent 以将 GC 级别保持在序列中的某个边界内;避免RareCodons,不要使用密码子频率低于阈值的稀有密码子;和 EnforceTranslation,以确保优化的序列被翻译回与输入相同的蛋白质。作为目标,内置的 MatchTargetCodonUsage 用于匹配主机中的密码子使用频率,并使用 EnforceGCContent 来优化序列中滑动窗口中的 GC 内容。这些规范的详细信息可以在 DNA 凿子文档 ( 16) 中找到。

Additional custom specifications were implemented and integrated into the tool. The Uridine Depletion hard constraint was implemented to ensure no codons with uridine in the third position are present in the sequence. In addition, three new objectives were implemented: MatchTargetPairUsage to account for dinucleotide usage; MatchTargetCodonPairUsage to optimize for codon pair usage frequencies; and MinimizeMFE to use different algorithms for MFE estimation.
实施了其他自定义规范并将其集成到该工具中。实施了尿苷耗竭硬约束,以确保序列中不存在尿苷处于第三位的密码子。此外,还实施了三个新目标:MatchTargetPairUsage,用于说明二核苷酸的使用;MatchTargetCodonPairUsage 优化密码子对使用频率;和 MinimizeMFE 使用不同的算法进行 MFE 估计。

A short description of the specifications is also provided in the mRNAid tooltip widget in the form of a question mark (‘?’) adjacent to a given specification. On the submission page of the mRNAid tool, the built-in specifications appear as ‘Avoid motifs’ (AvoidPattern), ‘Global GC content’ and ‘Window size for local GC content’ (EnforceGCContent), ‘Codon usage frequency threshold’ (AvoidRareCodons), ‘CAI optimization’ (MaximizeCAI) and the custom specifications, ‘Uridine depletion’, ‘Match dinucleotide usage’, ‘Match codon-pair usage’, ‘Use more accurate MFE estimation’ and ‘Entropy window size’ (MFE optimization).
在mRNAid工具提示小部件中,还以问号('?')的形式提供了规范的简短描述,与给定规范相邻。在 mRNAid 工具的提交页面上,内置规范显示为“避免基序”(AvoidPattern)、“全局 GC 含量”和“本地 GC 含量的窗口大小”(EnforceGCContent)、“密码子使用频率阈值”(AvoidRareCodons)、“CAI 优化”(MaximizeCAI)和自定义规范、“尿苷耗竭”、“匹配二核苷酸使用”、“匹配密码子对使用”、“使用更准确的 MFE 估计”和“熵窗口大小”(MFE 优化)。

In the following sections, we describe the custom specifications in more detail.
在以下各节中,我们将更详细地介绍自定义规范。

Uridine depletion 尿苷耗竭

This constraint ensures that there is no uridine in the third position of all the codons in an optimized sequence. This constraint is implemented on the base of DNA Chisel's CodonSpecification class.
这种约束确保了优化序列中所有密码子的第三位没有尿苷。该约束是在 DNA Chisel 的 CodonSpecification 类的基础上实现的。

Dinucleotides, codon-pair, CAI and MatchCodonUsage optimizations
二核苷酸、密码子对、CAI 和 MatchCodon使用优化

Dinucleotides and codon-pair are custom objectives derived from usage tables from the CoCoPUTs database (). They account for the difference between dinucleotide or codon-pair frequencies in the host organism (Homo sapiens or Mus musculus) and the current sequence. The score is calculated by the following formula:
二核苷酸和密码子对是从 CoCoPUTs 数据库的使用表中得出的自定义目标 ( 17)。它们解释了宿主生物体(智人或肌肉人)中二核苷酸或密码子对频率与当前序列之间的差异。分数由以下公式计算:

equation M0001
equation M0001a

where equation M0002 is the score for a given nucleotide pair or codon-pair, equation M0003 is the total score being the mean of all the individual scores, equation M0004 is the frequency for a given pair, equation M0005 is a corresponding frequency from the database and equation M0006 is the total number of pairs across the sequence. The total score is maximized by the DNA Chisel optimization algorithm. Codon Adaptation Index (CAI) optimization is the built-in objective, used if a user specifies so. CAI optimization is a common optimization strategy introduced in (). MatchCodonUsage is a built-in constraint which minimizes the sum of discrepancies over all possible codon frequencies in a given sequence and in the target organism, set as default. All codon optimization objectives are considered mutually exclusive, so it is not possible to use any combination of these in our tool.
其中 equation M0002 是给定核苷酸对或密码子对的分数, equation M0003 是所有单个分数的平均值的总分, equation M0004 是给定对的频率,是数据库中的相应频率, equation M0005 equation M0006 是整个序列中的对总数。总分通过DNA凿子优化算法最大化。密码子适应指数 (CAI) 优化是内置目标,如果用户指定,则使用。CAI优化是(18)中引入的一种常用优化策略。MatchCodonUsage 是一个内置约束,它最小化给定序列和目标生物体中所有可能的密码子频率的差异总和,设置为默认值。所有密码子优化目标都被认为是相互排斥的,因此在我们的工具中不可能使用这些目标的任何组合。

MFE optimization MFE 优化

We are targeting to maximize the MFE at the specified region of the sequence starting from the 5′ end of the mRNA molecule. We call this region an entropy window. Maximization of MFE in this region enforces a more open structure with fewer base pairs formed, which makes it more accessible to ribosomes. The aim is to have the MFE of the 5′ end as close to 0 as possible (it is usually negative). The user can choose between two algorithms for MFE estimation. The first one is the RNAfold algorithm (), based on dynamic programming which thoroughly explores all possible secondary structures. This process can take up to several seconds depending on the size of the sequence of interest and might not be the best option when multiple runs are required (which is exactly the case of mRNAid). However, the main benefit of the long computational time is the high accuracy of estimations. The RNAfold package is also used to provide the calculated secondary structures to the frontend for subsequent visualization.
我们的目标是从 mRNA 分子的 5' 端开始,在序列的指定区域最大化 MFE。我们称这个区域为熵窗口。该区域中MFE的最大化强制执行了更开放的结构,形成的碱基对更少,这使得核糖体更容易接近。目的是使 5' 端的 MFE 尽可能接近 0(通常为负)。用户可以在两种多模一体化估计算法之间进行选择。第一个是RNAfold算法(19),它基于动态编程,彻底探索了所有可能的二级结构。此过程可能需要长达几秒钟的时间,具体取决于目标序列的大小,并且当需要多次运行时可能不是最佳选择(这正是 mRNAid 的情况)。然而,计算时间长的主要好处是估计精度高。RNAfold 包还用于将计算出的二级结构提供给前端,以便后续可视化。

The alternative option is to use a faster MFE estimation algorithm, which is based on the correlated stem–loop prediction approach proposed in (). In this approach, all possible single stem–loop conformations are considered, and their interaction energies are averaged. This algorithm has quadratic complexity O(n^2), where ‘n’ is the number of nucleotides in sequence, compared with cubic complexity O(n^3) of the RNAfold algorithm. The simplified algorithm is used during the optimization, when mutation space is explored to estimate the score of the mutated sequence. However, when presenting the final value of the best sequence after the optimization is done, its MFE value is estimated with the RNAfold algorithm.
另一种选择是使用更快的MFE估计算法,该算法基于(20)中提出的相关茎环预测方法。在这种方法中,考虑了所有可能的单茎环构象,并平均了它们的相互作用能。该算法具有二次复杂度 O(n^2),其中“n”是序列中的核苷酸数,而 RNAfold 算法的立方复杂度为 O(n^3)。在优化过程中,当探索突变空间以估计突变序列的分数时,使用简化算法。然而,当在优化完成后呈现最佳序列的最终值时,其MFE值是用RNAfold算法估计的。

Scoring function 评分功能

A scoring function which evaluates sequences for different criteria was applied to the list of optimized sequences. The final score used in ranking was:
根据不同标准评估序列的评分函数被应用于优化序列列表。排名中使用的最终分数是:

equation M0007

where equation M0008 are individual weights of each score (equation M0009), equation M00010 is the uridine depletion score, equation M00011 is the GC content score, equation M00012 is the CAI score, equation M00013 is the total MFE score and equation M00014 is the MFE 5′-end score. The weights were assigned to accommodate the order of importance for different objectives, as defined by the experimental scientists and based on previous reports. These weights were later fine-tuned based on the experimental data. As more data become available, these parameters can be further optimized and changed accordingly.
其中 equation M0008 是每个分数的单独权重 ( equation M0009 ), equation M00010 是尿苷耗竭分数, equation M00011 是 GC 含量分数, equation M00012 是 CAI 分数,是总 MFE 分数, equation M00013 equation M00014 是 MFE 5′ 结束分数。分配权重以适应不同目标的重要性顺序,由实验科学家定义并基于以前的报告。这些权重后来根据实验数据进行了微调。随着更多数据的出现,这些参数可以进一步优化并相应地更改。

Uridine depletion score 尿苷耗竭评分

Uridine depletion is checked by counting each uridine at the third position in a codon and normalizing to the codon number. Maximum and minimum values are 1 and 0 (all/no codons have uridine at the third position). When uridine depletion is not specified by the user, this is not included in the final scoring function (by setting the weight to 0).
通过计算密码子中第三个位置的每个尿苷并归一化为密码子编号来检查尿苷耗竭。最大值和最小值为 1 和 0(所有/无密码子在第三个位置都有尿苷)。当用户未指定尿苷耗竭时,这不包括在最终评分函数中(通过将权重设置为 0)。

GC score GC 评分

GC content is calculated for the whole sequence and checked to be within the user-defined range (GC_min and GC_max). The score is calculated as a growing linear function of GC content value to favor sequences with larger GC values:
计算整个序列的GC含量,并检查是否在用户定义的范围(GC_min和GC_max)内。分数计算为 GC 含量值的递增线性函数,以支持具有较大 GC 值的序列:

equation M00015

The score is bounded in the range 0 to 1. As GC content has an influence on properties and expression rates of mRNA, we optimize the sequence to fit the GC content in a specified window.
分数的边界范围为 0 到 1。由于GC含量对mRNA的特性和表达率有影响,因此我们优化了序列以适合指定窗口中的GC含量。

Codon Adaptation Index score
密码子适应指数评分

The CAI is a widely used metric of synonymous codon usage bias, which measures the deviation of codon usage from that in a reference set of highly expressed genes (). The CAI score is equal to the value of the CAI itself and bound in the range 0 to 1, with 1 being the most optimal CAI.
CAI 是同义密码子使用偏差的广泛使用的指标,它衡量密码子使用与一组高表达基因参考中的密码子使用偏差 (18)。CAI 分数等于 CAI 本身的值,范围为 0 到 1,其中 1 是最优的 CAI。

Total MFE score MFE 总分

It is preferable to have sequences with a lower value of MFE. To enable efficient sorting of the sequences according to this requirement, we use the following score:
最好具有具有较低 MFE 值的序列。为了能够根据此要求对序列进行有效排序,我们使用以下分数:

equation M00016

A value of 5000 was chosen based on the observed value of many input sequences that did not exceed an MFE value of 3500. In that way, the score remains between 0 and 1, where the score tends to zero when MFE goes to zero, and does not exceed 1 for MFE values around 3500 bp.
根据许多输入序列的观测值选择值 5000,这些序列的 MFE 值不超过 3500。这样,分数保持在 0 和 1 之间,当 MFE 变为零时,分数趋于零,而对于 3500 bp 左右的 MFE 值,分数不超过 1。

5′-MFE score (mfe_5_score)
5′-MFE 评分 (mfe_5_score)

The MFE of the 5′ end is calculated using RNAfold. MFE has a theoretical maximum of 0, but in practice does not reach that value. The score is calculated as a decreasing exponential function of the 5′-MFE:
5′末端的MFE使用RNAfold计算。MFE 的理论最大值为 0,但实际上没有达到该值。分数计算为 5′-MFE 的递减指数函数:

equation M00017

In this case equation M00018 when equation M00019 and equation M00020 when equation M00021. The score is now bound in the range 0 to 1, with the aim to minimize the 5′-MFE to 0.
在这种情况下 equation M00018 ,何时 equation M00019equation M00020 何时 equation M00021 .分数现在限制在 0 到 1 的范围内,目的是将 5′-MFE 最小化为 0。

Comparison with COVID-19 vaccines
与 COVID-19 疫苗的比较

The sequence for the native spike surface glycoprotein gene was retrieved from the NCBI (NC_045512.2) and then optimized by mRNAid using Strategy 5 (Supplementary Table S1). The resultant optimized sequence was compared with the coding sequences within the putative Pfizer/BioNtech and Moderna vaccines (,). Putative Pfizer/BioNtech and Moderna assembled vaccine sequences were downloaded from: https://github.com/NAalytics/Assemblies-of-putative-SARS-CoV2-spike-encoding-mRNA-sequences-for-vaccines-BNT-162b2-and-mRNA-1273. Following CAI optimization, the mean Levenstein distance between a given mRNAid-optimized sequence and the putative Moderna or Pfizer/BioNtech assembled vaccine sequences was 131 and 350, respectively, reflecting ∼3.5% and 9.1% sequence variation, when 3819 nucleotides of the spike CDS are considered (excluding the stop codon).
从 NCBI (NC_045512.2) 中检索天然刺突表面糖蛋白基因的序列,然后使用策略 5(补充表 S1)通过 mRNAid 进行优化。将得到的优化序列与假定的辉瑞/BioNtech 和 Moderna 疫苗中的编码序列进行了比较 ( 21, 22)。推定的辉瑞/BioNtech 和 Moderna 组装的疫苗序列从以下网址下载:https://github.com/NAalytics/Assemblies-of-putative-SARS-CoV2-spike-encoding-mRNA-sequences-for-vaccines-BNT-162b2-and-mRNA-1273。在CAI优化后,当考虑刺突CDS的3819个核苷酸(不包括终止密码子)时,给定的mRNAid优化序列与假定的Moderna或Pfizer/BioNtech组装的疫苗序列之间的平均Levenstein距离分别为131和350,反映了∼3.5%和9.1%的序列变异。

Experimental validation 实验验证

In vitro transcription

mRNAs with ARCA or CleanCap® with or without uridine modification were in vitro transcribed using the mMESSAGE mMACHINE® T7 Ultra transcription kit (Ambion, AMB13455). Linearized plasmid DNA containing the target gene downstream of a T7 RNA polymerase promoter was used as the template, and synthesis reactions were performed according to the manufacturer's protocol. For mRNAs with CleanCap®, T7 2× NTP/ARCA was substituted with 8 mM CleanCap® Reagent AG (TriLink Biotechnologies, N-7113) and 10 mM of each NTP. Modified uridines used included pseudouridine-5′-triphosphate (TriLink Biotechnologies, N-1019), N1-methyl-pseudouridine-5′-triphosphate (TriLink Biotechnologies, N-1081) or 5-methoxyuridine-5′-triphosphate (TriLink Biotechnologies, N-1093). mRNAs were subsequently purified by the MegaClear Transcription Clean-up kit (Ambion, AM1908) and quantified on the NanoDrop spectrophotometer.


使用 mMESSAGE mMACHINE® T7 Ultra 转录试剂盒 (Ambion, AMB13455) 在体外转录具有 ARCA 或 CleanCap® 的体外转录 mRNA,有或没有尿苷修饰。使用含有T7 RNA聚合酶启动子下游靶基因的线性化质粒DNA作为模板,并根据制造商的方案进行合成反应。对于含 CleanCap® 的 mRNA,用 8 mM CleanCap® Reagent AG (TriLink Biotechnologies, N-7113) 和每种 NTP 的 10 mM 取代 T7 2× NTP/ARCA。使用的改性尿苷包括假尿苷-5′-三磷酸(TriLink Biotechnologies,N-1019),N 1 -甲基-假尿苷-5′-三磷酸(TriLink Biotechnologies,N-1081)或5-甲氧基尿苷-5′-三磷酸(TriLink Biotechnologies,N-1093)。随后通过 MegaClear 转录纯化试剂盒(Ambion,AM1908)纯化 mRNA,并在 NanoDrop 分光光度计上定量。

Cell culture 细胞培养

All cell lines were obtained from the American Type Culture Collection (ATCC) and grown at 37°C, 5% CO2. MIA PaCa-2 (CRL-1420) cells were maintained in Dulbecco’s modified Eagle’s medium (DMEM) with high glucose and GlutaMAX™ supplement (Gibco), 10% fetal bovine serum (FBS; HyClone) and 2.5% horse serum (Gibco). BJ fibroblasts (CRL-2522) were cultured in minimal essential medium (MEM) with GlutaMAX™ supplement (Gibco) and 10% FBS (HyClone). SJCRH30 (CRL-2061) cells were cultured in RPMI with GlutaMAX™ supplement (Gibco) and 10% FBS (HyClone).
所有细胞系均来自美国类型培养物保藏库 (ATCC),并在 37°C、5% CO 下生长 2 。MIA PaCa-2 (CRL-1420) 细胞维持在含有高葡萄糖和 GlutaMAX™ 补充剂 (Gibco)、10% 胎牛血清 (FBS;HyClone)和2.5%马血清(Gibco)。BJ 成纤维细胞 (CRL-2522) 在含有 GlutaMAX™ 添加剂 (Gibco) 和 10% FBS (HyClone) 的最小必需培养基 (MEM) 中培养。SJCRH30 (CRL-2061) 细胞在含有 GlutaMAX™ 添加剂 (Gibco) 和 10% FBS (HyClone) 的 RPMI 中培养。

Luminescences assays 发光检测

Lipofectamine™ MessengerMAX™ (Life Technologies) was diluted in opti-MEM to the desired working concentration and dispensed onto 384-well white assay plates (Greiner 781080). A source plate (Labcyte LP-0200) containing serial dilutions of the mRNAs was prepared using the Bravo liquid handler (Agilent), and a 10-point 2-fold dose titration of each mRNA was dispensed onto the assay plate using Echo555 (Labcyte). After a 10 min incubation, 4000 MIA PaCa-2 or SJCRH30 cells or 6000 BJ cells were added per well. For kinetic monitoring, 20 μM Endurazine (Promega), an extended time-released live cell substrate, was added to each well. Luminescence was measured continuously at 1 h intervals for 48 h on the Tecan Spark 10M set to 37ºC, 5% CO2. For end-point HiBiT protein detection, the NanoGlo HiBiT lytic detection assay (Promega, N3040) was performed as per the manufacturer's instructions. Luminescence signal was determined using the Envision plate reader, and values were normalized to a HiBiT-control protein (Promega, N3010)
在opti-MEM中将Lipofectamine™ MessengerMAX™(Life Technologies)稀释至所需的工作浓度,并分配到384孔白色测定板(Greiner 781080)上。使用 Bravo 液体处理器 (Agilent) 制备含有连续稀释液的 mRNA 的源板 (Labcyte LP-0200),并使用 Echo555 (Labcyte) 将每个 mRNA 的 10 点 2 倍剂量滴定分配到测定板上。孵育 10 分钟后,每孔加入 4000 个 MIA PaCa-2 或 SJCRH30 个细胞或 6000 个 BJ 细胞。为了进行动力学监测,向每个孔中加入 20 μM 恩度拉嗪 (Promega),一种延长时间释放的活细胞底物。在设置为37ºC,5%CO的Tecan Spark 10M上以1小时间隔连续测量发光48小时 2 。对于终点HiBiT蛋白检测,NanoGlo HiBiT裂解检测测定(Promega,N3040)按照制造商的说明进行。使用Envision酶标仪测定发光信号,并将值归一化为HiBiT对照蛋白(Promega,N3010)

Western blot analysis 蛋白质印迹分析

A total of 0.08 million MIA PaCa-2 cells were seeded per well in a 24-well poly-d-lysine-coated cell culture plate (Greiner) and allowed to attach overnight before mRNA transfection with Lipofectamine™ MessengerMAX™ (Life Technologies) according to the manufacturer's protocol. After 24 h incubation, 100 μl of Bolt™ lithium dodecyl sulfate (LDS) sample buffer supplemented with Bolt™ sample reducing agent was added per well of a 24-well plate. The wells were scraped using wide orifice tips and the lysate was transferred into polymerase chain reaction (PCR)-strip tubes and sonicated for 10 × 10 s in a chilled water bath sonicator (QSonica). A 15 μl aliquot of protein extract was separated on 4–12% Bis-Tris plus gels, transferred onto nitrocellulose membranes using the Trans-Blot® Turbo™ semi-dry system (Bio-rad), and blocked for 1 h at room temperature with Intercept™ (TBS) blocking buffer (Li-Cor). Blots were probed with the appropriate primary antibodies overnight at 4°C in blocking buffer supplemented with 0.1% Tween-20, followed by the secondary antibodies IRDye® 680RD donkey anti-mouse IgG or IRDye® 800CW donkey anti-rabbit IgG (Li-Cor) for 1 h at room temperature. Fluorescent signals were imaged and quantified using Odyssey® CLx. Primary antibodies used were: NanoLuc (Promega, N7000) and glyceraldehyde phosphate dehydrogenase (GAPDH; Cell Signaling Technology, #5174)
每孔总共接种 00.08 万个 MIA PaCa-2 细胞,接种在 24 孔聚-d-赖氨酸包被的细胞培养板 (Greiner) 中,并允许在 mRNA 转染之前附着过夜,然后根据制造商的方案用 Lipofectamine™ MessengerMAX™ (Life Technologies) 转染。孵育24小时后,在24孔板的每个孔中加入100μl补充有Bolt™样品还原剂的Bolt™十二烷基硫酸锂(LDS)样品缓冲液。使用宽孔口尖端刮擦孔,将裂解物转移到聚合酶链反应 (PCR) 联管中,并在冷冻水浴超声仪 (QSonica) 中超声处理 10 × 10 秒。在 4–12% Bis-Tris plus 凝胶上分离 15 μl 等分试样的蛋白质提取物,使用 Trans-Blot® Turbo™ 半干系统 (Bio-rad) 转移到硝酸纤维素膜上,并在室温下用 Intercept™ (TBS) 封闭缓冲液 (Li-Cor) 封闭 1 小时。在补充有 0.1% 吐温-20 的封闭缓冲液中,用适当的一抗在 4°C 下过夜探测印迹,然后用二抗 IRDye® 680RD 驴抗小鼠 IgG 或 IRDye® 800CW 驴抗兔 IgG (Li-Cor) 在室温下 1 小时。使用Odyssey® CLx对荧光信号进行成像和定量。 使用的一抗是:NanoLuc(Promega,N7000)和甘油醛磷酸脱氢酶(GAPDH;细胞信号转导技术,#5174)

IFN-β detection in BJ fibroblasts
BJ 成纤维细胞中的 IFN-β 检测

BJ fibroblasts were seeded in 96-well poly-d-lysine-coated cell culture plates (Greiner) at 20 000 cells per well and transfected the next day with 50 ng per well of the respective mRNA using Lipofectamine™ MessengerMAX™ (Life Technologies). The supernatant was harvested 48 h post-transfection and interferon-β (IFN-β) levels were determined using the Bio-Plex Pro Human Inflammation Panel 1 (BioRad) as per the manufacturer's protocol. Data were acquired on the Bio-Plex Pro 200 system (BioRad).
将 BJ 成纤维细胞接种在 96 孔聚 d-赖氨酸包被的细胞培养板 (Greiner) 中,每孔 20 000 个细胞,第二天使用 Lipofectamine™ MessengerMAX™ (Life Technologies) 每孔 50 ng 的相应 mRNA 转染。转染后 48 小时收获上清液,并按照制造商的方案使用 Bio-Plex Pro 人类炎症组合 1 (BioRad) 测定干扰素β (IFN-β) 水平。数据是在 Bio-Plex Pro 200 系统 (BioRad) 上采集的。

Results and discussion 结果与讨论

The core backbone for sequence optimization in mRNAid is based on the DNA Chisel framework () (Supplementary Figure S1). DNA Chisel permits global and local optimization of hard and soft constraints, which in turn enables the adjustment of the desired sequence properties along the entire transcript. Hard constraints refer to criteria that must be satisfied in the final sequence, whereas soft constraints refer to criteria whose score must be maximized. Furthermore, the ability to flexibly define new constraints makes DNA Chisel an ideal sandbox for probing the effect of a multitude of sequence properties on stability and expression.
mRNAid 中序列优化的核心骨架基于 DNA 凿子框架 ( 16) (补充图 S1)。DNA 凿子允许对硬约束和软约束进行全局和局部优化,从而能够在整个转录本上调整所需的序列特性。硬约束是指在最终序列中必须满足的标准,而软约束是指其分数必须最大化的标准。此外,灵活定义新约束的能力使 DNA Chisel 成为探索多种序列特性对稳定性和表达影响的理想沙箱。

mRNAid piggybacks on several hard constraints implemented in DNA Chisel that enforce global GC content and translation and avoid rare codons and specific motifs. mRNAid extends this list with an important uridine depletion constraint that avoids codons with uridine at their third position, which reportedly improves expression and reduces immunogenicity ().
mRNAid 搭载了 DNA Chisel 中实现的几个硬约束,这些约束强制执行全局 GC 含量和翻译,并避免罕见的密码子和特定基序。mRNAid 扩展了此列表,具有重要的尿苷耗竭限制,该约束避免了尿苷位于第三位的密码子,据报道,这会改善表达并降低免疫原性 (10)。

Codon usage optimization aims to improve expression by systematic replacement of synonymous codons based on the organism's codon frequency table. Numerous proprietary and freely available codon optimization algorithms have been reported to date () and many show a strong preference towards the CAI () since it highly correlates with gene expression (). The CAI reflects the deviation of codon frequencies in a sequence from those observed in a reference set of highly expressed genes in the target host organism (). DNA Chisel includes CAI and Matched Codon Usage optimizations as soft constraints. The latter method ensures that the relative frequencies of the codons in the sequence match the overall codon usage of the target organism (). While the above methods consider each codon independently, a recent study reported a bias in codon-pair utilization and dinucleotide usage that are inevitably inter-related ()
密码子使用优化旨在通过基于生物体密码子频率表的同义密码子系统替换来改善表达。迄今为止,已经报道了许多专有和免费提供的密码子优化算法 ( 11) 并且许多算法表现出对 CAI ( 18) 的强烈偏好,因为它与基因表达高度相关 ( 23)。CAI 反映了序列中密码子频率与在目标宿主生物体中一组高表达基因的参考中观察到的密码子频率的偏差 (18)。DNA 凿子包括 CAI 和匹配密码子使用优化作为软约束。后一种方法确保序列中密码子的相对频率与目标生物体的整体密码子使用率相匹配 ( 16)。虽然上述方法独立地考虑每个密码子,但最近的一项研究报告称,密码子对利用率和二核苷酸使用存在偏差,这不可避免地相互关联 ( 17)

The CoCoPUTs database provides pre-computed codon-pair and dinucleotide frequency tables in various organisms that can be used for sequence optimization. We implemented the dinucleotide and codon-pair usage optimizations since they have been shown to affect translation fidelity and efficiency (,). There is a significant codon-pair usage bias in all three domains of life, and between proteins expressed at a low and high level within a species () that cannot be explained by individual codon bias, pointing towards a distinct mechanism of translation modulation. It has been suggested that codon-pair effects on the translation rate may be mediated by interactions of adjacent aminoacyl-tRNA molecules bound to ribosomes (). To account for structural properties, we also incorporated the Vienna-RNA MFE optimization () and the correlated stem–loop prediction approach (), given the pivotal role that mRNA secondary structures play in regulating translation efficiency (). The Vienna-RNA method uses a thermodynamic energy model to compute the MFE of a given RNA sequence to identify the most thermodynamically stable secondary structure (). To reduce the complexity of the MFE computation, the stem–loop method () estimates a pseudo-MFE as the average energy of all possible stem–loop conformations for a given sequence, increasing the computation efficiency of sequence optimization. Multiple studies reported correlation between highly structured features in the CDS and functional mRNA half-life (,). In contrast, the region around the translation start site is less structured in highly expressed genes (), which presumably facilitates ribosome loading and prevents jamming. Thus, different transcript regions possess different structural properties that must be reflected in the optimization strategy. Furthermore, transcripts are often fused to already optimized UTRs, obviating the need for optimization of these regions. Given the above, and the fact that MFE optimization is extremely computationally expensive, we provide users with the flexibility to optimize MFE within an adjustable window in the 5′ end of the CDS, while accounting for global MFE in our ranking approach. To this end, we ensure that the combination of multiple constraints generates optimal sequence properties by applying a novel scoring formula that ranks sequences based on weighted scores of uridine depletion, GC content, CAI, local 5′ end and global MFEs.
CoCoPUTs 数据库提供了各种生物体中预先计算的密码子对和二核苷酸频率表,可用于序列优化。我们实施了二核苷酸和密码子对的使用优化,因为它们已被证明会影响翻译保真度和效率 ( 17, 24)。在生命的所有三个领域中,以及在一个物种中以低水平和高水平表达的蛋白质之间,都存在显着的密码子对使用偏倚(25),这不能用单个密码子偏倚来解释,这表明了一种独特的翻译调节机制。有人认为,密码子对对翻译速率的影响可能是由与核糖体结合的相邻氨酰基-tRNA 分子的相互作用介导的 ( 25)。为了解释结构特性,我们还结合了 Vienna-RNA MFE 优化 ( 19) 和相关的茎环预测方法 ( 20),因为 mRNA 二级结构在调节翻译效率方面起着关键作用 ( 26)。Vienna-RNA 方法使用热力学能量模型来计算给定 RNA 序列的 MFE,以确定最热力学稳定的二级结构 (19)。为了降低MFE计算的复杂度,茎-环方法(20)将伪MFE估计为给定序列下所有可能的茎-环构象的平均能量,从而提高了序列优化的计算效率。多项研究报告了 CDS 中高度结构化的特征与功能性 mRNA 半衰期之间的相关性 ( 27, 28)。相比之下,翻译起始位点周围的区域在高表达基因中的结构较少 ( 29),这可能有助于核糖体上样并防止干扰。 因此,不同的转录区域具有不同的结构特性,这些特性必须反映在优化策略中。此外,转录本通常与已经优化的UTR融合,从而避免了对这些区域进行优化的需要。鉴于上述情况,以及 MFE 优化在计算上极其昂贵这一事实,我们为用户提供了在 CDS 5' 端的可调窗口内优化 MFE 的灵活性,同时在我们的排名方法中考虑了全球 MFE。为此,我们通过应用一种新的评分公式来确保多个约束的组合产生最佳序列特性,该评分公式根据尿苷耗竭、GC 含量、CAI、局部 5' 末端和全局 MFE 的加权分数对序列进行排名。

Next, we selected five distinct optimization strategies in mRNAid that are based on a codon optimization approach coupled with uridine depletion, GC content and MFE optimizations (Strategies 1–5: dinucleotide, matched codon-pair usage, matched codon usage with or without uridine depletion and CAI, respectively) (Table (Table1).1). To experimentally validate these strategies, we adopted NanoLuciferase-PEST (Nluc-PEST) as the reporter system since the short half-lives of the individually produced luciferase proteins prevent confounding effects on interrogation of mRNA properties as they relate to translation efficiency or mRNA stability. Indeed, these effects can be conveniently measured through kinetic tracking of the target protein's luminescence. Area under the curve (AUC) and luminescence at 48 h (RLU @ 48 h) are represented as indicators of total protein output and functional mRNA stability, respectively, given that protein sequence and other mRNA features are kept constant. A de-optimized mRNA version of Nluc-PEST encoded by the least frequent codon for each amino acid (Rare, red, Figure Figure1A1AC) was used as the input for mRNAid and the top four ranked mRNA sequences generated under each software setting were selected (Supplementary Tables S1 and S2). These were benchmarked against a proprietary sequence from Promega (Promega, blue, Figure Figure1A1AC) as the codon-optimized control.
接下来,我们在mRNAid中选择了五种不同的优化策略,这些策略基于密码子优化方法与尿苷耗竭、GC含量和MFE优化相结合(策略1-5:二核苷酸、匹配密码子对使用、匹配密码子使用量,分别有或没有尿苷耗竭和CAI)(表(表1).1)。为了通过实验验证这些策略,我们采用纳米荧光素酶-PEST (Nluc-PEST) 作为报告系统,因为单独产生的荧光素酶蛋白的半衰期较短,可以防止对 mRNA 特性的询问产生混杂效应,因为它们与翻译效率或 mRNA 稳定性有关。事实上,这些效应可以通过对靶蛋白发光的动力学跟踪来方便地测量。曲线下面积 (AUC) 和 48 小时的发光 (RLU @ 48 h) 分别表示为总蛋白质输出和功能 mRNA 稳定性的指标,前提是蛋白质序列和其他 mRNA 特征保持不变。使用由每个氨基酸(稀有,红色,图1A1A-C)最不频繁的密码子编码的Nluc-PEST的去优化mRNA版本作为mRNAid的输入,并选择在每个软件设置下生成的前四个排名靠前的mRNA序列(补充表S1和S2)。这些是以 Promega 的专有序列(Promega,蓝色,图 Figure1A1A-C)作为密码子优化对照的基准测试。

Table 1. 表 1.

Parameters for the experimentally tested optimization strategies
经过实验测试的优化策略的参数

Optimization strategy 优化策略U-depletion U-耗竭Codon optimization 密码子优化
1. Dinucleotide 1. 二核苷酸YesDinucleotides 二核苷酸
2. Matched Codon Pair Usage
2. 匹配密码子对的使用
YesMatched codon pair usage 匹配密码子对使用情况
3. Matched Codon Usage U-depletion
3. 匹配密码子使用U-耗竭
YesMatched codon usage—default
匹配密码子用法 - 默认
4. Matched Codon Usage no U-depletion
4. 匹配密码子使用,无U耗尽
NoMatched codon usage—default
匹配密码子用法 - 默认
5. Codon Adaptation Index (CAI)
5. 密码子适应指数(CAI)
YesCAI

Other parameters were the same for all strategies: codon usage frequency threshold (10%), avoid motifs (NheI EcoRV NotI BamHI BspEI), minimal GC content (50), maximal GC content (70), window size for local GC content (100) and entropy window size (80).
所有策略的其他参数相同:密码子使用频率阈值 (10%)、避免基序 (NheI EcoRV NotI BamHI BspEI)、最小 GC 含量 (50)、最大 GC 含量 (70)、局部 GC 含量的窗口大小 (100) 和熵窗口大小 (80)。

An external file that holds a picture, illustration, etc.
Object name is lqae028fig1.jpg

Impact of various sequence optimization strategies in mRNAid on NanoLuc-PEST expression. Effects of five different sequence optimization strategies on NanoLuc-PEST expression in MIA PaCa-2 cells are represented as (A) area under the curve (AUC) of luminescence over 48 h and (B) relative luminescence (RLU) at 48 h post-transfection. Rare, red denotes the de-optimized input; and Promega, blue denotes the codon-optimized control. The top four outputs from each strategy were tested at a 6.25 ng dose of the respective mRNA. Scatter plots with bars represent the mean from two independent biological replicates. (C) Western blot analysis of the sequence variants from 24 h post-transfection in MIA PaCa-2 cells. GAPDH was used as a loading control. Band intensities for NanoLuc were normalized to GAPDH and represented as fold change over the Promega control. Two additional outputs for each strategy were evaluated. Data from repeat experiment are included in Supplementary Figure S2. (D) Correlation plots for AUC (top) and RLU @ 48 h (bottom) versus MFE (kcal/mol). Mean values of AUC and RLU @ 48 h were determined for the 6.25 ng mRNA dose in (A and B) and represented as fold change over the respective mean values of Promega control (blue dashed line). The red dot denotes the de-optimized Rare input. Pearson’s r is indicated as determined by GraphPad Prism.
mRNAid中各种序列优化策略对NanoLuc-PEST表达的影响。五种不同序列优化策略对 MIA PaCa-2 细胞中 NanoLuc-PEST 表达的影响表示为 (A) 转染后 48 h 发光曲线下面积 (AUC) 和 (B) 转染后 48 h 的相对发光 (RLU)。罕见,红色表示去优化的输入;和 Promega,蓝色表示密码子优化的对照。每种策略的前四个输出在 6.25 ng 剂量的相应 mRNA 下进行测试。带条形的散点图表示两个独立生物学重复的平均值。(C) 对 MIA PaCa-2 细胞转染后 24 小时的序列变异进行蛋白质印迹分析。GAPDH被用作上样对照。将 NanoLuc 的条带强度归一化为 GAPDH,并表示为 Promega 对照的倍数变化。对每项战略的另外两项产出进行了评价。来自重复实验的数据包含在补充图S2中。(D) AUC(顶部)和 RLU @ 48 小时(底部)与 MFE(kcal/mol)的相关性图。确定 (A 和 B) 中 6.25 ng mRNA 剂量的 AUC 和 RLU @ 48 小时的平均值,并表示为 Promega 对照相应平均值的倍数变化(蓝色虚线)。红点表示去优化的稀有输入。Pearson 的 r 表示为 GraphPad Prism 确定。

All 20 mRNAid-optimized sequences resulted in significantly higher expression relative to the de-optimized input (Figure (Figure1A1AC; Supplementary Figure S2), attesting to the robustness of the tool. Among the five optimization strategies, all four sequences optimized by Strategy-5 (CAI optimization) were consistently better than or comparable with Promega and exhibited the highest GC content, lowest global MFE and lowest uridine content (Figure (Figure1A1AC; Supplementary Table S2). Similarly, sequences that were optimized by Strategy-2 (matched codon-pair optimization) exhibited expression comparable with Promega. The importance of codon-pair context for enhanced gene expression was also recognized in previous studies (). Conversely, all the four sequences that were optimized by Strategy-1 (dinucleotide optimization) were expressed less well than in Promega (Figure (Figure1A1AC; Supplementary Table S2; Supplementary Figure S2). Further analyses revealed that both AUC and RLU @ 48 h were strongly correlated with global MFE (r = –0.87 and r = –0.92, respectively, Figure Figure1D),1D), in line with previous reports on the impact of secondary structures on mRNA half-lives and ultimately final protein output (). Correlations were also observed with GC (%) and U (%), but not 5′-MFE (Supplementary Figure S3). Sequence optimization using mRNAid has a strong impact on rate of decay but not translation initiation as compared with the de-optimized input (Supplementary Figure S4). These findings highlight the importance of combining a multitude of hard, soft, local and global constraints to achieve balanced sequence and structural properties that cooperatively define total protein expression.
所有 20 个 mRNAid 优化序列的表达量均显著高于去优化的输入(图 (图 1A1A–C;补充图S2),证明了该工具的坚固性。在五种优化策略中,通过策略-5(CAI优化)优化的所有四个序列都始终优于或与Promega相当,并且表现出最高的GC含量,最低的全局MFE和最低的尿苷含量(图(图1A1A-C;附表S2)。同样,通过策略-2(匹配密码子对优化)优化的序列表现出与Promega相当的表达。密码子对环境对增强基因表达的重要性在以前的研究中也得到了认可 ( 30)。相反,通过策略-1(二核苷酸优化)优化的所有四个序列的表达都不如Promega(图(图1A1A-C;附表S2;补充图S2)。进一步的分析表明,AUC 和 RLU @ 48 h 都与整体 MFE 密切相关(分别为 r = –0.87 和 r = –0.92,图 1D),1D),这与之前关于二级结构对 mRNA 半衰期和最终蛋白质输出影响的报道一致 ( 27)。还观察到与GC(%)和U(%)的相关性,但与5′-MFE无关(补充图S3)。与去优化的输入相比,使用 mRNAid 进行序列优化对衰变速率有很大影响,但对翻译起始没有影响(补充图 S4)。这些发现强调了结合多种硬、软、局部和全局约束的重要性,以实现平衡的序列和结构特性,从而共同定义总蛋白表达。

An in silico optimization of the native SARS-CoV-2 surface glycoprotein gene using mRNAid Strategy-5 (CAI optimization) generated a transcript with 96.5% and 90.9% similarity to the putative assembled Moderna and Pfizer/BioNtech vaccine sequences, respectively (Table (Table2)2) (,). We also compared the expression of the mRNAid-optimized spike CDS with the putative Moderna and Pfizer/BioNtech sequences in SJCRH30 (muscle, relevant for vaccine testing) and MIA PaCa-2 cells. In both cell lines, mRNAid optimization significantly boosted expression of spike protein as compared with the native input sequence (Figure (Figure2).2). Although the mRNAid sequence is inferior to the putative Moderna sequence in both cell lines, it yielded expression comparable with the putative Pfizer vaccine sequence in the muscle cell line (Figure (Figure2).2). These data confirm that mRNAid sequence optimization can be applied to therapeutically relevant proteins whose expression can be modulated based on the sequence optimization parameters applied.
使用 mRNAid Strategy-5 对天然 SARS-CoV-2 表面糖蛋白基因进行计算机优化(CAI 优化)产生的转录本分别与假定组装的 Moderna 和 Pfizer/BioNtech 疫苗序列相似度为 96.5% 和 90.9% (表 (Table2)2) ( 21, 22)。我们还比较了 mRNAid 优化的刺突 CDS 与假定的 Moderna 和 Pfizer/BioNtech 序列在 SJCRH30(肌肉,与疫苗测试相关)和 MIA PaCa-2 细胞中的表达。在两种细胞系中,与天然输入序列相比,mRNAid优化显著提高了刺突蛋白的表达(图(图2).2)。尽管mRNAid序列在两种细胞系中都不如假定的Moderna序列,但它在肌肉细胞系中的表达与假定的辉瑞疫苗序列相当(图(图2).2)。这些数据证实,mRNAid序列优化可以应用于治疗相关蛋白质,其表达可以根据应用的序列优化参数进行调节。

Table 2. 表 2.

Summary of GC content, MFE and percentage similarity between mRNAid-optimized native SARS-CoV-2 spike sequence and the indicated CDS
mRNAid 优化的天然 SARS-CoV-2 刺突序列与指定 CDS 之间的 GC 含量、MFE 和百分比相似性摘要

Sequence 序列GC ratio GC比率MFE (kcal/mol) MFE(千卡/摩尔)% Similarity to mRNAid CDS
与 mRNAid CDS 的相似度百分比
Putative Moderna CDS 推定的 Moderna CDS0.62–1476.4096.5
Putative Pfizer CDS 推定辉瑞 CDS0.57–1321.3090.9
Native CDS 原生 CDS0.37–1068.6068.9
mRNAid CDS mRNAid CDS (英语:mRNAid CDS)0.62–1451.70

GC-rich RNA sequences are more likely to form stable secondary structures, such as stem–loop structures or hairpins, whereas lower MFE values reflect more stable structures.
富含 GC 的 RNA 序列更有可能形成稳定的二级结构,例如茎环结构或发夹,而较低的 MFE 值反映出更稳定的结构。

An external file that holds a picture, illustration, etc.
Object name is lqae028fig2.jpg

Impact of mRNAid sequence optimization on spike protein expression. Luminescence signals (normalized to a HiBiT control protein) of HiBiT-tagged spike CDS optimized by mRNaid (teal) or from the Moderna (black) and Pfizer (pink) vaccines are compared with the native spike CDS (blue) at (A) 24 h and (B) 48 h in SJCRH30 and MIA PaCa-2 cell lines at the 12.5 ng dose of the respective mRNA. Scatter plots, with bars representing the mean from two independent biological replicate experiments. (C) Western blot analysis of the spike sequence variants 24 or 48 h post-transfection in MIA PaCa-2 and SJCRH30 cells. HSP90 was used as a loading control. A representative blot is shown from two independent biological replicates. (D) Band intensities from two biological replicate experiments for HiBiT–spike (full-length, FL; and cleaved S2, S2) were normalized to HSP90 and represented as fold change over the native control. The presence of the expected product and the addition of a poly(A) tail for all mRNA constructs synthesized in this study were confirmed by gel electrophoresis (Supplementary Table S4; Supplementary Figure S5).
mRNAid序列优化对刺突蛋白表达的影响。将 mRNaid(蓝绿色)或来自 Moderna(黑色)和辉瑞(粉红色)疫苗优化的 HiBiT 标记刺突 CDS 的发光信号(归一化为 HiBiT 对照蛋白)与天然刺突 CDS(蓝色)在 (A) 24 小时和 (B) 48 小时在 SJCRH30 和 MIA PaCa-2 细胞系中以 12.5 ng 剂量的相应 mRNA 进行比较。散点图,条形代表两个独立的生物重复实验的平均值。(C) MIA PaCa-2 和 SJCRH30 细胞转染后 24 或 48 小时对刺突序列变体进行蛋白质印迹分析。HSP90 用作上样对照。从两个独立的生物学重复中显示了具有代表性的印迹。(D) 来自HiBiT刺突的两个生物学重复实验(全长,FL;和裂解的S2,S2)的条带强度被归一化为HSP90,并表示为天然对照的倍数变化。通过凝胶电泳证实了本研究中合成的所有 mRNA 构建体存在预期产物并添加了 poly(A) 尾部(补充表 S4;补充图S5)。

We next explored additional features that can be incorporated into mRNAid-optimized sequences to yield optimal therapeutic mRNAs. The discovery that uridine analogs dramatically reduce immune stimulation () and increase protein production from synthetic mRNA () marked a breakthrough in mRNA-based therapeutics. We evaluated the impact of substituting uridine (U) with pseudouridine (pU), 5-methoxyuridine (5moU) or N1-methylpseudouridine (N1m) on Nluc-PEST protein expression and pro-inflammatory cytokine (IFN-β) release in human BJ fibroblasts. For the de-optimized input (Rare) and codon-optimized control (Promega) sequences, pU and N1m modifications significantly improved protein expression as compared with U (Figure (Figure3A),3A), in line with previous reports. Instead of improving expression, U substitution with 5moU reduced protein output from the Promega sequence. Out of 20 mRNAid sequences, we picked the best expressed sequence (Strategy-5:output-2, Figure Figure1C)1C) and were able to recapitulate this improvement compared with Rare and Promega in BJ fibroblast using U (black, Figure Figure3A).3A). However, protein levels were not further enhanced using uridine analogs in Strategy-5:output-2 sequence. We postulate that since Strategy-5:output-2 sequence had the lowest uridine content (14% in Strategy-5:output-2; 21% in Promega; 27% in Rare), the effect of uridine substitution on expression may be minimal. In terms of immunogenicity, unmodified mRNAs (U) caused the highest cytokine release, as expected (comparable with the positive control poly I:C), followed by pU, N1m and 5moU, respectively (Figure (Figure3B).3B). Thus, our findings demonstrate how sequence optimization in addition to incorporation of uridine analogs can reduce undesired innate immune responses while maintaining high target protein expression.
接下来,我们探索了可以整合到 mRNAid 优化序列中的其他功能,以产生最佳的治疗性 mRNA。尿苷类似物显着减少免疫刺激 (10) 并增加合成 mRNA 的蛋白质产量 (14) 的发现标志着基于 mRNA 的疗法的突破。我们评估了用假尿苷 (pU)、5-甲氧基尿苷 (5moU) 或 N 1 -甲基假尿苷 (N1m) 替代尿苷 (U) 对人 BJ 成纤维细胞中 Nluc-PEST 蛋白表达和促炎细胞因子 (IFN-β) 释放的影响。对于去优化的输入(Rare)和密码子优化对照(Promega)序列,与U相比,pU和N1m修饰显著改善了蛋白质表达(图(图3A),3A),与之前的报道一致。用 5moU 替换 U 非但没有改善表达,反而减少了 Promega 序列的蛋白质输出。在 20 个 mRNAid 序列中,我们选择了表达最好的序列(Strategy-5:output-2,图 Figure1C)1C),并且能够使用 U(黑色,图 3A)概括了与 Rare 和 Promega 相比在 BJ 成纤维细胞中的这种改善。然而,在策略-5:output-2序列中使用尿苷类似物并未进一步提高蛋白质水平。我们假设,由于 Strategy-5:output-2 序列的尿苷含量最低(Strategy-5:output-2 为 14%;Promega 为 21%;Rare 为 27%),因此尿苷替代对表达的影响可能很小。在免疫原性方面,未修饰的mRNA(U)引起最高的细胞因子释放,正如预期的那样(与阳性对照poly I:C相当),其次分别是pU,N1m和5moU(图(图3B).3B)。 因此,我们的研究结果表明,除了掺入尿苷类似物外,序列优化如何减少不需要的先天免疫反应,同时保持高靶蛋白表达。

An external file that holds a picture, illustration, etc.
Object name is lqae028fig3.jpg

Additional ways to engineer mRNA for therapeutic use. (A) Effect of modified nucleotides on NanoLuc-PEST expression in BJ fibroblasts. U, uridine; pU, pseudouridine; 5moU, 5-methoxyuridine; N1m, N1-methyl-pseudouridine. Individual values from two independent biological replicate experiments have been plotted. (B) Effect of modified nucleotides on innate immune activation in BJ fibroblasts. Cytokine release assay 48 h post-transfection with 50 ng of the indicated mRNAs. IFN-β levels were normalized to the Promega sequence with pU incorporated. OOR (out-of-range), where values are below the detection limits of the assay. The dashed line represents the positive control poly I:C. Scatter plot, with bars representing the mean from two biological replicate experiments. (C) Effect of AG versus GG initiator sequence after the T7 promoter on NanoLuc protein expression in MIA PaCa-2 cells 24 h post-mRNA transfection. Individual values from two independent biological replicate experiments have been plotted. (D) Effect of different UTRs on NanoLuc protein expression in MIA PaCa-2 cells 24 h post-mRNA transfection.
设计用于治疗用途的 mRNA 的其他方法。(A) 修饰核苷酸对BJ成纤维细胞中NanoLuc-PEST表达的影响。U: 尿苷;pU,假尿苷;5moU,5-甲氧基尿苷;N1m,N 1 -甲基假尿苷。已经绘制了两个独立的生物重复实验的单个值。(B) 修饰核苷酸对BJ成纤维细胞先天免疫活化的影响。用 50 ng 指定的 mRNA 转染后 48 小时进行细胞因子释放测定。将 IFN-β 水平归一化为掺入 pU 的 Promega 序列。OOR(超出范围),其中值低于检测的检测限。虚线表示阳性对照多边形 I:C.散点图,条形表示两个生物重复实验的平均值。(C) T7 启动子后 AG 与 GG 起始子序列对 mRNA 转染后 24 小时 MIA PaCa-2 细胞中 NanoLuc 蛋白表达的影响。已经绘制了两个独立的生物重复实验的单个值。(D) 不同 UTR 对 mRNA 转染后 24 小时 MIA PaCa-2 细胞中 NanoLuc 蛋白表达的影响。

mRNAs are capped at the 5′ end to protect against degradation, facilitate ribosome loading and evade innate immune responses () Compared with legacy cap analogs such as ARCA, the proprietary co-transcriptional capping reagent CleanCap AG from TriLink was shown to have higher capping efficiency, increased RNA yield and reduced immunogenicity. We modified the conventional GG initiator sequence after the T7 promoter to AG and synthesized mRNAs using CleanCap AG. Indeed, the AG initiator significantly improved NanoLuc expression (Figure (Figure3C)3C) and total RNA yield (data not shown). This highlights how protein output can be further boosted and emphasizes the importance of tailoring template design to the desired capping technique. As noted, the role of 5′- and 3′-UTRs in modulating mRNA stability and translation is well established (). To this end, we selected four pairs of UTR sequences that have been reported to boost protein expression in human cells (Supplementary Table S3). Indeed, all four UTRs increased NanoLuc expression compared with the plasmid in which default sequences flank its CDS (Figure (Figure3D).3D). Together, we present various opportunities to enhance mRNA potency by optimizing key mRNA components.
mRNA 被封盖在 5' 端以防止降解、促进核糖体加载和逃避先天免疫反应 ( 31) 与 ARCA 等传统帽类似物相比,TriLink 专有的共转录加帽试剂 CleanCap AG 被证明具有更高的加帽效率、更高的 RNA 产量和降低的免疫原性。我们将 T7 启动子后的传统 GG 起始子序列修改为 AG,并使用 CleanCap AG 合成 mRNA。事实上,AG引发剂显著改善了NanoLuc的表达(图(图3C)3C)和总RNA产量(数据未显示)。这突出了如何进一步提高蛋白质产量,并强调了根据所需的加帽技术定制模板设计的重要性。如前所述,5′- 和 3′-UTR 在调节 mRNA 稳定性和翻译中的作用已得到充分证实 ( 28)。为此,我们选择了四对UTR序列,这些序列已被报道为促进人类细胞中蛋白质表达(补充表S3)。事实上,与默认序列位于其 CDS 两侧的质粒相比,所有四种 UTR 都增加了 NanoLuc 表达(图 (图3D).3D)。我们共同提供了各种机会,通过优化关键的 mRNA 成分来增强 mRNA 效力。

In summary, this study represents a first attempt to create a comprehensive playbook for rational design of therapeutic mRNA transcripts. mRNAid is an open-source software that offers advanced sequence and structural optimization strategies that generate transcripts with desired expression properties. We also experimentally demonstrate that incorporation of certain uridine analogs, and inclusion of key mRNA components can further enhance stability, boost protein output and mitigate undesired immunogenicity effects.
总之,这项研究代表了为合理设计治疗性 mRNA 转录本创建综合剧本的首次尝试。mRNAid 是一款开源软件,可提供先进的序列和结构优化策略,可生成具有所需表达特性的转录本。我们还通过实验证明,掺入某些尿苷类似物和包含关键 mRNA 成分可以进一步增强稳定性、提高蛋白质输出并减轻不良免疫原性作用。

Despite the encouraging results of mRNAid, it is important to note its limitations. mRNAid does not optimize for MFE along the entire transcript, and the thermodynamic parameters of uridine analogs are not accounted for. Yet the experimental data we presented here clearly indicate that global MFE optimization is worthwhile, albeit computationally expensive. However, the flexible backbone of mRNAid presents an opportunity for the broader scientific community to make additional enhancements to the tool. These improvements may involve extending support to other species, incorporating MFE calculations that account for uridine analogs, implementing new constraints or optimization strategies, or considering any other sequence features that might influence stability, immunogenicity and expression.
尽管mRNAid取得了令人鼓舞的结果,但重要的是要注意其局限性。mRNAid 不针对整个转录本的 MFE 进行优化,并且不考虑尿苷类似物的热力学参数。然而,我们在这里提供的实验数据清楚地表明,尽管计算成本高昂,但全局多微一体化优化是值得的。然而,mRNAid的灵活骨架为更广泛的科学界提供了对该工具进行额外改进的机会。这些改进可能涉及将支持扩展到其他物种,纳入考虑尿苷类似物的MFE计算,实施新的约束或优化策略,或考虑可能影响稳定性、免疫原性和表达的任何其他序列特征。

Supplementary Material 补充材料

lqae028_Supplemental_Files

Click here to view.(648K, zip)
点击这里查看。 (648K, zip)

Acknowledgements 确认

We thank Jens Christensen, Vincent Antonucci and Carol A. Rohl for supporting this work. We are immensely grateful to David Dzamba for his help with the initial research on transcript stability and expression that eventually was not included in the final manuscript.
我们感谢 Jens Christensen、Vincent Antonucci 和 Carol A. Rohl 对这项工作的支持。我们非常感谢 David Dzamba 对转录本稳定性和表达的初步研究提供的帮助,该研究最终未包含在最终手稿中。

Author contributions: SL, PG, BH, AP and DB conceived the study. DB planned and supervised the study. NV designed and orchestrated the implementation of mRNAid. KB and XW contributed to the development of the backend, SP to the development of the frontend, MS to the architecture and open-source, and PM to code review and scoring function. AG acted as the product owner and managed the backlog. AM helped with scientific research throughout, particularly in the context of MFE. SL and PG scientifically led and conducted the experimental work with the help of CY, JG and JW. DP created a command line tool and improved the mRNAid user interface. DB wrote the manuscript, SL, PG, JW, JG, AM and NV also contributed to the main text and the Materials and Methods section. All authors read and approved the final version of the manuscript.
作者贡献:SL、PG、BH、AP 和 DB 构思了这项研究。DB计划并监督了这项研究。NV 设计并协调了 mRNAid 的实现。KB 和 XW 为后端的开发做出了贡献,SP 为前端的开发做出了贡献,MS 为架构和开源做出了贡献,PM 为代码审查和评分功能做出了贡献。AG 担任产品负责人并管理积压工作。AM在整个过程中帮助了科学研究,特别是在MFE的背景下。SL和PG在CY、JG和JW的帮助下科学地领导和开展了实验工作。DP 创建了一个命令行工具并改进了 mRNAid 用户界面。DB撰写了手稿,SL、PG、JW、JG、AM和NV也为正文和材料与方法部分做出了贡献。所有作者都阅读并批准了手稿的最终版本。

Contributor Information 贡献者信息

Nikita Vostrosablin, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Nikita Vostrosablin, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Shuhui Lim, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Shuhui Lim,定量生物科学,默沙东新加坡,138665,新加坡。

Pooja Gopal, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Pooja Gopal,定量生物科学,默沙东新加坡,138665,新加坡。

Kveta Brazdilova, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic. Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 160 00, Czech Republic.
Kveta Brazdilova, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.化学技术大学化学技术学院信息学和化学系,布拉格,160 00,捷克共和国。

Sushmita Parajuli, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Sushmita Parajuli, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Xiaona Wei, Bioinformatics, MSD Singapore, 138665, Singapore.
Xiaona Wei,生物信息学,默沙东新加坡,138665,新加坡。

Anna Gromek, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Anna Gromek, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

David Prihoda, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
David Prihoda, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Martin Spale, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Martin Spale, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Anja Muzdalo, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Anja Muzdalo, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Jamie Greig, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Jamie Greig,定量生物科学,默沙东新加坡,138665,新加坡。

Constance Yeo, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Constance Yeo,定量生物科学,默沙东新加坡,138665,新加坡。

Joanna Wardyn, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Joanna Wardyn,定量生物科学,默沙东新加坡,138665,新加坡。

Petr Mejzlik, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Petr Mejzlik, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Brian Henry, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Brian Henry,定量生物科学,默沙东新加坡,138665,新加坡。

Anthony W Partridge, Quantitative Biosciences, MSD Singapore, 138665, Singapore.
Anthony W Partridge,定量生物科学,默沙东新加坡,138665,新加坡。

Danny A Bitton, Discovery Informatics, MSD Czech Republic s.r.o., Prague, 150 00, Czech Republic.
Danny A Bitton, Discovery Informatics, MSD Czech Republic s.r.o., 布拉格, 150 00, 捷克共和国.

Data availability 数据可用性

All code for this publication is available in the following GitHub repository: https://github.com/MSDLLCpapers/mRNAid and as a web application at https://mrnaid.dichlab.org. The code is also available in Zenodo: https://zenodo.org/doi/10.5281/zenodo.10693976 .
此出版物的所有代码都位于以下 GitHub 存储库中:https://github.com/MSDLLCpapers/mRNAid 和 Web 应用程序,位于 https://mrnaid.dichlab.org。该代码在 Zenodo: https://zenodo.org/doi/10.5281/zenodo.10693976 中也可用。

Supplementary data 补充数据

Supplementary Data are available at NARGAB Online.
补充数据可在 NARGAB Online 上获得。

Funding 资金

Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA.
Merck Sharp & Dohme LLC,Merck & Co., Inc.的子公司,位于美国新泽西州拉威。

Conflict of interest statement. All authors that are/were employees of Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ 07065, USA may hold stocks and/or stock options in Merck & Co., Inc., Rahway, NJ, USA.
利益冲突声明。所有是默克夏普公司(Merck & Co., Inc.,公司,美国新泽西州拉威市,07065)的子公司默克夏普和多姆有限责任公司(Merck Sharp & Dohme LLC)的员工,都可以持有默克公司(Merck & Co., Inc., Rahway, NJ, USA)的股票和/或股票期权。

References 引用

1. Pardi N., Hogan M.J., Porter F.W., Weissman D. mRNA vaccines—a new era in vaccinology. Nat. Rev. Drug Discovery. 2018; 17:261–279. [PMC free article] [PubMed] []
1. Pardi N., Hogan M.J., Porter F.W., Weissman D. mRNA 疫苗——疫苗学的新时代。Nat. Rev. 药物发现。2018;17:261–279.[ PMC免费文章][ 出版医学][ 谷歌学术搜索]
2. Pastor F., Berraondo P., Etxeberria I., Frederick J., Sahin U., Gilboa E., Melero I.. An RNA toolbox for cancer immunotherapy. Nat. Rev. Drug Discovery. 2018; 17:751–767. [PubMed] []
3. Lim S., Khoo R., Juang Y.-C., Gopal P., Zhang H., Yeo C., Peh K.M., Teo J., Ng S., Henry B.et al... Exquisitely specific anti-KRAS biodegraders inform on the cellular prevalence of nucleotide-loaded states. ACS Cent. Sci. 2020; 7:274–291. [PMC free article] [PubMed] []
4. Corbett K.S., Edwards D.K., Leist S.R., Abiona O.M., Boyoglu-Barnum S., Gillespie R.A., Himansu S., Schäfer A., Ziwawo C.T., DiPiazza A.T.et al... SARS-CoV-2 mRNA vaccine design enabled by prototype pathogen preparedness. Nature. 2020; 586:567–571. [PMC free article] [PubMed] []
5. Martini G.V.P., Guey L.T.. A new era for rare genetic diseases: messenger RNA therapy. Hum. Gene Ther. 2019; 30:1180–1189. [PubMed] []
6. Hewitt S.L., Bai A., Bailey D., Ichikawa K., Zielinski J., Karp R., Apte A., Arnold K., Zacharek S.J., Iliou M.S.et al... Durable anticancer immunity from intratumoral administration of IL-23, IL-36γ, and OX40L mRNAs. Sci. Transl. Med. 2019; 11:eaat9143. [PubMed] []
7. Damase T.R., Sukhovershin R., Boada C., Taraballi F., Pettigrew R.I., Cooke J.P.. The limitless future of RNA therapeutics. Front. Bioeng. Biotechnol. 2021; 9:628137. [PMC free article] [PubMed] []
8. Jain R., Frederick J.P., Huang E.Y., Burke K.E., Mauger D.M., Andrianova E.A., Farlow S.J., Siddiqui S., Pimentel J., Cheung-Ong K.et al... MicroRNAs enable mRNA therapeutics to selectively program cancer cells to self-destruct. Nucleic Acid Ther. 2018; 28:285–296. [PMC free article] [PubMed] []
9. Verma M., Choi J., Cottrell K.A., Lavagnino Z., Thomas E.N., Pavlovic-Djuranovic S., Szczesny P., Piston D.W., Zaher H.S., Puglisi J.D.et al... A short translational ramp determines the efficiency of protein synthesis. Nat. Commun. 2019; 10:5774. [PMC free article] [PubMed] []
10. Vaidyanathan S., Azizian K.T., Haque A.K.M.A., Henderson J.M., Hendel A., Shore S., Antony J.S., Hogrefe R.I., Kormann M.S.D., Porteus M.H.et al... Uridine depletion and chemical modification increase Cas9 mRNA activity and reduce immunogenicity without HPLC purification. Mol. Ther. Nucleic Acids. 2018; 12:530–542. [PMC free article] [PubMed] []
11. Gould N., Hendy O., Papamichail D. Computational tools and algorithms for designing customized synthetic genes. Front. Bioeng. Biotechnol. 2014; 2:41. [PMC free article] [PubMed] []
12. Zhang H., Zhang L., Lin A., Xu C., Li Z., Liu K., Liu B., Ma X., Zhao F., Yao W.et al... Algorithm for optimized mRNA design improves stability and immunogenicity. Nature. 2023; 621:396–403. [PMC free article] [PubMed] []
13. Lee J., Kladwang W., Lee M., Cantu D., Azizyan M., Kim H., Limpaecher A., Gaikwad S., Yoon S., Treuille A.et al... RNA design rules from massive open laboratory. Proc. Natl Acad. Sci. USA. 2014; 111:2122–2127. [PMC free article] [PubMed] []
14. Kariko K., Muramatsu H., Welsh F.A., Ludwig J., Kato H., Akira S., Weissman D.. Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol. Ther. 2008; 16:1833–1840. [PMC free article] [PubMed] []
15. Kormann M.S.D., Hasenpusch G., Aneja M.K., Nica G., Flemmer A.W., Herber-Jonat S., Huppmann M., Mays L.E., Illenyi M., Schams A.et al... Expression of therapeutic proteins after delivery of chemically modified mRNA in mice. Nat. Biotechnol. 2011; 29:154–159. [PubMed] []
16. Zulkower V., Rosser S.. DNA Chisel, a versatile sequence optimizer. Bioinformatics. 2020; 36:4508–4509. [PubMed] []
17. Alexaki A., Kames J., Holcomb D.D., Athey J., Santana-Quintero L.v., Lam P.V.N., Hamasaki-Katagiri N., Osipova E., Simonyan V., Bar H.et al... Codon and codon-pair usage tables (CoCoPUTs): facilitating genetic variation analyses and recombinant gene design. J. Mol. Biol. 2019; 431:2434–2441. [PubMed] []
18. Sharp P.M., Li W.H.. The codon adaptation index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987; 15:1281–1295. [PMC free article] [PubMed] []
19. Hofacker I.L. Vienna RNA secondary structure server. Nucleic Acids Res. 2003; 31:3429–3431. [PMC free article] [PubMed] []
20. Gaspar P., Moura G., Santos M.A.S., Oliveira J.L.. mRNA secondary structure optimization using a correlated stem–loop prediction. Nucleic Acids Res. 2013; 41:5490. [PMC free article] [PubMed] []
21. Jeong D.-E., McCoy M., Artiles K., Ilbay O., Fire A., Nadeau K., Park H., Betts B., Boyd S., Hoh R.et al... Assemblies-of-Putative-SARS-CoV2-Spike-Encoding-mRNA-Sequences-for-Vaccines-BNT-162b2-and-mRNA-1273. (1 March 2022, date last accessed)https://virological.org/t/assemblies-of-putative-sars-cov2-spike-encoding-mrna-sequences-for-vaccines-bnt-162b2-and-mrna-1273/663.
22. World Health Organization Messenger RNA Encoding the Full-length SARS-CoV-2 Spike Glycoprotein. (1 March 2022, date last accessed)https://web.archive.org/web/20210105162941/https://mednet-communities.net/inn/db/media/docs/11889.doc.
23. Hale R.S., Thompson G.. Codon optimization of the gene encoding a domain from human type 1 neurofibromin protein results in a threefold improvement in expression level in Escherichia coli. Protein Expr. Purif. 1988; 12:185–188. [PubMed] []
24. Diambra L.A. Differential bicodon usage in lowly and highly abundant proteins. PeerJ. 2017; 5:e3081. [PMC free article] [PubMed] []
25. Gutman G.A., Hatfield G.W.. Nonrandom utilization of codon pairs in Escherichia coli. Proc. Natl Acad. Sci. USA. 1989; 86:3699–3703. [PMC free article] [PubMed] []
26. Tuller T., Waldman Y.Y., Kupiec M., Ruppin E.. Translation efficiency is determined by both codon bias and folding energy. Proc. Natl Acad. Sci. USA. 2010; 107:3645–3650. [PMC free article] [PubMed] []
27. Mauger D.M., Cabral J., Presnyak V., Su S.V., Reid D.W., Goodman B., Link K., Khatwani N., Reynders J., Moore M.J.et al... mRNA structure regulates protein expression through changes in functional half-life. Proc. Natl Acad. Sci. USA. 2019; 116:24075–24083. [PMC free article] [PubMed] []
28. Leppek K., Byeon G.W., Kladwang W., Wayment-Steele H.K., Kerr C.H., Xu A.F., Kim D.S., Topkar V.V., Choe C., Rothschild D.et al... Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics. Nat. Commun. 2022; 13:1536–1558. [PMC free article] [PubMed] []
29. Mortimer S.A., Kidwell M.A., Doudna J.A.. Insights into RNA structure and function from genome-wide studies. Nat. Rev. Genet. 2014; 15:469–479. [PubMed] []
30. Chung B.K.S., Lee D.Y.. Computational codon optimization of synthetic gene for protein expression. BMC Syst. Biol. 2012; 6:134–148. [PMC free article] [PubMed] []
31. Ramanathan A., Robb G.B., Chan S.H.. mRNA capping: biological functions and applications. Nucleic Acids Res. 2016; 44:7511–7526. [PMC free article] [PubMed] []

Articles from NAR Genomics and Bioinformatics are provided here courtesy of Oxford University Press

搜索引擎
百度
谷歌
必应
搜狗
360
知识
知乎
StackOverflow
脉脉
百度知道
维基百科
百度百科
豆丁文档
豆瓣读书
微信(搜狗)
开发
StackOverflow
Apache Issues
GitHub
Maven
翻译
百度翻译
谷歌翻译
有道词典
必应翻译
海词词典
deepL
地图
百度地图
高德地图
谷歌地图
谷歌地球
图片
百度图片
搜狗图片
谷歌图片
必应图片
pixiv
flickr
花瓣
音乐
网易云音乐
QQ音乐
酷我音乐
咪咕音乐
酷狗5sing
购物
淘宝
京东
天猫
学术
谷歌学术
百度学术
知网
万方
WOS
Springer
Letpub
科研通
社交
微博
贴吧
知乎
豆瓣
主页帮助设置
直到生成用户指定的序列数或超过尝试次数#直到生成用户指定的序列数或超过尝试次数