IPA410 Digital Methods in Practice
IPA410 数字方法实践
Workshop 6: Sentiment Analysis
工作坊六:情绪分析
Wednesday May 7 2025 2025年5月7日星期三
Thursday May 8 2025 2025年5月8日星期四
Dr. Isabella Magni 伊莎贝拉·马格尼博士
Today in class we will experiment with sentiment analysis. This handout includes all the necessary links and all the steps we will take to analyse simple text prompts. You can follow along in class.
今天我们将在课堂上进行情绪分析实验。这份讲义包含所有必要的链接以及分析简单文本提示所需的所有步骤。你可以在课堂上跟着做。
There are many commercial sentiment analysis tools that companies use to understand reviews and ratings of their products, and perform market research, among other things. Here are some of them:
市面上有许多商业情绪分析工具,公司可以使用它们来了解产品的评论和评分,并进行市场调研等。以下是其中一些:
These are all commercial tools, mostly behind a paywall, that are specifically designed for corporations to perform data analysis (including sentiment analysis) to evaluate their own brand and market space.
这些都是商业工具,大部分需要付费,专门为企业进行数据分析(包括情绪分析)以评估自己的品牌和市场空间而设计。
During our workshop today, we will not be using any of these tools, but we will be working on a Google Collab that we wrote ourselves for our specific needs. These are very simple Collab notebooks, that will take us step by step experimenting with simple sentiment analysis tasks, and with different approaches to performing sentiment analysis. Specifically we will be:
在今天的研讨会上,我们不会使用任何这些工具,而是会使用我们根据特定需求自行编写的 Google Collab。这些 Collab 笔记本非常简单,可以引导我们逐步尝试简单的情绪分析任务,以及不同的情绪分析方法。具体来说,我们将:
A Google Collab notebook is a list of cells. Cells contain either explanatory text or executable code and its output. You can click a cell to select it.
Google Collab 笔记本是一个单元格列表 。单元格包含说明性文本或可执行代码及其输出。您可以点击单元格来选择它。
This below is an example of a text cell. Text cells usually contain explanatory text and necessary context regarding your code.
下面是一个文本单元格的示例。文本单元格通常包含与代码相关的解释性文本和必要的上下文。
You can double-click to edit these cells. Text cells use markdown syntax.
您可以双击来编辑这些单元格。文本单元格使用 markdown 语法。
Below is an example of a code cell. Once the toolbar button indicates CONNECTED, click in the cell to select it and execute the contents in the following ways:
下面是一个代码单元的示例 。工具栏按钮显示“已连接”后,单击该单元将其选中,然后按以下方式执行其中的内容:
There are additional options for running some or all cells in the Runtime menu.
运行时菜单中有用于运行部分或全部单元的附加选项 。
This is a Google Collab which will allow you to first experiment with a pre-trained sentiment analysis model (cells 1-5), then a pre-trained zero-shot classification model (cells 6-12). Let’s walk through and run all the cells, one by one. You can Save a copy of this Collab in your Google Drive and follow along running the cells.
这是 Google 合作项目 这将允许您首先尝试预先训练的情绪分析模型(单元格 1-5),然后尝试预先训练的零样本分类模型(单元格 6-12)。让我们逐一介绍并运行所有单元格。您可以将此协作的副本保存到您的 Google 云端硬盘中,然后按照步骤运行这些单元格。
What have we learnt from experimenting with these two different sentiment analysis models?
通过试验这两种不同的情绪分析模型,我们学到了什么?
You can now experiment with your own texts and classes, after trying the examples we provided (here is how you can do that). Some more potential sentences you could try:
现在,您可以尝试我们提供的示例( 操作方法如下 ),并在您自己的文本和课程中尝试一下。您还可以尝试以下一些可能的句子:
This Google Collab tries to actually train a sentiment analysis model based on distilBERT using IMDB reviews. This is a smaller scale version of what Flair did to create the sentiment analysis model we used on the previous Collab. Let’s walk through and run all the cells, one by one. You can Save a copy of this Collab in your Google Drive and follow along running the cells.
本次 Google Collab 尝试使用 IMDB 评论训练一个基于 distilBERT 的情绪分析模型。这是 Flair 在上一次 Collab 中使用的情绪分析模型的缩小版。让我们逐一演示并运行所有单元。您可以将此 Collab 的副本保存到您的 Google Drive 中,然后按照步骤运行单元。
What have we learnt from experimenting with this training model?
我们从这种训练模式的实验中学到了什么?
You can now experiment with your own texts, after trying the examples we provided. Some more potential sentences you could try:
尝试我们提供的示例后,你现在可以用自己的文本进行实验了。以下是一些你可以尝试的句子:
What have you learned about sentiment analysis? And about the different models we used?
你对情绪分析有什么了解?以及我们使用的不同模型有哪些?
Running cell 1 运行单元 1
Running cell 2 运行单元 2
Running cell 3 运行单元 3
Running cell 4 运行单元 4
Running cell 5 运行单元 5
Insert new text in cell 5, and then run the cell again
在单元格 5 中插入新文本,然后再次运行该单元格
Re-run cell 4 and wait for the new results
重新运行单元格 4 并等待新的结果
You can do a similar process when experimenting with the second exercise, the zero-shot classification model (cells 6-12). First run all the cells once, then add in your own text in cell 11 (the second to last) and run the cell. Then go back to cell number 9 (the one that starts with the comment # Here we can see the results) and run that cell again to see the new results.
在进行第二个练习,即零样本分类模型(单元格 6-12)时,你可以执行类似的过程。首先运行所有单元格一次,然后在单元格 11(倒数第二个)中添加你自己的文本并运行该单元格。然后返回单元格 9(以注释 # Here we can see the results 开头的单元格 ),再次运行该单元格以查看新的结果。
You can also experiment with changing both the input text and the parameters. In that case you can change the text and the parameters in cell 12 (the last one), and run the cell. Then go back to cell number 9 (the one that starts with the comment # Here we can see the results) and run that cell again to see the new results.
您还可以尝试同时更改输入文本和参数。在这种情况下,您可以更改单元格 12(最后一个)中的文本和参数,然后运行该单元格。然后返回单元格 9(以注释 # Here we can see the results 开头的单元格 ),再次运行该单元格以查看新的结果。