Aire Cross-lingual Semantic Textual Similarity with Register consideration (XSTS+R+P) evaluation guidelines
Aire 考量語域的跨語言語意文本相似度 (XSTS+R+P) 評估指南
We would like to assess the degree of meaning correspondence/equivalence between a translation request and the translation of the requested paragraph.
我們想評估「翻譯請求」與「請求翻譯段落的譯文」之間,在意義上對應/對等的程度。
You will need to read the source and the target, compare them sentence by sentence and assign a score to each sentence. You will be asked not to only compare the semantic meaning, but also the register of the paragraphs.
您需要閱讀來源文本與目標文本,逐句比較,並為每個句子評分。除了比較語意,您也需要比較段落的語域。
Semantic Similarity Rating Criteria
語意相似度評分標準
5 - The two paragraphs are completely equivalent in meaning and style
5 - 兩個段落的意義與風格完全對等。
Please follow these guidelines when evaluating translations:
評估譯文時,請遵循以下準則:
The below rating criteria should be used to evaluate translations of texts holistically considering the entire paragraph.
應使用以下評分標準,從整體上評估整個段落的譯文。
The translation should be scored accordingly:
譯文應依下列標準評分:
Note: A translation that is equivalent in meaning but shows a major shift in register, such as changing a neutral tone to extremely formal or extremely informal, cannot score more than 2. This does not apply to minor stylistic differences that don't affect the reader's perception. For more details, please refer to the examples.
注意:若譯文意義對等,但在語域上有嚴重偏移(例如將中性語氣變成極度正式或極度口語),則最高分不得超過 2 分。此規定不適用於不影響讀者觀感的輕微風格差異。詳情請參考範例。
More details and examples are given below.
以下提供更多詳細說明與範例。
[1] The source paragraph and its translation are not semantically equivalent, share very little details, and may be about different topics.
'[1] 來源段落與其譯文在語意上不對等,幾乎沒有共同細節,且可能主題不同。'
Example A (different topics)
範例 A(主題不同):
Text1 (English): Train station
'文本 1 (英文): Train station'
Text2 (Spanish): Restaurante vegano (Vegan restaurant)
'文本 2 (西班牙文): Restaurante vegano (Vegan restaurant)'
Example B (similar/related topics):
範例 B(主題相似/相關):
Text1: Mexican restaurant
'文本 1: Mexican restaurant'
Text2: Restaurante italiano (Italian restaurant).
文本 2:Restaurante italiano (義大利餐廳)。
Example C (very little overlap):
範例 C (重疊度極低):
Text1: Open museums
文本 1:開放的博物館
Text2: Centros comerciales abiertos. (Open malls)
文本 2:Centros comerciales abiertos. (營業中的購物中心)
Example D (indecipherable on one side or another):
範例 D (其中一方文本難以辨識):
Text1: lorbo lorbo lorl room
文本 1:lorbo lorbo lorl room
Text2: Habitación doble (double room).
文本 2:Habitación doble (雙人房)。
[2] The source paragraph and its translation share some details, but are not equivalent. Some important information related to the primary subject/verb/object differs or is missing, which alters the intent or meaning of the paragraph. Alternatively, the register differs so much that this translation will be a big mistake. A significant change in register will always score no more than 2 points.
'[2] 來源段落與其譯文共享部分細節,但兩者並不對等。與主要主詞/動詞/受詞相關的部分重要資訊有所不同或缺漏,從而改變了段落的意圖或涵義。或者,語域差異過大,導致此譯文會是個大錯誤。語域的重大改變,評分一律不得超過 2 分。'
Example A (opposite polarity):
範例 A (語意對立):
Text1: Flight to London
文本 1:飛往倫敦的航班
Text2: Vuelo desde Londres (Flight from London).
文本 2:Vuelo desde Londres (從倫敦起飛的航班)。
Example B (word order that results in a different meaning)
範例 B (詞序不同導致涵義改變)
Text1: Two rooms for three people
文本 1:兩間房給三人住
Text2: Tres habitaciones para dos personas (Three rooms for two people).
文本 2:Tres habitaciones para dos personas (三間房給兩人住)。
Example C (missing salient information):
範例 C (缺少關鍵資訊):
Text1: Vegan Italian restaurant
文本 1:純素義大利餐廳
Text2: Restaurante italiano (Italian restaurant).
文本 2:Restaurante italiano (義大利餐廳)。
Example D (substitution/change in named entity)
範例 D (實體名稱被替換/變更)
Text1: Flight to Valencia
文本 1:飛往瓦倫西亞的航班
Text2: Vuelo a Valladolid (Flight to Valladolid).
文本 2:Vuelo a Valladolid (飛往瓦拉多利德的航班)。
Example E (register difference)
範例 E (語域差異)
Text1: What’s up dude?
文本 1:嘿,老兄,怎樣啊?
Text2: ¿Cómo se encuentra, señor? (How are you, sir?)
文本 2:¿Cómo se encuentra, señor? (先生,您好嗎?)
[3] The two paragraphs are mostly equivalent, but some unimportant details can differ. There cannot be any significant conflicts in intent, meaning or register between the sentences, no matter how long the sentences are.
'[3] 兩個段落大致對等,但可能存在一些無關緊要的細節差異。無論句子多長,其意圖、涵義或語域皆不得有任何重大衝突。'
Example A (minor details that are not salient to the meaning)
範例 A (不影響主旨的次要細節):
Text1: What’s the time right now?
文本 1:現在幾點了?
Text2: ¿Qué hora es? (What time is it?)
文本 2:¿Qué hora es? (現在幾點?)
Example B (minor verb tense and/or unit of measurement differences):
範例 B (動詞時態及/或度量單位的微小差異):
Text1: I want 2 pounds of cheese
文本 1:我想要兩磅起司
Text2: Quería 1 kg de queso (I wanted 1kg of cheese)
文本 2:Quería 1 kg de queso (我之前想要一公斤起司)
Example C (small, non-conflicting differences in meaning):
範例 C (無衝突的微小涵義差異):
Text1: I love running
文本 1:我超愛跑步
Text2: Me gusta correr (I like running).
文本 2:Me gusta correr (我喜歡跑步)。
Example D (minor difference):
範例 D (微小差異):
Text1: Photos of the trip
文本 1:這趟旅行的照片
Text2: Fotos de mi viaje (Photos of my trip)
文本 2:Fotos de mi viaje (我這趟旅行的照片)
Example E (omitted non-critical information, but no contradictory info introduced):
範例 E (省略非關鍵資訊,但未加入矛盾資訊):
Text1: Table for 3 adults
文本 1:三位大人的位子
Text2: Mesa para 3 (Table for 3)
文本 2:Mesa para 3 (三個人的位子)
[4] The two paragraphs are paraphrases of each other. Their meanings are near-equivalent, with no major differences or information missing. There can only be minor differences in meaning due to differences in expression (e.g., formality level, style, emphasis, potential implication, idioms, common metaphors). For single word texts, there might be multiple meanings depending on the context they would be used in, but the one presented is still correct.
'[4] 兩個段落互為釋義。其涵義近乎對等,沒有重大差異或資訊缺漏。僅能因表達方式不同(例如:正式程度、風格、強調重點、潛在含義、慣用語、常見譬喻)而產生微小的涵義差異。對於單詞文本,根據其使用情境可能有多種涵義,但所呈現的涵義仍然是正確的。'
Examples
範例:
Text1: Fire
文本 1:Fire
Text2: Despedir (to fire)
文本 2:Despedir (解僱)
Explanation: “Despedir” is a correct translation, but fire could also be a noun that means “fuego”
'說明:「Despedir」是正確的翻譯,但 fire 也可以是名詞,意指「fuego」(火)。'
Text1: This is great
文本 1:這太棒了
Text2: Esto es la leche (Lit: this is the milk).
文本 2:Esto es la leche (字面意思:這是牛奶)。
Explanation: “Esto es la leche” is an idiom, “this is great” is not.
'說明:「Esto es la leche」是慣用語,而「this is great」不是。'
Text1: The day that comes after the day of today
文本 1:今天的後一天
Text2: Mañana (Tomorrow)
文本 2:Mañana (明天)
Explanation: Differences in phrasing, text 1 is oddly phrased and more verbose than text2
'說明:措辭差異,文本 1 的說法奇怪且比文本 2 更囉嗦'
Text1: Bird
文本 1:鳥
Text2: Pajarito (birdie).
文本 2:Pajarito(小鳥)。
Explanation: Different level of formality (“Birdie” vs “bird”).
說明:正式程度不同(「小鳥」相對於「鳥」)。
[5] The two paragraphs are exactly and completely equivalent in meaning and usage expression (e.g., formality level, style, emphasis, potential implication, idioms, common metaphors)
[5] 這兩個段落在意義和用法表達上(例如:正式程度、風格、強調重點、潛在涵義、慣用語、常見譬喻)完全相同。
In other words, nuance is completely preserved and there is a faithful correspondence. Fidelity is also preserved.
換句話說,兩者的細微差異被完整保留,並存在著忠實的對應關係。忠實度也得以維持。
Examples
範例:
Text1: What’s up y'all?
文本 1:嘿,大家好嗎?
Text2: Howdy guys!
文本 2:嗨,各位好!
Text1: Hi friends
文本 1:嗨,朋友們
Text2: Hola chicos (Hello guys)
文本 2:Hola chicos(嗨,各位)
Text1: Hello, how are you
文本 1:哈囉,你好嗎
Text2: Hola cómo estás (Hello how are you)
文本 2:Hola cómo estás(哈囉,你好嗎)
Text1: One two apples oranges
文本 1:一 二 蘋果 橘子
Text2: Uno dos manzanas naranjas (One two apples oranges)
文本 2:Uno dos manzanas naranjas(一 二 蘋果 橘子)
Please provide mandatory comments explaining the reasoning behind your decisions. Please refer to the section Rating Justification below to learn more.
請提供必要的說明,解釋您做出決定的理由。請參閱下方的「評分理由」部分以了解更多資訊。
You will be asked to explain any reasons for your preference.
您將被要求說明您偏好的任何理由。
A good justification clearly explains why you gave a specific rating. This might involve highlighting issues such as incorrect word or phrase choices, mismatches in tone, or inconsistencies between the source and the translation (including text or emoji use).
好的理由能清楚解釋您給予特定評分的原因。這可能包含點出問題,例如不正確的字詞選擇、語氣不符,或原文與譯文之間的不一致(包含文字或表情符號的使用)。
You do not need to provide a full alternative translation for your comment to be helpful. The focus is on analyzing the translation provided, not rewriting it.
您的說明不必提供完整的替代翻譯也能有所幫助。重點在於分析所提供的譯文,而非重新翻譯。
Your justification doesn’t always have to be long. For translations with minimal issues, shorter comments are perfectly fine. However, if more detail is needed to explain your reasoning, please feel free to provide it. The goal is to strike a balance between clear feedback and efficient reviewing - we don’t want the commenting process to significantly slow down your workflow.
您的理由不一定總是需要很長。對於問題極小的譯文,較短的說明就已足夠。然而,若需要更多細節來解釋您的理由,也請隨時提供。目標是在清楚的回饋和高效率的審核之間取得平衡——我們不希望說明過程嚴重拖慢您的工作流程。
Some examples of good justifications:
一些好的理由範例:
Each sentence in the set comes with a register label—a 3-letter acronym that helps you check whether the register (level of formality or style) in the source matches the target.
此組合中的每個句子都附有一個語域標籤——一個三字母的縮寫,幫助您檢查原文的語域(正式程度或風格)是否與目標譯文相符。
For each of the three functional areas, we provide several non-exclusive options to describe them.
針對三個功能領域,我們提供數個非排他性的選項來描述它們。
Non-exclusive means that a domain can have more than one label.
非排他性意指一個領域可以有多個標籤。
Here’s a breakdown of the functional areas and their possible options. Each option is marked with a bold lowercase letter in brackets to make it easy to reference:
以下是功能領域及其可能選項的詳細說明。每個選項都以方括號中的粗體小寫字母標示,以便於參照:
Labels for registers
語域標籤
Each register label is created by combining the three-letter codes (the bold lowercase letters) from the three register categories. For example, a register characterized as impersonal (in connectedness), composed (in preparedness), and equal-assumed (in social differential) is labeled: ica.
每個語域標籤皆由三個語域類別的三個字母代碼 (粗體小寫字母) 組合而成。例如,某個語域的特徵為「非個人」(連結程度)、「撰寫」(準備程度) 和「平等-視為同儕」(社會階級差異),則其標籤為:ica。
In total, there are 16 labels (unique combinations) that can be used to characterize sentences of the BOUQuET dataset:
總共有 16 個標籤 (獨特組合) 可用來描述 BOUQuET 資料集中的句子特徵:
LABEL | CONNECTED | PREPAREDNESS | SOCIAL DIFFERENTIAL |
ica | impersonal | composed | assumed equal |
ich | impersonal | composed | higher-to-lower |
mea | multi-directional | extemporaneous | assumed equal |
meh | multi-directional | extemporaneous | higher-to-lower |
mel | multi-directional | extemporaneous | lower-to-higher |
mrh | multi-directional | reactive | higher-to-lower |
mrk | multi-directional | reactive | known equal |
mra | multi-directional | reactive | assumed equal |
mrl | multi-directional | reactive | lower-to-higher |
msa | multi-directional | scripted | assumed equal |
nca | non-directional | composed | assumed equal |
nch | non-directional | composed | higher-to-lower |
uca | uni-directional | composed | assumed equal |
uch | uni-directional | composed | higher-to-lower |
uia | uni-directional | improvised | assumed equal |
uih | uni-directional | improvised | higher-to-lower |
usa | uni-directional | scripted | assumed equal |
ush | uni-directional | scripted | higher-to-lower |
Extra Examples
額外範例
Sometimes not all the words are translated correctly in some languages, especially if a word has multiple meanings.
有時候,並非所有字詞都能在某些語言中得到正確翻譯,尤其是當一個字詞有多重含義時。
Source: If the new nut is still too tall, you can make it lower with a file.
'原文:如果新的琴枕還是太高,你可以用銼刀把它磨低。'
Context: This is a how-to article on how to change the guitar nut.
'情境:這是一篇關於如何更換吉他琴枕的教學文章。'
Translation: File the walnut down if it’s too big.
'譯文:如果核桃太大,就把它磨小。'
Since this is an example of changing salient information, this translation cannot score more than 2 points.
由於這個譯文更改了原文的關鍵資訊,因此分數不得超過 2 分。
Sometimes the meaning is conveyed correctly, but the level of formality is off.
有時候,譯文雖然正確傳達了意思,但正式程度卻不對。
Source: Try to solve the crossword puzzle.
'原文:試著解開這個填字遊戲。'
Context: This is a neutral suggestion to solve a crossword.
'情境:這是一個解決填字遊戲的中性建議。'
Translation: Take a stab at the crossword!
'譯文:來挑戰看看這個填字遊戲吧!'
Since this is an example of changing the register completely, this translation cannot score more than 2 points. By a complete change of register we mean that this translation does not reflect the neutrality of the original - it contains slang which makes it informal.
這個譯文完全改變了語域,因此分數不得超過 2 分。所謂的完全改變語域,是指譯文沒有反映出原文的中性語氣,反而使用了俚語,顯得不夠正式。
Sometimes the translation is not precise enough.
有時候,譯文不夠精確。
Source: Call the traffic police!
'原文:快叫交通警察!'
Context: This is a snippet of the conversation on the site of a traffic collision.
'情境:這是車禍現場的一段對話。'
Translation: Call the authorities.
'譯文:快叫有關當局。'
This example shows how some details are lost, though not the most important ones. Therefore, this sentence should score no more than 3 points.
這個範例顯示譯文遺漏了一些細節,但並非最重要的資訊。因此,這句話的得分不應超過 3 分。
Sometimes the translation does not convey some of the information in the source.
有時候,譯文沒有傳達原文的部分資訊。
Source: Squeeze the lemon completely, getting all the juice.
'原文:將檸檬完全榨乾,擠出所有汁液。'
Context: This is a part of a recipe.
'情境:這是一份食譜的其中一個步驟。'
Translation: Squeeze the lemon.
'譯文:擠壓檸檬。'
This example is quite straight-forward, since it is easy to see that the translation is much shorter. Some important information is lost, therefore this example would only score 2 points.
這個例子相當直接,因為很容易看出譯文短了許多。由於遺漏了重要的資訊,這個範例只能得到 2 分。
Sometimes idioms and set expressions are translated literally.
有時候,成語和慣用語會被字面直譯。
Source: Let’s get cracking!
'原文:Let’s get cracking! (我們開始動工吧!)'
Context: This is a line from a group chat where one person is suggesting to start some activity.
'情境:這是一段群組對話,其中一人建議開始進行某項活動。'
Translation: Let’s crack this.
'譯文:Let’s crack this. (我們來破解這個吧。)'
As you can see, the meaning of the sentence has changed completely. The literal translation made it impossible to get the initial idea of the source text, therefore this translation can only score 1 point.
如你所見,句子的意思已完全改變。字面直譯讓人無法理解原文的初衷,因此這個譯文只能得到 1 分。
Machine translation sometimes “forgets” how it translated something in the previous sentence. Since we work with paragraphs, please check that the target text is grammatically and lexically consistent (the same verb tense is used for corresponding events, “he” remains “he” and “she” remains “she”, the same object always has the same name etc.)
機器翻譯有時會「忘記」前一句是怎麼翻譯的。由於我們處理的是段落,請檢查目標文本在語法和詞彙上是否一致 (例如:相應的事件使用相同的動詞時態,「他」和「她」的指稱維持不變,同一個物件使用相同的名稱等)。
Source: Let the potato croquettes rest for 1 minute and then serve them alongside some salad.
'原文:讓馬鈴薯可樂餅靜置 1 分鐘,然後搭配沙拉一起上桌。'
Context:. This is a part of a recipe.
'情境:這是一份食譜的其中一個步驟。'
Translation: The potato fritters must rest for at least a minute and then serve the croquettes with salad.
'譯文:馬鈴薯煎餅必須靜置至少一分鐘,然後將可樂餅與沙拉一起端上。'
In this example, the model did not understand that the same object was meant and assumed two different translations. This undermines the understanding of the target sentences, therefore this translation can only score 2 points.
在這個範例中,翻譯模型不理解原文指的是同一個東西,因此用了兩種不同的譯法。這會影響對目標句子的理解,所以這個譯文只能得到 2 分。
Sometimes the translation makes no sense and cannot be understood or really attributed to the source text.
有時候,譯文會變得毫無意義、無法理解,或與原文內容完全無關。
Source: No no, she is feeling well.
'原文:不,不,她感覺很好。'
Context: This is a part of a dialogue in a fictional story.
'情境:這是一本虛構小說中的一段對話。'
Translation: No no no no no no no no no no no no no no no no no no no no no no no no no no no
'譯文:不不不不不不不不不不不不不不不不不不不不不不不不不不不不'
Such a translation can only score 1 point.
這樣的譯文只能得到 1 分。
More example ratings:
更多評分範例:
Source text | Context | Target text | Score (please fill in) |
Come along and bring your guitar! A microphone would also be handy ;-) | This is a text message to a friend - they are planning to jam, so bringing a guitar and a mic is a good idea. | Come and bring your guitar, bringing a microphone isn’t going to hurt either | 4 |
Are you ill? Get well soon! | This is a text to a friend, the friend is ill so you wish them a speedy recovery. | Are you ill? Wishing you a speedy recovery! | 4 |
- She is dressed to the nines! | This is a snippet out of a romantic novel. | - Her dress is size 9. | 1 |
They are delusional about the fact that the prices are rising. | This is a snippet from an article about concerns that people have about working from home. | They are a bit delulu about the prices which are rising. | 2 |
In a small bowl, place one teaspoon of Dijon mustard. | This is a snippet from a recipe. | Place one teaspoon of Dijon mustard in a small bowl. | 5 |
Thank you for calling the support department. | This is a dialogue between the user and the agent of customer support. | Thank you for calling. | 2 |
The reception hall where Diane entered was brightly lit. | This is a snippet from a romantic novel. | The reception hall was brightly lit. | 2 |
How to make homemade vinaigrette? | This is a snippet of a recipe. | How to make vinaigrette? | 3 |
Dude, that is weird from you. | This is a part of a dialogue between two friends. | This is a bit weird from you. | 2 |
Planting a tree is a simple process. The tree will grow in no time if you follow the proper procedure. | This is an article about the steps you need to take to plant a tree. | Planting a tree is a simple process. Your shrubs will grow fast if you follow the right procedure. | 2 |