gujarat Video Annotation Content
古吉拉特邦视频注释内容
Quality Control Guidelines
质量控制指南
l Quality inspection volume: 250 valid hours.
l质检量:250个有效小时。
l Workflow
l工作流程
After receiving the data, the quality inspectors will perform quality inspections on each
收到数据后,定性检验员将对每个
segment of the data. The content of the quality inspection includes:
段。质检内容包括:
1. Determine if the roles are correct:
1. 确定角色是否正确:
In each data segment, the first speaker's role should be O1, and this person's role
在每个数据段中,第一个说话人的角色应该是O1,这个人的角色
remains unique and consistent within that piece of audiot. The same applies to subsequent segments. Valid speech segments should have speaker roles, while invalid speech segments should have the fixed role N.
在该AudioT 中保持唯一和一致。这同样适用于后续区段。Valid语音段应具有说话人角色,而无效的语音段应具有固定角色N。
2. Select the appropriate accent for each role:
2.为每个角色选择合适的口音:
By default, the speaker is assigned a standard accent. If they speak in a different
默认情况下,为说话人分配标准口音。 如果他们以不同的方式说话
dialect of Czech, the accent should be adjusted accordingly.
捷克方言,口音应相应调整。
3. Verify the correctness and completeness of the transcribed text in the
3.验证
transcription box:
转录框:
A. If there are any text errors, they should be corrected to reflect the actual
一个。 如果有任何文本错误,应更正它们以反映实际
content.
内容。
B. Follow the principle of ‘what is heard and what is transcribed ’. If there are
湾。 遵循“听到的和转录的”的原则。如果有
speaker pauses, stutters, or incomplete pronunciation, use " -" to represent them.
说话者停顿、口吃或不完整的发音,请使用 “-” 来表示它们。
C. If there is overlapping speech in a segment, right-click and select the "overlap"
C.如果某个段落中有重叠的语音,右键单击并选择“重叠”
tag. Enclose the overlapping portion of the text between the opening and closing tags.
标记。将文本的重叠部分括在开始标签和结束标签之间。
For example: "we go to [OVERLAP/]school today[/OVERLAP]". This sentence
例如,“我们今天去[OVERLAP/]学校[/OVERLAP]”。 这句话
indicates that when the segment's speaker says "school today," another person is speaking at the same time. However, the content of the second person's speech does not need to be transcribed because this segment belongs to the primary speaker's ID, not the second person.
表示当SegmentS说话人说 “Schooltoday” 时,另一个人同时说话。但是,第二人称的语音内容不需要转录,因为这个片段属于第一人称的ID,而不是第二人称。
*Note: The "overlap" tags must appear in pairs, enclosing the overlapping content
*注意:“overlap”标签必须成对出现,将重叠的内容括起来
between them. They should not appear individually.
在他们之间。它们不应单独显示。
D. If the speech content involves personal privacy, the segment should be marked
D.如果语音内容涉及个人隐私,应标记 segment
as invalid, and the content in the text box should be deleted. Right-click and select the [PIL] tag as a replacement
视为无效,并且应删除文本框中的内容。右键单击并选择 [PIL] 标签作为替代.
l Excluding the above brief content, below are common rules encountered during
l除以上简要内容外,以下是Countred期间的常见规则