这是用户在 2025-2-4 2:46 为 https://app.immersivetranslate.com/html/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

[Joseph Sweeney] 12:59:38 Oh.
[Joseph Sweeney] 12:59:38 哦。

[Joseph Sweeney] 12:59:44 Okay?
[Joseph Sweeney] 12:59:44 好吗?

[Joseph Sweeney] 12:59:49 Okay. Hello, everyone and welcome to today's lecture. I'm going to introduce Dr. Rajat Nag.
[Joseph Sweeney] 12:59:49 好的。大家好,欢迎来到今天的讲座。我将介绍拉贾特·纳格博士。

[Joseph Sweeney] 12:59:59 He is an assistant professor here in the school of Bio systems and food engineering, and for this session and a subsequent session.
[Joseph Sweeney] 12:59:59 他是生物系统与食品工程学院的助理教授,负责本次会议和随后的会议。

[Joseph Sweeney] 13:00:11 This is part one of his two part series on data analysis techniques and presenting your results. Dr.
[Joseph Sweeney] 13:00:11 这是他关于数据分析技术和展示结果的两部分系列的第一部分。博士。

[Joseph Sweeney] 13:00:18 Naga has been very generous today that he is going to. He's going to record the lectures as well and provide them material on bright space, so that and when you finally generate your dot, and you have us you can actually look back through the slides and be able to use the the learnings from these two sessions, so there's I wanna pass over to
[Joseph Sweeney] 13:00:18 Naga 今天非常慷慨,他将会录制讲座并在 Brightspace 上提供材料,这样当你最终生成你的点时,你可以回顾幻灯片,并能够利用这两次会议的学习内容,所以我想把时间交给

[Joseph Sweeney] 13:00:40 you. Yeah, and thanks again for okay.
[Joseph Sweeney] 13:00:40 你。是的,再次感谢你,没问题。

[Rajat Nag] 13:00:41 Yeah, thanks. Sure. Thanks. So thanks.
[Rajat Nag] 13:00:41 是的,谢谢。好的。谢谢。所以谢谢。

[Rajat Nag] 13:00:48 So good afternoon, as Joe mentioned, colleague of Joe, and today I'll be recording the meeting.
[Rajat Nag] 13:00:48 大家下午好,正如乔提到的,我是乔的同事,今天我将录制会议。

[Rajat Nag] 13:00:58 So you do not need to turn on your videos. But please this, my only request. Please mute yourself if you're not talking, you do not need to raise your hand because, like I'll be in my flow.
[Rajat Nag] 13:00:58 所以你不需要打开视频。但请听我说,这是我唯一的请求。如果你不在说话,请静音,你不需要举手,因为我会在我的节奏中。

[Rajat Nag] 13:01:11 When I present but that doesn't mean like you cannot ask me questions, so just stop me and ask question, okay, and decision should be interactive.
[Rajat Nag] 13:01:11 当我演示时,但这并不意味着你不能问我问题,所以请随时打断我提问,好吗?决策应该是互动的。

[Rajat Nag] 13:01:24 So I'd like to receive as many question if you have, and what else? Yeah, so please do not join with your phone.
[Rajat Nag] 13:01:24 所以我希望能收到尽可能多的问题,如果你们还有其他的?是的,请不要用手机加入。

[Rajat Nag] 13:01:35 So if you have time, because, like we'll be trying, or based to work panelly, because that's the main problem.
[Rajat Nag] 13:01:35 所以如果你有时间,因为我们会尝试,或者基于工作面板,因为这才是主要问题。

[Rajat Nag] 13:01:43 Of the online session, because you cannot see my screen and work on your laptop. Simultaneously, but also last year, actually from based on the student feedback.
[Rajat Nag] 13:01:43 在线会议中,因为你无法看到我的屏幕并在你的笔记本电脑上工作。同时,实际上也是基于去年的学生反馈。

[Rajat Nag] 13:01:53 Many mentioned that it can be a pre recorded lecture or kind of, you know, online lecture, so that it can be recorded, and students can look the calculation or the class, and pause the steps, and then do gutherances on their own data.
[Rajat Nag] 13:01:53 许多人提到这可以是预录的讲座或某种在线讲座,这样可以录制,学生可以查看计算或课堂内容,暂停步骤,然后在自己的数据上进行推导。

[Rajat Nag] 13:02:13 So I'll record now onwards. So if you have any objection, please let me know.
[Rajat Nag] 13:02:13 所以我现在开始录音。如果你有任何异议,请告诉我。

[Rajat Nag] 13:02:22 In two seconds.
[Rajat Nag] 13:02:22 两秒钟内。

[Rajat Nag] 13:02:29 Silence means you do not have any objection, so thank you very much, and I'll open also a kind of.
[Rajat Nag] 13:02:29 沉默意味着你没有任何异议,非常感谢,我也会开启一种。

[Rajat Nag] 13:02:38 That Ai companion, if you need some minutes of this meeting.
[Rajat Nag] 13:02:38 那个人工智能助手,如果你需要这次会议的几分钟。

[Rajat Nag] 13:02:47 Yeah, please mute yourself. If you're not participating this only request.
[Rajat Nag] 13:02:47 是的,请静音。如果你不参与的话,只需要这样请求。

[Manidip Mandal] 13:02:48 And then.
[Manidip Mandal] 13:02:48 然后。

[Rajat Nag] 13:03:09 Hey, so here we are.
[Rajat Nag] 13:03:09 嘿,我们在这里。

[Rajat Nag] 13:03:14 So today's topic. So, as Joe mentioned the entire session of data will be divided into two parts, that part one and then I'll come back to the second part as you go with your data set for you this is so once you have your data.
[Rajat Nag] 13:03:14 所以今天的话题。正如乔提到的,整个数据会议将分为两个部分,第一部分,然后我会在你处理数据集时回到第二部分,所以一旦你有了你的数据。

[Rajat Nag] 13:03:31 Set, it will be more beneficial for you if I conduct the part two of this class later on, so I think that is twenty eight of February.
[Rajat Nag] 13:03:31 设置,如果我稍后进行这节课的第二部分,对你会更有益,所以我认为是二月二十八日。

[Rajat Nag] 13:03:42 Is that right? So just unmute yourself, and just let me know.
[Rajat Nag] 13:03:42 是这样吗?那就取消静音,让我知道一下。

[Rajat Nag] 13:03:48 So when is the second lecture of this? Do you know the date? Yes, I know.
[Rajat Nag] 13:03:48 那这个的第二次讲座是什么时候?你知道日期吗?是的,我知道。

[Rajat Nag] 13:04:01 Hello, okay, not sure. So I think it is discussed with Joe, so that will be on twenty-eight.
[Rajat Nag] 13:04:01 你好,好的,不确定。所以我想这是和乔讨论过的,所以那将在二十八号。

[Saksorn Techasutjalidsuntorn] 13:04:02 And not sure, yet.
[Saksorn Techasutjalidsuntorn] 13:04:02 还不确定。

[Rajat Nag] 13:04:12 Same timing. Okay?
[Rajat Nag] 13:04:12 同样的时间。好吗?

[Rajat Nag] 13:04:14 Okay. So.
[Rajat Nag] 13:04:14 好的。那么。

[Rajat Nag] 13:04:18 These are the some learning outcomes. By the end of the program. Students will learn very basic principles of descriptive statistics, and then be aware of different plots.
[Rajat Nag] 13:04:18 这些是一些学习成果。在课程结束时,学生将学习描述性统计的基本原则,并了解不同的图表。

[Rajat Nag] 13:04:30 Histogram visualization interpretation of data so descript it means like kind of a summary stat for the time being.
[Rajat Nag] 13:04:30 数据的直方图可视化解释,因此描述它意味着在此时的总结统计。

[Rajat Nag] 13:04:39 So just understand that. So it's kind of summarization of data and getting some pattern out of your data, set.
[Rajat Nag] 13:04:39 所以只需理解这一点。这是一种数据的总结,并从你的数据集中获取一些模式。

[Rajat Nag] 13:04:47 So these are the type of data analytics, descriptive, andics means like summarize past events like changes in views or sales performance or prediction of climate.
[Rajat Nag] 13:04:47 所以这些是数据分析的类型,描述性分析意味着总结过去的事件,比如观看次数或销售业绩的变化,或者气候的预测。

[Rajat Nag] 13:05:00 Kind of prediction. You can get a pattern. Then we have the tagging analysis which investigates.
[Rajat Nag] 13:05:00 一种预测。你可以得到一个模式。然后我们有标签分析来进行调查。

[Rajat Nag] 13:05:08 Why events awkward. So it's kind of an interpretation or kind of a diagnostics like a doctor.
[Rajat Nag] 13:05:08 为什么事件很尴尬。这有点像一种解释或一种诊断,就像医生一样。

[Rajat Nag] 13:05:16 Why things happened congratulations of diverse data sources, then we have predictive analytics.
[Rajat Nag] 13:05:16 为什么事情发生了,恭喜多样的数据来源,然后我们有了预测分析。

[Rajat Nag] 13:05:22 So it forecasts so prediction, whether predictions and climate change, predictions, and all.
[Rajat Nag] 13:05:22 所以它预测了预测,无论是气候变化的预测,还是其他的预测。

[Rajat Nag] 13:05:28 So forecast, short term future events based on historical data. But we'll do some of that into this class kind of it's called the Time Series, forecasting.
[Rajat Nag] 13:05:28 所以预测是基于历史数据的短期未来事件。但我们会在这个课程中做一些类似的内容,这被称为时间序列预测。

[Rajat Nag] 13:05:38 So we'll do that a little bit of that. But there are like precise tool and sophisticated kind of methodology to do that, such as like.
[Rajat Nag] 13:05:38 所以我们会稍微做一些。但有一些精确的工具和复杂的方法来做到这一点,比如说。

[Rajat Nag] 13:05:47 Say sales during previous hot summer, based on that you can predict at the future sales. Then we have perspective analytics, so which recommends action.
[Rajat Nag] 13:05:47 说在之前炎热的夏天的销售情况,基于此你可以预测未来的销售。然后我们有前瞻性分析,它会推荐行动。

[Rajat Nag] 13:05:58 Suggesting, for instance, adding an event, shift and renting a tank.
[Rajat Nag] 13:05:58 建议,例如,添加一个事件、班次和租用一个坦克。

[Rajat Nag] 13:06:07 If, whether model predicts a hot summer, about fifty-eight percent. So what is the like likelihood, or it's kind of print predict, you need to predict as a consequence, of something, so it's kind of conditional probability if you heard of the term means like if there is a probability of something to happen and then on top of that consequences so what would be the
[Rajat Nag] 13:06:07 如果模型预测一个炎热的夏天,大约有五十八个百分点。那么这种可能性是什么,或者说它是一种预测,你需要预测作为某种结果的后果,所以这有点像条件概率,如果你听说过这个术语,意思是如果某件事情发生的概率,然后在此基础上考虑后果,那么结果会是什么。

[Rajat Nag] 13:06:34 the perspective, or what would be the follow up kind of, you know. Prediction based on that. So then we have different types of data.
[Rajat Nag] 13:06:34 视角,或者说接下来的跟进,您知道的。基于此的预测。所以我们有不同类型的数据。

[Rajat Nag] 13:06:42 So categorical data and numerical data, so categorical means, like non number data.
[Rajat Nag] 13:06:42 所以分类数据和数值数据,分类数据是指非数字数据。

[Rajat Nag] 13:06:48 So in summary. So, for example, like agreement, disagreement being neutral for a survey, then weak medium and strong.
[Rajat Nag] 13:06:48 所以总结一下。例如,像协议、不同意见在调查中是中立的,然后是弱、中等和强。

[Rajat Nag] 13:06:59 So it's kind of a kind of a perspective, or it's kind of it's not, or like descriptive thing.
[Rajat Nag] 13:06:59 所以这有点像一种视角,或者说它不是,或者像一种描述性的东西。

[Rajat Nag] 13:07:06 So it is not like a one, two, three, and forty, so you cannot quantify things properly, but with some educated judgment kind of assigning some number to your qualitative.
[Rajat Nag] 13:07:06 所以这不是像一、二、三和四十那样的,你无法正确量化事物,但可以通过一些有根据的判断给你的定性分配一些数字。

[Rajat Nag] 13:07:20 Data or variable. You can also get some results.
[Rajat Nag] 13:07:20 数据或变量。你也可以得到一些结果。

[Rajat Nag] 13:07:25 So has a particular ordering. So in that case, for the ordinary one, so it has an order, so agreement, neutral disagreement, so it has an order, agreement, disagreement, and neutral.
[Rajat Nag] 13:07:25 所以有一个特定的顺序。在这种情况下,对于普通的那个,它有一个顺序,所以同意、中立不同意,它有一个顺序,同意、不同意和中立。

[Rajat Nag] 13:07:37 Then weak medium and strong. It's not like weak, strong, and medium, then, has no particular ordering sentences, like male female, normal, abnormal black, brown, blue.
[Rajat Nag] 13:07:37 然后是弱、中和强。并不是说弱、强和中没有特定的排序句子,比如男性、女性,正常、异常,黑色、棕色、蓝色。

[Rajat Nag] 13:07:48 So there is no relationship between black and blue, for the eye color. But if you, if you want kind of a choice of color from different kinds of people, it is randomly distributed, however, certain certain society, certain certain country, or certain culture, may have some kind, of you know a choice, of color, as kind, of an order, so these things are interesting so there are exception but that actually
[Rajat Nag] 13:07:48 所以黑色和蓝色之间没有关系,关于眼睛颜色。但是如果你想从不同种族的人中选择一种颜色,它是随机分布的,然而,某些社会、某些国家或某种文化可能会有某种颜色的选择,作为一种秩序,所以这些事情很有趣,确实存在例外,但实际上

[Rajat Nag] 13:08:21 actually follows the rule, so that actually describes why there is a need for classification and reading all these things.
[Rajat Nag] 13:08:21 实际上遵循这个规则,因此这实际上描述了为什么需要分类和阅读所有这些内容。

[Rajat Nag] 13:08:31 So it's kind of classification of data. So, in terms of the numerical data, we have ratio skill, and we have discrete data and continuous data.
[Rajat Nag] 13:08:31 所以这算是一种数据分类。在数值数据方面,我们有比率数据,还有离散数据和连续数据。

[Rajat Nag] 13:08:39 So, for example, income, height, weight, annual sales, so it can vary from like one fellow to another with in finite possibility of numbers in between ranks.
[Rajat Nag] 13:08:39 所以,例如,收入、身高、体重、年销售额等,这些在不同的人之间可能会有所不同,且在等级之间有无限的数字可能性。

[Rajat Nag] 13:08:52 So rank data represent the relative positions of the of set measurements, and then.
[Rajat Nag] 13:08:52 所以排名数据表示一组测量的相对位置,然后。

[Soham Deshpande] 13:08:54 I know
[Soham Deshpande] 13:08:54 我知道

[Soham Deshpande] 13:08:59 Almost a
[Soham Deshpande] 13:08:59 几乎一个

[Rajat Nag] 13:09:05 So home this one. They could have list. Turn off your Mike, please.
[Rajat Nag] 13:09:05 所以回家吧。它们可以有列表。请关闭你的麦克风。

[Rajat Nag] 13:09:13 Then we have a rates such as like percentage proportion ratio. Such data arises like when we take the ratio of two quantities.
[Rajat Nag] 13:09:13 然后我们有像百分比、比例、比率这样的费率。这种数据出现于我们取两个量的比率时。

[Rajat Nag] 13:09:24 So we have a question. Okay, that's fine.
[Rajat Nag] 13:09:24 我们有一个问题。好的,没问题。

[Rajat Nag] 13:09:32 So we have now two classification, discrete data and continuous data. So that's the most of the time. We will be.
[Rajat Nag] 13:09:32 所以我们现在有两种分类,离散数据和连续数据。这就是我们大部分时间会涉及的内容。

[Rajat Nag] 13:09:38 Following discreet data means, like real number, say, for example, how many cups of coffee you drink a day, so it can, or I.
[Rajat Nag] 13:09:38 通过离散数据手段,比如实数,例如,你每天喝多少杯咖啡,所以它可以,或者我。

[Rajat Nag] 13:09:48 I should refresh it. How many cuffs of coffee you order a day, because the thing is not all of us would drink coffee to the bottom it's like there can be some stages okay?
[Rajat Nag] 13:09:48 我应该刷新一下。你每天点多少杯咖啡,因为并不是我们所有人都会把咖啡喝到底,可能会有一些阶段,好吗?

[Rajat Nag] 13:10:01 So what you in the shop you cannot order two point five cups of coffee, or two point three seven cups of coffee.
[Rajat Nag] 13:10:01 所以在商店里你不能点两点五杯咖啡,或者两点三七杯咖啡。

[Rajat Nag] 13:10:07 Okay. So these are called discrete data. So, in short, so, for example, you count the change in your pocket right?
[Rajat Nag] 13:10:07 好的。这些被称为离散数据。简而言之,例如,你数一下口袋里的零钱,对吧?

[Rajat Nag] 13:10:17 So there are many other examples. So just please read from the slide. But I'm just giving one example.
[Rajat Nag] 13:10:17 所以还有很多其他例子。请从幻灯片上阅读。但我只是给一个例子。

[Rajat Nag] 13:10:25 So for continuous data, say, for example, the height of the Irish population, so it can range from one number to another with in finite possibility I'm saying in finite means, like I'm considering the entire seven point five million of data as infinity, okay, and say for example, the temperature.
[Rajat Nag] 13:10:25 所以对于连续数据,比如说爱尔兰人口的身高,它可以在有限的可能性范围内从一个数字变化到另一个数字。我说的有限是指,我将整个七百五十万的数据视为无限,好吗,举个例子,比如温度。

[Rajat Nag] 13:10:46 Of doubling it can vary from one number to another to the maximum number with kind of in finite possibility.
[Rajat Nag] 13:10:46 将其翻倍的结果可以从一个数字变化到另一个数字,达到最大值,具有无限的可能性。

[Rajat Nag] 13:10:55 So like ten point five degrees, ten point five two digits, and so on.
[Rajat Nag] 13:10:55 所以像十点五度,十点五两位数字,等等。

[Rajat Nag] 13:11:01 So, then we have measurements of data, so we will be only talking about mostly for the class continuous data.
[Rajat Nag] 13:11:01 所以,我们有数据的测量,因此我们将主要讨论连续数据的类别。

[Rajat Nag] 13:11:11 So these are called the central tendency, such as to calculate the mode, mean, and median of a data set.
[Rajat Nag] 13:11:11 这些被称为集中趋势,例如计算数据集的众数、均值和中位数。

[Rajat Nag] 13:11:19 So the mode, so I'll come with some examples. But these are some definition. The value which august most frequently, so that is called the mode, or most likely.
[Rajat Nag] 13:11:19 所以众数,我会给出一些例子。但这些是一些定义。出现频率最高的值被称为众数,或最可能的值。

[Rajat Nag] 13:11:30 So just take a note mode, and most likely is the same thing. It may not be unique, because, for example.
[Rajat Nag] 13:11:30 所以只需记下模式,很可能是同样的事情。它可能不是唯一的,因为,例如。

[Rajat Nag] 13:11:42 How many cups of coffee? Okay? So if I, if I say, I take at least one cup of coffee a day from the store and three cups of the coffee from the store, but most likely I take like two so one is the minimum, value two is the most likely or mode value and the maximum is three it can be an a fraction also so for example.
[Rajat Nag] 13:11:42 你喝多少杯咖啡?好吧?所以如果我说,我每天至少喝一杯咖啡来自商店,三杯咖啡来自商店,但我最有可能喝两杯,所以一杯是最小值,二杯是最可能的值或众数,最大值是三杯,也可以是一个分数,比如说。

[Rajat Nag] 13:12:11 Two, and of something, or how can you say? Hmm. So, if if you actually calculate the temperature of doubling at twelve noon for entire year, so you may have some.
[Rajat Nag] 13:12:11 两个,或者说你怎么能说呢?嗯。所以,如果你实际上计算一整年中正午十二点的温度翻倍,那么你可能会有一些。

[Rajat Nag] 13:12:28 Kind of data which is not in real numbers. See, for example, ten point five, it can be mostly appearing number in the entire data set of three sixty five days okay, and the mode is not defined when there are no repeats in a data set so if there is no kind of reputation of the data, set so if I have three sixty five unique data for the temperature of dublin for the year
[Rajat Nag] 13:12:28 一种不是真实数字的数据。比如说,十点五,它可能是整个三百六十五天数据集中最常出现的数字,好吧,当数据集中没有重复时,众数是没有定义的,所以如果数据集中没有任何重复,如果我有三百六十五个独特的数据用于都柏林一年的温度

[Rajat Nag] 13:12:58 so there will not be any kind of mode, and then we have the mean, so that is nothing but the and kind of an average of the entire data.
[Rajat Nag] 13:12:58 所以不会有任何模式,然后我们有了均值,这不过是整个数据的平均值。

[Rajat Nag] 13:13:07 So some all the data defied by the total number. That's your main value, and then we have the Median value.
[Rajat Nag] 13:13:07 所有数据都由总数定义。这是你的主要值,然后我们有中位数值。

[Rajat Nag] 13:13:15 So Median means like, if you order or sort your data from low to high. So the middle point, so three, sixty five.
[Rajat Nag] 13:13:15 所以中位数的意思是,如果你将数据从低到高排序。中间点,所以是三百六十五。

[Rajat Nag] 13:13:24 So three, sixty, five. Is that odd number. So if I divide three, sixty, four, divided by two, so that is one eighty-two, and then plus one, so that is one eighty-three, like one eighty-third if you can say in that way data, that is your median value so it's kind of the middle value of the entire data set if you sort them from like small to
[Rajat Nag] 13:13:24 所以三百六十五。这是一个奇数。所以如果我把三百六十四除以二,那就是一百八十二,然后加一,那就是一百八十三,可以说是这样的数据,那就是你的中位数,所以它是整个数据集的中间值,如果你把它们从小到大排序。

[Rajat Nag] 13:13:53 the high highest one, but the thing is it doesn't use all the data. It's just a middle value, and most of the time.
[Rajat Nag] 13:13:53 最高的那个,但问题是它并没有使用所有的数据。它只是一个中间值,而且大多数时候。

[Rajat Nag] 13:14:02 If you have a large data set, so mean is accepted, but for small data sets. Sometimes a median is widely used.
[Rajat Nag] 13:14:02 如果你有一个大数据集,那么均值是可以接受的,但对于小数据集,有时中位数被广泛使用。

[Rajat Nag] 13:14:13 Then we have a problem. So, for example, I have a Cds. Of data such as fifteen, sixteen, eighteen, sixteen, fourteen, sixteen, and eighty.
[Rajat Nag] 13:14:13 那么我们有一个问题。例如,我有一组数据,如十五、十六、十八、十六、十四、十六和八十。

[Rajat Nag] 13:14:24 So calculate the mode, mean and medium. So first thing first, what I should do, I should order the data.
[Rajat Nag] 13:14:24 所以计算众数、平均数和中位数。首先,我应该做的第一件事是对数据进行排序。

[Rajat Nag] 13:14:32 So fourteen, fifteen, sixteen, sixteen, eighteen and eighty correct. So in our case, mode are the most likely value most appealing number is sixteen, right.
所以十四、十五、十六、十六、十八和八十是正确的。所以在我们的情况下,众数是最可能的值,最吸引人的数字是十六,对吧。

[Rajat Nag] 13:14:46 Because it's appearing three times.
[Rajat Nag] 13:14:46 因为它出现了三次。

[Rajat Nag] 13:14:48 And the mean value. So that's the sum of data set, or the numbers divided by the total number.
[Rajat Nag] 13:14:48 平均值。所以这是数据集的总和,或者是数字除以总数。

[Rajat Nag] 13:14:57 So I have total seven numbers of data so divided by seven, and the media is the middle value. So, if I let me use my pointer, laser pointer, so I have one two, three, four, five, six, and seven, so the middle value is sixteen sometimes if we have kind of even number of points, say for example, if we didn't have that eighty value so we have one two
[Rajat Nag] 13:14:57 所以我总共有七个数据,所以除以七,中位数就是中间值。所以,如果我用我的激光指示器,我有一、二、三、四、五、六和七,所以中间值是十六。有时候如果我们有偶数个点,比如说,如果我们没有那个八十的值,那么我们有一、二

[Rajat Nag] 13:15:25 three, four, five, and six. So what would be the media value? It should be just kind of.
[Rajat Nag] 13:15:25 三、四、五和六。那么媒体价值应该是什么呢?它应该只是那种。

[Rajat Nag] 13:15:33 So we have to medium two middle values sixteen and sixteen right, so in that case it could be an average of these two.
[Rajat Nag] 13:15:33 所以我们需要对两个中间值 16 和 16 进行平均,对吧?在这种情况下,它可以是这两个的平均值。

[Rajat Nag] 13:15:41 So in that case we have like sixteen, because sixteen sixteen by two is sixteen. So next is.
[Rajat Nag] 13:15:41 所以在这种情况下,我们有十六,因为十六除以二是十六。所以下一个是。

[Rajat Nag] 13:15:50 So, how to visualize that large number of data in in a kind of a short boundary to see the nature of the data.
[Rajat Nag] 13:15:50 所以,如何在一个短的边界内可视化大量数据,以便查看数据的性质。

[Rajat Nag] 13:15:59 So we plot them with. It's called the box plot, and if we have some outlets we can also detect.
[Rajat Nag] 13:15:59 所以我们用它们绘制图表。这被称为箱线图,如果我们有一些出口,我们也可以检测到。

[Rajat Nag] 13:16:07 I'll come to that later, like what is outline, how we can detect, so how to measure dispersion.
[Rajat Nag] 13:16:07 我稍后会提到这个,比如什么是轮廓,我们如何检测,以及如何测量离散度。

[Rajat Nag] 13:16:17 So it's kind of it's also called the measure of dispersion. That is how much the data is spread out.
[Rajat Nag] 13:16:17 所以这也被称为离散度的度量。也就是说数据的分布程度。

[Rajat Nag] 13:16:24 So? What is the nature? So? It is widely sprayed, or it is locally sprayed.
[Rajat Nag] 13:16:24 那么?性质是什么?那么?它是广泛喷洒的,还是局部喷洒的?

[Rajat Nag] 13:16:30 So? If so, for that we need to, apart from the inter quartel range, so with that we can actually check the twenty fifth percent, I of the data set, and seventy-five percent of the data set that the X.
[Rajat Nag] 13:16:30 那么?如果是这样的话,我们需要除了四分位范围之外,这样我们实际上可以检查数据集的百分之二十五和百分之七十五。

[Rajat Nag] 13:16:44 The X in the middle stands for the mean value, and this line it stands for the Median value, and then we have the maximum value of the data point, and the minimum value of the data point, and if we have any outlayer it would appear as a dots so i'll come back to that later so this is the way of quantifying differences between people
[Rajat Nag] 13:16:44 中间的 X 代表平均值,这条线代表中位数,然后我们有数据点的最大值和最小值,如果有任何异常值,它会以点的形式出现,我稍后会再提到这一点,所以这就是量化人们之间差异的方式

[Rajat Nag] 13:17:07 observation and how certain we are about stats or our prediction。 So these are some definition like range so minimum and maximum values interpreted range。 So that is twenty fifth, and seventy-five。 Percent尽早上没有任何一个expand that in my next slide and we have standard deviation。 So that is the square root of the variance。
[Rajat Nag] 13:17:07 观察以及我们对统计数据或预测的确定性。因此,这里有一些定义,比如范围,即最小值和最大值解释的范围。所以这是二十五和七十五百分位。尽早上没有任何一个在我的下一张幻灯片中扩展,我们有标准差。所以这是方差的平方根。

[Rajat Nag] 13:17:34 So before you do any data analysis, we need to clean the data before we perform the data analysis. So these are the States get rid of extra spaces.
[Rajat Nag] 13:17:34 所以在进行任何数据分析之前,我们需要清理数据,然后再进行数据分析。所以这些是去除多余空格的状态。

[Rajat Nag] 13:17:47 If we have in the data set for the you know, text or categorical, variable.
[Rajat Nag] 13:17:47 如果我们在数据集中有文本或分类变量。

[Rajat Nag] 13:17:57 Hello, do you have any question, if not, please turn off your mind, please, then we need to select and treat all blank cells.
[Rajat Nag] 13:17:57 你好,你有什么问题吗?如果没有,请放空你的思绪,然后我们需要选择并处理所有空白单元格。

[Rajat Nag] 13:18:08 So if we have any blank cells, we need to delete them, and like blank rows as well.
[Rajat Nag] 13:18:08 所以如果我们有任何空单元格,我们需要删除它们,还有空行。

[Rajat Nag] 13:18:15 Then convert numbers stored as text. So if you have any number stored as text, so we need to convert all text into numbers.
[Rajat Nag] 13:18:15 然后将存储为文本的数字转换为数字。所以如果你有任何存储为文本的数字,我们需要将所有文本转换为数字。

[Rajat Nag] 13:18:24 Because we you can only perform data analysis on numbers at the time being just understand that. So we can do, data analysis with text also with dummy, variable.
[Rajat Nag] 13:18:24 因为我们目前只能对数字进行数据分析,请理解这一点。所以我们也可以使用文本和虚拟变量进行数据分析。

[Rajat Nag] 13:18:35 But that's beyond scope of this model. So we need to convert text into numbers, so that we have all the set of, numbers and we can perform the data.
[Rajat Nag] 13:18:35 但这超出了该模型的范围。因此,我们需要将文本转换为数字,以便我们拥有所有的数字集合,并可以进行数据处理。

[Rajat Nag] 13:18:46 Analysis. Then, when to remove duplicates, so if we have any duplicates in our data set, we need to delete them.
[Rajat Nag] 13:18:46 分析。然后,何时删除重复项,如果我们的数据集中有任何重复项,我们需要将其删除。

[Rajat Nag] 13:18:54 So luckily, the example problem, we have some duplicates, so we'll do that. Then. Highlight errors, like.
[Rajat Nag] 13:18:54 所以幸运的是,这个示例问题中我们有一些重复项,所以我们会这样做。然后。突出错误,比如。

[Rajat Nag] 13:19:00 If there is some mistake from like manual input. So people who actually entered the data, if there is some human error, you can notice, because, like blenders can happen, right.
[Rajat Nag] 13:19:00 如果手动输入有一些错误。实际上输入数据的人,如果有一些人为错误,你可以注意到,因为,比如说混合器可能会发生,对吧。

[Rajat Nag] 13:19:13 So that is also an importance of identifying the outliers. So if you can identify the outlayers and then you can dig into the data, set, that whether it is actually the outlayer or if there, is some humanly mistake okay?
[Rajat Nag] 13:19:13 所以识别异常值也是很重要的。如果你能识别出异常值,然后你可以深入数据集,确定它是否真的异常,或者是否存在某些人为错误,好吗?

[Rajat Nag] 13:19:29 I can see some noise.
[Rajat Nag] 13:19:29 我能听到一些噪音。

[Rajat Nag] 13:19:34 One, second.
[Rajat Nag] 13:19:34 一,二。

[Rajat Nag] 13:19:38 So this is my only device could it be scan of your Mike, please, it's distracting me a lot.
[Rajat Nag] 13:19:38 所以这是我唯一的设备,能不能扫描一下你的麦克风,请,这让我分心很多。

[Rajat Nag] 13:19:44 Then change text to lower one second. Let me see.
[Rajat Nag] 13:19:44 然后将文本改为小写。让我看看。

[Rajat Nag] 13:20:11 Okay?
[Rajat Nag] 13:20:11 好吗?

[Rajat Nag] 13:20:15 Okay. It's already sharing my screen right?
[Rajat Nag] 13:20:15 好的。它已经在共享我的屏幕,对吧?

[Rajat Nag] 13:20:26 Yeah. So then.
[Rajat Nag] 13:20:26 是的。那么。

[Rajat Nag] 13:20:31 Okay? Okay? Okay? Okay? Okay?
[Rajat Nag] 13:20:31 好吗?好吗?好吗?好吗?好吗?

[Rajat Nag] 13:20:34 So change takes to lower upper, or like proper case, so we should have, because, like otherwise the Excel will treat the text as a different object.
[Rajat Nag] 13:20:34 所以更改为小写或大写,或者像正确的大小写一样,我们应该这样做,因为否则 Excel 会将文本视为不同的对象。

[Rajat Nag] 13:20:46 Okay, so if you are trying to categorize data into different kind of category, then we should have the same kind of text, and then we have, like, you know, pass data using text to column.
[Rajat Nag] 13:20:46 好的,如果你想将数据分类到不同的类别中,那么我们应该有相同类型的文本,然后我们可以通过文本分列来传递数据。

[Rajat Nag] 13:20:58 So if you are moving kind of a a kind of data set which is written in, say, in text file, so we need to fit the entire thing into column, and there are different methods like you can also use Csv file into if you're doing it in r and all so it is automatically taking that data so that's not a problem but for excel you need to copy
[Rajat Nag] 13:20:58 所以如果你正在移动一种数据集,比如说在文本文件中写的,我们需要将整个内容放入列中,并且有不同的方法,比如如果你在 R 中操作,可以使用 Csv 文件,它会自动获取那些数据,所以这不是问题,但对于 Excel,你需要复制

[Rajat Nag] 13:21:23 paste, and then using like no, takes to column. So I don't think I have that example here, but if you are interested, you can just watch, or like you know, in Youtube, or just find on the waves like how to fit text data, into column is very one step thing so then we can do the spill check delete all formatting and use
[Rajat Nag] 13:21:23 粘贴,然后使用像不一样的方式,放入列中。所以我想我这里没有那个例子,但如果你感兴趣,你可以看看,或者像你知道的,在 YouTube 上,或者在网上找一下如何将文本数据适配到列中,这非常简单,所以我们可以进行溢出检查,删除所有格式并使用。

[Rajat Nag] 13:21:47 find and replace to clean data in excel. So after this kind of housekeeping stuff for the data, cleaning, we can move on to third data.
[Rajat Nag] 13:21:47 在 Excel 中查找和替换以清理数据。因此,在进行完这些数据的整理工作后,我们可以继续处理第三组数据。

[Rajat Nag] 13:21:56 Analysis, and this is the example of the variance. So so, if this is the mean value of your entire data set, so what is the distance of that line on that best fit line?
[Rajat Nag] 13:21:56 分析,这就是方差的例子。那么,如果这是您整个数据集的平均值,那么那条线与最佳拟合线的距离是多少?

[Rajat Nag] 13:22:13 Okay. So if you draw a perpendicular to that, so that's your distance. So how far the data is from the base fit line.
[Rajat Nag] 13:22:13 好的。所以如果你画一条垂线,那就是你的距离。那么数据离基准拟合线有多远。

[Rajat Nag] 13:22:23 So this distance we measure, so that is, xi. Minus the average of all this data point, and then squared because it the inner can be negative.
[Rajat Nag] 13:22:23 所以我们测量的这个距离,就是 xi。减去所有这些数据点的平均值,然后平方,因为内部可以是负的。

[Rajat Nag] 13:22:34 It can be positive. Okay, so to get rid of that. We squared that, and then, divided by, we have two options in minus one, and in.
[Rajat Nag] 13:22:34 这可能是积极的。好的,为了摆脱这个。我们平方了,然后,除以,我们有两个选项,一个是负一,另一个是。

[Rajat Nag] 13:22:43 So this is very important also, so if you're doing a survey for the internal population of Ireland, then you need to select only in the number of people or participants.
[Rajat Nag] 13:22:43 所以这也非常重要,如果你正在为爱尔兰的内部人群进行调查,那么你需要仅选择人数或参与者。

[Rajat Nag] 13:22:54 But if you're doing a very subset or very small sample of and sample should be the representative sample.
[Rajat Nag] 13:22:54 但是如果你只做一个非常小的子集或样本,样本应该是具有代表性的样本。

[Rajat Nag] 13:23:03 See, for example, you are considering a survey for the annual income of people for entire population, and if you want to do kind of a a sample it's kind of survey based on say, any university setup so that can not be of a fair representative sample right?
[Rajat Nag] 13:23:03 比如说,你正在考虑对整个群体的年收入进行调查,如果你想做一种基于某个大学设置的样本调查,那就不能算是一个公平的代表性样本,对吧?

[Rajat Nag] 13:23:25 So you should have people from every background, and then you need to consider that as a representative sample of the population, so perhaps you are conducting the survey on random, like, say, one hundred people in a live concert say for example, so one hundred will not be a good example, so for for example, five hundred people are one thousand people in a concert, and then n minus one should
[Rajat Nag] 13:23:25 所以你应该有来自每个背景的人,然后你需要考虑这作为人口的代表性样本,所以也许你是在进行随机调查,比如说,在一场现场音乐会中调查一百个人,比如说,一百个人可能不是一个好的例子,所以例如,五百个人或一千个人在一场音乐会中,然后 n 减去一应该

[Rajat Nag] 13:23:57 be there because you are actually considering for the sample, not for the population, and then once you have that is square, so that is your variance, and then square root of that.
[Rajat Nag] 13:23:57 在那里是因为你实际上是在考虑样本,而不是总体,然后一旦你有了那个平方,那就是你的方差,然后取它的平方根。

[Rajat Nag] 13:24:09 It's your S, or it's called the standard deviation. So standard division is a parameter to see how how your data is stretched from the mean value.
[Rajat Nag] 13:24:09 这是你的 S,或者称为标准差。因此,标准差是一个参数,用来查看你的数据是如何从均值延伸的。

[Rajat Nag] 13:24:19 Okay? Do you have any question to date till now? If not, you can unmute yourself.
[Rajat Nag] 13:24:19 好吗?到目前为止你有任何问题吗?如果没有,你可以解除静音。

[Rajat Nag] 13:24:26 If you have any question sorry, you can mute yourself, and if you have any question unmute, and ask me.
[Rajat Nag] 13:24:26 如果你有任何问题,抱歉,你可以静音自己,如果你有任何问题可以取消静音,问我。

[Rajat Nag] 13:24:37 Okay, thank you. Then we have the outlayers. So outlayers. So if you plot the data in the form, of a box plot, so you'll see these kind of dots so these are actually called outliers so how we I can calculate so as I mentioned we can calculate the twenty fifth percentile me medium value and then
[Rajat Nag] 13:24:37 好的,谢谢。那么我们有离群值。所以离群值。如果你以箱形图的形式绘制数据,你会看到这些点,这些实际上被称为离群值。那么我该如何计算呢?正如我提到的,我们可以计算第 25 百分位数的中位值,然后

[Rajat Nag] 13:25:01 seventy-five percent time, and then we can subtract the value of seventy-five percent from here minus don't hit twenty-five percentile value, and we have the kind of value in between.
[Rajat Nag] 13:25:01 七十五个百分点,然后我们可以从这里减去七十五个百分点的值,减去不要达到二十五百分位的值,我们就得到了中间的那种值。

[Rajat Nag] 13:25:15 So if it is, say, for example, fifty. And if it is, say, for example, thirty, so fifty, minus thirty, we know now it is twenty right, so that is called iqr integral range.
[Rajat Nag] 13:25:15 所以如果它是,比如说,五十。如果它是,比如说,三十,那么五十减去三十,我们现在知道是二十,对吧,这被称为 iqr 积分范围。

[Rajat Nag] 13:25:27 So the outlier will be so. Q. Three, so seventy, fifty percentile, plus one point five times of that Iq.
[Rajat Nag] 13:25:27 所以异常值将是这样。Q. 三,七十,五十百分位,加上那 Iq 的 1.5 倍。

[Rajat Nag] 13:25:36 So that is fifty minus thirty. I see, I think I say fifteen and thirty. So the difference would be this one, and you can multiply with one point five.
[Rajat Nag] 13:25:36 所以这是五十减去三十。我明白了,我想我说的是十五和三十。所以差值就是这个,你可以乘以一点五。

[Rajat Nag] 13:25:46 So that is your. Be on any data beyond this point, is considered as extreme outlayers.
[Rajat Nag] 13:25:46 所以这是你的。在此之后的任何数据都被视为极端异常值。

[Rajat Nag] 13:25:52 So it's positive outline, and these are negative outliers. See, for the equation to calculate that.
[Rajat Nag] 13:25:52 所以这是正的轮廓,而这些是负的异常值。看,计算这个的方程。

[Rajat Nag] 13:26:01 So that is key, one q. One value, minus one point five into Iqr. So Iqr is the same for both these and this.
[Rajat Nag] 13:26:01 所以这是关键,一个 q。一个值,减去 1.5 乘以 Iqr。所以 Iqr 在这两者和这个中是相同的。

[Rajat Nag] 13:26:09 And yeah, an outlay is an observation that lies outside the overall distribution of values.
[Rajat Nag] 13:26:09 而且,支出是一个位于整体值分布之外的观察。

[Rajat Nag] 13:26:14 So usually it's depending on like how far is the data from the central tendency, and it can be.
[Rajat Nag] 13:26:14 所以通常这取决于数据与中心趋势的距离有多远,它可以是。

[Rajat Nag] 13:26:22 It can be an inherent property of the data. So sometimes you need to judge, actually, should I delete the outlier?
[Rajat Nag] 13:26:22 这可能是数据的固有属性。因此,有时你需要判断,实际上,我应该删除这个异常值吗?

[Rajat Nag] 13:26:29 So should I consider these outliers, because sometimes that can be a part of the determines.
[Rajat Nag] 13:26:29 那我应该考虑这些异常值吗,因为有时候这可能是决定的一部分。

[Rajat Nag] 13:26:36 So if you know these are not outlay, say, for example, example, I'm telling one thing so I'm trying to get some savvy data from the people who are coming to our university campus by car so I can tell like I can ask them that how many many kilometers did you travel this morning?
[Rajat Nag] 13:26:36 所以如果你知道这些不是支出,比如说,我在说一件事,我试图从开车来我们大学校园的人那里获取一些聪明的数据,所以我可以问他们今早你们 traveled 了多少公里?

[Rajat Nag] 13:26:58 So perhaps people can respond like two kilometre, three kilometre, five day, and so on. But there can be few people who can travel.
[Rajat Nag] 13:26:58 所以也许人们可以回应像两公里、三公里、五天等等。但能旅行的人可能很少。

[Rajat Nag] 13:27:06 Maybe from far so I mean they can travel by bus, or they can also drive, so some may take, say, for example, car to travel one fifty kil kilometre from here, so those can be considered as outlay.
[Rajat Nag] 13:27:06 也许从远处来看,我的意思是他们可以乘坐公交车,或者他们也可以开车,所以有些人可能会开车,比如说,从这里旅行一百五十公里,所以这些可以被视为支出。

[Rajat Nag] 13:27:25 But if these, if you are trying to calculate the emission from vehicle, and these are very valuable data, so you should consider these outlayers.
[Rajat Nag] 13:27:25 但是如果这些,如果你试图计算车辆的排放,而这些数据非常有价值,那么你应该考虑这些异常值。

[Rajat Nag] 13:27:33 Okay, but if you know, it can happen due to some sampling error and all so the very extreme values you should delete them or after identifying the outlayers, and just apart from determines on your main data set.
[Rajat Nag] 13:27:33 好的,但如果你知道,这可能是由于某些抽样误差等原因造成的,因此你应该删除非常极端的值,或者在识别出异常值后,将其与主要数据集分开。

[Rajat Nag] 13:27:50 And yeah, that's it. And then we have some examples like here. So we'll solve this problem.
[Rajat Nag] 13:27:50 嗯,就是这样。然后我们有一些例子,比如这里。所以我们将解决这个问题。

[Rajat Nag] 13:27:58 And we will not see any kind of outlay in our example. Problem.
[Rajat Nag] 13:27:58 在我们的例子中,我们不会看到任何形式的支出。问题。

[Rajat Nag] 13:28:04 And this is called the normal distribution. So when we have, like very large data set, mostly things will be normally distributed.
[Rajat Nag] 13:28:04 这被称为正态分布。因此,当我们有一个非常大的数据集时,大多数事物将呈正态分布。

[Rajat Nag] 13:28:15 So the normal term comes from the normal, like the normal distribution that comes from the normally explained things.
[Rajat Nag] 13:28:15 所以正常一词来自于正常,就像来自于正常解释事物的正态分布。

[Rajat Nag] 13:28:22 See, for example, if you take a survey of the entire population of Ireland regarding the height, and if you classify them in different beans, say, for example, zero two, one meter sorry zero to one feet how many one to two.
[Rajat Nag] 13:28:22 例如,如果你对整个爱尔兰的人口进行身高调查,并将他们分类为不同的区间,比如说,零到一米,抱歉,零到一英尺,多少人是一到两英尺。

[Rajat Nag] 13:28:42 Feet, how many and so on, and then you'll you'll see a kind of this kind of plot.
[Rajat Nag] 13:28:42 脚,多少等等,然后你会看到这种类型的图。

[Rajat Nag] 13:28:47 So it's also called the Billship Plot, and this is the main value, and these are some standard deviations.
[Rajat Nag] 13:28:47 所以它也被称为 Billship 图,这就是主要的价值,这些是一些标准差。

[Rajat Nag] 13:28:54 This is I mean minus one standard deviation. This is mean plus one standard deviation, and this is mean plus two standard deviation.
[Rajat Nag] 13:28:54 这是我所说的减去一个标准差。这是平均值加一个标准差,这是平均值加两个标准差。

[Rajat Nag] 13:29:02 And so on. Okay, so the most common and the most useful distribution is continuous. Distribution is the normal distribution.
[Rajat Nag] 13:29:02 依此类推。好的,最常见和最有用的分布是连续分布。分布是正态分布。

[Rajat Nag] 13:29:09 Many every day data set, follow the approximately the normal distribution. Normality is an important assumption. When, considering other statistical distributions, if, for example, if you're considering the beta distribution your data should be normally distributions, which means, like you should have enough, data, to run the beta distribution.
[Rajat Nag] 13:29:09 许多日常数据集大致遵循正态分布。正态性是一个重要的假设。当考虑其他统计分布时,例如,如果你考虑贝塔分布,你的数据应该是正态分布的,这意味着你应该有足够的数据来运行贝塔分布。

[Rajat Nag] 13:29:27 So it's kind of an assumption for your other distributions. Okay, many statistical inference.
[Rajat Nag] 13:29:27 所以这对你的其他分布来说是一种假设。好的,许多统计推断。

[Rajat Nag] 13:29:36 Yeah. So this thing I mentioned. So it's kind of a bill ship car. So this is important.
[Rajat Nag] 13:29:36 是的。我提到的这个东西。它有点像账单运输车。所以这很重要。

[Rajat Nag] 13:29:43 So this is with one standard division, so plus minus the mean value, plus plus and minus. So you can capture actually sixty-eight percent of your entire data set by adding and subscribing to send division you can actually Capture ninety five percent of your data.
[Rajat Nag] 13:29:43 所以这是一个标准差,所以加减均值,加加和减。通过添加和订阅发送的标准差,实际上可以捕获你整个数据集的六十八个百分点,实际上可以捕获你数据的九十五个百分点。

[Rajat Nag] 13:29:58 And within three standard deviation. Actually you can capture ninety, nine point seven percent of your data.
[Rajat Nag] 13:29:58 在三个标准差内。实际上,你可以捕获九十九点七的百分比的数据。

[Rajat Nag] 13:30:03 So it gives a good central tendency. If you have the mean and standard deviation. Okay, of yeah.
[Rajat Nag] 13:30:03 所以它提供了一个良好的中心趋势。如果你有均值和标准差。好的,嗯。

[Rajat Nag] 13:30:15 Explained. Then we have things called kurtosis, so that is a statistical measure used to describe the distribution of observer data around the mean.
[Rajat Nag] 13:30:15 解释了。然后我们有一种叫做峰度的东西,它是一种统计测量,用于描述观察者数据围绕均值的分布。

[Rajat Nag] 13:30:26 So we can have three kinds of values. One can be less than like three, and then it can be exactly three, and can be more than three.
[Rajat Nag] 13:30:26 所以我们可以有三种值。一种可以小于三,另一种可以正好是三,还有一种可以大于三。

[Rajat Nag] 13:30:36 So, for example, we'll start with the first one Plati could think could take that data.
[Rajat Nag] 13:30:36 所以,例如,我们将从 Plati 能想到的第一个可以处理该数据的开始。

[Rajat Nag] 13:30:46 So that is your this one, the orange one like this one?
[Rajat Nag] 13:30:46 所以这是你的这个,像这样的橙色的?

[Rajat Nag] 13:30:52 Then we have the the three, one like the mesa code, thick data.
[Rajat Nag] 13:30:52 然后我们有三个,一个像 mesa 代码,厚数据。

[Rajat Nag] 13:30:59 So that's your green one.
[Rajat Nag] 13:30:59 所以那是你的绿色的。

[Rajat Nag] 13:31:01 And you have see in some abnormal shape like this way, so that is, when we have the laptop could take that out.
[Rajat Nag] 13:31:01 你会看到一些异常的形状,就像这样,所以当我们有笔记本电脑时可以把它拿出来。

[Rajat Nag] 13:31:10 It is like that the value is a higher than three. So you may have this kind of data so ideally.
[Rajat Nag] 13:31:10 这就像是值大于三。因此,您可能会理想地拥有这种数据。

[Rajat Nag] 13:31:18 So just tell me which one is more expected.
[Rajat Nag] 13:31:18 那就告诉我哪个更符合预期。

[Rajat Nag] 13:31:24 So just look at the carve and based on your normal distribution understanding. Just let me know which one should be.
[Rajat Nag] 13:31:24 所以只需查看雕刻,并根据你对正态分布的理解。告诉我哪一个应该是。

[Rajat Nag] 13:31:32 Ideal.
[Rajat Nag] 13:31:32 理想。

[Rajat Nag] 13:31:41 And you guess. Anyway, you guys, you should come up your you know you should come up with your answers.
[Rajat Nag] 13:31:41 你们猜吧。无论如何,你们应该想出你们的答案。

[Rajat Nag] 13:31:47 No matter with that, it is right or wrong.
[Rajat Nag] 13:31:47 无论如何,这都是对的或错的。

[Rajat Nag] 13:31:56 Overcome your fear of failure. Please.
克服你对失败的恐惧。拜托。

[Rajat Nag] 13:32:01 I need to know whether you are understanding you don't need to type in the chatbox. Just ask me, because I need to.
[Rajat Nag] 13:32:01 我需要知道你是否明白你不需要在聊天框中输入。只需问我,因为我需要。

[Rajat Nag] 13:32:07 Then click on the chat box and see unmute yourself, and this talk.
[Rajat Nag] 13:32:07 然后点击聊天框,查看取消静音自己,以及这个谈话。

[Saksorn Techasutjalidsuntorn] 13:32:14 Maybe the first one may.
[Saksorn Techasutjalidsuntorn] 13:32:14 也许第一个可以。

[Rajat Nag] 13:32:16 Yeah, it looks like. Mostly. In most cases we can have this kind of curve. Yes, that's true, but like when we have very short kind of like few outlayers, then we may have this kind of so most of the cases you will get a value near three so that's the correct answer thank you very much.
[Rajat Nag] 13:32:16 是的,看起来是这样的。大多数情况下我们可以得到这种曲线。是的,这是真的,但当我们有非常短的几种异常值时,我们可能会得到这种情况,所以在大多数情况下你会得到一个接近三的值,所以这是正确的答案,非常感谢。

[Rajat Nag] 13:32:39 Now is the is something called the skewness. So skewness is the an indication of how the data is placed towards the left-hand side or right-hand side.
[Rajat Nag] 13:32:39 现在有一个叫做偏度的东西。偏度是数据在左侧或右侧分布的指示。

[Rajat Nag] 13:32:52 So, in case of positive one skillness, a distribution is positively skewed when it is, it's still more pronounced to the right side.
[Rajat Nag] 13:32:52 所以,在正的单一技能的情况下,当分布向右侧更明显时,它是正偏的。

[Rajat Nag] 13:33:02 So we have extreme values towards the higher site. Okay, so like this one. So we have the long tail towards the higher end, right?
[Rajat Nag] 13:33:02 所以我们在较高的地方有极端值。好的,就像这个。所以我们在较高的端有长尾,对吧?

[Rajat Nag] 13:33:14 So these are higher values, and we have all the data. Most of our data in that zone.
[Rajat Nag] 13:33:14 所以这些是更高的值,我们拥有所有数据。我们在那个区域的大部分数据。

[Rajat Nag] 13:33:20 Okay?
[Rajat Nag] 13:33:20 好吗?

[Rajat Nag] 13:33:24 So that means, like most of the extremists, are on the right side, and we have the negative skewness.
[Rajat Nag] 13:33:24 所以这意味着,像大多数极端分子一样,都是在正确的一方,而我们有负偏态。

[Rajat Nag] 13:33:29 So if we have a plot like this. See my pointer, yeah, like this, which means like negative skewness, so most extreme values are found towards the lift and towards the lower value.
[Rajat Nag] 13:33:29 所以如果我们有这样的图。看我的指针,是的,就像这样,这意味着负偏态,所以大多数极端值位于左侧和较低的值。

[Rajat Nag] 13:33:43 In most cases, in reality you may find a positive skewness data in terms of.
在大多数情况下,实际上你可能会发现数据呈现正偏态。

[Rajat Nag] 13:33:53 In terms of your taste results in the thesis, but that is also can happen.
[Rajat Nag] 13:33:53 关于你在论文中的口味结果,但这也是可能发生的。

[Rajat Nag] 13:34:01 Okay, negative skewness. So then you can be better placed after getting this a value, or like, like, in the shape of the histogram.
[Rajat Nag] 13:34:01 好的,负偏态。那么在获得这个值后,或者说,在直方图的形状上,你可以更好地定位。

[Rajat Nag] 13:34:10 Then you'll be better place to judge whether my data is to as the lift or right. So histogram is the indicator.
[Rajat Nag] 13:34:10 那么你将更好地判断我的数据是向上还是向下。因此,直方图是指标。

[Rajat Nag] 13:34:17 So plot the histogram, and then we can see whether it is like positive or negatively schemes.
[Rajat Nag] 13:34:17 所以绘制直方图,然后我们可以看看它是正向还是负向方案。

[Rajat Nag] 13:34:21 Sometimes it's called lift, skewed and right skewed. Okay, and this is the time people make mistake.
[Rajat Nag] 13:34:21 有时它被称为提升、偏斜和右偏斜。好的,这就是人们犯错误的时刻。

[Rajat Nag] 13:34:29 So it's like lift. Skew is always not your left. It's the data is left.
[Rajat Nag] 13:34:29 所以这就像是提升。偏斜总不是你的左边。数据在左边。

[Rajat Nag] 13:34:36 Okay. So right now, this is rightly skewed data. This is rightly skewed data. Okay, and the opposite is, let's get data.
[Rajat Nag] 13:34:36 好的。现在,这是一组正确偏斜的数据。这是一组正确偏斜的数据。好的,反过来,我们来获取数据。

[Rajat Nag] 13:34:47 Then we have the tea test. So is the statistical test used to compare the means of two groups of data, say, for example, you are conducting a taste on a set of sample, and then you are repeating the sample and you have to repeat because I think you have gone to some terminologies are just replicates.
[Rajat Nag] 13:34:47 然后我们进行茶测试。这是用于比较两组数据均值的统计测试,比如说,您正在对一组样本进行品尝,然后您重复样本,您必须重复,因为我认为您已经使用了一些术语,这只是重复实验。

[Rajat Nag] 13:35:12 So, for example, you're testing on bacteria. So there can be variability and uncertainty around the data.
[Rajat Nag] 13:35:12 所以,例如,你在对细菌进行测试。因此,数据可能存在变异和不确定性。

[Rajat Nag] 13:35:18 So one day you calculate the number of bacteria, you are culturing in the lab, and then you should rip it.
[Rajat Nag] 13:35:18 所以有一天你计算了你在实验室培养的细菌数量,然后你应该撕掉它。

[Rajat Nag] 13:35:28 The experiment again and again. Okay, because the thing is like you should not come up with some value which is by chance.
[Rajat Nag] 13:35:28 实验一次又一次。好的,因为问题在于你不应该偶然得出某个值。

[Rajat Nag] 13:35:37 Okay. So if you are conducting only one test, it can be a by chance right, and if you're repeating the taste again and again over, say five times, and then went to apart from the tetest.
[Rajat Nag] 13:35:37 好的。所以如果你只进行一次测试,那可能是偶然的,对吧?如果你重复测试五次,然后再进行其他测试。

[Rajat Nag] 13:35:48 So with ditist. If we have some kind of a small value of the p-value, so, p-value, you have most of the time you have seen in the papers, and all, then we can say like, if you have very small people let's say for example, five percent so point zero five or point zero zero one sometimes like one percent.
[Rajat Nag] 13:35:48 所以与 ditist 一起。如果我们有某种小的 p 值,那么,p 值,你在论文中大多数时候都见过,然后我们可以说,如果你有非常小的人,比如说,五个百分点,所以 0.05 或 0.001,有时像一个百分点。

[Rajat Nag] 13:36:09 And then we can say, like, the results are statistically significant, like it's not by chance, so we start with the hypothesis that the results are by chance, and then we calculate the P.
[Rajat Nag] 13:36:09 然后我们可以说,结果在统计上是显著的,这不是偶然的,所以我们从结果是偶然的假设开始,然后计算 P 值。

[Rajat Nag] 13:36:24 -value. If the p-value is less than that desired confidence, like five percent or one percent, and then we can say, Okay, we need to reject the hypothesis, and we need to consider the alternative hypothesis which is the data whatever i'm getting the results are not by chance okay, so there are three.
[Rajat Nag] 13:36:24 -值。如果 p 值小于所需的置信度,比如百分之五或百分之一,那么我们可以说,好吧,我们需要拒绝假设,并且我们需要考虑替代假设,即数据,无论我得到的结果都不是偶然的,好吧,所以有三个。

[Rajat Nag] 13:36:47 Stages of getting this t-test value, so we'll use the excel analysis tool back, and you'll see there will be three options.
[Rajat Nag] 13:36:47 获取这个 t 检验值的步骤,所以我们将使用 Excel 分析工具,您会看到有三个选项。

[Rajat Nag] 13:36:55 The first is our two sample sets same or related. So, if you know, if you're doing that, taste the back via sample test same day multiple times.
[Rajat Nag] 13:36:55 第一个是我们的两个样本集是否相同或相关。所以,如果你知道,如果你正在这样做,通过样本测试在同一天多次进行回味。

[Rajat Nag] 13:37:08 So probably you can do the pay at, or dependency test. If no, then you need to ask your second question.
[Rajat Nag] 13:37:08 所以你可能可以进行支付或依赖性测试。如果不可以,那么你需要问你的第二个问题。

[Rajat Nag] 13:37:16 Are two sample sets of the same sample size. So if you're running the bacterial count, say, for example, three times on day, one and five times on day, two, so you cannot go for this one equal variance independently test No.
[Rajat Nag] 13:37:16 是两个样本大小相同的样本集。因此,如果你在第一天进行细菌计数,比如说三次,在第二天进行五次,那么你不能进行这个等方差独立性检验。

[Rajat Nag] 13:37:36 But if you have like, you know, in that case, so if you have the same sample size, so you need to go with this one. But if you have a different sample size, so then you can go with this one unequal variance independent t-test.
[Rajat Nag] 13:37:36 但是如果你有,比如说,在那种情况下,如果你的样本量相同,那么你需要选择这个。但是如果你的样本量不同,那么你可以选择这个不等方差独立 t 检验。

[Rajat Nag] 13:37:50 If you have any question, please ask me, because most of the time I have seen, like my student, are struggling a little bit in terms of this p-value, but, like we'll do some calculation of course.
[Rajat Nag] 13:37:50 如果你有任何问题,请问我,因为我发现大多数时候我的学生在这个 p 值方面有点挣扎,但我们当然会进行一些计算。

[Rajat Nag] 13:38:05 Yeah.
[Rajat Nag] 13:38:05 是的。

[Rajat Nag] 13:38:07 But from the theory point, if you do have any question.
[Rajat Nag] 13:38:07 但从理论的角度来看,如果你有任何问题。

[Rajat Nag] 13:38:21 Thank you. Next, we have the variability and uncertainty. I should give you examples to understand things better.
[Rajat Nag] 13:38:21 谢谢。接下来,我们要讨论变异性和不确定性。我应该给你一些例子,以便更好地理解这些内容。

[Rajat Nag] 13:38:32 So, for example, for the variability, it is the inherent property of the data. So regarding the bacteria count, so there can be in reality there can be arranged not a fixed value.
[Rajat Nag] 13:38:32 所以,例如,对于变异性,它是数据的固有属性。关于细菌计数,实际上可以安排的值并不是固定的。

[Rajat Nag] 13:38:46 So if you're sampling the battery account in a lake or a pond, you cannot have kind of you cannot do just one sample right, so you need to do sampling at different points of that point so while doing so you may have different numbers of bacteria.
[Rajat Nag] 13:38:46 所以如果你在湖泊或池塘中取样电池账户,你不能只做一个样本,对吧,所以你需要在那个点的不同位置进行取样,因此在这样做时,你可能会有不同数量的细菌。

[Rajat Nag] 13:39:06 Right but that is the beauty so that is called the variability it refers to the inherent heitrogenity or diversity of a data set in your laptop, and then variability cannot be reduced because you cannot change the variability is the nature of the no the bacteria.
[Rajat Nag] 13:39:06 对,但这就是美妙之处,这被称为变异性,它指的是你笔记本电脑中数据集的固有异质性或多样性,然后变异性无法减少,因为你无法改变变异性,这是细菌的本质。

[Rajat Nag] 13:39:25 That they can be callingized, and I mean they can form colonies, and they can, you know, grow in different way.
[Rajat Nag] 13:39:25 他们可以被称为殖民地,我的意思是他们可以形成殖民地,他们可以以不同的方式生长。

[Rajat Nag] 13:39:35 So the number of bacteria can be different in different parts of the pond, but it can be better characterized, so we need to capture that variability, example, variability between individual body weight.
[Rajat Nag] 13:39:35 所以池塘不同部分的细菌数量可能不同,但可以更好地表征,因此我们需要捕捉这种变异性,例如,个体体重之间的变异性。

[Rajat Nag] 13:39:47 Co. Two emissions from residential hitting oil. So these are the valuability and then uncertainty.
[Rajat Nag] 13:39:47 公司。来自住宅的两项排放影响石油。因此,这些是可价值性和不确定性。

[Rajat Nag] 13:39:54 See, for example, the same example of the bacterial account. So if I sample from that pond, so if I take, say fifty Ml.
[Rajat Nag] 13:39:54 看,比如说细菌账户的同一个例子。所以如果我从那个池塘取样,比如说我取五十毫升。

[Rajat Nag] 13:40:04 Of water from that pond, and if I give it to all of my students, so if I have five students, and I say like per from the bacterial contest, so grow the back to the the lab, and tell me what is the number of bacteria, in the water so you may come up with different values of the same thing so that is called the uncertainty or so it
[Rajat Nag] 13:40:04 从那个池塘里的水,如果我把它给我所有的学生,如果我有五个学生,我说像从细菌竞赛中,每个人都从实验室里培养细菌,然后告诉我水中细菌的数量,你可能会得出相同事物的不同值,这被称为不确定性

[Rajat Nag] 13:40:33 is basically a human, a error, and if I actually send this happen to define five labs, and they use different five missionary like the technology to detect that factor.
[Rajat Nag] 13:40:33 基本上是一个人,一个错误,如果我真的发送这个发生来定义五个实验室,他们使用不同的五个传教士来检测那个因素。

[Rajat Nag] 13:40:43 Yeah, so we may have different results based on the sensitivity or, like, you know, sensitive nature of those instrument and the accuracy of, or the calibration of those instrument.
[Rajat Nag] 13:40:43 是的,所以我们可能会根据这些仪器的敏感性或敏感性质,以及这些仪器的准确性或校准,得到不同的结果。

[Rajat Nag] 13:40:54 So got my point. So what is the difference between variability and uncertainty, so uncertainty can be minimized?
[Rajat Nag] 13:40:54 我明白我的观点了。那么变异性和不确定性之间有什么区别,以便可以最小化不确定性?

[Rajat Nag] 13:41:01 Conducting more taste, and with sophisticated instrument, and all but variability.
[Rajat Nag] 13:41:01 进行更多的品尝,使用复杂的仪器,以及所有的变异性。

[Rajat Nag] 13:41:08 We cannot reduce.
[Rajat Nag] 13:41:08 我们无法减少。

[Rajat Nag] 13:41:10 Okay?
[Rajat Nag] 13:41:10 好吗?

[Rajat Nag] 13:41:15 So the total uncertainty is a summation of the variability and uncertainty, so it can be based on the variable nature of the data and the way it is treated or detected or measured.
因此,总的不确定性是变异性和不确定性的总和,因此它可以基于数据的可变性质以及它的处理、检测或测量方式。

[Rajat Nag] 13:41:28 So uncertainty can come from that side.
[Rajat Nag] 13:41:28 所以不确定性可能来自那一方面。

[Niambee Ruth Palacio] 13:41:32 Professor, I have a question this part. So when you're calculating total uncertainty. Does that mean that you would prefer higher uncertainty, versus variability, because between the two you can only control the uncertainty.
[Niambee Ruth Palacio] 13:41:32 教授,我有一个问题关于这一部分。当你计算总不确定性时。这是否意味着你更倾向于更高的不确定性,而不是变异性,因为在两者之间你只能控制不确定性。

[Rajat Nag] 13:41:32 Yes, yes, please.
[Rajat Nag] 13:41:32 是的,是的,请。

[Rajat Nag] 13:41:48 No, it it should be the sum like you should carry the variability from your data, and also, if you are trying to, if you do not have any quality control, if you do not have a choice, so, it is better to use a distribution rather than a an average value so you can also go with your overall variability and uncertainty of the data set so instead of a deterministic value which is
[Rajat Nag] 13:41:48 不,应该是总和,你应该考虑数据的变异性,而且,如果你在尝试,如果你没有任何质量控制,如果你没有选择,那么使用分布而不是平均值更好,这样你也可以考虑数据集的整体变异性和不确定性,而不是一个确定的值

[Niambee Ruth Palacio] 13:41:52 Yeah.
[Niambee Ruth Palacio] 13:41:52 是的。

[Rajat Nag] 13:42:14 an average of your you know. Lab, test the number of actual count. You can go with a range.
[Rajat Nag] 13:42:14 你知道的,平均值。实验室,测试实际计数的数量。你可以选择一个范围。

[Rajat Nag] 13:42:24 So that's why we plot like box plot and all. So you measured your retrial account, fit a distribution, or you can actually plot that into form a box, port and all and you can have a confidence around the the range of that data, set so why the range is happening so that is the explanation so that is the total of uncertainty it can be based on the
[Rajat Nag] 13:42:24 所以这就是我们绘制箱线图等的原因。因此,您测量了您的重试账户,拟合了一个分布,或者您实际上可以将其绘制成一个框,端口等,您可以对该数据范围有一个置信度,所以为什么会发生这个范围,这就是解释,所以这就是不确定性的总和,它可以基于

[Rajat Nag] 13:42:47 valuability or it can be, it can be only because of the variability. But that is very weird, that because of the uncertainty you have no variability or no variation. So there is always a chance of variability, plus uncertainty and we actually call that total uncertainty.
[Rajat Nag] 13:42:47 价值或它可以,因为变异性而存在。但这很奇怪,因为不确定性你没有变异性或变化。因此,总是有变异的可能性,加上不确定性,我们实际上称之为总不确定性。

[Niambee Ruth Palacio] 13:43:07 Thank you, Professor.
[Niambee Ruth Palacio] 13:43:07 谢谢您,教授。

[Rajat Nag] 13:43:10 So. Next we have a deterministic, so that actually makes kind of a foundation of two kinds of model.
[Rajat Nag] 13:43:10 所以。接下来我们有一个确定性的,这实际上构成了两种模型的基础。

[Rajat Nag] 13:43:17 So one is the deterministic model. So that's the simplistic one. So it's ten plus five is always fifteen right, but in case of probabilistic model, when you are considering a range of those bacterial account and how the initial count of bacteria can lead to the growth model of pictures so if you're interested in doing a kind, of growth model of the
[Rajat Nag] 13:43:17 所以一个是确定性模型。这是简单的模型。所以十加五总是十五,对吧,但在概率模型的情况下,当你考虑这些细菌数量的范围以及细菌的初始数量如何导致生长模型时,如果你有兴趣做一种生长模型的

[Rajat Nag] 13:43:43 battery, and observe throughout the you know. Throughout the week what is the fluctuation of bacteria throughout the month, due to the temporary variation?
[Rajat Nag] 13:43:43 电池,并观察整个你知道的。整个星期细菌的波动是什么,因临时变化而导致的一个月内的波动?

[Rajat Nag] 13:43:54 What is the fluctuation of back to the and then, instead of perhaps one average value, you can actually plot the entire thing.
[Rajat Nag] 13:43:54 回到这个波动,然后,或许不是一个平均值,你实际上可以绘制整个图形。

[Rajat Nag] 13:44:02 So based on the distribution of the data, you can actually calculate a distribution ranging from, say, or two logs to six logs. And then you can add, or you can multiply other parameters, which can be a fixed value or a distribution to get your final distribution final kind, of a range so that is called the probabilistic model.
根据数据的分布,您实际上可以计算出一个分布范围,从两个对数到六个对数。然后,您可以添加或乘以其他参数,这可以是一个固定值或一个分布,以获得您的最终分布的最终类型,这被称为概率模型。

[Rajat Nag] 13:44:28 So most of the time. We also, apart from deterministic model, for the scenario like going with the maximum value.
[Rajat Nag] 13:44:28 所以大部分时间。我们还,除了确定性模型,对于像选择最大值这样的场景。

[Rajat Nag] 13:44:34 But that is not reflecting the reality, but it is good for kind of a screening chick. Then we have the like outputs, also a single value.
[Rajat Nag] 13:44:34 但这并没有反映现实,但对于某种筛选来说是好的。然后我们有类似的输出,也是一个单一的值。

[Rajat Nag] 13:44:44 Input values can be altered to reflect the what if scenario? So, in case of what, if the vector account is only to log?
[Rajat Nag] 13:44:44 输入值可以更改以反映假设场景吗?那么,如果向量账户仅用于记录呢?

[Rajat Nag] 13:44:53 So what if the backcule count is, say five, log, and you can do calculation, and in case of probabilistic model?
[Rajat Nag] 13:44:53 那如果后向计数是五,日志,你可以进行计算,在概率模型的情况下呢?

[Rajat Nag] 13:45:00 It's multiple scenarios are conducted inputs represented by probability distribution rather than a fixed value, and then the output is also a distribution, and we use Monte Carlo simulation for that modeling part so what is montagram simulation.
[Rajat Nag] 13:45:00 进行多个场景的输入是由概率分布表示的,而不是固定值,然后输出也是一个分布,我们使用蒙特卡洛模拟来进行建模部分,那么什么是蒙特卡洛模拟。

[Rajat Nag] 13:45:18 If some of you are working on the probabilistic modeling for your thesis, so that is, applies to you. So it is.
[Rajat Nag] 13:45:18 如果你们中的一些人正在为你的论文进行概率建模,那么这适用于你。所以就是这样。

[Rajat Nag] 13:45:25 Used to model the probability of different outcomes in a process that cannot easily be predicted due to the you know, variability in the input parameters.
[Rajat Nag] 13:45:25 用于建模在输入参数的变异性导致无法轻易预测的过程中不同结果的概率。

[Rajat Nag] 13:45:37 It is a technique you used to understand the impact of risk and uncertainty. And what happens like you actually select one data from individual distribution of the input parameter at a time.
[Rajat Nag] 13:45:37 这是一种技术,用于理解风险和不确定性的影响。实际上,你会一次从输入参数的单个分布中选择一个数据。

[Rajat Nag] 13:45:51 And then you calculate the output, which is a deterministic one, and then you repeat the thing again and again.
然后你计算输出,这是一个确定性的输出,然后你一次又一次地重复这个过程。

[Rajat Nag] 13:45:59 So one thousand, one hundred thousand times. If if you repeat the entire thing, so you are taking one randomly data from the individual distribution of the parameter, and you calculate your output, and then the overall distribution of your all- iterative outputs can form a distribution okay, so the overall, result can be a distribution so it can be a range so that is the beauty
[Rajat Nag] 13:45:59 所以一千,一十万次。如果你重复整个过程,你就从参数的个体分布中随机抽取一个数据,然后计算你的输出,然后你所有迭代输出的整体分布可以形成一个分布,好吧,所以整体结果可以是一个分布,所以它可以是一个范围,这就是美妙之处。

[Rajat Nag] 13:46:26 of Monte Carlo simulation, and you can have a kind of a confidence interval like your fifth, five percent share.
[Rajat Nag] 13:46:26 的蒙特卡洛模拟,你可以有一种置信区间,就像你的五分之一,五个百分点的份额。

[Rajat Nag] 13:46:33 What would be the data? You can be ninety-five percent like ninety-five percent sure, looking at your ninety-three percentile data.
[Rajat Nag] 13:46:33 数据会是什么?你可以有九十五的把握,看看你的九十三百分位数据。

[Rajat Nag] 13:46:40 And so on, so we'll we'll do those kind of thing in few minutes.
[Rajat Nag] 13:46:40 等等,我们将在几分钟内做这些事情。

[Rajat Nag] 13:46:46 A monte call simulation is used to tackle a range of problems in many fields, including investing businesses, physics and engineering.
[Rajat Nag] 13:46:46 蒙特卡罗模拟用于解决许多领域的各种问题,包括投资业务、物理学和工程。

[Rajat Nag] 13:46:54 So my Pd. And current research work is based on this. Mobilistic modeling, and Monte Carlo Simulation.
[Rajat Nag] 13:46:54 所以我的 Pd 和当前的研究工作是基于这个。流动建模和蒙特卡洛模拟。

[Rajat Nag] 13:47:03 So it is kind of written, but for me, so it is also referred to as a multiple probability simulation.
[Rajat Nag] 13:47:03 所以这算是写好了,但对我来说,这也被称为多重概率模拟。

[Rajat Nag] 13:47:09 Sometimes. So that's it. That's the theory. So we make come to the slides later.
[Rajat Nag] 13:47:09 有时候。就这样。这就是理论。我们稍后再来看幻灯片。

[Rajat Nag] 13:47:16 But for the time being I'm telling you how to download a data. So there can be large data set.
[Rajat Nag] 13:47:16 但目前我告诉你如何下载数据。所以可能会有大型数据集。

[Rajat Nag] 13:47:23 If you're working as a kind of you know there is a researcher who can contact only desktop based thesis work.
[Rajat Nag] 13:47:23 如果你在工作,作为一种你知道的研究者,只能联系基于桌面的论文工作。

[Rajat Nag] 13:47:32 So that's fine. You can download W. H data.
[Rajat Nag] 13:47:32 那很好。你可以下载 W. H 数据。

[Rajat Nag] 13:47:35 You can download different kinds of data, but to, you know, keep the manageable task for this data analysis.
[Rajat Nag] 13:47:35 你可以下载不同种类的数据,但为了让这个数据分析的任务更易于管理。

[Rajat Nag] 13:47:44 I'll use a relatively small data set, but as I said, if you understand with that small number of data you can apply that your own data set.
[Rajat Nag] 13:47:44 我会使用一个相对较小的数据集,但正如我所说的,如果你能理解这个小数据量,你就可以应用到你自己的数据集上。

[Rajat Nag] 13:47:52 Okay, so we'll use some data sets such as central Statistical Institute or sorry central statistics office. I think Cso.
[Rajat Nag] 13:47:52 好的,我们将使用一些数据集,比如中央统计局,或者说中央统计办公室。我想是 Cso。

[Rajat Nag] 13:48:01 For Ireland Cso. So the.
[Rajat Nag] 13:48:01 为爱尔兰 Cso。所以。

[Rajat Nag] 13:48:07 So the website looked like this. I can send you the link in the chat box, but.
[Rajat Nag] 13:48:07 所以网站看起来是这样的。我可以在聊天框中发送链接给你,但。

[Rajat Nag] 13:48:17 I'll also send you the exact excel file to save some time. So that's the source.
[Rajat Nag] 13:48:17 我也会把确切的 Excel 文件发给你,以节省一些时间。所以这就是来源。

[Rajat Nag] 13:48:23 And then, when you go to the environment.
[Rajat Nag] 13:48:23 然后,当你去环境时。

[Rajat Nag] 13:48:27 Go to environmental accounts. So we are interested in greenhouse gas emissions for the sample calculation.
[Rajat Nag] 13:48:27 进入环境账户。因此,我们对样本计算中的温室气体排放感兴趣。

[Rajat Nag] 13:48:36 For today. This is all an important stuff for every country to meet out climate, targets and all. So we'll see some kind of calculation from the government, or like data from the government.
[Rajat Nag] 13:48:36 今天就这些。这对每个国家实现气候目标和其他方面都很重要。所以我们将看到政府的一些计算,或者来自政府的数据。

[Rajat Nag] 13:48:47 So there are many data set. I think that's the most recent one.
[Rajat Nag] 13:48:47 所以有很多数据集。我认为那是最新的一个。

[Rajat Nag] 13:48:54 So if I click, you will see like you can select the entire greenhouse gases or you can be selective But for the sake of the class i can filter all the data in excel So I prefer to select all
[Rajat Nag] 13:48:54 所以如果我点击,你会看到你可以选择所有温室气体,或者你可以选择性地选择。但为了课堂的需要,我可以在 Excel 中过滤所有数据。所以我更喜欢选择全部

[Rajat Nag] 13:49:11 In reality, if you are interested to download some data set from a big data set, you can make your life easy by selecting the relevant one you are interested in okay And then you can also select the number of years
[Rajat Nag] 13:49:11 实际上,如果你有兴趣从一个大数据集中下载某些数据集,你可以通过选择你感兴趣的相关数据集来简化你的生活,好吗?然后你还可以选择年份的数量

[Rajat Nag] 13:49:24 Like which year you are interested in then you can select different sectors sectors manufacturing land transport and so on.
[Rajat Nag] 13:49:24 你感兴趣的是哪一年,然后你可以选择不同的行业,比如制造业、陆路运输等等。

[Rajat Nag] 13:49:33 So you can select these three and download your data.
[Rajat Nag] 13:49:33 所以你可以选择这三个并下载你的数据。

[Rajat Nag] 13:49:39 So you can download in the form of csv file Now I'm going to send you the same csv file so that you can work parallelly.
[Rajat Nag] 13:49:39 所以你可以以 csv 文件的形式下载。现在我将把同样的 csv 文件发送给你,以便你可以并行工作。

[Rajat Nag] 13:50:04 It's not allowing me to download that. One second.
[Rajat Nag] 13:50:04 它不让我下载那个。等一下。

[Rajat Nag] 13:50:12 Okay, I'll put that in Google Drive.
[Rajat Nag] 13:50:12 好的,我会把它放在 Google Drive 里。

[Rajat Nag] 13:51:01 Could you please follow on the website like could you please download from the website
[Rajat Nag] 13:51:01 请您在网站上关注一下,比如您能从网站上下载吗

[Rajat Nag] 13:51:09 So go to the website i
[Rajat Nag] 13:51:09 所以去网站 i

[Rajat Nag] 13:51:15 This is the website so go to the website
[Rajat Nag] 13:51:15 这是网站,所以请访问该网站

[Rajat Nag] 13:51:24 And then I'll repeat everything okay so probably you can see This one.
[Rajat Nag] 13:51:24 然后我会重复一切,好吧,所以你可能可以看到这个。

[Rajat Nag] 13:51:33 So I'll repeat. So just go with me. C-s-o.
[Rajat Nag] 13:51:33 所以我再说一遍。跟我来。C-s-o。

[Rajat Nag] 13:51:38 And then cso data or like statistical database
[Rajat Nag] 13:51:38 然后是 cso 数据或类似的统计数据库

[Rajat Nag] 13:51:45 And can you see all of you? Like the data in this way
[Rajat Nag] 13:51:45 你们都能看到吗?像这样的数据

[Rajat Nag] 13:51:57 So you can select the environment
[Rajat Nag] 13:51:57 所以你可以选择环境

[Saksorn Techasutjalidsuntorn] 13:51:57 Yes.
[Saksorn Techasutjalidsuntorn] 13:51:57 是的。

[Rajat Nag] 13:52:06 If any of you… like struggling with the CSO and data download let me know now because Once I download the data, so you have to follow
[Rajat Nag] 13:52:06 如果你们中有谁……在处理 CSO 和数据下载时遇到困难,请现在告诉我,因为一旦我下载了数据,你们就必须跟进

[Rajat Nag] 13:52:22 So click environment then go to environment accounts
[Rajat Nag] 13:52:22 所以点击环境,然后去环境账户

[Rajat Nag] 13:52:33 Within environment and environmental accounts there is a thing option called air emissions so click that.
[Rajat Nag] 13:52:33 在环境和环境账户中,有一个选项叫做空气排放,所以点击它。

[Rajat Nag] 13:52:46 And you can click the first one. Okay, the top one.
[Rajat Nag] 13:52:46 然后你可以点击第一个。好的,最上面的那个。

[Rajat Nag] 13:52:51 Greenhouse gas emissions from 2010 to 2020. The first one.
[Rajat Nag] 13:52:51 2010 年至 2020 年的温室气体排放。第一个。

[Rajat Nag] 13:53:00 And then you can you can select all greenhouse gases.
[Rajat Nag] 13:53:00 然后你可以选择所有温室气体。

[Rajat Nag] 13:53:07 You can select the entire year number of years and then the sectors And then download data.
您可以选择整个年份的年数,然后选择行业,然后下载数据。

[Rajat Nag] 13:53:18 And you can click the CSV file or you can click the XLS file.
[Rajat Nag] 13:53:18 你可以点击 CSV 文件,也可以点击 XLS 文件。

[Rajat Nag] 13:53:24 Like the excel file so you can select csv file because some of you may work in r so you can you can actually work on that or you can actually with excel we can with one click we can convert this csv file into excel file so just download the CSV.
[Rajat Nag] 13:53:24 像 Excel 文件一样,您可以选择 CSV 文件,因为你们中的一些人可能在使用 R,所以你们实际上可以在那上面工作,或者你们实际上可以使用 Excel,我们只需点击一下就可以将这个 CSV 文件转换为 Excel 文件,所以只需下载 CSV。

[Rajat Nag] 13:53:42 So this is the csv file I downloaded today.
[Rajat Nag] 13:53:42 所以这是我今天下载的 csv 文件。

[Rajat Nag] 13:53:48 Now I'm opening that file.
[Rajat Nag] 13:53:48 现在我正在打开那个文件。

[Rajat Nag] 13:53:53 First thing first. Go to file.
[Rajat Nag] 13:53:53 首先,去文件。

[Rajat Nag] 13:53:59 Save as and give it a number data analysis Day one.
[Rajat Nag] 13:53:59 另存为并给它一个编号 数据分析 第一天。

[Rajat Nag] 13:54:09 And change the file
[Rajat Nag] 13:54:09 并更改文件

[Rajat Nag] 13:54:16 Change the name change the type of file to Excel workbook
[Rajat Nag] 13:54:16 更改文件名称,将文件类型更改为 Excel 工作簿

[Rajat Nag] 13:54:23 Unmute yourself and tell me if you have any problem. Up to this.
[Rajat Nag] 13:54:23 解除静音并告诉我你是否有任何问题。到此为止。

[Niambee Ruth Palacio] 13:54:31 Professor? With regards to converting the CSV file into an Excel file, I have to open Yeah, I opened it, but it's showing me something different. It's giving me a different format.
[Niambee Ruth Palacio] 13:54:31 教授?关于将 CSV 文件转换为 Excel 文件,我必须打开。是的,我打开了,但它显示给我的东西不同。它给了我一个不同的格式。

[Rajat Nag] 13:54:33 Yes.
[Rajat Nag] 13:54:33 是的。

[Rajat Nag] 13:54:39 The csv file
[Rajat Nag] 13:54:39 CSV 文件

[Rajat Nag] 13:54:46 But that's fine i think if you go to file
[Rajat Nag] 13:54:46 但我认为如果你去文件的话,这没问题

[Niambee Ruth Palacio] 13:54:50 Okay. Okay, I'm going to figure it out. I'm going to try.
[Niambee Ruth Palacio] 13:54:50 好的。好的,我会想办法的。我会尝试的。

[Rajat Nag] 13:54:54 Yeah. So just there can be something called okay or edit Depending on the version of your excel But it should be okay so then if you save as the file as Excel workbook and give it a name.
[Rajat Nag] 13:54:54 是的。所以可以有一个叫做“确定”或“编辑”的选项,具体取决于你使用的 Excel 版本。但应该没问题,所以如果你将文件另存为 Excel 工作簿并给它命名。

[Rajat Nag] 13:55:10 The data. Anna laces.
[Rajat Nag] 13:55:10 数据。安娜的鞋带。

[Rajat Nag] 13:55:15 Day one and just save the file okay so now you have a Excel file.
[Rajat Nag] 13:55:15 第一天,先保存文件,好吧,现在你有一个 Excel 文件。

[Rajat Nag] 13:55:23 Then you can select the entire data like this way buy sell so just click on these cells like e d C, B, and A and drag.
[Rajat Nag] 13:55:23 然后你可以像这样选择整个数据,买卖,所以只需点击这些单元格,如 e、d、C、B 和 A,然后拖动。

[Rajat Nag] 13:55:35 So you can select entire data set okay and beyond that that's fine and then double click on the border of say d and e or a and b or b and c anywhere So this is a border double click so then you can see all the text
[Rajat Nag] 13:55:35 所以你可以选择整个数据集,没问题,然后双击比如 d 和 e 或 a 和 b 或 b 和 c 的边界,随便哪里。所以这是一个边界双击,这样你就可以看到所有的文本

[Rajat Nag] 13:55:57 Okay so we are not going to do any calculation on this so just call it raw data.
[Rajat Nag] 13:55:57 好的,所以我们不打算对此进行任何计算,所以就称其为原始数据。

[Rajat Nag] 13:56:06 Okay, so anytime you can revisit that page No.
[Rajat Nag] 13:56:06 好的,随时可以重新访问那个页面。

[Rajat Nag] 13:56:13 Next step.
[Rajat Nag] 13:56:13 下一步。

[Rajat Nag] 13:56:17 Right click on the raw data click move or copy.
[Rajat Nag] 13:56:17 右键单击原始数据,点击移动或复制。

[Rajat Nag] 13:56:25 And then select and then select And please close all the Excel file other than this okay otherwise you may have different number of like books and all but with the same workbook move to end and create a copy.
[Rajat Nag] 13:56:25 然后选择,然后选择。请关闭除这个以外的所有 Excel 文件,好吗?否则你可能会有不同数量的书籍,但在同一个工作簿中移动到末尾并创建一个副本。

[Rajat Nag] 13:56:41 So move to end and create a copy click here and select ok
[Rajat Nag] 13:56:41 所以移动到末尾并点击这里创建一个副本,然后选择确定

[Saksorn Techasutjalidsuntorn] 13:56:49 Can you do it again?
[Saksorn Techasutjalidsuntorn] 13:56:49 你能再做一次吗?

[Rajat Nag] 13:56:51 Yes, raw data right click. Move or copy.
[Rajat Nag] 13:56:51 是的,右键点击原始数据。移动或复制。

[Rajat Nag] 13:56:56 And then select move to end and click here create a copy
[Rajat Nag] 13:56:56 然后选择移动到末尾并点击这里创建一个副本

[Caroline Lema] 13:57:04 Sorry, can you repeat when you like the first stage after you have copied the data on the Excel?
[Caroline Lema] 13:57:04 对不起,你能重复一下在你将数据复制到 Excel 后你喜欢的第一阶段是什么时候吗?

[Caroline Lema] 13:57:12 The thing that you did to get the empty raw. Can you repeat on that, please?
[Caroline Lema] 13:57:12 你做的事情是为了得到空的原始数据。你能再说一遍吗?

[Rajat Nag] 13:57:20 So… Okay, so like like this.
[Rajat Nag] 13:57:20 所以……好的,就像这样。

[Rajat Nag] 13:57:25 So I selected the entire data i think it was something like here So from here right So you converted the CSV file into Excel file and it should look like this.
[Rajat Nag] 13:57:25 所以我选择了整个数据,我想它大概是这样的。所以从这里开始,对吧?所以你把 CSV 文件转换成了 Excel 文件,它应该看起来像这样。

[Caroline Lema] 13:57:47 Yes.
[Caroline Lema] 13:57:47 是的。

[Rajat Nag] 13:57:47 Yeah, so select select the entire data from the top selecting from the top selecting like this and then double click in the boundary of these two columns.
[Rajat Nag] 13:57:47 是的,所以从顶部选择整个数据,像这样选择,然后在这两列的边界上双击。

[Rajat Nag] 13:58:01 Double click and then data will be stretched according to the space required for individual cells some may have short name, some may have like long names long names okay so the data will be stretched in that way that you can read everything
[Rajat Nag] 13:58:01 双击后,数据将根据每个单元格所需的空间进行拉伸,有些可能有短名称,有些可能有长名称,长名称好的,所以数据将以这种方式拉伸,以便你可以阅读所有内容

[Rajat Nag] 13:58:22 Now, this is my raw data. So I just renamed the tab as raw data.
[Rajat Nag] 13:58:22 现在,这就是我的原始数据。所以我只是将标签重命名为原始数据。

[Rajat Nag] 13:58:28 Now I'm copying pasting the tab for my working file.
[Rajat Nag] 13:58:28 现在我正在复制粘贴我的工作文件的标签。

[Rajat Nag] 13:58:35 So move or copy move to end.
[Rajat Nag] 13:58:35 所以移动或复制移动到末尾。

[Rajat Nag] 13:58:42 Take this box create a copy and hit OK. So I'm basically copy pasting the entire thing to a new tab So just name it.
[Rajat Nag] 13:58:42 拿这个框,创建一个副本,然后点击确定。所以我基本上是把整个内容复制粘贴到一个新标签页。所以就命名它。

[Rajat Nag] 13:58:54 Working data. So you should not work on the raw data at any point in time because you can do some mistake and you can revisit the raw data at any point in time.
[Rajat Nag] 13:58:54 工作数据。因此,您在任何时候都不应处理原始数据,因为您可能会犯一些错误,并且您可以在任何时候重新查看原始数据。

[Rajat Nag] 13:59:05 Okay. And do whatever you want in your working data and following like files.
[Rajat Nag] 13:59:05 好的。你可以在你的工作数据和后续文件中随意做任何事情。

[Rajat Nag] 13:59:13 So next thing select the entire first row like this or you may not select there is no need to select you can just go to data and select filter oh sorry i think you need to select sorry so you can select like this only one two three four five columns
[Rajat Nag] 13:59:13 所以接下来选择整个第一行,像这样,或者你可以不选择,没必要选择,你可以直接去数据并选择过滤器,哦,对不起,我想你需要选择,抱歉,所以你可以像这样选择,只有一、二、三、四、五列

[Rajat Nag] 13:59:37 Or you can select like all all the columns. So if you have large data I do like select everything or if you have only five columns of data you can select like this and add filter.
[Rajat Nag] 13:59:37 或者你可以选择所有列。所以如果你有大量数据,我会选择所有内容,或者如果你只有五列数据,你可以像这样选择并添加过滤器。

[Rajat Nag] 13:59:50 So click here in the data tab in the data you have an option called filter click here and then you will see like different kinds of different kinds filtering options okay
[Rajat Nag] 13:59:50 所以在数据选项卡中点击这里,在数据中你会看到一个叫做过滤器的选项,点击这里,然后你会看到不同种类的过滤选项,好吗

[Rajat Nag] 14:00:10 So our main interest is to look at the agriculture sector okay because agriculture sector contributes the most in terms of the methane and nitrous oxide emissions if you know So I'm selecting so first unselect everything and then select agriculture there should be something called agriculture
[Rajat Nag] 14:00:10 所以我们主要关注农业部门,因为农业部门在甲烷和氧化亚氮排放方面贡献最大。如果你知道的话。所以我先取消选择所有,然后选择农业,应该有一个叫农业的选项

[Rajat Nag] 14:00:39 Can you see that?
[Rajat Nag] 14:00:39 你能看到那个吗?

[Rajat Nag] 14:00:45 Agriculture.
[Rajat Nag] 14:00:45 农业。

[Rajat Nag] 14:00:52 Did they change?
[Rajat Nag] 14:00:52 他们改变了吗?

[Soham Deshpande] 14:00:57 Sir, there is crop and animal production.
[Soham Deshpande] 14:00:57 先生,那里有农作物和动物生产。

[Rajat Nag] 14:01:01 Bro. Earlier it was like agriculture one.
[Rajat Nag] 14:01:01 兄弟。之前就像农业一样。

[Rajat Nag] 14:01:30 This one, right? Graph is the only relevant one.
[Rajat Nag] 14:01:30 这个,对吧?图是唯一相关的。

[Soham Deshpande] 14:01:34 Yes
[Soham Deshpande] 14:01:34 是的

[Rajat Nag] 14:01:36 So just select only one crop and animal production has been true.
[Rajat Nag] 14:01:36 所以只选择一种作物和动物生产是正确的。

[Rajat Nag] 14:01:42 Like hunting and other related Okay.
[Rajat Nag] 14:01:42 喜欢狩猎和其他相关的事情。

[Rajat Nag] 14:01:47 So we have the data in the form of our sector, correct?
[Rajat Nag] 14:01:47 所以我们有以我们的行业形式的数据,对吗?

[Rajat Nag] 14:01:56 So now the point is I'm checking. I have a good data set or not. So if I have any blank sales or inconsistency in data I actually got some inconsistency in terms of
[Rajat Nag] 14:01:56 所以现在的问题是我在检查。我是否有一个好的数据集。如果我有任何空白销售或数据不一致,我实际上在某些方面得到了不一致。

[Rajat Nag] 14:02:13 Tons and thousand tons So the unit is not same
[Rajat Nag] 14:02:13 吨和千吨,所以单位不一样

[Rajat Nag] 14:02:27 Okay, so I have… carbon dioxide.
[Rajat Nag] 14:02:27 好的,我有……二氧化碳。

[Rajat Nag] 14:02:31 Carbon dioxide from biomass okay different Okay, hydro.
[Rajat Nag] 14:02:31 来自生物质的二氧化碳可以不同,可以,水。

[Rajat Nag] 14:02:38 Pleurocarbons okay methane meeting insect equivalent okay that's the one we need actually CO2 equivalent
[Rajat Nag] 14:02:38 平面碳 okay 甲烷 会议 昆虫 当量 okay 这就是我们实际上需要的 CO2 当量

[Rajat Nag] 14:03:06 Okay, that characterization factor for all. Okay, okay, okay.
[Rajat Nag] 14:03:06 好的,这个特征因子适用于所有。好的,好的,好的。

[Rajat Nag] 14:03:22 Give me two minutes. To assimilate the new nature of data.
[Rajat Nag] 14:03:22 给我两分钟。来适应数据的新特性。

[Rajat Nag] 14:03:31 Okay, so just some rational So we have different kinds of gases in the atmosphere And every gases have like the greenhouse gas can be caused by three things one is
[Rajat Nag] 14:03:31 好的,所以只是一些理性的东西。我们的大气中有不同种类的气体,而每种气体都有像温室气体这样的特性,温室气体可以由三种因素引起,其中之一是

[Rajat Nag] 14:03:50 Like the abundance of the gases in the atmosphere say for example what is the of mostly available gas in the atmosphere, oxygen and nitrogen, right?
[Rajat Nag] 14:03:50 就像大气中气体的丰富性,比如说大气中主要存在的气体是什么,氧气和氮气,对吧?

[Rajat Nag] 14:04:00 And carbon dioxide perhaps. So, and then we have like the life of these gases in the atmosphere So the how long they can stay in the atmosphere.
[Rajat Nag] 14:04:00 还有二氧化碳。然后我们有这些气体在大气中的生命周期,也就是它们在大气中能停留多久。

[Rajat Nag] 14:04:12 See for example methionine can stay in its form for a very shorter period of time as compared to the nitrous oxide i think nitrous oxide can stay more than 100 years or so. And methane can stay because of the decay actually the gas actually break down into
[Rajat Nag] 14:04:12 例如,甲硫氨酸与一氧化二氮相比,可以保持其形态的时间要短得多,我认为一氧化二氮可以保持超过 100 年。甲烷可以保持,因为气体实际上会分解成

[Rajat Nag] 14:04:33 Different things and then different things means like carbon hydrogen that's the basic thing of meeting.
[Rajat Nag] 14:04:33 不同的事物,然后不同的事物意味着像碳氢,这就是会议的基本内容。

[Rajat Nag] 14:04:40 And I think the life is around 20 years or so.
[Rajat Nag] 14:04:40 我认为生命大约是 20 年左右。

[Rajat Nag] 14:04:45 The amount of the lifespan of the gas in the atmosphere and the third point is What is the potential of that gas as compared to carbon dioxide so we convert everything in terms of carbon so if carbon dioxide can heat up the atmosphere by
[Rajat Nag] 14:04:45 大气中气体的寿命量以及第三点是该气体与二氧化碳相比的潜力是什么,因此我们将一切转换为碳的形式,所以如果二氧化碳可以加热大气的话

[Rajat Nag] 14:05:01 One degree the same amount of methane can heat up the atmosphere by 27.9 times okay so and also the potential of nitrous oxide say 200 98 or something.
[Rajat Nag] 14:05:01 一度相同量的甲烷可以使大气加热 27.9 倍,好吧,还有氧化亚氮的潜力,大约是 200 98 或其他。

[Rajat Nag] 14:05:19 So we have… Sorry. Different characterization factor So first thing first we cannot compare apple with oranges. We need to compare Apple with Apple. So I'm only interested in first the quantity So the carbon dioxide is same because there is no need for characterization factor
[Rajat Nag] 14:05:19 所以我们有……抱歉。不同的特征化因子。所以首先,我们不能把苹果和橘子进行比较。我们需要把苹果和苹果进行比较。所以我只对数量感兴趣。所以二氧化碳是相同的,因为不需要特征化因子。

[Rajat Nag] 14:05:42 And this is also fine. So this is… of… Okay.
[Rajat Nag] 14:05:42 这也没问题。所以这是……的……好的。

[Rajat Nag] 14:05:54 Thousand tons.
[Rajat Nag] 14:05:54 千吨。

[Rajat Nag] 14:05:59 Is at the value thousand tons okay this is fine this is fine This is fine but it is not converted.
[Rajat Nag] 14:05:59 这个值一千吨可以吗,这很好,这很好,这很好,但它没有被转换。

[Rajat Nag] 14:06:07 Yeah it is converted to CO2 equivalent so we have value in C42.
[Rajat Nag] 14:06:07 是的,它被转换为二氧化碳当量,所以我们在 C42 中有值。

[Rajat Nag] 14:06:15 Equivalent so everything should be in this form So carbon dioxide is fine because is the same thing.
[Rajat Nag] 14:06:15 等价,所以一切都应该是这种形式。所以二氧化碳是可以的,因为是同样的东西。

[Rajat Nag] 14:06:23 Carbon dioxide from biomass co2 bio so that is also Fine.
[Rajat Nag] 14:06:23 来自生物质的二氧化碳 co2 bio 所以这也是好的。

[Rajat Nag] 14:06:29 Then we have hydrofluorocarbon that is converted into CO2 equivalent that is also fine Then we have methane. This methane we cannot consider.
[Rajat Nag] 14:06:29 然后我们有氢氟碳化合物,它被转化为二氧化碳当量,这也没问题。然后我们有甲烷。这个甲烷我们不能考虑。

[Rajat Nag] 14:06:40 Because this is just tons And we need something methane converted to CO2 equivalent so that's why you can actually you can actually delete this thing i know there are there are different values in between these cells because it's filtered but still you can just
[Rajat Nag] 14:06:40 因为这只是很多,我们需要一些甲烷转换为二氧化碳当量,所以这就是为什么你实际上可以删除这个东西,我知道这些单元格之间有不同的值,因为它被过滤了,但你仍然可以直接

[Rajat Nag] 14:07:01 Drag them and delete them.
[Rajat Nag] 14:07:01 拖动它们并删除它们。

[Soham Deshpande] 14:07:03 Sir, we can also unfilter methane in the filter option
[Soham Deshpande] 14:07:03 先生,我们也可以在过滤选项中取消过滤甲烷

[Rajat Nag] 14:07:08 Yes, we can do that. Without deleting stuff so we can select here So select all instead of select all carbon dioxide this is okay so this was it was co2 equivalent that is fine methane with CO2 equivalent and then nitrous oxide with equivalent
[Rajat Nag] 14:07:08 是的,我们可以这样做。无需删除内容,所以我们可以在这里选择。所以选择全部,而不是选择所有二氧化碳,这样可以,这就是二氧化碳当量,没问题,甲烷与二氧化碳当量,然后是氮氧化物与当量。

[Rajat Nag] 14:07:31 So that is fine. And then, okay, that is also calculated already.
[Rajat Nag] 14:07:31 那很好。然后,好吧,那也已经计算过了。

[Rajat Nag] 14:07:40 And this is also CO2 equivalent and total greenhouse gases in CO2 equivalent yeah So now we have a filtered data like this So please let me know if you have different values So these are actually zero values I do not know.
[Rajat Nag] 14:07:40 这也是二氧化碳当量和总温室气体的二氧化碳当量,是的。所以现在我们有了这样的过滤数据。如果你有不同的数值,请告诉我。这实际上是零值,我不知道。

[Rajat Nag] 14:08:02 Perhaps they okay so from egg okay i got it so from agriculture it is very unlikely to to have these gases okay so we have only these gases okay so you can actually also unfilter this uh per fluorocarbons.
[Rajat Nag] 14:08:02 也许他们没问题,所以从蛋开始,我明白了,从农业来看,这些气体出现的可能性非常小,所以我们只有这些气体,所以你实际上也可以不过滤这些氟化碳。

[Rajat Nag] 14:08:19 And sulfur. Extra fluoride so fluoride
[Rajat Nag] 14:08:19 还有硫。额外的氟,所以氟

[Rajat Nag] 14:08:26 This one and this one. That's fine now so we have now one two basically these two are same but with biogenic and non-biogenic i think These are the classification two And then we have hydrocarbon We have CO2 equivalent methane
[Rajat Nag] 14:08:26 这个和这个。现在没问题了,所以我们现在有一二,基本上这两个是相同的,但有生物源和非生物源,我认为这就是分类的两个。然后我们有碳氢化合物,我们有二氧化碳当量的甲烷。

[Rajat Nag] 14:08:46 We have CO2 equivalent nitrous oxide and total greenhouse gases right So now do one thing we are only interested in this data right so just copy copy entire thing.
[Rajat Nag] 14:08:46 我们有二氧化碳当量的氧化亚氮和总温室气体,对吧?所以现在做一件事,我们只对这些数据感兴趣,对吧?所以只需复制整个内容。

[Rajat Nag] 14:08:59 So copying that way and then paste it in a new tab.
[Rajat Nag] 14:08:59 所以以那种方式复制,然后粘贴到一个新标签页中。

[Rajat Nag] 14:09:10 So that's the data we are interested in to run the data analysis.
[Rajat Nag] 14:09:10 所以这是我们感兴趣的数据,用于进行数据分析。

[Rajat Nag] 14:09:18 And you can
[Rajat Nag] 14:09:18 你可以

[Rajat Nag] 14:09:23 Use these borders. So please tell me if you are stuck you should have this thing in order to move further.
[Rajat Nag] 14:09:23 使用这些边界。所以请告诉我如果你卡住了,你应该有这个东西才能继续前进。

[Rajat Nag] 14:09:37 So we… Apart from the housekeeping stuff so this data is now but like still check whether if there is some inconsistency in terms of the text and all but with filter we recognize that These are all correct. Otherwise, it would appear twice in the filtering.
[Rajat Nag] 14:09:37 所以我们…… 除了一些日常事务,这些数据现在是这样的,但仍需检查文本等方面是否存在不一致,但通过过滤我们确认这些都是正确的。否则,它会在过滤中出现两次。

[Rajat Nag] 14:10:06 Okay, so first thing first we'll check
[Rajat Nag] 14:10:06 好的,首先我们先检查一下

[Rajat Nag] 14:10:14 We can we can check First, I'm just checking what i should i Okay, so many stat can be a good one.
[Rajat Nag] 14:10:14 我们可以检查一下。首先,我只是在检查我应该做什么。好的,所以许多统计数据可能是一个不错的选择。

[Rajat Nag] 14:10:35 Okay in order to do so go to file go to options go to file, go to options go to add-ins and go to Excel add-ins.
[Rajat Nag] 14:10:35 好的,为了做到这一点,去文件,去选项,去文件,去选项,去加载项,然后去 Excel 加载项。

[Rajat Nag] 14:10:56 So I have a special software at risk so it's appearing so just select analysis tool pack and hit OK.
[Rajat Nag] 14:10:56 所以我有一个有风险的特殊软件,所以它出现了,只需选择分析工具包并点击确定。

[Rajat Nag] 14:11:22 I can also do one thing for the better visibility of the data I can… do in that way say for example here air wise so this is the year So I can plot here.
[Rajat Nag] 14:11:22 我还可以做一件事来更好地展示数据,我可以…这样做,比如说这里按空气分类,所以这是年份。所以我可以在这里绘制。

[Rajat Nag] 14:11:43 I can copy paste the year And then we have the gas, right?
[Rajat Nag] 14:11:43 我可以复制粘贴年份,然后我们有气体,对吧?

[Rajat Nag] 14:11:48 So I can… type the heading of this carbon dioxide And then we have the values like up to this one, right?
[Rajat Nag] 14:11:48 所以我可以… 输入这个二氧化碳的标题,然后我们有像这个一样的数值,对吗?

[Rajat Nag] 14:12:00 Just copy And that's the value for carbon dioxide.
[Rajat Nag] 14:12:00 只需复制,这就是二氧化碳的值。

[Rajat Nag] 14:12:07 Just a color things for better visibility.
[Rajat Nag] 14:12:07 只是为了更好的可见性而调整颜色。

[Ayse Nur Tonay] 14:12:14 Sorry, can you show the add-in part again? Yeah.
[Ayse Nur Tonay] 14:12:14 对不起,你能再展示一下插件部分吗?好的。

[Rajat Nag] 14:12:15 Yeah. Yeah, file.
[Rajat Nag] 14:12:15 是的。是的,文件。

[Rajat Nag] 14:12:20 Options. Add in and then just go and then select analysis toolpack and hit okay I am just coloring things so for better understanding I use a different software it can do things for me but that's fine okay So we have same
[Rajat Nag] 14:12:20 选项。添加,然后直接去选择分析工具包并点击确定。我只是给东西上色,以便更好地理解,我使用不同的软件,它可以为我做事情,但没关系,好吧,所以我们有相同的。

[Ayse Nur Tonay] 14:12:25 Yeah.
[Ayse Nur Tonay] 14:12:25 是的。

[Ayse Nur Tonay] 14:12:30 Okay.
[Ayse Nur Tonay] 14:12:30 好的。

[Rajat Nag] 14:12:47 Number of years repeating from 2010 to 2022 same here everything is same consistent data we have so next is this so just copy pasting that then we'll have these guys Augustine that.
[Rajat Nag] 14:12:47 从 2010 年到 2022 年重复的年数在这里都是一样的,所有数据都是一致的,所以接下来就是这个,只需复制粘贴,然后我们就会有这些家伙奥古斯丁。

[Rajat Nag] 14:13:12 And I have methane. Methane can be List here.
[Rajat Nag] 14:13:12 我有甲烷。甲烷可以在这里列出。

[Rajat Nag] 14:13:20 And then nitrous oxide nitrous oxide And in total greenhouse gas emissions.
[Rajat Nag] 14:13:20 然后是氧化亚氮,氧化亚氮,以及总温室气体排放。

[Rajat Nag] 14:13:34 So I can do one thing like I can double click to see the double click entire text okay so the first thing is copied the carbon dioxide is copied then you can copy the next column for the carbon dioxide from biomass
[Rajat Nag] 14:13:34 所以我可以做一件事,比如我可以双击查看整个文本,好吧,所以第一件事是复制二氧化碳,然后你可以复制下一列的生物质中的二氧化碳

[Rajat Nag] 14:13:55 So there's the carbon dioxide from biomass Next, we can just copy this bit For the next guest.
[Rajat Nag] 14:13:55 所以这是来自生物质的二氧化碳。接下来,我们可以把这一部分复制给下一个嘉宾。

[Rajat Nag] 14:14:05 Then you can go to the Following.
[Rajat Nag] 14:14:05 然后你可以去“Following”。

[Rajat Nag] 14:14:15 And then remaining two okay so once you have all data in this good format
[Rajat Nag] 14:14:15 然后剩下的两个好吧,所以一旦你把所有数据都整理成这个好的格式

[Rajat Nag] 14:14:27 You are good so you can delete this extra extra columns extra rows Okay.
[Rajat Nag] 14:14:27 你很好,所以你可以删除这些多余的列和多余的行。好的。

[Rajat Nag] 14:14:40 No. We can do some yeah
[Rajat Nag] 14:14:40 不,我们可以做一些,是的。

[Samkelisiwe Tengwa] 14:14:42 Sorry. What's the difference between the sheet four and the working data tab?
[Samkelisiwe Tengwa] 14:14:42 对不起。第四张表和工作数据标签之间有什么区别?

[Rajat Nag] 14:14:48 This so raw data working data is the same but with some filter So I filtered everything in terms of the sector So I'm only interested in crop, animal production, hunting and so on.
[Rajat Nag] 14:14:48 这些原始数据和工作数据是相同的,但经过了一些过滤。因此,我在行业方面过滤了所有内容。所以我只对作物、动物生产、狩猎等感兴趣。

[Rajat Nag] 14:15:02 So the influence of that sector in terms of greenhouse gas emissions in Ireland.
[Rajat Nag] 14:15:02 所以该行业在爱尔兰温室气体排放方面的影响。

[Samkelisiwe Tengwa] 14:15:03 Okay.
[Samkelisiwe Tengwa] 14:15:03 好的。

[Samkelisiwe Tengwa] 14:15:09 Okay.
[Samkelisiwe Tengwa] 14:15:09 好的。

[Rajat Nag] 14:15:09 So then I just copy paste it the data which is related to that sector.
所以我只是复制粘贴与该行业相关的数据。

[Rajat Nag] 14:15:15 And these are different gases. So I made filter for the symmetry of data like every data I have in terms of CO2 equivalent Because I had some data which is for methane but with small value but we need to multiply with the
[Rajat Nag] 14:15:15 这些是不同的气体。所以我为数据的对称性制作了过滤器,就像我拥有的每个数据一样,以二氧化碳当量为单位。因为我有一些数据是关于甲烷的,但数值很小,但我们需要乘以

[Rajat Nag] 14:15:33 Global warning potential of methane. So I filtered from here. So I excluded meet him from here and included only the meeting with CO2 equivalent values.
[Rajat Nag] 14:15:33 甲烷的全球警告潜力。因此我从这里进行了过滤。所以我从这里排除了与他见面,只包括与二氧化碳当量值的会议。

[Rajat Nag] 14:15:43 Okay, so that was the history. And then I use different colors to see the values for different gases and then I just copy paste it and rearrange the data in this format.
[Rajat Nag] 14:15:43 好的,这就是历史。然后我使用不同的颜色来查看不同气体的值,然后我只是复制粘贴并以这种格式重新排列数据。

[Samkelisiwe Tengwa] 14:15:59 Okay.
[Samkelisiwe Tengwa] 14:15:59 好的。

[Rajat Nag] 14:16:01 Okay. Now once you turned on that analysis two packs so you can see that in the data data analysis tab otherwise you cannot see this data analysis thing in excel
[Rajat Nag] 14:16:01 好的。现在一旦你打开了那个分析两个包,你就可以在数据分析选项卡中看到它,否则你在 Excel 中看不到这个数据分析的内容。

[Rajat Nag] 14:16:21 All right. So hit data analysis and then select The descriptive statistics.
[Rajat Nag] 14:16:21 好的。点击数据分析,然后选择描述性统计。

[Rajat Nag] 14:16:33 So select or click on the data analysis and select descriptive statistics.
[Rajat Nag] 14:16:33 所以选择或点击数据分析,然后选择描述性统计。

[Rajat Nag] 14:16:41 This kind of a summary stat for all and hit OK.
[Rajat Nag] 14:16:41 这种总结统计适用于所有人,点击确定。

[Rajat Nag] 14:16:48 And then it says the input range So this is my input range i am interested in this part i'm not interested in the first column like the year Because I'm not getting any trained from my descriptive statistics for that i need to perform
[Rajat Nag] 14:16:48 然后它说输入范围。所以这是我感兴趣的输入范围,我对第一列,比如年份不感兴趣,因为我从我的描述性统计中没有得到任何训练,为此我需要执行

[Rajat Nag] 14:17:08 Time sees forecasting. I'll also cover that.
[Rajat Nag] 14:17:08 时间看到预测。我也会涵盖这一点。

[Soham Deshpande] 14:17:10 Sir, can you again show us what to do after data? We were copying the data.
[Soham Deshpande] 14:17:10 先生,您能再给我们演示一下在数据之后该怎么做吗?我们正在复制数据。

[Rajat Nag] 14:17:16 Okay, after that go to this tab data analysis and select descriptive statistics.
[Rajat Nag] 14:17:16 好的,之后去这个标签数据分析,选择描述性统计。

[Rajat Nag] 14:17:27 So that was the kind of objective of today's lecture.
[Rajat Nag] 14:17:27 所以这就是今天讲座的目标。

[Rajat Nag] 14:17:38 And in the next class we'll do those ANOVA and regression stuff okay just kind of a teaser for you.
[Rajat Nag] 14:17:38 在下一节课我们会做那些方差分析和回归的内容,好吧,这只是给你们的一个小预告。

[Rajat Nag] 14:17:51 So select the descriptive statistics and hit ok and then it will prompt you in terms of give me the data so range So select this at all and you can select the entire data and do not select the first column because that's just a number of years
[Rajat Nag] 14:17:51 所以选择描述性统计并点击确定,然后它会提示你提供数据的范围。所以选择全部,你可以选择整个数据,但不要选择第一列,因为那只是年份的数字

[Rajat Nag] 14:18:10 And then you can select the arrow again the data is selected So it says levels in first row So I have levels indeed in the first row so just select that And then it says like new workbook or sorry
[Rajat Nag] 14:18:10 然后你可以再次选择箭头,数据被选中。所以它说第一行是级别。所以我确实在第一行有级别,所以只需选择它。然后它说像新工作簿或者抱歉

[Rajat Nag] 14:18:31 New workbook ply means like a new tab in the same excel or you can also select a new workbook means like a new excel will appear will open. I do not want that. I just need that in the next tab
[Rajat Nag] 14:18:31 新工作表选项意味着在同一个 Excel 中像一个新标签,或者你也可以选择一个新工作簿,这意味着会出现一个新的 Excel。我不想要那个。我只需要在下一个标签中。

[Rajat Nag] 14:18:46 So then there are some things what you want from this so see here here we have like group by columns but like without that copy paste thing you could do with group by rows also Okay. So for the understanding i just uh
[Rajat Nag] 14:18:46 那么这里有一些你想要的东西,所以看看这里我们有像按列分组的内容,但没有那种复制粘贴的东西,你也可以按行分组。好的。为了理解,我只是呃

[Rajat Nag] 14:19:07 Copy pasted everything in terms of column heading for other plots So you can do that also with rows.
[Rajat Nag] 14:19:07 将其他图表的列标题中的所有内容复制粘贴了。所以你也可以对行进行同样的操作。

[Rajat Nag] 14:19:14 So our data is grouped by columns And then first row is the level And then I want my data to be the summary in the same excel file as a new tab so that's why it is selected and then summary stat
[Rajat Nag] 14:19:14 所以我们的数据是按列分组的,然后第一行是级别,然后我希望我的数据作为摘要在同一个 Excel 文件中作为新标签页,这就是为什么它被选中,然后是摘要统计

[Rajat Nag] 14:19:33 So in summary state what are the things I want? So I need that confidence.
[Rajat Nag] 14:19:33 所以总结一下,我想要什么?我需要那种自信。

[Rajat Nag] 14:19:39 Level of the means so 95th percentile is perfect so which means like If there are outliers, so please explain everything in terms of 95th confidence interval with 95 confidence so you can change that also to 99 okay So depending on your choice of
[Rajat Nag] 14:19:39 手段的水平,所以第 95 百分位是完美的,这意味着如果有异常值,请用 95%的置信区间和 95%的置信度来解释一切,所以你也可以将其更改为 99,好吧,所以根据你的选择

[Rajat Nag] 14:20:05 Kind of, you know. Confidence. So you can then hit okay and then you will see So as I double clicked you can see the things are beyond the thing so the best practice of that is you can select the entire
[Rajat Nag] 14:20:05 有点,你知道的。自信。所以你可以点击确定,然后你会看到。正如我双击的,你可以看到事情超出了那个东西,所以最好的做法是你可以选择整个。

[Rajat Nag] 14:20:24 Entire Excel. And go to home and go to this option wrapped text
[Rajat Nag] 14:20:24 整个 Excel。然后去主页,选择这个选项“换行文本”。

[Rajat Nag] 14:20:33 What happens like if I now select everything And if I shorten the length of this thing I could see all the data as a kind of two line.
[Rajat Nag] 14:20:33 如果我现在选择所有内容,然后缩短这个东西的长度,我可以看到所有数据呈现为两行。

[Rajat Nag] 14:20:47 Otherwise, without the wrapping the text I cannot see everything in two lines okay
[Rajat Nag] 14:20:47 否则,如果没有包裹文本,我无法在两行中看到所有内容

[Rajat Nag] 14:20:56 So see here
[Rajat Nag] 14:20:56 所以看看这里

[Rajat Nag] 14:21:02 Yeah so it is appearing yeah this is this is the thing i do not like so you can also actually fine tune that see here mean standard deviation it is appearing again and again. I don't need that.
[Rajat Nag] 14:21:02 是的,它确实出现了,这就是我不喜欢的地方,所以你实际上也可以微调一下,看看这里的均值标准差,它一次又一次地出现。我不需要这个。

[Rajat Nag] 14:21:15 So you can move that you can move that to the carbon dioxide And then the next one you can move that here And you can move that on top of that the values and we'll going to delete the reputation of the text okay
[Rajat Nag] 14:21:15 所以你可以把那个移动到二氧化碳那里,然后下一个你可以把那个移动到这里,你可以把那个放在上面,然后我们将删除文本的重复内容,好吗

[Rajat Nag] 14:21:33 So I do not need This stuff.
[Rajat Nag] 14:21:33 所以我不需要这些东西。

[Rajat Nag] 14:21:40 Okay, so feel free to delete these columns.
[Rajat Nag] 14:21:40 好的,请随意删除这些列。

[Rajat Nag] 14:21:46 So that's a very good summary stat. For your study.
[Rajat Nag] 14:21:46 这是一个非常好的总结统计数据。对于你的研究。

[Rajat Nag] 14:21:52 So if you have a big data Within a click do the summary stat here and you can actually Report everything in your this is our paper. One thing I must say actually you do not need the first So these are all fine tuning okay if you're not following everything that's fine it is recorded
[Rajat Nag] 14:21:52 所以如果你有一个大数据,只需点击这里进行摘要统计,你实际上可以在这里报告所有内容,这是我们的论文。我必须说,实际上你不需要第一部分。所以这些都是微调,如果你没有跟上所有内容也没关系,已经录制下来了。

[Rajat Nag] 14:22:13 So these are fine tuning stuff So for a table it is always recommended to have no border no border first and then go to the bottom border like the bottom one And then the top border and bottom border for the first
[Rajat Nag] 14:22:13 所以这些是微调的内容。对于表格,通常建议先没有边框,然后再添加底部边框,接着是顶部边框和底部边框

[Rajat Nag] 14:22:36 Row so if i and you can also go to view and turn off the grid lines.
[Rajat Nag] 14:22:36 行,所以如果我和你也可以去查看并关闭网格线。

[Rajat Nag] 14:22:43 So now you can see this is the professional way you see the data summary stat presented in papers right Yes or no?
[Rajat Nag] 14:22:43 现在你可以看到,这就是你在论文中看到的数据摘要统计的专业方式,对吗?

[Rajat Nag] 14:22:57 This is a yeah so you will see your data or kind of summary stat presented in Many articles in this way. So few things there is no mode why is that?
[Rajat Nag] 14:22:57 这是一个,所以你会看到你的数据或某种摘要统计以这种方式呈现在许多文章中。那么有几点,为什么没有众数呢?

[Rajat Nag] 14:23:17 Why there is no mode?
[Rajat Nag] 14:23:17 为什么没有模式?

[Rajat Nag] 14:23:25 Anybody? Very good.
[Rajat Nag] 14:23:25 有人吗?很好。

[Caroline Lema] 14:23:26 Yeah, because then…
[Caroline Lema] 14:23:26 是的,因为那样…

[Ayse Nur Tonay] 14:23:26 Because there's no repetition.
[Ayse Nur Tonay] 14:23:26 因为没有重复。

[Rajat Nag] 14:23:33 So these, sorry, NA is also a finding so do not delete this row okay so not applicable means like It's not possible to find that. And what I always do and you should be doing it's my recommendation select everything
[Rajat Nag] 14:23:33 所以这些,抱歉,NA 也是一个发现,所以不要删除这一行,好吗?不适用意味着找不到。我的建议是,选择所有内容。

[Rajat Nag] 14:23:50 And select the top top this option like all the data and lift.
[Rajat Nag] 14:23:50 然后选择这个选项的顶部,像所有数据一样提升。

[Rajat Nag] 14:24:00 And be consistent with your
[Rajat Nag] 14:24:00 并保持一致性

[Rajat Nag] 14:24:08 Yeah. Now what happens like you do not need this many digits after your like these decimal points So I hope you can do but due to this not applicable i do not know whether you can do that yeah you can do still
[Rajat Nag] 14:24:08 是的。现在发生的事情是,你不需要这么多小数点后的数字。所以我希望你能做到,但由于这个不适用,我不知道你是否能做到,是的,你仍然可以做到。

[Rajat Nag] 14:24:27 Yeah, so three digit or maybe two digit i'm happy whatever i do so you can see the mean value that's the standard error the median value the mode is not applicable And then standard deviation.
[Rajat Nag] 14:24:27 是的,所以三位数或者两位数我都满意,无论我做什么,所以你可以看到均值,那是标准误差,中位数,众数不适用,然后是标准差。

[Rajat Nag] 14:24:43 The sample variance so square root of that is actually your standard deviation so let's check
[Rajat Nag] 14:24:43 样本方差的平方根实际上就是你的标准差,所以我们来检查一下

[Rajat Nag] 14:24:53 S q r t square root of that is your 184. That makes sense, right?
[Rajat Nag] 14:24:53 S q r t 的平方根是你的 184。这是有道理的,对吧?

[Rajat Nag] 14:25:01 To just say yes or no sometimes it's kind of encouraging for me so that I can understand your following Okay, so… Now, who will be the person to describe This kurtosis to me.
[Rajat Nag] 14:25:01 有时候仅仅说是或不是对我来说有点鼓励,这样我可以理解你接下来的内容。好的,那么… 现在,谁来给我描述一下这个峰度。

[Rajat Nag] 14:25:36 Any brevart?
[Rajat Nag] 14:25:36 有什么简短的消息吗?

[Soham Deshpande] 14:25:45 Sir, can you again repeat what do we have to do after compiling all the data?
[Soham Deshpande] 14:25:45 先生,您能再重复一遍我们在整理完所有数据后需要做什么吗?

[Rajat Nag] 14:25:53 One second, can I just merge it?
[Rajat Nag] 14:25:53 等一下,我可以合并一下吗?

[Rajat Nag] 14:26:06 Your company like all data means like this
[Rajat Nag] 14:26:06 你的公司像所有数据一样意味着这样

[Soham Deshpande] 14:26:11 No sir after the last step after we remove the grid lines
[Soham Deshpande] 14:26:11 不,先生,在最后一步之后,我们去掉网格线

[Rajat Nag] 14:26:15 That's fine. You can watch the recording. It's not major thing. So it's like I'm saying like for the kurtosis so What would be the interpretation so there are three kinds of the values right phi less than three adder than 3.
[Rajat Nag] 14:26:15 没关系。你可以观看录音。这不是大事。所以我在说峰度时,解释是什么,有三种值,对吧,phi 小于 3 和大于 3。

[Rajat Nag] 14:26:28 So I hope you can describe that. From from watching that from watching slide okay I have only i think 20 minutes left, right?
[Rajat Nag] 14:26:28 所以我希望你能描述一下。根据观看幻灯片的情况,我想我只剩下 20 分钟,对吗?

[Rajat Nag] 14:26:38 So it should be very quick now Yeah, so for this one, it's kind of this shape like the middle one Right? This one.
[Rajat Nag] 14:26:38 所以现在应该很快了。是的,对于这个,它有点像中间的那个形状,对吧?就是这个。

[Soham Deshpande] 14:26:39 Sir, the windows is less than three.
[Soham Deshpande] 14:26:39 先生,窗户少于三个。

[Rajat Nag] 14:26:49 Then we have some negative value or less than this less than three so all are platycur tick.
[Rajat Nag] 14:26:49 然后我们有一些负值或小于这个小于三,所以都是扁平的。

[Rajat Nag] 14:26:57 So platy means like the orange line.
[Rajat Nag] 14:26:57 所以 platy 的意思就像橙色线。

[Rajat Nag] 14:27:02 So we have few outliers. So that's fine. That's a good sign.
[Rajat Nag] 14:27:02 所以我们有一些异常值。这很好。这是个好兆头。

[Rajat Nag] 14:27:07 Okay, so… we need some visualization of this thing.
[Rajat Nag] 14:27:07 好的,所以…我们需要对这个东西进行一些可视化。

[Rajat Nag] 14:27:13 Sometimes the number so with this number you cannot make interpretation very fast okay for that what we should
[Rajat Nag] 14:27:13 有时候这个数字,所以用这个数字你无法很快做出解释,好吧,为此我们应该

[Rajat Nag] 14:27:27 That's summary stat.
[Rajat Nag] 14:27:27 那是总结统计。

[Rajat Nag] 14:27:37 You can select the entire thing and insert go to recommended charts go to all charts and select What is the box plot? Box plot.
[Rajat Nag] 14:27:37 你可以选择整个内容,然后插入,转到推荐图表,转到所有图表并选择什么是箱线图?箱线图。

[Rajat Nag] 14:27:55 Okay. And then you can then use legends.
[Rajat Nag] 14:27:55 好的。然后你可以使用图例。

[Rajat Nag] 14:28:03 You can use access title And perhaps the data level will be too much.
[Rajat Nag] 14:28:03 你可以使用访问标题,也许数据级别会太多。

[Rajat Nag] 14:28:10 That's fine. Now we need to adjust this thing
[Rajat Nag] 14:28:10 没问题。现在我们需要调整这个东西。

[Rajat Nag] 14:28:21 Yeah, that's a problem actually we have the difference of huge range like See here, can you see one outlier?
[Rajat Nag] 14:28:21 是的,这实际上是个问题,我们有很大的差异。你看这里,你能看到一个异常值吗?

[Rajat Nag] 14:28:43 I think there is one outlier here for carbon dioxide extreme value Can you see it?
[Rajat Nag] 14:28:43 我认为这里有一个二氧化碳极值的异常值,你能看到吗?

[Samkelisiwe Tengwa] 14:28:55 Yes, we can.
[Samkelisiwe Tengwa] 14:28:55 是的,我们可以。

[Rajat Nag] 14:28:56 Yeah. So you can do another way to identify that so that no field and you can select you can select the entire range go to conditional formatting and you can do color scales from… like low to high or high to low so from this color code you can see
[Rajat Nag] 14:28:56 是的。你可以通过另一种方式来识别,这样没有字段,你可以选择整个范围,去条件格式,然后你可以做颜色刻度,从…比如低到高或高到低,所以从这个颜色代码你可以看到。

[Rajat Nag] 14:29:21 Which one is the highest one so for here i think The lowest is the greenest and the highest is this one so you can you I think you can do the same thing like for all.
[Rajat Nag] 14:29:21 这里哪个是最高的,我认为最低的是最绿色的,最高的是这个,所以你可以,我认为你可以对所有的做同样的事情。

[Rajat Nag] 14:29:35 Oh, that's the problem. You'll understand everything with the same range so you need to do the same thing for all select conditional formatting, color scale your second one conditional formatting color scale to the second one.
[Rajat Nag] 14:29:35 哦,这就是问题。你会理解所有相同范围的内容,所以你需要对所有选择的条件格式进行相同的操作,将你的第二个条件格式的颜色比例设置为第二个。

[Rajat Nag] 14:29:53 Conditional formatting color scale to the second one okay
[Rajat Nag] 14:29:53 条件格式化颜色刻度改为第二个可以

[Rajat Nag] 14:30:05 All right so you can identify the highest value of like which year so in 2022 So we have the highest value for total greenhouse gas emissions.
[Rajat Nag] 14:30:05 好的,所以你可以确定哪个年份的最高值,所以在 2022 年,我们的温室气体排放总量达到了最高值。

[Rajat Nag] 14:30:16 Or perhaps not in 2021. Okay, just after the COVID.
[Rajat Nag] 14:30:16 或许在 2021 年并不是这样。好吧,就在 COVID 之后。

[Rajat Nag] 14:30:22 So things are not good, right? So we can do one time says forecasting here So…
[Rajat Nag] 14:30:22 所以情况不好,对吧?所以我们可以在这里做一次性预测。所以……

[Rajat Nag] 14:30:34 How to do that times is forecasting So we can do with moving average by different years.
[Rajat Nag] 14:30:34 如何进行时间预测,我们可以通过不同年份的移动平均来实现。

[Rajat Nag] 14:30:42 So this is the trend these are the number of years these are the total greenhouse gas emission So if I just make a plot If I just… make a plot like
[Rajat Nag] 14:30:42 所以这是趋势,这些是年份,这些是总温室气体排放量。所以如果我只是做一个图表,如果我只是……做一个图表像

[Rajat Nag] 14:31:02 Like this chart
[Rajat Nag] 14:31:02 喜欢这个图表

[Rajat Nag] 14:31:10 And in order to see the effect properly. So I do not have any data beyond 15,000.
[Rajat Nag] 14:31:10 为了正确地看到效果。所以我没有超过 15,000 的数据。

[Rajat Nag] 14:31:18 I should write 15,000 here. Okay.
[Rajat Nag] 14:31:18 我应该在这里写 15,000。好的。

[Rajat Nag] 14:31:24 And perhaps further down to 19,000.
[Rajat Nag] 14:31:24 或许进一步下跌到 19,000。

[Rajat Nag] 14:31:29 Your data your uh uh Lino.
[Rajat Nag] 14:31:29 你的数据,你的呃呃 Lino。

[Rajat Nag] 14:31:34 This kind of curve should capture 72 90% of the entire area of these plots so adjust your chart properly and you do not need to write the title because you are going to describe the figure at the bottom of that figure right
[Rajat Nag] 14:31:34 这种曲线应该捕捉到这些图表的 72%到 90%的整个区域,所以请正确调整你的图表,你不需要写标题,因为你将在图形底部描述该图形,对吧

[Rajat Nag] 14:31:52 And these are some housekeeping stuff which i like not necessarily you will be liking So I do not use the outline.
[Rajat Nag] 14:31:52 这些是一些我喜欢的日常事务,可能你不一定会喜欢,所以我不使用大纲。

[Rajat Nag] 14:32:00 So I use no outline But for the inside rectangle i use the black color And for the entire text i use black i'm going a little bit fast so that you can follow later this is just ornamentation you will get the same score but for
[Rajat Nag] 14:32:00 所以我不使用轮廓,但对于内部矩形我使用黑色,对于整个文本我也使用黑色。我稍微快一点,这样你可以稍后跟上,这只是装饰,你会得到相同的分数,但对于

[Rajat Nag] 14:32:18 You know uh for any journal publication although sometimes i have seen people are going for making plot in r or python Just for the sake of you know looking better as compared to the XL1.
[Rajat Nag] 14:32:18 你知道,嗯,对于任何期刊出版,虽然有时我看到人们只是为了看起来比 XL1 更好而选择在 R 或 Python 中制作图表。

[Rajat Nag] 14:32:30 You can also delete these grid lines But you can make Good.
[Rajat Nag] 14:32:30 你也可以删除这些网格线,但你可以做得很好。

[Rajat Nag] 14:32:39 Actually a graph in Excel. So this is kind of an our standard curve you can see and also you can you can adjust the tick marks so go to i think perhaps line One second. What is the option for tick marks okay
[Rajat Nag] 14:32:39 实际上是 Excel 中的一个图表。所以这可以算是我们的标准曲线,你可以看到,并且你可以调整刻度线,所以去看看我想可能是第一行。刻度线的选项是什么呢?

[Rajat Nag] 14:32:59 Yeah, so levels next to access.
[Rajat Nag] 14:32:59 是的,所以访问旁边的级别。

[Rajat Nag] 14:33:05 Okay, tick marks. Measure access outside like this and this outside And what else?
[Rajat Nag] 14:33:05 好的,刻度线。像这样和这样测量外部的接入,还有什么?

[Rajat Nag] 14:33:16 I think that's a good one. And if you want, you can add data levels and you can see now it's a messy one So you can change them selecting everything go to same numbers and right now it is general
[Rajat Nag] 14:33:16 我认为这是一个不错的选择。如果你想的话,可以添加数据级别,现在你可以看到它是一个混乱的状态。所以你可以通过选择所有内容来更改它们,去到相同的数字,现在它是一般的。

[Rajat Nag] 14:33:33 So you need to convert that thing to number and perhaps you do not need a decimal place So you can put zero.
[Rajat Nag] 14:33:33 所以你需要把那个东西转换成数字,也许你不需要小数位,所以你可以填零。

[Rajat Nag] 14:33:43 So that's the kind of cleaner version but still you need to adjust few things manually so for example if there's a there's a cut So you can move your text that way so that things cannot get cut okay So you should read the text without any difficulty.
[Rajat Nag] 14:33:43 这就是一种更简洁的版本,但你仍然需要手动调整一些东西,比如如果有剪切的话,你可以那样移动你的文本,以便不会被剪切。好的,所以你应该能够毫无困难地阅读文本。

[Rajat Nag] 14:34:06 So that's an example of a good way of presenting your data. Data analysis is important presenting data is also important okay
[Rajat Nag] 14:34:06 这就是一个很好地展示数据的例子。数据分析很重要,展示数据也很重要,好吗。

[Rajat Nag] 14:34:25 Okay, so that was the trend that was the trend of the data so we have a fluctuation we have some fluctuation now We need to predict what would be in 2020 so 2020 2023 say for example i wanted to check what would be
[Rajat Nag] 14:34:25 好的,这就是数据的趋势,所以我们有一些波动。现在我们需要预测 2020 年会是什么样子,比如说 2020 年到 2023 年,我想检查一下会是什么

[Rajat Nag] 14:34:42 2023.

[Rajat Nag] 14:34:50 So I can go with three year moving average but you can do the same thing with five years so for example three year moving average P, ER.
[Rajat Nag] 14:34:50 所以我可以使用三年移动平均,但你也可以用五年移动平均,比如三年移动平均 P,ER。

[Rajat Nag] 14:35:05 Moving average. And then perhaps I'll do one thing for you.
[Rajat Nag] 14:35:05 移动平均。然后也许我会为你做一件事。

[Rajat Nag] 14:35:11 So it's can be five you're moving Average.
[Rajat Nag] 14:35:11 所以它可以是你移动平均的五个。

[Rajat Nag] 14:35:19 So what it does you need to average out the first three years so average average the value of first three years and then hit enter and you can do the same thing beyond your last point.
[Rajat Nag] 14:35:19 所以它的作用是你需要对前三年进行平均,所以平均前三年的值,然后按回车,你可以在最后一点之后做同样的事情。

[Rajat Nag] 14:35:39 So the last point is average of these last three years okay And with the five-year moving average you cannot start beyond this line.
[Rajat Nag] 14:35:39 所以最后一点是这三年的平均值,好吗?而且五年移动平均线不能超出这一线。

[Rajat Nag] 14:35:51 One one two three four five So after sixth year, you can do an average average of one two three four five and you can report that for the Sixth year and then you can drag it for this. So based on
[Rajat Nag] 14:35:51 一二三四五 所以在第六年之后,你可以对一二三四五进行平均计算,并可以将其报告为第六年,然后你可以将其拖动到这里。因此,基于

[Rajat Nag] 14:36:10 The three-year moving average So these are your answer actually.
[Rajat Nag] 14:36:10 三年移动平均值 所以这实际上是你的答案。

[Rajat Nag] 14:36:19 Are you getting any interest or is kind of boring stuff
[Rajat Nag] 14:36:19 你有兴趣吗,还是觉得这有点无聊?

[Rajat Nag] 14:36:28 Islands can can… can be interpreted in many ways.
[Rajat Nag] 14:36:28 岛屿可以…可以以多种方式解读。

[Rajat Nag] 14:36:35 So here but like that's why I said like to watch it pause and watch.
[Rajat Nag] 14:36:35 所以在这里,但这就是我说的,暂停并观看。

[Soham Deshpande] 14:36:35 So it is a little bit hard to understand.
[Soham Deshpande] 14:36:35 所以这有点难以理解。

[Rajat Nag] 14:36:44 So it's kind of a prediction So that's why we are recording this year because i have only two slots the entire thesis module.
[Rajat Nag] 14:36:44 所以这有点像预测。这就是我们今年录制的原因,因为我在整个论文模块中只有两个时间段。

[Rajat Nag] 14:36:53 So this was for 2022. So we cannot predict without some prediction from the past, correct?
[Rajat Nag] 14:36:53 所以这是关于 2022 年的。所以我们不能在没有过去一些预测的情况下进行预测,对吗?

[Rajat Nag] 14:37:01 Yes or no? So that's why three year moving average means like I need to do average from last three years and predict for the fourth year.
[Rajat Nag] 14:37:01 是还是不是?所以这就是为什么三年移动平均意味着我需要从过去三年中计算平均值,并预测第四年。

[Soham Deshpande] 14:37:04 Yes.
[Soham Deshpande] 14:37:04 是的。

[Rajat Nag] 14:37:13 And with that, you can actually make a kind of uh graph and which can actually you can actually plot in that curve but that's fine so you can actually actually predict you can actually plot or you can So this is for year 2023 right
[Rajat Nag] 14:37:13 这样的话,你实际上可以制作一种图表,并且你实际上可以在那个曲线上绘制,但没关系,所以你实际上可以预测,你实际上可以绘制,或者你可以。所以这是 2023 年,对吧

[Rajat Nag] 14:37:43 But this is for 2023. And you… Yeah, so for example if i want to plot now so Insert.
[Rajat Nag] 14:37:43 但这是针对 2023 年的。你… 是的,比如说如果我现在想绘图,就插入。

[Rajat Nag] 14:37:56 Recommended okay that's fine. Insert scatter plot right click select data so first data is first data is
[Rajat Nag] 14:37:56 推荐好的,没问题。插入散点图,右键选择数据,所以第一组数据是第一组数据是

[Rajat Nag] 14:38:09 So it can be your prediction with three year so it's city's name can be this one x values can be only these values only these values.
[Rajat Nag] 14:38:09 所以它可以是你三年的预测,所以城市的名字可以是这个,x 值只能是这些值,这些值。

[Rajat Nag] 14:38:25 I'm sorry yeah yeah and y values can be values only these values okay for the three year and hit ok and you can add with the same curve series name is now your five year moving average x values can be only this.
[Rajat Nag] 14:38:25 对不起,是的,y 值只能是这些值,好的,三年的值,点击确定,你可以用相同的曲线系列名称添加,现在是你的五年移动平均,x 值只能是这个。

[Rajat Nag] 14:38:44 And y values can be
[Rajat Nag] 14:38:44 y 值可以是

[Rajat Nag] 14:38:49 These values okay so this is the comparison of your three-year moving average and five year moving average so you can see here with the five-year moving average it's kind of a straight line But this is the three-year moving average so you have a kind of pattern still
[Rajat Nag] 14:38:49 这些值很好,所以这是你三年移动平均和五年移动平均的比较。你可以看到,五年移动平均呈现出一种直线的趋势,但这是三年移动平均,所以你仍然有一种模式

[Rajat Nag] 14:39:10 Which is not that kind of you know that kind of ups and down it's kind of a smooth line.
[Rajat Nag] 14:39:10 这并不是那种你知道的起伏,而是一条平滑的线。

[Rajat Nag] 14:39:17 So if I add So if I change actually change chart type to the similar one So you can see here this is the actual data So this is the actual data.
[Rajat Nag] 14:39:17 所以如果我更改图表类型为类似的类型,你可以看到这里是实际数据。所以这是实际数据。

[Rajat Nag] 14:39:36 And this is your, if I insert the legend So that's your
[Rajat Nag] 14:39:36 这是你的,如果我插入图例,那么这就是你的

[Rajat Nag] 14:39:53 Let's say the blue one is your three year moving average so it's the the trend is a little bit smoother And then the five-year is kind of a straight line.
[Rajat Nag] 14:39:53 假设蓝色的是你的三年移动平均线,所以趋势会稍微平滑一些,然后五年线就像是一条直线。

[Rajat Nag] 14:40:02 Okay, so based on different kinds of, you know, number of years like moving average. So if you have a large data, you have data for 100 years, you can do five year moving average, you can do 10 year moving average and see what are the predictions okay
[Rajat Nag] 14:40:02 好的,所以基于不同种类的,比如说,移动平均的年数。如果你有大量数据,比如说有 100 年的数据,你可以做五年移动平均,也可以做十年移动平均,看看预测结果。

[Rajat Nag] 14:40:18 Okay, next is some other plots error plots so these are very important for your thesis work.
[Rajat Nag] 14:40:18 好的,接下来是一些其他的错误图,这些对你的论文工作非常重要。

[Rajat Nag] 14:40:31 And for that
[Rajat Nag] 14:40:31 为此

[Rajat Nag] 14:40:37 This legends actually you can actually
[Rajat Nag] 14:40:37 这个传奇实际上你可以实际上

[Rajat Nag] 14:40:42 Yeah, that's fine. Okay now we are trying to calculate few values such as fifth percentile fifth percentile mean value 95th percentile
[Rajat Nag] 14:40:42 是的,没问题。好的,现在我们正在尝试计算一些值,例如第五百分位数、第五百分位数的平均值和第 95 百分位数。

[Rajat Nag] 14:41:06 And then negative error and positive in a two develop a custom error plot okay So fifth percentile of a data set is equals to percentile and if you can select the entire data set comma we are interested in fifth percentile so that is zero point
[Rajat Nag] 14:41:06 然后负误差和正误差在两个中开发一个自定义误差图,好吧。所以数据集的第五百分位数等于百分位数,如果你可以选择整个数据集,我们感兴趣的是第五百分位数,所以是零点

[Rajat Nag] 14:41:34 05. Okay. And then for the mean it is average average of the entire data set And then we have 95th percentile percentile again of the entire data set comma this is 95th percentile 0.95 and then packet close.
[Rajat Nag] 14:41:34 05. 好的。然后对于均值,它是整个数据集的平均值。然后我们有 95 百分位数,整个数据集的 95 百分位数,这就是 95 百分位数 0.95,然后关闭包。

[Rajat Nag] 14:42:01 And negative error is nothing but negative equals values minus fifth percentile value and positive error is always the higher like 95th percentile and 99th percentile based on your data and accuracy level If you want to have 99 percentile that is also fine so the higher
负误差仅仅是负等于值减去第五百分位值,而正误差总是更高的,比如基于你的数据和准确性水平的 95 百分位和 99 百分位。如果你想要 99 百分位,那也是可以的,所以更高

[Rajat Nag] 14:42:25 Interval minus your mean value. So that's the positive error and once you have that You can drag it for all gases okay now how to how to plot how to i mean these things you already have in your summary stat but this is just a kind of
[Rajat Nag] 14:42:25 间隔减去你的平均值。所以这是正误差,一旦你有了这个,你可以将其拖动到所有气体上。好的,现在如何绘制,如何,我的意思是这些东西你已经在你的摘要统计中有了,但这只是一种

[Rajat Nag] 14:42:46 Visualization of your data okay So select the gases like top row and select control by pressing control select the mean values okay and then go to insert recommended charts all charts column turd.
[Rajat Nag] 14:42:46 你的数据可视化没问题。所以选择像第一行那样的气体,然后按住控制键选择平均值,好的,然后去插入推荐图表,所有图表,柱状图。

[Rajat Nag] 14:43:07 So we have a column chart. For all the gases.
[Rajat Nag] 14:43:07 所以我们有一个柱状图。针对所有气体。

[Rajat Nag] 14:43:12 Now we want a kind of an interval so these are the all mean values Correct. So always.
[Rajat Nag] 14:43:12 现在我们想要一种区间,所以这些都是平均值。正确。所以总是。

[Rajat Nag] 14:43:21 Use the axis title So these are all tons of CO2 equivalent right If I'm not wrong.
[Rajat Nag] 14:43:21 使用轴标题 所以这些都是二氧化碳当量的吨数,对吧 如果我没记错的话。

[Rajat Nag] 14:43:31 Are distance, right?
[Rajat Nag] 14:43:31 是距离,对吗?

[Rajat Nag] 14:43:36 Thousand tons okay yeah Thousand tons CO2 equivalent So this is
[Rajat Nag] 14:43:36 千吨,好吧,千吨二氧化碳当量。所以这是

[Rajat Nag] 14:43:50 Okay.
[Rajat Nag] 14:43:50 好的。

[Rajat Nag] 14:43:58 Think about it, okay? And these are the
[Rajat Nag] 14:43:58 想想看,好吗?这些是

[Rajat Nag] 14:44:05 Greenhouse gases and the total. So the total is this one Why I'm plotting the total as well you can see the contribution of methane which is the most followed by nitrous oxide.
[Rajat Nag] 14:44:05 温室气体和总量。所以总量就是这个。我为什么也要绘制总量,你可以看到甲烷的贡献是最多的,其次是氧化亚氮。

[Rajat Nag] 14:44:21 So are you happy with the interpretation that methane is the hotspot followed by the nitrous oxide and then carbon dioxide?
[Rajat Nag] 14:44:21 那么你对甲烷是热点,其次是氧化亚氮,然后是二氧化碳的解释感到满意吗?

[Rajat Nag] 14:44:30 Yes or no?
[Rajat Nag] 14:44:30 是还是不是?

[Rajat Nag] 14:44:36 Perfect. Thank you. So in order to make that error plot, so the next step is Selecting these kind of columns go to chart design go to add chart element go to error bars Oh, wait, there should be custom error bar.
[Rajat Nag] 14:44:36 完美。谢谢。那么为了制作那个误差图,下一步是选择这些列,进入图表设计,添加图表元素,选择误差线。哦,等等,应该有自定义误差线。

[Rajat Nag] 14:45:02 Error bar yeah more area about options.
[Rajat Nag] 14:45:02 错误条是的,更多关于选项的区域。

[Rajat Nag] 14:45:06 So go to custom and then specify value so it says what is the positive error value We already calculated that.
[Rajat Nag] 14:45:06 所以去自定义,然后指定值,这样它就会显示正误差值是什么。我们已经计算过了。

[Rajat Nag] 14:45:17 Equals positive errors so positive error is So select the entire row okay for negative error bars select the entire negative error.
[Rajat Nag] 14:45:17 等于正误差,所以正误差是。所以选择整行,好的,对于负误差条选择整个负误差。

[Rajat Nag] 14:45:35 Okay. And you perhaps cannot see The lower end of those So you can do one thing you can go to chart design sorry format shape field and select a light color so that you can see the bounds of this negative error and positive error okay
[Rajat Nag] 14:45:35 好的。你可能看不到那些的下限。所以你可以做一件事,去图表设计,抱歉,格式形状字段,选择一个浅色,这样你就可以看到这个负误差和正误差的边界了,好吗。

[Rajat Nag] 14:46:00 So that's it. That's the error bar and then we can plot some histograms You can do individual histograms so that's better i hope I cannot have multiple histogram
[Rajat Nag] 14:46:00 就这样。这就是误差条,然后我们可以绘制一些直方图。你可以做单独的直方图,这样更好。我希望我不能有多个直方图。

[Rajat Nag] 14:46:18 I do not know. Yeah, I cannot have like selecting everything So for histogram say for example for the last one I can select the data and go to go insert and then go insert histogram.
[Rajat Nag] 14:46:18 我不知道。是的,我不能选择所有内容。所以以直方图为例,对于最后一个,我可以选择数据,然后去插入,然后再插入直方图。

[Rajat Nag] 14:46:33 In histogram you can see we have only two bins so you can actually change format axes and you can actually select the number of bins say for example i'm interested in five bins. So there will be five category So we have five defined ranges of the entire
[Rajat Nag] 14:46:33 在直方图中,你可以看到我们只有两个区间,所以你实际上可以更改格式轴,并且你可以选择区间的数量,比如说我对五个区间感兴趣。所以将会有五个类别,因此我们有整个数据的五个定义范围。

[Rajat Nag] 14:46:52 Data set and data is distributed like this so it is not a normal distribution Because we have very limited amount of data.
[Rajat Nag] 14:46:52 数据集和数据分布如下,因此它不是正态分布,因为我们的数据量非常有限。

[Rajat Nag] 14:47:00 So that is also called nonparametric data. So these terminology is important non-parametric data.
[Rajat Nag] 14:47:00 这也被称为非参数数据。因此,这些术语是重要的非参数数据。

[Rajat Nag] 14:47:11 Because when you select the choice of coefficient for the Sensitivity analysis or the correlation coefficient For parametric data set.
[Rajat Nag] 14:47:11 因为当你选择灵敏度分析或参数数据集的相关系数的系数选项时。

[Rajat Nag] 14:47:24 Which are data is normally distributed you can use Pearson distribution coefficient Otherwise, we have to have to do select spearmint's rank order coefficient for the non-parametric data set.
[Rajat Nag] 14:47:24 如果数据是正态分布的,可以使用皮尔逊分布系数。否则,我们必须选择斯皮尔曼等级相关系数用于非参数数据集。

[Rajat Nag] 14:47:38 Write it down or just write it down follow my video recording and just write it down then okay then we can do what else we can do something else correlation. I hope that would be your last one okay time series done correlation
[Rajat Nag] 14:47:38 写下来或者跟着我的视频录制写下来,然后好吧,我们可以做其他的事情,相关性。我希望这会是你最后一个,好吧,时间序列完成相关性。

[Rajat Nag] 14:47:58 So for correlation it's similar way data data analysis now you can you can actually play around everything okay For histograms and all you can also make from here So I leave a few things for you.
[Rajat Nag] 14:47:58 所以对于相关性,它的方式与数据分析类似,现在你可以实际操作一切。好的,对于直方图等,你也可以从这里制作。所以我留了一些东西给你。

[Rajat Nag] 14:48:13 So for correlation you can select this and similar way select the input this is your input And… then column level In the first row i want in the same tab not like new workbook i want in in this workbook as a new tab and select okay
[Rajat Nag] 14:48:13 所以对于相关性,你可以选择这个,以类似的方式选择输入,这就是你的输入,然后列级别。在第一行我想要在同一个工作簿中,而不是像新工作簿那样,我想在这个工作簿中作为一个新标签页,然后选择确定

[Rajat Nag] 14:48:38 Boom. So you can see the correlation again I do not want that thing to be stretched that far.
[Rajat Nag] 14:48:38 轰。所以你可以再次看到相关性,我不想让那个东西被拉得那么远。

[Rajat Nag] 14:48:48 I can select entire thing go to home wrap text and if i
[Rajat Nag] 14:48:48 我可以选择整个内容,去主页,换行文本,如果我

[Rajat Nag] 14:48:58 Uh-oh.
[Rajat Nag] 14:48:58 哦哦。

[Rajat Nag] 14:49:04 I have a better plot here. Okay.
[Rajat Nag] 14:49:04 我这里有一个更好的情节。好的。

[Rajat Nag] 14:49:08 What is wrong with the previous ones?
[Rajat Nag] 14:49:08 前面的那些有什么问题?

[Rajat Nag] 14:49:17 Yeah, I could not see the first few columns anyway Yeah, that's sorted now.
[Rajat Nag] 14:49:17 是的,我反正看不到前几列。是的,现在已经解决了。

[Rajat Nag] 14:49:27 Also you can see I have many digits after
[Rajat Nag] 14:49:27 你也可以看到我后面有很多数字

[Rajat Nag] 14:49:36 And I actually leave this one as 1, not 1.0. That's why I'm not selecting everything.
[Rajat Nag] 14:49:36 我实际上把这个留作 1,而不是 1.0。这就是我不选择所有内容的原因。

[Rajat Nag] 14:49:46 Who can explain the correlation? Anybody?
[Rajat Nag] 14:49:46 谁能解释一下相关性?有人吗?

[Rajat Nag] 14:49:53 And then you can select the entire thing Or perhaps.
[Rajat Nag] 14:49:53 然后你可以选择整个内容,或者也许。

[Rajat Nag] 14:50:01 You can select the entire thing. And go to conditional formatting and then you can select this one okay So tell me, this is our main Target.
[Rajat Nag] 14:50:01 你可以选择整个内容。然后去条件格式设置,然后你可以选择这个,好吗?所以告诉我,这是我们的主要目标。

[Rajat Nag] 14:50:15 Total greenhouse gas. Air mission so tell me without looking at this one i know you have done this one already without looking at this one if you just look at the this correlation right
[Rajat Nag] 14:50:15 总温室气体。空气任务,所以告诉我,不看这个我知道你已经做过这个了,不看这个,如果你只是看这个相关性,对吧

[Rajat Nag] 14:50:42 Correlation. So just rename the tab it is easier for you So these are some your lots.
[Rajat Nag] 14:50:42 相关性。所以只需重命名标签,这样对你更容易。这些是你的一些批次。

[Rajat Nag] 14:50:54 And others. What was there here? Oh, that's the data.
[Rajat Nag] 14:50:54 还有其他人。这里有什么?哦,那是数据。

[Rajat Nag] 14:51:00 Okay, so which gas is mostly correlated with the total so correlation coefficient can vary from minus one to one Minus means negatively correlated Okay. And positive means positively correlated so you can see here the one is here one means like the correlation between this parameter and this parameter it is always one right
[Rajat Nag] 14:51:00 好的,那么哪种气体与总量的相关性最大,因此相关系数可以从负一到正一变化。负数意味着负相关。好的,正数意味着正相关,所以你可以看到这里的一个,意味着这个参数和这个参数之间的相关性总是为一,对吧。

[Rajat Nag] 14:51:22 Because this carbon dioxide this is carbon dioxide Next, carbon d say for example methane And here we have also meetings. So that doesn't make any sense, but this is the way of presentation.
[Rajat Nag] 14:51:22 因为这个二氧化碳这是二氧化碳接下来,碳 d 比如甲烷在这里我们也有会议。所以这没有任何意义,但这就是展示的方式。

[Rajat Nag] 14:51:33 So the highest rather than this one is 0.99 correct Yes or no?
[Rajat Nag] 14:51:33 所以最高的而不是这个是 0.99,对吗?是还是不是?

[Samkelisiwe Tengwa] 14:51:43 Yes.
[Samkelisiwe Tengwa] 14:51:43 是的。

[Rajat Nag] 14:51:43 So methane is the most impactful gas in order to have my total greenhouse gases.
[Rajat Nag] 14:51:43 所以甲烷是对我的总温室气体影响最大的气体。

[Rajat Nag] 14:51:54 Make sense? So that is also coming here right methane And then followed by Which one?
[Rajat Nag] 14:51:54 有道理吗?所以这也在这里对吧,甲烷 然后接着是哪个?

[Samkelisiwe Tengwa] 14:52:06 Yes, it does.
[Samkelisiwe Tengwa] 14:52:06 是的,确实如此。

[Samkelisiwe Tengwa] 14:52:08 Oxide.
[Samkelisiwe Tengwa] 14:52:08 氧化物。

[Rajat Nag] 14:52:08 Perfect. 0.9. So that is here also nitrous oxide So now you know and then negatively correlation the highest is i think this one
[Rajat Nag] 14:52:08 完美。0.9。所以这里也是氧化亚氮。现在你知道了,然后负相关最高的是我认为这个。

[Rajat Nag] 14:52:26 Which means… So that is correlated it is not correlated to the total one so with With total, we have always Oh, we have one. Okay. We have, yeah, this one So…
[Rajat Nag] 14:52:26 这意味着……所以它与总的没有相关性,所以与总的相比,我们总是哦,我们有一个。好的。我们有,是的,这个。所以……

[Rajat Nag] 14:52:45 Hydrofluorocarbon in CO2 equivalent I do not know why it is coming as Negative.
[Rajat Nag] 14:52:45 氟化氢碳在二氧化碳当量中,我不知道为什么它会显示为负值。

[Rajat Nag] 14:52:52 So perhaps with some characterization factor and all so This is something, could you please let me know why this is coming as negative?
[Rajat Nag] 14:52:52 所以也许通过一些特征因子等等。这是一个问题,您能告诉我为什么会出现负值吗?

[Rajat Nag] 14:53:00 In terms of total it says like if I introduce this one Perhaps it is a substitution for other gases like methanol.
[Rajat Nag] 14:53:00 从总体来看,它说如果我引入这个,或许它是其他气体如甲醇的替代品。

[Rajat Nag] 14:53:10 I'm not sure. So that's the question from my end for next class so if If there is something fishy you need to explain. Perhaps it is not fishy okay So you need to question if you cannot explain things so you can look back to the
[Rajat Nag] 14:53:10 我不确定。所以这是我对下一节课的问题,如果有什么可疑的地方你需要解释。也许这并不可疑,好吧。所以你需要质疑,如果你无法解释事情,你可以回顾一下。

[Rajat Nag] 14:53:26 Brought it and see. So it should be all positively cold. It means like if I increase the carbon dioxide the total greenhouse gas will be higher.
[Rajat Nag] 14:53:26 带来了,看看。所以它应该是完全冷的。这意味着如果我增加二氧化碳,总温室气体将会更高。

[Rajat Nag] 14:53:35 So all positively correlated. So highest is methane followed by nitrous oxide.
[Rajat Nag] 14:53:35 所以都是正相关的。最高的是甲烷,其次是氧化亚氮。

[Rajat Nag] 14:53:40 Then this one carbon dioxide and this one and last is this And then… I don't think anything is left.
[Rajat Nag] 14:53:40 然后这个二氧化碳和这个,最后是这个。然后……我觉得没有什么剩下的了。

[Rajat Nag] 14:53:50 And you are kind of fried at this point But anyway, I have only four hours in the internal module. So thank you very much for joining.
[Rajat Nag] 14:53:50 你现在有点累了。不过无论如何,我在内部模块只有四个小时。非常感谢你的参与。

[Rajat Nag] 14:54:00 If you have any questions so you can send me an email
[Rajat Nag] 14:54:00 如果你有任何问题,可以给我发邮件