Ten short guidelines for clear thinking and collaborative truth-seeking, followed by extensive discussion of what exactly they mean and why Duncan thinks they're an important default guideline.
十条清晰思考和协作寻求真理的简短指南,然后详细讨论它们的确切含义以及邓肯为什么认为它们是重要的默认准则。
This post examines the virtues of hope, optimism, and trust. It is meant mostly as an exploration of what other people have learned about these virtues, rather than as me expressing my own opinions about them, though I’ve been selective about what I found interesting or credible, according to my own inclinations. I wrote this not as an expert on the topic, but as someone who wants to learn more about it. I hope it will be helpful to people who want to know more about these virtues and how to nurture them.
这篇文章探讨了希望、乐观和信任的美德。它主要是作为对其他人对这些美德的认识的探索,而不是表达我自己对它们的观点,尽管我根据自己的倾向选择了我认为有趣或可信的内容。我并不是以专家的身份写这篇文章,而是作为一个想要更多地了解这个主题的人。我希望这对想要更多了解这些美德以及如何培养它们的人有所帮助。
These virtues have in common a sort of “look on the bright side” / “expect the best” approach to life. But there are a number of ways to interpret this, and if...
这些美德有一个共同点,那就是对生活持"看积极的一面"/"期待最好"的态度。但对此有多种解读方式,如果……
Not really. Robert Anton Wilson's description is more on-point:
不完全是。罗伯特·安东·威尔逊的描述更加切中要点:
... ...(阅读更多)Let me differentiate between scientific method and the neurology of the individual scientist. Scientific method has always depended on feedback [or flip-flopping as the Tsarists call it]; I therefore consider it the highest form of group intelligence thus far evolved on this backward planet. The individual scientist seems a different animal entirely. The ones I've met seem as passionate, and hence as egotistic and prejudiced, as painters, ballerinas or even, God save the mark, novelists. My hop
让我区分科学方法和科学家个人的神经学。科学方法一直依赖于反馈(或正如沙皇派所说的翻转);因此我认为这是迄今为止在这个落后的星球上所进化出的最高形式的群体智慧。科学家个人似乎是完全不同的动物。我遇到的那些看起来和画家、芭蕾舞者,甚至上帝保佑,小说家一样充满激情,因此也同样自负和偏见。我的跳
there's an analogy between the zurich r/changemyview curse of evals and the metr/epoch curse of evals. You do this dubiously ethical (according to more US-pilled IRBs or according to more paranoid/pure AI safety advocates) measuring/elicitation project because you might think the world deserves to know. But you had to do dubiously ethical experimentation on unconsenting reddizens / help labs improve capabilities in order to get there--- but the catch is, you only come out net positive if the world chooses to act on this information
瑞士的 r/changemyview 诅咒与 METR/epoch 评估诅咒之间存在一种类比。你可能会通过这种在伦理上可疑的测量/引诱项目(根据更多美国化的 IRB 或更多偏执/纯粹的 AI 安全倡导者的观点),认为世界应该了解这些信息。但你不得不对未经同意的 Reddit 用户进行可疑的伦理实验,或帮助实验室改进能力,以达到这个目的——关键在于,只有当世界选择对这些信息采取行动时,你才能实现净正面效果。
I’ve been thinking recently about what sets apart the people who’ve done the best work at Anthropic.
最近我一直在思考是什么让安特罗皮克公司最优秀的员工脱颖而出。
You might think that the main thing that makes people really effective at research or engineering is technical ability, and among the general population that’s true. Among people hired at Anthropic, though, we’ve restricted the range by screening for extremely high-percentile technical ability, so the remaining differences, while they still matter, aren’t quite as critical. Instead, people’s biggest bottleneck eventually becomes their ability to get leverage—i.e., to find and execute work that has a big impact-per-hour multiplier.
你可能认为技术能力是使人们在研究或工程方面真正高效的主要因素,对于普通人群来说确实如此。但在 Anthropic 雇佣的员工中,我们已经通过筛选极高百分位的技术能力来限制范围,所以剩下的差异,尽管仍然重要,但不太关键。相反,人们最大的瓶颈最终成为获得杠杆效应的能力——即找到并执行每小时影响力乘数很高的工作。
For example, here are some types of work at Anthropic that tend to have high impact-per-hour, or a high impact-per-hour ceiling when done well (of course this list is extremely non-exhaustive!):
例如,以下是 Anthropic 中一些倾向于具有高每小时影响力,或在做好时具有高每小时影响力上限的工作类型(当然,这个列表极不详尽!):
and computer use started off as a fraction of one person’s time,
计算机使用最初只占用了一个人一部分时间,
Curious when this was. When was the earliest point that someone was working on computer use? When was the latest point that ONLY one person was?
很想知道这是在什么时候。最早有人开始研究计算机使用是在什么时候?最晚只有一个人在使用计算机的时间点是什么时候?
Suppose that I have some budget set aside for philanthropic funding, say $1,000, but I think there are a big returns to scale, so that it would be >1,000x better if I had $1,000,000.[1]
假设我为慈善资助预留了一些预算,比如 1,000 美元,但我认为规模效应很大,所以如果我有 1,000,000 美元,效果将会好过 1,000 倍。
What are my best options for some bets I can make to get some chance of turning that $1,000 into $1,000,000?
我有什么最佳选择来下注,以获得将 1,000 美元变成 1,000,000 美元的机会?
I imagine the bets are generally better the more positive their EV is (which in practice means the less negative the EV is), the easier they are to find (like if they're standardised), and the better their tax treatment is, especially if you repeat them multiple times (maybe they can be done by DAFs).
我想,赌注的好处通常与其期望值(EV)越正面越好(实际上意味着 EV 越不负面),越容易找到(比如它们是标准化的),以及税收处理越好,尤其是当你多次重复这些赌注(可能可以通过捐赠建议基金(DAFs)来完成)。
NB: I don't believe this
注意:我不相信这一点
This is a follow-up to last week's D&D.Sci scenario: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.
这是上周 D&D.Sci 场景的后续:如果你打算玩这个场景,且尚未尝试,现在就应该先玩,以免剧透。
There is a web interactive here you can use to test your answer, and generation code available here if you're interested, or you can read on for the ruleset and scores.
这里有一个网页交互工具可以用来测试你的答案,如果你感兴趣,还可以查看生成代码,或者继续阅读规则和得分。
Each good is assigned a value for tax purposes:
每种商品都被分配一个用于税务目的的价值:
Good 好 | Value 价值 |
Cockatrice Eye 鸡尾兽之眼 | 6gp 6 金币 |
Dragon Head 龙头 | 14gp 14 金币 |
Lich Skull 巫妖头骨 | 10gp 10 金币 |
Unicorn Horn 独角兽之角 | 7gp 7 金币 |
Zombie Arm 僵尸手臂 | 2gp |
Depending on the total value of all goods you have, you determine a tax bracket:
根据您拥有的所有商品的总价值,确定税率等级:
Total Value 总价值 | Tax Rate 税率 |
<30gp <30 金币 | 20% |
30-59gp | 30% |
60-99gp | 40% |
100-299gp | 50% |
300gp+[1] | 60% |
Your taxes due are equal to your Tax Rate multiplied by the total value of your goods.
So if you have two Lich Skulls (20gp), your tax rate is 20% and you will owe 4gp of taxes.
If you have three Lich...
Should we expect the future to be good? This is an important question for many reasons. One such reason is that the answer to this question has implications for what our intermediate goals should be. If we should expect the future to be good, then it would be relatively more important for us to focus on ensuring that we survive long into the future, e.g. by working on mitigating extinction risks. If we should not expect the future to be good, then it would be relatively more important for us to focus on mitigating risks of astronomical suffering.
In this paper, I critique Paul Christiano's (2013) argument that the future will be good. In Section 2, I reconstruct Christiano's argument in premise form and articulate some simplifying...
EDIT: Read a summary of this post on Twitter
编辑:在推特上阅读这篇文章的摘要
Working in the field of genetics is a bizarre experience. No one seems to be interested in the most interesting applications of their research.
We’ve spent the better part of the last two decades unravelling exactly how the human genome works and which specific letter changes in our DNA affect things like diabetes risk or college graduation rates. Our knowledge has advanced to the point where, if we had a safe and reliable means of modifying genes in embryos, we could literally create superbabies. Children that would live multiple decades longer than their non-engineered peers, have the raw intellectual horsepower to do Nobel prize worthy scientific research, and very rarely suffer from depression or other mental health disorders.
The scientific establishment,...
Standard deviations are used to characterize the spread around the mean of a normal distribution -- it is not intended to characterize the tails. This is why discussion around it tends to focus on the 1-2 SDs, where the bulk of the data is, and rarely 3-4 SDs -- it is rare to have the data (of sufficient size or low noise) to support meaningful interpretation of even 4 SDs with real-world data.
So in practice, using precise figures like 5, 7, or 20 SDs is misleading, because the tails aren't usually sufficiently characterized (and it certainly isn't w...
Every now and then, some AI luminaries
I agree with (1) and strenuously disagree with (2).
The last time I saw something like this, I responded by writing: LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem.
Well, now we have a second entry in the series, with the new preprint book chapter “Welcome to the Era of Experience” by...
I think this is revealing some differences of terminology and intuitions between us. To start with, in the §2.1 definitions, both “goal misgeneralization” and “specification gaming” (a.k.a. “reward hacking”) can be associated with “competent pursuit of goals we don’t want”, w/hereas you seem to be treating “goal misgeneralization” as a competent thing and “reward hacking” as harmless but useless. And “reward hacking” is broader than wireheading.
...For example, if the AI forces the user into eternal cardio training on pain of death, and accordingly the reward
我希望这能被叫做"邓肯的论述指南"或类似的名字。我喜欢给出的大部分准则,但这些并非共识。虽然我支持邓肯在他的帖子中屏蔽他人的权利(并且在论述规范上更多地同意他而非被屏蔽的人),但这意味着不同意他规则的人无法在评论中陈述自己的观点。这对我来说感觉像是一个不平衡的战场。