Johnson
The language instinct
ChatGPT’s way with words raises questions about how humans acquire language
约翰逊专栏
语言本能
ChatGPT的用词方式引发了关于人类如何习得语言的问题
It has reignited a debate over the ideas of Noam Chomsky, the world’s most famous linguist
它重新引发了关于世界上最著名的语言学家诺姆-乔姆斯基思想的辩论
[Paragraph 1]
WHEN DEEP BLUE, a chess computer, defeated Garry Kasparov, a world champion, in 1997 many gasped in fear of machines triumphing over mankind.
1997年,当国际象棋计算机 "深蓝 "击败世界冠军加里-卡斯帕罗夫时,许多人都因机器战胜人类而感到恐惧。
In the intervening years, artificial intelligence has done some astonishing things, but none has managed to capture the public imagination in quite the same way.
在此后的几年里,人工智能做了一些惊人的事情,但没有一件事能以完全相同的方式吸引公众的注意力。
Now, though, the astonishment of the Deep Blue moment is back, because computers are employing something that humans consider their defining ability: language.
[Paragraph 2]
Or are they? Certainly, large language models (LLMs), of which the most famous is ChatGPT, produce what looks like impeccable human writing.
或者说他们是这样吗?当然,大型语言模型中最著名的是ChatGPT,它可以产生似乎无可挑剔的人类写作。
But a debate has ensued about what the machines are actually doing internally, what it is that humans, in turn, do when they speak—and, inside the academy, about the theories of the world’s most famous linguist, Noam Chomsky.
但是,关于机器内部实际做什么、人类在说话时又做什么的争论随之而来——在学术界,关于世界上最著名的语言学家诺姆·乔姆斯基理论的争论非常火热。
[Paragraph 3]
Although Professor Chomsky’s ideas have changed considerably since he rose to prominence in the 1950s, several elements have remained fairly constant.
1950年代,乔姆斯基教授的思想声名显赫,虽然现在已经发生了很大变化,但有几个观点一直保持不变。
He and his followers argue that human language is different in kind (not just degree of expressiveness) from all other kinds of communication.
他和他的追随者认为,人类语言在种类上(不仅仅是表达能力的程度)与所有其他种类的交流不同。
All human languages are more similar to each other than they are to, say, whale song or computer code.
所有人类语言彼此之间具有相似性,且比鲸鱼的歌声或计算机代码之间的相似性要高。
Professor Chomsky has frequently said a Martian visitor would conclude that all humans speak the same language, with surface variation.
乔姆斯基教授经常说,一个来自火星的访客可能会得出结论,所有人类都说同一种语言,只是表面有所不同。
[Paragraph 4]
Perhaps most notably, Chomskyan theories hold that children learn their native languages with astonishing speed and ease despite “the poverty of the stimulus”: the sloppy and occasional language they hear in childhood.
或许最值得注意的是,乔姆斯基的理论认为,尽管 "刺激的贫乏"(即儿童在童年时期听到的马虎和偶然的语言),但孩子们还是以惊人的速度又轻松地学会了他们的母语。
The only explanation for this can be that some kind of predisposition for language is built into the human brain.
对此的唯一解释是,人类大脑中内置了某种语言功能。
[Paragraph 5]
Chomskyan ideas have dominated the linguistic field of syntax since their birth.
乔姆斯基的思想自诞生以来一直主导着句法语言学领域。
But many linguists are strident anti-Chomskyans. And some are now seizing on the capacities of LLMs to attack Chomskyan theories anew.
但许多语言学家都是坚定的反乔姆斯基主义者。而且有些人现在正利用大型语言模型的能力,重新攻击乔姆斯基的理论。
[Paragraph 6]
Grammar has a hierarchical, nested structure involving units within other units. Words form phrases, which form clauses, which form sentences and so on.
语法具有层次结构、嵌套结构,涉及其他单元中的单元。单词构成短语,短语构成从句,句子构成句子等等。
Chomskyan theory posits a mental operation, “Merge”, which glues smaller units together to form larger ones that can then be operated on further (and so on).
乔姆斯基理论提出了一种心理操作,即 "合并",它将较小的单元粘在一起形成较大的单元,然后可以进一步操作(等等)。
In a recent New York Times op-ed, the man himself (now 94) and two co-authors said “we know” that computers do not think or use language as humans do, referring implicitly to this kind of cognition.
在最近的《纽约时报》专栏文章中,他本人(现年94岁)和两位合著者说,"我们知道 "计算机不像人类那样思考或使用语言,暗指这种认知攻击。
LLMs, in effect, merely predict the next word in a string of words.
实际上,大型语言模型只是预测一连串单词中的下一个单词。
[Paragraph 7]
Yet it is hard, for several reasons, to fathom what LLMs “think”.
然而,由于一些原因,我们很难理解大型语言模型的 "想法"。
Details of the programming and training data of commercial ones like ChatGPT are proprietary.
ChatGPT等商业软件的编程和训练数据的细节是有私有的。
And not even the programmers know exactly what is going on inside.
甚至连程序员都不知道里面到底发生了什么。
[Paragraph 8]
Linguists have, however, found clever ways to test LLMs’ underlying knowledge, in effect tricking them with probing tests.
然而,语言学家找到了测试大型语言模型基础知识的巧妙方法,实际上是用探测性测试来误导它们。
And indeed, LLMs seem to learn nested, hierarchical grammatical structures, even though they are exposed to only linear input, ie, strings of text.
事实上,大型语言模型似乎学习了嵌套的、层次化的语法结构,尽管它们只接触到线性输入,即文本字符串。
They can handle novel words and grasp parts of speech.
它们可以处理新词并掌握词性。
Tell ChatGPT that “dax” is a verb meaning to eat a slice of pizza by folding it, and the system deploys it easily: “After a long day at work, I like to relax and dax on a slice of pizza while watching my favourite TV show.” (The imitative element can be seen in “dax on”, which ChatGPT probably patterned on the likes of “chew on” or “munch on”.)
例如告诉ChatGPT,"dax "是一个动词,意思是折叠着吃一片比萨饼,系统就能轻松地处理它:"在漫长的一天工作之后,我喜欢去放松放松,一边看着我最喜欢的电视节目,一边吃着比萨饼。" (在 "dax on "中可以看到模仿的成分,ChatGPT可能是按照 "chew on "或 "munch on "词组结构。)
[Paragraph 9]
What about the “poverty of the stimulus”?
那么 "刺激的贫乏 "怎么解释?
After all, GPT-3 (the LLM underlying ChatGPT until the recent release of GPT-4) is estimated to be trained on about 1,000 times the data a human ten-year-old is exposed to.
毕竟,GPT-3(这是ChatGPT的基础大型语言模型,最近发布了GPT-4)所接受的训练数据大约是 10 岁儿童所接触数据的 1,000 倍。
That leaves open the possibility that children have an inborn tendency to grammar, making them far more proficient than any LLM.
这就留下了一种可能性,即儿童天生就有语法功能使他们比任何大型语言模型都熟练得多。
In a forthcoming paper in Linguistic Inquiry, researchers claim to have trained an LLM on no more text than a human child is exposed to, finding that it can use even rare bits of grammar.
在《语言学探索》上即将发表的一篇论文中,研究人员声称他们训练的大型语言模型所用的文本不超过人类儿童接触到的文本,发现它甚至可以使用罕见的语法。
But other researchers have tried to train an LLM on a database of only child-directed language (that is, of transcripts of carers speaking to children).
但是其他研究人员已尝试用仅包含面向儿童的语言(也就是看护者与儿童对话的记录)的数据库训练大型语言模型。
Here LLMs fare far worse. Perhaps the brain really is built for language, as Professor Chomsky says.
这种情况下,大型语言模型的表现要差得多。正如乔姆斯基教授所说,也许大脑真的是为语言而生的。
[Paragraph 10]
It is difficult to judge. Both sides of the argument are marshalling LLMs to make their case.
这很难判断。争论双方都在用大型语言模型来证明自己的观点。
The eponymous founder of his school of linguistics has offered only a brusque riposte.
这一语言学派的同名创始人(乔姆斯基)只是简单直接地反驳了一下。
For his theories to survive this challenge, his camp will have to put up a stronger defence.
为了使他的理论在这次挑战中站住脚,他的阵营必须提出更有力的观点来辩护。
(恭喜读完,本篇英语词汇量800左右)
原文出自:2023年4月29日《The Economist》Culture版块。
精读笔记来源于:自由英语之路
本文翻译整理: Irene
本文编辑校对: Irene
仅供个人英语学习交流使用。
【补充资料】(来自于网络)
乔姆斯基(Noam Chomsky)是美国语言学家,转换-生成语法的创始人。1928年12月7日出生于美国宾夕法尼亚州的费城。他在多个领域提出了许多有影响力的观点,包括语言习得、政治理论和媒体分析等。但他之所以成为这极具影响力的学者,源于他伟大的学术贡献--他为现代语言学奠定了基础,更被誉为现代语言学之父。他提出了“语言习得装置”这一概念,认为人类天生具有语言习得的能力,使得儿童能够快速而轻松地习得语言。同时,他也提出了“普遍语法”这一理论,认为所有语言都共享一些基本的语法规则,这些规则被编码到人类的大脑中。
深蓝时刻Deep Blue moment 是指1997年5月11日发生在国际象棋界的重要事件。当时,IBM 的超级计算机 Deep Blue 在一场历时六局、进行了数个小时的棋局中战胜了当时世界排名第一的国际象棋大师卡斯帕罗夫,成为了史上首台战胜人类国际象棋大师的计算机。这场比赛被称为“深蓝时刻”,标志着计算机技术在人工智能领域取得了重大突破,并对人们重新定义了计算机和人类智慧之间的关系。人们也开始更加意识到人工智能技术所带来的挑战与风险,包括如何保护数据隐私、如何避免算法歧视等问题。
刺激的贫乏“Poverty of the stimulus”是一个语言学术语,用来描述儿童在没有足够语言输入的情况下如何学习语言。这个概念最初由语言学家诺姆·乔姆斯基提出。根据“刺激的贫乏”理论,人类语言能力的本质来源于先天的语言学知识,即人类大脑天生具有一种称为“语言获取装置”的机制,允许人们自然地掌握语言的结构和规则。核心观点是,在自然语言中,语法规则远比实际语言输入要丰富得多。也就是说,儿童通过听到有限的语言样本来学习语言时,他们面临的是“刺激的贫乏”,即缺乏足够的语言输入。尽管存在刺激贫乏的问题,但是儿童仍能够轻松、自然地掌握其母语的语法规则,这表明了他们具有一种与生俱来的语言学习能力。
【重点句子】(3个)
Now, though, the astonishment of the Deep Blue moment is back, because computers are employing something that humans consider their defining ability: language.
不过现在,深蓝时刻的震撼场面再次出现了,因为计算机正在运用人类认为是其决定性能力的东西:语言。
Grammar has a hierarchical, nested structure involving units within other units. Words form phrases, which form clauses, which form sentences and so on.
语法具有层次结构、嵌套结构,涉及其他单元中的单元。单词构成短语,短语构成从句,句子构成句子等等。
That leaves open the possibility that children have an inborn tendency to grammar, making them far more proficient than any LLM.
这就留下了一种可能性,即儿童天生就有语法功能使他们比任何大型语言模型都熟练得多。
扫码关注
QQ联系
微信好友
关注微博