diff --git a/_posts/2024-01-21-insight-and-the-limits-of-language.md b/_posts/2024-01-21-insight-and-the-limits-of-language.md index 6ed56e5ecc..0dcee9bb1a 100644 --- a/_posts/2024-01-21-insight-and-the-limits-of-language.md +++ b/_posts/2024-01-21-insight-and-the-limits-of-language.md @@ -6,7 +6,7 @@ category: Jekyll layout: post --- -Like almost everyone else on the planet right now, I've been excitedly following the development of large language models (LLMs) like that ChatGPT. A friend recently showed me a video of Ilya Sutskever explaining some of the major concepts underlying LLMs, and the belief that they present a path to the development of truly intelligent systems. The core idea is that *'text is a projection of the world'*. That is, that events in the world have all sorts of statistical relationships, and that these relationships are reflected in the statistics of text. People often disparagingly criticise LLMs of being simply next-token predictors that can mimic the text they are trained on. But what Ilya and other LLM proponents emphasise is that through having a good statistical understanding of text, LLMs must also have an understanding of the world this text refers to. +Like almost everyone else on the planet right now, I've been excitedly following the development of large language models (LLMs) like that ChatGPT. A friend recently showed me a [video](https://youtu.be/8dMOdz2rcSI?feature=shared&t=706) of Ilya Sutskever (the cofounder of OpenAI) explaining some of the major concepts underlying LLMs, and the belief that they present a path to the development of truly intelligent systems. The core idea is that *'text is a projection of the world'*. That is, that events in the world have all sorts of statistical relationships, and that these relationships are reflected in the statistics of text. People often disparagingly criticise LLMs of being simply next-token predictors that can mimic the text they are trained on. But what Ilya and other LLM proponents emphasise is that through having a good statistical understanding of text, LLMs must also have an understanding of the world this text refers to. An example here can be useful. If I say 'I threw a ball in the air and it came back down and hit the ...', you might guess that the next word is 'ground'. An LLM could predict 'ground' simply on the basis of similar sentences it has been trained on, or alternatively, it could have learnt through its training some representation of the world which includes an understanding of gravity and how balls move, and the layout of the world, in which the ground is typically underneath things in the air. The latter case sounds much more like an intelligent system. A good world model enables LLMs to make sensible predictions of scenarios that it has never encountered in text before, which is a sure hallmark of true intelligence. @@ -27,6 +27,8 @@ must be made alone: if it can't be communicated meaningfully, it can only be don There is a striking similarity here with Buddhist practice. The Buddha famously rejected many philosophical questions, and instead encouraged his followers simply to practice. In his book Jhana Consciousness: Buddhist Meditation in the Age of Neuroscience', Paul Dennison gives a beautiful description of the Buddhist path. Our everyday waking consciousness is permeated by language: we all experience a constant voice in our heads describing all of our experiences as they happen. The Buddha presented a precise training program to step outside of this everyday waking consciousness. Through years of mindfulness and meditation, one can cultivate states known as the Jhanas, in which all internal voices cease. These states are completely outside of the realm of description, and therefore can only be understood through direct experience. Their purpose is to develop true insight into the nature of the world. Insight differs from knowledge in that it lies beyond language and therefore cannot be shared. It is, in a sense, devoid of internal structure, but consists of atoms that come to us in a flash. And it is insight that really matters. Attempts to put insight into language are doomed to fail, and therefore language acts only as a barrier to insight. This is why we must cultivate states outside language, in which we can patiently wait for these flashes to come to us. +![image](assets/superhans.jpg) + This, to me, is fundamentally why I don't find the discussion around LLMs and other forms of AI particularly appealing. While it is true that LLMs can learn valid models of empirical facts in the world, and in this sense are intelligent, the stuff that actually matters is probably missing. I want to quickly clarify that I am using language as a particular example of a representation of the world, but that all of these discussions do in fact apply to *any* representation. In neuroscience, we model how information about the world, entering our nervous system through the senses, gets processed to form compressed representations of the world, manifested in patterns of neural activity. Much neuroscience research revolves around studying how relationships between different internal representations reflect relationship between states in the world: a very Wittgensteinian research program. These representations are multi-modal, in that they reflect the fact that we receive information through a multitude of different senses. In everyday waking consciousness, we experience these representations as sounds, images, language, and other modalities. Sometimes they are very closely tethered to our senses, in which case we take them to be actual experiences of the world. Other times they take on a more dreamy or hallucinatory quality, reflecting experiences of the world we are not currently having (in Buddhism this is called 'mind', which is seen as an additional sense). And one of the things that makes Jhana consciousness impossible to talk about is that it lies beyond any representation of the external world - it is a state 'excluded from the senses'.