Hannah Arendt wrote about what she thought of a top-ranking Nazi, Eichmann's trial in Jerusalem after World War II. She argued that Eichmann's evilness resulted from the banality of evil—or at least, I think that's what she meant. I never understood what she meant by "Banality of Evil." But does my LLM understand what Hannah Arendt meant by "Banality of Evil"?
Without the context of the trial, you and I might interpret "Banality of Evil" in various ways. Large Language Models run into the same problem too if they don't know the context. But when I say context, I mean the real context of the outside world—not just words and written history, but also images, videos, and audio. In other words, LLMs and we need to see what Hannah Arendt saw in the Eichmann Trial to be grounded in the real world to truly understand her.
In natural language, which is not transparent but very context-heavy, there are many ways "Banality of Evil" can be interpreted. Since we cannot prove whether the various ways a text can be interpreted are finite or infinite, we cannot prove that it is computable in finite time to check if its meaning exactly equals something. Therefore, we cannot check if the LLM's meaning of "Banality of Evil" matches Arendt's. The paper attached shows the limitation of not grounding LLMs with something other than text in order to check if they truly understand meaning. In other words, we cannot prove whether the various ways a language with variables like "Banality of Evil" could be interpreted are finite or infinite, nor can we prove or assert that the LLM's understanding of "Banality of Evil" would align with Hannah Arendt's using a computer.
What do we mean by a language with variables? You should agree with me that English is a variable language because you must have used the Oxford Dictionary now and then to understand, perhaps, what Hamlet was saying in his soliloquies. Whereas if you form a language using only integers, 1 + 1 is a transparent language, and you can assert, for example, 1 = 0 + 1 or 1 + 1 = 2 and check that the meanings match. On the other hand, (X + 1) is a form of a language with a variable. Since X might have an infinite number of possible options, it becomes a non-transparent language and not computable to check the meaning's correctness because we don't know whether there would be finite options for X or infinite options for X. Therefore, we could not test if the meanings match. It is like the idea of constantly uncovering new layers of meaning. There is always more to explore and understand, potentially leading to Kant's "infinite regress," where you can always delve deeper into the analysis.
If you have taken some CS, you might remember the concept of compilers. Compilers translate human-readable computer language into computer-readable CPU instructions. From the paper, you can learn how to check if meaning is correct through assertions. For example, you can check or assert if 1 = 1, and it will return true. Similarly, you can learn how our brains are like compilers that are reading the written code, whether in Python, SQL, Java, or C. We compile the code in our heads to understand the order of execution that would happen in the computer once the code is compiled. The person who wrote the code is imagining and trying to assert what the actual Python compiler would do, agreeing on the meaning with the computer at each line while writing the code.
Unfortunately, checking if LLMs understand the "Banality of Evil" as a variable in our English language is not computable in finite time with computers—just as you and I might differ in our understanding of "Banality of Evil" due to the definition of the word "justice." By grounding LLMs in the outside world through images, audio, and video, we could check/assert if an LLM's understanding of "Banality of Evil" matches Hannah Arendt's.
For further reading, check out: Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand? at https://arxiv.org/abs/2104.10809