HN Debrief

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

  • AI
  • Philosophy
  • Research
  • Programming

The paper builds a deliberately absurd example. It shows how Age of Empires II can encode logic and computation, then argues that if an LLM running through that game would not deserve human-like interpretations, a normal LLM may not either. The intended target is a style of AI writing that treats fluent text as evidence for mentality. Most readers agreed with the broad warning and rejected the paper itself. They said the authors mixed up three different things that need to stay separate: the computational process, the physical or game substrate carrying it out, and the interface humans use to read outputs. Once you separate those, the headline result collapses into a banal point about Turing-complete systems being able to implement the same computation. The stronger critique was that the paper never shows why harder-to-read outputs, like goats or game entities, change what can be inferred about the underlying system. If an AoE II implementation produced the same input-output behavior as an ordinary LLM, many people saw no reason its claims to intelligence or consciousness would be weaker. Others used that same setup to push the opposite direction, saying the absurdity is exactly the point and that current consciousness talk around LLMs leans too hard on analogy, vibes, and marketing. Several comments widened this into the usual fight over computationalism. If mental properties are substrate independent, then weird substrates are fair game. If simulation is not instantiation, then an AoE II mind is no more conscious than a software rainstorm is wet. Nobody thought the paper resolved that. It mostly exposed how much of the argument rests on undefined terms like intelligence, consciousness, and human-like attributes.

Do not let impressive interfaces or fluent text do philosophical work for you. If you are evaluating claims about AI reasoning, consciousness, or agency, separate the model’s behavior from the substrate and from the presentation layer, then ask what evidence would actually distinguish one claim from another.

Discussion mood

Overwhelmingly negative. People thought the paper was sloppy, overwritten, and philosophically confused, though a few said the core provocation usefully mocks weak anthropomorphism arguments around LLMs.

Key insights

  1. 01

    Encoding output differently changes nothing

    Turning an LLM's output into game objects instead of plain text only makes it harder for humans to decode. It does not obviously change what the system is doing. That cuts straight at the paper's move from "AoE output looks silly" to "therefore human-like inferences are weaker." If identical behavior counts in one presentation layer, it should count in another unless you can say what evidence is lost.

    When you assess AI systems, treat interface polish and legibility as separate from the underlying capability claim. If your conclusion depends on whether the output arrives as text, speech, or some awkward encoding, the evidence standard is probably broken.

      Attribution:
    • azakai #1
    • kybernetikos #1
    • Havoc #1
  2. 02

    The real hinge is computationalism

    The live issue is not Age of Empires II at all. It is whether mind-like properties depend only on computation, regardless of substrate. Commenters pointed to the physical Church-Turing thesis as the background assumption doing all the work here. If you buy that cognition is computable, odd substrates are an implementation detail. If you do not, the paper's setup never gets off the ground.

    Be explicit about your hidden premise before arguing about machine consciousness or agency. Teams should state whether they are assuming substrate independence, because every downstream claim changes once that assumption moves.

      Attribution:
    • azan_ #1
    • red75prime #1 #2
  3. 03

    This reprises the old implementation problem

    The glass-of-water example and the Stanford Encyclopedia link tie the paper to a much older objection. If enough physical systems can be mapped onto arbitrary computations, then saying a system "implements" a mind becomes too cheap to explain consciousness. That matters because the AoE II stunt is not new evidence. It is another instance of the same unsolved problem in philosophy of computation.

    If this topic affects your product, policy, or ethics work, read the existing philosophy instead of treating each viral AI paper as fresh ground. The bottleneck is still a missing test for when a physical system genuinely implements a computation in the relevant sense.

      Attribution:
    • currymj #1 #2
  4. 04

    The paper lands as a parody of AI metaphors

    Several readers found the paper useful only as a jab at a genre of AI essays that line up loose analogies between brain functions and LLM internals, then quietly slide from resemblance to equivalence. On that reading, the Age of Empires II example is intentionally ridiculous because it exposes how much recent "LLMs have feelings or beliefs" discourse rests on aesthetic similarity and anthropomorphic prose.

    Be suspicious of arguments that stack analogies until they sound like evidence. In internal strategy or board conversations, ask what measurement would falsify the claim instead of letting metaphor carry it.

      Attribution:
    • glenstein #1 #2
  5. 05

    Even the AoE proof sketch looks shaky

    People dug into the game-specific construction and found it unconvincing on its own terms. The logic gates appear to rely on scenario editor scripting with "bit-goats," which makes the game engine look more like a visual wrapper around an external script than the substantive computational substrate. That weakens the paper even before the consciousness argument starts.

    If a philosophical paper leans on a technical construction, sanity check the implementation details. A flashy reduction is not persuasive if the hard part has been smuggled into tooling outside the claimed system.

      Attribution:
    • ma2kx #1 #2

Against the grain

  1. 01

    It is attacking measurement methods, not common sense

    One defense is that the paper is aimed at researchers who publish studies inferring traits like understanding or human-likeness from model behavior. Read that way, the absurd AoE II wrapper is a stress test for weak experimental designs. If your method would attribute the same property to a bizarrely encoded system, the method may be the problem.

    If you design evaluations for model cognition or alignment, test them against adversarially weird implementations. A method that survives only normal chat interfaces is not robust enough to support strong claims.

      Attribution:
    • dlcarrier #1
  2. 02

    Skeptics keep moving the intelligence bar

    One commenter pushed back on the anti-LLM mood by noting how many once-sacred milestones have already fallen. Passing Turing-test-like interactions, writing code, making art, driving cars, and solving hard math problems used to count as obvious signs of intelligence. Now each gets waved away after the fact. That does not prove current LLMs are conscious, but it does show that some criticism is definition shopping.

    Separate arguments about consciousness from arguments about competence. If you are benchmarking AI for business decisions, focus on task performance and reliability instead of waiting for consensus on a word like intelligence.

      Attribution:
    • handoflixue #1

In plain english

AoE II
Age of Empires II, a real-time strategy video game used in the paper as an unusual medium for implementing computation.
computationalism
The view in philosophy of mind that mental states and consciousness can be explained as forms of computation.
LLM
Large language model, a machine learning system trained on large amounts of text that can generate and analyze language and code.
physical Church-Turing thesis
The claim that any physically realizable process can be simulated by a Turing machine, meaning by ordinary computation in principle.
Turing-complete
Able, in principle, to perform any computation that a general-purpose computer can, given enough time and memory.

Reference links

Background philosophy

Related Hacker News discussions

Examples and analogies

  • Unreal NPC demo video
    Given as an example of pre-LLM or game-based characters exhibiting human-like conversational traits without implying deep intelligence.
  • Turing tarpit
    Used to note that being Turing-complete does not make a system practical or meaningful as a computing medium.
  • xkcd 505
    Cited as a playful extension of the substrate argument to the whole universe or other bizarre implementations.