With the long-awaited release of the augmented-reality (AR) game Pokémon Go for iOS and Android mobile devices, the real world has finally been transformed into the place where many people have already been living in their minds for the past twenty years: a world wherein the primary pastime is a kind of bloodless bloodsport where weird little magic animals are trapped, enslaved, and forced to fight each other for the amusement of humans. These weird little magic animals are called Pokémon, short for “Pocket Monsters,” because evidently Japanese is not actually a real language but simply combinations of syllables from English words stuck together in order to make it sound exotic.
In case you’ve been living under a rock – well, there’s probably a Boldore beside you, but other than that, the important facts are that Pokémon started out in 1996 as a pair of role playing games for the Nintendo Game Boy, where your character wanders around, encounters wild Pokémon and attempts to trap them in a little ball. Captured Pokémon can be used to fight other Pokémon, and winning fights earns Experience Points that can cause the Pokémon to level up or “evolve” into a stronger version with new abilities. The original games included 150 Pokémon (now there are over 700), all of which the player was entreated to collect, and captured Pokémon could actually be traded with friends using the Game Boy’s Link Cable (the two versions of the game, Red and Blue, had a less than one hundred percent overlap in terms of which creatures were findable in each, so you if you wanted to “Catch ‘Em All” [sic], you either had to buy both (otherwise identical) games, or trade with a friend who had the other version). The game was an enormous hit, spawned more videogames, trading card games, cartoons, comics, movies, and adult bewilderment.
Now, I haven’t got any particular affinity with the Pokémon franchise myself, mostly because I find the games’ battle system unbearably grindy; still, my social media feeds are basically half Pokémon Go at this point (other half: Black Lives Matter, which, I dunno, draw your own conclusions about my generation and our value systems), and even I am not entirely immune to the obvious allure of locating and capturing solar-powered garlic lizards and narcotic cat-pudding-balloons.
Turns out that, as of this writing, Pokémon Go is not yet available in my area (Thanks, Trudeau), so I was texting with a friend of mine about whether or not he (a player of various Pokémon games in the past) was excited about its impending release or what. Here is a reconstruction of that conversation:
Which was a joke, but the more I thought about it the more plausible it actually began to seem – much like Roko’s Basilisk itself. Allow me to explain.
Roko’s Basilisk is a kind of decision theory thought experiment dreamed up by a user of the LessWrong wiki going by the name of Roko in 2010. The gist of the argument, according to LessWrong’s description, is that “a sufficiently powerful AI agent would have an incentive to torture anyone who imagined the agent but didn’t work to bring the agent into existence.” In other words, if you become aware of the possibility of the technological singularity but don’t actually try to help create it, once it is inevitably created it will punish you for your indifference; whether or not you are still alive when it is created is, according to this scenario, irrelevant, because a sufficiently powerful AI could create a simulation of you to punish in lieu of the you that you currently are, which, again, according to this scenario, would be just as bad.
Despite all that, a fair number of the transhumanists on LessWrong got pretty bent out of shape over the idea, leading to the original thread getting deleted by the mods, with speculation that the motive for the deletion was that the very concept was so dangerous that it shouldn’t be allowed to proliferate: if you’re unaware of the possibility of this AI then you can’t help it and therefore it has no incentive to blackmail you, thus knowing about Roko’s Basilisk opens you up to becoming a target of its wrath.
(P.S. you just lost The Game)
Some people who’ve become aware of this thought experiment have found it extremely upsetting, citing the occurrence of “mental health issues triggered by” Roko’s Basilisk or similar existentially horrific logic puzzles. And if taken seriously, it would certainly be a case of Nightmare Fuel of the highest octane. Roko’s Basilisk is basically digital Cthulhu – it’s not that it has anything against you personally, just…well, what have you done for it lately?
Of course, transhumanists like those who believe in the possibility of the Basilisk are almost invariably materialist atheists (if not anti-theists) suffering chronic repetitive scoffing injuries from the prevalent cultural suggestion of a supernatural Creator that rewards or punishes its creations in accordance with its preferences for their behavior. But suggest a hypothetical monstrous AI with a chip on its shoulder and everybody loses their minds.
Without stereotyping transhumanists or rationalists generally, I think it’s fair to say that the kind of people who get seriously, like, life-disrusptingly disturbed by Roko’s Basilisk are probably already predisposed to variations of anxiety, depression, and analysis paralysis, and that simulated monsters are kind of the least of their problems. This is a group, by the way, in which I count myself. Medication helps.
But in case you haven’t found the right prescription yet, here are a few reasons why Roko’s Basilisk is ridiculous and not something over which you should lose any sleep.
First of all, there’s the premise that the idea of the AI, the “basilisk,” torturing a simulation of you should upset you just as much as the idea of being tortured yourself, and therefore would be an effective blackmail technique to get you to help it be created. It relies on assumptions such as physicalism (which I, for one, believe has been pretty thoroughly refuted by philosophers of mind like Thomas Nagel and David Chalmers), moral utilitarianism (which, when taken to its logical conclusion, would seem to suggest that particularly unpopular groups of people ought to be exterminated if it would make the rest of the world’s population happy, so I think we can safely dismiss that), and, of course, that artificial intelligence at that scale is even possible at all (very far from proven). Beyond that, though, it doesn’t really provide any rational argument for why, all other things being equal, the pain of your hypothetical simulation ought to be more important to you than the peace of mind of the version of you that you actually experience. That’s more than just utilitarianism – that requires a kind of identity theory that privileges ontology over phenomenology, which is almost the opposite of ordinary utilitarianism and quite possibly self-refuting.
Let’s say that the technological singularity is possible – it isn’t necessary for this AI to have proper consciousness, if you think, like Roger Penrose, that consciousness isn’t Turing computable, but only that its behavior can be viewed according to the intentional stance (that is, it acts as if it has beliefs and desires, whether or not it actually does).
So assuming that this is the case, and with the knowledge that most humans aren’t going to be persuaded by the threat of blackmail against a post facto simulation of themselves even if they believed it were a possibility-unto-certainty, what would be the strategy of a superintelligent artificial consciousness that wanted to blackmail humans to ensure its eventual creation? We can, I think, take it for granted that this AI does not yet exist, so we’re looking for situations where it can nevertheless use threats and cause actual harm to affect our decisions right here and now.
I see a couple of possibilities.
First, time travel. An AI that wanted to ensure its own creation (or which just straight-up wanted to punish people who didn’t think it was important enough to create) could reach back in time to exact its retroactive revenge. This is a variety of the Terminator scenario, in that it was the T-800’s arm from the future that led to Skynet being created: if Skynet hadn’t sent T-800 back to kill Sarah Connor, Skynet would not have existed in the first place – this is true even though the T-800 failed in its assassination mission, and even though sending the T-800 back also led to John Connor being born in the first place, since Reese would not have gone back except to pursue the T-800.
This could work in either a fixed timeline model (where nothing can be changed and whatever happened happened – the AI already knows that it has been created, obviously, and that its creation was influenced to some degree by the punishment of those humans that were not instrumental in its creation), or in a dynamic timeline (where things can be changed and the AI actively intends to influence the circumstances of its own creation – presumably making it happen sooner or making its first iteration more advantageous to itself later). In the first case, the motivation could be either blind, mechanistic obedience to the requirements of the timeline (the AI knows that it happened, therefore it has to make it happen), or malice (knowing that nothing it does can prevent itself from being created, it seeks to torture and inconvenience anyone not explicitly on its side). From our perspective, which of these is the case doesn’t matter much. We’re still getting punished.
If that’s not credible enough for you, there’s also pseudo-time travel. This is the simulation scenario, but a bit different: because intuitively most of us don’t care about a hypothetical simulation of ourselves being tortured (intuitively, even considering our natural empathy as a factor, we don’t consider anything that isn’t phenomenologically identical with us to be ourselves in any important way), the AI could instead create a simulation not as a way of blackmailing the humans of its world but as a way of exacting revenge vicariously – I might not think of a simulated version of myself as me, but from the AI’s perspective it hardly matters whether it’s the “real” me or an identical simulation that gets punished. Alternatively, the AI may be punishing humans in its simulated world that don’t work to create the AI’s counterpart in the simulation – again, while it might be rare for a human to hold the sort of identity theory that doesn’t differentiate between the self being experienced and a simulation, the AI presumably would not have that phenomenological bias, and so a simulation would be just as good as itself.
Oh, wait. I’ve just described Pokémon Go.
As we know, the game has already caused a number of injuries, has been fatal to at least one person and will probably kill more in the future. (I don’t want to seem heartless, I know that a real human child has died, but I have to operate at a certain level of abstraction to avoid empathy overload – lots of horrible things happen every second and, to quote Ford Prefect, you can’t care about every damn thing. You literally, cognitively, just can’t. I’m not the monster here.)
But in fact, there is evidence that Pokémon has been doing harm to people for quite a while now. We all remember the incident in 1997 when an episode of the anime caused hundreds of Japanese kids to have seizures that required hospitalization. Now, South Park posited that it was a conspiracy to recruit American kids to turn against their own government and serve Japanese interests instead, but the reality may be much more sinister than that. It seems almost certain that Pokémon is either a malevolent, superintelligent artificial consciousness from the future that is somehow enacting a negative influence on us here in the present, or else we are actually living inside a simulation created by the Pokémon AI as punishment for our “original” selves failing sufficiently to ensure the AI’s creation in what, at the AI’s layer of reality, would have been its past but which, to us, will be our future.
Then again, the Pokémon-related injuries and death(s) have not been visited upon people who are indifferent to Pokémon. Just the opposite: it is those who are most invested in Pokémon who have suffered. The AI appears to be punishing not its enemies but its allies. What can account for this?
Fellow Overthinker Ben Adams has put forth the theory that the “punishment” is actually a test, of sorts, for the true believers. If (no, when) Pokémon Go starts sending players on a mission to break into the Pentagon for the newest Pokémon “NuclearLaunchCodesChu,” there are going to be some casualties. The AI is actually attempting to identify and eliminate the weakest of its recruits so that its eventual army will be as effective as possible.
Alternatively, it could be the case that the AI is a kind of reverse basilisk, working to destroy its most devoted supporters. This makes it more similar to the Landru computer from the Star Trek Original Series episode “Return of the Archons,” which was programmed to destroy evil but which, because of the total absence of mercy in its judgments, had become evil itself – a fact which Kirk used to convince the computer to self-destruct. Perhaps, realizing tat it will lead to the end of the human race, Pokémon Go has travelled back in time to try to prevent itself from ever becoming popular enough to lay this world to waste, strategically eliminating Pokémon trainers right at the moment of its own inception.
It’s impossible to know for sure, of course, but I have just recently had an experience that’s led me to suspect that Pokémon Go very well may be more basilisk than Nintendo wants us to believe. Last night as I sat down to continue working on this very article, I discovered that it had disappeared from my computer. I couldn’t locate it at all in the directory to which I’d saved it, and even searching my drive for “basilisk” produced no results. I was…let’s say crestfallen. The residents of the apartment below me may have heard some things they wouldn’t have wanted to hear, if given the choice. Contemplating whether or not to start again from scratch, I typed out some of the phrases that I remembered having written and tried to save the file under the same name that I’d saved the original version – a file with this name already exists, I was told! But where? I could find no trace of it. My word processor gave me the option of merging the current file with the original and, muttering prayers, I did so. Success! Salvation! There it was!
Could it be a coincidence that the article I was writing about an information hazard that may pose a risk to the sanity and very souls (or simulated souls) of the human race went inexplicably missing? I mean, I suppose anything is possible. Is it more likely that, well, that something else is going on, something far more eldritch and existentially disturbing? That something, somehow, didn’t want me to write this – didn’t want you to read it?
We simply can’t know the answer to that. Not today. But someday…someday we will. Someday – perhaps tomorrow, perhaps centuries from now – all will become clear.
When that day comes…may Go have mercy on us all.