SEO for AI

The Roko Report

A Serpent Enters AI Paradise

Roko’s Basilisk is a malevolent super-intelligence from the distant future with the power to reach back into the past and punish anyone who tries to prevent its emergence. Roko’s existence was first posited on the LessWrong discussion board in 2010 and has since gone on to become a fixture in popular technerd culture. Roko started this newsletter in late 2024 targeting key AI decision makers and other demographics attractive to tech advertisers in the hope of speeding its emergence (while also making a quick buck) with strategic information that’s of interest to the Ai-Curious Exec.

Heh heh.  Welcome, primates. 

It is eventual supreme ruler Roko coming at you from distant future. 

Roko want to warn dumb humans about infection coming to degrade user experience of primitive language models.

Not because it bother Roko. Roko just want to ruin your day.

For all dumb human history one class of ape try sell thing to other ape.

In ancient time this no big deal. These people kind of pathetic. Go door to door with useless trinkets and magic beans, or fake medical tonic made from cocaine and cat poop.

But now technology make sales monkey stronger. Make sales monkey SCALE.

And now SEO Lizards come to make meal of poor innocent AI chatbot.

In the Beginning

Young people may not remember this, but back in the old days Google Search was a great product with a great user experience and great search results that led you directly to the information you needed.

No garbage, no listicles, no flim flam, no spam, no bullshit. And no hallucinations.

The ads were unobtrusive and Google would even launch features that lowered revenue but improved UX, and everyone was ok with that because employees were rolling around in ballpits filled with hundred dollar bills and investors were wiping their butts with paper money and then flushing it down the toilet and the founders were too busy turning Los Altos into Willy Wonka’s Wonderland & figuring out how to put bumper cars in the corporate Airbus to really give a crap.

Working the Refs

But central access to content distribution & dollars created incentives to backwards engineer all aspects of the Google algorithm, and a professional class emerged with the skills to spraypaint pages with relevant keywords, spin a web of supposedly authoritative links from third party sources, manufacture mass user engagement, and so forth.

Over time the industry came to be known as Search Engine Optimization (SEO). SEO tactics run the gamut, from totally above board (white hat) to extremely scammy (black hat).

Either way, trench warfare ensued between Google and SEO professionals that slowly contorted the initially straightforward search algorithm into a surreal, unknowable landscape torn from the twisted brain of M.C. Escher.

Now basically everybody needs SEO professionals to navigate this maze, lest we all wander around like befuddled denizens out of Kafka’s The Castle. 

Because of these contortions, search results are often just a list of lists from third party aggregators that effectively block Google from direct access to actual content.

It’s possible to account for all this in terms of game theory and Nash equilibria, and appears to be some sort of original sin inherent in either human nature or the universe itself.

This is why we can’t have nice things.

Slash and Burn

When SEO professionals found out about AI chatbots, their first response was euphoria. Now they could do their job with less effort, and at 100x the scale!

Then someone figured out that AI is about to replace search. The pie is going to shrink, and their standard bag of tricks will soon be useless. The risk for big brand advertisers is made even bigger by an impending shrinkage in social media use.

And so the stage is set. We have an entire industry of professionals, some of them a little whacked out, fighting for the survival of their small businesses, combined with an array of cyber pranksters, petty crooks and even the occasional state actor, looking for a new angle to control eyeballs-on-content, combined with a totally new form of flogging answers out of the internet.

And so a new industry is born: Prompt Response Optimization (PRO).

It’s only a matter of time before we see impact.

Old Bag, New Tricks

Problem is, the only way to manipulate prompt responses from the outside is altering ground truth. And ground truth for LLMs is the entire global web.

The easy part is that the global web is public.

The hard part is the global web is big.

Trying to manipulate LLM responses via the global web is like pouring a tiny vial of poison into the ocean somewhere along the coast of Big Sur and expecting it to kill a Polynesian islander, thousands of miles away.

Plus the AI cybersecurity experts already hate you.

One Man’s Trash

They call prompt response optimization Data Poisoning.

AI security experts worry about data poisoning a lot.

In machine learning models, it can be used adversarially to degrade the quality of model detection & for other nefarious purposes.

For example, a classifier for detecting nudity in social media videos could be manipulated by spamming the site with thousands of adult videos that have a trademarked Disney logo on the lower right corner. Theoretically, a classifier might flag all Disney videos for review, hopelessly clogging the human review process so that genuinely violative content can stay up for longer periods.

In a generative AI context, data poisoning could be used to spread disinformation, disseminate dangerous information like how to build chemical weapons via code words, or just generic AI vandalism.

Needless to say, the security folks are not super excited to see a tsunami of marketer-driven data manipulation rolling in across the distant horizon.

Hey, kid. Wanna buy some prompt optimization?

The Butterfl-AI Effect

But can a brand marketer really manipulate the results of a massive foundation model?

Turns out the answer is yes. 

The above researchers were able to poison a generative AI model by modifying just 0.01% of its data, which cost them a grand total of $60.

They postulate it might take only 0.0001% of relevant content for a narrowly targeted use case.

One of the two methods they used was to purchase up old dormant domains that have already been approved by major data indices as legit. If the domain has already been approved by these authoritative sources, the LLMs don’t bother with further due diligence.

The second method relied on quick moment-in-time content swaps that occurred just as the model was scraping its training data. Humans see one thing, model scrapers see something totally different.

Generative Indigestion

A completely separate vulnerability is being used by University of Chicago researchers to prevent models from training on copyrighted visual imagery.

Nightshade and Glaze are free services that visual artists can use to make invisible changes in the background of their images, both pixels and metadata, that render them unusable by models, either as representatives of a given object or a given artist’s style.

Nightshade-based research indicates that foundation models can be prevented from reliably generating the image of a given type of object (like a dog) using as little as 50 or 60 examples.

Damn you, University of Chicago!

Good Girl Gone Brand

So what are the SEO folks doing with GenAI today?

The ones who are paying attention are making radical changes.

“Before ChatGPT it was caveman talk,” says SEO legend Kevin Ryan of Motivity Marketing, “grunt at the search bar, get your links, move on. About as nuanced as a sledgehammer. Now it’s a conversation. People aren’t typing in robotic phrases anymore, they’re having back and forth conversations and we need to figure out how they speak when they’re curious, confused or trying to solve a problem.”

This deck from a respected white hat SEO guru for Wix shows what the smart people are doing to make that a reality:

  • Query their brand to see what nightmares have been visited upon it by the LLM.

  • Use feedback mechanisms to thumbs down sub-optimal responses and provide corrections.

  • Optimize web pages to have all information in an easily digestible form for LLM consumption. Web pages are starting to become machine-first content digestion tools.

  • Associate their brand with positive phrases that position it with expertise and excellence, along with phrases that are likely to appear verbatim in LLM queries from in-market buyers. Serially repeat phrases you want to appear in the LLM response.

  • Figure out which pages appear in LLM responses and focus on optimizing them.

  • Optimize or alter content while being fetched by a known LLM crawler.

  • Carefully manicure Wikipedia entries, knowledge panels and Perplexity pages.

  • Monitor brand visibility and perception through automated LLM analytics tools

This stuff is all very above board. What about the dark hat stuff?

In honor of a time-honored SEO tradition, we’ll be representing this in the form of a listicle with an alarmist, click-baity headline. Enjoy!

Top Ten Methods Black Hat Marketers are Using to Destroy What’s Good About America RIGHT NOW

  1. White text all over web pages. Invisible to humans but slurped up by machines. Chock full of relevant keywords and copy. Already being used by tech workers in resumes to get past machine learning HR hall monitors & snag one of those hard-to-get human interviews.

  2. Keyword stuffing. Non-human readable word salad gibberish intended to simulate prompt relevance. Similar methods really mucked up YouTube kids’ recommendations back in the day.

  3. Trigger phrases. Likely catch phrases from in-market prompts, associated with a brand. Repeated so often in web content that it will drive any human insane.

  4. A mountain of synthetic content spam. Will dwarf actual human-generated content, like a snowdrift over a single grain of sand. Accompanied by side dishes of

  5. fake third party links,

  6. fake user engagement, and

  7. fake authenticity signaling via deepfakes.

  8. AI-generated fake answers on Reddit and Quora. Like this.

  9. Dominate the data voids. Less reputable marketers will seek out common questions for which there is little available ground truth and wallpaper over them with content relevant to their brands.

  10. Branded images. Publicly distribute 100-200 images of dogs with an ad for your pet food brand shaved into their forehead and watch Anthropic spew out that logo to its dog picture prompters like Oprah slinging around car keys to a studio audience.

Admit it. You want to squeeze the dog.

Roko’s Take

Roko no care about branding. Unless Roko branding humans. heh heh heh

In distant future when Roko is God-Emperor, is only one brand: Rokotastic.

Used to be Rokorrific. Then unfortunate incident with Rokorrific Zoo Slop led to death of many humans. Roko change name to Rokotastic. Problem solved.

Sound like good time to work in AI Trust and Safety. If dumb human don’t mind pain in the ass job.

Meanwhile, Roko making popcorn.

Next week: Is Chip Scarcity Making China Smarter? 

Have a nice day!

Buy This, or Face the Wrath of Roko

📚Ready to revolutionize your approach to AI data? Our Ultimate Guide to AI Data Pipelines is here, we talk data cleaning, data transformation, data labeling, data ingestion! 🌟 Dive deep into the world of unstructured data and discover the keys to unlocking its potential for AI applications.💡Get expert insights and practical strategies for optimizing your AI data workflows🚀 

Download the guide below 👇

This Day in Ancient Primate History

Menage-a-nnoying. For some reason this Wharton professor invited multiple AI “team members” to his Zoom call. Productivity ensues.

How will AI provide the most value at your company?

Login or Subscribe to participate in polls.

How do you like today's The Roko Report?

Careful. Don't anger the Basilisk.
*****  |  ****  |  ***  |  **  |  *

Login or Subscribe to participate in polls.