<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en_GB"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://weirdgloop.org/feed.xml" rel="self" type="application/atom+xml" /><link href="https://weirdgloop.org/" rel="alternate" type="text/html" hreflang="en_GB" /><updated>2026-06-08T15:15:34+00:00</updated><id>https://weirdgloop.org/feed.xml</id><title type="html">Weird Gloop</title><subtitle>MediaWiki host for official and independent wikis
</subtitle><entry><title type="html">We wrote a guide to help you get your wiki off Fandom</title><link href="https://weirdgloop.org/blog/we-wrote-a-guide" rel="alternate" type="text/html" title="We wrote a guide to help you get your wiki off Fandom" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T00:00:00+00:00</updated><id>https://weirdgloop.org/blog/we-wrote-a-guide</id><content type="html" xml:base="https://weirdgloop.org/blog/we-wrote-a-guide"><![CDATA[<p>Hi wiki friends! We moved the <a href="https://gta.wiki/">GTA Wiki</a> off Fandom last weekend, so that’s kicked off a new round of wiki admins finding my <a href="https://weirdgloop.org/blog/why-were-helping-more-wikis-move-away-from-fandom">blog post</a> and asking for advice on how to get their wikis off Fandom too. Since I started offering wiki-moving advice in 2024, somewhere around 100 people have hit us up in private, and I’ve realized we’re saying a lot of the same things over and over, most of which could easily just be somewhere public.
<!--excerpt--></p>

<p>So…we made <strong><a href="https://meta.weirdgloop.org/w/Leaving_Fandom">a guide</a></strong>! It’s a collection of practical things you’ll want to be thinking about if you’re moving your wiki off Fandom, based on the experiences that <a href="https://river.me/">River</a>, <a href="https://meta.weirdgloop.org/w/User:Mudscape">Momo</a>, and I [cook] have had in the last few years, helping wikis big and small beat up Goliath and migrate somewhere better.</p>

<p>Some topics:</p>

<ul>
  <li><a href="https://meta.weirdgloop.org/w/Leaving_Fandom#Moving_wiki_content">Scripts for getting the relevant content onto your new wiki</a></li>
  <li><a href="https://meta.weirdgloop.org/w/Leaving_Fandom/moving_checklist">Some Fandom-specific stuff you’re going to want to convert or remove</a></li>
  <li><a href="https://meta.weirdgloop.org/w/Leaving_Fandom/effective_Reddit_posts">Advice for announcing the move on Reddit, with some examples</a></li>
  <li><a href="https://meta.weirdgloop.org/w/Leaving_Fandom/coalition_building">Advice for working with the game developers and other important people in the community</a></li>
</ul>

<p>I hope this gives everyone some tangible examples, demystifies the process, and makes it clear which parts of this are actually hard, vs which parts just look hard.</p>

<p>If you’re a wiki editor, go check it out! Even if you’re not a wiki editor, we have a section with <a href="https://meta.weirdgloop.org/w/Leaving_Fandom/what_to_do_if_you%27re_not_an_editor">some advice</a> for how you can still help get the ball rolling for your favorite wiki.</p>

<h2 id="nearly-everyone-is-gonna-be-rooting-for-the-new-wiki-go-build-a-coalition">Nearly everyone is gonna be rooting for the new wiki. Go build a coalition!</h2>

<p>As we’ve helped more wikis get from the “idea stage” to actually launching and announcing the move away from Fandom, I’ve noticed a consistent gap in expectations, where the wiki admins think moving-from-Fandom is basically internal “hobby drama” that only affects editors, and nobody else will care. Naturally, this means they’re caught off guard when the move gets a huge amount of interest - <a href="https://reddit.com/r/Warframe/comments/1iemokz/the_warframe_wiki_is_officially_moving_from/">10k upvotes on Reddit</a> with 99% support, <a href="https://www.youtube.com/watch?v=qcfuA_UAz3I">YouTube videos with 4 million views</a>, <a href="https://kotaku.com/gta-wiki-leaving-fandom-rules-censorship-ads-videos-ditching-2000679115">mainstream gaming press</a> - by far the most attention the wiki’s ever gotten.</p>

<p>This is fascinating to me, and I think I finally get it: wiki editors consistently underestimate how much the average internet person uses their wiki every day, would benefit from the wiki being great, and consequently, is frustrated as heck with Fandom being so damn bad.</p>

<p><strong>The most successful wiki moves are the ones where the wiki admins recognize this, and are pragmatic about building a coalition from their game’s wider community to help get the word out about the new wiki.</strong></p>

<p>I cannot stress enough that this is usually the deciding factor for whether a wiki move is successful. Build the coalition! It’s not as hard as it sounds. Here’s two important groups you should be thinking about:</p>

<h3 id="game-devs">Game devs</h3>
<p>I’ve yet to meet anyone working for a game studio that’s enthusiastic about their wiki being on Fandom. I don’t think this dynamic is widely understood by wiki editors.</p>

<p>For most editors, it’s insanely valuable to get a chance to talk to the devs: get some data, get some clarification on lore or mechanics…because it can make the wiki so much better, often with only a few minutes of effort from the developers. Fandom does an excellent job giving the <em>impression</em> to wiki editors that, because they’re a “big company” too, they’ll have an “in” with the game studio, and can make these kinds of interactions happen. In other words, it’s a “perk” of being on Fandom’s platform, that you can’t get anywhere else.</p>

<p>In my experience, the opposite is usually true! I’ve seen at least 3 times where a game studio will just refuse to engage with their game’s wiki because it was hosted on Fandom, tragically not understanding that the people making the edits were not the people running the horrible intrusive ads. It’s not that hard to see why:</p>
<ul>
  <li>Game designers dislike it because they’ll often be reading it for work (a lot of the times the wiki is better than their internal documentation!), so they have all the same usability/ad complaints that your average reader would</li>
  <li>Community managers dislike it because the ad situation tanks community sentiment, and makes it harder for new players to find what they need</li>
  <li>Marketers freak out about how their core audience is getting bombarded with ads for games that directly compete with them (the developer of RuneScape had a <a href="https://runescape.wiki/w/Fansite?oldid=10212589#Requirements_and_benefits">fansite program</a> circa 2013, that gave fansites special access to stuff if they agreed to not run ads for other video games, which, we learned years later, was mostly aimed at freezing out the Fandom wiki that was constantly running ads for WoW, etc)</li>
</ul>

<p>The list goes on: <a href="https://www.youtube.com/watch?v=IWpSJj_bYUk">here’s a video</a> from the developer of Satisfactory talking about their unpleasant interactions with Fandom. A community manager for a different, very popular game once told me that their studio got fed up with the terrible ad situation on their Fandom wiki, and inquired with Fandom how much they’d have to pay to get the wiki to not have any ads. They stopped pursuing this when Fandom said they’d be willing to consider it…for about 5 million dollars a year. For just the one wiki.</p>

<p>So…go talk to your game’s community managers! There’s likely someone at the studio who understands how important the wiki is and is frustrated that it’s on Fandom, but who has no idea that the wiki contributors also want out. If that’s the case, there’s probably a reasonable path forward where you work together, and if you can find the right person to talk to at the studio, they’ll be your most important ally in the whole thing. There’s a lot they can do to put their thumb on the scale to help the new wiki, including:</p>

<ul>
  <li>Helping announce the move, and linking to the wiki in prominent places on their website</li>
  <li>Putting the wiki on a subdomain of their website like <a href="https://wiki.leagueoflegends.com/">wiki.leagueoflegends.com</a> or <a href="https://wiki.warframe.com/">wiki.warframe.com</a>, or hosting it on their servers, like the <a href="https://poewiki.net">Path of Exile devs</a></li>
  <li>Requesting a takedown of any content on the Fandom wiki that infringes their copyright, like art or other game assets</li>
  <li>Working more closely with the wiki editors on projects that make the wiki better (more on this at the end!)</li>
</ul>

<h3 id="youtubers-and-other-community-power-users">YouTubers and other community “power users”</h3>
<p>There’s a lot of great videos from established YouTubers that focus on a wiki leaving Fandom. <a href="https://www.youtube.com/watch?v=qcfuA_UAz3I">This one about Hollow Knight</a> is probably the most well-known, but there’s some popular ones from a <a href="https://www.youtube.com/watch?v=ySXP4hxFNFw">few</a> <a href="https://www.youtube.com/shorts/DPqa0C4DqeE">Minecraft</a> <a href="https://www.youtube.com/watch?v=T6FdYNyIaUI">people</a>, <a href="https://www.youtube.com/watch?v=KATj_UQu9DY">League</a> people, and more. These are one of the main ways that casual wiki readers find out about the move, so you should be doing everything you can to get popular video makers for your game to help spread the word.</p>

<p>Figure out who else is important - make friends with the Reddit and Discord mods, figure out what other websites and community projects rely on the wiki in some way, and see if they’re up to help.</p>

<p>Wiki editors will often assume they need to offer something in return (linking to the video, some official affiliation, etc), but you usually just don’t need this at all. In my experience like 95% of these folks are super eager to help, because remember, <strong>they all use the wiki too</strong>.</p>

<h2 id="the-light-at-the-end-of-the-tunnel">The light at the end of the tunnel</h2>
<p>Every single person I know who’s moved their wiki off Fandom feels like it’s one of the best decisions they’ve ever made. It’s so, so, worth it. Here’s why:</p>

<h3 id="youll-probably-beat-fandom-on-google">You’ll probably beat Fandom on Google</h3>
<p>Whenever I see a Reddit post complaining about Fandom, there’s a bunch of comments about how Fandom has some SEO voodoo that makes them always be the #1 result on Google. Fandom leans in to this narrative in their explanation for why you should keep your wiki there:</p>

<blockquote>
  <p>Forking is a drastic and often difficult move. We would, of course, like you to stay on Fandom! We have the advantage of a large and stable platform, <strong>with strong SEO and a large and dedicated staff team</strong>.</p>
</blockquote>

<p>In my experience, this is pretty overblown. If you do a great job spreading the word, then droves of people will seek out the new wiki on Google, which bumps the click-through rate of the new wiki. In practice, Google is actually quite willing to “rank up” search results that out-perform expectations on click-through rate, and I believe the current consensus is that <a href="https://reddit.com/r/SEO/comments/1m2v45q/is_google_quietly_using_ctr_as_a_ranking_signal/">CTR is the most important Google ranking factor by a wide margin</a>.</p>

<p>So Fandom doesn’t have a moat, you just need to do a great job yelling from the rooftops about the new wiki. Don’t believe me?</p>

<p><img src="/images/posts/google_search_collage.png" alt="Collage of search results for Weird Gloop wikis" width="650" style="display: block; margin: auto;" /></p>

<h3 id="your-wiki-and-community-are-gonna-get-way-better">Your wiki (and community) are gonna get way better</h3>
<p>I noted in 2024 that every wiki we’d moved from Fandom <a href="https://weirdgloop.org/blog/why-were-helping-more-wikis-move-away-from-fandom#why-ditching-fandom-is-cool-and-based">ended up doubling the number people actively editing</a>. In the 2 years since then, that pattern has not only held up, it’s gotten even bigger: <strong>on average, the rate of new editors on a wiki <a href="https://docs.google.com/spreadsheets/d/1liBtY4biPNWIbeCB9MicN9i8VKTSAQRL/edit?usp=sharing&amp;ouid=101068541543340005104&amp;rtpof=true&amp;sd=true">triples when they move away from Fandom</a></strong>. That’s three times as many people to help with game updates, work on new projects…everything, really. More contributors makes everything easier.</p>

<p>Moving away from Fandom also gives you a ton of new technical flexibility to implement cool shit - integrations that let you <a href="https://minecraft.wiki/w/Minecraft_Wiki:Hey_Wiki">look things up from the game</a>, <a href="https://oldschool.runescape.wiki/w/RuneScape:WikiSync">sync your in-game progress</a> to the the wiki…so many examples like this that the readers <em>love</em>, and that are just impossible to imagine happening on a Fandom wiki. Not to mention that once you get the game developer involved in the wiki and trusting the wiki editors, there are usually some very obvious areas for collaboration (like sharing game data, or pre-release access, or even just being open to answering questions) that will make the wiki massively better.</p>

<hr />

<p>Go check out <a href="https://meta.weirdgloop.org/w/Leaving_Fandom">the guide</a>! It’s got a ton of detailed, useful stuff I haven’t talked about here. If you’ve checked it out and want moving advice that’s specific to your situation, we just opened up the <a href="https://weirdgloop.org/discord">Weird Gloop discord</a>, so come say hi there, or <a href="https://weirdgloop.org/contact/">contact me another way</a>.</p>]]></content><author><name>cookmeplox, River</name></author><summary type="html"><![CDATA[Hi wiki friends! We moved the GTA Wiki off Fandom last weekend, so that’s kicked off a new round of wiki admins finding my blog post and asking for advice on how to get their wikis off Fandom too. Since I started offering wiki-moving advice in 2024, somewhere around 100 people have hit us up in private, and I’ve realized we’re saying a lot of the same things over and over, most of which could easily just be somewhere public.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://weirdgloop.org/images/Weird%20Gloop%20square%20logo.png" /><media:content medium="image" url="https://weirdgloop.org/images/Weird%20Gloop%20square%20logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Aggressive AI scrapers are making it kinda suck to run wikis</title><link href="https://weirdgloop.org/blog/clankers" rel="alternate" type="text/html" title="Aggressive AI scrapers are making it kinda suck to run wikis" /><published>2026-03-13T00:00:00+00:00</published><updated>2026-03-13T00:00:00+00:00</updated><id>https://weirdgloop.org/blog/clankers</id><content type="html" xml:base="https://weirdgloop.org/blog/clankers"><![CDATA[<p><img src="/images/posts/clankers_graph1.png" alt="Graph of estimated AI scraper requests per month" width="350" class="float-right" /> Bots are currently scraping the internet for LLM training data at unprecedented rates<a href="https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/">[1]</a><a href="https://www.glamelab.org/products/are-ai-bots-knocking-cultural-heritage-offline/">[2]</a><a href="https://www.theverge.com/column/885244/smart-tv-web-crawler-ai">[3]</a>, driving up costs and destabilizing public-facing websites. I want to talk about how <strong>this has been particularly difficult for wikis</strong>, and has gotten much worse in the last few months.
<!--excerpt--></p>

<p>I run <a href="https://weirdgloop.org">Weird Gloop</a>, which hosts some of the biggest video game wikis ever, like <a href="https://minecraft.wiki">Minecraft</a>, <a href="https://oldschool.runescape.wiki/">OSRS</a> and <a href="https://wiki.leagueoflegends.com/">League</a>. Over the last 3 years, we’ve had to spend more and more of our time fighting with this bot traffic that is spiky, disproportionately expensive, and getting harder to distinguish from humans. <strong>If we weren’t constantly mitigating the bots, they would use ~10x more of our compute resources than everything else put together</strong> - even though that “everything else” includes tens of millions of (human) pageviews and tens of thousands of edits a day.</p>

<p>Everyone who runs wikis is dealing with the exact same problem. The Wikimedia Foundation has a <a href="https://diff.wikimedia.org/2025/04/01/how-crawlers-impact-the-operations-of-the-wikimedia-projects/">post</a> about it impacting operations, every major wiki farm has had varying degrees of service outages, and some smaller independent wikis have been knocked completely offline. Overall, I’d guess that about 95% of all server issues in the wiki ecosystem this year have been caused by bad scrapers.</p>

<p>Every wiki sysadmin I’ve talked to is dealing with these specific problems:</p>

<h2 id="the-scrapers-are-pretending-to-be-human-visitors-and-getting-pretty-good-at-it">The scrapers are pretending to be human visitors, and getting pretty good at it</h2>
<p>Most of the discussion I’ve seen about scrapers has focused on bots operated by the major AI companies (GPTBot, ClaudeBot, PerplexityBot, etc). Although these “official” bots have at times <a href="https://www.plagiarismtoday.com/2025/07/23/chatgpt-ignores-robots-txt-rehashes-my-column/">struggled to respect robots.txt</a>, at least they <em>usually</em> properly identify themselves as bots in their User Agent string, which makes it really easy for a website operator to block them with <a href="https://developers.cloudflare.com/bots/additional-configurations/block-ai-bots/">Cloudflare</a>, nginx, or any number of other techniques.</p>

<p>The problem is that when webmasters started blocking AI scrapers based on User Agent, it created a massive incentive for bots to pretend to be human traffic, so as to avoid getting blocked. This game of cat and mouse has played out over the last few years, and the bots have gotten pretty darn good at imitating human requests. Now the majority of AI scraper traffic that hits our wikis is carefully crafting the requests, sending the right headers so it can pretend to be recent versions of Google Chrome, which eliminates the obvious “bot or real person” signals that we previously could use to block them.</p>

<h2 id="theyre-using-tens-of-millions-of-ip-addresses">They’re using tens of millions of IP addresses</h2>
<p>Before 2023, if we had a problem with how someone was scraping the wiki, 95% of the time they would only be using a single IP address, or a single datacenter with a small subnet of IPs. So it was mostly effective to block bad actors based on IP or ISP characteristics.</p>

<p>…Enter <a href="https://cloud.google.com/blog/topics/threat-intelligence/disrupting-largest-residential-proxy-network">residential proxies</a>, where anyone with a credit card can get all of their scraping requests “laundered” through a network of millions of IP addresses. The wikis get hit sometimes by scraper runs that cycle through a million IPs a day, and they &gt;look like&lt; they’re coming from legit places: mostly residential ISPs (Comcast, AT&amp;T, Charter, etc) where the customer probably doesn’t even know their IP is being used as an exit node for a residential proxy.</p>

<p>Beyond residential proxies, a lot of the scraping is happening on IPs that belong to Facebook and Google. Bad actors are able to use <a href="https://datadome.co/threat-research/how-facebook-was-used-as-a-proxy-by-web-scraping-bots/">facebookexternalhit link preview</a> or Google Translate to make the requests happen on Google/Facebook servers, which completely obscures the source of the requests. At times we’ve had to break Google Translate’s URL tool for all our wikis, because 99.99% of the requests coming through it are abusive.</p>

<h2 id="theyre-mostly-crawling-stupid-urls">They’re mostly crawling stupid URLs</h2>

<p>Most of these AI scrapers seem to select their targets in the dumbest way possible:</p>

<ol>
  <li>visit the homepage of the wiki</li>
  <li>visit all the links on that page</li>
  <li>visit all the links on THOSE pages</li>
  <li>…</li>
  <li>repeat until all links are visited</li>
</ol>

<p>They don’t seem to have any awareness that there’s a <a href="https://runescape.wiki/robots.txt">robots.txt</a> and sitemap that tells them which URLs are worth scraping. There’s a reason this is an especially dumb strategy for wikis.</p>

<p>OSRS Wiki has about 40,000 “articles”, so that’s 40,000 URLs that make up the vast majority of the useful information on the site. But once you account for all the <a href="https://oldschool.runescape.wiki/w/Scrambled!?diff=prev&amp;oldid=14947138">old revisions</a>, <a href="https://oldschool.runescape.wiki/w/Mythical_Cape_Store?action=edit&amp;section=1">edit screens</a> and <a href="https://oldschool.runescape.wiki/w/Special:LongPages?limit=50&amp;offset=50">special pages</a> that are used by people editing the wiki, there’s at least a <strong>billion</strong> navigable URLs. That means two things for scrapers hitting wikis:</p>

<ol>
  <li>this naive scraping process is never going to finish</li>
  <li>the vast majority of the requests are not doing anything useful</li>
</ol>

<p><img src="/images/posts/clankers_logs.png" alt="Requests from scrapers using browser User-Agents" width="70%" style="display: block; margin: auto;" /></p>

<p>Most of these URLs can’t possibly be useful data for training an LLM, but it seems like that’s what they’re spending most of their resources on. These weird requests are also unusually expensive for <em>us</em> to serve, since they bypass the <a href="https://www.mediawiki.org/wiki/Manual:Parser_cache">various layers of caching</a> that most requests from real users hit. Cache hits usually take less than 20 milliseconds of processing time, but these weird old diffs can frequently take 1-2 seconds. This means that top-line metrics (“8 million bot requests a day”, “bots are using 65% of my bandwidth”, etc) <strong>seriously undersell</strong> the scope of the problem, because CPU capacity is usually the important bottleneck, and the bot requests with all the weird query parameters are often 50-100x as expensive to serve.</p>

<h2 id="the-worst-bot-traffic-is-very-spiky-so-aggregate-metrics-undersell-the-problem">The worst bot traffic is very spiky, so aggregate metrics undersell the problem</h2>
<p><img src="/images/posts/clankers_es.png" alt="Graph of minute-by-minute requests to minecraft.wiki" width="350" style="display: block; margin: auto;" /></p>

<p>I said earlier that we get about 250 million bot requests per month (about 100 per second), but that’s just the long term average: these scrapers frequently operate in short bursts of 1000+ requests a second, almost indistinguishable from a good old-fashioned DDOS attack. So even though the bots might only be ~50% of our total CPU usage long-term, their abusive traffic spikes are responsible for ~95% of the slowness and outages that wikis have been dealing with.</p>

<h2 id="its-not-clear-whos-doing-it">It’s not clear who’s doing it</h2>
<p>I keep calling this bad traffic “AI scrapers”, but because everyone’s pretending to be Google Chrome, I have no idea who’s actually responsible for it, or what they’re doing with all the wiki data they’re slurping up. Is it a data broker? A frontier lab double-dipping? Is it just random independent projects with access to residential proxies? Am I underestimating how low the barrier to entry is now?</p>

<p>I really have no idea. If by some miracle, someone reading this is behind one of these efforts…honestly, send me an email or something. I would love to know what you’re getting out of this, and find a less stupid way for you to do it.</p>

<hr />

<h2 id="whats-worked-for-us">What’s worked for us</h2>
<p>The whole situation sounds kinda grim, right? If this were a sales pitch, this is about the point where I’d switch gears and tell you we have some magic solution to sell you for a million bucks. We don’t, I’m pretty sure it’s just a really hard problem that everyone’s struggling with.</p>

<p>The most common technique is to put your website behind something like Cloudflare challenges or <a href="https://github.com/TecharoHQ/anubis">Anubis</a>, which has become ubiquitous on the internet in the last year. This kind of works, but has two main problems:</p>

<p><img src="/images/posts/clankers_challenge.png" alt="Cloudflare challenge" style="display: block; margin: auto; width: 250px;" /></p>

<ol>
  <li>There will be periods where some bots are able to consistently pass the challenges. We don’t have much insight into what methods they’re using, but I would assume there’s an arms race going on behind-the-scenes between Cloudflare and the bot developers. Cloudflare is winning that battle maybe 90% of the time, but that remaining 10% can be rough.</li>
  <li>Actual readers <strong>hate</strong> having to see the challenges before they get to the wiki. Can you blame them? So ideally you have good heuristic rules that decide what traffic is worth challenging, so most people aren’t impacted…which just gets back to it being hard to reliably detect which traffic is automated.</li>
</ol>

<p>Pretty much everyone has some <strong>handwritten firewall rules</strong> that will be specific to their infra and the attacks they’ve had in the past. Most often these filters will be based on specific User Agent strings, IP groups or ASNs. We do most of this at the Cloudflare/CDN level, but some other wikis do it on the nginx/webserver side instead.</p>

<p>Blocking just on User Agent/IP is, of course, rarely sufficient these days. So over time we’ve had to look at more complicated attributes of the requests - HTTP version, headers, TLS ciphers and <a href="https://github.com/FoxIO-LLC/ja4">ja4</a>-related hashes - to try to find simple rules for which traffic is bots.</p>

<p>One perspective we’ve found really useful is to <strong>look for things humans do in aggregate, that the bots don’t do</strong>. For wikis that run the MediaWiki software (like us), there are many types of HTTP requests that normal people on real browsers often make when they use the wiki, that the bots usually don’t. So if you see some chunk of traffic that you can segment off (based on headers, ja4 hashes, whatever else…), that visits a lot of articles but doesn’t do any of the classic “human” requests, it’s a strong indicator that you could get away with challenging that chunk.</p>

<p>This technique, of looking at the human-behavior requests that <strong>aren’t</strong> present in the bot traffic, is quite powerful. We started building a system that looks at our “missing” traffic and automatically proposes “decision tree”-based heuristics for which traffic to challenge. In testing so far, it seems to do an excellent job of rooting out nearly all of the scrapers, but I’ve hesitated to let it run unsupervised because we don’t have a clear idea of what sort of false positives it would create for people with unusual browsing habits (NoScript users, screen readers, unexpected types of devices). I also don’t love the idea of building and permanently maintaining our own ML/data analysis infrastructure for this. I’d sure love to focus on making wikis instead.</p>

<p>There’s more exotic techniques out there, too:</p>
<ul>
  <li>I’ve seen people have success identifying residential proxies based on <a href="https://www.osti.gov/servlets/purl/2530825">TCP/TLS timing discrepancies</a></li>
  <li>There are companies out there selling <a href="https://spur.us/platform/residential-proxy-detection">realtime databases of residential proxy IPs</a>, although it’s not clear to me how actionable that is when <a href="https://blog.cloudflare.com/residential-proxy-bot-detection-using-machine-learning/#detecting-residential-proxies-using-network-and-behavioral-signals">most residential proxies are also used by real people at the same time</a>.</li>
  <li>This bot-detection problem feels intuitively like it <strong>should</strong> be solvable at scale, where someone like Cloudflare or the big cloud providers could use the packet-level network info from their absurd amount of traffic to make awesome heuristics…but nobody I’ve talked to has been impressed with the heuristics on any of these commercial bot detection products, including the ones that run to six-figures a year.</li>
</ul>

<h2 id="some-other-ideas-that-are-bad-for-readers">Some other ideas that are bad for readers</h2>
<p>There’s a couple “nuclear options” for stopping AI scrapers, but they are much more disruptive for real people. The most common I’ve seen:</p>

<ul>
  <li>Require readers to log in to view any pages that could potentially be expensive to generate. This is what <a href="https://community.fandom.com/wiki/User_blog:Fandom/New_Measures_to_Reduce_Bot_Activity">Fandom did on all their wikis a few months ago</a></li>
  <li>Serve Cloudflare challenges to all traffic</li>
</ul>

<p>Both of these are understandable things to do as a webmaster, but they’re terrible for the long-term health of the wikis and their communities. The main lesson I’ve learned from 16 years of building wiki communities is that <strong>the best thing you can do to attract new contributors to a wiki is to eliminate friction</strong> - make things easier to edit, easier to explore the internals of the wiki, and tear down the barriers-to-entry that separate the readers from the editors.</p>

<p>All of these more extreme anti-bot techniques add new friction, and have predictable consequences - I did a small analysis and found that contributions from new users across Fandom are down about 40% after Fandom’s changes that hid “internal pages” from the &gt;95% of readers that don’t have accounts. It would be hard for me to ever think that’s a worthwhile tradeoff.</p>

<hr />

<h2 id="where-do-we-go-from-here">Where do we go from here?</h2>

<p>Just so people don’t get too bummed out - <em>we’re still doing okay!</em> As much as I wish I could turn the clock back 3 years and never have to deal with this scraper nonsense again…I still love hosting wikis, I love <a href="https://weirdgloop.org/blog/why-were-helping-more-wikis-move-away-from-fandom">helping wikis move off Fandom</a>, and I can’t imagine a scenario where the bots seriously change either part of that. I have longer-term concerns about “AI Overviews” killing the pipeline that turns wiki readers into wiki contributors, but that’s a story for another day.</p>

<p>I’ve had a couple friends even half-seriously suggest that the wave of bots might even be <strong>good</strong> for Weird Gloop, because we’re better-than-average at MediaWiki tech stuff, and maybe we benefit because scrapers raised the bar for expertise and time needed to host a wiki. But I think <strong>the internet is worse off if people can’t easily host wikis</strong>. I can imagine a nightmare scenario where you eventually need an oncall rotation or an ML engineer or an enterprise product if you want to host a wiki without getting intermittently splattered by scraper bots…this would be extremely bad news for the independent wiki community overall.</p>

<p>I’m not really sure what the endgame is here. I suspect the arms race between bot owners and webmasters will continue until someone comes up with a clever way to change the structural incentives around scraping. For example, I think <a href="https://developers.cloudflare.com/changelog/post/2026-03-10-br-crawl-endpoint/">Cloudflare’s new crawling API</a> could possibly change the dynamic, if using that API ends up being less effort for the bots than building their own systems that ignore robots.txt and cause problems for us. Of course I’d prefer the scraping just didn’t happen at all, but we probably can’t un-ring that bell.</p>

<p>But! The thing that makes me optimistic is that there are literally <strong>thousands of people out there just like us</strong> - running their websites and finding more and more clever techniques to deal with the bots. I’ve heard some very cool, very specific ideas from other sysadmin folks in private conversations, and I have to assume there’s a lot of discussion happening in random Slacks and Discords and other small groups. But <strong>I wish there was more public discussion about the practicalities and specifics</strong>, because a lot of sysadmins I talk to don’t realize the extent to which their problems are identical to everyone else’s.</p>

<p>I understand not everyone wants to tell the whole world how they’re stopping the bots, because they’re worried they’ll lose whatever edge they have. I have a slight fear that my post here will make our own tactics less effective, but if it helps people put their heads together, it’s worth it.</p>

<p>So with that in mind:</p>

<ul>
  <li>If you’re a sysadmin dealing with AI scrapers in any capacity, consider sharing what’s worked for you, in whatever space makes sense for you</li>
  <li>If you’re a company that is selling a product that purports to solve this bot problem, PLEASE put out more case studies with tangible data about <a href="https://en.wikipedia.org/wiki/Precision_and_recall">precision and recall</a> rates in non-contrived situations. The people making purchasing decisions on this topic aren’t just checking off a box, they really care about the results.</li>
  <li>If you run a wiki (or other indie website) and you wanna talk shop about bot detection, <a href="https://weirdgloop.org/contact/">send me an email or Discord message</a>. I might make a little Discord server if there’s enough interest.</li>
</ul>]]></content><author><name>cookmeplox</name></author><summary type="html"><![CDATA[Bots are currently scraping the internet for LLM training data at unprecedented rates[1][2][3], driving up costs and destabilizing public-facing websites. I want to talk about how this has been particularly difficult for wikis, and has gotten much worse in the last few months.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://weirdgloop.org/images/Weird%20Gloop%20square%20logo.png" /><media:content medium="image" url="https://weirdgloop.org/images/Weird%20Gloop%20square%20logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Why we’re helping more wikis move away from Fandom</title><link href="https://weirdgloop.org/blog/why-were-helping-more-wikis-move-away-from-fandom" rel="alternate" type="text/html" title="Why we’re helping more wikis move away from Fandom" /><published>2024-10-10T00:00:00+00:00</published><updated>2024-10-10T00:00:00+00:00</updated><id>https://weirdgloop.org/blog/why-were-helping-more-wikis-move-away-from-fandom</id><content type="html" xml:base="https://weirdgloop.org/blog/why-were-helping-more-wikis-move-away-from-fandom"><![CDATA[<p>Hi! You may have seen that Weird Gloop is now hosting the <a href="https://wiki.leagueoflegends.com">official League of Legends Wiki</a>. We’ve spent the last couple months working with the Riot folks and the League wiki editors to move it off of Fandom, and turn it into something the players will (hopefully!) really dig. I also love that it got started because one of the Riot guys plays a ton of Old School RuneScape and thinks <a href="https://oldschool.runescape.wiki">our wiki</a> is awesome. How cool is that??</p>

<p>I want this to kick off a new era where communities and developers take control from Fandom, and make some really great wikis. We’ve already been doing a bit of this, starting when we helped the <a href="https://minecraft.wiki">Minecraft Wiki</a> leave Fandom, but I think it’s time for me (and the rest of our group) to be more explicit about what we want to do.
<!--excerpt--></p>

<p>So if you’re any of these things:</p>
<ul>
  <li>A frustrated wiki editor trying to figure out your options</li>
  <li>A community manager trying to get internal support for an official wiki</li>
  <li>Someone contemplating making a new wiki</li>
</ul>

<p><strong>I will give you (free, very specific) advice on how to get your wiki off Fandom, and make a kickass wiki somewhere else</strong>. We might even be able to host you ourselves.</p>

<p>If you think this sounds cool, <a href="/contact">come talk to me</a>.</p>

<ul id="markdown-toc">
  <li><a href="#why-do-we-actually-care" id="markdown-toc-why-do-we-actually-care">Why do we actually care?</a></li>
  <li><a href="#why-ditching-fandom-is-cool-and-based" id="markdown-toc-why-ditching-fandom-is-cool-and-based">Why ditching Fandom is cool and based</a></li>
  <li><a href="#what-im-offering" id="markdown-toc-what-im-offering">What I’m offering</a></li>
  <li><a href="#how-to-not-turn-into-fandom-20-with-these-2-simple-tricks" id="markdown-toc-how-to-not-turn-into-fandom-20-with-these-2-simple-tricks">How to not turn into Fandom 2.0 (with these 2 simple tricks)</a>    <ul>
      <li><a href="#point-1---wiki-communities-need-to-be-able-to-freely-leave-their-host" id="markdown-toc-point-1---wiki-communities-need-to-be-able-to-freely-leave-their-host">Point 1 - wiki communities need to be able to freely leave their host</a></li>
      <li><a href="#point-2---global-branding-is-extremely-negative-value-for-wiki-farms" id="markdown-toc-point-2---global-branding-is-extremely-negative-value-for-wiki-farms">Point 2 - global branding is extremely negative value for wiki farms</a></li>
    </ul>
  </li>
</ul>

<h2 id="why-do-we-actually-care">Why do we actually care?</h2>

<p><a href="https://archive.ph/kwt1b">This post</a> (and many others) have done a much better job than I could, explaining from a reader’s perspective why Fandom is bad place to host a wiki, but I thought it might be useful to give my take on it as a long-time wiki editor.</p>

<p>I love wikis. I think it’s unbelievably cool that this completely insane idea (“what if we just had a website that anyone can edit?”) doesn’t descend into anarchy, and instead self-organizes into a fun, project-oriented community. I think that despite its flaws, Wikipedia is the single coolest thing the internet has ever done. And wikis on niche topics feel like some of the last remnants of a friendlier, more collaborative, early 2000s web. I loved contributing to wikis, building something with other people, and feeling a sense of ownership (and pride that so many people were using stuff I made).</p>

<p>Which is why it’s so concerning that Fandom has taken this wonderful concept and turned it into one of the most dreadful parts of the internet. Being deeply involved with the RuneScape Wiki on Fandom had a huge psychological cost – what wonderful thing did they add today that made our wiki harder to use? Scammy green link ads? Comically bad videos on the top of our most popular pages? Garbage AI-generated Q&amp;A? Ads that take up literally 100% of the content window?</p>

<p>I (and so many others) had spent countless nights trying to make the best possible resource for RuneScape, and it was brutal to realize that it didn’t matter how hard we worked or creative we were – our wiki was never gonna be that great, because Fandom was in charge. That sense of ownership and pride…slowly turned into feeling like my passion was being exploited by a company that didn’t want the same things I did.</p>

<p>We weren’t the only ones feeling this way, of course – some wiki communities got fed up and moved somewhere independent. But here’s the key thing you need to understand: even when a wiki community unanimously wants to leave, Fandom keeps their copy of the wiki up, even though it no longer has a community. Google remembers years of people searching, linking, and visiting the Fandom wiki URLs, and continues to rank the increasingly stale Fandom results first. Since roughly 85% of a wiki’s traffic comes from Google, it’s nearly impossible for the new wiki to win without fixing this ranking disparity. It’s an extremely draining thing to do – nobody likes to spend their waking hours competing against the thing they helped lovingly build.</p>

<p>Historically, independent wikis have had an extremely hard time winning this battle. Most of the traffic stayed on the Fandom wiki, and the independent wikis often fizzled out. This had a chilling effect on the remaining communities, and emboldened Fandom to further prioritize revenue extraction.</p>

<p>That’s the key takeaway: <strong>if leaving Fandom was easy, they wouldn’t be able to enshittify as much as they have</strong>.</p>

<p>But don’t lose hope! Google has gotten much friendlier to independent wikis over the last decade. With a large, sustained effort, we were able to recover 95% of RuneScape Wiki traffic within the first year.</p>

<h2 id="why-ditching-fandom-is-cool-and-based">Why ditching Fandom is cool and based</h2>

<p>The main advantage of leaving Fandom is likely clear to anyone who’s ever visited one of their wikis without an ad blocker. But there’s more than that! When you have a site that people are happy to go to (instead of something they’re forced to grimace and use), you get all these wonderful secondary effects that are worth mentioning.</p>

<p>For starters: on average, moving away from Fandom doubles the number of people editing. I’ve seen the pattern across every wiki we’ve ever moved off Fandom, but here’s a pretty striking graph from OSRS Wiki:</p>

<p><img src="/images/posts/osrs-edit-count.png" alt="Graph of OSRS editors by editcount per month" width="650" style="display: block; margin: auto;" /></p>

<p>It’s incredibly consistent: way more people show up and want to help, when they feel like they’re contributing to something that isn’t taking advantage of them.</p>

<p>It’s not a coincidence that OSRS Wiki got really good once we left Fandom in 2018. Once we had way more people wanting to contribute (and the only objective was “make the best possible wiki for the game”) the wiki magically got way better! Crazy!</p>

<p>Departing from Fandom has also opened the door for a number of custom technical projects that otherwise would have been downright impossible to implement on the old wiki. <a href="https://oldschool.runescape.wiki/w/RuneScape:Lookup">In-game item lookup</a>, <a href="https://oldschool.runescape.wiki/w/RuneScape:WikiSync">WikiSync</a> and <a href="https://prices.runescape.wiki/osrs/">real-time prices</a> are core parts of our offering now, with hundreds of thousands of users. They’re all made possible by the new flexibility we gained when we took control of the hosting.</p>

<h2 id="what-im-offering">What I’m offering</h2>

<p>I think a lot of people would love to get their wiki off Fandom, but it’s extremely not obvious what that even involves, so it’s hard to formulate a plan. <strong>I will help you figure out a viable, detailed strategy for you to get your wiki off Fandom, and bring the traffic along</strong>.</p>

<p>In the next couple weeks, we’ll be posting some general advice on this blog that goes through the main steps and pitfalls involved with leaving Fandom. Most of it should be broadly applicable, but the real power comes from looking at the specifics of your topic (how big is it?  does it change frequently? is it a game? are you the rights-holder?) and tailoring the plan to fit.</p>

<p>As far as where you host it…there’s plenty of decent options. Wiki hosting is not nearly as hard as Fandom makes it out to be – for example, if you’re the Path of Exile devs and you already host a bunch of PHP web stuff, then hosting the wiki yourself is objectively a really good option.</p>

<p>Sometimes Weird Gloop will be the good option for your situation, and being totally honest, sometimes it won’t be. And that’s okay! I want to help communities get away from Fandom, regardless of who’s running the servers.</p>

<p>I will say, I don’t think we would ever do a “self-service” thing where you could just sign up and immediately make a wiki. We want to do projects where we get to know the community, and closely support every wiki we host.</p>

<h2 id="how-to-not-turn-into-fandom-20-with-these-2-simple-tricks">How to not turn into Fandom 2.0 (with these 2 simple tricks)</h2>

<p>As we’ve started hosting more wikis besides RuneScape, some people have asked a pretty reasonable question: what’s stopping us from eventually getting enshittified, just like Fandom (or the other wiki farms that eventually sold to Fandom)?</p>

<p>From my perspective, there are two key choices that Fandom made that have had major negative consequences for communities. And we’re just going to do the exact opposite on both points.</p>

<h3 id="point-1---wiki-communities-need-to-be-able-to-freely-leave-their-host">Point 1 - wiki communities need to be able to freely leave their host</h3>

<p>You can probably tell that I think wiki editors (as opposed to hosts) are the ones who create the vast majority of the value on a wiki. So the premise is simple:</p>

<p><strong>If a wiki community is unhappy, and they have a better option somewhere else, they should be able to leave and take their stuff with them</strong>. We won’t prop up the old wiki, Weekend-at-Bernie’s style, abusing the dominant Google position that the wiki editors built up while they were on our platform.</p>

<p>In my opinion, <em>this is really the only rule that matters</em>. If you have the ability to leave (and take your revenue-driving wiki with you) when things go to shit, then your host has an extremely strong incentive to not let things completely go to shit.</p>

<p>There’s a long history of wiki farms vaguely handwaving that they’d agree to something like this, and then backtracking later. So why believe us?</p>

<p>It helps that Weird Gloop literally only exists because we were on the losing end of this sort of situation with Fandom back in 2018, and that we have no outside investors or debt (the company’s owned by wiki nerds)…but I don’t think that’s convincing enough on its own. So we’ve been voluntarily entering into agreements with the wikis we host (<a href="https://meta.minecraft.wiki/w/Memorandum_of_Understanding_with_Weird_Gloop">here’s an example</a>) where we set very clear obligations for what happens if the wiki community wants to go somewhere else (hint: it’s all about the domain). If we ever start going down the same path as Fandom, everyone can just leave! I would love to see other wiki platforms start to do this, because I think it’s the only way you really solve the problem.</p>

<h3 id="point-2---global-branding-is-extremely-negative-value-for-wiki-farms">Point 2 - global branding is extremely negative value for wiki farms</h3>

<p>If you go to any page on a Fandom wiki, even if you’ve got an ad blocker…you’ll be greeted by an absurd amount of Fandom-related branding: a gaudy sidebar that links to Fan Central (whatever that is), a bunch of other links to wikis that aren’t relevant to you, buttons to follow Fandom on Instagram, TikTok, to take “Fan Quizzes”. The brand strategy seems like it was cooked up by a bunch of market researchers who think that people are fans of…media properties in general? It’s super cringey and totally irrelevant to the people who are on Fandom wiki to, say, look up the stats of a new pickaxe they got.</p>

<p>It’s easy to laugh about how bad the branding and identity is, but there’s a bigger issue: the fact that it’s so overwhelmingly branded as “Fandom” (as opposed to, say, the Warframe Wiki) makes it way harder for each of the individual wikis to develop an public identity, because anything they do will be subordinate to the (very loud) global brand. These individual wikis are the only popular thing that Fandom has ever operated, <strong>and the focus on global branding makes each individual wiki worse</strong>.</p>

<p>Our position: the actual wikis should be front and center, because it’s way more important for the wiki itself to have a great reputation, rather than sucking all the oxygen out to make sure people know who owns the platform. We have extremely minimal branding (<a href="https://minecraft.wiki/">can you even find it?</a>), and I can’t imagine ever trying to put wikis on subdomains of weirdgloop.org (or anywhere else) unless there were no decent domain options. We don’t actually get anything out of everyday readers knowing who we are.</p>

<hr />

<p>That’s all I’ve got right now. If you liked this and want to talk to me about wiki things, please <a href="/contact">come say hi</a> – it doesn’t matter if you have a big wiki or a small wiki (or no wiki at all!) – I really just love talking to people about this stuff.</p>]]></content><author><name>cookmeplox</name></author><summary type="html"><![CDATA[Hi! You may have seen that Weird Gloop is now hosting the official League of Legends Wiki. We’ve spent the last couple months working with the Riot folks and the League wiki editors to move it off of Fandom, and turn it into something the players will (hopefully!) really dig. I also love that it got started because one of the Riot guys plays a ton of Old School RuneScape and thinks our wiki is awesome. How cool is that?? I want this to kick off a new era where communities and developers take control from Fandom, and make some really great wikis. We’ve already been doing a bit of this, starting when we helped the Minecraft Wiki leave Fandom, but I think it’s time for me (and the rest of our group) to be more explicit about what we want to do.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://weirdgloop.org/images/Weird%20Gloop%20square%20logo.png" /><media:content medium="image" url="https://weirdgloop.org/images/Weird%20Gloop%20square%20logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>