Wikipedia:Salting is usually a bad idea

From Wikipedia, the free encyclopedia

There is a saying among criminals: "Locks keep an honest person honest."[1] What they mean by this is that no lock will stop a sufficiently determined person from picking it... or, failing that, from taking an axe to the door or throwing a brick through the adjacent window.

The same is true of salting a page on Wikipedia (restricting certain categories of user from creating it): Like a lock on one's front door, it will keep out curious good-faith parties and driveby vandals, but salting will not keep out a determined attacker. It will only make them harder to find. The same is true, in most cases, for adding terms to the title blacklist, and in many cases for adding terms to the edit filter. It is easier to watch a specific known honeypot page that tends to attract bad edits than all other pages that might be used if the original target is protected.

Salting as "super-deletion"[edit]

The logic for salting a page (or blacklisting a pattern) is often one of "super-deletion": Deleting the page hasn't been enough to deter a bad actor, thus we should make them unable to create it at all. The problem is that there is an effectively[2] infinite number of potential titles. And an effectively infinite number of ways to work around a salting. Suppose you were tasked with guarding an infinite number of doors. If someone opens one of these doors, you can close it, and then you have two choices: You can put a lock on the door, or you can put a silent alarm on it. You look at the door's logs. It's been opened and closed four times now. The intruder is clearly interested in this one door. You look to your right and left and back and front and see, again, an effectively infinite number of doors that are nearly identical. Do you lock this door, pushing them toward all the others? Or do you set an alarm, quietly walk away from the door, and hope that they'll stick to just this door?

Silent alarms instead of locks[edit]

The obvious way to set a "silent alarm" is to watchlist a page. This works if you are very active and check your watchlist regularly. A more advanced approach is to set a "stalk bot" on IRC to notify you if a page matching a certain regex is created. This has the added benefit of potentially getting ahead of attempts to bypass scrutiny (for instance, being notified on the creation of any page containing a particular substring). There is also the option of setting a log-only abuse filter to trip when a page matching either an exact title or a regex is created, and then setting DatBot to report to AIV if this happens.

All of the "alarms" above stop working when an LTA's target page is salted. Salting actively makes LTAs harder to catch, without meaningfully hindering their abuse.

When salting is a good idea[edit]

The key question is whether it seems more important to the repeat-creators to add their content at a specific title, or just anywhere they can. In the former case, salting may in fact be a good idea. Examples include:

  • When you really do want to keep an honest person honest. This applies to two cases:
    • Pages that are frequently created accidentally or due to a good-faith misunderstanding, such as generic file names (e.g. File:pic.jpg) or titles used as examples (e.g. Wikipedia:Sockpuppet investigations/SOCKMASTER).
    • Pages that have been deleted at AfD or by CSD A7 deletion but repeatedly recreated in good faith. Sometimes someone might not notice the previous deletion, or might not realize its implications, and in these cases a salting may deter them.
  • Driveby vandalism. A page that attracts vandalism from unrelated low-effort vandals, such as Pooop, makes sense to salt. Since the restriction of article creation to autoconfirmed accounts (effectively semi-salting all of mainspace), this is needed less often, but may occasionally come up in other namespaces (e.g. Draft:Poop).
  • Corporate spam. Corporate spammers often care about their spam subject's name being represented literatim, especially if it is a URL or legal company name. Forcing them to use workarounds may well scare them off. There are also times that spambots for one reason or another lock on to a specific non-mainspace title; salting makes sense there too.
    However, spammers representing a person are among the most willing to try endless workarounds, to the extent that some people have made up new professional aliases just so they could use them as new salting workarounds.
  • Cases where an LTA does seem to have a fixation on a particular page and nothing else. This one is a bit of a gamble, since if you guess wrong you may send the LTA scurrying for workarounds. But there definitely are cases where salting a page that an LTA is fixated on has deterred them, particularly with low-competence LTAs, and particularly if title-blacklisting or edit-filtering is used rather than standard salting (see below).
  • Egregiously inappropriate titles, of the sort where creation and deletion need to be log-deleted or suppressed. Sometimes in these cases it may be worth it to play Whac-a-Mole with an LTA. However, note that all protected titles are listed at Special:ProtectedTitles, even if you log-delete the protection,[3] so there is some downside to this.

Further considerations[edit]

Title blacklisting[edit]

Above, salting and title-blacklisting have been discussed together, but there are some differences with the blacklist. Most significantly, the blacklist matches regexes, meaning that you can counter specific workarounds (e.g. (f|ph)[oO0u]+ for an LTA creating variants of foo). This may deter lower-skill LTAs. But, as with salting, there is an effectively infinite number of ways to bypass such filtering. Furthermore, the title blacklist is public, so, unlike some complex edit filters, this will not even be a difficult problem for the determined attacker to solve.

Edit-filtering[edit]

The edit filter has the most versatile set of options for preventing usage of a term in titles, and has the benefit of a "private" mode that can only be viewed by admins and a few other highly-trusted groups. If one is going to filter out a term, this is usually the best option, and this essay stops short of condemning its use to the extent of salting and title blacklisting. However, as with the other two, a sufficiently determined attacker will still always win. For this reason, as discussed above, it is often better to set filters to private log-only, and let attackers create pages and be instantly reported to AIV.

Other forms of protection[edit]

Some of the above can also be applied to protection of existing pages, particularly those outside of mainspace. In mainspace, the reader always comes first, but if an LTA has a fixation of vandalizing some low-visibilty projectspace page or spammy draft, and the vandalism is not particularly obnoxious, it may often be better to leave the page unprotected and let it serve as a honeypot.

Notes[edit]

  1. ^ Source: A convicted armed robber who worked lighting for stage productions at the essayist's high school—said moments before he latch-slipped the lock to an off-limits area of the theater, using only a plastic plate he'd cut into thirds.
  2. ^ Where there are 149,186 valid characters in Unicode 15, and a title can run up to 256 characters, . Factoring in technically invalid and blacklisted character combinations would bump this down a fair bit, but the end result would still be a mind-bogglingly large number. Even (one googol) is so large a number that the number of IPv6 addresses, designed to be effectively infinite, rounds down to zero if taken as a fraction of it. See Orders of magnitude (numbers) for further context of just how incredibly large a googol is, let alone .
  3. ^ See inclusion of testwiki:User:Tamzin/salting test at testwiki ProtectedTitles.