Content moderation inversion

Many years ago when I was studying statistics we had a project where we were working on making an email spam filter using statistical methods. Filtering out spam emails is a difficult task and even back then it required a statistical approach. Since then the number of spam emails has exploded and now there's vastly more spam emails than real emails attempted to be sent each day.

Email was originally developed as a system that assumed trust, especially on the side of the email sending party. The initial design of email had no explicit focus on preventing abuse, a problem that would become worse with each year email adoption grew. There's a lot of ways that spammers and scammers have attempted to bypass anti-spam mechanisms. This led to many people trying to tackle the idea of "how can we prevent abuse in an open system".

There's been a lot of geopolitical events in the last month and I've noticed an absolute tsunami of fake content being pushed in order to sway public opinion. Often this involves fake accounts boosting up the "engagement" numbers on existing posts, but I've also seen a lot of email spam as well. What this all has in common is the abuse of systems that were initially designed to be open systems by groups that have something to gain by hijacking the attention of the users of those systems with content that those users did not ask to see.

Why we have to use statistical models to filter out spam

Even in the days where only the ASCII character set was available there were many ways to evade simple list based approaches to banned words.

For example lets say you are running the email service for an organization and you are facing a huge number of spam emails that are along the lines of "buy pharmaceuticals online". A simplistic approach would be to just block the keyword "pharmaceuticals", but the spammers could respond in a number of annoying ways like substituting visually similar characters like "pharmaceutica1s" or similar.

When you consider unicode there's a huge number of different ways to substitute characters in such a way that they are human readable but not the exact same each time as a means to evade blocked keywords. As you can see just having a list of blocked keywords is insufficient to prevent spam in the modern era.

As more spammers and scammers are deliberately structuring their operations to evade detection a more sophisticated approach is required to be able to filter out spam. You want to ensure that real emails don't get deleted while filtering out as many spam emails as possible. In other words the false positive rate for spam detection matters a lot. As time has gone on the false negative rate has also become more stressed as vastly more spam emails are sent than real ones.

The way these statistical filters work is they automatically scan the contents of an email to give them a score that rates how likely the model considers the email to be real. Emails with a score that indicates that they are spam get moved automatically to the spam folder.

Anomaly detection and baseline usage

A way you might think of setting up a system to prevent abuse is to use various anomaly detection techniques.

The idea goes something like this, if you have a lot of regular use you can establish a baseline for what's "real" use and then the spam will show up as an anomaly that can be filtered out.

What's the statistical problem here?

The main problem is that if the main way in which the system is used is fraudulent then the real users will be picked up as the anomaly.

Another issue with content moderation recently is that there's teams of people who are paid to basically spam sites, there are also AI versions of the same.

All in all there's so much fraudulent content that using the "average" email or post as a baseline for determining "good" use is no longer possible on many platforms. For example if I enabled comments on this blog I'd get vastly more spam comments than real comments. Some have called this situation the great statistical moderation inversion.

Some companies have predicted this issue and acted accordingly. However many companies have not got staff with enough sophistication of statistical concepts to understand that this issue exists.

I predict over the next few years that this problem is going to grow substantially. The price of relying on statistical models that your company doesn't understand will grow as the failure mode of those models get exposed, either via deliberate adversarial attack or incidentally as a result of patterns of use changing. Large language models and advances in generative content creation are likely to drastically reduce the cost of spam and as a result more spam is to be expected.

Other approaches to filtering out spam emails

Email uses the Simple Mail Transfer Protocol, SMTP, which became popular in the early 1980s. The total number of users of email back then were orders of magnitude fewer, as compared to today in 2023, so the economics of spamming and scamming weren't so profitable at the beginning. The mail protocol wasn't designed with countermeasures to abuse because there just wasn't as much abuse of email back then. This is in stark contrast to today where vastly more spam is email is sent than there is real email sent.

As the problem of spam email became more and more crippling for legitimate usages of email as a communications tool it became very obvious that protocol level improvements were needed if email were to remain a viable system. Because SMTP has been around for a long time there's extremely strong network effects that have kept it being used, essentially a path dependency situation has arisen where despite the protocol having problems it would be hard to migrate such a large user base to something else. Building in defenses against spam was needed and this why SPF and DKIM were invented.

An example of where the SMTP system implicitly trusts users at a design level is that any computer can send email claiming to be from any source address. This is very useful in some situations, for example say your company has some automated systems that send email alerts it is highly convenient to be able to set up something like sendmail to send emails to whoever needs to receive them with the return email being something like a support email address. But this convenience is also extremely convenient for spammers and scammers who can send emails claiming to be from other addresses.

SPF - sender policy framework

Ensures the sending mail server is authorized to originate mail from the email sender's domain. This allows emails to be dropped as spam if the server isn't allowed to send mail for that domain, this is useful as it allows mail servers to silently drop spam emails.

DKIM - DomainKeys Identified Mail

This is a technique that is used to validate that the email is sent by who it is claimed to be sent by. This is important because it is possible to construct an email that claims to be from a certain sender without it actually originating from that sender. Scammers in particular abuse this to make emails that look like they come from official addresses but in reality are sent by the spammers. This approach uses cryptography to ensure that the sender of the email is who it claims to be AND that it hasn't been modified in flight.

DMARC- Domain-based Message Authentication, Reporting and Conformance

Validates that the to and from addresses are correct as they appear in the emails themselves. This is an extension of SPF and DKIM.

Some concluding thoughts

Much of what we have seen with email even a few decades ago we are now seeing with the broader internet. With vastly more people online now the potential number of targets for spamming and scamming has greatly increased. This has shifted the economics in such a way that has made certain unethical activities far more lucrative.

Open systems that were built on the basis of trust are frequently under attack since the rewards for breaching this trust have got higher. Unfortunately this is starting to make these systems, including the phone system, become less and less usable for people who have legitimate uses. The landline phone at my office gets vastly more illegitimate calls than real ones and now if I'm even remotely busy I don't even answer that phone if it rings.

Broadly speaking we are starting to see more fake content being generated than real content in many spheres. This has happened in part because sending fake or fraudulent content has got a lot cheaper. For example once upon a time it was costly to do international phone calls so international scams based on placing a lot of international phone calls just wouldn't have been financially feasible. People still ran international scams but they were far more targeted. Due to things like globalization combined with cheaper telecommunications access we have geo-arbitrage, scamming citizens of more prosperous nations using the time of people living in less prosperous nations allows you to use cheap labor while targeting a wealthier demographic far away. This also has benefits to the scammers in the way that many justice systems don't work anywhere near as well when multiple countries are involved. You could also employ cheap labor to create fake content, or in more recent times use some sort of generative AI to do this.

The advent of generative AI really shifts the balance on this, because you no longer have humans doing the work of creating spam content. This likely will dramatically reduce the costs of producing spam and scams. Part of the issue we are broadly facing is that many people still don't understand the full extent to which fake content generation has expanded. Even fewer understand the shifts in the economics of spamming and scamming that's becoming possible with newer technologies. We see these impacts show up from business all the way to online dating.

As time goes on we might need to introduce measures like the ones introduced into email into other systems. For example almost everyone I know in 2023 receives more spam and scam phone calls than real ones. Being able to cryptographically ensure that the number calling me is the number I think it is would be a great help as that would allow some filtering of these calls on a protocol level. Most scam calls don't reveal the number they are calling from, and a disturbing new trend is spoofing numbers from address books using information stolen from phones via phone apps that exist only to collect data for scammers. For various reasons changes to the calling system are still being resisted but I think much like email it has to change if we want phone calls to remain useful. The telephone companies do know who to bill for the call and bandwidth charges but I suspect the issue is that money is being made by allowing spam to go through. If you charge based on call volume and many of those calls are fraudulent then cutting the scam and spam calls might really cut into profit margins depending on the business model.

Published: Sat 04 November 2023

By Janis Lesinskis

In Economics

Tags: statistics spam anti-spam content-moderation internet internet-culture economics