AI and problems of scale

Published on

April 29, 2024

No items found.

There’s a story in one of Georges Simeon’s 1930s detective stories that I think about sometimes when talking about a certain kind of AI problem. Simenon’s hero, Inspector Maigret of the judicial police in Paris, scares a witness, and then goes across the street to a café and calls the telephone exchange. He tells them that someone will place a call from the ‘Pelican’ nightclub to Cannes: they are to hold the call until he gets there. Then he takes a taxi to the exchange, where they are indeed holding the call, and listens in.

I told this story to someone at a Three Letter Agency a few years ago, and got a wry smile - they can’t really do that now, but there are other things that they could do, and they could do them at the scale not of one phone call but of millions. That seems different. We accept the police listening to phone calls one at a time, with a warrant, but not listening to all of them, all of the time.

Something similar comes up when we talk about AI and face recognition by law enforcement today. We’re all (I think) comfortable with the idea of ‘Wanted’ posters. We understand that the police put them up in their offices, and maybe have some on the dashboard of their patrol car. In parallel, we have a pretty wide deployment today of licence plate recognition cameras for law enforcement (or just tolls), and no-one has really noticed. In parallel, public and private surveillance cameras have become a basic investigative tool, with a little more concern. But what if every police patrol car had a bank of cameras that scan not just every number plate but every face within a hundred yards against a national database of outstanding warrants? What if the cameras in the subway do that? All the connected cameras in the city? China is already trying to do this, and we seem to be pretty sure we don’t like that, but why? One could argue that there’s no difference in principle, only in scale, but a change in scale can itself be a change in principle.

We had a lot of these kinds of puzzles with the rise of databases in the 1960s and 1970s. Things that had always been possible in theory at a small scale became practical at a massive scale, and people wrote books about the threats this posed. There was some panic in this, but some of the arguments were entirely correct and remain concerns today. Automation changes things, though our reactions to that can be hard to predict. In USA v Jones (2012), the court held that the police cannot put a GPS tracker on a suspect’s car without a warrant, thought they would not need a warrant to follow them around manually, the old-fashioned way. Is it that we don’t want it, or that we don’t want it to be too easy, or too automated? At the extreme, the US firearms agency is banned from storing gun records in a searchable database - everything has to be analogue, and searched by hand. There’s something about the automation itself that might change things. And, there was a time when people told the music industry that Napster was the same as mix tapes - that didn’t stop recorded music revenue falling by more than half from 2000 to 2014 (before streaming changed everything).

Generative AI is now creating a lot of new examples of scale itself as a difference in principle. You could look the emergent abuse of AI image generators, shrug, and talk about Photoshop: there have been fake nudes on the web for as long as there’s been a web. But when high-school boys can load photos of 50 or 500 classmates into an ML model and generate thousands of such images (let’s not even think about video) on a home PC (or their phone), that does seem like an important change. Faking people’s voices has been possible for a long time, but it’s new and different that any idiot can do it themselves. People have always cheated at homework and exams, but the internet made it easy and now ChatGPT makes it (almost) free. Again, something that has always been theoretically possible on a small scale becomes practically possible on a massive scale, and that changes what it means.

Part of the experience of databases, though, was that some things create discomfort only because they’re new and unfamiliar. Part of the ambivalence, for any given scenario, is the novelty, and that may settle and resettle. This might be a genuinely new and bad thing that we don’t like at all; or, it may be new and we decide we don’t care; we may decide that it’s just a new (worse?) expression of an old thing we don’t worry about; and, it may be that this was indeed being done before, even at scale, but somehow doing it like this makes it different, or just makes us more aware that it’s being done at all. Cambridge Analytica was a hoax, but it catalysed awareness of issues that were real.

Meanwhile, all the examples I’ve given so far involve systems working as designed, but of course half of the problems are actually when they break. Each new wave of automation creates new ways to do bad things at scale, or at least things we’re not sure about, but they also create new ways to screw up at scale, and people have been screwing up and ruining people’s lives with databases for generations - the UK’s Post Office Scandal is just the latest and most obvious example. Machine learning and now generative AI create new ways to screw up, most obviously around ‘AI bias’ (which I wrote about here five years ago). Just as with databases, some of the answer is to train technologists to try to avoid this, but you can’t regulate away bugs, and a lot of the solution, again, has to be making sure people know that the computer can be wrong.

These problems might be new, or new expressions of old problems, bit they may become entirely new kinds of problems though scale, but our reaction to that is a matter of perception, culture and politics, not technology, and while we can easily agree on the extremes there’s a very large grey area in the middle where reasonable people will disagree Fake nudes are bad, but should they be illegal? This will probably also be different in different places - a good illustration of this is in attitudes to compulsory national identity cards. The UK sees the very idea as a fundamental breach of civil liberties. France, the land of ‘liberté’, has them and doesn’t worry about it (but the French census does not collect ethnicity because the Nazis used this to round up Jews during the occupation), and the USA pretends not to but demands for ID are everywhere. There’s not necessarily any right answer here and no way to get to one through any analytic process - this is a social, cultural and political question, with all sorts of unpredictable outcomes. The US bans a gun database, and yet, the US also has a company that scans your driving licence against a private blacklist of tens of thousands of people (another database) shared across over thousands of bars and nightclubs. And no privacy laws.

‍

AI and problems of scale

AI metrics

GenAI’s adoption puzzle

Our investment in Manas AI

AI Agents Don’t Buy Seats—Why Your Pricing Should Follow Suit

Transforming customer service with AI agents: Parloa raises $120M Series C at $1B valuation

What kind of disruption?

Apple innovation and execution

Are better models better?

The Deep Research problem

Introducing Mosaic's new Partner, Chandar Lal

AI eats the world

Competing in search

The AI summer

The VR winter continues

Why we invested in Coram.ai

Apple intelligence and AI maximalism

Building AI products

Ways to think about AGI

AI and problems of scale

Looking for AI use cases

The challenges of investing in AI

Why we invested in Podcastle

Remaking the App Store

Why we invested in Parloa

AI and everything else

Unbundling AI

Scaling personalised support: LLMs and human empowerment

The impact of LLMs on marketplaces

LLM agents: the next platform shift in B2B software

Generative AI and intellectual property

When tech says "no"

LLM applications: an investing framework

AI and the automation of work

Vision Pro

Personalised learning: Edtech’s long-standing aspiration

Netflix, Shein and MrBeast

The New Gatekeepers

Evaluating SaaS metrics at Series A

ChatGPT and the Imagenet moment

Why We Invested in Vektor AI - a Platform Unlocking Mentoring for Tech Talent

Ways to think about a metaverse

Powering Personalisation: Why We Invested in Ninetailed

Meet Johannes Barth - Mosaic's New Head of Analytics

The creator economy: a power law

Rocket ships and tractors

Within and tech M&A

Back to the trend line?

There’s no such thing as data

Now what? A Letter to Founders on How to Survive a Bear Market

What do Europe’s leading founders have in common?

AI for code: the next frontier in software development?

TV, merchant media and the unbundling of advertising

‘Google meets Which’: cost of living data platform Nous raises $9m

Privacy on the internet: what comes next?

Tech questions for 2022

Three Steps To The Future

Nexar is building a ‘digital twin’ of cities using crowdsourced dash cam data

Notes on newsletters

B2B marketplaces: what comes next?

When big tech buys small tech

Metabrand

Blockchain says it posted $1.5 billion in revenue this year

Reimagining the future of buy-to-let. Why we invested in GetGround.

Privacy on the internet: who cares?

Metaverse! Metaverse? Metaverse!!

Stepping out of the firehose

A decade of the Tim Cook machine

Why We Invested in Lightyear

Mainframes, ML and digital transformation

Ads, privacy and confusion

Do App Store Rules Matter?

Why we invested in Zerion

Unleashing the potential of the extended workforce. Why we invested in Utmost.

Integrative SaaS: A new OS for the workplace?

Antitrust posturing

The Potential of Real Time Trade Finance. Our investment in Hokodo

Boxes, trucks and bikes

Apple, Fedex and the cookie apocalypse

Can Apple change ads?