Singular Paths

Microthought: Safety isn't first, and it shouldn't be

Tue, 29 Sep 2020 21:12:46 GMT

"Safety first!" makes a great soundbite. But it's a silly way to make decisions. "Safety" isn't binary - if you really put maximizing safety before anything else you'd never leave the house! You make trade offs all the time, and you'll make better ones if you're explicit that you're doing so.

Everything you do has risks. Consider driving. About 10% of deaths for 25-44 year old Americans are traffic accidents! "Safety first" - best not leave the house! Is driving somewhere really worth a 10% increased chance of dying each day?

Of course! The absolute risk is still very low (that same age group has an all-causes risk of only about 4 micromorts each day). And getting around is really useful - how much less happy would you be if you could never go beyond walking distance?

But you're not putting "safety first". You're making a reasonable trade off. You try to mitigate the risk where practical to do so - you do wear your seatbelt don't you? - but you implicitly accept some risk in exchange for getting around.

You should do the same for other "risky" activities. Being able to get places, to do things, to have fun; these are all very valuable! Conversely, if you can reduce a small risk for low cost - like putting on your seatbelt - you probably still should.

Don't ask "is X safe?"; ask "is X worth the risk?".

The difficulty of AI benchmarks

Mon, 31 Aug 2020 20:27:25 GMT

How much progress have we made in AI, and how close are we to a general human-level intelligence (AGI)? It's tempting to pick some benchmarks that surely only something intelligent could pass. Superhuman chess, go, or the ability to hold a conversation have all been given as examples, among many others. Those three have arguably been beaten (by Deep Blue, AlphaGo, and ELIZA), but there's no AGI in sight yet. What's going on?

I claim the problem is benchmarks fail to measure something as general as "intelligence". Not that these benchmarks fail, but that all benchmarks fail. And it's not obvious how they will fail until you see a clearly non-AGI system beat it.

It was once thought outplaying humans at chess would require real "intelligence". Deep Blue beating Kasparov was a dramatic occasion! But superhuman chess clearly turned out to be a much easier problem than intelligence. It's now mundane. You can even run it on your smartphone, and you're more likely to call "AI" as a common name for a computer opponent than because it's "intelligent".

Computer vision has long been another hard task for computers. A lot of "AI" work went into having computers be able to recognize objects from pictures we'd find easy. Computers progressed from useless, to ok, to arguably superhuman at major benchmarks like ImageNet. Even my car - not a Tesla - quietly does optical image processing for automated steering and braking. I haven't found the words "AI" anywhere in the manual or the marketing, just bland terms like "lane keeping assistance".

A lot of cutting edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labelled AI anymore.
-Nick Bostrom

This sentiment is so common it has its own name, the "AI effect". It's easy to underestimate how much progress computers have made into domains once thought to be the preserve of human intelligence. Tasks we've succeeded at addressing with computers seem mundane, mere advances in some other field, not true AI. We miss that it was work in AI that lead to them.

This might suggest AGI is closer than we think. We retroactively reduce the significance of benchmarks AI attains, so underestimate the progress made so far in AI. The remaining distance might look much less after accounting for quite how far we've come already. If you want to be able to convince others - for any reason from getting investment to worrying about AI risks - you might want some way we could agree it's near.

One way you could address this issue is to ask people to commit to some benchmark(s) now, that they'll accept as a marker of progress towards AGI. That protects against this revisionism. You can ensure that when a proto-AGI passes that test it can't be neglected.

The AI effect has another side though. Perhaps the benchmarks were always flawed, because we set them as measures of a general system, forgetting that the first systems to break through might be specialized to the task. You only see how "hackable" the test was after you see it "passed" by a system that clearly isn't "intelligent". Taking previous benchmarks at face value might falsely suggest progress towards AGI from task-specific systems.

An obvious solution is to define better benchmarks. We've seen previous benchmarks that failed. We can sit down and come up with new ones, being careful that they're not vulnerable to any of the non-generalizable systems or approaches we've learned exist. Then we can commit to these new benchmarks.

I claim that not only will these new benchmarks turn out to have the same issue, but that this is a fundamental problem of defining precise benchmarks for something so general as intelligence.

We falsely extrapolate correlations that hold within one type of system - humans - between classes. Generally speaking, smarter humans are better at chess, so we saw "being good at chess" as a sign of intelligence. It seemed natural to expect that a system beating humans at chess would similarly have human intelligence. Yet two decades after Deep Blue beat Kasparov, with top computers now unambiguously better than any humans at chess, those systems remain far from general intelligence. It seems strange now to think many once saw chess as equivalent to general intelligence for computers. Easy to forget.

Of course, this shouldn't be surprising. Consider the limited domain that is "motion" instead of "intelligence". As for intelligence humans are relatively general systems. Humans who are faster on flat ground are generally also faster on rough ground, or in water. You might conjecture that anything much faster than humans on the flat would surely also beat them in water. But then you see a car is much faster than a human on flat ground, yet is useless in water, or on rough-enough ground. It's a system specialized to a smaller range of problems.

Similarly, among a class of general intelligences, it's easy to find measures that correlate with how intelligent they are. But that's not what you want to measure for AI benchmarks! It's much harder to construct precise benchmarks that measure whether a system is generally intelligent.

It gets worse. Whenever you do define a precise benchmark that gets widely accepted by the community it becomes a target for AI researchers to beat. Everyone agreed that this benchmark measures intelligence, so improving benchmark results must be good work. Against such relentless optimization both individually and as a community, any decoupling between the new benchmark and AGI progress will manifest.

This is Goodhart's law in action. The very act of agreeing on a benchmark for AGI can make it useless in that role! If you use it as an early warning then long before the first proto-AGI passes that test, countless false alarms have been raised and the benchmark is long since ignored.

It's all but impossible to be both precise and quantifiable enough to easily agree on what progress has been made, and general and flexible enough not to succumb to Goodhart.

The Turing Test highlights this tension. Well-defined, easily measurable and quantifiable versions, like the ability to fool ~50% of random people in a short conversation, have arguably been passed already by chat bots. (GPT-3 even more so). But the old chat bots relied heavily on specialized countermeasures to certain lines of questioning, and even GPT-3 is specialized at "producing plausible-looking text". None are AGI.

The most general version is that an expert with an arbitrarily long time to discuss any topic can't tell the difference better than chance. The chat bots and even GPT-3 fail this one. But it's impractical to run and impossible to replicate - was the expert expert enough and did they spend long enough?

Where does this leave us for measuring AI progress? No option seems great:

Benchmarks known at the time turn out to be poor measures of general progress (as above)
New benchmarks might not be computable for old systems and risk bias from knowing history
Expert surveys have substantial flaws (I'll write about this in a later post)

Still, we can try to piece together a view from the available evidence, taking into account and attempting to adjust for its flaws. I'll write more about my personal conclusions on AI progress from this later.

Microthought: "Be more careful" isn't a good solution

Thu, 20 Aug 2020 18:48:48 GMT

Humans make mistakes. Even when trying to be careful. In a good environment your long-term level of care tends to be roughly whatever is sustainable for you in that environment. If I tell you to "be more careful" it might create a temporary boost, but it's unlikely to prevent all accidents in future. It's not a solution to mistakes.

Worse yet, punishing individuals who mess up discourages people from owning up about their mistakes or near-misses, denying valuable data to identify and fix the root causes.

Instead, focus on the broader system humans are a part of. Can automated checks prevent the issue, or free up cognitive load from elsewhere to spend on this task? Can training be given (or improved) to reduce mistakes? Should the culture be shifted towards safety (often at the cost of other features - you can't make everything your top priority)? Is the current level of expected mistakes actually the correct trade-off and we should just accept it?

Using a mistake to remind people to be more careful can still help contribute to a culture of care. But it's not a solution by itself, and it's important that it doesn't feel like punishment.

Addendum:

Yes, technically yelling at people to be more careful all the time or terrifying them with dire consequences should they mess up is one way to create a culture of care. But the costs are high, both in happiness and in discouraging them from owning up to mistakes. This is why air accident investigations are restricted from use as legal evidence against any individual, because the benefits of open discourse about accidents and how to prevent them far outweigh those from punishing individuals who make mistakes.

Microthought: Zoom's insecurity as a competitive advantage

Mon, 17 Aug 2020 18:31:47 GMT

Zoom has suffered several well-known security issues, ranging from leaving a persistent server on your computer that turned out to be remotely exploitable, to weak encryption and authentication. You might call their security situation a weakness. I argue it highlights one of their strengths.

If Google (Meet, Hangouts), Microsoft (Teams, Skype), Cisco (Jabber), or many other big players had serious security issues in their chat software they'd suffer significant reputational harm to their broader business. Zoom isn't constrained by that and has been free to aggressively trade off security for usability (if the uninstaller doesn't, reinstalling it is really easy!) or development of other features.

I personally hate this. I'm sad that I'm forced into using questionable software just because that's what everyone has settled on. But many people's revealed preferences are that real security and privacy gets little consideration. I have to admit that Zoom - as an amoral business - adopted an effective strategy.

Where do they go from here?

A journey of a thousand miles begins with a single step

Sun, 16 Aug 2020 22:27:21 GMT

Welcome to Singular Paths!

Communicating purely in writing leaves me outside my comfort zone. I much prefer the richness of a face-to-face conversation with all the nuance of intonation and gestures, being able to see what is confusing and adapt. I thrive in private conversation. But that limits who and how many people I can speak to and learn from.

Talking publicly like this also feels arrogant. I know I always have more to learn. Who am I to expect anyone to listen to me? Being told I'm wrong is fine - indeed that's the best way to learn! But when I assert something wrong publicly, who might I mislead before I'm corrected? Yet it's silly to worry about being another person wrong on the internet.

So writing this blog is a big leap for me. It scares me. That itself is a reason to do this, to grow.

I'm ultimately writing to better myself and to understand the world. I'll write my thoughts on whatever topics interest me. They'll often be wrong, or at least an incomplete understanding. Please challenge me, and I hope we can all learn more together.