Spectre Explained

A recent massive security exploit broken down by the Mesh team.

This is a two-part series that covers both Spectre and Meltdown. In the following piece we will attempt to give an explanation of a recent security exploit known as Spectre

In the news

If you’ve been paying attention to tech chatter over the past week, you have probably heard about two catastrophic security flaws called Spectre and Meltdown. With so many security breaches in the news, it can be hard to discern which headlines you should be concerned with.

Bad news: these massive security flaws are as bad as they have been hyped up to be.

Good news: There are publicly reported instances of them being used maliciously… yet.

What should I do: What you should always be doing - updating your laptops/phones/servers/smart-refrigerators with the latest available software updates and security patches. Current updates will patch most of the attack vectors revealed by Spectre and all Meltdown.

While just knowing that you should update your software is good enough for some, here at Mesh we like to dig into the full story and to always know what is at stake. Security topics can be pretty complicated, so let’s try to break it down by covering Spectre:

Spectre

We’re going to cover the partly unpatchable and spookier of the two vulnerabilities first - Spectre.

This train took a wrong turn

Spectre is exploiting a flaw related to branch prediction and speculative execution. CPUs from nearly every vendor have used these two techniques for well over twenty years to help make computers faster. They are basically a feature where your CPU gets to rub the crystal ball and try to predict what it needs to do next without being told explicitly.

Why would it want to do that? Well, sometimes the CPU is waiting on something to tell it what to do next. Some smart engineers realized that if the CPU can make some educated guesses, it could potentially lead to some real performance gains.

Developers: If you don’t know what branch prediction is, you have probably never taken the time to wonder what the most popular question ever on Stack Overflow is. Now would be a good time to check it out!

Hide… (Think of subtitle)

In computing, we rely on memory to be protected from other parts of your computer so that if there’s a malicious actor (hacker, virus, etc), they can’t access what shouldn’t be seen. This is fundamental to how all modern computers work today, and how they provide ‘guarantees of protection’ from malicious code. It’s why the webpage you’re reading this on can’t run some code to find out what your bank password, which could be open in a different browser tab, is.

It actually potentially can now with the Spectre vulnerability being released, but we’ll get to that.

There’s always a tradeoff

As working consumers of tech, we demand from Apple, Google, Microsoft, etc. that our hardware and software be secure from threats. Safety isn’t enough though - we also want our computers to be fast, and to pull that off there’s some tricks along the way.

The CPU is a worker, it loves to work and process stuff. It loves it so much that if it’s blocked by something… say waiting for memory to arrive from your RAM, it will actually guess what it can/should be doing in the meantime. This is that speculative execution thing we mentioned earlier.

Clearly this feature would be a bad one if the CPU guessed wrong (which it does many times per second). So how do we mitigate the CPU running code it isn’t supposed to? We simply throw all the speculative execution work into a bucket called a reorder buffer while we wait for confirmation that we were supposed to execute it.

Remember how the CPU was working on other stuff while it was waiting on something to come back to tell it what it really should be working on? Well, if that direction finally arrives and it proves that the CPU was working on the wrong stuff - no worries! That speculative work is discarded, and we start working on the new stuff - life is good. Or, it was for the past 20 or so years…

Always clean up after yourself!

What if you knew about the CPU’s inclination to always be busy and speculatively execute? You could potentially setup a situation where you starved the CPU of that next instruction it wanted. What does a CPU do without clear instructions? It picks what it thinks it should be working on. The trick here is knowing what influences that decision and guiding it to work on things it shouldn’t be, or isn’t allowed to be working on.

But wait! We just said that all of that speculative, and potentially malicious, work would be discarded, and would never see the light of day because when that next real instruction finally arrives, the CPU will be put back on the right path. This is true, but the footprints of that malicious activity remain, and can be read/detected. This is the basis for Spectre.

Same explanation, but in not in nerd-language

Ok, let’s say you are a mason, and you can build exactly one wall per day. You don’t have a lot of say in where you build your wall though - that’s your boss’s call, and she shows up at noon every day. You, on the other hand, show up at 7AM because you want to get work over with as early as possible.

You’ve been working this insanely large housing development for a few months, and it has hundreds of empty lots. Your job entails either building a red brick wall on the left hand side of a lot, or a grey brick wall on the right hand side of the lot every day. It’s a bit monotonous, so you’re always looking for ways to speed up the process…

Every day you’ve been waiting for your boss to show up and tell you which side to build on. Once she does, you need to call in supplies for the correct colored bricks, and start building. For the past 60 days in a row she has arrived and told you to build on the right (grey bricks).

Finally you decide that you don’t feel like waiting around for someone to show up and tell you what you already know - that you’re building a grey brick wall on the right. So you start working prior to knowing what she really wants.

You take matters into your own hands - you call up the brick supplier every morning at 7AM, get the grey bricks delivered to your lot, and build that wall. You guess correctly, mostly, and life is good - you’re getting done early now because you’ve been guessing correctly for a while. What a novel performance optimization! Recall speculative execution? :)

Until… one day she shows up and wants a red brick wall built on the left. But, you already got the bricks delivered and are halfway done with the wall on the right! No worries though, you can just as easily knock it down, order the correct colored bricks, and build the wall on the right.

What about all those bricks you ordered for the wrong wall? They’re just sitting on the curb now, taking up space! Well, there are still a lot of walls to build in your development, and chances are you’ll need those supplies in the near future.

This where the vulnerability comes into play. The mason reverted the bad work and built the correct wall, but there are artifacts (the unused bricks) left over from his miscalculation of duties. Spectre isn’t about the erroneous work; it’s about garnering information, about your system, via the presence of artifacts of discarded work. If you do this enough repeatedly, you can begin to build a story of secrets.

There’s some serious glossing over of facts/details here, but this delivers the gist of the situation Spectre is taking advantage of.

Security v Performance

It’s the CPU’s job to execute your instructions, but the CPU is also supposed to do it in a safe way. Unfortunately, these two desires are often at odds with each other, such as with Spectre. The technology industry is always trying to optimize for performance, and safety often bares the burden of neglect… sound familiar? As the real world costs of these security slights escalate, it will be interesting to watch the what the public appetite is for performance hits in exchange for a safer experience.

For better or worse, security vulnerabilities are becoming more en vogue

Look at the sites and effort to connect this esoteric topic of speculative execution exploits to the general development community. Researchers are trying to almost sell it to developers. Your run-of-the-mill exploit is honored with a cryptic and intimidating title (e.g. CVE-2017-6655) and a pretty uninviting site. Check out the sites for Spectre and Meltdown.

Does anyone remember something about a bleeding heart that was also a security flaw about two years ago… perhaps the OpenSSL Vulnerability HeartBleed?

These sites look great, and for a reason - this is serious stuff! They’re trying to convey that in marketing appeal to developers and the broader public who otherwise wouldn’t know the difference between this CVE and the other hundred that came out in the past 24 hours. If you want a comparison, this is the clearing house for all vulnerabilities, and it looks like they hired the guy who designed the 90’s Excite Search Engine.

Don’t take it for granted

Just because the world has been using computers incessantly, and continue to entrench them into every facet of life, we can still find flaws that are so basic and fundamental that they destroy a more than a decade’s worth of modern microarchitectural design assumptions. To draw a parallel, it’s as if we discovered the foundation design for every building ever erected was flawed and needed to be repaired immediately.

Think about what’s been going on in the 20 or so years - we’ve have legions of genius black hat (malicious) hackers vying for every chance to exploit your information via finding vulnerabilities like Spectre and Meltdown. On the flip side of that battle, we’ve had an equal number (roughly) of security researchers mulling over code and design to prevent these exploits. In addition to your prototypical hacker, we’ve had the NSA, and the like, looking for things just like Spectre and Meltdown that could be weaponized for surveillance, or more nefarious activities. Not one of them found these vulnerabilities, and they’ve been staring the world in the face for about two decades.

The point I’m trying to make is that this stuff is complicated and there are no absolutes in security. Which is exactly why this stuff is so much fun to work with 🎉!