How A Steam Bug Deleted Someone’s Entire PC

Kevin Fang
19 Jan 202411:49

Summary

TLDRIn 2015, Linux Steam users reported that starting Steam recursively deleted all files owned by their user account. Investigating the Steam startup script revealed a bug where a variable returning the Steam root directory path instead returned nothing under certain conditions, causing 'rm -rf' to wipe system files. Though the exact sequence of user actions triggering this remains unclear, the bug was fixed by adding checks before deleting directories and preventing the script from continuing on failures.

Takeaways

  • 😀 This is the first episode of a video series about interesting software issues
  • 🤔 The Linux Steam client had a bug that caused it to delete a user's entire filesystem
  • 😮 Running 'steam.sh' directly with bash caused STEAMROOT to be an empty string
  • 🤨 This empty STEAMROOT variable triggered the 'reset_steam' function
  • 😠 'reset_steam' dangerously used 'rm -rf' on STEAMROOT, deleting everything
  • 😕 Multiple users experienced total data loss from this bug
  • 🧐 The root cause was incorrect handling of the $0 variable in 'steam.sh'
  • 🤔 There were debates about the best way to fix this dangerous bug
  • 😊 Multiple improvements were made to 'steam.sh' to avoid deleting user data
  • ✅ The Steam deletion bug has not resurfaced in many years and is considered fixed

Q & A

  • What caused Steam to delete user files on Linux?

    -A bug in the Steam Linux client caused it to run 'rm -rf' on the root directory if the STEAMROOT variable was set to an empty string. This happened if steam.sh was run directly rather than through the steam binary.

  • How did the 'rm -rf' command end up targeting the root directory?

    -The STEAMROOT variable contained an empty string instead of the path to the Steam directory. When substitution occurred in the 'rm -rf $STEAMROOT/*' command, it expanded to 'rm -rf /*', targeting the root directory.

  • What triggered the reset_steam() function that contained the 'rm -rf' command?

    -The reset_steam() function runs if the INSTALLED_BOOTSTRAP variable is empty. This variable should be set by the STEAMEXE executable, but wasn't run since STEAMROOT was invalid.

  • How did running 'steam.sh' cause the STEAMROOT variable to be empty?

    -When run directly, 'steam.sh' fails to get the proper $0 path variable, causing a cd command to fail and STEAMROOT assignment to output nothing.

  • Did Keyvin actually run the steam.sh script?

    -It's unclear. Keyvin claimed he only ran the steam binary, but others speculate he may have run steam.sh and forgotten.

  • What was Keyvin trying to do prior to launching Steam?

    -Keyvin was trying to move the Steam installation to another drive. He symlinked the original Steam folder to the new location.

  • How could the bug have been triggered by Keyvin's setup?

    -If the symlink was created incorrectly or accidentally deleted, it could cause STEAMROOT path resolution to fail when starting Steam.

  • What were some of the proposed fixes for the bug?

    -Using readlink/dirname for STEAMROOT, adding checks for valid paths, making steam.sh exit on failures, writing scripts in other languages, and removing use of rm -rf wildcard.

  • Has the deletion bug issue resurfaced since it was reported?

    -No, the issue has not reappeared over the past 8+ years since it was originally reported and fixed.

  • What was the overall impact of this Steam bug?

    -It caused total data loss for at least two Linux users. Beyond that, the impact seems to have been limited.

Outlines

00:00

😮 How a Steam bug accidentally deleted users' files

Paragraph 1 describes how in 2015, a bug in the Steam Linux client caused it to recursively delete all files when launched in certain ways. It explains the Linux directory structure, how Steam is installed, the steam.sh script, and how passing it incorrectly to Bash caused variable issues leading it to wipe files.

05:03

😲 Narrowing down the exact trigger of the wipe bug

Paragraph 2 analyzes the conditions in steam.sh that trigger the reset_steam() function that wipes files. It explains how running steam.sh directly with Bash causes key variables to be empty, meeting the if statement conditions that run reset_steam().

10:05

😅 Speculation on how the original poster triggered the wipe

Paragraph 3 speculates on how Keyvin, the original poster, might have accidentally triggered the bug even though he only claimed to run the Steam client normally. It suggests race conditions around symlinks as one unlikely theory.

Mindmap

Keywords

💡Steam

Steam is a video game digital distribution service and platform developed by Valve Corporation. It is used by gamers on Linux, MacOS, and Windows to buy, play and organize video games. The bug discussed in the video caused Steam to recursively delete files on a Linux user's computer when run in a certain way.

💡symlink

A symlink, also known as a symbolic link, is a type of file in Linux that contains a reference to another file or directory. It allows you to make a file or directory visible in multiple locations. In the video, a symlink was created to move the Steam installation to another drive, which contributed to the bug.

💡rm -rf

In Linux, 'rm -rf' is a command that recursively deletes files and directories. The 'rm' command removes files and directories, while the '-r' option makes it recursive and '-f' forces deletion without prompting. This dangerous command was unintentionally run by the buggy Steam script, wiping users' drives.

💡Linux filesystem

The Linux filesystem refers to the logical way Linux organizes files and directories on a drive or disk. Understanding the basic structure of the Linux filesystem, like the root directory and home directories, is key to following how the Steam bug was able to delete arbitrary files.

💡environment variable

Environment variables are dynamic values that can affect the behavior of programs and scripts on Linux/Unix systems. In the Steam bug, environment variables were improperly set, causing the Steam script to malfunction and delete files.

💡shell script

A shell script is a simple program written for the Linux/Unix shell. The steam.sh shell script contained the buggy code that lead to the recursive deletion when Steam was started in a particular way.

💡race condition

A race condition arises when the timing or order of events impacts program behavior. In one hypothetical theory, a race condition caused the Steam symlink to be deleted at just the wrong time, triggering the bug.

💡NTFS

NTFS is a proprietary Microsoft Windows filesystem that is also readable on Linux. One theory was that the Steam files being on an NTFS drive contributed to the bug, but this was later ruled out.

💡executable bit

The executable permission bit determines whether a file can be executed and run as a program in Linux. One theory was that the NTFS drive failed to properly set the executable bit on Steam, but this was debunked.

💡bash

Bash is a common Linux shell program used to execute commands and scripts. Running the Steam script directly in bash in a certain way reliably reproduced the recursive delete bug.

Highlights

Steam for Linux accidentally deleted user's entire computer by recursively deleting root directory

Bug caused by steam.sh script changing to invalid working directory and then trying to delete contents of steam root directory

Symlinks used to move Steam folder, but Steam couldn't find new location

steam.sh deletes everything recursively with no checks, very dangerous

STEAMROOT variable gets empty string instead of path due to script being run incorrectly

Script fails when run directly with bash instead of as executable

Bug triggers because bootstrap variable not set after failed launch

Possible NTFS drive issues with executable bit permissions

Multiple recommendations to improve script - use Python/Perl, add checks before deletes, etc.

Race condition theory - symlink deleted after creation causing issues

Fixes implemented - exit on error, verify paths, no recursive deletes

Debate on best practices for scripts and deleting files

Some skepticism about details of original report

Issue caused lots of discussion but hasn't resurfaced since

Funny theory about lasagna dinner before steam install issues

Transcripts

play00:00

Welcome to the first episode of Issues Insights, a series where we look at some of the most

play00:04

interesting issues out there, from Github to Gitlab, to proprietary issue trackers,

play00:08

even to your favorite mailing lists - we’ll cover them all.

play00:11

We expect to produce at least one episode of this series every 10 years, so buckle up

play00:16

as we dive into today’s issue from Valve - the Linux version of Steam unintentionally

play00:21

rm rf-ing or recursively force deleting someone’s entire computer!

play00:27

Steam for Linux has changed a lot since 2015, but here is how it worked back in the day:

play00:32

you first install steam, after which you can run the binary file called “steam”

play00:36

found in the directory /usr/bin to start the client.

play00:40

This binary does not contain the full steam application

play00:43

but invokes files in the installation directory called STEAMROOT in each user’s home.

play00:48

Before getting too deep, let’s take a high level view of the Linux file system, first there

play00:53

is the root directory, which as we know contains everything.

play00:56

Then, we see that /usr/bin contains user programs like steam, compared to /bin which usually

play01:01

contains programs essential for the operating system to function.

play01:05

The home directory, abbreviated as a ~ and located at /home/name works as you’d

play01:11

expect and is where each user stores all their stuff.

play01:14

In each user's home is the steam root directory where Steam installs itself and stores its games.

play01:20

It also has an important script called steam.sh, which is what the steam binary invokes to

play01:25

configure and launch steam.

play01:27

After setting everything up, this script will then invoke the main executable called STEAMEXE.

play01:33

All of three of these entry points can technically be used to start steam directly, just for

play01:38

different purposes and with certain limitations.

play01:40

Anyways, on January 14, 2015, Keyvin opened an issue on the steam-for-linux Github repo

play01:46

stating the following:

play01:48

>I moved the steam folder to a drive mounted somewhere else, and symlinked the original

play01:53

>directory to the new location.

play01:56

Symlinks or symbolic links are just special files which point to the file or directory

play02:00

that they are linked to.

play02:01

These are kind of like shortcuts, except symlinks are completely transparent to the application

play02:06

layer as the operating system automatically resolves it to the target.

play02:10

So you can kind of see what Keyvin was trying to do here - move the steam folder to a drive

play02:14

with better performance or storage capacity, and create this symlink so applications can

play02:19

still find that folder using the original path.

play02:22

>Then I launched Steam.

play02:25

>It did not launch.

play02:26

>It offered to let me browse and still could not find it when I pointed to the new location.

play02:31

>Steam crashed.

play02:32

>I restarted it.

play02:33

>It re-installed itself and everything looked great.

play02:36

>Until I looked and saw that steam had apparently deleted everything owned by my user recursively

play02:41

>from the root directory.

play02:43

>Including my 3tb external drive I back everything up to.

play02:47

Doofy then follows up with a reply, having encountered the same issue.

play02:51

>This is terrible.

play02:52

>I just lost my home directory.

play02:54

>All I did was start steam.sh with the STEAM_DEBUG flag enabled.

play02:58

If you recall, steam.sh is the helper script located in steamroot that can be used to start steam.

play03:04

After a bit of digging in this script, they found a line which runs rm -rf for all the

play03:08

contents of the steam root directory.

play03:09

I think we’re all familiar with what the rm command and its options -rf do.

play03:10

One line above this command was the comment “Scary” indicating this was a very scary operation.

play03:16

Soon enough, TcM1911 chimed in with a hypothesis: this steamroot variable must have somehow

play03:22

returned an empty string, causing the rm -rf argument to be the root directory.

play03:27

Taking a closer look at STEAMROOT, this enclosing dollar sign and parentheses is a command substitution

play03:33

where the output of the commands run inside will substitute the expression and thus be

play03:37

assigned to STEAMROOT.

play03:39

What is the output?

play03:40

Well it’s going to be what is echoed or printed, in this case, the current working

play03:44

directory signified by this PWD environment variable.

play03:49

The working directory of a script is where it is run from, not where the script is located.

play03:53

If you execute a script from the /tmp directory, the working directory returned by PWD will

play03:58

be /tmp regardless of where the script is located.

play04:00

So the strategy here is to change directory or cd to the directory of the script, and

play04:05

then print the working directory.

play04:07

We have another OS provided environment variable zero which refers to the path of the script

play04:12

being executed.

play04:13

Then, this percent signifies you want to remove the proceeding pattern from the end of the

play04:18

pre-ceding string.

play04:20

The pattern here is a forward slash, followed by a wildcard which matches anything.

play04:23

So from the preceding string which is the path of the script, we start from the end

play04:28

and delete the first slash we see, followed by everything after it.

play04:32

In this case, we delete the name of the file, leaving us with… the STEAMROOT directory.

play04:37

But this appears to work.

play04:38

What causes it to return an empty string?

play04:40

You see, if any part of this command fails prior to the echo, the command substitution

play04:45

will output nothing and therefore nothing will be assigned to STEAMROOT so it will remain

play04:50

Rcxdude notes that the command definitely fails in such a way when running the script

play04:54

directly with bash like so.

play04:56

Why is this the case?

play04:58

Normally, when you run a script in Linux by invoking its path, the kernel will detect

play05:02

that it is an interpreter script, since it starts with a shabang specifying which interpreter to use.

play05:08

Then, it will invoke the interpreter with the script path as the first argument.

play05:13

This first argument is what the environment variable zero we talked about previously refers to.

play05:18

However, bash and similar shells allow you to run a script by passing a plain filename

play05:23

into the command as long as it’s in the current directory.

play05:27

This will start a new bash shell and run the script using the filename as the first argument,

play05:32

which results in a dollar zero variable that is an invalid path.

play05:36

Here is the phenomenon in action: we run the same script but in the second case, it only

play05:41

prints a file name, not a valid path.

play05:44

This causes the directory change command to error out and return no output, setting Steam

play05:49

root as an empty string, causing rm -rf to delete the root directory’s contents.

play05:54

At this point, most were satisfied, and began suggesting fixes, discussing how terrible

play05:59

the code was, and wondering how it was possible for Keyvin to be so cool and collected.

play06:03

But wait, this recursive deletion is in reset_steam(), why was that called in the first place?

play06:10

reset_steam() is only invoked under two circumstances.

play06:13

First, the user can intentionally pass the --reset flag into the script.

play06:18

Given the bug report did not mention the reset flag, let’s take a closer look at the other case.

play06:24

All of these conditions need to be met, as indicated by the “and” operator between each condition.

play06:29

Only one of them is important, the rest is just housekeeping, so let’s get the obvious

play06:33

ones out of the way first.

play06:34

Initial launch just verifies there isn’t already a steam process running, so this is easily true.

play06:40

Next, current status must not be set to this magic exitcode, which is just a variable to

play06:44

ensure this doesn’t get stuck in an infinite loop.

play06:47

Next, this steam starting file must exist, and it should exist given the steamconfig

play06:53

folder has remained untouched, which would include the starting file.

play06:56

Lastly, it confirms that the steam script is not out of date, which should be expected.

play07:01

So the key condition which triggers reset_steam() is that this INSTALLED_BOOTSTRAP environment

play07:06

variable has to be an empty string, the default value for an unset variable.

play07:11

Before the reset_steam check, steam.sh should run the main executable STEAMEXE which sets

play07:17

this variable to 1 after installing some bootstrap software.

play07:21

The developers intended for reset_steam() to trigger if STEAMEXE failed to do this, indicating

play07:27

something went wrong.

play07:28

However in this case, nothing was wrong with STEAMEXE, it is just executed with a path

play07:33

that is based on STEAMROOT, which we’ve already established is set incorrectly as

play07:38

an empty string due to the other bug.

play07:41

So steam.sh fails to run the main executable at all since it has the wrong path, the INSTALLED_BOOTSTRAP

play07:47

environment variable is not exported and remains blank, all the conditions of the

play07:51

if-statement are met, and then reset_steam() is called.

play07:55

So now we know how to trigger the bug: run steam.sh inline with a shell like bash.

play08:01

This explains what happened to Doofy, who clearly ran steam.sh, but how does this play

play08:07

into Keyvin’s original account, who only mentioned launching the main steam binary?

play08:07

Well, a valve employee had a hypothesis: Keyvin first moves steam to an NTFS drive, which

play08:17

being made for Windows doesn’t play nice with the Linux filesystem and would strip

play08:22

the executable bit in the version of Ubuntu used here.

play08:26

This causes the steam binary to inevitably fail and crash as it cannot execute anything in the drive.

play08:32

Then, he gets really annoyed and runs “bash steam sh”, and most importantly, he gets

play08:37

amnesia and forgets he ever ran such a command.

play08:41

Keyvin replies that he only ran steam normally, and he wasn’t even sure if that line

play08:45

was the culprit, which makes sense given from his perspective he never ran steam.sh.

play08:50

Furthermore, he recalls he may have mounted the drive with options to allow execution

play08:55

of all files, which is reasonable given that ValveSoftware themselves have tutorials to

play09:00

mount NTFS drives in this way.

play09:03

And of course he is still very calm and collected.

play09:06

However, the exact play-by-play actions cannot be confirmed since the computer with all of

play09:10

its configurations was wiped.

play09:13

So what now?

play09:14

I suppose all we can do is guess, as I have my own never-seen-before-before theory to

play09:18

how this happened:

play09:22

So it all started on a stormy Wednesday evening, when Keyvin finished cooking up his lasagna

play09:27

and heads to the computer room to work on his new gaming setup.

play09:31

After setting everything up, he attempts to run steam.

play09:34

But steam fails to run.

play09:36

Not because the NTFS drive did not add the executable bit, but because Keyvin created

play09:41

the symlink incorrectly in home/local, rather than home/local/share, where steamroot is

play09:47

supposed to be.

play09:48

This is a fairly easy typo to make, as he made the same mistake on the Github issue he created.

play09:54

He realizes his mistake, and fixes the symlink, creating it in the proper folder, and re-runs steam.

play10:01

While steam is still starting up, he absent-mindedly goes to clean up the incorrect symlink

play10:05

because why not, and forgets that he ever did so.

play10:08

But unbeknownst to him, he actually accidentally unlinks the correct symlink instead.

play10:13

In a stunning race condition, he manages to do this in the literal microseconds after

play10:18

steam.sh is invoked but before the assignment of STEAMROOT, and since the symlinked path

play10:23

no longer exists, the directory change fails, and STEAMROOT remains an empty string.

play10:29

And we’ve already established that if STEAMROOT is an empty string, the bug will trigger.

play10:34

To fix the bug, various alterations to the STEAMROOT command such as using readlink or

play10:39

dirname can guarantee a valid path is returned, and checks can be added to verify steamroot

play10:45

is a real path.

play10:46

Furthermore, bash has built in options which make the shell exit when a command fails,

play10:51

so when the directory change or STEAMEXE execution fails, the script immediately ends instead

play10:57

of proceeding with buggy behavior.

play10:58

These are what ended up being the actual fix, but many disagreed that this was enough.

play11:04

Some suggested to write all scripts in a real language like Python or Perl, or check for

play11:08

typical Linux directories, lookup files from an internal database and only remove known files,

play11:13

not writing untreated amateur hour code, never use rm -rf, use an empty file as a marker,

play11:17

don’t use the wildcard, you should just lock this thread,

play11:20

just don't delete user data, by the way here is my public domain script that has my preferred fix,

play11:25

Valve has no idea how to linux, has this problem been fixed?

play11:27

It’s still happening.

play11:29

Question mark?

play11:30

You know, we could talk for centuries about the ideal way to fix this, but sometimes you

play11:34

don’t need to achieve perfection.

play11:35

Anyways, it’s been like 8 years and the issue hasn’t resurfaced for anyone except this

play11:40

guy, but he didn’t provide further evidence so we can mark this issue resolved.

play11:44

See you in the next issue of issues insights.