Thomas Countz

Ownership in Rust, Part 2 →

As a Rubyist, all that I know about memory allocation is that it’s handled by some process called garbage collection and that it’s Aaron Patterson’s problem, not mine.

So, when I cracked open the Rust Book and saw that one of Rust’s defining features is its alternative to garbage collection, I became a bit worried.

Was the responsibility of dealing with memory management about to be heaped onto me?

Apparently, with other system programming languages, like C, dealing with memory allocation is a big deal, and can have significant consequences when done poorly.

With all of the other new things to learn, I felt things beginning to stack up.

Stack & Heap

No, its not a hipster clothing brand, the stack and heap are ways of managing memory at runtime.

Stack & Heap

First, we have the stack. The stack is considered fast because it stores and accesses data based on order. The last thing that was placed — pushed — onto* *the stack is the first thing removed — popped — from the stack. This is referred to as *LIFO, *Last In First Out, and means that we only need to keep track of where the top of the stack is when it comes time to free up memory.

The stack can also be fast because the amount of space needed from the stack is known at compile time. This means that we can allocate a fix-size portion of memory before we store anything into it.

For example, if you know that four people are coming to your dinner party, you can decide ahead of time where everyone will sit, how much food to prepare, and practice their names before they get there. This is super efficient!

Next, we have the alternative, the heap. When you don’t know exactly how many people are coming to your dinner party ahead of time, you can use the heap. Using the heap means finding extra chairs and giving out name tags as more and more people arrive to your dinner party.

When data of unknown-size needs to be stored during runtime, the computer searches for memory on the heap, marks it, and returns a pointer, which points back to that place in memory. This is called allocating. You can then push this pointer onto a stack, however, when you want to retrieve the actual data, you need to follow the pointer back to the heap.

As I keep digging into the stack & heap rabbit hole, it seems like managing data in the heap can be difficult. For example, you need to ensure that you allow the computer to reallocate a place in memory once you’re done using it. But if one part of your code frees a place in memory that another part of your code still has a pointer to, funky things can happen.

Keeping track of what parts of code are using what data on the heap, minimizing the amount of duplicate data on the heap, and cleaning up unused data on the heap so you don’t run out of space are all problems that ownership addresses.

Rust Book

Ownership

Ownership & Scope

There are three rules about ownership in Rust:

Each value in Rust has a variable that’s called its *owner*.

There can only be one owner at a time.

When the owner goes out of scope, the value will be dropped.

The simplest illustration of this ownership magic is with variable scope:

fn main() {
  let hello = "Hello, World!";
  println("{}", hello);
} // variable `hello` is now invalid

Once the current function scope is over, denoted by the }, the variable hello goes out of scope, and is dropped.

“Well, duh!” That’s what I thought when I first read this. This is the same in most other programming languages. This is what I know as the behavior of a “locally-scoped variable.”

If this is all ownership does, I’m not sure what all the hubbub is about.

However, things get more interesting when we start passing around values and switching from using a string literal, which is stored on the stack, to using a String type, which is stored on the heap.

fn main() {
  let hello = "Hello, World!"; // string literal
  let hello1 = hello; // copy the value of `hello` and bind it to `hello1`
  println("{}", hello); // this works!
  
  let hello = String::from("Hello, World!"); // `String` type
  let hello1 = hello; // move the data of `hello` into `hello1`
  println("{}", hello); // error[E0382]: use of moved value: `hello`
}

We can see here, that when using a string literal, Rust is copying the value of hello into hello1, as we might expect. But when using a String type, Rust moves the value instead. Rust tells us that we attempted to retrieve a valued that has been moved by throwing the error:error[E0382]: use of moved value: 'hello’

It seems like when using a string literal, Rust will copy the value of that one variable into another variable, but when we use a String type, it moves the value instead.

In order to find out which types implement the copy trait “…you can check the documentation… but as a general rule, any group of simple scalar values can be Copy, and nothing that requires allocation or is some form of resource is Copy”

Why Not Copy Everything?

Updated June 13, 2018

For the related discussion which lead to this update, please visit this Rust language forum thread.

Copy trait when using &str

The string literal, "Hello, World!", is stored somewhere in read-only memory, (neither in the stack nor heap), and a pointer to that string is stored on the stack. Because it’s a string literal, it usually shows up as a reference, meaning that we use a pointer to a string stored in permanent memory, (see Ownership in Rust, Part 2 for more on references), and it’s guaranteed to be valid for the duration of the entire program, (it has a static lifetime).

Here, the pointers stored in hello and hello1 are using the stack. When we use the = operator, Rust pushes a new copy of the pointer stored in hello onto the stack, and binds it to hello1. At the end of the scope, Rust adds a call to [drop](https://doc.rust-lang.org/1.6.0/std/ops/trait.Drop.html) which pops the values from the stack in order to free up memory. These pointers can be stored and easily copied to the stack because their size is known at compile-time.

Move trait when using the heap

Over on the heap, the String type with value "Hello, World!" is bound to the variable hello , using the String::from method. However, unlike the string literal, there’s more data bound to hello than just a pointer, and the size of this data can change during runtime. Here, the = operator binds the data from hello to a new variable hello1, effectively *moving *the data from one variable to another. Poor hello is now invalid, as per ownership rule #2: “There can only be one owner at a time.”

But why do this? Why doesn’t Rust always just make a copy of the data and bind it to the new variable?

If we think back to the differences between the stack and heap, we remember that the size of data stored on the heap is not known at compile time, which means we need to run through some memory allocation steps during runtime. This can be expensive. Depending on how much data we’re storing, we could quickly run out of memory if we sit around making copies of data all day.

Besides that, the default behavior of Rust helps protect us from memory issues that we might run into in other languages.

Part of storing data on the heap, is store a pointer to that data on the stack. However, unlike using a pointer to locate read-only memory, like when using a string literal, the data at the end of the pointer that leads to the heap, can change. A pointer is part of the <<DATA>> that is bound to the hello variable that stores the String type. If we bind the same pointer data to two different variables, it might look something like this:

A rough sketch of copying String type data

We have two variables, hello and hello1, which share ownership of the same value. This violates rule #2: “There can only be one owner at a time,” but let’s keep going.

At the end of the scope in which hello and hello1 are defined, we have to drop the memory in the heap, which frees it up to be used again elsewhere.

Dropping hello1

First, we call drop on the data stored at the end of the pointer bound to hello1, but what happens now when we call drop on hello, next?

Double Free Error

This is called a double free error, which I think is best summarized in this Stack Overflow answer:

A double free in C, technically speaking, leads to undefined behavior. This means that the program can behave completely arbitrarily and all bets are off about what happens. That’s certainly a bad thing to have happen! In practice, double-freeing a block of memory will corrupt the state of the memory manager, which might cause existing blocks of memory to get corrupted or for future allocations to fail in bizarre ways (for example, the same memory getting handed out on two different successive calls of malloc).

Double frees can happen in all sorts of cases. A fairly common one is when multiple different objects all have pointers to one another and start getting cleaned up by calls to free. When this happens, if you aren’t careful, you might free the same pointer multiple times when cleaning up the objects. There are lots of other cases as well, though.

— templatetypedef

This is what Rust is trying to prevent! By invalidating hello, the compiler knows to only make a call to drop, (which calls free behind the scenes), on hello1.

This is all well and good, but there are instances when we do want to copy data that’s stored in the heap. Rust provides an easy way of doing that with clone().

fn main() {
  let hello = String::from("Hello, World!"); // `String` type
  let hello1 = hello.clone(); // clone data from `hello` into `hello1`
  println("{}", hello); // => "Hello, World!"
}

Keep in mind that calls to clone() can be expensive, which is why Rust prevents this “deep copying” by default.

Apparently, there’s a lot more about Rust ownership than covered here; there are concepts called borrowing, referencing, and slicing, too!

So far, it seems like learning about ownership is more to do with navigating Rust’s memory management solution than it is to learn about the problem it solves. But, instead of taking it as a quirk of the language, the Rust Book encourages you to learn about why the language writers were eager create a safer language.

Read Ownership in Rust, Part 2 →

References

July 9, 2018 · rust

Ownership in Rust, Part 1

Stack & Heap

Ownership & Scope

Why Not Copy Everything?

References