Facts and myths about Python names and values

This page is also available in Turkish.

The behavior of names and values in Python can be confusing. Like many parts of Python, it has an underlying simplicity that can be hard to discern, especially if you are used to other programming languages. Here I’ll explain how it all works, and present some facts and myths along the way.

BTW: I worked this up into a presentation for PyCon 2015: Python Names and Values.

Names and values

Let’s start simple:

Fact: Names refer to values.

As in many programming languages, a Python assignment statement associates a symbolic name on the left-hand side with a value on the right-hand side. In Python, we say that names refer to values, or a name is a reference to a value:

x = 23

Now the name “x” refers to the value 23. The next time we use the name x, we’ll get the value 23:

print(x+2)       # prints 25

Exactly how the name refers to the value isn’t really important. If you’re experienced with the C language, you might like to think of it as a pointer, but if that means nothing to you then don’t worry about it.

To help explain what’s going on, I’ll use diagrams. A gray rectangular tag-like shape is a name, with an arrow pointing to its value. Here’s the name x referring to an integer 23:

x refers to 23x23

I’ll be using these diagrams to show how Python statements affect the names and values involved. (The diagrams are SVG, if they don’t render, let me know.)

Another way to explore what’s going on with these code snippets is to try them on pythontutor.com, which cleverly diagrams your code as it runs. I’ve included links there with some of the examples.

Fact: Many names can refer to one value.

There’s no rule that says a value can only have one name. An assignment statement can make a second (or third, ...) name refer to the same value.

x = 23
y = x

Now x and y both refer to the same value:

x and y both refer to 23xy23

Neither x or y is the “real” name. They have equal status: each refers to the value in exactly the same way.

Fact: Names are reassigned independently of other names.

If two names refer to the same value, this doesn’t magically link the two names. Reassigning one of them won’t reassign the other also:

x = 23
y = x
x = 12
x and y aren’t magically linkedxy1223

When we said “y = x”, that doesn’t mean that they will always be the same forever. Reassigning x leaves y alone. Imagine the chaos if it didn’t!

Fact: Values live until nothing references them.

Python keeps track of how many references each value has, and automatically cleans up values that have none. This is called “garbage collection,” and means that you don’t have to get rid of values, they go away by themselves when they are no longer needed.

Exactly how Python keeps track is an implementation detail, but if you hear the term “reference counting,” that’s an important part of it. Sometimes cleaning up a value is called reclaiming it.

Assignment

An important fact about assignment:

Fact: Assignment never copies data.

When values have more than one name, it’s easy to get confused and think of it as two names and two values:

x = 23
y = x
# "Now I have two values: x and y!"
# NO: you have two names, but only one value.

Assigning a value to a name never copies the data, it never makes a new value. Assignment just makes the name on the left refer to the value on the right. In this case, we have only one 23, and x and y both refer to it, just as we saw in the last diagrams.

Things get more interesting when we have more complicated values, like a list:

nums = [1, 2, 3]
nums refers to a list of numbersnums123

Now if we assign nums to another name, we’ll have two names referring to the same list:

nums = [1, 2, 3]
tri = nums
nums and tri both refer to the same listnumstri123

Remember: assignment never makes new values, and it never copies data. This assignment statement doesn’t magically turn my list into two lists.

At this point, we have one list, referred to by two names, which can lead to a big surprise which is common enough I’m going to give it a catchy name: the Mutable Presto-Chango.

Fact: Changes in a value are visible through all of its names. (Mutable Presto-Chango)

Values fall into two categories based on their type: mutable or immutable. Immutable values include numbers, strings, and tuples. Almost everything else is mutable, including lists, dicts, and user-defined objects. Mutable means that the value has methods that can change the value in-place. Immutable means that the value can never change, instead when you think you are changing the value, you are really making new values from old ones.

Since numbers are immutable, you can’t change one in-place, you can only make a new value and assign it to the same name:

x = 1
x = x + 1

Here, x+1 computes an entirely new value, which is then assigned to x.

With a mutable value, you can change the value directly, usually with a method on the value:

nums = [1, 2, 3]
nums.append(4)

First we assign a list to a name:

nums refers to a list of numbersnums123nums = [1, 2, 3]

Then we append another value onto the list:

nums refers to a list of numbersnums1234nums.append(4)

Here we haven’t changed which value nums refers to. At first, the name nums refers to a three-element list. Then we use the name nums to access the list, but we don’t assign to nums, so the name continues to refer to the same list. The append method modifies that list by appending 4 to it, but it’s the same list, and nums still refers to it. This distinction between assigning a name and changing a value is sometimes described as “rebinding the name vs. mutating the value.”

Notice that informal English descriptions can be ambiguous. We might say that “x = x+1” is changing x, and “nums.append(4)” is changing nums, but they are very different kinds of change. The first makes x refer to a new value (rebinding), the second is modifying the value nums refers to (mutating).

Here’s where people get surprised: if two names refer to the same value, and the value is mutated, then both names see the change:

nums = [1, 2, 3]
tri = nums
nums.append(4)

print(tri)      # [1, 2, 3, 4]

Why did tri change!? The answer follows from what we’ve learned so far. Assignment never copies values, so after the assignment to tri, we have two names referring to the same list:

nums and tri both refer to the same listnumstri123tri = nums

Then we mutate the list by calling .append(4), which modifies the list in place. Since tri refers to that list, when we look at tri we see the same list as nums, which has been changed, so tri now shows four numbers also:

changing the list means both names see the changenumstri1234nums.append(4)

This Mutable Presto-Chango is the biggest issue people have with Python’s names and values. A value is shared by more than one name, and is modified, and all names see the change. To make the Presto-Chango happen, you need:

Keep in mind, this is not a bug in Python, however much you might wish that it worked differently. Many values have more than one name at certain points in your program, and it’s perfectly fine to mutate values and have all the names see the change. The alternative would be for assignment to copy values, and that would make your programs unbearably slow.

Myth: Python assigns mutable and immutable values differently.

Because the Presto-Chango only happens with mutable values, some people believe that assignment works differently for mutable values than for immutable values. It doesn’t.

All assignment works the same: it makes a name refer to a value. But with an immutable value, no matter how many names are referring to the same value, the value can’t be changed in-place, so you can never get into a surprising Presto-Chango situation.

Python’s diversity

I said earlier that Python has an underlying simplicity. Its mechanisms are quite simple, but they manifest in a number of ways.

Fact: References can be more than just names.

All of the examples I’ve been using so far used names as references to values, but other things can be references. Python has a number of compound data structures each of which hold references to values: list elements, dictionary keys and values, object attributes, and so on. Each of those can be used on the left-hand side of an assignment, and all the details I’ve been talking about apply to them. Anything that can appear on the left-hand side of an assignment statement is a reference, and everywhere I say “name” you can substitute “reference”.

In our diagrams of lists, I’ve shown numbers as the elements, but really, each element is a reference to a number, so it should be drawn like this:

nums refers to a list, which refers to intsnums123nums = [1, 2, 3]

But that gets complicated quickly, so I’ve used a visual shorthand:

nums refers to a list of numbersnums123nums = [1, 2, 3]

If you have list elements referring to other mutable values, like sub-lists, it’s important to remember that the list elements are just references to values.

Here are some other assignments. Each of these left-hand sides is a reference:

my_obj.attr = 23
my_dict[key] = 24
my_list[index] = 25
my_obj.attr[key][index].attr = "etc, etc"

and so on. Lots of Python data structures hold values, and each of those is a reference. All of the rules here about names apply exactly the same to any of these references. For example, the garbage collector doesn’t just count names, it counts any kind of reference to decide when a value can be reclaimed.

Note that “i = x” assigns to the name i, but “i[0] = x” doesn’t, it assigns to the first element of i’s value. It’s important to keep straight what exactly is being assigned to. Just because a name appears somewhere on the left-hand side of the assignment statement doesn’t mean the name is being rebound.

Fact: Lots of things are assignment

Just as many things can serve as references, there are many operations in Python that are assignments. Each of these lines is an assignment to the name X:

X = ...
for X in ...
[... for X in ...]
(... for X in ...)
{... for X in ...}
class X(...):
def X(...):
def fn(X): ... ; fn(12)
with ... as X:
except ... as X:
import X
from ... import X
import ... as X
from ... import ... as X

I don’t mean that these statements act kind of like assignments. I mean that these are assignments: they all make the name X refer to a value, and everything I’ve been saying about assignments applies to all of them uniformly.

For the most part, these statements define X in the same scope as the statement, but not all of them, especially the comprehensions, and the details differ slightly between Python 2 and Python 3. But they are all real assignments, and every fact about assignment applies to all of them.

Fact: Python passes function arguments by assigning to them.

Let’s examine the most interesting of these alternate assignments: calling a function. When I define a function, I name its parameters:

def my_func(x, y):
    return x+y

Here x and y are the parameters of the function my_func. When I call my_func, I provide actual values to be used as the arguments of the function. These values are assigned to the parameter names just as if an assignment statement had been used:

def my_func(x, y)
    return x+y

print(my_func(8, 9))

When my_func is called, the name x has 8 assigned to it, and the name y has 9 assigned to it. That assignment works exactly the same as the simple assignment statements we’ve been talking about. The names x and y are local to the function, so when the function returns, those names go away. But if the values they refer to are still referenced by other names, the values live on.

Just like every other assignment, mutable values can be passed into functions, and changes to the value will be visible through all of its names:

def augment_twice(a_list, val):
    """Put `val` on the end of `a_list` twice."""
    a_list.append(val)
    a_list.append(val)

nums = [1, 2, 3]
augment_twice(nums, 4)
print(nums)         # [1, 2, 3, 4, 4]

This can produce surprising results, so let’s take this step by step. When we call augment_twice, the names and values look like this:

the moment we call augment_twicenums123augment_twicea_listval4augment_twice(nums, 4)

The local names in the function are drawn in a new frame. Calling the function assigned the actual values to the parameter names, just like any other assignment statement. Remember that assignment never makes new values or copies any data, so here the local name a_list refers to the same value that was passed in, nums.

Then we call a_list.append twice, which mutates the list:

after appending twicenums12344augment_twicea_listval4a_list.append(4)

When the function ends, the local names are destroyed. Values that are no longer referenced are reclaimed, but others remain:

after returning from augment_twicenums12344print(nums)

You can try this example code yourself on pythontutor.com.

We passed the list into the function, which modified it. No values were copied. Although this behavior might be surprising, it’s essential. Without it, we couldn’t write methods that modify objects.

Here’s another way to write the function, but it doesn’t work. Let’s see why.

def augment_twice_bad(a_list, val):
    """Put `val` on the end of `a_list` twice."""
    a_list = a_list + [val, val]

nums = [1, 2, 3]
augment_twice_bad(nums, 4)
print(nums)         # [1, 2, 3]

At the moment we call augment_twice_bad, it looks the same as we saw earlier with augment_twice:

the moment we call augment_twice_badnums123augment_twice_bada_listval4augment_twice_bad(nums, 4)

The next statement is an assignment. The expression on the right-hand side makes a new list, which is then assigned to a_list:

after assigning to a_listnums123augment_twice_bada_list12344val4a_list = a_list + [val, val]

When the function ends, its local names are destroyed, and any values no longer referenced are reclaimed, leaving us just where we started:

after returning from augment_twice_badnums123print(nums)

(Try this code on pythontutor.com.)

It’s really important to keep in mind the difference between mutating a value in place, and rebinding a name. augment_twice worked because it mutated the value passed in, so that mutation was available after the function returned. augment_twice_bad used an assignment to rebind a local name, so the changes weren’t visible outside the function.

Another option for our function is to make a new value, and return it:

def augment_twice_good(a_list, val):
    a_list = a_list + [val, val]
    return a_list

nums = [1, 2, 3]
nums = augment_twice_good(nums, 4)
print(nums)         # [1, 2, 3, 4, 4]

Here we make an entirely new value inside augment_twice_good, and return it from the function. The caller uses an assignment to hold onto that value, and we get the effect we want.

This last function is perhaps the best, since it creates the fewest surprises. It avoids the Presto-Chango by not mutating a value in-place, and only creating new values.

There’s no right answer to choosing between mutating and rebinding: which you use depends on the effect you need. The important thing is to understand how each behaves, to know what tools you have at your disposal, and then to pick the one that works best for your particular problem.

Dynamic typing

Some details about Python names and values:

Fact: Any name can refer to any value at any time.

Python is dynamically typed, which means that names have no type. Any name can refer to any value at any time. A name can refer to an integer, and then to a string, and then to a function, and then to a module. Of course, this could be a very confusing program, and you shouldn’t do it, but the Python language won’t mind.

Fact: Names have no type, values have no scope.

Just as names have no type, values have no scope. When we say that a function has a local variable, we mean that the name is scoped to the function: you can’t use the name outside the function, and when the function returns, the name is destroyed. But as we’ve seen, if the name’s value has other references, it will live on beyond the function call. It is a local name, not a local value.

Fact: Values can’t be deleted, only names can.

Python’s memory management is so central to its behavior, not only do you not have to delete values, but there is no way to delete values. You may have seen the del statement:

nums = [1, 2, 3]
del nums

This does not delete the value nums, it deletes the name nums. The name is removed from its scope, and then the usual reference counting kicks in: if nums’ value had only that one reference, then the value will be reclaimed. But if it had other references, then it will not.

Myth: Python has no variables.

Some people like to say, “Python has no variables, it has names.” This slogan is misleading. The truth is that Python has variables, they just work differently than variables in C.

Names are Python’s variables: they refer to values, and those values can change (vary) over the course of your program. Just because another language (albeit an important one) behaves differently is no reason to describe Python as not having variables.

Wrapping up

Myth? Python is confusing.

I hope this has helped clarify how names and values work in Python. It’s a very simple mechanism that can be surprising, but is very powerful. Especially if you are used to languages like C, you’ll have to think about your values differently.

There are lots of side trips that I skipped here:

See also

If you are looking for more information about these topics, try:

Comments

[gravatar]
Hello!

These are good explanations. Make complicated things simple by using proper and clear abstractions. You are a good writer.

I would like to see an explanation of the execution model on the skip list.
The documentation is hard to understand and full of loose ends. I am not alone with this.

http://docs.python.org/2/reference/executionmodel.html
http://bugs.python.org/issue12374

This is an attempt by me to give an explanation for it, on apropos of the default mutable value:
http://www.daniweb.com/software-development/python/threads/457567/somebody-please-explain-why-this-is-happening#post1990653



Bye
Daniel
[gravatar]
@Daniel, thanks. Your issue is already on the skip list: "What's the deal with mutable default arguments to functions?"

In a nutshell: default values for arguments are evaluated just once, when the function is defined. When the function is called, if a value isn't supplied for that argument, then the default value is assigned to the local name, and execution proceeds. If you mutate that value, then you have a Presto-Chango, where the two references are: 1) the default for the function argument, and 2) the local parameter name. Since values live until there are no more references, and the function object continues to refer to the default value, that mutable default value will live a very long time, and will collect mutations as it goes.
[gravatar]
Ned
Can you explain if n and n2 refer to same list ?
If Y: why no mutable-presto-changeo
If N: why so ? why python creates 2 lists here ?

>>> n=[1,2,3]
>>> n2=[1,2,3]
>>> n[0]='x'
>>> n,n2
(['x', 2, 3], [1, 2, 3])
[gravatar]
Robert:

Take Ned's advce and try this on pythontutor.com.

You'll see that the creation of both 'n' and 'n2' create separate lists with the same value. Why? Python could get impossibly slow if it had to seach every single value for equality every time a name was assigned a new value. Imagine you have a program that holds 100,000 dictionaries each with 50,000 keys and you start to see the point.

If your code had said 'n2=n', then Python says, "Oh, I already know that value; I'll just create a new name to point to it."

More to the point, Python does this when you assign a value to a name via another name. It would hardly be sporting if you had a list of value [1,2,3], and so did some code in a third-party module, and it changed the list for its own purpose. If Python checked all values before assigning names to avoid duplicates, then anyone changing that list would change your list! Insanity would rule. Black is down, up is north, and the sun would set in liters.

You can wind up assigning many names to one value, but every time you assign by value (and not by name), you by necessity create a new instance of the value.

Or so I believe.
[gravatar]
@Robert: Matt has provided a good explanation, I'll add just a bit more. The semantics of Python are that [1,2,3] creates a list with three elements. It always does, no matter how many lists have already been created. Once you've executed
n = [1,2,3]
these two statements have very different effects:
n2 = [1,2,3]   # make a list and assign it to n2
n3 = n         # assign n's value to n3
All three names refer to a three-element list, but n and n3 refer to the same list, n2 refers to a different list that happens to be equal.
[gravatar]
Thanks, Matt and Ned

So at the "root" - this operation:

aName = aValue

it goes back to whether aValue is mutable or not, right ?
If non-mutable: Same value can be pointed to by multiple names
if mutable: Python just creates aValue right there.

Is this right ?
[gravatar]
@Robert. Mutable and immutable values are treated exactly the same, refer to the Myth about that very topic above. The expression on the right-hand side of the assignment statement produces a value. How it does that is up to it. A list literal *always* makes a new list. An integer literal might make a new int, or it might not. Python can share integer objects because they are immutable, so there's no way to change their values, so there's no harm in sharing them widely.

Myth: Python assigns mutable and immutable values differently.
[gravatar]
> how come "2 + 2 is 4", but "1000 + 1 is not 1001"

I had to try in shell before I believed. Are small ints (not sure of correct terminology) pre-allocated, interned?
[gravatar]
@Norman: indeed, CPython interns small ints (-5 to 256), and since they are immutable, it doesn't affect the Python semantics.
[gravatar]
Hi Ned,
Thanks for the article... It made very good reading. Initially I got confused with "2 + 2 is 4", but "1000 + 1 is not 1001" - then I realized the "is" is like reference comparison - and == is value comparison
- similar to java's == versus equals(...)
I think that is what you meant by the interns above...

Thanks - and it was enlightening - learnt something new...
[gravatar]
I may be wrong but I think there's a mistake

The first example where
x=23
y=x

The value of 23 is COPIED and assigned to 'y'. So actually you have 2 different values, because when you change the value of y, the value of x remains the same

y = y+2
x # 23
[gravatar]
@vasu: You are mistaken. The diagrams are correct. Initially x and y refer to the same value. Then the statement "y = y+2" makes y refer to an entirely new value, 25. x is left referring to the old value, 23.
[gravatar]
Nice article! Good and thorough explanation of the way Python handles names and values.

One thing I don't agree with: I really think that the "Python has no variables" makes things far easier for the novice. Saying that variables exist but have a different behaviour makes things much more complicate. "variable" contains a very precise built-in idea of "variable within a certain scope" - normally a specific type - which is got no match in Python. We shouldn't have variable-envy, let's name a different concept in a different way.
[gravatar]
@Alan: I guess we'll just have to disagree. This is my definition of a variable (taken from Wikipedia): "a symbolic name associated with a value and whose associated value may be changed." That definition applies equally well to C and to Python, without getting into language-specific details.

According to your logic, Javascript has no variables either, which makes explaining the "var" keyword a bit tricky.

Don't let C-myopia make perfectly good terms off-limits to other languages.
[gravatar]
Thanks for the great explanations, Ned! I'll definitely be directing students here when they ask me these questions.

At the risk of seeming pedantic: as I was reading through this, a counter-example came to mind that I thought might be an interesting footnote:
https://gist.github.com/jamalex/5997735

In short: sometimes assignment does make new values, if it's been specifically set up to do so. Yay, magic methods!
[gravatar]
@Jamie, thanks, that's a good point. The complex assignments (attribute, element, item, augmented) can all be overridden with special methods, and therefore can be arbitrarily different from the built-in behavior. There isn't even a guarantee that they will assign anything!

On the other hand, simple name assignment cannot be overridden.
[gravatar]
Hi Ned. I've recently decided that I want to learn Python in-depth. I've written a fair number of scripts with python, but haven't learnt it deeply - e.g I know what generators and decorators are, but never use them, or the fact that lists are mutable types and hence unhashable constantly trips me up (why not have a frozenlist like a frozenset?).

Could you recommend good resources to up my Python-fu? Ideally I'm looking for something comprehensive, like a book/course/workshop/series that I can go through.

Thanks, and nice article. Names and values get tricky because they seem to behave like C most of the time, but they are really quite different.
[gravatar]
Reading through the facts and myths, and looking at the side trips a the end, I felt like these are all things that I pretty much understood, until I got to the very last one, "Why is it easy to make a list class attribute, but hard to make an int class attribute?" Can you explain what you are referring to here? I've never come across this issue. Do you mean that it is hard to do

class Stuff(object):
a = 1

or

class Stuff(int):
a = 'myattr'

or what? Are you referring to a use-case in the former where instances mutate the class attribute?
[gravatar]
I find it interesting that you never mention the locals() dictionary, or namespaces in general.
[gravatar]
I really like the diagrams in your post. What software did you use to create them?
Thanks
[gravatar]
@Steve, thanks. I think I had mentioned this in an earlier draft, but it's not in the text here: I used graphviz with some Python code to help make them all uniform.
[gravatar]
@aaron: the difficulty with class attributes: self.my_int = 2 vs. self.my_list.append("thing")

I didn't mention namespaces because this mostly was not about scoping, and I didn't mention locals() because it's an esoteric attractive nuisance. I left out lots of stuff that's more important than locals()!
[gravatar]
I see. That's more of an issue with the way Python makes class variables accessible from instances. I think if you do self.__class__.my_int = 2 you should be fine.
[gravatar]
Great article, I didn't know that about small ints and the is keyword!

But now you've got me wanting more... Why do beginners find it hard to make a tic-tac-toe board in Python??
[gravatar]
Maybe because they try [[], [], []]*3.
[gravatar]
Hi Ned,
one thing that helped me understand these aspects of Python when first learning it was to think "all these things are dictionaries", e.g.

* "names" are a dictionary (ie. NameError is just KeyError from locals() or globals()). The facts you state about assignment etc. follow logically from that.

* object attributes are a dictionary

* function calls are a dictionary (hence args act like assignment... The one subtle exception to this seems to be **kwargs itself, which is always copied rather than assigned in cpython)

So if you understand how Python dictionaries behave, you can understand how all these things behave. Does this seem like a useful viewpoint to you?
[gravatar]
Be careful. Object attributes are actually much more complicated than a dictionary lookup in the general case.
[gravatar]
@Chris Lee: I like the analogy of "all things are dictionaries" for understanding object attributes, including modules and classes. I don't see how function calls are dictionaries though, so I wouldn't go as far as you do.

But also, that analogy is a orthongonal to the topics this piece addresses. Once you accept that everything is a dictionary, you still have to understand what d[k]=v does, and that brings you right back to the issues discussed here.
[gravatar]
Super nice
[gravatar]
The diagram after this code block:
x = 23
y = x
x = 12
seems to be wrong. x should be 12 and y should be 23, but the diagram has them reversed.
[gravatar]
@Gordon, thanks, I've fixed it.
[gravatar]
Nice tutorial!
Just for fun, you could add this example:
def augment_twice_not_so_bad(a_list, val):
    """Put `val` on the end of `a_list` twice."""
    a_list +=  [val, val]

nums = [1, 2, 3]
augment_twice_not_so_bad(nums, 4)
print(nums)         # [1, 2, 3, 4, 4]
[gravatar]
Good day Sir, I just wanted to ask if elliptical references are allowed for record types in python such as in namedtuple,dictionaries..etc..?? thanks.
[gravatar]
Hello Ned,

Thanks very much for your clear and detailed explanation of the Python variable concept :
It immediately made me think of the UNIX file system and I would like to know if you would agree with this analogy :
o In an UNIX file system, a file-content is identified by its first INODE.
This INODE is not directly a disk-address but an ID defined by the File systems internals to address file content. Thus, a FILE NAME is just a binding to this file-content INODE and you can assign as many file names as desired to a specific file-content, identified by its first INODE.
As for variables in Python language, a REFERENCE COUNT is associated to the file-content : This file-content (disk memory space) is not freed until its reference count comes down to zero : Then, both the disk memory space and the INODES - first INODE and other chained inodes - are both freed. We could say it's the 'garbage collector' mechanism of an Unix file system.
o Would you accept the analogy between files - in UNIX file system - and variables - in Python language - ?
Indeed, for both of them, names are only references to an internal ID and this ID (either in the UNIX file system or in the PYTHON memory system) is associated with effective disk-addresses (... for Unix files) and RAM-memory-addresses (...for PYTHON variables) by the system internals.
o If we go along with this analogy, I imagine there is an ID associated with each Python variable-content. Let's call it VNODE (variable-node-ID)as an analog to the INODE (file-node-ID) of an UNIX file. If that were the case, I guess Guido Van Rossum (Python conceptor) has been strongly inspired by the UNIX file system model (?). Do you know what lead him to its awesome conception of variable objects (?)
o Consequent remark :
One can't really understand the 'Unix file system' until he catches the INODE mechanism.
One can't really understand the 'Python variable system' until he catches the inner mechanism (VNODE or other). Sadly enough, this mechanism is hidden...
Would be great if you could describe the precise physical internal model of the Python-variable-system ! I'm a newbie with Python and understanding the python variable mechanism is so important that I'm surprised not being able to find anything about its internals on the Python tutorials or FAQS, or forums (maybe I didn't googled enough ?)

Thanks in advance if you can validate/invalidate the analogy, as well as the (guessed?) VNODE mechanism for the 'under the hood' Python variable model.
[gravatar]
@Jean, Python the language doesn't require a particular implementation. Any implementation is fine as long as it works the way the pictures in this post show.

CPython doesn't have anything corresponding to your VNODE idea. Or, the VNODE is simply the memory address of the PyObject structure, and therefore, of the data itself (in many cases).
[gravatar]
Thank you so much. What a wonderful explanation.
I have kids passing lists in as parameters and the two alternatives to modify a list that you give are great.
A. upDate(myList). #list modified "in place" and so nothing is returned
OR
B. myList = upDate(myList) #list copied in function, modified and "new" list returned
I suppose B looks and reads better as it is obvious that myList will be changed by the function.
On the other hand if A is suitably commented it does seem a neater solution ... why copy a list and then return it?
What if it is a long list? Is this an issue? Or is the list never really copied in the sense that the integer elements are immutable?
I am swinging towards A, mutate rather than rebind ... as long as ot is commented. I wonder if A is more "Pythonesque"?
Either way thank you so much. You are a great writer!
[gravatar]
I like your answer : 'Python the language doesn't require a particular implementation'. If I understand well , it's exactly as for IP Protocols and their RFC : There's a 'spec' defining what is an object in PYTHON and what is a PYTHON variable, no matter the implementation. This 'specification' would just say : 'a variable is just a type of object, with such and such particular data structure (PyObject structure), including the 'reference count' property and a 'content address' property.

My concern was : When I tried to compare PYTHON variables with UNIX files, I couldn't imagine them without a kind of ID - different from the mere content address -. Indeed, in https://docs.python.org/3/reference/datamodel.html , I can read :
'Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. The ‘is‘ operator compares the identity of two objects; 'the id() function returns an integer representing its identity.

... This 'integer representing its identity' is exactly what I meant.
I notice the word 'representing', which means it's not necessarily
the mere content_address, but directly and immutably linked to it.

'CPython implementation detail: For CPython, id(x) is the memory address where x is stored.'

Thus we could say : The 'PYTHON variable system' (reference to 'FILE system') can be implemented - and probably is - as a tuple (...,...,type,reference_count,content_address) and each line/element of this tuple is a particular variable (object).
id(x) is then a method returning the 'content_address' property.

Would you agree ?
[gravatar]
@Ned, sorry for my somehow complicated previous comment.
I go back to your first explanations : Names 'refer' to values ; assignment never copies data
j = i    # '=' doesn't copy data; j is not a copied data
j = i+1  # '=' doesn't copy data; but the '+' operator does generate
                a new value; thus i+1 is a copied+transformed data;
j = j+1  # '=' doesn't copy data; but the '+' operator does generate
                a new value; thus j+1 is a copied+transformed data;
In a word : Indeed '=' doesn't copy values, but the '+' operator (and others as well) do initiate a 'copy', the resulting value being an entirely new one.
j = 0
k = j
for i in range(0,10):
  j = j + 5
print ("j: " + str(j)) # prints: j: 50
print ("k: " + str(k)) # prints: k: 0
As a consequence, I come to the conclusion that a simple iteration i=i+1,
in a 1000 cycles loop, represents 1000 creations of new values =new contents = new memory addresses (memory allocations). Interesting ! ... but quite strange from a 'C developer' point of vue !!!... Luckily, I suppose that the PYTHON 'garbage collector' is efficient enough and soon reclaims the 999 freed memory allocations ...
[gravatar]
One more comment :
Immutable values include numbers, strings, and tuples. Almost everything else is mutable, including lists, dicts, and user-defined objects. Mutable means that the value has methods that can change the value in-place.

Then, we could say : Immutable variables (=immutable values) are READ-ONLY variables*. Everytime I have to change the content of such a READ-ONLY type variable (numbers, strings, tuples), I have to copy it to a new content, thus to a new variable.

Would you agree the 'READ-ONLY' qualifier ?
It HELPS me to catch up the PYTHON variable data model.

* ... not so far from PHP CONSTANTS, or PHP class constants
[gravatar]
@Jean Ti: I'm glad you are digging into the mechanics of Python. You might need to find another place to have these long discussions. The #python IRC channel can help if you like.

You say, "Python variables [probably are implemented] as a tuple (type,reference_count,content_address)". No, there is no such structure in CPython's implementation. The memory address of the value is the id. The struct at that address has a type and a reference count.

In general, an iteration could produce many many new objects, yes. CPython optimizes the creation of those objects, and in the case of integers, many of them are reused rather than created and destroyed. An advantage of immutable values is the implementation can decide when to share them and when to make new ones.

And finally, yes, READ-ONLY seems like a reasonable synonym for immutable, though remember: immutability is a characteristic of values, not names. So I wouldn't talk about "read-only variables," because any name in Python can be re-assigned at any time. There is no such thing as an immutable name, so I wouldn't say there were immutable variables. This is an area where informal use of "variable" and "value" can get confusing.
[gravatar]
Perfectly clear ! Thanks again ...
[gravatar]
Hello Ned,

I have been following your blog, I really like your PyCon 2015 Talk about Names and Values. Basically, I have question regarding tuples. Why can they be changed when they have lists(mutable) inside them.

Ex:

tup1=(1,2,[3,4,5])

tup1[2][0]=6

(New tuple) tup1=(1,2,[6,4,5])

Why can this happen when a tuple is immutable?. Please explain the reason behind this. I'm stuck at this. Thanks in advance.

Regards,
Uday.
[gravatar]
Tuples are immutable in this sense: you cannot change what objects they refer to, and you cannot change their size. But objects in a tuple may or may not be immutable, depending on their type. Tuples don't enforce immutability on their contents.
[gravatar]
Thanks, Now I'm clear about that Ned.
[gravatar]
This article shed some light onto how python works, thanks!

I'd like to add that
import foo 
and
from foo import bar
do behave differently regarding assignment as stated here:
https://stackoverflow.com/a/19185936/4884487
[gravatar]
I really liked this blog. One question - why do you avoid using the word object and insist on value?
[gravatar]
@Mark: it's true that all values in Python are objects. But it's not true in other languages, so some readers might miss the breadth of the point if I made this piece about "Names and Objects". The fact that all values in Python are objects is a bit orthogonal to the names and values point that I am focused on here.
[gravatar]
Hi Ned,

I am reading the following piece of code,

def get_value(obj, field):
if isinstance(data.get(field), (list, tuple)):
return lower("".join(data.get(field) or []))
else:
return lower(data.get(field) or "")

fields = fields or DEFAULT_FIELDS
string = u"".join([get_value(data, field) for field in fields])

when get_value is called, obj is made to refer to the value of data, then should the function definition code be like

def get_value(obj, field):
if isinstance(obj.get(field), (list, tuple)):
return lower("".join(obj.get(field) or []))
else:
return lower(obj.get(field) or "")
[gravatar]
Hey, I think there is a typo here:

"The first makes x refer to a new value (rebinding), the second is modifying the value x refers to (mutating)."

I think you meant to say:

"The first makes x refer to a new value (rebinding), the second is modifying the value nums refers to (mutating)."
[gravatar]
@Mate: nice find, after all these years! I've fixed the typo. Thanks.
[gravatar]
Hi Ned. I just try this, there are two lists whose an element with the same value. My inspection showed these lists are two different objects. However, the elements with the same value refer to the same object. Is this an indication that the lists consist of another object?
[gravatar]
Syahid, that's because of an optimization in CPython. Small integers always reference the same object in memory (Python never creates two instances of int(1), for example). If you replace your integers with larger ones, like 1000, you'll find that this is no longer the case. Also note that this optimization is implementation specific to CPython, so you shouldn't rely on it (don't use "is" to compare integers, for instance).
[gravatar]
Hi Ned,

Thanks for this article, it is extremely educational and easy to digest. I've been programming for years now and never payed too much attention to this.

I'm curious if you have written about any of the skip list items. These are all interesting to me specifically intrigued by 3 & 4.

Also pointing that you have this typo:
"so the here the local name a_list"
meant to write "in here" there.
[gravatar]
@Jeff, thanks for the typo find, I've fixed it.
[gravatar]
Thanks Ned for the excellent article, one of the best on this topic. I was a C++ hobbyist so Python gave me a lot of culture shock when I started a couple of months ago. I think there is another thing, i.e. what does Python do when I write ls = ls * 2 with ls as a list. I think it's a bit conflicting here, on one side, quoted from your article, "Assignment never creates new object", so ls should be rebound to a new object created by the * operator; on the other side, list is mutable, so one would assume that this simply mutates the original list, which is not the case. I think this is the most confusing part for me as for now. I'm pretty much find with what you said in the article but this is confusing.
[gravatar]
(I cannot edit the original comment)
BTW I think it would be best for people like me, who came from a C/C++ background, to completely remove the thought of "call by value/ref" from their head, and just start clean and use "names", "binding" and other definition.
[gravatar]
Lists are mutable, but that doesn't mean that every operation mutates them. 2*l creates a new list. In fact, Python makes this easy to remember because every operation that mutates a list in place returns None, so that you can't chain operations with it (but be aware that some Python libraries like NumPy do not share this behavior). So l = mutable_operation(l) would mutate the list l but then redefine l (the name) to point to None (the list will still be a mutated list if some other variable points to it; mutation on a list can only change its values, not its type).

One gotcha to know about is in place operators like += or *=. These always mutate in place for mutable objects like lists. This matters if you pass a list into a function and that function does in place operators on it without copying it. The same list will be mutated in the calling function.
[gravatar]
@Aaron Meurer

Thanks for the quick reply. Yeah at first I did think that since lists are mutable then every time it is the original that got changed, apparently I was wrong, at least in the * operator case. I need to take a look into how * operator works for list (probably an overload?) to better understand the problem. I try to go as deep as possible instead of memorizing everything.

I also tried ls = ls.append("new string") and you are correct that I got a NoneType error.

I'm wondering if I should follow some programming convention to avoid as many these gotchas as possible...
[gravatar]
> Fact: Names refer to values.

Wrong. Names refer to objects.
[gravatar]
@Bachsau, your comment seems needlessly pedantic and aggressive. In Python, all values are objects. Both of our statements are accurate. Maybe we're using different meanings of "value"?
[gravatar]
I just think that this is the most important thing to understand if someone starts learning python, that everything is an object and that there are no variables in any traditional sense, but only pointers to objects. Missunderstanding them as variables holding values is often the source of great confusion. While in languages like C, raw data "values" and a type definition are bound to variable, names in python have no type or value on their own but point to a distinct object which knows everything about itself. It can, theoreticaly exist on its own, while a variables' value in C can not. This also means that assigning something to a name never results in a copy, which is also different from other languages. Not meant to be aggressive.
[gravatar]
@Bachsau, great, we agree. These are exactly the points that the rest of the piece cover.

But I object to the term "traditional sense" of variable. For the last few decades, there have been more languages that have variables like Python than like C. It's time to move past the idea that everything starts with C.
[gravatar]
def augment_twice(a_list, val):
    """Put `val` on the end of `a_list` twice."""
    a_list.append(val)
    a_list.append(val)

nums = [1, 2, 3]
augment_twice(nums, 4)
print(nums)         # [1, 2, 3, 4, 4]
It might be useful to mention here (the diagram doesn't make this obvious) that both 4s at the end at this point would be referring to the same value.

In other words, if you had added a list instead of a number, modifications to either list would show up on the other.
augment_twice(nums, [1, 2])
nums[5].append(3)
print(nums) # [1, 2, 3, 4, 4, [1, 2, 3], [1, 2, 3]]
[gravatar]
Why is “is” different than “==” and how come “2 + 2 is 4”, but “1000 + 1 is not 1001”

I know the differences between == and "is", but why at the time this article was written, "1000 + 1 is not 1001" is a fact, but now it is no longer? How did Python change leading to this?
I have read some documents say that Python cached some int values within a certain range, values outside this range are initialized at runtime, which can lead to object "1000 + 1" and "1001" are different? It makes sense, but maybe not right at the moment?
(I tried to make a comparison between 1000 + 1 and 1001, even 10000000000 + 1 and 10000000001 using "is", all of which return "True").
Sorry for my bad english
[gravatar]
It looks like Python is doing some optimization here. When Python sees a purely numerical expression like 1000 + 1, it evaluates it at compile time, so that it doesn't need to execute it when the program runs. So "1000 + 1 is 1001" literally compiles to "1001 is 1001". It would seem that the interpreter further optimizes this by reusing the same 1001 object. It looks like this optimization was introduced in Python 3.7.

If you instead do

a = 1000 + 1
b = 1001
a is b

you will see that it is False, because this no longer causes Python to reuse the same object for 1001.

The moral is that you should never use "is", unless you actually do want to see if something is the same object in memory. For immutable objects, the interpreter is completely free to swap out one instance for another and make optimizations like this, and you shouldn't rely on the specific details of this.
[gravatar]
@Aaron Meurer: thanks for your quick reply. Again, it makes sense but ... not a fact. I tried as you said but the result was still "True", meaning that a and b refer to the same object "1001". It seems that Python has improved something further than just the interpreter.
[gravatar]
Oh no, I just realized that if I execute each line of code, then a and b refer to two different 1001 objects. On the other hand, I executed an entire program, the result was "True". I am very confused right now.
Does this mean that the Python interpreter optimizes the code and decides to use the same "1001" object for a and b?
[gravatar]
When Python compiles a file it compiles the entire module at a time, whereas the interactive interpreter compiles each statement as it is executed. So likely the optimizer finds all numeric constants in any block of code that it compiles and if any of them are equal it makes them reference the same object.

This is just me assuming how things work based on behavior. Perhaps someone with more knowledge of the Python internals can give more technical details.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.