Explaining the scoping of indexed variables under for loops in Python

Let's start with a test. What does the following function do?

def foo(lst):
  a = 0
  for i in lst:
    a += i
  b = 1
  for t in lst:
    b *= i
  return a, b

Don't get frustrated if you think its function is to "compute the sum and product of all the elements in lst". It's usually hard to find the error here. It would be awesome to find it in a big pile of real code. --It's hard to find this error when you don't know it's a test.

The error here is the use of i in the second loop body instead of t. Wait, how the hell does this work? i should be invisible outside the first loop? [1] Oh, no. In fact, Python formally states that the name defined for the for loop target (the stricter formal name is "index variable") can leak into the outer function scope. Hence the following code:

for i in [1, 2, 3]:
  pass
print(i)

This code is valid and prints out 3. In this article, I want to explore why this is the case, why it's unlikely to change, and use it as a tracer bullet to dig into some interesting parts of the CPython editor.

By the way, if you don't believe that this behavior could cause real problems, consider this code snippet:

def foo():
  lst = []
  for i in range(4):
    (lambda: i)
  print([f() for f in lst])

If you were expecting the code above to print [0, 1, 2, 3], your expectations would have been dashed, and it would have printed [3,3,3,3]; because there is only one i in the scope of foo, and that i is what all the lambda's capture.
official account

This behavior is explicitly documented in the for loop section of the Python reference documentation:

The for loop assigns variables to the target list. ...... When the loop ends, the variables in the assignment list are not deleted, but if the sequence is empty, they will not be assigned to all loops.

Notice the last sentence. Let's try it:

for i in []:
  pass
print(i)

Indeed, the code above throws a NameError exception. Later, we will see that this is a natural consequence of the way the Python VM executes bytecode.
Why is that?

I actually asked Guido van Rossum about the reasons for this execution behavior, and he was kind enough to tell me some of the historical background (thanks Guido!). The motivation for executing code in this way is to keep Python's access to variables and scopes simple, without resorting to hacks (e.g., deleting all variables defined in a loop after it completes - think of the exceptions it could raise) or more complex scoping rules.

Python's scoping rules are simple and elegant: modules, classes, and blocks of code for functions can introduce scopes. Within a function, variables are visible from their definition to the end of the block (including nested blocks such as nested functions). Of course, the rules are slightly different for local variables, global variables (and other nonlocal variables). However, this is not very relevant to our discussion.

The most important point here is that the innermost possible scope is a function body. Not a for loop body. Not a with block.Python, unlike other programming languages (such as C and its descendants), does not have nested lexical scopes at the function level.

So, if you just base your implementation on Python, your code might end up with this execution behavior. Here is another inspiring code snippet:

for i in range(4):
  d = i * 2
print(d)

The variable d is visible and accessible at the end of the for loop, are you surprised to find this? No. This is exactly how Python works. So why is the scope of indexed variables treated differently?

Incidentally, indexed variables in list comprehension also leak into their closed scopes, or more accurately, could leak before Python 3.

Python 3 contains a number of significant changes, among them also a fix for variable leakage in list derivatives. No doubt this breaks backwards compatibility neutrality. That's why I don't think the current implementation behavior will be changed.

Furthermore, many people still find this a useful feature in Python. Consider the following code:

for i, item in enumerate(somegenerator()):
  dostuffwith(i, item)
print('The loop executed {0} times!'.format(i+1))

If you don't know the number of items returned by somegenerator, you can use this concise approach. Otherwise, you have to have a separate counter.

Here's one other example:

for i in somegenerator():
  if isinteresing(i):
   break
dostuffwith(i)

This pattern effectively looks for an item in a loop and uses that item later. [2]

Many users have wanted to keep this feature for years. But it is hard to introduce significant changes to a feature that developers have identified as harmful. It's even harder to remove a feature when many people find it useful and use it heavily in real-world code.
Under the hood

Now for the most interesting part. Let's look at how the Python compiler and VM work together to make this code execution behavior possible. In this particular case, I think the clearest way to present this is to start reverse-analyzing from the bytecode. I hope to present this example of how to dig into Python internals [3] (which is so full of fun!) .

Let's look at part of the function presented at the beginning of this paper:

def foo(lst):
  a = 0
  for i in lst:
    a += i
  return a

The resulting bytecode is:

 0 LOAD_CONST        1 (0)
 3 STORE_FAST        1 (a)
 
 6 SETUP_LOOP       24 (to 33)
 9 LOAD_FAST        0 (lst)
12 GET_ITER
13 FOR_ITER        16 (to 32)
16 STORE_FAST        2 (i)
 
19 LOAD_FAST        1 (a)
22 LOAD_FAST        2 (i)
25 INPLACE_ADD
26 STORE_FAST        1 (a)
29 JUMP_ABSOLUTE      13
32 POP_BLOCK
 
33 LOAD_FAST        1 (a)
36 RETURN_VALUE

As a hint, LOAD_FAST and STORE_FAST are bytecodes (opcodes) that Python uses to access variables that are only used in functions. Since the Python compiler knows (at compile time) how many of these static variables are in each function, they can be accessed via static array offsets rather than a hash table, which makes access faster (hence the _FAST suffix). I'm digressing a bit. What really matters here is that variables a and i are treated equally. They are both fetched via LOAD_FAST and modified via STORE_FAST. There is absolutely no reason to think that their visibility is different. [4]

So how does this execution happen? Why does the compiler think that the variable i is just a local variable in foo. The logic is in the code in the symbol table, when the compiler executes to the point where the AST starts to create a control flow graph, which is followed by bytecode. More details of this process are described in my article on symbol tables - so I'll just mention the highlights of it here.

Symbol table code does not consider for statements to be special. There is the following code in symtable_visit_stmt:

case For_kind:
  VISIT(st, expr, s-&gt;);
  VISIT(st, expr, s-&gt;);
  VISIT_SEQ(st, stmt, s-&gt;);
  if (s-&gt;)
    VISIT_SEQ(st, stmt, s-&gt;);
  break;

The index variable is accessed like any other expression. Since the code accesses the AST, it's worthwhile to see what's going on inside the for statement node:

For(target=Name(id='i', ctx=Store()),
  iter=Name(id='lst', ctx=Load()),
  body=[AugAssign(target=Name(id='a', ctx=Store()),
          op=Add(),
          value=Name(id='i', ctx=Load()))],
  orelse=[])

So i is in a node named Name. These are handled by the symbol table code via the following statements in symtable_visit_expr:

case Name_kind:
  if (!symtable_add_def(st, e-&gt;,
             e-&gt; == Load ? USE : DEF_LOCAL))
    VISIT_QUIT(st, 0);
  /* ... */

Since the variable i is clearly marked as DEF_LOCAL (since the * _FAST bytecode is accessible, but this is also easily observed by using the symtable module if the symbol table is not available), the obvious code above calls symtable_add_def with DEF_LOCAL as the third argument. Now go through the above AST and notice the ctx=Store part of i in the Name node. Therefore, it is the AST with the information about i stored in the target part of the For node. let's see how this is accomplished.

The AST build portion of the compiler crosses over to the parse tree (which is a fairly low-level representation of the source code - some background information is available here), while among other things, setting the expr_context attribute at certain nodes, most notably the Name node. Think about this in the following statement:

foo = bar + 1

Both variables, for and bar, will end up in the Name node. But bar is just loaded into this code, while for is actually stored into this code. expr_context attribute is used to differentiate between current and future use through symbol table code[5] .

Back to the index variables of our for loop. These will be handled in the function ast_for_for_stmt - the for statement creation AST -. Here is the relevant part of that function:

static stmt_ty
ast_for_for_stmt(struct compiling *c, const node *n)
{
  asdl_seq *_target, *seq = NULL, *suite_seq;
  expr_ty expression;
  expr_ty target, first;
 
  /* ... */
 
  node_target = CHILD(n, 1);
  _target = ast_for_exprlist(c, node_target, Store);
  if (!_target)
    return NULL;
  /* Check the # of children rather than the length of _target, since
    for x, in ... has 1 element in _target, but still requires a Tuple. */
  first = (expr_ty)asdl_seq_GET(_target, 0);
  if (NCH(node_target) == 1)
    target = first;
  else
    target = Tuple(_target, Store, first-&gt;lineno, first-&gt;col_offset, c-&gt;c_arena);
 
  /* ... */
 
  return For(target, expression, suite_seq, seq, LINENO(n), n-&gt;n_col_offset,
        c-&gt;c_arena);
}

The Store context is created when the function ast_for_exprlist is called, which creates a node for the indexed variable (note that the indexed variable of a for loop may also be a tuple of a sequence of variables, not just a variable).

This function is the last essential part of the introduction to why for loop variables are treated the same as other variables in the loop. After marking it in the AST, the code used to handle loop variables in the symbol table and in the virtual machine is the same as the code used to handle other variables.
concluding remarks

This article discusses some specific behaviors in Python that might be considered "gotchas". I hope this article does explain the code execution behavior of Python's variables and scopes, why this behavior is useful and unlikely to ever change, and how the internals of the Python compiler make it work. Thank you for reading!

[1] I'd love to make a Microsoft Visual C ++ 6 joke here, but the fact that most readers of this blog won't get the joke in 2015 is a little disturbing (which reflects my age, not my readers' abilities).

[2] You might argue that dowithstuff(i) can go into the if when executing just before break. However, this is not always convenient. Also, according to Guido's explanation, there's a nice separation of our concerns here - the loop is used for and only for searching. What happens to the variables in the loop after the search is over is no longer the concern of the loop. I think this is a very good point.

[3]: Normally the code in my posts is based on Python 3. Specifically, I'm looking forward to the default branch of the next version (3.5) of the Python library to be completed. But for this particular topic, any version of the source code in the series should work.

[4] Another obvious thing in function decomposition is why i remains invisible if the loop doesn't execute; the bytecode pair GET_ITER and FOR_ITER treats our loop as an iterator and then calls its __next__ method. If this call ends up throwing a StopIteration exception, the VM catches the exception and ends the loop. Only if the actual value is returned does the VM continue to execute STORE_FAST on i, therefore leaving this value in existence for subsequent code to reference.

[5] This is an odd design, and I suspect that the essence of the design is to use relatively clean recursive access to code in the AST, such as symbol table code and CFG generators.