How to use pdb for Python debugging

Debugging an application is sometimes an unwelcome endeavor when you've been coding for a long time and just want the code you've written to run smoothly. However, there are many cases where we need to learn a new language feature or experiment with testing new methods to understand the mechanics of how it works.

Even if you don't consider such scenarios, debugging code is still necessary, so it's important to learn how to use a debugger at work. In this tutorial, I will give basic use about pdb----Python's interative source code debugger.

First some basics of pdb are given, you can save this article for easy follow up. pdb is similar to other debuggers, they are standalone tools and they are irreplaceable when you need a debugger. At the end of this tutorial, you will learn how to use the debugger to see any variable in the application, you can stop or resume the application execution flow at any moment, so you can see how each line of code affects the internal state of the application.

This helps in tracking down hard-to-find bugs and enables fast and reliable resolution of defective code. Sometimes, single-stepping the debugging code in pdb and then looking at the changes in the values of the variables can help us to get a deeper understanding of the application code. pdb is a part of the Python standard library, so as long as we are using the Python interpreter, we can also use pdb, which is very convenient.

The example code used in this article will be placed at the end of the article, the use of the environment is Python3.6 and later interpreter, you can download the source code to facilitate learning.

1. Starting phase: promise a variable value

To begin the case, let's first explore the simplest use of pdb: checking the value of a variable. First, we write the following statement on a line in a source code file:

import pdb; pdb.set_trace()

When this line of code is executed, the Python code file is paused, waiting for you to issue a command to instruct it what to do next. After running the above code, you can see the (Pdb) prompt in the command line interface, which means that the code is stopped, waiting for commands to be entered.

Since Python 3.7, the official standard recommends using the standard library internal function breakpoint() instead of the above code (note: the code accompanying this article are used in the form of the above code), which can speed up the operation of the interpreter and debugging:

breakpoint()

By default, breakpoint() will pour into the pdb module and call the pdb.set_trace() function, only it is further encapsulated. However, using breakpoint() allows more flexibility and allows the user to control the debugging behavior by calling its API, as well as using the environment variable PYTHONBREAKPOINT. for example, when we set PYTHONBREAKPOINT=0 in our environment, this completely disables the functionality of breakpoint(), and thereby turning off debugging.

Additionally, instead of manually adding breakpoint code to the source code file, we can simply set it by passing it through a parameter when entering a run command on the command line, for example:

$ python3 -m pdb  arg1 arg2

So, let's get right to the point of this subsection, which is to access the values of the variables in the code, and look at the following example, using the source code file:

#!/usr/bin/env python3

filename = __file__
import pdb; pdb.set_trace()
print(f'path={filename}')

And from your command line interface, run the above Python code and you get the following output:

$ ./
> /code/(5)<module>()
-> print(f'path={filename}')
(Pdb)

Next, let's enter the command p filename to see the value of the variable filename, as you can see below:

(Pdb)	p filename
'./'
(Pdb)

Since we are using a command-line interface program (command-line interface), then pay attention to the characters and format of the output, as explained below:

>The first line tells us the name of the source file we're running from, followed by the number of lines of code in parentheses, followed by the function name. Here, since we're not calling any functions, we're at the module level, and we're looking at the()。
->The second line at the beginning indicates the specific code content corresponding to the current line of code.
(Pdb)is a pdb prompt waiting for the next command to be entered.

We can use the q command to indicate the launch of debugging (quit).

2. Print expressions

When using the command p, we can similarly enter an expression and let Python compute the value of the expression. If a variable name is passed in, pdb will promise the value corresponding to the current variable. However, we can further investigate the current running state of our application.

In the following case, when the function get_path() is called, in order to see what happens in this function, I have gone ahead and inserted the breakpoint program pdb.set_trace() to block the program, the source code reads as follows:

#!/usr/bin/env python3
import os
def get_path(filename):
    """Return file's path or empty string if no path."""
    head, tail = (filename)
    import pdb; pdb.set_trace()
    return head

filename = __file__
print(f'path = {get_path(filename)}')

If you run this file from the command line, you get the following output:

$ ./ 
> /code/(10)get_path()
-> return head
(Pdb)

So at this point, what stage are we at:

>: Indicates that we are at line 10 of code in the source file, here the function get_path(). If you run the command p the scalar output is the currently referenced code frame, that is, the scalar within the current context.
->: The running code has been stopped at return head, this line of code has not been executed yet and is located at line 10 of the file, specifically within the function get_path().

If you want to see the current status code context of the application, you can use the commandll(longlist), to see the contents of the code, which is as follows:

(Pdb) ll
  6     def get_path(filename):
  7         """Return file's path or empty string if no path."""
  8         head, tail = (filename)
  9         import pdb; pdb.set_trace()
 10  ->     return head
(Pdb) p filename
'./'
(Pdb) p head, tail
('.', '')
(Pdb) p 'filename: ' + filename
'filename: ./'
(Pdb) p get_path
<function get_path at 0x100760e18>
(Pdb) p getattr(get_path, '__doc__')
"Return file's path or empty string if no path."
(Pdb) p [(p)[1] for p in ]
['pdb-basics', '', 'python3.6', 'lib-dynload', 'site-packages']
(Pdb)

You can enter any valid Python expression followed by p to perform the calculation. This is especially useful when you are debugging and want to test alternative implementations directly in your application at runtime. You can also use the command pp (pretty print) to print expressions beautifully. If you want to print variables or expressions with a lot of output, such as lists and dictionaries. If you can, pretty printing keeps the objects on one line, or breaks them into multiple lines if they don't fit in the allowed width.

3. Debugging code

In this section, we mainly use two commands for debugging the code as shown below:

The difference between the commands n (next) and s (step) is the location where pdb runs and stops. With n (next), pdb will execute until it reaches the next line of code in the current function or module, i.e., if an external function is called, it will not jump to the external function code, which can be interpreted as "step over". Use s (step) to execute the current code, but if an external function is called, it will jump to the external function, which can be interpreted as "step into", and if it executes to the external jump function, the s command will output --Call--.

Both the n (next) and s (step) commands suspend the execution of the code, and when it runs to the end of the current function and prints out --Return--, the following is the file source content:

#!/usr/bin/env python3

import os

def get_path(filename):
    """Return file's path or empty string if no path."""
    head, tail = (filename)
    return head

filename = __file__
import pdb; pdb.set_trace()
filename_path = get_path(filename)
print(f'path = {filename_path}')

If you run this file on the command line while typing the command n, you get the following output:

$ ./ 
> /code/(14)<module>()
-> filename_path = get_path(filename)
(Pdb) n
> /code/(15)<module>()
-> print(f'path = {filename_path}')
(Pdb)

By using the command n(next), we stop at 15 lines of code and we are also just in this module without jumping to the function get_path(). Here the function is represented as(), indicating that we are currently at the module level and not inside any function.

Let's try the s(step) command again, the output is:

$ ./ 
> /code/(14)<module>()
-> filename_path = get_path(filename)
(Pdb) s
--Call--
> /code/(6)get_path()
-> def get_path(filename):
(Pdb)

By using the s (step) command, we stop at the 6th line of code inside the function get_path(), because this function is called at line 14 in the code file; notice that the s command is followed by the output --Call--, indicating a function call. For convenience, pdb has a command memory function, so if we want to debug a lot of code, we can typeEnterEnter to repeat the command.

Here's a case where we use a mix of the n (next) and s (step) commands, first typing s (step) because we want to get to the function get_path(), and then debugging the code locally with the command n (next) and using theEnterEnter key to avoid repeating commands:

$ ./ 
> /code/(14)<module>()
-> filename_path = get_path(filename)
(Pdb) s
--Call--
> /code/(6)get_path()
-> def get_path(filename):
(Pdb) n
> /code/(8)get_path()
-> head, tail = (filename)
(Pdb) 
> /code/(9)get_path()
-> return head
(Pdb) 
--Return--
> /code/(9)get_path()->'.'
-> return head
(Pdb) 
> /code/(15)<module>()
-> print(f'path = {filename_path}')
(Pdb) 
path = .
--Return--
> /code/(15)<module>()->None
-> print(f'path = {filename_path}')
(Pdb)

Note the output of --Call-- and --Return--, which are messages output by pdb prompting us with information about the status of the debugging process. Both the n (next) and s (step) commands stop when the function returns, which is where we'll see the --Return-- message output. Also note that the->'.'At the end of the first - Return - output above:

--Return--
> /code/(9)get_path()->'.'
-> return head
(Pdb)

When pdb stops at the end of a function, but hasn't run to return, pdb also prints the return value, which in the example above is'.'.

3.1 Display code

Don't forget that we mentioned above the commandll(longlist: display the source code of the current function or frame), in our debugging into the unfamiliar code context, this command is very effective, we can print out the entire function code, display the following sample:

$ ./ 
> /code/(14)<module>()
-> filename_path = get_path(filename)
(Pdb) s
--Call--
> /code/(6)get_path()
-> def get_path(filename):
(Pdb) ll
  6  -> def get_path(filename):
  7         """Return file's path or empty string if no path."""
  8         head, tail = (filename)
  9         return head
(Pdb)

To see a short code snippet, we can use the commandl(list), without input parameters, it will print the contents of the code in the neighborhood of the current code on line 11, case in point:

$ ./ 
> /code/(14)<module>()
-> filename_path = get_path(filename)
(Pdb) l
  9         return head
 10     
 11     
 12     filename = __file__
 13     import pdb; pdb.set_trace()
 14  -> filename_path = get_path(filename)
 15     print(f'path = {filename_path}')
[EOF]
(Pdb) l
[EOF]
(Pdb) l .
  9         return head
 10     
 11     
 12     filename = __file__
 13     import pdb; pdb.set_trace()
 14  -> filename_path = get_path(filename)
 15     print(f'path = {filename_path}')
[EOF]
(Pdb)

4. Use of breakpoints

The correct use of breakpoints can save a lot of time in our debugging process. Instead of debugging lines of code in a single step, we can simply set a breakpoint where we want to access it and run directly to the breakpoint to debug. Similarly, we can add conditions that allow pdb to determine whether a breakpoint is necessary. A breakpoint is usually set using the command b (break), where we can specify the number of lines of code, or the name of the function we want to debug, with the following syntax:

b(reak) [ ([filename:]lineno | function) [, condition] ]

in the event thatfilename:is not specified before the number of lines of code, then the default is in the current code file. Note that the second optional parameter isb: conditionThis is a very powerful feature. Suppose in a scenario, we want to set a breakpoint if a certain condition holds, if we pass an expression seat the second parameter, pdb will set a breakpoint in the case of calculating the change expression to true, we will give the case below. In the following case, a tool module is used that lets us set a breakpoint in the function get_path(), and here is what the code looks like:

#!/usr/bin/env python3

import util

filename = __file__
import pdb; pdb.set_trace()
filename_path = util.get_path(filename)
print(f'path = {filename_path}')

Here are the contents of the tool module file:

def get_path(filename):
    """Return file's path or empty string if no path."""
    import os
    head, tail = (filename)
    return head

First, let's set a breakpoint using the name of the source file and the number of lines of code:

$ ./ 
> /code/(7)<module>()
-> filename_path = util.get_path(filename)
(Pdb) b util:5
Breakpoint 1 at /code/:5
(Pdb) c
> /code/(5)get_path()
-> return head
(Pdb) p filename, head, tail
('./', '.', '')
(Pdb)

The command c (continue) implements the command to continue running after being stopped by a breakpoint. In the following, let's use the function name to set the breakpoint:

$ ./ 
> /code/(7)<module>()
-> filename_path = util.get_path(filename)
(Pdb) b util.get_path
Breakpoint 1 at /code/:1
(Pdb) c
> /code/(3)get_path()
-> import os
(Pdb) p filename
'./'
(Pdb)

If you enter b with no arguments, you can see information about all the breakpoints that have been set:

(Pdb) b
Num Type         Disp Enb   Where
1   breakpoint   keep yes   at /code/:1
(Pdb)

You can use the commands disable bpnumber and enable bpnumber to disable and re-enable breakpoints. bpnumber is the breakpoint number in the first Num column of the breakpoint list. Note the change in value in the Enb column:

(Pdb) disable 1
Disabled breakpoint 1 at /code/:1
(Pdb) b
Num Type         Disp Enb   Where
1   breakpoint   keep no    at /code/:1
(Pdb) enable 1
Enabled breakpoint 1 at /code/:1
(Pdb) b
Num Type         Disp Enb   Where
1   breakpoint   keep yes   at /code/:1
(Pdb)

In order to remove a breakpoint, the command cl (clear) can be used:

cl(ear) filename:lineno
cl(ear) [bpnumber [bpnumber...]]

Now, let's try to enter an expression parameter when setting a breakpoint. In the current case scenario, the get_path() function will not set a breakpoint if it accepts a relative path, i.e., if the path name does not start with /, as in the following case:

$ ./ 
> /code/(7)<module>()
-> filename_path = util.get_path(filename)
(Pdb) b util.get_path, not ('/')
Breakpoint 1 at /code/:1
(Pdb) c
> /code/(3)get_path()
-> import os
(Pdb) a
filename = './'
(Pdb)

If you create a breakpoint and then continue to run it by typing the c (continue) command, pdb will only stop if the expression evaluates to true. The command a (args) prints out the incoming arguments to the current function.

In the above case, if you set a breakpoint by the function name instead of the number of lines of code, note that the expression will only use the parameters of the current function or global variables, otherwise, the breakpoint will not compute the expression and stop the function directly. If, however, we still want to compute the expression using a variable that is not the current function variable, that is, using a variable that is not in the current function parameter list, then we have to specify the number of lines of code, as in the following case:

$ ./ 
> /code/(7)<module>()
-> filename_path = util.get_path(filename)
(Pdb) b util:5, not ('/')
Breakpoint 1 at /code/:5
(Pdb) c
> /code/(5)get_path()
-> return head
(Pdb) p head
'.'
(Pdb) a
filename = './'
(Pdb)

5. Continue executing the code

Currently, we can use the n (next) and s (step) commands to debug to see the code, and then use the commands b (break) and c (continue) to stop or continue the code, and here is a related command: unt (until). Using the unt command is similar to the c command, but it results in the next line being larger than the current line of code. Sometimes, unt is more convenient and easy to use, let's show in the following case, first give the syntax to use:

Depending on whether we enter the lines of code parameter lineno, the unt command can be run in either of the following two ways:

Without lineno, the code can continue to execute until the next line is larger than the current line, which is similar to n (next), another form of execution similar to "step over". However, the difference between the n and unt commands is that unt will only stop at the next line that is larger than the current line, whereas the n command stops at the next logical line of execution.
With lineno, the code runs until the next time the number of lines is greater than or equal to the current number of lines, which is similar to c (continue) followed by a code function argument.

In both cases above, the unt command similar to n(next) and s(step) will only stop at the current frame (or function).

Use unt when you want to continue execution and stop further in the current source file. You can think of it as a mix of n (next) and b (break), depending on whether or not you pass a line number argument.

In the following example, there is a function with a loop. Here we want to continue executing the code and stop after the loop instead of single-stepping through each iteration of the loop or setting breakpoints, as shown in the file documentation below:

#!/usr/bin/env python3

import os
def get_path(fname):
    """Return file's path or empty string if no path."""
    import pdb; pdb.set_trace()
    head, tail = (fname)
    for char in tail:
        pass  # Check filename char
    return head

filename = __file__
filename_path = get_path(filename)
print(f'path = {filename_path}')

and the output of using unt under the command is as follows:

$ ./ 
> /code/(9)get_path()
-> head, tail = (fname)
(Pdb) ll
  6     def get_path(fname):
  7         """Return file's path or empty string if no path."""
  8         import pdb; pdb.set_trace()
  9  ->     head, tail = (fname)
 10         for char in tail:
 11             pass  # Check filename char
 12         return head
(Pdb) unt
> /code/(10)get_path()
-> for char in tail:
(Pdb) 
> /code/(11)get_path()
-> pass  # Check filename char
(Pdb) 
> /code/(12)get_path()
-> return head
(Pdb) p char, tail
('y', '')

llThe command is first used to print the source code for the function, followed by unt. pdb remembers the last command entered, so I just press Enter and repeat the unt command. This will continue to execute the code until it reaches a line of source code larger than the current line.

Note that in the console output above, pdb stops only once, at lines 10 and 11. Because unt was used, execution was stopped only on the first iteration of the loop. However, each iteration of the loop is executed. This can be verified in the last line of the output. The value 'y' of the char variable is equal to the last character in the tail value ''.

6. Display Expressions

Similar to the function of printing expressions p and pp, we can use dispaly [expression] to tell pdb to display the value of an expression, and the same applies to undisplay [expression] to make it clear that an expression is displayed, using some of the usage syntax explained below:

Below is a case study with a code file that demonstrates the use of a loop:

$ ./ 
> /code/(9)get_path()
-> head, tail = (fname)
(Pdb) ll
  6     def get_path(fname):
  7         """Return file's path or empty string if no path."""
  8         import pdb; pdb.set_trace()
  9  ->     head, tail = (fname)
 10         for char in tail:
 11             pass  # Check filename char
 12         return head
(Pdb) b 11
Breakpoint 1 at /code/:11
(Pdb) c
> /code/(11)get_path()
-> pass  # Check filename char
(Pdb) display char
display char: 'e'
(Pdb) c
> /code/(11)get_path()
-> pass  # Check filename char
display char: 'x'  [old: 'e']
(Pdb) 
> /code/(11)get_path()
-> pass  # Check filename char
display char: 'a'  [old: 'x']
(Pdb) 
> /code/(11)get_path()
-> pass  # Check filename char
display char: 'm'  [old: 'a']

In the output above, pdb automatically displays the value of the char variable because it changes every time a breakpoint is encountered. Sometimes this is helpful and exactly what you want, but there is another way to use the display.

You can type display multiple times to build a watch list of expressions. This is easier to use than p. After you add all the expressions you're interested in, just type display to see the current value:

$ ./ 
> /code/(9)get_path()
-> head, tail = (fname)
(Pdb) ll
  6     def get_path(fname):
  7         """Return file's path or empty string if no path."""
  8         import pdb; pdb.set_trace()
  9  ->     head, tail = (fname)
 10         for char in tail:
 11             pass  # Check filename char
 12         return head
(Pdb) b 11
Breakpoint 1 at /code/:11
(Pdb) c
> /code/(11)get_path()
-> pass  # Check filename char
(Pdb) display char
display char: 'e'
(Pdb) display fname
display fname: './'
(Pdb) display head
display head: '.'
(Pdb) display tail
display tail: ''
(Pdb) c
> /code/(11)get_path()
-> pass  # Check filename char
display char: 'x'  [old: 'e']
(Pdb) display
Currently displaying:
char: 'x'
fname: './'
head: '.'
tail: ''

Caller ID

In the last section, we will demonstrate the use of "call ID" based on what we have learned above. Below is the example code:

#!/usr/bin/env python3
import fileutil
def get_file_info(full_fname):
    file_path = fileutil.get_path(full_fname)
    return file_path
filename = __file__
filename_path = get_file_info(filename)
print(f'path = {filename_path}')

and the contents of the tool module file:

def get_path(fname):
    """Return file's path or empty string if no path."""
    import os
    import pdb; pdb.set_trace()
    head, tail = (fname)
    return head

In this case, suppose there is a large code base that contains a function in the utility module get_path() that is called with invalid input. However, it is called from many places in different packages.

How can I see who the calling program is?

Use the command w (where) to print a sequence of code stacks, that is, to display all the stacks from low to high:

$ ./ 
> /code/(5)get_path()
-> head, tail = (fname)
(Pdb) w
  /code/(12)<module>()
-> filename_path = get_file_info(filename)
  /code/(7)get_file_info()
-> file_path = fileutil.get_path(full_fname)
> /code/(5)get_path()
-> head, tail = (fname)
(Pdb)

Don't worry if this seems confusing, or if you're not sure what a stack trace or frame is. I will explain these terms below. It's not as difficult as it sounds.

Since the most recent frame is at the bottom, start there and read from the bottom up. Look at the lines starting with ->, but skip the first instance because pdb.set_trace() is used to enter pdb in the function get_path(). In this example, the source line for the call to the function get_path() is:

-> file_path = fileutil.get_path(full_fname)

at each-> The line above contains the filename and the number of lines of code (in parentheses), as well as the function name, so the caller is:

 /code/(7)get_file_info()
-> file_path = fileutil.get_path(full_fname)

Obviously in this simple sample we showed how to find the function caller, but imagine a large application where you set a breakpoint with a condition to determine the source of an incorrect input value, let's dive further below.

What are stack traces and stack frame contents?

A stack trace is just a list of all the frames Python creates to keep track of function calls. Frames are data structures that Python creates when a function is called and removes when the function returns. A stack is just an ordered list of frames or function calls at any point in time. （The stack (of function calls) grows and shrinks throughout the life of an application as functions are called and then returned. When printed, this ordered list of frames, the stack, is called a stack trace. You can view it at any time by typing the command w, just as we found the caller above.

What does the current stack frame mean?

Think of the current frame as the current function that pdb has stopped executing. In other words, the current frame is where your application is currently paused and is used as a "reference frame" for pdb commands such as p (print). p and other commands will use the current frame as a context when needed. In the case of p, the current frame will be used to find and print variable references. When pdb prints a stack trace, the arrow > indicates the current frame.

How do I use and switch stack frames?

You can use two commands u (up) and d (down) to change the current frame. Used in conjunction with p, this allows you to examine variables and state in your application at any point on the call stack in any frame. The syntax for use is as follows:

Let's look at an example using the u and d commands. In this case, we want to check the local variable full_fname in the function get_file_info(). To do this, we must change the current frame one level up with the command u:

$ ./ 
> /code/(5)get_path()
-> head, tail = (fname)
(Pdb) w
  /code/(12)<module>()
-> filename_path = get_file_info(filename)
  /code/(7)get_file_info()
-> file_path = fileutil.get_path(full_fname)
> /code/(5)get_path()
-> head, tail = (fname)
(Pdb) u
> /code/(7)get_file_info()
-> file_path = fileutil.get_path(full_fname)
(Pdb) p full_fname
'./'
(Pdb) d
> /code/(5)get_path()
-> head, tail = (fname)
(Pdb) p fname
'./'
(Pdb)

Because the breakpoint program pdb.set_trace() is set in the file function get_path(), the current frame is set here, as shown below:

> /code/(5)get_path()

To access and print the variable full_fname in the file function get_file_info(), the command u is implemented to move to the previous stack frame:

(Pdb) u
> /code/(7)get_file_info()
-> file_path = fileutil.get_path(full_fname)

Notice that in u's output above, pdb prints the arrow > at the beginning of the first line. This is pdb letting you know that the frame has changed and that this source position is now the current frame. The variable full_fname is now accessible. Also, it is important to realize that the source line beginning with -> on line 2 has been executed. Since the frame was moved up the stack, fileutil.get_path() has been called. Using u, we move the stack up (in a sense, back in time) to the function codeExample5.get_file_info() that called fileutil.get_path().

Continuing the example, after printing full_fname, use d to move the current frame to its original position and print the local variable fname in get_path(). if we wanted to, we could move multiple frames at once by passing the count parameter to u or d. For example, we can move to the module level in by typing u 2:

$ ./ 
> /code/(5)get_path()
-> head, tail = (fname)
(Pdb) u 2
> /code/(12)<module>()
-> filename_path = get_file_info(filename)
(Pdb) p filename
'./'
(Pdb)

It's easy to forget where you are when you're debugging and thinking about many different things. Remember that you can always use the appropriately named command w (where) to see where the execution is paused and what the current frame is.

8. Summary

In this tutorial, we focus on some basic common use in pdb:

Printing ExpressionsUsing the n (next) and s (step) commandsDebugging Code BreakpointsUsing unt (until) to continue code executionDisplaying ExpressionsFinding a Function Caller

Finally, I hope this tutorial has been helpful to you, and the corresponding code in this tutorial can be downloaded from the code repository:/1311440131/pdb_basic_tutorial

to this article on how to use pdb Python debugging article is introduced to this, more related pdb Python debugging content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future!