SoFunction
Updated on 2024-11-10

5 Tips to Boost Python's Running Speed

The original official text, the code all works

Python is one of the most widely used programming languages in the world. It is an interpreted high-level general-purpose programming language that has a wide range of uses and you can use it for almost everything. It is known for its simple syntax, elegant code, and rich third-party libraries. python has a lot of advantages in addition to that, but it also has a very big disadvantage in terms of speed.

Although Python code runs slowly, you can boost the speed of Python with the 5 tips shared below!

First, define a timing function, timeshow, which, with simple decoration, prints the running time of the specified function.

This function will be used several times in the following examples.

def timeshow(func):
    from time import time
    def newfunc(*arg, **kw):
        t1 = time()
        res = func(*arg, **kw)
        t2 = time()
        print(f"{func.__name__: >10} : {t2-t1:.6f} sec")
        return res
    return newfunc
@timeshow
def test_it():
    print("hello pytip")
test_it()

1. Choosing the right data structure

Using the right data structures can have a significant impact on the runtime of python scripts. python has four built-in data structures:

  • listings: List
  • tuple: Tuple
  • set (mathematics): Set
  • dictionaries: Dictionary

However, most developers use lists in all cases. This is an incorrect approach and you should use the appropriate data structure according to the task.

Running the code below, you can see that tuples perform simple retrieval operations faster than lists. One of the dis modules disassembles the bytecode of a function, which is useful to see the difference between lists and tuples.

import dis
def a():
    data = [1, 2, 3, 4, 5,6,7,8,9,10]
    x =data[5]
    return x
def b():
    data = (1, 2, 3, 4, 5,6,7,8,9,10)
    x =data[5]
    return x
print("-----:Machine code for using the list:------")
(a)
print("-----:Machine code using tuples:------")
(b)

Run Output.

-----:Machine code for using the list:------
3 0 LOAD_CONST 1 (1)
2 LOAD_CONST 2 (2)
4 LOAD_CONST 3 (3)
6 LOAD_CONST 4 (4)
8 LOAD_CONST 5 (5)
10 LOAD_CONST 6 (6)
12 LOAD_CONST 7 (7)
14 LOAD_CONST 8 (8)
16 LOAD_CONST 9 (9)
18 LOAD_CONST 10 (10)
20 BUILD_LIST 10
22 STORE_FAST 0 (data)
4 24 LOAD_FAST 0 (data)
26 LOAD_CONST 5 (5)
28 BINARY_SUBSCR
30 STORE_FAST 1 (x)
5 32 LOAD_FAST 1 (x)
34 RETURN_VALUE
-----:Machine code using tuples:------
7 0 LOAD_CONST 1 ((1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
2 STORE_FAST 0 (data)
8 4 LOAD_FAST 0 (data)
6 LOAD_CONST 2 (5)
8 BINARY_SUBSCR
10 STORE_FAST 1 (x)
9 12 LOAD_FAST 1 (x)
14 RETURN_VALUE

Look at the machine code for the list below, it's lengthy and redundant!

2. Make good use of powerful built-in functions and third-party libraries

If you are using python and still writing some general purpose functions (e.g. addition, subtraction) on your own, then are insulting python. Python has tons of libraries and built-in functions to help you not to have to write them. If you look into it, then you will be surprised to find that almost 90% of the problems already have third party packages or built-in functions to solve them.

This can be done by accessing theofficial documentView all built-in functions. You can also view all the built-in functions in thewiki pythonFind more scenarios using built-in functions on.

For example, now we want to merge all the words in the list into one sentence, comparing the difference between writing our own and calling a library function.

# ❌ Ways that a normal person could think of
@timeshow
def f1(list):
    s =""
    for substring in list:
        s += substring
    return s
# ✅ pythonic method
@timeshow
def f2(list):
    s = "".join(list)
    return s
l = ["I", "Love", "Python"] * 1000 # To see the difference, we've enlarged this list #
f1(l)
f2(l)

Run Output.

f1 : 0.000227 sec
f2 : 0.000031 sec

3. Less use of loops

  • List derivatives instead of loops
  • Replacing loops with iterators
  • Replace the loop with filter()
  • Reduced number of cycles, precise control, no CPU waste
## Returns all numbers up to n that are divisible by 7.
# ❌ Ways a normal person can think of.
@timeshow
def f_loop(n): 
    L=[]
    for i in range(n):
        if i % 7 ==0:
            (i)
    return L
# ✅ List Derivative
@timeshow
def f_list(n):
    L = [i for i in range(n) if i % 7 == 0]
    return L
# ✅ Iterator
@timeshow
def f_iter(n):
    L = (i for i in range(n) if i % 7 == 0)
    return L
# ✅ Filters
@timeshow
def f_filter(n):
    L = filter(lambda x: x % 7 == 0, range(n))
    return L
# ✅ Precise control of the number of cycles
@timeshow
def f_mind(n):
    L = (i*7 for i in range(n//7))
    return L
n = 1_000_000
f_loop(n)
f_list(n)
f_iter(n)
f_filter(n)
f_mind(n)

The output is.

f_loop : 0.083017 sec
f_list : 0.056110 sec
f_iter : 0.000015 sec
f_filter : 0.000003 sec
f_mind : 0.000002 sec

You can tell at a glance who's fast and who's slow!

filter become man and wifelambdaDafa is just dope!!!

4. Avoiding circular recalculations

If you have an iterator, you must do some time-consuming computations with its elements, such as matching regular expressions. You should define the regular expression pattern outside of the loop, because it is better to compile the pattern only once, rather than compiling it again and again in each iteration of the loop.

Whenever possible, you should try to do as much arithmetic as possible outside the loop, such as assigning a function computation to a local variable and then using it in the function.

# ❌ Should be avoided instead:
@timeshow
def f_more(s):
    import re
    for i in s:
        m = (r'a*[a-z]?c', i)
# ✅ A better way:
@timeshow
def f_less(s):
    import re
    regex = (r'a*[a-z]?c')
    for i in s:
        m = (i)
s = ["abctestabc"] * 1_000
f_more(s)
f_less(s)

The output is.

f_more : 0.001068 sec
f_less : 0.000365 sec

5. Use less memory, less global variables

Memory footprint is the amount of memory used while the program is running. To make Python code run faster, you should reduce the amount of memory used by the program, i.e., minimize the number of variables or objects.

Python accesses local variables more efficiently than global variables. You should always try to ignore declaring global variables until it is necessary. A global variable that has been defined in a program persists until the entire program is compiled, so it always occupies memory space. On the other hand, local variables are faster to access and can be reclaimed when the function completes. Therefore, it is better to use multiple local variables rather than global variables.

# ❌ Ways that should be avoided:
message = "Line1\n"
message += "Line2\n"
message += "Line3\n"
# ✅ A better way:
l = ["Line1","Line2","Line3"]
message = '\n'.join(l)
# ❌ Ways that should be avoided:
x = 5
y = 6 
def add():
    return x+y
add()
# ✅ A better way:
def add():
    x = 5
    y = 6
    return x+y
add()

summarize

That's all for this post, I hope it was helpful and I hope you'll check back for more from me!