SoFunction
Updated on 2024-11-13

An in-depth look at the use of the yield from syntax in Python

1. Why Use Concurrent Programs

In the previous article, we successfully transitioned from the basic understanding and use of generators to coprocessing.

But there must be many people, who only know what a concatenation is, but don't really know why they should use it? In other words, don't know under what circumstances to use a concatenation? What are its advantages over multithreading?
Before I start talking about yield from, I'd like to address this issue that has caused a lot of confusion for a lot of people.

An example. Suppose we make a crawler. We want to crawl multiple web pages, here's a simple example of two web pages (two spider functions), fetch the HTML (IO-consuming and time-consuming), and then parse the HTML line by line to get the data we're interested in.

Our code structure is streamlined as follows:

def spider_01(url):
    html = get_html(url)
    ...
    data = parse_html(html)

def spider_02(url):
    html = get_html(url)
    ...
    data = parse_html(html)

We all know that.get_html()Waiting for a web page to be returned is very IO intensive, one web page is fine, if we are crawling an extremely large web page, this waiting time is very alarming and is a huge waste of time.

Smart programmers, of course, wonder if it would be nice to have theget_html()Pause here and instead of waiting stupidly for the page to return, go do something else. After a while, you can go back to where you paused and receive the returned html content, and then you can continue parsing it.parse_html(html)

There's almost no way to achieve what we're looking for using conventional methods. So Python has thought of a good way to give us this from the language itself, which is theyieldSyntax. The effect of pausing in a function can be realized.

Try to think of a concurrent program that we would write if there were no coprocessing. There might be the following problems

1. Using the most conventional synchronous programming to achieve asynchronous concurrency is not ideal or extremely difficult.

2. Due to the existence of GIL locks, multi-threaded operation requires frequent locking and unlocking and thread switching, which greatly reduces concurrency performance;

And the emergence of concatenation just solves the above problem. It is characterized by

1. Concurrency is realized in a single thread for switching tasks.

2. Using synchronization to achieve asynchrony

3. Improved concurrency performance by eliminating the need for locks

2. yield from usage details

yield from is a syntax that only appeared in Python 3.3. So this feature is not available in Python2.

yield from The last thing you need to add is the iterable object, which can be a normal iterable object, an iterator, or even a generator.

2.1 Simple Application: Stitching Iterable Objects

We can do this with a program that uses theyieldand one that usesyield fromof examples to compare and contrast.

utilizationyield

# String
astr='ABC'
# List
alist=[1,2,3]
# Dictionary
adict={"name":"wangbm","age":18}
# Generator
agen=(i for i in range(4,8))
def gen(*args, **kw):
    for item in args:
        for i in item:
            yield i
new_list=gen(astr, alist, adict, agen)
print(list(new_list))
# ['A', 'B', 'C', 1, 2, 3, 'name', 'age', 4, 5, 6, 7]

utilizationyield from

# String
astr='ABC'
# List
alist=[1,2,3]
# Dictionary
adict={"name":"wangbm","age":18}
# Generator
agen=(i for i in range(4,8))
def gen(*args, **kw):
    for item in args:
        yield from item
new_list=gen(astr, alist, adict, agen)
print(list(new_list))
# ['A', 'B', 'C', 1, 2, 3, 'name', 'age', 4, 5, 6, 7]

Comparison of the above two ways, it can be seen that the field from the back of the iterable object, he can iterate the object of each element one by one field out, compared to the field of the code is more concise, the structure is more clear.

2.2 Complex applications: nesting of generators

If you think it's justyield from If that's all there is to it, then you're underestimating it, and there's a lot more to it than that.

(coll.) fail (a student)yield from With the addition of a generator behind it, generated nesting is achieved.

Of course realizing the nesting of generators doesn't necessarily necessitate the use of theyield fromInstead, they useyield fromallows us to avoid having to deal with all sorts of unexpected exceptions ourselves, and instead lets us focus on the implementation of the business code.

If you use it yourselfyieldTo realize, that will only increase the difficulty of writing code, reduce development efficiency and reduce the readability of the code. Since Python has been so thoughtful, we should certainly take advantage of it.

Before explaining it, it's important to know this few concepts first >1,caller: Client (caller) code that invokes the delegate generator >2,commission generator: Generator function containing a yield from expression >3,generator: generator function added after yield from

You may not know what they all mean, that's okay, take a look at this example.

This example, is to realize the real-time calculation of the average. For example, the first time you pass in 10, then the return average is naturally 10. The second time you pass in 20, then the return average is (10+20)/2=15 The third time you pass in 30, then the return average (10+20+30)/3=20

# Subgenerator
def average_gen():
    total = 0
    count = 0
    average = 0
    while True:
        new_num = yield average
        count += 1
        total += new_num
        average = total/count

# Commission Generator
def proxy_gen():
    while True:
        yield from average_gen()

# Caller
def main():
    calc_average = proxy_gen()
    next(calc_average)            # Pre-excitation generator
    print(calc_average.send(10))  # Printed: 10.0
    print(calc_average.send(20))  # Printed: 15.0
    print(calc_average.send(30))  # Printed: 20.0

if __name__ == '__main__':
    main()

Read the above code carefully, you should easily be able to understand the relationship between the caller, delegate generator, and child generator. I'll leave it at that.

The role of the delegate generator is to: Creates a caller-subgeneratorbi-directional channel

What is meant by a two-way channel? The caller can pass thesend()The message is sent directly to the sub-generator, and the value of the sub-generator yield, is also returned directly to the caller.

You may often see some code that can also be used in theyield fromI saw earlier that you can assign values. What is this usage?

You might think that the value returned by the child generator yield is intercepted by the delegate generator. You can personally write a demo to run a test, it is not what you think. As we said before, the delegate generator, only plays a bridge role, it builds abi-directional channel, it doesn't have the right or the means to do interception on what the sub-generator yield comes back.

To explain this usage, I'm still using the above example with some modifications. Added some comments, hopefully you can read them.

As usual, let's give an example.

# Subgenerator
def average_gen():
    total = 0
    count = 0
    average = 0
    while True:
        new_num = yield average
        if new_num is None:
            break
        count += 1
        total += new_num
        average = total/count
    # Each return means the current concatenation is over.
    return total,count,average
# Commission Generator
def proxy_gen():
    while True:
        # The variable to the left of yield from is assigned a value and the code that follows is executed only if the subgenerator is about to end (return).
        total, count, average = yield from average_gen()
        print("end of calculation (math.)!!\nTotal incoming {} numerical value, aggregate:{},mean value:{}".format(count, total, average))
# Caller
def main():
    calc_average = proxy_gen()
    next(calc_average)            # Pre-excited co-programs
    print(calc_average.send(10))  # Printed: 10.0
    print(calc_average.send(20))  # Printed: 15.0
    print(calc_average.send(30))  # Printed: 20.0
    calc_average.send(None)      # End the concatenation
    # If calc_average.send(10) is called here again, a new thread will be started since the previous one has ended.
if __name__ == '__main__':
    main()

After running, the output

10.0
15.0
20.0
Calculations are complete!
Total of 3 values passed in, sum: 60, average: 20.0

3. why use yield from

Learning here, I believe you must ask, since the delegate generator, play only a two-way channel role, I still need to delegate generator to do what? I call the caller directly call the sub-generator is not good?

High-energy warning!

Let's explore what's so great about yield from that we have to use it.

3.1 Because it helps us handle exceptions

If we remove the delegate generator and call the child generator directly. Then we need to change the code to something like the following, where we need to catch the exception and handle it ourselves. Instead of making theyield fromThat way it's less of a hassle.

# Subgenerator
# Subgenerator
def average_gen():
    total = 0
    count = 0
    average = 0
    while True:
        new_num = yield average
        if new_num is None:
            break
        count += 1
        total += new_num
        average = total/count
    return total,count,average
# Caller
def main():
    calc_average = average_gen()
    next(calc_average)            # Pre-excited co-programs
    print(calc_average.send(10))  # Printed: 10.0
    print(calc_average.send(20))  # Printed: 15.0
    print(calc_average.send(30))  # Printed: 20.0
    # ---------------- note -----------------
    try:
        calc_average.send(None)
    except StopIteration as e:
        total, count, average = 
        print("end of calculation (math.)!!\nTotal incoming {} numerical value, aggregate:{},mean value:{}".format(count, total, average))
    # ---------------- note -----------------
if __name__ == '__main__':
    main()

At this point, you may be saying that it's not just aStopIterationThe exception? It's no big deal to capture it yourself.

If you only knew.yield fromYou wouldn't be able to say that after what has been done for us behind the scenes.

actualyield fromWhat has been done for us can be seen in this code below.

# Some notes
"""
_i:generator,It is also an iterator
_y:generator生产的值
_r:yield from The final value of the expression
_s:The caller passes thesend()Value sent
_e:exception object
"""
_i = iter(EXPR)
try:
    _y = next(_i)
except StopIteration as _e:
    _r = _e.value
else:
    while 1:
        try:
            _s = yield _y
        except GeneratorExit as _e:
            try:
                _m = _i.close
            except AttributeError:
                pass
            else:
                _m()
            raise _e
        except BaseException as _e:
            _x = sys.exc_info()
            try:
                _m = _i.throw
            except AttributeError:
                raise _e
            else:
                try:
                    _y = _m(*_x)
                except StopIteration as _e:
                    _r = _e.value
                    break
        else:
            try:
                if _s is None:
                    _y = next(_i)
                else:
                    _y = _i.send(_s)
            except StopIteration as _e:
                _r = _e.value
                break
RESULT = _r

The above code, slightly more complex, interested students can be combined with the following instructions to study to see.

  • The value produced by the iterator (i.e., i.e., the subgenerator) is returned directly to the caller
  • Any value sent to a delegated producer (i.e., an external producer) using the send() method is passed directly to the iterator. If the send value is None, the iterator next() method is called; if it is not None, the iterator's send() method is called. If the call to the iterator produces a StopIteration exception, the delegated producer resumes to continue execution of the statement following yield from; if the iterator produces any other exception, both are passed to the delegated producer.
  • The subgenerator may just be an iterator and not a generator as a concurrent program, so it does not support the .throw() and .close() methods, i.e., it may raise an AttributeError exception.
  • Exceptions thrown to the delegate producer other than GeneratorExit exceptions will be passed to the iterator's throw() method. If the iterator throw() call produces a StopIteration exception, the delegate producer resumes and continues execution, and the other exceptions are passed to the delegate producer.
  • If a GeneratorExit exception is thrown to the delegate producer, or if the delegate producer's close() method is called, it will also be called if the iterator has a close(). If the close() call throws an exception, the exception is passed to the delegating producer. Otherwise, the delegate producer will throw the GeneratorExit exception.
  • When the iterator ends and throws an exception, the value of the yield from expression is the first argument in its StopIteration exception.
  • A return expr statement in a generator will exit from the generator and throw a StopIteration(expr) exception.

For those of you who aren't interested in seeing it, just know thatyield fromhelp us do a lot of exception handling, and comprehensive, and these if we have to implement their own, one is to write the code to increase the difficulty of writing out the code readability is extremely poor, we do not say, the main thing is that there is likely to be missed, as long as which exception is not taken into account, may lead to a program crash or something!

to this article on the use of Python yield from syntax in-depth understanding of the article is introduced to this, more related Python yield from syntax content, please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!