Key explanation of Golang concurrent programming

1. Sharing through communication

Concurrent programming is a big topic, and here are only some go-specific key content.

In many environments, the subtleties required to achieve proper access to shared variables make concurrent programming difficult. Go encourages a different approach in which shared values are passed in the channel, which is actually never actively shared by a separate execution thread. At any given time, only one goroutine can access the value. By design, data competition is impossible. To encourage this way of thinking, we simplify it into a slogan:

Do not communicate by sharing memory; instead, share memory by communicating.

Do not communicate through shared memory; instead, share memory through communication.

This approach may go too far. For example, reference counting is preferably achieved by placing mutexes around integer variables. But as an advanced approach, using channels to control access makes it easier to write clear and correct programs.

One way to consider this model is to consider a typical single-threaded program running on a CPU. It does not require synchronous primitives. Now run another instance like this; it also does not require synchronization. Now let the two programs communicate; if the communication is a synchronizer, there is still no need for other synchronization. For example, Unix pipelines fit perfectly into this model. Although Go's concurrency method originates from Hoare's communication Sequential Processes (CSP), it can also be regarded as a type-safe generalization of Unix pipelines.

2、Goroutines

They are called goroutines because existing terms—threads, coroutines, processes, etc.— convey inaccurate meanings. goroutine has a simple model: it is a function that executes concurrently with other goroutines in the same address space. It is lightweight and is not much more expensive than allocating stack space. And the stacks start out so they are cheap and grow by allocating (and freeing) heap storage as needed.

goroutines are multiplexed on multiple operating system threads, so if one thread is blocking, for example while waiting for I/O, other threads continue to run. Their design hides many of the complexities of thread creation and management.

Add before function or method callgoKeyword to run the call in a new goroutine. When the call completes, goroutine exits silently. (The effect is similar to the & symbol of Unix shell, used to run commands in the background.)

go () // run  concurrently; don't wait for it.

function literalIt's very convenient in goroutine calls.

func Announce(message string, delay ) {
    go func() {
        (delay)
        (message)
    }()  // Note the parentheses - must call the function.
}

In Go, function literals are closures: implementation ensures that the variables referenced by the function can survive as long as they are active.

3、Channels

Like map, channels are also usedmakeAssign the result value as a reference to the underlying data structure. If an optional integer parameter is provided, it sets the buffer size of the channel. For unbuffered channels or synchronous channels, the default value is 0.

ci := make(chan int)            // unbuffered channel of integers
cj := make(chan int, 0)         // unbuffered channel of integers
cs := make(chan *, 100)  // buffered channel of pointers to Files

The unbuffered channel combines communication (switching of values) with synchronization, ensuring that the two calculations (gorout routines) are in a known state.

There are many good idioms for using channels. This is the beginning. In the previous section, we started sorting in the background. The channel can allow the start of the goroutine to wait for the sort to complete.

c := make(chan int)  // Allocate a channel.
// Start the sort in a goroutine; when it completes, signal on the channel.
go func() {
    ()
    c <- 1  // Send a signal; value does not matter.
}()
doSomethingForAWhile()
<-c   // Wait for sort to finish; discard sent value.

The receiver is always blocked until data is received. If the channel is unbuffered, the sender will block until the receiver receives the value. If the channel has a buffer, the sender only blocks until the value is copied to the buffer; if the buffer is full, it means waiting until a receiver receives a value. (Reference 3.1)

Buffered channels can be used like semaphores, such as limiting throughput. In this example, the incoming request is passed to the handle, which sends a value to the channel, processes the request, and then receives a value from the channel to prepare a "semaphore" for the next user. The capacity of the channel buffer limits the number of simultaneous calls to be processed.

var sem = make(chan int, MaxOutstanding)
func handle(r *Request) {
    sem <- 1    // Wait for active queue to drain.
    process(r)  // May take a long time.
    <-sem       // Done; enable next request to run.
}
func Serve(queue chan *Request) {
    for {
        req := <-queue
        go handle(req)  // Don't wait for handle to finish.
    }
}

Once the MaxOutstanding handler is executing a process, requests attempting to send to the filled channel buffer will be blocked until an existing handler completes and is received from the buffer.

However, there is one problem with this design:ServeCreate a new goroutine for each incoming request, although at any time, onlyMaxOutstandingMultiple can run. Therefore, if the request comes too fast, the program may consume unlimited resources. We can change itServeTo restrict the creation of goroutines to solve this flaw. There is an obvious solution here, but be aware that there is a bug that we will fix later:

func Serve(queue chan *Request) {
    for req := range queue {
        sem <- 1
        go func() {
            process(req) // Buggy; see explanation below.
            <-sem
        }()
    }
}

The bug is that in Go forIn a loop, loop variables are reused in each iteration, soreqVariables are shared in all goroutines. This is not what we want. We need to ensure that each goroutine isreqIt's unique. Here is a way to addreqThe value of the closure is passed as a parameter:

func Serve(queue chan *Request) {
    for req := range queue {
        sem <- 1
        go func(req *Request) {
            process(req)
            <-sem
        }(req)
    }
}

Compare this version to the previous version to see the differences in how the closure is declared and runs. Another solution is to create a new variable with the same name, as shown in the following example:

func Serve(queue chan *Request) {
    for req := range queue {
        req := req // Create new instance of req for the goroutine.
        sem <- 1
        go func() {
            process(req)
            <-sem
        }()
    }
}

It seems a bit strange to write this way

req := req

But doing this in Go is legal and idiomatic. You will get a new variable with the same name, deliberately masking the loop variable locally, but unique for each goroutine.

Going back to the general problem of writing servers, another good way to manage resources is to start a fixed number ofhandle goroutines , all of thesehandle goroutines are all read from the request channel. The number of goroutines is limitedprocessThe number of calls simultaneously. thisServeThe function also accepts a channel, which will be told to exit the channel; after the goroutines is started, it will block reception from the channel.

func handle(queue chan *Request) {
    for r := range queue {
        process(r)
    }
}
func Serve(clientRequests chan *Request, quit chan bool) {
    // Start handlers
    for i := 0; i < MaxOutstanding; i++ {
        go handle(clientRequests)
    }
    <-quit  // Wait to be told to exit.
}

3.1 What are the characteristics of Channel

Channels in Go have the following characteristics:

Thread safety

Channels are thread-safe, and multiple coroutines can read and write a channel at the same time without data competition. This is because the channel in Go language implements a lock mechanism internally, ensuring that access to the channel between multiple coroutines is safe.

Blocking sending and receiving

When a coroutine sends data to a channel, if the channel is full, the sending operation will be blocked until other coroutines take data from the channel. Similarly, when a coroutine receives data from a channel, if there is no data in the channel for reception, the reception operation will be blocked until another coroutine sends data to the channel. This blocking mechanism can ensure synchronization and communication between coroutines.

Sequence

The data sent through the channel is arranged in the order of transmission. That is to say, if coroutine A first sends data x to the channel, and coroutine B then sends data y to the channel, then when receiving data from the channel, the first thing that is received must be x, and the next thing that is received must be y.

Can be closed

By closing the channel, you can notify other coroutines that this channel is no longer used. After closing a channel, other coroutines can still receive data from it, but can no longer send data to it. Turning off the channel can avoid memory leaks and other problems.

Buffer size

The channel can have a buffer that is used to store a certain amount of data. If the buffer is full, the sending operation will be blocked until another coroutine has retrieved data from the channel; if the buffer is empty, the receiving operation will be blocked until another coroutine has sent data to the channel. The size of the buffer can be specified when creating the channel, for example:

ch := make(chan int, 10)

Several situations of panic

1. Send data to the closed channel

2. Close the closed channel

3. Close the uninitialized nil channel

Clogging situations:

1. Never initializednilRead data in channel

2. Toward uninitializednilSend data in channel

3. When there is no read groutine, send data to the unbuffered channel.

There is a buffer, but the buffer is full, when sending data

4. When there is no data, read data from unbuffered or buffered channel

Returns zero value:

Receive data from a closed channe

3.2 channel best practices

When using a channel, the following best practices should be followed:

Avoid deadlocks

When using the channel, you should pay attention to avoid deadlock problems. If a coroutine sends data to a channel, but no other coroutine takes data from the channel, the sending operation will be blocked all the time, resulting in a deadlock. To avoid this, you can use a select statement to listen to multiple channels at the same time to avoid blocking.

Avoid leaks

When using channel, you should be careful to avoid memory leaks. If a channel is not closed and is no longer used, the data in it cannot be released, resulting in memory leaks. To avoid this, the channel can be closed at the end of the coroutine.

Avoid competition

When using channel, you should pay attention to avoiding data competition. If multiple coroutines read and write a channel at the same time, race conditions may occur, resulting in data inconsistency. To avoid this, you can use a lock mechanism or a one-way channel to restrict access to the coroutine.

Avoid overuse

When using the channel, you should be careful to avoid overuse. If a program uses a large number of channels, it may cause the program's performance to be degraded. To avoid this, other concurrent programming mechanisms can be used, such as locks, condition variables, etc.

4、Channels of channels

One of the most important properties of Go is that the channel is a first-class value, which can be allocated and passed like other values. A common use of this property is to implement secure parallel demultiplexing.

In the example in the previous section,handleis an ideal handler for requests, but we do not define the type it handles. If the type contains the channel on which to reply, each client can provide its own path to the answer. Below isRequestSchematic definition of type.

type Request struct {
    args        []int
    f           func([]int) int
    resultChan  chan int
}

The client provides a function and its parameters, as well as a channel within the request object for receiving answers.

func sum(a []int) (s int) {
    for _, v := range a {
        s += v
    }
    return
}
request := &Request{[]int{3, 4, 5}, sum, make(chan int)}
// Send request
clientRequests <- request
// Wait for response.
("answer: %d\n", <-)

On the server side, the only thing that needs to be changed is the handler function.

func handle(queue chan *Request) {
    for req := range queue {
         <- ()
    }
}

Obviously, there is still a lot of work to do to implement it, but this code is a framework for a rate-limited, parallel, non-blocking RPC system, and there is no mutex yet.

5. Parallelization

Another application of these ideas is parallel computing across multiple CPU cores. If the calculation can be broken down into independent parts that can be performed independently, it can be parallelized and signaled with one channel when each part is completed.

Suppose we have an expensive operation to perform on a vector of items, and the operation value of each item is independent, just like in this ideal example.

type Vector []float64
// Apply the operation to v[i], v[i+1] ... up to v[n-1].
func (v Vector) DoSome(i, n int, u Vector, c chan int) {
    for ; i < n; i++ {
        v[i] += (v[i])
    }
    c <- 1    // signal that this piece is done
}

We start these fragments independently in a loop, one per CPU. They can be done in any order, but that doesn't matter; we just calculate the completion signal through the drain channel after all goroutines are started.

const numCPU = 4 // number of CPU cores
func (v Vector) DoAll(u Vector) {
    c := make(chan int, numCPU)  // Buffering optional but sensible.
    for i := 0; i < numCPU; i++ {
        go (i*len(v)/numCPU, (i+1)*len(v)/numCPU, u, c)
    }
    // Drain the channel.
    for i := 0; i < numCPU; i++ {
        <-c    // wait for one task to complete
    }
    // All done.
}

We don't need tonumCPUCreate a constant, instead you can ask which value is appropriate at runtime. functionReturns the number of hardware CPU cores in the machine, so we can write it like this

There is also a function , it reports (or sets) the number of cores that the user-specified Go program can run simultaneously. The default value is, but can be overridden by setting similarly named shell environment variables or calling a function with a positive number. Calling it with 0 is just a query value. Therefore, if we want to satisfy the user's resource request, we should write

var numCPU = (0)

Be sure not to concurrency (concurrency, constructing a program as a component that executes independently) and parallelism (parallelism, performing calculations in parallel on multiple CPUs to improve efficiency). Although the concurrency nature of Go can make some problems easy to build into parallel computing, Go is a concurrent language, not a parallel language, and not all parallelization problems are suitable for Go's model. For a discussion of the differences, seeThis articlecited in the conversation.

6. A leaky buffer

Tools for concurrent programming can even make non-concurrent ideas easier to express. Here is an example abstracted from the RPC package. The client goroutine loops to receive data from a source (probably the network). To avoid allocating and freeing buffers, it retains a free list and uses a buffer channel to represent it. If the channel is empty, a new buffer is allocated. Once the message buffer is ready, it is sent toserverChanserver on.

var freeList = make(chan *Buffer, 100)
var serverChan = make(chan *Buffer)
func client() {
    for {
        var b *Buffer
        // Grab a buffer if available; allocate if not.
        select {
        case b = <-freeList:
            // Got one; nothing more to do.
        default:
            // None free, so allocate a new one.
            b = new(Buffer)
        }
        load(b)              // Read next message from the net.
        serverChan <- b      // Send to server.
    }
}

The server cycles to receive each message from the client, process it, and returns the buffer to the free list.

func server() {
    for {
        b := <-serverChan    // Wait for work.
        process(b)
        // Reuse buffer if there's room.
        select {
        case freeList <- b:
            // Buffer on free list; nothing more to do.
        default:
            // Free list full, just carry on.
        }
    }
}

The client tries tofreeListRetrieve the buffer in ; if none is available, a new one is allocated. The message sent by the server to freeList will put b back into the free list unless the free list is full, in which case the buffer will be discarded on the floor and recycled by the garbage collector. (When there is no othercaseWhen available,selectIn the statementdefault The clause will be executed, which meansselectStatements never block. )This implementation constructs a list of missing buckets in just a few lines, relying on buffer channels and garbage collectors for accounting.

This is the article about this article about Golang's key explanation of concurrency programming. For more related Go concurrency content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!