Source code analysis how Golang forks a process

Creating a new process is divided into two steps, one is the fork system call and the other is the execve system call. The fork call will reuse the stack of the parent process, while execve directly overwrites the stack of the current process and points the next execution instruction to the new executable file.

Before analyzing the source code, let’s take a look at how to write a child process for golang fork. (Strictly speaking, fork first and then execve to create a child process)

cmd := ("/bin/sh")
		 = ()
		 = 
		 = 
		 = 
		err = ()

The above code will fork a child process, and the child process will call the execve system call, using the new executable file /bin/sh instead of the program of the current child process. And the current standard input and output are also passed to the child process.

We will focus on how golang creates and passes file descriptors of the parent process to the child process.

() will call the method, which contains a piece of logic related to the delivery of standard input and output streams. Let's take a look.

// /usr/local/go/src/os/exec/:625 
func (c *Cmd) Start() error {
......
childFiles := make([]*, 0, 3+len())
   // Create a standard input for the child process	stdin, err := ()
	if err != nil {
		return err
	}
	childFiles = append(childFiles, stdin)
	// Create the stdout standard output of the child process	stdout, err := ()
	if err != nil {
		return err
	}
	childFiles = append(childFiles, stdout)
	// Create a child process's stderr standard error output	stderr, err := (stdout)
	if err != nil {
		return err
	}
	// At this time, childFiles already contains the above three standard input and output streams	childFiles = append(childFiles, stderr)
	childFiles = append(childFiles, ...)

	env, err := ()
	if err != nil {
		return err
	}
   // A child process will be started and the file descriptor placed in the parent process will be inherited from childFiles	, err = (, (), &amp;{
		Dir:   ,
		Files: childFiles,
		Env:   env,
		Sys:   ,
	})
	.....
}

As mentioned above, childStdin, childStdout, and childStderr will be called separately to create standard input and output for child processes. Let’s take a look at one of the childStdin implementation principles, and the other childStdout and childStderr implementation principles are similar to it.

// /usr/local/go/src/os/exec/:489
func (c *Cmd) childStdin() (*, error) {
	
	.....
	
	pr, pw, err := ()
	if err != nil {
		return nil, err
	}

	 = append(, pr)
	 = append(, pw)
	// The data written by pw comes from The parent process will start a coroutine copy to pw	 = append(, func() error {
		_, err := (pw, )
		if skipStdinCopyError(err) {
			err = nil
		}
		if err1 := (); err == nil {
			err = err1
		}
		return err
	})
	....
	return pr, nil
}

childStdin actually creates a pipeline, which has the return value pw, pr . The data written by pw can be read by pr . The data written by w originates from the parent process will start a coroutine to copy to pw , and the value is assigned as standard input in our most open demonstration code.

cmd := ("/bin/sh")
		 = ()
		 = 
		 = 
		 = 
		err = ()

pr returns the childFiles passed by the parent process and is passed to the child process and is used as standard input for the child process. When the child process starts, the data of the standard input terminal will be obtained from pr.

After seeing this, you should understand how the child process obtains the terminal information of the parent process.By establishing a pipeline and passing one end of the pipeline to the child process, the parent and child processes can communicate.。

Let's go back to the main process of creating the process. Just now, we just analyzed that the parent process will create its own standard input and output stream for the child process. Although it is wrapped through a pipeline, it has not yet analyzed in detail what methods the method uses to pass the file descriptor of the parent process to the child process.

Note that the process of creating child processes in fork and execve in golang is encapsulated into a unified methodforkExec, which controls child processes, inherits only specific file descriptors, and closes other file descriptors. The kernel fork system call will copy all file descriptors in the parent process. So how does golang achieve only inheritance of specific file descriptors? This is also the focus of the next analysis

Next, let's dive into the method,See how golang can only inherit the file descriptor passed by the parent process through childFiles for fork and execve calls。

The underlying layer will call the forkAndExecInChild1 method. Since the code is relatively long, I only listed the key steps here and commented them.

func forkAndExecInChild1(argv0 *byte, argv, envv []*byte, chroot, dir *byte, attr *ProcAttr, sys *SysProcAttr, pipe int) (r1 uintptr, err1 Errno, p [2]int, locked bool) {
    ...
    // Before the fork call, the data in it will be copied to the fd array. What we pass to the child process is childFiles. When the code is executed here, childFiles has been converted into a file descriptor and stored.  Nextfd is for the purpose of copying file descriptors in the future, and the file descriptors to be used by the child process will not be overwritten. The detailed description will be made in the next step 1.    nextfd = len()
	for i, ufd := range  {
		if nextfd &lt; int(ufd) {
			nextfd = int(ufd)
		}
		fd[i] = int(ufd)
	}
	nextfd++
   .....
   // Here we make a fork call to create a new process. However, we can see that the clone system call is used here. In fact, it is similar to fork. However, the difference is that the clone system call can specify a new process through flags. For which attributes of the parent process need to be inherited and which attributes do not need to be inherited. For example, the child process needs a new network namespace, you need to specify flags as syscall.CLONE_NEWNS   r1, err1 = rawVforkSyscall(SYS_CLONE, flags, 0)
   ....
   
   // Step 1: In short, after the clone system call above, the child process has been generated. The following two steps are steps that the child process will only take. After the clone system call above, the parent process returns by judging err1 != 0 || r1 != 0.  // Here, the file descriptor of fd[i] < i is copied to a new file descriptor through the dup system call, because in the next step 2 we need to copy fd[i] to the i-th file descriptor. If fd[i] < i, then the copied fd[i] will be the file descriptor that the child process has generated copying behavior, rather than the file descriptor that the parent process actually passes. Therefore, copy such file descriptors outside the fd array through nextfd and set O_CLOEXEC, so that it will be automatically closed after subsequent execve system calls.     	for i = 0; i &lt; len(fd); i++ {
		if fd[i] &gt;= 0 &amp;&amp; fd[i] &lt; i {
			....
			_, _, err1 = RawSyscall(SYS_DUP3, uintptr(fd[i]), uintptr(nextfd), O_CLOEXEC)
			if err1 != 0 {
				goto childerror
			}
			fd[i] = nextfd
			nextfd++
		}
	}
   ....
   // Step 2: traversing fd and let the child process fd[i] file descriptors be copied to the i-th file descriptor. Note that O_CLOEXEC is not set here, because we hope that the file descriptor here will still exist after execve	for i = 0; i &lt; len(fd); i++ {
		....
		_, _, err1 = RawSyscall(SYS_DUP3, uintptr(fd[i]), uintptr(i), 0)
		if err1 != 0 {
			goto childerror
		}
	} 
	
	....
    // Make an execve system call	_, _, err1 = RawSyscall(SYS_EXECVE,
		uintptr((argv0)),
		uintptr((&amp;argv[0])),
		uintptr((&amp;envv[0])))
}

It can be seen that before execve, golang achieved the purpose of inheriting the parent process file descriptor through the dup system call. The final effect is to inherit the file descriptor in the parameter. During this period, the excess file descriptor generated by the use of dup is also marked as O_CLOEXEC, which will be closed when the SYS_EXECVE system call.

However, just seeing this does not mean that golang will also close the external file descriptors, because when the fork system call, the child process will automatically inherit all file descriptors of the parent process. Will these inherited file descriptors be automatically closed after execve? The answer is that by default, it is OK.

The function of golang will call the following code to open the file. You can see that syscall.O_CLOEXEC flag is set when opening, so when the child process execve changes, these file descriptors will be automatically closed.

func openFileNolog(name string, flag int, perm FileMode) (*File, error) {
	setSticky := false
	if !supportsCreateWithStickyBit && flag&O_CREATE != 0 && perm&ModeSticky != 0 {
		if _, err := Stat(name); IsNotExist(err) {
			setSticky = true
		}
	}
	var r int
	for {
		var e error
		r, e = (name, flag|syscall.O_CLOEXEC, syscallMode(perm))
		if e == nil {

The socket file you listen to is also enabled by default with syscall.SOCK_NONBLOCK parameter

// descriptor as nonblocking and close-on-exec.
func sysSocket(family, sotype, proto int) (int, error) {
	s, err := socketFunc(family, sotype|syscall.SOCK_NONBLOCK|syscall.SOCK_CLOEXEC, proto)
	if err != nil {
		return -1, ("socket", err)
	}
	return s, nil
}

This is the end of this article about source code analysis of how Golang fork a process. For more relevant Golang fork process content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!