I. Unzip the package
zip file format
Common compressed package formats are zip, gzip, and bzip2. In the Go language, files in these formats can be easily processed using the archive/zip, compress/gzip, compress/bzip2 packages.
Unzip the zip file
Using the functions in the archive/zip package, we can easily manipulate zip files. First, we need to open the zip file:
zipFile, err := (zipPath) if err != nil { return err } defer ()
The above code uses a function to open a zip file and returns an object of type * that represents the reader and closer of the contents of the zip file. Note: After reading the zip file, remember to close the file using the defer statement.
Next, we can use the Read() function to decompress each file in the zip file and write it to local disk:
for _, zipFileInfo := range { dstPath := (outputDir, ) dstDir := (dstPath) err = (dstDir, 0755) if err != nil { return err } dstFile, err := (dstPath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, ()) if err != nil { return err } srcFile, err := () if err != nil { () return err } _, err = (dstFile, srcFile) () () if err != nil { return err } }
In the above code, we first splice the local file path and create the directory structure where it is located. Next, use the () function to open the local file in write mode and use the () function to get the permission information of the file in the zip file. Use the () function to open the file in the zip file and use the () function to write it to the local file. If any error occurs, remember to close the local file and the file in the zip file so that the resources can be released correctly.
Decompress the gzip file
Using the compress/gzip package, we can also easily decompress gzip files. The specific method is as follows:
gzipFile, err := (gzipPath) if err != nil { return err } defer () gzipReader, err := (gzipFile) if err != nil { return err } defer () dstPath := (outputDir, (gzipPath)) dstFile, err := (dstPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0644) if err != nil { return err } defer () _, err = (dstFile, gzipReader) if err != nil { return err }
The above code first opens the gzip file and uses the () function to create an object of type that represents the reader for the contents of the gzip file. Remember to close the file in question (i.e., the gzip file and the reader file) after reading that file. Then, just open the target file in write mode by using the () function and copy the contents of the gzip reader into the target file. Remember to close the associated files after manipulating the target file so that the resources are released successfully.
Decompress the bzip2 file
You can easily decompress bzip2 files using the compress/bzip2 package. The method is as follows:
bzip2File, err := (bzip2Path) if err != nil { return err } defer () bzip2Reader := (bzip2File) dstPath := (outputDir, (bzip2Path)) dstFile, err := (dstPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0644) if err != nil { return err } defer () _, err = (dstFile, bzip2Reader) if err != nil { return err }
In the above code, we use the () function to create an object of type representing a reader for the contents of a bzip2 file. Then, the target file is opened in write mode, the contents of the reader are copied to the target file, and the associated file is closed to free up resources when finished.
II. Reading docx/doc files
docx/doc file is a binary file format, we can use a third-party library to read the contents of it.Word documents are usually saved in .doc or .docx format, where .doc is a binary format and .docx is an XML format file. Next, we will introduce how to read the contents of these two file formats.
Reading .doc files
We can use /LopPay/office-parser/ole, /LopPay/office-parser/common and /LopPay/office-parser/msdoc to process ole files, parse doc files and read data from doc files respectively. The library has encapsulated all the parsing and conversion of text, images, tables and other elements.
Here is a simple program to read a doc file:
docFile, err := (docPath) if err != nil { return err } defer () docData, err := (docFile) if err != nil { return err } for _, para := range { for _, run := range { () } () }
The above code, we first use () function to open the doc file, and use () function to parse the file. The function returns an object of type that includes text, images, tables and other information. The following code iterates through each paragraph and the Run instances therein and outputs its contents to the console.
Reading .docx format files
We can use the third-party library /unidoc/unioffice to read files in .docx format. The library supports operations such as reading and writing a single file, reading and writing multiple files, converting and manipulating tables, images, paragraphs, styles, etc.
Here is a simple program to read a .docx file:
docFile, err := (docxPath) if err != nil { return err } defer () doc, err := (docFile) if err != nil { return err } for _, para := range () { for _, run := range () { (()) } () }
In the above code, we first use the () function to open the docx file and use the () function to parse the file. The function returns an object of type that includes text, images, tables and other information. The following code iterates through each paragraph and the Run instances in it and outputs its contents to the console.
III. Summary
This article describes how to use the Go language to automatically unpack packages and read docx/doc files. Specifically, we use the appropriate third-party libraries and code from the Go language built-in libraries.
The above code runs relatively simple, suitable for beginners to learn and practice. I hope to help you, you can modify and expand according to their own needs.
to this article on the Go automatically unpack and read the contents of docx/doc file details of the article is introduced to this, more related Go unpack and read the contents of the doc file content, please search for my previous articles or continue to browse the following articles hope that you will support me in the future!