This article introduces the TSV file types and their applications, and also introduces the implementation process of the Golang statement reading TSV files and converting them into struct.
Understand TSV files
Maybe you didn't know about TSV files before, so don't worry, it's very simple and very common. The TSV (tab-separated values) file represents a file format that divides values by tabs, that is, the TSV file includes a series of data information, in which the data is divided using tab characters (also known as tab characters, \t). Similar to the CSV file format, CSV uses a half-width comma (,).
Like CSV files, TSV files are very general and are supported by most platforms or processing software. However, TSV files use invisible tab characters as delimiters, and the probability of being misused by users is lower, and the fault tolerance is better than CSV.
Golang Reading TSV Files
The golang package encoding/csv provides the read and write functions of csv files. We deserve the difference between tsv and csv is only the delimiter, so the following code can easily read tsv:
package main import ( "encoding/csv" "fmt" "log" "os" ) func main() { f, err := ("") if err != nil { (err) } r := (f) = '\t' = '#' records, err := () if err != nil { (err) } (records) }
Resolved as a structure
Generally, we hope to read the tsv file and parse it into struct. Let’s take a look at some open source code implementations. The tsv file may include a title line, and the field adds a tsv tag. The example is as follows:
type TestTaggedRow struct { Age int `tsv:"age"` Active bool `tsv:"active"` Gender string `tsv:"gender"` Name string `tsv:"name"` }
Therefore, define the Parse type:
// Parser has information for parser type Parser struct { Headers []string // Title array Reader * // Reader Data interface{} // I hope to resolve to a struct type ref // Reflection value indices []int // indices is field index list of header array structMode bool // Structural pattern, structure has tsv tag normalize // parse UTF8 method}
Define an institutional function with untitled rows:
// NewParserWithoutHeader creates new TSV parser with given func NewParserWithoutHeader(reader , data interface{}) *Parser { r := (reader) = '\t' p := &Parser{ Reader: r, Data: data, ref: (data).Elem(), normalize: -1, } return p }
A parsing constructor with header row:
// NewStructModeParser creates new TSV parser with given as struct mode func NewParser(reader , data interface{}) (*Parser, error) { r := (reader) = '\t' // Read a line, that is, the title line; function string array headers, err := () if err != nil { return nil, err } // Loop to assign values to the title array for i, header := range headers { headers[i] = header } p := &Parser{ Reader: r, Headers: headers, Data: data, ref: (data).Elem(), indices: make([]int, len(headers)), structMode: false, normalize: -1, } // get type information t := () for i := 0; i < (); i++ { // get TSV tag tsvtag := (i).("tsv") if tsvtag != "" { // find tsv position by header for j := 0; j < len(headers); j++ { if headers[j] == tsvtag { // indices are 1 start [j] = i + 1 = true } } } } if ! { for i := 0; i < len(headers); i++ { [i] = i + 1 } } return p, nil }
Compared with the above untitled rows, analyze the logic of the tsv tag more.
Let’s start parsing each row of data. Let’s look at the Next() method:
// Next puts reader forward by a line func (p *Parser) Next() (eof bool, err error) { // Get next record var records []string for { // read until valid record records, err = () if err != nil { if () == "EOF" { return true, nil } return false, err } if len(records) > 0 { break } } if len() == 0 { = make([]int, len(records)) // mapping simple index for i := 0; i < len(records); i++ { [i] = i + 1 } } // record should be a pointer for i, record := range records { idx := [i] if idx == 0 { // skip empty index continue } // get target field field := (idx - 1) switch () { case : // Normalize text if >= 0 { record = (record) } (record) case : if record == "" { (false) } else { col, err := (record) if err != nil { return false, err } (col) } case : if record == "" { (0) } else { col, err := (record, 10, 0) if err != nil { return false, err } (col) } default: return false, ("Unsupported field type") } } return false, nil }
The main logic above is the process of parsing and storing each row of data through reflection and filling the structure. Only string, bool, and Int are considered here. Of course, we can extend and support more types.
The following is the main function:
import ( "fmt" "os" ) type TestRow struct { Name string // 0 Age int // 1 Gender string // 2 Active bool // 3 } func main() { file, _ := ("") defer () data := TestRow{} parser, _ := NewParser(file, &data) for { eof, err := () if eof { return } if err != nil { panic(err) } (data) } }
Open the file, define the structure object, and then define the parser, passing in the file and the structure object as parameters. The parsing results are stored in the structure object. For the above code, refer to the tsv open source project:/dogenzaka/tsv. And our more powerful open source library:/shenwei356/csvtk, not only parsing CSV/TSV files, but also implementing conversion in different formats.
This is the end of this article about Golang's practical examples of operating TSV files. For more related contents of Golang's operating TSV files, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!