SoFunction
Updated on 2025-03-04

Golang, a practical example of operating TSV files

This article introduces the TSV file types and their applications, and also introduces the implementation process of the Golang statement reading TSV files and converting them into struct.

Understand TSV files

Maybe you didn't know about TSV files before, so don't worry, it's very simple and very common. The TSV (tab-separated values) file represents a file format that divides values ​​by tabs, that is, the TSV file includes a series of data information, in which the data is divided using tab characters (also known as tab characters, \t). Similar to the CSV file format, CSV uses a half-width comma (,).

Like CSV files, TSV files are very general and are supported by most platforms or processing software. However, TSV files use invisible tab characters as delimiters, and the probability of being misused by users is lower, and the fault tolerance is better than CSV.

Golang Reading TSV Files

The golang package encoding/csv provides the read and write functions of csv files. We deserve the difference between tsv and csv is only the delimiter, so the following code can easily read tsv:

package main

import (
    "encoding/csv"
    "fmt"
    "log"
    "os"
)

func main() {

    f, err := ("")

    if err != nil {

        (err)
    }

    r := (f)
     = '\t'
     = '#'

    records, err := ()

    if err != nil {
        (err)
    }

    (records)
}

Resolved as a structure

Generally, we hope to read the tsv file and parse it into struct. Let’s take a look at some open source code implementations. The tsv file may include a title line, and the field adds a tsv tag. The example is as follows:

type TestTaggedRow struct {
    Age    int    `tsv:"age"`
    Active bool   `tsv:"active"`
    Gender string `tsv:"gender"`
    Name   string `tsv:"name"`
}

Therefore, define the Parse type:

// Parser has information for parser
type Parser struct {
    Headers    []string      // Title array    Reader     *   // Reader    Data       interface{}   // I hope to resolve to a struct type    ref         // Reflection value    indices    []int // indices is field index list of header array
    structMode bool  // Structural pattern, structure has tsv tag    normalize       // parse UTF8 method}

Define an institutional function with untitled rows:

// NewParserWithoutHeader creates new TSV parser with given 
func NewParserWithoutHeader(reader , data interface{}) *Parser {
    r := (reader)
     = '\t'

    p := &Parser{
        Reader:    r,
        Data:      data,
        ref:       (data).Elem(),
        normalize: -1,
    }

    return p
}

A parsing constructor with header row:

// NewStructModeParser creates new TSV parser with given  as struct mode
func NewParser(reader , data interface{}) (*Parser, error) {
    r := (reader)
     = '\t'

    // Read a line, that is, the title line; function string array    headers, err := ()

    if err != nil {
        return nil, err
    }

    // Loop to assign values ​​to the title array    for i, header := range headers {
        headers[i] = header
    }

    p := &Parser{
        Reader:     r,
        Headers:    headers,
        Data:       data,
        ref:        (data).Elem(),
        indices:    make([]int, len(headers)),
        structMode: false,
        normalize:  -1,
    }

    // get type information
    t := ()

    for i := 0; i < (); i++ {
        // get TSV tag
        tsvtag := (i).("tsv")
        if tsvtag != "" {
            // find tsv position by header
            for j := 0; j < len(headers); j++ {
                if headers[j] == tsvtag {
                    // indices are 1 start
                    [j] = i + 1
                     = true
                }
            }
        }
    }

    if ! {
        for i := 0; i < len(headers); i++ {
            [i] = i + 1
        }
    }

    return p, nil
}

Compared with the above untitled rows, analyze the logic of the tsv tag more.

Let’s start parsing each row of data. Let’s look at the Next() method:

// Next puts reader forward by a line
func (p *Parser) Next() (eof bool, err error) {

    // Get next record
    var records []string

    for {
        // read until valid record
        records, err = ()
        if err != nil {
            if () == "EOF" {
                return true, nil
            }
            return false, err
        }
        if len(records) > 0 {
            break
        }
    }

    if len() == 0 {
         = make([]int, len(records))
        // mapping simple index
        for i := 0; i < len(records); i++ {
            [i] = i + 1
        }
    }

    // record should be a pointer
    for i, record := range records {
        idx := [i]
        if idx == 0 {
            // skip empty index
            continue
        }
        // get target field
        field := (idx - 1)
        switch () {
        case :
            // Normalize text
            if  >= 0 {
                record = (record)
            }
            (record)
        case :
            if record == "" {
                (false)
            } else {
                col, err := (record)
                if err != nil {
                    return false, err
                }
                (col)
            }
        case :
            if record == "" {
                (0)
            } else {
                col, err := (record, 10, 0)
                if err != nil {
                    return false, err
                }
                (col)
            }
        default:
            return false, ("Unsupported field type")
        }
    }

    return false, nil
}

The main logic above is the process of parsing and storing each row of data through reflection and filling the structure. Only string, bool, and Int are considered here. Of course, we can extend and support more types.

The following is the main function:

import (
    "fmt"
    "os"
    )

type TestRow struct {
  Name   string // 0
  Age    int    // 1
  Gender string // 2
  Active bool   // 3
}

func main() {

  file, _ := ("")
  defer ()

  data := TestRow{}
  parser, _ := NewParser(file, &data)

  for {
    eof, err := ()
    if eof {
      return
    }
    if err != nil {
      panic(err)
    }
    (data)
  }

}

Open the file, define the structure object, and then define the parser, passing in the file and the structure object as parameters. The parsing results are stored in the structure object. For the above code, refer to the tsv open source project:/dogenzaka/tsv. And our more powerful open source library:/shenwei356/csvtk, not only parsing CSV/TSV files, but also implementing conversion in different formats.

This is the end of this article about Golang's practical examples of operating TSV files. For more related contents of Golang's operating TSV files, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!