Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading tsv file concurrently with multiple goroutines #38

Open
Wkalmar opened this issue Nov 26, 2022 · 1 comment
Open

Reading tsv file concurrently with multiple goroutines #38

Wkalmar opened this issue Nov 26, 2022 · 1 comment

Comments

@Wkalmar
Copy link

Wkalmar commented Nov 26, 2022

Hello,
I'm using your tsv package to read .tsv file.
The code below works fine

type row struct {
	Tconst         string `tsv:"tconst"`
	TitleType      string `tsv:"titleType"`
	PrimaryTitle   string `tsv:"primaryTitle"`
	OriginalTitle  string `tsv:"originalTitle"`
	IsAdult        byte   `tsv:"isAdult"`
	StartYear      uint16 `tsv:"startYear"`
	EndYear        string `tsv:"endYear"`
	RuntimeMinutes uint16 `tsv:"runtimeMinutes"`
	Genres         string `tsv:"genres"`
}

func ReadFilePlain() {
	file, err := os.Open("/static/data.tsv")
	if err != nil {
		panic(err)
	}
	defer file.Close()
	r := tsv.NewReader(file)
	r.HasHeaderRow = true
	r.UseHeaderNames = true
	for i := 0; i < 1000; i++ {
		var v row
		err = r.Read(&v)
		if err == nil {
			fmt.Printf("%+v\n", v)
		} else {
			fmt.Println(err)
		}
	}
}

However, when I try to speed things up a bit with using goroutines like this

func ReadFileGoRoutines() {
	file, err := os.Open("/static/data.tsv")
	if err != nil {
		panic(err)
	}
	defer file.Close()
	r := tsv.NewReader(file)
	r.HasHeaderRow = true
	r.UseHeaderNames = true
	var wg sync.WaitGroup
	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			var v row
			err = r.Read(&v)
			if err == nil {
				fmt.Printf("%+v\n", v)
			} else {
				fmt.Println(err)
			}
			wg.Done()
		}()
	}
	wg.Wait()
}

I get

column tconst does not appear in the header: map[0:4 1:7 1894:5 Carmencita:3 Documentary,Short:8 \N:6 short:1 tt0000001:0]
panic: runtime error: slice bounds out of range [60:42]

Is it me doing something non-idiomatic or is this some concurrency issue?
For your convenience, I have the complete code here

Thank you in advance
Bohdan

@jcharum
Copy link
Contributor

jcharum commented Nov 27, 2022

(*tsv.Reader).Read is not safe to call concurrently. This is consistent with other "read" APIs, like (*encoding/csv.Reader).Read and (io.Reader).Read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants