All Articles

Waiting Goroutines: fetching stock prices

Photo by Jonathan Chng on Unsplash unsplash-logoJonathan Chng

Goroutines are a lightweight approach for concurrent processing. This means you can trigger different long running tasks and let Go and your hardware find how to use the available resources. However, your program will not wait automatically for those tasks to finish. The execution can simply stop, and your routines will die along it. This article shows how you can wait for all goroutines to finish on a common use case: fetching data from a external API, such as stock prices.

The basics

  • All code examples are available on this repository. Fell free to reach out if you have any questions!
  • This article relies heavly on go-quote project.
  • There are other approaches on dealing with concurrency in Go, such as Mutexes and even Channels. Actually concurrency is a broader CS topic and there’s a lot to consider on how and when to use it in complex systems.

Our idea begins with a simple entrypoint. We want quotes within a time range, for the biggest tech companies out there. How about the last 10 years?

func main() {
	tickers := []string{"FB", "AAPL", "AMZN", "MSFT", "GOOG"}
	from := time.Now().AddDate(-10, 0, 0)
	to := time.Now()
	fetchQuotes(tickers, from, to)
}

This means we might have a fetchQuotes() function that does exactly that: for every ticker we provide, it calls fetchQuote() to bring what we need:

func fetchQuotes(tickers []string, from time.Time, to time.Time) {
	for _, ticker := range tickers {
		fetchQuote(ticker, from, to)
	}
}

#1 Naive Approach: Blocking operations

Let’s suppose our tasks does not seem to be that long. So we start with the following implementation:

func fetchQuote(ticker string, from time.Time, to time.Time){
	quotes, err := quote.NewQuoteFromYahoo(ticker,
			from.Format("2006-01-02"),
			to.Format("2006-01-02"),
			quote.Daily,
			true,
		)
	totalDataPoints := len(quotes.Close)
	if err != nil {
		fmt.Printf("[%s] Error fetching: %s\n", ticker, err)
		return
	}
	fmt.Printf("[%s] Fetched %d data points\n", ticker, totalDataPoints)
}

In this approach we wait for every fetchQuote() call to wait before looking into the next ticker. The result?

[FB] Fetched 2025 data points
[AAPL] Fetched 2518 data points
[AMZN] Fetched 2518 data points
[MSFT] Fetched 2518 data points
[GOOG] Fetched 2518 data points

real	0m5,112s
user	0m0,746s
sys	0m0,104s

That is not the worst, but we can do better: all these calls are completely independent for the last one. This premise makes this example an excellent use case for goroutines! So can we just add go before our call like that?

func fetchQuotes(tickers []string, from time.Time, to time.Time) {
	for _, ticker := range tickers {
		go fetchQuote(ticker, from, to)
	}
}

The output:

real	0m0,430s
user	0m0,549s
sys	0m0,098s

As we can see, our program did not waited for our routines to finish at all. It is not what you expected? Well, there are simple ways of preventing this to happen.

#2 Every man for himself: waiting groups

This simple, standard package called sync can help we with its WaitGroups. The following implementation uses a WaitGroup to hang until all tasks declare they are done:

func fetchQuotes(tickers []string, from time.Time, to time.Time) error {
	var group sync.WaitGroup
	group.Add(len(tickers)) // this means this group will wait for len(tickers) "Done" signals before letting it go
	for _, ticker := range tickers {
		go fetchQuote(ticker, from, to, &group)
	}
	group.Wait()
	return nil
}

func fetchQuote(ticker string, from time.Time, to time.Time, group *sync.WaitGroup){
	defer group.Done() // deferring the Done signaling to be sure we will let it go whether we had an error or not.
	quotes, err := quote.NewQuoteFromYahoo(ticker,
			from.Format("2006-01-02"),
			to.Format("2006-01-02"),
			quote.Daily,
			true,
		)
	totalDataPoints := len(quotes.Close)
	if err != nil {
		fmt.Printf("[%s] Error fetching: %s\n", ticker, err)
		return
	}
	fmt.Printf("[%s] Fetched %d data points\n", ticker, totalDataPoints)
}

Now all tasks are performed concurrently. We need to make sure the group.Done() will be called for each task no matter the outcome we get. Otherwise the the group.Wait() call will hang indefinitely. The result?

[AMZN] Fetched 2518 data points
[MSFT] Fetched 2518 data points
[FB] Fetched 2025 data points
[GOOG] Fetched 2518 data points
[AAPL] Fetched 2518 data points

real	0m2,351s
user	0m0,793s
sys	0m0,133s

Not only we got a way faster execution, but you can see the output order is different of the input order. Faster tasks (relying faster API responses or internal details) finish faster.

Conclusion

Many languages have a huge tool box for concurrency: threads, processes, mutex, event loops and so many others. The discussion can also scale up for parallelism, concurrency, distributed processing and more. However, Go offers native, eloquent and straightforward tooling for concurrent approaches. Huge performance gains can be achieved with small changes, specially because we are dealing with a concurrent-ready ecosystem: many important projects deal seamlessly with goroutines and channels.

We went over a simple approach for dealing with concurrent tasks. This example come up within a side project I’m working on: those data points are being stored in a InfluxDB instance, so they can be used in further analysis. I will try to keep bringing some examples of this project to illustrate some wonders of Golang. Stay tunned!