Created on February 5, 2021 at 9:20 pm
Updated on February 22, 2021 at 3:23 pm

Connecting Google Bigquery to AWS Lambda with Go

Update:

This method of getting credentials through AWS Lambda is not recommended. Every time a Lambda function is called a get request is made to S3. This has two problems. One it slows down the function by needing to pull a file from S3. Two it adds to to costs. It's not expensive but for every 2.5mm get request, it cost 1 usd.

Here's how to make Google Bigquery calls through AWS lambda

Setup

There are three initial steps before putting any code to editor. The credentials JSON file from Bigquery will need to be generated. A Lambda function will need to be created. And you will need to create and S3 bucket and upload the credentials JSON file into it. These steps are pretty straight forward so I won't be going in depth. Please follow the links provided to get started.

Once those are all set up, it's time to put code to editor. Here is the code in its entirety to start off with

package main

import (
	"context"
	"io/ioutil"
	"log"

	"cloud.google.com/go/bigquery"
	"github.com/aws/aws-lambda-go/lambda"
	"github.com/aws/aws-sdk-go/aws"
	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/service/s3"
	"google.golang.org/api/iterator"
	"google.golang.org/api/option"
)

var client *bigquery.Client
var ctx context.Context

func init() {
	svc := s3.New(session.New())
	input := &s3.GetObjectInput{
		Bucket: aws.String("name of bucket"),
		Key:    aws.String("name of json file"),
	}
	result, _ := svc.GetObject(input)
	defer result.Body.Close()
	body, _ := ioutil.ReadAll(result.Body)

	ctx = context.Background()

	var err error
	client, err = bigquery.NewClient(ctx, "name of Bigquery project", option.WithCredentialsJSON(body))
	if err != nil {
		log.Fatalf("bigquery.NewClient: %v", err)
	}

}

func main() {
	lambda.Start(handler)
}

func handler() ([]string, error) {
	it, err := datasets(ctx, client)
	if err != nil {
		log.Fatal(err)
	}
	datasets, _ := SliceResults(it)
	return datasets, nil
}

func datasets(ctx context.Context, client *bigquery.Client) (*bigquery.DatasetIterator, error) {
	it := client.Datasets(ctx)
	return it, nil
}

func SliceResults(iter *bigquery.DatasetIterator) ([]string, error) {
	var datasets []string
	for {
		dataset, err := iter.Next()

		if err == iterator.Done {
			break
		}
		if err != nil {
			return nil, err
		}

		datasets = append(datasets, dataset.DatasetID)
	}
	return datasets, nil
}

There's not too much code here. Let's go through this. As with any Go app there is the package declaration at the top followed by the import statements. These packages are necessary since the credentials file will be pulled from S3.

package main

import (
	"context"
	"io/ioutil"
	"log"

	"cloud.google.com/go/bigquery"
	"github.com/aws/aws-lambda-go/lambda"
	"github.com/aws/aws-sdk-go/aws"
	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/service/s3"
	"google.golang.org/api/iterator"
	"google.golang.org/api/option"
)

Initialize Bigquery Client

I've initialized the Bigquery client and ctx variable globally so they can be both called whenever necessary. The Go context needs to be passed to most of the functions that Bigquery has. Don't quote me on what the ctx variable doesn't exactly. There is a good blog post here that gives an in depth explanation of it.

var client *bigquery.Client
var ctx context.Context

AWS Lambda can execute an init function just as any other Go program. I've used this init function to:

- Create the S3 session and get the credentials file

func init() {
	svc := s3.New(session.New())
	input := &s3.GetObjectInput{
		Bucket: aws.String("name of bucket"),
		Key:    aws.String("name of json file"),
	}
	result, _ := svc.GetObject(input)
	defer result.Body.Close()
        ...
}

- read out the body into a slice of bytes and pass that into the options.WithCredentialsJSON function. Please also note that ctx is now initialized with an empty context object

func init() {
        ...
	body, _ := ioutil.ReadAll(result.Body)

	ctx = context.Background()

	var err error
	client, err = bigquery.NewClient(ctx, "name of Bigquery project, option.WithCredentialsJSON(body))
	if err != nil {
		log.Fatalf("bigquery.NewClient: %v", err)
	}

}

Testing the Client

You now have a Bigquery client that you can have your way with. For my test, I did a simple call to get a list of datasets. The datasets function returns a custom iterator type that I passed off to a simple function to convert it to a slice of strings. Lambda requires a main function that calls a handler that returns the results

func main() {
	lambda.Start(handler)
}

func handler() ([]string, error) {
	it, err := datasets(ctx, client)
	if err != nil {
		log.Fatal(err)
	}
	datasets, _ := SliceResults(it)
	return datasets, nil
}
func datasets(ctx context.Context, client *bigquery.Client) (*bigquery.DatasetIterator, error) {
	it := client.Datasets(ctx)
	return it, nil
}


func SliceResults(iter *bigquery.DatasetIterator) ([]string, error) {
	var datasets []string
	for {
		dataset, err := iter.Next()

		if err == iterator.Done {
			break
		}
		if err != nil {
			return nil, err
		}

		datasets = append(datasets, dataset.DatasetID)
	}
	return datasets, nil
}

You can test that the function works through the AWS lambda console, and it should look something like this

gif

Thanks for reading!

References

Bigquery QuickStart

https://cloud.google.com/bigquery/docs/quickstarts/quickstart-client-libraries

Create New Lambda Function

https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html

Getting Started with S3

https://docs.aws.amazon.com/AmazonS3/latest/userguide/GetStartedWithS3.html

Get S3 Object

https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.GetObject

Bigquery new client documentation

https://pkg.go.dev/cloud.google.com/go/bigquery#NewClient

Bigquery datasets function

https://pkg.go.dev/cloud.google.com/go/bigquery#Client.Datasets

Bigquery newclient withOptions

https://pkg.go.dev/google.golang.org/api/option#WithCredentialsJSON

AWS Lambda with Go

https://docs.aws.amazon.com/lambda/latest/dg/lambda-golang.html

AWS S3 SDK for Go - Getobject

https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.GetObject

Go Iterator Package

https://pkg.go.dev/google.golang.org/api/iterator

Go Context

https://blog.golang.org/context