Created on February 5, 2021 at 9:20 pm
Updated on February 22, 2021 at 3:23 pm
Connecting Google Bigquery to AWS Lambda with Go
Update:
This method of getting credentials through AWS Lambda is not recommended. Every time a Lambda function is called a get request is made to S3. This has two problems. One it slows down the function by needing to pull a file from S3. Two it adds to to costs. It's not expensive but for every 2.5mm get request, it cost 1 usd.
Here's how to make Google Bigquery calls through AWS lambda
Setup
There are three initial steps before putting any code to editor. The credentials JSON file from Bigquery will need to be generated. A Lambda function will need to be created. And you will need to create and S3 bucket and upload the credentials JSON file into it. These steps are pretty straight forward so I won't be going in depth. Please follow the links provided to get started.
Once those are all set up, it's time to put code to editor. Here is the code in its entirety to start off with
package main
import (
"context"
"io/ioutil"
"log"
"cloud.google.com/go/bigquery"
"github.com/aws/aws-lambda-go/lambda"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"google.golang.org/api/iterator"
"google.golang.org/api/option"
)
var client *bigquery.Client
var ctx context.Context
func init() {
svc := s3.New(session.New())
input := &s3.GetObjectInput{
Bucket: aws.String("name of bucket"),
Key: aws.String("name of json file"),
}
result, _ := svc.GetObject(input)
defer result.Body.Close()
body, _ := ioutil.ReadAll(result.Body)
ctx = context.Background()
var err error
client, err = bigquery.NewClient(ctx, "name of Bigquery project", option.WithCredentialsJSON(body))
if err != nil {
log.Fatalf("bigquery.NewClient: %v", err)
}
}
func main() {
lambda.Start(handler)
}
func handler() ([]string, error) {
it, err := datasets(ctx, client)
if err != nil {
log.Fatal(err)
}
datasets, _ := SliceResults(it)
return datasets, nil
}
func datasets(ctx context.Context, client *bigquery.Client) (*bigquery.DatasetIterator, error) {
it := client.Datasets(ctx)
return it, nil
}
func SliceResults(iter *bigquery.DatasetIterator) ([]string, error) {
var datasets []string
for {
dataset, err := iter.Next()
if err == iterator.Done {
break
}
if err != nil {
return nil, err
}
datasets = append(datasets, dataset.DatasetID)
}
return datasets, nil
}
There's not too much code here. Let's go through this. As with any Go app there is the package declaration at the top followed by the import statements. These packages are necessary since the credentials file will be pulled from S3.
package main
import (
"context"
"io/ioutil"
"log"
"cloud.google.com/go/bigquery"
"github.com/aws/aws-lambda-go/lambda"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"google.golang.org/api/iterator"
"google.golang.org/api/option"
)
Initialize Bigquery Client
I've initialized the Bigquery client and ctx variable globally so they can be both called whenever necessary. The Go context needs to be passed to most of the functions that Bigquery has. Don't quote me on what the ctx variable doesn't exactly. There is a good blog post here that gives an in depth explanation of it.
var client *bigquery.Client
var ctx context.Context
AWS Lambda can execute an init function just as any other Go program. I've used this init function to:
- Create the S3 session and get the credentials file
func init() {
svc := s3.New(session.New())
input := &s3.GetObjectInput{
Bucket: aws.String("name of bucket"),
Key: aws.String("name of json file"),
}
result, _ := svc.GetObject(input)
defer result.Body.Close()
...
}
- read out the body into a slice of bytes and pass that into the options.WithCredentialsJSON function. Please also note that ctx is now initialized with an empty context object
func init() {
...
body, _ := ioutil.ReadAll(result.Body)
ctx = context.Background()
var err error
client, err = bigquery.NewClient(ctx, "name of Bigquery project, option.WithCredentialsJSON(body))
if err != nil {
log.Fatalf("bigquery.NewClient: %v", err)
}
}
Testing the Client
You now have a Bigquery client that you can have your way with. For my test, I did a simple call to get a list of datasets. The datasets function returns a custom iterator type that I passed off to a simple function to convert it to a slice of strings. Lambda requires a main function that calls a handler that returns the results
func main() {
lambda.Start(handler)
}
func handler() ([]string, error) {
it, err := datasets(ctx, client)
if err != nil {
log.Fatal(err)
}
datasets, _ := SliceResults(it)
return datasets, nil
}
func datasets(ctx context.Context, client *bigquery.Client) (*bigquery.DatasetIterator, error) {
it := client.Datasets(ctx)
return it, nil
}
func SliceResults(iter *bigquery.DatasetIterator) ([]string, error) {
var datasets []string
for {
dataset, err := iter.Next()
if err == iterator.Done {
break
}
if err != nil {
return nil, err
}
datasets = append(datasets, dataset.DatasetID)
}
return datasets, nil
}
You can test that the function works through the AWS lambda console, and it should look something like this
Thanks for reading!
References
Bigquery QuickStart
https://cloud.google.com/bigquery/docs/quickstarts/quickstart-client-libraries
Create New Lambda Function
https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html
Getting Started with S3
https://docs.aws.amazon.com/AmazonS3/latest/userguide/GetStartedWithS3.html
Get S3 Object
https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.GetObject
Bigquery new client documentation
https://pkg.go.dev/cloud.google.com/go/bigquery#NewClient
Bigquery datasets function
https://pkg.go.dev/cloud.google.com/go/bigquery#Client.Datasets
Bigquery newclient withOptions
https://pkg.go.dev/google.golang.org/api/option#WithCredentialsJSON
AWS Lambda with Go
https://docs.aws.amazon.com/lambda/latest/dg/lambda-golang.html
AWS S3 SDK for Go - Getobject
https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.GetObject
Go Iterator Package
https://pkg.go.dev/google.golang.org/api/iterator
Go Context