AWS Lambda function for insert DynamoDB records to Google BigQuery
This package depends on gcloud
and it uses native code. So you have to build on Amazon Linux to execute AWS Lambda.
-
Launch new Amazon Linux instance
-
Install required packages to the instance
sudo yum update
sudo yum install git-core
sudo yum groupinstall "Development Tools"
- Install nvm
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.31.0/install.sh | bash
- Install Node.js
nvm install v0.10.44
- Clone repository
git clone https://github.com/runtakun/lambda-dynamodb-bigquery.git
- Change directory and install npm packages
cd lambda-dynamodb-bigquery
npm install
-
Add configuration file and account key file
-
Install grunt and build package
npm install -g grunt-cli
grunt lambda_package
- Create AWS Lambda function and upload package
Create Lambda function in AWS console and speficy Node.js 0.10
in Runtime section.
Create service account at developer console (below link) and download key file.
https://console.cloud.google.com/permissions/serviceaccounts
You should specify project name and dataset id by JSON file named gcpconfig.json
.
Example:
{"project": "lambda-bigquery-sample", "dataset": "sample"}
By default, BigQuery table will be named with the same as DynamoDB's one. But you can specify specific name by configuration.
Example:
{"project": "lambda-bigquery-sample", "dataset": "sample", "table": "Sample"}
You can use table partitioning by configuration field tablePartitionPeriod
and specifing daily
or monthly
. You should create normal table before data is inserted. For example, if you creates table named Sample
on BigQuery, BigQuery creates Sample20160401
table.
Example:
{"project": "lambda-bigquery-sample", "dataset": "sample", "tablePartitionPeriod": "monthly"}