How to build a RESTful API data pipeline with a Firebase Cloud Function and BigQuery
by bernt & torsten
In this article, I will show how I build a data pipeline with a Google Cloud Function. My use case, in this case, was to use a RESTful API to get the data from an external system and create new entries from the data and store it in BigQuery.
Setting up
Let’s get into it. First, we have to create a Firebase project in the
You must have the Firebase CLI installed. If you don’t have it install it with
npm install -g firebase-tools
and then configure it with
firebase login
Set up your project by running
firebase use --add
and select the firebase project you had created.
Install dependencies in the functions directory
cd functions;
npm install;
I have written my cloud function with javascript using node.js, the function reads an external RESTful API. To call the RESTful API I’m using the AXIOS, I also have to install firebase libraries and the BigQuery library. Here are the libraries I used.
npm install axios;
npm install firebase-functions;
npm install firebase-admin;
npm install @google-cloud/bigquery;
npm install http;
My package.json file looks like this you can copy this and run npm install
{
"name": "gpi-noonreport-to-bigquery",
"description": "This is a function that reads Emily API for noonreport table and store in BigQuery",
"scripts": {
"serve": "firebase serve --only functions",
"shell": "firebase functions:shell",
"start": "npm run shell",
"deploy": "firebase deploy --only functions",
"logs": "firebase functions:log",
"build": "echo 'build script executed'",
"lint": "eslint.js"
},
"dependencies": {
"@google-cloud/bigquery": "^4.2.1",
"axios": "^0.19.0",
"firebase-admin": "^8.5.0",
"firebase-functions": "^3.2.0"
},
"private": true,
"engines": {
"node": "8"
},
"devDependencies": {
"eslint": "^6.6.0"
}
}
This is the full code of my function, you could just copy it.
strict';
// Request Data From A URL
var axios = require('axios');
var https = require('https');
// Var Firebase Functions
var functions = require('firebase-functions');
const admin = require('firebase-admin');
// Initalise App
admin.initializeApp;
// Imports the Google Cloud client library
const {BigQuery} = require('@google-cloud/bigquery');
// Setting Timeout in Seconds - default is 1 minute
// The maximum value for timeoutSeconds is 540, or 9 minutes. Valid values for memory are:
// 128MB, 256MB, 512MB, 1GB, 2GB
const runtimeOpts = {
timeoutSeconds: 300,
memory: '512MB'
}
/**
* Read a RESTFul API and create new entries in a BigQuery table
*/
exports.getRESTFULApitoBigQuery = functions
.runWith(runtimeOpts)
.region('europe-west1')
.https.onRequest(async(req, res) => {
try {
// Intialise the read message and insert to BigQuery
await createMessage();
res.status(200).send("OK");
}
catch (error) {
// Handle the error
console.log(error);
res.status(500).send(error);
};
//This Query inserts data after charges completed
async function createMessage() {
// The API Key
let apikey = '<Your API Key>';
// Table to get data from
var apitable = '<your API function>';
// Get the current date
var today = new Date();
var dd = String(today.getDate()).padStart(2, '0');
var mm = String(today.getMonth() + 1).padStart(2, '0'); //January is 0!
var yyyy = today.getFullYear();
today = '0000' + '-' + '00' + '-' + '00';
// today = yyyy + '-' + mm + '-' + dd;
var url = 'https://test.com/api/' + apitable + '?apikey=' + apikey + '&syncdate=' + today;
// At request level
const agent = new https.Agent({
rejectUnauthorized: false
});
// axios.get(url)
await axios.get(url, { httpsAgent: agent })
.then(resp => {
try {
for(let artno in resp.data.noon_report) {
console.log(console.log(JSON.stringify(resp.data)));
// Create the BigQuery Row
var rows = {
local_time: resp.data[artno].local_time || resp.data.report_date,
status: resp.data[artno].status,
location: resp.data[artno].location,
country: resp.data[artno].country
};
insertBigQuery(rows);
}
console.log("For Loop end");
}
catch (error) {
// Handle the error
console.log(error);
res.status(500).send(error);
}
})
};
//This Query inserts data after charges completed
async function insertBigQuery(rows) {
// Create a client
const bigqueryClient = new BigQuery();
//Make use of a dataset
const datasetId = 'my-dataset';
//Make use of a table
const tableId = 'my-table';
console.log("ROW TO INSERT " + JSON.stringify(rows));
await bigqueryClient
.dataset(datasetId)
.table(tableId)
.insert(rows)
.catch(err => {
if (err && err.name === 'PartialFailureError') {
if (err.errors && err.errors.length > 0) {
console.log('Insert errors:');
err.errors.forEach(err => console.error(err));
}
} else {
console.error('ERROR:', err);
}
});
console.log("Inserted rows");
};
// End Function with status code 200
//res.status(200).send("OK");
});
URL Request and SSL handshake
A few things about the code, the URL that is used are for a specific system, so you would need to change the following code for the URL that you are going to use.
var url = 'https://test.com/api/' + apitable + '?apikey=' + apikey + '&syncdate=' + today;
In my URL I’m requesting data from an https:// site, I did not write any code to handle the SSL certificate handshake, instead, I created an https agent, that will ignore the SSL certificate authentication, this is the code snippet.
// At request level
const agent = new https.Agent({
rejectUnauthorized: false
});
Node.js server-specific settings
Depending on the size of the data that you are requesting – you can see the timeout value – default is 1 minute, you can also alter the amount of memory you need in running your function. This is the code that you alter that.
// Setting Timeout in Seconds - default is 1 minute
// The maximum value for timeoutSeconds is 540, or 9 minutes. Valid values for memory are:
// 128MB, 256MB, 512MB, 1GB, 2GB
const runtimeOpts = {
timeoutSeconds: 300,
memory: '512MB'
}
Specify Region
In the code you can specify the region that you want your function to run in, this is done in the below code.
exports.getRESTFULApitoBigQuery = functions
.runWith(runtimeOpts)
.region('europe-west1')
.https.onRequest(async(req, res)
Deploy your project using
firebase deploy
If you have many firebase cloud functions you need to specify the name of the cloud functions like this when you deploy
firebase deploy --only functions:getRESTFULApitoBigQuery
BigQuery
After you deploy your function and before you run it. You will need to create a BigQuery Dataset and a table – you can read https://cloud.google.com/bigquery/docs/datasets on how to setup.
In the cloud function code, you would need to change the following with the name of your dataset and table.
//Make use of a dataset
const datasetId = 'my-dataset';
//Make use of a table
const tableId = 'my-table';
Creating a BgQuery row, you would need to alter this code to match your URL response, depending on your data structure from the API call you make.
// Create the BigQuery Row
var rows = {
local_time: resp.data[artno].local_time || resp.data.report_date,
status: resp.data[artno].status,
location: resp.data[artno].location,
country: resp.data[artno].country
};
If you run the function the first time and you want the whole dataset from the RESTful API, in my cases I set the URL date to 0000-00-00
As BigQuery is a database that adds new records, you could create duplicates when testing your function, if that does happen you may want to empty your BigQuery table you can do it in many ways, you can read How to empty a table in BigQuery to learn how I do it.
To Run
As this is a function that is triggered by an HTTP request, you should go to the Firebase console and select functions from the left side menu. It will list your function and you will also see the https URL that you can copy and paste into another tab in your browser to run the function. After you run the function you can go to BigQuery and make a query to see if your data is there.
Why It’s Important to Exercise When You’re Over 60
Many of us find ourselves in a pickle as the years pile up. Once reliable sidekicks, our...
A Poem: The Last Time
You never know when it will be,
The last time you ski down slopes of snow,
A Poem: Time Millionaire
When the morning wakes, still and clear,
No more alarms, no more rush....