A serverless, scalable website preview service built using Node.js, Express.js, memory-cache and deployed using Up.
This repository is basically a follow up to an article that I wrote here.
It is a RESTful API service (a microservice) that will take in a website URL and reply with its title, description, a thumbnail preview of the first image found on the website along with the site name. Scrapping is done using @nunkisoftware/link-preview. It is serverless and runs on AWS Lambda as Function as a Service (FaaS). Since there is no server or other hardware considerations it can scale to mammoth proportions as Amazon will automatically deploy copies of our exported functions depending on the load.
• Install Up globally.
$ npm i -g up
• Then, you've two choices. Either clone this repo, install local dependencies skip to the very last step.
$ git clone https://github.com/MustansirZia/serverless-link-preview
$ npm i
OR follow along,
• First, initialise the project yourself by creating these files.
$ touch package.json up.json app.js
• Then, add a few local packages.
$ npm i express memory-cache cors @nunkisoftware/link-preview --save
• Add a scripts
section to your package.json
so Up knows how to start your express server.
{
"name": "serverless-link-preview",
"version": "0.0.1",
"description": "Serverless service to get website description and preview deployed on AWS Lambda.",
"main": "app.js",
"license": "MIT",
"scripts": {
"start": "node app.js"
},
"dependencies": {
"@nunkisoftware/link-preview": "^0.2.0",
"cors": "^2.8.4",
"express": "^4.16.2",
"memory-cache": "^0.2.0"
}
}
• Write an express server inside app.js
with a single GET endpoint at /
which would take a query param url
. This would be our website url whose preview we require.
const express = require('express');
const linkPreview = require('@nunkisoftware/link-preview');
const mCache = require('memory-cache');
const cors = require('cors');
const app = express();
// Apply cors to provide asynchronous access from browsers.
app.use(cors());
// Validation middleware to simply check the url query param.
const validate = function (req, res, next) {
const url = req.query.url;
if (!url) {
res.status(400).json({ message: 'url query param missing.' });
return;
}
next();
};
// Function which returns an in memory cache middleware.
const cache = function (duration) {
return function (req, res, next) {
const key = req.query.url;
// Try to get cached response using url param as key.
const cachedResponse = mCache.get(key);
if (cachedResponse) {
// Send cached response.
res.json(cachedResponse);
return;
}
// If cached response not present,
// pass the request to the actual handler.
res.originalJSON = res.json;
res.json = function (result) {
// Cache the newly generated response for later use
// and send it to the client.
mCache.put(key, result, duration * 1000);
res.originalJSON(result);
};
next();
};
};
// Actual get handler with cache set to 3 minutes.
app.get('/', validate, cache(180), function (req, res) {
const url = req.query.url;
// Get the actual response from link-preview.
linkPreview(url)
.then(function (response) {
if (!response.title) {
// If the url given is incorrect.
res.status(400).json({ message: 'Invalid URL given.' });
return;
}
res.json(response);
})
.catch(function (err) {
res.status(500).send('Internal Server Error.');
});
});
// Listen on the port provided by Up.
app.listen(process.env.PORT || 3000);
Please note that we also employ an in memory cache to store recent website previews so we don't query link-preview
on every frequent homogenous request (as that's a time/resource expensive thing to do) and thus serve the cached result to our client.
The following two steps can also be accomplished using environment variables but making a separate file is much cleaner and will make our deployment super easy by writing a single command, Up
.
• Add a single entry to our up.json
so Up knows where and how to find our AWS credentials.
{
"profile": "aws"
}
This is a one time step and won't be required for subsequent Up deployments.
• Finally, create the aws credentials file at ~/.aws/
and fill in your IAM credentials.
$ mkdir -p ~/.aws && touch ~/.aws/credentials
$ gedit ~/.aws/credentials
or $ nano ~/.aws/credentials
and paste the following in.
Replace $YOUR_ACCESS_ID
and $YOUR_ACCESS_KEY
with your own. Find them from here. It could be beneficial to create a new IAM user just for this purpose.
[aws]
aws_access_key_id = $YOUR_ACCESS_ID
aws_secret_access_key = $YOUR_ACCESS_KEY
Save the file and that's it.
To verify our installation, key in npm start
from the directory that houses our app.js
.
From another terminal window, request our service like so.
$ curl localhost:3000?url=https://www.youtube.com/watch?v=NUWViXhvW3k
You should see a familiar JSON and this verifies our installation.
{
"url": "https://www.youtube.com/watch?v=NUWViXhvW3k",
"image": "https://i.ytimg.com/vi/NUWViXhvW3k/maxresdefault.jpg",
"imageWidth": null,
"imageHeight": null,
"imageType": null,
"title": "Building the CLEANEST Desk Setup!!!",
"description": "My setup tour: https://goo.gl/nv0nja ADD ME ON SNAPCHAT TO STAY UP TO DATE WITH MY SETUP PROGRESS: Snapchat: Kenneth.YT or KDKHD SNAP CODE: http://kennethkre...",
"siteName": "YouTube"
}
Inside the same directory, deploy the service with a single command.
After the deployment is complete, get the service's URL like so.
$ up url
The url with the query param could look similar to this.
https://hfnuua77fd.execute-api.us-west-2.amazonaws.com/development?url=https://www.youtube.com/watch?v=NUWViXhvW3k
And there you have it, your own serverless and scalable website preview service built and deployed on AWS Lambda.
Query with your favourite http client inside any application.
• Documentation for Up.
• MIT.