How to collect and store data periodically with Serverless Functions

Build
Thomas Tacquet
4 min read

Some applications need to collect and store data periodically. For instance, when you poll a resource once every hour or monitor systems. Allocating static resources over time to this kind of periodic collection can be inefficient because of the punctual kind of workflow. In those types of contexts, using Serverless Containers and Functions can help you save energy and time.

We will demonstrate how to use Serverless products to efficiently build and deploy tools to periodically collect images and metadata, and store them in a bucket and database respectively. For this tutorial, we’ll demonstrate how to put in place a global warming monitoring system.

Monitoring the impact of global warming though Serverless

The impact of global warming is becoming increasingly serious and is most evident in the melting of our glaciers. To be able to observe and analyze glaciers melting, we can imagine some ways to gather important data about this phenomenon.

The use case we will look at is monitoring glaciers. To monitor the state of a group of glaciers, we want to create a public dataset of images and their associated metadata.

Data is not a solution, but it can help to understand our environment and increase awareness of what is happening. Plus, we can aggregate data to find interesting correlations.

Project outline

A lot of ski resort websites offer webcam data to check weather and skiing conditions, so we can take advantage of those livestreams to create our proof of concept with the fresh images.

How to build an infrastructure which periodically collects and stores data

We will aim to create an infrastructure that easily scales up and down. As we don’t know anything about the audience of people that will use the platform we’re building, we need flexibility first and foremost. We want something that can manage a lot of inputs from webcams and handle a lot of requests with a decent response time.

Our goal is to leverage the scaling benefits of serverless to create just such an infrastructure.

Using Containers and Functions is the way to go here because we can take advantage of scaling from or to 0, up to multiple containers and functions at the same time. All the scaling is managed by Scaleway, so we don’t need to configure and run multiple servers to handle the load.

Infrastructure diagram

For this use case, thinking Serverless and implementing managed databases and single purpose functions results in a simple but highly effective application and frees the developer from repetitive, low-value tasks related to configuration.

By design, Serverless handles a lot of different inputs, and in terms of the processing, we can easily add other resources. For example, we can have a GPU instance that starts once a week to analyze everything we collected.

In this case, we are polling webcams periodically, but also it’s easy to add some IoT to push sensor data directly to our functions.

First, you need to create two persistent resources:

You need to keep the S3 and database credentials safe. To do so, use secrets to keep credentials stored safely when creating a function or namespace.

Secrets are environment variables that are injected into your function and stored securely, but not displayed in the console after initial validation. Add a key and a value.

We do this via the following variables :

DB_HOST
DB_PORT = 1234
DB_USER = username
DB_PASS = password
DB_NAME = dbname
S3_ENDPOINT= sample.s3.fr-par.scw.cloud
S3_ACCESSKEY = accessKey
S3_SECRET = secretKey

These are accessible as environment variables to our resources in the namespace, and all of them are available in your favorite programming language.

Deploying functions and containers can be done using different ways, the most common of which are Scaleway Console, API, Terraform, and Serverless Framework.

For now, we will use a Serverless frameworkto deploy functions and containers. It’s a powerful solution to deploy our environment with just one command.

All instructions for Scaleway’s Serverless framework plugin are in our Github right here.

In this tutorial, I will use Go as my primary language, but note that everything can be done with other supported runtimes such as NodeJS or Python.

The purpose of the function is to download images from a URL, then to store the images themselves in S3, and their metadata in a Postgres database.

To achieve this project I personally used Golang + Serverless Framework. I decided to publish the source code of the Function, Container, and deployment configuration on my Github repository. This can be a bootstrap for many types of projects, and you can find instructions on how to use it by following this link: https://github.com/thomas-tacquet/scw-blogpost-glacier

What’s next for Serverless

There are multiple improvements you can make on the initial project. For example, there are possibilities to experiment with modifications to the actual source code:

  • Function: Compress and or resize images before sending them to S3.
  • Container: Add a route to get all the images at once.
  • Function: For each inserted record in the database, fetch the temperature from an API and add it to a new column in the database.
  • Container: Improve the front end to display thumbnails.

Now take a step back and analyze how smoothly creating an application goes when using Serverless. No configuration, no tiresome repetitives tasks, just building. I hope this inspired you to experiment with Serverless or even implement it in your projects.

If you have remarks or questions, or simply want to chat with us and our community about Serverless, join #serverless-functions and #serverless-containers in our open Scaleway Community Slack.

Share on
Other articles about:

Recommended articles