How To Deploy An Apps In AWS Serverless Infrastructure With FaaS?
Machine learning and neural networks are becoming more indispensable for many companies. One of the main problems they face is the deployment of such applications.
We want to show you how to show a practical and convenient way to do this, for which you do not need to be an expert in cloud technologies and clusters. To do this, we will use the serverless infrastructure.
Introduction
Recently, many tasks in the product are solved using models created by machine learning or neural networks. Often these are tasks that have been solved for many years by conventional deterministic methods now easier and cheaper to solve through ML.
With modern Keras or Tensorflow frameworks and catalogs of ready-made solutions, it becomes easier to create models that give the accuracy required for a product.
My colleagues call this the “commoditization of machine learning” and in some ways they are right. The most important thing is that it’s easy to find / download/train a model today and you want to be able to easily unplug it.
Again, when working in a start-up or a small company, you often need to quickly check assumptions, not only technical but also market ones. And for this you need to quickly and easily build a model, expecting not strong, but still traffic.
Amazon, Google, and Microsoft recently provided FaaS – function as a service. They are relatively cheap, they are easy to deploy (no Docker is required) and you can run in parallel an almost unlimited number of entities.
Now I’ll tell you how you can put TensorFlow / Keras models on AWS Lambda – FaaS from Amazon. As a result, an API for recognizing content on images worth $ 1 for 20,000 recognition. Can be cheaper? Maybe. Can it be simpler? Hardly.
Function-as-a-service
Consider a diagram of various types of applications:
On the left, we see on-premise – when we own the server. Next, we see Infrastructure-as-a-Service – here we are already working with a virtual machine – a server located in the data center. The next step is Platform-as-a-Service when we no longer have access to the machine itself, but we manage the container in which the application will run. And finally, Function-as-a-Service, when we control only the code, and everything else is hidden from us. This is good news, as we will see later, which gives us very cool functionality.
AWS Lambda is the implementation of FAAS on the AWS platform. Briefly about the implementation. The container for it is the zip archive . The code is the same as on the local machine. AWS deploys this code on containers, depending on the number of external requests (triggers). There is no limit at all from the top – the current limit is 1000 concurrent containers, but it can easily be raised to 10,000 and higher through the support.
The main advantages of AWS Lambda:
- It’s easy to deploy (without a docker) – just code and libraries
- It is easy to connect to triggers (API, S3, SNS, DynamoDB)
- Good scaling – in production we launched more than 40 thousand invocations at the same time. It is possible and more.
- Low call cost. For my colleagues from BD, it’s also important that microservices support a pay-as-you-go model for using the service. This makes the unit-economy of using the model understandable when scaling.
Why port a neural network to serverless
First of all, we want to clarify that for my examples we use Tensorflow, an open framework that allows developers to create, train, and implement machine learning models. At the moment it is the most popular library for in-depth training and is used by both experts and beginners.
At the moment, the main way to create models of machine learning is a cluster. If we want to make a REST API for in-depth training, it will look like this:
It seems cumbersome? At the same time, you will have to take care of the following things:
- To assign the logic of traffic distribution to cluster machines.
- To prescribe the logic of scaling, trying to find the golden mean between idle time and braking.
- To register the logic of the container’s behavior – logging, managing incoming requests.
On AWS Lambda architecture will look much easier:
- First, this approach is very scalable. It can process up to 10 thousand simultaneous requests without prescribing any additional logic. This feature makes the architecture ideal for handling peak loads since it does not require additional processing time.
- Secondly, you do not have to pay for a simple server. In Serverless architecture, the payment goes for one requester. This means that if you have 25,000 requests, you will only pay for 25,000 requests, regardless of what kind of stream they came from. Thus, not only does the cost become more transparent, but the cost itself is very low. For example, on Tensorflow, which I will show later the cost is 20-25 thousand requests for $ 1. A cluster with similar functionality costs much more, and it becomes more profitable to use only a very large number of requests (> 1 million).
- Thirdly, the infrastructure becomes much larger. Do not need to work with the docker, prescribe the logic of scaling and load distribution. In short – the company will not have to hire an additional person to support the infrastructure, and if you are a data saver, you can do it yourself.
As you will see below, the demurrage of the entire infrastructure for the above application requires no more than 4 lines of code.
It would be incorrect to not say about the drawbacks of the serverless infrastructure and about those cases when it will not work. AWS Lambda has strict limitations on processing time and on available memory, which should be borne in mind.
- First, as we mentioned earlier, clusters become more profitable after a certain number of requisitions. In cases where you do not have a peak load and many requests, the cluster will be more profitable.
Secondly, AWS Lambda has a small, but a certain start time (100-200ms). For deep learning applications, it takes some time to download the model from S3. For the example that we will show below, the cold start will be 4.5 seconds, and the warm start will be 3 seconds. For some applications, this may not be critical, but if your application is focused on the fastest possible processing of a single report, the cluster will be a better option.
Application
Now let’s move on to the practical part.
For this example, we use a fairly popular application of neural networks – image recognition. Our application takes a picture at the input and returns a description of the object on it. Such applications are widely used for filtering images and classifying multiple images into groups. Our application will try to recognize the panda photo.
We will use the following stack:
- Gateway API for query management.
- AWS Lambda for processing.
- Serverless framework for deployment.
"Hello world" code
First, you need to install and configure the Serverless framework, which we will use to orchestrate and de-install the application. Link to the guide.
Make an empty folder and run the following command:
serverless install -u -n tensorflow cd tensorflow serverless deploy serverless invoke --function main --log
You will receive the following response:
/tmp/imagenet/imagenet_synset_to_human_label_map.txt /tmp/imagenet/imagenet_2012_challenge_label_map_proto.pbtxt /tmp/imagenet/classify_image_graph_def.pb /tmp/imagenet/inputimage.jpg giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89107) indri, indris, Indri indri, Indri brevicaudatus (score = 0.00779) lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00296) custard apple (score = 0.00147) earthstar (score = 0.00117)
As you can see, our application has successfully recognized the panda picture (0.89).
Voila. We successfully installed a neural network for image recognition on Tensorflow on AWS Lambda.
Let’s consider the code in more detail
Let’s start with the configuration file. Nothing non-standard – we use the basic configuration of AWS Lambda.
service: tensorflow frameworkVersion: ">=1.2.0 <2.0.0" provider: name: aws runtime: python2.7 memorySize: 1536 timeout: 300 functions: main: handler: index.handler
If we look at the file ‘index.py’ itself, we’ll see that first, we download the model (‘.pb’ file) to the ‘/tmp/’ folder on AWS Lambda, and then import it in the standard way via Tensorflow.
Below are links to parts of the code in Github that you should keep in mind if you want to insert your own model:
Downloading the model from S3:
strBucket = 'ryfeuslambda' strKey = 'tensorflow/imagenet/classify_image_graph_def.pb' strFile = '/tmp/imagenet/classify_image_graph_def.pb' downloadFromS3(strBucket,strKey,strFile) print(strFile)
def create_graph(): with tf.gfile.FastGFile(os.path.join('/tmp/imagenet/', 'classify_image_graph_def.pb'), 'rb') as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) _ = tf.import_graph_def(graph_def, name='')
strFile = '/tmp/imagenet/inputimage.jpg' if ('imagelink' in event): urllib.urlretrieve(event['imagelink'], strFile) else: strBucket = 'ryfeuslambda' strKey = 'tensorflow/imagenet/cropped_panda.jpg' downloadFromS3(strBucket,strKey,strFile) print(strFile)
Getting predictions from the model:
softmax_tensor = sess.graph.get_tensor_by_name('softmax:0') predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': image_data}) predictions = np.squeeze(predictions)
Now let’s add the API to the lambda.
Example with API
The easiest way to add an API is to modify the configuration YAML file.
service: tensorflow frameworkVersion: ">=1.2.0 <2.0.0" provider: name: aws runtime: python2.7 memorySize: 1536 timeout: 300 functions: main: handler: index.handler events: - http: GET handler
Now let’s re-stack the stack:
serverless deploy
We get the following.
Service Information service: tensorflow stage: dev region: us-east-1 stack: tensorflow-dev api keys: None endpoints: GET - https://<urlkey>.execute-api.us-east-1.amazonaws.com/dev/handler functions: main: tensorflow-dev-main
To test the API, you can simply open it as a link:
https://<urlkey>.execute-api.us-east-1.amazonaws.com/dev/handler
Or use curl:
curl https://<urlkey>.execute-api.us-east-1.amazonaws.com/dev/handler
We will get:
{"return": "giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89107)"}
Conclusion
We created an API for the Tensorflow model based on AWS Lambda using the Serverless framework. All managed to be done quite simply and this approach saved us a lot of time compared to the traditional approach.
Modifying the configuration file, you can connect a lot of other AWS services, for example, SQS for streaming tasks or make a chatbox, using AWS Lex.
As my hobby, we port many libraries to make serverless more friendly. You can find them here. The MIT project has a license, so you can safely modify and use it for your tasks.
Libraries include the following examples:
- Machine learning (Scikit, LightGBM)
- Computer vision (Skimage, OpenCV, PIL)
- Text recognition (Tesseract)
- Text analysis (Spacy)
- Web scraping (Selenium, PhantomJS, lxml)
- Testing API (WRK, pyrestest)
I’m very pleased to see how others use serverless for their projects. Be sure to say feedback in the comments and successful development.