Building a serverless file upload system with AWS Lambda, S3, and API Gateway is a great way to handle file uploads without worrying about infrastructure costs or maintenance.
In this system, AWS Lambda functions are triggered by API Gateway when a file is uploaded, which then processes and stores the file in S3.
API Gateway acts as the entry point for the file upload, handling the request and passing it on to the Lambda function.
The Lambda function can then use the AWS SDK to upload the file to S3, where it can be stored and retrieved as needed.
This approach allows for a scalable and cost-effective way to handle file uploads, with the added benefit of being able to process and manipulate the uploaded files in real-time.
In a TypeScript implementation, the Lambda function can be written using the AWS SDK for TypeScript, which provides a convenient and easy-to-use API for interacting with AWS services.
Infrastructure Setup
To set up the infrastructure for your AWS Lambda, S3, and API Gateway project, you'll need to create a DynamoDB table, an API, two Lambda functions, and an S3 bucket. We create an API and a DynamoDB table, and also create a S3 Bucket, enabling CORS on the bucket to upload files from your frontend using presigned URLs.
The two Lambda functions will be used to upload a file on S3 and to list all the files uploaded by a user. We plug the Lambda functions into API Gateway, enabling CORS on the API Gateway resources to call the API from your frontend.
Here's a list of the infrastructure components you'll need to create:
- DynamoDB table
- API
- S3 Bucket
- Two Lambda functions
- API Gateway
Provisioning the Infrastructure
Provisioning the infrastructure is a crucial step in setting up your AWS infrastructure. We need to create several components, including a DynamoDB table, an API, two Lambda functions, and an S3 bucket.
To start, we create an API and a DynamoDB table, which will serve as the foundation for our application. We also create a S3 Bucket, which will store our uploaded files.
To enable file uploads from our frontend, we need to enable CORS on the S3 bucket. This will allow us to use presigned URLs to upload files directly from our frontend. We can restrict CORS to our frontend domain if needed, but for simplicity, we'll allow all origins.
We create two Lambda functions, each with a specific purpose: one for uploading files to S3 and another for listing all files uploaded by a user.
We plug the Lambda functions into API Gateway, which will handle incoming requests and route them to the correct Lambda function. We also enable CORS on API Gateway resources, so our frontend can make API calls without issues.
To ensure our Lambda functions have the necessary permissions, we need to grant them access to the DynamoDB table and S3 bucket. The first Lambda function requires read and write access to the DynamoDB table and write access to the S3 bucket. The second Lambda function requires read access to both the DynamoDB table and S3 bucket.
Here's a summary of the components we need to create:
- DynamoDB table
- API
- S3 Bucket
- Two Lambda functions
Configuring the Environment
Before starting your project, create your credentials for programmatic access to AWS through IAM, specifying the permissions your users should have.
You can install Serverless Framework through npm by running "npm install serverless". This will give you the necessary tools to configure your environment.
To improve productivity, consider using VSCode, which can make your development process smoother.
First, you need to configure the Serverless Framework to use your AWS credentials when interacting with AWS. This is where your IAM credentials come in handy.
With everything configured, you're now ready to start your project.
API Gateway Setup
To set up API Gateway for file uploads, you'll need to create an S3 bucket and an IAM role for API Gateway to access it. Create a new API Gateway REST API and a resource for uploading files, then add a PUT method with an integration type of AWS Service, pointing to the S3 PUT API.
You can also use API Gateway to upload files to S3 via a Lambda function, which provides secure, scalable, and efficient handling of file uploads. This involves creating a Lambda function with Python 3.10 as the runtime and a role with S3FullAccessPolicy.
Here are the general steps to set up API Gateway for file uploads:
- Create an S3 bucket and an IAM role for API Gateway
- Create a new API Gateway REST API and resource for uploading files
- Add a PUT method with an integration type of AWS Service, pointing to the S3 PUT API
- Or, use API Gateway to upload files to S3 via a Lambda function
Remember to add authentication to your API to prevent anonymous users from uploading files to your S3 storage.
API Gateway as Direct Proxy
API Gateway can be used as a direct proxy to S3 uploads if the payload size is always below the 10 MB limit.
You can create an S3 bucket for uploads, such as "my-apigw-uploads", and grant API Gateway access to it.
Create an IAM Role for API Gateway to access the bucket, specifying the "s3:PutObject" action and the bucket's ARN.
You can create a new API Gateway REST API and add a resource for uploads, with a child resource for the object path.
Configure the PUT method for the object resource, selecting the "AWS Service" integration type and specifying the S3 PUT API.
In the Integration Request box, add the object path parameter and configure the binary media types to treat all media types as binaries.
Deploy the API to a new stage, such as "v1", and test the upload using Postman.
Once deployed, verify that the uploaded file is present in the S3 bucket.
Here's a summary of the steps:
- Create an S3 bucket for uploads.
- Grant API Gateway access to the bucket.
- Create a new API Gateway REST API and add a resource for uploads.
- Configure the PUT method for the object resource.
- Deploy the API to a new stage.
- Test the upload using Postman.
- Verify the uploaded file in the S3 bucket.
API Gateway Setup
You can create an API Gateway setup using the AWS Console, AWS CLI, or through a framework like Serverless Framework, which defines resources for AWS using the CloudFormation template.
To start, you'll need to define the AWS resources you'll be using, such as an S3 bucket to store uploaded files. This can be done by specifying a variable for the bucket name, which can be overridden by a stage variable or a default value.
Next, you'll need to define an AWS IAM Role (UploadRole) that your lambda function will use to access S3 and put logs into a CloudWatch log group. This role should follow the least privileged principle of IAM.
The stage variable is useful for distinguishing between development, QA, and production environments. This can be set in the provider configuration at serverless.yml, making it easy to switch between environments.
Generating a Pre-Signed URL
Generating a Pre-Signed URL is a crucial step in uploading files to S3 through API Gateway and Lambda. This process allows users to upload files without having to go through your backend, making it a great way to handle large file uploads.
You can generate a pre-signed URL from a Lambda function using the AWS SDK. The Lambda function will use the SDK to sign a URL that allows the user to upload a file to S3. This URL is valid for 60 seconds, which is a good amount of time for the user to upload the file.
The Lambda function will read the environment variables, such as the name of the DynamoDB table and the S3 bucket. It will also read the body of the request, which contains the user's ID and the file name. The function will then put an item in the DynamoDB table, storing the user's ID and the file name.
Here are the steps to generate a pre-signed URL:
- Read the environment variables
- Read the body of the request
- Put an item in the DynamoDB table
- Generate a presigned URL
- Return the presigned URL
The presigned URL is generated using the AWS SDK's generate_presigned_url() API call. This call takes several parameters, including the bucket name, object name, and expiration time. The expiration time is set to 60 seconds, which is the default value.
You can also use API Gateway to generate a pre-signed URL. This involves creating a new API Gateway REST API and setting up a GET method to return the pre-signed URL. The integration type should be set to Lambda Function and Use Lambda Proxy integration should be selected.
To test the pre-signed URL, you can use Postman to send a GET request to the API Gateway endpoint. If successful, you should receive a HTTP 200 response and find your uploaded file in the S3 bucket.
File Upload Process
The file upload process using AWS Lambda, S3, and API Gateway is a powerful and scalable solution for handling file uploads. This process involves several key steps that work together to ensure secure, efficient, and reliable file uploads.
API Gateway acts as a front door for applications to access data, business logic, or functionality from backend services, such as applications running on Amazon EC2 or code running on AWS Lambda. It provides a managed interface to handle incoming HTTP requests, enabling secure access and throttling.
To create a Lambda function, you'll need to choose Python 3.10 as the runtime and create a new role for the function. Once the function is created, you can add the S3FullAccessPolicy to the role to grant it the necessary permissions to upload files to S3.
When uploading files, it's essential to configure the API Gateway to accept binary media types, such as multipart/form-data. This allows the API Gateway to transform the payload into a base64 string when the Content-Type header matches the API's binary media types.
Here are the key steps to follow when uploading files using API Gateway and Lambda:
- Create a Lambda function with Python 3.10 as the runtime and a new role.
- Add the S3FullAccessPolicy to the role.
- Configure the API Gateway to accept binary media types.
- Use a lambda proxy integration to transform the payload into a string.
- Parse the APIGatewayProxy event to extract the file and other fields from the form data.
- Upload the file to S3 and customize the filename when provided.
Note that API Gateway limits the payload size to 10 MB, so be aware of this limitation when uploading files.
Coding and Testing
Coding the function involves installing the "parse-multipart" package using "npm install parse-multipart" to extract the payload's content. This binary type was chosen to show how to build a function integrating with the AWS ecosystem, but it's not restricted to this kind of binary data.
To decode the base64 encoded files, you'll need to add the desired type to binary types on API Gateway settings. The handler receives the event from API Gateway, using the "parseMultipart" to extract the file's content and the name to save it into the S3 bucket. The bucket name is an environment variable that will be injected into our function at the time of deployment.
The Lambda function needs to be configured on "serverless.yml" with the event section specifying "http" to integrate the AWS API Gateway to our lambda function. We used the HTTP APIs for this function as it is useful for web apps like CORS, support for OIDC, and OAuth 2 authorization.
Coding the Function
To code the function, we need to extract the payload's content, which can be achieved using the "parse-multipart" package.
You can install it by running "npm install parse-multipart" in your NodeJs environment.
API Gateway passes binary files as base64 encoded, so you'll need to decode from base64 before passing the body to parse-multipart.
In the handler, we receive the event from API Gateway, using the "parseMultipart" to extract the file's content and the name to save it into the S3 bucket.
The bucket name is an environment variable that will be injected into our function at the time of deployment.
We're setting the ACL (Access Control List) as "public-read" to show how to upload files to S3 using API Gateway.
The Lambda function needs to be configured in "serverless.yml", where we declare our functions inside the "functions" tree, give it a name, and other required attributes.
We specify "http" to integrate the AWS API Gateway to our lambda function in the event section.
The API Gateway has two ways to integrate an endpoint http to lambda: HTTP APIs and REST APIs.
Time to Test
Now that we've set up our function, it's time to test it.
We'll be using Insomnia, an open-source API client that lets us send various types of requests quickly and easily.
To test our function, we'll upload files using "multipart/form-data".
Insomnia is a great tool for this, and it's already configured for us.
We can use Insomnia to upload a single file, and it's as simple as sending a request.
Below is an example of how to do it using Insomnia.
The service is already set up, so we can start testing right away.
Sources
- https://medium.com/@vaishnavipolichetti/uploading-files-to-s3-using-api-gateway-via-lambda-function-57fb160f7b7c
- https://dev.to/slsbytheodo/learn-serverless-on-aws-step-by-step-upload-files-on-s3-50d4
- https://tmmr.uk/post/upload-to-s3-through-api-gateway/
- https://dev.to/jsangilve/uploading-files-to-s3-with-serveless-4ai1
- https://moduscreate.com/blog/upload-files-to-aws-s3-using-a-serverless-framework/
Featured Images: pexels.com