
To read a CSV file from an S3 bucket trigger with AWS Lambda, you'll need to create an event trigger that watches for new or updated files in your S3 bucket.
The AWS Lambda function will then be triggered automatically whenever a new CSV file is uploaded to the S3 bucket.
You can configure your AWS Lambda function to read the CSV file from the S3 bucket by using the AWS SDK for your chosen programming language.
To do this, you'll need to install the AWS SDK and import it into your Lambda function code, as shown in the "Configuring the AWS SDK" section.
Setting Up S3 Bucket Event
To set up an S3 bucket event, you'll need to create a trigger for your Lambda function. Open your Lambda function and click on "Add trigger" to get started. Select S3 as the trigger target and choose the bucket you've created, then select the event type as "PUT" and add a suffix of ".csv" to the trigger.
You'll also need to configure a test event to simulate an S3 upload event. To do this, navigate to the "Code" tab and write Python code to read the contents of the CSV file using the Boto3 library.
Here's a step-by-step guide to creating a test event:
- Click Deploy to update the Lambda function with the new code.
- Choose "Create a New Test Event" and name your test event for reference.
- Search for and choose the "s3-put" template to mimic the event structure generated when an object is uploaded to an S3 bucket.
- Configure the Test Event JSON by replacing the placeholder values with actual data relevant to your S3 bucket and file.
- Trigger the test event by clicking the "Test" button to simulate an S3 upload event and trigger the execution of your Lambda function.
S3 Bucket Event
To set up an S3 bucket event, you'll need to create a trigger for your Lambda function. Open your Lambda function and click on add trigger.
Select S3 as the trigger target and choose the bucket you've created. Make sure to select the event type as "PUT" and add a suffix of ".csv" to the trigger.
To complete the setup, click on Add.
Step 3b: Add Triggers to Lambda
To add triggers to your Lambda function, you'll need to open it from the functions list and click on "Add Trigger" in the function overview. This will bring up a window where you can set up the trigger.
Choose S3 as the source of the trigger. Select the bucket you created earlier from the Bucket dropdown menu. For Event types, select PUT and deselect "All object create events" to only trigger the lambda when a new file is uploaded. This way, you can ignore files created in other ways, like HTTP POST or COPY.
In the Suffix field, enter ".csv" to make sure you only trigger the function for csv files and ignore other file uploads. Be aware of the Recursive invocation warning, but in this case, it shouldn't be an issue since the Lambda function lacks write access to S3.
Here's a quick summary of the trigger settings:
Creating Lambda Function
To create a Lambda function that reads a CSV file from an S3 bucket, start by navigating to the Lambda console and clicking on "Create function." Select "Author from Scratch" and name your function, such as "csv_s3_Lambda." Choose Python as the runtime and select the role you created with the necessary policy attached.
You will then need to import three modules in your code, which will be used to invoke the S3 client and DynamoDB resource. The modules are essential for interacting with AWS services programmatically.
To add a trigger to the Lambda function, open it from the functions list and click on "Add Trigger" in the function overview. Select S3 as the source and choose the bucket you created. Configure the event types to trigger the Lambda function only when a new CSV file is uploaded, and acknowledge the recursive invocation warning.
Creating a Lambda Execution Role with S3 Read Permissions
To create a Lambda execution role with S3 read permissions, you need to sign in to the AWS Console and navigate to the Identity and Access Management (IAM) console.
Click Roles and then click Create role to start the process. This will take you to the Create Role page.
Select Trusted Entity and Use case to specify the type of role you want to create. You can choose the AWS service that will use the role, which in this case is Lambda.
Add the AmazonS3ReadOnlyAccess policy for read-only S3 access by clicking on it in the list of available policies.
Give the role a name and description, and review the attached policies to ensure everything is correct. Finally, click Create role to create the IAM role.
To attach the role to your Lambda function, go to the Lambda function in the Lambda console and click on the Execution role section. Click Edit and choose the IAM role you created.
Save the changes to attach the role to your Lambda function.
Step 4: Coding the Lambda Function
To code the Lambda function, navigate to the "Code" tab after creating the Lambda function trigger. Here, you will find the default code provided by Lambda.
The default code source editor is where you'll write the Python code to read the contents of the CSV file. Import the Boto3 library, which is the Amazon Web Services (AWS) SDK for Python, to interact with AWS services programmatically.

Write the following Python code to read the contents of the CSV file. This code will be used to read in the data from the uploaded csv file and print out each row.
Click Deploy to update the lambda function with the new code you just wrote. This will make the changes live and ready for testing.
Next, you'll need to create a new test event to simulate an event that triggers your Lambda function. This is a crucial step in testing your function before making it live.
Setting Up Lambda Function Layers
To add external libraries to your Lambda function, you need to set up layers. This is necessary when your function requires libraries like pandas to process data efficiently.
You can create a deployment package that includes pandas and upload it as a Lambda layer. This is what we did in the example where we used pandas to calculate the average score for each student.
To attach a layer to your Lambda function, follow these steps:
- In the Lambda Function window, scroll down to find the “Layers” tab.
- Click on the “Add a layer” button.
You can select from AWS Layers, a custom layer you've created, or specify a layer ARN. We used the public pandas layer for AWS in the example.
Once you've selected the layer, click “Add” to attach it to your Lambda function. After attaching the layer, click on the “Test” button to run the Lambda function and confirm that it can import the library successfully.
Lambda Function Configuration
To configure a Lambda function for a read CSV from S3 bucket trigger, you'll need to specify the event source mapping, which includes the S3 bucket name and prefix.
The Lambda function's handler is set to the Python function `lambda_handler`, which is defined in the `lambda_function.py` file.
The `event` parameter in the `lambda_handler` function contains the S3 object key and bucket name, which are used to read the CSV file from S3.
The `context` parameter provides information about the function's execution environment, such as the function name and invocation ID.
The `boto3` library is used to interact with AWS services, including S3, and is installed in the Lambda function's environment.
The `pandas` library is used to read and manipulate the CSV file, and is installed in the Lambda function's environment.
The Lambda function's memory size is set to 512 MB, which is sufficient for reading a small to medium-sized CSV file from S3.
Input and Execution
To trigger a read from an S3 bucket, you need to configure an event source that monitors the bucket for new or updated files.
The event source can be an S3 bucket notification, which is a simple and efficient way to trigger a read from the bucket.
S3 bucket notifications can be configured to trigger on specific events, such as object creation or updates.
The event source can also be an S3 bucket policy, which defines the permissions and access controls for the bucket.
A bucket policy is a JSON document that defines the permissions and access controls for the bucket, and it can be used to grant access to an AWS Lambda function.
The event source can be configured to trigger a read from the S3 bucket using an AWS Lambda function, which is a serverless function that can be triggered by an event source.
Frequently Asked Questions
How to read CSV file from S3 bucket using pandas?
To read a CSV file from an S3 bucket using pandas, use the `pd.read_csv()` function with the S3 URL as the argument, like this: `pd.read_csv("s3://my-test-bucket/sample.csv")`. Alternatively, you can use the `awswrangler` library for a more efficient and AWS-native approach.
Sources
- https://stackoverflow.com/questions/56849240/how-to-read-csv-file-from-s3-bucket-in-aws-lambda
- https://www.dheeraj3choudhary.com/aws-lambda-csv-s3-dynamodb-automation/
- https://docs.workato.com/connectors/s3/trigger-csv-file.html
- https://www.freecodecamp.org/news/read-csv-file-from-s3-bucket-in-aws-lambda/
- https://medium.com/@arshxb/aws-lambda-trigger-a-lambda-function-when-a-file-is-uploaded-to-an-s3-bucket-13bdb654339c
Featured Images: pexels.com