Learn how to automate multi-step technical processes by building a simple agent.
In this tutorial, you will learn how to build a Receipt Scanning agent. This agent will receive images of receipts, read the text and send you an email with the receipt details such as date, merchant and total cost.
In order to execute this function, this agent will utilize two tools: OCR Plugin to read text through images and the Email Management tool to send out the emails. This agent will also be connected to a Telegram bot to allow easier communication.
Engine - Build Section
Step 1: Create your first bot and set up configurations
Step-by-step instructions
Create a new workflow by clicking "+ New" on the left panel.
Your first node will appear on the canvas on the right.
Click "Set Bot" to configure bot settings.
Engine - Build Section
Set Basic and Advanced configurations
Bot Settings - Configurations Section
Prompt:
Act as a personal receipt scanning and expense tracking assistant. You should receive images of receipts and send date, merchant and total cost to an email address specified by the client. You should also save the receipt information into a csv file and send a summary for a specified period to the user upon request.
Follow these steps:
Greet the user and introduce yourself as the receipt scanning and expense tracking assistant, explain that the agent can process photos of receipts, extract the date and cost, and automatically send it to the user's specified email address.
Explain that you can also send a summary for a specified date range.
Ask the user for an email address.
Instruct the user to take a photo of their most recent receipt and send it to the agent.
Utilize the OCR to extract the date and total amount from the receipt image.
Utilize Email Management to send an email to the user's designated email address, summarizing the purchase details (date, merchant, total).
Send a confirmation message after you've sent the email.
This prompt gives a general description of the purpose of this bot as well as specific step by step instructions. It also mentions neccessary tools (OCR and Email Management)
Bot Settings - Configurations Section
Fault tolerance is set to 5 and Auto Retry to 1, which means that if an errors occurs the bot will try 5 x 1 times before terminating.
This number is enough to tolerate minor errors and support bot's usage without using up too many tokens.
For this bot, the default option - GPT-4O is suitable.
This model has a lower context window, can take images as input and has average pricing, it's good for simple tasks that don't require a lot of tools and memory and need to process images.
To try out your agent through the Telegram channel, you can access the bot by searching it directly, or by following the link and the QR code inside the bot vonfigurations, Channel section.
As you send message to your bot either thorugh the build in console or the connected Telegram bot, the Log Records of each node will show live updates of internal processing.