Leora: AI-Powered Voice Sales Assistant

AI agent development based on corporate LLM, which enhances customer experience and sales efficiency

ABOUT the project

Client:

Leobit's Internal Project

Location:

USA

Ukraine

Company Size:

200+ Employees

Industry:

IT Services

Solution:

Custom AI-Powered Tool

Services:

Custom Software Development

Large Language Model (LLM) engineering

Technologies:

Azure OpenAI

Azure Blob Storage

Azure Cognitive Search

Azure Cosmos DB

Azure Functions

.NET

Selenium

Azure Speech-to-Text

Azure Text-to-Speech

3GS engine

Leora is Leobit’s AI-powered vocal sales assistant designed to deliver instant, tailored responses to potential clients. Unlike traditional chatbots, Leora uses voice interaction and advanced AI to simulate natural conversations, so that prospects or customers can receive the information they need without delay or manual search. Trained on the company’s data and case studies, Leora reflects Leobit’s domain knowledge and service portfolio.

This project showed that AI shouldn’t be used just for its own sake — it needs to solve real business problems. We created Leora to address specific challenges in our sales process, and by using the RAG approach, we ensured the assistant gave accurate, domain-specific answers instead of generic ones. The rapid prototyping allowed us to validate the concept within days and then focus on refining the user experience.

Yurii Shunkin

R&D Director at Leobit

Customer

Leobit is a .NET, AI, and web application development provider for technology companies and startups in the US and the EU. Our technology focus covers .NET, Angular, iOS, Android, Azure, .NET MAUI, Blazor, Flutter, Ruby, PHP, React, and a comprehensive range of other technologies from Microsoft, web, and mobile stacks. Leobit has a representative office in Austin, TX (USA) and development centers in Lviv (Ukraine), Tallinn (Estonia), and Krakow (Poland).

Business Challenge

Leobit aimed to develop a voice-based AI assistant capable of delivering accurate, personalized answers based on internal data and client use cases. The goal was to improve the user journey, reduce manual sales workload, and support seamless, natural communication with prospects.

Why Leobit

With deep expertise in AI and custom software development, Leobit was uniquely positioned to solve this internal challenge and build a sophisticated AI assistant that seamlessly integrates with its sales and marketing workflows. The team’s ability to combine vocal interfaces, real-time response logic, and domain-specific knowledge ensured the project’s success.

Project
in detail

Leobit developed Leora with an emphasis on natural voice interaction and real-time information delivery.

Core architecture development and initial setup

We started by integrating the Azure OpenAI service using the GPT‑3.5 Turbo model. To manage and store documents, we set up Azure Blob Storage, which works like a central folder in the cloud. Then, we connected everything using Azure Cognitive Search, allowing the assistant to quickly find and understand information from the stored documents.

Search optimization and data prioritization

To improve the accuracy of responses, we created a prioritization strategy, according to which Leobit’s service descriptions were given the highest importance, followed by case studies and blog posts. That way, when the LLM queries Cognitive Search, it gets the most reliable, relevant content first.

Website scraping

The web scraping part of our system uses Selenium, a popular web automation tool. Selenium is usually used by automation testers to simulate user actions in a web browser. We adapted Selenium to scrape web pages by automatically browsing the Leobit website and extracting the needed information. This data is then saved into a spreadsheet for further use.

Front-end development

We built the front end using modern React patterns to manage state and handle asynchronous communication with Azure services, including speech recognition, text generation, and voice output. React’s modular design allowed us to prototype quickly, test features in isolation, and update components independently as the system evolved. We wrote the codebase in TypeScript to reduce errors and improve long-term maintainability. For a faster and smoother development experience, we used Vite with Rollup, which also helped speed up the release process.

Virtual avatar implementation

We explored a few options and decided to go with a pre-made 3D character model, which we customized to reflect Leobit’s branding. We also used publicly available animations, like hand gestures and head movements, to make the avatar feel more natural. We used the Three.js engine to display the animated avatar in a browser. The result is a speaking, animated character that gestures and moves in sync with its voice.

Multilingual capabilities

While some parts of the system are already multilingual (e.g., Leobit’s LLM can understand and respond in many languages), the current voice interface works only in English. That means users can interact with the system by speaking or hearing responses only in English, even though the model itself could handle other languages if given written input. In future updates, we plan to add multilingual support to the speech recognition and text-to-speech components. This would allow users to speak and receive responses in different languages, making the entire experience fully multilingual.

Authorisation

Since the webpage is not public yet and is accessible only within the company, it is secured behind Azure Active Directory (Azure AD) using Enterprise ID authentication. This means users must be authorized through Azure AD to access the site, so that only company employees can access the content.

Data collection and processing

We gather content from several sources, not just our public website, but also our CRM records, internal documents, case studies, and presentations. After collecting these materials, we run them through a cleanup pipeline: we remove duplicates, filter out irrelevant items, convert the files to a consistent format, and organize everything into a clear structure. Initially, we handled some of this work manually, but we have since automated most tasks with custom tools that run as Azure Functions. These functions now process new data once a day, keeping the knowledge base current without extra effort from the team.

Retrieval‑Augmented Generation (RAG) approach

To make the AI assistant smarter and more accurate, we used a technique called Retrieval‑Augmented Generation (RAG). So, when a user submits a query, it first goes to Azure Cognitive Search. This tool scans through stored documents to find the most relevant pieces of information. It starts with basic keyword matching, but it also supports semantic search, which means it can understand the meaning behind words, even if the exact terms aren’t used.

More advanced still is vector search, a key feature in modern AI systems. It converts both the query and the documents into numerical representations called embeddings. This makes it possible to match content based on meaning rather than wording, and even across languages. For example, someone could search in English and still get accurate results from documents written in German.

Once the search identifies the most relevant snippets, those are passed to the Azure OpenAI model (GPT‑3.5 Turbo). The model then generates a response based specifically on the retrieved content. This RAG setup was the foundation of our solution. We quickly built a prototype to test its potential, and within a few days, we had a working system. From there, we focused on refining and improving its performance.

Speech‑to‑text and text‑to‑speech integration

For speech recognition and voice output, we used Azure Speech Studio’s services: real-time speech to text and custom neural voice services. When a user speaks into the microphone, the audio is sent to Azure’s Speech-to-Text service. This service converts the spoken words into text, which we show in the UI and pass along to the AI model for processing. Once the model generates a response, we send that text to Azure’s Text-to-Speech service. It returns an audio file, which we play so the user hears the response.

Azure also allows customization of the voice experience. You can choose different voices, adjust pitch and speed, and tweak pronunciation. For instance, Azure initially struggled with the pronunciation of “Leobit,” so we added our company name and its common variations to a custom pronunciation dictionary. We also fine-tuned the voice settings to match our preferred tone and pacing to create a smoother and more natural user experience.

Technology Solution

Used vector search to transform data into vector representations to enable language‑independent search.
Implemented two development directions: speech-to-text/text-to-speech using Azure Speech Studio’s services.
Adopted Azure OpenAI (GPT-3.5 Turbo) for natural language processing and response generation.
Implemented .NET Azure Function + Selenium for dynamic web data extraction.
Developed a React-powered front-end interface for interacting with the AI assistant.
Adopted Selenium for dynamic data collection.

Value Delivered

Real-time, tailored information without manual browsing.
Hands-free, voice-based interaction for greater convenience.
Increased productivity for the sales team.
Conference attendees responded positively to Leora’s vocal interaction and unique functionality.