On-Demand Webinar "Modernize or Rebuild from Scratch: What Your Legacy System Really Needs"
Contact us

AI PoC: Video Transcription and Summarization Tool Built with Azure Services

Custom development of an AI-powered video transcription and summarization solution with a flexible architecture

ABOUT
the project

Client:

Internal Project of Leobit

Location:

Country flag

Ukraine

Company Size:

100+ Employees

Services:

Technologies:

.NET

ASP.NET Core Web API

Azure Cosmos DB

Azure Blob Storage

Azure Speech Service

Azure OpenAI

Onion

Angular

TypeScript

RxJS

VideoJS

 

The Leobit team developed a .NET-based proof of concept (PoC) of a software solution that provides video transcription, storage, and AI-powered summarization. Our software development specialists leveraged API endpoints to integrate the app with Azure services that provide the above-mentioned functionality.

We experimented with an architectural approach to deliver an efficient AI-powered solution for video transcription and summarization. Its multi-layered architecture based on API integrations with Azurer’s service allowed us to build a solution in the short term and ensure its flexibility, which is essential for its forthcoming expansion.

Yurii Shunkin

R&D Director at Leobit

Customer

We at Leobit decided to build this solution as our internal tool. It allowed us to experiment with an innovative architectural approach and fully utilize the vast potential of various Azure cloud services. In addition, this solution can be used in our internal workflows and can be especially useful to our marketing team.

Business Challenge

We had to come up with the most time- and resource-efficient approach in order to build a flexible and ready for future updates PoC in the shortest terms possible.

Project
in detail

While our primary goal was developing a PoC, we still needed to break the software development cycle in several critical stages.

project in detail for PoC

We started the project by defining the features and selecting the tech stack behind the concept’s core functionality. Our specialists decided to choose .NET and Angular as core functionalities. We also decided to leverage Azure Speech Service for speech recognition and Azure Cosmos DB for data storage. To leverage AI functionality, we decided to apply Azure OpenAI. After that, our specialists defined integration points with Azure services and planned the solution’s architecture.

We developed a simple yet efficient application back end using .NET. Our primary goal was setting up API endpoints and Azure service connections. We configured data models and ensured storage integrations with the app’s back end. Our specialists also created a responsive app UI with Angular components. In particular, we integrated a video player into the solution.

Upon building the solution’s core, we connected its front end and back end with test workflows. While testing the solution, we detected and solved several issues. In particular, we optimized performance with large files by creating a workflow where video files are properly uploaded to Azure Blob Storage. We also created a convenient algorithm for quicker processing of speech-to-text services where audio is extracted from video. Our specialists also adjusted video encoding for quicker video loading. We also optimized speech transcription requests to ensure higher accuracy of results. We also optimized AI prompts for summary generation, set up proper error-handling exits, and configured CI/CD deployment patterns.

Once we stabilized our PoC, we configured hosting and deployed the solution. We also documented each stage of the development process, along with its critical endpoints. This comprehensively documented knowledge base on the solution will aid us during the forthcoming stages when we decide to create a full-fledged product based on our PoC.

project-in-detail

Efficient and Flexible Architecture

We created an efficient layered architecture that utilizes APIs to connect the app to Azure services that power its core functionality. A custom API controller connects the solution to the service layer involving such services as Azure Blob storage, used for efficient video upload, Azure Speech Service, responsible for speech recognition, and Azrue OpenAI Service which covers the generation of video summaries, etc.

This approach to the concept's development ensures great flexibility and efficiency. The solution fully leverages the capacities of Azure's cloud services and can be expanded further with new functionality.

project-in-detail

Comprehensive Functionality for Processing Videos

The solution provides a rich set of features for storing, editing, and managing videos. It involves functionality for:

– Uploading video files.
– Playing the video back with standard player controls.
– Transcribing the speech automatically with speaker recognition in videos.
– Processing transcription on background with progress displayed in real time.
– Generating summaries of transcriptions.
– Finding particular moments in videos by clicking on transcription segments.
– Automatically generating thumbnails for a video.
– Storing videos, transcriptions, and summaries.
– Sharing them via unique links.

As has already been mentioned, the PoC has a rich potential for continuous expansion of its functionality.

project-in-detail

A Functional App PoC Built with .NET and Angular

We leveraged our .NET expertise to create an efficient app back end that seamlessly connects with our API Controller. We also created a functional interface allowing users to test the PoC's core features. Our specialists used Angular to build an app's front end, which provides features for uploading, viewing, transcribing, and summarizing video content.

Explore
The solution prototype

See a PoC that showcases how automated case intake, role-based dashboards, and workflow management work in practice for civil and criminal cases.

Explore demo

Technology Solutions

  • Flexible layered architecture built according to the principles of the Onion architectural pattern.
  • Custom API controller connecting the app with a variety of Azure’s services, including AI tools.
  • Access to the secure and well-organized Azure Cosmos DB database.
  • Azure Blob Storage for fast video uploads and convenient storage.
  • Integration with the Azure Speech Service for transcribing videos and speaker recognition.
  • Integration with Azure OpenAI for AI-powered video summarization.

Value Delivered

  • An innovative and highly efficient architectural approach to software architecture based on the integrations with Azure’s services.
  • Great flexibility and potential for continuous improvement of the PoC into a full-fledged product.
  • A fully-usable PoC that verifies our innovative architectural concept; built within less than two weeks.