Government-Funded Multimodal
Large Language Model initiative for Indian Languages

Supported by National Mission on Interdisciplinary Cyber-Physical Systems, Department of Science & Technology (NM-ICPS, DST)

Implementing Agency

TIH Foundation for IoT and IoE at IIT Bombay led consortium

About BharatGen -

BharatGen is a multimodal large language model initiative, developing advanced generative AI models tailored to India's linguistic, cultural, and socio-economic diversity. At its core is Bharat Data Sagar, a vast repository of India-centric data that ensures the AI models are deeply rooted in the country’s unique context. By integrating text, speech, and images, BharatGen builds accessible AI technologies that foster innovation across key sectors like agriculture, education, and healthcare, ensuring inclusivity for India’s diverse population.

Objectives

Bharat Data Sagar

A multilingual data repository reflecting India's diversity supports AI with locally relevant datasets and data sovereignty.

Data and Compute-Efficient Learning

BharatGen focuses on efficient AI models for Indian languages with limited digital presence, ensuring high performance with minimal data.

Multimodal AI Models

BharatGen integrates text, speech, and images into AI models for inclusivity and robust solutions across Indian languages.

Startup Ecosystem and Innovation

BharatGen supports the startup ecosystem with tools, mentorship, and collaboration, empowering entrepreneurs to create innovative AI applications for India.

Academic Research and Industry Implementation

Combines academic research and industry expertise to innovate scalable AI through public-private partnerships, positioning India as a global leader across agriculture,education,and healthcare.

Skilling and Capacity Building

BharatGen aims to strengthen India's AI talent pool through fellowships, hackathons, and courses, positioning it as a global innovation hub.

Impact