Preserving Indian folk music in digital repositories poses significant challenges because robust classification systems are lacking to capture its linguistic, instrumental, and acoustic diversity. As a cornerstone of India’s intangible cultural heritage, this music faces the risk of marginalisation and loss unless systematic, scalable methods are employed to identify and preserve it. This research aims to develop an automated, multi-modal framework for regional classification of Indian folk music, thereby enabling structured archiving and improved accessibility. To achieve this, a novel machine learning pipeline was designed, integrating Whisper for speech recognition and regional language identification . Instrument detection was performed using YAMNet, which has proven effective in recognizing traditional instruments . Acoustic features such as MFCCs, chroma, and spectral descriptors were extracted using Librosa [Error! R eference source not found.]. Together, these tools provide a comprehensive understanding of the songs’ linguistic, instrumental, and rhythmic content. The curated dataset includes folk music from linguistically rich regions of India, such as Marathi, Punjabi, Urdu Qawwali, and dialects from Uttar Pradesh and Bihar. Seven supervised learning algorithms were trained and evaluated, including Random Forest, Support Vector Machine, and Gradient Boosting. Simpler classifiers, such as K-Nearest Neighbours, Naive Bayes, and Logistic Regression, were also tested. A hybrid ensemble model combining Random Forest, SVM, and Gradient Boosting through soft voting achieved a classification accuracy of 99%. This result demonstrates the effectiveness of ensemble learning, combined with multimodal features, in handling nuanced differences in regional folk genres. This research addresses the critical gap in scalable and automated tools for preserving folk music. The study highlights the potential of artificial intelligence in safeguarding endangered cultural assets