Sagnik De

I completed my B.Tech in Electronics and Communication Engineering in 2025, from Institute of Radio Physics and Electronics (IRPE), University of Calcutta, India.

I was a Research Assistant at the Cognitive Brain Dynamics Lab, School of AI & Data Science, IIT Jodhpur, under Prof. Dipanjan Roy, where I worked on computational models of brain resting-state dynamics and neuromodulation using tACS. Previously, I was a Winter Research Intern at IIT Delhi with Prof. Tapan Kumar Gandhi, focusing on deep learning optimization for multimodal anxiety detection from biopotential signals.

I also served as a Research Intern at MANIT Bhopal under Dr. Varun Bajaj, contributing to EEG-based anxiety detection frameworks, and at IIIT Naya Raipur under Dr. Anurag Singh, where I developed multimodal deep learning methods for Major Depressive Disorder diagnosis. Earlier, I worked at CDAC Pune with Dr. Anil Kumar Gupta on EEG-based early detection of Parkinson’s Disease and pathological brain state classification. At the University of Calcutta, under Dr. Anisha Halder Roy, I explored multimodal EEG–sEMG fusion for pain assessment and brain activity analysis during olfactory and taste perception.

Email / CV / Google Scholar / LinkedIn / Github

Research

My primary research interests span Artificial Intelligence & IoT in Healthcare, Biomedical Signal Processing, Human-Computer Interaction (HCI), Computational Neuroscience and Medical Image Analysis. I am deeply fascinated by the potential of these fields to shape the future of technology and transform the way we interact with machines and information.

I am always open to new collaborations and research ideas. Feel free to reach out if you are interested in working together!

Publications / Pre-Prints

2025

	A Novel Vision Transformer based Multimodal Fusion Approach for Clinical MDD Diagnosis Using EEG and Audio Signals (NEW!) Sagnik De, Anurag Singh, Ashish Kumar Bhandari IEEE Transactions on Computational Biology and Bioinformatics, 2025 Abstract / Article Major Depressive Disorder (MDD) is a debilitating mental health condition characterized by persistent sadness, anhedonia, and cognitive impairments that significantly disrupt daily functioning. Accurate diagnosis remains difficult due to the subjective nature of clinical assessments, highlighting the need for objective and automated diagnostic tools. Hence, this study proposes a novel multimodal framework integrating electroencephalography (EEG) and audio signals for accurate MDD detection. EEG signals undergo preprocessing and are transformed into 2D time-frequency (T-F) representations using the Superlet Transform, while audio signals are converted into Mel-spectrograms. The 2D representations from each modality are independently fed into a novel Vision Transformer (ViT) architecture. The proposed ViT first slices the T-F representation along frequency bands and applies positional encoding to each slice. The resulting slice embeddings are subsequently processed through a parallel Transformer Encoder (PE) module to effectively capture temporal dependencies. After the PE module has extracted sufficient information from the embedded slices, a learnable class token is appended to them, and the combined representation is passed through the Class Encoder (CE) module, allowing the model to capture global contextual information. Features extracted independently from EEG and audio streams are then fused and fed into a fully connected layer for final classification. Evaluation on the MODMA clinical dataset shows the framework achieves 98.86% accuracy, 98.32% F1-score, and 0.9403 MCC, surpassing unimodal baselines. The lightweight feature extraction and transformer-based fusion mechanisms enable the proposed architecture deployable in an edge–fog–cloud Internet of Medical Things (IoMT) system, resulting in low-latency, resource-efficient, and scalable remote diagnosis, enhancing accessibility and real-time clinical decision-making.

	GLEAM: A Multimodal Deep Learning Framework for Chronic Lower Back Pain Detection Using EEG and sEMG Signals Sagnik De, Prithwijit Mukherjee, Anisha Halder Roy Computers in Biology and Medicine, 2025 Abstract / Article Low Back Pain (LBP) is the most prevalent musculoskeletal condition worldwide and a leading cause of disability, significantly affecting mobility, work productivity, and overall quality of life. Due to its high prevalence and substantial economic burden, LBP presents a critical global public health challenge that demands innovative diagnostic and therapeutic solutions. This study introduces a novel deep-learning approach for diagnosing LBP intensity using electroencephalography (EEG) signals and surface electromyography (sEMG) signals from back muscles. A GAN-Convolution-Transformer-based model, named GLEAM (GAN-ConvoLution-sElf Attention-ETLSTM), is designed to classify LBP intensity into four categories: no LBP, mild LBP, moderate LBP, and intolerable LBP. A denoising GAN is central to the model’s functionality, playing a pivotal role in enhancing the quality of EEG and sEMG signals by removing noise, resulting in cleaner and more accurate input data. Various features are extracted from the GAN-denoised EEG and sEMG signals, and the combined features from both EEG and sEMG are used for LBP detection. After the feature extraction, the CNN is employed to capture local temporal patterns within the data, allowing the model to focus on smaller, region-specific trends in the signals. Subsequently, the self-attention module identifies global correlations among these locally extracted features, enhancing the model’s ability to recognize broader patterns. The proposed ETLSTM network performs the final classification, which achieves an impressive LBP detection accuracy of 98.95%. This research presents several innovative contributions: (i) the development of a novel denoising GAN for cleaning EEG and sEMG signals, (ii) the design and integration of a new ETLSTM architecture as a classifier within the GLEAM model, and (iii) the introduction of the GLEAM hybrid deep learning framework, which enables robust and reliable LBP intensity assessment.

	TasteNet: A Novel Deep Learning Approach for EEG-Based Basic Taste Perception Recognition Using CEEMDAN Domain Entropy Features Sagnik De, Prithwijit Mukherjee, Anisha Halder Roy Journal of Neuroscience Methods, 2025 Abstract / Article Taste perception is the process by which the gustatory system detects and interprets chemical stimuli from food and beverages, involving activation of taste receptors on the tongue. Analyzing taste perception is essential for understanding human sensory responses and diagnosing taste-related disorders. This research focuses on developing a deep learning framework to effectively recognize basic taste stimuli from EEG signals. Initially, the recorded EEG signals undergo preprocessing to remove noise and artifacts. The CEEMDAN (complete ensemble empirical mode decomposition with adaptive noise) method is then applied to decompose the EEG signals into various frequency rhythms, referred to as intrinsic mode functions (IMFs). From the chosen IMFs, six distinct entropy features — sample, bubble, approximate, dispersion, slope, and permutation entropy — are extracted for further analysis. A novel deep learning model, TasteNet, is then developed, integrating a convolutional neural network (CNN) module, a multi-head attention module, and the Att-BiPLSTM (Attention-Bidirectional Potent Long Short-Term Memory) network. The proposed architecture classifies the input data into six categories: no taste, sweet, sour, bitter, umami, and salty, achieving a remarkable accuracy of 97.52 ± 0.48%. TasteNet outperforms existing taste perception classification methods, as demonstrated through extensive experiments. This study presents TasteNet, a robust framework for precise taste perception recognition using EEG signals. Using CEEMDAN for effective signal decomposition and extracting key entropy features, the model captures intricate patterns in taste stimuli. The incorporation of multi-head attention module and the Att-BiPLSTM network further enhances the model’s ability to identify various taste sensations accurately.
	Identification of patients with de novo Parkinson's Disease from chemosensory EEG signals using ICEEMDAN domain Entropy Features (Spotlight!) Sagnik De, Sreenija Pavuluri, Anil Kumar Gupta IEEE Sensors Letters, 2025 Abstract / Article / Spotlight Parkinson's disease (PD) is a progressive neurodegenerative disorder that impairs motor and sensory functions, with early symptoms often involving olfactory dysfunction. Given the importance of detecting these early biomarkers for timely intervention, this letter proposes the novel use of chemosensory EEG for the early detection of PD, as it captures the brain's responses to olfactory stimuli, one of the primary sensory modalities affected by the disease. The proposed method employs an improved complete ensemble empirical mode decomposition with adaptive noise to decompose EEG signals into intrinsic mode functions (IMFs). Entropic features, including approximate entropy, sample entropy, and Rényi permutation entropy (RpEn), are extracted from these IMFs to identify distinguishing characteristics. These features are then evaluated using several machine learning classifiers. A comprehensive evaluation reveals that combining RpEn features with least squares support vector machine classifier achieves optimal performance, with an accuracy of 96.47%, a precision of 96.14%, and a kappa score of 0.95.
	Quantifying the Impact of Speaker and Content Features on ASR Systems Using Unsupervised Distance Metrics Sreenija Pavuluri, Sagnik De, Anil Kumar Gupta IEEE Sensors Reviews, 2025 Abstract / Article Automatic speech recognition (ASR) models have become increasingly sophisticated, yet the underlying mechanisms driving their translation accuracy remain underexplored. This article explores the comparative influence of speaker characteristics and content similarity on ASR model performance, utilizing unsupervised distance metrics and clustering algorithms to gain deeper insights. By conducting a series of experiments using custom datasets, we aim to understand ASR model performance by examining whether the latent space features correlate more with speaker traits, such as accent, pitch, and speaking style, or with the semantic and syntactic content of the speech. Our findings reveal significant insights into the biases and strengths of current ASR technologies, highlighting the balance between speaker-dependent and content-dependent factors. Understanding these dynamics not only enhances the development of more robust and inclusive ASR systems but also paves the way for innovations in speech technology applications. This research contributes to the broader discourse on improving ASR models to better serve diverse populations and varied linguistic contexts.

2024

	SLiTRANet: An EEG-Based Automated Diagnosis Framework for Major Depressive Disorder Monitoring Using a Novel LGCN and Transformer-Based Hybrid Deep Learning Approach Sagnik De, Anurag Singh, Vivek Tiwari, Harshita Patel, G. N. Vivekananda, Dharmendra Singh Rajput IEEE Access, 2024 Abstract / Article Major depressive disorder (MDD) is a mental ailment marked by a loss of interest in activities, persistent depression, and hopelessness. MDD has been on the rise in society in recent decades for varied reasons and has spurred suicidal tendencies among individuals. Early detection, continuous monitoring, and effective treatment are crucial for its impact on quality of life and society. EEG signal models the brain’s electrical activities and has emerged as a potential tool to assess the depression status of a person. Due to advancements in sensor technology, fast, convenient, and cost-effective EEG acquisition is now possible, resulting in many EEG-based healthcare monitoring applications in recent years. This work proposes an EEG-headset-based smart monitoring system for real-time diagnosis of MDD in the Internet of Medical Things (IoMT) framework. In this study, we proposed a novel Linear Graph Convolution Network-Transformer-based deep learning approach for categorizing MDD through a time-frequency analysis of EEG signals. The Stockwell transform (S-transform) is employed to exploit the spectro-temporal information from the EEG and the resulting 2D representation is then fed into customized Linear Graph Convolution Network for MDD detection. We have utilized the Weighted Focal Binary Hinge Loss function, specifically designed for customized Linear Graph Convolution Network, to improve learning and handle unbalanced input. Subsequently, a novel Transformer model is designed to refine the MDD classification further. The proposed methodology named SLiTRANet, blends spectral analysis with the S-transform, graph-based learning with Linear Graph Convolution Network, and the sequence modeling capability of the Transformer. The proposed SLiTRANet model can be further integrated within an IoMT framework for automated real-time MDD diagnosis using EEG signals. The proposed methodology is evaluated on two publicly available datasets, MODMA and HUSM datasets. The evaluation results demonstrate the superior performance of the proposed SLiTRANet framework against the existing pre-trained and hybrid deep learning models, achieving remarkable accuracy, sensitivity, specificity, and precision rates of 99.92%, 99.90%, 99.95%, and 99.97%, respectively on HUSM dataset followed by an equally good performance on MODMA dataset with an accuracy of 99.68%. The proposed comprehensive approach implemented on two varied datasets highlights significant advancements in depression detection by outperforming state-of-art approaches.

	Maestro: A Robust Multi-Head Attention Enhanced CNN Architecture for Heat-Induced Stress Recognition Using EEG Signals Sagnik De, Sreenija Pavuluri, Amaan Sayyad, Anil Kumar Gupta IEEE CSITSS, 2024 Abstract / Paper Heat-induced stress impacts various physiological parameters in the body. Elevated temperature can cause tachycardia (an increase in heart rate), as the body attempts to dissipate heat through vasodilation, leading to dehydration and electrolyte imbalances. In addition, the hypothalamus triggers sweating in order to regulate body temperature; that can culminate in fluid and electrolyte loss, which could impact metabolic processes and blood pressure. A prolonged exposure to high temperatures can cause heat stroke, heat exhaustion, and other ailments including organ damage and systemic dysfunction. Existing electroencephalography (EEG)-based heat-induced stress detection often considers the entire EEG frequency range (delta to gamma), concealing redundant and lossy information and increasing the likelihood of false detection rates. To address the limitations of conventional handcrafted feature engineering approaches in heat stress detection, this paper introduces MAESTRO, a novel model comprising two blocks: Convolutional and Multi-head Attention. The Convolutional block extracts precise information from individual EEG frequency bands, while the Multi-head Attention block enhances feature representation through attention mechanism. Finally, two dense layers are employed to classify heat stress into three classes: Acute, Chronic, and Control. The proposed framework undergoes validation using EEG data obtained from 40 rodents in a simulated laboratory environment. The outcomes illustrate the viability of the method in classifying heat-induced stress, yielding remarkable results for overall accuracy, precision, recall, and F1 score of 98.88%, 98.54%, 98. 67%, and 98.60%, respectively.
	ParViT: A modified Vision Transformer architecture for Parkinson’s Disease identification using EEG signals Sagnik De, Amaan Sayyad, Hani Kotian, Anil Kumar Gupta IEEE ICSSES, 2024 Abstract / Paper Parkinson’s disease (PD) is a degenerative neurological condition that affects millions of individuals worldwide and is marked by both motor and non-motor indicators. In order to enhance patient outcomes through timely intervention, prompt detection and forecasting of PD is essential. Early detection of PD risk factors can help with the execution of management and treatment strategies that are beneficial in delaying the progression of the ailment. In this investigation, we present a novel hybrid framework intended for prediction of PD. The methodology begins by employing the Short-Time Fourier Transform (STFT) to process raw electroencephalography (EEG) data, facilitating the extraction of pertinent time-frequency characteristics. Subsequently, these features are organized into time-frequency blocks and inputted into an enhanced Vision Transformer(ViT) architecture, named ParViT, for classification of subjects into PD and Healthy Controls (HC). The proposed method demonstrates superior performance compared to current existing techniques in tasks related to PD identification, achieving an impressive accuracy rate of 98.25%, as well as precision, recall, specificity and F1 scores of 98.20%, 98.27%, 98.47% and 98.24%, respectively, as evidenced by experiments conducted on the publicly available UC San Diego dataset.
	A Quantum Machine Learning framework for Driver Drowsiness Detection using Biopotential Signals and Head Movement Analysis Sagnik De, Anil Kumar Gupta IEEE ICWITE, 2024 Abstract / Paper Road accidents claim numerous lives annually, with drowsiness identified as a primary catalyst for a substantial portion of these incidents. This study addresses this critical issue by introducing an innovative approach to gauge human drowsiness levels during driving. The primary objective of this study is to introduce a novel deep-learning technique capable of detecting various alertness levels—awake, drowsy, and very sleepy—while driving. For this purpose, a hybrid model is proposed, leveraging Convolutional Neural Networks (CNN) in conjunction with an Attention-based Quantum Long Short-Term Memory (QLSTM) network. The designed model employs different biopotential signals, including electroencephalogram (EEG), facial electromyography (EMG), pulse rate, and head movement, to discern a person’s alertness level. Demonstrating remarkable accuracy, the proposed model achieves detection rates of 99%, 98.5%, and 99% for awake, drowsy, and very sleepy states, respectively, thus offering a promising solution to mitigate the impact of drowsiness-related accidents.

2023

	A Novel Deep Learning-Based Approach for Hypertension Level Detection Using PPG Sagnik De, Prithwijit Mukherjee, Anisha Halder Roy IEEE SILCON, 2023 Abstract / Paper In the contemporary era, a significant portion of individuals endure cardiovascular ailments (CVDs). Hypertension stands as the principal cause behind blood pressure (BP) irregularities and diverse CVDs. Addressing this exigency, the ceaseless monitoring of BP has emerged as an urgent priority. Our study endeavors to devise an efficacious deep learning-powered automated technique for measuring BP (specifically systolic blood pressure (SBP) and diastolic blood pressure (DBP)), leveraging potentially cost-effective technology. Within our investigation, we have formulated two Long Short-Term Memory (LSTM)-based regression models that prognosticate SBP and DBP based on recorded photoplethysmogram (PPG) readings. Furthermore, an attention mechanism-based TLSTM (tanh Long-Short Term Memory) model has been proposed that can accurately predict distinct stages of hypertension, namely Normal, Pre-Hypertension, Hypertension stage 1, and Hypertension stage 2. The attained root-mean-squared error (RMSE) values are 10.503 and 9.284 for SBP and DBP, respectively, whereas the mean absolute error (MAE) values are 7.529 and 4.218 for SBP and DBP, respectively. The proposed attention mechanism-based TLSTM model exhibits a classification accuracy of 96%. The novelty of this investigation resides in the incorporation of an attention module into the TLSTM network for increasing its classification accuracy.
	A Hybrid Pain Assessment Approach with Stacked Autoencoders and Attention-Based CP-LSTM Sagnik De, Prithwijit Mukherjee, Anisha Halder Roy IEEE AIKIIE, 2023 Abstract / Paper Pain assessment is an integral part of healthcare since it enables the optimal management of patient well-being and the prompt administration of therapies. The ability to precisely diagnose pain is essential for ensuring appropriate medical attention and treatment. This study presents a novel pain categorization approach based on EEG (Electroencephalography) signals. A hybrid deep learning-based model is utilized in the study, which combines a Stacked Autoencoder for automated feature extraction and Chebyshev polynomial Long Short-Term Memory (CP-LSTM) with an attention mechanism for classification. The primary objective is to distinguish between two distinct pain states: ‘No Pain’ and ‘Pain’. The proposed model can detect the ‘Pain’ and ‘No Pain’ states of an individual with 99.32% and 98.91% accuracy, respectively. Experimental observations indicate an increase in delta wave power in the frontal and central cortex, as well as an increase in alpha wave power in the frontal, temporal, and occipital lobes during pain.
	A Novel Human Stress Level Detection Technique Using EEG Dipanjan Konar, Sagnik De, Prithwijit Mukherjee, Anisha Halder Roy IEEE NMITCON, 2023 Abstract / Paper In the 21st century, a significant portion of the world's population is plagued by stress. Stress is harmful to humans and can cause various physical and mental illnesses, such as headaches, anxiety, depression, and heart diseases. The aim of this study is to design a machine learning-based model to measure the mental stress level of a person using electroencephalography (EEG) signals of the frontal lobe. The proposed stress level assessment technique can classify the stress level of a person into four categories: no stress, low stress, moderate stress, and high stress. In our study, first EEG signals of the subjects have been recorded while they solve four mathematical question sets with different complexity levels. Subsequently, different handcrafted feature extraction techniques have been employed for extracting eight features, namely Skewness, Kurtosis, Maximum, Mean, Mean Absolute Value, Minimum, Standard Deviation, and Power Spectral Density from the pre-processed EEG signals. A majority voting-based ensemble classifier model has been designed by combining the predictions of three different classifiers, i.e., Support Vector Machine (SVM), K-Nearest neighbour (KNN), and Naive Bayes, to predict a person's mental stress level. The obtained classification accuracy of the proposed ensemble classifier model is 93.85%.

Miscellanea

Awards & Achievements

Recipient of IASc-INSA-NASI Summer Research Fellowship Program 2024

Recipient of Satyendra Nath Bose Summer Research Internship Program 2024, NIT Silchar

Won the 3rd Runners Up in TELECAST 2024 organized by University of Calcutta, Kolkata in collaboration with CTiF, India

Won the 1st Prize in COGNITECH 2023 organized by AI & Robotics Club in collaboration with IEEE Calcutta University Student Branch

Won the 1st Prize in Research Work Presentation 2023 organized by IEEE Photonics Society Kolkata Chapter, IEEE APS Kolkata Chapter & IEEE Calcutta University Student Branch

Reviewer

IEEE Access	Biomedical Signal Processing & Control	Food Chemistry
Scientific Reports	Artificial Intelligence In Medicine	Biological Psychology

Positions of Responsibility

Secretary
IEEE Calcutta University Student Branch
(Nov 2023 – Apr 2025)

President
AI & Robotics Club, IEEE CUSB
(Nov 2023 – Apr 2025)

Secretary
AI & Robotics Club, IEEE CUSB
(May 2023 – Oct 2023)