|
Sagnik De
I completed my B.Tech in Electronics and Communication Engineering in 2025, from
Institute of Radio Physics and Electronics (IRPE),
University of Calcutta, India.
I also completed my Minors in Business Administration from
Indian Institute of Technology (IIT), Patna.
I was a Winter Research Intern at IIT Delhi with
Prof. Tapan Kumar Gandhi,
focusing on graph based deep learning framework for anxiety state detection from neural signals.
Further, I analyzed intercranial EEG (iEEG) data for real-time epileptic seizure prediction in
humans.
I also served as a Research Intern at MANIT Bhopal under
Dr. Varun Bajaj,
contributing to EEG-based identification of dementia subtypes, and at
IIIT Naya Raipur under
Dr. Anurag Singh,
where I developed multimodal deep learning methods for Major Depressive Disorder diagnosis.
Earlier, I worked at CDAC Pune with
Dr. Anil Kumar Gupta
on EEG-based early detection of Parkinson’s Disease and pathological brain state classification.
Further, at University of Calcutta, under
Dr. Anisha Halder Roy,
I worked on multimodal EEG–sEMG fusion for chronic Lower Back pain assessment,
performed dedicated neuropathic pain analysis using EEG signals,
and investigated brain activity patterns during olfactory and gustatory perception.
Email
 / 
CV
 / 
Google Scholar
 / 
LinkedIn
 / 
Github
|
|

|
|
Research
My primary research interests span
Artificial Intelligence & IoT in Healthcare,
Biomedical Signal Processing,
Brain-Computer Interface (BCI),
Neural Engineering,
Computational Neuroscience
and
Medical Image Analysis.
I am deeply fascinated by the potential of these fields to shape the future of technology and
transform the way we interact with machines and information.
I am always open to new collaborations and research ideas.
Feel free to reach out if you are interested in working together!
|
Publications / Pre-Prints
2025
|
A Novel Vision Transformer based Multimodal Fusion
Approach for Clinical MDD Diagnosis Using EEG and
Audio Signals
(NEW!)
Sagnik De,
Anurag Singh,
Ashish Kumar Bhandari
IEEE Transactions on Computational Biology and
Bioinformatics, 2025
Abstract
/
Article
Major Depressive Disorder (MDD) is a debilitating
mental health condition characterized by persistent
sadness, anhedonia, and cognitive impairments that
significantly disrupt daily functioning.
Accurate diagnosis remains difficult due to the
subjective nature of clinical assessments,
highlighting the need for objective and automated
diagnostic tools.
Hence, this study proposes a novel multimodal
framework integrating electroencephalography (EEG)
and audio signals for accurate MDD detection.
EEG signals undergo preprocessing and are transformed
into 2D time-frequency (T-F) representations using
the Superlet Transform, while audio signals are
converted into Mel-spectrograms.
The 2D representations from each modality are
independently fed into a novel Vision Transformer
(ViT) architecture.
The proposed ViT first slices the T-F representation
along frequency bands and applies positional encoding
to each slice.
The resulting slice embeddings are subsequently
processed through a parallel Transformer Encoder (PE)
module to effectively capture temporal dependencies.
After the PE module has extracted sufficient
information from the embedded slices, a learnable
class token is appended to them, and the combined
representation is passed through the Class Encoder
(CE) module, allowing the model to capture global
contextual information.
Features extracted independently from EEG and audio
streams are then fused and fed into a fully connected
layer for final classification.
Evaluation on the MODMA clinical dataset shows the
framework achieves 98.86% accuracy, 98.32% F1-score,
and 0.9403 MCC, surpassing unimodal baselines.
The lightweight feature extraction and
transformer-based fusion mechanisms enable the
proposed architecture deployable in an edge–fog–cloud
Internet of Medical Things (IoMT) system, resulting
in low-latency, resource-efficient, and scalable
remote diagnosis, enhancing accessibility and
real-time clinical decision-making.
|
|
|
GLEAM: A Multimodal Deep Learning Framework for
Chronic Lower Back Pain Detection Using EEG and
sEMG Signals
Sagnik De,
Prithwijit Mukherjee,
Anisha Halder Roy
Computers in Biology and Medicine, 2025
Abstract
/
Article
Low Back Pain (LBP) is the most prevalent
musculoskeletal condition worldwide and a leading
cause of disability, significantly affecting
mobility, work productivity, and overall quality of
life.
Due to its high prevalence and substantial economic
burden, LBP presents a critical global public health
challenge that demands innovative diagnostic and
therapeutic solutions.
This study introduces a novel deep-learning approach
for diagnosing LBP intensity using
electroencephalography (EEG) signals and surface
electromyography (sEMG) signals from back muscles.
A GAN-Convolution-Transformer-based model, named
GLEAM (GAN-ConvoLution-sElf Attention-ETLSTM), is
designed to classify LBP intensity into four
categories: no LBP, mild LBP, moderate LBP, and
intolerable LBP.
A denoising GAN is central to the model’s
functionality, playing a pivotal role in enhancing
the quality of EEG and sEMG signals by removing
noise, resulting in cleaner and more accurate input
data.
Various features are extracted from the GAN-denoised
EEG and sEMG signals, and the combined features from
both EEG and sEMG are used for LBP detection.
After the feature extraction, the CNN is employed to
capture local temporal patterns within the data,
allowing the model to focus on smaller,
region-specific trends in the signals.
Subsequently, the self-attention module identifies
global correlations among these locally extracted
features, enhancing the model’s ability to recognize
broader patterns.
The proposed ETLSTM network performs the final
classification, which achieves an impressive LBP
detection accuracy of 98.95%.
This research presents several innovative
contributions:
(i) the development of a novel denoising GAN for
cleaning EEG and sEMG signals,
(ii) the design and integration of a new ETLSTM
architecture as a classifier within the GLEAM model,
and
(iii) the introduction of the GLEAM hybrid deep
learning framework, which enables robust and
reliable LBP intensity assessment.
|
|
|
TasteNet: A Novel Deep Learning Approach for
EEG-Based Basic Taste Perception Recognition
Using CEEMDAN Domain Entropy Features
Sagnik De,
Prithwijit Mukherjee,
Anisha Halder Roy
Journal of Neuroscience Methods, 2025
Abstract
/
Article
Taste perception is the process by which the
gustatory system detects and interprets chemical
stimuli from food and beverages, involving
activation of taste receptors on the tongue.
Analyzing taste perception is essential for
understanding human sensory responses and diagnosing
taste-related disorders.
This research focuses on developing a deep learning
framework to effectively recognize basic taste
stimuli from EEG signals.
Initially, the recorded EEG signals undergo
preprocessing to remove noise and artifacts.
The CEEMDAN (complete ensemble empirical mode
decomposition with adaptive noise) method is then
applied to decompose the EEG signals into various
frequency rhythms, referred to as intrinsic mode
functions (IMFs).
From the chosen IMFs, six distinct entropy
features — sample, bubble, approximate, dispersion,
slope, and permutation entropy — are extracted for
further analysis.
A novel deep learning model, TasteNet, is then
developed, integrating a convolutional neural
network (CNN) module, a multi-head attention module,
and the Att-BiPLSTM (Attention-Bidirectional Potent
Long Short-Term Memory) network.
The proposed architecture classifies the input data
into six categories: no taste, sweet, sour, bitter,
umami, and salty, achieving a remarkable accuracy of
97.52 ± 0.48%.
TasteNet outperforms existing taste perception
classification methods, as demonstrated through
extensive experiments.
This study presents TasteNet, a robust framework for
precise taste perception recognition using EEG
signals.
Using CEEMDAN for effective signal decomposition and
extracting key entropy features, the model captures
intricate patterns in taste stimuli.
The incorporation of multi-head attention module and
the Att-BiPLSTM network further enhances the model’s
ability to identify various taste sensations
accurately.
|
|
Identification of patients with de novo
Parkinson's Disease from chemosensory EEG
signals using ICEEMDAN domain Entropy Features
(Spotlight Paper!)
Sagnik De,
Sreenija Pavuluri,
Anil Kumar Gupta
IEEE Sensors Letters, 2025
Abstract
/
Article
/
Spotlight
Parkinson's disease (PD) is a progressive
neurodegenerative disorder that impairs motor and
sensory functions, with early symptoms often
involving olfactory dysfunction.
Given the importance of detecting these early
biomarkers for timely intervention, this letter
proposes the novel use of chemosensory EEG for the
early detection of PD, as it captures the brain's
responses to olfactory stimuli, one of the primary
sensory modalities affected by the disease.
The proposed method employs an improved complete
ensemble empirical mode decomposition with adaptive
noise to decompose EEG signals into intrinsic mode
functions (IMFs).
Entropic features, including approximate entropy,
sample entropy, and Rényi permutation entropy
(RpEn), are extracted from these IMFs to identify
distinguishing characteristics.
These features are then evaluated using several
machine learning classifiers.
A comprehensive evaluation reveals that combining
RpEn features with least squares support vector
machine classifier achieves optimal performance,
with an accuracy of 96.47%, a precision of 96.14%,
and a kappa score of 0.95.
|
|
Quantifying the Impact of Speaker and Content
Features on ASR Systems Using Unsupervised
Distance Metrics
Sreenija Pavuluri,
Sagnik De,
Anil Kumar Gupta
IEEE Sensors Reviews, 2025
Abstract
/
Article
Automatic speech recognition (ASR) models have
become increasingly sophisticated, yet the
underlying mechanisms driving their translation
accuracy remain underexplored.
This article explores the comparative influence of
speaker characteristics and content similarity on
ASR model performance, utilizing unsupervised
distance metrics and clustering algorithms to gain
deeper insights.
By conducting a series of experiments using custom
datasets, we aim to understand ASR model
performance by examining whether the latent space
features correlate more with speaker traits, such
as accent, pitch, and speaking style, or with the
semantic and syntactic content of the speech.
Our findings reveal significant insights into the
biases and strengths of current ASR technologies,
highlighting the balance between speaker-dependent
and content-dependent factors.
Understanding these dynamics not only enhances the
development of more robust and inclusive ASR systems
but also paves the way for innovations in speech
technology applications.
This research contributes to the broader discourse
on improving ASR models to better serve diverse
populations and varied linguistic contexts.
|
2024
|
SLiTRANet: An EEG-Based Automated Diagnosis
Framework for Major Depressive Disorder
Monitoring Using a Novel LGCN and
Transformer-Based Hybrid Deep Learning Approach
Sagnik De,
Anurag Singh,
Vivek Tiwari,
Harshita Patel,
G. N. Vivekananda,
Dharmendra Singh Rajput
IEEE Access, 2024
Abstract
/
Article
Major depressive disorder (MDD) is a mental ailment
marked by a loss of interest in activities,
persistent depression, and hopelessness.
MDD has been on the rise in society in recent
decades for varied reasons and has spurred suicidal
tendencies among individuals.
Early detection, continuous monitoring, and
effective treatment are crucial for its impact on
quality of life and society.
EEG signal models the brain’s electrical activities
and has emerged as a potential tool to assess the
depression status of a person.
Due to advancements in sensor technology, fast,
convenient, and cost-effective EEG acquisition is
now possible, resulting in many EEG-based healthcare
monitoring applications in recent years.
This work proposes an EEG-headset-based smart
monitoring system for real-time diagnosis of MDD in
the Internet of Medical Things (IoMT) framework.
In this study, we proposed a novel Linear Graph
Convolution Network-Transformer-based deep learning
approach for categorizing MDD through a
time-frequency analysis of EEG signals.
The Stockwell transform (S-transform) is employed
to exploit the spectro-temporal information from the
EEG and the resulting 2D representation is then fed
into customized Linear Graph Convolution Network for
MDD detection.
We have utilized the Weighted Focal Binary Hinge
Loss function, specifically designed for customized
Linear Graph Convolution Network, to improve
learning and handle unbalanced input.
Subsequently, a novel Transformer model is designed
to refine the MDD classification further.
The proposed methodology named SLiTRANet, blends
spectral analysis with the S-transform, graph-based
learning with Linear Graph Convolution Network, and
the sequence modeling capability of the Transformer.
The proposed SLiTRANet model can be further
integrated within an IoMT framework for automated
real-time MDD diagnosis using EEG signals.
The proposed methodology is evaluated on two
publicly available datasets, MODMA and HUSM
datasets.
The evaluation results demonstrate the superior
performance of the proposed SLiTRANet framework
against the existing pre-trained and hybrid deep
learning models, achieving remarkable accuracy,
sensitivity, specificity, and precision rates of
99.92%, 99.90%, 99.95%, and 99.97%, respectively on
HUSM dataset followed by an equally good performance
on MODMA dataset with an accuracy of 99.68%.
The proposed comprehensive approach implemented on
two varied datasets highlights significant
advancements in depression detection by
outperforming state-of-art approaches.
|
|
|
Maestro: A Robust Multi-Head Attention Enhanced
CNN Architecture for Heat-Induced Stress
Recognition Using EEG Signals
Sagnik De,
Sreenija Pavuluri,
Amaan Sayyad,
Anil Kumar Gupta
IEEE CSITSS, 2024
Abstract
/
Paper
Heat-induced stress impacts various physiological
parameters in the body.
Elevated temperature can cause tachycardia (an
increase in heart rate), as the body attempts to
dissipate heat through vasodilation, leading to
dehydration and electrolyte imbalances.
In addition, the hypothalamus triggers sweating in
order to regulate body temperature; that can
culminate in fluid and electrolyte loss, which could
impact metabolic processes and blood pressure.
A prolonged exposure to high temperatures can cause
heat stroke, heat exhaustion, and other ailments
including organ damage and systemic dysfunction.
Existing electroencephalography (EEG)-based
heat-induced stress detection often considers the
entire EEG frequency range (delta to gamma),
concealing redundant and lossy information and
increasing the likelihood of false detection rates.
To address the limitations of conventional
handcrafted feature engineering approaches in heat
stress detection, this paper introduces MAESTRO, a
novel model comprising two blocks: Convolutional and
Multi-head Attention.
The Convolutional block extracts precise information
from individual EEG frequency bands, while the
Multi-head Attention block enhances feature
representation through attention mechanism.
Finally, two dense layers are employed to classify
heat stress into three classes: Acute, Chronic, and
Control.
The proposed framework undergoes validation using
EEG data obtained from 40 rodents in a simulated
laboratory environment.
The outcomes illustrate the viability of the method
in classifying heat-induced stress, yielding
remarkable results for overall accuracy, precision,
recall, and F1 score of 98.88%, 98.54%, 98. 67%,
and 98.60%, respectively.
|
|
ParViT: A modified Vision Transformer
architecture for Parkinson’s Disease
identification using EEG signals
Sagnik De,
Amaan Sayyad,
Hani Kotian,
Anil Kumar Gupta
IEEE ICSSES, 2024
Abstract
/
Paper
Parkinson’s disease (PD) is a degenerative
neurological condition that affects millions of
individuals worldwide and is marked by both motor
and non-motor indicators.
In order to enhance patient outcomes through timely
intervention, prompt detection and forecasting of PD
is essential.
Early detection of PD risk factors can help with the
execution of management and treatment strategies
that are beneficial in delaying the progression of
the ailment.
In this investigation, we present a novel hybrid
framework intended for prediction of PD.
The methodology begins by employing the Short-Time
Fourier Transform (STFT) to process raw
electroencephalography (EEG) data, facilitating the
extraction of pertinent time-frequency
characteristics.
Subsequently, these features are organized into
time-frequency blocks and inputted into an enhanced
Vision Transformer(ViT) architecture, named ParViT,
for classification of subjects into PD and Healthy
Controls (HC).
The proposed method demonstrates superior
performance compared to current existing techniques
in tasks related to PD identification, achieving an
impressive accuracy rate of 98.25%, as well as
precision, recall, specificity and F1 scores of
98.20%, 98.27%, 98.47% and 98.24%, respectively, as
evidenced by experiments conducted on the publicly
available UC San Diego dataset.
|
|
A Quantum Machine Learning framework for Driver
Drowsiness Detection using Biopotential Signals
and Head Movement Analysis
Sagnik De,
Anil Kumar Gupta
IEEE ICWITE, 2024
Abstract
/
Paper
Road accidents claim numerous lives annually, with
drowsiness identified as a primary catalyst for a
substantial portion of these incidents.
This study addresses this critical issue by
introducing an innovative approach to gauge human
drowsiness levels during driving.
The primary objective of this study is to introduce
a novel deep-learning technique capable of detecting
various alertness levels—awake, drowsy, and very
sleepy—while driving.
For this purpose, a hybrid model is proposed,
leveraging Convolutional Neural Networks (CNN) in
conjunction with an Attention-based Quantum Long
Short-Term Memory (QLSTM) network.
The designed model employs different biopotential
signals, including electroencephalogram (EEG),
facial electromyography (EMG), pulse rate, and head
movement, to discern a person’s alertness level.
Demonstrating remarkable accuracy, the proposed
model achieves detection rates of 99%, 98.5%, and
99% for awake, drowsy, and very sleepy states,
respectively, thus offering a promising solution to
mitigate the impact of drowsiness-related accidents.
|
2023
|
A Novel Deep Learning-Based Approach for
Hypertension Level Detection Using PPG
Sagnik De,
Prithwijit Mukherjee,
Anisha Halder Roy
IEEE SILCON, 2023
Abstract
/
Paper
In the contemporary era, a significant portion of
individuals endure cardiovascular ailments (CVDs).
Hypertension stands as the principal cause behind
blood pressure (BP) irregularities and diverse CVDs.
Addressing this exigency, the ceaseless monitoring
of BP has emerged as an urgent priority.
Our study endeavors to devise an efficacious deep
learning-powered automated technique for measuring
BP (specifically systolic blood pressure (SBP) and
diastolic blood pressure (DBP)), leveraging
potentially cost-effective technology.
Within our investigation, we have formulated two
Long Short-Term Memory (LSTM)-based regression
models that prognosticate SBP and DBP based on
recorded photoplethysmogram (PPG) readings.
Furthermore, an attention mechanism-based TLSTM
(tanh Long-Short Term Memory) model has been
proposed that can accurately predict distinct stages
of hypertension, namely Normal, Pre-Hypertension,
Hypertension stage 1, and Hypertension stage 2.
The attained root-mean-squared error (RMSE) values
are 10.503 and 9.284 for SBP and DBP, respectively,
whereas the mean absolute error (MAE) values are
7.529 and 4.218 for SBP and DBP, respectively.
The proposed attention mechanism-based TLSTM model
exhibits a classification accuracy of 96%.
The novelty of this investigation resides in the
incorporation of an attention module into the TLSTM
network for increasing its classification accuracy.
|
|
A Hybrid Pain Assessment Approach with Stacked
Autoencoders and Attention-Based CP-LSTM
Sagnik De,
Prithwijit Mukherjee,
Anisha Halder Roy
IEEE AIKIIE, 2023
Abstract
/
Paper
Pain assessment is an integral part of healthcare
since it enables the optimal management of patient
well-being and the prompt administration of
therapies.
The ability to precisely diagnose pain is essential
for ensuring appropriate medical attention and
treatment.
This study presents a novel pain categorization
approach based on EEG (Electroencephalography)
signals.
A hybrid deep learning-based model is utilized in
the study, which combines a Stacked Autoencoder for
automated feature extraction and Chebyshev
polynomial Long Short-Term Memory (CP-LSTM) with an
attention mechanism for classification.
The primary objective is to distinguish between two
distinct pain states: ‘No Pain’ and ‘Pain’.
The proposed model can detect the ‘Pain’ and
‘No Pain’ states of an individual with 99.32% and
98.91% accuracy, respectively.
Experimental observations indicate an increase in
delta wave power in the frontal and central cortex,
as well as an increase in alpha wave power in the
frontal, temporal, and occipital lobes during pain.
|
|
A Novel Human Stress Level Detection Technique
Using EEG
Dipanjan Konar,
Sagnik De,
Prithwijit Mukherjee,
Anisha Halder Roy
IEEE NMITCON, 2023
Abstract
/
Paper
In the 21st century, a significant portion of the
world's population is plagued by stress.
Stress is harmful to humans and can cause various
physical and mental illnesses, such as headaches,
anxiety, depression, and heart diseases.
The aim of this study is to design a machine
learning-based model to measure the mental stress
level of a person using electroencephalography (EEG)
signals of the frontal lobe.
The proposed stress level assessment technique can
classify the stress level of a person into four
categories: no stress, low stress, moderate stress,
and high stress.
In our study, first EEG signals of the subjects have
been recorded while they solve four mathematical
question sets with different complexity levels.
Subsequently, different handcrafted feature
extraction techniques have been employed for
extracting eight features, namely Skewness,
Kurtosis, Maximum, Mean, Mean Absolute Value,
Minimum, Standard Deviation, and Power Spectral
Density from the pre-processed EEG signals.
A majority voting-based ensemble classifier model
has been designed by combining the predictions of
three different classifiers, i.e., Support Vector
Machine (SVM), K-Nearest neighbour (KNN), and Naive
Bayes, to predict a person's mental stress level.
The obtained classification accuracy of the proposed
ensemble classifier model is 93.85%.
|
|
Awards & Achievements
Recipient of
IASc-INSA-NASI Summer Research Fellowship
Program 2024
Recipient of
Satyendra Nath Bose Summer Research Internship Program
2024, NIT Silchar
Awarded the
Outstanding Volunteer 2023-24
by IEEE Calcutta University Student Branch
Won the
3rd Runners Up
in
TELECAST 2024
organized by University of Calcutta, Kolkata in collaboration
with CTiF, India
Received the
Best Paper Award
at
IEEE ICCCCM 2024
and
IEEE IACIS 2024
Won the
1st Prize
in
COGNITECH 2023
organized by AI & Robotics Club in collaboration with IEEE
Calcutta University Student Branch
Won the
1st Prize
in
Research Work Presentation 2023
organized by IEEE Photonics Society Kolkata Chapter,
IEEE APS Kolkata Chapter &
IEEE Calcutta University Student Branch
|
Reviewer
| IEEE Access |
Biomedical Signal Processing & Control |
Food Chemistry |
| Scientific Reports |
Artificial Intelligence In Medicine |
Biological Psychology |
|
Positions of Responsibility
Secretary
IEEE Calcutta University Student Branch
(Nov 2023 – Apr 2025)
|
President
AI & Robotics Club, IEEE CUSB
(Nov 2023 – Apr 2025)
|
Founding Secretary
AI & Robotics Club, IEEE CUSB
(May 2023 – Oct 2023)
|
|
|