Bipin Saha
Bipin Saha (বিপিন সাহা in Bangla/বাংলা); Name of origin: Bipin/বিপিন/विपिन (Sanskrit) – Vast Natural Grove
Autonomous Vehicle/AI-ML Researcher & Engineer
Exploring the frontiers of autonomy by uniting perception and intelligence.
Resume
About Me
Hi, I'm Bipin Saha (please call me Bipin), originally from Bangladesh, and currently living in Greensboro, North Carolina. From developing real-world AI agents as a Machine Learning Engineer to researching the future of mobility, my journey has been driven by a passion for solving complex problems through Computer Vision and Robotics.
I am currently a PhD student in Electrical and Computer Engineering at North Carolina A&T State University, where I conduct my doctoral research at the ACCESS Lab.
I earned my B.Sc. in Electrical and Electronic Engineering from the University of Rajshahi in September 2022. During my undergraduate studies, I worked on robust appearance-based eye-gaze classification under natural head poses, focusing on real-world variability and robustness. I also contributed to the development of a comprehensive vehicle-detection dataset designed for challenging and underrepresented regions, evaluating its effectiveness using transfer learning techniques.
Before starting my PhD, I worked as a Machine Learning Engineer with the Industry 4.0 team at Business Automation Limited, where I built and deployed intelligent systems involving Computer Vision, Vision–Language Models, Large Language Models, and vision-enabled robotic platforms. This industry experience strongly shaped my research perspective, emphasizing the importance of scalable, explainable, and deployable AI systems.
My current research focuses on visual navigation and camera–LiDAR fusion for autonomous systems, with the goal of making self-driving technologies more perceptive, robust, and explainable. I am particularly interested in self-supervised perception, multimodal sensor fusion, and reasoning-driven robotic decision-making. Beyond research, I am deeply passionate about STEM education and aspire to pursue a long-term career in academic research and teaching, contributing to intelligent transportation systems that will shape the next century.
Recent Update: 😁 😁 😁
1. IEEE QPAIN 2026 Acceptance
"Comparative Evaluation of CNN, Transformer, and Vision-Language Approaches for Vehicle Detection in Complex Scenarios"
February 15, 2026
2. Joined ACCESS Lab as Graduate Research Assistant
Started PhD in Electrical and Computer Engineering at North Carolina A&T State University
August 20, 2025
- IEEE QPAIN 2026 paper acceptance on robust vehicle detection across CNN, Transformer, and VLM methods.
- 43.77% simulated PCE in dual-heterojunction solar-cell modeling research published in AIP Advances.
- ~98% accuracy in appearance-based eye-gaze estimation across 13 directional targets.
Selected Publications
Accepted in IEEE QPAIN 2026.
Published in IEEE QPAIN 2025.
Selected Projects
OCR + layout analysis + LLM structuring for clinical text extraction. Read full project breakdown
Retrieval-augmented assistant for technical paper exploration. Read full project breakdown
End-to-end steering prediction with speed control loop. Read full project breakdown
Technical Writeups
Education
North Carolina A&T State University, Greensboro, NC, USA
Fall 2025 - Present
Focus: Visual Navigation and Camera-LiDAR Fusion for Autonomous Systems
Advisor – Dr. Ali Karimoddini
University of Rajshahi, Rajshahi-6205, Bangladesh
January 2017 - August 2022
• Research focus: Robust appearance-based eye-gaze classification, vehicle detection in underrepresented regions
• Developed comprehensive datasets and evaluated transfer learning techniques for real-world robustness
Relevant Coursework: Control System, Digital Signal Processing, Microprocessors & Embedded Systems, Power Electronics, Computer Networks
Experiences
Professional Experiences
For details visit here.
August 2025 - Present (Greensboro, NC)
• Implemented unsupervised LiDAR 3D object detection and scene understanding using DBSCAN and KMEANS + PCA, improving autonomous vehicle perception and safety in unstructured environments.
• Reviewing state-of-the-art BEVFusion and Self-Supervised learning research and implementing existing methods to gain deeper understanding and establish a foundation for advancing camera–LiDAR fusion in autonomous vehicle perception.
Advisor – Dr. Ali Karimoddini
December 2023 - February 2025 (Rajshahi, BD, Full-Time)
• Digitized Handwritten Prescriptions by introducing efficient layout segmentation and OCR (CER 0.0951), fine-tuned LLM models for output formation (IDs, Diagnosis, Medication, Frequencies, and medical histories), optimized both training via implementation of custom loss function and inference time to around 2.28 seconds and minimized human workload by 52.68%, impacting over 64k patients.
• Implemented Bangla Law Consultancy chatbot using Llama-3 with an Agentic RAG system and TF-IDF indexing, reduced 3.2x memory footprints in training through QLoRA, QDoRA, and PEFT techniques, and achieved 20% faster inference response.
• Developed a multi-agent code suggestion and generation framework that leverages LangGraph’s state memory, DeepSeek-R1, and a Milvus dataset based on analyzing GitLab issue titles and descriptions.
• Executed BIDA and EBS Employee Behavioral Log Text Summarization enhancing efficacy in task-based accomplishment utilizing Mistral, Pegasus, and ML Models.
Project Manages - Hajj Medical Prescriptions, Mutation
January 2023 - November 2023 (Dhaka, BD, Full-Time)
• Spearheaded the design and development of a state-of-the-art Object Detection algorithm using YOLOv5 and YOLOv8 frameworks for accurate recognition of over 20 grocery products, achieving an impressive mAP50 of 0.955 and mAP50:95 of 0.872.
• Deployed production-ready models ensuring precise and reliable detection and recognition of diverse grocery items with hands-on experience in best practices for production environments.
• Led the design and development of a 2R inverse kinematics robotics arm tailored for seamless navigation within a 2D plane.
• Conducted extensive design and optimization of Proportional-Integral-Derivative (PID) control algorithms, ensuring enhanced control accuracy, precise movement, and accurate slot navigation for robotic systems.
Undergraduate Research Experience
October 2021 - April 2022
Worked under the supervision of Dr. Jaker Hossain
• Designed and simulated a dual-heterojunction n-ZnSe/p-CuSbSe2/p+-CGS solar device using Solar Cell Capacitance Simulator-1D software.
• Investigated the impact of absorber layer thickness, doping levels, and defect densities on device performance.
• Achieved 43.77% power conversion efficiency (PCE) with the dual-heterojunction structure, compared to 27.74% for single heterojunction, approaching the Shockley–Queisser efficiency limit through back surface field (BSF) assisted upconversion.
• Published results in AIP Advances 13, 025255 (2023) with Impact Factor 1.6.
2020 - 2021
Worked under the supervision of Shaikh Khaled Mostaque and Co-Supervised by Dr. Johirul Islam
• Developed a real-time eye gaze tracking interface using active appearance methods with standard web/smartphone cameras in unconstrained environments accounting for natural head movements.
• Performed feature extraction by separating sclera pixel areas from masked eye images and evaluated performance using decision tree, random forest, and extra tree classifiers.
• Achieved ~98% accuracy across 13 gaze targets using extra tree classifier, with applications in human-computer interaction, robotics, and medical science.
• Published in 2021 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2)
Volunteer Experiences
November 2020 - October 2022
• Founded and led the IEEE RAS Student Branch Chapter at University of Rajshahi, fostering innovation and education in robotics and automation.
• Promoted scientific and technological knowledge exchange in robotics for advancement of humanity and fundamental research.
July 2018 - December 2020
• Served as Secretary and Publicity Coordinator, promoting IEEE events and international seminars.
• Developed soft and hard skills through engagement with fellow IEEE members and professional networking.
May 2020 - July 2021
• Selected as IEEE Brand Ambassador to enhance brand recognition and create consistent IEEE Brand experience.
• Educated others about IEEE brand standards and raised awareness among professional and volunteer communities.
August 2018 - August 2019
• Developed technological and enriched educational materials for primary education students.
• Led 6 human training programs including "Skill Journey" and "In Search of Future Scientist" initiatives.
September 2018 - June 2019
• Served as freelance content writer for Bigganbortika, a Bengali science and technology news portal.
• Published quality-assured articles promoting science education in Bengali across digital platforms.
Publications
For details visit Google Scholar.Published
Bipin Saha, TK Taton, MH Islam, MJ Islam, SK Mostaque
Published in - 2025 International Conference on Quantum Photonics, Artificial Intelligence (July 31 - August 2, 2025)
TK Taton, Bipin Saha, MJ Islam, SK Mostaque
Published in - Discover Analytics 3 (1), 9
TK Taton, Bipin Saha, A Akter, MJ Islam, SK Mostaque
Published in - 2024 International Conference on Recent Progresses in Science, Engineering and Technology
U Rumman, A Ferdousi, Bipin Saha, MS Hossain, MJ Islam, S Ahmad, et al.
Available on - arXiv preprint arXiv:2409.07484
Bipin Saha, MJ Islam, SK Mostaque, A Bhowmik, TK Taton, MNH Chowdhury, et al.
Available on - arXiv preprint arXiv:2405.12150
Bipin Saha, BK Mondal, SK Mostaque, M Hossain, J Hossain
Published in - AIP Advances 13 (2)
Bipin Saha, MJ Islam, AS Dipto, SK Mostaque
Published in - 2021 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2)
Competitions
Best Performances
🥈 1st Runner Up — December 2022, Dhaka
Honorable Mention — September 2020, IEEE Bangladesh Section
Projects
Project page | View on GitHub
PaddleOCR · TrOCR · YOLO · Gemma-7B
The project focuses on digitizing handwritten prescriptions using advanced layout analysis and OCR pipelines. Line-by-line segmentation extracts patient IDs, medication names, dosage, frequency, quantity, and medical history. Optimized OCR models ensure handwriting robustness, while a generative model converts extracted text into structured clinical records for efficient healthcare data management.
Project page | View on GitHub
Gemini-Pro · LLaMA-3 · Semantic RAG
ChatPDF enables conversational interaction with research papers using Retrieval-Augmented Generation. The system retrieves relevant paper segments, summarizes complex ideas, answers technical queries, and can even translate theoretical concepts into executable code to accelerate research understanding and implementation.
Project page | View on GitHub
Imitation Learning · TensorFlow · PID Control
This project implements an end-to-end autonomous driving pipeline using behavioral cloning. A CNN maps camera images directly to steering commands, while a PID controller regulates vehicle speed. Careful tuning of control gains ensures smooth, stable, and safe driving behavior in simulation environments.
View on GitHub
CoppeliaSim · Lua · Inverse Kinematics
A robotic sorting system using a KUKA manipulator operating over a conveyor belt. Objects are detected using color sensors, picked using a gripper, and transferred to designated locations through inverse kinematics-based motion planning.
View on GitHub
YOLO · Tkinter · Multithreading
A real-time industrial monitoring system for detecting slab bending anomalies using multi-camera feeds. YOLO-based inference identifies defects, while a Tkinter GUI displays camera ID, slab position, timestamps, and confidence scores alongside user authentication and database logging.
View on GitHub
Pegasus · Mistral · ROUGE
Automated summarization of employee work logs using Pegasus. The system generates concise summaries and achieves strong ROUGE scores (R-1: 0.31, R-2: 0.08), enabling efficient performance evaluation and managerial decision-making.
View on GitHub
PyTorch · CNN · LSTM
A Bangla sentiment classification system achieving 90.31% accuracy using CNN-BiLSTM architectures with data augmentation. The model classifies text into positive, negative, and neutral sentiments with strong generalization.
View on GitHub
Scikit-Learn · Dlib · Random Forest
A real-time eye gaze tracking system using standard cameras. Active appearance modeling and ensemble classifiers achieve ~98% accuracy across multiple gaze targets, enabling robust human-computer interaction.
View on GitHub
NumPy · PyGame
A full Python implementation of the Rapidly-Exploring Random Tree algorithm for 2D path planning. The simulation visually demonstrates tree expansion and obstacle-aware navigation.
View on GitHub
Arduino · C++
An autonomous landmine-detecting robot using sonar sensors and a rotating metal detector. The system transmits detection alerts to a base station and supports manual smartphone control.
Certifications
Basic Image Classification with TensorFlow
Coursera
Issued Jun 2020 · Show credential
Computer Vision – Image Basics with OpenCV and Python
Coursera
Issued Jun 2020 · Show credential
Computer Vision – Object Tracking with OpenCV and Python
Coursera
Issued Jun 2020 · Show credential
Electric Power System
Coursera
Issued Jun 2020 · Show credential
Image Classification with CNNs using Keras
Coursera
Issued Jun 2020 · Show credential
RPA Lifecycle: Introduction, Discovery and Design
Coursera
Issued Jun 2020 · Show credential
Robotics: Aerial Robotics
Coursera
Issued Jun 2020 · Show credential
Control of Mobile Robots
Coursera
Issued May 2020 · Show credential