Gynaecological Disease Detection Using Machine Learning
DOI:
https://doi.org/10.61808/jsrt234Keywords:
Gynaecology, Machine Learning, TF-IDF, Disease DetectionAbstract
Due to cultural taboos, lack of information, and limited access to specialised treatment, gynaecological illnesses are commonly underdiagnosed or misdiagnosed, particularly in poor countries. If undiagnosed, UTIs and PCOS may harm women's physical, emotional, and reproductive health. Invasive, time-consuming, and physical referrals make traditional diagnostic techniques difficult for women in distant and underprivileged areas. This research proposes an AI-driven, non-invasive, and widely accessible method to predict gynaecological disorders early based on user-reported symptoms. The system supports free-form English symptom inputs via voice or text, making it user-friendly and inclusive. Symptom descriptions are cleaned, normalised, and standardised using NLP methods including tokenisation, lemmatisation, part-of-speech tagging, and spell correction. The Term Frequency-Inverse Document Frequency (TF-IDF) method converts processed symptoms into numerical feature vectors, representing medical words' relative relevance to the dataset. Using a balanced and medically vetted dataset of symptom-disease-treatment mappings, a Multinomial Naive Bayes classifier is trained. Based on symptoms entered, the model predicts the most probable illness with 87% accuracy, high precision, recall, and F1-score values, separating UTI from PCOD. Users have an interactive, intuitive, and real-time prediction experience using Streamlit, a contemporary and lightweight online application framework. The projected condition is used to offer medically acceptable treatments to empower consumers with instant, actionable recommendations and emphasise the significance of expert medical consultation. This project uses AI and user-centric design to make women's healthcare more accessible. Though confined to English input and a few disorders, the system sets the groundwork for scalable, multilingual, and comprehensive gynaecological health diagnosis. The disease database will be expanded, multilingual support added (including Hindi and Tamil), user profile and history tracking added, cloud-based solutions deployed for scalability, and conversational chatbot features added for patient interaction.