Multimodal AI for clothing assistive solutions for the visually impaired

Kathure, B. M.

Multimodal AI for clothing assistive solutions for the visually impaired

dc.contributor.author	Kathure, B. M.
dc.date.accessioned	2026-04-13T09:39:05Z
dc.date.issued	2025
dc.description	Full - text thesis
dc.description.abstract	This study presents an Artificial Intelligence (AI) powered Image-to-Text-to-Speech (ITTS) system to enhance accessibility for visually impaired individuals in the clothing domain. Using the DeepFashion2 dataset, the Bootstrapped Language Image Pretraining (BLIP) model generated enriched captions, integrating metadata such as clothing scale, viewpoint, and category. These enriched captions were synthesized into audio using Google Text-to-Speech (gTTS), offering an accessible and descriptive experience. The system’s performance was evaluated under zero-shot and fine-tuned settings, demonstrating substantial improvements in Bilingual Evaluation Understudy (BLEU)-1 (from 0.09 to 0.19), BLEU-2 (from 0.04 to 0.07), BLEU-3 (from 0.02 to 0.04), Recall-Oriented Understudy for Gisting Evaluation (ROUGE-L) remained stable at 0.16. At the same time, Metric for Evaluation of Translation with Explicit Ordering (METEOR) improved from 0.09 to 0.13. Although Consensus-based Image Description Evaluation (CIDEr) scores remained at 0.0, the fine-tuned model excelled in generating contextually rich and descriptive captions due to metadata integration. This study highlights the potential of multimodal AI systems, whose performance was evaluated using BLEU and other standard metrics, to address accessibility challenges, providing a solution to empower visually impaired users and laying the groundwork for future innovations in inclusive design. Keywords: Multimodal AI, Assistive Reading, Digital Accessibility, Fashion Content, Image-to-Speech, Inclusive Design.
dc.identifier.citation	Kathure, B. M. (2025). Multimodal AI for clothing assistive solutions for the visually impaired [Strathmore University]. https://hdl.handle.net/11071/16378
dc.identifier.uri	https://hdl.handle.net/11071/16378
dc.language.iso	en_US
dc.publisher	Strathmore University
dc.title	Multimodal AI for clothing assistive solutions for the visually impaired
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Multimodal AI for clothing assistive solutions for the visually impaired.pdf
Size:: 3.61 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

MSc. DSA Theses and Dissertations (2025)