Multimodal AI for clothing assistive solutions for the visually impaired

dc.contributor.authorKathure, B. M.
dc.date.accessioned2026-04-13T09:39:05Z
dc.date.issued2025
dc.descriptionFull - text thesis
dc.description.abstractThis study presents an Artificial Intelligence (AI) powered Image-to-Text-to-Speech (ITTS) system to enhance accessibility for visually impaired individuals in the clothing domain. Using the DeepFashion2 dataset, the Bootstrapped Language Image Pretraining (BLIP) model generated enriched captions, integrating metadata such as clothing scale, viewpoint, and category. These enriched captions were synthesized into audio using Google Text-to-Speech (gTTS), offering an accessible and descriptive experience. The system’s performance was evaluated under zero-shot and fine-tuned settings, demonstrating substantial improvements in Bilingual Evaluation Understudy (BLEU)-1 (from 0.09 to 0.19), BLEU-2 (from 0.04 to 0.07), BLEU-3 (from 0.02 to 0.04), Recall-Oriented Understudy for Gisting Evaluation (ROUGE-L) remained stable at 0.16. At the same time, Metric for Evaluation of Translation with Explicit Ordering (METEOR) improved from 0.09 to 0.13. Although Consensus-based Image Description Evaluation (CIDEr) scores remained at 0.0, the fine-tuned model excelled in generating contextually rich and descriptive captions due to metadata integration. This study highlights the potential of multimodal AI systems, whose performance was evaluated using BLEU and other standard metrics, to address accessibility challenges, providing a solution to empower visually impaired users and laying the groundwork for future innovations in inclusive design. Keywords: Multimodal AI, Assistive Reading, Digital Accessibility, Fashion Content, Image-to-Speech, Inclusive Design.
dc.identifier.citationKathure, B. M. (2025). Multimodal AI for clothing assistive solutions for the visually impaired [Strathmore University]. https://hdl.handle.net/11071/16378
dc.identifier.urihttps://hdl.handle.net/11071/16378
dc.language.isoen_US
dc.publisherStrathmore University
dc.titleMultimodal AI for clothing assistive solutions for the visually impaired
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Multimodal AI for clothing assistive solutions for the visually impaired.pdf
Size:
3.61 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: