Multimodal AI for clothing assistive solutions for the visually impaired
| dc.contributor.author | Kathure, B. M. | |
| dc.date.accessioned | 2026-04-13T09:39:05Z | |
| dc.date.issued | 2025 | |
| dc.description | Full - text thesis | |
| dc.description.abstract | This study presents an Artificial Intelligence (AI) powered Image-to-Text-to-Speech (ITTS) system to enhance accessibility for visually impaired individuals in the clothing domain. Using the DeepFashion2 dataset, the Bootstrapped Language Image Pretraining (BLIP) model generated enriched captions, integrating metadata such as clothing scale, viewpoint, and category. These enriched captions were synthesized into audio using Google Text-to-Speech (gTTS), offering an accessible and descriptive experience. The system’s performance was evaluated under zero-shot and fine-tuned settings, demonstrating substantial improvements in Bilingual Evaluation Understudy (BLEU)-1 (from 0.09 to 0.19), BLEU-2 (from 0.04 to 0.07), BLEU-3 (from 0.02 to 0.04), Recall-Oriented Understudy for Gisting Evaluation (ROUGE-L) remained stable at 0.16. At the same time, Metric for Evaluation of Translation with Explicit Ordering (METEOR) improved from 0.09 to 0.13. Although Consensus-based Image Description Evaluation (CIDEr) scores remained at 0.0, the fine-tuned model excelled in generating contextually rich and descriptive captions due to metadata integration. This study highlights the potential of multimodal AI systems, whose performance was evaluated using BLEU and other standard metrics, to address accessibility challenges, providing a solution to empower visually impaired users and laying the groundwork for future innovations in inclusive design. Keywords: Multimodal AI, Assistive Reading, Digital Accessibility, Fashion Content, Image-to-Speech, Inclusive Design. | |
| dc.identifier.citation | Kathure, B. M. (2025). Multimodal AI for clothing assistive solutions for the visually impaired [Strathmore University]. https://hdl.handle.net/11071/16378 | |
| dc.identifier.uri | https://hdl.handle.net/11071/16378 | |
| dc.language.iso | en_US | |
| dc.publisher | Strathmore University | |
| dc.title | Multimodal AI for clothing assistive solutions for the visually impaired | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Multimodal AI for clothing assistive solutions for the visually impaired.pdf
- Size:
- 3.61 MB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: