Oculight: AI Based System for Visual Assistance to Blind and Visually Impaired People

Authors

  • Shreya Tandale Student, Department of Computer Science, Vishwaniketan & Institute of Management Entrepreneurship and Engineering Technology (ViMEET), Khalapur, Raigad, Maharashtra, India
  • Shyam Pawar Student, Department of Computer Science, Vishwaniketan & Institute of Management Entrepreneurship and Engineering Technology (ViMEET), Khalapur, Raigad, Maharashtra, India
  • Sanjivani Singh Student, Department of Computer Science, Vishwaniketan & Institute of Management Entrepreneurship and Engineering Technology (ViMEET), Khalapur, Raigad, Maharashtra, India
  • Sukriti Singh Student, Department of Computer Science, Vishwaniketan & Institute of Management Entrepreneurship and Engineering Technology (ViMEET), Khalapur, Raigad, Maharashtra, India
  • Prajakta Jadhav Professor, Department of Computer Science, Vishwaniketan & Institute of Management Entrepreneurship and Engineering Technology (ViMEET), Khalapur, Raigad, Maharashtra, India

Keywords:

AI, Assistive technologies, captioning model, image understanding, blind people, Android device, scene detection, multimodal interfaces, human-computer interaction

Abstract

This research introduces a new captioning model that utilizes both image and caption models to produce textual descriptions of images. By combining a convolutional neural network (CNN) and a long short-term memory (LSTM) recurrent neural network (RNN), this deep learning architecture offers a promising solution for improving the quality of life and independence of blind people through AIbased systems. The system's implementation on an Android device ensures that it is highly accessible and user-friendly, while the integration of text-to-speech technology further enhances its potential for improving human-computer interaction. The CNN is responsible for extracting meaningful features from input images, which the RNN then uses to generate textual descriptions. This research highlights the potential for future development to expand the system's capabilities to support a wider range of visual tasks and environments and improve the efficiency and accuracy of the model. With further refinement, this technology could have a profound impact on the lives of blind people, improving their overall well-being and sense of independence.

References

Peter Young, Alice Lai, Micah Hodosh, Julia Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Trans Assoc Comput Linguist. 2014; 2: 67–78. doi: https://doi.org/10.1162/tacl_a_00166

Singh R, Shukla R, Thakur M, Shinde S, Patil A. Drishti for blind - A smart assistant for navigation and go-to text reading. Int J Inf Technol (IJIT). 2021; 7(4): 18–23.

Nguyen Q-H, Tran T-H. Scene description for visually impaired in outdoor environment. 2013 International Conference on Advanced Technologies for Communications (ATC 2013), Ho Chi Minh City, Vietnam. 2013; 398–403. doi: 10.1109/ATC.2013.6698144.

Najm H, Elferjani K, Alariyibi A. Assisting Blind People Using Object Detection with Vocal Feedback. 2022 IEEE 2nd International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), Sabratha, Libya. 2022; 48–52. doi: 10.1109/MI-STA54861.2022.9837737.

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556v6. 2014 Sep 4. https://arxiv.org/abs/1409.1556

Deng J, Dong W, Socher R, Li L-J, Kai Li, Li Fei-Fei. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA. 2009; 248–255. doi: 10.1109/CVPR.2009.5206848.

Albawi S, Mohammed TA, Al-Zawi S. Understanding of a convolutional neural network. 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey. 2017; 1–6. doi: 10.1109/ICEngTechnol.2017.8308186.

Sherstinsky A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. arXiv:1808.03314. 2018 Aug 9. https://arxiv.org/abs/1808.03314

Hochreiter Sepp, Schmidhuber Jürgen. Long Short-term Memory. Neural Comput. 1997; 9(8): 1735–80. 10.1162/neco.1997.9.8.1735.

Terry JK, Jayakumar M, Alwis KD. Statistically significant stopping of neural network training. arXiv:2103.01205. 2021 Mar 1. https://arxiv.org/abs/2103.01205v3.

Published

2023-04-21

Issue

Section

Review Article