Visual Question Answering Using Deep Learning Techniques
Ahmed Mostafa Soliman Radwan;
Abstract
This thesis reviews state of the art visual question answering (VQA) algorithms and datasets, describes out work on constructing the first Arabic VQA dataset, and the details of our approach on that dataset.
This thesis is divided into 7 chapters as show below
Chapter 1
Gives an introduction about recent advances in integrating language and vision research to show the motivation behind working on the VQA problem. It explains the big picture of the problem, then gives an overview on the work done in this thesis.
Chapter 2
Gives an overview of the different categories of datasets used to benchmark VQA algorithms and the methodology of constructing those datasets, and state of the art approaches to deal with the problem
Chapter 3
Gives the basic theoretical foundations that our work is based on. It explains
This thesis is divided into 7 chapters as show below
Chapter 1
Gives an introduction about recent advances in integrating language and vision research to show the motivation behind working on the VQA problem. It explains the big picture of the problem, then gives an overview on the work done in this thesis.
Chapter 2
Gives an overview of the different categories of datasets used to benchmark VQA algorithms and the methodology of constructing those datasets, and state of the art approaches to deal with the problem
Chapter 3
Gives the basic theoretical foundations that our work is based on. It explains
Other data
| Title | Visual Question Answering Using Deep Learning Techniques | Other Titles | إجابة الأسئلة حول المحتوى المرئى للصورة باستخدام تقنيات التعلم العميق | Authors | Ahmed Mostafa Soliman Radwan | Issue Date | 2021 |
Attached Files
| File | Size | Format | |
|---|---|---|---|
| BB11804.pdf | 486.52 kB | Adobe PDF | View/Open |
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.