Multi-modal video retrieval using Dilated Pyramidal Residual network

An Ngoc Thuy La; Dat Phuoc Nguyen; Nhut Minh Pham; Quan Hai Vu

doi:10.32508/stdjns.v2i5.789

Article
Details
Citation
Metrics

Open Access

Downloads

Download data is not yet available.

Abstract

Pyramidal Residual Network achieved high accuracy in image classification tasks. However, there is no previous work on sequence recognition tasks using this model. We presented how to extend its architecture to form Dilated Pyramidal Residual Network (DPRN), for this long-standing research topic and evaluate it on the problems of automatic speech recognition and optical character recognition. Together, they formed a multi-modal video retrieval framework for Vietnamese Broadcast News. Experiments were conducted on caption images and speech frames extracted from VTV broadcast videos. Results showed that DPRN was not only end-to-end trainable but also performed well in sequence recognition tasks.

Comments

Author's Affiliation

An Ngoc Thuy La

University of Science, VNU-HCM
Google Scholar Pubmed

Dat Phuoc Nguyen

University of Science, VNU-HCM
Google Scholar Pubmed

Nhut Minh Pham

University of Science, VNU-HCM
Google Scholar Pubmed

Quan Hai Vu

University of Science, VNU-HCM
Google Scholar Pubmed

Article Details

Issue: Vol 2 No 5 (2018)

Page No.: 138-143

Published: Jul 2, 2019

Section: Original Research

DOI: https://doi.org/10.32508/stdjns.v2i5.789

Copyright: The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License CC-BY 4.0., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

How to Cite

La, A., Nguyen, D., Pham, N., & Vu, Q. (2019). Multi-modal video retrieval using Dilated Pyramidal Residual network. Science & Technology Development Journal: Natural Sciences, 2(5), 138-143. https://doi.org/https://doi.org/10.32508/stdjns.v2i5.789

Download Citation

Cited by

Article level Metrics by Paperbuzz/Impactstory

Article level Metrics by Altmetrics

Article Statistics

HTML = 1550 times
Download PDF = 254 times
Total = 254 times

Science & Technology Development Journal: NATURAL SCIENCES

An official journal of University of Science, Viet Nam National University Ho Chi Minh City, Viet Nam

HTML

1550

Total

254

Citations

Share

Multi-modal video retrieval using Dilated Pyramidal Residual network

An Ngoc Thuy La

Dat Phuoc Nguyen

Nhut Minh Pham

Quan Hai Vu

Downloads

Abstract

An Ngoc Thuy La

Dat Phuoc Nguyen

Nhut Minh Pham

Quan Hai Vu

Science & Technology Development Journal: Natural Science (STDJNS) (2588-106X) is published by Viet Nam National University Ho Chi Minh City, Ho Chi Minh City, Viet Nam

Science & Technology Development Journal: Natural Science (STDJNS) (2588-106X) is an official journal of the University of Science

INFORMATION

FOR AUTHORS

CONTACT US

Science & Technology Development Journal: NATURAL SCIENCES

An official journal of University of Science, Viet Nam National University Ho Chi Minh City, Viet Nam

HTML1550 Total 254 Citations Share Multi-modal video retrieval using Dilated Pyramidal Residual network

An Ngoc Thuy La Dat Phuoc Nguyen Nhut Minh Pham Quan Hai Vu

Downloads

Abstract

An Ngoc Thuy La

Dat Phuoc Nguyen

Nhut Minh Pham

Quan Hai Vu

HTML

1550

Total

254

Citations

Share

Multi-modal video retrieval using Dilated Pyramidal Residual network

An Ngoc Thuy La

Dat Phuoc Nguyen

Nhut Minh Pham

Quan Hai Vu