A Large-scale Video Text Dataset
DAVAR LAB
[Download (Please email us [See Contact])]
Introduction
Here, we release a large-scale video text dataset named LSVTD. In recent years, research in video scene text still remains unpopular in contrast to its promising application prospect. The existing video scene text datasets are limited on the scale of video items and scenarios, which may restrain research of video scene text spotting. Then we collect and annotate LSVTD, which contains 129 scene videos acquired from 21 typical real-life scenarios.
The dataset contains 129 video clips (ranging from several seconds to over 1 minutes long) from 21 real-life scenes. It was extended on the basis of LSVTD dataset by addding 15 videos for 'harbor surveillance' scenario and 14 videos for 'train watch' scenario, for the purpose of solving video text spotting problem in industrial transportation applications.
LSVTD mainly characterized is described in detail as follows:
Dataset released
Contact
If you have any questions about the dataset, please contact Jing Lu or Zhanzhan Cheng .(lujing6kh@163.com or 11821104@zju.edu.cn).
Terms of Use
Change Log
Recommended Citation
If you find this dataset is helpful to your research, please feel free to cite us:@inproceedings{cheng2019you, title={You Only Recognize Once: Towards Fast Video Text Spotting}, author={Cheng, Zhanzhan and Lu, Jing and Niu, Yi and Pu, Shiliang and Wu, Fei and Zhou, Shuigeng}, booktitle={Proceedings of the 27th ACM International Conference on Multimedia}, pages={855–863}, year={2019}, organization={ACM} } }