Hadoop分布式的海量图像检索Massive image retrieval based on Hadoop distribution
王立,陈军峰
摘要(Abstract):
传统基于内容的图像检索方法通过相似度测量算法获取检索结果,对海量图像存在检索效率低和精度差的弊端,因此设计基于Hadoop分布式的海量图像检索方法,其基于Hadoop云平台对海量数码图像实施分布式运算,采集图像SURF特征,采用K-Means聚类方法将相似图像SURF特征聚集起来,通过TF-IDF数据挖掘技术对图像特征实施量化,进而基于Hadoop平台中的Lucene框架塑造海量图像数据的索引模块和搜索模块,依据用户输入的图像SURF特征塑造海量图像数据索引,完成相似图像的准确检索。实验结果说明,所提图像检索方法检索出的图像质量佳,对海量图像进行检索的效率和精度高。
关键词(KeyWords): Hadoop分布式;海量图像;SURF特征;K-Means聚类;检索;数据挖掘
基金项目(Foundation): 北京市科委重点专项课题:面向海量数据的智慧档案管理及三维可视化系统研发与应用示范(Z161100001116072)~~
作者(Author): 王立,陈军峰
DOI: 10.16652/j.issn.1004-373x.2018.09.014
参考文献(References):
- [1]朱为盛,王鹏.基于Hadoop云计算平台的大规模图像检索方案[J].计算机应用,2014,34(3):695-699.ZHU Weisheng,WANG Peng.Large-scale image retrieval solution based on Hadoop cloud computing platform[J].Journal of computer applications,2014,34(3):695-699.
- [2]郭飞,詹炳宏,刘刚.基于Hadoop的服饰图像存储与检索关键技术研究[J].计算机应用研究,2014,31(4):1086-1089.GUO Fei,ZHAN Binghong,LIU Gang.Research on key technology of clothing image storage and retrieval based on Hadoop[J].Application research of computers,2014,31(4):1086-1089.
- [3]吴松洋,张熙哲,王旭鹏,等.基于Hadoop的高效分布式取证:原理与方法[J].电信科学,2014,30(1):31-38.WU Songyang,ZHANG Xizhe,WANG Xupeng,et al.An efficient distributed forensic system based on Hadoop:principle and method[J].Telecommunications science,2014,30(1):31-38.
- [4]蔡晓东,华娜,吴迪,等.云平台上基于图像特征索引的并行检索系统技术研究[J].电视技术,2015,39(13):24-26.CAI Xiaodong,HUA Na,WU Di,et al.Research on parallel retrieval system based on image feature index on cloud platform[J].Video engineering,2015,39(13):24-26.
- [5]刘有耀,李彬.基于Hadoop的测试数据处理系统设计与实现[J].电子技术应用,2015,41(7):140-143.LIU Youyao,LI Bin.Design and implementation of test data processing system based on Hadoop[J].Application of electronic technology,2015,41(7):140-143.
- [6]刘贤熜,宋斌.基于Hadoop的海量数据TCP报文重组技术[J].计算机工程,2016,42(10):113-117.LIU Xiancong,SONG Bin.Hadoop-based mass data TCP packet reassembly technology[J].Computer engineering,2016,42(10):113-117.
- [7]孙卫真,王秀锦,徐远超.交通信息分布式处理中的Hadoop调度算法优化[J].计算机工程与设计,2014,35(4):1269-1273.SUN Weizhen,WANG Xiujin,XU Yuanchao.Optimization of Hadoop scheduling algorithms on distributed system for traffic information processing[J].Computer engineering and design,2014,35(4):1269-1273.
- [8]胡静泓,李德文,黄文君,等.一种流程工业的分布式海量报警管理系统[J].上海交通大学学报,2015,49(11):1660-1664.HU Jinghong,LI Dewen,HUANG Wenjun,et al.A distributed mass alarm management system in process industry[J].Journal of Shanghai Jiao Tong University,2015,49(11):1660-1664.
- [9]余征,龚勋,李天瑞,等.Hadoop的小图片处理技术及其在人脸特征提取上的应用[J].小型微型计算机系统,2015,36(8):1891-1895.YU Zheng,GONG Xun,LI Tianrui,et al.Small image processing techniques in Hadoop and its application on facial feature extraction[J].Journal of Chinese computer systems,2015,36(8):1891-1895.
- [10]彭天强,粟芳.基于深度卷积神经网络和二进制哈希学习的图像检索方法[J].电子与信息学报,2016,38(8):2068-2075.PENG Tianqiang,SU Fang.Image retrieval based on deep convolutional neural networks and binary Hashing learning[J].Journal of electronics&information technology,2016,38(8):2068-2075.