期刊(Journal)

论文标题	An optimized approach for storing and accessing small files on cloud storage
作者	董博，郑庆华，田锋（通信作者）等
发表/完成日期	2012-08-03
期刊名称	Journal of Network and Computer Applications
期卷	**
相关文章
论文简介	This paper firstly analyzes and points out the reasons of smallfile problem of HDFS: (1) large numbers of smallfiles impose heavy burden on NameNode of HDFS; (2) correlations between smallfiles are not considered for data placement; and (3) no optimization mechanism, such as prefetching, is provided to improve I/O performance. Secondly, in the context of HDFS, the clear cut-off point between large and smallfiles is determined through experimentation, which helps determine ‘how small is small’. Thirdly, according to file correlation features, files are classified into three types: structurally-related files, logically-related files, and independent files. Finally, based on the above three steps, an optimizedapproach is designed to improve the storage and access efficiencies of smallfiles on HDFS. File merging and prefetching scheme is applied for structurally-related smallfiles, while file grouping and prefetching scheme is used for managing logically-related smallfiles. Experimental results demonstrate that the proposed schemes effectively improve the storage and access efficiencies of smallfiles, compared with native HDFS and a Hadoop file archiving facility.（http://www.sciencedirect.com/science/article/pii/S1084804512001610）

返回

版权所有：西安交通大学 陕ICP备05001571号

技术支持联系电话：(创新港)

服务邮箱：

欢迎您访问我们的网站，您是第2305990434位访客

微信公众号

QQ答疑群