Large Scale Image Registration Utilizing Data-Tunneling in the MapReduce Cluster, In: Arefin M.S., Kaiser M.S., Bandyopadhyay A., Ahad M.A.R., Ray K. (eds

Section 1: Publication

Publication Type

Authorship

Roy B, Roy CK, and Schneider KA

Title

Large Scale Image Registration Utilizing Data-Tunneling in the MapReduce Cluster, In: Arefin M.S., Kaiser M.S., Bandyopadhyay A., Ahad M.A.R., Ray K. (eds

Year

2022

Publication Outlet

Proceedings of the International Conference on Big Data, IoT, and Machine Learning. Lecture Notes on Data Engineering and Communications Technologies, vol 95, pp. 167-180, Springer, Singapore

DOI

https://doi.org/10.1007/978-981-16-6636-0_14

ISBN

ISSN

Citation

Roy B, Roy CK, and Schneider KA, Large Scale Image Registration Utilizing Data-Tunneling in the MapReduce Cluster, In: Arefin M.S., Kaiser M.S., Bandyopadhyay A., Ahad M.A.R., Ray K. (eds) Proceedings of the International Conference on Big Data, IoT, and Machine Learning. Lecture Notes on Data Engineering and Communications Technologies, vol 95, pp. 167-180, Springer, Singapore. https://doi.org/10.1007/978-981-16-6636-0_14.

Abstract

Applications of image registration tasks are computation-intensive, memory-intensive, and communication-intensive. Robust efforts are required on error recovery and re-usability of both the data and the operations, along with performance optimization. Considering these, we explore various programming models aiming to minimize the folding operations (such as join and reduce) which are the primary candidates of data shuffling, concurrency bugs and expensive communication in a distributed cluster. Particularly, we analyze modular MapReduce execution of an image registration pipeline (IRP) with the external and internal data (data-tunneling) flow mechanism and compare them with the compact model. Experimental analyzes with the ComputeCanada cluster and a crop field data-sets containing 1000 images show that these design options are valuable for large-scale IRPs executed with a MapReduce cluster. Additionally, we present an effectiveness measurement metric to analyze the impact of a design model for the Big IRP, accumulating the error-recovery and re-usability metrics along with the data size and execution time. Our explored design models and their performance analysis can serve as a benchmark for the researchers and application developers who deploy large-scale image registration and other image processing tasks.

Plain Language Summary