Providing an improved method for the implementation of Hash Join algorithm within the framework of Map Reduce
Ebrahim Azhdari Pour1, Elham Ghaffari2, Somayeh Kargaran3
Today, one of the most important concerns of information systems, is the processing of massive database queries. Parallel processing system of Google called Map Reduce and open source version of it called hadoop have the ability to work on multiple concurrent execution processing systems. Despite the relatively high performance still it needs to improve the processing method when dealing with the processing of the giant databases. SQL is a famous and powerful database that can store large amounts of data, maintain and make access to the information required for the processing. The database, contain an order called Hash Join that is one of the ways of implementation of join between the tables of a database. In this study, a method is suggested for executing Hash Join in Map Reduce that increases the speed of the join between the tables and the outputs. By comparing this technique with the usual implementation of Hash Join on Map Reduce system, it can be seen that approximately 30 percent improvement is achieved through the proposed approach in running the system.