ASCF: Apriori Algorithm on Spark Based on Cuckoo Filter Structure
Bana Ahmad Alrahwan;
Abstract
Summary:
The Apriori algorithm is one of the most basic techniques that are used to discover frequent patterns in dataset. Apriori is iterative and works sequentially. It generates candidate sets having all possible combinations for frequent itemsets that are generated from the previous iteration and comparing each combination of items with every transaction record in every iteration. Thus, Apriori algorithm is not efficient and gets computationally more expensive as the data size is increased. The rapid growth of data necessitates running the data intensive algorithms in parallel distributed environment to achieve convenient performance. Many approaches have been proposed to solve the Apriori major drawbacks that severely degrade the performance as the datasets get larger which is a common feature in Today’s data. In this thesis, Apriori Algorithm on Spark based on Cuckoo Filter structure (ASCF) is introduced. ASCF succeeds in removing the candidate generation step from Apriori algorithm to reduce computational complexity and avoid costly comparisons. The proposed algorithm is implemented on spark in-memory processing distributed environment to reduce processing time. The ASCF offers great improvement in performance over other implementation approaches of Apriori algorithm based on spark.
The Apriori algorithm is one of the most basic techniques that are used to discover frequent patterns in dataset. Apriori is iterative and works sequentially. It generates candidate sets having all possible combinations for frequent itemsets that are generated from the previous iteration and comparing each combination of items with every transaction record in every iteration. Thus, Apriori algorithm is not efficient and gets computationally more expensive as the data size is increased. The rapid growth of data necessitates running the data intensive algorithms in parallel distributed environment to achieve convenient performance. Many approaches have been proposed to solve the Apriori major drawbacks that severely degrade the performance as the datasets get larger which is a common feature in Today’s data. In this thesis, Apriori Algorithm on Spark based on Cuckoo Filter structure (ASCF) is introduced. ASCF succeeds in removing the candidate generation step from Apriori algorithm to reduce computational complexity and avoid costly comparisons. The proposed algorithm is implemented on spark in-memory processing distributed environment to reduce processing time. The ASCF offers great improvement in performance over other implementation approaches of Apriori algorithm based on spark.
Other data
| Title | ASCF: Apriori Algorithm on Spark Based on Cuckoo Filter Structure | Authors | Bana Ahmad Alrahwan | Issue Date | 2018 |
Recommend this item
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.