Abstract :
FP-growth is a classical algorithm in frequent pattern mining, which is often used in static data mining. Some researches
have been done on using FP-growth algorithm to analyze streaming data. However, the double-scan-of-database manner in FP-tree
creation is a serious bottleneck in streaming data analysis. Sliding window technique could solve this problem in certain degree, but
it still can lead to the inaccuracy of FP-tree creation, which may impact the consequent data mining. In this paper, a new FP-tree
algorithm is presented for streaming data, which creates the FP-tree by a single-pass scanning (SPSFP) throughout the database.
Compared with the traditional FP-tree creation, the new method scans the database only once and doesnʹt need to store the whole data
set into memory, which not only saves the memory space but also makes it possible to mine frequent pattern accurately in streaming
data. Furthermore, the time cost of the new algorithm is almost equivalent to the traditional one
Author/Authors :
Qiang Tu1 , Jian-feng Lu , Jiu-bin Tang , Jing-yu Yang