Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 47215 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Jalili, Rasool
- Abstract:
- Different data sources are creating a huge amount of data at increasing speeds that require real-time processing. Such data is called “Big data stream". Although, mining and analysis this type of data is so useful for companies, but it also may cause many privacy breaches. The principle issues for big data stream’ anonymization are real time processing and information loss. There are some works that are proposed for data streams, but they have some drawbacks such as inefficient anonymization of big data stream and also not consider time expiration of tuples that lead to increase the information loss and cost of the data publishing. In this thesis, in order to speed up the ability of big data streams’ anonymization, we have presented FAST as a parallel anonymization algorithm to protect privacy of big data stream. This algorithm also considers a time-expiration heuristic in order to decrease the information loss and cost of the system. in order to gain cloud computation power and achieve high scalability, we designed and implemented FAST in a distributed cloud-based framework based on Apache storm framework and called it "SD-FAST". Our simulation results indicate significant improvement in big data stream anonymization in terms of speed, information loss and cost metric
- Keywords:
- Privacy ; Anonymity ; Data Stream ; Data Management ; Big Data ; Fast Anonymization of Bigdata Stream (FAST)
- محتواي کتاب
- view