Format-aware Detection of Malicious Documents (FORMAD)
Zusammenfassung der Projektergebnisse
In the FORMAD project, new methods have been developed for classification of malicious documents. The key feature of the new methods is the relative ease of their adaptation to new formats. We have demonstrated on two popular formats – PDF and SFW – that joint treatment of the structure and the content extracted by the respective document parsers can provide useful features for reliable detection of malicious documents. The experiments carried out in this project verified the high effectiveness of our methods on several hundred thousand documents. The achieved detection rates exceeding 98% demonstrate the high potential of the developed techniques. Another key result achieved in this project is demonstration of successful attacks against deployed classifiers. Security of deployed classifiers is one of the key requirements for their practical use. Therefore, techniques for experiemental security evaluation of such systems constitute an essential instrument for a practitioner. In this project, we have demonstrated that even with limited knowledge about deployed classifiers, attackers can successfully evade them by collecting enough information from publicly available resources and filling in the gaps by algorithmic approximations. The main results of the project have made a substantial impact on subsequent research in reactive computer security. Several important publications have cited our results in the field of document security. As the most impressive follow-up of our work, a method was proposed for successful evasion of Google Chrome’s Phishing Pages Filter using techniques similar to the ones proposed in our work.
Projektbezogene Publikationen (Auswahl)
- Detection of malicious PDF files based on hierarchical document structure. In Proceedings of the 20th Network and Distributed Systems Symposium, 2013
N. Šrndić and P. Laskov
- Practical evasion of a learning-based classifier: A case study. In Proceedings of the IEEE Symposium on Security and Privacy, pages 197–211, 2014
N. Šrndić and P. Laskov
(Siehe online unter https://doi.org/10.1109/SP.2014.20) - Hidost: a static machine-learning-based malicious software detector for multiple file formats. EURASIP Journal on Information Security, 2016(1):22, 2016
N. Šrndić and P. Laskov
(Siehe online unter https://doi.org/10.1186/s13635-016-0045-0)