Project Details
Security and Privacy of PDF Documents
Applicants
Dr.-Ing. Christian Mainka; Dr.-Ing. Vladislav Mladenov
Subject Area
Security and Dependability, Operating-, Communication- and Distributed Systems
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 500595941
PDF is the de-facto standard for document exchange worldwide. It is used daily by the business sector, governmental organizations, and civil society. Based on the wide usage, PDF documents are a worthwhile carrier for various risks such as malware abusing lacks in the implementation, weaknesses in the cryptographic protection, and leaks of confidential information.In this project proposal, we will provide a comprehensive and systematic security analysis of PDF documents by covering the following three main building blocks: malware, digital signatures, and privacy leaks.First, we will provide a comprehensive overview and a systematic classification of the existing PDF malware by considering multiple vulnerability databases. We will create a list with current malware detectors ranging from open-source implementations to commercial software such as antivirus programs. We will use the collected malicious PDFs to evaluate the malware detectors regarding their efficiency. Afterward, we will analyze the detection techniques and algorithms. This analysis paves the way for the elaboration of new evasion techniques bypassing the malware detectors.Second, we will conduct an in-depth analysis of PDF signatures. Digitally signed PDFs provide integrity and authenticity and are used to detect unauthorized changes on documents such as contracts, agreements, and receipts. In recent years, multiple attacks breaking the existing implementations have been discovered, but all of them were created manually. Any claim of completeness and the coverage of all existing variants is missing. To fill this gap, we will design and implement a fully automated tool capable of creating attack vectors and evaluating the security of PDF signatures in PDF applications. The implementation of such an approach is challenging due to the involvement of cryptographic mechanisms. The tool should understand the structure of digitally signed files and create a meaningful set of attack variants instead of a huge set of irrelevant test cases.Third, PDF documents could contain unattended but sensitive data during the generation or the editing. Currently, there is no systematic evaluation analyzing such privacy leaks. For this purpose, we plan to create a comprehensive data set by crawling the Internet and downloading public available documents. We will analyze them and provide novel insights regarding the privacy level in PDFs.Moreover, we will systematically analyze redacting tools that aim to remove privacy leaks from PDF documents. Finally, we will propose concrete countermeasures hardening the PDF document format and the existing implementations. We will elaborate substantial improvements on specification- and implementation-level.
DFG Programme
Research Grants