BayesPDGImVD: Bayesian Hyperparameter-Optimized Image-Based Vulnerability Detection via Program Dependency Graph Representation

Machine learning Bayesian optimization Program dependency graph Image-based representation Source code vulnerability detection

Authors

  • Xingquan Mao School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China
  • Zhangpei Huang School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China
Vol. 13 No. 07 (2025)
Engineering and Computer Science
July 29, 2025

Downloads

With the increasing complexity of software systems and widespread adoption of open-source components, traditional vulnerability detection approaches face significant bottlenecks in both efficiency and accuracy. Recent advances in machine learning have opened new avenues for intelligent vulnerability detection. This paper presents a Bayesian Hyperparameter-Optimized Image-Based Vulnerability Detection method via Program Dependency Graph Representation (BayesPDGImVD), which innovatively combines program dependency graph (PDG) image representation with Bayesian hyperparameter optimization to effectively overcome limitations of conventional detection methods. The implemented system performs static PDG extraction from C/C++ source code using the Joern analyzer, then constructs multi-channel image features by integrating Sent2Vec semantic embeddings with triple-node centrality metrics (degree, closeness, and Katz centrality). The CNN classifier employs Bayesian optimization to automatically tune critical parameters (learning rate, kernel size, dropout rate, etc.), completely eliminating manual parameter adjustment. Experimental results on the SARD benchmark dataset demonstrate outstanding performance: 86.43% detection accuracy and 80.38% F1-score, with 40% reduced performance fluctuation compared to non-optimized models, validating Bayesian optimization's effectiveness in enhancing model robustness and detection capability. Unlike existing approaches such as VulCNN, our key contribution lies in the organic integration of image-based representation with hyperparameter optimization mechanisms, providing a more interpretable and engineering-practical solution for source code vulnerability detection.