Syscall-Based HIDS Generalisation: From CVE to CWE

Why it matters

This research is crucial for AI builders developing security systems. Understanding the limits of generalization in HIDS can inform the design of more robust and adaptable intrusion detection models. It highlights the need for careful consideration of training data and feature engineering to improve detection of novel threats.

What changed This research paper investigates the effectiveness of generalizing Host Intrusion Detection Systems (HIDS) from specific Common Vulnerabilities and Exposures (CVEs) to broader Common Weakness Enumeration (CWE) classes. Traditionally, HIDS are trained and evaluated on individual CVE instances. However, in real-world scenarios, security professionals need to identify new exploits that fall under known weakness types. The study empirically tests whether a one-class anomaly detector, trained on the normal behavior of CVEs belonging to a particular CWE, can generalize to detect a different, previously unseen CVE within that same CWE.

The researchers used six scenarios from the LID-DS-2021 dataset, categorized into three CWE families: CWE-307 (improper restriction of operations within the bounds of a resource, often related to authentication), CWE-89 (SQL injection), and CWE-434 (unrestricted file upload). For each scenario, they extracted a 66-dimensional feature vector using a Peng-Guo-style approach per sliding window. Two types of anomaly detectors were trained: Isolation Forest and SGD One-Class SVM. The detectors were calibrated using normal-only thresholds set to achieve fixed target false positive rates (FPR).

Four specific research questions were addressed: the self-detection capability of the trained models, the effectiveness of asymmetric cross-CVE transfer (i.e., training on one CVE and testing on another within the same CWE), the value of using a combined CWE-level normal profile for training, and the impact of feature filtering on the system's transferability. The results showed that the combined CWE-307 detector achieved an F1 score of 0.6976 at a calibration target FPR of 0.05, with a precision of 0.8994 and a recall of 0.5698. However, for CWE-89 and CWE-434, the F1 scores dropped to 0.21 or lower under the same protocol. The study also found that cross-CVE transfer is highly direction-dependent and is more influenced by the breadth of the source normal profile than by the CWE label itself.

Why it matters for builders For AI builders working on security solutions, this paper offers critical insights into the limitations and potential of anomaly detection systems. The findings suggest that while generalizing HIDS to broader weakness categories is achievable for certain types of vulnerabilities (like CWE-307 in this study), it is not a universal solution. Builders need to be aware that the effectiveness of such generalization heavily depends on the specific weakness family and the nature of the training data. This research underscores the importance of carefully selecting and engineering features, as well as understanding the characteristics of normal system behavior, to build more resilient and adaptable security tools.

Practical impact The practical implications of this research are significant for the development of next-generation intrusion detection systems. The study demonstrates that a CWE-level generalization approach can be empirically attained for some weakness families, but not all, using current system-call features. This means that security systems might need to be tailored to specific types of vulnerabilities or employ hybrid approaches that combine different detection strategies. The strong direction-dependency of cross-CVE transfer and the dominance of the source normal profile's breadth over the CWE label suggest that the quality and scope of the 'normal' data used for training are paramount. Developers should focus on creating comprehensive normal behavior profiles and consider the directionality of potential threats when designing their models. The paper also advocates for calibrated FPR as a methodological prerequisite for honest reporting in HIDS research, which can lead to more standardized and reliable evaluation metrics for security AI.

Caveats and source limits The findings of this study are based on a specific set of six scenarios drawn from the LID-DS-2021 dataset and evaluated using Isolation Forest and SGD One-Class SVM detectors with a 66-dimensional Peng-Guo-style feature vector. The research explicitly states that CWE-level generalization is empirically attainable for some but not all weakness families with current syscall features. The performance varied significantly across different CWE families, with CWE-307 showing moderate success while CWE-89 and CWE-434 performed poorly. The study did not explore other types of HIDS, feature extraction methods, or anomaly detection algorithms. Therefore, the conclusions regarding the generalizability of HIDS should be considered within the context of these specific experimental conditions and may not directly apply to all HIDS implementations or vulnerability types. The research is presented as a pre-print on arXiv, indicating it has not yet undergone peer review.

Article ID - cmqqye3fn0Featured on AI Radar: Syscall-Based HIDS Generalisation: From CVE to CWE