Hybrid Anomaly Detection SOC
Enterprise-grade SOC combining rule-based detection with machine learning to identify threats across Windows endpoints and AWS cloud infrastructure
Executive Summary
Enterprise-grade SOC combining rule-based detection with machine learning to identify threats across Windows endpoints and AWS cloud infrastructure. Reduces MTTD from 24 hours to 3.2 minutes with 85% detection accuracy.
Key Results:
The Problem
Security teams face alert fatigue from thousands of low-fidelity alerts daily, delayed detection from manual log correlation (hours to days), and context gaps that waste analyst time before investigation.
The Solution
Three-Layer Detection Architecture
Layer 1: Data Collection & Normalization
- • Windows Security logs via Universal Forwarder
- • Sysmon process/network/registry telemetry
- • AWS CloudTrail and GuardDuty via Splunk Add-on
- • Normalized to Common Information Model
Layer 2: Hybrid Detection Engine
- • 20+ rule-based detections (W01-W07, S01-S08, A01-A05)
- • Machine learning anomaly scoring (21 features)
- • Dynamic severity classification (p99, p99.7)
Layer 3: Automated Response
- • High-severity alerts trigger TheHive case creation
- • Pre-populated observables: users, hosts, IPs
- • One-click deep link to Splunk evidence

System Architecture Diagram
Detection Coverage
Windows Security Events
(7 rules)
- Failed logon spikes (4625)Brute force detection
- RDP access (4624 Type=10)Lateral movement
- Special privileges (4672)Privilege escalation
- New user (4720)Persistence
- Added to Admins (4732)Persistence
- Log cleared (1102)Anti-forensics
- Service installed (4697)Malware persistence
Sysmon Telemetry
(8 rules)
- PowerShell executionT1059.001
- Encoded commandsT1027
- certutil.exeT1105
- mshta.exeT1218.005
- High-port connectionsT1571
- Startup persistenceT1547.001
- Registry Run keysT1547.001
- Temp executionT1204.002
AWS CloudTrail
(5 rules)
- Root account activityT1078
- Console login failuresT1110
- CloudTrail tamperingT1562.008
- IAM user creationT1136.003
- AccessDenied reconnaissanceT1087.004

MITRE ATT&CK Coverage Heatmap
Machine Learning Pipeline
Anomaly Detection Workflow:
- 1. Feature engineering: 21 features
- 2. Isolation Forest scoring on 7-day baseline
- 3. Severity classification (p99.7+, p99-p99.7)
- 4. HEC write-back to ai_anomalies
- 5. Auto-promote high-severity to incidents
Performance Metrics:

ML Pipeline Architecture
Validation & Testing
Validated using Atomic Red Team across 156 scenarios:
Metric | Result |
---|---|
Detection Rate | 94.2% (147/156) |
Avg Detection Time | 2.7 min |
False Negatives | 5.8% |
100% detection for:
- ✓ PowerShell execution (T1059.001)
- ✓ Registry persistence (T1547.001)
- ✓ Service manipulation (T1543.003)
- ✓ Account creation (T1136.001)

Test Results
Real-Time Operations
SOC Dashboard

Real-time Monitoring
- • Event ingestion: 10,247/hour
- • Active incidents by severity
- • Top risky entities from ML
- • Detection rule status
Incident Management

TheHive Integration
Business Impact
Metric | Value | Improvement |
---|---|---|
MTTD | 3.2 min | 99.8% faster |
Workload | -70% | Automation |
False Positives | 15% | -25 points |
Skills Demonstrated
Detection Engineering
Developed 20+ MITRE-mapped detection rules with optimized throttling and correlation logic
SIEM Administration
Configured multi-source ingestion, built dashboards, optimized SPL for sub-30s query performance
Machine Learning
Engineered 21 features from security logs, trained Isolation Forest model with 0.92 AUC-ROC
Incident Response
Automated alert-to-case workflow, reducing manual triage time by 75%
Cloud Security
AWS CloudTrail/GuardDuty monitoring, IAM security event detection, multi-account log aggregation
Python Development
Built ML pipeline with pandas, scikit-learn, scheduled cron jobs, HEC integration
Windows Security
Configured Universal Forwarder, Sysmon telemetry, analyzed EventIDs 4624/4625/4672/4720/4732
API Integration
REST API webhooks, Splunk HEC (port 8088), TheHive case automation
Red Team Validation
Atomic Red Team testing across 156 scenarios, 94.2% detection validation
Data Normalization
Common Information Model (CIM) mapping for cross-source correlation
Bash Scripting
Automation scripts, system health checks, log validation
Docker Containerization
Deployed TheHive 5.0 in Docker, volume management, container orchestration
Production-ready SOC engineering: clean ingestion, intelligent detection, ML signals, and automated response.