Skip to the content.

Malware Runtime Analysis Environment — Design & Implementation Plan

Document Version: 1.0 Date: April 2026 Purpose: Comprehensive design specification for an automated malware detonation and artifact analysis system built on Proxmox with STIX 2.1 output and Postgres persistence.


Executive Summary

This document outlines a production-grade malware runtime analysis system that automates the detonation of suspicious binaries in isolated Windows virtual machines, captures behavioral artifacts, and exports findings as structured STIX 2.1 objects stored in PostgreSQL. The system enforces isolation at the infrastructure level using Proxmox network bridges, integrates industry-standard monitoring tools (ProcMon, RegShot, Wireshark, CaptureBAT), and provides an extensible job queue for scaling analysis capacity.

Key Design Principles:


Architecture Overview

1. Infrastructure Topology

┌─────────────────────────────────────────────────────────────────┐
│ PROXMOX HOST (KVM)                                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌────────────────────────────────┐  ┌──────────────────────┐  │
│  │ Management Bridge (vmbr0)      │  │ Analysis Bridge      │  │
│  │ 172.16.0.0/24                  │  │ (vmbr.analysis)      │  │
│  │ No IP on Proxmox host          │  │ 192.168.100.0/24     │  │
│  │ (management isolated)          │  │ No IP on Proxmox     │  │
│  │                                │  │ (untrusted VMs only) │  │
│  └────────────────────────────────┘  └──────────────────────┘  │
│                                               │                 │
│         ┌─────────────────────────────────────┘                │
│         │                                                      │
│    ┌────▼─────────────────────────────────────────────────┐   │
│    │ OPNsense/pfSense Firewall VM (on vmbr.analysis)     │   │
│    │ - DMZ gateway for analysis network                  │   │
│    │ - Explicit allow rules (INetSim, DNS control)       │   │
│    │ - Default deny egress (kill-switch policy)          │   │
│    └────┬─────────────────────────────────────────────────┘   │
│         │                                                      │
│    ┌────▼──────────────────────────────────────────────────┐  │
│    │ Job Orchestrator VM (Debian/Rocky on vmbr.analysis)  │  │
│    │ - Python job queue (Celery or RQ)                    │  │
│    │ - Proxmox API client                                 │  │
│    │ - STIX model layer + Postgres driver                 │  │
│    │ - Malware sample intake & validation                 │  │
│    │ - Result aggregation & export                        │  │
│    └────┬──────────────────────────────────────────────────┘  │
│         │                                                      │
│    ┌────▼──────────────────────────────────────────────────┐  │
│    │ Analysis Guest VMs (Windows 10/11 Hardened)          │  │
│    │ [FLARE-VM Template]                                  │  │
│    │ - Minimal attack surface (no AV, bloatware removed)  │  │
│    │ - ProcMon, RegShot, Wireshark, CaptureBAT            │  │
│    │ - RDP bridge (inbound only, firewall restricted)     │  │
│    │ - Dedicated snapshot for revert                      │  │
│    │ - Dropped file collector agent (Windows Service)     │  │
│    └────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ PostgreSQL VM (Dedicated LVM volume)                    │   │
│  │ - STIX schema (Malware, Indicator, File, Process, etc.) │   │
│  │ - Analysis metadata tables                              │   │
│  │ - Full-text index on IOCs                               │   │
│  │ - Audit trail & versioning                              │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ Quarantine/Storage (Dedicated LVM + NFS)                │   │
│  │ - Isolated dropped file repository                      │   │
│  │ - Immutable archive (append-only)                       │   │
│  │ - Hash verification on ingestion                        │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

                    Dual ISP (Xfinity + Verizon 5G)
                  Host WAN Isolation (no outbound to VMs)

Key Decisions:


2. Analysis VM Configuration (FLARE-VM Template)

2.1 Base Image & Hardening

Template Spec:

Installed Tools (FLARE-VM Kit):

Hardening:

Snapshot Workflow:

  1. Create VM from template.
  2. Boot, install FLARE-VM, run system optimization.
  3. Boot clean state, start all monitoring tools, take “clean” snapshot.
  4. Run malware, capture artifacts, shut down.
  5. Revert to “clean” snapshot.
  6. Repeat.

3. Monitoring & Artifact Capture

3.1 Pre-Execution Baseline

  1. Registry Snapshot (RegShot 1st Shot): Captures HKLM, HKCU, HKU. Output: Registry.txt (baseline).
  2. File Baseline (FSUtil):

    fsutil fsinfo statistics c: > c:\baseline\fsstat.txt
    dir /s c:\windows > c:\baseline\windows_baseline.txt
    dir /s c:\program files > c:\baseline\programfiles_baseline.txt
    
  3. Process Baseline:

    tasklist /v > c:\baseline\processes_baseline.txt
    
  4. Wireshark Start:

    tshark -i Ethernet -w c:\captures\capture.pcap -f "not (arp or stp)"
    
  5. ProcMon Start: Capture filter: Include all, will filter post-execution. Export path: c:\captures\procmon.pml.

3.2 Malware Execution

3.3 Post-Execution Capture

  1. Registry Snapshot (RegShot 2nd Shot): Diff → Registry.txt.html.
  2. ProcMon Stop & Export: Export CSV filtered by malware PID + children. Extract RegSetValue, WriteFile, CreateKey, DeleteKey events.
  3. Wireshark Stop & Export:

    tshark -r c:\captures\capture.pcap -T fields \
      -e frame.time -e ip.src -e ip.dst -e dns.qry.name \
      > c:\captures\network_summary.txt
    
  4. File I/O Collection: Scan AppData\Local\Temp, AppData\Roaming, ProgramData, Windows\Temp, $Recycle.Bin. Hash all new files (MD5, SHA1, SHA256). Copy to quarantine.
  5. Memory Dump (Optional): On signs of code injection (CreateRemoteThread, WriteProcessMemory).
  6. Artifact Compression & Staging: Copy c:\artifacts\* to \\192.168.100.1\analysis\[job_id]\.

4. STIX 2.1 Object Modeling

All captured artifacts are modeled as STIX 2.1 Cyber Observables and Malware objects. Design constraint: all hypothesis and evidence objects are STIX types, stored in Postgres as JSONB with normalized indices.

Object types used: malware, file, process, network-traffic, indicator, directory, ipv4-addr, domain-name. Every object carries an x_analysis_metadata extension tying it back to an analysis_id, VM UUID, sample hash, and tool provenance.

See orchestrator/stix_builder.py for the canonical factories and docs/stix_examples/ for wire-format samples.


5. PostgreSQL Schema

The authoritative schema lives at migrations/001_initial_schema.sql. It defines:


6. Job Queue & Orchestration

6.1 Job Lifecycle

1. Sample submission
2. Intake validation (hash check, YARA scan, file-type validation)
3. Enqueue (row in analysis_jobs, message to Celery)
4. VM lifecycle (spin up from snapshot, boot, start monitoring)
5. Malware execution (SMB/RDP delivery, timeout)
6. Artifact capture (stop tools, enumerate drops, copy to quarantine)
7. STIX generation (parsers → STIX factories)
8. Postgres persistence (bundle + normalized tables + audit log)
9. VM reset (revert snapshot)
10. Ready for next job

6.2 Tech Stack


7. Isolation & Security Best Practices

7.1 Network Isolation

OPNsense default-deny ruleset:

Default: DENY all

Allow:
- Inbound DNS (UDP 53) to INetSim (192.168.100.2)
- Inbound NTP (UDP 123) to time server
- Inbound HTTP/HTTPS (80, 443) to INetSim honeypot
- Inbound SMB (445) from Job Orchestrator only
- Inbound RDP (3389) from Job Orchestrator only

Deny:
- ALL outbound to Xfinity/Verizon gateway
- ALL outbound to management network (172.16.0.0/24)
- ALL multicast, broadcast

7.2 Host-Level Hardening

7.3 Snapshot-Based Reset

7.4 Sample Handling & Chain of Custody

  1. Hash sample on intake.
  2. Verify against VirusTotal / known-bad lists.
  3. Store sample in encrypted, access-controlled quarantine.
  4. Never expose sample hash/path in logs or UI outside authenticated contexts.
  5. After analysis, delete original sample; retain only quarantined drops + STIX objects.

8. STIX Output & Threat Intelligence Integration


9. Implementation Roadmap

Phase 1: Foundation (Weeks 1–3)

Phase 2: Monitoring & Capture (Weeks 4–5)

Phase 3: STIX Generation & Persistence (Weeks 6–8)

Phase 4: Job Queue & Automation (Weeks 9–10)

Phase 5: Integration & Hardening (Weeks 11–12)


10. Operational Considerations

10.1 Capacity Planning

10.2 Observability

10.3 Incident Response


11. Conclusion

This design provides a production-grade malware runtime analysis environment that enforces infrastructure-level isolation, captures comprehensive behavioral artifacts, models findings as STIX 2.1, persists in Postgres, and automates via a job queue — with integration hooks for threat-intelligence platforms.

Immediate action items:

  1. Validate Proxmox dual-ISP connectivity and network topology.
  2. Stand up OPNsense firewall VM + rules.
  3. Create FLARE-VM template and test snapshot/revert cycle.
  4. Apply migrations/001_initial_schema.sql and exercise the STIX builder.
  5. Prototype the end-to-end flow against a single VM.