Skip to the content.

GNAT High-Level Architecture

GNAT (CTM Toolkit) is a production-ready Python library providing a unified client interface for security and threat-intelligence platforms. This document describes the overall system architecture and links to the individual Architecture Decision Records (ADRs) that capture the rationale behind each major design choice.

Visual diagrams:


Layers at a Glance

┌─────────────────────────────────────────────────────────────────────┐
│                        CLI / Web Dashboard                          │
│          gnat/cli/  ·  gnat/serve/  ·  gnat/viz/tui.py             │
├─────────────────────────────────────────────────────────────────────┤
│              Dissemination  (gnat/dissemination/)                   │
│     ExportService · WebhookNotifier · TAXII 2.1 · REST Gateway      │
├────────────────────┬────────────────────┬───────────────────────────┤
│   Analysis Layer   │   Reporting Layer  │   Investigation Builder   │
│   gnat/analysis/   │   gnat/reporting/  │   gnat/investigations/    │
│ Confidence · TLP   │ Report lifecycle   │ 5-step evidence graph     │
│ Correlation        │ STIX SDO export    │ Cross-platform pipeline   │
│ Timeline · Graph   │ AI drafting assist │ Workspace materialisation │
├────────────────────┴────────────────────┴───────────────────────────┤
│                         GNATClient facade                           │
│                          gnat/client.py                             │
├────────────────────┬────────────────────┬───────────────────────────┤
│   Ingest Pipeline  │  AI Agent Layer    │  Research Library         │
│   gnat/ingest/     │  gnat/agents/      │  gnat/research/           │
├────────────────────┴────────────────────┴───────────────────────────┤
│                     STIX 2.1 ORM  (gnat/orm/)                       │
├──────────────────────────────────┬──────────────────────────────────┤
│  159 Platform Connectors         │  Export Pipeline                 │
│   gnat/connectors/               │  gnat/export/                   │
├──────────────────────────────────┴──────────────────────────────────┤
│          HTTP Client Layer  (gnat/clients/  ·  gnat/async_client/)  │
│            urllib3 (sync)  ·  httpx (async)                         │
├─────────────────────────────────────────────────────────────────────┤
│   Context & Workspace  │  Scheduling    │  Search Sidecar           │
│   gnat/context/        │  gnat/schedule/│  gnat/search/ (Solr)      │
└─────────────────────────────────────────────────────────────────────┘

Core Subsystems

HTTP Client Layer

All network I/O is handled by a thin wrapper around urllib3.PoolManager for synchronous work and httpx.AsyncClient for async work. The layer provides connection pooling, configurable retries, and a uniform GNATClientError exception that carries HTTP status and body.

ADR-0001: HTTP Client LayerADR-0007: Async Client


STIX 2.1 ORM

STIXBase is a pure Python class — not a SQLAlchemy model or Pydantic model. Core STIX fields are real instance attributes; all other properties live in a _properties dict exposed via __getattr__/__setattr__. Serialisation is done via to_dict() / from_dict() / to_stix_bundle(). Non-standard extension fields use the x_ prefix per STIX 2.1.

ADR-0002: ORM / STIX Compatibility


Analysis Layer

The gnat.analysis package is the analyst-facing layer that transforms ingested CTI data into intelligence products. It provides:

ADR-0031: Analysis Layer ArchitectureADR-0033: Confidence Scoring ModelADR-0051: Attribution & Campaign TrackingADR-0053: Infrastructure Graph LabelsHow-to: Use the Analysis Layer


Investigation Builder

gnat.investigations.InvestigationBuilder orchestrates a five-step cross-platform evidence collection pipeline: seed expansion → incident expansion → normalisation → correlation → materialisation. It translates raw platform records into a unified EvidenceGraph of EvidenceNode and EvidenceEdge objects, then writes them to a GNAT workspace as STIX objects and Relationship SROs. Works with any subset of connected platform clients.

ADR-0031: Analysis Layer ArchitectureHow-to: Build Cross-Platform Investigations


HuntGNAT (Detection Rule Translation)

gnat.plugins.huntgnat translates STIX indicator patterns into platform-native detection rules. A recursive descent parser produces a typed AST from STIX patterns, which four translators consume:

Hunt packages (HuntPackage) bundle hypotheses, evidence, detection rules, and ATT&CK coverage into STIX Grouping objects with a lifecycle (DRAFT → PEER_REVIEWED → ACTIVE → RETIRED). CoverageAnalyzer builds ATT&CK technique × rule coverage matrices and identifies gaps. DeploymentTracker monitors where rules are deployed and DriftDetector identifies when on-platform copies diverge from canonical versions. ValidationRun scores whether rules actually fire during Atomic Red Team-style test executions.

ADR-0050: HuntGNAT — Detection Rule Translation


Telemetry Ingestion

gnat.ingest.telemetry provides high-volume sensor event ingestion for lab infrastructure:

Install with pip install "gnat[telemetry]" (kafka-python-ng + redis).

ADR-0052: Telemetry Ingestion


Analysis Rule Engine

gnat.analysis.rules provides automated hypothesis evaluation via declarative rules. Three engine implementations are available, selectable via [rules] engine in config:

All engines share the evaluation pipeline: RuleContext, Decision types, AuditWriter, RuleOrchestrator, and 26 helper functions (evidence, confidence, temporal, status, policy, source/trust). Rules are advisors — they return decisions without mutating state. The orchestrator applies decisions via InvestigationService. Feature flag default: OFF.

ADR-0054: Analysis Rule Engine


Reporting Layer

gnat.reporting provides first-class intelligence report objects with a formal five-state lifecycle (DRAFT → REVIEW → APPROVED → PUBLISHED → ARCHIVED). ReportService enforces the state machine and generates a STIX 2.1 report SDO bundle automatically on publish(). Published reports are immutable; revisions create a new draft linked via parent_report_id. Distinct from gnat.reports (operational PDF/DOCX generator) — this layer produces structured, traceable finished intelligence.

ADR-0034: Report LifecycleADR-0032: STIX Custom ObjectsHow-to: Create Intelligence Reports


Dissemination Layer

gnat.dissemination handles the outbound delivery of finished intelligence:

ADR-0028: TAXII 2.1 ServerADR-0031: Analysis Layer ArchitectureHow-to: Disseminate Intelligence


Connector Architecture

Each connector uses dual inheritance — BaseClient (HTTP) and ConnectorMixin (STIX contract). Every connector must implement authenticate(), to_stix(), from_stix(), health_check(), and the four CRUD methods. Connectors are registered in CLIENT_REGISTRY in gnat/clients/__init__.py. The library ships with 159 connectors covering SIEM, XDR, TIP, ASM, OT/IoT, vulnerability management, sandboxes, MDR, identity/ITDR, email security, insider risk/UEBA, BAS, DFIR, certificate transparency, bug bounty, and AI platforms.

ADR-0003: Connector Architecture


Ingestion Framework

Three composable abstractions form the ingest pipeline:

Abstraction Role
SourceReader Reads raw records from any source (file, API, TAXII, RSS, SQL…)
RecordMapper Converts raw records into STIXBase objects
IngestPipeline Wires reader → mapper → dedup → connector write

14 built-in readers and 12 built-in mappers cover the most common formats. Custom readers and mappers can be dropped in by subclassing.

ADR-0004: Ingestion Framework


Context and Workspace

A GlobalContextRegistry tracks named connector instances and their read/write permissions. WorkspaceManager creates and manages investigation workspaces, each with its own object graph and diff/commit lifecycle. Workspaces are serialised to JSON for persistence; optional SQLAlchemy back-end available via the persist extra.

ADR-0005: Context SystemADR-0006: Workspace PersistenceADR-0027: Multi-Tenant Workspace Isolation


Visualization

Three rendering targets are supported out of the box:

Target Module Best for
Tabular (pandas / rich) gnat/viz/tabular.py CLI output, quick review
Graph (sigma.js / pyvis) gnat/viz/graph.py Relationship exploration
Grafana / Power BI export gnat/viz/ Operational dashboards

ADR-0008: Visualization — TabularADR-0009: Visualization — GraphADR-0010: Visualization — Grafana vs Power BI


CLI

The CLI (gnat/cli/main.py) uses argparse subcommands with no framework dependency. It surfaces ingest, export, scheduling, workspaces, connectors, reports, and code generation as top-level subcommands.

ADR-0011: CLI DesignADR-0023: Terminal UI — Textual


Code Generation

gnat/codegen/ scaffolds new connector packages from an OpenAPI specification. It generates the directory layout, __init__.py, client.py stub with the full ConnectorMixin contract, unit test skeleton, INI example block, and ADR stub.

ADR-0012: Code GenerationADR-0024: XSOAR Content Pack Generator


Configuration

INI-based configuration via stdlib configparser. Search order: GNAT_CONFIG env var → ~/.gnat/config.ini./gnat.ini. Each platform gets its own section; shared settings live in [global]. No external config library is used.

ADR-0013: Configuration


Testing Strategy

Unit tests live in tests/unit/ and mock at the HTTP layer via mock_pool_manager. Integration tests in tests/integration/ are gated behind @pytest.mark.integration and the --run-integration pytest flag; they require live credentials in GNAT_CONFIG. Minimum coverage is 70 %.

ADR-0014: Testing Strategy


Packaging and Extras

GNAT uses setuptools extras so users install only what they need. The core package requires only urllib3. Optional feature groups (yaml, taxii, ingest, async, persist, schedule, reports, viz, serve) are installed on demand. The all extra pulls everything.

ADR-0015: Packaging and Extras


Feed Scheduling

FeedJob wraps a (SourceReader, RecordMapper, connector) triple with a cron expression. FeedScheduler runs jobs via croniter, tracks last_success, and passes a JobRunContext to each reader factory so incremental fetches work correctly.

ADR-0016: Feed Scheduling


Export Pipeline

The export layer converts STIXBase objects to delivery-ready formats. Built-in targets include EDL (plain-text IP/domain/URL block lists) and Netskope CE. A filter chain (ConfidenceFilter, TLPFilter, SectorFilter) gates what reaches each target.

ADR-0017: Export / Integration Pipeline


AI Agent Layer

ResearchAgent (a SourceReader) and ParsingAgent (a RecordMapper) drop directly into the existing IngestPipeline and FeedJob infrastructure. They call the Claude API using stdlib urllib (no anthropic SDK dependency). Every AI-extracted STIX object is capped at ai_confidence_ceiling (default 60) and tagged x_source_type: "ai_extracted" to require human review before high-stakes propagation. CopilotReader connects to Microsoft 365 via the Bot Framework DirectLine v3 API.

ADR-0018: AI Agent Layer


Research Library

ResearchLibrary provides a curated, searchable store of threat reports, news, and analyst notes. CurationJob automates ingestion from monitored RSS/web sources. The library integrates with the AI agent layer for AI-assisted summarisation and with the Solr search sidecar for full-text search.

ADR-0019: Shared Research Library


NLP Query Layer

A natural-language query interface sits in front of workspace objects and the research library. Queries are translated to structured filters by the AI agent layer, allowing analysts to ask questions like “show me all ransomware indicators added this week” without writing code.

ADR-0020: NLP Query Layer


Rust Native Extension

An optional Rust extension module (gnat._core) accelerates hot-path IOC operations: classify, defang, refang, extract pattern value, and batch classify. The Python shim (gnat/ingest/_ioc_classifier.py) detects whether the compiled extension is available and falls back to the pure-Python implementation transparently.

ADR-0021: Rust Native Extension


Web Dashboard

gnat/serve/ exposes a FastAPI-based web dashboard for browsing workspaces, running queries, and reviewing AI-extracted objects. It is an optional component installed via the serve extra (fastapi + uvicorn).

ADR-0022: Web Dashboard


Upstream Contribution Pipeline

A pipeline that formats GNAT-curated intelligence as pull requests or API submissions to open-source threat-intel communities (MISP galaxies, OpenCTI, TAXII 2.1 servers). Governed by configurable confidence thresholds and TLP markings.

ADR-0025: Upstream Contribution Pipeline


Connector Health Monitor

A background service that polls each registered connector’s health_check() endpoint on a configurable interval, records latency and availability metrics, and surfaces connector status in the web dashboard and CLI.

ADR-0026: Connector Health Monitor


TAXII 2.1 Server

An embedded TAXII 2.1 server (gnat/serve/taxii/) allows GNAT to act as a threat-intel distribution point. Collections map to connector namespaces or workspace snapshots. Requires the serve extra.

ADR-0028: TAXII 2.1 Server


Docker Containerisation

Official Docker images and a docker-compose.yml ship with the repository. The compose stack includes the GNAT API server, Solr (search sidecar), and a scheduler container. Configuration is injected via environment variables that map to INI keys.

ADR-0029: Docker Containerization


Architecture Decision Records Index

All ADRs are stored in docs/explanation/architecture/adrs/ and listed in the ADR README.

# Title Topic
0001 HTTP Client Layer Infrastructure
0002 ORM / STIX Compatibility Data model
0003 Connector Architecture Integration
0004 Ingestion Framework Data pipeline
0005 Context System State management
0006 Workspace Persistence State management
0007 Async Client Infrastructure
0008 Visualization — Tabular UX
0009 Visualization — Graph UX
0010 Visualization — Grafana vs Power BI UX
0011 CLI Design UX
0012 Code Generation Developer experience
0013 Configuration Infrastructure
0014 Testing Strategy Quality
0015 Packaging and Extras Distribution
0016 Feed Scheduling Data pipeline
0017 Export / Integration Pipeline Data pipeline
0018 AI Agent Layer Intelligence
0019 Shared Research Library Intelligence
0020 NLP Query Layer Intelligence
0021 Rust Native Extension Performance
0022 Web Dashboard UX
0023 Terminal UI — Textual UX
0024 XSOAR Content Pack Generator Developer experience
0025 Upstream Contribution Pipeline Integration
0026 Connector Health Monitor Operations
0027 Multi-Tenant Workspace Isolation State management
0028 TAXII 2.1 Server Integration
0029 Docker Containerization Operations
0030 Adopt Diátaxis and ADRs Documentation
0031 Analysis Layer Architecture Intelligence
0032 STIX Custom Objects Data model
0033 Confidence Scoring Model Intelligence
0034 Report Lifecycle State Machine Intelligence
0035 Quality Agents Quality
0036 Security Agents (Phase B) Quality
0037 Responsible Disclosure, DCO, and Apache 2.0 Compliance Governance

Key Design Principles

Principle Rationale
urllib3 over requests Direct control, no extra abstraction layer, compatible with async path
Pure-Python ORM STIX objects are not DB-bound; serialise to JSON, not sessions
ConnectorMixin contract Every connector exposes the same CRUD + STIX surface; no special casing in pipelines
Extras-based packaging Users pay only for the dependencies they actually use
AI confidence ceiling AI-extracted intel requires human review before high-stakes propagation
INI configuration Zero external config library; works everywhere configparser works
Diátaxis docs Each document has one purpose — tutorial, how-to, reference, or explanation

Licensed under the Apache License, Version 2.0