Skip to the content.

How to configure YARA rules

SandGNAT has two independent YARA scan points:

  1. Intake-time quick scan (INTAKE_YARA_RULES_DIR) — runs on the orchestrator against every submission before enqueue. Matches bump priority and annotate the job.
  2. Deep scan on the Linux static-analysis guest (STATIC_YARA_DEEP_RULES_DIR on the host; LINUX_GUEST_YARA_DEEP_RULES_DIR on the guest) — runs as part of the static stage with a heavier ruleset.

Both use yara-python. Both are optional: a missing library or an empty rules directory degrades to a no-op.

When to use which

They’re independent; you can run both, either, or neither.

Install yara-python

The intake path needs the yara optional extra:

pip install -e '.[yara]'

yara-python bundles libyara by default on Linux but fails at install-time if your toolchain is unhappy. If you see “libyara.so not found,” your distro probably ships a separate libyara package:

# Debian/Ubuntu:
apt-get install libyara-dev yara

On the Linux static-analysis guest, the tool wrapper imports yara-python lazily and degrades to skipped if missing — so a guest without libyara still works, the deep scan just doesn’t run.

Configure rule directories

On the orchestrator (intake + export):

INTAKE_YARA_RULES_DIR=/etc/sandgnat/yara-intake
STATIC_YARA_DEEP_RULES_DIR=/etc/sandgnat/yara-deep

On the Linux static-analysis guest:

LINUX_GUEST_YARA_DEEP_RULES_DIR=/etc/sandgnat/yara-deep

(You’ll typically mount the same rules volume on both host and guest via a shared filesystem; see build-linux-guest.md.)

Both directories are scanned recursively for *.yar / *.yara files. Every rule file is compiled once at service start; compile errors surface at boot, not at first sample.

Writing rules that SandGNAT cares about

Intake promotes priority (prioritized decision, priority ≤ 2) for:

Everything else is matched and recorded but doesn’t bump priority.

Example “high-severity” rule that would promote:

rule EvilCorp_Stealer_v3 : stealer malware
{
    meta:
        author = "your-analyst"
        severity = "high"
        description = "Known EvilCorp credential-stealer variant v3"

    strings:
        $config_magic = "ECSC3" wide
        $c2_pattern = /\bec[a-z0-9]{3,}\.example\b/

    condition:
        $config_magic and $c2_pattern
}

An “advisory” rule that would just annotate:

rule High_Entropy_Code_Section
{
    meta:
        severity = "info"
        description = "Code section entropy suggests packing"

    condition:
        math.entropy(filesize - 1024, 1024) >= 7.0
}

Deep-scan rules

The deep scan is free to use heavier features:

If you have vendor-licensed rulesets (e.g. from a threat-intel feed), put them in the deep dir — they’re typically too heavy for every intake.

Verify rules loaded

Check the intake-service logs at startup. You should see:

INFO orchestrator.yara_scanner: Compiling 14 YARA rule files from /etc/sandgnat/yara-intake

(The number is the count of distinct rule files, not individual rules.)

Submit a known-bad sample and verify the /submit response:

{
  "decision": "prioritized",
  "priority": 2,
  "yara_matches": [
    {"rule": "EvilCorp_Stealer_v3", "tags": ["stealer", "malware"], "meta": {"severity": "high"}}
  ]
}

And in the DB:

SELECT yara_matches FROM analysis_jobs WHERE id = '...';

Failure modes

Managing rules

SandGNAT has no opinion about how you maintain the rules directory. Common patterns:

Security