⚙️ The 3 Types of Automation in GRC Engineering (pick the right one)

Think about your objective determines your automation type. Not what sounds cool or sounds more like GRC Engineering. Outcomes before tools!

IN PARTNERSHIP WITH

Automate your SDLC Governance with Kosli

Are you delivering software in a regulated industry? Know the pains of ensuring supply chain security, change management, and runtime monitoring? Kosli automates all of the governance tasks in your software delivery process, giving you speed, security, and audit-ready proof—at scale.

📣 Moving from Scripting to Engineering

Most GRC teams conflate scripting with systems engineering.

They're writing Python scripts to collect evidence. Calling APIs to check control status. Building dashboards to track compliance.

This is GRC Scripting.

Valuable? Absolutely.

Critical infrastructure? Yes.

But it's a different discipline from GRC Systems Engineering.

Here's the distinction: scripting solves the immediate implementation (the plumbing). Systems engineering architects solutions that scale (the infrastructure design).

Both matter. Both require technical skill. But understanding the difference changes what you build and how you approach problems.

As a GRC leader, you need to know: which discipline does this problem need? Are you doing plumbing or architecture?

Before we talk about the three types, let's establish what we mean by "engineering" in this context.

The 3 Types of Automation

The framework: The three types of GRC Engineering automations

Type 1: Deterministic Automation

This is the scripting era. You write rules, the system executes them. Pass or fail. Yes or no.

What it looks like:

  • Python scripts collecting compliance evidence

  • APIs pulling patch data from vulnerability scanners

  • Scheduled tasks gathering logs for audit trails

  • Dashboards aggregating control status across tools

When it works brilliantly:

  • Repeatable processes with clear inputs and outputs

  • Structured data from reliable sources

  • Binary outcomes (compliant/non-compliant)

  • Known good states you can check against

# evidence_collector.py
from google.cloud import storage

def check_gcs_encryption():
    client = storage.Client()
    bucket = client.get_bucket('compliance-evidence')
    
    if bucket.default_kms_key_name:
        return 'Compliant: Encryption enabled'
    else:
        return 'Non-Compliant: No encryption'

# Run monthly via cron
result = check_gcs_encryption()
print(result)

Simple. Mechanical. Predictable.

Why it breaks:

  • Vendors change their UI (screenshot script fails)

  • Data sources return unexpected formats

  • Edge cases you didn't anticipate

  • Any ambiguity in requirements

The limitation isn't technical. It's conceptual. Deterministic automation assumes GRC work is mechanical. Follow these steps, get this result. But compliance isn't mechanical. Risk isn't binary. Security doesn't fit into if/then statements.

When to use Type 1: Evidence collection from structured sources. Status aggregation across tools. Scheduled compliance reporting. Any task where you can write clear if/then rules upfront.

This is where 90% of GRC teams live. They've automated evidence collection and think they're done.

Type 2: AI-Powered Automation

This is where we are right now. LLMs can read unstructured documents, analyze security posture, extract controls from frameworks. It feels like magic compared to regex patterns.

What it is: LLM API calls within deterministic workflows. Script collects data → AI analyzes → Script executes action. Hybrid architecture: deterministic flow, AI reasoning step.

The key pattern:

# vendor_risk_assessment.py
from google.cloud import storage
import anthropic

def assess_vendor_risk(vendor_id):
    # Step 1: Deterministic - Collect vendor data
    client = storage.Client()
    bucket = client.get_bucket('vendor-assessments')
    blob = bucket.blob(f'{vendor_id}/security_questionnaire.pdf')
    questionnaire_text = blob.download_as_text()
    
    # Step 2: AI-Powered - LLM analyzes security posture
    claude = anthropic.Anthropic()
    response = claude.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": f"""Analyze this vendor security questionnaire and assess risk:

{questionnaire_text}

Evaluate against these criteria:
- Data encryption practices (at-rest and in-transit)
- Access control implementation
- Incident response capabilities
- Third-party security assessments

Provide:
1. Overall risk score (1-10, where 10 is highest risk)
2. Key security gaps identified
3. Recommendation: APPROVE, REVIEW, or REJECT

Format as JSON."""
        }]
    )
    
    # Parse LLM response
    analysis = parse_json(response.content[0].text)
    risk_score = analysis['risk_score']
    
    # Step 3: Deterministic - Execute based on AI analysis
    if risk_score >= 8:
        create_high_risk_ticket(vendor_id, analysis)
        return 'REJECT: High risk - requires security review'
    elif risk_score >= 5:
        assign_to_senior_analyst(vendor_id, analysis)
        return 'REVIEW: Medium risk - manual assessment needed'
    else:
        auto_approve_vendor(vendor_id)
        return 'APPROVE: Low risk - auto-approved'

# Run for each vendor
result = assess_vendor_risk('vendor_523')

The AI provides reasoning. The script controls execution.

When it works brilliantly:

  • Converting unstructured content into structured formats

  • Analysis requiring reasoning over context

  • Pattern matching across large document sets

  • Judgment calls within defined boundaries

Real examples:

  • Vendor questionnaire analysis: Extract responses → Claude assesses security posture → Route based on risk score

  • Control extraction: Collect framework documents → LLM identifies applicable controls → Map to your requirements

  • Policy generation: Gather your actual practices → AI writes policy matching implementation → Format for compliance

Here's the trap: Type 2 uses probabilistic tools (LLMs) but many teams still expect deterministic outcomes. You want Claude to analyze a vendor and you want the exact same output every time.

That's not how probability works.

Most GRC teams using AI are forcing Type 2 tools into Type 1 thinking. They treat LLMs like deterministic APIs, get frustrated with variation they haven't designed for, and don't build validation frameworks. They don't trust results without manual review. They've added AI to their workflow but haven't changed their workflow for AI.

When to use Type 2: Reasoning over unstructured data within clear workflow. Analysis, extraction, classification where context matters, but with programmatic action on results.

Type 3: Intelligent Automation

This is where GRC moves from scripting to systems engineering. You're not implementing solutions to specific tasks.

You're architecting systems that make autonomous decisions at scale. This is the future of GRC Engineering - but it's not appropriate for most GRC work.

Most problems need good scripting (Types 1 and 2), not complex systems architecture. When you DO need it, here's what it looks like: Autonomous agents with tool-calling frameworks. Self-directed based on context. Adaptive execution.

Real example - TPRM agents:

  • Read vendor documentation (autonomous tool call)

  • Extract security controls (analyzes what's relevant)

  • Evaluate against our requirements (contextual assessment)

  • Identify gaps (adaptive analysis)

  • Calculate risk scores (judgment based on findings)

  • Generate recommendations (context-specific guidance)

The agent decides which tools to call next based on what it learns. Not following a pre-written script. Making decisions.

Results: Minutes per vendor assessment vs hours when manual. 100% accuracy on control extraction. 60 vendors in hours that would take analysts weeks.

What makes it different:

  • Type 1: You write all the rules (if vendor_score > 7, alert)

  • Type 2: You write the workflow, AI does reasoning step (analyze this, then I'll route)

  • Type 3: Agent decides what to do next (assess vendor, determine next steps autonomously)

When it works brilliantly:

  • Complex judgment requiring context

  • Decisions that adapt to new information

  • Scale makes human review impractical

  • Problems too expensive to solve manually

What makes it different from Type 1 and 2:

  • Embraces probabilistic outputs instead of fighting them

  • Includes evaluation frameworks (how do we know it's working?)

  • Learns from patterns across assessments

  • Produces trustworthy results, not perfect results

When to use Type 3: Complex decisions at scale. Vendor assessments. Risk analysis. Continuous monitoring requiring autonomous response. When the next step depends on what you learn.

When NOT to use Type 3: Simple repeatable tasks. Tasks where you can write the complete workflow upfront. When you can't build evaluation frameworks to verify quality.

IN PARTNERSHIP WITH

Not impressed with AI buzzwords?

Good.

Meet Agent Studio, Data Studio, ChatGRC and Anecdotes MCP, built for skeptics like you. Use these new capabilities to deploy agents that run on your data, understand GRC context, and execute complete workflows.

Ralph building Type 3 automation for a Type 1 problem (Don't be Ralph.)

How to Choose the Right Type

Decision-matrix: How to know what to use?

Do You Actually Need Automation?

Before choosing between Type 1, 2, or 3, ask whether you need automation at all.

Maybe you need better process design. Maybe you need to optimize your workflow. I've seen teams spend weeks automating 2-hour tasks that become 15-minute tasks with better processes.

The engineering mindset: Question the problem before solving it. Optimize the process before automating it.

If you've done that and still need automation, here's the framework for choosing which type.

Ask the Honesty Questions

1. What's my actual objective?

Not "automate evidence collection" but "demonstrate continuous compliance" or "enable risk-based decisions."

Different objectives need different automation:

  • "Evidence ready for annual audit" → Might not need automation

  • "Continuous compliance visibility" → Type 1 (scheduled scripts)

  • "Risk-based vendor decisions at scale" → Type 3 (autonomous agents)

2. What's the true complexity?

  • Can I write clear if/then rules? → Type 1

  • Need reasoning over unstructured data? → Type 2

  • Need autonomous decisions that adapt? → Type 3

3. What's the scale?

  • <20 items → Probably don't automate

  • 20-100 items → Type 1 or 2

  • 100+ with context-dependent decisions → Consider Type 3

4. What happens if it's wrong?

  • Audit failure → Keep human review

  • Inefficient triage → Type 3 with evaluation frameworks works

Decision Matrix

Your Situation

Type

Why

10 GCP logs monthly

Type 1

Mechanical, structured

40 vendor questionnaires

Type 2

Needs reasoning, clear workflow

60 vendor assessments

Type 3

Complex judgment at scale

Weekly dashboard updates

Type 1

Scheduled, structured

Framework control extraction

Type 2

Unstructured → structured

The test: Can you write the complete workflow upfront? → Type 1 or 2. Does next step depend on what you learn? → Type 3.

Common Mistakes to Avoid

Mistake 1: Building Type 3 for Type 1 Problems

Multi-agent orchestration for 10 monthly logs. Cost: $427/month to save 15 minutes. You escape the manual loop but create a maintenance loop.

Mistake 2: Using Type 1 for Type 3 Problems

If/then rules for complex judgment means 500-line scripts with nested conditionals that break constantly.

Mistake 3: Expecting Type 1 Outputs from Type 2

LLMs are probabilistic. Demanding identical results means you don't trust outputs, add manual review everywhere, gain zero value.

The pattern: Mismatched type to problem creates new problems instead of solving old ones.

My Journey Through All Three

In building production GRC systems, I've learned all three types are essential - they're just different disciplines.

Type 1 (Scripting): Python scripts for evidence collection. Our operational backbone. Simple, reliable, mechanical. Good implementation engineering, not systems architecture.

Type 2 (AI-Enhanced Scripting): LLM APIs in assessment workflows. Vendor questionnaires → Claude analyzes → Scripts route. Still scripting, enhanced with AI reasoning.

Type 3 (Systems Engineering): Autonomous TPRM agents. Required different thinking - evaluation frameworks, probabilistic outputs, orchestration architecture. Not scripting. System design.

The lesson: Most of our work is still scripting (Types 1 and 2). Type 3 is reserved for problems that genuinely need autonomous decisions at scale.

Understanding which discipline your problem needs - that's the difference between building the right thing and building the impressive thing.

What This Means For You

The application: what to put in place now.

Understanding the three types isn't about declaring Type 3 superior. It's about recognizing they're different disciplines solving different problems.

Type 1 is GRC Scripting. Type 2 is AI-Enhanced Scripting. Type 3 is GRC Systems Engineering.

All three are necessary. The question is: which discipline does your problem need?

Ask yourself:

  • Am I doing plumbing (implementing a solution) or architecture (designing a system)?

  • Am I solving an immediate task or building scalable infrastructure?

  • Do I need a script or do I need a system?

For most GRC work, you need scripting (Types 1 and 2). Evidence collection, status checks, compliance reporting - these need good implementation, not complex architecture.

But for problems that require autonomous decisions at scale, you need systems engineering (Type 3). That's a different skill set: evaluation frameworks, probabilistic thinking, orchestration design.

The honest questions:

  • Could I optimize the manual process? (Maybe no automation needed)

  • Is this a plumbing problem? (Type 1 or 2)

  • Is this an architecture problem? (Type 3)

The future of GRC Engineering is understanding when you're doing plumbing and when you're architecting systems. Both matter. They're just different disciplines.

Build what the problem needs, not what sounds most sophisticated.

Did you enjoy this week's entry?

Login or Subscribe to participate in polls.

That’s all for this week’s issue, folks!

If you enjoyed it, you might also enjoy:

See you next week!

Reply

or to participate.