The highest tagged major version is v2.

semanticfw

package module

v1.1.0 Latest Latest Go to latest Published: Jan 11, 2026 License: MIT Imports: 18 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/BlackVectorOps/semantic_firewall

Links

Open Source Insights

README ¶

Semantic Firewall

Detect logic corruption that bypasses code reviews.

Semantic Firewall generates deterministic fingerprints of your Go code's behavior, not its bytes. It uses Scalar Evolution (SCEV) analysis to prove that syntactically different loops are mathematically identical, and a Semantic Zipper to diff architectural changes without the noise.

Quick Start

# Install
go install github.com/BlackVectorOps/semantic_firewall/cmd/sfw@latest

# Fingerprint a file
sfw check ./main.go

# Semantic diff between two versions
sfw diff old_version.go new_version.go

Check Output:

{
  "file": "./main.go",
  "functions": [
    { "function": "main", "fingerprint": "005efb52a8c9d1e3..." }
  ]
}

Diff Output (The Zipper):

{
  "summary": {
    "semantic_match_pct": 92.5,
    "preserved": 12,
    "modified": 1
  },
  "functions": [
    {
      "function": "HandleLogin",
      "status": "modified",
      "added_ops": ["Call <log.Printf>", "Call <net.Dial>"],
      "removed_ops": []
    }
  ]
}

Why Use This?

"Don't unit tests solve this?" No. Unit tests verify correctness (does input A produce output B?). sfw verifies intent and integrity.

A developer refactors a function but secretly adds a network call → unit tests pass, sfw fails.
A developer changes a switch to a Strategy Pattern → git diff shows 100 lines changed, sfw diff shows zero logic changes.

Traditional Tooling	Semantic Firewall
Git Diff — Shows lines changed (whitespace, renaming = noise)	sfw check — Verifies control flow graph identity
Unit Tests — Verify input/output (blind to side effects)	sfw diff — Isolates actual logic drift from cosmetic changes

Use cases:

Supply chain security — Detect backdoors like the xz attack that pass code review
Safe refactoring — Prove your refactor didn't change behavior
CI/CD gates — Block PRs that alter critical function logic

CI Integration: Blocker & Reporter Modes

sfw supports two distinct CI roles:

Blocker Mode: When a PR claims to be a refactor (via title or semantic-safe label), sfw enforces strict semantic equivalence. Any logic change fails the build.
Reporter Mode: On feature PRs, sfw runs a semantic diff and generates a drift report (e.g., "Semantic Match: 80%"), helping reviewers focus on the code where behavior actually changed.

GitHub Actions Workflow

name: Semantic Firewall

on:
  pull_request:
    branches: [ "main" ]
    types: [opened, synchronize, reopened, labeled]

jobs:
  semantic-analysis:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-go@v5
        with:
          go-version: '1.24'

      - name: Install sfw
        run: go install github.com/BlackVectorOps/semantic_firewall/cmd/sfw@latest

      - name: Determine Mode
        id: mode
        run: |
          if [[ "${{ contains(github.event.pull_request.labels.*.name, 'semantic-safe') }}" == "true" ]] || \
             [[ "${{ contains(github.event.pull_request.title, 'refactor') }}" == "true" ]]; then
            echo "mode=BLOCKER" >> $GITHUB_OUTPUT
          else
            echo "mode=REPORTER" >> $GITHUB_OUTPUT
          fi

      - name: Run Blocker Check
        if: steps.mode.outputs.mode == 'BLOCKER'
        run: sfw check ./

      - name: Run Reporter Diff
        if: steps.mode.outputs.mode == 'REPORTER'
        run: |
          BASE_SHA=${{ github.event.pull_request.base.sha }}
          git diff --name-only "$BASE_SHA" HEAD -- '*.go' | while read file; do
            [ -f "$file" ] || continue
            git show "$BASE_SHA:$file" > old.go 2>/dev/null || touch old.go
            sfw diff old.go "$file" | jq .
            rm old.go
          done

Library Usage

import semanticfw "github.com/BlackVectorOps/semantic_firewall"

src := `package main
func Add(a, b int) int { return a + b }
`

results, err := semanticfw.FingerprintSource("example.go", src, semanticfw.DefaultLiteralPolicy)
if err != nil {
    log.Fatal(err)
}

for _, r := range results {
    fmt.Printf("%s: %s\n", r.FunctionName, r.Fingerprint)
}

Technical Deep Dive

Click to expand: SCEV & The Zipper

How It Works

Parse — Load Go source into SSA (Static Single Assignment) form
Canonicalize — Normalize variable names, branch ordering, loop structures
Fingerprint — SHA-256 hash of the canonical IR

The result: semantically equivalent code produces identical fingerprints.

Scalar Evolution (SCEV) Analysis

Standard hashing is brittle—changing for i := 0 to for range breaks the hash. sfw solves this with an SCEV engine (scev.go) that algebraically solves loops:

Induction Variable Detection: Classifies loop variables as Add Recurrences: ${Start, +, Step}$
Trip Count Derivation: Proves that a range loop and an index loop iterate the same number of times
Loop Invariant Hoisting: Invariant expressions (e.g., len(s)) are virtually hoisted, so manual optimizations don't alter fingerprints

Result: Refactor loop syntax freely. If the math is the same, the fingerprint is the same.

The Semantic Zipper

When logic does change (e.g., architectural refactors), fingerprint comparison fails. The Zipper algorithm (zipper.go) takes two SSA graphs and "zips" them together starting from function parameters:

Anchor Alignment: Parameters and free variables establish deterministic entry points
Forward Propagation: Traverses use-def chains to match semantically equivalent nodes
Divergence Isolation: Reports exactly what changed (e.g., "added Call <net.Dial>, preserved all assignments")

Result: A semantic changelog that ignores renaming, reordering, and helper extraction.

Security Hardening

Cycle Detection: Prevents stack overflow DoS from malformed cyclic graphs
IR Injection Prevention: Sanitizes string literals and struct tags to prevent fake instruction injection
NaN-Safe Comparisons: Limits branch normalization to integer/string types to avoid floating-point edge cases

License

MIT License — See LICENSE for details.

Documentation ¶

Index ¶

Variables
func AnalyzeSCEV(info *LoopInfo)
func BuildSSAFromPackages(initialPkgs []*packages.Package) (*ssa.Program, *ssa.Package, error)
func ReleaseCanonicalizer(c *Canonicalizer)
type Canonicalizer
- func AcquireCanonicalizer(policy LiteralPolicy) *Canonicalizer
- func NewCanonicalizer(policy LiteralPolicy) *Canonicalizer
- func (c *Canonicalizer) ApplyVirtualControlFlowFromState(swappedBlocks map[*ssa.BasicBlock]bool, ...)
- func (c *Canonicalizer) CanonicalizeFunction(fn *ssa.Function) string
type FingerprintResult
- func FingerprintPackages(initialPkgs []*packages.Package, policy LiteralPolicy, strictMode bool) ([]FingerprintResult, error)
- func FingerprintSource(filename string, src string, policy LiteralPolicy) ([]FingerprintResult, error)
- func FingerprintSourceAdvanced(filename string, src string, policy LiteralPolicy, strictMode bool) ([]FingerprintResult, error)
- func GenerateFingerprint(fn *ssa.Function, policy LiteralPolicy, strictMode bool) FingerprintResult
- func (r FingerprintResult) GetSSAFunction() *ssa.Function
type IVType
type InductionVariable
type LiteralPolicy
- func (p *LiteralPolicy) ShouldAbstract(c *ssa.Const, usageContext ssa.Instruction) bool
type Loop
- func (l *Loop) String() string
type LoopInfo
- func DetectLoops(fn *ssa.Function) *LoopInfo
type Renamer
type SCEV
type SCEVAddRec
- func (s *SCEVAddRec) EvaluateAt(k *big.Int) *big.Int
- func (s *SCEVAddRec) IsLoopInvariant(loop *Loop) bool
- func (s *SCEVAddRec) Name() string
- func (s *SCEVAddRec) Parent() *ssa.Function
- func (s *SCEVAddRec) Pos() token.Pos
- func (s *SCEVAddRec) Referrers() *[]ssa.Instruction
- func (s *SCEVAddRec) String() string
- func (s *SCEVAddRec) StringWithRenamer(r Renamer) string
- func (s *SCEVAddRec) Type() types.Type
type SCEVConstant
- func SCEVFromConst(c *ssa.Const) *SCEVConstant
- func (s *SCEVConstant) EvaluateAt(k *big.Int) *big.Int
- func (s *SCEVConstant) IsLoopInvariant(loop *Loop) bool
- func (s *SCEVConstant) Name() string
- func (s *SCEVConstant) Parent() *ssa.Function
- func (s *SCEVConstant) Pos() token.Pos
- func (s *SCEVConstant) Referrers() *[]ssa.Instruction
- func (s *SCEVConstant) String() string
- func (s *SCEVConstant) StringWithRenamer(r Renamer) string
- func (s *SCEVConstant) Type() types.Type
type SCEVGenericExpr
- func (s *SCEVGenericExpr) EvaluateAt(k *big.Int) *big.Int
- func (s *SCEVGenericExpr) IsLoopInvariant(loop *Loop) bool
- func (s *SCEVGenericExpr) Name() string
- func (s *SCEVGenericExpr) Parent() *ssa.Function
- func (s *SCEVGenericExpr) Pos() token.Pos
- func (s *SCEVGenericExpr) Referrers() *[]ssa.Instruction
- func (s *SCEVGenericExpr) String() string
- func (s *SCEVGenericExpr) StringWithRenamer(r Renamer) string
- func (s *SCEVGenericExpr) Type() types.Type
type SCEVUnknown
- func (s *SCEVUnknown) EvaluateAt(k *big.Int) *big.Int
- func (s *SCEVUnknown) IsLoopInvariant(loop *Loop) bool
- func (s *SCEVUnknown) Name() string
- func (s *SCEVUnknown) Parent() *ssa.Function
- func (s *SCEVUnknown) Pos() token.Pos
- func (s *SCEVUnknown) Referrers() *[]ssa.Instruction
- func (s *SCEVUnknown) String() string
- func (s *SCEVUnknown) StringWithRenamer(r Renamer) string
- func (s *SCEVUnknown) Type() types.Type
type Zipper
- func NewZipper(oldFn, newFn *ssa.Function, policy LiteralPolicy) (*Zipper, error)
- func (z *Zipper) ComputeDiff() (*ZipperArtifacts, error)
type ZipperArtifacts

Constants ¶

This section is empty.

Variables ¶

View Source

var DefaultLiteralPolicy = LiteralPolicy{
	AbstractControlFlowComparisons: true,
	KeepSmallIntegerIndices:        true,
	KeepReturnStatusValues:         true,

	KeepStringLiterals: false,
	SmallIntMin:        -16,
	SmallIntMax:        16,
	AbstractOtherTypes: true,
}

DefaultLiteralPolicy represents the standard policy for fingerprinting; it preserves small integers used for indexing and status codes while masking magic numbers and large constants.

View Source

var KeepAllLiteralsPolicy = LiteralPolicy{
	AbstractControlFlowComparisons: false,
	KeepSmallIntegerIndices:        true,
	KeepReturnStatusValues:         true,
	KeepStringLiterals:             true,
	SmallIntMin:                    math.MinInt64,
	SmallIntMax:                    math.MaxInt64,
	AbstractOtherTypes:             false,
}

KeepAllLiteralsPolicy is designed for testing or exact matching by disabling most abstractions and expanding the "small" integer range to the full int64 spectrum.

Functions ¶

func AnalyzeSCEV ¶

func AnalyzeSCEV(info *LoopInfo)

AnalyzeSCEV is the main entry point for SCEV analysis on a LoopInfo.

func BuildSSAFromPackages ¶

func BuildSSAFromPackages(initialPkgs []*packages.Package) (*ssa.Program, *ssa.Package, error)

Constructs the Static Single Assignment form from loaded Go packages. Provides the complete program and the target package for analysis.

func ReleaseCanonicalizer ¶

func ReleaseCanonicalizer(c *Canonicalizer)

Types ¶

type Canonicalizer ¶

type Canonicalizer struct {
	Policy     LiteralPolicy
	StrictMode bool
	// contains filtered or unexported fields
}

Canonicalizer transforms an SSA function into a deterministic string representation.

func AcquireCanonicalizer ¶

func AcquireCanonicalizer(policy LiteralPolicy) *Canonicalizer

func NewCanonicalizer ¶

func NewCanonicalizer(policy LiteralPolicy) *Canonicalizer

func (*Canonicalizer) ApplyVirtualControlFlowFromState ¶

func (c *Canonicalizer) ApplyVirtualControlFlowFromState(swappedBlocks map[*ssa.BasicBlock]bool, virtualBinOps map[*ssa.BinOp]token.Token)

func (*Canonicalizer) CanonicalizeFunction ¶

func (c *Canonicalizer) CanonicalizeFunction(fn *ssa.Function) string

type FingerprintResult ¶

type FingerprintResult struct {
	FunctionName string
	Fingerprint  string
	CanonicalIR  string
	Pos          token.Pos
	Line         int
	Filename     string
	// contains filtered or unexported fields
}

Encapsulates the output of the semantic fingerprinting process for a function.

func FingerprintPackages ¶

func FingerprintPackages(initialPkgs []*packages.Package, policy LiteralPolicy, strictMode bool) ([]FingerprintResult, error)

FingerprintPackages iterates over loaded packages to construct SSA and generate results.

func FingerprintSource ¶

func FingerprintSource(filename string, src string, policy LiteralPolicy) ([]FingerprintResult, error)

FingerprintSource analyzes a single Go source file provided as a string. This is the primary entry point for verifying code snippets or patch hunks.

func FingerprintSourceAdvanced ¶

func FingerprintSourceAdvanced(filename string, src string, policy LiteralPolicy, strictMode bool) ([]FingerprintResult, error)

FingerprintSourceAdvanced provides an extended interface for source analysis with strict mode control.

func GenerateFingerprint ¶

func GenerateFingerprint(fn *ssa.Function, policy LiteralPolicy, strictMode bool) FingerprintResult

GenerateFingerprint generates the hash and canonical string representation for an SSA function. This function uses a pooled Canonicalizer to ensure high throughput and low allocation overhead.

func (FingerprintResult) GetSSAFunction ¶ added in v1.1.0

func (r FingerprintResult) GetSSAFunction() *ssa.Function

GetSSAFunction returns the underlying SSA function for advanced analysis workflows such as semantic diffing with the Zipper algorithm. Returns nil if not available.

type IVType ¶

type IVType int

const (
	IVTypeUnknown    IVType = iota
	IVTypeBasic             // {S, +, C}
	IVTypeDerived           // Affine: A * IV + B
	IVTypeGeometric         // {S, *, C}
	IVTypePolynomial        // Step is another IV
)

type InductionVariable ¶

type InductionVariable struct {
	Phi   *ssa.Phi
	Type  IVType
	Start SCEV // Value at iteration 0
	Step  SCEV // Update stride
}

InductionVariable describes a detected IV. Reference: Section 3.2 Classification Taxonomy.

type LiteralPolicy ¶

type LiteralPolicy struct {
	AbstractControlFlowComparisons bool
	KeepSmallIntegerIndices        bool
	KeepReturnStatusValues         bool
	// FIX: Added flag to keep string literals.
	KeepStringLiterals bool
	SmallIntMin        int64
	SmallIntMax        int64
	AbstractOtherTypes bool
}

LiteralPolicy defines the configurable strategy for determining which literal values should be abstracted into placeholders during canonicalization. It allows fine grained control over integer abstraction in different contexts.

func (*LiteralPolicy) ShouldAbstract ¶

func (p *LiteralPolicy) ShouldAbstract(c *ssa.Const, usageContext ssa.Instruction) bool

decides whether a given constant should be replaced by a generic placeholder. It analyzes the constant's type, value, and immediate usage context in the SSA graph.

type Loop ¶

type Loop struct {
	Header *ssa.BasicBlock
	Latch  *ssa.BasicBlock // Primary source of the backedge

	// Blocks contains all basic blocks within the loop body.
	Blocks map[*ssa.BasicBlock]bool
	// Exits contains blocks inside the loop that have successors outside.
	Exits []*ssa.BasicBlock

	// Hierarchy
	Parent   *Loop
	Children []*Loop

	// Semantic Analysis (populated in scev.go)
	Inductions map[*ssa.Phi]*InductionVariable
	TripCount  SCEV // Symbolic expression
}

Loop represents a natural loop in the SSA graph. Reference: Section 2.3 Natural Loops.

func (*Loop) String ¶

func (l *Loop) String() string

type LoopInfo ¶

type LoopInfo struct {
	Function *ssa.Function
	Loops    []*Loop // Top-level loops (roots of the hierarchy)
	// Map from Header block to Loop object for O(1) lookup
	LoopMap map[*ssa.BasicBlock]*Loop
}

LoopInfo summarizes loop analysis for a single function.

func DetectLoops ¶

func DetectLoops(fn *ssa.Function) *LoopInfo

DetectLoops reconstructs the loop hierarchy using dominance relations. Reference: Section 2.3.1 Algorithm: Detecting Natural Loops.

type Renamer ¶

type Renamer func(ssa.Value) string

Renamer is a function that maps an SSA value to its canonical name. This is used to ensure deterministic output regardless of SSA register naming.

type SCEV ¶

type SCEV interface {
	ssa.Value
	EvaluateAt(k *big.Int) *big.Int
	IsLoopInvariant(loop *Loop) bool
	String() string
	// StringWithRenamer returns a canonical string using the provided renamer
	// function to map SSA values to their canonical names (e.g., v0, v1).
	// This is critical for determinism: without it, raw SSA names (t0, t1)
	// would leak into fingerprints, breaking semantic equivalence.
	StringWithRenamer(r Renamer) string
}

SCEV represents a scalar expression.

type SCEVAddRec ¶

type SCEVAddRec struct {
	Start SCEV
	Step  SCEV
	Loop  *Loop
}

SCEVAddRec represents an Add Recurrence: {Start, +, Step}_L Reference: Section 4.1 The Add Recurrence Abstraction.

func (*SCEVAddRec) EvaluateAt ¶

func (s *SCEVAddRec) EvaluateAt(k *big.Int) *big.Int

func (*SCEVAddRec) IsLoopInvariant ¶

func (s *SCEVAddRec) IsLoopInvariant(loop *Loop) bool

func (*SCEVAddRec) Name ¶

func (s *SCEVAddRec) Name() string

ssa.Value Stubs

func (*SCEVAddRec) Parent ¶

func (s *SCEVAddRec) Parent() *ssa.Function

func (*SCEVAddRec) Pos ¶

func (s *SCEVAddRec) Pos() token.Pos

func (*SCEVAddRec) Referrers ¶

func (s *SCEVAddRec) Referrers() *[]ssa.Instruction

func (*SCEVAddRec) String ¶

func (s *SCEVAddRec) String() string

func (*SCEVAddRec) StringWithRenamer ¶

func (s *SCEVAddRec) StringWithRenamer(r Renamer) string

func (*SCEVAddRec) Type ¶

func (s *SCEVAddRec) Type() types.Type

type SCEVConstant ¶

type SCEVConstant struct {
	Value *big.Int
}

SCEVConstant represents a literal integer constant.

func SCEVFromConst ¶

func SCEVFromConst(c *ssa.Const) *SCEVConstant

func (*SCEVConstant) EvaluateAt ¶

func (s *SCEVConstant) EvaluateAt(k *big.Int) *big.Int

func (*SCEVConstant) IsLoopInvariant ¶

func (s *SCEVConstant) IsLoopInvariant(loop *Loop) bool

func (*SCEVConstant) Name ¶

func (s *SCEVConstant) Name() string

ssa.Value Stubs

func (*SCEVConstant) Parent ¶

func (s *SCEVConstant) Parent() *ssa.Function

func (*SCEVConstant) Pos ¶

func (s *SCEVConstant) Pos() token.Pos

func (*SCEVConstant) Referrers ¶

func (s *SCEVConstant) Referrers() *[]ssa.Instruction

func (*SCEVConstant) String ¶

func (s *SCEVConstant) String() string

func (*SCEVConstant) StringWithRenamer ¶

func (s *SCEVConstant) StringWithRenamer(r Renamer) string

func (*SCEVConstant) Type ¶

func (s *SCEVConstant) Type() types.Type

type SCEVGenericExpr ¶

type SCEVGenericExpr struct {
	Op token.Token
	X  SCEV
	Y  SCEV
}

SCEVGenericExpr represents binary operations like Add/Mul for formulas.

func (*SCEVGenericExpr) EvaluateAt ¶

func (s *SCEVGenericExpr) EvaluateAt(k *big.Int) *big.Int

func (*SCEVGenericExpr) IsLoopInvariant ¶

func (s *SCEVGenericExpr) IsLoopInvariant(loop *Loop) bool

func (*SCEVGenericExpr) Name ¶

func (s *SCEVGenericExpr) Name() string

ssa.Value Stubs

func (*SCEVGenericExpr) Parent ¶

func (s *SCEVGenericExpr) Parent() *ssa.Function

func (*SCEVGenericExpr) Pos ¶

func (s *SCEVGenericExpr) Pos() token.Pos

func (*SCEVGenericExpr) Referrers ¶

func (s *SCEVGenericExpr) Referrers() *[]ssa.Instruction

func (*SCEVGenericExpr) String ¶

func (s *SCEVGenericExpr) String() string

func (*SCEVGenericExpr) StringWithRenamer ¶

func (s *SCEVGenericExpr) StringWithRenamer(r Renamer) string

func (*SCEVGenericExpr) Type ¶

func (s *SCEVGenericExpr) Type() types.Type

type SCEVUnknown ¶

type SCEVUnknown struct {
	Value       ssa.Value
	IsInvariant bool // Explicitly tracks invariance relative to the analysis loop scope
}

SCEVUnknown represents a symbolic value (e.g., parameter or unanalyzable instr).

func (*SCEVUnknown) EvaluateAt ¶

func (s *SCEVUnknown) EvaluateAt(k *big.Int) *big.Int

func (*SCEVUnknown) IsLoopInvariant ¶

func (s *SCEVUnknown) IsLoopInvariant(loop *Loop) bool

func (*SCEVUnknown) Name ¶

func (s *SCEVUnknown) Name() string

ssa.Value Stubs

func (*SCEVUnknown) Parent ¶

func (s *SCEVUnknown) Parent() *ssa.Function

func (*SCEVUnknown) Pos ¶

func (s *SCEVUnknown) Pos() token.Pos

func (*SCEVUnknown) Referrers ¶

func (s *SCEVUnknown) Referrers() *[]ssa.Instruction

func (*SCEVUnknown) String ¶

func (s *SCEVUnknown) String() string

func (*SCEVUnknown) StringWithRenamer ¶

func (s *SCEVUnknown) StringWithRenamer(r Renamer) string

func (*SCEVUnknown) Type ¶

func (s *SCEVUnknown) Type() types.Type

type Zipper ¶ added in v1.1.0

type Zipper struct {
	// contains filtered or unexported fields
}

Zipper implements the semantic delta analysis algorithm.

func NewZipper ¶ added in v1.1.0

func NewZipper(oldFn, newFn *ssa.Function, policy LiteralPolicy) (*Zipper, error)

NewZipper creates a new analysis session.

func (*Zipper) ComputeDiff ¶ added in v1.1.0

func (z *Zipper) ComputeDiff() (*ZipperArtifacts, error)

ComputeDiff executes the Zipper Algorithm Phases.

type ZipperArtifacts ¶ added in v1.1.0

type ZipperArtifacts struct {
	OldFunction  string
	NewFunction  string
	MatchedNodes int
	Added        []string
	Removed      []string
	Preserved    bool
}

ZipperArtifacts encapsulates the results of the semantic delta analysis.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
sfw command Package main provides the sfw CLI tool for semantic fingerprinting of Go source files.	Package main provides the sfw CLI tool for semantic fingerprinting of Go source files.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL